How does Tonic.ai actually work for creating safe test data from a production database?
We are a healthcare software company and one of our persistent challenges is that our developers and QA team need realistic test data but we cannot use actual patient data in non-production environments for obvious compliance reasons. Right now we use a manually created set of fake records which is time-consuming to maintain and does not reflect the complexity and variety of real production data, which means we miss edge cases in testing fairly regularly.
[Tonic.ai](http://Tonic.ai) has been mentioned as a tool that generates synthetic data that preserves the statistical properties and relational structure of real production data without containing any actual personal information. That sounds exactly like what we need but I want to understand how it handles the complexity of healthcare data specifically, which tends to have a lot of interdependencies between tables and domain-specific data patterns that need to look realistic.
Has anyone used Tonic.ai in a healthcare or similarly complex regulated data environment? I want to understand the implementation process, specifically how it connects to and analyses your production schema, how much configuration is required to get the synthetic data looking realistic, and whether the referential integrity across related tables is maintained properly in the output. Also curious about the compliance documentation it provides for audit purposes.