Tonic generating real fake data that solves the 40-50% of developers using production data in pre-production environments
Real production data in pre-production environments exposes actual customer PII, actual financial records, actual sensitive business data to development and testing infrastructure with lower security controls than production. The regulatory exposure this creates, GDPR, HIPAA, SOX, varies by industry but the legal liability is real.
Generating real fake data that maintains the statistical properties, referential integrity and business rule consistency of production data without containing actual sensitive information is the technical solution. The test coverage benefit being preserved while the security risk is eliminated is the value proposition.
The broader implication for development practices: test environments that use data with the same statistical characteristics as production catch the same category of edge cases as production data. Random or obviously synthetic data misses the edge cases that only appear in data with real-world distributions.
For security or compliance leaders: what is your organisation's current policy on production data in pre-production environments and how is that policy actually enforced rather than just documented?