Stanford HAI's AI privacy piece covers the specific ways your data is at risk that are different from traditional privacy concerns
Traditional data privacy is about who stores your data and whether it is kept secure. AI-era privacy is a broader problem and Stanford HAI's piece https://hai.stanford.edu/news/privacy-ai-era-how-do-we-protect-our-personal-information covers the dimensions that are genuinely new rather than just variations on familiar concerns.
The inference problem is the one worth spending time on. AI systems can infer sensitive attributes from non-sensitive inputs. Browsing patterns predict health status. Voice tone predicts emotional state. Aggregate behaviour predicts political views. None of those inferences require accessing sensitive data directly. They emerge from combining innocuous data at scale with a model that has learned the correlations.
The LLM memorisation problem is the one most developers have not thought through. Models trained on data that includes identifiable information can reproduce that information in outputs in ways that are difficult to predict or detect. The training data is not a sealed vault once the model is trained.
The voice cloning and synthetic identity dimensions are the ones making current headlines and the article covers them in the context of the broader privacy architecture rather than as isolated concerns.
What specific category of personal data would you never input into an AI tool regardless of the privacy policy and why?