From Privacy Compliance to AI Governance: Sourcing Training Data
From Privacy Compliance to AI Governance: Sourcing Training Data
Publish Date: 2026-06-04 07:48:00
Source Domain: natlawreview.com
For years, internet privacy compliance mainly focused on consumer-facing disclosures: privacy policies, cookie banners, and notices explaining how personal data would be collected, used, stored, and shared. The growing integration of generative AI into everyday digital services has disrupted this model, shifting legal attention away from disclosure and toward how data is collected, scraped, licensed, retained, and used to build and train AI systems. Recent litigation involving AI systems demonstrates growing legal scrutiny of how AI training data is sourced, processed, and deployed.
Recent privacy law developments demonstrate that privacy compliance can no longer end with consumer disclosures. The Texas Data Privacy and Security Act, which took effect on July 1, 2024, requires companies not only to provide privacy notices but also to limit personal data collection to what is “reasonably necessary”, implement reasonable safeguards, and conduct data protection assessments for high-risk processing activities such as targeted advertising or profiling that may influence significant decisions about consumers. These obligations make compliance depend on how companies actually manage personal data, rather than disclosures alone. The California Privacy Protection Agency (CPPA) has publicly highlighted enforcement concerns about excessive data collection and dark patterns that manipulate consent, and continues rulemaking tied to risk assessments, cybersecurity audits, and automated decision-making. These developments push companies to look beyond front-end disclosures and toward internal data governance, especially where personal information is repurposed for model development, profiling, or automated outputs.
The same trend is visible in AI-specific legislation. The relevant provisions of the EU AI Act generally become applicable on August 2, 2026, and extend compliance beyond disclosure to the governance of high-risk AI systems. Article 6 and Annex III classify…