Canada’s Privacy Ruling on AI Training Data Sets a Bad Precedent | Blogs | May 12, 2026
Canada’s Privacy Ruling on AI Training Data Sets a Bad Precedent | Blogs | May 12, 2026
Publish Date: 2026-05-12 13:27:00
Source Domain: itif.org
Canada’s privacy regulators are taking a misguided approach to AI training data. In a recent decision, the federal Office of the Privacy Commissioner (OPC) and several provincial authorities concluded that OpenAI violated Canadian privacy law by, among other claims, using publicly accessible Internet data and licensed third-party datasets to train ChatGPT.
The OPC acknowledged that OpenAI’s broader purpose—developing and deploying generative AI systems—was appropriate. They also recognized that user interaction data could legitimately be used to improve model performance. But it concluded that OpenAI’s use of publicly accessible online information was “overbroad” and failed to satisfy Canadian consent requirements because individuals would not have reasonably expected their public data to be used to train AI systems. BC and Alberta went further, finding the consent problem unresolved regardless of OpenAI’s mitigation measures.
That conclusion reflects a flawed understanding of how modern AI systems are developed and risks placing Canada on the wrong side of global AI competition.
Large language models (LLMs) depend on access to large-scale datasets to learn how language, reasoning, and information retrieval work. Publicly accessible websites, discussion forums, academic content, and licensed datasets are foundational inputs for training these systems. Restricting access to those materials would not only constrain a single company but also undermine the development of advanced AI systems across the broader ecosystem, including by startups, researchers, and open-source developers.
The OPC places significant weight on the claim that people did not reasonably expect publicly available information to be used for AI training because the practice was “novel” and “not widely understood” at the time. But novelty alone is not a sound basis for restricting technological development. Indeed, limiting businesses to using data only in ways consumers already…