{"id":235216,"date":"2026-04-24T14:39:00","date_gmt":"2026-04-24T18:39:00","guid":{"rendered":"https:\/\/news-you-need.com\/index.php\/2026\/04\/24\/benchmarking-openais-privacy-filter-what-it-gets-right-and-where-pii-detection-still-needs-real-data\/"},"modified":"2026-04-24T14:45:10","modified_gmt":"2026-04-24T18:45:10","slug":"benchmarking-openais-privacy-filter-what-it-gets-right-and-where-pii-detection-still-needs-real-data","status":"publish","type":"post","link":"https:\/\/news-you-need.com\/index.php\/2026\/04\/24\/benchmarking-openais-privacy-filter-what-it-gets-right-and-where-pii-detection-still-needs-real-data\/","title":{"rendered":"Benchmarking OpenAI&#8217;s Privacy Filter: What it gets right, and where PII detection still needs real data"},"content":{"rendered":"<p><a href=\"https:\/\/securityboulevard.com\/2026\/04\/benchmarking-openais-privacy-filter-what-it-gets-right-and-where-pii-detection-still-needs-real-data\/\">Benchmarking OpenAI&#8217;s Privacy Filter: What it gets right, and where PII detection still needs real data<\/a><\/p>\n<p><a href=\"https:\/\/securityboulevard.com\/2026\/04\/benchmarking-openais-privacy-filter-what-it-gets-right-and-where-pii-detection-still-needs-real-data\/\">https:\/\/securityboulevard.com\/2026\/04\/benchmarking-openais-privacy-filter-what-it-gets-right-and-where-pii-detection-still-needs-real-data\/<\/a><\/p>\n<p>Publish Date: <a href=\"publish_date]\">2026-04-24 14:39:00<\/a><\/p>\n<p>Source Domain: <a href=\"securityboulevard.com\">securityboulevard.com<\/a><\/p>\n<p>A benchmark, a mechanistic look under the hood, and a fine-tuning curve.<\/p>\n<p>Yesterday, OpenAI released <strong>Privacy Filter<\/strong> (OPF), an open-source 1.5B-parameter mixture-of-experts model for detecting PII in text. It\u2019s a thoughtful release: Apache 2.0 licensed, small enough to run in a browser or on a laptop, and state-of-the-art on <strong>PII-Masking-300k<\/strong>, a widely used synthetic PII benchmark.<\/p>\n<p>We spend our time at Tonic.ai thinking about exactly this problem, and we were curious how the model performs on the kind of data our customers actually send through our redaction pipelines: electronic-health-record notes, call-center transcripts, loan contracts, and general web scrapes. We also wanted to understand why it behaves the way it does, and what it would take to close any gaps we found.<\/p>\n<p>This post covers three things:<\/p>\n<ol>\n<li>A head-to-head benchmark of OPF against our production redactor, Tonic Textual, on four real-data test groups.<\/li>\n<li>A brief mechanistic look at where OPF succeeds and where it falls short.<\/li>\n<li>A fine-tuning experiment to understand how much labeled data is needed to make OPF competitive on specific domains.<\/li>\n<\/ol>\n<p>The short version: OPF is an excellent <strong>base model<\/strong> for PII detection, in roughly the way BERT or RoBERTa are excellent base models for token classification. What it is not \u2014 at least out of the box \u2014 is a drop-in replacement for a mature, domain-tuned redactor. The difference between the two is training data, and a lot of it.<\/p>\n<h2>PII detection benchmark: OpenAI Privacy Filter vs. Tonic Textual on real data<\/h2>\n<p>OPF exposes 8 PII categories: account_number, private_address, private_email, private_person, private_phone, private_url, private_date, and secret. Textual emits 26 finer-grained labels. For a clean comparison we project both systems into OPF\u2019s 8-class space and evaluate at the <strong>token level<\/strong>, which sidesteps the boundary mismatches that come from Textual splitting \u201c123 Main St, Boston, MA 02115\u201d into address \/ city \/ state \/ zip while OPF treats it as one&#8230;<\/p>\n<p><a href=\"https:\/\/securityboulevard.com\/2026\/04\/benchmarking-openais-privacy-filter-what-it-gets-right-and-where-pii-detection-still-needs-real-data\/\">Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Benchmarking OpenAI&#8217;s Privacy Filter: What it gets right, and where PII detection still needs real&#8230;<\/p>\n","protected":false},"author":1,"featured_media":235217,"comment_status":"closed","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"fifu_image_url":"https:\/\/cdn.prod.website-files.com\/62e28cf08913e80aefba2c44\/69ebac396aafe2cc4080feef_plot_prf%201.png","fifu_image_alt":"","footnotes":""},"categories":[16],"tags":[],"class_list":["post-235216","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-privacy"],"_links":{"self":[{"href":"https:\/\/news-you-need.com\/index.php\/wp-json\/wp\/v2\/posts\/235216"}],"collection":[{"href":"https:\/\/news-you-need.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/news-you-need.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/news-you-need.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/news-you-need.com\/index.php\/wp-json\/wp\/v2\/comments?post=235216"}],"version-history":[{"count":1,"href":"https:\/\/news-you-need.com\/index.php\/wp-json\/wp\/v2\/posts\/235216\/revisions"}],"predecessor-version":[{"id":235218,"href":"https:\/\/news-you-need.com\/index.php\/wp-json\/wp\/v2\/posts\/235216\/revisions\/235218"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/news-you-need.com\/index.php\/wp-json\/wp\/v2\/media\/235217"}],"wp:attachment":[{"href":"https:\/\/news-you-need.com\/index.php\/wp-json\/wp\/v2\/media?parent=235216"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/news-you-need.com\/index.php\/wp-json\/wp\/v2\/categories?post=235216"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/news-you-need.com\/index.php\/wp-json\/wp\/v2\/tags?post=235216"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}