Get started with differential privacy and privacy budgeting in BigQuery data clean rooms

9 months ago 65
News Banner

Looking for an Interim or Fractional CTO to support your business?

Read more

We are excited to announce that differential privacy enforcement with privacy budgeting is now available in BigQuery data clean rooms to help organizations prevent data from being reidentified when it is shared.

Differential privacy is an anonymization technique that limits the personal information that is revealed in a query output. Differential privacy is considered to be one of the strongest privacy protections that exists today because it:

  • is provably private
  • supports multiple differentially private queries on the same dataset
  • can be applied to many data types

Differential privacy is used by advertisers, healthcare companies, and education companies to perform analysis without exposing individual records. It is also used by public sector organizations that comply with the General Data Protection Regulation (GDPR), the Health Insurance Portability and Accountability Act (HIPAA), the Family Educational Rights and Privacy Act (FERPA), and the California Consumer Privacy Act (CCPA).

What can I do with differential privacy?

With differential privacy, you can:

  • protect individual records from re-identification without moving or copying your data
  • protect against privacy leak and re-identification
  • use one of the anonymization standards most favored by regulators

BigQuery customers can use differential privacy to:

  • share data in BigQuery data clean rooms while preserving privacy
  • anonymize query results on AWS and Azure data with BigQuery Omni
  • share anonymized results with Apache Spark stored procedures and Dataform pipelines so they can be consumed by other applications
  • enhance differential privacy implementations with technology from Google Cloud partners Gretel.ai and Tumult Analytics
  • call frameworks like PipelineDP.io

So what is BigQuery differential privacy exactly?

BigQuery differential privacy is three capabilities:

Parameter-driven privacy budgeting in BigQuery data clean rooms – When you apply a differential privacy analysis rule, you also set a privacy budget to limit the data that is revealed when your shared data is queried. BigQuery uses parameter-driven privacy budgeting to give you more granular control over your data than query thresholds do and to prevent further queries on that data when the budget is exhausted.

BigQuery differential privacy enforcement in action

Here’s how to enable the differential privacy analysis rule and configure a privacy budget when you add data to a BigQuery data clean room.

figure 1

Subscribers of that clean room must then use differential privacy to query your shared data.

figure 2

Subscribers of that clean room cannot query your shared data once the privacy budget is exhausted.

figure 3

Get started with BigQuery differential privacy

BigQuery differential privacy is configured when a data owner or contributor shares data in a BigQuery data clean room. A data owner or contributor can share data using any compute pricing model and does not incur compute charges when a subscriber queries that data. Subscribers of a data clean room incur compute charges when querying shared data that is protected with a differential privacy analysis rule. Those subscribers are required to use on-demand pricing (charged per TB) or the Enterprise Plus edition (charged per slot hour).

Create a clean room where all queries are protected with differential privacy today and let us know where you need help.

Related Article

Privacy-preserving data sharing now generally available with BigQuery data clean rooms

Now GA, BigQuery data clean rooms has a new data contributor and subscriber experience, join restrictions, new analysis rules, usage metr...

Read Article
Read Entire Article