Skip to main content

Get started with Decentriq UI

Below, you will go through a practical example showing the following use-case:

A bank and an insurance provider want to know what is the overlap of their customer base, but they cannot share the CRM data with the other party. This is an example of a collaboration workflow that the Decentriq platform can make possible. Using the Decentriq platform, the parties can securely connect sensitive customer data while keeping them private, and run the overlap computation on them with a straightforward workflow

In this example, we will define the computations in SQL. You can use the following files to reproduce the steps below and create your first Data Clean Room:

Simple example material ➞


Step 1 - Access the platform

  • Navigate to https://platform.decentriq.com/
  • Log in with your credentials. If you do not have any credentials yet, please contact your reference person of the Decentriq team.
    Platform login

Step 2 - Create a Data Clean Room

  • Click on New Data Clean Room button.
    Choose DCR type
  • If your organization has multiple DCR types enabled, choose the Advanced Analytics for this tutorial.
    Create DCR
  • Give a name. The example Data Clean Room will be on a 'confidential computing overlap' between a bank and an insurance provider.
  • Here you can decide to start from scratch or to import an existing data clean room configuration. For this example, please select the JSON file available in the 'simple example material' folder linked at the beginning of this page.

Step 3 - Define the datasets

Define the datasets to be provisioned by Data Owners:

Datasets

  • Add a new Table when using structured datasets (CSV) with an expected schema with columns and types.
    • Add new columns, select the data type and check if it accepts empty values.
      • Note: for some data types it is possible to require the value to be hashed before uploaded.
    • Additional configuration
      • It is possible to require column values or the combination of multiple column values to be unique.
    • All values will be validated as soon as a dataset is provisioned to the table.
  • Add a new File when working with unstructured datasets (JSON, TXT, ZIP or any other kind).
  • In both types, by default, datasets must be provisioned before allowing running computations that depend on them. This can be toggled via the checkbox.

Step 4 - Define the computations

These can be SQL, Python, R or Synthetic Data. In this example, we will use a SQL query to define a computation that calculates the overlap. For a list of supported data types and SQL clauses, check the SQL Computation section.

SQL Computation

  • Type in the query content
  • Use the Table browser for a quick reference of tables and columns available. Click the copy icon in front of each item and paste it directly into the editor for a faster experience.

Additionally, you could create a new Synthetic Data computation, that takes a sensitive table as source and produces artificial data with the same data schema as the source: Synthetic data

  • Mask the columns where the value should not appear in the results - these will be replaced with a random value of each type.
  • All other columns will be synthesized using differential privacy while keeping similar statistical properties.

Add as many computations as you wish, combining different languages and referencing results from each other.

Once completed press the Test all computations button to make sure it will work once the Data Clean Room is published.

note

This will test the computation with empty datasets and only return the expected result schema.

After publishing, Data Owners can be provision datasets and the computation can be run.

Step 5 - Set permissions

Define the participants that will be invited to the collaboration, and assign them permissions to interact with the tables, files and/or computations:

Permissions

  • If this data clean room is immutable is checked, it will allow participants to request the new computations (to be approved by the affected Data Owners) after it is published. Otherwise, the Data Clean Room will be immutable and cannot be changed.
  • Enable development environment to give participants access to a tab where they can run arbitrary computations based on data and computation results where they have permissions.
  • By default, Data Clean Rooms are interactive, allowing participants to request new computations (to be approved by the affected Data Owners) after it is published. To prevent any modifications after publishing, check the option to make it immutable.
  • Use the dropdown boxes to assign Data Owner and Analyst permissions to each participant on each dataset and computation.
  • Add a new participant by typing in their email - an invitation will be sent as soon as the Data Clean Room is published.

Step 6 - Encrypt and publish the Data Clean Room

  • Click the Publish button at the top-right side.
    Publish DCR
  • The Data Clean Room definition will be encrypted and enforced in our confidential computing environment once published.
  • Note that you can duplicate the DCR, or export its definition in JSON format to save it offline at any moment.
  • Now, participants can start collaborating in the published Data Clean Room.
note

To protect the Data Clean Room with a password, click the ... menu icon next to the Encrypt and publish button, then toggle the switch to ON.
At the time of publishing, you can define a password and share it with the participants before they can interact with this Data Clean Room.

Step 7 - Provision datasets and run computations

The Overview tab contains all datasets where you are a Data Owner of, and all computations where you have Analyst permissions. To see the entire DCR definition, switch off the show only actionable items toggle. Provision and run

  • The Data Owners can provision datasets in CSV format to the tables, by following the Datasets guide. Analysts can then run the computations and get the results back.
  • You can find the necessary CSV's to run the example in the ZIP folder provided above. Once datasets are provisioned, you can run your computation.
  • It is also possible to provision unstructured datasets if a file was defined in the Data tab when drafting the DCR.
  • Once all necessary data is available, click the Run button of each computation and get the results.

Step 8 - Browse provisioned datasets

From the sidebar, access the overview of all your provisioned data in the Datasets page:

Dataset statistics Here you can see the full list of the datasets you have uploaded, see to which Data Clean Rooms they are provisioned and more:

  • Dataset size
  • Dataset schema (if tabular)
  • Which Data Clean Rooms it is provisioned to, and deprovision directly from here
  • Delete the dataset from the platform
  • Export the dataset to other platforms using Data Connectors

Step 9 - Check the tamper-proof audit log

All participants of DCR created via the UI have auditing permissions, for full transparency and to build trust among them:

Audit log This means that all of them can:

  • Inspect the Data Clean Room definition
  • Be aware of who uploaded data
  • Access and download the audit log, that is the register of all the activities of the Data Clean Room with the user that performed it