Skip to main content

Keychain

The Keychain lets you confidentially store dataset encryption keys and other secrets.

Among other features, this enables you to reuse datasets across data clean rooms without having to re-upload them to Decentriq.

How it works

The Keychain operates like a traditional password manager. It derives from your password an encryption key. This key is used to locally encrypt the secrets (e.g. dataset encryption keys) you want to store in the Keychain. The encrypted secrets are then stored on the Decentriq Platform. They are retrieved when needed and the same password-derived encryption key is used to decrypt them.

As Decentriq does not have access to your password, Decentriq can never access the secrets stored in the Keychain.

Activate the Keychain

To interact with the Decentriq Platform, your Keychain must be activated. You will be able to create a password as soon as you sign in.

Keychain setup

note

If you lose your password, you will have to reset it and lose access to all previously stored secrets.

For convenience, a key derived from your Keychain password is cached in your browsing session such that you don't need to type it every time it's required.

In your consecutive accesses to the platform, you just need to enter your password when prompted.

Keychain login

If you forgot your password, you can reset it. However, all your stored keys will be deleted from your Keychain.

Datasets will remain provisioned to all data clean rooms. No data will be lost.

If you want to change your password, you can do it at any time by accessing the Keychain page from the sidebar and clicking Change Keychain password in the options menu.

Keychain change password

Store a dataset encryption key

As a Data Owner, start by clicking the Provision dataset button in a specific data node within a Data Clean Room:

DCR provision dataset

Click on Import from my computer, as this is a new dataset:

Select dataset source

Select the file from your computer. In this example, our file is called Bank Dataset.csv

Select local dataset

Notice there is an option selected by default to store the encryption key in your Keychain.

Follow the steps in the screen.

In the last step, an encryption key will be generated locally to encrypt your dataset before it gets uploaded. Once the process is completed, this encryption key is going to be stored in your Keychain.

Browse the Keychain

To access the Keychain, you can click on the sidebar in the Decentriq UI.

Sidebar menu

Here you can find all your stored keys:

Browse Keychain

  • Notice that the key used to encrypt your dataset is now stored in your Keychain.
  • You can check the dataset details by clicking the view icon on the right side.
  • Deleting the encryption key from your Keychain will not delete the dataset itself.

Reprovision a dataset to another Data Clean Room

As a Data Owner, if already provisioned a dataset from your computer and have its encryption key stored in your Keychain, you can provision this same dataset to another Data Clean Room without having to upload it again. To do so, start the process by clicking the Provision dataset button in a specific data node of a Data Clean Room.

DCR provision dataset

Click on Choose from my stored datasets, as this is a dataset already stored in the Decentriq platform:

Select dataset source

Now you can immediately select the desired dataset. In this example, our file is the same Bank Dataset.csv

Select existing dataset

The next will retrieve from your Keychain the encryption key for the selected dataset and provision it to the other Data Clean Room.

Reprovisioning completed

From the Data Clean Room or from the Datasets page, you can see the details of the dataset:

Dataset details

  • Notice that it is now provisioned to 2 DCRs
  • You can deprovision it from any DCR directly from this screen by clicking the “unlink” icon on the right side. The dataset will not be deleted, regardless of being provisioned to a DCR or not.
  • To deprovision a dataset from all DCRs and delete it from the Decentriq platform, click the Delete dataset button.

Python SDK integration

The same Decentriq UI flow can be achieved programmatically, using the Decentriq Python SDK.

Please follow the Get started with Python SDK tutorial to learn how to connect with the Decentriq platform, create DCRs, provision datasets and run computations.

The Keychain feature has been introduced in version 0.14.0.

pip install decentriq-platform==0.14.0

Let's start by importing the necessary packages. Note that the Keychain is accessible from decentriq_platform.keychain

import decentriq_platform as dq
from decentriq_platform.keychain import Keychain, KeychainEntry

Establish a connection with the Decentriq platform

client = dq.create_client(user_email, api_token)
enclave_specs = dq.enclave_specifications.versions([
"decentriq.driver:v10",
"decentriq.sql-worker:v10"
])

auth, _ = client.create_auth_using_decentriq_pki(enclave_specs)
session = client.create_session(auth, enclave_specs)

Activate the Keychain for the first time

If the Keychain has not been activated via the Decentriq Platform UI, it can be done for the first time with the Keychain.create_new_keychain() method. Otherwise, this step can be skipped.

Keychain.create_new_keychain(client, b"KeychainPassword1234")

Note that the password must be a binary string.

Encrypt and provision a new dataset to an existing DCR

Assume you have:

  • a published Data Clean Room with:
  • a data node (leaf) where you have Data Owner permissions to provision a dataset to it
DCR_ID = "1b56c9252eb6f4cf69ac0daf3d2525bcea4706d3f4ce748aebb41ac14a4857d9"
LEAF_NODE_ID = "4f368f91ed20e323f34c133fbc7ba56ca4d6930fc9254714d377d2c783213e13"

First, you generate a key to encrypt a new dataset, upload it to the Decentriq platform and provision it to the data node inside an existing DCR.

data = dqsql.read_input_csv_file("/path/to/dataset.csv", has_header=True)

encryption_key = dq.Key()

dataset_id = dqsql.upload_and_publish_tabular_dataset(
data, encryption_key, DCR_ID,
table=LEAF_NODE_ID,
session=session,
description="dataset uploaded via the sdk",
validate=True
)

Alternatively, you can use the client.upload_dataset() method to only upload an encrypted dataset to the Decentriq platform.

dataset_id = client.upload_dataset(data, encryption_key, "myfile")

Store the encryption key in the Keychain

As that the dataset has been successfully uploaded and has a known ID, you can store the generated encryption key and reference the dataset ID so that it is convenient to retrieve it later.

This can be achieved by initializing the Keychain with your password and calling the insert() method that expects an entry of type dataset_key.

Note that the password must be a binary string.

my_keychain = Keychain.init_with_password(client, b"KeychainPassword1234")

my_keychain.insert(KeychainEntry("dataset_key", dataset_id, encryption_key.material))

Provision a dataset to other DCRs

At a later moment, you can retrieve from the Keychain the encryption key of the same dataset and provision it to other DCRs.

Assume you have:

  • another published Data Clean Room with:
  • a data node (leaf) where you have Data Owner permissions to provision a dataset to it
DCR_ID2 = "cd9948b4f93e0dd1fbf053ad541294fcd9ddf8767f40eeddc95bf0f90b1af90a"
LEAF_NODE_ID2 = "3c52ca99ad00a81131f0222d89ecc09ba365d6bd4010f7e8a44a84fd61172a40"

Retrieve the encryption key from the Keychain referencing the previously dataset ID by calling the get() method.

retrieved_key = my_keychain.get("dataset_key", dataset_id)

Provision the dataset to another DCR using the publish_dataset() method. This can be done multiple times just by specifying other DCR ID and data node ID.

session.publish_dataset(DCR_ID2, dataset_id, f"{LEAF_NODE_ID2}_leaf", dq.Key(retrieved_key.value))
note

The publish_dataset() method expects the data node ID to be suffixed with _leaf

Deprovision a dataset

To remove a dataset from a DCR, only the DCR ID and data node ID need to be specified.

This will not delete the dataset nor remove the encryption key from the Keychain.

session.remove_published_dataset(
DCR_ID,
f"{LEAF_NODE_ID}_leaf"
)

Delete a dataset

To delete a dataset from the Decentriq platform, call the delete_dataset() method specifying the dataset ID. This will additionally deprovision the dataset from all DCRs. The encryption key will remain in the Keychain and must be removed separately.

client.delete_dataset(dataset_id)

Delete an encryption key

To delete an encryption key from the Keychain, call the remove() method specifying the referenced dataset ID. This will not delete the dataset, nor deprovision it from any DCR. To do so, check the instructions above.

my_keychain.remove("dataset_key", dataset_id)