Migrating your account after the v4.0 release

In the v4.0 release, we change the authentication and secrets management. This change will improve the user experience and enables us to release new features faster.

1. What you need to know

When logging into the platform after the release, you will be prompted to 'accept the migration' to the new authentication system. This can be postponed for 90 days.
To migrate and log into your account thereafter, you must have access to the email account used as your username.
Before migrating:
1. You will not have access to the latest features of v4.0.
2. You will not be able to use the SDK to interact with Media DCRs created by users who already migrated.
After migrating:
1. When logging in, you will occasionally be requested to input a one-time password sent to your email account. For how to deal with service accounts where you don't have access to the email account, please see the section below.
2. You will have to update the Decentriq SDK to the latest version and make minor modifications to its use. For how to migrate your code, please see the section below.
3. Old API tokens will not be valid anymore and you will have to create new ones.
4. You will not be asked anymore to input your keychain password.

If you have any questions, do not hesitate to contact your customer success representative or support@decentriq.com. We are here to help you.

2. How to deal with service accounts

Some of our customers are using service accounts with aliases that are not associated to an active mailbox. Often, these accounts have been used by multiple people. One person used the account to upload data, another one to create DCRs and provision data.

It will not be possible anymore to use share accounts without an existing email address as entering an email one-time password is required. Decentriq's recommendation is not to use shared accounts, for security reasons, but rather use personal accounts and the new 'dataset sharing' feature.

The first person should use their personal user to upload data, then use the dataset sharing feature to share the data with the second person's personal user. The second user will then be able to use the dataset in their DCRs.

3. Migrating to the New SDK

With the release of Platform version 4.0 and SDK version 0.34.0, there are several critical updates and improvements. The old SDK is not compatible with the new one, and interaction with datarooms created by a migrated user using a new SDK or the UI won't be possible.

3.1 API Tokens changes in the SDK

The new SDK version requires generating new API tokens, as old tokens are no longer compatible. It’s also essential to note that the new API tokens will not work with previous SDK versions.

3.2 Code Changes in the SDK

The new SDK version changed slightly the syntax, in particular for what regards the removal of the Keychain component. In this new version, secrets, including dataset keys, are securely stored directly within a secure enclave, eliminating the need for a Keychain and simplifying secret management.

3.2.1 Client and Keychain Initialization

In previous SDK versions, initializing the client required both an API token and a Keychain instance, which needed a separate password for accessing secrets.

Before:

import decentriq_platform as dq

USER_EMAIL = "@@ YOUR EMAIL HERE @@"
API_TOKEN = "@@ YOUR TOKEN HERE @@"
KEYCHAIN_PASSWORD = "@@ YOUR KEYCHAIN PASSWORD HERE @@"

client = dq.create_client(USER_EMAIL, API_TOKEN)
my_keychain = dq.Keychain.get_or_create_unlocked_keychain(client, bytes(KEYCHAIN_PASSWORD, 'utf8'))

After:

import decentriq_platform as dq

USER_EMAIL = "@@ YOUR EMAIL HERE @@"
NEW_API_TOKEN = "@@ YOUR TOKEN HERE @@"

client = dq.create_client(USER_EMAIL, NEW_API_TOKEN)

Now, only a Client instance is needed, simplifying the setup. The new SDK handles the secure storage of secrets directly, so dataset keys and other secrets are better integrated into the enclave.

3.2.2 File Upload and Encryption Key Management

In previous SDK versions, uploading a dataset required generating an encryption key and explicitly storing it in the Keychain.

Before:

# Generate an encryption key
encryption_key = dq.Key()

# Read dataset locally, encrypt, upload, and provision it to DCR
with open("/path/to/dataset.csv", "rb") as dataset:
    DATASET_ID = client.upload_dataset(
        dataset,
        encryption_key,
        "dataset_name",
        store_in_keychain=my_keychain
    )

After:

# Generate an encryption key
encryption_key = dq.Key()

# Read dataset locally, encrypt, upload, and provision it to DCR
with open("/path/to/dataset.csv", "rb") as dataset:
    DATASET_ID = client.upload_dataset(
        dataset,
        encryption_key,
        "dataset_name",
    )

The encryption key is now securely stored in the enclave by default. Users can still modify this default setting to allow dataset access to multiple users. For more information, refer to the Dataset Management guide.

3.2.3 Dataset Provisioning and Key Retrieval for Advanced Analytics DCR

Previously, provisioning a dataset to a DCR required retrieving the dataset key from the Keychain.

Before:

dcr = client.retrieve_analytics_dcr(DCR_ID)

retrieved_key = my_keychain.get("dataset_key", DATASET_ID)

# Reprovision the existing dataset to a DCR
dcr.get_node("my-raw-data-node").publish_dataset(
    DATASET_ID,
    dq.Key(retrieved_key.value)
)

After:

dcr = client.retrieve_analytics_dcr(DCR_ID)

retrieved_key = client.get_dataset_key(DATASET_ID)

# Reprovision the existing dataset to a DCR
dcr.get_node("my-raw-data-node").publish_dataset(
    DATASET_ID,
    retrieved_key
)

The key is now accessed directly from the client, making the code simpler and ensuring secure storage within the enclave.

3.2.4 Data Lab Provisioning to a Media Insights Clean Room

Provisioning a data lab:

Before:

data_lab.provision_local_datasets(
    file_encryption_key,
    publisher_keychain,
    "/path/to/matching_data.csv",
    "/path/to/segments_data.csv",
    demographics_data_path="/path/to/demographics_data.csv",
    embeddings_data_path="/path/to/embeddings_data.csv",
)
...
data_lab.provision_to_media_insights_data_room(dcr_id, publisher_keychain)

After:

data_lab.provision_local_datasets(
    file_encryption_key,
    "/path/to/matching_data.csv",
    "/path/to/segments_data.csv",
    demographics_data_path="/path/to/demographics_data.csv",
    embeddings_data_path="/path/to/embeddings_data.csv",
)
data_lab.provision_to_media_insights_data_room(dcr_id)

3.2.5 Dataset Provisioning and Key Retrieval for Legacy Code

Tabular datasets (.CSV files) to table nodes:

Before:

encryption_key = dq.Key()

with open("/path/to/dataset.csv", "rb") as tabular_dataset:
    dataset_id = dq.data_science.provision_tabular_dataset(
        tabular_dataset,
        name="dataset_name",
        session=session,
        key=encryption_key,
        # Data Clean Room ID copied from Decentriq UI
        data_room_id=DCR_ID,
        # Table ID copied when hovering over it in the Decentriq UI
        data_node=TABLE_NODE_ID,
        store_in_keychain=keychain
    )

After:

encryption_key = dq.Key()

with open("/path/to/dataset.csv", "rb") as tabular_dataset:
    dataset_id = dq.data_science.provision_tabular_dataset(
        tabular_dataset,
        name="dataset_name",
        session=session,
        key=encryption_key,
        # Data Clean Room ID copied from Decentriq UI
        data_room_id=DCR_ID,
        # Table ID copied when hovering over it in the Decentriq UI
        data_node=TABLE_NODE_ID,
    )

Unstructured (or “raw”) datasets (.JSON, .TXT, .ZIP, etc) to file nodes:

Before:

encryption_key = dq.Key()

with open("/path/to/file.json", "rb") as raw_dataset:
    dq.data_science.provision_raw_dataset(
        raw_dataset,
        name="My Dataset",
        session=session,
        key=encryption_key,
        # Data Clean Room ID copied from Decentriq UI
        data_room_id=DCR_ID,
        # File ID copied when hovering over it in the Decentriq UI
        data_node=FILE_NODE_ID,
        store_in_keychain=keychain
    )

After:

encryption_key = dq.Key()

with open("/path/to/file.json", "rb") as raw_dataset:
    dq.data_science.provision_raw_dataset(
        raw_dataset,
        name="My Dataset",
        session=session,
        key=encryption_key,
        # Data Clean Room ID copied from Decentriq UI
        data_room_id=DCR_ID,
        # File ID copied when hovering over it in the Decentriq UI
        data_node=FILE_NODE_ID,
        store_in_keychain=keychain
    )

Datalab provisioning:

Before:

file_encryption_key = dq.Key()

data_lab.provision_local_datasets(
    file_encryption_key,
    my_keychain,
    "/path/to/matching_data.csv",
    "/path/to/segments_data.csv",
    demographics_data_path="/path/to/demographics_data.csv",
    embeddings_data_path="/path/to/embeddings_data.csv",
)

After:

file_encryption_key = dq.Key()
data_lab.provision_local_datasets(
    file_encryption_key,
    "/path/to/matching_data.csv",
    "/path/to/segments_data.csv",
    demographics_data_path="/path/to/demographics_data.csv",
    embeddings_data_path="/path/to/embeddings_data.csv",
)

Lookalike Clean Room datalab provisioning:

Before:

data_lab.provision_to_lookalike_media_data_room(LMDCR_HASH, my_keychain)

After:

data_lab.provision_to_lookalike_media_data_room(LMDCR_HASH)

Lookalike Clean Room audiences provisioning:

Before:

# Generate an encryption key
encryption_key = dq.Key()
# Read dataset locally, encrypt, upload and provision it to DCR
with open("/path/to/audiences_data.csv", "rb") as audience_dataset:
    dq.lookalike_media.provision_dataset(
        audience_dataset,
        name="My Dataset",
        session=session,
        key=encryption_key,
        # Data Clean Room ID copied from Decentriq UI
        data_room_id=LMDCR_HASH,
        store_in_keychain=keychain,
        dataset_type=dq.lookalike_media.DatasetType.AUDIENCES
    )

After:

# Generate an encryption key
encryption_key = dq.Key()
# Read dataset locally, encrypt, upload and provision it to DCR
with open("/path/to/audiences_data.csv", "rb") as audience_dataset:
    dq.lookalike_media.provision_dataset(
        audience_dataset,
        name="My Dataset",
        session=session,
        key=encryption_key,
        # Data Clean Room ID copied from Decentriq UI
        data_room_id=LMDCR_HASH,
        dataset_type=dq.lookalike_media.DatasetType.AUDIENCES
    )

Migrating your account after the v4.0 release

1. What you need to know​

2. How to deal with service accounts​

3. Migrating to the New SDK​

3.1 API Tokens changes in the SDK​

3.2 Code Changes in the SDK​

3.2.1 Client and Keychain Initialization​

3.2.2 File Upload and Encryption Key Management​

3.2.3 Dataset Provisioning and Key Retrieval for Advanced Analytics DCR​

3.2.4 Data Lab Provisioning to a Media Insights Clean Room​

3.2.5 Dataset Provisioning and Key Retrieval for Legacy Code​

1. What you need to know

2. How to deal with service accounts

3. Migrating to the New SDK

3.1 API Tokens changes in the SDK

3.2 Code Changes in the SDK

3.2.1 Client and Keychain Initialization

3.2.2 File Upload and Encryption Key Management

3.2.3 Dataset Provisioning and Key Retrieval for Advanced Analytics DCR

3.2.4 Data Lab Provisioning to a Media Insights Clean Room

3.2.5 Dataset Provisioning and Key Retrieval for Legacy Code