Provisioning a new dataset
This doc gives an example of how to provision a dataset to a Media Data Clean Room (DCR), using a direct upload from a local source.
import decentriq_platform as dq
user_email = "@@ YOUR EMAIL HERE @@"
api_token = "@@ YOUR TOKEN HERE @@"
client = dq.create_client(user_email, api_token)
enclave_specs = dq.enclave_specifications.latest()
Setup Script
If you want to test this functionality and don't have a clean room already set up, you can use this script to create an appropriate environment to test the rest of this guide with.
import decentriq_platform as dq
from decentriq_platform.media import MediaDcrBuilder
advertiser_email = "@@ YOUR EMAIL HERE @@"
advertiser_api_token = "@@ YOUR TOKEN HERE @@"
publisher_email = "@@ EMAIL OF PUBLISHER PARTICIPANT @@"
advertiser_client = dq.create_client(advertiser_email, advertiser_api_token)
builder = MediaDcrBuilder(client=advertiser_client)
dcr_definition = builder.\
with_name("My DCR").\
with_insights().\
with_lookalike().\
with_retargeting().\
with_matching_id_format(dq.types.MatchingId.STRING).\
with_publisher_emails(publisher_email).\
with_advertiser_emails(advertiser_email).\
with_agency_emails(["test@agency.com"]).\
with_observer_emails(["test@observer.com"]).\
build()
media_dcr = advertiser_client.publish_media_dcr(dcr_definition)
dcr_id = media_dcr.id
Direct Upload
Advertisers can use the Session
function publish_dataset
to provision their data. While technically optional, using a Keychain
will make any future operations with this dataset easier. This example upload script also places the key into the Keychain
. You will not need to remember the key to reprovision this data. You only need the password for the Keychain
.
import decentriq_platform as dq
from decentriq_platform import Keychain, KeychainEntry
user_email = "@@ YOUR EMAIL HERE @@"
api_token = "@@ YOUR TOKEN HERE @@"
keychain_password = "@@ YOUR KEYCHAIN PASSWORD HERE @@"
dataset_name = "audiences.csv"
dataset_path = "/path/to/advertiser_data.csv"
client = dq.create_client(user_email, api_token)
user_keychain = Keychain.get_or_create_unlocked_keychain(client, bytes(keychain_password, 'utf8'))
data_room_descriptions = {description['id']: description for description in client.get_data_room_descriptions()}
data_room_description = data_room_descriptions[dcr_id]
session = client.create_session_from_data_room_description(data_room_description)
key = dq.Key()
with open(dataset_path, "rb") as f:
dataset_id = client.upload_dataset(
f,
key,
dataset_name,
store_in_keychain=user_keychain
)
user_keychain.remove("dataset_key", dataset_id)
user_keychain.insert(KeychainEntry("dataset_key", dataset_id, key.material))
session.publish_dataset(dcr_id, dataset_id, "audiences", key)
Upload from Keychain
This is an example of provisioning data to a Media DCR. You will not need to remember the key to reprovision this data. You only need the password for the Keychain
.
import decentriq_platform as dq
from decentriq_platform import Keychain, KeychainEntry
user_email = "@@ YOUR EMAIL HERE @@"
api_token = "@@ YOUR TOKEN HERE @@"
keychain_password = "@@ YOUR KEYCHAIN PASSWORD HERE @@"
dataset_id = "@@ YOUR DATASET ID HERE @@"
client = dq.create_client(user_email, api_token)
user_keychain = Keychain.get_or_create_unlocked_keychain(client, bytes(keychain_password, 'utf8'))
key = dq.Key(user_keychain.get("dataset_key", dataset_id).value)
data_room_descriptions = {description['id']: description for description in client.get_data_room_descriptions()}
data_room_description = data_room_descriptions[dcr_id]
session = client.create_session_from_data_room_description(data_room_description)
session.publish_dataset(dcr_id, dataset_id,"audiences", key)
# altnerate way to call without session
# dcr = client.retrieve_media_dcr(dcr_id)
# dcr.get_node("audiences").publish_dataset(
# dataset_id,
# dq.Key(key.value)
# )
No Keychain
For completeness, this example upload script does the same operation without using the Keychain
. You will need to use the same key again to reprovision this data.
import decentriq_platform as dq
user_email = "@@ YOUR EMAIL HERE @@"
api_token = "@@ YOUR TOKEN HERE @@"
dataset_name = "audiences.csv"
dataset_path = "/path/to/advertiser_data.csv"
client = dq.create_client(user_email, api_token)
data_room_descriptions = {description['id']: description for description in client.get_data_room_descriptions()}
data_room_description = data_room_descriptions[dcr_id]
session = client.create_session_from_data_room_description(data_room_description)
key = dq.Key()
with open(dataset_path, "rb") as f:
dataset_id = client.upload_dataset(f, key, dataset_name)
session.publish_dataset(dcr_id, dataset_id, "audiences", key)