Getting started with the Python SDK for the Media DCR
This tutorial shows the steps required to build and run a Media Data Clean Room (DCR) from scratch. You will learn how to:
- create a
Client
for interacting with the Decentriq platform - create a new Media DCR
- provision data to a Media DCR
- download audiences from a Media DCR
- find and interact with an existing Media DCR
Creating a Client
A Client
object handles communication with the Decentriq platform. It provides facilities for creating DCRs, retrieving existing DCRs, and uploading data.
The snippet below shows how to create a Client
object by providing user credentials.
import decentriq_platform as dq
advertiser_email = "@@ YOUR EMAIL HERE @@"
advertiser_api_token = "@@ YOUR TOKEN HERE @@"
advertiser_client = dq.create_client(advertiser_email, advertiser_api_token)
NOTE: API tokens can be created and managed via the Decentriq UI - API tokens page.
Creating a New Media DCR
To create a Media DCR via the SDK, you must first import the media
package.
The MediaDcr
object is used to create a new Media DCR. The constructor takes configuration options as arguments which includes:
client
: The client instance for API communicationname
: Name of the DCRparticipants
: Participants of the DCRcollaboration_types
: Collaboration types supported by the DCRmatching_ids
: Matching IDs used by the DCRhide_absolute_values
: A flag determining whether to hide absolute values from outputs
An example of building a Media DCR is shown below:
import decentriq_platform as dq
from decentriq_platform.media import MediaDcr, Participant, CollaborationType, Permission
advertiser_email = "@@ YOUR EMAIL HERE @@"
advertiser_api_token = "@@ YOUR TOKEN HERE @@"
advertiser_client = dq.create_client(advertiser_email, advertiser_api_token)
publisher_email = "@@ EMAIL OF PUBLISHER PARTICIPANT @@"
media_dcr = MediaDcr(
client=advertiser_client,
name="my_media_dcr",
participants=[
Participant(
role="Publisher",
emails=[publisher_email],
permissions=[
Permission.VIEW_OVERLAP,
Permission.VIEW_INSIGHTS,
Permission.PROVIDE_BASE_AUDIENCE,
Permission.EXPORT_AUDIENCE,
],
),
Participant(
role="Advertiser",
emails=[advertiser_email],
permissions=[
Permission.VIEW_OVERLAP,
Permission.VIEW_INSIGHTS,
Permission.PROVIDE_SEED_AUDIENCE,
Permission.EXPORT_AUDIENCE,
Permission.CREATE_CUSTOM_AUDIENCE,
],
),
],
collaboration_types=[
CollaborationType.INSIGHTS,
CollaborationType.LOOKALIKE,
CollaborationType.REMARKETING,
CollaborationType.RULE_BASED,
],
matching_ids=[dq.types.MatchingId.HASHED_EMAIL],
)
The participants
field comprises a list of DCR collaborators. Each Participant
represents a group of users with the same attributes.
The collaboration_types
field represents the features supported by the DCR.
The matching_ids
field denotes the type of matching IDs used by the DCR when matching users between the advertiser and publisher.
Provisioning data to a Media DCR
Provisioning advertiser data
Advertisers can use their data in a Media DCR by provisioning a dataset directly to it.
key = dq.Key()
with open("/path/to/advertiser_data.csv", "rb") as file:
# Upload the local dataset to the Dataset Portal.
advertiser_dataset_manifest_hash = advertiser_client.upload_dataset(
file,
key,
"advertiser.csv",
)
# Provision the uploaded dataset to the Media DCR.
media_dcr.provision_seed_audiences(advertiser_dataset_manifest_hash)
Provisioning publisher data
Before publisher datasets can be provisioned to a Media DCR, a data lab must be created. This is an intermediate step that helps check for internal consistency in the data. Please contact customer success for additional support when working with data labs.
The snippet below is provided to help brands test clean rooms in a "quick start" test.
from decentriq_platform.media import MediaDcr
publisher_email = "@@ EMAIL OF PUBLISHER PARTICIPANT @@"
publisher_api_token = "@@ PUBLISHER TOKEN HERE @@"
publisher_client = dq.create_client(publisher_email, publisher_api_token)
builder = dq.data_lab.DataLabBuilder(publisher_client)
builder.with_name("tutorial-data-lab")
builder.with_matching_id_format(dq.types.MatchingId.HASHED_EMAIL)
builder.with_embeddings(50)
builder.with_demographics()
builder.with_segments()
data_lab = builder.build()
file_encryption_key = dq.Key()
data_lab.provision_local_datasets(
file_encryption_key,
"/path/to/matching_data.csv",
"/path/to/segments_data.csv",
demographics_data_path="/path/to/demographics_data.csv",
embeddings_data_path="/path/to/embeddings_data.csv",
)
data_lab.run()
validation_report = data_lab.get_validation_report()
statistics_report = data_lab.get_statistics_report()
publisher_dcr = MediaDcr.from_existing(media_dcr.id, publisher_client)
publisher_dcr.provision_base_audience(data_lab.data_lab_id)
Managing audiences
Retrieving audiences
Audiences can be retrieved from the DCR using the get_audiences
method. This returns a Job
object that can be used to retrieve the audiences once the job has completed.
Publishers calling get_audiences
will only see audiences that have been made available to them by the advertiser.
audiences_job = media_dcr.get_audiences()
audiences_job.wait_for_completion(timeout=60 * 5)
audiences = audiences_job.result()
Creating audiences
Creating a rule-based audience
A RuleBasedAudienceBuilder
is used to create rule-based audiences from an existing "source" audience. AudienceFilters
and AudienceCombinator
's may be specified, allowing the rule-based audience to be constructed as required. As a convenience, the audience can be made immediately available to the publisher participant using the with_share_with_participants
method.
AudienceFilters
allow users within an audience to be selected based on the desired attributes.
AudienceCombinator
's are created using the AudienceCombinatorBuilder
. They allow multiple audiences to be combined and also allow AudienceFilters
to be applied to the combining audience.
from decentriq_platform.media import (
RuleBasedAudienceBuilder,
AudienceFilters,
Filter,
FilterOperator,
AudienceCombinatorBuilder,
CombineOperator,
MatchOperator
)
participants = media_dcr.get_participants()
publisher_participant = [participant for participant in participants if participant.role == "Publisher"][0]
audience_filters = AudienceFilters(
filters=[
Filter(
attribute="gender",
values=["M"],
operator=FilterOperator.CONTAINS_ALL,
),
Filter(
attribute="age",
values=["21-30", "31-40"],
operator=FilterOperator.CONTAINS_ANY,
),
],
operator=MatchOperator.MATCH_ALL,
)
combinator_source_audience = audiences.get_seed_audience("insurance")
audience_combinator = (
AudienceCombinatorBuilder(
operator=CombineOperator.UNION,
source_audience=combinator_source_audience,
)
.with_filters(
filters=AudienceFilters(
filters=[
Filter(
attribute="gender",
values=["M"],
operator=FilterOperator.CONTAINS_ALL,
)
],
operator=MatchOperator.MATCH_ALL,
),
)
.build()
)
source_audience = audiences.get_seed_audience("shoes")
rule_based_audience_definition = (
RuleBasedAudienceBuilder(
name="rule-based-audience",
source_audience=source_audience,
)
.with_share_with_participants([publisher_participant])
.with_filters(audience_filters)
.with_combinator([audience_combinator])
.build()
)
rb_audience = media_dcr.create_rule_based_audience(rule_based_audience_definition)
Creating a lookalike audience
A LookalikeAudienceBuilder
is used to create lookalike audiences from an existing "source" audience. It allows the reach to be specified as a percentage between 1-30 with the option of making the audience immediately available to the publisher participant through the use of the with_share_with_participants
method.
from decentriq_platform.media import LookalikeAudienceBuilder
source_audience = audiences.get_seed_audience("shoes")
participants = media_dcr.get_participants()
publisher_participant = [participant for participant in participants if participant.role == "Publisher"][0]
lookalike_audience_definition = (
LookalikeAudienceBuilder(
name="lookalike-audience",
reach=10,
source_audience=source_audience,
)
.with_share_with_participants([publisher_participant])
.build()
)
lal_audience = media_dcr.create_lookalike_audience(lookalike_audience_definition)
Creating a remarketing audience
A RemarketingAudienceBuilder
is used to create remarketing audiences from an existing source audience type. As a convenience, the audience can be made immediately available to the publisher participant using the with_share_with_participants
method.
from decentriq_platform.media import RemarketingAudienceBuilder
remarketing_audience_definition = (
RemarketingAudienceBuilder(
name="remarketing-audience",
source_audience_type="shoes",
)
.with_share_with_participants([publisher_participant])
.build()
)
remarketing_audience = media_dcr.create_remarketing_audience(remarketing_audience_definition)
Exporting audiences as a publisher
As a publisher, any audience that has been made available to you can have its users exported. This is achieved by calling the get_audience_user_list
function.
The below example shows how to export an audience that has been made available to you, as a csv file.
audience_user_list_job = publisher_dcr.get_audience_user_list(lal_audience)
audience_user_list_job.wait_for_completion(timeout=60 * 5)
audience_user_list = audience_user_list_job.result()
# Write a CSV file with the list of users in the output
outpath = "audience_user_list.csv"
with open(outpath, "w") as f:
for user in audience_user_list:
f.write(f"{user}\n")
Interacting with an existing DCR
A DCR might have been created in the Decentriq UI or as part of another script. In this case it can easily be retrieved using the MediaDcr
object.
existing_media_dcr = MediaDcr.from_existing(dcr_id, publisher_client)
Quickstart Script
You can use the following script to get started quickly:
import decentriq_platform as dq
from decentriq_platform.media import (
MediaDcr,
Participant,
CollaborationType,
Permission,
LookalikeAudienceBuilder,
RuleBasedAudienceBuilder,
AudienceFilters,
Filter,
FilterOperator,
AudienceCombinatorBuilder,
CombineOperator,
MatchOperator
)
advertiser_email = "@@ YOUR EMAIL HERE @@"
advertiser_api_token = "@@ YOUR TOKEN HERE @@"
advertiser_client = dq.create_client(advertiser_email, advertiser_api_token)
publisher_email = "@@ EMAIL OF PUBLISHER PARTICIPANT @@"
media_dcr = MediaDcr(
client=client,
name="my_media_dcr",
participants=[
Participant(
role="Publisher",
emails=[publisher_email],
permissions=[
Permission.VIEW_OVERLAP,
Permission.VIEW_INSIGHTS,
Permission.PROVIDE_BASE_AUDIENCE,
Permission.EXPORT_AUDIENCE,
],
),
Participant(
role="Advertiser",
emails=[advertiser_email],
permissions=[
Permission.VIEW_OVERLAP,
Permission.VIEW_INSIGHTS,
Permission.PROVIDE_SEED_AUDIENCE,
Permission.EXPORT_AUDIENCE,
Permission.CREATE_CUSTOM_AUDIENCE,
],
),
],
collaboration_types=[
CollaborationType.INSIGHTS,
CollaborationType.LOOKALIKE,
CollaborationType.REMARKETING,
CollaborationType.RULE_BASED,
],
matching_ids=[dq.types.MatchingId.HASHED_EMAIL],
)
# upload and publish data
key = dq.Key()
with open("/path/to/advertiser_data.csv", "rb") as file:
# Upload the local dataset to the Dataset Portal.
advertiser_dataset_manifest_hash = advertiser_client.upload_dataset(
file,
key,
"advertiser.csv",
)
# Provision the uploaded dataset to the Media DCR.
media_dcr.provision_seed_audiences(advertiser_dataset_manifest_hash)
"""
Continue initialising the Media DCR as a publisher.
The publisher must provision a data lab to the DCR before any further
advertiser interactions are permitted.
"""
publisher_api_token = "@@ PUBLISHER TOKEN HERE @@"
publisher_client = dq.create_client(publisher_email, publisher_api_token)
builder = dq.data_lab.DataLabBuilder(publisher_client)
builder.with_name("tutorial-data-lab")
builder.with_matching_id_format(dq.types.MatchingId.HASHED_EMAIL)
builder.with_embeddings(50)
builder.with_demographics()
builder.with_segments()
data_lab = builder.build()
file_encryption_key = dq.Key()
data_lab.provision_local_datasets(
file_encryption_key,
"/path/to/matching_data.csv",
"/path/to/segments_data.csv",
demographics_data_path="/path/to/demographics_data.csv",
embeddings_data_path="/path/to/embeddings_data.csv",
)
data_lab.run()
validation_report = data_lab.get_validation_report()
statistics_report = data_lab.get_statistics_report()
publisher_dcr = MediaDcr.from_existing(media_dcr.id, publisher_client)
publisher_dcr.provision_base_audience(data_lab.data_lab_id)
"""
DCR fully initialised. Continue interacting with the DCR as an advertiser.
"""
# Retrieve the list of audiences
audiences_job = media_dcr.get_audiences()
audiences_job.wait_for_completion(timeout=60 * 5)
audiences = audiences_job.result()
# Get the publisher participant
participants = media_dcr.get_participants()
publisher_participant = [participant for participant in participants if participant.role == "Publisher"][0]
# Create a lookalike audience
source_audience = audiences.get_seed_audience("shoes")
lookalike_audience_definition = (
LookalikeAudienceBuilder(
name="lookalike-audience",
reach=10,
source_audience=source_audience,
)
.with_share_with_participants([publisher_participant])
.build()
)
lal_audience = media_dcr.create_lookalike_audience(lookalike_audience_definition)
# Create a rule-based audience
audience_filters = AudienceFilters(
filters=[
Filter(
attribute="gender",
values=["M"],
operator=FilterOperator.CONTAINS_ALL,
),
Filter(
attribute="age",
values=["21-30", "31-40"],
operator=FilterOperator.CONTAINS_ANY,
),
],
operator=MatchOperator.MATCH_ALL,
)
combinator_source_audience = audiences.get_seed_audience("insurance")
audience_combinator = (
AudienceCombinatorBuilder(
operator=CombineOperator.UNION,
source_audience=combinator_source_audience,
)
.with_filters(
filters=AudienceFilters(
filters=[
Filter(
attribute="gender",
values=["M"],
operator=FilterOperator.CONTAINS_ALL,
)
],
operator=MatchOperator.MATCH_ALL,
),
)
.build()
)
source_audience = audiences.get_seed_audience("shoes")
rule_based_audience_definition = (
RuleBasedAudienceBuilder(
name="rule-based-audience",
source_audience=source_audience,
)
.with_share_with_participants([publisher_participant])
.with_filters(audience_filters)
.with_combinator([audience_combinator])
.build()
)
rb_audience = media_dcr.create_rule_based_audience(rule_based_audience_definition)