Download datasets
After running a computation in a Data Clean Room, you can retrieve the results in different formats depending on your needs.
Setup
First, import the necessary modules:
from decentriq_platform import create_client
from decentriq_platform.analytics import AnalyticsDcrBuilder, PythonComputeNodeDefinition
Then, create the Client instance with which you can communicate with the
Decentriq platform:
user_email = "@@ YOUR EMAIL HERE @@"
api_token = "@@ YOUR TOKEN HERE @@"
client = create_client(user_email, api_token)
enclave_specs = dq.enclave_specifications.latest()
Retrieving computation results
Once you have published a DCR and have access to a computation node, you can retrieve the results using various methods:
# Build and publish your DCR
builder = AnalyticsDcrBuilder(client=client)
script = """with open("/output/results.txt", "w") as output_file:
output_file.write("Computation results")
"""
dcr_definition = (
builder.with_name("My Analytics DCR")
.with_owner(user_email)
.with_description("Example DCR for downloading results")
.add_node_definitions(
[
PythonComputeNodeDefinition(
name="python_node",
script=script,
),
]
)
.add_participant(user_email, analyst_of=["python_node"])
.build()
)
dcr = client.publish_analytics_dcr(dcr_definition)
# Get the computation node
node = dcr.get_node("python_node")
# Retrieve results as a ZIP file (recommended for multiple output files)
result_zip = node.run_computation_and_get_results_as_zip()
# Access files in the ZIP
for filename in result_zip.namelist():
content = result_zip.read(filename).decode()
Download methods
The SDK provides different methods for retrieving computation results:
1. As ZIP file (recommended)
Use run_computation_and_get_results_as_zip() when your computation produces multiple output files:
result_zip = node.run_computation_and_get_results_as_zip()
# List all files in the results
print(result_zip.namelist())
# Read a specific file
content = result_zip.read("results.txt").decode()
2. As bytes
Use run_computation_and_get_results_as_bytes() when you need the raw binary data:
result_bytes = node.run_computation_and_get_results_as_bytes()
# Save to file
with open("results.zip", "wb") as f:
f.write(result_bytes)
Working with existing DCRs
To download results from an existing DCR:
dcr_id = dcr.id
# Retrieve an existing DCR by its ID
existing_dcr = client.retrieve_analytics_dcr(
dcr_id=dcr_id,
enclave_specs=enclave_specs.values()
)
# Get the computation node and retrieve results
computation_node = existing_dcr.get_node("python_node")
results = computation_node.run_computation_and_get_results_as_zip()