Skip to main content

Importing from GCS

Setup

First, import the necessary modules:

from decentriq_platform import create_client, Key
from decentriq_platform.analytics import (
AnalyticsDcrBuilder,
RawDataNodeDefinition,
)
from decentriq_platform.data_connectors import (
GcsImportConnectorDefinition,
GcsCredentials,
)

Then, create the Client instance with which you can communicate with the Decentriq platform:

user_email = "@@ YOUR EMAIL HERE @@"
api_token = "@@ YOUR TOKEN HERE @@"

client = create_client(user_email, api_token)
import json
enclave_specs = dq.enclave_specifications.latest()

Example: Import a file from GCS

This example shows how to import a file from Google Cloud Storage into your Data Clean Room.

# Build the Data Clean Room
builder = AnalyticsDcrBuilder(client=client)

dcr_definition = (
builder.with_name("GCS Import DCR")
.with_owner(user_email)
.with_description("Import a file from GCS")
.add_node_definitions([
# Node to hold GCS credentials
RawDataNodeDefinition(
name="gcs-credentials",
is_required=True,
),
# Import connector node
GcsImportConnectorDefinition(
name="gcs-import",
object_key="integration_test_import.txt",
bucket="@@ GCS BUCKET NAME HERE @@",
credentials_dependency="gcs-credentials",
),
])
.add_participant(
user_email,
analyst_of=["gcs-import"],
data_owner_of=["gcs-credentials"],
)
.build()
)

# Publish the Data Clean Room
dcr = client.publish_analytics_dcr(dcr_definition)

# Upload GCS credentials
gcs_credentials = dcr.get_node("gcs-credentials")
gcs_credentials.upload_and_publish_dataset(
GcsCredentials(
credentials_json="@@ GCS SERVICE ACCOUNT JSON HERE @@",
).as_binary_io(),
Key(),
"credentials.txt",
)

# Import the data from GCS
gcs_import_connector = dcr.get_node("gcs-import")
result = gcs_import_connector.run_computation_and_get_results_as_bytes()

Parameters

The GcsImportConnectorDefinition requires the following parameters:

  • name: The name of the import connector node
  • object_key: The GCS object key (path) of the file to import
  • bucket: The name of the GCS bucket containing the file
  • credentials_dependency: The name of the node containing GCS credentials

The imported file will be available in the Decentriq Platform and can be used in your Data Clean Rooms.