decentriq_platform.data_lab

Classes

DataLab

DataLab(
    client: decentriq_platform.client.Client,
    cfg: decentriq_platform.data_lab.data_lab.DataLabConfig,
    existing_data_lab: Optional[decentriq_platform.data_lab.data_lab.ExistingDataLab] = None,
)

deprovision_dataset

def deprovision_dataset(
    self,
    dataset_type: decentriq_platform.types.DataLabDatasetType,
)

get_statistics_report

def get_statistics_report(
    self,
    timeout: Optional[int] = None,
)

Retrieve the statistics report. This function will block until the report is ready unless a timeout is specified.

Parameters:

timeout: Amount of time to wait (in seconds) for the statistics report to become available.

get_validation_report

def get_validation_report(
    self,
    timeout: Optional[int] = None,
)

Retrieve the validation report. This function will block until the report is ready unless a timeout is specified.

Parameters:

timeout: Amount of time to wait (in seconds) for the validation report to become available.

is_validation_passed

def is_validation_passed(
    self,
    validation_report: Dict[str, str],
) ‑> bool

Check whether or not DataLab validation has passed.

Parameters:

validation_report: Result of calling get_validation_report on this DataLab.

provision_dataset

def provision_dataset(
    self,
    manifest_hash: str,
    key: decentriq_platform.storage.Key,
    dataset_type: decentriq_platform.types.DataLabDatasetType,
)

provision_local_datasets

def provision_local_datasets(
    self,
    key: decentriq_platform.storage.Key,
    matching_data_path: str,
    segments_data_path: Optional[str] = None,
    demographics_data_path: Optional[str] = None,
    embeddings_data_path: Optional[str] = None,
    *,
    secret_store_options: Optional[decentriq_platform.client.SecretStoreOptions] = None,
)

Upload local datasets and provision to the DataLab.

Parameters

key: The key used to encrypt the dataset.
match: The file path to the "match" dataset.
segments: The file path to the "segments" dataset.
demographics: The file path to the "demographics" dataset.
embeddings: The file path to the "embeddings" dataset.

run

def run(
    self,
    /,
    *,
    dry_run: Optional[decentriq_platform.types.DryRunOptions] = None,
    parameters: Optional[Mapping[str, str]] = None,
)

Running the DataLab results in the validation jobs and statistics job being kicked off. This function does not block waiting for the results. Instead the user should call the get_validation_report or get_statistics_report function.

DataLabBuilder

DataLabBuilder(
    client: decentriq_platform.client.Client,
)

A helper class to build a Data Lab.

build

def build(
    self,
) ‑> decentriq_platform.data_lab.data_lab.DataLab

Build the DataLab.

from_existing

def from_existing(
    self,
    data_lab_id: str,
)

Construct a new DataLab from an existing DataLab with the given ID.

Parameters:

data_lab_id: The ID of the existing DataLab.

with_demographics

def with_demographics(
    self,
)

Enable demographics in the DataLab.

with_disable_drop_invalid_rows

def with_disable_drop_invalid_rows(
    self,
)

Disable dropping of invalid rows in the Data Lab.

with_embeddings

def with_embeddings(
    self,
    num_embeddings: int,
)

Enable embeddings in the DataLab.

Parameters:

num_embeddings: The number of embeddings the DataLab should use.

with_matching_id_format

def with_matching_id_format(
    self,
    matching_id: decentriq_platform.types.MatchingId,
)

Set the matching ID format.

Parameters:

matching_id: The type of matching ID to use.

with_name

def with_name(
    self,
    name: str,
)

Set the name of the DataLab.

Parameters:

name: Name to be used for the DataLab.

with_segments

def with_segments(
    self,
)

DataLabDatasetType

DataLabDatasetType(
    value,
    names=None,
    *,
    module=None,
    qualname=None,
    type=None,
    start=1,
)

An enumeration.

Ancestors (in MRO)

enum.Enum

MatchingId

MatchingId(
    value,
    names=None,
    *,
    module=None,
    qualname=None,
    type=None,
    start=1,
)

The type of Matching ID to use.

Members:

- STRING
- EMAIL
- HASHED_EMAIL
- PHONE_NUMBER
- HASHED_PHONE_NUMBER

Ancestors (in MRO)

builtins.str
enum.Enum

MatchingIdFormat

MatchingIdFormat(
    value,
    names=None,
    *,
    module=None,
    qualname=None,
    type=None,
    start=1,
)

An enumeration.

Ancestors (in MRO)

builtins.str
enum.Enum

decentriq_platform.data_lab

Classes​

DataLab​

deprovision_dataset​

get_statistics_report​

get_validation_report​

is_validation_passed​

provision_dataset​

provision_local_datasets​

run​

DataLabBuilder​

build​

from_existing​

with_demographics​

with_disable_drop_invalid_rows​

with_embeddings​

with_matching_id_format​

with_name​

with_segments​

DataLabDatasetType​

MatchingId​

MatchingIdFormat​

Classes

DataLab

deprovision_dataset

get_statistics_report

get_validation_report

is_validation_passed

provision_dataset

provision_local_datasets

run

DataLabBuilder

build

from_existing

with_demographics

with_disable_drop_invalid_rows

with_embeddings

with_matching_id_format

with_name

with_segments

DataLabDatasetType

MatchingId

MatchingIdFormat