Skip to main content

decentriq_platform.analytics

Sub-modules

  • decentriq_platform.analytics.python_environment_compute_nodes

Functions

read_input_csv_file

def read_input_csv_file(
path: str,
/,
*,
has_header: bool = True,
check_header: bool = True,
encoding='utf-8',
**kwargs,
)> _io.BytesIO

Read CSV from a file and turn it into a bytes array of the correct format so that it can be uploaded to the Decentriq platform.

Parameters:

  • path: The path to the CSV file.
  • has_header: Whether the string contains a header row.
  • check_header: Whether the function should try to determine whether the file has a header. If the file has a header row but you didn't set the has_header flag, an exception will be raised. If you're sure that the way you use the function is correct, you can disable this check using this parameter.
  • encoding: The encoding of the CSV file. If you wrote the CSV file using a library like pandas, you need to check the documentation to see what encoding they use by default when writing files (likely "utf-8" in which case this can be left at its default value).
  • delimiter: What delimiter is used in the the CSV file. Default is the comma.
  • **kwargs: Additional keyword arguments passed to the Python CSV parser. Refer to the official documentation for a list of supported arguments.

Returns: A BytesIO object that can be passed to the methods resposible for uploading data.

Classes

AnalyticsDcr

AnalyticsDcr(
session: Session,
dcr_id: str,
high_level: Dict[str, str],
nodes: List[NodeDefinition],
*,
client: Client,
)

A class representing an Analytics DCR.

Initialise an Analytics DCR.

Parameters:

  • session: A Session object which can be used for communication with the enclave.
  • dcr_id: ID of the Analytics DCR.
  • high_level: High level representation of the Analytics DCR.
  • nodes: List of Data Node Definitions in the Analytics DCR.
  • client: A Client object which can be used to perform operations such as uploading data and retrieving computation results.

get_node

def get_node(
self,
name: str,
)> Union[decentriq_platform.analytics.high_level_node.ComputationNode, decentriq_platform.analytics.high_level_node.DataNode, ForwardRef(None)]

Retrieve a node with the given name.

Parameters:

  • name: Node name.

participants

def participants(
self,
)> List[str]

Retrieve the participants of the Analytics DCR as a list.

retrieve_audit_log

def retrieve_audit_log(
self,
)> str

Retrieve the Analytics DCR audit log.

stop

def stop(
self,
)

Stop the Analytics DCR.

AnalyticsDcrBuilder

AnalyticsDcrBuilder(
*,
client: Client,
enclave_specs: Optional[Dict[str, EnclaveSpecification]] = None,
)

A builder for constructing Analytics Data Clean Rooms.

Initialise an Analytics DCR builder.

Parameters:

  • client: A Client object that can be used to retrieve information about the platform.
  • enclave_specs: Determines the types of enclaves that will supported by this Data Clean Room. If not specified, the latest enclave specifications known to this SDK version will be used.

Instance variables

node_definitions : The current list of Node Definitions that will be added to the Data Clean Room.

permissions : The list of permissions that will be added to the Data Clean Room.

add_node_definition

def add_node_definition(
self,
definition: NodeDefinition,
)> Self

Add a single node definition to this builder.

A node definition defines how a Compute or Data Node should be constructed.

add_node_definitions

def add_node_definitions(
self,
definitions: List[NodeDefinition],
)> Self

Add a list of node definitions to this builder.

Each node definition defines how the respective Compute or Data Node should be constructed.

add_participant

def add_participant(
self,
email: str,
*,
analyst_of: List[str] = [],
data_owner_of: List[str] = [],
)> Self

Add a participant to the DCR being built.

If the participant isn't assigned a role, the user can still view the DCR but cannot interact with it.

Parameters:

  • email: The email address of the participant.
  • analyst_of: The names of the Compute Nodes that the user can run.
  • data_owner_of: The names of the Data Nodes to which the user can connect a dataset.

build

def build(
self,
)> decentriq_platform.analytics.analytics_dcr.AnalyticsDcrDefinition

Build the Data Clean Room.

In order to use the DCR, the output of this method should be passed to client.publish_analytics_dcr.

with_description

def with_description(
self,
description: str,
)> Self

Set the description of the Data Clean Room.

Parameters:

  • description: Description of the Data Clean Room.

with_name

def with_name(
self,
name: str,
)> Self

Set the name of the Data Clean Room.

Parameters:

  • name: Name to be used for the Data Clean Room.

with_owner

def with_owner(
self,
email: str,
)> Self

Set the owner of the Data Clean Room.

Parameters:

  • email: The email address of the owner of the Data Clean Room.

AnalyticsDcrDefinition

AnalyticsDcrDefinition(
name: str,
high_level: Dict[str, Any],
enclave_specs: Optional[Dict[str, EnclaveSpecification]] = None,
)

A class representing an Analytics DCR Definition.

Column

Column(
format_type: FormatType,
name: str,
is_nullable: bool,
hash_with: Optional[HashingAlgorithm] = None,
in_range: Optional[NumericRangeRule] = None,
)

Column(format_type: 'FormatType', name: 'str', is_nullable: 'bool', hash_with: 'Optional[HashingAlgorithm]' = None, in_range: 'Optional[NumericRangeRule]' = None)

DatasetSinkComputationNode

DatasetSinkComputationNode(
**data: Any,
)

Usage docs: https://docs.pydantic.dev/2.9/concepts/models/

A base class for creating Pydantic models.

Attributes: class_vars: The names of the class variables defined on the model. private_attributes: Metadata about the private attributes of the model. signature: The synthesized __init__ [Signature][inspect.Signature] of the model.

__pydantic_complete__: Whether model building is completed, or if there are still undefined fields.
__pydantic_core_schema__: The core schema of the model.
__pydantic_custom_init__: Whether the model has a custom `__init__` function.
__pydantic_decorators__: Metadata containing the decorators defined on the model.
This replaces `Model.__validators__` and `Model.__root_validators__` from Pydantic V1.
__pydantic_generic_metadata__: Metadata for generic models; contains data used for a similar purpose to
__args__, __origin__, __parameters__ in typing-module generics. May eventually be replaced by these.
__pydantic_parent_namespace__: Parent namespace of the model, used for automatic rebuilding of models.
__pydantic_post_init__: The name of the post-init method for the model, if defined.
__pydantic_root_model__: Whether the model is a [`RootModel`][pydantic.root_model.RootModel].
__pydantic_serializer__: The `pydantic-core` `SchemaSerializer` used to dump instances of the model.
__pydantic_validator__: The `pydantic-core` `SchemaValidator` used to validate instances of the model.

__pydantic_extra__: A dictionary containing extra values, if [`extra`][pydantic.config.ConfigDict.extra]
is set to `'allow'`.
__pydantic_fields_set__: The names of fields explicitly set during instantiation.
__pydantic_private__: Values of private attributes set on the model instance.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Ancestors (in MRO)

  • pydantic.main.BaseModel

DatasetSinkComputeNodeDefinition

DatasetSinkComputeNodeDefinition(
name: str,
dataset_name: str,
dependency: str,
encryption_key_dependency: str,
input_type: SinkInput,
is_key_hex_encoded: Optional[bool] = False,
id: Optional[str] = None,
)

Helper class that provides a standard way to create an ABC using inheritance.

Initialise a DatasetSinkComputeNodeDefinition. This class is used to construct DatasetSinkComputeNodes.

Parameters:

  • name: Name of the DatasetSinkComputeNodeDefinition.
  • dataset_name: Name of the dataset when it is stored in the Decentriq Platform.
  • dependency: Name of the node whose data will be stored.
  • encryption_key_dependency: Name of the node storing the encryption key that will be used to encrypt the dataset in the Decentriq Platform.
  • input_type: The type of input data to be stored (raw, list of files in a zip, entire zip contents).
  • is_key_hex_encoded: Flag indicating whether or not the encryption key is hex encoded (False indicates raw bytes).
  • id: Optional ID of the dataset sink node.

Ancestors (in MRO)

  • decentriq_platform.analytics.node_definitions.NodeDefinition
  • abc.ABC Instance variables

required_workers :

build

def build(
self,
dcr_id: str,
node_definition: NodeDefinition,
client: Client,
session: Session,
)> DatasetSinkComputeNode

Construct a DatasetSinkComputeNode from the definition.

Parameters:

  • dcr_id: ID of the DCR the node is a member of.
  • node_definition: Definition of the DatasetSink Compute Node.
  • client: A Client object which can be used to perform operations such as uploading data and retrieving computation results.
  • session: The session with which to communicate with the enclave.

FileContent

FileContent(
name: str,
content: str,
)

Descendants

  • decentriq_platform.analytics.script.Script

FormatType

FormatType(
*args,
**kwds,
)

str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str

Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to 'strict'.

Ancestors (in MRO)

  • builtins.str
  • enum.Enum Static methods

from_primitive_type

def from_primitive_type(
tpe: str,
)> decentriq_platform.analytics.table_data_nodes.FormatType

to_primitive_type

def to_primitive_type(
fmt: FormatType,
)> str

MaskType

MaskType(
*args,
**kwds,
)

str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str

Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to 'strict'.

Ancestors (in MRO)

  • builtins.str
  • enum.Enum

MatchingComputeNode

MatchingComputeNode(
id: str,
name: str,
dcr_id: str,
config: MatchingComputeNodeConfig,
dependencies: List[str],
client: Client,
session: Session,
node_definition: NodeDefinition,
enable_logs_on_error: bool = False,
enable_logs_on_success: bool = False,
output: Optional[str] = '/output',
)

Class representing a Matching Compute Node.

A MatchingComputeNode can be used to join two datasets based on common columns.

Initialise a MatchingComputeNode:

Parameters:

  • name: Name of the MatchingComputeNode.
  • dcr_id: ID of the DCR this node is part of.
  • config: Configuration of the MatchingComputeNode.
  • dependencies: Nodes that the MatchingComputeNode depends on.
  • client: A Client object which can be used to perform operations such as uploading data and retrieving computation results.
  • session: The session with which to communicate with the enclave.
  • node_definition: Definition with which this node was built. enable_logs_on_error: Enable logs in the event of an error. enable_logs_on_success: Enable logs when the computation is successful. output: Directory where the results will be written. id: Optional ID of the MatchingComputeNode.

Ancestors (in MRO)

  • decentriq_platform.analytics.high_level_node.ContainerComputationNode
  • decentriq_platform.analytics.high_level_node.ComputationNode
  • decentriq_platform.analytics.high_level_node.HighLevelNode
  • abc.ABC

MatchingComputeNodeConfig

MatchingComputeNodeConfig(
query: List[str],
round: int,
epsilon: int,
sensitivity: int,
dependency_paths: List[str],
)

MatchingComputeNodeConfig(query: 'List[str]', round: 'int', epsilon: 'int', sensitivity: 'int', dependency_paths: 'List[str]')

MatchingComputeNodeDefinition

MatchingComputeNodeDefinition(
name: str,
config: MatchingComputeNodeConfig,
dependencies: List[str],
enable_logs_on_error: bool = False,
enable_logs_on_success: bool = False,
output: Optional[str] = '/output',
id: Optional[str] = None,
)

Class definining the structure of a MatchingComputeNode.

A MatchingComputeNode can be used to join two datasets based on common columns.

Initialise a MatchingComputeNodeDefinition:

Parameters: name: Name of the MatchingComputeNodeDefinition. config: Configuration of the MatchingComputeNodeDefinition. dependencies: Nodes that the MatchingComputeNodeDefinition depends on. enable_logs_on_error: Enable logs in the event of an error. enable_logs_on_success: Enable logs when the computation is successful. output: Directory where the results will be written.

Ancestors (in MRO)

  • decentriq_platform.analytics.node_definitions.NodeDefinition
  • abc.ABC Instance variables

required_workers :

build

def build(
self,
dcr_id: str,
node_definition: NodeDefinition,
client: Client,
session: Session,
)> MatchingComputeNode

Construct a MatchingComputeNode from the Node Definition.

Parameters:

  • dcr_id: ID of the DCR the node is a member of.
  • node_definition: Definition of the Matching Node.
  • client: A Client object which can be used to perform operations such as uploading data and retrieving computation results.
  • session: The session with which to communicate with the enclave.

PreviewComputeNode

PreviewComputeNode(
id: str,
name: str,
dcr_id: str,
dependency: str,
client: Client,
session: Session,
node_definition: NodeDefinition,
quota_bytes: Optional[int] = 0,
)

Class representing a Preview (Airlock) Computation node.

Initialise a PreviewComputeNode:

Parameters:

  • name: Name of the PreviewComputeNode.
  • dcr_id: ID of the DCR this node is part of.
  • dependency: Node that the PreviewComputeNode depends on.
  • client: A Client object which can be used to perform operations such as uploading data and retrieving computation results.
  • session: The session with which to communicate with the enclave.o
  • node_definition: Definition of the Preview Node.
  • quota_bytes: Threshold for amount of data that can be previewed.
  • id: ID of the PreviewComputeNode.

Ancestors (in MRO)

  • decentriq_platform.analytics.high_level_node.ComputationNode
  • decentriq_platform.analytics.high_level_node.HighLevelNode
  • abc.ABC

PreviewComputeNodeDefinition

PreviewComputeNodeDefinition(
name: str,
dependency: str,
quota_bytes: Optional[int] = 0,
id: Optional[str] = None,
)

Class representing a Preview (Airlock) Compute Node Definition.

Initialise a PreviewComputeNodeDefinition:

Parameters: name: Name of the PreviewComputeNodeDefinition. dependency: Node that the PreviewComputeNodeDefinition depends on. quota_bytes: Threshold for amount of data that can be previewed.

Ancestors (in MRO)

  • decentriq_platform.analytics.node_definitions.NodeDefinition
  • abc.ABC Instance variables

required_workers :

build

def build(
self,
dcr_id: str,
node_definition: NodeDefinition,
client: Client,
session: Session,
)> PreviewComputeNode

Construct a PreviewComputeNode from the Node Definition.

Parameters:

  • dcr_id: ID of the DCR the node is a member of.
  • node_definition: Definition of the Matching Node.
  • client: A Client object which can be used to perform operations such as uploading data and retrieving computation results.
  • session: The session with which to communicate with the enclave.

PrimitiveType

PrimitiveType(
*args,
**kwds,
)

str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str

Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to 'strict'.

Ancestors (in MRO)

  • builtins.str
  • enum.Enum

PythonComputeNode

PythonComputeNode(
id: str,
name: str,
dcr_id: str,
script: str,
client: Client,
session: Session,
node_definition: NodeDefinition,
dependencies: List[str] = [],
additional_files: Optional[List[FileContent]] = None,
enable_logs_on_error: bool = False,
enable_logs_on_success: bool = False,
output: Optional[str] = '/output',
custom_environment: Optional[str] = None,
)

A PythonComputeNode is a node that is able to run arbitrary Python code.

Initialise a PythonComputeNode:

Parameters: -id: ID of the PythonComputeNode.

  • name: Name of the PythonComputeNode.
  • dcr_id: ID of the DCR the node is a member of.
  • script: The Python computation as a string.
  • client: A Client object which can be used to perform operations such as uploading data and retrieving computation results.
  • session: The session with which to communicate with the enclave.
  • node_definition: Definition of the Matching Node.
  • dependencies: Nodes that the PythonComputeNode depends on. -additional_files: Other files that can be used by the PythonComputeNode. -enable_logs_on_error: Enable logs in the event of an error. -enable_logs_on_success: Enable logs when the computation is successful. -output: Directory where the results will be written.

Ancestors (in MRO)

  • decentriq_platform.analytics.high_level_node.ContainerComputationNode
  • decentriq_platform.analytics.high_level_node.ComputationNode
  • decentriq_platform.analytics.high_level_node.HighLevelNode
  • abc.ABC

PythonComputeNodeDefinition

PythonComputeNodeDefinition(
name: str,
script: str,
additional_files: Optional[List[FileContent]] = None,
dependencies: List[str] = [],
enable_logs_on_error: bool = False,
enable_logs_on_success: bool = False,
output: Optional[str] = '/output',
id: Optional[str] = None,
custom_environment: Optional[str] = None,
)

Helper class that provides a standard way to create an ABC using inheritance.

Initialise a PythonComputeNodeDefinition.

This class is used in order to construct PythonComputeNodes.

Parameters:

  • name: Name of the PythonComputeNodeDefinition.
  • script: The Python computation.
  • additional_files: Other files that can be used by the PythonComputeNodeDefinition.
  • dependencies: Nodes that the PythonComputeNodeDefinition depends on.
  • enable_logs_on_error: Enable logs in the event of an error.
  • enable_logs_on_success: Enable logs when the computation is successful.
  • output: Directory where the results will be written.

Ancestors (in MRO)

  • decentriq_platform.analytics.node_definitions.NodeDefinition
  • abc.ABC Instance variables

required_workers :

build

def build(
self,
dcr_id: str,
node_definition: NodeDefinition,
client: Client,
session: Session,
)> PythonComputeNode

Construct a PythonComputeNode from the Node Definition.

Parameters:

  • dcr_id: ID of the DCR the node is a member of.
  • node_definition: Definition of the Python Compute Node.
  • client: A Client object which can be used to perform operations such as uploading data and retrieving computation results.
  • session: The session with which to communicate with the enclave.

PythonEnvironmentComputeNode

PythonEnvironmentComputeNode(
id: str,
name: str,
requirements_txt: str,
dcr_id: str,
session: Session,
node_definition: NodeDefinition,
*,
client: Client,
)

Class representing an Environment.

Initialise an instance of a PythonEnvironmentComputationNode.

Parameters:

  • id: ID of the PythonEnvironmentComputeNode.
  • name: Name of the PythonEnvironmentComputeNode.
  • requirements_txt: Content of the requirements.txt file which list the packages for the environment.
  • dcr_id: ID of the DCR the node is a member of.
  • session: The session with which to communicate with the enclave.
  • 'node_definition': Definition with which the node was built.
  • client: A Client object which can be used to perform operations such as uploading data and retrieving computation results.

Ancestors (in MRO)

  • decentriq_platform.analytics.high_level_node.ContainerComputationNode
  • decentriq_platform.analytics.high_level_node.ComputationNode
  • decentriq_platform.analytics.high_level_node.HighLevelNode
  • abc.ABC

get_installation_report_as_dict

def get_installation_report_as_dict(
self,
)> Optional[Dict[str, str]]

Retrieve the virtual environment creation report to this PythonEnvironmentComputeNode.

PythonEnvironmentComputeNodeDefinition

PythonEnvironmentComputeNodeDefinition(
name: str,
requirements_txt: str,
id: Optional[str] = None,
)

Class representing a Python Environment Definition.

Initialise a PythonEnvironmentDefinition.

Parameters:

  • name: Name of the PythonEnvironmentDefinition.
  • requirements_txt: Content of the requirements.txt file which list the packages for the environment.

Ancestors (in MRO)

  • decentriq_platform.analytics.node_definitions.NodeDefinition
  • abc.ABC Instance variables

required_workers :

build

def build(
self,
dcr_id: str,
node_definition: NodeDefinition,
client: Client,
session: Session,
)> PythonEnvironmentComputeNode

Construct a PythonEnvironmentComputeNode from the Node Definition.

Parameters:

  • dcr_id: ID of the DCR the node is a member of.
  • node_definition: Definition of the Python Environment Compute Node.
  • client: A Client object which can be used to perform operations such as uploading data and retrieving computation results.
  • session: The session with which to communicate with the enclave.

PythonScript

PythonScript(
name: str,
content: str,
)

Class representing a Python script.

Ancestors (in MRO)

  • decentriq_platform.analytics.script.Script
  • decentriq_platform.analytics.script.FileContent

RComputeNode

RComputeNode(
id: str,
name: str,
dcr_id: str,
script: str,
client: Client,
session: Session,
node_definition: NodeDefinition,
additional_files: Optional[List[FileContent]] = None,
dependencies: List[str] = [],
enable_logs_on_error: bool = False,
enable_logs_on_success: bool = False,
output: Optional[str] = '/output',
)

Class representing an R Computation node.

An RComputeNode is able to run arbitrary R code.

Initialise a RComputeNode:

Parameters:

  • id: ID of the RComputeNode.
  • name: Name of the RComputeNode.
  • dcr_id: ID of the DCR the node is a member of.
  • script: The R computation.
  • client: A Client object which can be used to perform operations such as uploading data and retrieving computation results.
  • session: The session with which to communicate with the enclave.
  • node_definition: Definition of the Matching Node.
  • additional_files: Other files that can be used by the RComputeNode.
  • dependencies: Nodes that the RComputeNode depends on.
  • enable_logs_on_error: Enable logs in the event of an error.
  • enable_logs_on_success: Enable logs when the computation is successful.
  • output: Directory where the results should be written.

Ancestors (in MRO)

  • decentriq_platform.analytics.high_level_node.ContainerComputationNode
  • decentriq_platform.analytics.high_level_node.ComputationNode
  • decentriq_platform.analytics.high_level_node.HighLevelNode
  • abc.ABC

RComputeNodeDefinition

RComputeNodeDefinition(
name: str,
script: str,
additional_files: Optional[List[FileContent]] = None,
dependencies: List[str] = [],
enable_logs_on_error: bool = False,
enable_logs_on_success: bool = False,
output: Optional[str] = '/output',
id: Optional[str] = None,
)

Helper class that provides a standard way to create an ABC using inheritance.

Initialise a RComputeNodeDefinition:

Parameters:

  • name: Name of the RComputeNodeDefinition.
  • script: The R computation as a string.
  • additional_files: Other files that can be used by the RComputeNodeDefinition.
  • dependencies: Nodes that the RComputeNodeDefinition depends on.
  • enable_logs_on_error: Enable logs in the event of an error.
  • enable_logs_on_success: Enable logs when the computation is successful.
  • output: Directory where the results should be written.

Ancestors (in MRO)

  • decentriq_platform.analytics.node_definitions.NodeDefinition
  • abc.ABC Instance variables

required_workers :

build

def build(
self,
dcr_id: str,
node_definition: NodeDefinition,
client: Client,
session: Session,
)> RComputeNode

Construct a RComputeNode from the Node Definition.

Parameters:

  • dcr_id: ID of the DCR the node is a member of.
  • node_definition: Definition of the R Compute Node.
  • client: A Client object which can be used to perform operations such as uploading data and retrieving computation results.
  • session: The session with which to communicate with the enclave.

RScript

RScript(
name: str,
content: str,
)

Class representing an R script.

Ancestors (in MRO)

  • decentriq_platform.analytics.script.Script
  • decentriq_platform.analytics.script.FileContent

RawDataNode

RawDataNode(
id: str,
name: str,
is_required: bool,
dcr_id: str,
client: Client,
session: Session,
node_definition: RawDataNodeDefinition,
)

Class representing a Raw Data node.

Data that is provisioned to a Raw Data Node is assumed to be unstructured. This means that any of the SQL node types cannot read from such a Data Node. This is the preferred node type for data such as images or binary data. It can, of course, also be used for tabular data files such as CSV or Excel. In this case, however, the code reading from the Data Node will have to interpret the data correctly.

Initialise a RawDataNode instance.

Parameters:

  • 'id': ID of the RawDataNode.
  • name: Name of the RawDataNode
  • is_required: Flag determining if the RawDataNode must be present for dependent computations.
  • dcr_id: ID of the DCR the node is a member of.
  • node_definition: Definition of the Raw Data Node.
  • client: A Client object which can be used to perform operations such as uploading data and retrieving computation results.
  • session: The session with which to communicate with the enclave.
  • node_definition: Definition of the Raw Data Node.

Ancestors (in MRO)

  • decentriq_platform.analytics.high_level_node.DataNode
  • decentriq_platform.analytics.high_level_node.HighLevelNode
  • abc.ABC

RawDataNodeDefinition

RawDataNodeDefinition(
name: str,
is_required: bool,
id: Optional[str] = None,
)

Helper class that provides a standard way to create an ABC using inheritance.

Initialise a RawDataNodeDefinition:

Parameters:

  • name: Name of the RawDataNodeDefinition.
  • is_required: Defines if the RawDataNodeDefinition is required.
  • id: Optional id of the RawDataNodeDefinition.

Ancestors (in MRO)

  • decentriq_platform.analytics.node_definitions.NodeDefinition
  • abc.ABC Instance variables

required_workers :

build

def build(
self,
dcr_id: str,
node_definition: NodeDefinition,
client: Client,
session: Session,
)> RawDataNode

Construct a RawDataNode from the Node Definition.

Parameters:

  • dcr_id: ID of the DCR the node is a member of.
  • node_definition: Definition of the Raw Data Node.
  • client: A Client object which can be used to perform operations such as uploading data and retrieving computation results.
  • session: The session with which to communicate with the enclave.

S3Provider

S3Provider(
*args,
**kwds,
)

str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str

Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to 'strict'.

Ancestors (in MRO)

  • builtins.str
  • enum.Enum

S3SinkComputeNode

S3SinkComputeNode(
id: str,
name: str,
dcr_id: str,
credentials_dependency_id: str,
endpoint: str,
region: str,
dependency: str,
client: Client,
session: Session,
node_definition: NodeDefinition,
provider: Optional[S3Provider] = S3Provider.AWS,
)

Class representing an S3 Sink Computation node.

Initialise a S3SinkComputeNode:

Parameters:

  • id: ID of the S3SinkComputeNode.
  • name: Name of the S3SinkComputeNode.
  • dcr_id: ID of the DCR the node is a member of.
  • credentials_dependency_id: ID of the S3SinkComputeNode dependency.
  • endpoint: Endpoint where data will be uploaded.
  • region: Region where the data will be uploaded.
  • dependency: Node that the S3SinkComputeNode depends on.
  • client: A Client object which can be used to perform operations such as uploading data and retrieving computation results.
  • session: The session with which to communicate with the enclave.
  • node_definition: Definition of the Raw Data Node.
  • provider: Type of S3 provider (AWS/GCS).

Ancestors (in MRO)

  • decentriq_platform.analytics.high_level_node.ComputationNode
  • decentriq_platform.analytics.high_level_node.HighLevelNode
  • abc.ABC

S3SinkComputeNodeDefinition

S3SinkComputeNodeDefinition(
name: str,
credentials_dependency_id: str,
endpoint: str,
region: str,
dependency: str,
provider: S3Provider = S3Provider.AWS,
id: Optional[str] = None,
)

Class representing an S3 Sink Computation node.

Initialise a S3SinkComputeNodeDefinition:

Parameters: name: Name of the S3SinkComputeNodeDefinition. credentials_dependency_id: ID of the S3SinkComputeNodeDefinition dependency. endpoint: Endpoint where data will be uploaded. region: Region where the data will be uploaded. dependency: Node that the S3SinkComputeNodeDefinition depends on. provider: Type of S3 provider (AWS/GCS). id: Optional ID of the S3SinkComputeNodeDefinition.

Ancestors (in MRO)

  • decentriq_platform.analytics.node_definitions.NodeDefinition
  • abc.ABC Instance variables

required_workers :

build

def build(
self,
dcr_id: str,
node_definition: NodeDefinition,
client: Client,
session: Session,
)> S3SinkComputeNode

Construct a S3SinkComputeNode from the Node Definition.

Parameters:

  • dcr_id: ID of the DCR the node is a member of.
  • node_definition: Definition of the S3 Sink Compute Node.
  • client: A Client object which can be used to perform operations such as uploading data and retrieving computation results.
  • session: The session with which to communicate with the enclave.

SinkInputFormat

SinkInputFormat(
)

Factory for creating the desired SinkInput type.

Static methods

all

def all(
)> decentriq_platform.analytics.dataset_sink_compute_nodes.SinkInput

Store all files in a zip to the Decentriq Platform.

files

def files(
files: List[str],
)> decentriq_platform.analytics.dataset_sink_compute_nodes.SinkInput

Store the specified files in a zip to the Decentriq Platform.

raw

def raw(
)> decentriq_platform.analytics.dataset_sink_compute_nodes.SinkInput

Store a single raw file to the Decentriq Platform.

SqlComputeNode

SqlComputeNode(
id: str,
name: str,
dcr_id: str,
query: str,
client: Client,
session: Session,
node_definition: NodeDefinition,
*,
dependencies: Optional[List[str]] = None,
minimum_rows_count: Optional[int] = None,
)

Class representing an SQL Computation Node.

Initialise a SqlComputeNode:

Parameters:

  • id: ID of the SqlComputeNode.
  • name: Name of the SqlComputeNode.
  • dcr_id: ID of the DCR the node is a member of.
  • query: SQL query string.
  • client: A Client object which can be used to perform operations such as uploading data and retrieving computation results.
  • session: The session with which to communicate with the enclave.
  • dependencies: Nodes that the SqlComputeNode depends on.
  • minimum_rows_count: Minimum number of rows required by the SqlComputeNode.

Ancestors (in MRO)

  • decentriq_platform.analytics.high_level_node.StructuredOutputNode
  • decentriq_platform.analytics.high_level_node.ComputationNode
  • decentriq_platform.analytics.high_level_node.HighLevelNode
  • abc.ABC

SqlComputeNodeDefinition

SqlComputeNodeDefinition(
name: str,
query: str,
dependencies: Optional[List[str]] = None,
minimum_rows_count: Optional[int] = None,
id: Optional[str] = None,
)

Class representing an SQL Computation Node Definition.

Initialise a SqlComputeNodeDefinition:

Parameters: name: Name of the SqlComputeNodeDefinition. query: SQL query string. dependencies: Node ids that the SQL node depends on. minimum_rows_count: Minimum number of rows required by the SqlComputeNodeDefinition. id: Optional ID of the SqlComputeNodeDefinition.

Ancestors (in MRO)

  • decentriq_platform.analytics.node_definitions.NodeDefinition
  • abc.ABC Instance variables

required_workers :

build

def build(
self,
dcr_id: str,
node_definition: NodeDefinition,
client: Client,
session: Session,
)> SqlComputeNode

Construct a SqlComputeNode from the Node Definition.

Parameters:

  • dcr_id: ID of the DCR the node is a member of.
  • node_definition: Definition of the SQL Compute Node.
  • client: A Client object which can be used to perform operations such as uploading data and retrieving computation results.
  • session: The session with which to communicate with the enclave.

SqliteComputeNode

SqliteComputeNode(
id: str,
name: str,
dcr_id: str,
query: str,
client: Client,
session: Session,
node_definition: NodeDefinition,
dependencies: Optional[List[str]] = None,
enable_logs_on_error: bool = False,
enable_logs_on_success: bool = False,
)

Class representing an SQLite Computation node.

Initialise a SqliteComputeNode:

Parameters:

  • id: ID of the SqliteComputeNode.
  • name: Name of the SqliteComputeNode.
  • dcr_id: ID of the DCR the node is a member of.
  • query: SQLite query string.
  • client: A Client object which can be used to perform operations such as uploading data and retrieving computation results.
  • session: The session with which to communicate with the enclave.
  • dependencies: Nodes that the SqliteComputeNode depends on.
  • enable_logs_on_error: Enable logs in the event of an error.
  • enable_logs_on_success: Enable logs when the computation is successful.

Ancestors (in MRO)

  • decentriq_platform.analytics.high_level_node.StructuredOutputNode
  • decentriq_platform.analytics.high_level_node.ComputationNode
  • decentriq_platform.analytics.high_level_node.HighLevelNode
  • abc.ABC

SqliteComputeNodeDefinition

SqliteComputeNodeDefinition(
name: str,
query: str,
dependencies: Optional[List[str]] = None,
enable_logs_on_error: bool = False,
enable_logs_on_success: bool = False,
id: Optional[str] = None,
)

Class representing an SQLite Computation Node Definition.

Initialise a SqliteComputeNodeDefinition:

Parameters:

  • name: Name of the SqliteComputeNodeDefinition.
  • query: SQLite query string.
  • dependencies: Mappings between node id and the table name under which they should be made available.
  • enable_logs_on_error: Enable logs in the event of an error.
  • enable_logs_on_success: Enable logs when the computation is successful.
  • id: Optional ID of the SqliteComputeNodeDefinition.

Ancestors (in MRO)

  • decentriq_platform.analytics.node_definitions.NodeDefinition
  • abc.ABC Instance variables

required_workers :

build

def build(
self,
dcr_id: str,
node_definition: NodeDefinition,
client: Client,
session: Session,
)> SqliteComputeNode

Construct a SqliteComputeNode from the Node Definition.

Parameters:

  • dcr_id: ID of the DCR the node is a member of.
  • node_definition: Definition of the SQLite Compute Node.
  • client: A Client object which can be used to perform operations such as uploading data and retrieving computation results.
  • session: The session with which to communicate with the enclave.

SyntheticDataComputeNode

SyntheticDataComputeNode(
id: str,
name: str,
dcr_id: str,
columns: List[SyntheticNodeColumn],
dependency: str,
epsilon: float,
client: Client,
session: Session,
node_definition: NodeDefinition,
output_original_data_statistics: bool = False,
enable_logs_on_error: bool = False,
enable_logs_on_success: bool = False,
)

Class representing a Synthetic Data Computation Node.

Initialise a SyntheticDataComputeNode:

Parameters:

  • id: ID of the SyntheticDataComputeNode.
  • name: Name of the SyntheticDataComputeNode.
  • dcr_id: ID of the DCR the node is a member of.
  • columns: Columns defined for the SyntheticDataComputeNode
  • dependency: Node that the SyntheticDataComputeNode depends on.
  • epsilon: Amount of noise to add to the data.
  • client: A Client object which can be used to perform operations such as uploading data and retrieving computation results.
  • session: The session with which to communicate with the enclave.
  • output_original_data_statistics: Include the original statistics in the output.
  • enable_logs_on_error: Enable logs in the event of an error.
  • enable_logs_on_success: Enable logs when the computation is successful.

Ancestors (in MRO)

  • decentriq_platform.analytics.high_level_node.ContainerComputationNode
  • decentriq_platform.analytics.high_level_node.ComputationNode
  • decentriq_platform.analytics.high_level_node.HighLevelNode
  • abc.ABC

get_results_as_string

def get_results_as_string(
self,
interval: int = 5,
timeout: Optional[int] = None,
)> Optional[str]

Retrieve the results of a computation as a string.

Parameters:

  • interval: Time interval (in seconds) to check for results.
  • timeout: Time (in seconds) after which results are no longer checked.

run_computation_and_get_results_as_string

def run_computation_and_get_results_as_string(
self,
interval: int = 5,
timeout: Optional[int] = None,
)> Optional[str]

This is a blocking call to run a computation and get the results as a string.

Parameters:

  • interval: Time interval (in seconds) to check for results.
  • timeout: Time (in seconds) after which results are no longer checked.

SyntheticDataComputeNodeDefinition

SyntheticDataComputeNodeDefinition(
name: str,
columns: List[SyntheticNodeColumn],
dependency: str,
epsilon: float,
output_original_data_statistics: bool = False,
enable_logs_on_error: bool = False,
enable_logs_on_success: bool = False,
id: Optional[str] = None,
)

Class representing a Synthetic Data Computation node.

Initialise a SyntheticDataComputeNodeDefinition:

Parameters:

  • name: Name of the SyntheticDataComputeNodeDefinition.
  • columns: Columns defined for the SyntheticDataComputeNodeDefinition
  • dependency: Node that the SyntheticDataComputeNodeDefinition depends on.
  • epsilon: Amount of noise to add to the data.
  • output_original_data_statistics: Include the original statistics in the output.
  • enable_logs_on_error: Enable logs in the event of an error.
  • enable_logs_on_success: Enable logs when the computation is successful.
  • id: Optional ID of the SyntheticDataComputeNodeDefinition.

Ancestors (in MRO)

  • decentriq_platform.analytics.node_definitions.NodeDefinition
  • abc.ABC Instance variables

required_workers :

build

def build(
self,
dcr_id: str,
node_definition: NodeDefinition,
client: Client,
session: Session,
)> SyntheticDataComputeNode

Construct a SyntheticDataComputeNode from the Node Definition.

Parameters:

  • dcr_id: ID of the DCR the node is a member of.
  • node_definition: Definition of the Synthetic Data Compute Node.
  • client: A Client object which can be used to perform operations such as uploading data and retrieving computation results.
  • session: The session with which to communicate with the enclave.

SyntheticNodeColumn

SyntheticNodeColumn(
data_type: PrimitiveType,
index: int,
mask_type: MaskType,
should_mask_column: bool,
is_nullable: bool = True,
name: Optional[Optional[str]] = None,
)

SyntheticNodeColumn(data_type: 'PrimitiveType', index: 'int', mask_type: 'MaskType', should_mask_column: 'bool', is_nullable: 'bool' = True, name: 'Optional[Optional[str]]' = None)

TableDataNode

TableDataNode(
id: str,
name: str,
columns: List[Column],
is_required: bool,
dcr_id: str,
client: Client,
session: Session,
node_definition: TableDataNodeDefinition,
)

Class representing a Table Data Node.

Initialise a TableDataNode instance.

Parameters:

  • 'id': ID of the TableDataNode.
  • name: Name of the TableDataNode
  • columns: Definition of the columns that make up the TableDataNode.
  • is_required: Flag determining if the TableDataNode must be present for dependent computations.
  • dcr_id: ID of the DCR the node is a member of.
  • client: A Client object which can be used to perform operations such as uploading data and retrieving computation results.
  • session: The session with which to communicate with the enclave.

Ancestors (in MRO)

  • decentriq_platform.analytics.high_level_node.DataNode
  • decentriq_platform.analytics.high_level_node.HighLevelNode
  • abc.ABC

get_validation_report_as_dict

def get_validation_report_as_dict(
self,
)> Optional[Dict[str, str]]

Retrieve the validation report corresponding to this TableDataNode.

publish_dataset

def publish_dataset(
self,
manifest_hash: str,
key: Key,
)

Publish data to the TableDataNode.

Parameters:

  • manifest_hash: Hash identifying the dataset to be published.
  • key: Encryption key used to decrypt the dataset.

remove_published_dataset

def remove_published_dataset(
self,
)> None

Removes any dataset that is published to this node.

TableDataNodeDefinition

TableDataNodeDefinition(
name: str,
columns: List[Column],
is_required: bool,
id: Optional[str] = None,
unique_column_combinations: list[list[int]] = [],
)

Class representing a Table Data Node Definition.

Initialise a TableDataNodeDefinition instance.

Parameters:

  • name: Name of the TableDataNodeDefinition
  • columns: Definition of the columns that make up the TableDataNodeDefinition.
  • is_required: Flag determining if the RawDataNode must be present for dependent computations.
  • id: Optional ID of the TableDataNodeDefinition
  • unique_column_combinations: Check that the given combination of columns are unique across the dataset. This should be a list of lists, where the inner lists list the 0-based column indices. For example, passing the list [[0], [0, 1]] would result in the enclave checking that all values in the first column are unique. It would further check that all tuples formed by the first and second column are unique.

Ancestors (in MRO)

  • decentriq_platform.analytics.node_definitions.NodeDefinition
  • abc.ABC Instance variables

required_workers :

build

def build(
self,
dcr_id: str,
node_definition: NodeDefinition,
client: Client,
session: Session,
)> TableDataNode

Construct a TableDataNode from the Node Definition.

Parameters:

  • dcr_id: ID of the DCR the node is a member of.
  • node_definition: Definition of the Table Data Node.
  • client: A Client object which can be used to perform operations such as uploading data and retrieving computation results.
  • session: The session with which to communicate with the enclave.