decentriq_platform.analytics
Sub-modules
- decentriq_platform.analytics.python_environment_compute_nodes
Functions
read_input_csv_file
def read_input_csv_file(
path: str,
/,
*,
has_header: bool = True,
check_header: bool = True,
encoding='utf-8',
**kwargs,
) ‑> _io.BytesIO
Read CSV from a file and turn it into a bytes array of the correct format so that it can be uploaded to the Decentriq platform.
Parameters:
path
: The path to the CSV file.has_header
: Whether the string contains a header row.check_header
: Whether the function should try to determine whether the file has a header. If the file has a header row but you didn't set thehas_header
flag, an exception will be raised. If you're sure that the way you use the function is correct, you can disable this check using this parameter.encoding
: The encoding of the CSV file. If you wrote the CSV file using a library like pandas, you need to check the documentation to see what encoding they use by default when writing files (likely"utf-8"
in which case this can be left at its default value).delimiter
: What delimiter is used in the the CSV file. Default is the comma.**kwargs
: Additional keyword arguments passed to the Python CSV parser. Refer to the official documentation for a list of supported arguments.
Returns: A BytesIO object that can be passed to the methods resposible for uploading data.
Classes
AnalyticsDcr
AnalyticsDcr(
session: Session,
dcr_id: str,
high_level: Dict[str, str],
nodes: List[NodeDefinition],
*,
client: Client,
)
A class representing an Analytics DCR.
Initialise an Analytics DCR.
Parameters:
session
: ASession
object which can be used for communication with the enclave.dcr_id
: ID of the Analytics DCR.high_level
: High level representation of the Analytics DCR.nodes
: List of Data Node Definitions in the Analytics DCR.client
: AClient
object which can be used to perform operations such as uploading data and retrieving computation results.
get_node
def get_node(
self,
name: str,
) ‑> Union[decentriq_platform.analytics.high_level_node.ComputationNode, decentriq_platform.analytics.high_level_node.DataNode, ForwardRef(None)]
Retrieve a node with the given name.
Parameters:
name
: Node name.
participants
def participants(
self,
) ‑> List[str]
Retrieve the participants of the Analytics DCR as a list.
retrieve_audit_log
def retrieve_audit_log(
self,
) ‑> str
Retrieve the Analytics DCR audit log.
stop
def stop(
self,
)
Stop the Analytics DCR.
AnalyticsDcrBuilder
AnalyticsDcrBuilder(
*,
client: Client,
enclave_specs: Optional[Dict[str, EnclaveSpecification]] = None,
)
A builder for constructing Analytics Data Clean Rooms.
Initialise an Analytics DCR builder.
Parameters:
client
: AClient
object that can be used to retrieve information about the platform.enclave_specs
: Determines the types of enclaves that will supported by this Data Clean Room. If not specified, the latest enclave specifications known to this SDK version will be used.
Instance variables
node_definitions
: The current list of Node Definitions that will be added to the Data Clean Room.
permissions
: The list of permissions that will be added to the Data Clean Room.
add_node_definition
def add_node_definition(
self,
definition: NodeDefinition,
) ‑> Self
Add a single node definition to this builder.
A node definition defines how a Compute or Data Node should be constructed.
add_node_definitions
def add_node_definitions(
self,
definitions: List[NodeDefinition],
) ‑> Self
Add a list of node definitions to this builder.
Each node definition defines how the respective Compute or Data Node should be constructed.
add_participant
def add_participant(
self,
email: str,
*,
analyst_of: List[str] = [],
data_owner_of: List[str] = [],
) ‑> Self
Add a participant to the DCR being built.
If the participant isn't assigned a role, the user can still view the DCR but cannot interact with it.
Parameters:
email
: The email address of the participant.analyst_of
: The names of the Compute Nodes that the user can run.data_owner_of
: The names of the Data Nodes to which the user can connect a dataset.
build
def build(
self,
) ‑> decentriq_platform.analytics.analytics_dcr.AnalyticsDcrDefinition
Build the Data Clean Room.
In order to use the DCR, the output of this method should be passed to
client.publish_analytics_dcr
.
with_description
def with_description(
self,
description: str,
) ‑> Self
Set the description of the Data Clean Room.
Parameters:
description
: Description of the Data Clean Room.
with_name
def with_name(
self,
name: str,
) ‑> Self
Set the name of the Data Clean Room.
Parameters:
name
: Name to be used for the Data Clean Room.
with_owner
def with_owner(
self,
email: str,
) ‑> Self
Set the owner of the Data Clean Room.
Parameters:
email
: The email address of the owner of the Data Clean Room.
AnalyticsDcrDefinition
AnalyticsDcrDefinition(
name: str,
high_level: Dict[str, Any],
enclave_specs: Optional[Dict[str, EnclaveSpecification]] = None,
)
A class representing an Analytics DCR Definition.
Column
Column(
format_type: FormatType,
name: str,
is_nullable: bool,
hash_with: Optional[HashingAlgorithm] = None,
in_range: Optional[NumericRangeRule] = None,
)
Column(format_type: 'FormatType', name: 'str', is_nullable: 'bool', hash_with: 'Optional[HashingAlgorithm]' = None, in_range: 'Optional[NumericRangeRule]' = None)
DatasetSinkComputationNode
DatasetSinkComputationNode(
**data: Any,
)
Usage docs: https://docs.pydantic.dev/2.9/concepts/models/
A base class for creating Pydantic models.
Attributes:
class_vars: The names of the class variables defined on the model.
private_attributes: Metadata about the private attributes of the model.
signature: The synthesized __init__
[Signature
][inspect.Signature] of the model.
__pydantic_complete__: Whether model building is completed, or if there are still undefined fields.
__pydantic_core_schema__: The core schema of the model.
__pydantic_custom_init__: Whether the model has a custom `__init__` function.
__pydantic_decorators__: Metadata containing the decorators defined on the model.
This replaces `Model.__validators__` and `Model.__root_validators__` from Pydantic V1.
__pydantic_generic_metadata__: Metadata for generic models; contains data used for a similar purpose to
__args__, __origin__, __parameters__ in typing-module generics. May eventually be replaced by these.
__pydantic_parent_namespace__: Parent namespace of the model, used for automatic rebuilding of models.
__pydantic_post_init__: The name of the post-init method for the model, if defined.
__pydantic_root_model__: Whether the model is a [`RootModel`][pydantic.root_model.RootModel].
__pydantic_serializer__: The `pydantic-core` `SchemaSerializer` used to dump instances of the model.
__pydantic_validator__: The `pydantic-core` `SchemaValidator` used to validate instances of the model.
__pydantic_extra__: A dictionary containing extra values, if [`extra`][pydantic.config.ConfigDict.extra]
is set to `'allow'`.
__pydantic_fields_set__: The names of fields explicitly set during instantiation.
__pydantic_private__: Values of private attributes set on the model instance.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError
][pydantic_core.ValidationError] if the input data cannot be
validated to form a valid model.
self
is explicitly positional-only to allow self
as a field name.
Ancestors (in MRO)
- pydantic.main.BaseModel
DatasetSinkComputeNodeDefinition
DatasetSinkComputeNodeDefinition(
name: str,
dataset_name: str,
dependency: str,
encryption_key_dependency: str,
input_type: SinkInput,
is_key_hex_encoded: Optional[bool] = False,
id: Optional[str] = None,
)
Helper class that provides a standard way to create an ABC using inheritance.
Initialise a DatasetSinkComputeNodeDefinition
.
This class is used to construct DatasetSinkComputeNodes.
Parameters:
name
: Name of theDatasetSinkComputeNodeDefinition
.dataset_name
: Name of the dataset when it is stored in the Decentriq Platform.dependency
: Name of the node whose data will be stored.encryption_key_dependency
: Name of the node storing the encryption key that will be used to encrypt the dataset in the Decentriq Platform.input_type
: The type of input data to be stored (raw, list of files in a zip, entire zip contents).is_key_hex_encoded
: Flag indicating whether or not the encryption key is hex encoded (False
indicates raw bytes).id
: Optional ID of the dataset sink node.
Ancestors (in MRO)
- decentriq_platform.analytics.node_definitions.NodeDefinition
- abc.ABC Instance variables
required_workers
:
build
def build(
self,
dcr_id: str,
node_definition: NodeDefinition,
client: Client,
session: Session,
) ‑> DatasetSinkComputeNode
Construct a DatasetSinkComputeNode
from the definition.
Parameters:
dcr_id
: ID of the DCR the node is a member of.node_definition
: Definition of the DatasetSink Compute Node.client
: AClient
object which can be used to perform operations such as uploading data and retrieving computation results.session
: The session with which to communicate with the enclave.
FileContent
FileContent(
name: str,
content: str,
)
Descendants
- decentriq_platform.analytics.script.Script
FormatType
FormatType(
*args,
**kwds,
)
str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to 'strict'.
Ancestors (in MRO)
- builtins.str
- enum.Enum Static methods
from_primitive_type
def from_primitive_type(
tpe: str,
) ‑> decentriq_platform.analytics.table_data_nodes.FormatType
to_primitive_type
def to_primitive_type(
fmt: FormatType,
) ‑> str
MaskType
MaskType(
*args,
**kwds,
)
str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to 'strict'.
Ancestors (in MRO)
- builtins.str
- enum.Enum
MatchingComputeNode
MatchingComputeNode(
id: str,
name: str,
dcr_id: str,
config: MatchingComputeNodeConfig,
dependencies: List[str],
client: Client,
session: Session,
node_definition: NodeDefinition,
enable_logs_on_error: bool = False,
enable_logs_on_success: bool = False,
output: Optional[str] = '/output',
)
Class representing a Matching Compute Node.
A MatchingComputeNode can be used to join two datasets based on common columns.
Initialise a MatchingComputeNode
:
Parameters:
name
: Name of theMatchingComputeNode
.dcr_id
: ID of the DCR this node is part of.config
: Configuration of theMatchingComputeNode
.dependencies
: Nodes that theMatchingComputeNode
depends on.client
: AClient
object which can be used to perform operations such as uploading data and retrieving computation results.session
: The session with which to communicate with the enclave.node_definition
: Definition with which this node was built.enable_logs_on_error
: Enable logs in the event of an error.enable_logs_on_success
: Enable logs when the computation is successful.output
: Directory where the results will be written.id
: Optional ID of theMatchingComputeNode
.
Ancestors (in MRO)
- decentriq_platform.analytics.high_level_node.ContainerComputationNode
- decentriq_platform.analytics.high_level_node.ComputationNode
- decentriq_platform.analytics.high_level_node.HighLevelNode
- abc.ABC
MatchingComputeNodeConfig
MatchingComputeNodeConfig(
query: List[str],
round: int,
epsilon: int,
sensitivity: int,
dependency_paths: List[str],
)
MatchingComputeNodeConfig(query: 'List[str]', round: 'int', epsilon: 'int', sensitivity: 'int', dependency_paths: 'List[str]')
MatchingComputeNodeDefinition
MatchingComputeNodeDefinition(
name: str,
config: MatchingComputeNodeConfig,
dependencies: List[str],
enable_logs_on_error: bool = False,
enable_logs_on_success: bool = False,
output: Optional[str] = '/output',
id: Optional[str] = None,
)
Class definining the structure of a MatchingComputeNode.
A MatchingComputeNode can be used to join two datasets based on common columns.
Initialise a MatchingComputeNodeDefinition
:
Parameters:
name
: Name of the MatchingComputeNodeDefinition
.
config
: Configuration of the MatchingComputeNodeDefinition
.
dependencies
: Nodes that the MatchingComputeNodeDefinition
depends on.
enable_logs_on_error
: Enable logs in the event of an error.
enable_logs_on_success
: Enable logs when the computation is successful.
output
: Directory where the results will be written.
Ancestors (in MRO)
- decentriq_platform.analytics.node_definitions.NodeDefinition
- abc.ABC Instance variables
required_workers
:
build
def build(
self,
dcr_id: str,
node_definition: NodeDefinition,
client: Client,
session: Session,
) ‑> MatchingComputeNode
Construct a MatchingComputeNode from the Node Definition.
Parameters:
dcr_id
: ID of the DCR the node is a member of.node_definition
: Definition of the Matching Node.client
: AClient
object which can be used to perform operations such as uploading data and retrieving computation results.session
: The session with which to communicate with the enclave.
PreviewComputeNode
PreviewComputeNode(
id: str,
name: str,
dcr_id: str,
dependency: str,
client: Client,
session: Session,
node_definition: NodeDefinition,
quota_bytes: Optional[int] = 0,
)
Class representing a Preview (Airlock) Computation node.
Initialise a PreviewComputeNode
:
Parameters:
name
: Name of thePreviewComputeNode
.dcr_id
: ID of the DCR this node is part of.dependency
: Node that thePreviewComputeNode
depends on.client
: AClient
object which can be used to perform operations such as uploading data and retrieving computation results.session
: The session with which to communicate with the enclave.onode_definition
: Definition of the Preview Node.quota_bytes
: Threshold for amount of data that can be previewed.id
: ID of thePreviewComputeNode
.
Ancestors (in MRO)
- decentriq_platform.analytics.high_level_node.ComputationNode
- decentriq_platform.analytics.high_level_node.HighLevelNode
- abc.ABC
PreviewComputeNodeDefinition
PreviewComputeNodeDefinition(
name: str,
dependency: str,
quota_bytes: Optional[int] = 0,
id: Optional[str] = None,
)
Class representing a Preview (Airlock) Compute Node Definition.
Initialise a PreviewComputeNodeDefinition
:
Parameters:
name
: Name of the PreviewComputeNodeDefinition
.
dependency
: Node that the PreviewComputeNodeDefinition
depends on.
quota_bytes
: Threshold for amount of data that can be previewed.
Ancestors (in MRO)
- decentriq_platform.analytics.node_definitions.NodeDefinition
- abc.ABC Instance variables
required_workers
:
build
def build(
self,
dcr_id: str,
node_definition: NodeDefinition,
client: Client,
session: Session,
) ‑> PreviewComputeNode
Construct a PreviewComputeNode from the Node Definition.
Parameters:
dcr_id
: ID of the DCR the node is a member of.node_definition
: Definition of the Matching Node.client
: AClient
object which can be used to perform operations such as uploading data and retrieving computation results.session
: The session with which to communicate with the enclave.
PrimitiveType
PrimitiveType(
*args,
**kwds,
)
str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to 'strict'.
Ancestors (in MRO)
- builtins.str
- enum.Enum
PythonComputeNode
PythonComputeNode(
id: str,
name: str,
dcr_id: str,
script: str,
client: Client,
session: Session,
node_definition: NodeDefinition,
dependencies: List[str] = [],
additional_files: Optional[List[FileContent]] = None,
enable_logs_on_error: bool = False,
enable_logs_on_success: bool = False,
output: Optional[str] = '/output',
custom_environment: Optional[str] = None,
)
A PythonComputeNode is a node that is able to run arbitrary Python code.
Initialise a PythonComputeNode
:
Parameters:
-id
: ID of the PythonComputeNode
.
name
: Name of thePythonComputeNode
.dcr_id
: ID of the DCR the node is a member of.script
: The Python computation as a string.client
: AClient
object which can be used to perform operations such as uploading data and retrieving computation results.session
: The session with which to communicate with the enclave.node_definition
: Definition of the Matching Node.dependencies
: Nodes that thePythonComputeNode
depends on. -additional_files
: Other files that can be used by thePythonComputeNode
. -enable_logs_on_error
: Enable logs in the event of an error. -enable_logs_on_success
: Enable logs when the computation is successful. -output
: Directory where the results will be written.
Ancestors (in MRO)
- decentriq_platform.analytics.high_level_node.ContainerComputationNode
- decentriq_platform.analytics.high_level_node.ComputationNode
- decentriq_platform.analytics.high_level_node.HighLevelNode
- abc.ABC
PythonComputeNodeDefinition
PythonComputeNodeDefinition(
name: str,
script: str,
additional_files: Optional[List[FileContent]] = None,
dependencies: List[str] = [],
enable_logs_on_error: bool = False,
enable_logs_on_success: bool = False,
output: Optional[str] = '/output',
id: Optional[str] = None,
custom_environment: Optional[str] = None,
)
Helper class that provides a standard way to create an ABC using inheritance.
Initialise a PythonComputeNodeDefinition
.
This class is used in order to construct PythonComputeNodes.
Parameters:
name
: Name of thePythonComputeNodeDefinition
.script
: The Python computation.additional_files
: Other files that can be used by thePythonComputeNodeDefinition
.dependencies
: Nodes that thePythonComputeNodeDefinition
depends on.enable_logs_on_error
: Enable logs in the event of an error.enable_logs_on_success
: Enable logs when the computation is successful.output
: Directory where the results will be written.
Ancestors (in MRO)
- decentriq_platform.analytics.node_definitions.NodeDefinition
- abc.ABC Instance variables
required_workers
:
build
def build(
self,
dcr_id: str,
node_definition: NodeDefinition,
client: Client,
session: Session,
) ‑> PythonComputeNode
Construct a PythonComputeNode from the Node Definition.
Parameters:
dcr_id
: ID of the DCR the node is a member of.node_definition
: Definition of the Python Compute Node.client
: AClient
object which can be used to perform operations such as uploading data and retrieving computation results.session
: The session with which to communicate with the enclave.
PythonEnvironmentComputeNode
PythonEnvironmentComputeNode(
id: str,
name: str,
requirements_txt: str,
dcr_id: str,
session: Session,
node_definition: NodeDefinition,
*,
client: Client,
)
Class representing an Environment.
Initialise an instance of a PythonEnvironmentComputationNode
.
Parameters:
id
: ID of thePythonEnvironmentComputeNode
.name
: Name of thePythonEnvironmentComputeNode
.requirements_txt
: Content of therequirements.txt
file which list the packages for the environment.dcr_id
: ID of the DCR the node is a member of.session
: The session with which to communicate with the enclave.- 'node_definition': Definition with which the node was built.
client
: AClient
object which can be used to perform operations such as uploading data and retrieving computation results.
Ancestors (in MRO)
- decentriq_platform.analytics.high_level_node.ContainerComputationNode
- decentriq_platform.analytics.high_level_node.ComputationNode
- decentriq_platform.analytics.high_level_node.HighLevelNode
- abc.ABC
get_installation_report_as_dict
def get_installation_report_as_dict(
self,
) ‑> Optional[Dict[str, str]]
Retrieve the virtual environment creation report to this PythonEnvironmentComputeNode
.
PythonEnvironmentComputeNodeDefinition
PythonEnvironmentComputeNodeDefinition(
name: str,
requirements_txt: str,
id: Optional[str] = None,
)
Class representing a Python Environment Definition.
Initialise a PythonEnvironmentDefinition
.
Parameters:
name
: Name of thePythonEnvironmentDefinition
.requirements_txt
: Content of therequirements.txt
file which list the packages for the environment.
Ancestors (in MRO)
- decentriq_platform.analytics.node_definitions.NodeDefinition
- abc.ABC Instance variables
required_workers
:
build
def build(
self,
dcr_id: str,
node_definition: NodeDefinition,
client: Client,
session: Session,
) ‑> PythonEnvironmentComputeNode
Construct a PythonEnvironmentComputeNode from the Node Definition.
Parameters:
dcr_id
: ID of the DCR the node is a member of.node_definition
: Definition of the Python Environment Compute Node.client
: AClient
object which can be used to perform operations such as uploading data and retrieving computation results.session
: The session with which to communicate with the enclave.
PythonScript
PythonScript(
name: str,
content: str,
)
Class representing a Python script.
Ancestors (in MRO)
- decentriq_platform.analytics.script.Script
- decentriq_platform.analytics.script.FileContent
RComputeNode
RComputeNode(
id: str,
name: str,
dcr_id: str,
script: str,
client: Client,
session: Session,
node_definition: NodeDefinition,
additional_files: Optional[List[FileContent]] = None,
dependencies: List[str] = [],
enable_logs_on_error: bool = False,
enable_logs_on_success: bool = False,
output: Optional[str] = '/output',
)
Class representing an R Computation node.
An RComputeNode is able to run arbitrary R code.
Initialise a RComputeNode
:
Parameters:
id
: ID of theRComputeNode
.name
: Name of theRComputeNode
.dcr_id
: ID of the DCR the node is a member of.script
: The R computation.client
: AClient
object which can be used to perform operations such as uploading data and retrieving computation results.session
: The session with which to communicate with the enclave.node_definition
: Definition of the Matching Node.additional_files
: Other files that can be used by theRComputeNode
.dependencies
: Nodes that theRComputeNode
depends on.enable_logs_on_error
: Enable logs in the event of an error.enable_logs_on_success
: Enable logs when the computation is successful.output
: Directory where the results should be written.
Ancestors (in MRO)
- decentriq_platform.analytics.high_level_node.ContainerComputationNode
- decentriq_platform.analytics.high_level_node.ComputationNode
- decentriq_platform.analytics.high_level_node.HighLevelNode
- abc.ABC
RComputeNodeDefinition
RComputeNodeDefinition(
name: str,
script: str,
additional_files: Optional[List[FileContent]] = None,
dependencies: List[str] = [],
enable_logs_on_error: bool = False,
enable_logs_on_success: bool = False,
output: Optional[str] = '/output',
id: Optional[str] = None,
)
Helper class that provides a standard way to create an ABC using inheritance.
Initialise a RComputeNodeDefinition
:
Parameters:
name
: Name of theRComputeNodeDefinition
.script
: The R computation as a string.additional_files
: Other files that can be used by theRComputeNodeDefinition
.dependencies
: Nodes that theRComputeNodeDefinition
depends on.enable_logs_on_error
: Enable logs in the event of an error.enable_logs_on_success
: Enable logs when the computation is successful.output
: Directory where the results should be written.
Ancestors (in MRO)
- decentriq_platform.analytics.node_definitions.NodeDefinition
- abc.ABC Instance variables
required_workers
:
build
def build(
self,
dcr_id: str,
node_definition: NodeDefinition,
client: Client,
session: Session,
) ‑> RComputeNode
Construct a RComputeNode from the Node Definition.
Parameters:
dcr_id
: ID of the DCR the node is a member of.node_definition
: Definition of the R Compute Node.client
: AClient
object which can be used to perform operations such as uploading data and retrieving computation results.session
: The session with which to communicate with the enclave.
RScript
RScript(
name: str,
content: str,
)
Class representing an R script.
Ancestors (in MRO)
- decentriq_platform.analytics.script.Script
- decentriq_platform.analytics.script.FileContent
RawDataNode
RawDataNode(
id: str,
name: str,
is_required: bool,
dcr_id: str,
client: Client,
session: Session,
node_definition: RawDataNodeDefinition,
)
Class representing a Raw Data node.
Data that is provisioned to a Raw Data Node is assumed to be unstructured. This means that any of the SQL node types cannot read from such a Data Node. This is the preferred node type for data such as images or binary data. It can, of course, also be used for tabular data files such as CSV or Excel. In this case, however, the code reading from the Data Node will have to interpret the data correctly.
Initialise a RawDataNode
instance.
Parameters:
- 'id': ID of the
RawDataNode
. name
: Name of theRawDataNode
is_required
: Flag determining if theRawDataNode
must be present for dependent computations.dcr_id
: ID of the DCR the node is a member of.node_definition
: Definition of the Raw Data Node.client
: AClient
object which can be used to perform operations such as uploading data and retrieving computation results.session
: The session with which to communicate with the enclave.node_definition
: Definition of the Raw Data Node.
Ancestors (in MRO)
- decentriq_platform.analytics.high_level_node.DataNode
- decentriq_platform.analytics.high_level_node.HighLevelNode
- abc.ABC
RawDataNodeDefinition
RawDataNodeDefinition(
name: str,
is_required: bool,
id: Optional[str] = None,
)
Helper class that provides a standard way to create an ABC using inheritance.
Initialise a RawDataNodeDefinition
:
Parameters:
name
: Name of theRawDataNodeDefinition
.is_required
: Defines if theRawDataNodeDefinition
is required.id
: Optional id of theRawDataNodeDefinition
.
Ancestors (in MRO)
- decentriq_platform.analytics.node_definitions.NodeDefinition
- abc.ABC Instance variables
required_workers
:
build
def build(
self,
dcr_id: str,
node_definition: NodeDefinition,
client: Client,
session: Session,
) ‑> RawDataNode
Construct a RawDataNode from the Node Definition.
Parameters:
dcr_id
: ID of the DCR the node is a member of.node_definition
: Definition of the Raw Data Node.client
: AClient
object which can be used to perform operations such as uploading data and retrieving computation results.session
: The session with which to communicate with the enclave.
S3Provider
S3Provider(
*args,
**kwds,
)
str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to 'strict'.
Ancestors (in MRO)
- builtins.str
- enum.Enum
S3SinkComputeNode
S3SinkComputeNode(
id: str,
name: str,
dcr_id: str,
credentials_dependency_id: str,
endpoint: str,
region: str,
dependency: str,
client: Client,
session: Session,
node_definition: NodeDefinition,
provider: Optional[S3Provider] = S3Provider.AWS,
)
Class representing an S3 Sink Computation node.
Initialise a S3SinkComputeNode
:
Parameters:
id
: ID of theS3SinkComputeNode
.name
: Name of theS3SinkComputeNode
.dcr_id
: ID of the DCR the node is a member of.credentials_dependency_id
: ID of theS3SinkComputeNode
dependency.endpoint
: Endpoint where data will be uploaded.region
: Region where the data will be uploaded.dependency
: Node that theS3SinkComputeNode
depends on.client
: AClient
object which can be used to perform operations such as uploading data and retrieving computation results.session
: The session with which to communicate with the enclave.node_definition
: Definition of the Raw Data Node.provider
: Type of S3 provider (AWS/GCS).
Ancestors (in MRO)
- decentriq_platform.analytics.high_level_node.ComputationNode
- decentriq_platform.analytics.high_level_node.HighLevelNode
- abc.ABC
S3SinkComputeNodeDefinition
S3SinkComputeNodeDefinition(
name: str,
credentials_dependency_id: str,
endpoint: str,
region: str,
dependency: str,
provider: S3Provider = S3Provider.AWS,
id: Optional[str] = None,
)
Class representing an S3 Sink Computation node.
Initialise a S3SinkComputeNodeDefinition
:
Parameters:
name
: Name of the S3SinkComputeNodeDefinition
.
credentials_dependency_id
: ID of the S3SinkComputeNodeDefinition
dependency.
endpoint
: Endpoint where data will be uploaded.
region
: Region where the data will be uploaded.
dependency
: Node that the S3SinkComputeNodeDefinition
depends on.
provider
: Type of S3 provider (AWS/GCS).
id
: Optional ID of the S3SinkComputeNodeDefinition
.
Ancestors (in MRO)
- decentriq_platform.analytics.node_definitions.NodeDefinition
- abc.ABC Instance variables
required_workers
:
build
def build(
self,
dcr_id: str,
node_definition: NodeDefinition,
client: Client,
session: Session,
) ‑> S3SinkComputeNode
Construct a S3SinkComputeNode from the Node Definition.
Parameters:
dcr_id
: ID of the DCR the node is a member of.node_definition
: Definition of the S3 Sink Compute Node.client
: AClient
object which can be used to perform operations such as uploading data and retrieving computation results.session
: The session with which to communicate with the enclave.
SinkInputFormat
SinkInputFormat(
)
Factory for creating the desired SinkInput
type.
Static methods
all
def all(
) ‑> decentriq_platform.analytics.dataset_sink_compute_nodes.SinkInput
Store all files in a zip to the Decentriq Platform.
files
def files(
files: List[str],
) ‑> decentriq_platform.analytics.dataset_sink_compute_nodes.SinkInput
Store the specified files in a zip to the Decentriq Platform.
raw
def raw(
) ‑> decentriq_platform.analytics.dataset_sink_compute_nodes.SinkInput
Store a single raw file to the Decentriq Platform.
SqlComputeNode
SqlComputeNode(
id: str,
name: str,
dcr_id: str,
query: str,
client: Client,
session: Session,
node_definition: NodeDefinition,
*,
dependencies: Optional[List[str]] = None,
minimum_rows_count: Optional[int] = None,
)
Class representing an SQL Computation Node.
Initialise a SqlComputeNode
:
Parameters:
id
: ID of theSqlComputeNode
.name
: Name of theSqlComputeNode
.dcr_id
: ID of the DCR the node is a member of.query
: SQL query string.client
: AClient
object which can be used to perform operations such as uploading data and retrieving computation results.session
: The session with which to communicate with the enclave.dependencies
: Nodes that theSqlComputeNode
depends on.minimum_rows_count
: Minimum number of rows required by theSqlComputeNode
.
Ancestors (in MRO)
- decentriq_platform.analytics.high_level_node.StructuredOutputNode
- decentriq_platform.analytics.high_level_node.ComputationNode
- decentriq_platform.analytics.high_level_node.HighLevelNode
- abc.ABC
SqlComputeNodeDefinition
SqlComputeNodeDefinition(
name: str,
query: str,
dependencies: Optional[List[str]] = None,
minimum_rows_count: Optional[int] = None,
id: Optional[str] = None,
)
Class representing an SQL Computation Node Definition.
Initialise a SqlComputeNodeDefinition
:
Parameters:
name
: Name of the SqlComputeNodeDefinition
.
query
: SQL query string.
dependencies
: Node ids that the SQL node depends on.
minimum_rows_count
: Minimum number of rows required by the SqlComputeNodeDefinition
.
id
: Optional ID of the SqlComputeNodeDefinition
.
Ancestors (in MRO)
- decentriq_platform.analytics.node_definitions.NodeDefinition
- abc.ABC Instance variables
required_workers
:
build
def build(
self,
dcr_id: str,
node_definition: NodeDefinition,
client: Client,
session: Session,
) ‑> SqlComputeNode
Construct a SqlComputeNode from the Node Definition.
Parameters:
dcr_id
: ID of the DCR the node is a member of.node_definition
: Definition of the SQL Compute Node.client
: AClient
object which can be used to perform operations such as uploading data and retrieving computation results.session
: The session with which to communicate with the enclave.
SqliteComputeNode
SqliteComputeNode(
id: str,
name: str,
dcr_id: str,
query: str,
client: Client,
session: Session,
node_definition: NodeDefinition,
dependencies: Optional[List[str]] = None,
enable_logs_on_error: bool = False,
enable_logs_on_success: bool = False,
)
Class representing an SQLite Computation node.
Initialise a SqliteComputeNode
:
Parameters:
id
: ID of theSqliteComputeNode
.name
: Name of theSqliteComputeNode
.dcr_id
: ID of the DCR the node is a member of.query
: SQLite query string.client
: AClient
object which can be used to perform operations such as uploading data and retrieving computation results.session
: The session with which to communicate with the enclave.dependencies
: Nodes that theSqliteComputeNode
depends on.enable_logs_on_error
: Enable logs in the event of an error.enable_logs_on_success
: Enable logs when the computation is successful.
Ancestors (in MRO)
- decentriq_platform.analytics.high_level_node.StructuredOutputNode
- decentriq_platform.analytics.high_level_node.ComputationNode
- decentriq_platform.analytics.high_level_node.HighLevelNode
- abc.ABC
SqliteComputeNodeDefinition
SqliteComputeNodeDefinition(
name: str,
query: str,
dependencies: Optional[List[str]] = None,
enable_logs_on_error: bool = False,
enable_logs_on_success: bool = False,
id: Optional[str] = None,
)
Class representing an SQLite Computation Node Definition.
Initialise a SqliteComputeNodeDefinition
:
Parameters:
name
: Name of theSqliteComputeNodeDefinition
.query
: SQLite query string.dependencies
: Mappings between node id and the table name under which they should be made available.enable_logs_on_error
: Enable logs in the event of an error.enable_logs_on_success
: Enable logs when the computation is successful.id
: Optional ID of theSqliteComputeNodeDefinition
.
Ancestors (in MRO)
- decentriq_platform.analytics.node_definitions.NodeDefinition
- abc.ABC Instance variables
required_workers
:
build
def build(
self,
dcr_id: str,
node_definition: NodeDefinition,
client: Client,
session: Session,
) ‑> SqliteComputeNode
Construct a SqliteComputeNode from the Node Definition.
Parameters:
dcr_id
: ID of the DCR the node is a member of.node_definition
: Definition of the SQLite Compute Node.client
: AClient
object which can be used to perform operations such as uploading data and retrieving computation results.session
: The session with which to communicate with the enclave.
SyntheticDataComputeNode
SyntheticDataComputeNode(
id: str,
name: str,
dcr_id: str,
columns: List[SyntheticNodeColumn],
dependency: str,
epsilon: float,
client: Client,
session: Session,
node_definition: NodeDefinition,
output_original_data_statistics: bool = False,
enable_logs_on_error: bool = False,
enable_logs_on_success: bool = False,
)
Class representing a Synthetic Data Computation Node.
Initialise a SyntheticDataComputeNode
:
Parameters:
id
: ID of theSyntheticDataComputeNode
.name
: Name of theSyntheticDataComputeNode
.dcr_id
: ID of the DCR the node is a member of.columns
: Columns defined for theSyntheticDataComputeNode
dependency
: Node that theSyntheticDataComputeNode
depends on.epsilon
: Amount of noise to add to the data.client
: AClient
object which can be used to perform operations such as uploading data and retrieving computation results.session
: The session with which to communicate with the enclave.output_original_data_statistics
: Include the original statistics in the output.enable_logs_on_error
: Enable logs in the event of an error.enable_logs_on_success
: Enable logs when the computation is successful.
Ancestors (in MRO)
- decentriq_platform.analytics.high_level_node.ContainerComputationNode
- decentriq_platform.analytics.high_level_node.ComputationNode
- decentriq_platform.analytics.high_level_node.HighLevelNode
- abc.ABC
get_results_as_string
def get_results_as_string(
self,
interval: int = 5,
timeout: Optional[int] = None,
) ‑> Optional[str]
Retrieve the results of a computation as a string.
Parameters:
interval
: Time interval (in seconds) to check for results.timeout
: Time (in seconds) after which results are no longer checked.
run_computation_and_get_results_as_string
def run_computation_and_get_results_as_string(
self,
interval: int = 5,
timeout: Optional[int] = None,
) ‑> Optional[str]
This is a blocking call to run a computation and get the results as a string.
Parameters:
interval
: Time interval (in seconds) to check for results.timeout
: Time (in seconds) after which results are no longer checked.
SyntheticDataComputeNodeDefinition
SyntheticDataComputeNodeDefinition(
name: str,
columns: List[SyntheticNodeColumn],
dependency: str,
epsilon: float,
output_original_data_statistics: bool = False,
enable_logs_on_error: bool = False,
enable_logs_on_success: bool = False,
id: Optional[str] = None,
)
Class representing a Synthetic Data Computation node.
Initialise a SyntheticDataComputeNodeDefinition
:
Parameters:
name
: Name of theSyntheticDataComputeNodeDefinition
.columns
: Columns defined for theSyntheticDataComputeNodeDefinition
dependency
: Node that theSyntheticDataComputeNodeDefinition
depends on.epsilon
: Amount of noise to add to the data.output_original_data_statistics
: Include the original statistics in the output.enable_logs_on_error
: Enable logs in the event of an error.enable_logs_on_success
: Enable logs when the computation is successful.id
: Optional ID of theSyntheticDataComputeNodeDefinition
.
Ancestors (in MRO)
- decentriq_platform.analytics.node_definitions.NodeDefinition
- abc.ABC Instance variables
required_workers
:
build
def build(
self,
dcr_id: str,
node_definition: NodeDefinition,
client: Client,
session: Session,
) ‑> SyntheticDataComputeNode
Construct a SyntheticDataComputeNode from the Node Definition.
Parameters:
dcr_id
: ID of the DCR the node is a member of.node_definition
: Definition of the Synthetic Data Compute Node.client
: AClient
object which can be used to perform operations such as uploading data and retrieving computation results.session
: The session with which to communicate with the enclave.
SyntheticNodeColumn
SyntheticNodeColumn(
data_type: PrimitiveType,
index: int,
mask_type: MaskType,
should_mask_column: bool,
is_nullable: bool = True,
name: Optional[Optional[str]] = None,
)
SyntheticNodeColumn(data_type: 'PrimitiveType', index: 'int', mask_type: 'MaskType', should_mask_column: 'bool', is_nullable: 'bool' = True, name: 'Optional[Optional[str]]' = None)
TableDataNode
TableDataNode(
id: str,
name: str,
columns: List[Column],
is_required: bool,
dcr_id: str,
client: Client,
session: Session,
node_definition: TableDataNodeDefinition,
)
Class representing a Table Data Node.
Initialise a TableDataNode
instance.
Parameters:
- 'id': ID of the
TableDataNode
. name
: Name of theTableDataNode
columns
: Definition of the columns that make up theTableDataNode
.is_required
: Flag determining if theTableDataNode
must be present for dependent computations.dcr_id
: ID of the DCR the node is a member of.client
: AClient
object which can be used to perform operations such as uploading data and retrieving computation results.session
: The session with which to communicate with the enclave.
Ancestors (in MRO)
- decentriq_platform.analytics.high_level_node.DataNode
- decentriq_platform.analytics.high_level_node.HighLevelNode
- abc.ABC
get_validation_report_as_dict
def get_validation_report_as_dict(
self,
) ‑> Optional[Dict[str, str]]
Retrieve the validation report corresponding to this TableDataNode
.
publish_dataset
def publish_dataset(
self,
manifest_hash: str,
key: Key,
)
Publish data to the TableDataNode
.
Parameters:
manifest_hash
: Hash identifying the dataset to be published.key
: Encryption key used to decrypt the dataset.
remove_published_dataset
def remove_published_dataset(
self,
) ‑> None
Removes any dataset that is published to this node.
TableDataNodeDefinition
TableDataNodeDefinition(
name: str,
columns: List[Column],
is_required: bool,
id: Optional[str] = None,
unique_column_combinations: list[list[int]] = [],
)
Class representing a Table Data Node Definition.
Initialise a TableDataNodeDefinition
instance.
Parameters:
name
: Name of theTableDataNodeDefinition
columns
: Definition of the columns that make up theTableDataNodeDefinition
.is_required
: Flag determining if theRawDataNode
must be present for dependent computations.id
: Optional ID of theTableDataNodeDefinition
unique_column_combinations
: Check that the given combination of columns are unique across the dataset. This should be a list of lists, where the inner lists list the 0-based column indices. For example, passing the list[[0], [0, 1]]
would result in the enclave checking that all values in the first column are unique. It would further check that all tuples formed by the first and second column are unique.
Ancestors (in MRO)
- decentriq_platform.analytics.node_definitions.NodeDefinition
- abc.ABC Instance variables
required_workers
:
build
def build(
self,
dcr_id: str,
node_definition: NodeDefinition,
client: Client,
session: Session,
) ‑> TableDataNode
Construct a TableDataNode from the Node Definition.
Parameters:
dcr_id
: ID of the DCR the node is a member of.node_definition
: Definition of the Table Data Node.client
: AClient
object which can be used to perform operations such as uploading data and retrieving computation results.session
: The session with which to communicate with the enclave.