SDK Quick Start
SDK & API Documentation
Features
Computation requests: So far, data clean rooms have been immutable. This made it necessary to create new data clean rooms to run different computations on the same data. Now it is possible to request additional computations in existing data clean rooms which were not part of the original specification. Of course the data owners have to approve first in order to continue guaranteeing the security of the data. To do this, when you create a request, the platform automatically determines the owners of the affected datasets and asks them for approval. Once all approved, the new computation can be added to the data clean room.
Development tab: This is a scratch space that allows you to run arbitrary computations on data and results you already have access to without the need for approval. This allows prototyping computations which subsequently can be turned into a request.
Join then synthesize: Our UI now allows you to create synthetic data not only based on tables, but also based on the output of a SQL computation. This allows generating synthetic copies of joined datasets from multiple sources, enabling much richer synthetic datasets.
Privacy filter for individual computations: You can now set an independent k-anonymity privacy filter for each SQL computation, previously this was only possible globally.
Mandatory datasets: You can now set which tables and files must be provisioned before allowing running computations that depend on them.
New UI look: We did our spring cleaning and removed several borders and other unnecessary elements. We hope you like it too!
UX improvements: Several enhancements in order to improve the experience of writing and debugging code into the platform, and preparing the data clean room draft.
New Python SDK version: A new version of the Python SDK, version 0.11.0, has been released with support for all new features available in the platform.
Bugfixes
Removed some performance bottlenecks for large data clean rooms
Pagination bug in ‘My Datasets’ page is fixed
The list of data clean rooms in the sidebar is now updated automatically after creating a new one
It is now possible to have multiple files as dependencies for computations
Enclave versions
Driver enclave
Version 4 (new):
Identifier:
decentriq.driver:v4
Hash:
d7af541df8de018effbe537269a624850868e93c6982acc31e3f81e5babaf9a5
Version 2:
Identifier:
decentriq.driver:v2
Hash:
e3c40c52ccaa92ab420d30d31afcbfe898861d9d67e7ebee2df261db9c0372c4
SQL worker
Version 4 (new):
Identifier:
decentriq.sql-worker:v4
Hash:
3cf64f9d80f538e7d1e48725c63b3ee2091d7cf5508e69e0db3a8ed2d98595c5
Version 2:
Identifier:
decentriq.sql-worker:v2
Hash:
3742070ece17bf1c0f37191bdbba4e1ebdb1be0f6194e1594a53bea49484f98d
Python worker (AWS Nitro-based):
Version 2 (new):
Identifier:
decentriq.python-ml-worker:v2
Hash:
0c3952473d23707bf8cd6a909c8162e2c9f13c5a5af34228518a1c4ad35d358e
Version 1:
Identifier:
decentriq.python-ml-worker:v1
Hash:
47dd5eee8bbebf33a25dfffd825196bcb4276f4fe06aef09e27d0bdf5f7da43c
R worker (AWS Nitro-based):
Version 2 (new):
Identifier:
decentriq.r-latex-worker:v2
Hash:
b58b53ecaa9636d9a1f75d471000226a556bb36c9238dd20e0b063163b5922f1
Synthetic data worker (AWS Nitro-based):
Version 2 (new):
Identifier:
decentriq.python-synth-data-worker:v2
Hash:
73a18198fdd98ded93cd9be57c3457e5a83eb7f998dcfdc625c8bb25c0751a30
Features
Synthetic data generation: generate from any table a differentially-private synthetic copy of the data with similar statistical properties. This allows you to prototype your scripts locally before running them on the original data in a dedicated data clean room.
R computations: you can now also write scripts in the R language.
Multi-file scripts: Python and R computations support more than just a single script, they can now consist of multiple scripts and file types.
Preview script results: the CSV result from Python and R scripts can now be viewed directly in the UI.
Unstructured data support: it's now possible to provision datasets of any kind, not only tabular but also unstructured data such as texts, JSON and images.
Bugfixes
Some datasets with floating point numbers couldn’t be provisioned
Some strings were not wrapped in quotes when downloading results as CSV
Fixed a compatibility issue with Firefox
Improved specific error messages for Python scripts
Improved performance for data clean rooms with a large amount of computations
Enclave versions
Driver enclave
Identifier:
decentriq.driver:v2
Hash:
e3c40c52ccaa92ab420d30d31afcbfe898861d9d67e7ebee2df261db9c0372c4
SQL worker
Identifier:
decentriq.sql-worker:v2
Hash:
3742070ece17bf1c0f37191bdbba4e1ebdb1be0f6194e1594a53bea49484f98d
Python worker (AWS Nitro-based):
Identifier:
decentriq.python-ml-worker:v1
Hash:
47dd5eee8bbebf33a25dfffd825196bcb4276f4fe06aef09e27d0bdf5f7da43c
R worker (AWS Nitro-based):
Identifier:
decentriq.r-latex-worker:v1
Hash:
77dab20c5a42a8383083162e9f5f7c6251f5e5ac28b007968dc216b8e9f01750
Synthetic data worker (AWS Nitro-based):
Identifier:
decentriq.python-synth-data-worker:v1
Hash:
a5324a143bed8f63639fc48cc45046e288ad0d1feb79dc0c323a8ebfe9d6ed07
We are delighted to announce that the Decentriq platform version 2.0 is released! We are introducing compute nodes that support the execution of Python scripts. This opens the door to confidential machine learning and other exciting applications in order to unlock the value of your sensitive data assets, all powered by confidential computing! Available in Switzerland and globally.
What's new in version 2.0
The most important improvement is our new compute graph-based platform architecture: the Decentriq platform now consists of driver and worker enclaves. While the driver enforces the data clean rooms permissions and orchestrates the execution, the worker enclaves execute the computations. This architecture allows us to combine different trusted execution environment technologies according to their strengths: Intel SGX, AMD SEV/SNP, AWS Nitro...
Processing your datasets just got more powerful! Besides writing SQL queries, you can now take your analyses to the next level by writing Python scripts, also running on confidential computing!
Brand new Python SDK, fully compatible with data clean rooms created using our web platorm. Your entire workflow can be automated: manage data clean rooms, datasets and analyses programmatically. Documentation and step-by-step guides available.
Additional Features
Tables and files browser: writing a query or script is made easier now by just having all available tables, columns and files in a handy side panel.
Several UI improvements: finding your way in the UI is just getting more intuitive and efficient.
Bugfixes
When importing a data clean room from a file, the description field was not being read correctly
Switching between data clean rooms was slightly time consuming
Some brand logos were cropped
Some error messages were not very intuitive
Enclave versions
The following enclave versions will be available.
Driver enclave
Identifier:
decentriq.driver:v2
Hash:
e3c40c52ccaa92ab420d30d31afcbfe898861d9d67e7ebee2df261db9c0372c4
SQL worker
Identifier:
decentriq.sql-worker:v2
Hash:
3742070ece17bf1c0f37191bdbba4e1ebdb1be0f6194e1594a53bea49484f98d
Python worker (AWS Nitro-based):
Identifier:
decentriq.python-ml-worker:v1
Hash:
47dd5eee8bbebf33a25dfffd825196bcb4276f4fe06aef09e27d0bdf5f7da43c
Features
Data portal: An overview of all datasets you encrypted and connected to data clean rooms is available in the new ‘Datasets’ menu. Keep full control over your data!
Dataset statistics: When connecting your datasets, the platform computes summary statistics, that you can share with other participants of the data clean room. Get and share instant insights into the data quality!
Fuzzy string matching: The
fuzzystrmatchfunction now takes an optional parameter such that only the best match is returned.
Several UI improvements
Bugfixes
Fixed a computation bug of the STDEV and STDEVP SQL commands
When CSV datasets were truncated, the file size was not correctly calculated
Fixed a bug when editing column names of a table
Fixed a bug that forced users to login and logout twice to switch accounts
Enclave version
1978873e5be413527f9025d18b39e8a7071fbfeea90669064bf8322596c0a595
Features
Technical documentation: Easy access to centralised documentation about the platform, SDKs and step-by-step tutorials now available at docs.decentriq.com - check it out!
Dataset upload wizard: Uploading CSVs got more powerful - now you can adjust the parsing parameters and preview your data before connecting it to a data clean room, besides adding some context to it such as name and description.
Dataset metadata: On every data clean room table, it's possible to see more information about the uploaded datasets like number or rows, file size, name and description.
Previewing query results with many columns is now possible, with horizontal scrolling.
Bugfixes
Fixed a bug that was sending multiple requests when retrieving the audit log.
Fixed a bug that was breaking the chronological order of the rendered audit log.
Now all participants are always copied when duplicating a data clean room.
Enclave version
e6546a05f73a23c0f7fa88fbabe3e0feca4107d82b1336797235626b0554d981
Features
Data deletion: Now, when you delete your dataset, also all the results, derivated datasets and metadata get deleted from the encrypted data store.
Improved multitenancy: Enclaves now prioritize fast requests (such retrieving the audit log or the data clean room definition) over longer running tasks (executing queries) to give you the best possible user experience.
Data clean room stopping: The data clean room creator can now stop the data clean room, so that no participant can upload datasets nor run queries. Stopped data clean rooms still allow data deletion and retrieving the audit log.
Major changes of data clean room creation UI/UX: The UX of the data clean room creation has been improved thanks to several changes: Modifications of titles or elements are now auto-saved, you can re-order table columns easily, the data clean room creation elements are easier to modify, and multiple styling improvements.
API tokens self-service: You can create and manage your API tokens on your own from within the platform UI.
SQL capabilities page is now better accessible through a link in the sidebar.
Bugfixes
Fixed a bug that did not allow the audit log to be sometimes properly rendered.
The sidebar now gets refreshed every time that a DCR is created or deleted.
Enclave version
e6546a05f73a23c0f7fa88fbabe3e0feca4107d82b1336797235626b0554d981
Features
Privacy filter: To protect the privacy of individuals, we implemented a configurable privacy filter that enforces a minimum level of aggregation in the output. If enabled, only aggregating queries (with a GROUP BY in the last step) are allowed and all resulting groups with less than a specified threshold are filtered out before being returned to the analyst.
Result file naming: Result files are now named according to the query.
Various small UI/UX improvements
Bugfixes
Fixed a rare concurrency issue when executing long-running queries that could lead to a situation where the query never seems to finish
'Last edited' timestamp bug has been fixed
Enclave version
59b7768e3c455cec14aac84187618c46a2a49c21cac754bcd021deb5cd0299ea
Features
Better support for long-running queries: You can now leave the page when a query is triggered without losing the results. The query is run anyway and you are able to retrieve the results once computed
Query constraint: Added the option to run queries only when all datasets have been provided to the data clean room
Better support for query development: When creating a data room, you can now validate your queries for correctness without publishing the data room. To boot, you also get the schema of each query result
Better data room descriptions: The description field now supports long, rich-formatted texts for beautiful descriptions
Branding of data rooms: You can now to personalize the data clean rooms you create with your company logo, using the toggle from the top-right menu
Simplified participant view: We simplified the data room participation by combining the STATUS and OVERVIEW tabs
Improved audit log: We added a more expressive human-readable column to the audit log and also log more information
Bugfixes
In published data clean rooms, the text is now in read-only mode
Fixed link to change password
Fixed formatting of the headers in the results preview
Enclave version
597731febdecd0b5fcdfd4331c1b0c238e40f2c4c51e346e9036135ce1a81287
Features
SQL engine and stack optimization: The platform now performs computations up to 5x faster
The platform now can support Single Sign On via Microsoft Active Directory
Full support for DISTINCT function and casting to VARCHAR
More control of your data: Improved user experience for dataset de-provisioning
The description box is now a long rich-text field that can contain instructions and useful information to use the data clean room in a well-formatted manner
The audit log is now better readable
Improved the validation of identifiers (table names, column names, query names)
Improved user experience in the analysis and action tab: It is now easier to scroll, run queries, and download results
The UI is now able to load and display large result tables
The reordering of table columns reordering is now more intuitive
Other minor UI and UX improvements
Bugfixes
The data clean room publication validation now checks the consistency of the data type when variables are joined in an SQL statement
Fixed bug regarding SUM on a window function (it used to be implemented as a sum over the partition but it should be a running sum)
The analysis tab now refreshes the content properly when a query is deleted
Fixed a bug during login workflow
Fixed a bug that freezes text writing in the modal window a new data room creation
Enclave version
9288101a27978bfbc67ec393d3060ce5c96139bbf9aab320061a25c562bdac4b
V1.0 brings major changes in the web interface of the platform, with a whole new look & feel, an improved UI and more intuitive UX.
New features
(1) Expanded SQL functionality:
Added support for 'IS NULL' and 'IS NOT NULL'
ORDER BY ASC/DESC function is now fully supported
NTILE function is now supported
ROUND function is now supported
(2) New UI functionalities:
You can now directly duplicate an existing data clean room
Error messages are more actionable and closer to natural language
You are now able to see who was already contributed data to a data cleanroom
Headers are now included in the preview and the .CSV of the results
Archiving data rooms without deleting them is now possible
General improvements and improved visualization of the audit log
User roles and permissions are now explicit and have a dedicated section of the data clean room
User documentation is available directly from the platform
Table creation is now supported by a UI table builder
It is now possible to delete your encrypted datasets from the data clean room
Enclave version
5d93bfef5324f984d7781b55ac0cc4cd13dc21eed1f9a7750e1747016abddaf0
Features
Fuzzy matching now supports any type of
JOINin the
FROMstatement
Casting into
int64is now supported
Computation speed of the fuzzy matching algorithm has been substantially improved
Bugfixes
Timeout error during upload is now fixed
Enclave version
561fd910346334b5a245b21b81287393310bd90bc1ae09028a7680291ab48d4a
Features
Audit log: It is now possible to download a CSV file that shows all the interactions that happened in the secure enclave for a specific data clean room. This intamparable log gives data owners full transparency of how their data has been used.
Queries can now use other queries in their
FROMpart. This UX feature makes it much easier to write longer queries.
Results can now be indefinitely big without resulting in problems.
Enclave version
b6f5fe60884d309951128d02e53e623c66b246d0ae9425974edb28296d3eac4d
Features
You can now set a password during the data room definition that needs to be used to interact with the data room once published.
Now
UNIONfunction is supported.
Email notifications are sent out when a new account or data room is created or you are invited to collaborate in a data room.
Performance improvement: Now the platform switches automatically from single enclave to distributed mode, allowing computation on bigger datasets.
Table and query editors are now resizable.
Personalised platform branding has been added.
Bugfixes
Fixed bug in parsing the table name from the CREATE TABLE statement.
Minor bug fixes in query validation.
Features
Now the platform supports
NULLvalues and all types of
JOINs
Added query polling to prevent the browser from timing out
Improved messages and UX for data room validation errors
Now the uploaded datasets get validated before being ingested
Tables and queries definition can be downloaded also after publication
Improved UI for participants' actions
Buxfixes
Now tables and queries cannot be submitted without having at least one assigned analyst
Table name entry is now a read-only field preventing from errors
Solved an issue regarding the results preview.