Python SDK: Container Compute Module

This module provides additional functionality to run computations within containers. It enables the execution of Python scripts within the trusted execution environment and to process both structured and unstructured input data. Please note that currently this functionality is limited to Python but support for additional languages, such as R, will be available in the near future.

Refer to the "Python SDK: Quick Start" page to see this module in action.

When using this module, you will define your computations in the form of compute nodes of class StaticContainerCompute. Doing this requires the addition of a particular enclave specification when creating the DataRoomBuilder object, as only certain enclaves are capable of executing these computations.

The currently available enclave specifications for Python workers are listed in the documentation of the main SDK package.


View Source
"""
.. include:: ../../../decentriq_platform_docs/container_getting_started.md
---
"""
__docformat__ = "restructuredtext"

from .compute import StaticContainerCompute
from .helpers import read_result_as_zipfile


__all__ = [
    "StaticContainerCompute",
    "read_result_as_zipfile",
]
#   class StaticContainerCompute(decentriq_platform.node.Node):
View Source
class StaticContainerCompute(Node):
    """
    Compute node which allows the execution of programs inside a fixed
    container image.
    """

    def __init__(
            self,
            name: str,
            command: List[str],
            mount_points: List[MountPoint],
            output_path: str,
            enclave_type: str,
            include_container_logs_on_error: bool = False,
            include_container_logs_on_success: bool = False
    ) -> None:
        """
        Create a container compute node.

        **Parameters**:
        - `name`: The name of the node. This serves as an identifier for the
            node and needs to be specified when you interact with the node
            (e.g. run its computation or retrieve its results).
        - `command`: The command to execute within the container.
        - `mount_points`: A list of mount points that tell the enclave
            at which file system paths which input should be mounted
            (e.g. the contents of a data node or the output of another
            compute node).
        - `output_path`: A path to a directory that will contain all the
            output written by the command executed in this container.
            Files within this directory will be zipped and downloadable.
        - `enclave_type`: The particular enclave to use for this container.
            This setting controls the environment in which the given `command`
            will be run, i.e. what programs and libraries are available.
            This identifier corresponds to the worker name without the version suffix,
            i.e. `"decentriq.python-ml-worker"`.
        - `include_container_logs_on_error`: Whether to report the internal
            container logs to the outside in case of an error. These logs
            could contain sensitive data and therefore this setting should
            only be used for debugging.
        - `include_container_logs_on_success`: Whether to report the internal
            container logs as part of the result zip file.
            Note that these logs could contain sensitive data and therefore this
            setting should only be used for debugging.
        """
        configuration = ContainerWorkerConfiguration()
        configuration.static.command.extend(command)
        configuration.static.mountPoints.extend(mount_points)
        configuration.static.outputPath = output_path
        configuration.static.includeContainerLogsOnError = include_container_logs_on_error
        configuration.static.includeContainerLogsOnSuccess = include_container_logs_on_success
        config = serialize_length_delimited(configuration)
        dependencies = list(map(lambda a: a.dependency, mount_points))

        super().__init__(
            name,
            config=config,
            enclave_type=enclave_type,
            dependencies=dependencies,
            output_format=ComputeNodeFormat.ZIP
        )

Compute node which allows the execution of programs inside a fixed container image.

#   StaticContainerCompute( name: str, command: List[str], mount_points: List[compute_container_pb2.MountPoint], output_path: str, enclave_type: str, include_container_logs_on_error: bool = False, include_container_logs_on_success: bool = False )
View Source
    def __init__(
            self,
            name: str,
            command: List[str],
            mount_points: List[MountPoint],
            output_path: str,
            enclave_type: str,
            include_container_logs_on_error: bool = False,
            include_container_logs_on_success: bool = False
    ) -> None:
        """
        Create a container compute node.

        **Parameters**:
        - `name`: The name of the node. This serves as an identifier for the
            node and needs to be specified when you interact with the node
            (e.g. run its computation or retrieve its results).
        - `command`: The command to execute within the container.
        - `mount_points`: A list of mount points that tell the enclave
            at which file system paths which input should be mounted
            (e.g. the contents of a data node or the output of another
            compute node).
        - `output_path`: A path to a directory that will contain all the
            output written by the command executed in this container.
            Files within this directory will be zipped and downloadable.
        - `enclave_type`: The particular enclave to use for this container.
            This setting controls the environment in which the given `command`
            will be run, i.e. what programs and libraries are available.
            This identifier corresponds to the worker name without the version suffix,
            i.e. `"decentriq.python-ml-worker"`.
        - `include_container_logs_on_error`: Whether to report the internal
            container logs to the outside in case of an error. These logs
            could contain sensitive data and therefore this setting should
            only be used for debugging.
        - `include_container_logs_on_success`: Whether to report the internal
            container logs as part of the result zip file.
            Note that these logs could contain sensitive data and therefore this
            setting should only be used for debugging.
        """
        configuration = ContainerWorkerConfiguration()
        configuration.static.command.extend(command)
        configuration.static.mountPoints.extend(mount_points)
        configuration.static.outputPath = output_path
        configuration.static.includeContainerLogsOnError = include_container_logs_on_error
        configuration.static.includeContainerLogsOnSuccess = include_container_logs_on_success
        config = serialize_length_delimited(configuration)
        dependencies = list(map(lambda a: a.dependency, mount_points))

        super().__init__(
            name,
            config=config,
            enclave_type=enclave_type,
            dependencies=dependencies,
            output_format=ComputeNodeFormat.ZIP
        )

Create a container compute node.

Parameters:

  • name: The name of the node. This serves as an identifier for the node and needs to be specified when you interact with the node (e.g. run its computation or retrieve its results).
  • command: The command to execute within the container.
  • mount_points: A list of mount points that tell the enclave at which file system paths which input should be mounted (e.g. the contents of a data node or the output of another compute node).
  • output_path: A path to a directory that will contain all the output written by the command executed in this container. Files within this directory will be zipped and downloadable.
  • enclave_type: The particular enclave to use for this container. This setting controls the environment in which the given command will be run, i.e. what programs and libraries are available. This identifier corresponds to the worker name without the version suffix, i.e. "decentriq.python-ml-worker".
  • include_container_logs_on_error: Whether to report the internal container logs to the outside in case of an error. These logs could contain sensitive data and therefore this setting should only be used for debugging.
  • include_container_logs_on_success: Whether to report the internal container logs as part of the result zip file. Note that these logs could contain sensitive data and therefore this setting should only be used for debugging.
#   def read_result_as_zipfile(result: bytes):
View Source
def read_result_as_zipfile(
    result: bytes
):
    """
    Read the given raw computation result as a `zipfile.ZipFile` object.
    Use the `read(name: str)` method on the returned object to read a specific
    file contained in the archive.

    Refer to the [official documentation](https://docs.python.org/3/library/zipfile.html)
    of the zipfile library for all the available methods.
    """
    return zipfile.ZipFile(io.BytesIO(result), "r")

Read the given raw computation result as a zipfile.ZipFile object. Use the read(name: str) method on the returned object to read a specific file contained in the archive.

Refer to the official documentation of the zipfile library for all the available methods.