Analyze data stored in a public S3 repository in parallel ========================================================= Description ----------- We will show how to use `dask `_ to analyze an IDR image stored in a public S3 repository We will show: - How to connect to IDR to retrieve the image metadata. - How to load the Zarr binary stored in a public repository. - How to run a segmentation on each plane in parallel. Setup ----- We recommend to use a Conda environment to install the OMERO Python bindings. Please read first :doc:`setup`. Step-by-Step ------------ In this section, we go through the steps required to analyze the data. The script used in this document is :download:`public_s3_segmentation_parallel.py <../scripts/public_s3_segmentation_parallel.py>`. Load the image and reate a dask array from the Zarr storage format: .. literalinclude:: ../scripts/public_s3_segmentation_parallel.py :start-after: # Load-binary :end-before: # Segment-image Define the analysis function: .. literalinclude:: ../scripts/public_s3_segmentation_parallel.py :start-after: # Segment-image :end-before: # Prepare-call Make our function lazy using ``dask.delayed``. It records what we want to compute as a task into a graph that we will run later in parallel: .. literalinclude:: ../scripts/public_s3_segmentation_parallel.py :start-after: # Prepare-call :end-before: # Compute We are now ready to run in parallel using the default number of workers see `Configure dask.compute `_: .. literalinclude:: ../scripts/public_s3_segmentation_parallel.py :start-after: # Compute :end-before: # main In order to use the methods implemented above in a proper standalone script: **Wrap it all up** in ``main``: .. literalinclude:: ../scripts/public_s3_segmentation_parallel.py :start-after: # main