Getting started with ilastik API and OMERO
Description
We will use a Python script showing how to analyze data stored in an OMERO server
using the ilastik API. The code snippets are extracted from the Python script
pixel_classification.py
. A notebook
is also available, see idr0062_pixel_classification.ipynb.
We will show:
How to connect to server.
How load images from a dataset using the OMERO API.
How to run ilastik using its Python API.
How to save the generated results as OMERO images. If you are accessing a public resource e.g. idr.openmicroscopy.org, this step will not work.
Setup
We recommend to use a Conda environment to install ilastik and the OMERO Python bindings. Please read first Install ilastik and OMERO Python bindings.
Resources
We will use an ilastik project created with ilastik version 1.3.3 to analyze 3D images of mouse blastocysts from the Image Data Resource (IDR).
Images from IDR idr0062.
For convenience, the IDR data have been imported into the training OMERO.server. This is only because we cannot save results back to IDR which is a read-only OMERO.server.
Step-by-Step
In this section, we go over the various steps required to analyse the data.
The script used in this document is pixel_classification.py
.
Connect to the server:
def connect(hostname, username, password):
conn = BlitzGateway(username, password,
host=hostname, secure=True)
conn.connect()
return conn
Load the images:
def load_images(conn, dataset_id):
return conn.getObjects('Image', opts={'dataset': dataset_id})
We are now ready to analyze the images:
def analyze(conn, images, model, new_dataset):
# Prepare ilastik
os.environ["LAZYFLOW_THREADS"] = "2"
os.environ["LAZYFLOW_TOTAL_RAM_MB"] = "2000"
args = app.parse_args([])
args.headless = True
args.project = model
shell = app.main(args)
for image in images:
input_data = load_numpy_array(image)
# run ilastik headless
print('running ilastik using %s and %s' % (model, image.getName()))
data = [ {"Raw Data": PreloadedArrayDatasetInfo(preloaded_array=input_data, axistags=vigra.defaultAxistags("tzyxc"))}] # noqa
predictions = shell.workflow.batchProcessingApplet.run_export(data,
export_to_array=True) # noqa
for d in predictions:
save_results(conn, image, d, new_dataset)
The ilastik project used expects the data array to be in the order TZYXC. The order will need to be adjusted depending on the order expected in the ilastik project
def load_numpy_array(image):
pixels = image.getPrimaryPixels()
size_z = image.getSizeZ()
size_c = image.getSizeC()
size_t = image.getSizeT()
size_y = image.getSizeY()
size_x = image.getSizeX()
z, t, c = 0, 0, 0 # first plane of the image
zct_list = []
for t in range(size_t):
for z in range(size_z): # get the Z-stack
for c in range(size_c): # all channels
zct_list.append((z, c, t))
values = []
# Load all the planes as YX numpy array
planes = pixels.getPlanes(zct_list)
j = 0
k = 0
tmp_c = []
tmp_z = []
s = "z:%s t:%s c:%s y:%s x:%s" % (size_z, size_t, size_c, size_y, size_x)
print(s)
# axis tzyxc
print("Downloading image %s" % image.getName())
for i, p in enumerate(planes):
if k < size_z:
if j < size_c:
tmp_c.append(p)
j = j + 1
if j == size_c:
# use dstack to have c at the end
tmp_z.append(numpy.dstack(tmp_c))
tmp_c = []
j = 0
k = k + 1
if k == size_z: # done with the stack
values.append(numpy.stack(tmp_z))
tmp_z = []
k = 0
return numpy.stack(values)
Let’s now save the generated data and create a new OMERO image:
def save_results(conn, image, data, dataset):
filename, file_extension = os.path.splitext(image.getName())
# Save the probabilities file as an image
print("Saving Probabilities as an Image in OMERO")
name = filename + "_Probabilities"
desc = "ilastik probabilities from Image:%s" % image.getId()
# Re-organise array from tzyxc to zctyx order expected by OMERO
data = data.swapaxes(0, 1).swapaxes(3, 4).swapaxes(2, 3).swapaxes(1, 2)
def plane_gen():
"""
Set up a generator of 2D numpy arrays.
The createImage method below expects planes in the order specified here
(for z.. for c.. for t..)
"""
size_z = data.shape[0]-1
for z in range(data.shape[0]): # all Z sections data.shape[0]
print('z: %s/%s' % (z, size_z))
for c in range(data.shape[1]): # all channels
for t in range(data.shape[2]): # all time-points
yield data[z][c][t]
conn.createImageFromNumpySeq(plane_gen(), name, data.shape[0],
data.shape[1], data.shape[2],
description=desc, dataset=dataset)
When done, close the session:
def disconnect(conn):
conn.close()