Skip to content

[BUG] Dimension mismatch after distributed segmentation of 2D image #1138

@drhochbaum

Description

@drhochbaum

I successfully ran distributed segmentation of a 2D zarr array, but the code fails after processing all the chunks:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[21], line 24
     13 cluster_kwargs = {
     14     'n_workers':1,    # if you only have 1 gpu, then 1 worker is the right choice
     15     'ncpus':8,
     16     'memory_limit':'64GB',
     17     'threads_per_worker':1,
     18 }
     20 # run segmentation
     21 # outputs:
     22 #     segments: zarr array containing labels
     23 #     boxes: list of bounding boxes around all labels (very useful for navigating big data)
---> 24 segments, boxes = distributed_eval(
     25     input_zarr=data_zarr,
     26     blocksize=(2048, 2048),
     27     write_path='/Users/XXX/YYY/data_chunk.zarr',
     28     model_kwargs=model_kwargs,
     29     eval_kwargs=eval_kwargs,
     30     cluster_kwargs=cluster_kwargs,
     31 )

File ~/miniforge3/envs/FISH/lib/python3.12/site-packages/cellpose/contrib/distributed_segmentation.py:357, in cluster.<locals>.create_or_pass_cluster(*args, **kwargs)
    355     with cluster_constructor(**kwargs['cluster_kwargs']) as cluster:
    356         kwargs['cluster'] = cluster
...
--> 182     raise RuntimeError('structure and input must have equal rank')
    183 for ii in structure.shape:
    184     if ii != 3:

RuntimeError: structure and input must have equal rank
------------------------------------------------------------------------------------------

I chunked a 2D image of DAPI stained nuclei into a zarr array with chunks = (2048,2048)

Then I ran the distributed segmentation:

# Run CELLPOSE3 distributed segmentation
from cellpose.contrib.distributed_segmentation import distributed_eval

# parameterize cellpose however you like
model_kwargs = {'gpu':True, 'model_type':'nuclei'}  # can also use 'pretrained_model'
eval_kwargs = {'diameter':100,
            'z_axis':0,
            'channels':[1,0],
            'do_3D':False,
}

# define compute resources for local workstation
cluster_kwargs = {
    'n_workers':1,    # if you only have 1 gpu, then 1 worker is the right choice
    'ncpus':8,
    'memory_limit':'64GB',
    'threads_per_worker':1,
}

# run segmentation
# outputs:
#     segments: zarr array containing labels
#     boxes: list of bounding boxes around all labels (very useful for navigating big data)
segments, boxes = distributed_eval(
    input_zarr=data_zarr,
    blocksize=(2048, 2048),
    write_path='/Users/XXX/YYY/chunked_data.zarr',
    model_kwargs=model_kwargs,
    eval_kwargs=eval_kwargs,
    cluster_kwargs=cluster_kwargs,
)

After running through all the chunks the error above is thrown.
The log file shows that nuclei were successfully detected in a majority of the chunks.

dask_worker_0.log

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions