Single-cell identification

Functions to exploit detection and segmentation results, by identifying individual cells and their objects.

Identify and remove transcription sites

Define transcription sites as clustered RNAs detected inside nucleus:

More generally, identify detected objects within a specific cellular region:

bigfish.multistack.remove_transcription_site(rna, clusters, nuc_mask, ndim)

Distinguish RNA molecules detected in a transcription site from the rest.

A transcription site is defined as as a foci detected within the nucleus.

Parameters:
rnanp.ndarray

Coordinates of the detected RNAs with shape (nb_spots, 4) or (nb_spots, 3). One coordinate per dimension (zyx or yx coordinates) plus the index of the cluster assigned to the RNA. If no cluster was assigned, value is -1.

clustersnp.ndarray

Array with shape (nb_clusters, 5) or (nb_clusters, 4). One coordinate per dimension for the clusters centroid (zyx or yx coordinates), the number of RNAs detected in the clusters and their index.

nuc_masknp.ndarray, bool

Binary mask of the nuclei region with shape (y, x).

ndimint

Number of spatial dimensions to consider (2 or 3).

Returns:
rna_out_tsnp.ndarray

Coordinates of the detected RNAs with shape (nb_spots, 4) or (nb_spots, 3). One coordinate per dimension (zyx or yx coordinates) plus the index of the foci assigned to the RNA. If no foci was assigned, value is -1. RNAs from transcription sites are removed.

focinp.ndarray

Array with shape (nb_foci, 5) or (nb_foci, 4). One coordinate per dimension for the foci centroid (zyx or yx coordinates), the number of RNAs detected in the foci and its index.

tsnp.ndarray

Array with shape (nb_ts, 5) or (nb_ts, 4). One coordinate per dimension for the transcription site centroid (zyx or yx coordinates), the number of RNAs detected in the transcription site and its index.

bigfish.multistack.identify_objects_in_region(mask, coord, ndim)

Identify cellular objects in specific region.

Parameters:
masknp.ndarray, bool

Binary mask of the targeted region with shape (y, x).

coordnp.ndarray

Array with two dimensions. One object per row, zyx or yx coordinates in the first 3 or 2 columns.

ndimint

Number of spatial dimensions to consider (2 or 3).

Returns:
coord_innp.ndarray

Coordinates of the objects detected inside the region.

coord_outnp.ndarray

Coordinates of the objects detected outside the region.


Define and export single-cell results

Extract detection and segmentation results and for every individual cell:

See an example of application here.

bigfish.multistack.extract_cell(cell_label, ndim, nuc_label=None, rna_coord=None, others_coord=None, image=None, others_image=None, remove_cropped_cell=True, check_nuc_in_cell=True)

Extract cell-level results for an image.

The function gathers different segmentation and detection results obtained at the image level and assigns each of them to the individual cells.

Parameters:
cell_labelnp.ndarray, np.uint or np.int

Image with labelled cells and shape (y, x).

ndimint

Number of spatial dimensions to consider (2 or 3).

nuc_labelnp.ndarray, np.uint or np.int

Image with labelled nuclei and shape (y, x). If None, individual nuclei are not assigned to each cell.

rna_coordnp.ndarray

Coordinates of the detected RNAs with zyx or yx coordinates in the first 3 or 2 columns. If None, RNAs are not assigned to individual cells.

others_coordDict[np.ndarray]

Dictionary of coordinates arrays. For each array of the dictionary, the different elements are assigned to individual cells. Arrays should be organized the same way than spots: zyx or yx coordinates in the first 3 or 2 columns, np.int64 dtype, one element per row. Can be used to assign different detected elements to the segmented cells along with the spots. If None, no others elements are assigned to the individual cells.

imagenp.ndarray, np.uint

Image in 2-d. If None, image of the individual cells are not extracted.

others_imageDict[np.ndarray]

Dictionary of images to crop. If None, no others image of the individual cells are extracted.

remove_cropped_cellbool

Remove cells cropped by the FoV frame.

check_nuc_in_cellbool

Check that each nucleus is entirely localized within a cell.

Returns:
fov_resultsList[Dict]

List of dictionaries, one per cell segmented in the image. Each dictionary includes information about the cell (image, masks, coordinates arrays). Minimal information are:

  • cell_id: Unique id of the cell.

  • bbox: bounding box coordinates with the order (min_y, min_x, max_y, max_x).

  • cell_coord: boundary coordinates of the cell.

  • cell_mask: mask of the cell.

bigfish.multistack.extract_spots_from_frame(spots, ndim, z_lim=None, y_lim=None, x_lim=None)

Get spots coordinates within a given frame.

Parameters:
spotsnp.ndarray

Coordinate of the spots. One coordinate per dimension first (zyx coordinates or yx coordinates) plus additional dimensions if necessary.

ndim{2, 3}

Number of spatial dimension to consider.

z_limtuple[int, int]

Minimum and maximum coordinate of the frame along the z axis.

y_limtuple[int, int]

Minimum and maximum coordinate of the frame along the y axis.

x_limtuple[int, int]

Minimum and maximum coordinate of the frame along the x axis.

Returns:
extracted_spotsnp.ndarray

Coordinate of the spots. One coordinate per dimension first (zyx coordinates or yx coordinates) plus additional dimensions if necessary.

bigfish.multistack.summarize_extraction_results(fov_results, ndim, path_output=None, delimiter=';')

Summarize results extracted from an image and store them in a dataframe.

Parameters:
fov_resultsList[Dict]

List of dictionaries, one per cell segmented in the image. Each dictionary includes information about the cell (image, masks, coordinates arrays). Minimal information are:

  • cell_id: Unique id of the cell.

  • bbox: bounding box coordinates with the order (min_y, min_x, max_y, max_x).

  • cell_coord: boundary coordinates of the cell.

  • cell_mask: mask of the cell.

ndimint

Number of spatial dimensions to consider (2 or 3).

path_outputstr, optional

Path to save the dataframe in a csv file.

delimiterstr, default=”;”

Delimiter used to separate columns if the dataframe is saved in a csv file.

Returns:
dfpd.DataFrame

Dataframe with summarized results from the field of view, at the cell level. At least cell_id (Unique id of the cell) and ‘cell_area’ (2-d area of the cell, in pixel) are returned. Other indicators are summarized if available:

  • nuc_area: 2-d area of the nucleus, in pixel.

  • nb_rna: Number of detected rna in the cell.

  • nb_rna_in_nuc: Number of detected rna inside the nucleus.

  • nb_rna_out_nuc: Number of detected rna outside the nucleus.

Extra coordinates elements detected are counted in the cell and summarized as well.


Manipulate surfaces, coordinates and boundaries

Convert identified surfaces into coordinates, delimit boundaries and manipulates coordinates:

bigfish.multistack.center_mask_coord(main, others=None)

Center a 2-d binary mask (surface or boundaries) or a 2-d localization coordinates array and pad it.

One mask or coordinates array should be at least provided (main). If others masks or arrays are provided (others), they will be transformed like main. All the provided masks should have the same shape.

Parameters:
mainnp.ndarray, np.uint or np.int or bool

Binary image with shape (y, x) or array of coordinates with shape (nb_points, 2).

othersList(np.ndarray)

List of binary image with shape (y, x), array of coordinates with shape (nb_points, 2) or array of coordinates with shape (nb_points, 3).

Returns:
main_centerednp.ndarray, np.uint or np.int or bool

Centered binary image with shape (y, x).

others_centeredList(np.ndarray)

List of centered binary image with shape (y, x), centered array of coordinates with shape (nb_points, 2) or centered array of coordinates with shape (nb_points, 3).

bigfish.multistack.from_boundaries_to_surface(binary_boundaries)

Fill in the binary matrix representing the boundaries of an object.

Parameters:
binary_boundariesnp.ndarray, np.uint or np.int or bool

Binary image with shape (y, x).

Returns:
binary_surfacenp.ndarray, bool

Binary image with shape (y, x).

bigfish.multistack.from_surface_to_boundaries(binary_surface)

Convert the binary surface to binary boundaries.

Parameters:
binary_surfacenp.ndarray, np.uint or np.int or bool

Binary image with shape (y, x).

Returns:
binary_boundariesnp.ndarray, np.uint or np.int or bool

Binary image with shape (y, x).

bigfish.multistack.from_binary_to_coord(binary)

Extract coordinates from a 2-d binary matrix.

As the resulting coordinates represent the external boundaries of the object, the coordinates values can be negative.

Parameters:
binarynp.ndarray, np.uint or np.int or bool

Binary image with shape (y, x).

Returns:
coordnp.ndarray, np.int

Array of boundaries coordinates with shape (nb_points, 2).

bigfish.multistack.complete_coord_boundaries(coord)

Complete a 2-d coordinates array, by generating/interpolating missing points.

Parameters:
coordnp.ndarray, np.int

Array of coordinates to complete, with shape (nb_points, 2).

Returns:
coord_completednp.ndarray, np.int

Completed coordinates arrays, with shape (nb_points, 2).

bigfish.multistack.from_coord_to_frame(coord, external_coord=True)

Initialize a frame shape to represent coordinates values in 2-d matrix.

If coordinates represent the external boundaries of an object, we add 1 to the minimum coordinate and substract 1 to the maximum coordinate in order to build the frame. The frame centers the coordinates by default.

Parameters:
coordnp.ndarray, np.int

Array of cell boundaries coordinates with shape (nb_points, 2) or (nb_points, 3).

external_coordbool

Coordinates represent external boundaries of object.

Returns:
frame_shapetuple

Shape of the 2-d matrix.

min_yint

Value tu substract from the y coordinate axis.

min_xint

Value tu substract from the x coordinate axis.

margeint

Value to add to the coordinates.

bigfish.multistack.from_coord_to_surface(cell_coord, nuc_coord=None, rna_coord=None, external_coord=True)

Convert 2-d coordinates to a binary matrix with the surface of the object.

If we manipulate the coordinates of the external boundaries, the relative binary matrix has two extra pixels in each dimension. We compensate by keeping only the inside pixels of the object surface.

If others coordinates are provided (nucleus and mRNAs), the relative binary matrix is built with the same shape as the main coordinates (cell).

Parameters:
cell_coordnp.ndarray, np.int

Array of cell boundaries coordinates with shape (nb_points, 2).

nuc_coordnp.ndarray, np.int

Array of nucleus boundaries coordinates with shape (nb_points, 2).

rna_coordnp.ndarray, np.int

Array of mRNAs coordinates with shape (nb_points, 2) or (nb_points, 3).

external_coordbool

Coordinates represent external boundaries of object.

Returns:
cell_surfacenp.ndarray, bool

Binary image of cell surface with shape (y, x).

nuc_surfacenp.ndarray, bool

Binary image of nucleus surface with shape (y, x).

rna_binarynp.ndarray, bool

Binary image of mRNAs localizations with shape (y, x).

new_rna_coordnp.ndarray, np.int

Array of mRNAs coordinates with shape (nb_points, 2) or (nb_points, 3).