Skip to content

Generate 3D patches

Short Description

The generateSingle3DCube function cut out a fixed-size image from the center of a specified, pre-segmented single cell. The cuttted sub-images will be used to train a deep learning model. Make sure to have the raw image, computed single-cell spatial table, and markers.csv file ready for input.

Function

generateSingle3DCube(spatialTablePath, imagePath, imageName='test_image', markerChannelMapPath=None, markers=['DNA1'], markerColumnName='marker', channelColumnName='channel', x_coordinate='X_centroid', y_coordinate='Y_centroid', z_coordinate='Z_centroid', cell_indicator='cellID', cropsize=75, z_cropsize=75, padding=True, image_fraction=1, random_state=1, verbose=True, projectDir=None)

Parameters:

Name Type Description Default
spatialTablePath str

Path to the single-cell spatial feature matrix.

required
imagePath str

Path to the image file. Recognizes .ome.tif image file.

required
imageName str

image name for cropped sub images.

'test_image'
markerChannelMapPath str

Path to a markers.csv file that maps the channel number with the marker information. Create a .csv file with at least two columns named 'channel' and 'marker' that map the channel numbers to their corresponding markers. The channel number should use 1-based indexing.

None
markers list

Markers for which singlePatches need to be generated. The function looks for these listed names in the single-cell spatial Table.

['DNA1']
markerColumnName str

The name of the column in the markers.csv file that holds the marker information.

'marker'
channelColumnName str

The name of the column in the markers.csv file that holds the channel information.

'channel'
x_coordinate str

The column name in single-cell spatial table that records the X coordinates for each cell. Default use 'X_centroid'.

'X_centroid'
y_coordinate str

The column name in single-cell spatial table that records the Y coordinates for each cell. Default use 'Y_centroid'.

'Y_centroid'
z_coordinate str

The column name in single-cell spatial table that records the Z coordinates for each cell. Default use 'Z_centroid'.

'Z_centroid'
cell_indicator str

The column name in single-cell spatial table that records the ID index for each cell. Default use 'cellID'.

'cellID'
cropsize int

Default use 75 pixel for x,y.

75
z_cropsize int

Default use 75 pixel for z.

75
padding bool

If True, padding the cube to the expected cropsize. Else, filter the cubes without the expected cropsize.

True
image_fraction float

Percentage of the cells to be cropped and saved. Default use 100% and crop all cells provided in the single-cell spatial table.

1
random_state int

random seeds set in sampling the cells to be cropped if image_fraction less than 100%.

1
verbose bool

If True, print detailed information about the process to the console.

True
projectDir string

Path to output directory. The result will be located at projectDir/SpatialAE/Single3DPatch/.

None

Returns:

Name Type Description
images .tif

3D cubes will be returned,

Example
import spatialae
## crop DNA1 channel
spatialTablePath="/home/roh6824/ResearchProject/SpatialMolecular/3D/LSP13626_F8iic_metadata.csv"
imagePath="/n/scratch/users/r/xxx/data/LSP13626/Dataset1-LSP13626-invasive_margin.tif"

# marker = pd.read_table(markerChannelMapPath)
projectDir="/n/scratch/users/r/xxx/Results/LSP13626_DNA_padding/"

spatialae.datasets.generateSingle3DCube(spatialTablePath,
                imagePath,
                imageName = "LSP13626_3Dimage",
                markers = ["DNA1"],
                cropsize = 75,
                z_cropsize = 75,
                padding = True,
                cell_indicator="CellID",
                image_fraction = 1,
                projectDir=projectDir)

## crop multiple channels 
markerChannelMapPath="/n/scratch/users/r/roh6824/data/LSP13626/markers.csv"

projectDir="/n/scratch/users/r/roh6824/Results/LSP13626_DNA_padding/"

spatialae.datasets.generateSingle3DCube(spatialTablePath,
                imagePath,
                imageName = "LSP13626_3Dimage",
                markerChannelMapPath = markerChannelMapPath,
                markers = ["MART-1",  "SOX10", "S100B", "Cytokeratin (pan)"],
                cropsize = 75,
                z_cropsize = 75,
                padding = True,
                cell_indicator="CellID",
                image_fraction = 1,
                projectDir=projectDir)
Source code in spatialae/datasets/pp3dimage.py
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
def generateSingle3DCube(spatialTablePath, 
                        imagePath,
                        imageName = "test_image",
                        markerChannelMapPath = None,
                        markers = ["DNA1"],
                        markerColumnName='marker',
                        channelColumnName='channel',
                        x_coordinate='X_centroid', 
                        y_coordinate='Y_centroid',
                        z_coordinate='Z_centroid',
                        cell_indicator="cellID",
                        cropsize = 75,
                        z_cropsize = 75,
                        padding = True,
                        image_fraction = 1,
                        random_state = 1,
                        verbose=True,
                        projectDir = None):
    """

Parameters:
        spatialTablePath (str):
            Path to the single-cell spatial feature matrix.

        imagePath (str):
            Path to the image file. Recognizes `.ome.tif` image file.

        imageName (str):
            image name for cropped sub images.

        markerChannelMapPath (str):
            Path to a `markers.csv` file that maps the channel number with the marker information. 
            Create a .csv file with at least two columns named 'channel' and 'marker' that 
            map the channel numbers to their corresponding markers. The channel number 
            should use 1-based indexing.

        markers (list):
            Markers for which `singlePatches` need to be generated. The function looks for
            these listed names in the `single-cell spatial Table`.

        markerColumnName (str):
            The name of the column in the `markers.csv` file that holds the marker information. 

        channelColumnName (str):
            The name of the column in the `markers.csv` file that holds the channel information. 

        x_coordinate (str, optional):
            The column name in `single-cell spatial table` that records the
            X coordinates for each cell. Default use 'X_centroid'.

        y_coordinate (str, optional):
            The column name in `single-cell spatial table` that records the
            Y coordinates for each cell. Default use 'Y_centroid'.

        z_coordinate (str, optional):
            The column name in `single-cell spatial table` that records the
            Z coordinates for each cell. Default use 'Z_centroid'.

        cell_indicator (str, optional):
            The column name in `single-cell spatial table` that records the
            ID index for each cell. Default use 'cellID'.

        cropsize (int, optional):
            Default use 75 pixel for x,y.

        z_cropsize (int, optional):
            Default use 75 pixel for z.

        padding (bool, optional):
            If True, padding the cube to the expected cropsize.
            Else, filter the cubes without the expected cropsize.

        image_fraction (float, optional):
            Percentage of the cells to be cropped and saved.
            Default use 100% and crop all cells provided in the `single-cell spatial table`.

        random_state (int, optional):
            random seeds set in sampling the cells to be cropped if image_fraction less than 100%.

        verbose (bool, optional):
            If True, print detailed information about the process to the console. 

        projectDir (string, optional):
            Path to output directory. The result will be located at
            `projectDir/SpatialAE/Single3DPatch/`.

Returns:
    images (.tif):  
        3D cubes will be returned,

Example:
        ```python

        import spatialae
        ## crop DNA1 channel
        spatialTablePath="/home/roh6824/ResearchProject/SpatialMolecular/3D/LSP13626_F8iic_metadata.csv"
        imagePath="/n/scratch/users/r/xxx/data/LSP13626/Dataset1-LSP13626-invasive_margin.tif"

        # marker = pd.read_table(markerChannelMapPath)
        projectDir="/n/scratch/users/r/xxx/Results/LSP13626_DNA_padding/"

        spatialae.datasets.generateSingle3DCube(spatialTablePath,
                        imagePath,
                        imageName = "LSP13626_3Dimage",
                        markers = ["DNA1"],
                        cropsize = 75,
                        z_cropsize = 75,
                        padding = True,
                        cell_indicator="CellID",
                        image_fraction = 1,
                        projectDir=projectDir)

        ## crop multiple channels 
        markerChannelMapPath="/n/scratch/users/r/roh6824/data/LSP13626/markers.csv"

        projectDir="/n/scratch/users/r/roh6824/Results/LSP13626_DNA_padding/"

        spatialae.datasets.generateSingle3DCube(spatialTablePath,
                        imagePath,
                        imageName = "LSP13626_3Dimage",
                        markerChannelMapPath = markerChannelMapPath,
                        markers = ["MART-1",  "SOX10", "S100B", "Cytokeratin (pan)"],
                        cropsize = 75,
                        z_cropsize = 75,
                        padding = True,
                        cell_indicator="CellID",
                        image_fraction = 1,
                        projectDir=projectDir)
        ```



    """
    print("generateSinglePatch starting")
    # create folders if it does not exist
    if projectDir is None:
        projectDir = os.getcwd()
    cropped_image_path = pathlib.Path(projectDir).joinpath("SpatialAE").joinpath("Single3DPatch")
    if not os.path.exists(cropped_image_path):
        os.makedirs(cropped_image_path)

    if markerChannelMapPath is None:
        # default to crop DNA1
        marker_map={"DNA1":0}
        markers = ["DNA1",]
    else:
        # read the markers.csv to map the marker and channels
        maper = pd.read_csv(pathlib.Path(markerChannelMapPath))
        chmamap = dict(zip(maper[markerColumnName], maper[channelColumnName]))
        # find the corresponding channel index
        markerChannels = [chmamap[key] for key in markers if key in chmamap]
        # convert markerChannels to zero indexing
        markerChannels = [x-1 for x in markerChannels]
        # creat a dict of marker and corresponding marker channel index (zero indexing)
        marker_map = dict(zip(markers, markerChannels))

    # load the csv to identify potential thumbnails
    metadata = pd.read_csv(pathlib.Path(spatialTablePath))
    if image_fraction < 1:
        metadata = metadata.sample(frac=image_fraction, random_state=random_state)
        metadata.reset_index(drop=True, inplace=True)
    locations = metadata[[x_coordinate, y_coordinate, z_coordinate]]
    boundary = boundarylocator(locations, cropsize, z_cropsize)

    # load the image to get the shape
    original_image = da.from_zarr(tifffile.imread(pathlib.Path(imagePath), aszarr=True, level=0))
    print("3D image shape:", original_image.shape)

    # zaxis_max_value = original_image.shape[0]
    # yaxis_max_value = original_image.shape[2]
    # xaxis_max_value = original_image.shape[3]
    # boundary = boundarylocator(locations, xaxis_max_value, yaxis_max_value, yaxis_max_value, cropsize, z_cropsize)
    image_shape = original_image[:,0,:,:].shape # any channel have same shape. in z,y,x order.


    # Run the function for each marker
    r_processMarker = lambda x: processMarker (marker=x, marker_map=marker_map,
                  imagePath=imagePath,
                  locations=locations,
                  boundary=boundary,
                  metadata=metadata,
                  image_shape=image_shape,
                  cropped_image_path=cropped_image_path,
                  padding=padding,
                  imagename=imageName,
                  cell_indicator=cell_indicator,
                  verbose=verbose)
    final = list(map(r_processMarker, markers))

    # Finish Job
    if verbose is True:
        print('subimages have been generated, head over to "' + str(projectDir) + 'SpatialAE/Single3DPatch" to view results')