Train 2D images
Short Description
Train an autoencoder (a deep learning model) on a given dataset for selected marker/markers. The autoencoder learns to compress the dataset images into a lower-dimensional encoding and then reconstruct them from this encoding. To train the aeTrain
or aeTrainMulti
model, simply direct the function to the dataset_dir
folder.
Function¶
aeTrain(dataset_dir, outModelPath, input_dim=256, encoding_dim=64, max_epoch_num=100, batch_size=8, num_workers=10, prefetch_factor=8)
¶
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset_dir |
str
|
The file path leading to the directory that holds the training data. |
required |
outModelPath |
str
|
The file path where the trained model will be saved. |
required |
input_dim |
int
|
The input dimension of the model, the input images assumed to be square, i.e., size x size. Default input dimensio is 256, which corresponds to an input image size of 16x16 pixels. |
256
|
encoding_dim |
int
|
The size of the encoding layer, representing the dimensionality of the feature encoding learned by the autoencoder. |
64
|
max_epoch_num |
int
|
The maximum number of epochs to run during training. |
100
|
batch_size |
int
|
The number of images in each batch of dataloader. |
8
|
num_workers |
int
|
The number of worker subprocesses for loading the data. A value of 0 means that the data will be loaded in the main process. |
10
|
prefetch_factor |
int
|
The number of batches to load in advance by each worker. |
8
|
Returns:
Name | Type | Description |
---|---|---|
model |
model
|
The function saves the trained model to the specified path. |
Example
import spatialae as sa
# Define the parameters
input_dim = 256 # 16*16
# Replace with the path to your dataset directory
dataset_dir="/n/scratch/users/r/roh6824/Results/CRC12image_update/SpatialAE/SinglePatch/DNA1/"
# Replace with your desired output model file path
outModelPath = '/n/scratch/users/r/roh6824/Results/CRC12image_update/SpatialAE/ln_autoencoder_DNA_validate_300_model.pth'
# Train the autoencoder
sa.aeTrain(dataset_dir, outModelPath, input_dim, encoding_dim, max_epoch_num, batch_size)
Source code in spatialae/models/aeTrain.py
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 |
|
aeTrainMulti(dataset_dir, outModelPath, channels, input_dim=256, encoding_dim=64, max_epoch_num=100, batch_size=8, num_workers=10, prefetch_factor=8)
¶
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset_dir |
str
|
The file path to the directory containing the training dataset. Each subdirectory within is expected to represent a marker. |
required |
outModelPath |
str
|
The file path where the trained model will be saved. |
required |
channels |
list
|
A list of strings representing the markers/channels to be used for model training. If a user wants to train the model on specific channels, they can specify them by name (e.g., ['CD3D', 'CD4']). Only data corresponding to these channels will be used for training. |
required |
input_dim |
int
|
The input dimension of the model, the input images assumed to be square, i.e., size x size. Default input dimensio is 256, which corresponds to an input image size of 16x16 pixels. |
256
|
encoding_dim |
int
|
The size of the encoding layer, representing the dimensionality of the feature encoding learned by the autoencoder. Default is 64. |
64
|
max_epoch_num |
int
|
The maximum number of epochs to run during training. Default is 100. |
100
|
batch_size |
int
|
The number of images in each batch of dataloader. Default is 8. |
8
|
num_workers |
int
|
The number of worker subprocesses for loading the data. Default is 10. A value of 0 means that the data will be loaded in the main process. |
10
|
prefetch_factor |
int
|
The number of batches to load in advance by each worker. Default is 8. |
8
|
Returns:
Name | Type | Description |
---|---|---|
model |
model
|
The function saves the trained model to the specified path. |
Example
import spatialae as sa
# Define the parameters
channels = ["DNA1", "CD3", "KERATIN", "CD20", "CD68","CD8A", "CD163","ECAD", "CD31"]
input_dim=256
encoding_dim=64
batch_size=32
max_epoch_num=300
# Replace with the path to your dataset directory
dataset_dir=/n/scratch/users/r/roh6824/Results/CRC12image_update/SpatialAE/SinglePatch/
# Replace with your desired output model file path
outModelPath=/n/scratch/users/r/roh6824/Results/CRC12image_update/SpatialAE/ln_autoencoder_multi_validate_300_model_dim32.pth
# Train the autoencoder
sa.aeTrainMulti(dataset_dir, outModelPath, channels, input_dim, encoding_dim, max_epoch_num, batch_size)
Source code in spatialae/models/aeTrain.py
89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 |
|