Skip to content

PixelPartitioner

Short Description

The PixelPartitioner function applies multi-class OTSU thresholding to a set of images to partition pixels based on intensity. It iteratively increases the number of classes for images with a high percentage of pixels exceeding a specified threshold, accumulating results in a DataFrame. The final results are saved in a CSV file within the specified output folder.

Function

PixelPartitioner(imagePaths, outputFolder, num_classes=2, percentPositiveThreshold=5, verbose=True)

Parameters:

Name Type Description Default
imagePaths list of str

A list of paths to images that will undergo pixel partitioning.

required
outputFolder str

The directory where the output results, including a master DataFrame as a CSV file, will be saved. The function will create a 'results' subfolder in this directory for the CSV file.

required
num_classes int

The initial number of classes to use for OTSU thresholding. Default is 2.

2
percentPositiveThreshold int or float

The percentage threshold used to determine if an image has a greater percentage of pixels in the highest class than specified. Images exceeding this threshold will be re-processed in subsequent iterations with an increased number of classes. Default is 5.

5
verbose bool

If True, the function will print verbose messages about its progress. Default is True.

True

Returns:

Name Type Description
DataFrame DataFrame

A DataFrame containing the cumulative results of the pixel partitioning process, with columns representing the results of different num_classes iterations.

Example
imagePaths = ['/path/to/images/img1.tif', '/path/to/images/img2.tif']
outputFolder = '/path/to/output'
num_classes = 2
percentPositiveThreshold = 5

# Execute pixel partitioning
results_df = pp.PixelPartitioner(imagePaths=imagePaths, 
                              outputFolder=outputFolder, 
                              num_classes=num_classes, 
                              percentPositiveThreshold=percentPositiveThreshold)
Source code in pixelpartitioner/PixelPartitioner.py
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
def PixelPartitioner (imagePaths, 
                      outputFolder, 
                      num_classes=2,
                      percentPositiveThreshold=5,
                      verbose=True):

    """

Parameters:
        imagePaths (list of str): 
            A list of paths to images that will undergo pixel partitioning.
        outputFolder (str): 
            The directory where the output results, including a master DataFrame as a CSV file, will be saved. The function will create a 'results' subfolder in this directory for the CSV file.
        num_classes (int, optional): 
            The initial number of classes to use for OTSU thresholding. Default is 2.
        percentPositiveThreshold (int or float, optional): 
            The percentage threshold used to determine if an image has a greater percentage of pixels in the highest class than specified. Images exceeding this threshold will be re-processed in subsequent iterations with an increased number of classes. Default is 5.
        verbose (bool, optional): 
            If True, the function will print verbose messages about its progress. Default is True.

Returns:
        DataFrame (pandas.DataFrame): 
            A DataFrame containing the cumulative results of the pixel partitioning process, with columns representing the results of different num_classes iterations.

Example:
        ```python

        imagePaths = ['/path/to/images/img1.tif', '/path/to/images/img2.tif']
        outputFolder = '/path/to/output'
        num_classes = 2
        percentPositiveThreshold = 5

        # Execute pixel partitioning
        results_df = pp.PixelPartitioner(imagePaths=imagePaths, 
                                      outputFolder=outputFolder, 
                                      num_classes=num_classes, 
                                      percentPositiveThreshold=percentPositiveThreshold)
        ```

    """


    # loop through all TSU thresholds
    paths_to_remaining_files = imagePaths[:]  # Copy of the initial list of image paths
    master_df = pd.DataFrame()  # Initialize an empty DataFrame for accumulating results
    first_num_classes = num_classes  # Store the initial num_classes value

    # Loop until there are no remaining files to process
    while len(paths_to_remaining_files) > 0:

        # Print statements for verbose output
        if verbose: 
            print(f'Performing OTSU Thresholding with {num_classes} classes')

        # Perform multi-class OTSU thresholding
        df = process_images(image_paths=paths_to_remaining_files, outputFolder=outputFolder, num_classes=num_classes)

        # Identify images that have a greater percentage of pixels than the user-specified threshold
        column_name = f"{num_classes}_class_OTSU"  # Dynamic column name based on num_classes
        failedSample = df[df[column_name] > percentPositiveThreshold]['FileName'].tolist()

        # Prepare the DataFrame for merging
        df.set_index('FileName', inplace=True)

        # Merge with the master DataFrame
        if master_df.empty:
            master_df = df[[column_name]].copy()
        else:
            # Ensuring all previous iterations are carried forward even if they're missing in the current df
            master_df = master_df.join(df[[column_name]], how='outer')

        # Update paths for the next iteration
        if failedSample:
            path_to_prepend = os.path.dirname(paths_to_remaining_files[0])
            paths_to_remaining_files = [os.path.join(path_to_prepend, file_name) for file_name in failedSample]
            num_classes += 1  # Increase the number of classes for the next iteration
        else:
            break  # Exit loop if no failed samples

    # Sort the master DataFrame based on the first iteration results, from largest to smallest
    first_iteration_column = f"{first_num_classes}_class_OTSU"
    if not master_df.empty and first_iteration_column in master_df.columns:
        master_df.sort_values(by=first_iteration_column, ascending=False, inplace=True)

    # Save master_df that contains the cumulative results with num_classes iterations as columns
    results_folder = os.path.join(outputFolder, 'results')
    # Create the 'results' folder if it does not exist
    if not os.path.exists(results_folder):
        os.makedirs(results_folder)
    # Specify the filename for saving the DataFrame
    filename = 'master_results.csv'
    # Construct the full path to the file
    file_path = os.path.join(results_folder, filename)
    # Save the master_df DataFrame to CSV in the 'results' folder
    master_df.to_csv(file_path, index=True)
    if verbose:
        print("---------------------------------------------")
        print(f"Master DataFrame saved to: {file_path}")
        print(f"Thresholded Images saved to: {outputFolder}")

    # return data
    return master_df