domid.dsets package¶

Submodules¶

domid.dsets.a_dset_mnist_color_rgb_solo module¶

Color MNIST with single color

class domid.dsets.a_dset_mnist_color_rgb_solo.ADsetMNISTColorRGBSolo(ind_color, path, subset_step=100, color_scheme='both', label_transform=<function mk_fun_label2onehot.<locals>.fun_label2onehot>, list_transforms=None, raw_split='train', flag_rand_color=False, inject_variable=None, args=None)[source]¶

Bases: Dataset

Color MNIST with single color

nominal domains: color palettes/range/spectrum
subdomains: color(foreground, background)
structure: each subdomain contains a combination of foreground+background color

abstract get_foreground_color(ind)[source]¶

abstract get_background_color(ind)[source]¶

abstract get_num_colors()[source]¶

__init__(ind_color, path, subset_step=100, color_scheme='both', label_transform=<function mk_fun_label2onehot.<locals>.fun_label2onehot>, list_transforms=None, raw_split='train', flag_rand_color=False, inject_variable=None, args=None)[source]¶

Parameters:

ind_color – index of a color palette
path – disk storage directory
color_scheme – num (paint according to number), back (only paint background), both (background and foreground)
list_transforms – torch transformations
raw_split – default use the training part of mnist
flag_rand_color – flag if to randomly paint each image (depreciated)
label_transform – e.g. index to one hot vector

generate_dataframe()[source]¶

domid.dsets.dset_her2 module¶

class domid.dsets.dset_her2.DsetHER2(class_num, path, d_dim, inject_variable=None, metadata_path=None, transform=None)[source]¶

Bases: Dataset

Dataset of HER2 stained digital microscopy images. As currently implemented, the subdomains are the HER2 diagnostic classes 1, 2, and 3. There are also 4 data collection site/machine combinations.

__init__(class_num, path, d_dim, inject_variable=None, metadata_path=None, transform=None)[source]¶

Parameters:

class_num – a integer value from 0 to 2, only images of this class will be kept. Note: that actual classes are from 1-3 (therefore, 1 is added in line 28)
path – path to data storage directory (typically passed through args.dpath)
d_dim – number of clusters for the clustering task
inject_variable – name of the variable to be injected for CDVaDE
metadata – path to the CSV file containing the to-be-injected variable for CDVaDE (typecally passed through args.meta_data_csv); if not specified then defaults to “dataframe.csv” in directory given by the “path” argument
transform – torch transformations

domid.dsets.dset_mnist module¶

MNIST

class domid.dsets.dset_mnist.DsetMNIST(digit, args, list_transforms=None, raw_split='train')[source]¶

Bases: Dataset

MNIST Dataset Loading - subdomains: MNIST digit value - structure: each subdomain contains all images of a given digit

__init__(digit, args, list_transforms=None, raw_split='train')[source]¶

Parameters:

digit – a integer value from 0 to 9; only images of this digit will be kept.
path – disk storage directory
subset_step – used to subsample the dataset; a fraction of 1/subset_step images is kept
list_transforms – torch transformations
raw_split – default use the training part of mnist

domid.dsets.dset_mnist_color_solo_default module¶

class domid.dsets.dset_mnist_color_solo_default.DsetMNISTColorSoloDefault(ind_color, path, subset_step=100, color_scheme='both', label_transform=<function mk_fun_label2onehot.<locals>.fun_label2onehot>, list_transforms=None, raw_split='train', flag_rand_color=False, inject_variable=None, args=None)[source]¶

Bases: ADsetMNISTColorRGBSolo

property palette¶

get_num_colors()[source]¶

get_background_color(ind)[source]¶

get_foreground_color(ind)[source]¶

domid.dsets.dset_unittest module¶

class domid.dsets.dset_unittest.DsetUnitTest(digit, args, subset_step=1, list_transforms=None)[source]¶

Bases: Dataset

This dataset is solely used for unit testing of loss values. The images contain tensors of one with the dimension of 1x16x16, the label is a random integer.

__init__(digit, args, subset_step=1, list_transforms=None)[source]¶

create_the_dataset(dpath)[source]¶

domid.dsets.dset_usps module¶

class domid.dsets.dset_usps.DsetUSPS(digit, args, subset_step=1, list_transforms=None)[source]¶

Bases: Dataset

__init__(digit, args, subset_step=1, list_transforms=None)[source]¶

get_original_indicies()[source]¶

domid.dsets.dset_wsi module¶

class domid.dsets.dset_wsi.DsetWSI(class_num, path, args, path_to_domain=None, transform=None)[source]¶

Bases: Dataset

Dataset of WEAH stained digital microscopy images. As currently implemented, the subdomains are the HER2 diagnostic classes 1, 2, and 3. There are also 4 data collection site/machine combinations.

__init__(class_num, path, args, path_to_domain=None, transform=None)[source]¶

Parameters:

class_num – a integer value from 0 to 2, only images of this class will be kept.Note: that actual classes are from 1-3 (therefore, 1 is added in line 28)
path – path to root storage directory
d_dim – number of clusters for the clustering task
path_to_domain – if inject previously predicted domain labels, the path needs to be specified.domain_labels.txt must be inside the directory, containing to-be-injected labels.
transform – torch transformations

domid.dsets.generate_dataset_dataframe_her2 module¶

domid.dsets.generate_dataset_dataframe_her2.get_jpg_folders(path)[source]¶: only keep folders of .jpg images, which folder names by convention end in jpg

domid.dsets.generate_dataset_dataframe_her2.total_count_images(path)[source]¶

domid.dsets.generate_dataset_dataframe_her2.parse_machine_labels(image_names)[source]¶

domid.dsets.generate_dataset_dataframe_her2.mean_scores_per_experiment(scores, img_locs)[source]¶: Parser to get mean scores per image from the cvs file. The name of the images in the folders are slightly different from the names in the csv file.

domid.dsets.make_graph module¶

class domid.dsets.make_graph.GraphConstructor(graph_method, topk=7)[source]¶

Bases: object

Class to construct graph from features. This is only used in training for SDCN model.

__init__(graph_method, topk=7)[source]¶: Initializer of GraphConstructor. :param graph_method: the method to calculate distance between features; one of ‘heat’, ‘cos’, ‘ncos’. :param topk: number of connections per image

sparse_mx_to_torch_sparse_tensor(sparse_mx)[source]¶: Convert a scipy sparse matrix to a torch sparse tensor.

get_features_labels(dataset)[source]¶: This funciton is used to get features and labels from dataset. :param dataset: Image dataset that can be batched or unbatched :return: X: features from the image (flattened images), labels: domain labels, region_labels: region labels if the dataset is WSI images

normalize(mx)[source]¶: Row-normalize sparse matrix which is used to calculate the distance for normalized cosine method. :param mx: sparse matrix :return: row-normalized sparse matrix

distance_calc(features)[source]¶: This function is used to calculate distance between features. :param features: the batch of features from the dataset :return: distance matrix between features of the batch of images with the shape of (num_img, num_img)

connection_calc(features)[source]¶: This function is used to calculate the connection pairs between images for all the batches of dataset. :param features: flattened image from the batch of dataset :return: indecies of top k connections per each image in the batch (shape: (num_img*self.topk, 2))

mk_adj_mat(n, connection_pairs)[source]¶: This function is used to make the adjacency matrix for the graph for each batch of dataset. :param n: batchsize :param connection_pairs: top k connections per each image in the batch (shape: (num_img*self.topk, 2)) :return:

construct_graph(dataset, experiment_folder)[source]¶: This function is used to construct the graph for all the batches of dataset. This is called in the trainer function of SDCN model. :param dataset: dataset contraining all the batches of data (or no batched data) :param graph_method: graph construction method :return: the adjacency matrix for all the batches of data

domid.dsets.make_graph_wsi module¶

class domid.dsets.make_graph_wsi.GraphConstructorWSI(graph_method, topk=7)[source]¶

Bases: GraphConstructor

Class to construct graph from features from WSI images. This is only used in training for SDCN model and for WSI dataset.

__init__(graph_method, topk=7)[source]¶: Initializer of GraphConstructor. :param graph_method: the method to calculate distance between features; one of ‘heat’, ‘cos’, ‘ncos’, ‘patch_distance’. :param topk: number of connections per image

distance_calc_wsi(features=None, coordinates=None)[source]¶: This function is used to calculate distance between features. :param features: the batch of features from the dataset :param coordinates: if the image(patch in the batch) has the coordinates specified, then the distance between can be calculated based on the coordinates :return: distance matrix between features of the batch of images with the shape of (num_img, num_img)

connection_calc(features, region_labels)[source]¶: This function is used to calculate the connection pairs between images for all the batches of dataset. :param features: flattened image from the batch of dataset :param region_labels: spacial information between patches used to calculate the distance between them (e.g. of the string ‘1Carcinoma_coord_39100_39573_patchnumber_98_xy_0_0.png’) :return: indecies of top k connections per each image in the batch (shape: (num_img*self.topk, 2))

construct_graph(features, img_ids, experiment_folder)[source]¶: This function is used to construct the graph for all the batches of dataset. This is called in the trainer function of SDCN model. :param features: flattened image from the batch of dataset :img_ids: :experiment_folder: :return: the adjacency matrix for one batch of data

domid.dsets package¶

Submodules¶

domid.dsets.a_dset_mnist_color_rgb_solo module¶

domid.dsets.dset_her2 module¶

domid.dsets.dset_mnist module¶

domid.dsets.dset_mnist_color_solo_default module¶

domid.dsets.dset_unittest module¶

domid.dsets.dset_usps module¶

domid.dsets.dset_wsi module¶

domid.dsets.generate_dataset_dataframe_her2 module¶

domid.dsets.make_graph module¶

domid.dsets.make_graph_wsi module¶

Module contents¶