vulcanai.datasets package¶

vulcanai.datasets.fashion module¶

class vulcanai.datasets.fashion.FashionData(root, train=True, transform=None, target_transform=None, download=False)¶

Bases: torch.utils.data.dataset.Dataset

‘MNIST <http://yann.lecun.com/exdb/mnist/>`_ Dataset.

Parameters:

root (string): Root directory of dataset where processed/training.pt and processed/test.pt exist. train (bool, optional): If True, creates dataset from training.pt,

otherwise from test.pt.

download (bool, optional): If true, downloads the dataset from the internet and puts it in root directory. If dataset is already downloaded, it is not downloaded again. transform (callable, optional): A function/transform that takes in an PIL image and returns a transformed version. E.g, transforms.RandomCrop target_transform (callable, optional): A function/transform that takes in the target and transforms it.

__init__(root, train=True, transform=None, target_transform=None, download=False)¶: Initialize self. See help(type(self)) for accurate signature.

download()¶: Download the MNIST data if it doesn’t exist in processed_folder already.

processed_folder = 'processed'¶

raw_folder = 'raw'¶

test_file = 'test.pt'¶

training_file = 'training.pt'¶

urls = ['http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz', 'http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz', 'http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz', 'http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz']¶

vulcanai.datasets.fashion.get_int(b)¶

vulcanai.datasets.fashion.parse_byte(b)¶

vulcanai.datasets.fashion.read_image_file(path)¶

vulcanai.datasets.fashion.read_label_file(path)¶

vulcanai.datasets.multidataset module¶

Defines the MultiDataset Class

class vulcanai.datasets.multidataset.MultiDataset(dataset_tuples)¶

Bases: torch.utils.data.dataset.Dataset

Define a dataset for multi input networks.

Takes in a list of datasets, and whether or not their input_data and target data should be output.

Parameters:

dataset_tuples : list of tuples: Each tuple being (Dataset, use_data_boolean, use_target_boolean). A list of tuples, wherein each tuple should have the Dataset in the zero index, a boolean of whether to include the input_data in the first index, and a boolean of whether to include the target data in the second index. You can only specificy one target at a time throughout all incoming datasets.

Returns:

multi_dataset : torch.utils.data.Dataset

__init__(dataset_tuples)¶: Initialize a dataset for multi input networks.

vulcanai.datasets.tabulardataset module¶

vulcanai.datasets.utils module¶

This file contains utility methods that many be useful to several dataset classes. check_split_ration, stratify, rationed_split, randomshuffler were all copy-pasted from torchtext because torchtext is not yet packaged for anaconda and is therefore not yet a reasonable dependency. See https://github.com/pytorch/text/blob/master/torchtext/data/dataset.py

vulcanai.datasets.utils.check_split_ratio(split_ratio)¶

Check that the split ratio argument is not malformed

Parameters:

split_ratio: desired split ratio, either a list of length 2 or 3

depending if the validation set is desired.

Returns:: split ratio as tuple

vulcanai.datasets.utils.clean_dataframe(df)¶: Goes through and ensures that all nonsensical values are encoded as NaNs :param df: :return:

vulcanai.datasets.utils.rationed_split(df, train_ratio, test_ratio, validation_ratio)¶

Function to split a dataset given ratios. Assumes the ratios given are valid (checked using check_split_ratio).

Parameters:

df: Dataframe: The dataframe you want to split
train_ratio: int: proportion of the dataset that will go to the train split. between 0 and 1
test_ratio: int: proportion of the dataset that will go to the test split. between 0 and 1
validation_ratio: int: proportion of the dataset that will go to the val split. between 0 and 1

Returns:

indices: tuple of list of indices.

vulcanai.datasets package¶

vulcanai.datasets.fashion module¶

vulcanai.datasets.multidataset module¶

vulcanai.datasets.tabulardataset module¶

vulcanai.datasets.utils module¶

Table of Contents

Previous topic

Next topic

This Page