数据集¶

Torchvision 在 torchvision.datasets 模块中提供了许多内置数据集，以及用于构建自定义数据集的实用类。

内置数据集¶

所有数据集都是torch.utils.data.Dataset的子类即，它们实现了__getitem__和__len__方法。因此，它们都可以传递给torch.utils.data.DataLoader 它可以使用torch.multiprocessing工作器并行加载多个样本。例如：

imagenet_data = torchvision.datasets.ImageNet('path/to/imagenet_root/')
data_loader = torch.utils.data.DataLoader(imagenet_data,
                                          batch_size=4,
                                          shuffle=True,
                                          num_workers=args.nThreads)

所有数据集都有几乎相似的API。它们都有两个共同的参数： transform 和 target_transform 分别用于转换输入和目标。你也可以使用提供的基类创建自己的数据集。

图像分类¶

`Caltech101`(root[, target_type, transform, ...])	Caltech 101 数据集。
`Caltech256`(root[, transform, ...])	Caltech 256 数据集。
`CelebA`(root[, split, target_type, ...])	Large-scale CelebFaces Attributes (CelebA) Dataset 数据集。
`CIFAR10`(root[, train, transform, ...])	CIFAR10 数据集。
`CIFAR100`(root[, train, transform, ...])	CIFAR100 数据集。
`Country211`(root[, split, transform, ...])	The Country211 Data Set 来自 OpenAI。
`DTD`(root[, split, partition, transform, ...])	Describable Textures Dataset (DTD).
`EMNIST`(root, split, **kwargs)	EMNIST 数据集。
`EuroSAT`(root[, transform, target_transform, ...])	EuroSAT 数据集的RGB版本。
`FakeData`([size, image_size, num_classes, ...])	一个返回随机生成图像并将其作为PIL图像返回的假数据集
`FashionMNIST`(root[, train, transform, ...])	Fashion-MNIST 数据集。
`FER2013`(root[, split, transform, ...])	FER2013 数据集。
`FGVCAircraft`(root[, split, ...])	FGVC Aircraft 数据集。
`Flickr8k`(root, ann_file[, transform, ...])	Flickr8k Entities 数据集。
`Flickr30k`(root, ann_file[, transform, ...])	Flickr30k Entities 数据集。
`Flowers102`(root[, split, transform, ...])	Oxford 102 Flower 数据集。
`Food101`(root[, split, transform, ...])	The Food-101 Data Set.
`GTSRB`(root[, split, transform, ...])	German Traffic Sign Recognition Benchmark (GTSRB) 数据集。
`INaturalist`(root[, version, target_type, ...])	iNaturalist 数据集。
`ImageNet`(root[, split])	ImageNet 2012 分类数据集。
`Imagenette`(root[, split, size, download, ...])	Imagenette 图像分类数据集。
`KMNIST`(root[, train, transform, ...])	Kuzushiji-MNIST 数据集。
`LFWPeople`(root[, split, image_set, ...])	LFW 数据集。
`LSUN`(root[, classes, transform, ...])	LSUN 数据集。
`MNIST`(root[, train, transform, ...])	MNIST 数据集。
`Omniglot`(root[, background, transform, ...])	Omniglot 数据集。
`OxfordIIITPet`(root[, split, target_types, ...])	Oxford-IIIT Pet Dataset.
`Places365`(root, ~pathlib.Path], split, ...)	Places365 分类数据集。
`PCAM`(root[, split, transform, ...])	PCAM Dataset.
`QMNIST`(root[, what, compat, train])	QMNIST 数据集。
`RenderedSST2`(root[, split, transform, ...])	The Rendered SST2 Dataset.
`SEMEION`(root[, transform, target_transform, ...])	SEMEION 数据集。
`SBU`(root[, transform, target_transform, ...])	SBU Captioned Photo 数据集。
`StanfordCars`(root[, split, transform, ...])	斯坦福汽车数据集
`STL10`(root[, split, folds, transform, ...])	STL10 数据集。
`SUN397`(root[, transform, target_transform, ...])	The SUN397 Data Set.
`SVHN`(root[, split, transform, ...])	SVHN 数据集。
`USPS`(root[, train, transform, ...])	USPS 数据集。

图像检测或分割¶

`CocoDetection`(root, annFile[, transform, ...])	MS Coco Detection 数据集。
`CelebA`(root[, split, target_type, ...])	大规模名人面部属性（CelebA）数据集数据集。
`Cityscapes`(root[, split, mode, target_type, ...])	Cityscapes 数据集。
`Kitti`(root[, train, transform, ...])	KITTI 数据集。
`OxfordIIITPet`(root[, split, target_types, ...])	Oxford-IIIT Pet Dataset.
`SBDataset`(root[, image_set, mode, download, ...])	语义边界数据集
`VOCSegmentation`(root[, year, image_set, ...])	Pascal VOC 分割数据集。
`VOCDetection`(root[, year, image_set, ...])	Pascal VOC 检测数据集。
`WIDERFace`(root[, split, transform, ...])	WIDERFace 数据集。

光流¶

`FlyingChairs`(root[, split, transforms])	FlyingChairs 光流数据集。
`FlyingThings3D`(root[, split, pass_name, ...])	FlyingThings3D 数据集用于光流。
`HD1K`(root[, split, transforms])	HD1K 数据集用于光流。
`KittiFlow`(root[, split, transforms])	KITTI 数据集用于光流（2015年）。
`Sintel`(root[, split, pass_name, transforms])	Sintel 光流数据集。

立体匹配¶

`CarlaStereo`(root[, transforms])	Carla模拟器数据链接在CREStereo github repo中。
`Kitti2012Stereo`(root[, split, transforms])	KITTI数据集来自2012年立体评估基准。
`Kitti2015Stereo`(root[, split, transforms])	KITTI数据集来自2015立体评估基准。
`CREStereo`(root[, transforms])	用于训练CREStereo架构的合成数据集。
`FallingThingsStereo`(root[, variant, transforms])	FallingThings 数据集。
`SceneFlowStereo`(root[, variant, pass_name, ...])	Scene Flow 数据集的数据集接口。
`SintelStereo`(root[, pass_name, transforms])	Sintel Stereo Dataset.
`InStereo2k`(root[, split, transforms])	InStereo2k 数据集。
`ETH3DStereo`(root[, split, transforms])	ETH3D 低分辨率双视图数据集。
`Middlebury2014Stereo`(root[, split, ...])	来自Middlebury数据集的公开可用场景 2014版本。

图像对¶

LFWPairs(root[, split, image_set, ...])

LFW 数据集。

PhotoTour(root, name[, train, transform, ...])

Multi-view Stereo Correspondence 数据集。

图像字幕¶

CocoCaptions(root, annFile[, transform, ...])

MS Coco Captions 数据集。

视频分类¶

`HMDB51`(root, annotation_path, frames_per_clip)	HMDB51 数据集。
`Kinetics`(root, frames_per_clip[, ...])	Generic Kinetics 数据集。
`UCF101`(root, annotation_path, frames_per_clip)	UCF101 数据集。

视频预测¶

MovingMNIST(root[, split, split_ratio, ...])

MovingMNIST 数据集。

自定义数据集的基础类¶

`DatasetFolder`(root, loader[, extensions, ...])	一个通用的数据加载器。
`ImageFolder`(root, ~pathlib.Path], transform, ...)	一个通用的数据加载器，其中图像默认按以下方式排列：。
`VisionDataset`([root, transforms, transform, ...])	用于制作与torchvision兼容的数据集的基类。

转换 v2¶

wrap_dataset_for_transforms_v2(dataset[, ...])

将torchvision.dataset包装以用于torchvision.transforms.v2。