AutoMM 检测 - 在小型 COCO 格式数据集上的快速入门¶

在本节中，我们的目标是在一个小型COCO格式的数据集上快速微调一个预训练模型，并在其测试集上进行评估。训练集和测试集都是COCO格式的。有关如何将其他数据集转换为COCO格式，请参见Convert Data to COCO Format。

设置导入¶

确保安装了 mmcv 和 mmdet：

#!pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0  # To use object detection, downgrade the torch version if it's >=2.2
!mim install "mmcv==2.1.0"  # For Google Colab, use the line below instead to install mmcv
#!pip install "mmcv==2.1.0" -f https://download.openmmlab.com/mmcv/dist/cu121/torch2.1.0/index.html
!pip install "mmdet==3.2.0"

Show code cell output Hide code cell output

Looking in links: https://download.openmmlab.com/mmcv/dist/cu124/torch2.5.0/index.html
Requirement already satisfied: mmcv==2.1.0 in /home/ci/opt/venv/lib/python3.11/site-packages (2.1.0)
Requirement already satisfied: addict in /home/ci/opt/venv/lib/python3.11/site-packages (from mmcv==2.1.0) (2.4.0)
Requirement already satisfied: mmengine>=0.3.0 in /home/ci/opt/venv/lib/python3.11/site-packages (from mmcv==2.1.0) (0.10.5)
Requirement already satisfied: numpy in /home/ci/opt/venv/lib/python3.11/site-packages (from mmcv==2.1.0) (1.26.4)
Requirement already satisfied: packaging in /home/ci/opt/venv/lib/python3.11/site-packages (from mmcv==2.1.0) (24.2)
Requirement already satisfied: Pillow in /home/ci/opt/venv/lib/python3.11/site-packages (from mmcv==2.1.0) (11.0.0)
Requirement already satisfied: pyyaml in /home/ci/opt/venv/lib/python3.11/site-packages (from mmcv==2.1.0) (6.0.2)
Requirement already satisfied: yapf in /home/ci/opt/venv/lib/python3.11/site-packages (from mmcv==2.1.0) (0.43.0)
Requirement already satisfied: matplotlib in /home/ci/opt/venv/lib/python3.11/site-packages (from mmengine>=0.3.0->mmcv==2.1.0) (3.9.2)
Requirement already satisfied: rich in /home/ci/opt/venv/lib/python3.11/site-packages (from mmengine>=0.3.0->mmcv==2.1.0) (13.9.4)
Requirement already satisfied: termcolor in /home/ci/opt/venv/lib/python3.11/site-packages (from mmengine>=0.3.0->mmcv==2.1.0) (2.5.0)
Requirement already satisfied: opencv-python>=3 in /home/ci/opt/venv/lib/python3.11/site-packages (from mmengine>=0.3.0->mmcv==2.1.0) (4.10.0.84)
Requirement already satisfied: platformdirs>=3.5.1 in /home/ci/opt/venv/lib/python3.11/site-packages (from yapf->mmcv==2.1.0) (4.3.6)
Requirement already satisfied: contourpy>=1.0.1 in /home/ci/opt/venv/lib/python3.11/site-packages (from matplotlib->mmengine>=0.3.0->mmcv==2.1.0) (1.3.1)
Requirement already satisfied: cycler>=0.10 in /home/ci/opt/venv/lib/python3.11/site-packages (from matplotlib->mmengine>=0.3.0->mmcv==2.1.0) (0.12.1)
Requirement already satisfied: fonttools>=4.22.0 in /home/ci/opt/venv/lib/python3.11/site-packages (from matplotlib->mmengine>=0.3.0->mmcv==2.1.0) (4.55.0)
Requirement already satisfied: kiwisolver>=1.3.1 in /home/ci/opt/venv/lib/python3.11/site-packages (from matplotlib->mmengine>=0.3.0->mmcv==2.1.0) (1.4.7)
Requirement already satisfied: pyparsing>=2.3.1 in /home/ci/opt/venv/lib/python3.11/site-packages (from matplotlib->mmengine>=0.3.0->mmcv==2.1.0) (3.2.0)
Requirement already satisfied: python-dateutil>=2.7 in /home/ci/opt/venv/lib/python3.11/site-packages (from matplotlib->mmengine>=0.3.0->mmcv==2.1.0) (2.9.0.post0)
Requirement already satisfied: markdown-it-py>=2.2.0 in /home/ci/opt/venv/lib/python3.11/site-packages (from rich->mmengine>=0.3.0->mmcv==2.1.0) (3.0.0)
Requirement already satisfied: pygments<3.0.0,>=2.13.0 in /home/ci/opt/venv/lib/python3.11/site-packages (from rich->mmengine>=0.3.0->mmcv==2.1.0) (2.18.0)
Requirement already satisfied: mdurl~=0.1 in /home/ci/opt/venv/lib/python3.11/site-packages (from markdown-it-py>=2.2.0->rich->mmengine>=0.3.0->mmcv==2.1.0) (0.1.2)
Requirement already satisfied: six>=1.5 in /home/ci/opt/venv/lib/python3.11/site-packages (from python-dateutil>=2.7->matplotlib->mmengine>=0.3.0->mmcv==2.1.0) (1.16.0)
Requirement already satisfied: mmdet==3.2.0 in /home/ci/opt/venv/lib/python3.11/site-packages (3.2.0)
Requirement already satisfied: matplotlib in /home/ci/opt/venv/lib/python3.11/site-packages (from mmdet==3.2.0) (3.9.2)
Requirement already satisfied: numpy in /home/ci/opt/venv/lib/python3.11/site-packages (from mmdet==3.2.0) (1.26.4)
Requirement already satisfied: pycocotools in /home/ci/opt/venv/lib/python3.11/site-packages (from mmdet==3.2.0) (2.0.8)
Requirement already satisfied: scipy in /home/ci/opt/venv/lib/python3.11/site-packages (from mmdet==3.2.0) (1.14.1)
Requirement already satisfied: shapely in /home/ci/opt/venv/lib/python3.11/site-packages (from mmdet==3.2.0) (2.0.6)
Requirement already satisfied: six in /home/ci/opt/venv/lib/python3.11/site-packages (from mmdet==3.2.0) (1.16.0)
Requirement already satisfied: terminaltables in /home/ci/opt/venv/lib/python3.11/site-packages (from mmdet==3.2.0) (3.1.10)
Requirement already satisfied: tqdm in /home/ci/opt/venv/lib/python3.11/site-packages (from mmdet==3.2.0) (4.67.1)
Requirement already satisfied: contourpy>=1.0.1 in /home/ci/opt/venv/lib/python3.11/site-packages (from matplotlib->mmdet==3.2.0) (1.3.1)
Requirement already satisfied: cycler>=0.10 in /home/ci/opt/venv/lib/python3.11/site-packages (from matplotlib->mmdet==3.2.0) (0.12.1)
Requirement already satisfied: fonttools>=4.22.0 in /home/ci/opt/venv/lib/python3.11/site-packages (from matplotlib->mmdet==3.2.0) (4.55.0)
Requirement already satisfied: kiwisolver>=1.3.1 in /home/ci/opt/venv/lib/python3.11/site-packages (from matplotlib->mmdet==3.2.0) (1.4.7)
Requirement already satisfied: packaging>=20.0 in /home/ci/opt/venv/lib/python3.11/site-packages (from matplotlib->mmdet==3.2.0) (24.2)
Requirement already satisfied: pillow>=8 in /home/ci/opt/venv/lib/python3.11/site-packages (from matplotlib->mmdet==3.2.0) (11.0.0)
Requirement already satisfied: pyparsing>=2.3.1 in /home/ci/opt/venv/lib/python3.11/site-packages (from matplotlib->mmdet==3.2.0) (3.2.0)
Requirement already satisfied: python-dateutil>=2.7 in /home/ci/opt/venv/lib/python3.11/site-packages (from matplotlib->mmdet==3.2.0) (2.9.0.post0)

首先，让我们导入 MultiModalPredictor：

from autogluon.multimodal import MultiModalPredictor

/home/ci/opt/venv/lib/python3.11/site-packages/mmengine/optim/optimizer/zero_optimizer.py:11: DeprecationWarning: `TorchScript` support for functional optimizers is deprecated and will be removed in a future PyTorch release. Consider using the `torch.compile` optimizer instead.
  from torch.distributed.optim import \

并且导入一些在本教程中将使用的其他包：

import os
import time

from autogluon.core.utils.loaders import load_zip

下载数据¶

我们已经在云端准备好了样本数据集。让我们下载它：

zip_file = "https://automl-mm-bench.s3.amazonaws.com/object_detection_dataset/tiny_motorbike_coco.zip"
download_dir = "./tiny_motorbike_coco"

load_zip.unzip(zip_file, unzip_dir=download_dir)
data_dir = os.path.join(download_dir, "tiny_motorbike")
train_path = os.path.join(data_dir, "Annotations", "trainval_cocoformat.json")
test_path = os.path.join(data_dir, "Annotations", "test_cocoformat.json")

Downloading ./tiny_motorbike_coco/file.zip from https://automl-mm-bench.s3.amazonaws.com/object_detection_dataset/tiny_motorbike_coco.zip...

0%|          | 0.00/21.8M [00:00<?, ?iB/s]
46%|████▌     | 10.1M/21.8M [00:00<00:00, 101MiB/s]
92%|█████████▏| 20.1M/21.8M [00:00<00:00, 94.4MiB/s]
100%|██████████| 21.8M/21.8M [00:00<00:00, 96.1MiB/s]

在使用COCO格式数据集时，输入是数据集分割的json注释文件。在这个例子中，trainval_cocoformat.json 是训练和验证分割的注释文件，而 test_cocoformat.json 是测试分割的注释文件。

创建MultiModalPredictor¶

我们选择了"medium_quality"预设，它使用了一个在COCO数据集上预训练的YOLOX-large模型。这个预设可以快速进行微调或推理，并且易于部署。我们还提供了"high_quality"预设，使用DINO-Resnet50模型，以及"best quality"预设，使用DINO-SwinL模型，这些预设具有更高的性能，但也更慢且需要更高的GPU内存使用。

presets = "medium_quality"

我们使用选定的预设创建MultiModalPredictor。我们需要将problem_type指定为"object_detection"，并且还需要提供一个sample_data_path以便预测器推断数据集的类别。这里我们提供了train_path，使用该数据集的其他分割也同样有效。我们还提供了一个path来保存预测器。如果未指定path，它将被保存到AutogluonModels下自动生成的带有时间戳的目录中。

# Init predictor
import uuid

model_path = f"./tmp/{uuid.uuid4().hex}-quick_start_tutorial_temp_save"

predictor = MultiModalPredictor(
    problem_type="object_detection",
    sample_data_path=train_path,
    presets=presets,
    path=model_path,
)

微调模型¶

学习率、训练轮数和批量大小已包含在预设中，因此无需指定。请注意，我们在微调期间默认使用两阶段学习率选项，模型头部的学习率将是100倍。仅在头部层使用高学习率的两阶段学习率使模型在微调期间收敛得更快。它通常也能提供更好的性能，特别是在包含数百或数千张图像的小数据集上。我们还计算了拟合过程的时间，以便更好地理解速度。我们在AWS上的g4.2xlarge EC2机器上运行它，部分命令输出如下所示：

start = time.time()
predictor.fit(train_path)  # Fit
train_end = time.time()

loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
Downloading yolox_l_8x8_300e_coco_20211126_140236-d3bd2b23.pth from https://download.openmmlab.com/mmdetection/v2.0/yolox/yolox_l_8x8_300e_coco/yolox_l_8x8_300e_coco_20211126_140236-d3bd2b23.pth...
Loads checkpoint by local backend from path: yolox_l_8x8_300e_coco_20211126_140236-d3bd2b23.pth
The model and loaded state dict do not match exactly

size mismatch for bbox_head.multi_level_conv_cls.0.weight: copying a param with shape torch.Size([80, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([10, 256, 1, 1]).
size mismatch for bbox_head.multi_level_conv_cls.0.bias: copying a param with shape torch.Size([80]) from checkpoint, the shape in current model is torch.Size([10]).
size mismatch for bbox_head.multi_level_conv_cls.1.weight: copying a param with shape torch.Size([80, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([10, 256, 1, 1]).
size mismatch for bbox_head.multi_level_conv_cls.1.bias: copying a param with shape torch.Size([80]) from checkpoint, the shape in current model is torch.Size([10]).
size mismatch for bbox_head.multi_level_conv_cls.2.weight: copying a param with shape torch.Size([80, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([10, 256, 1, 1]).
size mismatch for bbox_head.multi_level_conv_cls.2.bias: copying a param with shape torch.Size([80]) from checkpoint, the shape in current model is torch.Size([10]).

=================== System Info ===================
AutoGluon Version:  1.2b20241127
Python Version:     3.11.9
Operating System:   Linux
Platform Machine:   x86_64
Platform Version:   #1 SMP Tue Sep 24 10:00:37 UTC 2024
CPU Count:          8
Pytorch Version:    2.5.1+cu124
CUDA Version:       12.4
Memory Avail:       28.42 GB / 30.95 GB (91.8%)
Disk Space Avail:   WARNING, an exception (FileNotFoundError) occurred while attempting to get available disk space. Consider opening a GitHub Issue.
===================================================
Using default root folder: ./tiny_motorbike_coco/tiny_motorbike/Annotations/... Specify `model.mmdet_image.coco_root=...` in hyperparameters if you think it is wrong.
AutoMM starts to create your model. ✨✨✨

To track the learning progress, you can open a terminal and launch Tensorboard:
    ```shell
    # Assume you have installed tensorboard
    tensorboard --logdir /home/ci/autogluon/docs/tutorials/multimodal/object_detection/quick_start/tmp/6976828b4a1c47ada9615e6079ea1031-quick_start_tutorial_temp_save
    ```
Seed set to 0
0%|          | 0.00/217M [00:00<?, ?iB/s]
0%|          | 273k/217M [00:00<01:21, 2.65MiB/s]
0%|          | 749k/217M [00:00<00:56, 3.81MiB/s]
1%|          | 1.34M/217M [00:00<00:45, 4.72MiB/s]
1%|          | 2.06M/217M [00:00<00:37, 5.68MiB/s]
1%|▏         | 2.89M/217M [00:00<00:32, 6.62MiB/s]
2%|▏         | 3.96M/217M [00:00<00:27, 7.86MiB/s]
2%|▏         | 5.25M/217M [00:00<00:22, 9.42MiB/s]
3%|▎         | 6.30M/217M [00:00<00:21, 9.74MiB/s]
3%|▎         | 7.35M/217M [00:00<00:21, 9.97MiB/s]
4%|▍         | 8.42M/217M [00:01<00:20, 10.2MiB/s]
4%|▍         | 9.56M/217M [00:01<00:19, 10.5MiB/s]
5%|▍         | 10.7M/217M [00:01<00:19, 10.8MiB/s]
5%|▌         | 11.9M/217M [00:01<00:18, 11.1MiB/s]
6%|▌         | 13.1M/217M [00:01<00:18, 11.3MiB/s]
7%|▋         | 14.3M/217M [00:01<00:17, 11.5MiB/s]
7%|▋         | 15.5M/217M [00:01<00:17, 11.7MiB/s]
8%|▊         | 16.7M/217M [00:01<00:16, 11.9MiB/s]
8%|▊         | 18.0M/217M [00:01<00:16, 12.0MiB/s]
9%|▉         | 19.2M/217M [00:01<00:16, 12.1MiB/s]
9%|▉         | 20.5M/217M [00:02<00:16, 12.3MiB/s]
10%|█         | 21.8M/217M [00:02<00:15, 12.6MiB/s]
11%|█         | 23.2M/217M [00:02<00:15, 12.7MiB/s]
11%|█▏        | 24.5M/217M [00:02<00:15, 12.7MiB/s]
12%|█▏        | 25.8M/217M [00:02<00:14, 12.9MiB/s]
12%|█▏        | 27.1M/217M [00:02<00:14, 12.9MiB/s]
13%|█▎        | 28.4M/217M [00:02<00:14, 13.1MiB/s]
14%|█▎        | 29.8M/217M [00:02<00:14, 13.3MiB/s]
14%|█▍        | 31.1M/217M [00:02<00:15, 11.8MiB/s]
15%|█▍        | 32.4M/217M [00:02<00:16, 11.2MiB/s]
15%|█▌        | 33.5M/217M [00:03<00:16, 10.8MiB/s]
16%|█▌        | 34.6M/217M [00:03<00:17, 10.6MiB/s]
16%|█▋        | 35.7M/217M [00:03<00:17, 10.6MiB/s]
17%|█▋        | 36.7M/217M [00:03<00:17, 10.5MiB/s]
17%|█▋        | 37.8M/217M [00:03<00:17, 10.5MiB/s]
18%|█▊        | 38.9M/217M [00:03<00:16, 10.6MiB/s]
18%|█▊        | 39.9M/217M [00:03<00:16, 10.6MiB/s]
19%|█▉        | 41.0M/217M [00:03<00:16, 10.7MiB/s]
19%|█▉        | 42.1M/217M [00:03<00:16, 10.8MiB/s]
20%|█▉        | 43.2M/217M [00:04<00:15, 10.9MiB/s]
20%|██        | 44.4M/217M [00:04<00:15, 11.0MiB/s]
21%|██        | 45.5M/217M [00:04<00:15, 11.1MiB/s]
21%|██▏       | 46.7M/217M [00:04<00:15, 11.2MiB/s]
22%|██▏       | 47.8M/217M [00:04<00:14, 11.3MiB/s]
23%|██▎       | 49.0M/217M [00:04<00:14, 11.4MiB/s]
23%|██▎       | 50.2M/217M [00:04<00:14, 11.5MiB/s]
24%|██▎       | 51.4M/217M [00:04<00:14, 11.7MiB/s]
24%|██▍       | 52.6M/217M [00:04<00:15, 10.3MiB/s]
25%|██▍       | 53.6M/217M [00:04<00:16, 9.83MiB/s]
25%|██▌       | 54.6M/217M [00:05<00:16, 9.64MiB/s]
26%|██▌       | 55.6M/217M [00:05<00:16, 9.56MiB/s]
26%|██▌       | 56.6M/217M [00:05<00:16, 9.51MiB/s]
26%|██▋       | 57.6M/217M [00:05<00:17, 9.39MiB/s]
27%|██▋       | 58.6M/217M [00:05<00:16, 9.56MiB/s]
27%|██▋       | 59.5M/217M [00:05<00:16, 9.56MiB/s]
28%|██▊       | 60.6M/217M [00:05<00:15, 9.81MiB/s]
28%|██▊       | 61.6M/217M [00:05<00:15, 9.89MiB/s]
29%|██▉       | 62.6M/217M [00:05<00:15, 9.96MiB/s]
29%|██▉       | 63.7M/217M [00:06<00:15, 10.2MiB/s]
30%|██▉       | 64.7M/217M [00:06<00:14, 10.3MiB/s]
30%|███       | 65.8M/217M [00:06<00:14, 10.4MiB/s]
31%|███       | 66.9M/217M [00:06<00:14, 10.6MiB/s]
31%|███▏      | 68.0M/217M [00:06<00:14, 10.6MiB/s]
32%|███▏      | 69.1M/217M [00:06<00:13, 10.7MiB/s]
32%|███▏      | 70.2M/217M [00:06<00:13, 10.8MiB/s]
33%|███▎      | 71.3M/217M [00:06<00:13, 10.9MiB/s]
33%|███▎      | 72.4M/217M [00:06<00:15, 9.52MiB/s]
34%|███▍      | 73.4M/217M [00:06<00:16, 8.98MiB/s]
34%|███▍      | 74.3M/217M [00:07<00:16, 8.63MiB/s]
35%|███▍      | 75.2M/217M [00:07<00:16, 8.58MiB/s]
35%|███▌      | 76.1M/217M [00:07<00:16, 8.49MiB/s]
35%|███▌      | 76.9M/217M [00:07<00:16, 8.52MiB/s]
36%|███▌      | 77.8M/217M [00:07<00:16, 8.54MiB/s]
36%|███▌      | 78.7M/217M [00:07<00:16, 8.63MiB/s]
37%|███▋      | 79.6M/217M [00:07<00:15, 8.72MiB/s]
37%|███▋      | 80.5M/217M [00:07<00:15, 8.77MiB/s]
37%|███▋      | 81.4M/217M [00:07<00:15, 8.87MiB/s]
38%|███▊      | 82.4M/217M [00:08<00:15, 8.99MiB/s]
38%|███▊      | 83.4M/217M [00:08<00:14, 9.26MiB/s]
39%|███▉      | 84.3M/217M [00:08<00:14, 9.23MiB/s]
39%|███▉      | 85.3M/217M [00:08<00:14, 9.31MiB/s]
40%|███▉      | 86.3M/217M [00:08<00:13, 9.44MiB/s]
40%|████      | 87.2M/217M [00:08<00:13, 9.55MiB/s]
41%|████      | 88.3M/217M [00:08<00:13, 9.88MiB/s]
41%|████      | 89.4M/217M [00:08<00:12, 9.98MiB/s]
42%|████▏     | 90.4M/217M [00:08<00:13, 9.12MiB/s]
42%|████▏     | 91.3M/217M [00:08<00:14, 8.61MiB/s]
42%|████▏     | 92.2M/217M [00:09<00:14, 8.41MiB/s]
43%|████▎     | 93.0M/217M [00:09<00:15, 8.23MiB/s]
43%|████▎     | 93.8M/217M [00:09<00:15, 8.21MiB/s]
44%|████▎     | 94.7M/217M [00:09<00:14, 8.23MiB/s]
44%|████▍     | 95.5M/217M [00:09<00:14, 8.34MiB/s]
44%|████▍     | 96.4M/217M [00:09<00:14, 8.38MiB/s]
45%|████▍     | 97.3M/217M [00:09<00:14, 8.43MiB/s]
45%|████▌     | 98.1M/217M [00:09<00:13, 8.54MiB/s]
46%|████▌     | 99.0M/217M [00:09<00:13, 8.62MiB/s]
46%|████▌     | 100M/217M [00:10<00:13, 8.89MiB/s]
46%|████▋     | 101M/217M [00:10<00:13, 8.91MiB/s]
47%|████▋     | 102M/217M [00:10<00:12, 9.05MiB/s]
47%|████▋     | 103M/217M [00:10<00:12, 9.07MiB/s]
48%|████▊     | 104M/217M [00:10<00:12, 9.18MiB/s]
48%|████▊     | 105M/217M [00:10<00:12, 9.26MiB/s]
49%|████▊     | 106M/217M [00:10<00:11, 9.38MiB/s]
49%|████▉     | 107M/217M [00:10<00:11, 9.35MiB/s]
50%|████▉     | 108M/217M [00:10<00:11, 9.47MiB/s]
50%|████▉     | 109M/217M [00:10<00:11, 9.66MiB/s]
50%|█████     | 110M/217M [00:11<00:10, 9.81MiB/s]
51%|█████     | 111M/217M [00:11<00:10, 9.84MiB/s]
51%|█████▏    | 112M/217M [00:11<00:10, 10.1MiB/s]
52%|█████▏    | 113M/217M [00:11<00:10, 10.2MiB/s]
52%|█████▏    | 114M/217M [00:11<00:10, 10.3MiB/s]
53%|█████▎    | 115M/217M [00:11<00:09, 10.4MiB/s]
53%|█████▎    | 116M/217M [00:11<00:09, 10.5MiB/s]
54%|█████▍    | 117M/217M [00:11<00:09, 10.9MiB/s]
54%|█████▍    | 118M/217M [00:11<00:08, 11.0MiB/s]
55%|█████▍    | 119M/217M [00:11<00:08, 11.1MiB/s]
55%|█████▌    | 121M/217M [00:12<00:08, 11.2MiB/s]
56%|█████▌    | 122M/217M [00:12<00:08, 11.3MiB/s]
57%|█████▋    | 123M/217M [00:12<00:08, 11.5MiB/s]
57%|█████▋    | 124M/217M [00:12<00:07, 11.7MiB/s]
58%|█████▊    | 125M/217M [00:12<00:07, 11.8MiB/s]
58%|█████▊    | 127M/217M [00:12<00:07, 12.0MiB/s]
59%|█████▉    | 128M/217M [00:12<00:07, 11.8MiB/s]
59%|█████▉    | 129M/217M [00:12<00:07, 11.1MiB/s]
60%|█████▉    | 130M/217M [00:12<00:08, 10.8MiB/s]
60%|██████    | 131M/217M [00:13<00:08, 10.2MiB/s]
61%|██████    | 132M/217M [00:13<00:08, 9.81MiB/s]
61%|██████▏   | 133M/217M [00:13<00:08, 9.69MiB/s]
62%|██████▏   | 134M/217M [00:13<00:08, 9.73MiB/s]
62%|██████▏   | 135M/217M [00:13<00:08, 9.66MiB/s]
63%|██████▎   | 136M/217M [00:13<00:08, 9.67MiB/s]
63%|██████▎   | 137M/217M [00:13<00:08, 9.81MiB/s]
64%|██████▎   | 138M/217M [00:13<00:07, 10.0MiB/s]
64%|██████▍   | 139M/217M [00:13<00:07, 10.2MiB/s]
65%|██████▍   | 140M/217M [00:13<00:07, 10.3MiB/s]
65%|██████▌   | 142M/217M [00:14<00:07, 10.5MiB/s]
66%|██████▌   | 143M/217M [00:14<00:07, 10.6MiB/s]
66%|██████▌   | 144M/217M [00:14<00:06, 10.7MiB/s]
67%|██████▋   | 145M/217M [00:14<00:06, 10.9MiB/s]
67%|██████▋   | 146M/217M [00:14<00:06, 10.9MiB/s]
68%|██████▊   | 147M/217M [00:14<00:06, 11.0MiB/s]
68%|██████▊   | 148M/217M [00:14<00:06, 11.3MiB/s]
69%|██████▉   | 150M/217M [00:14<00:05, 11.3MiB/s]
69%|██████▉   | 151M/217M [00:14<00:06, 10.5MiB/s]
70%|██████▉   | 152M/217M [00:15<00:06, 9.73MiB/s]
70%|███████   | 153M/217M [00:15<00:06, 9.35MiB/s]
71%|███████   | 154M/217M [00:15<00:06, 9.21MiB/s]
71%|███████   | 155M/217M [00:15<00:06, 9.04MiB/s]
72%|███████▏  | 155M/217M [00:15<00:06, 9.03MiB/s]
72%|███████▏  | 156M/217M [00:15<00:06, 8.98MiB/s]
72%|███████▏  | 157M/217M [00:15<00:06, 8.93MiB/s]
73%|███████▎  | 158M/217M [00:15<00:06, 9.06MiB/s]
73%|███████▎  | 159M/217M [00:15<00:06, 9.16MiB/s]
74%|███████▎  | 160M/217M [00:15<00:06, 9.32MiB/s]
74%|███████▍  | 161M/217M [00:16<00:05, 9.45MiB/s]
75%|███████▍  | 162M/217M [00:16<00:05, 9.59MiB/s]
75%|███████▌  | 163M/217M [00:16<00:06, 9.00MiB/s]
76%|███████▌  | 164M/217M [00:16<00:06, 8.45MiB/s]
76%|███████▌  | 165M/217M [00:16<00:06, 8.04MiB/s]
76%|███████▋  | 166M/217M [00:16<00:06, 7.92MiB/s]
77%|███████▋  | 167M/217M [00:16<00:06, 7.90MiB/s]
77%|███████▋  | 167M/217M [00:16<00:06, 7.36MiB/s]
77%|███████▋  | 168M/217M [00:16<00:07, 6.78MiB/s]
78%|███████▊  | 169M/217M [00:17<00:07, 6.44MiB/s]
78%|███████▊  | 169M/217M [00:17<00:07, 6.32MiB/s]
78%|███████▊  | 170M/217M [00:17<00:07, 6.21MiB/s]
79%|███████▊  | 171M/217M [00:17<00:07, 6.15MiB/s]
79%|███████▉  | 171M/217M [00:17<00:07, 6.19MiB/s]
79%|███████▉  | 172M/217M [00:17<00:07, 6.23MiB/s]
79%|███████▉  | 173M/217M [00:17<00:07, 6.35MiB/s]
80%|███████▉  | 173M/217M [00:17<00:06, 6.52MiB/s]
80%|████████  | 174M/217M [00:17<00:06, 6.57MiB/s]
80%|████████  | 175M/217M [00:18<00:06, 6.73MiB/s]
81%|████████  | 175M/217M [00:18<00:06, 6.82MiB/s]
81%|████████  | 176M/217M [00:18<00:05, 6.93MiB/s]
81%|████████▏ | 177M/217M [00:18<00:05, 6.95MiB/s]
82%|████████▏ | 178M/217M [00:18<00:05, 7.11MiB/s]
82%|████████▏ | 178M/217M [00:18<00:05, 7.22MiB/s]
82%|████████▏ | 179M/217M [00:18<00:05, 7.40MiB/s]
83%|████████▎ | 180M/217M [00:18<00:04, 7.58MiB/s]
83%|████████▎ | 181M/217M [00:18<00:04, 7.66MiB/s]
84%|████████▎ | 182M/217M [00:18<00:04, 7.82MiB/s]
84%|████████▍ | 182M/217M [00:19<00:04, 7.92MiB/s]
84%|████████▍ | 183M/217M [00:19<00:04, 8.13MiB/s]
85%|████████▍ | 184M/217M [00:19<00:04, 8.19MiB/s]
85%|████████▌ | 185M/217M [00:19<00:03, 8.35MiB/s]
86%|████████▌ | 186M/217M [00:19<00:03, 8.36MiB/s]
86%|████████▌ | 187M/217M [00:19<00:03, 8.54MiB/s]
86%|████████▋ | 188M/217M [00:19<00:03, 8.68MiB/s]
87%|████████▋ | 189M/217M [00:19<00:03, 8.89MiB/s]
87%|████████▋ | 190M/217M [00:19<00:03, 9.07MiB/s]
88%|████████▊ | 191M/217M [00:19<00:02, 9.18MiB/s]
88%|████████▊ | 192M/217M [00:20<00:02, 9.31MiB/s]
89%|████████▊ | 193M/217M [00:20<00:02, 9.43MiB/s]
89%|████████▉ | 194M/217M [00:20<00:02, 9.58MiB/s]
90%|████████▉ | 195M/217M [00:20<00:02, 9.81MiB/s]
90%|█████████ | 196M/217M [00:20<00:02, 9.89MiB/s]
91%|█████████ | 197M/217M [00:20<00:02, 9.99MiB/s]
91%|█████████ | 198M/217M [00:20<00:01, 10.2MiB/s]
92%|█████████▏| 199M/217M [00:20<00:01, 10.4MiB/s]
92%|█████████▏| 200M/217M [00:20<00:01, 10.4MiB/s]
93%|█████████▎| 201M/217M [00:20<00:01, 10.6MiB/s]
93%|█████████▎| 202M/217M [00:21<00:01, 10.7MiB/s]
94%|█████████▎| 203M/217M [00:21<00:01, 10.8MiB/s]
94%|█████████▍| 204M/217M [00:21<00:01, 10.9MiB/s]
95%|█████████▍| 206M/217M [00:21<00:01, 11.1MiB/s]
95%|█████████▌| 207M/217M [00:21<00:00, 11.2MiB/s]
96%|█████████▌| 208M/217M [00:21<00:00, 11.3MiB/s]
96%|█████████▌| 209M/217M [00:21<00:00, 11.5MiB/s]
97%|█████████▋| 210M/217M [00:21<00:00, 11.7MiB/s]
97%|█████████▋| 211M/217M [00:21<00:00, 11.7MiB/s]
98%|█████████▊| 213M/217M [00:21<00:00, 11.9MiB/s]
98%|█████████▊| 214M/217M [00:22<00:00, 12.1MiB/s]
99%|█████████▉| 215M/217M [00:22<00:00, 12.1MiB/s]
100%|█████████▉| 216M/217M [00:22<00:00, 12.3MiB/s]
/home/ci/opt/venv/lib/python3.11/site-packages/mmengine/runner/checkpoint.py:347: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(filename, map_location=map_location)
GPU Count: 1
GPU Count to be Used: 1
GPU 0 Name: Tesla T4
GPU 0 Memory: 0.43GB/15.0GB (Used/Total)
Using 16bit Automatic Mixed Precision (AMP)
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
`Trainer(val_check_interval=1.0)` was configured so validation will run at the end of the training epoch..
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
| Name              | Type                             | Params | Mode 
-------------------------------------------------------------------------------
0 | model             | MMDetAutoModelForObjectDetection | 54.2 M | train
1 | validation_metric | MeanAveragePrecision             | 0      | train
-------------------------------------------------------------------------------
54.2 M    Trainable params
0         Non-trainable params
54.2 M    Total params
216.620   Total estimated model params size (MB)
592       Modules in train mode
0         Modules in eval mode
/home/ci/opt/venv/lib/python3.11/site-packages/mmdet/models/backbones/csp_darknet.py:118: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
  with torch.cuda.amp.autocast(enabled=False):
/home/ci/opt/venv/lib/python3.11/site-packages/torch/functional.py:534: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3595.)
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
/home/ci/opt/venv/lib/python3.11/site-packages/mmdet/models/task_modules/assigners/sim_ota_assigner.py:118: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
  with torch.cuda.amp.autocast(enabled=False):
Epoch 2, global step 15: 'val_map' reached 0.40029 (best 0.40029), saving model to '/home/ci/autogluon/docs/tutorials/multimodal/object_detection/quick_start/tmp/6976828b4a1c47ada9615e6079ea1031-quick_start_tutorial_temp_save/epoch=2-step=15.ckpt' as top 1
Epoch 5, global step 30: 'val_map' reached 0.42549 (best 0.42549), saving model to '/home/ci/autogluon/docs/tutorials/multimodal/object_detection/quick_start/tmp/6976828b4a1c47ada9615e6079ea1031-quick_start_tutorial_temp_save/epoch=5-step=30.ckpt' as top 1
Epoch 8, global step 45: 'val_map' was not in top 1
Epoch 11, global step 60: 'val_map' was not in top 1
Epoch 14, global step 75: 'val_map' was not in top 1
Epoch 17, global step 90: 'val_map' reached 0.42773 (best 0.42773), saving model to '/home/ci/autogluon/docs/tutorials/multimodal/object_detection/quick_start/tmp/6976828b4a1c47ada9615e6079ea1031-quick_start_tutorial_temp_save/epoch=17-step=90.ckpt' as top 1
Epoch 20, global step 105: 'val_map' reached 0.45249 (best 0.45249), saving model to '/home/ci/autogluon/docs/tutorials/multimodal/object_detection/quick_start/tmp/6976828b4a1c47ada9615e6079ea1031-quick_start_tutorial_temp_save/epoch=20-step=105.ckpt' as top 1
Epoch 23, global step 120: 'val_map' reached 0.45307 (best 0.45307), saving model to '/home/ci/autogluon/docs/tutorials/multimodal/object_detection/quick_start/tmp/6976828b4a1c47ada9615e6079ea1031-quick_start_tutorial_temp_save/epoch=23-step=120.ckpt' as top 1
Epoch 26, global step 135: 'val_map' was not in top 1
Epoch 29, global step 150: 'val_map' was not in top 1
Epoch 32, global step 165: 'val_map' was not in top 1
Epoch 35, global step 180: 'val_map' was not in top 1
Epoch 38, global step 195: 'val_map' was not in top 1
/home/ci/autogluon/multimodal/src/autogluon/multimodal/utils/checkpoint.py:63: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  avg_state_dict = torch.load(checkpoint_paths[0], map_location=torch.device("cpu"))["state_dict"]  # nosec B614
AutoMM has created your model. 🎉🎉🎉

To load the model, use the code below:
    ```python
    from autogluon.multimodal import MultiModalPredictor
    predictor = MultiModalPredictor.load("/home/ci/autogluon/docs/tutorials/multimodal/object_detection/quick_start/tmp/6976828b4a1c47ada9615e6079ea1031-quick_start_tutorial_temp_save")
    ```

If you are not satisfied with the model, try to increase the training time, 
adjust the hyperparameters (https://auto.gluon.ai/stable/tutorials/multimodal/advanced_topics/customization.html),
or post issues on GitHub (https://github.com/autogluon/autogluon/issues).

请注意，在每个进度条的末尾，如果当前阶段的检查点已保存，它会打印模型的保存路径。在这个例子中，它是 ./quick_start_tutorial_temp_save。

打印出时间，我们可以看到它很快！

print("This finetuning takes %.2f seconds." % (train_end - start))

This finetuning takes 444.90 seconds.

Evaluation¶

要评估我们刚刚训练的模型，请运行以下代码。

评估结果显示在命令行输出中。第一行是COCO标准下的mAP，第二行是VOC标准下的mAP（或mAP50）。有关这些指标的更多详细信息，请参阅COCO的评估指南。请注意，为了展示快速微调，我们使用了“medium_quality”预设，您可以通过简单地使用“high_quality”或“best_quality”预设在此数据集上获得更好的结果，或自定义您自己的模型和超参数设置：自定义，以及其他一些示例在快速微调Coco或高性能微调Coco。

predictor.evaluate(test_path)
eval_end = time.time()

loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
saving file at /home/ci/autogluon/docs/tutorials/multimodal/object_detection/quick_start/AutogluonModels/ag-20241127_101336/object_detection_result_cache.json
loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
Loading and preparing results...
DONE (t=0.00s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=0.09s).
Accumulating evaluation results...
DONE (t=0.04s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.362
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.515
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.400
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.287
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.470
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.742
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.245
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.411
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.434
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.374
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.529
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.811

Using default root folder: ./tiny_motorbike_coco/tiny_motorbike/Annotations/... Specify `model.mmdet_image.coco_root=...` in hyperparameters if you think it is wrong.
/home/ci/opt/venv/lib/python3.11/site-packages/mmdet/models/backbones/csp_darknet.py:118: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
  with torch.cuda.amp.autocast(enabled=False):
A new predictor save path is created. This is to prevent you to overwrite previous predictor saved here. You could check current save path at predictor._save_path. If you still want to use this path, set resume=True
No path specified. Models will be saved in: "AutogluonModels/ag-20241127_101336"

打印出评估时间：

print("The evaluation takes %.2f seconds." % (eval_end - train_end))

The evaluation takes 1.83 seconds.

我们可以使用之前的 save_path 加载一个新的预测器，如果并非所有设备都可用，我们还可以重置要使用的 GPU 数量：

# Load and reset num_gpus
new_predictor = MultiModalPredictor.load(model_path)
new_predictor.set_num_gpus(1)

Load pretrained checkpoint: /home/ci/autogluon/docs/tutorials/multimodal/object_detection/quick_start/tmp/6976828b4a1c47ada9615e6079ea1031-quick_start_tutorial_temp_save/model.ckpt
/home/ci/autogluon/multimodal/src/autogluon/multimodal/learners/base.py:2117: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  state_dict = torch.load(path, map_location=torch.device("cpu"))["state_dict"]  # nosec B614

评估新的预测器给我们完全相同的结果：

# Evaluate new predictor
new_predictor.evaluate(test_path)

loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
saving file at /home/ci/autogluon/docs/tutorials/multimodal/object_detection/quick_start/AutogluonModels/ag-20241127_101340/object_detection_result_cache.json
loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
Loading and preparing results...
DONE (t=0.00s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=0.09s).
Accumulating evaluation results...
DONE (t=0.04s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.362
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.515
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.400
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.287
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.470
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.742
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.245
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.411
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.434
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.374
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.529
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.811

Using default root folder: ./tiny_motorbike_coco/tiny_motorbike/Annotations/... Specify `model.mmdet_image.coco_root=...` in hyperparameters if you think it is wrong.
/home/ci/opt/venv/lib/python3.11/site-packages/mmdet/models/backbones/csp_darknet.py:118: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
  with torch.cuda.amp.autocast(enabled=False):
A new predictor save path is created. This is to prevent you to overwrite previous predictor saved here. You could check current save path at predictor._save_path. If you still want to use this path, set resume=True
No path specified. Models will be saved in: "AutogluonModels/ag-20241127_101340"

{'map': 0.36155918359870143,
 'mean_average_precision': 0.36155918359870143,
 'map_50': 0.5148674017830767,
 'map_75': 0.3996074255623752,
 'map_small': 0.2867007327076268,
 'map_medium': 0.4702258124119172,
 'map_large': 0.7422723199571868,
 'mar_1': 0.24470237256283764,
 'mar_10': 0.4106356589147287,
 'mar_100': 0.4340122151750059,
 'mar_small': 0.37416666666666665,
 'mar_medium': 0.5293650793650794,
 'mar_large': 0.8114619883040936}

有关如何设置超参数并以更高性能微调模型，请参阅AutoMM Detection - High Performance Finetune on COCO Format Dataset。

推理¶

现在我们已经完成了模型的设置、微调和评估，本节将详细介绍推理过程。具体来说，我们将展示使用模型进行预测并可视化结果的步骤。

要在整个测试集上运行推理，请执行：

pred = predictor.predict(test_path)
print(pred)

loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
[<InstanceData(

    META INFORMATION

    DATA FIELDS
    labels: tensor([8, 0, 7, 3, 7, 8, 3])
    bboxes: tensor([[ 1.9047e+02,  1.1102e+02,  2.5992e+02,  2.3801e+02],
                [ 1.5984e+02,  1.7375e+02,  2.7844e+02,  2.4422e+02],
                [-1.7180e-01,  2.2358e+02,  4.2115e+01,  3.1275e+02],
                [ 1.1251e-01,  1.6193e+02,  4.1098e+01,  3.1541e+02],
                [ 1.0890e+00,  1.7085e+02,  4.0806e+01,  3.3579e+02],
                [-8.1238e-02,  2.2299e+02,  1.0958e+01,  3.2037e+02],
                [ 6.6293e-01,  1.3410e+02,  2.9220e+01,  3.2215e+02]])
    scores: tensor([0.9170, 0.8789, 0.5723, 0.2634, 0.0267, 0.0212, 0.0183])
) at 0x7f4c73ea58d0>, <InstanceData(

    META INFORMATION

    DATA FIELDS
    labels: tensor([8, 7, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 2, 8, 8,
                8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8])
    bboxes: tensor([[115.5541, 155.2838, 180.5397, 201.1669],
                [208.3319, 134.1505, 273.6994, 219.9552],
                [221.0634, 106.2617, 273.0772, 185.3087],
                [279.9424, 314.5258, 294.6670, 333.1045],
                [435.7524, 277.8247, 450.1851, 328.3760],
                [464.1591, 270.9555, 487.4034, 325.0833],
                [421.2425, 267.0117, 438.1325, 321.9919],
                [454.4393, 275.5787, 465.8732, 325.5410],
                [382.3109, 271.1247, 398.9391, 299.5091],
                [267.9264, 317.0689, 280.9017, 333.2973],
                [397.5775, 277.3991, 417.2662, 306.9142],
                [447.4636, 270.4652, 458.7864, 328.3095],
                [360.3695, 317.2955, 376.7399, 333.0707],
                [404.5516, 264.6689, 418.8860, 286.0319],
                [447.9741, 269.1672, 463.7447, 324.1357],
                [489.5655, 274.3198, 500.2783, 328.7542],
                [  8.7118, 174.4055,  19.5108, 209.0135],
                [411.4486, 276.6526, 426.0514, 315.8685],
                [  0.6016, 164.3757,  10.8059, 213.1807],
                [464.2011, 270.8110, 477.9864, 323.2735],
                [  0.9583, 161.8075,   9.1979, 247.7981],
                [  9.2450, 171.2233, 109.7003, 251.2802],
                [488.1668, 249.8935, 500.1144, 273.4480],
                [475.4756, 261.8313, 486.2431, 286.1335],
                [432.0408, 271.1862, 439.8342, 291.6307],
                [  4.0836, 169.9634,  17.4252, 212.2831],
                [431.8996, 269.7989, 440.7567, 286.7645],
                [419.2566, 266.3104, 429.1809, 282.4361],
                [464.8848, 271.2646, 476.5215, 299.7601],
                [381.6506, 143.6793, 398.0370, 176.0320],
                [489.6107, 269.4374, 501.0143, 298.4604],
                [  1.1278, 186.6927,  11.9215, 276.4587],
                [493.7070, 273.5195, 500.0430, 302.9770],
                [  0.6661, 175.4241,   9.2339, 272.4844],
                [  1.7888, 171.9483,  12.4934, 213.8158],
                [480.5421, 266.1275, 493.6767, 326.3936]])
    scores: tensor([0.9355, 0.9121, 0.8867, 0.8340, 0.8115, 0.7969, 0.7305, 0.5557, 0.5347,
                0.5337, 0.4910, 0.3965, 0.3853, 0.2722, 0.2361, 0.2330, 0.1621, 0.1302,
                0.0889, 0.0753, 0.0684, 0.0615, 0.0511, 0.0414, 0.0389, 0.0243, 0.0235,
                0.0209, 0.0189, 0.0173, 0.0164, 0.0148, 0.0141, 0.0141, 0.0129, 0.0112])
) at 0x7f4c73d67710>, <InstanceData(

    META INFORMATION

    DATA FIELDS
    labels: tensor([3, 8, 8, 8, 7, 7, 7, 7, 8, 8, 8, 8, 8, 7, 8, 9, 3, 8, 8])
    bboxes: tensor([[ 4.5661e+02,  1.0522e+02,  4.9964e+02,  2.3521e+02],
                [ 1.9646e+02,  8.1145e+01,  3.0744e+02,  3.2784e+02],
                [ 2.7904e+02,  5.4151e+01,  3.3424e+02,  2.9702e+02],
                [ 3.2966e+02,  1.9037e+01,  4.3597e+02,  3.8292e+02],
                [ 1.1785e+00, -5.5230e-02,  5.0234e+02,  3.6021e+02],
                [ 2.1811e+01,  1.4723e+02,  4.0085e+02,  3.6918e+02],
                [ 2.1312e+02,  4.6128e+00,  5.0602e+02,  2.7234e+02],
                [ 6.2347e+00,  6.1531e+00,  2.8517e+02,  3.4658e+02],
                [ 1.9330e+02,  2.1784e+00,  5.1373e+02,  2.5173e+02],
                [ 9.8422e+00,  1.7471e-01,  4.8899e+02,  3.5764e+02],
                [ 7.3587e+00,  2.2232e+00,  2.9069e+02,  3.4465e+02],
                [ 4.8249e+01,  4.4013e+01,  5.7219e+01,  6.9073e+01],
                [ 2.0527e+02,  5.2552e+01,  3.3301e+02,  3.1210e+02],
                [ 3.5634e-01,  5.2616e+00,  2.7230e+02,  1.9474e+02],
                [ 2.6791e+02,  3.2485e-01,  5.0006e+02,  2.0163e+02],
                [ 4.5059e+02,  1.6641e+01,  4.8535e+02,  5.8896e+01],
                [ 4.6272e+02,  4.0970e+01,  5.0056e+02,  2.3423e+02],
                [ 2.7003e+02,  4.5432e+01,  4.4442e+02,  3.7410e+02],
                [ 3.8556e+02,  5.8893e+00,  4.9959e+02,  2.5642e+02]])
    scores: tensor([0.9341, 0.8955, 0.8354, 0.8262, 0.7393, 0.7231, 0.3301, 0.2507, 0.2502,
                0.1166, 0.1108, 0.0916, 0.0564, 0.0316, 0.0298, 0.0291, 0.0283, 0.0109,
                0.0108])
) at 0x7f4c73d674d0>, <InstanceData(

    META INFORMATION

    DATA FIELDS
    labels: tensor([7, 7, 8, 7, 7, 7, 8, 7, 7, 7, 8, 7, 8, 8, 7, 8, 8, 7, 8, 7, 7, 8, 7, 7])
    bboxes: tensor([[-3.2522e+01,  1.9648e+01,  5.0830e+02,  3.2918e+02],
                [-1.0149e+01,  1.9682e+01,  4.8906e+02,  1.5610e+02],
                [ 4.2916e-02,  2.4991e-01,  1.8146e+01,  5.7644e+01],
                [-1.4898e+01,  1.6539e+01,  5.0318e+02,  2.2969e+02],
                [ 3.6231e+02,  5.7808e+00,  5.0175e+02,  1.6082e+02],
                [ 3.0306e+02,  2.4590e+01,  5.0085e+02,  1.6575e+02],
                [ 3.8935e+02,  1.6814e-01,  4.4659e+02,  5.7775e+01],
                [ 1.7842e+02,  1.7653e+01,  5.0010e+02,  1.7269e+02],
                [ 4.7847e+02,  2.2646e+01,  5.0043e+02,  1.0565e+02],
                [-1.4660e+00,  4.5003e+00,  3.0850e+02,  1.6014e+02],
                [ 3.8304e+02, -8.8226e-01,  4.9977e+02,  1.4335e+02],
                [ 4.0527e+02,  2.9020e+01,  5.0020e+02,  1.5859e+02],
                [ 3.4958e+02,  1.5321e+00,  4.4886e+02,  1.0605e+02],
                [ 2.0165e+02,  9.2959e-01,  2.8624e+02,  1.1046e+02],
                [-1.0469e+00,  4.9753e-01,  4.8659e+02,  1.1549e+02],
                [ 3.8168e+02,  2.6241e-01,  4.5425e+02,  1.3135e+02],
                [ 3.5631e+02, -2.2387e+00,  4.5775e+02,  1.5682e+02],
                [ 4.7834e+02,  2.2733e+01,  5.0056e+02,  1.6898e+02],
                [ 7.2643e+01,  1.6134e+01,  8.9467e+01,  5.6123e+01],
                [ 3.1649e+02,  1.3990e+01,  4.9835e+02,  2.8344e+02],
                [-9.3102e-03,  2.7952e+01,  2.4376e+02,  1.7148e+02],
                [ 3.8695e+02,  3.1417e+00,  4.1540e+02,  4.8059e+01],
                [-6.6315e-01,  1.6343e+02,  5.4814e+01,  3.2786e+02],
                [-5.2468e-01,  7.2292e+01,  1.1947e+02,  1.6221e+02]])
    scores: tensor([0.9238, 0.2776, 0.2112, 0.0903, 0.0499, 0.0357, 0.0349, 0.0342, 0.0259,
                0.0239, 0.0218, 0.0193, 0.0188, 0.0177, 0.0168, 0.0157, 0.0152, 0.0148,
                0.0138, 0.0136, 0.0121, 0.0115, 0.0115, 0.0102])
) at 0x7f4c73e81d50>, <InstanceData(

    META INFORMATION

    DATA FIELDS
    labels: tensor([7, 7, 7, 7, 8, 7, 8, 8, 7, 8, 8, 8, 8, 8, 8, 8, 7, 8, 7, 7, 8, 7, 8, 7,
                8, 7, 7, 7, 8, 7, 7, 7, 7])
    bboxes: tensor([[ 94.9157, 198.2999, 139.0687, 225.5283],
                [273.4020, 192.0842, 305.8949, 216.9002],
                [331.7704, 188.6993, 360.4171, 209.1523],
                [218.4268, 189.7875, 251.8857, 217.6344],
                [111.5938, 185.9720, 134.5000, 221.4499],
                [161.8049, 189.9249, 212.2185, 220.6220],
                [338.4800, 182.4427, 356.4418, 204.8620],
                [177.6422, 177.9988, 196.5766, 213.4075],
                [140.6445, 200.9436, 174.5899, 222.1033],
                [147.8450, 185.8573, 169.3425, 219.6115],
                [291.9177, 177.2163, 307.6917, 209.1118],
                [233.5325, 176.8208, 251.2331, 211.8511],
                [281.9614, 183.4100, 299.6793, 211.1213],
                [285.9812, 178.8920, 302.6907, 210.7564],
                [227.8931, 180.2697, 247.1069, 212.6991],
                [294.3336, 179.3184, 308.7914, 204.4707],
                [281.7528, 186.4072, 314.3409, 210.2725],
                [226.2992, 185.1532, 244.0133, 210.3546],
                [297.9124, 187.1051, 315.3689, 208.5980],
                [256.7986, 283.7644, 498.6702, 373.2669],
                [280.0393, 182.7196, 295.3513, 206.3429],
                [231.7640, 245.1018, 444.7985, 304.8982],
                [188.9067, 179.5305, 202.3042, 213.0476],
                [281.4845, 185.8814, 304.4530, 213.5327],
                [417.7891, 269.7563, 500.1797, 373.6031],
                [260.6064, 343.9363, 351.8936, 374.8137],
                [222.8555, 190.4381, 257.6133, 213.0775],
                [259.5873, 333.6486, 498.2252, 374.5545],
                [285.9018, 274.8684, 496.9107, 373.9597],
                [343.1176, 249.3008, 445.5543, 300.3086],
                [467.9727, 304.6523, 500.7773, 374.6446],
                [264.0222, 341.9602, 418.3997, 374.4461],
                [239.6679, 267.3108, 356.4259, 293.2361]])
    scores: tensor([0.9131, 0.9126, 0.8979, 0.8662, 0.8496, 0.8447, 0.8237, 0.8027, 0.7969,
                0.7822, 0.7773, 0.7471, 0.7456, 0.7056, 0.6768, 0.6719, 0.6567, 0.6514,
                0.5469, 0.4600, 0.3318, 0.2878, 0.1995, 0.1854, 0.0976, 0.0856, 0.0618,
                0.0585, 0.0284, 0.0229, 0.0172, 0.0172, 0.0118])
) at 0x7f4c73e95f10>, <InstanceData(

    META INFORMATION

    DATA FIELDS
    labels: tensor([8, 7, 7, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8,
                8, 7, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8,
                8, 7, 8, 8, 7, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 7, 8, 3, 8, 7,
                8, 8, 8, 8, 8, 8, 7, 8])
    bboxes: tensor([[271.4857,  79.5765, 382.0300, 241.1644],
                [253.8519, 131.5258, 382.0856, 316.0651],
                [320.6353,  59.3875, 388.7397, 136.1648],
                [179.7756,  70.6396, 227.6463, 119.5376],
                [277.7370,  40.2577, 292.9661,  63.7242],
                [410.8191,  41.3533, 423.5559,  89.5035],
                [435.6531,  42.3452, 450.2845,  65.4479],
                [319.4424,  29.9087, 368.4482, 114.7276],
                [261.9485,  43.4770, 276.7234,  63.0457],
                [425.8289,  43.4721, 435.8898,  87.8734],
                [464.4441,  39.4610, 479.3059,  95.2072],
                [291.7780,  41.5695, 303.5346,  62.9987],
                [391.6432,  41.0292, 406.4037,  90.8049],
                [305.6837,  39.6217, 316.1914,  64.2624],
                [372.6173,  46.8283, 384.0233,  66.3398],
                [455.1466,  43.9075, 467.5097,  65.5470],
                [455.7914,  43.8520, 467.6461,  92.3798],
                [157.4206,  41.4324, 170.3138,  62.0608],
                [231.0185,  44.8651, 242.4190,  72.4076],
                [214.5253,  44.4385, 231.1778,  83.5843],
                [236.2109,  38.8920, 249.7266,  62.3535],
                [435.9778,  40.8511, 449.9597,  86.9761],
                [467.2784,  40.2974, 479.5966,  65.1504],
                [384.3904,  41.6667, 396.0784,  88.7015],
                [261.7603,  43.7015, 276.5209,  86.5690],
                [254.7014, 133.0908, 380.4548, 250.3910],
                [477.7160,  40.9153, 490.2527,  64.6302],
                [294.6238,  42.7452, 306.5481,  63.0935],
                [325.6184,  32.5843, 359.5378, 108.6316],
                [426.3479,  42.7661, 435.3708,  65.0271],
                [177.5767,  69.7012, 229.8452, 165.6261],
                [331.3413,  35.4451, 370.6118, 111.0481],
                [365.5200,  43.6287, 376.2769,  65.6304],
                [277.8513,  39.4658, 292.4612,  89.3388],
                [236.7991,  41.0505, 251.0915,  85.4085],
                [316.0443,  42.1337, 331.2214,  78.6572],
                [230.3798,  38.3001, 241.4952,  65.1931],
                [317.6763,  42.3511, 331.5425,  65.0512],
                [372.5433,  46.8092, 383.3160,  77.9885],
                [215.4318,  44.9264, 222.4588,  57.7850],
                [410.0192,  42.7507, 422.7933,  65.6289],
                [361.8983,  43.5614, 372.8674,  65.3068],
                [ 91.0902, 202.3433, 103.7340, 226.4839],
                [304.8661,  39.0185, 316.2277,  87.8315],
                [483.3349,  41.4188, 492.4464,  66.4721],
                [321.2705,  37.9789, 348.2607, 103.9211],
                [214.9194,  44.3499, 225.3149,  66.9615],
                [392.4561,  42.1823, 405.9814,  64.9268],
                [217.0000,  44.6373, 229.4844,  62.1786],
                [213.9660,  47.5715, 240.3309,  86.4126],
                [406.5304,  45.4100, 416.1259,  91.8968],
                [291.0857,  41.4918, 304.2268,  89.2673],
                [173.5340,  47.3544, 495.2160, 312.6729],
                [317.1226,  30.0491, 366.0806,  64.4532],
                [451.5135,  43.6241, 466.4553,  90.7508],
                [359.4681,  43.7037, 369.8288,  65.6532],
                [383.9344,  24.8846, 397.7062,  66.3926],
                [296.7070,  39.2671, 313.4492,  90.1238],
                [211.5619,  48.4769, 240.7818,  86.4844],
                [340.1151,  40.3805, 357.5412,  65.4582],
                [385.3858,  41.0304, 400.9423,  87.8719],
                [316.7220,  41.0068, 340.3093,  84.0841],
                [407.4102,  43.7461, 417.5898,  71.1812],
                [321.5169,  42.5995, 338.6394,  63.5323],
                [169.8591,  46.8453, 180.7269,  63.1956],
                [384.8448,  44.4553, 396.7958,  70.3742],
                [484.0365,  41.9558, 497.9948,  65.3487],
                [168.2845,  88.7939, 365.3092, 251.2970],
                [272.2848,  43.2156, 281.6215,  63.2094],
                [ 15.9471,  53.6990,  36.2013,  64.4533],
                [405.8932,  58.4059, 414.4193,  92.5828],
                [222.9711,  48.6539, 241.0914,  85.4279],
                [208.2356,  43.0693, 225.3581,  81.5330],
                [489.9778,  41.6829, 499.8660,  65.3285],
                [478.8498,  38.4072, 491.4628,  96.2610],
                [328.6969,  43.6563, 345.1313,  64.5278],
                [385.3545,  25.4544, 397.4581,  55.1706],
                [233.1364,  38.2715, 245.3793,  61.8990],
                [323.5156,  56.9981, 357.7344, 108.4542],
                [369.3009,  45.7179, 381.8709,  78.2980]])
    scores: tensor([0.9370, 0.9150, 0.8462, 0.8125, 0.7842, 0.7534, 0.7295, 0.7070, 0.7046,
                0.6919, 0.6909, 0.6890, 0.6670, 0.6035, 0.5933, 0.5479, 0.4956, 0.4763,
                0.4573, 0.4258, 0.4119, 0.4043, 0.3958, 0.3835, 0.3760, 0.3701, 0.2954,
                0.2791, 0.2744, 0.2688, 0.2639, 0.2507, 0.2496, 0.2468, 0.2446, 0.2397,
                0.2344, 0.2322, 0.2316, 0.1454, 0.1434, 0.1295, 0.1194, 0.1102, 0.1087,
                0.1030, 0.0974, 0.0911, 0.0795, 0.0655, 0.0638, 0.0617, 0.0598, 0.0595,
                0.0453, 0.0453, 0.0440, 0.0416, 0.0390, 0.0360, 0.0356, 0.0353, 0.0334,
                0.0310, 0.0230, 0.0228, 0.0204, 0.0200, 0.0190, 0.0186, 0.0178, 0.0176,
                0.0161, 0.0153, 0.0148, 0.0132, 0.0131, 0.0118, 0.0116, 0.0111])
) at 0x7f4c73e80a90>, <InstanceData(

    META INFORMATION

    DATA FIELDS
    labels: tensor([8, 7, 8, 8, 3, 8, 7, 8, 8, 3, 7, 8, 7, 3, 7, 8, 3, 7])
    bboxes: tensor([[112.4268,  49.4735, 304.9958, 465.7609],
                [  3.4786,  80.7808, 344.7644, 493.8286],
                [100.5002, 136.8570, 138.5015, 207.6743],
                [202.8846, 438.3407, 210.2387, 450.7218],
                [229.8136, 238.3908, 270.0773, 276.4530],
                [255.1657,  77.3897, 337.7463, 460.8916],
                [ 92.5101,  51.7661, 335.4652, 361.1246],
                [ 52.6380, 136.4516, 138.7784, 208.6656],
                [ 56.2887, 135.1908, 138.6452, 301.1374],
                [100.6135,  53.0106, 332.4428, 338.0050],
                [221.2052, 173.1790, 336.5308, 490.4929],
                [268.4396,  82.1483, 333.4618, 308.0861],
                [131.2023, 170.0540, 338.2026, 487.3679],
                [258.3805, 237.8141, 307.9540, 283.2797],
                [272.2990, 179.5792, 332.7291, 437.6083],
                [  4.9937,  80.2133, 259.6085, 412.7555],
                [  6.2139,  54.5533, 368.8019, 413.4154],
                [  5.3096,  77.2884, 173.5997, 450.4460]])
    scores: tensor([0.8682, 0.7993, 0.5518, 0.4126, 0.2849, 0.2507, 0.1484, 0.1393, 0.1383,
                0.1283, 0.0696, 0.0582, 0.0525, 0.0484, 0.0346, 0.0288, 0.0231, 0.0128])
) at 0x7f4c73f645d0>, <InstanceData(

    META INFORMATION

    DATA FIELDS
    labels: tensor([7, 8, 8, 9, 8, 9, 9, 9, 8, 8, 8, 3])
    bboxes: tensor([[ 9.4612e+01,  1.3797e+02,  3.7961e+02,  3.0450e+02],
                [ 1.9913e+02,  1.0446e+02,  2.9423e+02,  3.1772e+02],
                [ 2.7475e+02,  1.0924e+02,  3.3072e+02,  2.5754e+02],
                [ 2.0348e+02,  1.2987e+02,  2.1683e+02,  1.5087e+02],
                [ 3.1149e+02,  1.2981e+02,  3.3147e+02,  1.8312e+02],
                [ 1.8224e+02,  1.2654e+02,  2.0194e+02,  1.4913e+02],
                [ 2.0361e+02,  1.0668e+02,  2.1865e+02,  1.5299e+02],
                [ 2.0132e+02,  1.2597e+02,  2.1509e+02,  1.5184e+02],
                [ 2.0314e+02,  1.2993e+02,  2.1600e+02,  1.4964e+02],
                [ 4.7639e+02, -7.2472e-02,  5.0018e+02,  6.4209e+01],
                [ 2.0742e+02,  1.0351e+02,  3.3047e+02,  2.4375e+02],
                [ 4.8035e+02,  3.9456e+01,  5.0012e+02,  1.4461e+02]])
    scores: tensor([0.9521, 0.9209, 0.9028, 0.6689, 0.6025, 0.2302, 0.1331, 0.0487, 0.0314,
                0.0244, 0.0127, 0.0110])
) at 0x7f4c73d67fd0>, <InstanceData(

    META INFORMATION

    DATA FIELDS
    labels: tensor([7, 7, 7, 8, 7, 8, 8, 7, 8, 8, 8, 7, 8, 7, 8, 7, 8, 7, 8, 8, 7, 8, 7, 7,
                8, 8])
    bboxes: tensor([[ 64.4969,  97.4722, 440.5812, 343.1528],
                [ 40.7091,  49.8597,  97.1815,  77.6794],
                [115.9910,  48.7624, 152.9543,  86.4915],
                [162.5449,  38.2769, 172.4160,  54.2524],
                [144.7386,  48.5168, 190.0270,  84.5886],
                [150.0135,  38.2022, 159.9475,  52.8622],
                [ 33.5306,  37.6492,  47.4753,  64.7922],
                [105.0365,  54.0813, 119.7682,  79.6101],
                [127.1023,  38.6426, 147.8977,  54.1796],
                [118.6327,  36.3916, 151.0938,  84.5068],
                [125.2549,  39.5044, 149.3544,  69.0894],
                [ 35.1607,  47.3914,  57.2709,  75.9484],
                [173.3051,  44.4945, 181.7731,  56.0915],
                [ 34.2089,  46.5494,  46.7482,  67.1225],
                [162.3474,  39.8791, 172.0276,  61.2927],
                [ 37.2679,  47.8880,  69.6657,  77.0144],
                [ 86.3865, 189.5554, 128.8479, 258.8821],
                [ 87.8972,  50.3984, 113.0794,  77.3359],
                [100.0741,  44.9206, 111.2541,  57.4232],
                [ 87.4683, 190.7642, 120.3442, 248.6889],
                [ 17.8292,  45.9603,  53.6552,  77.0865],
                [148.5942,  36.3465, 160.7808,  56.9153],
                [ 89.7671,  48.6790, 110.6235,  71.3405],
                [183.9877,  28.1688, 249.9967,  93.2179],
                [ 33.9724,  41.2328,  44.1526,  64.6266],
                [162.4006,  36.7776, 175.2947,  50.9666]])
    scores: tensor([0.8921, 0.8364, 0.7100, 0.6689, 0.6226, 0.5576, 0.2477, 0.2384, 0.2211,
                0.2064, 0.1094, 0.0903, 0.0427, 0.0303, 0.0302, 0.0257, 0.0194, 0.0186,
                0.0160, 0.0133, 0.0124, 0.0122, 0.0117, 0.0116, 0.0105, 0.0104])
) at 0x7f4c7280bb50>, <InstanceData(

    META INFORMATION

    DATA FIELDS
    labels: tensor([7, 7, 8, 3, 2, 7, 9, 3, 7, 7, 3, 8, 3, 7, 3, 0, 3, 7])
    bboxes: tensor([[ 3.1752e+02,  4.0739e+01,  4.8873e+02,  1.6239e+02],
                [ 4.5143e+01,  7.2154e+01,  4.4548e+02,  3.1261e+02],
                [ 3.5729e+02,  1.4798e+00,  4.5599e+02,  1.2625e+02],
                [ 9.8589e-01,  1.7422e+01,  1.7021e+02,  1.6793e+02],
                [ 5.3983e+00,  1.2770e+00,  1.7741e+02,  1.6083e+02],
                [ 3.3786e+02, -3.4059e+00,  4.9417e+02,  1.1786e+02],
                [ 1.3577e+02,  6.7244e-01,  1.6736e+02,  2.7453e+01],
                [ 3.3819e+02, -3.4268e+00,  4.9775e+02,  1.1808e+02],
                [ 1.2903e-02,  1.3679e+02,  7.7080e+00,  2.0930e+02],
                [ 3.1802e-01,  1.2308e+02,  1.1242e+01,  2.0875e+02],
                [-5.4418e-02,  1.4363e+02,  6.4509e+00,  2.1282e+02],
                [ 2.2858e-02,  1.3889e+02,  7.3014e+00,  2.1111e+02],
                [ 6.6161e-02,  1.3591e+02,  8.6741e+00,  2.0940e+02],
                [ 7.0746e+00, -8.9061e-02,  1.9878e+02,  1.8124e+02],
                [ 4.5079e+02,  1.5042e+01,  4.9999e+02,  1.0849e+02],
                [-1.4470e-01,  1.3534e+02,  1.2913e+01,  2.0645e+02],
                [ 3.3901e+02,  7.9218e-01,  4.5669e+02,  2.2816e+01],
                [ 8.8935e+01,  7.3754e+01,  1.9524e+02,  1.6531e+02]])
    scores: tensor([0.9468, 0.9136, 0.9004, 0.0602, 0.0569, 0.0513, 0.0397, 0.0368, 0.0296,
                0.0232, 0.0187, 0.0186, 0.0173, 0.0160, 0.0135, 0.0123, 0.0123, 0.0118])
) at 0x7f4b41739f50>, <InstanceData(

    META INFORMATION

    DATA FIELDS
    labels: tensor([8, 7, 8, 8, 7, 7, 8, 7, 8, 7, 8, 7, 8, 8, 7, 7])
    bboxes: tensor([[ 86.2171,  80.5538, 178.0407, 235.4185],
                [369.5970, 113.5575, 469.4655, 176.0675],
                [292.3818,  82.9536, 362.3058, 187.7403],
                [207.8438,  81.0523, 287.0781, 213.2567],
                [ 84.4058, 141.0876, 172.6255, 264.0751],
                [290.7714, 129.2999, 362.3536, 204.6276],
                [400.5898,  72.8207, 454.0977, 163.3291],
                [197.8433, 131.9943, 284.1879, 230.8176],
                [182.8734,  80.1364, 247.9860, 199.3400],
                [486.6034, 127.2439, 500.1154, 159.6488],
                [199.3866,  79.4162, 247.8791, 129.2153],
                [487.9211, 101.5083, 500.3602, 160.4032],
                [298.8225, 143.8950, 319.5368, 186.3244],
                [295.3304, 140.6003, 326.1541, 190.7900],
                [199.0445, 141.6606, 231.0336, 216.2722],
                [488.7791,  63.7851, 500.2834, 162.8016]])
    scores: tensor([0.9609, 0.9575, 0.9448, 0.9209, 0.9199, 0.9194, 0.8770, 0.8398, 0.8193,
                0.6655, 0.1028, 0.0471, 0.0420, 0.0177, 0.0148, 0.0114])
) at 0x7f4c73ec1710>, <InstanceData(

    META INFORMATION

    DATA FIELDS
    labels: tensor([8, 8, 8, 7, 8, 7, 3, 8, 7, 3])
    bboxes: tensor([[1.7205e+02, 1.6686e+02, 2.9904e+02, 4.5697e+02],
                [4.0024e+02, 2.3802e+02, 4.2319e+02, 3.0534e+02],
                [4.3766e+02, 2.1643e+02, 4.6780e+02, 3.0154e+02],
                [2.2520e+00, 1.9803e+02, 4.1611e+02, 4.6681e+02],
                [3.3768e+02, 1.7922e+02, 3.8849e+02, 2.9851e+02],
                [4.1755e+01, 9.2042e+00, 3.9574e+02, 4.6462e+02],
                [4.5373e-01, 1.2322e+02, 4.7813e+01, 1.9787e+02],
                [6.7030e+01, 3.9046e+01, 4.3297e+02, 4.4728e+02],
                [1.3346e+02, 1.6755e+02, 4.6927e+02, 3.7893e+02],
                [4.8785e-01, 1.3739e+02, 4.7950e+01, 1.7843e+02]])
    scores: tensor([0.9219, 0.9062, 0.9028, 0.8711, 0.8394, 0.6646, 0.0244, 0.0160, 0.0130,
                0.0115])
) at 0x7f4c73d67050>, <InstanceData(

    META INFORMATION

    DATA FIELDS
    labels: tensor([8, 7, 8, 7, 7, 7])
    bboxes: tensor([[181.9427,  92.3327, 352.8230, 316.3430],
                [188.1480, 149.8185, 361.8520, 328.7263],
                [258.8270,  53.5329, 409.9231, 293.8610],
                [291.5276, 168.9651, 426.4412, 291.2342],
                [241.2737, 119.7830, 427.0857, 293.1864],
                [263.0099, 199.2441, 358.8651, 329.6533]])
    scores: tensor([0.9004, 0.8691, 0.7969, 0.5977, 0.2439, 0.0105])
) at 0x7f4c73d67e10>, <InstanceData(

    META INFORMATION

    DATA FIELDS
    labels: tensor([7, 8, 7, 8, 8, 7, 8, 8, 8])
    bboxes: tensor([[ 1.8695e+01,  5.5144e+01,  4.3013e+02,  3.3958e+02],
                [ 4.8639e+02,  5.9341e-01,  5.0032e+02,  1.6960e+01],
                [ 5.1021e+01,  6.4488e+01,  3.3160e+02,  1.8493e+02],
                [ 4.8688e+02,  2.0296e-01,  4.9984e+02,  4.2375e+01],
                [ 4.4484e+02,  6.8057e-02,  4.9969e+02,  1.6216e+01],
                [ 4.4175e+02,  7.5438e-02,  4.9888e+02,  1.3023e+01],
                [ 4.8874e+02, -8.1093e-01,  5.0032e+02,  9.0655e+01],
                [ 4.8610e+02,  2.3775e+00,  4.9984e+02,  7.4185e+01],
                [ 4.4256e+02, -2.2997e-01,  4.7073e+02,  9.5622e+00]])
    scores: tensor([0.8975, 0.0807, 0.0695, 0.0685, 0.0167, 0.0145, 0.0134, 0.0129, 0.0123])
) at 0x7f4c73e80590>, <InstanceData(

    META INFORMATION

    DATA FIELDS
    labels: tensor([7, 8, 3, 7, 7, 7, 7])
    bboxes: tensor([[ 1.3613e+02,  1.0207e+02,  4.7325e+02,  3.2489e+02],
                [ 2.4340e+02,  3.8751e+01,  4.0582e+02,  2.7141e+02],
                [ 1.6503e+02,  1.9052e+01,  5.0215e+02,  2.2353e+02],
                [ 2.2282e+00, -6.6140e-02,  1.1740e+02,  2.8631e+01],
                [ 1.6346e+02,  1.7963e+01,  4.9084e+02,  2.5469e+02],
                [ 2.4430e+00, -6.5794e-01,  1.8222e+02,  3.0614e+01],
                [ 4.4309e+02, -1.9421e-01,  4.9988e+02,  5.0121e+01]])
    scores: tensor([0.9370, 0.9180, 0.0336, 0.0185, 0.0176, 0.0119, 0.0117])
) at 0x7f4c73ea63d0>, <InstanceData(

    META INFORMATION

    DATA FIELDS
    labels: tensor([7, 8])
    bboxes: tensor([[190.0439, 159.2768, 480.6593, 373.1451],
                [264.9960, 155.5962, 408.0509, 315.4976]])
    scores: tensor([0.9624, 0.8838])
) at 0x7f4c73eb1090>, <InstanceData(

    META INFORMATION

    DATA FIELDS
    labels: tensor([7, 8, 3, 3])
    bboxes: tensor([[101.8878, 112.2385, 432.4872, 310.8084],
                [205.6325,  77.2475, 364.2894, 293.2603],
                [475.4175, 199.5767, 499.5826, 364.8764],
                [473.7561,  99.1101, 499.6814, 368.4680]])
    scores: tensor([0.9688, 0.9468, 0.2433, 0.0299])
) at 0x7f4c73ec2490>, <InstanceData(

    META INFORMATION

    DATA FIELDS
    labels: tensor([7, 3, 3, 7, 3, 3, 0, 3])
    bboxes: tensor([[ 7.2260e+01,  2.6017e+02,  4.5352e+02,  4.5765e+02],
                [ 6.4062e-01,  2.8625e+02,  3.3832e+01,  3.7178e+02],
                [ 2.2564e-02,  3.0095e+02,  9.5294e+00,  3.7506e+02],
                [ 4.6958e+02,  2.0706e+00,  4.9995e+02,  9.3567e+01],
                [-6.6004e-01,  2.5624e+02,  9.4264e+01,  3.7327e+02],
                [ 8.7050e-02,  2.5616e+02,  9.3175e+01,  3.2997e+02],
                [ 4.6956e+02,  2.2125e+00,  4.9997e+02,  9.4988e+01],
                [ 3.5576e-01,  2.9125e+02,  1.1851e+01,  3.6952e+02]])
    scores: tensor([0.9077, 0.2073, 0.0675, 0.0277, 0.0174, 0.0132, 0.0111, 0.0108])
) at 0x7f4c73eb1350>, <InstanceData(

    META INFORMATION

    DATA FIELDS
    labels: tensor([7, 8, 7, 3, 7, 3, 3, 7, 3, 7, 7])
    bboxes: tensor([[ 8.9975e+01,  4.2341e+01,  3.3795e+02,  4.8461e+02],
                [ 1.8403e+02,  9.4142e+00,  3.3370e+02,  4.5777e+02],
                [ 1.2438e+02,  3.5920e+01,  3.3361e+02,  2.7404e+02],
                [ 1.8598e+02,  1.7294e+01,  3.3409e+02,  1.9189e+02],
                [ 1.7443e+02,  2.5304e+01,  3.3627e+02,  3.7860e+02],
                [ 1.4181e+02,  2.5746e+01,  3.3531e+02,  2.4554e+02],
                [ 1.7565e+02,  1.7738e+01,  3.3544e+02,  3.2230e+02],
                [-7.5937e-02,  3.9008e+02,  1.1194e+02,  4.9976e+02],
                [ 1.8828e+02,  1.9382e+01,  3.3296e+02,  1.3189e+02],
                [ 1.2246e+02,  2.7760e+01,  3.3006e+02,  1.8669e+02],
                [ 1.8286e+02, -8.7825e+00,  3.4580e+02,  4.9238e+02]])
    scores: tensor([0.8813, 0.0515, 0.0467, 0.0414, 0.0289, 0.0232, 0.0156, 0.0136, 0.0121,
                0.0107, 0.0107])
) at 0x7f4c73e94110>, <InstanceData(

    META INFORMATION

    DATA FIELDS
    labels: tensor([7, 3, 3, 8, 8, 7, 3, 8, 3, 3, 3, 3, 3, 3, 3, 8, 3, 3, 8, 7, 3])
    bboxes: tensor([[ 84.3600,  82.4711, 269.3394, 410.4977],
                [  3.5905,  89.4917,  33.4484, 105.2349],
                [ 58.6168,  89.0517,  76.8508, 103.8194],
                [150.1357,  89.7497, 155.9353, 104.4886],
                [155.1998,  91.5745, 160.8264, 104.8122],
                [250.8009,  92.3562, 256.7148, 101.7845],
                [160.9561,  88.7714, 177.5178,  97.6544],
                [353.4520,  96.2032, 358.2412, 105.3593],
                [215.5946,  89.2846, 228.6768,  97.6295],
                [ 42.6352,  90.9048,  58.7704, 101.5757],
                [ 74.9434,  91.4436,  81.6056, 100.7439],
                [204.6635,  93.0420, 217.7457, 101.0986],
                [261.4524,  92.7793, 280.0278,  99.1153],
                [ 74.2842,  90.8829,  86.5592, 100.5233],
                [  0.4593,  95.3997,   8.1050, 106.3581],
                [250.4535,  92.0200, 256.2814, 100.8511],
                [ 81.0910,  92.3332,  94.0019,  99.1707],
                [119.2429,  89.8981, 133.3438,  97.7972],
                [ 98.3096,  71.9024, 270.2249, 385.5195],
                [258.2211,  92.0678, 266.4720,  99.1431],
                [369.8869,  98.1672, 373.8188, 105.1532]])
    scores: tensor([0.9321, 0.8555, 0.8516, 0.2356, 0.1378, 0.0865, 0.0837, 0.0809, 0.0701,
                0.0531, 0.0407, 0.0339, 0.0323, 0.0263, 0.0210, 0.0166, 0.0158, 0.0151,
                0.0135, 0.0122, 0.0109])
) at 0x7f4c73eb0c50>, <InstanceData(

    META INFORMATION

    DATA FIELDS
    labels: tensor([8, 8, 7, 8, 7, 7, 8])
    bboxes: tensor([[4.0554e-01, 1.3563e+02, 7.2690e+01, 3.9875e+02],
                [9.6178e+01, 1.6634e+02, 1.8800e+02, 3.5475e+02],
                [8.9262e+01, 2.5258e+02, 1.9746e+02, 3.9508e+02],
                [3.6405e+02, 3.5873e+02, 3.7424e+02, 3.9362e+02],
                [3.6403e+02, 3.6256e+02, 3.7464e+02, 3.9252e+02],
                [3.6445e+02, 3.7256e+02, 3.7461e+02, 3.9385e+02],
                [9.0739e-02, 1.3139e+02, 7.2517e+01, 3.0260e+02]])
    scores: tensor([0.9507, 0.9395, 0.9126, 0.1044, 0.0992, 0.0494, 0.0132])
) at 0x7f4c73eba9d0>, <InstanceData(

    META INFORMATION

    DATA FIELDS
    labels: tensor([8, 7, 7])
    bboxes: tensor([[ 50.9346,  45.9914, 222.2107, 217.8758],
                [ 86.6260, 177.0684, 319.7704, 446.3692],
                [ 84.4247, 173.0812, 252.0247, 358.1688]])
    scores: tensor([0.9165, 0.9092, 0.1705])
) at 0x7f4c73eb09d0>, <InstanceData(

    META INFORMATION

    DATA FIELDS
    labels: tensor([7, 7, 3, 7, 3, 8])
    bboxes: tensor([[175.2436, 187.1129, 306.0064, 275.7467],
                [ 55.1460, 178.0697, 140.0689, 298.8634],
                [332.5161, 123.4375, 497.1714, 372.2601],
                [ 62.1126, 213.4315, 139.0592, 297.5124],
                [374.0063,  96.3536, 500.9937, 275.2241],
                [205.4092,  21.7157, 213.3408,  38.4873]])
    scores: tensor([0.9502, 0.8789, 0.8652, 0.0678, 0.0538, 0.0126])
) at 0x7f4c73e81990>, <InstanceData(

    META INFORMATION

    DATA FIELDS
    labels: tensor([8, 8, 7, 7, 7, 8, 7, 3, 7, 8, 8, 7, 8, 8, 7, 7, 8, 8, 8, 8, 7, 3, 8])
    bboxes: tensor([[ 4.6331e+02,  1.4040e+00,  4.9997e+02,  8.8051e+01],
                [ 4.2077e+02,  3.8005e-01,  4.6360e+02,  7.8278e+01],
                [ 1.7378e+01,  3.2953e+01,  4.5450e+02,  2.8559e+02],
                [ 2.6765e+02,  1.6813e+02,  3.5266e+02,  2.6219e+02],
                [ 2.3051e+02,  1.6432e+02,  3.5504e+02,  2.6366e+02],
                [ 2.7664e+02,  1.5958e+02,  3.5071e+02,  2.5003e+02],
                [ 2.6882e+02,  2.0266e+02,  3.5267e+02,  2.6245e+02],
                [ 1.7270e+02,  4.9276e-01,  2.5034e+02,  1.4530e+01],
                [ 3.9116e+02,  6.4297e+00,  5.0103e+02,  8.5614e+01],
                [ 2.5660e+02,  1.1762e+02,  3.5434e+02,  2.4743e+02],
                [ 3.1872e+02, -2.6780e-01,  3.5081e+02,  2.7163e+01],
                [ 1.9130e+02,  1.8490e+02,  3.5245e+02,  2.6340e+02],
                [ 4.4639e+02,  2.4278e+02,  5.0048e+02,  3.3020e+02],
                [ 4.6061e+02,  2.3281e-01,  4.8314e+02,  5.2287e+01],
                [ 4.4431e+02,  2.9640e+02,  5.0022e+02,  3.3169e+02],
                [ 4.3563e+02,  2.4040e+02,  5.0030e+02,  3.3023e+02],
                [ 2.0002e+02,  1.1440e+02,  3.6404e+02,  2.5045e+02],
                [ 4.0637e+02,  9.1383e-01,  4.2879e+02,  6.5823e+01],
                [ 4.8508e+02,  3.1094e+02,  5.0008e+02,  3.3395e+02],
                [ 4.9438e+02, -1.7780e+00,  4.9937e+02,  9.8952e+01],
                [ 4.3606e+02,  2.8422e+02,  5.0066e+02,  3.3097e+02],
                [ 2.0145e+02,  4.7423e-02,  2.4855e+02,  1.3620e+01],
                [ 4.9355e+02,  1.9544e+00,  4.9941e+02,  6.5857e+01]])
    scores: tensor([0.9224, 0.9170, 0.8809, 0.5864, 0.2654, 0.2267, 0.0762, 0.0508, 0.0414,
                0.0413, 0.0403, 0.0313, 0.0307, 0.0280, 0.0210, 0.0192, 0.0181, 0.0179,
                0.0131, 0.0111, 0.0108, 0.0105, 0.0104])
) at 0x7f4c73ba6890>, <InstanceData(

    META INFORMATION

    DATA FIELDS
    labels: tensor([7, 8, 8, 3, 0, 7, 8, 8, 7, 7, 7])
    bboxes: tensor([[ 68.1087,  24.5678, 441.2663, 350.0416],
                [399.8750,  23.4408, 422.7812,  52.3405],
                [131.0934, 124.9318, 156.7973, 191.0839],
                [377.4912,  18.3602, 499.8526, 133.9836],
                [398.9582,  37.9738, 422.9168,  54.3602],
                [400.0855,  24.4216, 422.5707,  52.8733],
                [131.4223, 126.7186, 155.6871, 170.7424],
                [130.5047, 128.2851, 159.9250, 214.2930],
                [ 71.8003,  23.2082, 292.4575, 262.7293],
                [127.6538, 145.0532, 436.4087, 345.5718],
                [374.7353,  21.5306, 499.4835, 132.3757]])
    scores: tensor([0.8833, 0.5884, 0.3101, 0.1367, 0.0371, 0.0358, 0.0337, 0.0333, 0.0262,
                0.0184, 0.0101])
) at 0x7f4c73ebbb90>, <InstanceData(

    META INFORMATION

    DATA FIELDS
    labels: tensor([7, 7, 8, 7])
    bboxes: tensor([[202.4139, 127.8639, 455.3986, 290.3403],
                [ 77.0534, 126.4936, 295.9934, 263.1789],
                [253.3534, 118.1280, 323.2091, 170.1203],
                [353.0914, 165.3600, 453.9399, 283.3302]])
    scores: tensor([0.9517, 0.8901, 0.0188, 0.0116])
) at 0x7f4c73eb3310>, <InstanceData(

    META INFORMATION

    DATA FIELDS
    labels: tensor([7, 8, 3, 7, 3, 7, 8, 3, 8])
    bboxes: tensor([[ 6.6523e+01,  9.5990e+01,  4.0496e+02,  3.2589e+02],
                [ 1.5785e+02,  3.9114e+01,  3.3278e+02,  2.9917e+02],
                [ 4.2682e+02,  7.3618e-01,  4.9896e+02,  2.7853e+01],
                [ 1.1840e+00,  2.9183e+01,  1.0917e+02,  7.6774e+01],
                [ 4.8295e+02, -1.5784e-01,  5.0065e+02,  1.6003e+01],
                [ 4.2664e+02,  8.9144e-01,  4.9992e+02,  2.7771e+01],
                [ 8.9379e-01,  9.7689e-01,  7.9819e+01,  7.6953e+01],
                [ 1.2308e+02,  3.1398e+01,  3.5935e+02,  2.4360e+02],
                [ 6.4923e-01,  2.9145e+01,  7.6255e+01,  7.4957e+01]])
    scores: tensor([0.9521, 0.9238, 0.0670, 0.0509, 0.0327, 0.0267, 0.0240, 0.0151, 0.0130])
) at 0x7f4c73eb2d90>, <InstanceData(

    META INFORMATION

    DATA FIELDS
    labels: tensor([7, 8, 8, 8, 7, 8, 7, 7, 8, 8, 8, 3, 8, 7, 8, 7, 8, 8, 7, 7, 8])
    bboxes: tensor([[ 1.4569e+01,  3.9193e+01,  3.6375e+02,  3.6979e+02],
                [ 2.9350e+02,  1.6898e+01,  4.5337e+02,  3.7158e+02],
                [ 3.3951e+02,  3.1465e+01,  4.4135e+02,  3.9393e+02],
                [ 9.5742e+01,  6.2416e-01,  3.1559e+02,  9.3712e+01],
                [ 8.4019e+00, -2.1591e-01,  5.0566e+02,  3.5725e+02],
                [ 1.2224e+02,  5.3763e+00,  4.8675e+02,  3.7275e+02],
                [ 7.5386e+00,  6.0591e+00,  4.4637e+02,  2.4511e+02],
                [ 1.1132e+02,  2.3276e+02,  2.4415e+02,  3.4771e+02],
                [ 3.3004e+02,  2.1862e+00,  5.1137e+02,  3.6813e+02],
                [ 3.8840e+01,  4.1425e+00,  3.4221e+02,  1.9068e+02],
                [ 4.7846e+00,  5.6009e+00,  3.2334e+02,  2.8893e+02],
                [ 2.7977e+02,  4.9454e+00,  5.2023e+02,  3.5755e+02],
                [ 1.4390e+01, -6.8709e-01,  2.8639e+02,  9.3314e+01],
                [ 3.3984e+01,  1.8297e+00,  4.0352e+02,  1.3645e+02],
                [ 2.5127e+02,  4.7020e-01,  3.1748e+02,  7.2333e+01],
                [ 2.4434e+01,  1.3920e+02,  3.9978e+02,  3.7135e+02],
                [ 7.9091e+00, -4.2418e-01,  5.1201e+02,  2.4574e+02],
                [ 3.3463e+02,  5.0553e+01,  4.0834e+02,  3.8577e+02],
                [ 1.1314e+02,  2.0632e+02,  3.3256e+02,  3.4524e+02],
                [ 4.6638e+01,  1.3677e+00,  3.7211e+02,  9.5166e+01],
                [ 1.9558e+02,  1.1931e-01,  3.1770e+02,  9.5974e+01]])
    scores: tensor([0.8730, 0.7031, 0.4932, 0.3430, 0.2742, 0.0988, 0.0688, 0.0538, 0.0529,
                0.0528, 0.0501, 0.0261, 0.0247, 0.0179, 0.0137, 0.0136, 0.0120, 0.0118,
                0.0117, 0.0107, 0.0105])
) at 0x7f4c73ea7e50>, <InstanceData(

    META INFORMATION

    DATA FIELDS
    labels: tensor([8, 7, 8, 9, 7])
    bboxes: tensor([[1.1731e+02, 2.2979e+01, 2.2742e+02, 2.3112e+02],
                [2.4811e+01, 1.0539e+02, 4.5097e+02, 3.4773e+02],
                [2.9172e+02, 2.1828e+01, 3.1414e+02, 5.4832e+01],
                [1.0893e+01, 1.3974e-01, 6.1812e+01, 6.1921e+01],
                [1.7540e+01, 7.8309e+00, 4.8285e+02, 3.5838e+02]])
    scores: tensor([0.9644, 0.9438, 0.7646, 0.0572, 0.0207])
) at 0x7f4b41715490>, <InstanceData(

    META INFORMATION

    DATA FIELDS
    labels: tensor([8, 7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 7, 7, 8, 8, 8, 8, 8, 8, 7, 8, 8, 7, 8,
                8, 7, 7, 7])
    bboxes: tensor([[104.4134, 184.1334, 247.9613, 347.5072],
                [106.9834, 238.7496, 289.5113, 391.3285],
                [ 92.4884, 194.2419, 321.3811, 389.3519],
                [105.0645, 245.0990, 228.5690, 389.2760],
                [184.2158, 113.9310, 276.8973, 207.7487],
                [ 95.4052,  24.7347, 122.4619,  54.4645],
                [156.9257, 245.7783, 257.3342, 330.7842],
                [191.4511, 253.8183, 257.5583, 330.1661],
                [127.6749, 216.4084, 257.1066, 335.9353],
                [139.8645,  57.9741, 202.9444, 114.8774],
                [303.1578,  21.4669, 314.1324,  46.5019],
                [263.0184, 136.0066, 326.1599, 209.5013],
                [135.0083, 283.1389, 289.0127, 386.7830],
                [227.6159,  54.5608, 258.4855, 102.9587],
                [ 96.6895,  25.2100, 117.2732,  38.9013],
                [179.1599,  80.8773, 233.5382, 118.3415],
                [155.0054,  58.4106, 202.4451, 104.7730],
                [143.4763,  56.5634, 232.5202, 117.8507],
                [177.9607,  80.9174, 217.7532, 122.5983],
                [220.8810, 134.2856, 326.5200, 212.9800],
                [196.4847, 137.5971, 324.7566, 351.4654],
                [  8.7171, 175.3551, 244.8756, 387.1450],
                [189.1525, 121.5860, 325.0609, 216.6952],
                [174.1985,  78.9641, 220.5393, 144.4734],
                [219.2163,  64.6356, 256.3432, 104.7980],
                [  7.8722, 182.5243, 261.7286, 387.3976],
                [195.4762, 286.2000, 286.7208, 355.9875],
                [219.7416,  63.5509, 260.1127, 102.6600]])
    scores: tensor([0.7734, 0.6772, 0.5713, 0.4756, 0.3015, 0.1567, 0.1562, 0.1321, 0.0756,
                0.0677, 0.0526, 0.0501, 0.0494, 0.0347, 0.0310, 0.0292, 0.0271, 0.0242,
                0.0224, 0.0213, 0.0192, 0.0181, 0.0164, 0.0161, 0.0142, 0.0140, 0.0119,
                0.0116])
) at 0x7f4c73ba6990>, <InstanceData(

    META INFORMATION

    DATA FIELDS
    labels: tensor([7, 8, 8, 8])
    bboxes: tensor([[138.1911, 119.8278, 409.0745, 314.9378],
                [158.5271,  33.0113, 323.5042, 282.6137],
                [128.7116,  39.6673, 172.2650,  55.7917],
                [458.6322, 161.4878, 499.1803, 262.7310]])
    scores: tensor([0.9678, 0.9482, 0.0118, 0.0103])
) at 0x7f4c73e96710>, <InstanceData(

    META INFORMATION

    DATA FIELDS
    labels: tensor([8, 7, 8, 7, 8, 8, 7, 7, 7, 8, 7, 7, 8, 7, 7, 7, 7, 7, 8, 7, 7, 7, 8, 8,
                8, 7, 7, 8, 7, 7, 2, 7, 3, 8, 8, 8, 7, 8, 8, 3, 8, 7, 8, 8, 8, 7, 7, 2,
                7, 8, 7, 8, 7, 7, 8, 8, 8, 8, 7, 8, 8, 8, 3, 8, 7, 7, 2, 8, 7, 8, 8, 7,
                7, 8, 3])
    bboxes: tensor([[ 2.0102e+02,  5.2484e+01,  3.1070e+02,  2.3675e+02],
                [ 1.9946e+02,  1.3422e+02,  3.1968e+02,  2.7250e+02],
                [ 3.0117e+01,  9.7336e+01,  4.6202e+01,  1.4233e+02],
                [ 3.8109e+02,  1.2191e+02,  4.0954e+02,  1.4254e+02],
                [ 4.5878e+02,  1.0833e+02,  4.7950e+02,  1.6041e+02],
                [ 3.1248e+02,  1.1808e+02,  3.2229e+02,  1.3954e+02],
                [ 8.9655e+01,  1.1686e+02,  1.2753e+02,  1.4115e+02],
                [ 3.2368e+02,  1.2292e+02,  3.5210e+02,  1.4114e+02],
                [ 1.2580e+02,  1.1203e+02,  1.5408e+02,  1.4091e+02],
                [ 7.4249e+01,  9.6497e+01,  8.7275e+01,  1.3770e+02],
                [ 3.0925e+02,  1.2525e+02,  3.2903e+02,  1.4095e+02],
                [ 5.7717e+01,  1.1627e+02,  9.8631e+01,  1.4037e+02],
                [ 6.5089e+01,  9.8526e+01,  7.6318e+01,  1.3723e+02],
                [ 1.5208e+02,  1.1402e+02,  1.6803e+02,  1.4067e+02],
                [ 1.6508e+02,  1.1388e+02,  1.8433e+02,  1.4217e+02],
                [ 1.3634e+02,  1.1199e+02,  1.5428e+02,  1.4016e+02],
                [ 4.3252e+02,  1.2111e+02,  4.5107e+02,  1.3963e+02],
                [ 1.8579e+02,  1.1544e+02,  2.1323e+02,  1.3847e+02],
                [ 4.2008e+02,  1.1137e+02,  4.2679e+02,  1.3024e+02],
                [ 1.9567e+02,  1.1589e+02,  2.1449e+02,  1.3860e+02],
                [ 1.2469e+02,  1.1185e+02,  1.4524e+02,  1.4050e+02],
                [ 1.8495e+02,  1.1610e+02,  2.0392e+02,  1.3859e+02],
                [ 3.9085e+02,  1.1380e+02,  4.0837e+02,  1.4011e+02],
                [ 4.8035e+02,  1.1496e+02,  4.9230e+02,  1.4032e+02],
                [ 3.0975e+02,  1.1655e+02,  3.2345e+02,  1.3834e+02],
                [ 4.7890e+02,  1.2238e+02,  4.9923e+02,  1.4266e+02],
                [ 4.0690e+02,  1.1915e+02,  4.2123e+02,  1.4140e+02],
                [ 2.2936e+02,  1.1903e+02,  2.4330e+02,  1.3879e+02],
                [ 4.7931e+02,  1.2309e+02,  4.9022e+02,  1.4233e+02],
                [ 4.3054e+02,  1.1584e+02,  4.4758e+02,  1.2928e+02],
                [ 1.3689e+00,  4.6122e+01,  8.5057e+01,  1.3558e+02],
                [ 2.3204e+02,  1.2003e+02,  2.4296e+02,  1.3954e+02],
                [ 4.3085e+02,  1.1525e+02,  4.4728e+02,  1.2929e+02],
                [ 3.2888e+02,  1.1461e+02,  3.4300e+02,  1.3813e+02],
                [ 1.2566e+02,  1.0817e+02,  1.4562e+02,  1.3559e+02],
                [ 4.7670e+02,  1.1581e+02,  4.9048e+02,  1.3946e+02],
                [ 1.7436e+02,  1.1646e+02,  1.8638e+02,  1.3998e+02],
                [ 1.2493e+02,  1.0976e+02,  1.3601e+02,  1.3166e+02],
                [ 4.9029e+02,  1.0749e+02,  5.0033e+02,  1.2729e+02],
                [ 3.1652e+02,  1.0206e+02,  3.9949e+02,  1.2804e+02],
                [ 4.5118e+02,  1.1168e+02,  4.5975e+02,  1.3716e+02],
                [ 4.8117e+02,  1.1438e+02,  5.0008e+02,  1.4167e+02],
                [ 4.8180e+02,  1.1258e+02,  5.0023e+02,  1.4191e+02],
                [ 1.5479e+02,  1.1085e+02,  1.6649e+02,  1.3798e+02],
                [ 1.6786e+02,  1.0931e+02,  1.7960e+02,  1.3836e+02],
                [ 1.7776e+02,  1.1882e+02,  1.8826e+02,  1.3900e+02],
                [ 2.1025e+02,  1.2131e+02,  2.2217e+02,  1.3865e+02],
                [ 2.9493e+02,  1.0234e+02,  4.0000e+02,  1.2463e+02],
                [ 4.4904e+02,  1.1544e+02,  4.5956e+02,  1.3983e+02],
                [ 4.0782e+02,  1.1251e+02,  4.1718e+02,  1.3730e+02],
                [ 1.1281e+02,  1.1654e+02,  1.2840e+02,  1.4050e+02],
                [ 1.3053e+02,  1.0893e+02,  1.5209e+02,  1.3913e+02],
                [ 4.1679e+02,  1.1568e+02,  4.2617e+02,  1.3920e+02],
                [ 2.9661e+02,  1.2476e+02,  3.2917e+02,  1.4105e+02],
                [ 4.0709e+02,  1.1192e+02,  4.1400e+02,  1.2247e+02],
                [ 2.4550e+02,  1.0757e+02,  2.6505e+02,  1.5200e+02],
                [ 4.9082e+02,  1.0687e+02,  4.9981e+02,  1.2147e+02],
                [ 2.3219e+02,  1.1801e+02,  2.4554e+02,  1.3980e+02],
                [ 3.9285e+02,  1.1546e+02,  4.1027e+02,  1.4001e+02],
                [ 4.5261e+02,  1.1157e+02,  4.6224e+02,  1.3707e+02],
                [ 1.3712e+02,  1.0972e+02,  1.5292e+02,  1.3795e+02],
                [ 4.0819e+02,  1.1515e+02,  4.1994e+02,  1.4071e+02],
                [ 4.3031e+02,  1.1234e+02,  4.5641e+02,  1.3162e+02],
                [ 2.3761e+02,  1.3249e+02,  2.9129e+02,  2.2232e+02],
                [ 2.1094e+02,  1.1473e+02,  2.2304e+02,  1.3879e+02],
                [ 1.5765e+02,  1.1739e+02,  1.7087e+02,  1.4159e+02],
                [ 1.9166e+00,  9.1637e+01,  6.9031e+01,  1.3671e+02],
                [ 4.4956e+02,  1.1750e+02,  4.6059e+02,  1.4050e+02],
                [ 4.7707e+02,  1.2525e+02,  4.8934e+02,  1.4408e+02],
                [ 3.4516e+02,  1.2022e+02,  3.5367e+02,  1.3661e+02],
                [ 1.8153e+02,  1.1741e+02,  1.9308e+02,  1.4002e+02],
                [ 3.8553e+02,  1.1866e+02,  4.0744e+02,  1.4091e+02],
                [ 1.2267e+02,  1.1646e+02,  1.3886e+02,  1.4135e+02],
                [ 4.2792e+02,  1.1455e+02,  4.3927e+02,  1.2980e+02],
                [-4.5452e-02,  1.0515e+02,  5.1413e+01,  1.3607e+02]])
    scores: tensor([0.9243, 0.9106, 0.8857, 0.8179, 0.8115, 0.7607, 0.7388, 0.7314, 0.7236,
                0.7192, 0.7002, 0.6958, 0.6860, 0.6587, 0.6274, 0.6177, 0.6118, 0.5913,
                0.5820, 0.5557, 0.5288, 0.5093, 0.4768, 0.4612, 0.4072, 0.3889, 0.3633,
                0.3633, 0.3503, 0.2937, 0.2932, 0.2930, 0.2917, 0.2795, 0.2423, 0.2134,
                0.2061, 0.1835, 0.1674, 0.1543, 0.1501, 0.1405, 0.1317, 0.1138, 0.1109,
                0.0893, 0.0891, 0.0850, 0.0672, 0.0599, 0.0546, 0.0540, 0.0509, 0.0503,
                0.0492, 0.0489, 0.0471, 0.0442, 0.0432, 0.0403, 0.0379, 0.0360, 0.0346,
                0.0342, 0.0340, 0.0260, 0.0221, 0.0213, 0.0212, 0.0163, 0.0161, 0.0147,
                0.0128, 0.0124, 0.0121])
) at 0x7f4c73ba5810>, <InstanceData(

    META INFORMATION

    DATA FIELDS
    labels: tensor([7, 8, 8, 7])
    bboxes: tensor([[102.3617, 166.5556, 385.1383, 369.0728],
                [168.0341,  59.4446, 321.8097, 326.7470],
                [170.6449,  63.9343, 321.1520, 218.3352],
                [ 40.5326,  51.0319, 407.9049, 314.6487]])
    scores: tensor([0.9390, 0.8247, 0.2191, 0.0120])
) at 0x7f4c7284fcd0>, <InstanceData(

    META INFORMATION

    DATA FIELDS
    labels: tensor([8, 7, 8, 8, 8, 8, 8, 7, 7, 8, 8, 8, 7, 8, 8, 8, 8, 8, 7, 8, 8, 8, 8, 8,
                8, 8, 7, 8, 7, 8, 8, 7])
    bboxes: tensor([[ 2.9486e+02,  1.0858e+01,  3.6959e+02,  1.2830e+02],
                [ 2.4417e+01,  3.2956e+01,  4.5644e+02,  3.0650e+02],
                [ 1.3387e-01,  1.7160e+01,  5.4114e+01,  1.9729e+02],
                [ 4.0168e+02,  2.7506e+01,  4.9441e+02,  2.2152e+02],
                [ 3.5094e+02,  2.7713e+01,  3.7523e+02,  9.8947e+01],
                [ 3.7757e+02,  3.9901e+01,  3.9782e+02,  9.7599e+01],
                [ 3.8794e+02,  4.7856e+01,  4.1675e+02,  9.4038e+01],
                [ 3.0208e+02,  1.4453e+01,  5.0182e+02,  1.8281e+02],
                [ 3.3720e+02,  1.8142e+00,  4.0577e+02,  2.4626e+01],
                [ 4.1048e+02,  1.6823e+01,  4.4421e+02,  9.2650e+01],
                [ 4.7800e+02,  3.4628e+01,  5.0013e+02,  1.6928e+02],
                [ 4.0457e+02,  1.8622e+01,  4.4543e+02,  1.8587e+02],
                [ 4.0197e+02,  1.8631e+01,  5.0037e+02,  1.6721e+02],
                [ 4.7468e+02,  3.4282e+01,  4.9954e+02,  9.0425e+01],
                [ 3.9910e+02,  1.0159e+01,  5.0012e+02,  1.5859e+02],
                [ 3.1136e+02,  1.1680e+01,  5.0270e+02,  1.6527e+02],
                [ 4.8519e+02,  4.7174e+01,  4.9996e+02,  1.7001e+02],
                [ 4.0958e+02,  1.7096e+01,  4.9901e+02,  1.0829e+02],
                [ 3.8121e+02,  6.6811e+01,  4.1215e+02,  9.7545e+01],
                [ 3.0803e-01,  1.1522e+02,  1.9077e+01,  1.9455e+02],
                [ 4.9000e+02,  3.8323e+01,  5.0063e+02,  1.1011e+02],
                [ 3.4072e+02,  2.6013e+01,  3.7256e+02,  1.3170e+02],
                [ 4.8859e+02,  6.4065e+01,  4.9969e+02,  1.6875e+02],
                [ 3.6478e+02,  3.9287e+01,  3.7740e+02,  1.0095e+02],
                [ 4.9228e+02,  3.8884e+01,  4.9990e+02,  9.4808e+01],
                [ 4.9225e+02,  3.2880e+01,  4.9916e+02,  1.4808e+02],
                [ 7.7724e+01,  7.9648e+00,  4.9728e+02,  2.2309e+02],
                [ 4.7965e+02,  1.6421e+01,  5.0081e+02,  1.1747e+02],
                [ 3.5498e+02, -1.1327e-01,  4.2783e+02,  2.5162e+01],
                [ 3.6882e+02,  4.1484e+01,  3.8040e+02,  9.8067e+01],
                [ 4.0930e+02,  4.6326e+01,  4.1883e+02,  9.1174e+01],
                [ 1.0096e+02, -4.1558e-01,  2.9611e+02,  4.5801e+01]])
    scores: tensor([0.9346, 0.9214, 0.9111, 0.9048, 0.7725, 0.7295, 0.6919, 0.6631, 0.3394,
                0.3242, 0.2250, 0.1907, 0.1378, 0.0721, 0.0685, 0.0428, 0.0349, 0.0267,
                0.0265, 0.0227, 0.0222, 0.0213, 0.0198, 0.0184, 0.0148, 0.0142, 0.0139,
                0.0121, 0.0119, 0.0112, 0.0110, 0.0107])
) at 0x7f4c73e837d0>, <InstanceData(

    META INFORMATION

    DATA FIELDS
    labels: tensor([7, 8, 8, 8, 3, 3, 7, 7, 7, 8, 7, 8, 7, 7, 8, 7, 8, 7])
    bboxes: tensor([[-1.0313e+00,  2.3967e+01,  2.1920e+02,  2.3245e+02],
                [ 6.2661e+01, -2.5749e+00,  1.8695e+02,  1.9719e+02],
                [ 1.0182e+00,  2.9399e+00,  1.0816e+02,  1.8376e+02],
                [-9.1974e-01, -3.0710e+00,  1.7504e+02,  1.9944e+02],
                [ 1.7057e+02,  1.9200e+00,  5.0052e+02,  1.5685e+02],
                [-2.5987e+01,  6.4490e-01,  5.0958e+02,  1.5422e+02],
                [-4.0247e+01, -5.5010e-02,  5.1017e+02,  1.5473e+02],
                [ 1.1992e-01,  1.6981e+01,  3.9382e+01,  1.6630e+02],
                [ 1.9230e-01,  1.8545e+01,  2.5003e+01,  1.6103e+02],
                [ 8.2145e+00,  3.0153e+00,  1.0673e+02,  6.9341e+01],
                [ 1.6442e+02,  1.1382e+02,  2.3558e+02,  1.8381e+02],
                [ 1.6399e+02,  1.3589e+02,  2.1531e+02,  1.8419e+02],
                [-1.4666e-01,  1.6936e+01,  8.2324e+01,  1.7221e+02],
                [-4.4486e-01,  2.4443e+01,  3.2232e+01,  2.1655e+02],
                [ 1.6429e+02,  1.1218e+02,  2.3532e+02,  1.8389e+02],
                [ 9.0525e-01,  4.3966e+01,  8.5911e+01,  2.0796e+02],
                [ 2.7777e-01,  2.0573e+01,  1.1880e+01,  1.6281e+02],
                [ 3.5768e+01,  8.3345e+00,  5.0017e+02,  3.2562e+02]])
    scores: tensor([0.9165, 0.8623, 0.8076, 0.3440, 0.2700, 0.1531, 0.0961, 0.0955, 0.0627,
                0.0281, 0.0266, 0.0228, 0.0212, 0.0202, 0.0155, 0.0134, 0.0131, 0.0130])
) at 0x7f4c73ea7410>, <InstanceData(

    META INFORMATION

    DATA FIELDS
    labels: tensor([3, 3, 7, 7, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 8, 8, 3, 3, 8,
                3, 3, 3, 8])
    bboxes: tensor([[ 1.1710e+02,  6.0195e+01,  2.0829e+02,  1.4234e+02],
                [ 3.8051e+02,  3.9745e+01,  5.0074e+02,  1.8252e+02],
                [ 3.1119e+02,  3.7844e+01,  3.9779e+02,  9.6726e+01],
                [ 1.2916e+02,  6.9513e+01,  4.4896e+02,  3.2951e+02],
                [ 2.0787e+02,  1.6675e+01,  3.9408e+02,  7.5903e+01],
                [ 8.6767e+01,  3.2137e+01,  1.0640e+02,  4.5304e+01],
                [ 6.4715e-01,  2.3922e+01,  2.7453e+01,  4.2436e+01],
                [ 5.9496e-01,  2.5238e+01,  1.8448e+01,  4.4391e+01],
                [ 1.1880e+02,  2.8865e+01,  1.3745e+02,  4.3938e+01],
                [ 1.6451e+01,  2.4945e+01,  3.6333e+01,  3.8629e+01],
                [ 4.9243e-01,  2.5550e+01,  1.1031e+01,  4.6325e+01],
                [ 8.6653e-02,  2.8712e+01,  4.5490e+00,  4.7118e+01],
                [ 8.5918e+01,  3.1210e+01,  1.0432e+02,  4.1983e+01],
                [ 9.5329e+01,  2.9045e+01,  1.3514e+02,  4.5076e+01],
                [ 1.2791e+02,  6.5420e+01,  4.5529e+02,  3.3106e+02],
                [-2.7498e+00,  2.4025e+01,  1.4737e+01,  4.0135e+01],
                [ 1.2412e+02,  3.0119e+01,  1.4658e+02,  4.2831e+01],
                [ 3.6104e-01,  2.7464e+01,  1.4971e+01,  4.7877e+01],
                [ 1.3059e+02,  3.1486e+01,  1.5144e+02,  4.4198e+01],
                [ 2.0177e+02,  2.4681e+01,  2.1073e+02,  4.7145e+01],
                [ 2.0503e+02,  2.8852e+01,  2.1372e+02,  4.9810e+01],
                [ 1.6753e+01,  2.6847e+01,  2.8243e+01,  4.2196e+01],
                [ 1.1581e+02,  4.2497e+01,  3.3224e+02,  1.4276e+02],
                [ 2.0131e+02,  2.3769e+01,  2.0884e+02,  4.4102e+01],
                [ 2.1681e+02,  2.0373e+01,  2.6874e+02,  3.4314e+01],
                [ 1.7918e+02,  3.2215e+01,  1.9446e+02,  4.2101e+01],
                [ 3.6741e+01,  2.4949e+01,  4.9685e+01,  3.4377e+01],
                [ 1.6990e+02,  2.7650e+01,  1.7444e+02,  3.7096e+01]])
    scores: tensor([0.9556, 0.9551, 0.9424, 0.8208, 0.7290, 0.6938, 0.5439, 0.5293, 0.4810,
                0.4763, 0.2434, 0.2385, 0.1681, 0.1359, 0.1209, 0.1169, 0.0720, 0.0696,
                0.0481, 0.0346, 0.0292, 0.0284, 0.0278, 0.0271, 0.0189, 0.0154, 0.0145,
                0.0138])
) at 0x7f4c73eb9450>, <InstanceData(

    META INFORMATION

    DATA FIELDS
    labels: tensor([7, 8, 7, 7, 8, 7, 8])
    bboxes: tensor([[-2.3193e+00,  2.3117e+02,  3.9649e+02,  4.8914e+02],
                [ 2.6090e+02,  3.8444e+02,  2.7469e+02,  4.1790e+02],
                [ 1.9678e+02,  2.3472e+02,  3.9077e+02,  3.3754e+02],
                [ 1.9546e+02,  2.3212e+02,  3.9443e+02,  4.2647e+02],
                [-6.5199e-02,  3.7058e+02,  9.5874e+00,  4.0872e+02],
                [ 2.0066e+02,  2.3363e+02,  3.1383e+02,  2.9723e+02],
                [ 2.5208e+02,  3.7938e+02,  2.7375e+02,  4.2062e+02]])
    scores: tensor([0.9414, 0.0848, 0.0797, 0.0565, 0.0271, 0.0183, 0.0181])
) at 0x7f4c72931e90>, <InstanceData(

    META INFORMATION

    DATA FIELDS
    labels: tensor([8, 7, 8, 7, 8, 8, 8, 7, 8, 8, 8, 8, 8, 8, 8])
    bboxes: tensor([[363.4390, 131.7631, 426.7954, 292.6947],
                [-24.7383,  41.9014, 410.6758, 327.0564],
                [  3.1171,   1.0296, 189.6563, 306.9563],
                [ -3.8765, 257.1475, 204.2672, 330.6835],
                [358.0988,  12.1673, 362.2137,  17.3415],
                [496.6452,  41.6129, 499.4486,  71.2436],
                [294.4947,  72.9084, 299.6459,  81.8662],
                [  2.2965,  29.8816, 214.5004, 330.4776],
                [495.9322,  39.3294, 499.3804,  94.6327],
                [443.6203,   0.5641, 477.4735,  14.3369],
                [496.7606,  41.6251, 499.3332,  61.8512],
                [  0.6624,   1.3462, 175.2166, 163.9813],
                [358.8210,  12.7607, 363.0540,  18.7512],
                [ 66.2434,  -1.6253, 184.1472, 283.0337],
                [498.0081,  43.7743, 500.4294,  58.2363]])
    scores: tensor([0.9199, 0.8799, 0.8271, 0.1023, 0.0630, 0.0605, 0.0545, 0.0407, 0.0383,
                0.0349, 0.0334, 0.0271, 0.0181, 0.0170, 0.0117])
) at 0x7f4c73eb1ad0>, <InstanceData(

    META INFORMATION

    DATA FIELDS
    labels: tensor([7, 1, 8, 8, 7, 7, 8, 7, 7, 8, 7, 8])
    bboxes: tensor([[-6.8324e-01,  2.5061e+01,  4.9561e+02,  3.1342e+02],
                [ 2.6011e+02,  3.6636e+02,  2.7661e+02,  3.7583e+02],
                [ 9.0145e+01,  1.8297e+02,  1.1669e+02,  2.2602e+02],
                [ 8.3383e+01,  1.8242e+02,  1.1632e+02,  2.3945e+02],
                [ 3.0971e+02,  2.9398e+02,  4.8092e+02,  3.7438e+02],
                [ 3.6359e+02,  3.1951e+02,  4.8172e+02,  3.7424e+02],
                [ 1.5943e+02,  8.4796e+01,  1.7983e+02,  1.3552e+02],
                [ 2.6013e+02,  3.6659e+02,  2.7620e+02,  3.7560e+02],
                [ 4.6684e+02, -2.0683e+00,  5.0035e+02,  1.7424e+02],
                [ 4.6988e+02,  1.5740e+00,  5.0043e+02,  2.0057e+02],
                [ 4.6625e+02,  7.5516e-02,  5.0016e+02,  9.4260e+01],
                [ 4.6680e+02,  2.3584e+00,  4.9961e+02,  8.7632e+01]])
    scores: tensor([0.9023, 0.1505, 0.1073, 0.0786, 0.0746, 0.0422, 0.0254, 0.0237, 0.0234,
                0.0133, 0.0123, 0.0122])
) at 0x7f4c73d67510>, <InstanceData(

    META INFORMATION

    DATA FIELDS
    labels: tensor([7, 8, 8])
    bboxes: tensor([[163.4342,  31.9882, 352.5814, 150.2696],
                [243.1980,  27.1711, 304.0677, 118.1267],
                [244.8346,  45.1710, 287.5873, 106.1890]])
    scores: tensor([0.9360, 0.8032, 0.0283])
) at 0x7f4c73e83350>, <InstanceData(

    META INFORMATION

    DATA FIELDS
    labels: tensor([7, 7, 7, 8, 8, 3])
    bboxes: tensor([[ 48.9178,  90.6664, 463.5822, 261.6773],
                [186.5760,  93.1506, 322.0177, 169.3494],
                [233.9870, 162.7791, 455.0756, 266.9084],
                [190.7153, 194.7435, 204.0113, 218.1472],
                [152.2129, 269.8406, 158.7246, 277.0344],
                [288.1075,   1.9895, 496.6581,  98.5964]])
    scores: tensor([0.9019, 0.0814, 0.0624, 0.0240, 0.0199, 0.0153])
) at 0x7f4c73d677d0>, <InstanceData(

    META INFORMATION

    DATA FIELDS
    labels: tensor([7, 7, 7, 7, 7, 7, 3, 7, 3, 7, 3, 7, 7, 8, 3, 7, 8, 7, 8, 8, 8, 8, 8, 8,
                7, 8, 8, 7, 7, 8, 7, 7, 8, 7, 8, 8, 8, 8, 8, 7, 8, 7])
    bboxes: tensor([[ 2.2612e+01,  1.8107e+01,  5.0551e+02,  3.0614e+02],
                [ 2.4715e+02,  2.9132e+00,  4.8761e+02,  1.2487e+02],
                [-4.6253e-03,  1.0893e+00,  2.2500e+02,  1.9713e+02],
                [ 4.4889e+02,  2.0340e+02,  4.9955e+02,  3.3350e+02],
                [ 3.1590e+02,  8.0462e+00,  4.2863e+02,  8.0039e+01],
                [ 1.5060e-01,  5.7119e+01,  1.5483e+02,  2.0001e+02],
                [ 4.6880e+02,  2.8145e+01,  4.9995e+02,  2.1162e+02],
                [ 1.8293e+02,  9.5839e-01,  2.7254e+02,  8.2298e+01],
                [ 4.3314e+02,  1.8549e+01,  5.0045e+02,  2.0971e+02],
                [ 2.6429e+02,  4.1855e+00,  4.3024e+02,  7.8827e+01],
                [ 4.7222e+02,  7.7413e+01,  4.9966e+02,  2.0957e+02],
                [ 3.0090e+02,  2.2624e+00,  4.8894e+02,  8.3920e+01],
                [ 1.2259e+02,  2.1786e-01,  2.7077e+02,  8.2355e+01],
                [ 3.6559e+02,  7.1596e-01,  4.8988e+02,  8.0638e+01],
                [ 4.3124e+02,  1.3196e+01,  5.0001e+02,  1.2064e+02],
                [ 1.8098e+02, -4.1123e-01,  3.6902e+02,  8.5082e+01],
                [ 4.3382e+02, -6.4725e-02,  4.9977e+02,  3.2767e+01],
                [ 8.2624e+00, -1.2800e+00,  4.9877e+02,  1.3882e+02],
                [ 3.8484e+02, -2.1366e-01,  4.2844e+02,  1.6723e+01],
                [ 4.2810e+02,  7.4436e-01,  5.0003e+02,  9.9339e+01],
                [ 5.9509e+01,  3.3575e+01,  1.5612e+02,  1.2494e+02],
                [ 4.2025e+01,  2.7713e-02,  1.3678e+02,  1.0513e+01],
                [ 4.2870e+02,  1.9189e+01,  5.0021e+02,  2.0556e+02],
                [ 1.4984e+00,  3.5575e-01,  1.5954e+02,  4.6296e+01],
                [ 4.1513e+02,  5.5887e+00,  4.9971e+02,  2.0745e+02],
                [ 4.7239e+02,  3.2335e-01,  4.9949e+02,  3.1233e+01],
                [-1.5110e-02,  1.8815e+00,  1.7492e+02,  6.3280e+01],
                [-1.4726e+00, -5.3557e+00,  2.1534e+02,  1.2729e+02],
                [ 3.2539e+02,  9.6910e+00,  4.9961e+02,  2.0920e+02],
                [-1.2709e+00, -3.7706e+00,  1.9346e+02,  1.1536e+02],
                [ 1.4384e+02,  6.6114e-01,  2.1241e+02,  1.3464e+02],
                [ 2.3672e+01,  2.7747e-01,  4.6422e+02,  9.3514e+01],
                [ 2.3612e+02, -3.8021e-02,  3.6310e+02,  7.7880e+01],
                [ 2.8233e+00,  3.4833e+00,  4.9718e+02,  2.1795e+02],
                [ 2.0233e+01, -1.8263e+01,  5.0125e+02,  2.8359e+02],
                [ 2.9815e+02,  3.5670e-01,  4.9521e+02,  8.4606e+01],
                [ 1.0033e+02,  1.1446e+00,  2.6725e+02,  8.3282e+01],
                [ 3.4014e+02, -2.3646e+00,  5.0126e+02,  1.2127e+02],
                [ 4.3322e+02,  2.3206e-01,  4.8474e+02,  5.3224e+01],
                [ 6.6729e+01, -4.7621e-01,  2.2663e+02,  1.3694e+02],
                [ 4.0154e+02, -4.7930e+00,  5.0158e+02,  1.5599e+02],
                [ 2.4862e+02,  1.6185e+01,  3.1739e+02,  1.2575e+02]])
    scores: tensor([0.9033, 0.8467, 0.8076, 0.7339, 0.3081, 0.2340, 0.2080, 0.1628, 0.1089,
                0.0922, 0.0763, 0.0618, 0.0356, 0.0341, 0.0314, 0.0311, 0.0250, 0.0241,
                0.0234, 0.0212, 0.0198, 0.0193, 0.0191, 0.0191, 0.0182, 0.0177, 0.0165,
                0.0152, 0.0145, 0.0135, 0.0134, 0.0131, 0.0130, 0.0127, 0.0125, 0.0125,
                0.0115, 0.0109, 0.0108, 0.0108, 0.0101, 0.0101])
) at 0x7f4c73eb1a50>, <InstanceData(

    META INFORMATION

    DATA FIELDS
    labels: tensor([8, 8, 7, 7, 8, 7, 8, 7, 8, 7, 8, 7, 7, 8, 8, 3, 7, 8, 3, 3, 3, 3, 8, 7,
                3, 8, 3, 7, 8, 8, 7, 8, 7, 8, 8, 7, 7])
    bboxes: tensor([[ 1.2344e+02,  8.0375e+01,  1.7460e+02,  2.4248e+02],
                [ 3.5781e+02,  8.7783e+01,  4.3790e+02,  2.4737e+02],
                [ 3.8148e+01,  1.1488e+02,  1.4466e+02,  2.8981e+02],
                [ 3.5583e+02,  1.2638e+02,  4.2854e+02,  2.6776e+02],
                [ 3.2463e+02,  9.4089e+01,  3.7107e+02,  2.3911e+02],
                [ 2.1035e+02,  9.8244e+01,  3.3340e+02,  3.2949e+02],
                [ 2.7799e+02,  1.8434e+02,  3.4936e+02,  2.8558e+02],
                [ 4.9350e+02,  1.1716e+02,  5.0025e+02,  1.5471e+02],
                [ 4.9296e+02,  1.1626e+02,  5.0001e+02,  1.5444e+02],
                [ 5.0335e-02,  2.1489e+02,  5.1315e+00,  2.7027e+02],
                [ 6.9383e-02,  2.0902e+02,  5.1705e+00,  2.6949e+02],
                [ 4.9450e+02,  1.2420e+02,  5.0003e+02,  1.5588e+02],
                [ 1.8489e-01,  1.8612e+02,  5.2167e+00,  2.6974e+02],
                [ 3.8967e+01,  1.0966e+02,  1.4140e+02,  2.3566e+02],
                [ 1.2722e-01,  1.7124e+02,  5.1004e+00,  2.7290e+02],
                [ 3.9251e+01,  1.0694e+02,  5.8844e+01,  1.1592e+02],
                [ 4.8960e+02,  1.1543e+02,  5.0024e+02,  1.5528e+02],
                [-7.9864e-02,  1.3420e+02,  4.4439e+00,  2.7479e+02],
                [-1.3017e-01,  1.0867e+02,  9.5784e+00,  1.1730e+02],
                [ 7.7167e+00,  1.0885e+02,  2.6194e+01,  1.1752e+02],
                [ 6.8754e+01,  1.0304e+02,  8.6910e+01,  1.1083e+02],
                [ 1.3864e+00,  1.0863e+02,  1.8120e+01,  1.1793e+02],
                [-1.0107e-01,  9.1903e+01,  8.9023e+00,  2.6786e+02],
                [ 3.2286e+02,  9.4378e+01,  4.2832e+02,  2.4722e+02],
                [ 5.7158e+01,  1.0524e+02,  6.9697e+01,  1.1097e+02],
                [ 1.0064e-01,  1.9960e+02,  6.9367e+00,  2.6993e+02],
                [ 3.9055e+01,  1.0783e+02,  5.0789e+01,  1.1638e+02],
                [-1.6304e-01,  2.3339e+02,  5.0184e+00,  2.7403e+02],
                [-6.8258e-02,  1.4891e+02,  7.1239e+00,  2.6984e+02],
                [ 5.1473e+01,  1.9500e+02,  7.4113e+01,  2.4484e+02],
                [ 1.0762e-01,  1.7992e+02,  9.7313e+00,  2.7321e+02],
                [-2.2908e-01,  5.5771e+01,  1.0770e+01,  2.5751e+02],
                [ 3.9640e+02,  1.4888e+02,  4.1688e+02,  2.0287e+02],
                [ 4.9704e+01,  2.1480e+02,  7.8128e+01,  2.4575e+02],
                [ 2.6641e+02,  1.1339e+02,  3.5664e+02,  2.8427e+02],
                [ 3.2496e+02,  1.2196e+02,  3.7269e+02,  2.3781e+02],
                [-1.1021e-01,  2.2071e+02,  1.0199e+01,  2.7773e+02]])
    scores: tensor([0.9419, 0.9355, 0.9209, 0.9141, 0.8765, 0.8516, 0.7178, 0.6069, 0.4463,
                0.2023, 0.1921, 0.1720, 0.1648, 0.1320, 0.1043, 0.1031, 0.0668, 0.0539,
                0.0504, 0.0493, 0.0477, 0.0374, 0.0365, 0.0322, 0.0305, 0.0276, 0.0261,
                0.0261, 0.0199, 0.0175, 0.0161, 0.0149, 0.0146, 0.0119, 0.0113, 0.0111,
                0.0102])
) at 0x7f4c73d64bd0>, <InstanceData(

    META INFORMATION

    DATA FIELDS
    labels: tensor([7, 7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 8, 8, 2, 8, 8, 8, 8, 7, 2, 7, 8, 8, 8,
                8, 8, 8, 8, 8, 7, 2])
    bboxes: tensor([[ 3.8448e+02,  1.0300e+02,  4.6318e+02,  1.9521e+02],
                [ 2.9650e+02,  1.1040e+02,  3.7889e+02,  1.9796e+02],
                [ 1.0244e+02,  1.0244e+02,  2.0772e+02,  2.0846e+02],
                [ 2.5453e+02,  9.8049e+01,  3.2125e+02,  1.8357e+02],
                [ 3.9554e+02,  6.0650e+01,  4.6540e+02,  1.7687e+02],
                [ 1.2334e+02,  7.1516e+01,  2.0986e+02,  2.0269e+02],
                [ 3.0908e+02,  7.0008e+01,  3.8038e+02,  1.7707e+02],
                [ 2.6410e+02,  7.0086e+01,  3.2184e+02,  1.6587e+02],
                [ 1.5255e+02,  4.0767e+01,  1.7870e+02,  8.0626e+01],
                [ 4.4303e+02,  3.9880e+01,  4.7182e+02,  1.1391e+02],
                [ 4.2424e+02,  3.9218e+01,  4.4841e+02,  6.6854e+01],
                [ 1.2570e+02,  7.3808e+01,  1.7410e+02,  1.1394e+02],
                [ 2.0647e+01,  3.4617e+01,  3.3162e+01,  6.3551e+01],
                [ 2.5217e+02,  7.4287e+00,  4.9978e+02,  1.0079e+02],
                [ 6.8253e+01,  3.4682e+01,  8.0965e+01,  6.5633e+01],
                [ 4.9507e+02,  4.9954e+01,  4.9946e+02,  1.2101e+02],
                [-1.6405e-02,  4.0138e+01,  1.3688e+01,  6.6227e+01],
                [ 4.4074e+02,  3.8045e+01,  4.5145e+02,  5.6854e+01],
                [ 5.9148e+01,  4.1689e+01,  7.9817e+01,  6.1944e+01],
                [ 3.5397e+02,  1.1919e+00,  4.9759e+02,  9.5122e+01],
                [ 4.0559e+02,  6.6192e+01,  4.7566e+02,  1.1736e+02],
                [ 4.2374e+02,  4.1040e+01,  4.5048e+02,  9.6259e+01],
                [ 4.9578e+02,  6.4668e+01,  4.9953e+02,  1.2415e+02],
                [ 4.9462e+02,  2.3952e+01,  4.9913e+02,  1.2593e+02],
                [ 4.9393e+02,  4.4303e+01,  5.0060e+02,  1.2022e+02],
                [ 4.5613e-01,  3.9425e+01,  1.6573e+01,  9.1921e+01],
                [ 4.3530e+02,  3.8808e+01,  4.5064e+02,  6.3556e+01],
                [ 4.8950e+02,  4.3486e+01,  4.9956e+02,  1.2416e+02],
                [ 6.3631e+01,  3.5897e+01,  8.3048e+01,  6.4321e+01],
                [ 1.2469e+02,  7.4679e+01,  2.3742e+02,  1.1161e+02],
                [ 2.5305e+02,  1.9583e-01,  4.9539e+02,  1.5203e+02]])
    scores: tensor([0.9497, 0.9424, 0.9365, 0.9351, 0.9321, 0.9204, 0.9194, 0.8989, 0.8389,
                0.8130, 0.8052, 0.6689, 0.6401, 0.5947, 0.5566, 0.2825, 0.2534, 0.1823,
                0.1710, 0.1158, 0.1070, 0.1034, 0.0844, 0.0440, 0.0426, 0.0352, 0.0295,
                0.0272, 0.0195, 0.0127, 0.0110])
) at 0x7f4c73ba7850>, <InstanceData(

    META INFORMATION

    DATA FIELDS
    labels: tensor([7, 8, 7, 8])
    bboxes: tensor([[ 51.3095, 134.0120, 310.7999, 285.5193],
                [273.1259,  58.9821, 394.0617, 277.5413],
                [269.0843, 116.9513, 403.1813, 230.3143],
                [152.7141, 139.4707, 160.3718, 152.3262]])
    scores: tensor([0.9658, 0.9624, 0.0568, 0.0120])
) at 0x7f4c73eb28d0>, <InstanceData(

    META INFORMATION

    DATA FIELDS
    labels: tensor([7, 8, 0, 7, 7, 8, 7, 7, 8, 8, 8])
    bboxes: tensor([[ 1.6816e+02,  1.0975e+02,  4.0489e+02,  2.6485e+02],
                [ 2.2691e+02,  9.4515e+01,  3.5590e+02,  2.4240e+02],
                [ 1.9421e+02,  3.2945e+02,  2.7532e+02,  3.7368e+02],
                [ 5.2750e+00,  1.7068e-01,  1.8262e+02,  2.2827e+01],
                [ 1.8483e+02,  3.2824e+02,  3.7884e+02,  3.7450e+02],
                [ 1.6726e+02, -2.0679e-02,  1.8118e+02,  9.4445e+00],
                [ 8.5513e+01,  5.7839e-02,  1.8460e+02,  1.5921e+01],
                [ 1.2194e+02, -1.9368e-01,  1.8567e+02,  1.3048e+01],
                [ 1.4649e+02, -3.5056e-01,  1.8008e+02,  1.0025e+01],
                [ 2.5532e+02,  3.3007e+02,  2.7124e+02,  3.5508e+02],
                [ 1.1766e+02,  7.5046e-02,  1.7980e+02,  1.0759e+01]])
    scores: tensor([0.9653, 0.9131, 0.0481, 0.0350, 0.0258, 0.0216, 0.0155, 0.0148, 0.0144,
                0.0116, 0.0110])
) at 0x7f4c73c36b10>, <InstanceData(

    META INFORMATION

    DATA FIELDS
    labels: tensor([7, 8, 7, 8, 8, 8, 7, 8, 8, 7, 8, 7, 7, 8, 8, 7, 8, 8, 7, 7, 8, 7, 8, 8,
                7, 8, 7, 8, 8, 0])
    bboxes: tensor([[ 60.1871,  94.8003, 386.2973, 301.6841],
                [223.1392, 102.4838, 230.7670, 121.3444],
                [264.1303, 126.3315, 296.8072, 164.6841],
                [266.3674, 120.7412, 294.5701, 166.5635],
                [201.1645, 152.5987, 215.6324, 181.3857],
                [218.9001, 132.7385, 249.0686, 178.7850],
                [182.9373, 219.4694, 317.0627, 262.1713],
                [272.5203, 121.1257, 286.8547, 160.3196],
                [275.0216, 121.1683, 285.9159, 131.9567],
                [218.6263, 104.7136, 298.1706, 166.9661],
                [265.8151, 121.4964, 284.1849, 166.7848],
                [175.0369, 152.0790, 212.0725, 182.1007],
                [242.8973, 110.6611, 297.7277, 165.7061],
                [272.9492, 120.8199, 288.7695, 145.7817],
                [220.4158, 132.3303, 237.3967, 158.4900],
                [220.5662, 135.1572, 252.0901, 168.9444],
                [211.5210, 125.5684, 222.4634, 150.6035],
                [210.7893, 128.0862, 225.9294, 156.2888],
                [176.5171, 120.5339, 251.6079, 182.7864],
                [238.9776, 128.8168, 280.9443, 167.6676],
                [212.0682, 134.7090, 230.5099, 160.2129],
                [197.8998, 214.7596, 352.8815, 261.0217],
                [211.2747, 127.2713, 234.4284, 157.8850],
                [266.0341, 133.6446, 279.2784, 166.7460],
                [176.4193, 154.8448, 197.6041, 178.9443],
                [171.8957, 154.8875, 184.7449, 178.7062],
                [171.3519, 147.1142, 218.2966, 179.2530],
                [174.3686, 155.8312, 187.5455, 176.2001],
                [112.2356, 141.2250, 131.9050, 160.5328],
                [209.2535,  99.6867, 297.3871, 163.7899]])
    scores: tensor([0.9243, 0.2998, 0.1368, 0.0953, 0.0846, 0.0651, 0.0598, 0.0551, 0.0496,
                0.0470, 0.0427, 0.0423, 0.0416, 0.0401, 0.0390, 0.0366, 0.0355, 0.0235,
                0.0131, 0.0125, 0.0120, 0.0119, 0.0114, 0.0112, 0.0111, 0.0111, 0.0108,
                0.0107, 0.0107, 0.0105])
) at 0x7f4c73e83b50>, <InstanceData(

    META INFORMATION

    DATA FIELDS
    labels: tensor([7, 8, 7, 8, 7, 8, 7, 8, 8, 7, 7, 8, 3, 7, 7, 7, 8, 8, 8, 7, 8, 7, 3, 8,
                7, 8, 8, 8, 7, 8, 7, 8, 8, 8, 8, 7, 8, 7, 8, 3, 8, 8, 8, 8, 8, 8, 8, 8,
                8, 8, 8, 7, 8, 7, 3, 8, 8, 3, 2])
    bboxes: tensor([[2.2695e-01, 5.6731e+01, 6.0564e+01, 1.0604e+02],
                [1.2620e+02, 3.9722e+01, 2.9411e+02, 2.3058e+02],
                [3.7562e+01, 1.0333e+02, 3.1595e+02, 2.6417e+02],
                [2.0386e+02, 1.7666e+01, 2.5004e+02, 7.0988e+01],
                [6.9262e+01, 1.2385e+02, 4.9793e+02, 3.2698e+02],
                [2.7292e+02, 1.8640e+01, 3.0091e+02, 9.9630e+01],
                [1.2141e+02, 5.8094e+01, 1.5808e+02, 1.0702e+02],
                [4.3634e+02, 2.2636e+01, 4.5194e+02, 3.6938e+01],
                [3.3623e+02, 2.8689e+01, 5.0361e+02, 2.8748e+02],
                [5.8907e+01, 5.7775e+01, 1.0476e+02, 1.0011e+02],
                [1.6418e+02, 1.5822e+02, 5.0262e+02, 3.2579e+02],
                [3.3435e+02, 2.8127e+01, 3.9104e+02, 7.6677e+01],
                [2.9912e+02, 2.0135e+01, 3.9112e+02, 7.1154e+01],
                [2.8740e+02, 3.7704e+01, 5.0127e+02, 1.4165e+02],
                [2.9257e+02, 6.3039e+01, 4.2696e+02, 1.3203e+02],
                [1.3460e+02, 8.2611e+01, 2.0817e+02, 1.5393e+02],
                [1.2495e+02, 2.5384e+01, 1.6392e+02, 9.6301e+01],
                [7.2091e+01, 2.6348e+01, 9.2948e+01, 6.5819e+01],
                [4.5118e+02, 2.5975e+01, 4.6366e+02, 3.7746e+01],
                [9.6951e+01, 5.5319e+01, 1.2649e+02, 1.0101e+02],
                [3.8191e+02, 1.9677e+01, 3.9856e+02, 6.5464e+01],
                [3.3638e+02, 2.7933e+01, 5.0425e+02, 3.0775e+02],
                [4.8197e+02, 3.2860e+01, 5.0006e+02, 4.6474e+01],
                [2.7362e+02, 2.0304e+01, 3.0099e+02, 5.2395e+01],
                [1.8346e+01, 2.6952e+01, 5.0314e+02, 2.9975e+02],
                [1.3062e+02, 2.7691e+01, 1.6255e+02, 6.3841e+01],
                [3.3174e+02, 4.2583e+01, 4.5458e+02, 2.7280e+02],
                [3.3708e+02, 4.5492e+01, 4.3675e+02, 1.7641e+02],
                [1.3668e+02, 6.7682e+01, 2.9106e+02, 1.6125e+02],
                [4.3661e+02, 2.2470e+01, 4.5324e+02, 6.2524e+01],
                [3.1273e+01, 9.5674e+01, 4.3318e+02, 2.8724e+02],
                [3.2957e+02, 3.6428e+01, 4.8215e+02, 1.3259e+02],
                [1.0264e+02, 3.7868e+01, 1.2646e+02, 6.7229e+01],
                [9.9680e+01, 3.5695e+01, 1.2688e+02, 1.0297e+02],
                [4.5067e+02, 2.6439e+01, 4.6496e+02, 5.1529e+01],
                [2.9154e+02, 7.3483e+01, 3.5885e+02, 1.3164e+02],
                [3.9716e+02, 3.4749e+01, 4.8487e+02, 1.2412e+02],
                [1.4949e+02, 2.5130e+01, 4.9192e+02, 2.5396e+02],
                [3.3206e+02, 3.0648e+01, 4.0700e+02, 1.3261e+02],
                [1.5886e+02, 1.1092e+01, 4.6106e+02, 1.7607e+02],
                [3.3802e+02, 2.8549e+01, 3.8854e+02, 6.2642e+01],
                [1.6600e+02, 4.6635e+01, 1.9318e+02, 7.2903e+01],
                [3.9799e+02, 3.5959e+01, 4.3951e+02, 8.3385e+01],
                [1.9022e+02, 3.5634e+01, 2.2033e+02, 7.0536e+01],
                [1.6556e+02, 2.8567e+01, 1.9518e+02, 7.1748e+01],
                [4.0250e+02, 3.1081e+01, 5.0063e+02, 2.0351e+02],
                [3.8445e+02, 2.5351e+01, 5.0383e+02, 3.2321e+02],
                [1.7388e+02, 2.6214e+01, 5.1166e+02, 3.0440e+02],
                [1.6306e+02, 5.0857e+01, 2.0334e+02, 8.5759e+01],
                [1.6580e+02, 3.5800e+01, 2.0569e+02, 7.5932e+01],
                [4.4071e+02, 2.5465e+01, 4.6163e+02, 6.3872e+01],
                [4.5682e+02, 1.4812e+02, 4.9943e+02, 3.2847e+02],
                [3.9356e+02, 1.2106e+02, 4.9863e+02, 3.2743e+02],
                [1.2142e+02, 4.5419e+01, 2.8834e+02, 2.3113e+02],
                [1.7151e+02, 1.4604e+01, 3.1013e+02, 1.7080e+02],
                [1.5270e+02, 4.8800e+01, 1.6664e+02, 7.0641e+01],
                [4.5224e+01, 4.9213e+01, 5.4239e+01, 6.6423e+01],
                [1.7718e+02, 1.8378e+01, 2.9978e+02, 5.3102e+01],
                [3.0026e+02, 1.8608e+01, 4.1498e+02, 7.1607e+01]])
    scores: tensor([0.9243, 0.8784, 0.8501, 0.7622, 0.6919, 0.6904, 0.6729, 0.6567, 0.6406,
                0.6284, 0.6250, 0.6240, 0.6099, 0.6069, 0.5947, 0.5806, 0.5210, 0.4287,
                0.3643, 0.3557, 0.3433, 0.3049, 0.2588, 0.2236, 0.2159, 0.2092, 0.1938,
                0.1757, 0.0855, 0.0801, 0.0739, 0.0610, 0.0506, 0.0470, 0.0409, 0.0384,
                0.0360, 0.0336, 0.0284, 0.0278, 0.0234, 0.0231, 0.0216, 0.0216, 0.0207,
                0.0196, 0.0194, 0.0190, 0.0129, 0.0127, 0.0125, 0.0123, 0.0122, 0.0122,
                0.0114, 0.0111, 0.0109, 0.0104, 0.0103])
) at 0x7f4c73ba6e10>, <InstanceData(

    META INFORMATION

    DATA FIELDS
    labels: tensor([7, 7, 8, 8, 8, 8, 8, 8, 8, 8, 8, 7, 8, 8, 8, 7, 8, 8, 7, 7, 7, 8, 8, 7,
                8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 7, 8, 8, 8, 8, 8, 8, 8, 7, 7, 8,
                8, 8, 8, 8])
    bboxes: tensor([[ 5.0690e+01,  2.2368e+01,  4.3720e+02,  3.4912e+02],
                [-6.5878e+00,  1.6895e+00,  3.0581e+02,  1.8073e+02],
                [ 3.1509e+02,  1.0630e+00,  3.6108e+02,  7.3986e+01],
                [ 9.2416e+01,  3.0911e-01,  2.0426e+02,  8.1185e+01],
                [ 3.9245e+02,  5.4134e+00,  5.0052e+02,  3.7857e+02],
                [ 2.9586e+02, -5.7960e-01,  3.2015e+02,  7.6459e+01],
                [ 3.1375e+02,  1.8623e+00,  5.0734e+02,  3.8837e+02],
                [ 4.4138e+02,  4.1040e+01,  5.0003e+02,  3.8826e+02],
                [ 2.1063e+02,  2.7041e-01,  2.5187e+02,  1.1522e+01],
                [ 2.8436e+02,  5.3258e-01,  2.9650e+02,  3.8188e+01],
                [ 4.2511e+02,  1.7663e-01,  4.6317e+02,  1.2111e+02],
                [-2.0334e+01, -5.8637e+00,  4.9572e+02,  1.5137e+02],
                [ 4.2683e+02,  6.5137e-01,  4.9661e+02,  1.2230e+02],
                [ 3.9568e+02,  4.8363e-01,  4.5979e+02,  1.2237e+02],
                [ 3.8194e+02, -3.1306e-01,  4.3603e+02,  5.0606e+01],
                [ 1.7865e+02, -1.6821e+00,  4.9908e+02,  1.5881e+02],
                [ 2.4205e+02,  2.4906e+00,  5.0053e+02,  3.3052e+02],
                [-8.5809e-02, -1.5850e-01,  1.0645e+01,  4.4690e+01],
                [ 3.6314e+01,  1.3628e+00,  2.7677e+02,  1.0498e+02],
                [ 2.9582e+02, -1.4637e+00,  4.8309e+02,  1.4668e+02],
                [-1.4907e+01,  1.1476e+00,  4.3170e+02,  2.1252e+02],
                [ 3.3322e+02, -2.3527e+00,  4.9881e+02,  1.4278e+02],
                [ 2.0000e+01,  2.4586e-01,  1.3098e+02,  2.8636e+01],
                [-2.3046e+00, -2.0540e-01,  4.3824e+02,  1.0733e+02],
                [ 2.2215e+01,  8.4842e-01,  2.0376e+02,  8.4210e+01],
                [ 3.7601e+02,  2.2619e-01,  4.3259e+02,  1.6693e+01],
                [ 2.7711e-01, -3.0779e-01,  2.3453e+01,  4.3716e+01],
                [ 4.5963e+02, -3.5754e-02,  4.9428e+02,  3.1335e+01],
                [ 2.5037e+02,  1.2067e+00,  2.8948e+02,  7.2719e+01],
                [ 4.7282e+01, -1.1378e-01,  1.3123e+02,  9.0066e+00],
                [ 4.5408e+02, -1.4758e+00,  5.0061e+02,  1.5411e+02],
                [ 1.7423e+02, -1.7266e+00,  4.8359e+02,  1.4860e+02],
                [ 2.0229e+02,  2.0019e+00,  2.6100e+02,  7.3584e+01],
                [ 8.6903e+01, -6.2880e-01,  1.3400e+02,  2.4066e+01],
                [ 2.8675e+02, -1.6475e+00,  4.7184e+02,  1.4208e+02],
                [ 7.8144e-01, -1.6031e-01,  8.7549e+01,  3.9565e+01],
                [ 4.5647e+02,  1.5268e+00,  5.0057e+02,  1.0619e+02],
                [ 3.1477e+02,  8.0895e-01,  4.5671e+02,  7.1213e+01],
                [ 7.6046e+01, -5.0016e-01,  1.2962e+02,  7.8610e+00],
                [ 6.2024e+01, -3.8208e+00,  2.6219e+02,  1.8644e+02],
                [ 8.9823e-02,  1.1775e+00,  1.2886e+01,  7.4457e+01],
                [ 8.3456e+01, -3.4331e+00,  2.1811e+02,  1.7150e+02],
                [ 7.3255e+00,  2.9569e-01,  4.9307e+02,  1.4472e+02],
                [-7.3873e-02,  4.8964e-01,  9.2902e+00,  2.9418e+01],
                [ 3.4330e+01, -3.7662e-01,  1.1381e+02,  7.4933e+00],
                [ 2.5344e+02, -1.6359e+00,  5.0593e+02,  2.1433e+02],
                [ 2.1626e+02,  5.9205e+00,  5.0014e+02,  3.6478e+02],
                [ 6.4862e+01,  4.9707e-02,  1.3338e+02,  2.7172e+01],
                [-4.8466e-04,  1.8248e+00,  8.1670e+00,  7.4054e+01],
                [ 4.6231e+02, -1.3082e-01,  4.9941e+02,  3.9486e+01],
                [ 1.9911e+02,  6.9879e-01,  2.8995e+02,  7.3764e+01],
                [ 7.8816e+01,  5.9946e-01,  2.8095e+02,  8.1871e+01]])
    scores: tensor([0.8740, 0.8105, 0.7700, 0.7373, 0.6636, 0.5986, 0.2229, 0.2003, 0.1940,
                0.1799, 0.1438, 0.1238, 0.1061, 0.0933, 0.0831, 0.0768, 0.0753, 0.0721,
                0.0627, 0.0627, 0.0614, 0.0591, 0.0550, 0.0493, 0.0446, 0.0374, 0.0321,
                0.0313, 0.0298, 0.0253, 0.0242, 0.0235, 0.0232, 0.0197, 0.0193, 0.0187,
                0.0186, 0.0171, 0.0164, 0.0157, 0.0147, 0.0143, 0.0141, 0.0140, 0.0140,
                0.0132, 0.0125, 0.0124, 0.0118, 0.0118, 0.0116, 0.0106])
) at 0x7f4c73ec2450>, <InstanceData(

    META INFORMATION

    DATA FIELDS
    labels: tensor([7, 8, 3, 7, 8, 7, 7, 8])
    bboxes: tensor([[  4.5216, 192.0680, 387.2752, 503.2445],
                [110.7337,  81.4261, 290.8288, 420.5270],
                [111.6346,  79.0848, 315.7091, 360.7589],
                [105.6097,  72.8126, 331.4996, 413.9062],
                [109.9213,  77.7690, 285.1959, 299.7700],
                [  5.7558, 224.4943, 242.4864, 430.9744],
                [  6.5127,  71.4787, 356.3779, 452.7401],
                [300.0851, 352.3588, 320.6181, 426.1568]])
    scores: tensor([0.8887, 0.8018, 0.1181, 0.0549, 0.0540, 0.0239, 0.0179, 0.0128])
) at 0x7f4c73ba5950>]

Using default root folder: ./tiny_motorbike_coco/tiny_motorbike/Annotations/... Specify `model.mmdet_image.coco_root=...` in hyperparameters if you think it is wrong.
/home/ci/opt/venv/lib/python3.11/site-packages/mmdet/models/backbones/csp_darknet.py:118: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
  with torch.cuda.amp.autocast(enabled=False):
A new predictor save path is created. This is to prevent you to overwrite previous predictor saved here. You could check current save path at predictor._save_path. If you still want to use this path, set resume=True
No path specified. Models will be saved in: "AutogluonModels/ag-20241127_101342"
Saved detection results to /home/ci/autogluon/docs/tutorials/multimodal/object_detection/quick_start/AutogluonModels/ag-20241127_101342/result.txt

输出 pred 是一个 pandas DataFrame，它有两列，image 和 bboxes。

在image中，每一行包含图像路径

在bboxes中，每一行是一个字典列表，每个字典代表一个边界框：{"class": , "bbox": [x1, y1, x2, y2], "score": }

请注意，默认情况下，predictor.predict 不会将检测结果保存到文件中。

要运行推理并保存结果，请运行以下内容：

pred = predictor.predict(test_path, save_results=True)

loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
saving file at /home/ci/autogluon/docs/tutorials/multimodal/object_detection/quick_start/AutogluonModels/ag-20241127_101344-001/result.json

Using default root folder: ./tiny_motorbike_coco/tiny_motorbike/Annotations/... Specify `model.mmdet_image.coco_root=...` in hyperparameters if you think it is wrong.
/home/ci/opt/venv/lib/python3.11/site-packages/mmdet/models/backbones/csp_darknet.py:118: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
  with torch.cuda.amp.autocast(enabled=False):
A new predictor save path is created. This is to prevent you to overwrite previous predictor saved here. You could check current save path at predictor._save_path. If you still want to use this path, set resume=True
No path specified. Models will be saved in: "AutogluonModels/ag-20241127_101344"
Saved detection results to /home/ci/autogluon/docs/tutorials/multimodal/object_detection/quick_start/AutogluonModels/ag-20241127_101344/result.txt
A new predictor save path is created. This is to prevent you to overwrite previous predictor saved here. You could check current save path at predictor._save_path. If you still want to use this path, set resume=True
No path specified. Models will be saved in: "AutogluonModels/ag-20241127_101344-001"
Using default root folder: ./tiny_motorbike_coco/tiny_motorbike/Annotations/... Specify `model.mmdet_image.coco_root=...` in hyperparameters if you think it is wrong.
--- Logging error ---
Traceback (most recent call last):
  File "/opt/conda/lib/python3.11/logging/__init__.py", line 1110, in emit
    msg = self.format(record)
          ^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/logging/__init__.py", line 953, in format
    return fmt.format(record)
           ^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/logging/__init__.py", line 687, in format
    record.message = record.getMessage()
                     ^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/logging/__init__.py", line 377, in getMessage
    msg = msg % self.args
          ~~~~^~~~~~~~~~~
TypeError: not all arguments converted during string formatting
Call stack:
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/home/ci/opt/venv/lib/python3.11/site-packages/ipykernel_launcher.py", line 18, in <module>
    app.launch_new_instance()
  File "/home/ci/opt/venv/lib/python3.11/site-packages/traitlets/config/application.py", line 1075, in launch_instance
    app.start()
  File "/home/ci/opt/venv/lib/python3.11/site-packages/ipykernel/kernelapp.py", line 739, in start
    self.io_loop.start()
  File "/home/ci/opt/venv/lib/python3.11/site-packages/tornado/platform/asyncio.py", line 205, in start
    self.asyncio_loop.run_forever()
  File "/opt/conda/lib/python3.11/asyncio/base_events.py", line 608, in run_forever
    self._run_once()
  File "/opt/conda/lib/python3.11/asyncio/base_events.py", line 1936, in _run_once
    handle._run()
  File "/opt/conda/lib/python3.11/asyncio/events.py", line 84, in _run
    self._context.run(self._callback, *self._args)
  File "/home/ci/opt/venv/lib/python3.11/site-packages/ipykernel/kernelbase.py", line 545, in dispatch_queue
    await self.process_one()
  File "/home/ci/opt/venv/lib/python3.11/site-packages/ipykernel/kernelbase.py", line 534, in process_one
    await dispatch(*args)
  File "/home/ci/opt/venv/lib/python3.11/site-packages/ipykernel/kernelbase.py", line 437, in dispatch_shell
    await result
  File "/home/ci/opt/venv/lib/python3.11/site-packages/ipykernel/ipkernel.py", line 362, in execute_request
    await super().execute_request(stream, ident, parent)
  File "/home/ci/opt/venv/lib/python3.11/site-packages/ipykernel/kernelbase.py", line 778, in execute_request
    reply_content = await reply_content
  File "/home/ci/opt/venv/lib/python3.11/site-packages/ipykernel/ipkernel.py", line 449, in do_execute
    res = shell.run_cell(
  File "/home/ci/opt/venv/lib/python3.11/site-packages/ipykernel/zmqshell.py", line 549, in run_cell
    return super().run_cell(*args, **kwargs)
  File "/home/ci/opt/venv/lib/python3.11/site-packages/IPython/core/interactiveshell.py", line 3009, in run_cell
    result = self._run_cell(
  File "/home/ci/opt/venv/lib/python3.11/site-packages/IPython/core/interactiveshell.py", line 3064, in _run_cell
    result = runner(coro)
  File "/home/ci/opt/venv/lib/python3.11/site-packages/IPython/core/async_helpers.py", line 129, in _pseudo_sync_runner
    coro.send(None)
  File "/home/ci/opt/venv/lib/python3.11/site-packages/IPython/core/interactiveshell.py", line 3269, in run_cell_async
    has_raised = await self.run_ast_nodes(code_ast.body, cell_name,
  File "/home/ci/opt/venv/lib/python3.11/site-packages/IPython/core/interactiveshell.py", line 3448, in run_ast_nodes
    if await self.run_code(code, result, async_=asy):
  File "/home/ci/opt/venv/lib/python3.11/site-packages/IPython/core/interactiveshell.py", line 3508, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "/tmp/ipykernel_4462/4018775541.py", line 1, in <module>
    pred = predictor.predict(test_path, save_results=True)
  File "/home/ci/autogluon/multimodal/src/autogluon/multimodal/predictor.py", line 640, in predict
    return self._learner.predict(
  File "/home/ci/autogluon/multimodal/src/autogluon/multimodal/learners/object_detection.py", line 755, in predict
    save_result_coco_format(
  File "/home/ci/autogluon/multimodal/src/autogluon/multimodal/utils/object_detection.py", line 1610, in save_result_coco_format
    logger.info(25, f"Saved detection result to {result_path}")
Message: 25
Arguments: ('Saved detection result to /home/ci/autogluon/docs/tutorials/multimodal/object_detection/quick_start/AutogluonModels/ag-20241127_101344-001/result.json',)
Saved detection results as coco to /home/ci/autogluon/docs/tutorials/multimodal/object_detection/quick_start/AutogluonModels/ag-20241127_101344-001/result.json

在这里，我们将pred保存到一个.txt文件中，该文件完全遵循与pred相同的布局。你可以使用以任何方式初始化的预测器（即微调的预测器、带有预训练模型的预测器等）。

可视化结果¶

要运行可视化，请确保已安装opencv。如果尚未安装，请通过运行以下命令安装opencv：

!pip install opencv-python

Requirement already satisfied: opencv-python in /home/ci/opt/venv/lib/python3.11/site-packages (4.10.0.84)
Requirement already satisfied: numpy>=1.21.2 in /home/ci/opt/venv/lib/python3.11/site-packages (from opencv-python) (1.26.4)

要可视化检测边界框，请运行以下内容：

from autogluon.multimodal.utils import ObjectDetectionVisualizer

conf_threshold = 0.4  # Specify a confidence threshold to filter out unwanted boxes
image_result = pred.iloc[30]

img_path = image_result.image  # Select an image to visualize

visualizer = ObjectDetectionVisualizer(img_path)  # Initialize the Visualizer
out = visualizer.draw_instance_predictions(image_result, conf_threshold=conf_threshold)  # Draw detections
visualized = out.get_image()  # Get the visualized image

from PIL import Image
from IPython.display import display
img = Image.fromarray(visualized, 'RGB')
display(img)

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[17], line 4
      1 from autogluon.multimodal.utils import ObjectDetectionVisualizer
      3 conf_threshold = 0.4  # Specify a confidence threshold to filter out unwanted boxes
----> 4 image_result = pred.iloc[30]
      6 img_path = image_result.image  # Select an image to visualize
      8 visualizer = ObjectDetectionVisualizer(img_path)  # Initialize the Visualizer

AttributeError: 'list' object has no attribute 'iloc'

在您自己的数据上进行测试¶

你也可以使用各种输入格式来预测你自己的图像。以下是一个示例：

下载示例图片：

from autogluon.multimodal import download
image_url = "https://raw.githubusercontent.com/dmlc/web-data/master/gluoncv/detection/street_small.jpg"
test_image = download(image_url)

对COCO格式的json文件中的数据运行推理（有关COCO格式的更多详细信息，请参见将数据转换为COCO格式）。请注意，由于根目录默认是注释文件的父文件夹，因此我们将注释文件放在一个文件夹中：

import json

# create a input file for demo
data = {"images": [{"id": 0, "width": -1, "height": -1, "file_name": test_image}], "categories": []}
os.mkdir("input_data_for_demo")
input_file = "input_data_for_demo/demo_annotation.json"
with open(input_file, "w+") as f:
    json.dump(data, f)

pred_test_image = predictor.predict(input_file)
print(pred_test_image)

对图像文件名列表中的数据运行推理：

pred_test_image = predictor.predict([test_image])
print(pred_test_image)

Other Examples¶

You may go to AutoMM Examples to explore other examples about AutoMM.

Customization¶

To learn how to customize AutoMM, please refer to Customize AutoMM.

Citation¶

@article{DBLP:journals/corr/abs-2107-08430,
  author    = {Zheng Ge and
               Songtao Liu and
               Feng Wang and
               Zeming Li and
               Jian Sun},
  title     = {{YOLOX:} Exceeding {YOLO} Series in 2021},
  journal   = {CoRR},
  volume    = {abs/2107.08430},
  year      = {2021},
  url       = {https://arxiv.org/abs/2107.08430},
  eprinttype = {arXiv},
  eprint    = {2107.08430},
  timestamp = {Tue, 05 Apr 2022 14:09:44 +0200},
  biburl    = {https://dblp.org/rec/journals/corr/abs-2107-08430.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org},
}