Dataset

The Dataset component stores the ground truth and the predictions of the model.
It provides different distribution analyses in order to identify possible bias in the data set.

DatasetClassification

The DatasetClassification class can be used to store the ground truth and predictions for classification models.

Parameters

dataset_gt_param: str; Path of the ground truth .json file.
task_type: TaskType; Problem task type. It can be:
TaskType.CLASSIFICATION_BINARY, TaskType.CLASSIFICATION_SINGLE_LABEL, TaskType.CLASSIFICATION_MULTI_LABEL.
proposals_paths: str or list of tuple, optional; Path of the proposals of a single model or list of couples. Each couple contains the model name and the corresponding proposals path.
(default is None)
observations_set_name: str, optional; Name of the data set.
(default is 'test')
observations_abs_path: str, optional; Path of the observation directory.
(default is None)
result_saving_path: str, optional; Path used to save results.
(default is './results/')
similar_classes: list of list, optional; List of groups of ids of categories which are similar to each other.
(default is None)
properties_file: str, optional; The name of the file used to store the names of and values of the properties and the names of the categories.
(default is 'properties.json')
for_analysis: bool, optional; Indicates whether the properties and the predictions have to be loaded. If False, only the ground truth is loaded.
(default is True)
load_properties: bool, optional; Indicates whether the properties should be loaded.
(default is True)
match_on_filename: bool, optional; Indicates whether the predictions refer to the ground truth by file_name (set to True) or by id (set to False).
(default is False)
save_graphs_as_png: bool, optional; Indicates whether plots should be saved as .png images.
(default is True)

Input format

Ground Truth

Odin accepts as ground truth a .json file in the following format:

"categories" :[       # list of categories in the data set
  {
    "id": ...,        # id of the category
    "name": ...       # name of the category
  },
  ...],

"observations":[      # list of observations
  {
    "id": ...,        # id of the observation
    "file_name": ..., # filename of the observation (it is mandatory only if match_on_filename=True)
    "category": ...,  # id of the category present in the observation (it is mandatory only for TaskType.CLASSIFICATION_BINARY and TaskType.CLASSIFICATION_SINGLE_LABEL classification tasks)
    "categories": ...,# list of ids of the categories present in the observation (it is mandatory only for TaskType.CLASSIFICATION_MULTI_LABEL classification task)
    "...": ...        # any property name with the corresponding property value
  },
  ...]

Predictions

The predictions must be in a .txt file, one for each category, and all must be stored in the same directory. Each line of the file represents a prediction, and it must be in the following format:

# category_name_a.txt

observation_id confidence        # if match_on_filename=False
...
observation_file_name confidence # if match_on_filename=True
...

Example

from odin.classes import DatasetClassification

# define the path of the GT .json file
dataset_gt_param = "/path/to/gt/file.json"

# define the paths of the folders which contains the predictions .txt files for each model
path_to_detections = "/path/to/predictions"

classification_type = TaskType.CLASSIFICATION_MULTI_LABEL

my_dataset = DatasetClassification(dataset_gt_param, classification_type, proposals_paths=path_to_detections)

DatasetLocalization

The DatasetLocalization class can be used to store the ground truth and predictions for localization models, such as object detection and instance segmentation.

Parameters

dataset_gt_param: str; Path of the ground truth .json file.
task_type: TaskType; Problem task type. It can be:
TaskType.OBJECT_DETECTION, TaskType.INSTANCE_SEGMENTATION.
proposals_paths: str or list of tuple, optional; Path of the proposals of a single model or list of couples. Each couple contains the model name and the corresponding proposals path.
(default is None)
images_set_name: str, optional; Name of the data set.
(default is 'test')
images_abs_path: str, optional; Path of the images directory.
(default is None)
result_saving_path: str, optional; Path used to save results.
(default is './results/')
similar_classes: list of list, optional; List of groups of ids of categories which are similar to each other.
(default is None)
properties_file: str, optional; The name of the file used to store the names of and values of the properties and the names of the categories.
(default is 'properties.json')
for_analysis: bool, optional; Indicates whether the properties and the predictions have to be loaded. If False, only the ground truth is loaded.
(default is True)
load_properties: bool, optional; Indicates whether the properties should be loaded.
(default is True)
match_on_filename: bool, optional; Indicates whether the predictions refer to the ground truth by file_name (set to True) or by id (set to False).
(default is False)
save_graphs_as_png: bool, optional; Indicates whether plots should be saved as .png images.
(default is True)

Input format

Ground Truth

Odin accepts as ground truth a .json file in the following format:

"categories" :[           # list of categories in the data set
  {
    "id": ...,            # id of the category
    "name": ...           # name of the category
  },
  ...],

"images":[                # list of images
  {
    "id": ...,            # id of the image
    "file_name": ...,     # filename of the image (it is mandatory only if match_on_filename=True)
  },
  ...],

"annotations": [          # list of annotations
  {
    "id": ...,            # id of the annotation
    "image_id": ...,      # id of the image the annotation refers to
    "category_id": ...,   # id of the category present in the annotation
    "segmentation": ...,  # segmentation mask of the annotation (it is mandatory only for TaskType.INSTANCE_SEGMENTATION localization task)
    "bbox": ...,          # bounding box of the annotation (it is mandatory only for TaskType.OBJECT_DETECTION)
    "...": ...            # any property name with the corresponding property value
  }
]

Predictions

The predictions must be in a .txt file, one for each category, and all must be stored in the same directory. Each line of the file represents a prediction, and it must be in the following format:

# category_name_a.txt

# For TaskType.OBJECT_DETECTION
image_id confidence min_x min_y width height        # if match_on_filename=False
...
image_file_name confidence min_x min_y width height # if match_on_filename=True
...

# For TaskType.INSTANCE_SEGMENTATION
image_id confidence x1 y1 x2 y2 ... xn yn           # if match_on_filename=False
...
image_file_name confidence x1 y1 x2 y2 ... xn yn    # if match_on_filename=True
...

Example

from odin.classes import DatasetLocalization

# define the path of the GT .json file
dataset_gt_param = "/path/to/gt/file.json"

# define the paths of the folders which contains the predictions .txt files for each model
path_to_detections = "/path/to/predictions"

localization_type = TaskType.OBJECT_DETECTION

my_dataset = DatasetLocalization(dataset_gt_param, localization_type, proposals_paths=path_to_detections)

DatasetCAMs

The DatasetCAMs class can be used to store the ground truth and the Class Activation Maps generated by classification models.

Parameters

dataset_gt_param: str; Path of the ground truth .json file.
task_type: TaskType; Problem task type. It can be:
TaskType.CLASSIFICATION_BINARY, TaskType.CLASSIFICATION_SINGLE_LABEL, TaskType.CLASSIFICATION_MULTI_LABEL.
cams_paths: str or list of tuple, optional; Path of the CAMs of a single model or list of couples. Each couple contains the model name and the corresponding CAMs path.
(default is None)
annotation_type: AnnotationType, optional; Indicates whether the annotation is a bounding box (AnnotationType.BBOX) or a segmentation mask (AnnotationType.SEGMENTATION)
(default is AnnotationType.BBOX)
proposals_paths: str or list of tuple, optional; Path of the proposals of a single model or list of couples. Each couple contains the model name and the corresponding proposals path.
(default is None)
observations_set_name: str, optional; Name of the data set.
(default is 'test')
observations_abs_path: str, optional; Path of the observation directory.
(default is None)
result_saving_path: str, optional; Path used to save results.
(default is './results/')
similar_classes: list of list, optional; List of groups of ids of categories which are similar to each other.
(default is None)
properties_file: str, optional; The name of the file used to store the names of and values of the properties and the names of the categories.
(default is 'properties.json')
for_analysis: bool, optional; Indicates whether the properties and the predictions have to be loaded. If False, only the ground truth is loaded.
(default is True)
match_on_filename: bool, optional; Indicates whether the predictions refer to the ground truth by file_name (set to True) or by id (set to False).
(default is False)
save_graphs_as_png: bool, optional; Indicates whether plots should be saved as .png images.
(default is True)

Input format

Ground Truth

Odin accepts as ground truth a .json file in the following format:

"categories" :[       # list of categories in the data set
  {
    "id": ...,        # id of the category
    "name": ...       # name of the category
  },
  ...],

"observations":[      # list of observations
  {
    "id": ...,        # id of the observation
    "file_name": ..., # filename of the observation (it is mandatory only if match_on_filename=True)
    "category": ...,  # id of the category present in the observation (it is mandatory only for TaskType.CLASSIFICATION_BINARY and TaskType.CLASSIFICATION_SINGLE_LABEL classification tasks)
    "categories": ...,# list of ids of the categories present in the observation (it is mandatory only for TaskType.CLASSIFICATION_MULTI_LABEL)
    "...": ...        # any property name with the corresponding property value
  },
  ...],

"annotations": [          # list of annotations
  {
    "id": ...,            # id of the annotation
    "image_id": ...,      # id of the image the annotation refers to
    "category_id": ...,   # id of the category present in the annotation
    "segmentation": ...,  # segmentation mask of the annotation (it is mandatory only for AnnotationType.SEGMENTATION localization task)
    "bbox": ...,          # bounding box of the annotation (it is mandatory only for AnnotationType.BBOX)
  }
]

CAMs

The CAMs must be in a .npy file, one for each image, and all must be stored in the same directory. Each file size is h x w x c where h and w are the height and width of the image and c is the number of categories.

Example

from odin.classes import DatasetCAMs

# define the path of the GT .json file
dataset_gt_param = "/path/to/gt/file.json"

# define the paths of the CAMs for each model
path_to_cams_detections = "/path/to/cams/predictions"

classification_type = TaskType.CLASSIFICATION_MULTI_LABEL

my_dataset = DatasetCAMs(dataset_gt_param, classification_type, cams_paths=path_to_cams_detections)