Dataset
The Dataset component stores the ground truth and the predictions of the model.
It provides different distribution analyses in order to identify possible bias in the data set.
DatasetClassification
The DatasetClassification class can be used to store the ground truth and predictions for classification models.
Parameters
- dataset_gt_param
str- Path of the ground truth .json file.
- task_type
TaskType- Problem task type. It can be:
TaskType.CLASSIFICATION_BINARY, TaskType.CLASSIFICATION_SINGLE_LABEL, TaskType.CLASSIFICATION_MULTI_LABEL. - proposals_paths
str or list of tuple, optional- Path of the proposals of a single model or list of couples. Each couple contains the model name and the corresponding proposals path.
(default is None) - observations_set_name
str, optional- Name of the data set.
(default is 'test') - observations_abs_path
str, optional- Path of the observation directory.
(default is None) - result_saving_path
str, optional- Path used to save results.
(default is './results/') - similar_classes
list of list, optional- List of groups of ids of categories which are similar to each other.
(default is None) - properties_file
str, optional- The name of the file used to store the names of and values of the properties and the names of the categories.
(default is 'properties.json') - for_analysis
bool, optional- Indicates whether the properties and the predictions have to be loaded. If False, only the ground truth is loaded.
(default is True) - load_properties
bool, optional- Indicates whether the properties should be loaded.
(default is True) - match_on_filename
bool, optional- Indicates whether the predictions refer to the ground truth by file_name (set to True) or by id (set to False).
(default is False) - save_graphs_as_png
bool, optional- Indicates whether plots should be saved as .png images.
(default is True)
Input format
Ground Truth
Odin accepts as ground truth a .json file in the following format:
"categories" :[ # list of categories in the data set
{
"id": ..., # id of the category
"name": ... # name of the category
},
...],
"observations":[ # list of observations
{
"id": ..., # id of the observation
"file_name": ..., # filename of the observation (it is mandatory only if match_on_filename=True)
"category": ..., # id of the category present in the observation (it is mandatory only for TaskType.CLASSIFICATION_BINARY and TaskType.CLASSIFICATION_SINGLE_LABEL classification tasks)
"categories": ...,# list of ids of the categories present in the observation (it is mandatory only for TaskType.CLASSIFICATION_MULTI_LABEL classification task)
"...": ... # any property name with the corresponding property value
},
...]
Predictions
The predictions must be in a .txt file, one for each category, and all must be stored in the same directory. Each line of the file represents a prediction, and it must be in the following format:
# category_name_a.txt
observation_id confidence # if match_on_filename=False
...
observation_file_name confidence # if match_on_filename=True
...
Example
from odin.classes import DatasetClassification
# define the path of the GT .json file
dataset_gt_param = "/path/to/gt/file.json"
# define the paths of the folders which contains the predictions .txt files for each model
path_to_detections = "/path/to/predictions"
classification_type = TaskType.CLASSIFICATION_MULTI_LABEL
my_dataset = DatasetClassification(dataset_gt_param, classification_type, proposals_paths=path_to_detections)
DatasetLocalization
The DatasetLocalization class can be used to store the ground truth and predictions for localization models, such as object detection and instance segmentation.
Parameters
- dataset_gt_param
str- Path of the ground truth .json file.
- task_type
TaskType- Problem task type. It can be:
TaskType.OBJECT_DETECTION, TaskType.INSTANCE_SEGMENTATION. - proposals_paths
str or list of tuple, optional- Path of the proposals of a single model or list of couples. Each couple contains the model name and the corresponding proposals path.
(default is None) - images_set_name
str, optional- Name of the data set.
(default is 'test') - images_abs_path
str, optional- Path of the images directory.
(default is None) - result_saving_path
str, optional- Path used to save results.
(default is './results/') - similar_classes
list of list, optional- List of groups of ids of categories which are similar to each other.
(default is None) - properties_file
str, optional- The name of the file used to store the names of and values of the properties and the names of the categories.
(default is 'properties.json') - for_analysis
bool, optional- Indicates whether the properties and the predictions have to be loaded. If False, only the ground truth is loaded.
(default is True) - load_properties
bool, optional- Indicates whether the properties should be loaded.
(default is True) - match_on_filename
bool, optional- Indicates whether the predictions refer to the ground truth by file_name (set to True) or by id (set to False).
(default is False) - save_graphs_as_png
bool, optional- Indicates whether plots should be saved as .png images.
(default is True)
Input format
Ground Truth
Odin accepts as ground truth a .json file in the following format:
"categories" :[ # list of categories in the data set
{
"id": ..., # id of the category
"name": ... # name of the category
},
...],
"images":[ # list of images
{
"id": ..., # id of the image
"file_name": ..., # filename of the image (it is mandatory only if match_on_filename=True)
},
...],
"annotations": [ # list of annotations
{
"id": ..., # id of the annotation
"image_id": ..., # id of the image the annotation refers to
"category_id": ..., # id of the category present in the annotation
"segmentation": ..., # segmentation mask of the annotation (it is mandatory only for TaskType.INSTANCE_SEGMENTATION localization task)
"bbox": ..., # bounding box of the annotation (it is mandatory only for TaskType.OBJECT_DETECTION)
"...": ... # any property name with the corresponding property value
}
]
Predictions
The predictions must be in a .txt file, one for each category, and all must be stored in the same directory. Each line of the file represents a prediction, and it must be in the following format:
# category_name_a.txt
# For TaskType.OBJECT_DETECTION
image_id confidence min_x min_y width height # if match_on_filename=False
...
image_file_name confidence min_x min_y width height # if match_on_filename=True
...
# For TaskType.INSTANCE_SEGMENTATION
image_id confidence x1 y1 x2 y2 ... xn yn # if match_on_filename=False
...
image_file_name confidence x1 y1 x2 y2 ... xn yn # if match_on_filename=True
...
Example
from odin.classes import DatasetLocalization
# define the path of the GT .json file
dataset_gt_param = "/path/to/gt/file.json"
# define the paths of the folders which contains the predictions .txt files for each model
path_to_detections = "/path/to/predictions"
localization_type = TaskType.OBJECT_DETECTION
my_dataset = DatasetLocalization(dataset_gt_param, localization_type, proposals_paths=path_to_detections)
DatasetCAMs
The DatasetCAMs class can be used to store the ground truth and the Class Activation Maps generated by classification models.
Parameters
- dataset_gt_param
str- Path of the ground truth .json file.
- task_type
TaskType- Problem task type. It can be:
TaskType.CLASSIFICATION_BINARY, TaskType.CLASSIFICATION_SINGLE_LABEL, TaskType.CLASSIFICATION_MULTI_LABEL. - cams_paths
str or list of tuple, optional- Path of the CAMs of a single model or list of couples. Each couple contains the model name and the corresponding CAMs path.
(default is None) - annotation_type
AnnotationType, optional- Indicates whether the annotation is a bounding box (AnnotationType.BBOX) or a segmentation mask (AnnotationType.SEGMENTATION)
(default is AnnotationType.BBOX) - proposals_paths
str or list of tuple, optional- Path of the proposals of a single model or list of couples. Each couple contains the model name and the corresponding proposals path.
(default is None) - observations_set_name
str, optional- Name of the data set.
(default is 'test') - observations_abs_path
str, optional- Path of the observation directory.
(default is None) - result_saving_path
str, optional- Path used to save results.
(default is './results/') - similar_classes
list of list, optional- List of groups of ids of categories which are similar to each other.
(default is None) - properties_file
str, optional- The name of the file used to store the names of and values of the properties and the names of the categories.
(default is 'properties.json') - for_analysis
bool, optional- Indicates whether the properties and the predictions have to be loaded. If False, only the ground truth is loaded.
(default is True) - match_on_filename
bool, optional- Indicates whether the predictions refer to the ground truth by file_name (set to True) or by id (set to False).
(default is False) - save_graphs_as_png
bool, optional- Indicates whether plots should be saved as .png images.
(default is True)
Input format
Ground Truth
Odin accepts as ground truth a .json file in the following format:
"categories" :[ # list of categories in the data set
{
"id": ..., # id of the category
"name": ... # name of the category
},
...],
"observations":[ # list of observations
{
"id": ..., # id of the observation
"file_name": ..., # filename of the observation (it is mandatory only if match_on_filename=True)
"category": ..., # id of the category present in the observation (it is mandatory only for TaskType.CLASSIFICATION_BINARY and TaskType.CLASSIFICATION_SINGLE_LABEL classification tasks)
"categories": ...,# list of ids of the categories present in the observation (it is mandatory only for TaskType.CLASSIFICATION_MULTI_LABEL)
"...": ... # any property name with the corresponding property value
},
...],
"annotations": [ # list of annotations
{
"id": ..., # id of the annotation
"image_id": ..., # id of the image the annotation refers to
"category_id": ..., # id of the category present in the annotation
"segmentation": ..., # segmentation mask of the annotation (it is mandatory only for AnnotationType.SEGMENTATION localization task)
"bbox": ..., # bounding box of the annotation (it is mandatory only for AnnotationType.BBOX)
}
]
CAMs
The CAMs must be in a .npy file, one for each image, and all must be stored in the same directory. Each file size is h x w x c where h and w are the height and width of the image and c is the number of categories.
Example
from odin.classes import DatasetCAMs
# define the path of the GT .json file
dataset_gt_param = "/path/to/gt/file.json"
# define the paths of the CAMs for each model
path_to_cams_detections = "/path/to/cams/predictions"
classification_type = TaskType.CLASSIFICATION_MULTI_LABEL
my_dataset = DatasetCAMs(dataset_gt_param, classification_type, cams_paths=path_to_cams_detections)