data_management package

This package contains tools for:

Converting frequently-used metadata formats to COCO Camera Traps format
Converting the output of AI models (especially YOLOv5) to the format used for AI results throughout this repo
Creating, visualizing, and editing COCO Camera Traps .json databases

Subpackages

Submodules

data_management.cct_json_utils module

cct_json_utils.py

Utilities for working with COCO Camera Traps .json databases:

https://github.com/agentmorris/MegaDetector/blob/main/megadetector/data_management/README.md#coco-cameratraps-format

class megadetector.data_management.cct_json_utils.CameraTrapJsonUtils[source]

Bases: object

Miscellaneous utility functions for working with COCO Camera Traps databases

static annotations_to_class_names(annotations, cat_id_to_name)[source]

Given a list of annotations and a mapping from class IDs to names, produces a list of class names, sorted alphabetically.

Parameters:

annotations (list) – a list of annotation dicts
cat_id_to_name (dict) – a dict mapping category IDs to category names

Returns:

a list of class names present in [annotations]

Return type:

list

static annotations_to_string(annotations, cat_id_to_name)[source]

Given a list of annotations and a mapping from class IDs to names, produces a comma-delimited string containing a list of class names, sorted alphabetically.

Parameters:

annotations (list) – a list of annotation dicts
cat_id_to_name (dict) – a dict mapping category IDs to category names

Returns:

a comma-delimited list of class names

Return type:

str

static get_entries_for_locations(db, locations)[source]

Given a dict representing a JSON database in the COCO Camera Trap format, returns a dict with the ‘images’ and ‘annotations’ fields in the CCT format, each is an array that only includes entries in the original [db] that are in the [locations] set.

Parameters:

db (dict) – a dict representing a JSON database in the COCO Camera Trap format
locations (set) – a set or list of locations to include; each item is a string

Returns:

a dict with the ‘images’ and ‘annotations’ fields in the CCT format

Return type:

dict

static group_annotations_by_image_field(db_indexed, image_field='seq_id')[source]

Given an instance of IndexedJsonDb, group annotation entries by a field in the image entry. Typically used to find all the annotations associated with a sequence.

Parameters:

db_indexed (IndexedJsonDb) – an initialized IndexedJsonDb, typically loaded from a COCO Camera Traps .json file
image_field (str, optional) – a field by which to group annotations (defaults to ‘seq_id’)

Returns:

a dict mapping objects (typically strings, in fact typically sequence IDs) to lists of annotations

Return type:

dict

static order_db_keys(db)[source]

Given a dict representing a JSON database in the COCO Camera Trap format, returns an OrderedDict with keys in the order of ‘info’, ‘categories’, ‘annotations’ and ‘images’. When this OrderedDict is serialized with json.dump(), the order of the keys are preserved.

Parameters:: db (dict) – a JSON database in the COCO Camera Trap format
Returns:: the same content as [db] but as an OrderedDict with keys ordered for readability
Return type:: dict

class megadetector.data_management.cct_json_utils.IndexedJsonDb(json_filename, b_normalize_paths=False, filename_replacements=None, b_convert_classes_to_lower=True, b_force_forward_slashes=True)[source]

Bases: object

Wrapper for a COCO Camera Traps database.

Handles boilerplate dictionary creation that we do almost every time we load a .json database.

get_annotations_for_image(image)[source]

Finds all the annnotations associated with the image dict [image].

Parameters:: image (dict) – an image dict loaded from a CCT .json file. Only the ‘id’ field is used.
Returns:: list of annotations associated with this image. Returns None if the db has not been loaded, or [] if no annotations are available for this image.
Return type:: list

get_classes_for_image(image)[source]

Returns a list of class names associated with [image].

Parameters:: image (dict) – an image dict loaded from a CCT .json file. Only the ‘id’ field is used.
Returns:: list of class names associated with this image. Returns None if the db has not been loaded, or [] if no annotations are available for this image.
Return type:: list

class megadetector.data_management.cct_json_utils.SequenceOptions[source]

Bases: object

Options parameterizing the grouping of images into sequences by time.

datetime_conversion_failure_behavior

‘error’ or ‘none’

Type:: How to handle invalid datetimes

episode_interval_seconds: Images separated by <= this duration will be grouped into the same sequence.

verbose: Enable additional debug output

megadetector.data_management.cct_json_utils.create_sequences(image_info, options=None)[source]

Synthesizes episodes/sequences/bursts for the images in [image_info].

Modifies [image_info] in place, populating the ‘seq_id’, ‘seq_num_frames’, and ‘frame_num’ fields for each image.

Parameters:

image_info (str, dict, or list) – a dict in CCT format, a CCT .json file, or just the ‘images’ component of a CCT dataset (a list of dicts with fields ‘file_name’ (str), ‘datetime’ (datetime), and ‘location’ (str)).
options (SequenceOptions, optional) – options parameterizing the assembly of images into sequences; see the SequenceOptions class for details.

Returns:

if [image_info] is passed as a list, returns the list, otherwise returns a CCT-formatted dict.

Return type:

image_info

megadetector.data_management.cct_json_utils.parse_datetimes_from_cct_dict(d, conversion_failure_behavior='error')[source]

Given a COCO camera traps dictionary that may just have been loaded from file, converts all string-formatted datetime fields to Python datetimes, making reasonable assumptions about datetime representations. Modifies [d] in place if [d] is supplied as a dict

Parameters:

d (dict or str) – a dict in CCT format or a filename pointing to a CCT .json file
conversion_failure_behavior (str, optional) – determines what happens on a failed conversion; can be “error” (raise an error), “str” (leave as a string), or “none” (convert to None)

Returns:

the CCT dict with converted datetimes.

Return type:

dict

megadetector.data_management.cct_json_utils.parse_datetimes_from_cct_image_list(images, conversion_failure_behavior='error', verbose=False)[source]

Given the “images” field from a COCO camera traps dictionary, converts all string-formatted datetime fields to Python datetimes, making reasonable assumptions about datetime representations. Modifies [images] in place.

Parameters:

images (list) – a list of dicts in CCT images format
conversion_failure_behavior (str, optional) – determines what happens on a failed conversion; can be “error” (raise an error), “str” (leave as a string), or “none” (convert to None)
verbose (bool, optional) – enable additional debug output

Returns:

the input list, with datetimes converted (after modifying in place)

Return type:

images

megadetector.data_management.cct_json_utils.write_object_with_serialized_datetimes(d, json_fn)[source]

Writes the object [d] to the .json file [json_fn] with a standard approach to serializing Python datetime objects.

Parameters:

d (obj) – the object to write, typically a dict
json_fn (str) – the output filename

data_management.cct_to_md module

cct_to_md.py

“Converts” a COCO Camera Traps file to a MD results file. Currently ignores non-bounding-box annotations, and gives all annotations a confidence of 1.0.

The only reason to do this is if you are going to add information to an existing CCT-formatted dataset, and you want to do that in Timelapse.

Currently assumes that width and height are present in the input data, does not read them from images.

megadetector.data_management.cct_to_md.cct_to_md(input_filename, output_filename=None)[source]

“Converts” a COCO Camera Traps file to a MD results file. Currently ignores non-bounding-box annotations. If the semi-standard “score” field is present in an annotation, or the totally non-standard “conf” field is present, it will be transferred to the output, otherwise a confidence value of 1.0 is assumed for all annotations.

The main reason to run this script the scenario where you are going to add information to an existing CCT-formatted dataset, and you want to do that in Timelapse.

Currently assumes that width and height are present in the input data, does not read them from images.

Parameters:

input_filename (str) – the COCO Camera Traps .json file to read
output_filename (str, optional) – the .json file to write in MD results format

Returns:

MD-formatted results, identical to the content of [output_filename] if [output_filename] is not None

Return type:

dict

data_management.cct_to_wi module

cct_to_wi.py

Converts COCO Camera Traps .json files to the Wildlife Insights batch upload format.

This is very much just a demo script; all the relevant constants are hard-coded at the top of main().

But given that caveat, it works. You need to set up all the paths in the “paths” cell at the top of main().

Also see:

megadetector.data_management.cct_to_wi.main()[source]: Converts COCO Camera Traps .json files to the Wildlife Insights batch upload format; to use this, you need to modify all the paths in the “Paths” cell.

data_management.coco_to_labelme module

coco_to_labelme.py

Converts a COCO dataset to labelme format (one .json per image file).

If you want to convert YOLO-formatted data to labelme format, use yolo_to_coco, then coco_to_labelme.

megadetector.data_management.coco_to_labelme.coco_to_labelme(coco_data, image_base, overwrite=False, bypass_image_size_check=False, verbose=False)[source]

For all the images in [coco_data] (a dict or a filename), write a .json file in labelme format alongside the corresponding relative path within image_base.

Parameters:

coco_data (dict or str) – path to a COCO-formatted .json file, or an already-loaded COCO-formatted dict
image_base (str) – path where images live (filenames in [coco_data] should be relative to [image_base]); this is also where labelme files will be written
overwrite (bool, optional) – overwrite existing .json files
bypass_image_size_check (bool, optional) – if you’re sure that the COCO data already has correct ‘width’ and ‘height’ fields, this bypasses the somewhat-slow loading of each image to fetch image sizes
verbose (bool, optional) – enable additional debug output

megadetector.data_management.coco_to_labelme.get_labelme_dict_for_image_from_coco_record(im, annotations, categories, info=None)[source]

For the given image struct in COCO format and associated list of annotations, reformats the detections into labelme format.

Parameters:

im (dict) – image dict, as loaded from a COCO .json file; ‘height’ and ‘width’ are required
annotations (list) – a list of annotations that refer to this image (this function errors if that’s not the case)
categories (list) – a list of category in dicts in COCO format ({‘id’:x,’name’:’s’})
info (dict, optional) – a dict to store in a non-standard “custom_info” field in the output

Returns:

a dict in labelme format, suitable for writing to a labelme .json file

Return type:

dict

coco_to_labelme - CLI interface

Convert a COCO database to labelme annotation format

coco_to_labelme [-h] [--overwrite] coco_file image_base

coco_to_labelme positional arguments

coco_file - Path to COCO data file (.json)
image_base - Path to images (also the output folder)

coco_to_labelme options

-h, --help - show this help message and exit
--overwrite - Overwrite existing labelme .json files

data_management.coco_to_yolo module

coco_to_yolo.py

Converts a COCO-formatted dataset to a YOLO-formatted dataset, flattening the dataset (to a single folder) in the process.

If the input and output folders are the same, writes .txt files to the input folder, and neither moves nor modifies images.

Currently ignores segmentation masks, and errors if an annotation has a segmentation polygon but no bbox.

Has only been tested on a handful of COCO Camera Traps data sets; if you use it for more general COCO conversion, YMMV.

megadetector.data_management.coco_to_yolo.coco_to_yolo(input_image_folder, output_folder, input_file, source_format='coco', overwrite_images=False, create_image_and_label_folders=False, class_file_name='classes.txt', allow_empty_annotations=False, clip_boxes=False, image_id_to_output_image_json_file=None, images_to_exclude=None, path_replacement_char='#', category_names_to_exclude=None, category_names_to_include=None, write_output=True, flatten_paths=False, empty_image_handling='write_empty')[source]

Converts a COCO-formatted dataset to a YOLO-formatted dataset, optionally flattening the dataset to a single folder in the process.

If the input and output folders are the same, writes .txt files to the input folder, and neither moves nor modifies images.

Currently ignores segmentation masks, and errors if an annotation has a segmentation polygon but no bbox.

Parameters:

input_image_folder (str) – the folder where images live; filenames in the COCO .json file [input_file] should be relative to this folder
output_folder (str) – the base folder for the YOLO dataset
input_file (str) – a .json file in COCO format; can be the same as [input_image_folder], in which case images are left alone.
source_format (str, optional) – can be ‘coco’ (default) or ‘coco_camera_traps’. The only difference is that when source_format is ‘coco_camera_traps’, we treat an image with a non-bbox annotation as a special case, i.e. that’s how an empty image is indicated. The original COCO standard is a little ambiguous on this issue. If source_format is ‘coco’, we either treat images as empty or error, depending on the value of [allow_empty_annotations]. [allow_empty_annotations] has no effect if source_format is ‘coco_camera_traps’.
overwrite_images (bool, optional) – over-write images in the output folder if they exist
create_image_and_label_folders (bool, optional) – whether to create separate folders called ‘images’ and ‘labels’ in the YOLO output folder. If create_image_and_label_folders is False, a/b/c/image001.jpg will become a#b#c#image001.jpg, and the corresponding text file will be a#b#c#image001.txt. If create_image_and_label_folders is True, a/b/c/image001.jpg will become images/a#b#c#image001.jpg, and the corresponding text file will be labels/a#b#c#image001.txt.
class_file_name (str, optional) – .txt file (relative to the output folder) that we should populate with a list of classes (or None to omit)
allow_empty_annotations (bool, optional) – if this is False and [source_format] is ‘coco’, we’ll error on annotations that have no ‘bbox’ field
clip_boxes (bool, optional) – whether to clip bounding box coordinates to the range [0,1] before converting to YOLO xywh format
image_id_to_output_image_json_file (str, optional) – an optional output file, to which we will write a mapping from image IDs to output file names
images_to_exclude (list, optional) – a list of image files (relative paths in the input folder) that we should ignore
path_replacement_char (str, optional) – only relevant if [flatten_paths] is True; this is used to replace path separators, e.g. if [path_replacement_char] is ‘#’ and [flatten_paths] is True, a/b/c/d.jpg becomes a#b#c#d.jpg
category_names_to_exclude (str, optional) – category names that should not be represented in the YOLO output; only impacts annotations, does not prevent copying images. There’s almost no reason you would want to specify this and [category_names_to_include].
category_names_to_include (str, optional) – allow-list of category names that should be represented in the YOLO output; only impacts annotations, does not prevent copying images. There’s almost no reason you would want to specify this and [category_names_to_exclude].
write_output (bool, optional) – determines whether we actually copy images and write annotations; setting this to False mostly puts this function in “dry run” “mode. The class list file is written regardless of the value of write_output.
flatten_paths (bool, optional) – replace /’s in image filenames with [path_replacement_char], which ensures that the output folder is a single flat folder.
empty_image_handling (str, optional) – whether to omit .txt files for images with no annotations (‘omit’) or write empty .txt files (‘write_empty’). Both are generally considered valid YOLO.

Returns:

information about the coco –> yolo mapping, containing at least the fields:

class_list_filename: the filename to which we wrote the flat list of class names required by the YOLO format.
source_image_to_dest_image: a dict mapping source images to destination images
coco_id_to_yolo_id: a dict mapping COCO category IDs to YOLO category IDs

Return type:

dict

megadetector.data_management.coco_to_yolo.write_yolo_dataset_file(yolo_dataset_file, dataset_base_dir, class_list, train_folder_relative=None, val_folder_relative=None, test_folder_relative=None)[source]

Write a YOLOv5 dataset.yaml file to the absolute path [yolo_dataset_file] (should have a .yaml extension, though it’s only a warning if it doesn’t).

Parameters:

yolo_dataset_file (str) – the file, typically ending in .yaml or .yml, to write. Does not have to be within dataset_base_dir.
dataset_base_dir (str) – the absolute base path of the YOLO dataset
class_list (list or str) – an ordered list of class names (the first item will be class 0, etc.), or the name of a text file containing an ordered list of class names (one per line, starting from class zero).
train_folder_relative (str, optional) – train folder name, used only to populate dataset.yaml. Can also be a filename (e.g. a .txt file with image files).
val_folder_relative (str, optional) – val folder name, used only to populate dataset.yaml. Can also be a filename (e.g. a .txt file with image files).
test_folder_relative (str, optional) – test folder name, used only to populate dataset.yaml. Can also be a filename (e.g. a .txt file with image files).

coco_to_yolo - CLI interface

Convert COCO-formatted data to YOLO format, flattening the image structure

coco_to_yolo [-h] [--create_bounding_box_editor_symlinks]
             input_folder output_folder input_file

coco_to_yolo positional arguments

input_folder - Path to input images
output_folder - Path to flat, YOLO-formatted dataset
input_file - Path to COCO dataset file (.json)

coco_to_yolo options

-h, --help - show this help message and exit
--create_bounding_box_editor_symlinks - Prepare symlinks so the whole folder appears to contain "images" and "labels" folderss

data_management.generate_crops_from_cct module

generate_crops_from_cct.py

Given a .json file in COCO Camera Traps format, creates a cropped image for each bounding box.

megadetector.data_management.generate_crops_from_cct.generate_crops_from_cct(cct_file, image_dir, output_dir, padding=0, flat_output=True)[source]

Given a .json file in COCO Camera Traps format, creates a cropped image for each bounding box.

Parameters:

cct_file (str) – the COCO .json file from which we should load data
image_dir (str) – the folder where the images live; filenames in the .json file should be relative to this folder
output_dir (str) – the folder where we should write cropped images
padding (float, optional) – number of pixels we should expand each box before cropping
flat_output (bool, optional) – if False, folder structure will be preserved in the output, e.g. the image a/b/c/d.jpg will result in image files in the output folder called, e.g., a/b/c/d_crop_000_id_12345.jpg. If [flat_output] is True, the corresponding output image will be a_b_c_d_crop_000_id_12345.jpg.

megadetector.data_management.generate_crops_from_cct.main()[source]: Command-line interface to generate crops from a COCO Camera Traps .json file.

data_management.get_image_sizes module

get_image_sizes.py

Given a json-formatted list of image filenames, retrieves the width and height of every image, optionally writing the results to a new .json file.

megadetector.data_management.get_image_sizes.get_image_sizes(filenames, image_prefix=None, output_file=None, n_workers=1, use_threads=True, recursive=True)[source]

Gets the width and height of all images in [filenames], which can be:

A .json-formatted file containing list of strings
A folder
A list of files

…returning a list of (path,w,h) tuples, and optionally writing the results to [output_file].

Parameters:

filenames (str or list) – the image filenames for which we should retrieve sizes, can be the name of a .json-formatted file containing list of strings, a folder in which we should enumerate images, or a list of files.
image_prefix (str, optional) – optional prefix to add to images to get to full paths; useful when [filenames] contains relative files, in which case [image_prefix] is the base folder for the source images.
output_file (str, optional) – a .json file to write the image sizes
n_workers (int, optional) – number of parallel workers to use, set to <=1 to disable parallelization
use_threads (bool, optional) – whether to use threads (True) or processes (False) for parallelization; not relevant if [n_workers] <= 1
recursive (bool, optional) – only relevant if [filenames] is actually a folder, determines whether image enumeration within that folder will be recursive

Returns:

list of (path,w,h) tuples

Return type:

list

get_image_sizes - CLI interface

get_image_sizes [-h] [--image_prefix IMAGE_PREFIX] [--n_threads N_THREADS]
                filenames output_file

get_image_sizes positional arguments

filenames - Folder from which we should fetch image sizes, or .json file with a list of filenames
output_file - Output file (.json) to which we should write image size information

get_image_sizes options

-h, --help - show this help message and exit
--image_prefix IMAGE_PREFIX - Prefix to append to image filenames, only relevant if [filenames] points to a list of relative paths
--n_threads N_THREADS - Number of concurrent workers, set to <=1 to disable parallelization (default 1)

data_management.labelme_to_coco module

labelme_to_coco.py

Converts a folder of labelme-formatted .json files to COCO.

megadetector.data_management.labelme_to_coco.find_empty_labelme_files(input_folder, recursive=True)[source]

Returns a list of all image files in in [input_folder] associated with .json files that have no boxes in them. Also returns a list of images with no associated .json files. Specifically, returns a dict:

Parameters:

input_folder (str) – the folder to search for empty (i.e., box-less) Labelme .json files
recursive (bool, optional) – whether to recurse into [input_folder]

Returns:

a dict with fields:

images_with_empty_json_files: a list of all image files in [input_folder] associated with .json files that have no boxes in them
images_with_no_json_files: a list of images in [input_folder] with no associated .json files
images_with_non_empty_json_files: a list of images in [input_folder] associated with .json files that have at least one box

Return type:

dict

megadetector.data_management.labelme_to_coco.labelme_to_coco(input_folder, output_file=None, category_id_to_category_name=None, empty_category_name='empty', empty_category_id=None, info_struct=None, relative_paths_to_include=None, relative_paths_to_exclude=None, use_folders_as_labels=False, recursive=True, no_json_handling='skip', validate_image_sizes=True, max_workers=1, use_threads=True)[source]

Finds all images in [input_folder] that have corresponding .json files, and converts to a COCO .json file.

Currently supports bounding box and polygon annotations, as well as image-level flags (i.e., does not support point annotations). Polygon annotations produce COCO annotations with a “segmentation” field and an axis-aligned bounding box.

Labelme’s image-level flags don’t quite fit the COCO annotations format, so they are attached to image objects, rather than annotation objects.

If output_file is None, just returns the resulting dict, does not write to file.

if use_folders_as_labels is False (default), the output labels come from the labelme .json files. If use_folders_as_labels is True, the lowest-level folder name containing each .json file will determine the output label. E.g., if use_folders_as_labels is True, and the folder contains:

images/train/lion/image0001.json

…all boxes in image0001.json will be given the label “lion”, regardless of the labels in the file. Empty images in the “lion” folder will still be given the label “empty” (or [empty_category_name]).

Parameters:

input_folder (str) – input folder to search for images and Labelme .json files
output_file (str, optional) – output file to which we should write COCO-formatted data; if None this function just returns the COCO-formatted dict
category_id_to_category_name (dict, optional) – dict mapping category IDs to category names; really used to map Labelme category names to COCO category IDs. IDs will be auto-generated if this is None.
empty_category_name (str, optional) – if images are present without boxes, the category name we should use for whole-image (and not-very-COCO-like) empty categories.
empty_category_id (int, optional) – category ID to use for the not-very-COCO-like “empty” category; also see the no_json_handling parameter.
info_struct (dict, optional) – dict to stash in the “info” field of the resulting COCO dict
relative_paths_to_include (list, optional) – allowlist of relative paths to include in the COCO dict; there’s no reason to specify this along with relative_paths_to_exclude.
relative_paths_to_exclude (list, optional) – blocklist of relative paths to exclude from the COCO dict; there’s no reason to specify this along with relative_paths_to_include.
use_folders_as_labels (bool, optional) – if this is True, class names will be pulled from folder names, useful if you have images like a/b/cat/image001.jpg, a/b/dog/image002.jpg, etc.
recursive (bool, optional) – whether to recurse into [input_folder]
no_json_handling (str, optional) –
how to deal with image files that have no corresponding .json files, can be:
- ’skip’: ignore image files with no corresponding .json files
- ’empty’: treat image files with no corresponding .json files as empty
- ’error’: throw an error when an image file has no corresponding .json file
validate_image_sizes (bool, optional) – whether to load images to verify that the sizes specified in the labelme files are correct
max_workers (int, optional) – number of workers to use for parallelization, set to <=1 to disable parallelization
use_threads (bool, optional) – whether to use threads (True) or processes (False) for parallelization, not relevant if max_workers <= 1

Returns:

a COCO-formatted dictionary, identical to what’s written to [output_file] if [output_file] is not None.

Return type:

dict

labelme_to_coco - CLI interface

Convert labelme-formatted data to COCO

labelme_to_coco [-h] input_folder output_file

labelme_to_coco positional arguments

input_folder - Path to images and .json annotation files
output_file - Output filename (.json)

labelme_to_coco options

-h, --help - show this help message and exit

data_management.rename_images module

rename_images.py

Copies images from a possibly-nested folder structure to a flat folder structure, including EXIF timestamps in each filename. Loosely equivalent to camtrapR’s imageRename() function.

megadetector.data_management.rename_images.main()[source]

megadetector.data_management.rename_images.rename_images(input_folder, output_folder, dry_run=False, verbose=False, read_exif_options=None, n_copy_workers=8)[source]

Copies images from a possibly-nested folder structure to a flat folder structure, including EXIF timestamps in each filename. Loosely equivalent to camtrapR’s imageRename() function.

Parameters:

input_folder (str) – the folder to search for images, always recursive
output_folder (str) – the folder to which we will copy images; cannot be the same as [input_folder]
dry_run (bool, optional) – only map images, don’t actually copy
verbose (bool, optional) – enable additional debug output
read_exif_options (ReadExifOptions, optional) – parameters controlling the reading of EXIF information
n_copy_workers (int, optional) – number of parallel threads to use for copying

Returns:

a dict mapping relative filenames in the input folder to relative filenames in the output folder

Return type:

dict

rename_images - CLI interface

Copies images from a possibly-nested folder structure to a flat folder structure, adding datetime information from EXIF to each filename

rename_images [-h] [--dry_run] input_folder output_folder

rename_images positional arguments

input_folder - The folder to search for images, always recursive
output_folder - The folder to which we should write the flattened image structure

rename_images options

-h, --help - show this help message and exit
--dry_run - Only map images, don’t actually copy

data_management.ocr_tools module

ocr_tools.py

Use OCR (via the Tesseract package) to pull metadata (particularly times and dates from camera trap images).

The general approach is:

Crop a fixed percentage from the top and bottom of an image, slightly larger than the largest examples we’ve seen of how much space is used for metadata.
Define the background color as the median pixel value, and find rows that are mostly that color to refine the crop.
Crop to the refined crop, then run pytesseract to extract text.
Use regular expressions to find time and date.

Prior to using this module:

Install Tesseract from https://tesseract-ocr.github.io/tessdoc/Installation.html
pip install pytesseract

Known limitations:

Semi-transparent overlays (which I’ve only seen on consumer cameras) usually fail.

class megadetector.data_management.ocr_tools.DatetimeExtractionOptions[source]

Bases: object

Options used to parameterize datetime extraction in most functions in this module.

apply_sharpening_filter: Whether to apply PIL’s ImageFilter.SHARPEN prior to OCR

background_crop_fraction_of_rough_crop: Within that rough crop, how much should we use for determining the background color?

background_tolerance: When we’re looking for pixels that match the background color, allow some tolerance around the dominant color

crop_padding: Pad each crop with a few pixels to make tesseract happy

force_all_ocr_options: If this is False, and one set of options appears to succeed for an image, we’ll stop there. If this is True, we always run all option sets on every image.

image_crop_fraction: What fraction of the [top,bottom] of the image should we use for our rough crop?

min_background_fraction: We need to see a consistent color in at least this fraction of pixels in our rough crop to believe that we actually found a candidate metadata region.

min_background_fraction_for_background_row: A row is considered a probable metadata row if it contains at least this fraction of the background color. This is used only to find the top and bottom of the crop area, so it’s not that every row needs to hit this criteria, only the rows that are generally above and below the text.

min_text_length: Discard short text, typically text from the top of the image

p_crop_success_threshold: Using a semi-arbitrary metric of how much it feels like we found the text-containing region, discard regions that appear to be extraction failures

tesseract_cmd

Tesseract should be on your system path, but you can also specify the path explicitly, e.g. you can do either of these:

os.environ[‘PATH’] += r’;C:Program FilesTesseract-OCR’
self.tesseract_cmd = ‘r”C:Program FilesTesseract-OCRtesseract.exe”’

tesseract_config_strings

“assume a single uniform block of text” psm 13: raw line oem: 0 == legacy, 1 == lstm tesseract_config_string = ‘–oem 0 –psm 6’

Try these configuration strings in order until we find a valid datetime

Type:: psm 6

megadetector.data_management.ocr_tools.crop_to_solid_region(rough_crop, crop_location, options=None)[source]

Given a rough crop from the top or bottom of an image, finds the background color and crops to the metadata region.

Within a region of an image (typically a crop from the top-ish or bottom-ish part of an image), tightly crop to the solid portion (typically a region with a black background).

The success metric is just a binary indicator right now: 1.0 if we found a region we believe contains a solid background, 0.0 otherwise.

Parameters:

rough_crop (Image) – the PIL Image to crop
crop_location (str) – ‘top’ or ‘bottom’
options (DatetimeExtractionOptions, optional) – OCR parameters

Returns:

a tuple containing (a cropped_image (Image), p_success (float), padded_image (Image))

Return type:

tuple

megadetector.data_management.ocr_tools.find_text_in_crops(rough_crops, options=None, tesseract_config_string=None)[source]

Finds all text in each Image in the dict [rough_crops]; those images should be pretty small regions by the time they get to this function, roughly the top or bottom 20% of an image.

Parameters:

rough_crops (list) – list of Image objects that have been cropped close to text
options (DatetimeExtractionOptions, optional) – OCR parameters
tesseract_config_string (str, optional) – optional CLI argument to pass to tesseract.exe

Returns:

a dict with keys “top” and “bottom”, where each value is a dict with keys ‘text’ (text found, if any) and ‘crop_to_solid_region_results’ (metadata about the OCR pass)

Return type:

dict

megadetector.data_management.ocr_tools.get_datetime_from_image(image, include_crops=True, options=None)[source]

Tries to find the datetime string (if present) in an image.

Parameters:

image (Image or str) – the PIL Image object or image filename in which we should look for datetime information.
include_crops (bool, optional) – whether to include cropped images in the return dict (set this to False if you’re worried about size and you’re processing a zillion images)
options (DatetimeExtractionOptions or list, optional) – OCR parameters, either one DatetimeExtractionOptions object or a list of options to try

Returns:

a dict with fields:

datetime: Python datetime object, or None

text_results: length-2 list of strings

all_extracted_datetimes: if we ran multiple option sets, this will contain the datetimes extracted for each option set

ocr_results: detailed results from the OCR process, including crops as PIL images; only included if include_crops is True

Return type:

dict

megadetector.data_management.ocr_tools.get_datetimes_for_folder(folder_name, output_file=None, n_to_sample=-1, options=None, n_workers=16, use_threads=False)[source]

The main entry point for this module. Tries to retrieve metadata from pixels for every image in [folder_name], optionally the results to the .json file [output_file].

Parameters:

folder_name (str) – the folder of images to process recursively
output_file (str, optional) – the .json file to which we should write results; if None, just returns the results
n_to_sample (int, optional) – for debugging only, used to limit the number of images we process
options (DatetimeExtractionOptions or list, optional) – OCR parameters, either one DatetimeExtractionOptions object or a list of options to try for each image
n_workers (int, optional) – the number of parallel workers to use; set to <= 1 to disable parallelization
use_threads (bool, optional) – whether to use threads (True) or processes (False) for parallelization; not relevant if n_workers <= 1

Returns:

a dict mapping filenames to datetime extraction results, see try_get_datetime_from_images for the format of each value in the dict.

Return type:

dict

megadetector.data_management.ocr_tools.make_rough_crops(image, options=None)[source]

Crops the top and bottom regions out of an image.

Parameters:

image (Image or str) – a PIL Image or file name
options (DatetimeExtractionOptions, optional) – OCR parameters

Returns:

a dict with fields ‘top’ and ‘bottom’, each pointing to a new PIL Image

Return type:

dict

megadetector.data_management.ocr_tools.try_get_datetime_from_image(filename, include_crops=False, options=None)[source]

Try/catch wrapper for get_datetime_from_image, optionally trying multiple option sets until we find a datetime.

Parameters:

filename (Image or str) – the PIL Image object or image filename in which we should look for datetime information.
include_crops (bool, optional) – whether to include cropped images in the return dict (set this to False if you’re worried about size and you’re processing a zillion images)
options (DatetimeExtractionOptions or list, optional) – OCR parameters, either one DatetimeExtractionOptions object or a list of options to try

Returns:

A dict with fields:

datetime: Python datetime object, or None
text_results: length-2 list of strings
all_extracted_datetimes: if we ran multiple option sets, this will contain the datetimes extracted for each option set
ocr_results: detailed results from the OCR process, including crops as PIL images; only included if include_crops is True

Return type:

dict

data_management.remove_exif module

remove_exif.py

Removes all EXIF/IPTC/XMP metadata from a folder of images, without making backup copies, using pyexiv2. Ignores non-jpeg images.

This module is rarely used, and pyexiv2 is not thread-safe, so pyexiv2 is not included in package-level dependency lists. YMMV.

megadetector.data_management.remove_exif.remove_exif(image_base_folder, recursive=True, n_processes=1)[source]

Removes all EXIF/IPTC/XMP metadata from a folder of images, without making backup copies, using pyexiv2. Ignores non-jpeg images.

Parameters:

image_base_folder (str) – the folder from which we should remove EXIF data
recursive (bool, optional) – whether to process [image_base_folder] recursively
n_processes (int, optional) – number of concurrent workers. Because pyexiv2 is not thread-safe, only process-based parallelism is supported.

megadetector.data_management.remove_exif.remove_exif_from_image(fn)[source]

Remove EXIF information from a single image

pyexiv2 is not thread safe, do not call this function in parallel within a process.

Parallelizing across processes is fine.

Parameters:: fn (str) – image file from which we should remove EXIF information
Returns:: whether EXIF removal succeeded
Return type:: bool

remove_exif - CLI interface

Removes EXIF/IPTC/XMP metadata from images in a folder

remove_exif [-h] [--nonrecursive] [--n_processes N_PROCESSES] image_base_folder

remove_exif positional arguments

image_base_folder - Folder to process for EXIF removal

remove_exif options

-h, --help - show this help message and exit
--nonrecursive - Don’t recurse into [image_base_folder] (default is recursive)
--n_processes N_PROCESSES - Number of concurrent processes for EXIF removal (default: 1)

data_management.read_exif module

read_exif.py

Given a folder of images, reads relevant metadata (EXIF/IPTC/XMP) fields from all images, and writes them to a .json or .csv file.

This module can use either PIL (which can only reliably read EXIF data) or exiftool (which can read everything). The latter approach expects that exiftool is available on the system path. No attempt is made to be consistent in format across the two approaches.

class megadetector.data_management.read_exif.ExifResultsToCCTOptions[source]

Bases: object

Options controlling the behavior of exif_results_to_cct() (which reformats the datetime information) extracted by read_exif_from_folder().

exif_datetime_tag: The EXIF tag from which to pull datetime information

filename_to_location_function: Function for extracting location information, should take a string and return a string. Defaults to ct_utils.image_file_to_camera_folder. If this is None, location is written as “unknown”.

min_valid_timestamp_year: Timestamps older than this are assumed to be junk; lots of cameras use a default time in 2000.

class megadetector.data_management.read_exif.ReadExifOptions[source]

Bases: object

Parameters controlling metadata extraction.

allow_write_error: If this is True and an output file is specified for read_exif_from_folder, and we encounter a serialization issue, we’ll return the results but won’t error.

byte_handling

How should we handle byte-formatted EXIF tags?

‘convert_to_string’: convert to a Python string ‘delete’: don’t include at all ‘raw’: include as a byte string

exiftool_command_name: The command line to invoke if using exiftool, can be an absolute path to exiftool.exe, or can be just “exiftool”, in which case it should be on your system path.

n_workers: Number of concurrent workers, set to <= 1 to disable parallelization

processing_library: Should we use exiftool or PIL?

tag_types_to_ignore: “File” and “ExifTool” are tag types used by ExifTool to report data that doesn’t come from EXIF, rather from the file (e.g. file size).

tags_to_exclude: Include/exclude specific tags (tags_to_include and tags_to_exclude are mutually incompatible)

tags_to_include

Include/exclude specific tags (tags_to_include and tags_to_exclude are mutually incompatible)

A useful set of tags one might want to limit queries for:

options.tags_to_include = minimal_exif_tags

use_threads

Should we use threads (vs. processes) for parallelization?

Not relevant if n_workers is <= 1.

verbose: Enable additional debug console output

megadetector.data_management.read_exif.exif_results_to_cct(exif_results, cct_output_file=None, options=None)[source]

Given the EXIF results for a folder of images read via read_exif_from_folder, create a COCO Camera Traps .json file that has no annotations, but attaches image filenames to locations and datetimes.

Parameters:

exif_results (str or list) – the filename (or loaded list) containing the results from read_exif_from_folder
cct_output_file (str, optional) – the filename to which we should write COCO-Camera-Traps-formatted data
options (ExifResultsToCCTOptions, optional) – options guiding the generation of the CCT file, particularly location mapping

Returns:

a COCO Camera Traps dict (with no annotations).

Return type:

dict

megadetector.data_management.read_exif.format_datetime_as_exif_datetime_string(dt)[source]

Returns a Python datetime object rendered using the standard EXIF datetime string format (‘%Y:%m:%d %H:%M:%S’)

Parameters:: dt (datetime) – datetime object to format
Returns:: [dt] as a string in standard EXIF format
Return type:: str

megadetector.data_management.read_exif.get_exif_lat_lon(gps, verbose=False)[source]

Convert an EXIF GPS dict to lat,lon.

Parameters:

gps (dict) – dict with fields GPSLatitude, GPSLongitude, GPSLatitudeRef,
GPSLongitudeRef (and)
verbose (bool, optional) – print warnings on unsuccessful conversions

Returns:

lat,lon, or None if the data are not valid GPS coordinates

Return type:

tuple

megadetector.data_management.read_exif.get_gps_info(im, verbose=False, check_for_null_island=True)[source]

Given a filename, PIL image, dict of EXIF tags, or dict containing an ‘exif_tags’ field, return GPS location information if available.

Parameters:

im (str, PIL.Image.Image, dict) – image for which we should read GPS metadata
verbose (bool, optional) – enable additional debug information
check_for_null_island (bool, optional) – treat 0,0 as being “not GPS”

Returns:

with keys ‘status’, ‘gps_info’. ‘status’ will be ‘success’, ‘read_error’, ‘no_exif_info, ‘no_gps_info’, or ‘null_island’. If not None, ‘gps_info’ contains at least the keys GPSVersionID, GPSLatitudeRef, GPSLatitude, GPSLongitudeRef, and GPSLongitude. Values are not decoded to, e.g., degrees, they are left as reported in EXIF.

Return type:

dict

megadetector.data_management.read_exif.has_gps_info(im)[source]

Given a filename, PIL image, dict of EXIF tags, or dict containing an ‘exif_tags’ field, determine whether GPS location information is present in this image. Does not retrieve location info, currently only used to determine whether it’s present.

Parameters:

im (str, PIL.Image.Image, dict) – image for which we should determine GPS metadata
presence

Returns:

whether GPS metadata is present, or None if we failed to read EXIF data from a file.

Return type:

bool

megadetector.data_management.read_exif.parse_exif_datetime_string(s, verbose=False)[source]

” Exif datetimes are strings, but in a standard format:

%Y:%m:%d %H:%M:%S

Parses one of those strings into a Python datetime object.

Parameters:

s (str) – datetime string to parse, should be in standard EXIF datetime format
verbose (bool, optional) – enable additional debug output

Returns:

the datetime object created from [s]

Return type:

datetime

megadetector.data_management.read_exif.read_exif_from_folder(input_folder, output_file=None, options=None, filenames=None, recursive=True)[source]

Read EXIF data for a folder of images.

Parameters:

input_folder (str) – folder to process; if this is None, [filenames] should be a list of absolute paths
output_file (str, optional) – .json file to which we should write results; if this is None, results are returned but not written to disk
options (ReadExifOptions, optional) – parameters controlling metadata extraction
filenames (list, optional) – allowlist of relative filenames (if [input_folder] is not None) or a list of absolute filenames (if [input_folder] is None)
recursive (bool, optional) – whether to recurse into [input_folder], not relevant if [input_folder] is None.

Returns:

list of dicts, each of which contains EXIF information for one images. Fields include at least:

’file_name’: the relative path to the image
’exif_tags’: a dict of EXIF tags whose exact format depends on [options.processing_library].
’status’ and ‘error’: only populated for images where EXIF reading failed

Return type:

list

megadetector.data_management.read_exif.read_exif_tags_for_image(file_path, options=None)[source]

Get relevant fields from EXIF data for an image

Parameters:

file_path (str) – image from which we should read EXIF data
options (ReadExifOptions, optional) – see ReadExifOptions

Returns:

a dict with fields ‘status’ (str) and ‘tags’. The exact format of ‘tags’ depends on options (ReadExifOptions, optional): parameters controlling metadata extraction options.processing_library:

For exiftool, ‘tags’ is a list of lists, where each element is (type/tag/value)

For PIL, ‘tags’ is a dict (str:str)

Return type:

dict

megadetector.data_management.read_exif.read_pil_exif(im, options=None)[source]

Read all the EXIF data we know how to read from an image, using PIL. This is primarily an internal function; the main entry point for single-image EXIF information is read_exif_tags_for_image().

Parameters:

im (str or PIL.Image.Image) – image (as a filename or an Image object) from which we should read EXIF data.
options (ReadExifOptions, optional) – see ReadExifOptions

Returns:

a dictionary mapping EXIF tag names to their values

Return type:

dict

read_exif - CLI interface

Read EXIF information from all images in a folder, and write the results to .csv or .json

read_exif [-h] [--n_workers N_WORKERS] [--use_threads]
          [--processing_library PROCESSING_LIBRARY]
          input_folder output_file

read_exif positional arguments

input_folder - Folder of images from which we should read EXIF information
output_file - Output file (.json) to which we should write EXIF information

read_exif options

-h, --help - show this help message and exit
--n_workers N_WORKERS - Number of concurrent workers to use (defaults to 1)
--use_threads - Use threads (instead of processes) for multitasking
--processing_library PROCESSING_LIBRARY - Processing library (exif or pil)

data_management.speciesnet_to_md module

speciesnet_to_md.py

Converts the WI (SpeciesNet) predictions.json format to MD .json format. This is just a command-line wrapper around utils.wi_taxonomy_utils.generate_md_results_from_predictions_json.

speciesnet_to_md - CLI interface

speciesnet_to_md [-h] [--base_folder BASE_FOLDER] predictions_json_file md_results_file

speciesnet_to_md positional arguments

predictions_json_file - .json file to convert from SpeciesNet predictions.json format to MD format
md_results_file - output file to write in MD format

speciesnet_to_md options

-h, --help - show this help message and exit
--base_folder BASE_FOLDER - leading string to remove from each path in the predictions.json file (to convert from absolute to relative paths)

data_management.mewc_to_md module

mewc_to_md.py

Converts the output of the MEWC inference scripts to the MD output format.

megadetector.data_management.mewc_to_md.mewc_to_md(mewc_input_folder, output_file=None, mount_prefix='/images/', category_name_column='class_id', mewc_out_filename='mewc_out.csv', md_out_filename='md_out.json')[source]

Converts the output of the MEWC inference scripts to the MD output format.

Parameters:

mewc_input_folder (str) – the folder we’ll search for MEWC output files
output_file (str, optional) – .json file to write with class information
mount_prefix (str, optional) – string to remove from all filenames in the MD .json file, typically the prefix used to mount the image folder.
category_name_column (str, optional) – column in the MEWC results .csv to use for category naming.
mewc_out_filename (str, optional) – MEWC-formatted .csv file that should be in [mewc_input_folder]
md_out_filename (str, optional) – MD-formatted .json file (without classification information) that should be in [mewc_input_folder]

Returns:

an MD-formatted dict, the same as what’s written to [output_file]

Return type:

dict

mewc_to_md - CLI interface

mewc_to_md [-h] [--mount_prefix MOUNT_PREFIX] [--category_name_column CATEGORY_NAME_COLUMN]
           input_folder output_file

mewc_to_md positional arguments

input_folder - Folder containing images and MEWC .json/.csv files
output_file - .json file where output will be written

mewc_to_md options

-h, --help - show this help message and exit
--mount_prefix MOUNT_PREFIX - prefix to remove from each filename in MEWC results, typically the Docker mount point
--category_name_column CATEGORY_NAME_COLUMN - column in the MEWC .csv file to use for category names

data_management.zamba_to_md module

zamba_to_md.py

Convert a labels.csv file produced by Zamba Cloud to a MD results file suitable for import into Timelapse.

Columns are expected to be:

video_uuid (not used) original_filename (assumed to be a relative path name) top_k_label,top_k_probability, for k = 1..N [category name 1],[category name 2],… corrected_label

Because the MD results file fundamentally stores detections, what we’ll actually do is create bogus detections that fill the entire image.

There is no special handling of empty/blank categories; because these results are based on a classifier, rather than a detector (where “blank” would be the absence of all other categories), “blank” can be queried in Timelapse just like any other class.

megadetector.data_management.zamba_to_md.zamba_results_to_md_results(input_file, output_file=None)[source]

Converts the .csv file [input_file] to the MD-formatted .json file [output_file].

If [output_file] is None, ‘.json’ will be appended to the input file.

Parameters:

input_file (str) – the .csv file to convert
output_file (str, optional) – the output .json file (defaults to [input_file].json)

zamba_to_md - CLI interface

Convert a Zamba-formatted .csv results file to a MD-formatted .json results file

zamba_to_md [-h] [--output_file OUTPUT_FILE] input_file

zamba_to_md positional arguments

input_file - input .csv file

zamba_to_md options

-h, --help - show this help message and exit
--output_file OUTPUT_FILE - output .json file (defaults to input file appended with ".json")

data_management.animl_to_md module

animl_to_md.py

Convert a .csv file produced by the Animl package:

https://github.com/conservationtechlab/animl-py

…to a MD results file suitable for import into Timelapse.

Columns are expected to be:

file category (MD category identifies: 1==animal, 2==person, 3==vehicle) detection_conf bbox1,bbox2,bbox3,bbox4 class classification_conf

megadetector.data_management.animl_to_md.animl_results_to_md_results(input_file, output_file=None)[source]

Converts the .csv file [input_file] to the MD-formatted .json file [output_file].

If [output_file] is None, ‘.json’ will be appended to the input file.

animl_to_md - CLI interface

Convert an Animl-formatted .csv results file to MD-formatted .json results file

animl_to_md [-h] [--output_file OUTPUT_FILE] input_file

animl_to_md positional arguments

input_file - input .csv file

animl_to_md options

-h, --help - show this help message and exit
--output_file OUTPUT_FILE - output .json file (defaults to input file appended with ".json")

data_management.camtrap_dp_to_coco module

camtrap_dp_to_coco.py

Parse a very limited subset of the Camtrap DP data package format:

https://camtrap-dp.tdwg.org/

…and convert to COCO format. Assumes that all required metadata files have been put in the same directory (which is standard).

Does not currently parse bounding boxes, just attaches species labels to images.

Currently supports only sequence-level labeling.

megadetector.data_management.camtrap_dp_to_coco.camtrap_dp_to_coco(camtrap_dp_folder, output_file=None)[source]

Convert the Camtrap DP package in [camtrap_dp_folder] to COCO.

Does not validate images, just converts. Use integrity_check_json_db to validate the resulting COCO file.

Optionally writes the results to [output_file]

Parameters:

camtrap_dp_folder (str) – input folder, containing a CamtrapDP package
output_file (str, optional) – COCO-formatted output file

camtrap_dp_to_coco - CLI interface

Convert Camtrap DP to COCO format

camtrap_dp_to_coco [-h] [--output_file OUTPUT_FILE] camtrap_dp_folder

camtrap_dp_to_coco positional arguments

camtrap_dp_folder - Input folder, containing a CamtrapDP package

camtrap_dp_to_coco options

-h, --help - show this help message and exit
--output_file OUTPUT_FILE - COCO-formatted output file (defaults to [camtrap_dp_folder]_coco.json)

data_management.remap_coco_categories module

remap_coco_categories.py

Given a COCO-formatted dataset, remap the categories to a new mapping. A common use case is to take a fine-grained dataset (e.g. with species-level categories) and map them to coarse categories (typically MD categories).

megadetector.data_management.remap_coco_categories.remap_coco_categories(input_data, output_category_name_to_id, input_category_name_to_output_category_name, output_file=None, allow_unused_categories=False)[source]

Given a COCO-formatted dataset, remap the categories to a new categories mapping, optionally writing the results to a new file.

Parameters:

input_data (str or dict) – a COCO-formatted dict or a filename. If it’s a dict, it will be copied, not modified in place.
output_category_name_to_id (dict) – a dict mapping strings to ints. Categories not in this dict will be ignored or will result in errors, depending on allow_unused_categories.
input_category_name_to_output_category_name (dict) – a dict mapping strings to strings. Annotations using categories not in this dict will be omitted or will result in errors, depending on allow_unused_categories.
output_file (str, optional) – output file to which we should write remapped COCO data
allow_unused_categories (bool, optional) – should we ignore categories not present in the input/output mappings? If this is False and we encounter an unmapped category, we’ll error.

Returns:

COCO-formatted dict

Return type:

dict

remap_coco_categories - CLI interface

Remap categories in a COCO-formatted dataset

remap_coco_categories [-h] [--allow_unused_categories]
                      input_coco_file output_category_map_file
                      input_to_output_category_map_file output_coco_file

remap_coco_categories positional arguments

input_coco_file - Path to the input COCO .json file
output_category_map_file - Path to a .json file mapping output category names to integer IDs (e.g., {‘cat’:0, ``'dog':1}``)
input_to_output_category_map_file - Path to a .json file mapping input category names to output category names (e.g., {‘old_cat_name’:’cat’, ``'old_dog_name':'dog'}``)
output_coco_file - Path to save the remapped COCO .json file

remap_coco_categories options

-h, --help - show this help message and exit
--allow_unused_categories - Allow unmapped categories (by default, errors on unmapped categories)

data_management.yolo_output_to_md_output module

yolo_output_to_md_output.py

Converts the output of YOLOv5’s detect.py or val.py to the MD output format.

Converting .txt files

detect.py writes a .txt file per image, in YOLO training format. Converting from this format does not currently support recursive results, since detect.py doesn’t save filenames in a way that allows easy inference of folder names. Requires access to the input images, because the YOLO format uses the absence of a results file to indicate that no detections are present.

YOLOv5 output has one text file per image, like so:

0 0.0141693 0.469758 0.0283385 0.131552 0.761428

That’s [class, x_center, y_center, width_of_box, height_of_box, confidence]

val.py can write in this format as well, using the –save-txt argument.

In both cases, a confidence value is only written to each line if you include the –save-conf argument. Confidence values are required by this conversion script.

Converting .json files

val.py can also write a .json file in COCO-ish format. It’s “COCO-ish” because it’s just the “images” portion of a COCO .json file.

Converting from this format also requires access to the original images, since the format written by YOLOv5 uses absolute coordinates, but MD results are in relative coordinates.

megadetector.data_management.yolo_output_to_md_output.read_classes_from_yolo_dataset_file(fn)[source]

Reads a dictionary mapping integer class IDs to class names from a YOLOv5/YOLOv8 dataset.yaml file or a .json file. A .json file should contain a dictionary mapping integer category IDs to string category names.

Parameters:: fn (str) – YOLOv5/YOLOv8 dataset file with a .yml or .yaml extension, a .json file mapping integer category IDs to category names, or a .txt file with a flat list of classes.
Returns:: a mapping from integer category IDs to category names
Return type:: dict

megadetector.data_management.yolo_output_to_md_output.yolo_json_output_to_md_output(yolo_json_file, image_folder, output_file, yolo_category_id_to_name, detector_name='unknown', image_id_to_relative_path=None, offset_yolo_class_ids=True, truncate_to_standard_md_precision=True, image_id_to_error=None, convert_slashes=True)[source]

Converts a YOLOv5/YOLOv8 .json file to MD .json format.

Parameters:

yolo_json_file (str) – the YOLO-formatted .json file to convert to MD output format
image_folder (str) – the .json file contains relative path names, this is the path base
output_file (str) – the MD-formatted .json file to write
yolo_category_id_to_name (str or dict) – the .json results file contains only numeric identifiers for categories, but we want names and numbers for the output format; yolo_category_id_to_name provides that mapping either as a dict or as a YOLOv5 dataset.yaml file.
detector_name (str, optional) – a string that gets put in the output file, not otherwise used within this function
image_id_to_relative_path (dict, optional) – YOLOv5 .json uses only basenames (e.g. abc1234.JPG); by default these will be appended to the input path to create pathnames. If you have a flat folder, this is fine. If you want to map base names to relative paths in a more complicated way, use this parameter.
offset_yolo_class_ids (bool, optional) – YOLOv5 class IDs always start at zero; if you want to make the output classes start at 1, set offset_yolo_class_ids to True.
truncate_to_standard_md_precision (bool, optional) – YOLOv5 .json includes lots of (not-super-meaningful) precision, set this to truncate to COORD_DIGITS and CONF_DIGITS.
image_id_to_error (dict, optional) – if you want to include image IDs in the output file for which you couldn’t prepare the input file in the first place due to errors, include them here.
convert_slashes (bool, optional) – force all slashes to be forward slashes in the output file

megadetector.data_management.yolo_output_to_md_output.yolo_txt_output_to_md_output(input_results_folder, image_folder, output_file, detector_tag=None, truncate_to_standard_md_precision=True)[source]

Converts a folder of YOLO-output .txt files to MD .json format.

Less finished than the .json conversion function; this .txt conversion assumes a hard-coded mapping representing the standard MD categories (in MD indexing, 1/2/3=animal/person/vehicle; in YOLO indexing, 0/1/2=animal/person/vehicle).

Parameters:

input_results_folder (str) – the folder containing YOLO-output .txt files
image_folder (str) – the folder where images live, may be the same as [input_results_folder]
output_file (str) – the MD-formatted .json file to which we should write results
detector_tag (str, optional) – string to put in the ‘detector’ field in the output file
truncate_to_standard_md_precision (bool, optional) – set this to truncate to COORD_DIGITS and CONF_DIGITS, like the standard MD pipeline does.

yolo_output_to_md_output - CLI interface

Converts YOLOv5 output (.json or .txt) to MD output format.

yolo_output_to_md_output [-h] {json,txt} ...

yolo_output_to_md_output options

-h, --help - show this help message and exit

yolo_output_to_md_output json

Convert YOLO-formatted .json results.

yolo_output_to_md_output json [-h] [--detector_name DETECTOR_NAME]
                              [--image_id_to_relative_path_file IMAGE_ID_TO_RELATIVE_PATH_FILE]
                              [--offset_yolo_class_ids {true,false}]
                              [--truncate_to_standard_md_precision {true,false}]
                              [--convert_slashes {true,false}]
                              yolo_json_file image_folder output_file
                              yolo_category_id_to_name_file

yolo_output_to_md_output json positional arguments

yolo_json_file - Path to the input YOLO-formatted .json results file
image_folder - Path to the image folder
output_file - Path to the MD-formatted .json output file
yolo_category_id_to_name_file - Path to the .yml, .yaml, .json, or .txt file mapping YOLO category IDs to names

yolo_output_to_md_output json options

-h, --help - show this help message and exit
--detector_name DETECTOR_NAME - Detector name to store in the output file (default: 'unknown')
--image_id_to_relative_path_file IMAGE_ID_TO_RELATIVE_PATH_FILE - Path to a .json file mapping image IDs to relative paths
--offset_yolo_class_ids OFFSET_YOLO_CLASS_IDS - Offset YOLO class IDs in the output (default: 'true')
--truncate_to_standard_md_precision TRUNCATE_TO_STANDARD_MD_PRECISION - Truncate coordinates and confidences to standard MD precision (default: 'true')
--convert_slashes CONVERT_SLASHES - Convert backslashes to forward slashes in output file paths (default: 'true')

yolo_output_to_md_output txt

Convert YOLO-formatted .txt results from a folder

yolo_output_to_md_output txt [-h] [--detector_tag DETECTOR_TAG]
                             [--truncate_to_standard_md_precision {true,false}]
                             input_results_folder image_folder output_file

yolo_output_to_md_output txt positional arguments

input_results_folder - Path to the folder containing YOLO .txt output files
image_folder - Path to the image folder
output_file - Path to the MD-formatted .json file output

yolo_output_to_md_output txt options

-h, --help - show this help message and exit
--detector_tag DETECTOR_TAG - Detector tag to store in the output file
--truncate_to_standard_md_precision TRUNCATE_TO_STANDARD_MD_PRECISION - Truncate coordinates and confidences to standard MD precision (default: 'true').

data_management.yolo_to_coco module

yolo_to_coco.py

Converts a folder of YOLO-formatted annotation files to a COCO-formatted dataset.

megadetector.data_management.yolo_to_coco.load_yolo_class_list(class_name_file)[source]

Loads a dictionary mapping zero-indexed IDs to class names from the text/yaml file [class_name_file].

Parameters:: class_name_file (str or list) – this can be: - a .yaml or .yaml file in YOLO’s dataset.yaml format - a .txt or .data file containing a flat list of class names - a list of class names
Returns:: A dict mapping zero-indexed integer IDs to class names
Return type:: dict

megadetector.data_management.yolo_to_coco.validate_label_file(label_file, category_id_to_name=None, verbose=False)[source]

” Verifies that [label_file] is a valid YOLO label file. Does not check the extension.

Parameters:

label_file (str) – the .txt file to validate
category_id_to_name (dict, optional) – a dict mapping integer category IDs to names; if this is not None, this function errors if the file uses a category that’s not in this dict
verbose (bool, optional) – enable additional debug console output

Returns:

a dict with keys ‘file’ (the same as [label_file]) and ‘errors’ (a list of errors (if any) that we found in this file)

Return type:

dict

megadetector.data_management.yolo_to_coco.validate_yolo_dataset(input_folder, class_name_file, n_workers=1, pool_type='thread', verbose=False)[source]

Verifies all the labels in a YOLO dataset folder. Does not yet support the case where the labels and images are in different folders (yolo_to_coco() supports this).

Looks for:

Image files without label files
Text files without image files
Illegal classes in label files
Invalid boxes in label files

Parameters:

input_folder (str) – the YOLO dataset folder to validate
class_name_file (str or list) – a list of classes, a flat text file, or a yolo dataset.yml/.yaml file. If it’s a dataset.yml file, that file should point to input_folder as the base folder, though this is not explicitly checked.
n_workers (int, optional) – number of concurrent workers, set to <= 1 to disable parallelization
pool_type (str, optional) – ‘thread’ or ‘process’, worker type to use for parallelization; not used if [n_workers] <= 1
verbose (bool, optional) – enable additional debug console output

Returns:

validation results, as a dict with fields:

image_files_without_label_files (list)
label_files_without_image_files (list)
label_results (list of dicts with field ‘filename’, ‘errors’) (list)

Return type:

dict

megadetector.data_management.yolo_to_coco.yolo_to_coco(input_folder, class_name_file, output_file=None, empty_image_handling='no_annotations', empty_image_category_name='empty', error_image_handling='no_annotations', allow_images_without_label_files=True, n_workers=1, pool_type='thread', recursive=True, exclude_string=None, include_string=None, overwrite_handling='overwrite', label_folder=None, supercategory=None, force_integer_ids=False, include_area=False, include_crowd=False, invalid_annotation_handling='error', precision=3)[source]

Converts a YOLO-formatted dataset to a COCO-formatted dataset.

All images will be assigned an “error” value, usually None.

Parameters:

input_folder (str) – the YOLO dataset folder to convert. If the image and label folders are different, this is the image folder, and [label_folder] is the label folder.
class_name_file (str or list) – a list of classes, a flat text file, or a yolo dataset.yml/.yaml file. If it’s a dataset.yml file, that file should point to input_folder as the base folder, though this is not explicitly checked.
output_file (str, optional) – .json file to which we should write COCO .json data
empty_image_handling (str, optional) –
how to handle images with no boxes; whether this includes images with no .txt files depends on the value of [allow_images_without_label_files]. Can be:
- ’no_annotations’: include the image in the image list, with no annotations
- ’empty_annotations’: include the image in the image list, and add an annotation without any bounding boxes, using a category called [empty_image_category_name].
- ’skip’: don’t include the image in the image list
- ’error’: there shouldn’t be any empty images
empty_image_category_name (str, optional) – if we’re going to be inserting annotations for images with no boxes, what category name should we use?
error_image_handling (str, optional) –
how to handle images that don’t load properly; can be:
- ’skip’: don’t include the image at all
- ’no_annotations’: include with no annotations
allow_images_without_label_files (bool, optional) – whether to silently allow images with no label files (True) or raise errors for images with no label files (False)
n_workers (int, optional) – number of concurrent workers, set to <= 1 to disable parallelization
pool_type (str, optional) – ‘thread’ or ‘process’, worker type to use for parallelization; not used if [n_workers] <= 1
recursive (bool, optional) – whether to recurse into [input_folder]
exclude_string (str, optional) – exclude any images whose filename contains a string
include_string (str, optional) – include only images whose filename contains a string
overwrite_handling (bool, optional) – behavior if output_file exists (‘load’, ‘overwrite’, or ‘error’)
label_folder (str, optional) – label folder, if different from the image folder
supercategory (str, optional) – populate the ‘supercategory’ field, currently only supports None (don’t populate) or a single supercategory for the whole dataset. This is mostly only here because RF-DETR requires something to be populated in this field.
force_integer_ids (bool, optional) – force image and annotation IDs to be integers
include_area (bool, optional) – add the “area” field for boxes
include_crowd (bool, optional) – include the “iscrowd” field (always 0) for annotations
invalid_annotation_handling (str, optional) – how to handle invalid annotations, e.g. negative-height bounding boxes. Can be ‘error’, ‘warn’, or ‘exclude’. ‘exclude’ implies ‘warn’.
precision (int, optional) – round box coordinates to this many decimal places, or None to bypass rounding.

Returns:

COCO-formatted data, the same as what’s written to [output_file]

Return type:

dict

yolo_to_coco - CLI interface

Convert a YOLO-formatted dataset to COCO format

yolo_to_coco [-h] [--label_folder LABEL_FOLDER]
             [--empty_image_handling {no_annotations,empty_annotations,skip,error}]
             [--empty_image_category_name EMPTY_IMAGE_CATEGORY_NAME]
             [--error_image_handling {skip,no_annotations}]
             [--allow_images_without_label_files {true,false}] [--n_workers N_WORKERS]
             [--pool_type {thread,process}] [--recursive {true,false}]
             [--exclude_string EXCLUDE_STRING] [--include_string INCLUDE_STRING]
             [--overwrite_handling {load,overwrite,error}]
             input_folder class_name_file output_file

yolo_to_coco positional arguments

input_folder - Path to the YOLO dataset folder (image folder)
class_name_file - Path to the file containing class names (e.g., classes.txt or dataset.yaml)
output_file - Path to the output COCO .json file.

yolo_to_coco options

-h, --help - show this help message and exit
--label_folder LABEL_FOLDER - Label folder, if different from the image folder. Default: None (labels are in the image folder)
--empty_image_handling EMPTY_IMAGE_HANDLING - How to handle images with no bounding boxes.
--empty_image_category_name EMPTY_IMAGE_CATEGORY_NAME - Category name for empty images if empty_image_handling is "empty_annotations"
--error_image_handling ERROR_IMAGE_HANDLING - How to handle images that fail to load
--allow_images_without_label_files ALLOW_IMAGES_WITHOUT_LABEL_FILES - Whether to allow images that do not have corresponding label files (true/false)
--n_workers N_WORKERS - Number of workers for parallel processing. <=1 for sequential
--pool_type POOL_TYPE - Type of multiprocessing pool if n_workers > 1
--recursive RECURSIVE - Whether to search for images recursively in the input folder (true/false)
--exclude_string EXCLUDE_STRING - Exclude images whose filename contains this string
--include_string INCLUDE_STRING - Include images only if filename contains this string
--overwrite_handling OVERWRITE_HANDLING - Behavior if output_file exists.

data_management.labelme_to_yolo module

labelme_to_yolo.py

Create YOLO .txt files in a folder containing labelme .json files.

megadetector.data_management.labelme_to_yolo.labelme_file_to_yolo_file(labelme_file, category_name_to_category_id, yolo_file=None, required_token=None, overwrite_behavior='overwrite')[source]

Convert the single .json file labelme_file to yolo format, writing the results to the text file yolo_file (defaults to s/json/txt).

If required_token is not None and the dict in labelme_file does not contain the key [required_token], this function no-ops (i.e., does not generate a YOLO file).

overwrite_behavior should be ‘skip’ or ‘overwrite’ (default).

Parameters:

labelme_file (str) – .json file to convert
category_name_to_category_id (dict) – category name –> ID mapping
yolo_file (str, optional) – output .txt file defaults to s/json/txt
required_token (str, optional) – only process filenames containing this token
overwrite_behavior (str, optional) – “skip” or “overwrite”

megadetector.data_management.labelme_to_yolo.labelme_folder_to_yolo(labelme_folder, category_name_to_category_id=None, required_token=None, overwrite_behavior='overwrite', relative_filenames_to_convert=None, n_workers=1, use_threads=True)[source]

Given a folder with images and labelme .json files, convert the .json files to YOLO .txt format. If category_name_to_category_id is None, first reads all the labels in the folder to build a zero-indexed name –> ID mapping.

If required_token is not None and a labelme_file does not contain the key [required_token], it won’t be converted. Typically used to specify a field that indicates which files have been reviewed.

If relative_filenames_to_convert is not None, this should be a list of .json (not image) files that should get converted, relative to the base folder.

overwrite_behavior should be ‘skip’ or ‘overwrite’ (default).

returns a dict with:: ‘category_name_to_category_id’, whether it was passed in or constructed ‘image_results’: a list of results for each image (converted, skipped, error)

Parameters:

labelme_folder (str) – folder of .json files to convert
category_name_to_category_id (dict) – category name –> ID mapping
required_token (str, optional) – only process filenames containing this token
overwrite_behavior (str, optional) – “skip” or “overwrite”
relative_filenames_to_convert (list of str, optional) – only process filenames on this list
n_workers (int, optional) – parallelism level
use_threads (bool, optional) – whether to use threads (True) or processes (False) for parallelism

labelme_to_yolo - CLI interface

Convert a folder of Labelme .json files to YOLO .txt format

labelme_to_yolo [-h] [--output_category_file OUTPUT_CATEGORY_FILE]
                [--required_token REQUIRED_TOKEN] [--overwrite_behavior {skip,overwrite}]
                [--n_workers N_WORKERS] [--use_processes]
                labelme_folder

labelme_to_yolo positional arguments

labelme_folder - Folder of Labelme .json files to convert

labelme_to_yolo options

-h, --help - show this help message and exit
--output_category_file OUTPUT_CATEGORY_FILE - Path to save the generated category mapping (.json)
--required_token REQUIRED_TOKEN - Only process files containing this token as a key in the Labelme JSON dict
--overwrite_behavior OVERWRITE_BEHAVIOR - Behavior if YOLO .txt files exist (default: 'overwrite')
--n_workers N_WORKERS - Number of workers for parallel processing (default: 1)
--use_processes - Use processes instead of threads for parallelization (defaults to threads)

data_management.resize_coco_dataset module

resize_coco_dataset.py

Given a COCO-formatted dataset, resizes all the images to a target size, scaling bounding boxes accordingly.

class megadetector.data_management.resize_coco_dataset.TestResizeCocoDataset[source]

Bases: object

Test class for the resize_coco_dataset function.

set_up()[source]

tear_down()[source]

test_resize_sequential_vs_parallel()[source]: Test driver for sequence vs. parallel COCO dataset resizing.

megadetector.data_management.resize_coco_dataset.resize_coco_dataset(input_folder, input_filename, output_folder, output_filename=None, target_size=(-1, -1), correct_size_image_handling='copy', unavailable_image_handling='error', n_workers=1, pool_type='thread', no_enlarge_width=True, verbose=False)[source]

Given a COCO-formatted dataset (images in input_folder, data in input_filename), resizes all the images to a target size (in output_folder) and scales bounding boxes accordingly.

Parameters:

input_folder (str) – the folder where images live; filenames in [input_filename] should be relative to [input_folder]
input_filename (str) – the (input) COCO-formatted .json file containing annotations
output_folder (str) – the folder to which we should write resized images; can be the same as [input_folder], in which case images are over-written
output_filename (str, optional) – the COCO-formatted .json file we should generate that refers to the resized images
target_size (list or tuple of ints, optional) – this should be tuple/list of ints, with length 2 (w,h). If either dimension is -1, aspect ratio will be preserved. If both dimensions are -1, this means “keep the original size”. If both dimensions are -1 and correct_size_image_handling is copy, this function is basically a no-op.
correct_size_image_handling (str, optional) – what to do in the case where the original size already matches the target size. Can be ‘copy’ (in which case the original image is just copied to the output folder) or ‘rewrite’ (in which case the image is opened via PIL and re-written, attempting to preserve the same quality). The only reason to do use ‘rewrite’ ‘is the case where you’re superstitious about biases coming from images in a training set being written by different image encoders.
unavailable_image_handling (str, optional) – what to do when a file can’t be opened. Can be ‘error’ or ‘omit’.
n_workers (int, optional) – number of workers to use for parallel processing. Defaults to 1 (no parallelization). If <= 1, processing is sequential.
pool_type (str, optional) – type of multiprocessing pool to use (‘thread’ or ‘process’). Defaults to ‘thread’. Only used if n_workers > 1.
no_enlarge_width (bool, optional) – if [no_enlarge_width] is True, and [target width] is larger than the original image width, does not modify the image, but still writes it
verbose (bool, optional) – enable additional debug output

Returns:

the COCO database with resized images, identical to the content of [output_filename]

Return type:

dict

megadetector.data_management.resize_coco_dataset.test_resize_coco_dataset_main()[source]: Driver for the TestResizeCocoDataset() class.

resize_coco_dataset - CLI interface

Resize images in a COCO dataset and scale annotations

resize_coco_dataset [-h] [--target_size TARGET_SIZE]
                    [--correct_size_image_handling {copy,rewrite}] [--n_workers N_WORKERS]
                    [--pool_type {thread,process}]
                    input_folder input_filename output_folder output_filename

resize_coco_dataset positional arguments

input_folder - Path to the folder containing original images
input_filename - Path to the input COCO .json file
output_folder - Path to the folder where resized images will be saved
output_filename - Path to the output COCO .json file for resized data

resize_coco_dataset options

-h, --help - show this help message and exit
--target_size TARGET_SIZE - Target size as "width,height". Use -1 to preserve aspect ratio for a dimension. E.g., "800,600" or "1024,-1".
--correct_size_image_handling CORRECT_SIZE_IMAGE_HANDLING - How to handle images already at target size
--n_workers N_WORKERS - Number of workers for parallel processing. <=1 for sequential
--pool_type POOL_TYPE - Type of multiprocessing pool if n_workers > 1

data_management.threshold_coco_dataset module

threshold_coco_dataset.py

Given a COCO-formatted dataset that stores confidence in the semi-standard “score” field, remove annotations below a threshold.

megadetector.data_management.threshold_coco_dataset.threshold_coco_dataset(input_filename, confidence_threshold=0.0, output_filename=None, confidence_field='score', missing_confidence_handling='error')[source]

Given a COCO-formatted dataset that stores confidence in the semi-standard “score” field, remove annotations below a threshold.

Parameters:

input_filename (str) – the (input) COCO-formatted .json file containing annotations
confidence_threshold (float, optional) – discard annotations below this confidence value
output_filename (str, optional) – write the thresholded output here
confidence_field (str, optional) – the field within annotations that represents confidence values
missing_confidence_handling – what to do if a confidence value is missing (should be ‘error’ or ‘warning’)

Returns:

the thresholded COCO database

Return type:

dict

threshold_coco_dataset - CLI interface

Threshold a COCO dataset, write the results to a new file

threshold_coco_dataset [-h] [--confidence_field CONFIDENCE_FIELD]
                       [--missing_confidence_handling {error,warning}]
                       input_filename output_filename confidence_threshold

threshold_coco_dataset positional arguments

input_filename - Path to the input COCO .json file
output_filename - Path to the .json file where thresholded data will be saved
confidence_threshold - Confidence threshold

threshold_coco_dataset options

-h, --help - show this help message and exit
--confidence_field CONFIDENCE_FIELD - Field to use for confidence values, default "score"
--missing_confidence_handling MISSING_CONFIDENCE_HANDLING - Whether to error on annotations that are missing the confidence field

data_management.wi_download_csv_to_coco module

wi_download_csv_to_coco.py

Converts a .csv file (or a folder of .csv files) from a Wildlife Insights project export to a COCO camera traps .json file.

Currently assumes that common names are unique identifiers, which is convenient but unreliable.

megadetector.data_management.wi_download_csv_to_coco.wi_download_csv_to_coco(csv_file_in, coco_file_out=None, image_folder=None, exclude_missing_images=False, image_flattening='deployment', verbose=True, category_remappings={'.*human.*': 'human', '.*vehicle.*': 'vehicle', 'atv': 'vehicle', 'homo species': 'human', 'misfire': 'blank', 'no cv result': 'unknown', 'truck': 'vehicle'}, blank_disagreement_handling='trust_label', include_blanks=True)[source]

Converts a .csv file (or folder of .csv files) from a Wildlife Insights project export to a COCO Camera Traps .json file.

TODO: currently relies on uniqueness of common names, which is not guaranteed. Prints warnings for non-unique common names.

Parameters:

csv_file_in (str) – a downloaded .csv file we should convert to COCO, or a folder containing images…csv files.
coco_file_out (str, optional) – the .json file we should write; if [coco_file_out] is None, returns data, but doesn’t write it
image_folder (str, optional) – the folder where images live, only relevant if [exclude_missing_images] is True
exclude_missing_images (bool, optional) – whether to exclude images not present in disk; if this is True, [image_folder] must be a valid folder. This has no impact on blank images if “include_blanks” is False.
image_flattening (str, optional) – if ‘none’, relative paths will be stored as the entire URL for each image, other than gs://. Can be ‘guid’ (just store [GUID].JPG), ‘deployment’ (store as [deployment]/[GUID].JPG), or ‘project’ (store as [project]/[deployment]/[GUID].JPG).
verbose (bool, optional) – enable additional debug console output
category_remappings (dict, optional) – str –> str dict that maps WI category names to output category names. Regular expressions allowed in keys.
blank_disagreement_handling (str, optional) – what to do when the “common_name” field disagrees with the “is_blank” field; can be “trust_label” (default), “trust_is_blank”, or “error
include_blanks (bool, optional) – whether to include blank images in the COCO file

Returns:

COCO-formatted data, identical to what’s written to [coco_file_out]

Return type:

dict