API Reference
This section provides a detailed reference to the classes and functions in the vision-explanation-methods package.
vision_explanation_methods
Module for creating explanations for vision models.
vision_explanation_methods.explanations
Module for image explanation methods.
vision_explanation_methods.explanations.drise
Implementation of DRISE.
A black box explainability method for object detection.
- vision_explanation_methods.explanations.drise.DRISE_saliency(model: vision_explanation_methods.explanations.common.GeneralObjectDetectionModelWrapper, image_tensor: torch.Tensor, target_detections: List[vision_explanation_methods.explanations.common.DetectionRecord], number_of_masks: int, mask_res: Tuple[int, int] = (16, 16), mask_padding: Optional[int] = None, device: str = 'cpu', verbose: bool = False) List[torch.Tensor] [source]
Compute DRISE saliency map.
- Parameters
model (OcclusionModelWrapper) – Object detection model wrapped for occlusion
target_detections (List of Detection Records) – Baseline detections to get saliency maps for
number_of_masks (int) – Number of masks to use for saliency
mask_res – Resolution of mask before scale up
mask_padding – How much to pad the mask before cropping
- Type
Optional int
- Device
Device to use to run the function
- Type
- Returns
A list of tensors, one tensor for each image. Each tensor is of shape [D, 3, W, H], and [i ,3 W, H] is the saliency map associated with detection i.
- Return type
List torch.Tensor
- vision_explanation_methods.explanations.drise.DRISE_saliency_for_mlflow(model, image_tensor: pandas.core.frame.DataFrame, target_detections: List[vision_explanation_methods.explanations.common.DetectionRecord], number_of_masks: int, mask_res: Tuple[int, int] = (16, 16), mask_padding: Optional[int] = None, device: str = 'cpu', verbose: bool = False) List[torch.Tensor] [source]
Compute DRISE saliency map.
- Parameters
model (OcclusionModelWrapper) – Object detection model wrapped for occlusion
target_detections (List of Detection Records) – Baseline detections to get saliency maps for
number_of_masks (int) – Number of masks to use for saliency
mask_res – Resolution of mask before scale up
mask_padding – How much to pad the mask before cropping
- Type
Optional int
- Device
Device to use to run the function
- Type
- Returns
A list of tensors, one tensor for each image. Each tensor is of shape [D, 3, W, H], and [i ,3 W, H] is the saliency map associated with detection i.
- Return type
List torch.Tensor
- class vision_explanation_methods.explanations.drise.MaskAffinityRecord(mask: torch.Tensor, affinity_scores: List[torch.Tensor])[source]
Bases:
object
Class for keeping track of masks and associated affinity score.
- Parameters
mask (torch.Tensor) – 3xHxW mask
affinity_scores (List of Tensors) – Scores for each detection in each image associated with mask.
- get_weighted_masks() List[torch.Tensor] [source]
Return the masks weighted by the affinity scores.
- Returns
Masks weighted by affinity scores - N tensors of shape Dx3xHxW, where N is the number of images in the batch, D, is the number of detections in an image (where D changes image to image)
- Return type
List of Tensors
- vision_explanation_methods.explanations.drise.compute_affinity_scores(base_detections: vision_explanation_methods.explanations.common.DetectionRecord, masked_detections: vision_explanation_methods.explanations.common.DetectionRecord) torch.Tensor [source]
Compute highest affinity score between two sets of detections.
- Parameters
base_detections (Detection Record) – Set of detections to get affinity scores for
masked_detections (Detection Record) – Set of detections to score against
- Returns
Set of affinity scores associated with each detections
- Return type
Tensor of shape D, where D is number of base detections
- vision_explanation_methods.explanations.drise.convert_base64_to_tensor(b64_img: str, device: str) torch.Tensor [source]
Convert base64 image to tensor.
- vision_explanation_methods.explanations.drise.convert_tensor_to_base64(img_tens: torch.Tensor) Tuple[str, Tuple[int, int]] [source]
Convert image tensor to base64 string.
- Parameters
img_tens (Tensor) – Image tensor
- Returns
Base64 encoded image
- Return type
- vision_explanation_methods.explanations.drise.fuse_mask(img_tensor: torch.Tensor, mask: torch.Tensor) torch.Tensor [source]
Mask an image tensor.
- Parameters
img_tensor (Tensor) – Image to be masked
mask (Tensor) – Mask for image
- Returns
Masked image
- Return type
Tensor
- vision_explanation_methods.explanations.drise.generate_mask(base_size: Tuple[int, int], img_size: Tuple[int, int], padding: int, device: str) torch.Tensor [source]
Create a random mask for image occlusion.
- Parameters
- Returns
Occlusion mask for image, same shape as image
- Return type
Tensor
- vision_explanation_methods.explanations.drise.saliency_fusion(affinity_records: List[vision_explanation_methods.explanations.drise.MaskAffinityRecord], device: str, normalize: Optional[bool] = True, verbose: bool = False) torch.Tensor [source]
Create a fused mask based on the affinity scores of the different masks.
- Parameters
affinity_records (List of affinity records) – List of affinity records computed for mask
device (String) – Torch string describing device, e.g. ‘cpu’ or ‘cuda:0’
normalize – Normalize the image by subtracting off the average affinity score (optional), defaults to true
- Type
- Returns
List of saliency maps - one list of maps for each image in batch, and one map per detection in each image
- Return type
List of Tensors - one tensor for each image, and each tensor of shape Dx3xHxW, where D is the number of detections in that image.
vision_explanation_methods.evaluation
Module for evaluation.
vision_explanation_methods.evaluation.pointing_game
Defines a variety of explanation evaluation tools.
- class vision_explanation_methods.evaluation.pointing_game.PointingGame(model: Any, device='auto')[source]
Bases:
object
A class for the high energy pointing game.
- calculate_gt_salient_pixel_overlap(saliency_scores: List[torch.Tensor], gt_bbox: List)[source]
Calculate percent of overlap between salient pixels and gt bbox.
- Formula: number of salient pixels in the gt bbox /
number of pixels in the gt bbox
- Parameters
saliency_scores (List[Tensor]) – 2D matrix representing the saliency scores of each pixel in an image
gt_bbox (List) – bounding box for ground truth prediction
- Returns
return percent of salient pixel overlap with the ground truth
- Return type
Float
- pointing_game(imagelocation: str, index: int, threshold: float = 0.8, num_masks: int = 100)[source]
Calculate the saliency scores for a given object detection prediction.
The calculated value is a matrix of saliency scores. Values below the threshold are set to -1. The goal here is to filter out insignificant saliency scores, and identify highly salient pixels. That is why it is called a pointing game - we want to “point”, i.e. identify, all highly salient pixels. That way we can easily determine if these highly salient pixels overlap with the gt bounding box.
- Parameters
imagelocation (str) – Path of the image location
index (int) – Index of the desired object within the given image to evaluate
threshold (float) – threshold between 0 and 1 to determine saliency of a pixel. If saliency score is below the threshold, then the score is set to -1
num_masks (int) – number of masks to run drise with
- Returns
2d matrix of highly salient pixels
- Return type
List[Tensor]
- visualize_highly_salient_pixels(img, saliency_scores, gt_bbox: Optional[List] = None)[source]
Create figure of highly salient pixels.
- Parameters
img (PIL.Image) – PIL test image
saliency_scores (List[Tensor]) – 2D matrix representing the saliency scores of each pixel in an image
gt_bbox (List) – bounding box for ground truth prediction. if none then no ground truth bounding box is drawn
- Returns
Overlay of the saliency scores on top of the image
- Return type
Figure
vision_explanation_methods.error_labeling
Module for error labeling.
vision_explanation_methods.error_labeling.error_labeling
Defines the Error Labeling Manager class.
- class vision_explanation_methods.error_labeling.error_labeling.ErrorLabelType(value)[source]
Bases:
enum.Enum
Enum providing types of error labels.
If none, then the detection is not an error. It is a correct prediction.
- BACKGROUND = 'background'
- CLASS_LOCALIZATION = 'class_localization'
- CLASS_NAME = 'class_name'
- DUPLICATE_DETECTION = 'duplicate_detection'
- LOCALIZATION = 'localization'
- MATCH = 'match'
- MISSING = 'missing'
- class vision_explanation_methods.error_labeling.error_labeling.ErrorLabeling(task_type: str, pred_y: list, true_y: list, iou_threshold: float = 0.5)[source]
Bases:
object
Defines a wrapper class of Error Labeling for vision scenario.
Only supported for object detection at this point.
vision_explanation_methods.DRISE_runner
Method for generating saliency maps for object detection models.
- vision_explanation_methods.DRISE_runner.get_drise_saliency_map(imagelocation: str, model: Optional[object], numclasses: int, savename: str, nummasks: int = 25, maskres: Tuple[int, int] = (4, 4), maskpadding: Optional[int] = None, devicechoice: Optional[str] = None, max_figures: Optional[int] = None)[source]
Run D-RISE on image and visualize the saliency maps.
- Parameters
imagelocation (str) – Path of the image location
model (PyTorch model) – Input model for D-RISE. If None, Faster R-CNN model will be used.
numclasses (int) – Number of classes model predicted
savename (str) – Path of the saved output figure
nummasks (int) – Number of masks to use for saliency
maskres (Tuple of ints) – Resolution of mask before scale up
maskpadding – How much to pad the mask before cropping
max_figures – max figure # if memory limitations.
- Type
Optional int
- Type
Optional int
- Returns
Tuple of Matplotlib figure list, path to where the output figure is saved, list of labels
- Return type
- vision_explanation_methods.DRISE_runner.get_instance_segmentation_model(num_classes: int)[source]
Load in pre-trained Faster R-CNN model with resnet50 backbone.
- Parameters
num_classes (int) – Number of classes model predicted
- Returns
Faster R-CNN PyTorch model
- Return type
PyTorch model
- vision_explanation_methods.DRISE_runner.plot_img_bbox(ax: matplotlib.axes._subplots.AxesSubplot, box: numpy.ndarray, label: str, color: str)[source]
Plot predicted bounding box and label on the D-RISE saliency map.
- Parameters
ax (Matplotlib AxesSubplot) – Axis on which the d-rise saliency map was plotted
box (numpy.ndarray) – Bounding box the model predicted
label (str) – Label the model predicted
color (single letter color string) – Color of the bounding box based on predicted label
- Returns
Axis with the predicted bounding box and label plotted on top of d-rise saliency map
- Return type
Matplotlib AxesSubplot
vision_explanation_methods.version
Metadata including name and version of package.