From 9eaa367fc67294f449050c0ca8ef3e4bcacf23d0 Mon Sep 17 00:00:00 2001 From: j-t-1 <120829237+j-t-1@users.noreply.github.com> Date: Sun, 2 Mar 2025 10:58:33 +0000 Subject: [PATCH] Tiny change of ResNet TorchVision ablation tutorial --- tutorials/Resnet_TorchVision_Ablation.ipynb | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/tutorials/Resnet_TorchVision_Ablation.ipynb b/tutorials/Resnet_TorchVision_Ablation.ipynb index d231553367..238a64f295 100644 --- a/tutorials/Resnet_TorchVision_Ablation.ipynb +++ b/tutorials/Resnet_TorchVision_Ablation.ipynb @@ -13,13 +13,13 @@ "source": [ "This notebook tutorial demonstrates how feature ablation in Captum can be applied to inspect computer vision models. \n", "\n", - "**Task:** Classification into ImageNet-1k categories\n", + "**Task:** Classification into ImageNet-1k categories.\n", "\n", - "**Model:** A ResNet18 trained on ImageNet-1k\n", + "**Model:** A ResNet18 trained on ImageNet-1k.\n", "\n", - "**Data to inspect:** Samples from PASCAL VOC 2012\n", + "**Data to inspect:** Samples from PASCAL VOC (Visual Object Classes) 2012.\n", "\n", - "**Ablation based on:** Segmentation masks\n", + "**Ablation based on:** Segmentation masks.\n", "\n", "We will use the visualization functions in Captum to show how each semantic part impacts the model output.\n", " \n", @@ -81,9 +81,9 @@ "source": [ "A straightforward way to demonstrate feature ablation on images is to ablate semantic image areas.\n", "\n", - "Therefore, we will load sample images from PASCAL VOC, as these images come along with annotated segmentation masks.\n", + "Therefore, we will load sample images from VOC 2012, as these images come along with annotated segmentation masks.\n", "\n", - "**Note**: The VOC dataset is 2GB. If you do not want to download it, you can skip the next step and provide your own image and segmentation mask in the step next." + "**Note**: The VOC 2012 dataset is 2GB. If you do not want to download it, you can skip the next step and provide your own image and segmentation mask in the step next." ] }, { @@ -150,7 +150,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "According to the segmentation mask, the image contains three bottles, and two TV monitors, with the rest considered background. All of `background`, `bottle`, and `tvmonitor` are among the 20 categories in PASCAL VOC 2012. This dataset also features a `void` category, used to annotate pixels that are not considered part of any class. These pixels represent border between the objects in the above example." + "According to the segmentation mask, the image contains three bottles, and two TV monitors, with the rest considered background. All of `background`, `bottle`, and `tvmonitor` are among the 20 categories in VOC 2012. This dataset also features a `void` category, used to annotate pixels that are not considered part of any class. These pixels represent border between the objects in the above example." ] }, { @@ -159,7 +159,7 @@ "source": [ "Let us also load ImageNet class labels to understand the output when we classify the samples using a classifier trained on ImageNet-1k.\n", "\n", - "**Note**: wget should be available as a command in your environment. You might need to install it. You can skip the next two steps if you are OK with class index as classification output (in that case, use `classify` in the next sections with `print_result`=`False`). " + "**Note**: wget should be available as a command in your environment, although you might need to install it. You can skip the next two steps if you are OK with class index as classification output (in that case, use `classify` in the next sections with `print_result`=`False`). " ] }, { @@ -332,7 +332,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "These ids correspond to the VOC labels for `background`, `bottle`, `tvmonitor` and `void`.\n", + "These ids correspond to the VOC 2012 labels for `background`, `bottle`, `tvmonitor` and `void`.\n", "\n", "While they would work, Captum expects consecutive group ids and would hence consider that there are 256 feature groups (most of them empty). This would result in slow execution.\n", "\n", @@ -461,7 +461,7 @@ "\n", "attribution_map = attribution_map.squeeze().cpu().detach().numpy()\n", "# adjust shape to height, width, channels \n", - "attribution_map = np.transpose(attribution_map, (1,2,0))\n", + "attribution_map = np.transpose(attribution_map, (1,2,0))\n", "\n", "_ = viz.visualize_image_attr(attribution_map,\n", " method=\"heat_map\",\n",