Skip to content

Commit

Permalink
Fixed SAM example notebook small issues (#1927)
Browse files Browse the repository at this point in the history
  • Loading branch information
sergiopaniego authored Sep 16, 2024
1 parent ff969fe commit 8ec9402
Show file tree
Hide file tree
Showing 3 changed files with 42 additions and 62 deletions.
78 changes: 34 additions & 44 deletions examples/vision/ipynb/sam.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,12 @@
"colab_type": "text"
},
"source": [
"# Segment Anything Model with \ud83e\udd17Transformers\n",
"# Segment Anything Model with 🤗Transformers\n",
"\n",
"**Authors:** [Merve Noyan](https://twitter.com/mervenoyann) & [Sayak Paul](https://twitter.com/RisingSayak)<br>\n",
"**Date created:** 2023/07/11<br>\n",
"**Last modified:** 2023/07/11<br>\n",
"**Description:** Fine-tuning Segment Anything Model using Keras and \ud83e\udd17 Transformers."
"**Description:** Fine-tuning Segment Anything Model using Keras and 🤗 Transformers."
]
},
{
Expand Down Expand Up @@ -45,9 +45,8 @@
"segmentation and edge detection. The goal of SAM is to enable all of these downstream\n",
"segmentation tasks through prompting.\n",
"\n",
"In this example, we'll learn how to use the SAM model from \ud83e\udd17 Transformers for performing\n",
"inference and fine-tuning.\n",
""
"In this example, we'll learn how to use the SAM model from 🤗 Transformers for performing\n",
"inference and fine-tuning.\n"
]
},
{
Expand All @@ -61,7 +60,7 @@
},
{
"cell_type": "code",
"execution_count": 0,
"execution_count": null,
"metadata": {
"colab_type": "code"
},
Expand All @@ -81,7 +80,7 @@
},
{
"cell_type": "code",
"execution_count": 0,
"execution_count": null,
"metadata": {
"colab_type": "code"
},
Expand Down Expand Up @@ -109,12 +108,9 @@
"\n",
"SAM has the following components:\n",
"\n",
"|\n",
"![](https://imgur.com/oLfdwuB)\n",
"| ![](https://imgur.com/oLfdwuB.png) |\n",
"|:--:|\n",
"| Image taken from the official\n",
"[SAM blog post](https://ai.facebook.com/blog/segment-anything-foundation-model-image-segmentation/) |\n",
"|"
"| Image taken from the official [SAM blog post](https://ai.facebook.com/blog/segment-anything-foundation-model-image-segmentation/) |\n"
]
},
{
Expand Down Expand Up @@ -174,7 +170,7 @@
},
{
"cell_type": "code",
"execution_count": 0,
"execution_count": null,
"metadata": {
"colab_type": "code"
},
Expand All @@ -197,7 +193,7 @@
},
{
"cell_type": "code",
"execution_count": 0,
"execution_count": null,
"metadata": {
"colab_type": "code"
},
Expand Down Expand Up @@ -315,8 +311,7 @@
" show_mask(mask, axes[i])\n",
" axes[i].title.set_text(f\"Mask {i+1}, Score: {score.numpy().item():.3f}\")\n",
" axes[i].axis(\"off\")\n",
" plt.show()\n",
""
" plt.show()\n"
]
},
{
Expand All @@ -333,7 +328,7 @@
},
{
"cell_type": "code",
"execution_count": 0,
"execution_count": null,
"metadata": {
"colab_type": "code"
},
Expand All @@ -357,7 +352,7 @@
},
{
"cell_type": "code",
"execution_count": 0,
"execution_count": null,
"metadata": {
"colab_type": "code"
},
Expand All @@ -380,7 +375,7 @@
},
{
"cell_type": "code",
"execution_count": 0,
"execution_count": null,
"metadata": {
"colab_type": "code"
},
Expand Down Expand Up @@ -416,7 +411,7 @@
},
{
"cell_type": "code",
"execution_count": 0,
"execution_count": null,
"metadata": {
"colab_type": "code"
},
Expand All @@ -443,9 +438,7 @@
"As can be noticed, all the masks are _valid_ masks for the point prompt we provided.\n",
"\n",
"SAM is flexible enough to support different visual prompts and we encourage you to check\n",
"out [this\n",
"notebook](https://github.com/huggingface/notebooks/blob/main/examples/segment_anything.ipy\n",
"nb) to know more about them!"
"out [this notebook](https://github.com/huggingface/notebooks/blob/main/examples/segment_anything.ipynb) to know more about them!"
]
},
{
Expand All @@ -467,7 +460,7 @@
},
{
"cell_type": "code",
"execution_count": 0,
"execution_count": null,
"metadata": {
"colab_type": "code"
},
Expand All @@ -494,7 +487,7 @@
},
{
"cell_type": "code",
"execution_count": 0,
"execution_count": null,
"metadata": {
"colab_type": "code"
},
Expand Down Expand Up @@ -553,7 +546,7 @@
},
{
"cell_type": "code",
"execution_count": 0,
"execution_count": null,
"metadata": {
"colab_type": "code"
},
Expand Down Expand Up @@ -606,8 +599,7 @@
" y_max = min(H, y_max + np.random.randint(0, 20))\n",
" bbox = [x_min, y_min, x_max, y_max]\n",
"\n",
" return bbox\n",
""
" return bbox\n"
]
},
{
Expand All @@ -634,7 +626,7 @@
},
{
"cell_type": "code",
"execution_count": 0,
"execution_count": null,
"metadata": {
"colab_type": "code"
},
Expand Down Expand Up @@ -667,7 +659,7 @@
},
{
"cell_type": "code",
"execution_count": 0,
"execution_count": null,
"metadata": {
"colab_type": "code"
},
Expand Down Expand Up @@ -696,7 +688,7 @@
},
{
"cell_type": "code",
"execution_count": 0,
"execution_count": null,
"metadata": {
"colab_type": "code"
},
Expand All @@ -723,12 +715,12 @@
},
"source": [
"We will now write DICE loss. This implementation is based on\n",
"[MONAI DICE loss](https://docs.monai.io/en/stable/_modules/monai/losses/dice.html#DiceLoss)."
"[MONAI DICE loss](https://docs.monai.io/en/stable/losses.html#diceloss)."
]
},
{
"cell_type": "code",
"execution_count": 0,
"execution_count": null,
"metadata": {
"colab_type": "code"
},
Expand All @@ -752,8 +744,7 @@
" loss = 1.0 - (2.0 * intersection + 1e-5) / (denominator + 1e-5)\n",
" loss = tf.reduce_mean(loss)\n",
"\n",
" return loss\n",
""
" return loss\n"
]
},
{
Expand All @@ -762,15 +753,15 @@
"colab_type": "text"
},
"source": [
"##\u00a0Fine-tuning SAM\n",
"## Fine-tuning SAM\n",
"\n",
"We will now fine-tune SAM's decoder part. We will freeze the vision encoder and prompt\n",
"encoder layers."
]
},
{
"cell_type": "code",
"execution_count": 0,
"execution_count": null,
"metadata": {
"colab_type": "code"
},
Expand Down Expand Up @@ -806,8 +797,7 @@
" grads = tape.gradient(loss, trainable_vars)\n",
" optimizer.apply_gradients(zip(grads, trainable_vars))\n",
"\n",
" return loss\n",
""
" return loss\n"
]
},
{
Expand All @@ -822,7 +812,7 @@
},
{
"cell_type": "code",
"execution_count": 0,
"execution_count": null,
"metadata": {
"colab_type": "code"
},
Expand Down Expand Up @@ -859,7 +849,7 @@
},
{
"cell_type": "code",
"execution_count": 0,
"execution_count": null,
"metadata": {
"colab_type": "code"
},
Expand All @@ -880,7 +870,7 @@
},
{
"cell_type": "code",
"execution_count": 0,
"execution_count": null,
"metadata": {
"colab_type": "code"
},
Expand All @@ -906,7 +896,7 @@
},
{
"cell_type": "code",
"execution_count": 0,
"execution_count": null,
"metadata": {
"colab_type": "code"
},
Expand Down Expand Up @@ -947,4 +937,4 @@
},
"nbformat": 4,
"nbformat_minor": 0
}
}
13 changes: 4 additions & 9 deletions examples/vision/md/sam.md
Original file line number Diff line number Diff line change
Expand Up @@ -85,12 +85,9 @@ import os

SAM has the following components:

|
![](https://imgur.com/oLfdwuB)
| ![](https://imgur.com/oLfdwuB.png) |
|:--:|
| Image taken from the official
[SAM blog post](https://ai.facebook.com/blog/segment-anything-foundation-model-image-segmentation/) |
|
| Image taken from the official [SAM blog post](https://ai.facebook.com/blog/segment-anything-foundation-model-image-segmentation/) |

The image encoder is responsible for computing image embeddings. When interacting with
SAM, we compute the image embedding one time (as the image encoder is heavy) and then
Expand Down Expand Up @@ -345,9 +342,7 @@ And there we go!
As can be noticed, all the masks are _valid_ masks for the point prompt we provided.

SAM is flexible enough to support different visual prompts and we encourage you to check
out [this
notebook](https://github.com/huggingface/notebooks/blob/main/examples/segment_anything.ipy
nb) to know more about them!
out [this notebook](https://github.com/huggingface/notebooks/blob/main/examples/segment_anything.ipynb) to know more about them!

---
## Fine-tuning
Expand Down Expand Up @@ -559,7 +554,7 @@ ground_truth_mask (2, 256, 256) <dtype: 'int32'> True
### Training

We will now write DICE loss. This implementation is based on
[MONAI DICE loss](https://docs.monai.io/en/stable/_modules/monai/losses/dice.html#DiceLoss).
[MONAI DICE loss](https://docs.monai.io/en/stable/losses.html#diceloss).


```python
Expand Down
13 changes: 4 additions & 9 deletions examples/vision/sam.py
Original file line number Diff line number Diff line change
Expand Up @@ -67,12 +67,9 @@
SAM has the following components:
|
![](https://imgur.com/oLfdwuB)
| ![](https://imgur.com/oLfdwuB.png) |
|:--:|
| Image taken from the official
[SAM blog post](https://ai.facebook.com/blog/segment-anything-foundation-model-image-segmentation/) |
|
| Image taken from the official [SAM blog post](https://ai.facebook.com/blog/segment-anything-foundation-model-image-segmentation/) |
"""

Expand Down Expand Up @@ -298,9 +295,7 @@ def show_masks_on_image(raw_image, masks, scores):
As can be noticed, all the masks are _valid_ masks for the point prompt we provided.
SAM is flexible enough to support different visual prompts and we encourage you to check
out [this
notebook](https://github.com/huggingface/notebooks/blob/main/examples/segment_anything.ipy
nb) to know more about them!
out [this notebook](https://github.com/huggingface/notebooks/blob/main/examples/segment_anything.ipynb) to know more about them!
"""

"""
Expand Down Expand Up @@ -484,7 +479,7 @@ def get_bounding_box(self, ground_truth_map):

"""
We will now write DICE loss. This implementation is based on
[MONAI DICE loss](https://docs.monai.io/en/stable/_modules/monai/losses/dice.html#DiceLoss).
[MONAI DICE loss](https://docs.monai.io/en/stable/losses.html#diceloss).
"""


Expand Down

0 comments on commit 8ec9402

Please sign in to comment.