Viewing Florence2 vision output embeddings? #21

asmith26 · 2024-09-14T14:27:38Z

Hi @andimarafioti, many thanks for this amazing code/repo!

Just wondering, do you know if it's possible to view the output vision embeddings of Florence2?
I'm trying to pass HuggingFaceM4/DocumentVQA through Florence2 and pass the embeddings to UMAP for visualization (similar to https://cs.stanford.edu/people/karpathy/cnnembed/).

Here's what I've tried (many thanks for any help!):

from unittest.mock import patch
from datasets import load_dataset
from transformers import AutoModelForCausalLM, AutoProcessor

data = load_dataset("HuggingFaceM4/DocumentVQA", split="train")[0]

model_id="microsoft/Florence-2-base-ft"
with patch("transformers.dynamic_module_utils.get_imports", fixed_get_imports):  # workaround for CPU/avoid flash_attn requirement
    model = AutoModelForCausalLM.from_pretrained(model_id, trust_remote_code=True)
processor = AutoProcessor.from_pretrained(model_id, trust_remote_code=True)

inputs = processor(text=data["question"], images=data["image"], return_tensors="pt", padding=True)

model.vision_tower(inputs["pixel_values"])

unfortunately yields:

Traceback (most recent call last):
  File "~/miniforge3/envs/florence2/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3577, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-30-82c1d1be9daf>", line 1, in <module>
    model.vision_tower(inputs["pixel_values"])
  File "~/miniforge3/envs/florence2/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/miniforge3/envs/florence2/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/.cache/huggingface/modules/transformers_modules/microsoft/Florence-2-base-ft/9803f52844ec1ae5df004e6089262e9a23e527fd/modeling_florence2.py", line 662, in forward
    x = self.forward_features(x)
        ^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/.cache/huggingface/modules/transformers_modules/microsoft/Florence-2-base-ft/9803f52844ec1ae5df004e6089262e9a23e527fd/modeling_florence2.py", line 657, in forward_features
    x = self.norms(x)
        ^^^^^^^^^^
  File "~/miniforge3/envs/florence2/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1729, in __getattr__
    raise AttributeError(f"'{type(self).__name__}' object has no attribute '{name}'")
AttributeError: 'DaViT' object has no attribute 'norms'

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Viewing Florence2 vision output embeddings? #21

Viewing Florence2 vision output embeddings? #21

asmith26 commented Sep 14, 2024

Viewing Florence2 vision output embeddings? #21

Viewing Florence2 vision output embeddings? #21

Comments

asmith26 commented Sep 14, 2024