We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hi @andimarafioti, many thanks for this amazing code/repo!
Just wondering, do you know if it's possible to view the output vision embeddings of Florence2? I'm trying to pass HuggingFaceM4/DocumentVQA through Florence2 and pass the embeddings to UMAP for visualization (similar to https://cs.stanford.edu/people/karpathy/cnnembed/).
Here's what I've tried (many thanks for any help!):
from unittest.mock import patch from datasets import load_dataset from transformers import AutoModelForCausalLM, AutoProcessor data = load_dataset("HuggingFaceM4/DocumentVQA", split="train")[0] model_id="microsoft/Florence-2-base-ft" with patch("transformers.dynamic_module_utils.get_imports", fixed_get_imports): # workaround for CPU/avoid flash_attn requirement model = AutoModelForCausalLM.from_pretrained(model_id, trust_remote_code=True) processor = AutoProcessor.from_pretrained(model_id, trust_remote_code=True) inputs = processor(text=data["question"], images=data["image"], return_tensors="pt", padding=True) model.vision_tower(inputs["pixel_values"])
unfortunately yields:
Traceback (most recent call last): File "~/miniforge3/envs/florence2/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3577, in run_code exec(code_obj, self.user_global_ns, self.user_ns) File "<ipython-input-30-82c1d1be9daf>", line 1, in <module> model.vision_tower(inputs["pixel_values"]) File "~/miniforge3/envs/florence2/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "~/miniforge3/envs/florence2/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "~/.cache/huggingface/modules/transformers_modules/microsoft/Florence-2-base-ft/9803f52844ec1ae5df004e6089262e9a23e527fd/modeling_florence2.py", line 662, in forward x = self.forward_features(x) ^^^^^^^^^^^^^^^^^^^^^^^^ File "~/.cache/huggingface/modules/transformers_modules/microsoft/Florence-2-base-ft/9803f52844ec1ae5df004e6089262e9a23e527fd/modeling_florence2.py", line 657, in forward_features x = self.norms(x) ^^^^^^^^^^ File "~/miniforge3/envs/florence2/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1729, in __getattr__ raise AttributeError(f"'{type(self).__name__}' object has no attribute '{name}'") AttributeError: 'DaViT' object has no attribute 'norms'
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Hi @andimarafioti, many thanks for this amazing code/repo!
Just wondering, do you know if it's possible to view the output vision embeddings of Florence2?
I'm trying to pass HuggingFaceM4/DocumentVQA through Florence2 and pass the embeddings to UMAP for visualization (similar to https://cs.stanford.edu/people/karpathy/cnnembed/).
Here's what I've tried (many thanks for any help!):
unfortunately yields:
The text was updated successfully, but these errors were encountered: