huggingface / optimum-nvidia Public

Notifications
Fork 98
Star 937

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security
Insights

Issues: huggingface/optimum-nvidia

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clear current search query, filters, and sorts

50 Open 18 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

Supporting Bert / Roberta - tags: enchancement / new_model model

#21 opened Dec 6, 2023 by michaelfeil

GPTQ support? enhancement

New feature or request

#26 opened Dec 7, 2023 by tigerinus 0.1.0b3

Excellent project, hopefully ChatGLM3 will be supported.

#33 opened Dec 10, 2023 by Jeru2023

Enhancing Compatibility and Extending Support for Optimum-NVIDIA Across Diverse Workloads

#34 opened Dec 11, 2023 by yihong1120

When is the pip install command coming?

#36 opened Dec 11, 2023 by BBC-Esq 0.1.0b3

roubleshooting installation issues with Optimum-NVIDIA without docker

#37 opened Dec 12, 2023 by arkaprovob 0.1.0b3

Segmentation fault: address not mapped to object at address 0xb1fe8

#44 opened Dec 14, 2023 by SinanAkkoyun

FileNotFoundError: [Errno 2] No such file or directory: '/data/Dilip/models/llama-2-7b-chat-hf/build.json' bug

Something isn't working

#47 opened Dec 15, 2023 by dilip467

Use FP8 by default when on a supported device

#50 opened Dec 15, 2023 by laikhtewari

ImportError: Using low_cpu_mem_usage=True or a device_map requires Accelerate: pip install accelerate bug

Something isn't working

#56 opened Dec 21, 2023 by taozhang9527 0.1.0b3

RuntimeError: TRT Engine build failed...

#58 opened Dec 31, 2023 by yirunwang

Installation errors

#59 opened Jan 2, 2024 by RomanKoshkin

Feature request: streamer enhancement

New feature or request

#60 opened Jan 2, 2024 by RomanKoshkin

Not able to run 'Generate' from QuickStart section bug

Something isn't working

#61 opened Jan 3, 2024 by harikrishnaapc

Pip Installation

#62 opened Jan 4, 2024 by rmccorm4

Build failed with cuda runtime error.

#64 opened Jan 8, 2024 by Anindyadeep

Mixtral support

#67 opened Jan 15, 2024 by nmiletic

Error when Running LLAMA with tensor parallelism = 2 bug

Something isn't working

#68 opened Jan 23, 2024 by TheCodeWrangler

Triton Inference Server

#69 opened Jan 23, 2024 by TheCodeWrangler

llama.py with fp8 is broken (inference produces garbage results)

#71 opened Feb 10, 2024 by urimerhav

How do you use the library in your scripts after pulling and running the Docker image?

#72 opened Feb 15, 2024 by jddunn

How to build this environment without docker?

#75 opened Feb 24, 2024 by lemon-little

Qwen Support

#84 opened Feb 28, 2024 by Yuchen-Cao

Incorrect tensorrt_llm config class initialization bug

Something isn't working

#90 opened Mar 7, 2024 by Wojx

Original model configuration (config.json) was not found error during running inference using "Llama-2-7b-chat-hf"

#91 opened Mar 8, 2024 by raorajendra

Previous 1 2 Next

Previous Next

ProTip! Mix and match filters to narrow down what you’re looking for.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly