Kindly add OpenVINO backend support #32

gericho · 2024-05-06T13:14:16Z

Will it be possible to add OpenVINO support for Intel-based processors? The repo by @zhuzilin here shows a speed improvement of nearly 50%, so users will be able to use larger models without sacrificing current performance. Thank you!

zachs-55 · 2024-05-11T17:23:34Z

You can use OpenVINO with whisper.cpp (I personally found using CLBlast instead was a little faster on my weak Celeron n5095, though):
https://github.com/ggerganov/whisper.cpp#openvino-support

Then you can use this for a Wyoming endpoint: https://github.com/ser/wyoming-whisper-api-client

gericho · 2024-05-14T19:31:30Z

Thank you very much, I'll give it a try soon!

tannisroot · 2024-06-01T22:18:09Z

faster-whisper uses CTranslate2 which doesn't have OpenVino support.

monoamin · 2025-01-08T07:20:19Z

For future reference, and anyone stumbling into here trying to get whisper to use their Intel GPU, I have created a working demo Dockerfile and Compose here:

https://github.com/monoamin/wyoming-whispercpp-openvino-gpu

MaximumFish · 2025-01-09T10:45:24Z

This is perfect timing as I just started looking at this yesterday. @monoamin have you run any comparison tests at all? Interestingly the page for faster-whisper pegs it as slightly faster with no acceleration compared to whispercpp with OpenVINO. I'd be very curious as to whether you see similar results.

monoamin · 2025-01-13T05:45:39Z

@MaximumFish I have not extensively tested this, but I might be able to give you some quick numbers.
My primary intent with running it on the GPU was to utilize the hardware I have the most I can.
In terms of hardware, my server is running an Intel Arc A380 w/ 6GB Vram, and a Ryzen 7 5700x w/ 32G Ram.

This is with whisper.cpp, first CPU-only, then GPU, with some random speech sample:

root on monoamin in ~
$ ffmpeg -i speechtest_fixed.wav 2>&1 | grep Duration 
  **Duration: 00:00:24.17**, bitrate: 256 kb/s

root on monoamin in ~
$ **python3 testwhispertime.py # CPU Only**              
Transcription: {"text":" Das Internet ist für uns alle Neuland und es ermöglicht auch Feinden und Gegnern unserer demokratischen Grundordnung natürlich mit völlig neuen Möglichkeiten und völlig neuen Herangehensweisen unsere Art zu leben in Gefahr zu bringen.\n"}
**Time taken: 6.303544759750366 seconds**

root on monoamin in ~
$ **python3 testwhispertime.py # With GPU**            
Transcription: {"text":" Das Internet ist für uns alle Neuland und es ermöglicht auch Feinden und Gegnern unserer demokratischen Grundordnung natürlich, mit völlig neuen Möglichkeiten und völlig neuen Herangehensweisen unsere Art zu leben in Gefahr zu bringen.\n"}
**Time taken: 4.317137241363525 seconds**

I currently don't have a faster-whisper container running but I'll see if I can set one up today to compare results.

MaximumFish · 2025-01-13T13:53:24Z

Thanks for testing it! Definitely interested to see the results vs faster-whisper.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kindly add OpenVINO backend support #32

Kindly add OpenVINO backend support #32

gericho commented May 6, 2024

zachs-55 commented May 11, 2024

gericho commented May 14, 2024

tannisroot commented Jun 1, 2024

monoamin commented Jan 8, 2025

MaximumFish commented Jan 9, 2025

monoamin commented Jan 13, 2025 •

edited

Loading

MaximumFish commented Jan 13, 2025

Kindly add OpenVINO backend support #32

Kindly add OpenVINO backend support #32

Comments

gericho commented May 6, 2024

zachs-55 commented May 11, 2024

gericho commented May 14, 2024

tannisroot commented Jun 1, 2024

monoamin commented Jan 8, 2025

MaximumFish commented Jan 9, 2025

monoamin commented Jan 13, 2025 • edited Loading

MaximumFish commented Jan 13, 2025

monoamin commented Jan 13, 2025 •

edited

Loading