Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] llm_on_genie install instruction #24

Open
BrickDesignerNL opened this issue Dec 1, 2024 · 10 comments
Open

[BUG] llm_on_genie install instruction #24

BrickDesignerNL opened this issue Dec 1, 2024 · 10 comments
Labels
question Further information is requested

Comments

@BrickDesignerNL
Copy link

BrickDesignerNL commented Dec 1, 2024

https://github.com/quic/ai-hub-apps/tree/main/tutorials/llm_on_genie

After:
python3.10 -m venv llm_on_genie_venv

it says
source llm_on_genie_venv/bin/activate

typically on Windows it's

llm_on_genie_venv\Scripts\activate

@BrickDesignerNL
Copy link
Author

Please also add instructions to login-in first to HuggingFace and provide an access token in the CMD.
Else the Windows CMD won't be able to download the LLAMA model dependancies from HuggingFace.

pip install huggingface_hub
huggingface-cli login

and then paste a token that you have created in HugggingFace when asked.

@BrickDesignerNL
Copy link
Author

BrickDesignerNL commented Dec 1, 2024

System used
CPU: x1e-78-100 (Snapdragon X Elite)
Mem: 32GB
OS: Windows 11

Python: AMD64 , v3.10
Torch: 2.5.1+cpu

SWAP: can be expanded to 80GB (to ensure all would fit in mem).

@BrickDesignerNL
Copy link
Author

cp ai-hub-apps/tutorials/llm_on_genie/configs/htp/htp_backend_ext_config.json.template genie_bundle/htp_backend_ext_config.json

is not a windows cmd command (and slashes are different).

so for Snapdragon X Elite users, please extend the instructions.
Thank you.

@BrickDesignerNL
Copy link
Author

Why do I need the tokenizer of https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct/tree/main what is the difference with that of https://huggingface.co/meta-llama/Llama-3.2-3B/tree/main for llama-v3-2-3b-chat-quantized?
(there is no exact name match)

@BrickDesignerNL
Copy link
Author

BrickDesignerNL commented Dec 2, 2024

For some reason $QNN_SDK_ROOT is not automatically defined.

Also not in PowerShell, I just learned that you assume PowerShell and not CMD ;)

So maybe telling that C:\Qualcomm\AIStack\QAIRT\2.28.2.241116\ is the root folder might be nice, since it's not the default Program files or Microsoft SDK folder.

@BrickDesignerNL
Copy link
Author

BrickDesignerNL commented Dec 2, 2024

When compiling the ChatApp, I get the following error:

Severity Code Description Project File Line Suppression State Details
Error (active) E1696 cannot open source file "GenieCommon.h" ChatApp C:\Users<username>\cleansheet\ChatApp\ChatApp.hpp 9

Severity Code Description Project File Line Suppression State Details
Error (active) E1696 cannot open source file "GenieDialog.h" ChatApp C:\Users<username>\cleansheet\ChatApp\ChatApp.hpp 10

While

genie-t2t-run.exe -c genie_config.json -p "<|begin_of_text|><|start_header_id|>user<|end_header_id|>\n\nWhat is France's capital?<|eot_id|><|start_header_id|>assistant<|end_header_id|>"

gave the correct response and utilized the NPU using 3,2GB of memory for the 3B model of llama 3.2.
But when the response runs out of the token size, it keeps on repeating the last words indefinitely.

genie-t2t-run.exe -c genie_config.json -p "<|begin_of_text|><|start_header_id|>user<|end_header_id|>\n\nSchrijf een Sinterklaas gedicht in het Nederlands<|eot_id|><|start_header_id|>assistant<|end_header_id|>"

It produces the same answer when you start with the same prompt.

@bhushan23
Copy link

Thanks for feedback @BrickDesignerNL

for #24 (comment) ,
Are you referring to ChatApp on XElite? If yes, have you followed https://github.com/quic/ai-hub-apps/tree/main/apps/windows/cpp/ChatApp?

Possibly, path is not set correctly leading to not able to find Genie header files.

Could you please check if following file exists?

$(QNN_SDK_ROOT)\include\Genie\GenieCommon.h
$(QNN_SDK_ROOT)\include\Genie\GenieDialog.h

@mestrona-3 mestrona-3 added the question Further information is requested label Dec 2, 2024
@BrickDesignerNL
Copy link
Author

BrickDesignerNL commented Dec 2, 2024

@bhushan23 Thank you!

I have followed the instructions on
https://github.com/quic/ai-hub-apps/tree/main/tutorials/llm_on_genie#for-windows-device
for llama_v3_2_3b_chat_quantized

There it says

cp $QNN_SDK_ROOT/lib/hexagon-v73/unsigned/* genie_bundle
cp $QNN_SDK_ROOT/lib/aarch64-windows-msvc/* genie_bundle
cp $QNN_SDK_ROOT/bin/aarch64-windows-msvc/genie-t2t-run.exe genie_bundle

Indeed
C:\Qualcomm\AIStack\QAIRT\2.28.0.241029\include\Genie
contains

GenieCommon.h
GenieDialog.h
GenieEmbedding.h

I've copied them (now) to the main folder of ChatApp, as I assume that should be done.

I now get:

ChatApp.cpp
Main.cpp
PromptHandler.cpp
Generating Code...
LINK : fatal error LNK1104: cannot open file '\lib\aarch64-windows-msvc\Genie.lib'

This file can be found in C:\Qualcomm\AIStack\QAIRT\2.28.2.241116\lib\aarch64-windows-msvc\Genie.lib
It was (and is) also already available in C:\Users\<username>\cleansheet\ChatApp\genie_bundle\Genie.lib

Putting that one into the root of ChatApp doesn't solve this.
I can't quickly find what file is referring to this, so that I can change the path.

@BrickDesignerNL
Copy link
Author

BrickDesignerNL commented Dec 3, 2024

@bhushan23 / @mestrona-3 Do you have a tip how to fix the last issue?

@DexterWoo
Copy link

@bhushan23 / @mestrona-3 Do you have a tip how to fix the last issue?

that one is easy.
if you havn't configured QNN_SDK_ROOT before you open the solution, define it in user or system environment variables

if you have configured QNN_SDK_ROOT, update the VC++ Directories setting
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants