-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix genai-perf command line for LLM model type #959
base: main
Are you sure you want to change the base?
Conversation
Model Analyzer no longer supports LLMs (as you have noted the interface has changed). I would encourage you to use GenAI-Perf directly as the ability to both checkpoint and sweep through stimulus parameters has recently been added. |
Thank you for your response! I have a few follow-up questions:
Looking forward to your insights! |
No, GenAI-Perf does not support automatic Triton configuration tuning. Can you share what parameters you are interested in tuning? |
Thank you for the clarification! I have a few more questions regarding configuration tuning for LLMs:
Looking forward to your insights! |
Fix from this issue #935
If we run model-analyzer from nvcr.io/nvidia/tritonserver:24.08-py3-sdk docker container for model with LLM model type,
It will fail with the following error message:
Command:
genai-perf -m my_model -- -b 1 -u server:8001 -i grpc -f my_model-results.csv --verbose-csv --concurrency-range 64 --measurement-mode count_windows --collect-metrics --metrics-url http://server:8002 --metrics-interval 1000
Error:
2024-10-01 10:42 [INFO] genai_perf.parser:803 - Detected passthrough args: ['-b', '1', '-u', 'server:8001', '-i', 'grpc', '-f', 'my_model-results.csv', '--verbose-csv', '--concurrency-range', '64', '--measurement-mode', 'count_windows', '--collect-metrics', '--metrics-url', 'http://server:8002', '--metrics-interval', '1000']
usage: genai-perf [-h] [--version] {compare,profile} ...
genai-perf: error: argument subcommand: invalid choice: 'my_model' (choose from 'compare', 'profile')
It looks like the genai-perf command line created by model_analyzer missing required mode (genai-perf profile ...).
it seems that genai-perf has changed their CLI and now it requires profile. We could try adding profile to line 328 of perf_analyzer.py. It should now look like:
cmd = ["genai-perf", "profile -m", self._config.models_name()]