You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The objective of the exercise is, we are trying to get the same level of accuracy from the model output between a Finetuned model and Base model + LoRA adapters (deployed using the multi-lora functionality of the TGI).
We are getting the expected output from the Finetuned model but when using the multi-lora the accuracy of the output reduces drastically.
Could you suggest is there any difference between the default values used in above mentioned methodology. Also if you can suggest a way to increase the output accuracy while using Multi-LoRA.
We are trying to match the output from a TGI deployed Finetuned model with a Model deployed using TGI Multi-LoRA functionality (where we are using a base model (Starcoder2-3B) and 2 different fine-tuned adapters).
Even after keeping all the inference parameters same, we are getting completely different outputs for the same prompts.
System Info
Hi team,
We are trying to get the default parameter values that is being used while invoking a fine-tuned model which is deployed using TGI (latest version).
In the logs we are able to get the below information.
The objective of the exercise is, we are trying to get the same level of accuracy from the model output between a Finetuned model and Base model + LoRA adapters (deployed using the multi-lora functionality of the TGI).
We are getting the expected output from the Finetuned model but when using the multi-lora the accuracy of the output reduces drastically.
We are using the below config while invoking.
While using Finetuned model
While using Multi-LoRA functionality
we did refer to the below link:
https://github.com/huggingface/text-generation-inference/blob/38773453ae0d29fba3dc79a38d589ebdc5451093/router/src/lib.rs
Could you suggest is there any difference between the default values used in above mentioned methodology. Also if you can suggest a way to increase the output accuracy while using Multi-LoRA.
Thanks.
Information
Tasks
Reproduction
Multi-LoRA deployment:
Finetuned model deployment
Expected behavior
With the same parameter values, we should be getting the same output (or at least same output accuracy)
The text was updated successfully, but these errors were encountered: