[Bug]: 您好，使用vllm部署的Qwen2.5-Math-72B-Instruct和Qwen2.5-Math-7B-Instruct数学模型，为什么评测时模型怎么回答了很多其他乱七八糟的内容，直到达到限制token数才停止？ #38

13416157913 · 2024-11-07T08:46:40Z

Model Series

Qwen2.5

What are the models used?

Qwen2.5-Math-72B-Instruct、Qwen2.5-Math-7B-Instruct

What is the scenario where the problem happened?

vllm

Is this a known issue?

I have followed the GitHub README.
I have checked the Qwen documentation and cannot find an answer there.
I have checked the documentation of the related framework and cannot find useful information.
I have searched the issues and there is not a similar one.

Information about environment

您好，使用vllm部署的Qwen2.5-Math-72B-Instruct和Qwen2.5-Math-7B-Instruct数学模型，为什么评测时模型怎么回答了很多其他乱七八糟的内容，直到达到限制token数才停止？（其中评测数据集有Math）

Log output

您好，使用vllm部署的Qwen2.5-Math-72B-Instruct和Qwen2.5-Math-7B-Instruct数学模型，为什么评测时模型怎么回答了很多其他乱七八糟的内容，直到达到限制token数才停止？（其中评测数据集有Math）

Description

您好，使用vllm部署的Qwen2.5-Math-72B-Instruct和Qwen2.5-Math-7B-Instruct数学模型，为什么评测时模型怎么回答了很多其他乱七八糟的内容，直到达到限制token数才停止？（其中评测数据集有Math）

jnanliu · 2024-11-26T09:31:02Z

请问这个有发现这个问题是因为什么吗

William-WSJ · 2024-12-05T03:03:04Z

是的，我也有这个问题，7B的模型有时候会在输出回答后再输出一些无关紧要的内容，有时候也只是无限重复原问题，直到token数量上限。
Yes, I have this question either. The 7B model sometimes appends some unrelated content after giving an answer, or it will endlessly repeat the original question until it hits the token limit.

jnanliu · 2024-12-05T08:20:51Z

是的，我也有这个问题，7B的模型有时候会在输出回答后再输出一些无关紧要的内容，有时候也只是无限重复原问题，直到token数量上限。 Yes, I have this question either. The 7B model sometimes appends some unrelated content after giving an answer, or it will endlessly repeat the original question until it hits the token limit.

将温度系数从1.0更改为0.7会好很多。
In my settings, changing the temperature coefficient from 1.0 to 0.7 is helpful.

13416157913 · 2024-12-05T11:37:43Z

是的，我也有这个问题，7B的模型有时候会在输出回答后再输出一些无关紧要的内容，有时候也只是无限重复原问题，直到token数量上限。 Yes, I have this question either. The 7B model sometimes appends some unrelated content after giving an answer, or it will endlessly repeat the original question until it hits the token limit.

将温度系数从1.0更改为0.7会好很多。 In my settings, changing the temperature coefficient from 1.0 to 0.7 is helpful.

我这边温度设置很低的0.2，

William-WSJ · 2024-12-06T04:02:33Z

是的，我也有这个问题，7B的模型有时候会在输出回答后再输出一些无关紧要的内容，有时候也只是无限重复原问题，直到token数量上限。 Yes, I have this question either. The 7B model sometimes appends some unrelated content after giving an answer, or it will endlessly repeat the original question until it hits the token limit.

将温度系数从1.0更改为0.7会好很多。 In my settings, changing the temperature coefficient from 1.0 to 0.7 is helpful.

我这边温度设置很低的0.2，

是这样的，我使用英文输入时，7B模型可以正确解答问题，但是会在解答之后又会出现以Human:打头的无关题目的输出，直到达到max token。如果是中文题目，大概率会一直重复我的问题，输出结果和配置如下图：
输出结果英文：

输出结果中文：

配置文件：

这里已经将qwen2.5-math-7B模型部署了，使用端口访问的

pengwenzhi · 2024-12-26T07:56:40Z

math_eval.py里有个stop_words的选项，可以加

13416157913 · 2024-12-26T11:05:45Z

math_eval.py里有个stop_words的选项，可以加

通过stop_words来解决，感觉没真正从根本上解决问题，模型的回答本质上还是很长，只是通过stop_words截断而已；这种解决方法，从长远来看，不够合理，因为不知道模型回答中，会不会出现不在stop_words中停止符号。

qwerty3564 · 2024-12-31T09:27:19Z

是的，我也有这个问题，7B的模型有时候会在输出回答后再输出一些无关紧要的内容，有时候也只是无限重复原问题，直到token数量上限。 Yes, I have this question either. The 7B model sometimes appends some unrelated content after giving an answer, or it will endlessly repeat the original question until it hits the token limit.

将温度系数从1.0更改为0.7会好很多。 In my settings, changing the temperature coefficient from 1.0 to 0.7 is helpful.

我这边温度设置很低的0.2，

是这样的，我使用英文输入时，7B模型可以正确解答问题，但是会在解答之后又会出现以Human:打头的无关题目的输出，直到达到max token。如果是中文题目，大概率会一直重复我的问题，输出结果和配置如下图：输出结果英文：

输出结果中文：配置文件：

这里已经将qwen2.5-math-7B模型部署了，使用端口访问的

请问你解决没

jklj077 transferred this issue from QwenLM/Qwen2.5 Nov 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: 您好，使用vllm部署的Qwen2.5-Math-72B-Instruct和Qwen2.5-Math-7B-Instruct数学模型，为什么评测时模型怎么回答了很多其他乱七八糟的内容，直到达到限制token数才停止？ #38

[Bug]: 您好，使用vllm部署的Qwen2.5-Math-72B-Instruct和Qwen2.5-Math-7B-Instruct数学模型，为什么评测时模型怎么回答了很多其他乱七八糟的内容，直到达到限制token数才停止？ #38

13416157913 commented Nov 7, 2024

jnanliu commented Nov 26, 2024

William-WSJ commented Dec 5, 2024

jnanliu commented Dec 5, 2024

13416157913 commented Dec 5, 2024

William-WSJ commented Dec 6, 2024

pengwenzhi commented Dec 26, 2024

13416157913 commented Dec 26, 2024

qwerty3564 commented Dec 31, 2024

[Bug]: 您好，使用vllm部署的Qwen2.5-Math-72B-Instruct和Qwen2.5-Math-7B-Instruct数学模型，为什么评测时模型怎么回答了很多其他乱七八糟的内容，直到达到限制token数才停止？ #38

[Bug]: 您好，使用vllm部署的Qwen2.5-Math-72B-Instruct和Qwen2.5-Math-7B-Instruct数学模型，为什么评测时模型怎么回答了很多其他乱七八糟的内容，直到达到限制token数才停止？ #38

Comments

13416157913 commented Nov 7, 2024

Model Series

What are the models used?

What is the scenario where the problem happened?

Is this a known issue?

Information about environment

Log output

Description

jnanliu commented Nov 26, 2024

William-WSJ commented Dec 5, 2024

jnanliu commented Dec 5, 2024

13416157913 commented Dec 5, 2024

William-WSJ commented Dec 6, 2024

pengwenzhi commented Dec 26, 2024

13416157913 commented Dec 26, 2024

qwerty3564 commented Dec 31, 2024