Skip to content

Commit

Permalink
[Feature] Add support for MiniMax API (#548)
Browse files Browse the repository at this point in the history
* update requirement

* update requirement

* update with minimax

* update api model

* Update readme

* fix error

---------

Co-authored-by: zhangsongyang <[email protected]>
  • Loading branch information
tonysy and tonysy authored Nov 6, 2023
1 parent 1ccdfaa commit 239c2a3
Show file tree
Hide file tree
Showing 17 changed files with 368 additions and 8 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ Just like a compass guides us on our journey, OpenCompass will guide you through
## 🚀 What's New <a><img width="35" height="20" src="https://user-images.githubusercontent.com/12782558/212848161-5e783dd6-11e8-4fe0-bbba-39ffb77730be.png"></a>

- **\[2023.11.06\]** We have supported several API-based models, include ChatGLM Pro@Zhipu, ABAB-Chat@MiniMax and Xunfei. Welcome to [Models](https://opencompass.readthedocs.io/en/latest/user_guides/models.html) section for more details. 🔥🔥🔥.
- **\[2023.10.24\]** We release a new benchmark for evaluating LLMs’ capabilities of having multi-turn dialogues. Welcome to [BotChat](https://github.com/open-compass/BotChat) for more details. 🔥🔥🔥.
- **\[2023.09.26\]** We update the leaderboard with [Qwen](https://github.com/QwenLM/Qwen), one of the best-performing open-source models currently available, welcome to our [homepage](https://opencompass.org.cn) for more details. 🔥🔥🔥.
- **\[2023.09.20\]** We update the leaderboard with [InternLM-20B](https://github.com/InternLM/InternLM), welcome to our [homepage](https://opencompass.org.cn) for more details. 🔥🔥🔥.
Expand All @@ -46,7 +47,6 @@ Just like a compass guides us on our journey, OpenCompass will guide you through
- **\[2023.09.08\]** We update the leaderboard with Baichuan-2/Tigerbot-2/Vicuna-v1.5, welcome to our [homepage](https://opencompass.org.cn) for more details.
- **\[2023.09.06\]** [**Baichuan2**](https://github.com/baichuan-inc/Baichuan2) team adpots OpenCompass to evaluate their models systematically. We deeply appreciate the community's dedication to transparency and reproducibility in LLM evaluation.
- **\[2023.09.02\]** We have supported the evaluation of [Qwen-VL](https://github.com/QwenLM/Qwen-VL) in OpenCompass.
- **\[2023.08.25\]** [**TigerBot**](https://github.com/TigerResearch/TigerBot) team adpots OpenCompass to evaluate their models systematically. We deeply appreciate the community's dedication to transparency and reproducibility in LLM evaluation.

> [More](docs/en/notes/news.md)
Expand Down
2 changes: 1 addition & 1 deletion README_zh-CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@
## 🚀 最新进展 <a><img width="35" height="20" src="https://user-images.githubusercontent.com/12782558/212848161-5e783dd6-11e8-4fe0-bbba-39ffb77730be.png"></a>

- **\[2023.11.06\]** 我们已经支持了多个基于 API 的模型,包括ChatGLM Pro@智谱清言、ABAB-Chat@MiniMax 和讯飞。欢迎查看 [模型](https://opencompass.readthedocs.io/en/latest/user_guides/models.html) 部分以获取更多详细信息。🔥🔥🔥。
- **\[2023.10.24\]** 我们发布了一个全新的评测集,BotChat,用于评估大语言模型的多轮对话能力,欢迎查看 [BotChat](https://github.com/open-compass/BotChat) 获取更多信息. 🔥🔥🔥.
- **\[2023.09.26\]** 我们在评测榜单上更新了[Qwen](https://github.com/QwenLM/Qwen), 这是目前表现最好的开源模型之一, 欢迎访问[官方网站](https://opencompass.org.cn)获取详情.🔥🔥🔥.
- **\[2023.09.20\]** 我们在评测榜单上更新了[InternLM-20B](https://github.com/InternLM/InternLM), 欢迎访问[官方网站](https://opencompass.org.cn)获取详情.🔥🔥🔥.
Expand All @@ -46,7 +47,6 @@
- **\[2023.09.08\]** 我们在评测榜单上更新了Baichuan-2/Tigerbot-2/Vicuna-v1.5, 欢迎访问[官方网站](https://opencompass.org.cn)获取详情。
- **\[2023.09.06\]** 欢迎 [**Baichuan2**](https://github.com/baichuan-inc/Baichuan2) 团队采用OpenCompass对模型进行系统评估。我们非常感谢社区在提升LLM评估的透明度和可复现性上所做的努力。
- **\[2023.09.02\]** 我们加入了[Qwen-VL](https://github.com/QwenLM/Qwen-VL)的评测支持。
- **\[2023.08.25\]** 欢迎 [**TigerBot**](https://github.com/TigerResearch/TigerBot) 团队采用OpenCompass对模型进行系统评估。我们非常感谢社区在提升LLM评估的透明度和可复现性上所做的努力。

> [更多](docs/zh_cn/notes/news.md)
Expand Down
37 changes: 37 additions & 0 deletions configs/eval_minimax.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
from mmengine.config import read_base
from opencompass.models.minimax import MiniMax
from opencompass.partitioners import NaivePartitioner
from opencompass.runners import LocalRunner
from opencompass.runners.local_api import LocalAPIRunner
from opencompass.tasks import OpenICLInferTask

with read_base():
# from .datasets.collections.chat_medium import datasets
from .summarizers.medium import summarizer
from .datasets.ceval.ceval_gen import ceval_datasets

datasets = [
*ceval_datasets,
]

models = [
dict(
abbr='minimax_abab5.5-chat',
type=MiniMax,
path='abab5.5-chat',
key='xxxxxxx', # please give you key
group_id='xxxxxxxx', # please give your group_id
query_per_second=1,
max_out_len=2048,
max_seq_len=2048,
batch_size=8),
]

infer = dict(
partitioner=dict(type=NaivePartitioner),
runner=dict(
type=LocalAPIRunner,
max_num_workers=4,
concurrent_users=4,
task=dict(type=OpenICLInferTask)),
)
50 changes: 50 additions & 0 deletions configs/eval_xunfei.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
from mmengine.config import read_base
from opencompass.models.xunfei_api import XunFei
from opencompass.partitioners import NaivePartitioner
from opencompass.runners import LocalRunner
from opencompass.runners.local_api import LocalAPIRunner
from opencompass.tasks import OpenICLInferTask

with read_base():
# from .datasets.collections.chat_medium import datasets
from .summarizers.medium import summarizer
from .datasets.ceval.ceval_gen import ceval_datasets

datasets = [
*ceval_datasets,
]

models = [
dict(
abbr='Spark-v1-1',
type=XunFei,
appid="xxxx",
path='ws://spark-api.xf-yun.com/v1.1/chat',
api_secret = "xxxxxxx",
api_key = "xxxxxxx",
query_per_second=1,
max_out_len=2048,
max_seq_len=2048,
batch_size=8),
dict(
abbr='Spark-v3-1',
type=XunFei,
appid="xxxx",
domain='generalv3',
path='ws://spark-api.xf-yun.com/v3.1/chat',
api_secret = "xxxxxxxx",
api_key = "xxxxxxxxx",
query_per_second=1,
max_out_len=2048,
max_seq_len=2048,
batch_size=8),
]

infer = dict(
partitioner=dict(type=NaivePartitioner),
runner=dict(
type=LocalAPIRunner,
max_num_workers=2,
concurrent_users=2,
task=dict(type=OpenICLInferTask)),
)
36 changes: 36 additions & 0 deletions configs/eval_zhihu.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
from mmengine.config import read_base
from opencompass.models import ZhiPuAI
from opencompass.partitioners import NaivePartitioner
from opencompass.runners import LocalRunner
from opencompass.runners.local_api import LocalAPIRunner
from opencompass.tasks import OpenICLInferTask

with read_base():
# from .datasets.collections.chat_medium import datasets
from .summarizers.medium import summarizer
from .datasets.ceval.ceval_gen import ceval_datasets

datasets = [
*ceval_datasets,
]

models = [
dict(
abbr='chatglm_pro',
type=ZhiPuAI,
path='chatglm_pro',
key='xxxxxxxxxxxx',
query_per_second=1,
max_out_len=2048,
max_seq_len=2048,
batch_size=8),
]

infer = dict(
partitioner=dict(type=NaivePartitioner),
runner=dict(
type=LocalAPIRunner,
max_num_workers=2,
concurrent_users=2,
task=dict(type=OpenICLInferTask)),
)
2 changes: 1 addition & 1 deletion docs/en/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@ We always welcome *PRs* and *Issues* for the betterment of OpenCompass.
.. _Tools:
.. toctree::
:maxdepth: 1
:caption: tools
:caption: Tools

tools.md

Expand Down
1 change: 1 addition & 0 deletions docs/en/notes/news.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
# News

- **\[2023.08.25\]** [**TigerBot**](https://github.com/TigerResearch/TigerBot) team adpots OpenCompass to evaluate their models systematically. We deeply appreciate the community's dedication to transparency and reproducibility in LLM evaluation.
- **\[2023.08.21\]** [**Lagent**](https://github.com/InternLM/lagent) has been released, which is a lightweight framework for building LLM-based agents. We are working with Lagent team to support the evaluation of general tool-use capability, stay tuned!
- **\[2023.08.18\]** We have supported evaluation for **multi-modality learning**, include **MMBench, SEED-Bench, COCO-Caption, Flickr-30K, OCR-VQA, ScienceQA** and so on. Leaderboard is on the road. Feel free to try multi-modality evaluation with OpenCompass !
- **\[2023.08.18\]** [Dataset card](https://opencompass.org.cn/dataset-detail/MMLU) is now online. Welcome new evaluation benchmark OpenCompass !
Expand Down
13 changes: 12 additions & 1 deletion docs/en/user_guides/models.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,9 @@ model = HuggingFaceCausalLM(
Currently, OpenCompass supports API-based model inference for the following:

- OpenAI (`opencompass.models.OpenAI`)
- More coming soon
- ChatGLM (`opencompass.models.ZhiPuAI`)
- ABAB-Chat from MiniMax (`opencompass.models.MiniMax`)
- XunFei from XunFei (`opencompass.models.XunFei`)

Let's take the OpenAI configuration file as an example to see how API-based models are used in the
configuration file.
Expand All @@ -94,6 +96,15 @@ models = [
]
```

We have provided several examples for API-based models. Please refer to

```bash
configs
├── eval_zhihu.py
├── eval_xunfei.py
└── eval_minimax.py
```

## Custom Models

If the above methods do not support your model evaluation requirements, you can refer to
Expand Down
1 change: 1 addition & 0 deletions docs/zh_cn/notes/news.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
# 新闻

- **\[2023.08.25\]** 欢迎 [**TigerBot**](https://github.com/TigerResearch/TigerBot) 团队采用OpenCompass对模型进行系统评估。我们非常感谢社区在提升LLM评估的透明度和可复现性上所做的努力。
- **\[2023.08.21\]** [**Lagent**](https://github.com/InternLM/lagent) 正式发布,它是一个轻量级、开源的基于大语言模型的智能体(agent)框架。我们正与Lagent团队紧密合作,推进支持基于Lagent的大模型工具能力评测 !
- **\[2023.08.18\]** OpenCompass现已支持**多模态评测**,支持10+多模态评测数据集,包括 **MMBench, SEED-Bench, COCO-Caption, Flickr-30K, OCR-VQA, ScienceQA** 等. 多模态评测榜单即将上线,敬请期待!
- **\[2023.08.18\]** [数据集页面](https://opencompass.org.cn/dataset-detail/MMLU) 现已在OpenCompass官网上线,欢迎更多社区评测数据集加入OpenCompass !
Expand Down
13 changes: 12 additions & 1 deletion docs/zh_cn/user_guides/models.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,9 @@ model = HuggingFaceCausalLM(
OpenCompass 目前支持以下基于 API 的模型推理:

- OpenAI(`opencompass.models.OpenAI`
- Coming soon
- ChatGLM@智谱清言 (`opencompass.models.ZhiPuAI`)
- ABAB-Chat@MiniMax (`opencompass.models.MiniMax`)
- XunFei@科大讯飞 (`opencompass.models.XunFei`)

以下,我们以 OpenAI 的配置文件为例,模型如何在配置文件中使用基于 API 的模型。

Expand All @@ -86,6 +88,15 @@ models = [
]
```

我们也提供了API模型的评测示例,请参考

```bash
configs
├── eval_zhihu.py
├── eval_xunfei.py
└── eval_minimax.py
```

## 自定义模型

如果以上方式无法支持你的模型评测需求,请参考 [支持新模型](../advanced_guides/new_model.md) 在 OpenCompass 中增添新的模型支持。
3 changes: 2 additions & 1 deletion opencompass/models/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
from .huggingface import HuggingFaceCausalLM # noqa: F401, F403
from .intern_model import InternLM # noqa: F401, F403
from .llama2 import Llama2, Llama2Chat # noqa: F401, F403
from .minimax_api import MiniMax # noqa: F401
from .openai_api import OpenAI # noqa: F401
from .xunfei_api import XunFei # noqa: F401
from .zhipuai import ZhiPuAI # noqa: F401
from .zhipuai_api import ZhiPuAI # noqa: F401
Loading

0 comments on commit 239c2a3

Please sign in to comment.