Skip to content

Commit

Permalink
Merge branch 'baidubce:master' into master
Browse files Browse the repository at this point in the history
  • Loading branch information
C9luster authored Sep 13, 2024
2 parents 860f464 + cb57e6b commit ada166a
Show file tree
Hide file tree
Showing 41 changed files with 1,699 additions and 385 deletions.
20 changes: 17 additions & 3 deletions .github/workflows/python-package.yml
Original file line number Diff line number Diff line change
Expand Up @@ -76,14 +76,28 @@ jobs:
git fetch upstream
git remote -v
git status
changed_files=$(git diff --name-only --diff-filter=ACMRT master -- '*.py' '*.sh')
echo "发生更改的py/sh文件为:"
# 找到当前分支与 upstream/master 的共同祖先提交
merge_base=$(git merge-base HEAD upstream/master)
echo "merge_base=$merge_base"
# 比较当前分支与 merge_base 之间的差异
changed_files=$(git diff --name-only --diff-filter=ACMRT $merge_base)
changed_files_py_sh=$(git diff --name-only --diff-filter=ACMRT $merge_base -- '*.py' '*.sh')
echo "发生更改的文件为:"
echo "$changed_files"
if [ -n "$changed_files" ]; then
echo "发生更改的py/sh文件为:"
if [ -n "$changed_files_py_sh" ]; then
export APPBUILDER_PYTHON_TESTS=True
echo "$changed_files_py_sh"
else
export APPBUILDER_PYTHON_TESTS=False
echo "没有检测到Python或Shell文件被更改"
fi
echo "APPBUILDER_PYTHON_TESTS=$APPBUILDER_PYTHON_TESTS" >> $GITHUB_ENV
pwd
- name: Install dependencies
Expand Down
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ AppBuilder-SDK不仅提供了百度智能云提供的基础能力组件,同时

## 如何安装AppBuilder-SDK

#### 百度智能云千帆AppBuilder-SDK 最新版本 0.9.3 (2024-08-20)
#### 百度智能云千帆AppBuilder-SDK 最新版本 0.9.4 (2024-09-10)

百度智能云千帆AppBuilder-SDK 更新记录&最新特性请查阅我们的[版本说明](/docs/quick_start/changelog.md)

Expand Down Expand Up @@ -263,6 +263,7 @@ Hook:
| 基础能力组件 | [基础组件服务化](/cookbooks/components/agent_runtime.ipynb) | 基础组件可通过flask实现服务化部署 或 通过chainlit实现可交互的前端部署,集成到您的系统中 |
| 流程编排 | [Assistant SDK](/cookbooks/pipeline/assistant_function_call.ipynb) | 学习如何纯代码态搭建一个Agent应用,并实现自定义工作流程及FunctionCall |
| 端到端应用 | [AppBuilder Client SDK](/cookbooks/agent_builder.ipynb) | 使用AppBuilder网页端创建并发布一个Agent应用后,通过AppBuilderClient SDK集成到你的系统中 |
| 端到端应用 | [通过AppBuilder-ToolCall功能实现端云组件联动的Agent](/cookbooks/end2end_application/agent/tool_call.ipynb) | 学习Agent、FunctionCall的知识,并构造调用本地组件的Agent |
| 端到端应用 | [简历筛选小助手](/cookbooks/end2end_application/rag/rag.ipynb) | 通过对本地简历库的简历进行解析、切片、创建索引,实现基于JD进行简历筛选,并对筛选的Top1简历进行总结 |
| 端到端应用 | [企业级问答系统](/cookbooks/end2end_application/rag/qa_system_2_dialogue.ipynb) | 学习如何通过SDK与网页平台搭配,实现离线知识库生产与在线问答 |
| 进阶应用 | [使用appbuilder_bce_deploy部署公有云服务](/cookbooks/advanced_application/cloud_deploy.ipynb) | 一键将自己的服务部署到百度智能云,部署后可以自动生成公网ip,联动工作流的API节点 |
Expand Down
2 changes: 1 addition & 1 deletion README_en.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ Baidu AI Cloud Qianfan AppBuilder-SDK offers the following essential features fo

## How to install?

#### The latest version of Baidu AI Cloud Qianfan AppBuilder SDK is 0.9.3 (2024-08-20)
#### The latest version of Baidu AI Cloud Qianfan AppBuilder SDK is 0.9.4 (2024-09-10)

Baidu AI Cloud Qianfan AppBuilder SDK ReleaseNote please refer to our [version description](/docs/quick_start/changelog.md)

Expand Down
2 changes: 1 addition & 1 deletion README_ja.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ Baidu AI Cloud Qianfan AppBuilder-SDKは、AIアプリケーション開発者

## どのようにインストールしますか?

#### Baidu AI Cloud Qianfan AppBuilder SDKの最新バージョンは0.9.3(2024-08-20)です
#### Baidu AI Cloud Qianfan AppBuilder SDKの最新バージョンは0.9.4(2024-09-10)です

Baidu AI Cloud Qianfan AppBuilder SDKのリリースノートについては、[バージョン説明](/docs/quick_start/changelog.md)をご覧ください。

Expand Down
4 changes: 3 additions & 1 deletion appbuilder/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
# limitations under the License.


__version__ = '0.9.3'
__version__ = '0.9.4'

import os
import sys
Expand Down Expand Up @@ -90,6 +90,7 @@ def get_default_header():
from .core.components.retriever.baidu_vdb.baiduvdb_retriever import BaiduVDBVectorStoreIndex
from .core.components.retriever.baidu_vdb.baiduvdb_retriever import BaiduVDBRetriever
from .core.components.retriever.baidu_vdb.baiduvdb_retriever import TableParams
from .core.components.retriever.reranker.rerank import Reranker
from .core.components.ppt_generation_from_instruction.component import PPTGenerationFromInstruction
from .core.components.ppt_generation_from_paper.component import PPTGenerationFromPaper
from .core.components.ppt_generation_from_file.component import PPTGenerationFromFile
Expand Down Expand Up @@ -184,6 +185,7 @@ def get_default_header():
"BaiduVDBVectorStoreIndex",
"BaiduVDBRetriever",
"TableParams",
"Reranker",
"HallucinationDetection",

'DishRecognition',
Expand Down
74 changes: 42 additions & 32 deletions appbuilder/core/agent.py
Original file line number Diff line number Diff line change
Expand Up @@ -322,8 +322,9 @@ def warp():
user_id = request.headers.get("X-Appbuilder-User-Id", None)

init_context(session_id=session_id, request_id=request_id, user_id=user_id)
logging.debug(
f"[request_id={request_id}, session_id={session_id}] message={message}, stream={stream}, data={data}")
logging.info(
f"request_id={request_id}, session_id={session_id}] message={message},"
f" stream={stream}, data={data}, start run...")

def gen_sse_resp():
with app.app_context():
Expand All @@ -332,7 +333,16 @@ def gen_sse_resp():
while retry_count < MAX_RETRY_COUNT:
try:
answer = self.chat(message, stream, **data)
except Exception as e: # 调用chat方法报错,直接返回
code = 500 if not hasattr(e, "code") else e.code
err_resp = {"code": code, "message": "InternalServerError", "result": None}
logging.error(
f"request_id={request_id}, session_id={session_id}, err={e}, execute self.chat failed", exc_info=True)
yield "data: " + json.dumps(err_resp, ensure_ascii=False) + "\n\n"
return
else: # 调用chat方法成功,开始生成流式事件
content_iterator = iter(answer.content)
answer.content = None
result = None
try:
for sub_content in content_iterator:
Expand All @@ -349,10 +359,21 @@ def gen_sse_resp():
received_first_packet = True
except Exception as e:
retry_count += 1
if not received_first_packet:
logging.error(
f"[request_id={request_id}, session_id={session_id}] err={e}, "
f"retry_count={retry_count}", exc_info=True)
# 如果未收到首包且重试次数小于最大重试次数,则尝试重新执行一次chat方法
if not received_first_packet and retry_count < MAX_RETRY_COUNT:
continue
else:
raise e
else: # 其它情况返回
logging.error(
f"[request_id={request_id}, session_id={session_id}] err={e}, "
f"retry_count={retry_count}, received_first_packet={received_first_packet}"
, exc_info=True)
code = 500 if not hasattr(e, "code") else e.code
err_resp = {"code": code, "message": "InternalServerError", "result": None}
yield "data: " + json.dumps(err_resp, ensure_ascii=False) + "\n\n"
return
result.content = ""
yield "data: " + json.dumps({
"code": 0, "message": "",
Expand All @@ -362,41 +383,30 @@ def gen_sse_resp():
"answer_message": json.loads(result.json(exclude_none=True))
}
}, ensure_ascii=False) + "\n\n"
logging.info(
f"request_id={request_id}, session_id={session_id}]"
f"retry_count={retry_count}, success response", exc_info=True)
self.user_session._post_append()
break
except Exception as e:
code = 500 if not hasattr(
e, "code") else e.code
err_resp = {"code": code,
"message": "InternalServerError", "result": None}
logging.error(
f"[request_id={request_id}, session_id={session_id}] err={e}", exc_info=True)
yield "data: " + json.dumps(err_resp, ensure_ascii=False) + "\n\n"
return # 正常返回

try:
if stream:
return Response(stream_with_context(gen_sse_resp()), 200,
{'Content-Type': 'text/event-stream; charset=utf-8'},
)
else:
if stream: # 流式
return Response(stream_with_context(gen_sse_resp()), 200,
{'Content-Type': 'text/event-stream; charset=utf-8'})
if not stream: # 非流式
try:
answer = self.chat(message, stream, **data)
blocking_result = json.loads(
copy.deepcopy(answer).json(exclude_none=True))
logging.info(
f"[request_id={request_id}, session_id={session_id}] blocking_result={blocking_result}")
blocking_result = json.loads(copy.deepcopy(answer).json(exclude_none=True))
logging.debug(f"[request_id={request_id}, session_id={session_id}] blocking_result={blocking_result}")
self.user_session._post_append()
return {
"code": 0, "message": "",
"result": {"session_id": session_id, "answer_message": blocking_result}
}
except Exception as e:
logging.error(
f"[request_id={request_id}, session_id={session_id}] err={e}", exc_info=True)
raise e
except Exception as e:
logging.error(
f"[request_id={request_id}, session_id={session_id}] err={e}", exc_info=True)
raise e
except Exception as e:
logging.error(
f"[request_id={request_id}, session_id={session_id}] err={e}", exc_info=True)
code = 500 if not hasattr(e, "code") else e.code
return {"code": code, "message": "InternalServerError", "result": None}

app.add_url_rule(url_rule, 'chat', warp, methods=['POST'])
return app
Expand Down
69 changes: 55 additions & 14 deletions appbuilder/core/components/doc_parser/doc_parser.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,9 @@
from appbuilder.utils.logger_util import logger
from appbuilder.core._client import HTTPClient
from appbuilder.core.components.doc_parser.base import ParserConfig, ParseResult
from appbuilder.utils.trace.tracer_wrapper import components_run_trace, components_run_stream_trace
from appbuilder.utils.trace.tracer_wrapper import (
components_run_trace,
)


class DocParser(Component):
Expand All @@ -46,6 +48,7 @@ class DocParser(Component):
parse_result = parser(msg)
"""

name: str = "doc_parser"
tool_desc: Dict[str, Any] = {"description": "parse document content"}
base_url: str = "/v1/bce/xmind/parser"
Expand All @@ -61,16 +64,24 @@ def make_parse_result(self, response: Dict):
"""
将解析结果的内容转化成ParseResult的结构
"""
para_nodes = response["para_nodes"] if response["para_nodes"] is not None else []
para_nodes = (
response["para_nodes"] if response["para_nodes"] is not None else []
)
catalog = response["catalog"] if response["catalog"] is not None else []
pdf_data = response["pdf_data"]
title_node_ids = [title["node_id"] for title in catalog] if catalog else []
page_contents = []
for content in response["file_content"]:
page_content = {"page_num": content["page_num"], "page_width": int(content["page_size"]["width"]),
"page_height": int(content["page_size"]["height"]), "page_angle": int(content["page_angle"]),
"page_type": content["page_content"]["type"], "page_layouts": [], "page_titles": [],
"page_tables": []}
page_content = {
"page_num": content["page_num"],
"page_width": int(content["page_size"]["width"]),
"page_height": int(content["page_size"]["height"]),
"page_angle": int(content["page_angle"]),
"page_type": content["page_content"]["type"],
"page_layouts": [],
"page_titles": [],
"page_tables": [],
}
for layout_item in content["page_content"]["layout"]:
if layout_item["node_id"] in title_node_ids:
continue
Expand All @@ -81,8 +92,16 @@ def make_parse_result(self, response: Dict):
table_row = []
for i in range(len(layout_item["matrix"])):
cell_index = layout_item["matrix"][i]
row_markdown = "|" + "|".join(
[layout_item["children"][index]["text"] for index in set(cell_index)]) + "|"
row_markdown = (
"|"
+ "|".join(
[
layout_item["children"][index]["text"]
for index in set(cell_index)
]
)
+ "|"
)
if i != len(layout_item["matrix"]) - 1:
row_markdown += "\n"
table_row.append(row_markdown)
Expand All @@ -94,9 +113,18 @@ def make_parse_result(self, response: Dict):
for title in catalog:
page_num = title["position"][0]["pageno"]
page_contents[page_num]["page_titles"].append(
{"text": title["text"], "type": title["level"], "box": title["position"][0]["box"],
"node_id": title["node_id"]})
parse_result = {"para_node_tree": para_nodes, "page_contents": page_contents, "pdf_data": pdf_data}
{
"text": title["text"],
"type": title["level"],
"box": title["position"][0]["box"],
"node_id": title["node_id"],
}
)
parse_result = {
"para_node_tree": para_nodes,
"page_contents": page_contents,
"pdf_data": pdf_data,
}
# parse_result = ParseResult.parse_obj(parse_result)
return parse_result

Expand All @@ -123,13 +151,26 @@ def run(self, input_message: Message, return_raw=False) -> Message:
payload = json.dumps({"file_list": [param]})
headers = self.http_client.auth_header()
headers["Content-Type"] = "application/json"
response = self.http_client.session.post(url=self.http_client.service_url(self.base_url), headers=headers, data=payload)
response = self.http_client.session.post(
url=self.http_client.service_url(self.base_url),
headers=headers,
data=payload,
)
self.http_client.check_response_header(response)
self.http_client.check_response_json(response.json())
request_id = self.http_client.response_request_id(response)
response = response.json()
if response["error_code"] != 0:
logger.error("doc parser service log_id {} err {}".format(response["log_id"], response["error_msg"]))
raise AppBuilderServerException(response["error_msg"])
logger.error(
"doc parser service log_id {} err {}".format(
response["log_id"], response["error_msg"]
)
)
raise AppBuilderServerException(
request_id=request_id,
service_err_code=response["error_code"],
service_err_message=response["error_msg"],
)
parse_result = self.make_parse_result(response["result"]["result_list"][0])
if return_raw:
parse_result["raw"] = response
Expand Down
Loading

0 comments on commit ada166a

Please sign in to comment.