-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Release PaddleSeg release/2.9.2, adding new models and supporting all…
…-in-one full development tools (#3815)
- Loading branch information
Showing
3 changed files
with
209 additions
and
15 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,97 @@ | ||
|
||
## 目录 | ||
- [一站式全流程开发简介](#1) | ||
- [图像分割相关能力支持](#2) | ||
- [图像分割相关模型产线列表和教程](#3) | ||
- [图像分割相关单功能模块列表和教程](#4) | ||
|
||
<a name="1"></a> | ||
|
||
## 1. 一站式全流程开发简介 | ||
|
||
飞桨一站式全流程开发工具[PaddleX](https://github.com/PaddlePaddle/PaddleX/tree/release/3.0-beta1),依托于PaddleSeg的先进技术,支持了图像分割领域的**一站式全流程**开发能力。通过一站式全流程开发,可实现简单且高效的模型使用、组合与定制。这将显著**减少模型开发的时间消耗**,**降低其开发难度**,大大加快模型在行业中的应用和推广速度。特色如下: | ||
|
||
* 🎨 **模型丰富一键调用**:将通用语义分割和图像异常检测涉及的**19个模型**整合为2条模型产线,通过极简的**Python API一键调用**,快速体验模型效果。此外,同一套API,也支持图像分类、目标检测、文本图像智能分析、通用OCR、时序预测等共计**200+模型**,形成20+单功能模块,方便开发者进行**模型组合使用**。 | ||
|
||
* 🚀 **提高效率降低门槛**:提供基于**统一命令**和**图形界面**两种方式,实现模型简洁高效的使用、组合与定制。支持**高性能部署、服务化部署和端侧部署**等多种部署方式。此外,对于各种主流硬件如**英伟达GPU、昆仑芯、昇腾、寒武纪和海光**等,进行模型开发时,都可以**无缝切换**。 | ||
|
||
>**说明**:PaddleX 致力于实现产线级别的模型训练、推理与部署。模型产线是指一系列预定义好的、针对特定AI任务的开发流程,其中包含能够独立完成某类任务的单模型(单功能模块)组合。 | ||
<a name="2"></a> | ||
|
||
## 2. 图像分割相关能力支持 | ||
|
||
PaddleX中图像分割相关的2条产线均支持本地**快速推理**,部分产线支持**在线体验**,您可以快速体验各个产线的预训练模型效果,如果您对产线的预训练模型效果满意,可以直接对产线进行[高性能部署](https://github.com/PaddlePaddle/PaddleX/blob/release/3.0-beta1/docs/pipeline_deploy/high_performance_deploy.md)/[服务化部署](https://github.com/PaddlePaddle/PaddleX/blob/release/3.0-beta1/docs/pipeline_deploy/service_deploy.md)/[端侧部署](https://github.com/PaddlePaddle/PaddleX/blob/release/3.0-beta1/docs/pipeline_deploy/lite_deploy.md),如果不满意,您也可以使用产线的**二次开发**能力,提升效果。完整的产线开发流程请参考[PaddleX产线使用概览](https://github.com/PaddlePaddle/PaddleX/blob/release/3.0-beta1/docs/pipeline_usage/pipeline_develop_guide.md)或各产线使用教程。 | ||
|
||
此外,PaddleX为开发者提供了基于[云端图形化开发界面](https://aistudio.baidu.com/pipeline/mine)的全流程开发工具, 详细请参考[教程《零门槛开发产业级AI模型》](https://aistudio.baidu.com/practical/introduce/546656605663301) | ||
|
||
<table > | ||
<tr> | ||
<td></td> | ||
<td>在线体验</td> | ||
<td>快速推理</td> | ||
<td>高性能部署</td> | ||
<td>服务化部署</td> | ||
<td>端侧部署</td> | ||
<td>二次开发</td> | ||
<td><a href = "https://aistudio.baidu.com/pipeline/mine">星河零代码产线</a></td> | ||
</tr> | ||
<tr> | ||
<td>通用图像分类</td> | ||
<td><a href = "https://aistudio.baidu.com/community/app/100061/webUI?source=appMineRecent">链接</a></td> | ||
<td>✅</td> | ||
<td>✅</td> | ||
<td>✅</td> | ||
<td>✅</td> | ||
<td>✅</td> | ||
<td>✅</td> | ||
</tr> | ||
<tr> | ||
<td>图像异常检测</td> | ||
<td>🚧</td> | ||
<td>✅</td> | ||
<td>✅</td> | ||
<td>✅</td> | ||
<td>🚧</td> | ||
<td>✅</td> | ||
<td>🚧</td> | ||
</tr> | ||
|
||
|
||
</table> | ||
|
||
> ❗注:以上功能均基于GPU/CPU实现。PaddleX还可在昆仑、昇腾、寒武纪和海光等主流硬件上进行快速推理和二次开发。下表详细列出了模型产线的支持情况,具体支持的模型列表请参阅 [模型列表(NPU)](https://github.com/PaddlePaddle/PaddleX/blob/release/3.0-beta1/docs/support_list/model_list_npu.md) // [模型列表(XPU)](https://github.com/PaddlePaddle/PaddleX/blob/release/3.0-beta1/docs/support_list/model_list_xpu.md) // [模型列表(MLU)](https://github.com/PaddlePaddle/PaddleX/blob/release/3.0-beta1/docs/support_list/model_list_mlu.md) // [模型列表DCU](https://github.com/PaddlePaddle/PaddleX/blob/release/3.0-beta1/docs/support_list/model_list_dcu.md)。同时我们也在适配更多的模型,并在主流硬件上推动高性能和服务化部署的实施。 | ||
|
||
**🚀 国产化硬件能力支持** | ||
|
||
<table> | ||
<tr> | ||
<th>产线名称</th> | ||
<th>昇腾 910B</th> | ||
<th>昆仑 R200/R300</th> | ||
<th>寒武纪 MLU370X8</th> | ||
<th>海光 Z100</th> | ||
</tr> | ||
<tr> | ||
<td>通用语义分割</td> | ||
<td>✅</td> | ||
<td>✅</td> | ||
<td>✅</td> | ||
<td>✅</td> | ||
</tr> | ||
</table> | ||
|
||
<a name="3"></a> | ||
|
||
## 3. 图像分割相关模型产线列表和教程 | ||
|
||
- **通用语义分割产线**: [使用教程](https://github.com/PaddlePaddle/PaddleX/blob/release/3.0-beta1/docs/pipeline_usage/tutorials/cv_pipelines/semantic_segmentation.md) | ||
- **图像异常检测产线**: [使用教程](https://github.com/PaddlePaddle/PaddleX/blob/release/3.0-beta1/docs/pipeline_usage/tutorials/cv_pipelines/image_anomaly_detection.md) | ||
|
||
<a name="4"></a> | ||
|
||
## 4. 图像分割相关单功能模块列表和教程 | ||
|
||
- **语义分割模块**: [使用教程](https://github.com/PaddlePaddle/PaddleX/blob/release/3.0-beta1/docs/module_usage/tutorials/cv_modules/semantic_segmentation.md) | ||
- **图像异常检测模块**: [使用教程](https://github.com/PaddlePaddle/PaddleX/blob/release/3.0-beta1/docs/module_usage/tutorials/cv_modules/anomaly_detection.md) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,101 @@ | ||
# 快速开始 | ||
|
||
>**说明:** | ||
>* 飞桨一站式全流程开发工具[PaddleX](https://github.com/PaddlePaddle/PaddleX/tree/release/3.0-beta1),依托于PaddleSeg的先进技术,支持了图像分割领域的**一站式全流程**开发能力。通过一站式全流程开发,可实现简单且高效的模型使用、组合与定制。 | ||
>* PaddleX 致力于实现产线级别的模型训练、推理与部署。模型产线是指一系列预定义好的、针对特定AI任务的开发流程,其中包含能够独立完成某类任务的单模型(单功能模块)组合。本文档提供**图像分割相关产线**的快速使用,单功能模块的快速使用以及更多功能请参考[PaddleSeg一站式全流程开发](./overview.md)中相关章节。 | ||
|
||
### 🛠️ 安装 | ||
|
||
> ❗安装PaddleX前请先确保您有基础的**Python运行环境**。 | ||
* **安装PaddlePaddle** | ||
```bash | ||
# cpu | ||
python -m pip install paddlepaddle | ||
|
||
# gpu,该命令仅适用于 CUDA 版本为 11.8 的机器环境 | ||
python -m pip install paddlepaddle-gpu==3.0.0b1 -i https://www.paddlepaddle.org.cn/packages/stable/cu118/ | ||
|
||
# gpu,该命令仅适用于 CUDA 版本为 12.3 的机器环境 | ||
python -m pip install paddlepaddle-gpu==3.0.0b1 -i https://www.paddlepaddle.org.cn/packages/stable/cu123/ | ||
``` | ||
> ❗ 更多飞桨 Wheel 版本请参考[飞桨官网](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/install/pip/linux-pip.html)。 | ||
* **安装PaddleX** | ||
|
||
```bash | ||
pip install https://paddle-model-ecology.bj.bcebos.com/paddlex/whl/paddlex-3.0.0.beta1-py3-none-any.whl | ||
``` | ||
|
||
> ❗ 更多安装方式参考[PaddleX安装教程](./installation/installation.md) | ||
### 💻 命令行使用 | ||
|
||
一行命令即可快速体验产线效果,统一的命令行格式为: | ||
|
||
```bash | ||
paddlex --pipeline [产线名称] --input [输入图片] --device [运行设备] | ||
``` | ||
|
||
只需指定三个参数: | ||
* `pipeline`:产线名称 | ||
* `input`:待处理的输入图片的本地路径或URL | ||
* `device`: 使用的GPU序号(例如`gpu:0`表示使用第0块GPU),也可选择使用CPU(`cpu`) | ||
|
||
|
||
以通用OCR产线为例: | ||
```bash | ||
paddlex --pipeline OCR --input https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_002.png --device gpu:0 | ||
``` | ||
<details> | ||
<summary><b>👉 点击查看运行结果 </b></summary> | ||
|
||
```bash | ||
{'img_path': '/root/.paddlex/predict_input/general_ocr_002.png', 'dt_polys': [[[5, 12], [88, 10], [88, 29], [5, 31]], [[208, 14], [249, 14], [249, 22], [208, 22]], [[695, 15], [824, 15], [824, 60], [695, 60]], [[158, 27], [355, 23], [356, 70], [159, 73]], [[421, 25], [659, 19], [660, 59], [422, 64]], [[337, 104], [460, 102], [460, 127], [337, 129]], [[486, 103], [650, 100], [650, 125], [486, 128]], [[675, 98], [835, 94], [835, 119], [675, 124]], [[64, 114], [192, 110], [192, 131], [64, 134]], [[210, 108], [318, 106], [318, 128], [210, 130]], [[82, 140], [214, 138], [214, 163], [82, 165]], [[226, 136], [328, 136], [328, 161], [226, 161]], [[404, 134], [432, 134], [432, 161], [404, 161]], [[509, 131], [570, 131], [570, 158], [509, 158]], [[730, 138], [771, 138], [771, 154], [730, 154]], [[806, 136], [817, 136], [817, 146], [806, 146]], [[342, 175], [470, 173], [470, 197], [342, 199]], [[486, 173], [616, 171], [616, 196], [486, 198]], [[677, 169], [813, 166], [813, 191], [677, 194]], [[65, 181], [170, 177], [171, 202], [66, 205]], [[96, 208], [171, 205], [172, 230], [97, 232]], [[336, 220], [476, 215], [476, 237], [336, 242]], [[507, 217], [554, 217], [554, 236], [507, 236]], [[87, 229], [204, 227], [204, 251], [87, 254]], [[344, 240], [483, 236], [483, 258], [344, 262]], [[66, 252], [174, 249], [174, 271], [66, 273]], [[75, 279], [264, 272], [265, 297], [76, 303]], [[459, 297], [581, 295], [581, 320], [459, 322]], [[101, 314], [210, 311], [210, 337], [101, 339]], [[68, 344], [165, 340], [166, 365], [69, 368]], [[345, 350], [662, 346], [662, 368], [345, 371]], [[100, 459], [832, 444], [832, 465], [100, 480]]], 'dt_scores': [0.8183103704439653, 0.7609575621092027, 0.8662357274035412, 0.8619508290334809, 0.8495855993183273, 0.8676840017933314, 0.8807986687956436, 0.822308525056085, 0.8686617037621976, 0.8279022169854463, 0.952332847006758, 0.8742692553015098, 0.8477013022907575, 0.8528771493227294, 0.7622965906848765, 0.8492388224448705, 0.8344203789965632, 0.8078477124353284, 0.6300434587457232, 0.8359967356998494, 0.7618617265751318, 0.9481573079350023, 0.8712182945408912, 0.837416955846334, 0.8292475059403851, 0.7860382856406026, 0.7350527486717117, 0.8701022267947695, 0.87172526903969, 0.8779847108088126, 0.7020437651809734, 0.6611684983372949], 'rec_text': ['www.997', '151', 'PASS', '登机牌', 'BOARDING', '舱位 CLASS', '序号SERIALNO.', '座位号SEATNO', '航班 FLIGHT', '日期DATE', 'MU 2379', '03DEC', 'W', '035', 'F', '1', '始发地FROM', '登机口 GATE', '登机时间BDT', '目的地TO', '福州', 'TAIYUAN', 'G11', 'FUZHOU', '身份识别IDNO.', '姓名NAME', 'ZHANGQIWEI', '票号TKTNO.', '张祺伟', '票价FARE', 'ETKT7813699238489/1', '登机口于起飞前10分钟关闭GATESCLOSE1OMINUTESBEFOREDEPARTURETIME'], 'rec_score': [0.9617719054222107, 0.4199012815952301, 0.9652514457702637, 0.9978302121162415, 0.9853208661079407, 0.9445787072181702, 0.9714463949203491, 0.9841841459274292, 0.9564052224159241, 0.9959094524383545, 0.9386572241783142, 0.9825271368026733, 0.9356589317321777, 0.9985442161560059, 0.3965512812137604, 0.15236201882362366, 0.9976775050163269, 0.9547433257102966, 0.9974752068519592, 0.9646636843681335, 0.9907559156417847, 0.9895358681678772, 0.9374122023582458, 0.9909093379974365, 0.9796401262283325, 0.9899340271949768, 0.992210865020752, 0.9478569626808167, 0.9982215762138367, 0.9924325942993164, 0.9941263794898987, 0.96443772315979]} | ||
...... | ||
``` | ||
|
||
可视化结果如下: | ||
|
||
![alt text](./imgs/boardingpass.png) | ||
|
||
</details> | ||
|
||
其他产线的命令行使用,只需将`pipeline`参数调整为相应产线的名称。下面列出了每个产线对应的命令: | ||
|
||
|
||
| 产线名称 | 使用命令 | | ||
|-----------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | ||
| 文档场景信息抽取 | | | ||
| 通用图像分类 | `paddlex --pipeline image_classification --input https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_image_classification_001.jpg --device gpu:0` | | ||
| 通用OCR | `paddlex --pipeline OCR --input https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_002.png --device gpu:0` | | ||
| 通用表格识别 | `paddlex --pipeline table_recognition --input https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/table_recognition.jpg --device gpu:0` | | ||
|
||
|
||
### 📝 Python脚本使用 | ||
|
||
几行代码即可完成产线的快速推理,统一的Python脚本格式如下: | ||
```python | ||
from paddlex import create_pipeline | ||
|
||
pipeline = create_pipeline(pipeline=[产线名称]) | ||
output = pipeline.predict([输入图片名称]) | ||
for batch in output: | ||
for item in batch: | ||
res = item['result'] | ||
res.print() | ||
res.save_to_img("./output/") | ||
res.save_to_json("./output/") | ||
``` | ||
执行了如下几个步骤: | ||
|
||
* `create_pipeline()` 实例化产线对象 | ||
* 传入图片并调用产线对象的`predict` 方法进行推理预测 | ||
* 对预测结果进行处理 | ||
|
||
其他产线的Python脚本使用,只需将`create_pipeline()`方法的`pipeline`参数调整为相应产线的名称。下面列出了每个产线对应的参数名称及详细的使用解释: | ||
|
||
| 产线名称 | 对应参数 | 详细说明 | | ||
|----------|----------------------|------| | ||
| 通用OCR产线 | `OCR` | [通用OCR产线Python脚本使用说明](./pipeline_usage/OCR.md#222-python脚本方式集成) | | ||
| 通用表格识别产线 | `table_recognition` | [通用表格识别产线Python脚本使用说明](./pipeline_usage/table_recognition.md#22-python脚本方式集成) | | ||
| PP-ChatOCRv3产线 | `pp_chatocrv3` | [PP-ChatOCRv3产线Python脚本使用说明](./pipeline_usage/document_scene_information_extraction.md#222-python脚本方式集成) | |