Rhubarb

Rhubarb is a light-weight Python framework that makes it easy to build document understanding applications using Multi-modal Large Language Models (LLMs) and Embedding models. Rhubarb is created from the ground up to work with Amazon Bedrock and supports multiple foundation models including Anthropic Claude V3 Multi-modal Language Models and Amazon Nova models for document processing, along with Amazon Titan Multi-modal Embedding model for embeddings.

What can I do with Rhubarb?

Visit Rhubarb documentation.

Rhubarb can do multiple document processing tasks such as

✅ Document Q&A
✅ Streaming chat with documents (Q&A)
✅ Document Summarization
- 🚀 Page level summaries
- 🚀 Full summaries
- 🚀 Summaries of specific pages
- 🚀 Streaming Summaries
✅ Structured data extraction
✅ Extraction Schema creation assistance
✅ Named entity recognition (NER)
- 🚀 With 50 built-in common entities
✅ PII recognition with built-in entities
✅ Figure and image understanding from documents
- 🚀 Explain charts, graphs, and figures
- 🚀 Perform table reasoning (as figures)
✅ Document Classification with vector sampling using multi-modal embedding models
✅ Logs token usage to help keep track of costs

Rhubarb comes with built-in system prompts that makes it easy to use it for a number of different document understanding use-cases. You can customize Rhubarb by passing in your own system prompts. It supports exact JSON schema based output generation which makes it easy to integrate into downstream applications.

Supports PDF, TIFF, PNG, JPG, DOCX files (support for Excel, PowerPoint, CSV, Webp, eml files coming soon)
Performs document to image conversion internally to work with the multi-modal models
Works on local files or files stored in S3
Supports specifying page numbers for multi-page documents
Supports chat-history based chat for documents
Supports streaming and non-streaming mode
Supports Converse API
Supports Cross-Region Inference

Installation

Start by installing Rhubarb using pip.

pip install pyrhubarb

Usage

Create a boto3 session.

import boto3
session = boto3.Session()

Call Rhubarb

Local file

from rhubarb import DocAnalysis

da = DocAnalysis(file_path="./path/to/doc/doc.pdf", 
                 boto3_session=session)
resp = da.run(message="What is the employee's name?")
resp

With file in Amazon S3

from rhubarb import DocAnalysis

da = DocAnalysis(file_path="s3://path/to/doc/doc.pdf", 
                 boto3_session=session)
resp = da.run(message="What is the employee's name?")
resp

For more usage examples see cookbooks.

Security

See CONTRIBUTING for more information.

License

This project is licensed under the Apache-2.0 License.

Name	Name	Last commit message	Last commit date
Latest commit anjanvb Merge pull request #25 from awslabs/doc-update Mar 5, 2025 af91662 · Mar 5, 2025 History 83 Commits
.github	.github	Release v0.0.2	Sep 11, 2024
assets	assets	Initial Version	Apr 17, 2024
cookbooks	cookbooks	udpate docs	Mar 5, 2025
docs	docs	update documentation	Mar 5, 2025
sample_deployments	sample_deployments	[Enhance] Added Cfn for Layer deployment [skip ci]	Sep 30, 2024
src/rhubarb	src/rhubarb	add nova models	Feb 21, 2025
tests	tests	Initial Version	Apr 17, 2024
.gitignore	.gitignore	Initial Version	Apr 17, 2024
.pre-commit-config.yaml	.pre-commit-config.yaml	[Enhance] Support for Bedrock Converse API and inference profiles	Oct 3, 2024
CODE_OF_CONDUCT.md	CODE_OF_CONDUCT.md	Initial Version	Apr 17, 2024
CONTRIBUTING.md	CONTRIBUTING.md	Initial Version	Apr 17, 2024
DEVELOPING.md	DEVELOPING.md	Updated DEVELOPING.md	Sep 11, 2024
LICENSE	LICENSE	Initial Version	Apr 17, 2024
NOTICE	NOTICE	Initial Version	Apr 17, 2024
README.md	README.md	udpate docs	Mar 5, 2025
THIRD-PARTY-LICENSES	THIRD-PARTY-LICENSES	Updated 3P Licenses	Oct 3, 2024
build_install.sh	build_install.sh	Release v0.0.2	Sep 11, 2024
poetry.lock	poetry.lock	[Enhance] Support for Bedrock Converse API and inference profiles	Oct 2, 2024
pyproject.toml	pyproject.toml	add nova models	Feb 21, 2025
requirements.txt	requirements.txt	[Enhance] Support for Bedrock Converse API and inference profiles	Oct 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Rhubarb

What can I do with Rhubarb?

Installation

Usage

Call Rhubarb

Security

License

About

Releases 2

Packages

Contributors 4

Languages

License

awslabs/rhubarb

Folders and files

Latest commit

History

Repository files navigation

Rhubarb

What can I do with Rhubarb?

Installation

Usage

Call Rhubarb

Security

License

About

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases 2

Packages 0

Contributors 4

Languages

Packages