Skip to content

A Python framework for multi-modal document understanding with Amazon Bedrock

License

Notifications You must be signed in to change notification settings

awslabs/rhubarb

Folders and files

NameName
Last commit message
Last commit date

Latest commit

af91662 Β· Mar 5, 2025

History

83 Commits
Sep 11, 2024
Apr 17, 2024
Mar 5, 2025
Mar 5, 2025
Sep 30, 2024
Feb 21, 2025
Apr 17, 2024
Apr 17, 2024
Oct 3, 2024
Apr 17, 2024
Apr 17, 2024
Sep 11, 2024
Apr 17, 2024
Apr 17, 2024
Mar 5, 2025
Oct 3, 2024
Sep 11, 2024
Oct 2, 2024
Feb 21, 2025
Oct 2, 2024

Rhubarb

Amazon Bedrock License made-with-python Python 3.11 Ruff

Rhubarb

Rhubarb is a light-weight Python framework that makes it easy to build document understanding applications using Multi-modal Large Language Models (LLMs) and Embedding models. Rhubarb is created from the ground up to work with Amazon Bedrock and supports multiple foundation models including Anthropic Claude V3 Multi-modal Language Models and Amazon Nova models for document processing, along with Amazon Titan Multi-modal Embedding model for embeddings.

What can I do with Rhubarb?

Visit Rhubarb documentation.

Rhubarb can do multiple document processing tasks such as

  • βœ… Document Q&A
  • βœ… Streaming chat with documents (Q&A)
  • βœ… Document Summarization
    • πŸš€ Page level summaries
    • πŸš€ Full summaries
    • πŸš€ Summaries of specific pages
    • πŸš€ Streaming Summaries
  • βœ… Structured data extraction
  • βœ… Extraction Schema creation assistance
  • βœ… Named entity recognition (NER)
    • πŸš€ With 50 built-in common entities
  • βœ… PII recognition with built-in entities
  • βœ… Figure and image understanding from documents
    • πŸš€ Explain charts, graphs, and figures
    • πŸš€ Perform table reasoning (as figures)
  • βœ… Document Classification with vector sampling using multi-modal embedding models
  • βœ… Logs token usage to help keep track of costs

Rhubarb comes with built-in system prompts that makes it easy to use it for a number of different document understanding use-cases. You can customize Rhubarb by passing in your own system prompts. It supports exact JSON schema based output generation which makes it easy to integrate into downstream applications.

  • Supports PDF, TIFF, PNG, JPG, DOCX files (support for Excel, PowerPoint, CSV, Webp, eml files coming soon)
  • Performs document to image conversion internally to work with the multi-modal models
  • Works on local files or files stored in S3
  • Supports specifying page numbers for multi-page documents
  • Supports chat-history based chat for documents
  • Supports streaming and non-streaming mode
  • Supports Converse API
  • Supports Cross-Region Inference

Installation

Start by installing Rhubarb using pip.

pip install pyrhubarb

Usage

Create a boto3 session.

import boto3
session = boto3.Session()

Call Rhubarb

Local file

from rhubarb import DocAnalysis

da = DocAnalysis(file_path="./path/to/doc/doc.pdf", 
                 boto3_session=session)
resp = da.run(message="What is the employee's name?")
resp

With file in Amazon S3

from rhubarb import DocAnalysis

da = DocAnalysis(file_path="s3://path/to/doc/doc.pdf", 
                 boto3_session=session)
resp = da.run(message="What is the employee's name?")
resp

For more usage examples see cookbooks.

Security

See CONTRIBUTING for more information.

License

This project is licensed under the Apache-2.0 License.