Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
jliberma authored Dec 31, 2024
1 parent d07ec4d commit 2502609
Showing 1 changed file with 35 additions and 37 deletions.
72 changes: 35 additions & 37 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,25 +1,27 @@
# PDF to Podcast
<h2><img align="center" src="https://github.com/user-attachments/assets/cbe0d62f-c856-4e0b-b3ee-6184b7c4d96f">NVIDIA AI Blueprint: PDF to Podcast</h2>

## Overview
A microservice driven implementation for transforming PDFs into engaging audio content. For a deeper dive into the system architecture, please see the diagram below:
A microservice driven implementation for transforming PDFs into engaging audio content.

You can view a mermaid diagram of our system [here](docs/README.md).
For a deeper dive into the system architecture, please see the diagram below:

View a mermaid diagram of our system [here](docs/README.md).

## Quick Start Guide

1. **Environment Variables**:
We require the following environment variables to be set:
Set the following environment variables:
```bash
# Create .env file with required variables
echo "ELEVENLABS_API_KEY=your_key" > .env
echo "NIM_KEY=your_key" >> .env
echo "MAX_CONCURRENT_REQUESTS=1" >> .env
```

Note that in production we use the NVIDIA Eleven Labs API key, which can handle concurrent requests. For local development, you may want to set `MAX_CONCURRENT_REQUESTS=1` to avoid rate-limiting issues. You can generate your own testing API key for free [here](https://elevenlabs.io/).
> **Note:** the NVIDIA Eleven Labs API key used in this example can handle concurrent requests. For local development, set `MAX_CONCURRENT_REQUESTS=1` to avoid rate-limiting issues. Generate your own API key for free [here](https://elevenlabs.io/).
2. **Install Dependencies**:
We use UV to manage python dependencies.
We use [UV](https://pypi.org/project/uv/) to manage Python dependencies.

```bash
make uv
Expand All @@ -32,7 +34,8 @@ You can view a mermaid diagram of our system [here](docs/README.md).
If you open up a new terminal window and want to quickly re-use the same environment, you can run `make uv` again.

3. **Start Development Server**:
You can start the entire stack with:
Start the entire stack with:

```bash
make all-services
```
Expand All @@ -46,52 +49,56 @@ You can view a mermaid diagram of our system [here](docs/README.md).
You can also set `DETACH=1` to run the services in detached mode, which allows you to continue using your terminal while the services are running.

4. **Run Podcast Generation**:
5. **Run Podcast Generation**:

```bash
source .venv/bin/activate
python tests/test.py --target <pdf1.pdf> --context <pdf2.pdf>
```

This will generate a 2-person podcast. In order to generate a 1-person monologue, you can add the `--monologue` flag. Check out the test file for more examples. If you are not on a GPU machine, the PDF service might take a while to run.

## Hosting the PDF service on a separate machine
This will generate a 2-person podcast. Add the `--monologue` flag to generate a 1-person podcast. Check out the test file for more examples.

As stated above, we use [docling](https://github.com/DS4SD/docling) as our default PDF service. When you spin up the stack, docling will be built and run automatically.
## Customization

If you would like to run the PDF service on a separate machine, you can add the following to your `.env` file. The `make model-dev` target will let you spin up only the docling service:
```bash
echo "MODEL_API_URL=<pdf-model-service-url" >> .env
```
### Host the PDF service on a separate machine

### Using `nv-ingest`
This blueprint uses [docling](https://github.com/DS4SD/docling) as the default PDF extraction service.

We also support using a fork of NVIDIA's [NV-Ingest](https://github.com/NVIDIA/NV-Ingest) as our PDF service. This requires two A100-SXM machines. See the [repo](https://github.com/jdye64/nv-ingest/tree/brev-dev-convert-endpoint) for more information. If you would like to use this, you can add the following to your `.env` file:
To run the PDF extraction service on a separate machine, add the following to your `.env` file:
```bash
echo "MODEL_API_URL=<nv-ingest-url>/v1" >> .env
echo "MODEL_API_URL=<pdf-model-service-url" >> .env
```
**Note the use of `v1` in the URL.**
The `make model-dev` target will let you spin up only the docling service.

## Selecting LLMs
### Use a Self-hosted NIM

We currently use an ensemble of 3 LLMS to generate these podcasts. Out of the box, we recommend using the LLama 3.1-70B NIM. If you would like to use a different model, you can update the `models.json` file with the desired model. The default `models.json` calls a NIM that we have currently hosted. Feel free to use it as you develop locally. When you deploy, please use our NIM API Catalog endpoints.
We currently use an ensemble of 3 LLMS to generate these podcasts. Out of the box, we recommend using the LLama 3.1-70B NIM. If you would like to use a different model, you can update the `models.json` file with the desired model. The default `models.json` calls an NVIDIA-hosted NIM. Feel free to use it as you develop locally. When you deploy, please use our NIM API Catalog endpoints.

## Optimizing for GPU usage
### Change the Default Models and GPU Assignments

Due to our design, it is relatively easy to swap out different pieces of our stack to optimize for GPU usage and available hardware. For example, you could swap each model with the smaller LLama 3.1-8B NIM and disable GPU usage for `docling` in `docker-compose.yaml`.

## Development Tools

### Tracing
### Enable Tracing
We expose a Jaeger instance at `http://localhost:16686/` for tracing. This is useful for debugging and monitoring the system.

## Contributing

1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Run tests: `python tests/test.py <pdf1> <pdf2>`
5. Run linting: `make ruff`
6. Submit a pull request

### Code Quality
The project uses `ruff` for linting and formatting. You must run `make ruff` before your PR can be merged:

```bash
make ruff # Runs both lint and format
```

## CI/CD
### CI/CD
We use GitHub Actions for CI/CD. We run the following actions:

- `ruff`: Runs linting and formatting
- `pr-test`: Runs an end-to-end podcast test on the PR
- `build-and-push`: Builds and pushes a new container image to the remote repo. This is used to update production deployments
Expand All @@ -106,12 +113,3 @@ We use GitHub Actions for CI/CD. We run the following actions:
- Implement proper certificate management
- Configure appropriate security headers
- Follow other web security best practices

## Contributing

1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Run tests: `python tests/test.py <pdf1> <pdf2>`
5. Run linting: `make ruff`
6. Submit a pull request

0 comments on commit 2502609

Please sign in to comment.