Skip to content

Commit

Permalink
Fix broken links (#374)
Browse files Browse the repository at this point in the history
  • Loading branch information
nkmcalli authored Jan 23, 2025
1 parent de6983c commit 10cabf2
Show file tree
Hide file tree
Showing 3 changed files with 26 additions and 11 deletions.
27 changes: 17 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ NVIDIA Ingest enables parallelization of the process of splitting documents into

## Introduction

### What NVIDIA-Ingest is ✔️
### What NVIDIA-Ingest Is ✔️

A microservice that:

Expand All @@ -30,7 +30,7 @@ A microservice that:
- Supports multiple methods of extraction for each document type in order to balance trade-offs between throughput and accuracy. For example, for PDF documents we support extraction via pdfium, Unstructured.io, and Adobe Content Extraction Services.
- Supports various types of pre and post processing operations, including text splitting and chunking; transform, and filtering; embedding generation, and image offloading to storage.

### What NVIDIA-Ingest is not ✖️
### What NVIDIA-Ingest Is Not ✖️

A service that:

Expand Down Expand Up @@ -65,7 +65,7 @@ To get started using NVIDIA Ingest, you need to do a few things:
4. [Inspect and consume results](#step-4-inspecting-and-consuming-results) 🔍

Optional:
1. [Direct Library Deployment](docs/deployment.md) 📦
1. [Direct Library Deployment](docs/docs/user-guide/developer-guide/deployment.md) 📦

### Step 1: Starting containers

Expand All @@ -74,14 +74,14 @@ This example demonstrates how to use the provided [docker-compose.yaml](docker-c
> [!IMPORTANT]
> NIM containers on their first startup can take 10-15 minutes to pull and fully load models.
If preferred, you can also [start services one by one](docs/deployment.md), or run on Kubernetes via [our Helm chart](helm/README.md). Also of note are [additional environment variables](docs/environment-config.md) you may wish to configure.
If you prefer, you can also [start services one by one](docs/docs/user-guide/developer-guide/deployment.md), or run on Kubernetes via [our Helm chart](helm/README.md). Also of note are [additional environment variables](docs/docs/user-guide/developer-guide/environment-config.md) you may wish to configure.

1. Git clone the repo:
`git clone https://github.com/nvidia/nv-ingest`
2. Change directory to the cloned repo
`cd nv-ingest`.

3. [Generate API keys](docs/ngc-api-key.md) and authenticate with NGC with the `docker login` command:
3. [Generate API keys](docs/docs/user-guide/developer-guide/ngc-api-key.md) and authenticate with NGC with the `docker login` command:
```shell
# This is required to access pre-built containers and NIM microservices
$ docker login nvcr.io
Expand Down Expand Up @@ -112,7 +112,7 @@ NVIDIA_BUILD_API_KEY=... # Optional, set this is you are using build.nvidia.com
`docker compose up`

> [!TIP]
> By default we have [configured log levels to be verbose](docker-compose.yaml#L27).
> By default we have [configured log levels to be verbose](docker-compose.yaml).
>
> It's possible to observe service startup proceeding: you will notice _many_ log messages. Disable verbose logging by configuring `NIM_TRITON_LOG_VERBOSE=0` for each NIM in [docker-compose.yaml](docker-compose.yaml).
>
Expand Down Expand Up @@ -183,7 +183,7 @@ pip install .
```
> [!NOTE]
> Interacting from the host depends on the appropriate port being exposed from the nv-ingest container to the host as defined in [docker-compose.yaml](docker-compose.yaml#L141).
> Interacting from the host depends on the appropriate port being exposed from the nv-ingest container to the host as defined in [docker-compose.yaml](docker-compose.yaml).
>
> If you prefer, you can disable exposing that port, and interact with the nv-ingest service directly from within its container.
>
Expand Down Expand Up @@ -211,7 +211,11 @@ In the below examples, we are doing text, chart, table, and image extraction:
> [!IMPORTANT]
> `extract_tables` controls extraction for both tables and charts. You can optionally disable chart extraction by setting `extract_charts` to false.
#### In Python (you can find more documentation and examples [here](./client/client_examples/examples/python_client_usage.ipynb)):
#### In Python
> [!NOTE]
> You can find more examples [here](client/client_examples/examples/).
```python
import logging, time
Expand Down Expand Up @@ -265,7 +269,10 @@ result = client.fetch_job_result(job_id, timeout=60)
print(f"Got {len(result)} results")
```
#### Using the the `nv-ingest-cli` (you can find more nv-ingest-cli examples [here](./client/client_examples/examples/cli_client_usage.ipynb)):
#### Using the the `nv-ingest-cli`

> [!NOTE]
> You can find more examples [here](client/client_examples/examples/).
```shell
nv-ingest-cli \
Expand Down Expand Up @@ -326,7 +333,7 @@ multimodal_test.pdf.metadata.json
processed_docs/text:
multimodal_test.pdf.metadata.json
```
You can view the full JSON extracts and the metadata definitions [here](docs/content-metadata.md).
You can view the full JSON extracts and the metadata definitions [here](/docs/docs/user-guide/developer-guide/content-metadata.md).

#### We also provide a script for inspecting [extracted images](src/util/image_viewer.py)
First, install `tkinter` by running the following commands depending on your OS.
Expand Down
8 changes: 8 additions & 0 deletions docs/docs/user-guide/developer-guide/content-metadata.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,15 @@ Metadata: Descriptive data which can be associated with Sources, Content(Image o
| | Axis | TODO | Extracted | |
| | uploaded\_image\_uri | Mirrors source\_metadata.source\_location | Generated | |


<!--
2025-01-23 NKM: Commenting out this section
I can find only the first (text) file, and it is empty
I can't find the other 2 files (images, charts and tables) at all
If we get the files, we can add this back
## Example Text Extracts for multimodal_test.pdf:
1. [text](example_processed_docs/text/multimodal_test.pdf.metadata.json)
2. [images](example_processed_docs/image/multimodal_test.pdf.metadata.json)
3. [charts and tables](example_processed_docs/structured/multimodal_test.pdf.metadata.json)
-->
2 changes: 1 addition & 1 deletion docs/docs/user-guide/developer-guide/kubernetes-dev.md
Original file line number Diff line number Diff line change
Expand Up @@ -169,7 +169,7 @@ helm repo add \
https://charts.bitnami.com/bitnami
```

For the full list of repositories, refer to the `dependencies` section in the [Chart.yaml](../../../../helm/Chart.yaml) file of this project.
For the full list of repositories, refer to the `dependencies` section in the [Chart.yaml](https://github.com/NVIDIA/nv-ingest/blob/main/helm/Chart.yaml) file.

#### NVIDIA GPU Support

Expand Down

0 comments on commit 10cabf2

Please sign in to comment.