Skip to content

Commit

Permalink
Merge pull request google-gemini#132 from google-gemini/io-features
Browse files Browse the repository at this point in the history
Merge io-features branch.
  • Loading branch information
MarkDaoust authored May 14, 2024
2 parents 6c11013 + e5fc0ad commit 8f65f49
Show file tree
Hide file tree
Showing 4 changed files with 747 additions and 347 deletions.
358 changes: 321 additions & 37 deletions quickstarts/Counting_Tokens.ipynb

Large diffs are not rendered by default.

181 changes: 151 additions & 30 deletions quickstarts/File_API.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -54,8 +54,7 @@
"source": [
"The Gemini API supports prompting with text, image, and audio data, also known as *multimodal* prompting. You can include text, image,\n",
"and audio in your prompts. For small images, you can point the Gemini model\n",
"directly to a local file when providing a prompt. For larger images, videos\n",
"(sequences of image frames), and audio, upload the files with the [File\n",
"directly to a local file when providing a prompt. For larger text files, images, videos, and audio, upload the files with the [File\n",
"API](https://ai.google.dev/api/rest/v1beta/files) before including them in\n",
"prompts.\n",
"\n",
Expand All @@ -64,10 +63,9 @@
"your API key for generation within that time period. It is available at no cost in all regions where the [Gemini API is\n",
"available](https://ai.google.dev/available_regions).\n",
"\n",
"For information on valid file formats (MIME types) and supported models, see [Supported file formats](https://ai.google.dev/tutorials/prompting_with_media#supported_file_formats).\n",
"\n",
"Note: Videos must be converted into image frames before uploading to the File\n",
"API.\n",
"For information on valid file formats (MIME types) and supported models, see the documentation on\n",
"[supported file formats](https://ai.google.dev/tutorials/prompting_with_media#supported_file_formats)\n",
"and view the text examples at the end of this guide.\n",
"\n",
"This guide shows how to use the File API to upload a media file and include it in a `GenerateContent` call to the Gemini API. For more information, see the [code\n",
"samples](https://github.com/google-gemini/cookbook/tree/main/quickstarts/file-api).\n"
Expand All @@ -79,17 +77,7 @@
"metadata": {
"id": "_d_yY8XWGQ12"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m142.1/142.1 kB\u001b[0m \u001b[31m1.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m663.6/663.6 kB\u001b[0m \u001b[31m13.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[?25h"
]
}
],
"outputs": [],
"source": [
"!pip install -U -q google-generativeai"
]
Expand All @@ -112,7 +100,7 @@
"id": "YdyC6Z6wqxz-"
},
"source": [
"### Authentication Overview\n",
"## Authentication\n",
"\n",
"**Important:** The File API uses API keys for authentication and access. Uploaded files are associated with the API key's cloud project. Unlike other Gemini APIs that use API keys, your API key also grants access data you've uploaded to the File API, so take extra care in keeping your API key secure. For best practices on securing API keys, refer to Google's [documentation](https://support.google.com/googleapi/answer/6310037)."
]
Expand All @@ -138,7 +126,7 @@
"source": [
"from google.colab import userdata\n",
"\n",
"GOOGLE_API_KEY=userdata.get('GOOGLE_API_KEY')\n",
"GOOGLE_API_KEY = userdata.get(\"GOOGLE_API_KEY\")\n",
"genai.configure(api_key=GOOGLE_API_KEY)"
]
},
Expand All @@ -148,7 +136,7 @@
"id": "c-z4zsCUlaru"
},
"source": [
"## Upload a file to the File API\n",
"## Upload file\n",
"\n",
"The File API lets you upload a variety of multimodal MIME types, including images and audio formats. The File API handles inputs that can be used to generate content with [`model.generateContent`](https://ai.google.dev/api/rest/v1/models/generateContent) or [`model.streamGenerateContent`](https://ai.google.dev/api/rest/v1/models/streamGenerateContent).\n",
"\n",
Expand All @@ -161,7 +149,7 @@
"id": "2wsJ0vHNNtdJ"
},
"source": [
"First, we will prepare a sample image to upload to the API.\n",
"First, you will prepare a sample image to upload to the API.\n",
"\n",
"Note: You can also [upload your own files](https://github.com/google-gemini/cookbook/tree/main/examples/Upload_files_to_Colab.ipynb) to use."
]
Expand Down Expand Up @@ -196,7 +184,7 @@
],
"source": [
"!curl -o image.jpg \"https://storage.googleapis.com/generativeai-downloads/images/jetpack.jpg\"\n",
"Image(filename='image.jpg')"
"Image(filename=\"image.jpg\")"
]
},
{
Expand All @@ -205,7 +193,7 @@
"id": "EEoXN0f3N2yc"
},
"source": [
"Next, we'll upload that file to the File API."
"Next, you will upload that file to the File API."
]
},
{
Expand All @@ -219,13 +207,12 @@
"name": "stdout",
"output_type": "stream",
"text": [
"Uploaded file 'Sample drawing' as: https://generativelanguage.googleapis.com/v1beta/files/p0dsmt12b68\n"
"Uploaded file '' as: https://generativelanguage.googleapis.com/v1beta/files/p0dsmt12b68\n"
]
}
],
"source": [
"sample_file = genai.upload_file(path=\"image.jpg\",\n",
" display_name=\"Sample drawing\")\n",
"sample_file = genai.upload_file(path=\"image.jpg\", display_name=\"Sample drawing\")\n",
"\n",
"print(f\"Uploaded file '{sample_file.display_name}' as: {sample_file.uri}\")"
]
Expand Down Expand Up @@ -282,7 +269,9 @@
"source": [
"## Generate content\n",
"\n",
"After uploading the file, you can make `GenerateContent` requests that reference the File API URI. In this example, we create prompt that starts with a text followed by the uploaded image."
"After uploading the file, you can make `GenerateContent` requests that reference the file by providing the URI. In the Python SDK you can pass the returned object directly.\n",
"\n",
"Here you create a prompt that starts with text and includes the uploaded image."
]
},
{
Expand All @@ -303,7 +292,9 @@
"source": [
"model = genai.GenerativeModel(model_name=\"models/gemini-1.5-pro-latest\")\n",
"\n",
"response = model.generate_content([\"Describe the image with a creative description.\", sample_file])\n",
"response = model.generate_content(\n",
" [\"Describe the image with a creative description.\", sample_file]\n",
")\n",
"\n",
"print(response.text)"
]
Expand All @@ -314,7 +305,7 @@
"id": "IrPDYdQSKTg4"
},
"source": [
"## Delete Files\n",
"## Delete files\n",
"\n",
"Files are automatically deleted after 2 days or you can manually delete them using `files.delete()`."
]
Expand All @@ -336,7 +327,137 @@
],
"source": [
"genai.delete_file(sample_file.name)\n",
"print(f'Deleted {sample_file.display_name}.')"
"print(f\"Deleted {sample_file.display_name}.\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "u_aF5anOvKsO"
},
"source": [
"## Supported text types\n",
"\n",
"As well as supporting media uploads, the File API can be used to embed text files, such as Python code, or Markdown files, into your prompts.\n",
"\n",
"This example shows you how to load a markdown file into a prompt using the File API."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "3Hz37jFBSr9l"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"## Steps to Take Before Contributing to the Gemini API Cookbook:\n",
"\n",
"Here's what you should do before you begin writing:\n",
"\n",
"**1. Contributor License Agreement (CLA):**\n",
"\n",
"* Visit https://cla.developers.google.com/ to check if you or your employer have already signed the Google CLA. If not, you'll need to sign one to allow the project to use and redistribute your contributions.\n",
"\n",
"**2. Familiarize Yourself with Style Guides:**\n",
"\n",
"* Read the highlights of the technical writing style guide: https://developers.google.com/style/highlights \n",
"* Review the style guide for the programming language you'll be using: https://google.github.io/styleguide/\n",
"\n",
"**3. Consider Using pyink (for Python notebooks):**\n",
"\n",
"* While not mandatory, running `pyink` on your *.ipynb files can help maintain consistent style and avoid potential issues.\n",
"\n",
"**4. Propose Your Contribution:**\n",
"\n",
"* Before writing anything, create an issue on the GitHub repository (https://github.com/google-gemini/cookbook/issues) to discuss your idea and receive guidance on structuring your content. This helps ensure your contribution aligns with the project's goals and avoids wasted effort.\n",
"\n",
"**5. Understand the Evaluation Criteria:**\n",
"\n",
"* The project considers factors like originality, pedagogical value, and quality when accepting new guides. Aim to make your contribution as strong as possible in these areas. \n",
"\n"
]
}
],
"source": [
"# Download a markdown file and ask a question.\n",
"\n",
"!curl -so contrib.md https://raw.githubusercontent.com/google-gemini/cookbook/main/CONTRIBUTING.md\n",
"\n",
"md_file = genai.upload_file(path=\"contrib.md\", display_name=\"Contributors guide\", mime_type=\"text/markdown\")\n",
"\n",
"model = genai.GenerativeModel(model_name=\"models/gemini-1.5-pro-latest\")\n",
"response = model.generate_content(\n",
" [\n",
" \"What should I do before I start writing, when following these guidelines?\",\n",
" md_file,\n",
" ]\n",
")\n",
"print(response.text)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "pmmVaBz4Ss3W"
},
"source": [
"Some common text formats are automatically detected, such as `text/x-python`, `text/html` and `text/markdown`. If you are using a file that you know is text, but is not automatically detected by the API as such, you can specify the MIME type as `text/plain` explicitly."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "8m4qpfTqzE9o"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"## Program Breakdown: Gemma Language Model Example\n",
"\n",
"This C++ program demonstrates how to use the Gemma language model for text generation. Let's break down what each part does:\n",
"\n",
"**1. Headers and Setup:**\n",
"\n",
"* Includes necessary libraries like `iostream` for input/output, `gemma.h` for the Gemma model, and others for thread management and argument parsing.\n",
"* Defines a `tokenize` function that prepares the input prompt string by adding specific start/end tokens and converting it into a sequence of integer tokens using the provided tokenizer.\n",
"\n",
"**2. Main Function:**\n",
"\n",
"* **Argument Parsing:** Uses `LoaderArgs` to parse command-line arguments related to the model, tokenizer, weights, and other settings.\n",
"* **Thread Pool Creation:** Creates a thread pool based on the available hardware concurrency for efficient parallel processing.\n",
"* **Model and Cache Initialization:**\n",
" * Loads the Gemma model using the specified tokenizer and weights.\n",
" * Creates a Key-Value (KV) cache, which is used for caching intermediate results during generation. \n",
"* **Random Number Generator:** Sets up a random number generator using `std::mt19937` for stochastic aspects of the model.\n",
"* **Prompt Tokenization:** Tokenizes the example instruction \"Write a greeting to the world.\" using the `tokenize` function. \n",
"* **Stream Token Callback:** Defines a callback function `stream_token` that is called for each generated token. It keeps track of the generation progress and prints the generated text.\n",
"* **Text Generation:** Calls the `GenerateGemma` function to generate text based on the provided prompt, model, KV cache, and various parameters like maximum token limits and temperature. The `stream_token` callback is used to process each generated token.\n",
"* **Output:** Prints the final generated text to the console. \n",
"\n",
"**In essence, this program takes an instruction as input, uses the Gemma language model to generate text based on that instruction, and then outputs the generated text to the user.** \n",
"\n"
]
}
],
"source": [
"# Download some C++ code and force the MIME as text when uploading.\n",
"\n",
"!curl -so gemma.cpp https://raw.githubusercontent.com/google/gemma.cpp/main/examples/hello_world/run.cc\n",
"\n",
"cpp_file = genai.upload_file(\n",
" path=\"gemma.cpp\", display_name=\"gemma.cpp\", mime_type=\"text/plain\"\n",
")\n",
"\n",
"model = genai.GenerativeModel(model_name=\"models/gemini-1.5-pro-latest\")\n",
"response = model.generate_content([\"What does this program do?\", cpp_file])\n",
"print(response.text)"
]
}
],
Expand Down
Loading

0 comments on commit 8f65f49

Please sign in to comment.