From c6aba18ed1fc739bb56748063da3574da7d39cc4 Mon Sep 17 00:00:00 2001 From: Luciano Martins Date: Tue, 14 May 2024 17:08:15 -0300 Subject: [PATCH] Include Gemini Flash model quickstart (#133) --- quickstarts/Gemini_Flash_Introduction.ipynb | 494 ++++++++++++++++++++ 1 file changed, 494 insertions(+) create mode 100644 quickstarts/Gemini_Flash_Introduction.ipynb diff --git a/quickstarts/Gemini_Flash_Introduction.ipynb b/quickstarts/Gemini_Flash_Introduction.ipynb new file mode 100644 index 000000000..db35347c0 --- /dev/null +++ b/quickstarts/Gemini_Flash_Introduction.ipynb @@ -0,0 +1,494 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "cDzZKCF4ea5n" + }, + "source": [ + "##### Copyright 2024 Google LLC." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "cellView": "form", + "id": "cxsdQaqTeihY" + }, + "outputs": [], + "source": [ + "#@title Licensed under the Apache License, Version 2.0 (the \"License\");\n", + "# you may not use this file except in compliance with the License.\n", + "# You may obtain a copy of the License at\n", + "#\n", + "# https://www.apache.org/licenses/LICENSE-2.0\n", + "#\n", + "# Unless required by applicable law or agreed to in writing, software\n", + "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", + "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", + "# See the License for the specific language governing permissions and\n", + "# limitations under the License." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "FjjEC1DHenXF" + }, + "source": [ + "# Gemini Flash Introduction" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "HQQSrHovfBan" + }, + "source": [ + "\n", + " \n", + "
\n", + " Run in Google Colab\n", + "
" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "1LEBXIg0fq83" + }, + "source": [ + "The Gemini 1.5 Flash is a new model from Gemini ecosystem providing better quality and lower latency for existing Gemini 1.0 Pro developers and users.\n", + "\n", + "It simplifies your tests and adoption due to feature parity with the currently available Gemini models.\n", + "\n", + "In this notebook you will experiment with different scenarios (including text, chat and multimodal examples) where the only change required is changing the model you want to interact with - all the code is simply the same." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "ZxjOSybzhS5F" + }, + "source": [ + "## Installing the latest version of the Gemini SDK" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "7WIjD40XBMEM" + }, + "outputs": [], + "source": [ + "!pip install -q -U google-generativeai # Install the Python SDK" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "0DUxJvIwhWQI" + }, + "source": [ + "## Import the Gemini python SDK" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "kmyjiZKSBYej" + }, + "outputs": [], + "source": [ + "import google.generativeai as genai" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "xvJsVuNQhcED" + }, + "source": [ + "## Set up your API key\n", + "\n", + "To run the following cell, your API key must be stored it in a Colab Secret named `GOOGLE_API_KEY`. If you don't already have an API key, or you're not sure how to create a Colab Secret, see the [Authentication](https://github.com/google-gemini/cookbook/blob/main/quickstarts/Authentication.ipynb) quickstart for an example." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "g3SXoJCLBpFs" + }, + "outputs": [], + "source": [ + "from google.colab import userdata\n", + "\n", + "GOOGLE_API_KEY=userdata.get('GOOGLE_API_KEY')\n", + "genai.configure(api_key=GOOGLE_API_KEY)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "pWLzoSm3xs5V" + }, + "source": [ + "## Working with text scenarios\n", + "\n", + "In the first scenario of this notebook, you will work with text only scenarios. You will send direct requests, in text format, to the Gemini API and handle the results. It will include the understanding the information for each model (including input and output limits) and working with mechanisms to count the tokens of your request.\n", + "\n", + "First pick which model version you want to experiment with selecting on the listbox below - The available models are:\n", + "\n", + "- `models/gemini-1.5-flash-latest`\n", + "- `models/gemini-1.5-pro-latest`\n", + "- `models/gemini-1.0-pro-latest`" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "giFvfXMeUnyR" + }, + "outputs": [], + "source": [ + "version = 'models/gemini-1.5-flash-latest' # @param [\"models/gemini-1.5-flash-latest\", \"models/gemini-1.5-pro-latest\", \"models/gemini-1.0-pro-latest\"]\n", + "model = genai.GenerativeModel(version)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "W4CuxUizinbs" + }, + "source": [ + "Using `model.get_model()` method, you can explore details about the model, like `input_token_limit` and `output_token_limit`:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "PAimDgd5ugKn" + }, + "outputs": [], + "source": [ + "model_info = genai.get_model(version)\n", + "print(f'{version} - input limit: {model_info.input_token_limit}, output limit: {model_info.output_token_limit}')" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "u7GRPvW7jh7s" + }, + "source": [ + "You can also count the tokens of your input using the `model.count_tokens()` method:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "yscBZrjPu1zL" + }, + "outputs": [], + "source": [ + "prompt = \"What is artificial intelligence?\"\n", + "model.count_tokens(prompt)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "6vgBNqUajsQ2" + }, + "source": [ + "Then you can send your request prompt to Gemini API - Does not matter which model version you chose, the same request code is going to be used here:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "W1RWCdNPtTzd" + }, + "outputs": [], + "source": [ + "response = model.generate_content(prompt)\n", + "print(response.text)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "iEREbartxmXx" + }, + "source": [ + "## Working with chat scenarios\n", + "\n", + "The next experimentation is working with chats. Again, the first action is to pick which model you want to play with. As for the text example, you can pick one of the above:\n", + "- `models/gemini-1.5-flash-latest`\n", + "- `models/gemini-1.5-pro-latest`\n", + "- `models/gemini-1.0-pro-latest`" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "E5WcsAIGvznk" + }, + "outputs": [], + "source": [ + "version = 'models/gemini-1.5-flash-latest' # @param [\"models/gemini-1.5-flash-latest\", \"models/gemini-1.5-pro-latest\", \"models/gemini-1.0-pro-latest\"]\n", + "model = genai.GenerativeModel(version)\n", + "chat = model.start_chat(history=[])" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "3hnEUnrik0D1" + }, + "source": [ + "Using `model.get_model()` method, you can explore details about the model, like `input_token_limit` and `output_token_limit`:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "NX3xgV2NYggV" + }, + "outputs": [], + "source": [ + "model_info = genai.get_model(version)\n", + "print(f'{version} - input limit: {model_info.input_token_limit}, output limit: {model_info.output_token_limit}')" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "T_RfgpAPk58V" + }, + "source": [ + "You can also count the tokens of your experiment using the `model.count_tokens()` method:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "y8Megh7SYm-7" + }, + "outputs": [], + "source": [ + "prompt = \"How can I start learning artificial intelligence?\"\n", + "model.count_tokens(prompt)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "ZIZyFu5tk-oW" + }, + "source": [ + "Then you can send your request prompt to the Gemini API - Does not matter which model version you chose, the same request code is going to be used here:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "rzUMuKSXvzhN" + }, + "outputs": [], + "source": [ + "response = chat.send_message(\"How can I start learning artificial intelligence?\")\n", + "print(response.text)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "IB_88tt_lCmR" + }, + "source": [ + "The same way you can perform a tokens counting for your prompts, you can use it against your chat history too, using the same `model.count_tokens()` method:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "vrVI9zvqvzfI" + }, + "outputs": [], + "source": [ + "model.count_tokens(chat.history)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "vu8blCSpapTF" + }, + "source": [ + "## Working with multimodal scenarios\n", + "\n", + "Then finally you can experiment with a multimodal experiment - or, in other words, sending in the same request prompt different data modalities (like text and images together).\n", + "\n", + "You must first pick which model version you want to experiment with selecting on the listbox below - The available models are:\n", + "\n", + "- `models/gemini-1.5-flash-latest`\n", + "- `models/gemini-1.5-pro-latest`\n", + "- `models/gemini-pro-vision`" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "LfuCWtHetcuA" + }, + "outputs": [], + "source": [ + "version = 'models/gemini-pro-vision' # @param [\"models/gemini-1.5-flash-latest\", \"models/gemini-1.5-pro-latest\", \"models/gemini-pro-vision\"]\n", + "model = genai.GenerativeModel(version)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "ZYlc6EZhmR4W" + }, + "source": [ + "Using `model.get_model()` method, you can explore details about the model, like `input_token_limit` and `output_token_limit`:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "YOQVyY7XavyR" + }, + "outputs": [], + "source": [ + "model_info = genai.get_model(version)\n", + "print(f'{version} - input limit: {model_info.input_token_limit}, output limit: {model_info.output_token_limit}')" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "aeug2Lk3mXV5" + }, + "source": [ + "Now you will pick a test image to be used on your multimodal prompt. Here you will use a sample croissant image:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "Ur7rfzAbbIcQ" + }, + "outputs": [], + "source": [ + "import PIL\n", + "from IPython.display import display, Image\n", + "\n", + "!curl -s -o image.jpg \"https://storage.googleapis.com/generativeai-downloads/images/croissant.jpg\"\n", + "img = PIL.Image.open('image.jpg')\n", + "display(Image('image.jpg', width=300))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "nOnmWlsimfGR" + }, + "source": [ + "As you did for the text and chat prompts, you can perform a tokens counting for your image as well. Here you will show first the image resolution (using `img.size`) and then the amount of tokens that represent the image, using `model.cout_tokens()` method:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "HXMvnNALbmWP" + }, + "outputs": [], + "source": [ + "print(img.size)\n", + "print(model.count_tokens(img))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "D3ziijDAm6VE" + }, + "source": [ + "Now it is time to define the text prompt to be sent together with your test image - in this case, you will send a request to extract some information from the image, like what is in the image, which country the item in the image is related and what is the best pairing for the item." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "mGh0OwvYlLEW" + }, + "outputs": [], + "source": [ + "prompt = \"\"\"\n", + "Describe this image, including which country is famous for having this food and what is the best pairing for it.\n", + "\"\"\"" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "N2Inb0TRny4X" + }, + "outputs": [], + "source": [ + "response = model.generate_content([prompt, img])\n", + "print(response.text)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "sDrDAo1xnu9u" + }, + "source": [ + "## Learning more\n", + "\n", + "* To learn how use a model for prompting, see the [Prompting](https://github.com/google-gemini/cookbook/blob/main/quickstarts/Prompting.ipynb) quickstart.\n", + "\n", + "* [count_tokens](https://ai.google.dev/api/python/google/generativeai/GenerativeModel#count_tokens) Python API reference and [Count Tokens](https://github.com/google-gemini/cookbook/blob/main/quickstarts/Counting_Tokens.ipynb) quickstart.\n", + "\n", + "* For more information on models, visit the [Gemini models](https://ai.google.dev/models/gemini) documentation." + ] + } + ], + "metadata": { + "colab": { + "name": "Gemini_Flash_Introduction.ipynb", + "toc_visible": true + }, + "kernelspec": { + "display_name": "Python 3", + "name": "python3" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +}