diff --git a/examples/Entity_Extraction.ipynb b/examples/Entity_Extraction.ipynb new file mode 100644 index 000000000..23ecdb104 --- /dev/null +++ b/examples/Entity_Extraction.ipynb @@ -0,0 +1,428 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "KLHiTPXNTf2a" + }, + "source": [ + "##### Copyright 2024 Google LLC." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": { + "cellView": "form", + "id": "oTuT5CsaTigz" + }, + "outputs": [], + "source": [ + "# @title Licensed under the Apache License, Version 2.0 (the \"License\");\n", + "# you may not use this file except in compliance with the License.\n", + "# You may obtain a copy of the License at\n", + "#\n", + "# https://www.apache.org/licenses/LICENSE-2.0\n", + "#\n", + "# Unless required by applicable law or agreed to in writing, software\n", + "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", + "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", + "# See the License for the specific language governing permissions and\n", + "# limitations under the License." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "HUP7eCxvtblC" + }, + "source": [ + "# Gemini API: Entity extraction\n", + "\n", + "Use Gemini API to speed up some of your tasks, such as searching through text to extract needed information. Entity extraction with a Gemini model is a simple query, and you can ask it to retrieve its answer in the form that you prefer.\n", + "\n", + "This notebook shows how to extract entities into a list." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "b04e5041d418" + }, + "source": [ + "\n", + " \n", + "
\n", + " Run in Google Colab\n", + "
" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "lu51_vs315hS" + }, + "source": [ + "## Setup" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": { + "id": "esUtyazO2TU5" + }, + "outputs": [], + "source": [ + "!pip install -U -q google-generativeai" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": { + "id": "4hDSLsOk2buH" + }, + "outputs": [], + "source": [ + "import google.generativeai as genai\n", + "from IPython.display import Markdown" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "6Z8xjVQX2fuB" + }, + "source": [ + "## Configure your API key\n", + "\n", + "To run the following cell, your API key must be stored it in a Colab Secret named `GOOGLE_API_KEY`. If you don't already have an API key, or you're not sure how to create a Colab Secret, see [Authentication](https://github.com/google-gemini/cookbook/blob/main/quickstarts/Authentication.ipynb) for an example." + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": { + "id": "uwbcbjun2gb3" + }, + "outputs": [], + "source": [ + "from google.colab import userdata\n", + "GOOGLE_API_KEY=userdata.get('GOOGLE_API_KEY')\n", + "\n", + "genai.configure(api_key=GOOGLE_API_KEY)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "ky3Njb3t0riS" + }, + "source": [ + "# Examples" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "vV4RsTkJ2BAp" + }, + "source": [ + "You will use Gemini 1.5 Flash model for fast responses." + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": { + "id": "pjMaGHKm2iUH" + }, + "outputs": [], + "source": [ + "model = genai.GenerativeModel('gemini-1.5-flash-latest')" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "D14cShRh4DNl" + }, + "source": [ + "### Extracting few entities at once\n", + "\n", + "This block of text is about possible ways to travel from the airport to the Colosseum. \n", + "\n", + "Let's extract all street names and proposed forms of transportation from it." + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": { + "id": "Qc0w1Ylb7HQB" + }, + "outputs": [], + "source": [ + "directions = \"\"\"To reach the Colosseum from Rome's Fiumicino Airport (FCO), your options are diverse.\n", + "Take the Leonardo Express train from FCO to Termini Station, then hop on metro line A towards Battistini and alight at Colosseo station.\n", + "Alternatively, hop on a direct bus, like the Terravision shuttle, from FCO to Termini, then walk a short distance to the Colosseum on Via dei Fori Imperiali.\n", + "If you prefer a taxi, simply hail one at the airport and ask to be taken to the Colosseum.\n", + "The taxi will likely take you through Via del Corso and Via dei Fori Imperiali.\n", + "A private transfer service offers a direct ride from FCO to the Colosseum, bypassing the hustle of public transport.\n", + "If you're feeling adventurous, consider taking the train from FCO to Ostiense station,\n", + "then walking through the charming Trastevere neighborhood, crossing Ponte Palatino to reach the Colosseum, passing by the Tiber River and Via della Lungara.\n", + "Remember to validate your tickets on the metro and buses, and be mindful of pickpockets, especially in crowded areas.\n", + "No matter which route you choose, you're sure to be awed by the grandeur of the Colosseum.\"\"\"" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": { + "id": "bxPfNhRi-JsW" + }, + "outputs": [ + { + "data": { + "text/markdown": [ + "```json\n", + "{\n", + " \"Street\": [\n", + " \"Via dei Fori Imperiali\",\n", + " \"Via del Corso\",\n", + " \"Via della Lungara\"\n", + " ],\n", + " \"Transport\": [\n", + " \"train\",\n", + " \"metro\",\n", + " \"bus\",\n", + " \"Terravision shuttle\",\n", + " \"taxi\",\n", + " \"private transfer service\"\n", + " ]\n", + "}\n", + "```" + ], + "text/plain": [ + "" + ] + }, + "execution_count": 7, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "directions_prompt = f\"\"\"\n", + " From the given text, extract the following entities and return a list of them.\n", + " Entities to extract: street name, form of transport.\n", + " Text: {directions}\n", + " Street = []\n", + " Transport = [] \"\"\"\n", + "\n", + "Markdown(model.generate_content(directions_prompt).text)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "VYqOM7P76lua" + }, + "source": [ + "You can modify the form of the answer for your extracted entities even more:" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": { + "id": "DZdGgTsN5s6z" + }, + "outputs": [ + { + "data": { + "text/markdown": [ + "Here are the extracted entities:\n", + "\n", + "**Street:**\n", + "- Via dei Fori Imperiali\n", + "- Via del Corso\n", + "- Via della Lungara \n", + "\n", + "**Transport:**\n", + "- Train (Leonardo Express)\n", + "- Metro (line A)\n", + "- Bus (Terravision shuttle)\n", + "- Taxi\n", + "- Private transfer service \n" + ], + "text/plain": [ + "" + ] + }, + "execution_count": 8, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "directions_list_prompt = f\"\"\"\n", + " From the given text, extract the following entities and return a list of them.\n", + " Entities to extract: street name, form of transport.\n", + " Text: {directions}\n", + " Return your answer as two lists:\n", + " Street = [street names]\n", + " Transport = [forms of transport]\"\"\"\n", + "\n", + "Markdown(model.generate_content(directions_list_prompt).text)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "4d7Om_vjuffj" + }, + "source": [ + "### Numbers\n", + "\n", + "Try entity extraction of phone numbers" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": { + "id": "tkAi37jQoJFs" + }, + "outputs": [], + "source": [ + "customer_service_email = \"\"\"\n", + "Hello,\n", + "Thank you for reaching out to our customer support team regarding your recent purchase of our premium subscription service.\n", + "We will send activation code to +87 668 098 344\n", + "Additionally, if you require immediate assistance, feel free to contact us directly at +1 (800) 555-1234.\n", + "Our team is available Monday through Friday from 9:00 AM to 5:00 PM PST.\n", + "For after-hours support, please call our dedicated emergency line at +87 455 555 678.\n", + "We appreciate your business and look forward to resolving any issues you may encounter promptly.\n", + "Thank you.\"\"\"" + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": { + "id": "J_cHeX-wwmeN" + }, + "outputs": [ + { + "data": { + "text/markdown": [ + "The phone numbers are:\n", + "- +87 668 098 344\n", + "- +1 (800) 555-1234\n", + "- +87 455 555 678 \n" + ], + "text/plain": [ + "" + ] + }, + "execution_count": 10, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "phone_prompt = f\"\"\"\n", + " From the given text, extract the following entities and return a list of them.\n", + " Entities to extract: phone numbers.\n", + " Text: {customer_service_email}\n", + " Return your answer in a list:\"\"\"\n", + "\n", + "Markdown(model.generate_content(phone_prompt).text)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "o_tNxJhQur-x" + }, + "source": [ + "### URLs\n", + "\n", + "\n", + "Try entity extraction of URLs and get response as a clickable link." + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": { + "id": "J1Ufx9qkxPfJ" + }, + "outputs": [], + "source": [ + "url_text = \"\"\"\n", + "Gemini API billing FAQs\n", + "\n", + "This page provides answers to frequently asked questions about billing for the Gemini API. For pricing information, see the pricing page https://ai.google.dev/pricing.\n", + "For legal terms, see the terms of service https://ai.google.dev/gemini-api/terms#paid-services.\n", + "\n", + "What am I billed for?\n", + "Gemini API pricing is based on total token count, with different prices for input tokens and output tokens. For pricing information, see the pricing page https://ai.google.dev/pricing.\n", + "\n", + "Where can I view my quota?\n", + "You can view your quota and system limits in the Google Cloud console https://console.cloud.google.com/apis/api/generativelanguage.googleapis.com/quotas.\n", + "\n", + "Is GetTokens billed?\n", + "Requests to the GetTokens API are not billed, and they don't count against inference quota.\"\"\"" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": { + "id": "eaFWpN1IyOLW" + }, + "outputs": [ + { + "data": { + "text/markdown": [ + "- https://ai.google.dev/pricing\n", + "- https://ai.google.dev/gemini-api/terms#paid-services\n", + "- https://console.cloud.google.com/apis/api/generativelanguage.googleapis.com/quotas \n" + ], + "text/plain": [ + "" + ] + }, + "execution_count": 12, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "url_prompt = f\"\"\"\n", + " From the given text, extract the following entities and return a list of them.\n", + " Entities to extract: URLs.\n", + " Text: {url_text}\n", + " Do not duplicate entities.\n", + " Return your answer in a markdown format:\"\"\"\n", + "\n", + "Markdown(model.generate_content(url_prompt).text)" + ] + } + ], + "metadata": { + "colab": { + "name": "Entity_Extraction.ipynb", + "toc_visible": true + }, + "kernelspec": { + "display_name": "Python 3", + "name": "python3" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} diff --git a/examples/README.md b/examples/README.md index 99a8347f5..62b22dea5 100644 --- a/examples/README.md +++ b/examples/README.md @@ -18,6 +18,7 @@ This is a colletion of fun examples for the Gemini API. * [Voice Memos](https://github.com/google-gemini/cookbook/blob/main/examples/Voice_memos.ipynb): You'll use the Gemini API to help you generate ideas for your next blog post, based on voice memos you recorded on your phone, and previous articles you've written. * [Translate a public domain](https://github.com/google-gemini/cookbook/blob/main/examples/Translate_a_Public_Domain_Book.ipynb): In this notebook, you will explore Gemini model as a translation tool, demonstrating how to prepare data, create effective prompts, and save results into a `.txt` file. * [Working with Charts, Graphs, and Slide Decks](https://github.com/google-gemini/cookbook/blob/main/examples/Working_with_Charts_Graphs_and_Slide_Decks.ipynb): Gemini models are powerful multimodal LLMs that can process both text and image inputs. This notebook shows how Gemini 1.5 Flash model is capable of extracting data from various images. +* [Entity extraction](https://github.com/google-gemini/cookbook/blob/main/examples/Entity_Extraction.ipynb): Use Gemini API to speed up some of your tasks, such as searching through text to extract needed information. Entity extraction with a Gemini model is a simple query, and you can ask it to retrieve its answer in the form that you prefer. Folders * [Prompting examples](https://github.com/google-gemini/cookbook/tree/main/examples/prompting): A directory with examples of various prompting techniques.