Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate to Gen AI SDK: gemini_prompt_engineering.ipynb #570

Open
wants to merge 1 commit into
base: sdk_update
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
148 changes: 43 additions & 105 deletions notebooks/vertex_genai/labs/gemini_prompt_engineering.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -9,23 +9,23 @@
"\n",
"**Learning Objective**\n",
"\n",
"1. Learn how query the Vertex Gemini API\n",
"1. Learn how to setup the Gemini API parameters \n",
"1. Learn how to use Google Gen AI SDK to call Gemini\n",
"1. Learn how to setup the Gemini parameters \n",
"1. Learn prompt engineering for text generation\n",
"1. Learn prompt engineering for chat applications\n",
"\n",
"\n",
"The Vertex AI Gemini API lets you test, customize, and deploy instances of Google's Gemini large language models (LLM) so that you can leverage the capabilities of Gemini in your applications. The Gemini family of models supports text completion, multi-turn chat, and text embeddings generation.\n",
"The Google Gen AI SDK lets you test, customize, and deploy instances of Google's Gemini large language models (LLM) so that you can leverage the capabilities of Gemini in your applications. The Gemini family of models supports text completion, multi-turn chat, and text embeddings generation.\n",
"\n",
"This notebook will provide examples of accessing pre-trained Gemini models with the API for use cases like text classification, summarization, extraction, and chat."
"This notebook will provide examples of accessing pre-trained Gemini models with the SDK for use cases like text classification, summarization, extraction, and chat."
]
},
{
"cell_type": "markdown",
"id": "3f30622e-7ae5-4092-bebf-80ecd3b874f6",
"metadata": {},
"source": [
"### Setup"
"## Setup"
]
},
{
Expand All @@ -37,13 +37,9 @@
},
"outputs": [],
"source": [
"from IPython.display import Markdown\n",
"from vertexai.generative_models import (\n",
" Content,\n",
" GenerationConfig,\n",
" GenerativeModel,\n",
" Part,\n",
")"
"from google import genai\n",
"from google.genai.types import Content, Part\n",
"from IPython.display import Markdown"
]
},
{
Expand All @@ -53,7 +49,18 @@
"source": [
"## Text generation\n",
"\n",
"The cell below implements the helper function `generate_content` to generate responses from the Gemini API. "
"The cell below implements the helper function `generate_content_stream` to generate stream responses from Gemini using the Gen AI SDK. <br>\n",
"If you don't need streaming outputs, use `generate_content` instead."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "45189fc6-4f04-4d42-ac9e-a95b005510ae",
"metadata": {},
"outputs": [],
"source": [
"client = genai.Client(vertexai=True, location=\"us-central1\")"
]
},
{
Expand All @@ -65,10 +72,11 @@
"source": [
"def generate(\n",
" prompt,\n",
" model_name=\"gemini-1.0-pro\",\n",
" model_name=\"gemini-2.0-flash-001\",\n",
"):\n",
" model = GenerativeModel(model_name)\n",
" responses = model.generate_content(prompt, stream=True)\n",
" responses = client.models.generate_content_stream(\n",
" model=model_name, contents=prompt\n",
" )\n",
" return responses"
]
},
Expand Down Expand Up @@ -250,13 +258,13 @@
"\n",
"Service Rep: What seems to be the problem? \n",
"\n",
"Customer: I am trying to use the PaLM API but I keep getting an error. \n",
"Customer: I am trying to use Gemini but I keep getting an error. \n",
"\n",
"Service Rep: Can you share the error with me? \n",
"\n",
"Customer: Sure. The error says: \"ResourceExhausted: 429 Quota exceeded for \n",
" aiplatform.googleapis.com/online_prediction_requests_per_base_model \n",
" with base model: text-bison\"\n",
" with base model: gemini-2.0-flash-001\"\n",
" \n",
"Service Rep: It looks like you have exceeded the quota for usage. Please refer to \n",
" https://cloud.google.com/vertex-ai/docs/quotas for information about quotas\n",
Expand Down Expand Up @@ -398,9 +406,9 @@
"id": "10ba2c36-d280-433f-b124-0106c8ce2c9c",
"metadata": {},
"source": [
"The Vertex AI Gemini API for chat is optimized for multi-turn chat. Multi-turn chat is when a model tracks the history of a chat conversation and then uses that history as the context for responses.\n",
"The Gen AI SDK for chat is optimized for multi-turn chat. Multi-turn chat is when a model tracks the history of a chat conversation and then uses that history as the context for responses.\n",
"\n",
"Gemini enables you to have freeform conversations across multiple turns. The ChatSession class simplifies the process by managing the state of the conversation, so unlike with generate_content, you do not have to store the conversation history as a list.\n",
"Gemini enables you to have freeform conversations across multiple turns. The `Chat` class simplifies the process by managing the state of the conversation, so unlike with `generate_content`, you do not have to store the conversation history as a list.\n",
"\n",
"Let's initialize the chat:"
]
Expand All @@ -412,8 +420,7 @@
"metadata": {},
"outputs": [],
"source": [
"model = GenerativeModel(\"gemini-1.0-pro\")\n",
"chat = model.start_chat(history=[])\n",
"chat = client.chats.create(model=\"gemini-2.0-flash-001\")\n",
"chat"
]
},
Expand All @@ -422,7 +429,7 @@
"id": "f239950b-cb28-4cb5-aedf-0278ed356e06",
"metadata": {},
"source": [
"The ChatSession.send_message method returns the same GenerateContentResponse type as GenerativeModel.generate_content. It also appends your message and the response to the chat history:"
"The `Chat.send_message` method returns the same `GenerateContentResponse` type as `client.models.generate_content`. It also appends your message and the response to the chat history:"
]
},
{
Expand All @@ -443,7 +450,7 @@
"id": "825730b6-168f-455b-9ac3-f98a04e5dee2",
"metadata": {},
"source": [
"Recall that within a chat session, history is preserved. This enables the model to remember things within a given chat session for context. You can see this history in the `history` attribute of the chat session object. Notice that the history is simply a list of previous input/output pairs."
"Recall that within a chat session, history is preserved. This enables the model to remember things within a given chat session for context. You can see this history in the `_curated_history` attribute of the chat session object. Notice that the history is simply a list of previous input/output pairs."
]
},
{
Expand All @@ -453,7 +460,7 @@
"metadata": {},
"outputs": [],
"source": [
"chat.history"
"chat._curated_history"
]
},
{
Expand All @@ -477,74 +484,6 @@
"response.text"
]
},
{
"cell_type": "markdown",
"id": "6a86a0c0-5668-4efa-8049-1eb7b4d56cdc",
"metadata": {
"tags": []
},
"source": [
"### Adding Context\n",
"\n",
"While the `ChatSession` class shown earlier can handle many use cases, it does make some assumptions. If your use case doesn't fit into this chat implementation it's good to remember that `ChatSession` is just a wrapper around `GenerativeModel.generate_content`. In addition to single requests, it can handle multi-turn conversations.\n",
"\n",
"The individual messages are `protos.Content objects` or compatible dictionaries, as seen in previous sections. As a dictionary, the message requires role and parts keys. The role in a conversation can either be the user, which provides the prompts, or model, which provides the responses.\n",
"\n",
"Pass a list of `protos.Content` objects and it will be treated as multi-turn chat:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4b641bdf-fca4-46a6-aad5-e310456f6615",
"metadata": {},
"outputs": [],
"source": [
"model = GenerativeModel(\"gemini-1.5-flash\")\n",
"messages = [\n",
" {\n",
" \"role\": \"user\",\n",
" \"parts\": [\"Briefly explain how a computer works to a young child.\"],\n",
" }\n",
"]\n",
"responses = model.generate_content(str(messages), stream=True)\n",
"\n",
"for response in responses:\n",
" print(response.text)"
]
},
{
"cell_type": "markdown",
"id": "0da257d4-e31e-45ed-884b-fac663862e30",
"metadata": {},
"source": [
"To continue the conversation, add the response and another message."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2d8a4cc8-412b-4f4e-83ce-2c0654ded06d",
"metadata": {},
"outputs": [],
"source": [
"messages.append({\"role\": \"model\", \"parts\": [response.text]})\n",
"\n",
"messages.append(\n",
" {\n",
" \"role\": \"user\",\n",
" \"parts\": [\n",
" \"Okay, how about a more detailed explanation to a high school student?\"\n",
" ],\n",
" }\n",
")\n",
"\n",
"responses = model.generate_content(str(messages), stream=True)\n",
"\n",
"for response in responses:\n",
" print(response.text)"
]
},
{
"cell_type": "markdown",
"id": "d062fa76-69b4-461c-b2ff-37ace5427fd8",
Expand All @@ -570,25 +509,24 @@
},
"outputs": [],
"source": [
"chat2 = model.start_chat(\n",
"chat2 = client.chats.create(\n",
" model=\"gemini-2.0-flash-001\",\n",
" history=[\n",
" Content(\n",
" role=\"user\",\n",
" parts=[\n",
" Part.from_text(\n",
" \"\"\"\n",
" My name is Ned. You are my personal assistant. My favorite movies are Lord of the Rings and Hobbit.\n",
" Who do you work for?\n",
" \"\"\"\n",
" text=\"\"\"My name is Ned. You are my personal assistant. My favorite movies are Lord of the Rings and Hobbit.Who do you work for?\"\"\"\n",
" )\n",
" ],\n",
" ),\n",
" Content(role=\"model\", parts=[Part.from_text(\"I work for Ned.\")]),\n",
" Content(role=\"user\", parts=[Part.from_text(\"What do I like?\")]),\n",
" Content(role=\"model\", parts=[Part.from_text(text=\"I work for Ned.\")]),\n",
" Content(role=\"user\", parts=[Part.from_text(text=\"What do I like?\")]),\n",
" Content(\n",
" role=\"model\", parts=[Part.from_text(\"Ned likes watching movies.\")]\n",
" role=\"model\",\n",
" parts=[Part.from_text(text=\"Ned likes watching movies.\")],\n",
" ),\n",
" ]\n",
" ],\n",
")\n",
"\n",
"response = chat2.send_message(\"Are my favorite movies based on a book series?\")\n",
Expand Down Expand Up @@ -632,9 +570,9 @@
"metadata": {
"environment": {
"kernel": "conda-base-py",
"name": "workbench-notebooks.m123",
"name": "workbench-notebooks.m128",
"type": "gcloud",
"uri": "us-docker.pkg.dev/deeplearning-platform-release/gcr.io/workbench-notebooks:m123"
"uri": "us-docker.pkg.dev/deeplearning-platform-release/gcr.io/workbench-notebooks:m128"
},
"kernelspec": {
"display_name": "Python 3 (ipykernel) (Local)",
Expand All @@ -651,7 +589,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.14"
"version": "3.10.16"
}
},
"nbformat": 4,
Expand Down
Loading