GoogleCloudPlatform · takumiohym · Feb 17, 2025
diff --git a/notebooks/vertex_genai/labs/gemini_prompt_engineering.ipynb b/notebooks/vertex_genai/labs/gemini_prompt_engineering.ipynb
@@ -9,23 +9,23 @@
     "\n",
     "**Learning Objective**\n",
     "\n",
-    "1. Learn how query the Vertex Gemini API\n",
-    "1. Learn how to setup the Gemini API parameters \n",
+    "1. Learn how to use Google Gen AI SDK to call Gemini\n",
+    "1. Learn how to setup the Gemini parameters \n",
     "1. Learn prompt engineering for text generation\n",
     "1. Learn prompt engineering for chat applications\n",
     "\n",
     "\n",
-    "The Vertex AI Gemini API lets you test, customize, and deploy instances of Google's Gemini large language models (LLM) so that you can leverage the capabilities of Gemini in your applications. The Gemini family of models supports text completion, multi-turn chat, and text embeddings generation.\n",
+    "The Google Gen AI SDK lets you test, customize, and deploy instances of Google's Gemini large language models (LLM) so that you can leverage the capabilities of Gemini in your applications. The Gemini family of models supports text completion, multi-turn chat, and text embeddings generation.\n",
     "\n",
-    "This notebook will provide examples of accessing pre-trained Gemini models with the API for use cases like text classification, summarization, extraction, and chat."
+    "This notebook will provide examples of accessing pre-trained Gemini models with the SDK for use cases like text classification, summarization, extraction, and chat."
    ]
   },
   {
    "cell_type": "markdown",
    "id": "3f30622e-7ae5-4092-bebf-80ecd3b874f6",
    "metadata": {},
    "source": [
-    "### Setup"
+    "## Setup"
    ]
   },
   {
@@ -37,13 +37,9 @@
    },
    "outputs": [],
    "source": [
-    "from IPython.display import Markdown\n",
-    "from vertexai.generative_models import (\n",
-    "    Content,\n",
-    "    GenerationConfig,\n",
-    "    GenerativeModel,\n",
-    "    Part,\n",
-    ")"
+    "from google import genai\n",
+    "from google.genai.types import Content, Part\n",
+    "from IPython.display import Markdown"
    ]
   },
   {
@@ -53,7 +49,18 @@
    "source": [
     "## Text generation\n",
     "\n",
-    "The cell below implements the helper function `generate_content` to generate responses from the Gemini API. "
+    "The cell below implements the helper function `generate_content_stream` to generate stream responses from Gemini using the Gen AI SDK. <br>\n",
+    "If you don't need streaming outputs, use `generate_content` instead."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "45189fc6-4f04-4d42-ac9e-a95b005510ae",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "client = genai.Client(vertexai=True, location=\"us-central1\")"
    ]
   },
   {
@@ -65,10 +72,11 @@
    "source": [
     "def generate(\n",
     "    prompt,\n",
-    "    model_name=\"gemini-1.0-pro\",\n",
+    "    model_name=\"gemini-2.0-flash-001\",\n",
     "):\n",
-    "    model = GenerativeModel(model_name)\n",
-    "    responses = model.generate_content(prompt, stream=True)\n",
+    "    responses = client.models.generate_content_stream(\n",
+    "        model=model_name, contents=prompt\n",
+    "    )\n",
     "    return responses"
    ]
   },
@@ -250,13 +258,13 @@
     "\n",
     "Service Rep: What seems to be the problem? \n",
     "\n",
-    "Customer: I am trying to use the PaLM API but I keep getting an error. \n",
+    "Customer: I am trying to use Gemini but I keep getting an error. \n",
     "\n",
     "Service Rep: Can you share the error with me? \n",
     "\n",
     "Customer: Sure. The error says: \"ResourceExhausted: 429 Quota exceeded for \n",
     "      aiplatform.googleapis.com/online_prediction_requests_per_base_model \n",
-    "      with base model: text-bison\"\n",
+    "      with base model: gemini-2.0-flash-001\"\n",
     "      \n",
     "Service Rep: It looks like you have exceeded the quota for usage. Please refer to \n",
     "             https://cloud.google.com/vertex-ai/docs/quotas for information about quotas\n",
@@ -398,9 +406,9 @@
    "id": "10ba2c36-d280-433f-b124-0106c8ce2c9c",
    "metadata": {},
    "source": [
-    "The Vertex AI Gemini API for chat is optimized for multi-turn chat. Multi-turn chat is when a model tracks the history of a chat conversation and then uses that history as the context for responses.\n",
+    "The Gen AI SDK for chat is optimized for multi-turn chat. Multi-turn chat is when a model tracks the history of a chat conversation and then uses that history as the context for responses.\n",
     "\n",
-    "Gemini enables you to have freeform conversations across multiple turns. The ChatSession class simplifies the process by managing the state of the conversation, so unlike with generate_content, you do not have to store the conversation history as a list.\n",
+    "Gemini enables you to have freeform conversations across multiple turns. The `Chat` class simplifies the process by managing the state of the conversation, so unlike with `generate_content`, you do not have to store the conversation history as a list.\n",
     "\n",
     "Let's initialize the chat:"
    ]
@@ -412,8 +420,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "model = GenerativeModel(\"gemini-1.0-pro\")\n",
-    "chat = model.start_chat(history=[])\n",
+    "chat = client.chats.create(model=\"gemini-2.0-flash-001\")\n",
     "chat"
    ]
   },
@@ -422,7 +429,7 @@
    "id": "f239950b-cb28-4cb5-aedf-0278ed356e06",
    "metadata": {},
    "source": [
-    "The ChatSession.send_message method returns the same GenerateContentResponse type as GenerativeModel.generate_content. It also appends your message and the response to the chat history:"
+    "The `Chat.send_message` method returns the same `GenerateContentResponse` type as `client.models.generate_content`. It also appends your message and the response to the chat history:"
    ]
   },
   {
@@ -443,7 +450,7 @@
    "id": "825730b6-168f-455b-9ac3-f98a04e5dee2",
    "metadata": {},
    "source": [
-    "Recall that within a chat session, history is preserved. This enables the model to remember things within a given chat session for context. You can see this history in the `history` attribute of the chat session object. Notice that the history is simply a list of previous input/output pairs."
+    "Recall that within a chat session, history is preserved. This enables the model to remember things within a given chat session for context. You can see this history in the `_curated_history` attribute of the chat session object. Notice that the history is simply a list of previous input/output pairs."
    ]
   },
   {
@@ -453,7 +460,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "chat.history"
+    "chat._curated_history"
    ]
   },
   {
@@ -477,74 +484,6 @@
     "response.text"
    ]
   },
-  {
-   "cell_type": "markdown",
-   "id": "6a86a0c0-5668-4efa-8049-1eb7b4d56cdc",
-   "metadata": {
-    "tags": []
-   },
-   "source": [
-    "### Adding Context\n",
-    "\n",
-    "While the `ChatSession` class shown earlier can handle many use cases, it does make some assumptions. If your use case doesn't fit into this chat implementation it's good to remember that `ChatSession` is just a wrapper around `GenerativeModel.generate_content`. In addition to single requests, it can handle multi-turn conversations.\n",
-    "\n",
-    "The individual messages are `protos.Content objects` or compatible dictionaries, as seen in previous sections. As a dictionary, the message requires role and parts keys. The role in a conversation can either be the user, which provides the prompts, or model, which provides the responses.\n",
-    "\n",
-    "Pass a list of `protos.Content` objects and it will be treated as multi-turn chat:\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "4b641bdf-fca4-46a6-aad5-e310456f6615",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "model = GenerativeModel(\"gemini-1.5-flash\")\n",
-    "messages = [\n",
-    "    {\n",
-    "        \"role\": \"user\",\n",
-    "        \"parts\": [\"Briefly explain how a computer works to a young child.\"],\n",
-    "    }\n",
-    "]\n",
-    "responses = model.generate_content(str(messages), stream=True)\n",
-    "\n",
-    "for response in responses:\n",
-    "    print(response.text)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "0da257d4-e31e-45ed-884b-fac663862e30",
-   "metadata": {},
-   "source": [
-    "To continue the conversation, add the response and another message."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "2d8a4cc8-412b-4f4e-83ce-2c0654ded06d",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "messages.append({\"role\": \"model\", \"parts\": [response.text]})\n",
-    "\n",
-    "messages.append(\n",
-    "    {\n",
-    "        \"role\": \"user\",\n",
-    "        \"parts\": [\n",
-    "            \"Okay, how about a more detailed explanation to a high school student?\"\n",
-    "        ],\n",
-    "    }\n",
-    ")\n",
-    "\n",
-    "responses = model.generate_content(str(messages), stream=True)\n",
-    "\n",
-    "for response in responses:\n",
-    "    print(response.text)"
-   ]
-  },
   {
    "cell_type": "markdown",
    "id": "d062fa76-69b4-461c-b2ff-37ace5427fd8",
@@ -570,25 +509,24 @@
    },
    "outputs": [],
    "source": [
-    "chat2 = model.start_chat(\n",
+    "chat2 = client.chats.create(\n",
+    "    model=\"gemini-2.0-flash-001\",\n",
     "    history=[\n",
     "        Content(\n",
     "            role=\"user\",\n",
     "            parts=[\n",
     "                Part.from_text(\n",
-    "                    \"\"\"\n",
-    "    My name is Ned. You are my personal assistant. My favorite movies are Lord of the Rings and Hobbit.\n",
-    "    Who do you work for?\n",
-    "    \"\"\"\n",
+    "                    text=\"\"\"My name is Ned. You are my personal assistant. My favorite movies are Lord of the Rings and Hobbit.Who do you work for?\"\"\"\n",
     "                )\n",
     "            ],\n",
     "        ),\n",
-    "        Content(role=\"model\", parts=[Part.from_text(\"I work for Ned.\")]),\n",
-    "        Content(role=\"user\", parts=[Part.from_text(\"What do I like?\")]),\n",
+    "        Content(role=\"model\", parts=[Part.from_text(text=\"I work for Ned.\")]),\n",
+    "        Content(role=\"user\", parts=[Part.from_text(text=\"What do I like?\")]),\n",
     "        Content(\n",
-    "            role=\"model\", parts=[Part.from_text(\"Ned likes watching movies.\")]\n",
+    "            role=\"model\",\n",
+    "            parts=[Part.from_text(text=\"Ned likes watching movies.\")],\n",
     "        ),\n",
-    "    ]\n",
+    "    ],\n",
     ")\n",
     "\n",
     "response = chat2.send_message(\"Are my favorite movies based on a book series?\")\n",
@@ -632,9 +570,9 @@
  "metadata": {
   "environment": {
    "kernel": "conda-base-py",
-   "name": "workbench-notebooks.m123",
+   "name": "workbench-notebooks.m128",
    "type": "gcloud",
-   "uri": "us-docker.pkg.dev/deeplearning-platform-release/gcr.io/workbench-notebooks:m123"
+   "uri": "us-docker.pkg.dev/deeplearning-platform-release/gcr.io/workbench-notebooks:m128"
   },
   "kernelspec": {
    "display_name": "Python 3 (ipykernel) (Local)",
@@ -651,7 +589,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.10.14"
+   "version": "3.10.16"
   }
  },
  "nbformat": 4,