From the course: Advanced Gemini API for Integration

Using system instructions

- [Instructor] One of the biggest advantages and selling points of large language models like Gemini is that they are autonomous. They have the ability to generate amazing results without lots of handholding. You give them a prompt, from cooking to fishing to asking about investing in stocks, and they are capable of giving you very reasonable answers delivered in a very comprehensible format. This autonomous capability is also one of the flaws of LLMs. Because LLMs are often optimized for generating results as fast as possible, they tend to assume any extra information they need about a task without asking. They can generate information based on an assumed context when none is provided, assume key information that should have been provided in the query and deliver in a format they see fit. While this works well for most casual use cases for focused work, especially when you are building applications using the Gemini API, you need more control. You need to be able to instruct the model on the role it should perform, provide all necessary information, peculiar or general, that the model requires to perform the task and also correctly define the context it is operating within. This way, you can get optimal results from Gemini that are tailored to the needs of your application. Fortunately, the Gemini API has provided the utility to achieve what we have just described. This utility is system instructions. System instructions help you to steer the behavior of a model to the needs of your application. Through the setting in the request to your API, you can provide instructions that give additional context, define the delivery format, specify the model's role, or just restrict the scope of the output. Through this setting in the request to your API, you can provide instructions that give additional context, define the delivery format, specify the model's role, or just restrict the scope of the output. This way the model can better understand the task, adhere to specific guidelines that you've provided and provide more customized responses. This is really beneficial to application developers using the Gemini API as it helps them get a more accurate behavior from the Gemini model. System instructions are to be set when you initialize the model. These instructions then persist through all your interactions with the model across multiple user and model prompt response cycles. For example, if your application teaches developers how to be better with writing JavaScript, you can instruct the model to always respond in the role of an experienced JavaScript teacher. This will provide responses that are more suited to helping your users learn about JavaScript, empathize with their situation as learners and produce more comprehensible examples of JavaScript code. Without system instructions, the responses will still be okay, but may not really help students become better at JavaScript because their situation as learners is not taken into account. Aside defining a role for Gemini to operate in, let's see a list of other powerful controls that you can have with system instructions. System instructions can also help you to define goals and rules for a task. For example, let's say you want to create an app that allows homeowners to be able to fix things that break in their house. For example, kitchen sink leaks, garage door jams, changing light bulbs, changing the tire and so on. Without going out to call an artisan, you can provide system instructions telling Gemini that the steps it generates for fixing the issues should only include tools and items that can be found around the house. No professional tools. You can also set a default budget. Let's say the fix should not cost more than 50 bucks just in case the homeowner may need to run to the hardware store to get stuff like glue or screw driver. You can also set a default budget. Let's say the fix should not cost more than 50 bucks, just in case the homeowner may need to run to the hardware store to get stuff like glue or a screwdriver. This way, you have defined the underlying goal and set rules around the type of response you are expecting from Gemini. Anyone using your application can now simply type in how to fix a leaking sink, and Gemini will generate an appropriate response. Because the model is now working within the boundaries of your instructions, it won't tell the user to go call an expert or suggest an expensive fix. Another advantage to using system instructions is in defining the output style and tone. This guides the model in better understanding the emotions, state, biases, age range, demographic, gender, and so on of the target audience. For example, generating stories for kids is a lot different than generating fantasy novels for middle-aged women. Providing system instructions that describes your target audience and also defines the delivery tone for the output helps extract the best results from Gemini. Next, you can also use system instructions to provide additional context for Gemini to take into account. For example, if you're building an application that helps users manage their taxes, it is quite important to take the country of the user into context. This will enable Gemini to use the tax laws that apply to the user's country to advise the user on how to undo their taxes. If not provided, the model will simply assume a set of tax laws that most likely won't be applicable to the user's country of residence. The last use case of system instructions that we'll be looking at is in defining the format you want Gemini to deliver your response. You can specify that your answers should be delivered in a list or in a specialized format like JSON or Markdown. You can also specify a custom structure that your responses should be delivered in. For example, let's assume you want to create an application that gives its users a workout routine. You can simply define the format that is currently displayed on the screen. This format starts by describing the workout routine. It then follows that up with a list of steps to perform the workout, and finally it ends the response with the benefits of performing the workout routine. With this specified, if the user types in how to do pull-ups, the model will strictly adhere to the format of the response and any other workout routine the user queries will be displayed in this format. Now that we have a good understanding of what system instructions are and what they can help us achieve, let's take a look at a Gemini request demo that makes use of system instructions. We are going to try as much as possible to be language and framework agnostic with our demos, so we'll be using the REST API for almost all our demos. We'll only use client SDKs in cases where the REST API does not support the task we are trying to perform. While we can use a command line tool like cURL to interact with the REST API, I'll be using Postman to write the API requests in this course. Postman is a REST API client that provides a graphical user interface that helps developers create and send API requests. You can get Postman on their website postman.com by navigating to the downloads page postman.com/downloads. On the downloads page, or on the eagle section of the homepage as you have seen here, you can download the appropriate client for your operating system. Once you have Postman installed, run it and pull it up as I have done here. You can then create a new tab for a new HTTP request. To begin constructing our request to the Gemini API, we start with writing the API endpoint. Now, at this point, I would like to note a very important fact. If you're new to the Gemini API and how to construct endpoints and requests, I'll advise you check out our Gemini API Getting Started course. The Getting Started course is a prerequisite to this course as mentioned in the introductory videos, and it teaches you all you need to know about the structure of the German API and how to write requests. However, if you're already familiar with the Gemini API and writing requests to the API, you can continue with this course. So, let's write our API endpoint. First, this is a post request, so make sure you select post from the request method dropdown. We then begin by writing the service URL, which is https://generativelanguage.googleapis.com. Next, we'll be using the V1 beta version of the API. This is the version that currently supports systems instructions as at the time of recording this course, so let's say /v1beta. Next we use the models resource, and from this resource, we are going to be accessing the Gemini 1.5 flash model. On this model, we then call the generate content function. This is the function responsible for the generative features of the model. And lastly on the URL, let's add our API key to ensure that our request is authenticated. Without the key, the Gemini API will not allow our request to be processed. We now have our request pointing to the Gemini 1.5 flash model and calling the generate content method. Now, let's build our request body. Below the request URL field in Postman, click on the body tab. Under request format options, click raw and on the dropdown that appears next to the options, select the JSON format. Awesome. Now, in the request body window, let's start typing our request body in the JSON format. First, let's start with a simple prompt that has no system instruction. Now that we have our request body set up, what should we ask Gemini? In this demo, I'm going to be asking Gemini to tell me all about the Eiffel Tower, so I'm just going to write, tell me about the Eiffel Tower. Now let's go ahead and hit send to see what we get back. Our response is back. If we pull this up, as you can see, Gemini responded with a full description. However, let's assume we're building a travel guide app and we don't want to waste the traveler's time with too many details about the places they choose to inquire about. We can simply instruct the model to summarize its descriptions so that the traveler can quickly get the information that they need. So let's go ahead and have that rule as a system instruction. Let's pull down the response window and back in our request, let's begin adding our system instruction. A system instruction is added as another property in the main request object using the system_instruction property name. This property is set to an object, and this object also contains a parts property, which is also set to an object, and the parts property contains a text property, which will be holding our instruction. Now within the quotes of the value of this text property, which will be holding our instruction, let us add the instruction, responses should be restricted to three main facts about the query. This instruction will ensure that Gemini keeps its results short and focused. So now let us run the query once again. Let's hit send. Let's inspect the response, and now as you can see, we now have a shortened version of the description of the Eiffel Tower. It contains just three sentences of facts about the monument, and Gemini has selected these three facts as the most important for anyone inquiring about the tower. By using system instructions, we have been able to scope our response to the requirements of our application. This gives developers and application builders a lot more power over the results they can get from Gemini. In the next video, we'll be taking a look at another interesting parameter called the candidate count. See you in the next video.

Contents