From the course: Enhancing Your Notebook Workflow with Jupyter AI

Generating code snippets

- [Instructor] The key purpose of Jupyter AI is to bring generative AI into our workflow to enhance productivity, reduce repetitive tasks, and support learning through intelligent assistance. Now that we've successfully set up Jupyternaut, the conversational assistance, let's practice generating some code snippets. Our hypothetical use case is to apply Z core normalization to a column of data. For context, let's first ask Jupyternaut, what is Z core normalization? So now Jupyternaut is bringing the definition of Z core normalization. It has made the case that in statistics, it is a method used to standardize data by converting it into standard normal distribution with a mean of zero, and a standard deviation of one. And it can read all of the definition it's brought up. I should mention that the definition may be slightly different for you, but it's going to answer that question. Now that we have a clear explanation, we can follow up with the prompt to generate the data set we will work with. So let's ask Jupyter to create a Pandas data frame named DF with 100 rows. And we wanted to have columns like Age, and let's give it the range. Now, we want age to be in between 18 to 65. Income. And the income should be between 25,000 and 120,000. And city, we want the name of cities. Let's provide a list of cities. Let's say it should randomly from London, Manchester, Birmingham. I missed the G. Glasgow and Edinburgh. So that's the name of our possible (indistinct) base, and let's specify again that we should use Nompy. The random. The random integer for numbers, and random, the choice for cities. This is a pretty explicit prompt, and the reason we're going this route is that this is a pretty simple model, and the clearer you are, the better it interprets the task. If it were more powerful model, we wouldn't need to give all these details. So there we go, Jupyternaut has gone ahead to create a code snippets that addresses all these things that we said. We said it should use random. Okay, so now that we have the code, you can copy to clipboard, or you can insert directly below an active cell, or above an active cell if that's what you prefer. So this is our code. I wouldn't want it to print the outcome yet. Or let's print out the entire outcome since it's just 100 row. We can reduce the size of this, and go ahead to run the cell. Okay, it says it cannot see any models named Pandas. So let's go ahead and pip install Pandas Nompy. So now it's trying to install the necessary libraries. So let's wait a bit. And let's run the code again. Okay, so there we go, we have our hypothetical data frame. You can decide to save this data frame if you want to save the original data frame. Again, this is not going to look exactly the same for you and I, but it's going to be similar, and it has shown now you can generate the simple dummy data set that works well for you. So now our initial task is that we would like to apply Z core normalization on the income column. So let's ask Jupyternaut to apply Z core normalization to the income column in the data frame. The name of our data frame is DF, using Pandas and Nompy. Add the results as a new column called Income Z, or whatever you prefer. Again, part of the reason why I'm explicit that it should use Pandas and Nompy is because this is a lightweight model, and the more explicit you are in the instruction, the better it can interpret and get what you want out of this. For bigger models with larger capacity, for example, you wouldn't need to restrict it to say something like "use Pandas and Nompy". It can use other libraries. But in this instance, this is what we want. Okay, since we're not trying to run the entire code, let's copy just this part where it's applying the Z core normalization. If this runs well, if we run the DF again, we should have a new column called Income Z, just like we specified. Now you can go ahead and save this data frame, DF to CSV. Let's called it Salaries Z score dot CSV. We wouldn't like it to have an extra index column, so let's set that as false. So there we go, DF to CSV. And if you check our files, we have this. It's good practice when you save a file to verify that it works. So let's copy part, and then do pd dot read, CSV. And the file that we just saved, the Salaries Z Core CSV. Remember that you can save to other file formats if that's what you care about. So you can, now that you have this saved, you can right click on it, and download to your local file.

Contents