From the course: Multimodal Programming Interfaces

Course introduction

- Multimodal programming is one of those things that can help you get to results with AI tooling faster. Sometimes it's kind of tricky to accurately describe exactly what is it that you need by creating a prompt, by typing everything in. Sometimes what you need is actually capture a screenshot. Perhaps you have some sort of an image already, and then you use that to enhance the prompt and let the large language models to help you out. We'll see throughout how we can put that together and put many different tools to good use and then build something from scratch. We'll be doing some web development to put all those things together with screenshots, with the help of many different things, including MCP and agents and many different tools. Let's get started and let's see how these multimodal programming can help us out.

Contents