From the course: AI Data Pipelines with Spring

Introducing Spring Cloud Stream for data pipelines

From the course: AI Data Pipelines with Spring

Introducing Spring Cloud Stream for data pipelines

- [Instructor] In this section, I'll introduce you to Spring Cloud Streams. I'll review some concepts and how it can be used to build streaming data pipeline applications. Spring Cloud Streams is a spring project that implements common design patterns for streaming applications. Check out the details at spring.io. I will use Spring Cloud Streams along with Spring Boot to build streaming applications. With Spring Cloud Streams, data flows from one application to another using a messaging system. The streaming data pipeline applications do not directly communicate with each other. The messaging system provides an abstraction between applications. This introduces flexibility to the data pipeline. You could easily evolve and improve your applications independently when using the messaging system. In this course, I'll use the RabbitMQ messaging system. RabbitMQ is one of the most popular open source messaging systems. If you're not familiar with RabbitMQ, especially using it with spring, check out my course on data resiliency with Spring and RabbitMQ event streaming. A spring cloud stream application can be either a source application, a processor application, or a sync application. A source application starts the data flow. For example, reading data from a file or accepting an ACDP API to send to the messaging system. The sync application ends the data pipeline. For example, writing data to a database that's received from the messaging system. A processor is an intermediate step. It takes data from the messaging system, does some sort of calculations, operations, and returns the output to the messaging system. For our examples, when using rabbit, data will always be sent to something called an exchange. You can think of a rabbitMQ exchange as an address to send data. Data will be read from a rabbit queue. A queue is like a mailbox to hold messages that were sent to an exchange. The source application will send outputted data to the exchange. The sync application will read data from the rabbit queue. It will also associate a queue with an exchange using something called a binding rule in rabbit. In our example, all messages sent to the exchange will be saved in the queue. The processor application will consume input data from a queue and publish the output data to a rabbit exchange. In the next section, I'll show you how to run a Spring Cloud source application to start our first data pipeline.

Contents