Beginners guide to create and deploy Spring boot Java application in Elastic Bean Stalk as a Worker application
In this guide I will share my experience, and important points to create and deploy a Spring boot Java application on AWS Elastic Bean Stalk as a background Worker application.
Asynchronous Worker based Processing is an important and interesting design pattern to take care and process those tasks that need not be blocking the critical path of the data pipeline, and can be taken care through a Asynchronous processing pipeline.
Typically asynchronous processing is achieved as follows:
- producers submit request to a request queue.
- consumers read requests through the request queue and process as soon as possible
The throughput and effectiveness of this pipeline depends on how quickly the consumers can process requests. The throughput from a consumer depends on the provided infrastructure (RAM) to the server. After a ceiling to throughput is reached, the way to scale consumers is horizontally i.e. automatically launching new consumers to take on the higher volume of requests, and increase the overall throughput of the consumers. This is where AWS Elastic Bean Stalk comes into picture as a big help. AWS ElasticBeanStalk provides mechanism to auto-scale based on various parameters like NetworkIn/Out, Requests in Queue, CPU, and so on.
Now, coming to the actual problem I was solving. I wanted to design a scale able pipeline that will offload a) image downloading, and b) updating the status of image download to database. For the need to update the status of image download (b), I created a Java Spring bootable application deployed on Elastic Bean Stalk as a Worker application, receiving its tasks from an SQS queue.
The Java Application
The Spring Bootable application was developed in Java, and Spring Boot 2.0.5.RELEASE. I used gradle wrapper + gradle as my build tool to build, and package the application. Typical dependencies required are
"org.springframework.boot:spring-boot-starter" "org.springframework.boot:spring-boot-starter-web" "org.springframework.boot:spring-boot-starter-data-jpa" "org.springframework.cloud:spring-cloud-starter-aws" "org.springframework.cloud:spring-cloud-aws-messaging" "com.google.code.gson:gson:2.8.5" "org.springframework.boot:spring-boot-gradle-plugin"
A required design need of using the Elastic Bean Stalk Worker environment is that the input to the application is from a SQS Queue. As part of Bean Stalk architecture a sqsd daemon is installed on each instance, that acts as a manager of messages, and is responsible to read messages from SQS queue, and pass on the messages to the application as a HTTP POST call.
This means that the Java application should expose an HTTP POST endpoint to which sqsd can send the messages to be processed. So remember, to create this HTTP POST endpoint in your application. Example of an end-point is
@RequestMapping(value = "/processMessage", method = RequestMethod.POST)
public void processMessage(@RequestBody InputMessageJavaObject message) throws Exception {
//application code goes here
}
Packaging of Java Application
I did not find any great examples of how to package the Java application to be compatible with Elastic Bean Stalk. After many trials, I got the right package expected by Elastic Bean Stalk. To prepare the package I created a custom task in my gradle build.gradle file. Some of my learnings, and notes from my experience is below:
- Elastic Bean Stalk expects the Java application to be packaged as a ZIP file. So the goal was to create a zip file.
- The bare minimum files required in the ZIP file are:
.ebextensions .ebextensions\*.config yourapplication.jar Procfile
- application.jar -> this is the application package generated from a typical gradle build. To know if the application.jar will work in Elastic Bean Stalk ensure that the jar will run using command “java -jar yourapplication.jar”. Note: the name of the jar can be anything you wish.
- Procfile -> this is the file that is used to run the Java application when the new environment is created. So this will have the same command typically used to run the Java application from command line. Note: “web” denotes the main command to run the application.
- One issue that I encountered is that the environment variables that you set through the ElasticBeanStalk Software is not directly available in the Procfile. I needed to pass different profiles to my application, that I achieved by creating separate Procfile for each profile, and creating separate ZIP packages for each profile.
web: java -Dspring.profiles.active=dev -jar yourapplication-1.0.jar $JAVA_ARGS
- .ebextensions\.config -> this typically can be used to satisfy any custom application infrastructure that you would like to create. Example: creating log folder to store application logs, changing ownership/permissions of files, and so on. I used this to create the application log folder as follows:
commands:
01_create_log_directory:
command: "mkdir -p /var/log/yourapplication"
02_set_log_permissions:
command: "chmod -R 777 /var/log/yourapplication"
The Elastic Bean Stalk
I created a Worker environment, with Platform as “Java”, and as part of application code provided my application Zip file as a file. We can also upload the Zip file to S3, and provide the path to S3.
Following are some important points to configure application.
- Software — here we configure the real-time logging, environment variables. An important point is that in ElasticBeanStalk the Java application runs on port 5000 (which is different from the typical port 8080 that a Java application runs on). You can modify the port on which the Java application should listen on by creating new environment variable
SERVER_PORT: 5000
- Worker — here we configure the input SQS queue, HTTP POST endpoint, the input MIME type, number of concurrent connections, timeout period, and so on. Most important is the SQS queue, HTTP endpoint (example: /processMessage), input MIME type (application.json).
- Capacity — here we configure if the application should run in a single-instance or should scale based on environment metrics. This is easy to setup.
- Instances — here you configure the instance type to launch, and most importantly the security group (that decides what this environment can access).
I created and saved multiple deployment configurations to suit my dev, and production needs.
Conclusion
I was pleasantly surprised working through the setup, and configuration through UI. To verify the auto-scaling, I generated consistent and heavy load in controlled manner, and saw the auto-scaling kick in to launch new instances, and start the consumer Java application successfully to read messages from SQS queue and process them.
Overall, it think the AWS Elastic Bean Stalk provides a very easy, and effective option to scale your architecture components with minimal devops support.
I will encourage you all to try it out, and share your experience….
Good luck !!!