Solving The Cloud Native Microservice Autonomy Problem: Implementing Search With Cosmos DB Graph and Azure Search (Part 1)
Ella: Here's the next set of UX concepts along with the UI we got from Anthony.
Jake: Awesome. Let me take a look.
...a few minutes later.
Jake: Hmm, what do you think of the search experience? Our team's implementation of the two microservices that own 100% of this data are using graph databases, right?
Ella: That's right. We're using Cosmos DB's Gremlin API. In terms of fulfilling the UX needs we have for search, are you asking me about querying and our microservice boundaries and sharing data?
Jake: Exactly what I was thinking. How can we avoid sharing? It seems almost unavoidable now.
Ella: Well, my assessment is that we're going to run aground if we apply SRP where we're creating single query mechanisms all over the place. The search UX that we're implementing doesn't mean we have to violate our service boundaries though. Each microservice will still retain data ownership. I've done a preliminary evaluation of Azure's Search service before coming to you...
[Conversation continues on.]
***
In the dialog you just imagined, we've read up to the point where things are getting interesting. For starters, how can we maintain service boundaries and yet have a search on the backend where the microservices continue to be loosely coupled? How is Ella going to reconcile this seeming paradox? Second, how are they going to achieve an acceptable search experience using a graph database, which wasn't intended for these query patterns?
Let's dive into the details, shall we? Because these are non-trivial problems, I'm breaking this down into two parts.
We'll start with 1) our understanding of microservice boundaries in the cloud, and then 2) how we're actually going to solve Ella and Jake's dilemma with Cosmos DB Graph and Search.
Microservice boundaries in the cloud
Sprinkled into the conversation above, you'll find some videos and articles on boundaries. However, the best definition of a service you'll probably ever find, was written by Udi Dahan--aaaaaaaaallll the waaaaaay back in the middle ages:
2010.
He states: "A service is the technical authority for a specific business capability. Any piece of data or rule must be owned by only one service." - Udi Dahan
In many of the conversations I have with a business's technical team, they either think they are doing microservices or have a solid understanding of microservices to do them. Unfortunately, "microservice" is still in a state of, what Martin Fowler calls, "semantic diffusion."
The situation gets even more interesting once we throw Azure's cloud resources at the problem, and try and merge our past habits deploying software to VMs in a datacenter. What does "ownership of the data" mean anyway for my "physical architecture" if I'm ultimately using a multi-tenant Azure something or other (Cosmos DB anyone)?
This question alone is worth an extended post, so a short answer will need to suffice for now.
Here it is: microservice data and rule ownership in the cloud (*cough* Azure), at the physical architecture, becomes operational autonomy. In other words, can the microservice be deployed, operated, restarted, etc. autonomously. (I love the word autonomous, but it can often be misunderstood, so I'll use self-determining from now on.)
But what about search, huh?? I got you! You're about to tell me to create an instance of Azure Search, and then put all my microservice eggs into that one basket!
Logical not "physical"
Before you get too bent out of shape, we should clear up another characteristic of microservice boundaries which is absolutely fundamental to our conversation:
A microservice boundary is first and foremost a logical one.
Let's take Ella and Jake's scenario and approach it with all too common mistakes that either consciously or unconsciously disregard the "logical" bit of the statement above.
The "best practices" architecture
So we have the two microservices. We'll call them GREEN and BLUE. We're using Azure, and you and I are leads--just like Ella and Jake.
What are we going to do? Keep in mind as each mistake is highlighted, that creating an Azure resource is a good thing. How else are we going to build a cloud native system? So pay close attention to the reasoning just before Azure comes into play.
Mistake #1. Perhaps you're going to read up on some best practices. Why else would they be best practices? And why else would the Cloud Solution Architect be validating those best practices if they weren't the best? Best is what everyone wants. Right?
(Back to our microservices.)
They both use Cosmos DB with the Graph API, as the backing SoR (system or record)--nothing wrong per se with this.
Oh, and we're also delivering the UI through a tablet. How else is our user going to search for that data they need to lookup and submit their work order tickets while they're sitting in that truck out in the real world?
Mistake #2. Adhering to our best practices of microservice deployment, we ensure GREEN and BLUE microservices are on separate database resources. We conclude we should create two separate Cosmos DB accounts for each service to have their own database.
Mistake #3. We need to have the tablet app communicate with something, right? Right. And of course, GREEN and BLUE are microservices the app needs to consume.
Mistake #4. so we decide to go with the best practice microservice architecture and make them "REST APIs". How else are we going to do anything useful with GREEN and BLUE?
We create an API Management instance to protect and define our APIs (a good thing in and of itself). Now our client app can call GREEN and BLUE. Done. We just need to wire up API Management to a backend
Step three, the microservices need to be a single deployable unit. This is obvious!
Mistake #5. The microservices must be a single deployable unit. This is obvious! *cough* Containers are the future and we've read blog after blog that this is the best practice. We pack GREEN and BLUE into separate docker images, and feel really good.
We need a registry and compute, so creating a private container registry and spinning up an AKS cluster with a few quick commands is the next obvious action. We deploy each microservice, I mean Docker image, to Kubernetes.
We wire up the backend of API Management and start making test calls from Postman. We're in business!
Mistake #6. One thing we notice while we do this is that one particular HTTP request to GREEN is slow. I mention it to you and you remark that GREEN is making an internal HTTP call to BLUE to fetch so data to build up its response.
Being the best practice guru I am, I remark that, "We don't want to prematurely optimize at this point." You agree and put it on the backlog as a technical spike for later.
Bam! We haven't solved that pesky search yet, but we'll get there. We practice "emergent architect". We've got a loosely coupled architecture. Right?
Not really.
We've made so many critical missteps thus far in this fictitious scenario, we've caused future problems with reliability, scalability, resiliency, and COST.
A cloud native architecture is first and foremost about resourcefulness; and resourcefulness starts with people who are resourceful.
A healthy, working relationship
Our fictional Ella and Jake are not making these mistakes. They understand that microservice boundaries will be exercised in what I'll call the operational relationship. Another word for operational is working.
Have you ever had a healthy working relationship with someone? To have this type of relationship, there's a complex mix of personality compatibility, social rules, etc. Microservices are the same.
To achieve a healthy microservice composed system, we must combine certain fundamental design principles and engineering disciplines.
Remember that tablet/mobile app? Which microservice owns that code that's making a call to GREEN? Or BLUE?
Hint: If that app on that tablet doesn't have GREEN and BLUE microservice on it, then we've already made a mistake and violated the fundamental design principal of autonomy and loose coupling.
Where many smart and good intentioned architects and developers get tripped up is when they think deploying microservices GREEN and BLUE into the same process, is violating autonomy. In fact, this is a choice that must be done.
Let me explain.
What thing is "allowed" to make an HTTP call to GREEN, logically? It must be GREEN. This means that we're going to end up with one or more microservices running in that process on that tablet!
Turtles all the way down, my friend.
Wrapping it up
Getting back to our narrative.
Ella is on the right path with Azure Search since it fulfills the needs the UX is driving for the tablet, but Jake still has difficulty with reconciling his understanding of microservices and implementing a search that can be used in the UI and still keeps the microservice's data ownership intact.
I'll begin expanding on the full solution in the next article of this series.
Did this bring back memories for you 😉?