Cost Transparency in the Cloud
Semantics to analyze and reduce Operational Costs in complex (Micro-)Service Environments
Increasingly, we not only use services from the cloud, e.g. from providers such as Amazon, Microsoft, Google, IBM, Digital Ocean or Heroku, to name just a few, but we also offer our services through these. Here complex costs arise from basic fees and usage rates, depending on scaling requirements, the number of requests, messages or users, usage times or transfer volume, CPU or RAM requirements or even license costs.
Our (micro) services use one or more of the low-level services of the various cloud providers and our apps utilize our (micro) services. Now, what are the resulting operational costs of our apps? How do I calculate a fair price for my customers? How do I lower my operating costs in the competitive market environment? Which of my services are used to what extent and by whom? Which services are worth to expand, which might need a revision, which services might need to be even discontinued?
To answer these questions, we first need a model of our application infrastructure, i.e. which components and which relations and dependencies exist between them. Then: What types of costs exist and where do these costs arise? Which stakeholders are involved and use which apps or services? And finally: Where and how do I get all this information and how good and complete is it?
The applications and services and their dependencies can be represented predestined in a graph. Just a few classes like Application and Service, Customer and Vendor together with Properties like usesApplication and usesService quickly create a picture of your software infrastructure. While transitive properties like usesService follow paths in a tree and expose which app uses which low-level service without an explicit association, inverse properties such as isUsedBy allow both bottom-up queries such as "Which low-level service is used by which applications and customers" as well as top-down queries such as "How many applications and customers use which low-level service". This makes the power of a semantic graph.
In a semantic graph database, classes are structured hierarchically in the form of a taxonomy. A taxonomy for our cost analysis could contain the superclass Cost, Cost the subclasses CostPerTime and CostPerUsage and the latter one the subclasses CostPerRequest and CostPerVolume.
If each service is now related to its respective – and only once generated – cost types by means of one or more hasCost properties, our graph already contains the necessary knowledge about the relations between the applications, the services and their costs.
Linked Information in a Semantic Graph
Recommended by LinkedIn
This is the basis for a cost estimation based on an expected user behavior, and on experience – a good start for an initial pricing for an app or service to be offered. In practice, however, it is difficult to accurately predict user behavior and thus the actual use of the paid low-level services.
Strategic goals such as reducing costs or focusing on profitable apps or customers require not only a model, but also the most complete and high-quality data possible – primarily from operations, but also from development and support.
Many cloud providers provide APIs for accessing usage data, along with our infrastructure model, the foundation for quantitative and qualitative cost analysis. Ultimately, by providing your (micro-) services that track their usage, you gain cost transparency, not just at any level of your software stack, but by utilizing the knowledge about the relations in the model (within the graph), also across customers and cloud providers.
Initially, using mappings to a reference model, the semantic model in the graph database harmonizes heterogeneous data from different sources and provides a central terminology – in our scenario for a common semantic understanding of the individual cost information of the various cloud providers - e.g. for cross-platform reports and expressive KPIs.
Already through the model, the relations between the apps and the various services provide insight into dependencies and thus, for instance, support effort estimates for necessary changes (impact analysis).
The largest cost drivers and thus the achievable savings potentials become visible as soon as the extensive operative data, generated by the services themselves in the appropriate big data platforms, are combined with the semantic model. The information about the request to a service in the logs already is sufficient to evaluate the CostPerRequest per app, per customer or per provider - analogous to the other cost types, which can be aggregated upwards by the taxonomy and validated directly against the provider invoice.
The knowledge about the app and service infrastructure persisted in the semantic model is the basis for creating the suitable queries to the big data platforms. The combination of both technologies delivers the desired statements for business-critical calculations and decisions.
For instance: Which applications generate which costs in a certain period of time with which services? With which customer do I achieve the highest or lowest profit or loss with which apps? Which cloud providers generate the highest costs and which services could be cheaper using other providers with known user behavior? Which contribution margins are achieved by which apps and services?
This case study shows which potentials can be released with semantic data management for cost analysis and cost reduction even in complex services infrastructures in the cloud with a comparatively manageable implementation effort.
Interested in further articles on semantic data management? Follow this blog.
Alexander, thanks for sharing!