Know Thy API

I am a fan of perimeter security when it comes to ingesting API from outside world. Build a castle, build a moat around it with a draw bridge and guards. What does it mean when you get out of medieval era: 

  • Use an fortified API gateway that exposes HTTPS to receive API requests. 
  • Put an web application firewall, or something that can catch your OWASP top X API threats, 
  • Do your input validation right there. 
  • Check user or service account or client (anything that is sending API to you) right there. 
  • Don’t put all credentials or user databases at the gateway of course, but use a classic RADIUS-like AAA model with API gateway acting as authentication edge and your authentication service with the backend. This way you can delegate your authentications to any type of identity provider in the market and hopefully even get some GDPR mileage out of your vendor. 
  • Same thing with authorization: delegate the authorization to some service that can do role based access control (RBAC) and granular service level authorization. This means avoiding generic, buffet style all you can eat tokens and forcing the authorization to be specific to a service so that any lateral movements within the system can be avoided. Just because you came in through the door, should not mean you can get anywhere and eat anything: follow a zero trust model and scope your authorization tokens, tie them to the microservices that you deem are warranted, but please don’t add any personally identifiable information (PII) inside the token, unless you intend to encrypt it all the way to the client.

There are new API intelligence tools that give you visibility into what API is coming through your gateways. They show you not only every API request and response that goes through your gateway, but also which endpoints they are going to and coming from. For a security team this is a great way to build visibility into API traffic, get organized, find base patterns and see anomalies and whether they are just software or deployment anomalies or they are actually security incidents that need more attention. Whether you call this machine learning or just advanced monitoring, up to you. Whether you call this supervised learning or configuration, either way, some leg work is obviously required. For instance you need to tag your endpoints with tags that makes sense for your analysis: whether it is tagging by product name, cluster name, environment type (dev or production), or even the name of the product engineering lead to send emails to, up to you.

For architecture teams trying to set architectural patterns, API standards and build maturity for the API ecosystem, for instance ask the product team to provide an API specification for the new product and load it in the tool and detect any violations up to your sensitivity level: was the API expecting specific headers, or specific types for a payload and what you see in the field is not matching? You can catch that, or even block it. Whether you think you are dealing reconnaissance or malicious payload injection pattens, you can detect that as well and investigate accordingly based on what you think the severity level of the anomaly is.

On the other end, some of these tools tell you where the API is coming from: if it is external which IP address is it coming from? Is it a special client associated to a customer? Is it a known user using that special client? Have the customer support team close by to let your customer know you are receiving anomalous traffic from their IP address or employees or clients.

Knowing base patterns and being able to detect anomalies goes a long way even for dealing outside the norm internal traffic (inside the castle). Is it somebody who is stress testing the wrong production endpoint instead of using a dev environment endpoint? Or is it an attacker that is inside and performing exfiltration acts. This is where having a good inventory of internal IP address ranges come handy: what IP ranges are product workload clusters (say Kubernetes) are using? Heck, you can even talk to your SRE team and find out if the Kubernetes are all using up to date OS for their work nodes and pooling container images free of CVEs or not. And if not, then it is time for that product team to deal with those old JIRA tickets to fix those CVEs. Similarly verify you service architecture: which clusters these clusters talk to and whether you have any type of secure communications between these clusters. 

Before we close: one more important thing is examining sensitive and personal data flowing through the APIs for your privacy compliance, how far through the infrastructure do those sensitive payloads go and how are they protected. Is the masking that the product team is telling you about really effective? following the API is one of your best ways to figure out what sort of consent is needed from users and where you need to go to delete stuff when users want to be forgotten. Also note that sensitive data is not only the PII but also things that are business sensitive that your customers don’t want their competitors to find out and a tool that gets you visibility into what flows through your API infrastructure can be supervised to learn or customized to detect what is sensitive for your customers and therefore for your company.


To view or add a comment, sign in

More articles by Madjid Nakhjiri

Others also viewed

Explore content categories