SIEM and Web data: a match made in hell
It's a new year, and I hope everyone has made it into 2015 safe and sound. While I took a much needed break, I also took some time to assess the log and SIEM landscape based on my last few posts. As Marc Andreessen famously said "Software is eating the world", which from a log perspective appears to be true. From my last post, the log landscape is changing; we're ingesting less typical "enterprise" logs, while the volumes of web based logs are growing at an astounding rate.
Now lets just face it, we're all interacting less with hardware and more with web based software. That's just how it is. We used to interface with hardware via some kind of console (remember those horrible java or flash based consoles? Ugh), today we're talking to web pages that represent multiple devices and enterprise services. We've started de-aggregating our data from the devices that generate it, and talking to web based API's now to retrieve it. We're not using VPN's to connect to partner networks, we're setting up partner API's that tie right to the data sources. How many of use us SaaS, PaaS or IaaS today, and how do you interface with them? And how does this then impact the security landscape around it?
Now think about the billions of web sites and API's that already exist, and that fact that thousands more pop up every day. We all know that these apps and API's aren't created securely, and the chances of developers suddenly programming everything securely are about as good as my chances of winning the lottery, then getting hit with lightning, twice, then falling into the water and being eaten by a shark. Ok, that might be a bit extreme.
Here is a very enlightening Twitter post I saw today from Jeremiah Grossman, the CEO and founder of WhiteHat Security, who has been waving this flag for years (disclaimer: We use WhiteHat Sentinel where I work):
So should we just lean on WAF and SIEM systems to hopefully protect those applications and our transactions over them? Are those technologies even working for us today? Well according to a new SANS analytics survey (released: Dec 14) they aren't, but I think we all already knew that. Spending on WAF, SIEM and IDS/IPS are all going down, fast, and we're still missing that security visibility into our applications (noted as one of the highest rated priorities in 2015). What's more, we're wanting to become more proactive, and hence the spike in spending on behavioral analysis type tools.
But again, none of these guys are solving the web problem, they are focused on the enterprise based issues: insider threats, data exfiltration, APT's and the like. They are doing a good job finding the those rare and complex events that SIEM is incapable of finding, which does fill a huge gap. Meanwhile web fraud has turned into a multi billion dollar problem, and short of transaction level checks, we still don't have anything capable of finding these kinds of attacks. I saw some numbers the other day regarding EMV credit cards (smart chip credit cards) from Canada and the EU, which show once crooks could no longer clone cards, they simply moved online to commit their fraud. So while card cloning dropped dramatically, online fraud rocketed upwards - which is what I've been seeing between our European and US customers. And the US is switching to EMV cards this year.
So I started working with Splunk a while back trying to see what kind of visibility I could get into our web data. The basic weren't too hard, they run along the same lines as a conventional device: X failed logins in X minutes, multiple logins from different IP's at the same time, repeated invalid URL/URI requests over X minutes, and other simple scenarios. That's right in SIEM's wheelhouse.
But when I looked at the real issues that I'm facing now, like the low slow brute force attacks, the account takeover and misuse cases, and the fraudulent transactions we see every day, I couldn't do anything. Each "use case" is a use case is words only: with each case the type of behavior will itself be different. How could I write a rule to find that? What would the transitions of that state machine need to look like? Sorry, you simply can't do it. You can look for specific events, but you can't look for specific behavior. Ask a police officer how they spot suspicious behavior, and they will tell you "I know it when I see it". SIEM's can't do that, they don't "know" anything.
So what makes these cases so difficult? Lets examine a few:
- First the low, slow brute force attack. Say a bad guy is using several different IP's and then attempting to login with a handful of accounts a day, each time maybe once or twice at the maximum, and continuing to do this several minutes apart, over several days. This is not a threshold that is reached, this isn't a static attack that can be found via a rule. While this is similar to an APT, but since it's web based there aren't any current APT tools (that I'm aware of) that even look at this data.
- Next lets take an account take over or misuse. If a someone logs in with a valid account, how do we tell if the person who logged in and their actions they take a legitimate? Say for e-commerce, if someone logs in, changes their shipping address, purchases and then ships something, how can we tell if that's legit before we get the chargeback or a report of fraud? Again, if it's a legitimate login and a legitimate credit card, you can't detect it until after it's happened.
- Lastly, lets take a legitimate user who does something that's outside the ordinary. Say this user normally logs in, spends a few minutes browsing, then buys a couple low priced items every few weeks. But now this user suddenly logs in and buy an expensive item without browsing for other items. But what if this site offered a one day sale, where this expensive item was on sale and in high demand? The current tools today wouldn't know this, and flag it as fraud or even block the purchase at the transaction level. That's an equation for unhappy users. A WAF or SIEM is again worthless.
SIEM's, WAF's and IDS/IPS's just aren't made to handle these kind of use cases. If someone tries to do a cross site scripting (XSS) attack, sure a WAF will do the job and a SIEM will report on it. But checking Mandiant's yearly M-trends report, 100% of targeted attacks used valid credentials, not an XSS attack or anything a WAF might catch.
So here I am, at the worlds largest Cloud Service Brokerage (CSB), but unable to successfully detect these web based problems. Abuse our APIs? Until you break it, I can't see it. Take over a user account and purchase goods? Until we get the complaint we don't even know it was fraudulent. I'm tired of being purely reactive, and from talking with my peers at a recent CISO summit, they feel the same way.
But hey, I'm only human.
* All views expressed in this post are mine, and don't reflect the views of my employer, professional groups or any organizations I may be a member of.
** And no one proof reads my stuff before I post, so of course I have spelling and grammar errors :)
Erik Bloch writes for fun from San Francisco California and Göteborg Sweden, mostly about struggles he's had, or is trying to overcome, that others may face as well.
On twitter: @ejbloch
Where are you getting that spending on SIEM is going down? Fairly substantial growth in this market based on all other research. And..SIEMs are expanding beyond what people think of as traditional SIEMs. I agree on the trend that people are tired of being reactive.. Remember those days of automatic blocking based on behavior and anomaly detection...there back. CISOs are willing to take the risk to block a user, application, protocol etc which may disrupt some business, as opposed to taking the risk of getting breached..
Excellent post - and thanks for the linkage to the SANS doc, missed that 1 until now :)
Behavioral analysis on web data? We should probably get that coffee soon. :)