Log analysis with Elastic stack
A slow application can trigger a series of escalating calls through an IT organization and whenever the application starts giving issues, we need to troubleshoot it and fix it as soon as possible. In the process our log files are normally among the first places where we go when we start the troubleshooting process. A modern web application environment consists of multiple log sources, which collectively output thousands of log lines written in unintelligible machine language. If I talk about the LAMP stack set up, then we have PHP, Apache, and MySQL logs to go through. In addition to that we have framework logs and system logs which collectively creates endless pile of machine data. Now we have all the information scattered here and there but if we need to get information out of it we need to do the dirty work by using cat, grep, tail etc.
Now lets talk about ELK and understand how it can solve the puzzle here. According to Elastic- ELK is a great way to centralize logs from multiple sources, identify correlations, and perform deep-data analysis. Elasticsearch is a search-and-analytics engine based on Apache Lucene that allows users to search and analyze large amounts of data in near real time. Logstash can ingest and forward logs from anywhere to anywhere. Kibana is a dash-boarding tool with user interface that allows us to query, visualize, and explore Elasticsearch data easily. I am not going to explain installation process here but in next article I will try to cover them separately and there I will try to explain the individual software starting from installation to implementation.
Next step after installation is to set up a log pipeline into Elasticsearch for indexing and analysis using Kibana. There are various ways of forwarding data into Elasticsearch, but I am going to use Logstash. Logstash configuration files are written in JSON format under /etc/logstash/conf.d. The configuration consists of three sections: input, filter, and output. I am going to create a demo configuration file 'logs-apache.conf' for apache logs starting from input section, filter section is used to modify the input before sending through output section. For now I am skipping this part to make it simple to understand the topic. Next is output section:
We have created the logstash configuration file with input and output section. We need to start Logstash with the new configuration. I am using Ubuntu 17.04 here so run this command as per your logstash setup in Operating System:
bin/logstash --path.settings /usr/share -f /etc/logstash/conf.d/logs-apache.conf
We can check the log data in elasticsearch by accessing the created index through logstash:
http://localhost:9200/logs_apache/_search
As we have the apache logs in elasticsearch, so the next thing is to display it in Kibana. I will show this process through series of images.
First configure an index pattern in Kibana by providing the name of index then select the time filter field name and click on create.
Next screen shows index with field and their data types. (I will explain the data types and other details on Elasticsearch in my next article.)
The discover tab of Kibana shows apache log data with search capabilities.
Now try to search the keywords (as in below image):
Now we have put everything in place lets play by accessing some local websites to push the apache access logs. Logstash is already tailing this log, so these messages will be indexed into Elasticsearch and displayed in Kibana. Now play with the data in Kibana by analyzing it.
This is a very simple tutorial to explain how we can configure Elastic stack and use it to monitor apache logs. Elastic stack can be used with beats to fetch file, network and system information etc. It can be connected to existing application to monitor the application performance as well as to create a great dashboard to monitor key performance indicators. We can also use it as a standalone system by pushing the data from any RDBMS or file based data source. We can not only show or search data but can also perform analysis on top of that. In case of any confusion do comment.