Month: June 2015

Health check and Analytics for enterprise application using InfluxDB and Grafana

Posted on Updated on

Problem:

Need to monitor the real time enterprise application performance metric with server stats for following parameters.

  1. Log the response time taken for web api.
  2. Log the time required to fetch the data from the external source like cache, database, web api …etc
  3. Log the application specific exceptions.
  4. Log the server stats like CPU, Memory, Disk or Network usage …etc
  5. Log the page visited with search criteria.

For each stats meta data also need to provide like machine name, tenant specific or consumer specific info.

Solution:

One of the new school of thought of this problem is to use InfluxDB time series database to collect the log data and grafana will be used to monitor the time series for respective metric instead of using StatsD with graphite.

For pushing the application performance metric to InfluxDB, we developed InfluxDB.UDP client which will push the data into InfluxDB in JSON format supported by InfluxDB. InfluxDB UDP plug-in needs to enabled to receive the UDP data.

For pushing server stats or performance counter like CPU, Network Usage …etc used the CollectD on Linux server and CollectM is node.js based utility for Windows Server.

The overall solution to this problem is as shown below:

The InfluxDB support the aggregation like min, max, mean, count …etc on time series data by creating time buckets (e.g. count of particular request by 30 min window time). It has supported sql like query dsl.

Setup Guide InfluxDB, Grafana, CollectD, CollectM and InfluxDB.UDP

I have used docker to install the InfluxDB and Grafana on Ubuntu 14.04.

Prerequisite

  1. Ubuntu 14.04
  2. Install Docker 1.6

Step 1:

Install InfluxDB using docker images, instruction are provided here

Step 2:

Install Grafana using docker run following docker command:

sudo docker run -d -p 3000:3000 grafana/grafana

After this access Grafana on browser on port 3000 and configure InfluxDB as a data source as per provided instructions.

Step 3:

Install CollectD on Linux to monitor server stats.

sudo apt-get update
sudo apt-get install collectd collectd-utils

Change the collectd configuration file ‘/etc/collectd/collectd.conf’ for InfluxDB as a CollectD storage


Hostname “your-host-name”

LoadPlugin network

#influxdb client configuration
Server “influxDb-ip-or-domain-name” “25826”

and restart the collecd service:

sudo service collectd restart

Step 4

Install CollectM on windows and change its configuration for pointing to InfluxDB.

Go to Installation Directory\CollectM\config\default.json and
change the following things

….
“Interval”: 10,

“Network”: {
“servers”: [
{
“hostname”: “influxDb-ip-or-domain-name”,
“port”: 25826
}
]
},

Step 5

InfluxDB.UDP is Dot Net library to push the application performance metrics to InfluxDB. The instructions are available here.

Conclusion:

This is going to be useful for devops to check your application health and performance in real time in any environment like Dev-QA-Stage or Production. For future development need to create alert or notification based on the available stats.