Health check and Analytics for enterprise application using InfluxDB and Grafana

Posted on Updated on

Problem:

Need to monitor the real time enterprise application performance metric with server stats for following parameters.

  1. Log the response time taken for web api.
  2. Log the time required to fetch the data from the external source like cache, database, web api …etc
  3. Log the application specific exceptions.
  4. Log the server stats like CPU, Memory, Disk or Network usage …etc
  5. Log the page visited with search criteria.

For each stats meta data also need to provide like machine name, tenant specific or consumer specific info.

Solution:

One of the new school of thought of this problem is to use InfluxDB time series database to collect the log data and grafana will be used to monitor the time series for respective metric instead of using StatsD with graphite.

For pushing the application performance metric to InfluxDB, we developed InfluxDB.UDP client which will push the data into InfluxDB in JSON format supported by InfluxDB. InfluxDB UDP plug-in needs to enabled to receive the UDP data.

For pushing server stats or performance counter like CPU, Network Usage …etc used the CollectD on Linux server and CollectM is node.js based utility for Windows Server.

The overall solution to this problem is as shown below:

The InfluxDB support the aggregation like min, max, mean, count …etc on time series data by creating time buckets (e.g. count of particular request by 30 min window time). It has supported sql like query dsl.

Setup Guide InfluxDB, Grafana, CollectD, CollectM and InfluxDB.UDP

I have used docker to install the InfluxDB and Grafana on Ubuntu 14.04.

Prerequisite

  1. Ubuntu 14.04
  2. Install Docker 1.6

Step 1:

Install InfluxDB using docker images, instruction are provided here

Step 2:

Install Grafana using docker run following docker command:

sudo docker run -d -p 3000:3000 grafana/grafana

After this access Grafana on browser on port 3000 and configure InfluxDB as a data source as per provided instructions.

Step 3:

Install CollectD on Linux to monitor server stats.

sudo apt-get update
sudo apt-get install collectd collectd-utils

Change the collectd configuration file ‘/etc/collectd/collectd.conf’ for InfluxDB as a CollectD storage


Hostname “your-host-name”

LoadPlugin network

#influxdb client configuration
Server “influxDb-ip-or-domain-name” “25826”

and restart the collecd service:

sudo service collectd restart

Step 4

Install CollectM on windows and change its configuration for pointing to InfluxDB.

Go to Installation Directory\CollectM\config\default.json and
change the following things

….
“Interval”: 10,

“Network”: {
“servers”: [
{
“hostname”: “influxDb-ip-or-domain-name”,
“port”: 25826
}
]
},

Step 5

InfluxDB.UDP is Dot Net library to push the application performance metrics to InfluxDB. The instructions are available here.

Conclusion:

This is going to be useful for devops to check your application health and performance in real time in any environment like Dev-QA-Stage or Production. For future development need to create alert or notification based on the available stats.

Custom query DSL for NoSQL

Posted on Updated on

Problem:  

Need to support the custom query DSL for Domain objects, so that user can create various queries in SQL like format for NoSQL database.

e.g. Select * from User where firstName = “vinayak” and age > 18 order by id limit 10.

The result will be the same irrespective of underlying storage either Elastic Search, MongoDB or MySql.

Solution:

The solution to this problem is to create custom query language by providing SQL like syntax or grammar. The  query defined using this grammar will be translated to the provided data source query language like elastic search query dsl. The tool evaluated for this purpose is ANTLR which is widely used for designing the custom domain specific languages. I highly appreciate the work done by Terence Parr in this area.

The language translation for target data source using ANTLR works in three different Steps.

Step I:

ANTLR provides java based utility which parse the grammar and generate the code as per the target run time like c#, java, javascript etc… The grammar format is supported in Backus Normal Form

Step II:

The code generated from Step I as per target run time will be used to parse the custom query designed using the respective grammar. The output of Step II is parse tree or Abstract Syntax Tree.

Step III:

To translate the parse tree into desired output or targeted data source query ANTLR also provided the parse tree traversal using Depth First Search. The Step I by default generate the  listener interface and empty base implementation for this interface. The Step I also optionally generate the visitor for the parse tree. The listener methods will be called during parse tree traversal, in this callback methods one can perform respective actions as per the need.

This tool will help you to design your own custom language and its translation as per your need in very simple way. For more information one can explore the documentation and The Definitive ANTLR 4 Reference book.

References:

1. http://www.antlr.org/
2. https://pragprog.com/book/tpantlr2/the-definitive-antlr-4-reference
3. http://java.dzone.com/articles/creating-external-dsls-using
4. https://theantlrguy.atlassian.net/wiki/display/ANTLR4/Getting+Started+with+ANTLR+v4

Distributed Transaction with Nosql using 2 Phase Commit

Posted on Updated on

Problem:

Distributed transaction need to be supported across MongoDB and Elastic Search from an application. Inherently MongoDB and Elastic Search is not supporting distributed transactions. Here I am going to explain you how it has been solved using well known 2 Phase commit protocol.

Solution:

Basic components/actors involve in 2 phase commit protocol for MongoDB and elastic search is as shown below:

BDAT

Resource manager for MongoDB and ElasticSearch has registered with Transaction Coordinator to participate in two phase commit. During phase I transaction coordinator will send prepare notification to registered resource manager. Each resource manager can give the response as ready to commit or rollback. During first phase transaction log will be created into Redis and in case of update or delete operation need to perform, the initial state of data is also logged and lock is applied on transnational entity in Redis only. In phase I the actual changes will be done on the resource.

If all participants resource manager voted success in phase I then transaction coordinator will send the commit notification in phase II to all participant.If any one failed in phase I will send the rollback notification to remaining participated resource manager. In Phase II participated resource manager will update the transaction log and release the lock on transaction entity.

If in phase II any participant failed to commit or rollback,then that transaction will be logged as in doubt transaction and eventually watch dog will rollback corresponding changes from participated resource manager.

If transaction coordinator it self failed after or in between phase I or in between phase II, then the watch dog will rollback the transaction after transaction timeout and release the lock. In this case data will be inconsistent state still watch dog perform recovery.

2 phase commit failure scenario and its remedy

1 MongDB fail in phase I Data is consistent
TransactionLog ID Distributed Transaction Id Participant State Phase
t1 dt1 Mongo Failed & rollback I
t2 dt1 ES Ready To Commit I
t1 dt1 Mongo Rollback not called II
t2 dt1 ES Rollback II
2 ES fail in phase I Data is consistent
TransactionLog ID Distributed Transaction Id Participant State Phase
t1 dt1 Mongo Ready To Commit I
t2 dt1 ES Failed & rollback I
t1 dt1 Mongo Rollback II
t2 dt1 ES Rollback not called II
3 MongDB fail in phase II commit Data is inconsistent
TransactionLog ID Distributed Transaction Id Participant State Phase
t1 dt1 Mongo Ready To Commit I
t2 dt1 ES Ready To Commit I
t1 dt1 Mongo Commit Failed II
t2 dt1 ES Commited II
t3 dt1 TC InDoubt End of phase II
Wacthdog Action Retry to rollback transaction and release lock
4 ES fail in phase II commit Data is inconsistent
TransactionLog ID Distributed Transaction Id Participant State Phase
t1 dt1 Mongo Ready To Commit I
t2 dt1 ES Ready To Commit I
t1 dt1 Mongo Commited II
t2 dt1 ES Commit Failed II
t3 dt1 TC InDoubt End of phase II
Wacthdog Action Retry to rollback transaction and release lock
5 MongDB fail in phase II rollback Data is inconsistent
TransactionLog ID Distributed Transaction Id Participant State Phase
t1 dt1 Mongo Ready To Commit I
t2 dt1 ES Ready To Commit I
t1 dt1 Mongo rollback Failed II
t2 dt1 ES rollback II
t3 dt1 TC InDoubt End of phase II
Wacthdog Action Retry to rollback transaction and release lock
6 ES fail in phase II rollback Data is inconsistent
TransactionLog ID Distributed Transaction Id Participant State Phase
t1 dt1 Mongo Ready To Commit I
t2 dt1 ES Ready To Commit I
t1 dt1 Mongo rollback II
t2 dt1 ES rollback Failed II
t3 dt1 TC InDoubt End of phase II
Wacthdog Action Retry to rollback transaction and release lock
7 Transaction Coordinator failed before phase I Data is consistent
TransactionLog ID Distributed Transaction Id Participant State Phase
t1 dt1 Mongo Ready 0
t2 dt1 ES Ready 0
t1 dt1 Mongo waiting I
t2 dt1 ES waiting I
Wacthdog Action No action needed just delete the transaction
8 Transaction Coordinator failed after phase I Data is inconsistent
TransactionLog ID Distributed Transaction Id Participant State Phase
t1 dt1 Mongo Ready To Commit I
t2 dt1 ES Ready To Commit I
t1 dt1 Mongo waiting II
t2 dt1 ES waiting II
Wacthdog Action Retry to rollback transaction and release lock
9 Transaction Coordinator failed inbetween phase II Data is inconsistent
TransactionLog ID Distributed Transaction Id Participant State Phase
t1 dt1 Mongo Ready To Commit I
t2 dt1 ES Ready To Commit I
t1 dt1 Mongo Commited II
t2 dt1 ES waiting II
Wacthdog Action Retry to rollback transaction and release lock
10 Transaction Coordinator failed after phase I Data is inconsistent
TransactionLog ID Distributed Transaction Id Participant State Phase
t1 dt1 Mongo Failed & rollback I
t2 dt1 ES Ready To Commit I
t1 dt1 Mongo waiting II
t2 dt1 ES waiting II
Wacthdog Action Retry to rollback transaction and release lock

2 Phase commit is supported by .Net framework since 2.0 using System.Transaction package. It has been extensible for any other resource like MongoDB, Sql, Message Queue … etc using the IEnlistmentNotification interface.

The source code for POC is added here

Behavior Driven Acceptance Testing for Web API

Posted on Updated on

Problem

In modern enterprise application which is based on SOA, we have multiple web APIs. In this case service provider needs to do functional and integration test for each Web API as per the given scenario and provide the documentation to service consumer. This will take  integration pain away from service consumer and also ensure the functional automation of web API. The problem is how to do functional automation for Web API using continuous integration ?

Solution

In  Agile software development  Behavior Driven Development (BDD) is the way to solve this problem but in this case developer is involved, need to follow this practice from beginning of the development.

But if instead of BDD you are using TDD and you have Testing team which is responsible for making releases of your web API. Then following approach will come into the rescue:

To do the functional and integration testing for web api as well as generating reports/document for stake holders like business analyst, product owner, service consumer etc … in continuous integration.

I have decided to use Cucumber which has Gherkin support, SoapUI is used to do the functional and integration testing of Web API and JUnit will act as bridge between Cucumber and SoapUI. Jenkins is used for continuous integration using Maven and cucumber reports are visualized using Cucumber test result plugin. The work flow is as below:

BDAT

To give an example, I want to test a scenario where user will be authenticated based on his credentials and after successfully authentication will receive access token. Using this access token there is another service will provide the user profile.

To support it my IdentityManagement.feature file will be as below:

  Feature: Identity Management
  In order to manage user identity
  As a service consumer
  I want to authenticate user and view his profile
 
  @soapui
  Scenario Outline: Autheticate user and Get user profile
  Given I have a <username> and <password>
  When I call authetication service
  Then I should get Access Token
  And I call user service to fetch the user profile
  
  Examples: Successfull Parameters
  | username | password |
  | vinayak | secretAgent |
  | john | secretAgent |

Using this scenario Cucumber will generate the JUnit Test cases. Cucumber and JUnit integration is provide by cucumber-jvm . From these unit test SoapUI test case/test step will be executed.Here SoapUI has provided the nice documentation for it.

In brief will show you what is the SoapUI project structure and how it relates to Cucumber feature. The following image is self explanatory.

SoapUI Project Structure

Terminology used in Cucumber is mapped with SoapUI building blocks as below:

Cucumber   SoapUI      
Feature        Test Suite  
Scenario     Test Case  
Given           Test Step  
When           Test Step  
Then           Test Step    

The source code for skeleton project is here.

Happy automation testing for Web API 🙂

Command query responsibility segregation pattern with MongoDB, Redis and ElasticSearch

Posted on Updated on

Problem:

In large scales data-centric enterprise application read write ratio is very high. In that case we want the fast read access with different search criteria from enterprise database. This is often a requirement for any large scales enterprise application.

Solution:

In most of the cases for enterprise application data storage and retrieval is the fundamental thing. The Command Query Responsibility Segregation is very well known for this kind of requirement for multiple reasons.

Here this pattern is implemented using MongoDB as primary storage and Elastic Search and Redis as secondary storage. Following solution diagram will represent the same.

Every create, update or delete action for the domain object will go through following steps in Command Service:

1. Create, update or delete for a domain object happened on MongoDB as a primary storage.
2. Invalidate the Redis cache entry based on the primary key of the domain object.
3. Create, update or delete the Elastic Search index for the domain object.

Every query or read for domain object with search criteria will go through following stages in Query Service:

1. Try to get the primary keys from the elastic search index based on search criteria unless search is not based on a primary key itself.
2. Based on the received primary key from previous step try to get the domain object from  Redis cache.
3. If domain object is not available with cache, then get the domain object from MongoDB. (As retrieval using a primary key is always efficient with MongoDB or any other SQL providers)
4. Update the domain object into the Redis cache and return the result.

Using this pattern you can achieve better response time for all your read queries with some a write overhead and you don’t need to surprise and change your design after a performance testing.

Scalable load testing for SOAP and Rest services on Cloud

Posted on Updated on


For every web application/ web services before going into the production an application developer needs to do performance/load testing. For this is reason you must have load testing framework in your toolbox. Always we want cost effective solution for this purpose. Here I am going to explore the open source Soap-UI capability along with elastic search, mysql, jenkins and groovy to do load testing for Soap/Rest service on AWS cloud.

Problem

Perform a load testing for SOAP and Rest services for following reason:

1. Get the average response time and test per second of web service under certain load.
2. Ensure the maximum capacity (number of requests per second) of the web server based on the current hardware infrastructure.
3. Get the CPU, Memory, Disk IO and Network usage for web server under certain load.
4. Generate a load test result report for analysis.

Solution

JMeter and SoapUI these are two popular frameworks used for load testing in open source community which are freely available. JMeter is a really good for load testing for multiple reasons. We used JMeter for load tesing of our ASP.NET MVC 4 applications. But when we consider it for SOAP services testing then need to generate the SOAP xml request and JMeter does not give the kind of flexibility required. Visual studio ultimate edition also provides the load testing framework based on test cases but need a cost effective solutions.

Soap-UI is our choice for multiple reasons. It generate the request and response using data contract provided through WSDL or WADL. Soap-UI comes with the capability of groovy script implicitly,it helps to manipulate the request, response as well as to get and save the test data to/from external sources.

Soap UI has nice GUI which help to create test suite, test cases and test step. SoapUI also support the property transfer test steps which helps to parse the response from previous request and set the properties at project, test suite, test case level. Properties are key value pair which can be accessed inside the request using expression language e.g ${#Project#PropertyName}. Here you can learn more about SoapUI.

You can also use groovy script test step to parse the response and apply some use case logic and set the properties for next request or store the responses in external sources which can be used for further processing.

When we perform the load test we are interested in numbers like avg response time and test per seconds and number of requests depending on load test strategy. Basic CSV file reporting is provide by SoapUI community edition. We integrated elastic search for reporting with soap ui for load test results. The distributed SoapUI load test results is dumped into the Elastic Search and the reports are visualized using the Kibana web user interface.

When System is under load we need to get the performance metrics like CPU, Memory, Disk IO, Network utilization. We get it using Jmeter Server Agent and custom TCP client for it. The tcp perfmon client also inserting the result into elastic search. It will helpful to analyze the load test stats with performance metrics.

The above mention setup work well from single instance but load testing need to scale on multiple instance to generate more load on server. In that case the current setup works as master node having Elastic search, mysql, PerfMon Tcp client, SoapUI load test runner and Jenkins for managing master/slave node. The Jenkins master will distribute the SoapUI load test runner job on SSH Client slave machines in a network. The slave node run the SoapUILoadTestRunner Job and publish the load test result to elastic search hosted on the master node. The scalability is achieved through adding n-numbers of slave node on the Cloud. These slave nodes are ubuntu based EC2 intances

This is cost effective solution for load testing using open source SoapUI against the paying huge license cost of SoapUI Pro and LoadUI Pro which help us to get more or less same result. By using this solution we at least saved USD 10000.

Happy Load testing  🙂

Open ID Connect 1.0 with oAuth 2.0 is for managing authentication, authorization and single sign on for service oriented or distributed enterprise applications.

Posted on Updated on

Every service oriented or distrusted enterprise applications needs authentication, authorization and single sign on (SSO) to manage the users. In era of Service Oriented Architecture the same need is fulfilled by WS-Federation, WS-Trust and SAML  along with WS-* specification. In some cases custom framework is designed to do it and still they can continue with it, there is no harm in doing so as well. But thing has been changed in last few years because of wide adoption of internet, smart phones and social networking sites like Facebook, Google Plus etc…

Problem Definition: The need or demand of time is to allow the users from Social Networking sites to get register with the enterprise application in most of cases user’s already have account in some of the social networking site and applications pull the existing user profile. Need to support the common protocol to manage the user’s authentication, authorization and SSO from different types of applications like server side web applications, native mobile applications, browser based client side applications or desk top applications with respect any technology. Need to protect the access of web services which are publically exposed.

Solution:   This solution is referring terms from oAuth 2.0 and Open Id Connect 1.0 specification. If you are not aware about it please go through it once.

To begin with the scenario let me introduce the player/actors involve in it using following picture:

sso

 

Here we have relying party as a web application which is in interest of user. The relying party providing some set of services to users like gmail, wordpress or any web application. The relying party is depends on another web application for authentication which is known as identity provider (IDP). The responsibility of IDP is to authenticate the user using underline authentication provider like database, active directory or any other source of authentication provider like Facebook. Once user is authenticated from IDP user is redirected to relying party. User access this web application using a User agent (web browser). In this scenario we have user as resource owner, web browser or user agent, relying party and identity provider.

Now we can explore Open Id connect 1.0 basic client flow. In this case the Identity Server acting as open Id connect 1.0 and oAuth 2.0 provider aka identity provider (IDP). Relying party is the web application which is open id connect 1.0 or oAuth 2.0 clients. The Relying party can access services endpoints like google api or facebook graph api which are secured by oAuth 2.0 access token aka Resource Server. The authentication and authorization between relying party, IDP and resource server code flow is mentioned as below.

oidc flow

Single Sign On

The single sign in is achieved based on browser http only secured session cookie. Once Relying party user is authenticated by IDP a secure http session cookie is added in response. If some other relying party the user trying to access from same identity provider he can initiate the IDP authenticated session from the previous secure session cookie.

The single sign out for relying parties is achieved based on session management profile of open id connect 1.0 specification.  In this case two more endpoints are exposed by IDP as Open ID Provider as check session iframe endpoint for checking valid session and end session end point for log out initiation from relying parties.

 

References:

http://openid.net/specs/openid-connect-session-1_0.html

http://tools.ietf.org/html/rfc6749

http://openid.net/connect/

http://vimeo.com/70556512

https://www.youtube.com/watch?v=hEewiXlynyc

Thinktecture.IdentityServer.v3

MITREid Connect