Docker

How to setup a MongoDB Sharded cluster using docker with skydns

Posted on Updated on

Problem: To setup MongoDB Sharded cluster on development environment, need to have replica set, config server and Mongo Router on single machine.

Solution: One of the solution to setup MongoDB Sharded cluster on single machine is Docker. Docker MongoDB container are hosted on single host with isolation. The MongoBD sharded cluster using docker container is as shown below:

Prerequisite

  1. Ubuntu server 14.04
  2. Install Docker host

Install Skydock and skydns for docker service discovery

Step 1: Check ip address docker0, it is docker networking gateway.

 ifconfig docker0

If docker0 bridge ip address is 172.17.0.1 then go ahead else use whichever is assigned.

Step 3: Edit /etc/default/docker

DOCKER_OPTS="--bip=172.17.0.1/16 --dns=172.17.0.1 --dns 8.8.8.8 --dns 8.8.4.4"

Step 4: Restart docker service

    sudo service docker restart

Step 5: Start the skydns container to manage the docker service discovery with skydock

docker run  -d -p 172.17.0.1:53:53/udp --name skydns crosbymichael/skydns -nameserver 8.8.8.8:53 -domain docker

docker run  -d -v /var/run/docker.sock:/docker.sock --name skydock crosbymichael/skydock -ttl 30 -environment dev -s /docker.sock -domain docker -name skydns

MongoDB sharded cluster on docker

Setup the replica set:

Start the replica set 1 primary shard

docker run  --name rs1-srv1 -d mongo mongod --storageEngine wiredTiger --replSet rs1

Start the replica set 1 secondary shard

docker run  --name rs1-srv2 -d mongo mongod --storageEngine wiredTiger --replSet rs1

Start the replica set 1 arbiter

docker run  --name rs1-arb -d mongo mongod --storageEngine wiredTiger --replSet rs1

The docker container domain name register at skydns is

container-name.image-name.environment.domain-name

default value for envirnment : dev and domain name is docker

for example: domain name for primary shard is rs1-srv1.mongo.dev.docker where container name is rs1-srv1 and image name is mongo

Connect to primary shard to initiate replica set using mongodb shell client:

docker run -i -t mongo mongo --host rs1-srv1.mongo.dev.docker

Then initiate the replica set and add secondary node and arbiter to replica set.

config = { _id: "rs1", members:[{ _id : 0, host : "rs1-srv1.mongo.dev.docker:27017" }]};

rs.initiate(config);

rs.add("rs1-srv2.mongo.dev.docker:27017");
rs.addArb("rs1-arb.mongo.dev.docker:27017");
rs.status();
exit

Start the MongoDB config server:

docker run   --name cfg1 -d mongo mongod --configsvr --port 27017 --dbpath /data/db

Start the MongoDB Router :

docker run  -p 27017:27017 --name mongo-router -d mongo mongos --configdb cfg1.mongo.dev.docker:27017

Connect to MongoDB router to enable the sharding using mongodb shell client:

docker run -i -t mongo mongo --host mongo-router.mongo.dev.docker

Add the shard server to config db from mongodb router:

sh.addShard("rs1/rs1-srv1.mongo.dev.docker:27017");
sh.status();

Now MongoDB sharded cluster is ready. Need to enable sharded database on this cluster.
To give an example, here sharding is enabled on mytestdb and want to balance data distribution across the cluster use Hashed Shard Key which require hashed index.

use my_test_db
sh.enableSharding("my_test_db")

db.my_collection.ensureIndex( { _id : "hashed" } )

sh.shardCollection("my_test_db.my_collection", { "_id": "hashed" } )
exit

Start docker container on reboot using the upstart

sudo vi /etc/init/docker-mongo-cluster.conf

then paste the following content:

    description "Docker container"
    author "Vinayak Bhadage"
    start on filesystem and started docker
    stop on runlevel [!2345]
    respawn
    script

                /usr/bin/docker start skydns
                /bin/sleep 10s
                /usr/bin/docker start skydock
                /bin/sleep 10s
                /usr/bin/docker start rs1-srv1
                /bin/sleep 10s
                /usr/bin/docker start rs1-srv2
                /bin/sleep 10s
                /usr/bin/docker start rs1-arb
                /bin/sleep 10s
                /usr/bin/docker start cfg1
                /bin/sleep 10s
                /usr/bin/docker start -a mongo-router

end script

In this case need to change “rs1-srv1” name of docker container and start the service

sudo service docker-mongo-cluster

Conclusion

Dev environment is ready using MongoDB Sharded cluster. You can use MongoChefGUI client to connect the Mongo Router.

References

  1. https://github.com/crosbymichael/skydock
  2. https://hub.docker.com/_/mongo/
  3. https://medium.com/@gargar454/deploy-a-mongodb-cluster-in-steps-9-using-docker-49205e231319
  4. https://medium.com/@gargar454/deploy-a-mongodb-cluster-in-steps-9-using-docker-49205e231319

How to import data from MS SQL Server into Elasticsearch 1.6

Posted on Updated on

Problem:

Need to provide the analytics and visualization for audit log data which is stored in relational database.

Solution

One of the solution to this problem is to visualize the data in open source tool like kibana . But kibana uses the elasticsearch for search and storage purpose.

So that need to import selected records from relational database into the elasticsearch 1.6. The new index will be created in elasticsearch for this data and it will be used by kibana.

Prior to elasticsearch 1.6 the river plugin was available for this purpose but it is now deprecated.

But to solve the same problem another standalone java utility known as elasticsearch – jdbc is available.

Here I am going to tell you how to use this utility through docker so whenever you need it. it would be only three steps process for you i.e clone it, build image and start the container with parameter.

Prerequisite:

  1. Ubuntu 14.04
  2. Install Docker Host
  3. Install elasticsearch
    docker run -d -p 9200:9200 -p 9300:9300 elasticsearch 
    
  4. Install Kibana

Step 1: Check out the docker file https://github.com/vinayakbhadage/data-importer

git clone https://github.com/vinayakbhadage/data-importer.git

Step 2: Change the required parameter from this file dataimport.sh as mentioned here

Step 3: Build the images from Dockerfile

docker build -t data-importer .

Step 4: Run the data-importer by setting following parameter

1.LAST_EXECUTION_START=”2014-06-06T09:08:00.948Z”

This date time used to import the data from the log table of your database. All records in that table with timestamp column value greater than this will be imported in Elastic search.

2.INDEX_NAME=** Provide the value **

This one is index name for elasticsearch.

3.CLUSTER=** Provide the value **

Provide the elasticsearch cluster name.

4.ES_HOST=** Provide the value **

Provide the elastic search host name or IP address.

5.ES_PORT=”9300″

Provide the elastic search host port number.

6.SCHEDULE=”0 0/10 * * * ?”

Default interval for data-importer is 10 min. this is Quartz cron trigger syntax.

7.SQL_SERVER_HOST=”Provide the value”

It should be sql server database IP or hostname.

8.DB_NAME=”Provide the value”

It should be sql server database name.

9.DB_USER_NAME=”Provide the value”

It should be sql server user name, here server authentication is required.

10.DB_PASSWORD=”Provide the value”

It should be sql server user password, here server authentication is required.

Note: Please change the environment variable as per your requirement

docker run -d --name data-importer -e LAST_EXECUTION_START="2014-06-06T09:08:00.948Z" \
  -e INDEX_NAME="myindex"  -e CLUSTER="elasticsearch" -e ES_HOST="myeshost" \
  -e ES_PORT="9300" -e SCHEDULE="0 0/10 * * * ?" -e SQL_SERVER_HOST="mydb" \
  -e DB_NAME="mydb" -e DB_USER_NAME="myuser" -e DB_PASSWORD="find-out" data-importer

Lastly checkout the status of elasticsearch index then you can find data over there.

Health check and Analytics for enterprise application using InfluxDB and Grafana

Posted on Updated on

Problem:

Need to monitor the real time enterprise application performance metric with server stats for following parameters.

  1. Log the response time taken for web api.
  2. Log the time required to fetch the data from the external source like cache, database, web api …etc
  3. Log the application specific exceptions.
  4. Log the server stats like CPU, Memory, Disk or Network usage …etc
  5. Log the page visited with search criteria.

For each stats meta data also need to provide like machine name, tenant specific or consumer specific info.

Solution:

One of the new school of thought of this problem is to use InfluxDB time series database to collect the log data and grafana will be used to monitor the time series for respective metric instead of using StatsD with graphite.

For pushing the application performance metric to InfluxDB, we developed InfluxDB.UDP client which will push the data into InfluxDB in JSON format supported by InfluxDB. InfluxDB UDP plug-in needs to enabled to receive the UDP data.

For pushing server stats or performance counter like CPU, Network Usage …etc used the CollectD on Linux server and CollectM is node.js based utility for Windows Server.

The overall solution to this problem is as shown below:

The InfluxDB support the aggregation like min, max, mean, count …etc on time series data by creating time buckets (e.g. count of particular request by 30 min window time). It has supported sql like query dsl.

Setup Guide InfluxDB, Grafana, CollectD, CollectM and InfluxDB.UDP

I have used docker to install the InfluxDB and Grafana on Ubuntu 14.04.

Prerequisite

  1. Ubuntu 14.04
  2. Install Docker 1.6

Step 1:

Install InfluxDB using docker images, instruction are provided here

Step 2:

Install Grafana using docker run following docker command:

sudo docker run -d -p 3000:3000 grafana/grafana

After this access Grafana on browser on port 3000 and configure InfluxDB as a data source as per provided instructions.

Step 3:

Install CollectD on Linux to monitor server stats.

sudo apt-get update
sudo apt-get install collectd collectd-utils

Change the collectd configuration file ‘/etc/collectd/collectd.conf’ for InfluxDB as a CollectD storage


Hostname “your-host-name”

LoadPlugin network

#influxdb client configuration
Server “influxDb-ip-or-domain-name” “25826”

and restart the collecd service:

sudo service collectd restart

Step 4

Install CollectM on windows and change its configuration for pointing to InfluxDB.

Go to Installation Directory\CollectM\config\default.json and
change the following things

….
“Interval”: 10,

“Network”: {
“servers”: [
{
“hostname”: “influxDb-ip-or-domain-name”,
“port”: 25826
}
]
},

Step 5

InfluxDB.UDP is Dot Net library to push the application performance metrics to InfluxDB. The instructions are available here.

Conclusion:

This is going to be useful for devops to check your application health and performance in real time in any environment like Dev-QA-Stage or Production. For future development need to create alert or notification based on the available stats.