DevOps Analytics: Five Steps To Visualise Your Jenkins/UrbanCode Deploy Delivery Pipeline In Action!
Developers are delivering code into source control, lets say Git as an example, and from there you're doing automated builds in Jenkins, and using IBM's UrbanCode Deploy to automate deployments.
Everyone is happy because releases are getting out the door quicker than before and with less effort! And the automated tests you've added to the pipeline are helping weed out issues earlier, so quality is improving as well.
But one thing that isn't as easy to do is get a sense for what is happening in your pipeline. Both from the point of view of seeing where a particular set of changes are in the pipeline, as well as getting a sense for how well your pipeline is performing.
Enter DevOps Analytics!
By this I mean the following three capabilities:
- A live data set representing all of the activity that is occurring in your delivery pipeline.
- A way of visualising this data. More specifically the ability to visualise the answers to certain questions of the data set.
- Along with the capability to be able to inspect this data in ways which allow you to draw useful conclusions, as well as to support investigative activities. Essentially this means being able to keep rephrasing the questions you are asking of the data set to make the answers more and more interesting.
But importantly, it provides this information in a dynamic fashion, allowing different questions to be asked, and different aspects of the data to be "zoomed in" on.
For example, I might start by asking what builds a change went into. That would let me identify a particular build that I could then focus on. The next question I'd ask is where and when this build had been deployed. I could narrow down the question and ask for deploys to a particular environment, then look at a timeline of those deploys and pick out a deploy I was interested in at a particular point in time. I could then "zoom out" and look at tests that had been run against that environment to see what the quality was like after that deploy. If I find a test failure, I could again zoom out and see what other environments that test was failing in to see if it might be environment specific.
This is all actually quite easy to do!
I've created a solution based on the following components - to great effect.
- ElasticSearch as the datastore for the pipeline data set.
- curl to make REST-based calls to post data to ElasticSearch from the various stages in my pipeline.
- Kibana to allow me to create powerful dashboards that visualise the data, and allow me to filter the views in ways that give me insights into what is happening in the pipeline.
|Figure 3 - DevOps Analytics solution components.|
Kibana provides the second and third capabilities: A way of visualising the data, as well as a way of inspecting the data by asking and rephrasing questions to give increasingly interesting answers.
Step 1 - Decide On A Pipeline Data Model
- A common model for looking at activity across the entire pipeline.
- Who/what was performing an activity.
- What application/component was involved.
- What technology stack this application/component is from.
- Where the activity was taking place.
- When it was occurring.
- How long the activity took.
- What was the input.
- What change did it produce.
Step 2 - Provision The Tools
From their website...
Note: You can skip the steps in those instructions for setting up LogStash and Beats for now. They're not necessary to get the basic solution up and running.
Once you're finished with those instructions you should have your two Docker containers running.
|Figure 5 - Running containers for ElasticSearch and Kibana.|
And you should be able to access Kibana with your browser.
|Figure 6 - Kibana - no data yet!|
All we need now is some data :)
Step 3 - Instrument Your Pipeline With Data Emitters
I've referred to using a data emitter to send pipeline activity data to my ElasticSearch datastore. This is really just some scripted logic that uses curl to post the data to the ElasticSearch REST API.
So in full my data emitter consists of:
- Scripted logic to map activity data to the data model presented in figure 4.
- A JSON template file that is used by this mapping logic to create a data file ready for posting.
- Scripted logic to post the JSON data file to the ElasticSearch REST API using curl.
|Figure 8 - Instantiating our JSON template.|
|Figure 9 - Invoking curl to send a message to our ElasticSearch REST API endpoint.|
|Figure 10 - Adding our data emitter to a Jenkins build.|
Then add them to your component processes running in UrbanCode Deploy. In my example, shown in figure 11, I have added my data emitter logic to an existing Ant script that is wrapped as a UCD plugin. So I don't really need to do anything special in UCD. You may need to add an additional step to your component process where you call your data emitter - either wrapped as a plugin, or just call it from a Shell Script step.
|Figure 11 - Options for adding your data emitter to a UCD component process.|
One quick point - you'll notice we didn't create a table space or anything similar in our ElasticSearch repository before putting data into it. In ElasticSearch, your data lives in an index. And when you post your data, all you need to do is specific the index name and it will be created for you should it not already exist. Great :). I called my index pipeline_events.
Also note that for our example I've talked about data coming from Jenkins and UCD, but I've also been adding data emitters to other tools in the pipeline such as test tools and SCM tools.
Step 4 - Visualise The Data That Is Rolling InOpen up Kibana, and you should see data building up. Fantastic!
|Figure 12 - DevOps Analytics data building up.|
4.1 Question 1 - What Activity Is Happening In My Pipeline?
We'll use a simple donut chart visualization for this that shows a break-down of the activity in the pipeline by pipeline phase. In the example in figure 13 you can see the green section shows deploys, the blue section shows builds, and the purple section shows automated tests. As you'd expect, we do more deploys than builds. Hovering over the green section we discover we've done 25 deploys which represents 48.08% of the activity in our pipeline. In this example we can also see that there aren't many automated tests being run.
|Figure 13 - A "donut chart" showing the break-down of activity in the pipeline.|
4.2 Question 2 - When Did This Activity Happen?Most important we'll want to consider when this activity is happening. Referring again to figure 4, this is represented in my data model using the start and end timestamps. This will allow us to ask time-based questions of our data set. You'll see later in step 5 how this becomes very useful, but for know lets just note that we'll be visualising the amount of activity that is happening, and when it is happening.
For this you use a timeline visualization which plots the amount of activity happening during a specific period as a line on a graph. In the example shown in figure 14 we can see there were 10 events in the pipeline some point before 2017-09-23.
|Figure 14 - A timeline showing the amount of activity in the pipeline for a period.|
|Figure 15 - "Zoomed in" view showing time of events across a shorter period.|
4.3 Question 3 - What Technologies Are Involved In This Activity?
Most teams start by setting up a delivery pipeline that supports a single application development language/technology - commonly just pure simple Java. But the more interesting ones apply these same concepts to all of the application development technologies that form part of their complete Enterprise Architecture stack.
So assuming our pipeline extends to cover all of these, we'll want to be able to ask the question of which application runtime technology the activity in our pipeline is related to. (As an aside, you may think of these as separate pipelines but we'll want them all covered by our DevOps Analytics so we can ask interesting questions that span the runtime technology boundaries.)
This information is represented in my data model as runtimePlatform. To visualize this, we'll use a tag cloud. Figure 16 shows that our pipeline is building, deploying and testing applications for IBM Integration Bus, IBM WebSphere Liberty, and IBM BPM (Core and AAIM). The relative sizes of each platform name gives an indication as to how much activity there has been in the pipeline for that platform.
|Figure 16 - A tag cloud of runtime platforms that are being serviced by the pipeline.|
So lets say that we'd also like to ask the question of which of these technologies is involved in the activity in our pipeline. Figure 17 shows that our pipeline activity is occurring in Jenkins and UrbanCode deploy.
|Figure 17 - A tag cloud of pipeline tools servicing our pipeline.|
4.4 Putting This All Together On A Dashboard
Step 5 - Asking Interesting Questions Of Your Data
Figure 19 shows that we've zoomed in on 7 IBM Integration Bus events that all occurred between 13:06 and 13:24 on the 17th October.
|Figure 19 - Zoom in to a specific period on the timeline.|
|Figure 20 - Zoom in on just build events.|
|Figure 21 - Zoom in on the soapPaymentServices component.|
Try out the steps suggested in the blog post and visualize what is going on in your own delivery pipeline! Please do let me know how you get on.
Copyright © Continualoop Blog 2017. All rights reserved.