Last year, I posted an article on my experience ingesting a CSV file into Druid. Today, I recreated that work and am posting a step-by-step guide. You may be wondering what you can do with Druid data, so I wrote up a quick follow-up.
For this step, I’m going to use the open-source visualization tool Swiv, which used to be Pivot. Swiv is a web-based exploratory visualization UI built on top of a library called Plywood.
And voilà, http://localhost:9090 leads to Swiv, a data visualization tool! For some reason, the datasource names were not showing in the UI, but I have only two, so it was easy to find my seattle-weather
data. A few drag and drops later, I got the following:
Let’s face it, though, the distribution of counts of various precipitation values is not very interesting. Let’s add some new metrics. The options for aggregations are shown in the Druid documentation at http://druid.io/docs/latest/querying/aggregations.html.
Replace the contents of metricsSpec
in seattle-weather.json
with the following:
Then, disable the data source, and re-ingest the data in Druid. I had to restart Swiv to get it to pick up the two new metrics.
When I did, I was able to group by maximum and minimum temperature. Here’s an example:
Here, you can see the variation of high temperatures (the ‘maximum’) in Seattle during January 2015. By default, the graph is split by hour. If you click on the ‘Time (Hour)’ control, you can change the granularity to ‘1 day’.
With this example in hand, you should be well on your way toward happy visualization in Swiv using Druid as a data source!