Using Redis TimeSeries with Grafana for Real-Time Analytics

Key Takeaways

  • Time-series data management is critical for data analytics initiates in organizations. Examples of time-series data are stock prices or CPU performance metrics.
  • Purpose-built time-series databases like RedisTimeSeries address the needs of handling time-series data and also remove the limitations enforced by relational databases.
  • Other databases purpose built for time-series data include InfluxDB and Prometheus.
  • By integrating Grafana with RedisTimeSeries, you can zoom in or zoom out on the charts in real time.

Time-series data is broadly defined as a series of data stored in time order. Examples of time-series data can range from stock prices over a period of many years to CPU performance metrics from the past few hours. Time-series data is widely used across many industry verticals. It has carved out its own category of databases, because relational, document-oriented and streaming databases do not fulfill the needs of this particular type of data.

Characteristics of time-series data

Due to its distinct characteristics (listed below), time-series data is typically inefficient when managed with other databases:

  1. High-speed data ingest: Whether it is an IoT use case or market analysis data, you have a steady stream of data that arrives at high speeds and often in bursts. For most solutions, the data arrives 24/7, 365 days a year.
  2. Immutable data: Once inserted in the database, a data point does not undergo any changes until it is expired or deleted. The data is typically log data with a timestamp and a few data points.
  3. Unstructured labels: Time-series data is generally produced continuously over a long period of time by many sources. For example, in an IoT use case, every sensor is a source of time-series data. In such situations, each data point in the series stores the source information and other sensor measurements as labels. Data labels from every source may not conform to the same structure or order.
  4. Diminishing value over time: Only an aggregated summary of data with an appropriate time range would be relevant in the future. For example, in a year from now, most users will not require every data point stored at the range of milliseconds. Only aggregated and generalized data by a minute, hour or day would make sense in that case.
  5. Queries are aggregated by time intervals: Charts based on time-series data enable you to zoom in and out. They do so by aggregating their data by time intervals. Typically, time-series data queries are aggregations. This is in contrast to retrieving individual records from the database.

Problems with using traditional databases for time-series use cases

Many solutions still store time-series data in a relational database. This approach has many drawbacks, because relational databases:

  • Are designed and optimized for transactional use cases.
  • Carry the overhead of locking and synchronization that are not required for the immutable time-series data. This results in slower-than-required performance for both ingest and queries. Enterprises then end up investing in additional compute resources to scale out.
  • Enforce a rigid structure for labels and cannot accommodate unstructured data.
  • Require scheduled jobs for cleaning up old data.
  • Are used for multiple use cases. Overuse of running time-series queries may affect other workloads.

Rethinking the time-series database

A purpose-built time-series database addresses the needs of handling time-series data. It also removes the limitations enforced by relational databases. RedisTimeSeries is purpose-built to collect, manage and deliver time-series data at scale. It delivers:

  • Fast data ingest: As an in-memory database, RedisTimeSeries can ingest over 500,000 records per second on a standard node. Our benchmarks show that you can ingest over 11.5 million records per second with a cluster of 16 Redis shards.
  • Resource efficiency: With RedisTimeSeries, you can add rules to compact data by downsampling. For example, if you have collected more than one billion data points in a day, you could aggregate the data by every minute in order to downsample it, thereby reducing the dataset size to 24 * 60 = 1,440 data points. You can also set data retention policies and expire the data by time when you don’t need them anymore.
Using Redis TimeSeries with Grafana for Real-Time Analytics 1

Figure 1: Downsampling and aggregation using a time-series database

  • Easy, fast queries: RedisTimeSeries allows you to aggregate data by average, minimum, maximum, sum, count, range, first and last. You can run over 100,000 aggregation queries per second with sub-millisecond latency. You can also perform reverse lookups on the labels in a specific time range.

Some databases that are purpose built for time-series data include Influx DB and Prometheus.

A typical time-series database is usually built to only manage time-series data so one of the challenges it faces is with use cases that involve some sort of computation on top of time-series data. An example would be capturing a live video feed in a time-series database. If you were to apply some sort of an AI model for face recognition, you would have to extract the time-series data, apply some sort of data transformation and then do computation. This is not ideal for a real-time use case. Multi-model databases that also manage other data models solve for these use cases where multiple data models can be manipulated in place.

A quick start guide to RedisTimeSeries

The quickest way to get started with RedisTimeSeries is to add it as a data source to your Grafana dashboard.  In the next section of this article, I’ll walk you through how I loaded sample time-series data into RedisTimeSeries and viewed the data in a Grafana dashboard.

I chose to compare the performance of stock prices for Apple Inc. (AAPL) and Intel Inc. (INTC) over 19 years, using a chart on a Grafana panel:

Using Redis TimeSeries with Grafana for Real-Time Analytics 2

Figure 2: Comparison of Apple and Intel stock performance using RedisTimesSeries and Grafana

My RedisTimeSeries setup

I started out by downloading the RedisTimeSeries source code from GitHub and building it locally. I then imported the “.so” file into Redis using the command:

MODULE LOAD [path to]/redistimeseries.so

I also could’ve also loaded the module by inserting the following command in redis.conf:

loadmodule [path to]/redistimeseries.so

If you prefer using Docker you can give it a try by issuing the following command:

docker run -p 6379:6379 -it --rm redislabs/redistimeseries

Once my Redis server was up, I checked whether Redis had successfully loaded the module by running “module list.” Lo and behold, “timeseries” was listed as one of the modules:

127.0.0.1:6379> module list
1) 1) "name"
   2) "timeseries"
   3) "ver"
   4) (integer) 200

Sample dataset: Over 19 years of stock market data

I downloaded daily stock prices for AAPL and INTC from the Wall Street Journal. The file included prices from the year 2000 until now in csv (comma separated value) format. Here’s some sample data on AAPL:

2006-01-03,10.34,10.68,10.32,10.68,201853036,AAPL
2006-01-04,10.73,10.85,10.64,10.71,155225609,AAPL
2006-01-05,10.69,10.7,10.54,10.63,112396081,AAPL
2006-01-06,10.75,10.96,10.65,10.9,176139334,AAPL
2006-01-09,10.96,11.03,10.82,10.86,168861224,AAPL
2006-01-10,10.89,11.7,10.83,11.55,570088246,AAPL
2006-01-11,11.98,12.11,11.8,11.99,373548882,AAPL
2006-01-12,12.14,12.34,11.95,12.04,320201966,AAPL
2006-01-13,12.14,12.29,12.09,12.23,194153393,AAPL
2006-01-17,12.24,12.34,11.98,12.1,209215265,AAPL

Next, I wrote a Python script to import this data into RedisTimeSeries:

import sys
import csv
import time
import redis

if(len(sys.argv) > 1):
   ticker = str(sys.argv[1])
else:
   ticker = 'test'

file = ticker + '.csv'

r = redis.Redis(host='localhost', port=6380, db=0)

with open(file) as csv_file:
   csv_reader = csv.reader(csv_file, delimiter=",")
   r.execute_command("ts.create stock:"+ticker);
   count = 0
   for row in csv_reader:
      time_tuple = time.strptime(row[0], '%Y-%m-%d')
      time_epoch = time.mktime(time_tuple)*1000
      r.execute_command("ts.add stock:"+ticker+" "+str(int(time_epoch))+" "+row[1])
      count = count + 1

   print(f'Imported {count} lines')

As you can see, I used RedisTimeSeries’ TS.CREATE command to establish the new time-series data structure, and its TS.ADD command to populate that data structure. For the symbol AAPL, this program created a data structure called stock:aapl. An example command for adding data looks like:

TS.ADD stock:aapl 1513324800000 173.04

I next ran TS.RANGE to verify the data. Please note that the time stamps used in this query are in milliseconds.

127.0.0.1:6379> TS.RANGE stock:aapl 1513324800000 1514324800000 aggregation avg 1
1) 1) (integer) 1513324800000
   2) "173.63"
2) 1) (integer) 1513584000000
   2) "174.88"
3) 1) (integer) 1513670400000
   2) "174.99000000000001"
4) 1) (integer) 1513756800000
   2) "174.87"
5) 1) (integer) 1513843200000
   2) "174.16999999999999"
6) 1) (integer) 1513929600000
   2) "174.68000000000001"
7) 1) (integer) 1514275200000
   2) "170.80000000000001"

In the next step, I’ll explain how I used Grafana to view and compare the stock prices.

Viewing RedisTimeSeries data in Grafana

In this section, I will walk through how I installed Grafana and used the SimpleJSON data connector to read data from RedisTimeSeries. To do this, I developed a new SimpleJSON data source application. It’s an intermediary HTTP-based Node.js application that translates SimpleJSON calls into Redis calls, and RedisTimeSeries data into JSON data.

Step 1: Install Grafana

First, I used the homebrew utility to install Grafana on my Mac (if you’re using a PC, follow Grafana’s instruction manual to install and set it up). I ran the following commands to get Grafana up and running:

$ brew install grafana
$ brew services start grafana
==> Successfully started `grafana` (label: homebrew.mxcl.grafana)

With Grafana now running on port 3000, I could log in using http://localhost:3000.

Step 2: Develop and run the SimpleJSON data source application

Grafana comes with a built-in data source connector called SimpleJSON, which connects to any application with an HTTP server that supports “/”, “/search”, and “/query”. Since RedisTimeSeries doesn’t have its own connector as yet, I developed a new Node.js application supporting the HTTP protocol and the required queries for the SimpleJSON data source application. You can download my code from GitHub and run it in your local Node.js environment.

Each HTTP query in the SimpleJSON data source application has a unique purpose, and I developed my program with the following design principles for each one:

1. “/”: This is a default request that should respond with an arbitrary message. It is used to test the connection (like a ping test).

2. “/search”: This should return the list of keys that hold the time-series data. (With other databases, this could be a list of table names instead of keys, however since Redis is a key-value store, we return the list of keys that are of the special type for a time series).

To get this list of keys, I used the safer “SCAN” command instead of “KEYS”. And for each key, I checked whether the key was of type, “TSDB-TYPE”, the internal name we use for time-series keys. The program maintains an array of all the keys of that type, and returns the array in the JSON format.

3. “/query”: The query command receives input arguments that include the list of keys, start time, end time and bucket time. The application returns time-series data in JSON format based on the input commands.

There is also a fourth HTTP request, called “/annotations”, but I did not require that request for this sample application.

Once I had the code ready, I ran the node application. The sample code listened to the HTTP requests on port 3333, so I could test it on a browser by calling http://localhost:3333. It returned, “I have a quest for you!”

Step 3: Connect Grafana with RedisTimeSeries

This was the easiest of all the steps. After logging in to Grafana, I added a data source by going to Configuration > Data Sources, and clicking “Add data source.”

Using Redis TimeSeries with Grafana for Real-Time Analytics 3

I searched for the SimpleJSON option and selected it.

Using Redis TimeSeries with Grafana for Real-Time Analytics 4

This opened a configuration screen, where I entered the URL to connect to my Node.js application.

Using Redis TimeSeries with Grafana for Real-Time Analytics 5

Now that I had the data source configured, I could add new panels to my dashboard. For this example, I added a panel with two queries: one time-series query for each stock ticker. As shown in the picture, the drop down menu for my query had already populated the time-series keys stock:aapl and stock:intc. The chart also populated with the data as soon as I selected the time-series keys. Behind the scenes, the SimpleJSON connector had called our application with appropriate queries (“/search” and “/query”).

Using Redis TimeSeries with Grafana for Real-Time Analytics 6
Using Redis TimeSeries with Grafana for Real-Time Analytics 7

Here’s the end result: a Grafana panel querying RedisTimeSeries. It was quite simple to set up RedisTimeSeries and connect it with Grafana.

In conclusion, RedisTimeSeries combines all the benefits of Redis and a purpose-built time-series database. It can help your business in many ways, including by saving on resources, supporting more end users and taking your apps faster to the market with easy integration. By integrating Grafana with RedisTimeSeries, you can zoom in or zoom out on the charts in real time. You can also handle more queries per second, letting your dashboard to show more data points in their panels. On top of that you can add more panels, and serve more end users.

About the Author

Using Redis TimeSeries with Grafana for Real-Time Analytics 8

Roshan Kumar is a senior product manager at Redis Labs, Inc. He has extensive experience in software development and product management in the technology sector. In the past, Kumar has worked at Hewlett-Packard, and a few successful Silicon Valley startups. He holds a bachelor’s degree in Computer Science, and an MBA from Santa Clara University, California, USA.

[ad_2]

This article has been published from the source link without modifications to the text. Only the headline has been changed.

Source link