NanoTDB - A compact database for sensor data on low-resource hardware.

NanoTDB

A compact database for sensor data on low-resource hardware.

Pitch

NanoTDB is a lightweight, append-only time-series database tailored for long-running sensor data on modest hardware such as Raspberry Pi and IoT gateways. It requires no external dependencies and simplifies data management by storing everything in plain files within a single directory.

Description

NanoTDB is a lightweight, embedded time-series database tailored for resource-limited environments such as Raspberry Pi devices, edge computing nodes, and Internet of Things (IoT) gateways. Designed with simplicity in mind, NanoTDB operates with no external dependencies, ensuring that all data is stored in plain files within a single root directory.

Architecture Overview

The architecture of NanoTDB revolves around an efficient Engine, which is the single entry point for database operations. It manages multiple named databases, including:

prod - for production data storage
sensors - handling sensor data, and
internal - for engine self-metrics.

Storage Layers

Each database consists of three layers to ensure data integrity and performance:

Write-Ahead Log (WAL): Ensures crash safety by recording every sample prior to its permanent entry into the database.
Catalog: A mapping of metric names to compact MetricIDs and value types, maintained as a JSON file.
Data Files: Compressed, immutable pages that are flushed from memory and organized into partitioned files.

Data Flow

Data ingestion is accomplished using the AddLine method, accepting data in a line-protocol format. Here’s a concise example of how data is added:

AddLine("prod/room.temp 21.5 1715000000000000000")

This method efficiently parses incoming data, updates the WAL, and compresses it into the appropriate data files, while adhering to strict timestamp rules that reject out-of-order data.

Querying Data

To retrieve data, this database implements a QueryRange function that scans specified partitions based on provided timestamps. A sample query might look like this:

QueryRange("prod", "room.temp", fromTS, toTS, stride, callback)

Using this functionality, users can efficiently access and analyze time-series data with high precision and accuracy.

Line Protocol

NanoTDB uses a simple line protocol for recording and processing data, which is structured as follows:

DB/metric.name value [ts]

Notably, the database automatically creates the metric and assigns correct types based on initial data entries, streamlining the input process.

Configuration and Management

The configuration file engine.toml is automatically created upon the first start. It includes key settings that allow for customization of server behavior, such as:

Listening address
Maximum segment size for WAL
Durability profiles influencing data persistence strategies

Binaries

NanoTDB provides two main utilities:

nanotdb – Server

This command starts the server and exposes a small HTTP API, enabling integration with other systems.

nanocli – Offline CLI Tool

This utility allows users to perform offline operations on the database, such as inspecting databases and importing or exporting data in line protocol.

Example CLI Commands

nanocli inspect db --root <dir>
nanocli import --root <dir> --in <file.lp>
nanocli export --root <dir> --db <name> --out <file.lp>

Embedding the Engine

For developers looking to embed the database engine, NanoTDB offers a simple API:

e, err := engine.OpenEngine("/data", 0)
def e.Close()

This functionality allows for the seamless integration of time-series data management within client applications.

In summary, NanoTDB is designed for effective time-series data handling in environments where performance and resource efficiency are critical. It stands as a robust solution suitable for various applications in the realm of sensor data management and beyond.

0 comments

No comments yet.

New comment