Skip to content

lspecian/chronik-stream

Chronik Stream

Build Status Release Docker Image License Rust

A high-performance streaming platform built in Rust that implements core Kafka wire protocol functionality with comprehensive Write-Ahead Log (WAL) durability and automatic recovery.

Latest Release: v2.2.18 - Performance fix for hot-path logging. See CHANGELOG.md for full release history.

✨ What's New in v2.2.18

πŸš€ Performance Fix: Reduced hot-path logging from INFO to TRACE level πŸ“ˆ +168% Throughput: 146K β†’ 391K msg/s (default logging, no RUST_LOG override needed) ⚑ 99% Lower p99 Latency: 46ms β†’ 0.5ms by eliminating per-request logging overhead

Fixed Logs (changed from info! to trace!):

  • Produce request logging in kafka_handler
  • PARTITION_DEBUG logging in produce_handler
  • PRODUCEβ†’BUFFER logging in produce_handler
  • Fetch result logging in fetch_handler

Upgrade Recommendation: All users should upgrade to v2.2.18 for optimal performance.

πŸš€ Features

  • Kafka Wire Protocol: Full Kafka wire protocol with consumer group and transactional support
  • Searchable Topics: Opt-in real-time full-text search with Tantivy (3% overhead) - see docs/SEARCHABLE_TOPICS.md
  • Full Compression Support: All Kafka compression codecs (Gzip, Snappy, LZ4, Zstd) - see COMPRESSION_SUPPORT.md
  • WAL-based Metadata: ChronikMetaLog provides event-sourced metadata persistence
  • GroupCommitWal: PostgreSQL-style group commit with per-partition background workers and batched fsync
  • Zero Message Loss: WAL ensures durability for all acks modes (0, 1, -1) even during unexpected shutdowns
  • Automatic Recovery: WAL records are automatically replayed on startup to restore state with 100% accuracy
  • Real Client Testing: Tested with kafka-python, confluent-kafka, KSQL, and Apache Flink
  • Stress Tested: Verified at scale with millions of messages, zero duplicates, 300K+ msgs/sec throughput
  • Transactional APIs: Full support for Kafka transactions (InitProducerId, AddPartitionsToTxn, EndTxn)
  • High Performance: Async architecture with zero-copy networking optimizations
  • Multi-Architecture: Native support for x86_64 and ARM64 (Apple Silicon, AWS Graviton)
  • Container Ready: Docker deployment with proper network configuration
  • Simplified Operations: Single-process architecture reduces operational complexity

πŸ—οΈ Architecture - 3-Tier Seamless Storage

Chronik implements a unique 3-tier storage system with automatic failover that provides infinite retention without requiring infinite local disk:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              Chronik 3-Tier Seamless Storage                     β”‚
β”‚                   (Infinite Retention Design)                    β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Tier 1: WAL (Hot - Local Disk)                                 β”‚
β”‚  β”œβ”€ Location: ./data/wal/{topic}/{partition}/                   β”‚
β”‚  β”œβ”€ Latency: <1ms (in-memory buffer)                            β”‚
β”‚  └─ Retention: Until sealed (250MB or 30min by default)         β”‚
β”‚        ↓ Background WalIndexer (every 30s)                       β”‚
β”‚                                                                   β”‚
β”‚  Tier 2: Raw Segments in S3 (Warm - Object Storage)             β”‚
β”‚  β”œβ”€ Location: s3://bucket/segments/{topic}/{partition}/{range}  β”‚
β”‚  β”œβ”€ Latency: 50-200ms (download + deserialize)                  β”‚
β”‚  β”œβ”€ Retention: Unlimited (cheap object storage)                 β”‚
β”‚  └─ Purpose: Message consumption after local WAL deletion        β”‚
β”‚        ↓ PLUS ↓                                                  β”‚
β”‚                                                                   β”‚
β”‚  Tier 3: Tantivy Indexes in S3 (Cold - Searchable)              β”‚
β”‚  β”œβ”€ Location: s3://bucket/indexes/{topic}/partition-{p}/...     β”‚
β”‚  β”œβ”€ Latency: 100-500ms (download + decompress + search)         β”‚
β”‚  β”œβ”€ Retention: Unlimited                                         β”‚
β”‚  └─ Purpose: Full-text search WITHOUT downloading raw data       β”‚
β”‚                                                                   β”‚
β”‚  Consumer Fetch Flow (Automatic Fallback):                      β”‚
β”‚    Phase 1: Try WAL buffer (hot, in-memory) β†’ ΞΌs latency        β”‚
β”‚    Phase 2: Try local WAL (warm, local disk) β†’ ms latency       β”‚
β”‚    Phase 3: Download raw segment from S3 β†’ 50-200ms latency     β”‚
β”‚    Phase 4: Search Tantivy index β†’ 100-500ms latency            β”‚
β”‚                                                                   β”‚
β”‚  Local Disk Cleanup:                                             β”‚
β”‚    - WAL files DELETED after successful upload to S3             β”‚
β”‚    - Old messages still accessible from S3 indefinitely          β”‚
β”‚    - No infinite local disk space required!                      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚   Kafka Client  β”‚  (kafka-python, Java clients, KSQL, etc.)
    β”‚  (Any Language) β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
             β”‚
             β–Ό
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚         Chronik Server                  β”‚
    β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
    β”‚  β”‚ Kafka Proto  β”‚  β”‚ ChronikMetaLog  β”‚ β”‚
    β”‚  β”‚ Handler      β”‚  β”‚ (WAL Metadata)  β”‚ β”‚
    β”‚  β”‚ (Port 9092)  β”‚  β”‚                 β”‚ β”‚
    β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
    β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
    β”‚  β”‚   Search     β”‚  β”‚  Storage Mgr    β”‚ β”‚
    β”‚  β”‚  (Tantivy)   β”‚  β”‚  (3-Tier)       β”‚ β”‚
    β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                β”‚
                β–Ό
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚    Object Storage         β”‚
    β”‚  (S3/GCS/Azure/Local)     β”‚
    β”‚  β€’ Raw segments (Tier 2)  β”‚
    β”‚  β€’ Tantivy indexes (Tier 3)β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Key Differentiators vs Kafka Tiered Storage

Feature Kafka Tiered Storage Chronik Layered Storage
Hot Storage Local disk WAL + Segments (local)
Cold Storage S3 (raw data) S3 raw segments + Tantivy indexes
Auto-archival βœ… Yes βœ… Yes (WalIndexer background task)
Query by Offset βœ… Yes βœ… Yes (download from S3 as needed)
Full-text Search ❌ NO βœ… YES (Tantivy indexes, no download!)
Local Disk Grows forever Bounded (old WAL deleted after S3 upload)

Unique Advantage: Chronik's Tier 3 isn't just "cold storage" - it's a searchable indexed archive. You can query old data by content or timestamp range without downloading or scanning raw data!

⚑ Quick Start

Using Docker (Recommended)

# Quick start - single node
docker run -d -p 9092:9092 \
  -e CHRONIK_ADVERTISED_ADDR=localhost \
  ghcr.io/lspecian/chronik-stream:latest start

# With persistent storage
docker run -d --name chronik \
  -p 9092:9092 \
  -v chronik-data:/data \
  -e CHRONIK_ADVERTISED_ADDR=localhost \
  -e RUST_LOG=info \
  ghcr.io/lspecian/chronik-stream:latest start

# Using docker-compose
curl -O https://raw.githubusercontent.com/lspecian/chronik-stream/main/docker-compose.yml
docker-compose up -d

With S3/MinIO Object Storage

# MinIO for development
docker run -d --name chronik \
  -p 9092:9092 \
  -e CHRONIK_ADVERTISED_ADDR=localhost \
  -e OBJECT_STORE_BACKEND=s3 \
  -e S3_ENDPOINT=http://minio:9000 \
  -e S3_BUCKET=chronik-storage \
  -e S3_ACCESS_KEY=minioadmin \
  -e S3_SECRET_KEY=minioadmin \
  -e S3_PATH_STYLE=true \
  ghcr.io/lspecian/chronik-stream:latest start

# AWS S3 for production (uses IAM role)
docker run -d --name chronik \
  -p 9092:9092 \
  -e CHRONIK_ADVERTISED_ADDR=localhost \
  -e OBJECT_STORE_BACKEND=s3 \
  -e S3_REGION=us-west-2 \
  -e S3_BUCKET=chronik-prod-archives \
  ghcr.io/lspecian/chronik-stream:latest start

⚠️ Critical Docker Configuration

IMPORTANT: When running Chronik Stream in Docker or binding to 0.0.0.0, you MUST set CHRONIK_ADVERTISED_ADDR:

# docker-compose.yml example
services:
  chronik-stream:
    image: ghcr.io/lspecian/chronik-stream:latest
    ports:
      - "9092:9092"
    environment:
      CHRONIK_BIND_ADDR: "0.0.0.0"  # Just host, no port
      CHRONIK_ADVERTISED_ADDR: "chronik-stream"  # REQUIRED - use container name for Docker networks
      # or "localhost" for host access, or your public hostname/IP for remote access

Without CHRONIK_ADVERTISED_ADDR, clients will receive 0.0.0.0:9092 in metadata responses and fail to connect.

Test with Kafka Client

# Python example
from kafka import KafkaProducer, KafkaConsumer

# Single-node setup
producer = KafkaProducer(
    bootstrap_servers='localhost:9092',
    api_version=(0, 10, 0)  # Important: specify version
)
producer.send('test-topic', b'Hello Chronik!')
producer.flush()

# Consumer
consumer = KafkaConsumer(
    'test-topic',
    bootstrap_servers='localhost:9092',
    api_version=(0, 10, 0),
    auto_offset_reset='earliest'
)
for message in consumer:
    print(f"Received: {message.value}")

⚠️ CRITICAL for Cluster Deployments: When using a multi-node cluster, ALWAYS configure clients with ALL cluster brokers for 100% message consumption success:

# βœ… CORRECT - Cluster configuration (ALL brokers)
producer = KafkaProducer(
    bootstrap_servers='localhost:9092,localhost:9093,localhost:9094',  # All 3 brokers!
    api_version=(0, 10, 0)
)

# ❌ WRONG - Single broker causes leadership rejections and message loss
producer = KafkaProducer(
    bootstrap_servers='localhost:9092',  # Only one broker - NOT RECOMMENDED for clusters!
    api_version=(0, 10, 0)
)

See docs/100_PERCENT_CONSUMPTION_INVESTIGATION.md for detailed analysis.

Using Binary

# Download latest release (Linux x86_64)
curl -L https://github.com/lspecian/chronik-stream/releases/latest/download/chronik-server-linux-amd64.tar.gz -o chronik-server.tar.gz
tar xzf chronik-server.tar.gz
./chronik-server start

# macOS (Apple Silicon)
curl -L https://github.com/lspecian/chronik-stream/releases/latest/download/chronik-server-darwin-arm64.tar.gz -o chronik-server.tar.gz
tar xzf chronik-server.tar.gz
./chronik-server start

Building from Source

# Clone repository
git clone https://github.com/lspecian/chronik-stream.git
cd chronik-stream

# Build release binary
cargo build --release --bin chronik-server

# Run single-node
./target/release/chronik-server start

# Or run 3-node cluster locally
./target/release/chronik-server start --config config/examples/cluster/chronik-cluster-node1.toml
./target/release/chronik-server start --config config/examples/cluster/chronik-cluster-node2.toml
./target/release/chronik-server start --config config/examples/cluster/chronik-cluster-node3.toml

🌟 KSQL Integration

Chronik Stream provides full compatibility with KSQLDB (Confluent's SQL engine for Kafka) including transactional support. Simply point KSQLDB at Chronik's Kafka endpoint:

# ksql-server.properties
bootstrap.servers=localhost:9092
ksql.service.id=ksql_service_1

For detailed KSQL setup and usage examples, see docs/KSQL_INTEGRATION_GUIDE.md.

🎯 Operational Modes

The unified chronik-server binary supports two deployment modes via the start command:

Single-Node Mode (Default)

Perfect for development, testing, and single-node production deployments:

# Simplest - just start
./chronik-server start

# With custom data directory
./chronik-server start --data-dir /var/lib/chronik

# With advertised address (required for Docker/remote clients)
./chronik-server start --advertise my-hostname.com:9092

Features:

  • βœ… Full Kafka protocol compatibility
  • βœ… WAL-based durability (zero message loss)
  • βœ… Automatic crash recovery
  • βœ… 3-tier storage (local + S3/GCS/Azure)
  • βœ… Full-text search with Tantivy

Cluster Mode (Multi-Node Replication)

Available in v2.2.0+: Production-ready multi-node cluster with Raft consensus, automatic replication, and zero-downtime operations.

Minimum 3 nodes required for quorum-based replication.

Quick Start (Local Testing):

# Terminal 1 - Node 1
./chronik-server start --config config/examples/cluster/chronik-cluster-node1.toml

# Terminal 2 - Node 2
./chronik-server start --config config/examples/cluster/chronik-cluster-node2.toml

# Terminal 3 - Node 3
./chronik-server start --config config/examples/cluster/chronik-cluster-node3.toml

Production Setup (3 Machines):

# On each node, create config file with unique node_id
# Example node1.toml:
enabled = true
node_id = 1
replication_factor = 3
min_insync_replicas = 2

[[peers]]
id = 1
kafka = "node1.example.com:9092"
wal = "node1.example.com:9291"
raft = "node1.example.com:5001"

[[peers]]
id = 2
kafka = "node2.example.com:9092"
wal = "node2.example.com:9291"
raft = "node2.example.com:5001"

[[peers]]
id = 3
kafka = "node3.example.com:9092"
wal = "node3.example.com:9291"
raft = "node3.example.com:5001"

# Start each node
./chronik-server start --config /etc/chronik/node1.toml

Key Features:

  • βœ… Quorum-based replication (survives minority node failures)
  • βœ… Automatic leader election via Raft consensus
  • βœ… Strong consistency (linearizable reads/writes)
  • βœ… Zero-downtime node addition and removal (v2.2.0+)
  • βœ… Automatic partition rebalancing
  • βœ… Comprehensive monitoring via Prometheus metrics

Cluster Management:

# Add node to running cluster
export CHRONIK_ADMIN_API_KEY=<key>
./chronik-server cluster add-node 4 \
  --kafka node4:9092 \
  --wal node4:9291 \
  --raft node4:5001 \
  --config cluster.toml

# Query cluster status
./chronik-server cluster status --config cluster.toml

# Remove node gracefully
./chronik-server cluster remove-node 4 --config cluster.toml

Complete Guide: See docs/RUNNING_A_CLUSTER.md for step-by-step instructions.

Configuration Options

Commands:

chronik-server start [OPTIONS]          # Start server (auto-detects single-node or cluster)
chronik-server cluster <SUBCOMMAND>     # Manage cluster (add-node, remove-node, status)
chronik-server version                  # Display version info
chronik-server compact <SUBCOMMAND>     # Manage WAL compaction

Start Command Options:

chronik-server start [OPTIONS]

Options:
  -d, --data-dir <DIR>         Data directory (default: ./data)
  --config <FILE>              Cluster config file (enables cluster mode)
  --node-id <ID>               Override node ID from config
  --advertise <ADDR:PORT>      Advertised Kafka address (for remote clients)
  -l, --log-level <LEVEL>      Log level (error/warn/info/debug/trace)

Key Environment Variables:

# Server Configuration
CHRONIK_DATA_DIR             Data directory path (default: ./data)
CHRONIK_ADVERTISED_ADDR      Address advertised to clients (CRITICAL for Docker)
RUST_LOG                     Log level (error, warn, info, debug, trace)

# Performance Tuning
CHRONIK_WAL_PROFILE          WAL performance: low/medium/high/ultra (auto-detects)
CHRONIK_PRODUCE_PROFILE      Producer flush: low-latency/balanced/high-throughput
CHRONIK_WAL_ROTATION_SIZE    WAL segment size: 100KB/250MB (default)/1GB

# Cluster Management (v2.2.0+)
CHRONIK_ADMIN_API_KEY        Admin API authentication key (REQUIRED for production clusters)

# Object Store (3-Tier Storage)
OBJECT_STORE_BACKEND         Backend: s3/gcs/azure/local (default: local)

# S3/MinIO Configuration
S3_ENDPOINT                  S3 endpoint (e.g., http://minio:9000)
S3_REGION                    AWS region (default: us-east-1)
S3_BUCKET                    Bucket name (default: chronik-storage)
S3_ACCESS_KEY                Access key ID
S3_SECRET_KEY                Secret access key
S3_PATH_STYLE                Path-style URLs (default: true, required for MinIO)
S3_DISABLE_SSL               Disable SSL (default: false)

# GCS Configuration
GCS_BUCKET                   GCS bucket name
GCS_PROJECT_ID               GCP project ID

# Azure Configuration
AZURE_ACCOUNT_NAME           Storage account name
AZURE_CONTAINER              Container name

⚑ Performance Tuning

Chronik Stream provides two layers of performance tuning for different workloads:

WAL Performance Profiles

The Write-Ahead Log is the primary performance lever. It automatically detects system resources (CPU, memory, Docker/K8s limits) and selects an appropriate profile. Override with:

CHRONIK_WAL_PROFILE=low        # Containers, small VMs (≀1 CPU, <512MB) - 2ms batch
CHRONIK_WAL_PROFILE=medium     # Typical servers (2-4 CPUs, 512MB-4GB) - 10ms batch
CHRONIK_WAL_PROFILE=high       # Dedicated servers (4-16 CPUs, 4-16GB) - 50ms batch
CHRONIK_WAL_PROFILE=ultra      # Maximum throughput (16+ CPUs, 16GB+) - 100ms batch

Benchmark results use high profile - see BASELINE_PERFORMANCE.md for detailed methodology.

Producer Flush Profiles

Control when buffered messages become visible to consumers:

Profile Batches Flush Interval Buffer Use Case
low-latency (default) 1 10ms 16MB Real-time analytics, instant messaging
balanced 10 100ms 32MB General-purpose workloads
high-throughput 100 500ms 128MB Data pipelines, ETL, batch processing
extreme 500 2000ms 512MB Bulk ingestion, data migrations
# Set producer profile (low-latency is default, use high-throughput for batch workloads)
CHRONIK_PRODUCE_PROFILE=high-throughput ./chronik-server start

Benchmarking

Run the built-in benchmark tool:

cargo build --release
./target/release/chronik-bench -c 128 -s 256 -d 30s -m produce

πŸ“¦ Docker Images

All images support both linux/amd64 and linux/arm64 architectures:

Image Tags Description
ghcr.io/lspecian/chronik-stream latest, v2.2.17, 2.2 Chronik server with full KSQL support

Supported Platforms

  • βœ… Linux x86_64 (amd64)
  • βœ… Linux ARM64 (aarch64) - AWS Graviton, Raspberry Pi 4+
  • βœ… macOS x86_64 (Intel)
  • βœ… macOS ARM64 (Apple Silicon M1/M2/M3)

βœ… Kafka Compatibility

Supported Kafka APIs (24 total)

API Version Status Description
Produce v0-v9 βœ… Full Send messages to topics
Fetch v0-v13 βœ… Full Retrieve messages from topics
ListOffsets v0-v7 βœ… Full Query partition offsets
Metadata v0-v12 βœ… Full Get cluster metadata
OffsetCommit v0-v8 βœ… Full Commit consumer offsets
OffsetFetch v0-v8 βœ… Full Retrieve consumer offsets
FindCoordinator v0-v4 βœ… Full Find group coordinator
JoinGroup v0-v9 βœ… Full Join consumer group
Heartbeat v0-v4 βœ… Full Consumer heartbeat
LeaveGroup v0-v5 βœ… Full Leave consumer group
SyncGroup v0-v5 βœ… Full Sync group assignments
ApiVersions v0-v3 βœ… Full Negotiate API versions
CreateTopics v0-v7 βœ… Full Create new topics
DeleteTopics v0-v6 βœ… Full Delete topics
DescribeGroups v0-v5 βœ… Full Describe consumer groups
ListGroups v0-v4 βœ… Full List all groups
SaslHandshake v0-v1 βœ… Full SASL authentication
SaslAuthenticate v0-v2 βœ… Full SASL auth exchange
CreatePartitions v0-v3 βœ… Full Add partitions to topics
InitProducerId v0-v4 βœ… Full Initialize transactional producer
AddPartitionsToTxn v0-v3 βœ… Full Add partitions to transaction
AddOffsetsToTxn v0-v3 βœ… Full Add consumer offsets to transaction
EndTxn v0-v3 βœ… Full Commit or abort transaction
TxnOffsetCommit v0-v3 βœ… Full Commit offsets within transaction

Tested Clients

  • βœ… kafka-python - Python client (full compatibility)
  • βœ… confluent-kafka - High-performance C-based client
  • βœ… KSQLDB - Full support including transactional operations
  • βœ… Apache Flink - Stream processing integration

πŸ› οΈ Development

Prerequisites

  • Rust 1.75+
  • Docker & Docker Compose (for testing)
  • Python 3.8+ with kafka-python (for client testing)

Building

# Build all components
cargo build --release

# Run tests (unit and bin tests only)
cargo test --workspace --lib --bins

# Run integration tests (requires setup)
cargo test --test integration

# Run benchmarks
cargo bench

Project Structure

chronik-stream/
β”œβ”€β”€ crates/
β”‚   β”œβ”€β”€ chronik-server/      # Main server binary (unified)
β”‚   β”œβ”€β”€ chronik-protocol/    # Kafka wire protocol implementation
β”‚   β”œβ”€β”€ chronik-storage/     # Storage abstraction layer
β”‚   β”œβ”€β”€ chronik-search/      # Search engine integration (Tantivy)
β”‚   β”œβ”€β”€ chronik-query/       # Query processing
β”‚   β”œβ”€β”€ chronik-common/      # Shared utilities
β”‚   β”œβ”€β”€ chronik-auth/        # Authentication & authorization
β”‚   β”œβ”€β”€ chronik-monitoring/  # Metrics & observability
β”‚   β”œβ”€β”€ chronik-config/      # Configuration management
β”‚   β”œβ”€β”€ chronik-backup/      # Backup functionality
β”‚   β”œβ”€β”€ chronik-bench/       # Performance benchmarking tool
β”‚   β”œβ”€β”€ chronik-wal/         # Write-Ahead Log & metadata store
β”‚   β”œβ”€β”€ chronik-raft/        # Raft consensus implementation
β”‚   └── chronik-raft-bridge/ # Raft integration bridge
β”œβ”€β”€ tests/                   # Integration tests
β”œβ”€β”€ Dockerfile              # Multi-arch Docker build
β”œβ”€β”€ docker-compose.yml      # Local development setup
└── .github/workflows/      # CI/CD pipelines

⚑ Performance

Chronik Stream delivers exceptional performance across all deployment modes (128 concurrency, 256 byte messages):

Benchmarks

Mode Configuration Throughput p99 Latency
Standalone acks=1 309K msg/s 0.59ms
Standalone acks=all 348K msg/s 0.56ms
Cluster (3 nodes) acks=1 188K msg/s 2.81ms
Cluster (3 nodes) acks=all 166K msg/s 1.80ms

Searchable Topics Impact

Configuration Non-Searchable Searchable Overhead
Standalone 198K msg/s 192K msg/s 3%
Cluster (3 nodes) 183K msg/s 123K msg/s 33%

Key Performance Features

  • High Throughput: Up to 348K messages/second standalone, 188K cluster
  • Low Latency: Sub-millisecond p99 latency standalone, sub-3ms cluster
  • Efficient Memory: Zero-copy networking with minimal allocations
  • Recovery: 100% message recovery with zero duplicates
  • Search: Only 3% overhead for real-time Tantivy indexing (standalone)
  • Compression: Snappy, LZ4, Zstd for efficient storage

WAL Performance

  • Write Throughput: 300K+ msgs/sec with GroupCommitWal
  • Recovery Speed: Full recovery in seconds even for large datasets
  • Zero Data Loss: All acks modes (0, 1, -1) guaranteed durable
  • Group Commit: PostgreSQL-style batched fsync reduces I/O overhead

See BASELINE_PERFORMANCE.md for detailed benchmark methodology and results.

πŸ”’ Security

SASL Authentication

Chronik Stream supports SASL authentication with the following mechanisms:

  • PLAIN - Username/password authentication
  • SCRAM-SHA-256 - Challenge-response authentication
  • SCRAM-SHA-512 - Challenge-response authentication (stronger)

Configuration:

# Production: Configure custom users
CHRONIK_SASL_USERS='myuser:securepass123,kafka-app:app-secret' ./chronik-server start

# Production: Disable all default users
CHRONIK_SASL_NO_DEFAULTS=1 ./chronik-server start

# Development only: Default test users (NOT for production!)
# - admin/admin123, user/user123, kafka/kafka-secret
# These are used if no CHRONIK_SASL_* env vars are set
# Python example with SASL/PLAIN
from kafka import KafkaProducer

producer = KafkaProducer(
    bootstrap_servers='localhost:9092',
    security_protocol='SASL_PLAINTEXT',
    sasl_mechanism='PLAIN',
    sasl_plain_username='myuser',
    sasl_plain_password='securepass123'
)

Additional Security Features

  • TLS/SSL: End-to-end encryption (infrastructure in chronik-auth crate)
  • ACLs: Topic and consumer group access control framework
  • Admin API: Secured with API key authentication (cluster management)

⚠️ Security Note: Default test users are for development only. Always set CHRONIK_SASL_USERS or CHRONIK_SASL_NO_DEFAULTS=1 in production.

πŸ“Š Monitoring

Prometheus Metrics

# Expose metrics endpoint
chronik --metrics-port 9093

# Key metrics:
- chronik_messages_received_total
- chronik_messages_stored_total
- chronik_produce_latency_seconds
- chronik_fetch_latency_seconds
- chronik_storage_usage_bytes
- chronik_active_connections

🀝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

πŸ“„ License

Apache License 2.0. See LICENSE for details.

πŸ“š Documentation

Getting Started

v2.2.8 Release (Critical Fixes)

Operations & Performance

Cluster Management (v2.2.0+)

Development

About

High-performance, Kafka-compatible distributed streaming platform with built-in search capabilities

Topics

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors 2

  •  
  •