Skip to content

Conversation

@joe-clickhouse
Copy link
Contributor

@joe-clickhouse joe-clickhouse commented Jan 15, 2026

Summary

This PR adds a native async HTTP client based on aiohttp (AiohttpAsyncClient in clickhouse_connect/driver/aiohttp_client.py) and wires it into the async client factory path. It replaces the default executor-wrapped sync client with true async I/O while keeping the public async API consistent with the existing AsyncClient surface. The legacy executor-wrapped path remains available (and is now explicitly deprecated) when passing a sync client.

Why this change

The previous async client was a thin wrapper around sync operations executed in a thread pool, which:

  • added thread overhead and context switching
  • limited async I/O benefits
  • complicated resource/session management

The new implementation performs HTTP I/O natively with aiohttp while preserving the established client API and behavior.

Key behavior and design points

Native async I/O with aiohttp

Requests use aiohttp.ClientSession with a configurable TCPConnector (pool limits, keepalive). HTTP response handling is fully async.

Streaming bridge for Native format

Native format parsing/serialization is still synchronous CPU-bound work. The client uses a bounded queue as a sync/async bridge so async network reads/writes can overlap with sync parsing/serialization in an executor.

On the async query path (StreamingResponseSource), he async producer reads from aiohttp response and the sync consumer parses in an executor.

On the async insert path (StreamingInsertSource), the sync producer serializes in an executor and the async consumer streams to aiohttp.

Preventing event loop blocking

The client uses two complementary strategies to prevent users from accidentally blocking the event loop.

For non-streaming queries (.query(), .query_df(), etc.) results are fully materialized inside the executor before returning to the event loop. By the time a QueryResult is returned, all data is already in memory, so synchronous iteration is safe and won't cause deadlocks.

For streaming queries (.query_rows_stream(), .query_df_stream(), etc.) the AsyncSyncQueue bridge actively detects deadlock attempts. If you try to synchronously iterate a stream (for row in stream) from within an async def function, it raises a ProgrammingError immediately, prompting you to use async for instead.

Backward compatibility

AsyncClient(client=...) still wraps the sync client in an executor, but emits a deprecation warning. The recommended path is get_async_client(...), which now creates the aiohttp-based client.

Tests

Integration tests using param_client now exercise both sync and async clients, and new async-specific tests validate native async behaviors (concurrency, streaming cleanup, session protection, timeouts, etc.).

Migration and compatibility notes

  • Recommended: async_client = await clickhouse_connect.get_async_client(...)
  • Legacy (still supported, deprecated warning): AsyncClient(client=sync_client)

Notable trade-offs

  • Non-streaming queries are fully materialized in the executor to keep the event loop safe.
  • Parsing/serialization still runs in a thread pool. The async benefit is in I/O and concurrency.

Performance notes

A preliminary benchmark comparing the executor-based async client (as it exists in clickhouse-connect v0.10.0) against the new async-native client was performed. The setup was as follows:

  • Server: ClickHouse Cloud (25.10.1.7186), 1–4 vCPUs (burstable), 8 GiB RAM, node type r5ad‑2xlarge
  • Client: Apple M4 Max (14 cores), 36 GB RAM

The observed speedups of the new async client over the executor-based client ranged from 2% to 95% with average increase of around 40%, depending on the workload. P95 latencies showed marked improvement as well. A detailed design/benchmark blog post is planned and a link will be provided when done.

Checklist

Delete items not relevant to your PR:

  • Unit and integration tests covering the common scenarios were added
  • A human-readable description of the changes was provided to include in CHANGELOG

@joe-clickhouse joe-clickhouse linked an issue Jan 15, 2026 that may be closed by this pull request
@joe-clickhouse joe-clickhouse changed the title Joe/141 a database client should be based on asyncio Implement native async client Jan 16, 2026
@joe-clickhouse joe-clickhouse marked this pull request as ready for review January 21, 2026 00:40
@joe-clickhouse joe-clickhouse requested a review from genzgd January 21, 2026 00:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

A database client should be based on asyncio

2 participants