Building a Multi-Engine Iceberg Lakehouse with QueryFlux

Apache Iceberg v3 building a multi-engine lakehouse with QueryFlux

The Iceberg lakehouse pitch is compelling: store data once in open Parquet files and let every engine query it. But in practice, running multiple engines creates an operational headache. BI tools need a MySQL endpoint pointing at StarRocks. Data engineers prefer Trino over JDBC. Analysts want sub-second lookups that DuckDB handles best. Each engine has its own connection string, its own SQL dialect, and its own concurrency limits. Clients end up with N connection configurations, and nobody has a unified view of what is running where.

QueryFlux solves this. It is a Rust-based SQL proxy that sits between your clients and your Iceberg engines. Clients connect once — over Trino HTTP, PostgreSQL wire, MySQL wire, or Arrow Flight SQL — and QueryFlux routes each query to the right backend based on rules you define in a single YAML config. It translates SQL dialects automatically via sqlglot, enforces per-engine concurrency limits, and exposes unified Prometheus metrics. The measured p50 proxy overhead is approximately 0.35 ms, so it is not on the critical path for analytical queries.

This article walks through building a production multi-engine Iceberg lakehouse with QueryFlux. We will set up the full stack with Docker Compose, configure routing rules that dispatch queries by protocol, headers, SQL patterns, and custom Python logic, wire up dialect translation, and add SLA protection with concurrency caps and fallback routing.

Architecture Overview

The architecture has three layers. At the bottom, your Iceberg tables sit on S3 (or GCS, or HDFS), managed by a catalog like Lakekeeper, Polaris, or AWS Glue. In the middle, multiple engines — Trino for complex joins, StarRocks for low-latency dashboards, DuckDB for lightweight ad-hoc exploration — all read from the same Iceberg catalog. On top, QueryFlux provides one endpoint that every client connects to. BI tools, Python scripts, dbt, Airflow operators, and ad-hoc SQL clients all point at QueryFlux. The routing layer decides which engine handles each query.

The key insight is that QueryFlux separates where a query should go logically (a cluster group) from which physical instance serves it (a cluster). Routers pick the group; the cluster manager picks the instance within the group using strategies like round-robin, least-loaded, failover, or weighted balancing.

Setting Up the Full Stack

QueryFlux provides a full-stack Docker Compose example with Trino, StarRocks, Lakekeeper (Iceberg REST catalog), MinIO (S3-compatible storage), PostgreSQL (for QueryFlux state), and QueryFlux Studio (web UI). Start the entire stack with three commands.

bash

1git clone https://github.com/lakeops-org/queryflux.git2cd queryflux/examples/full-stack3docker compose up -d --wait4 5# Load TPC-H sample data into Iceberg tables via Trino6docker compose --profile loader run --rm -T data-loader7 8# Register the Iceberg catalog in StarRocks so both engines see the same tables9docker compose --profile loader run --rm -T starrocks-catalog-setup

Once the stack is up, you have a working multi-engine lakehouse. Trino and StarRocks both query the same Iceberg tables stored in MinIO, managed by Lakekeeper. QueryFlux listens on port 8080 (Trino HTTP) and port 3306 (MySQL wire). Studio is at localhost:3000 for visual cluster management, routing configuration, and query history.

The QueryFlux Configuration

Everything in QueryFlux is driven by a single YAML config file. It defines frontends (which protocols to listen on), clusters (individual engine instances), cluster groups (pools of clusters with concurrency limits), routers (rules that pick a group for each query), and a fallback group for unmatched queries. The full reference is in the QueryFlux configuration docs. Here is a production-grade config for a three-engine Iceberg setup.

yaml

1queryflux:2  externalAddress: http://localhost:80803  frontends:4    trinoHttp:5      enabled: true6      port: 80807    mysqlWire:8      enabled: true9      port: 330610    postgresWire:11      enabled: true12      port: 543213  persistence:14    type: postgres15    url: postgres://queryflux:queryflux@postgres:5432/queryflux16  adminApi:17    port: 900018 19clusters:20  trino-coord:21    engine: trino22    enabled: true23    endpoint: http://trino:808024  starrocks-fe:25    engine: starRocks26    enabled: true27    endpoint: http://starrocks-fe:803028    auth:29      type: basic30      username: root31      password: ""32  duckdb-embedded:33    engine: duckDb34    enabled: true35 36clusterGroups:37  grp-trino:38    enabled: true39    maxRunningQueries: 10040    members: [trino-coord]41    strategy:42      type: leastLoaded43  grp-starrocks:44    enabled: true45    maxRunningQueries: 5046    maxQueuedQueries: 20047    members: [starrocks-fe]48  grp-duckdb:49    enabled: true50    maxRunningQueries: 851    members: [duckdb-embedded]52 53routers:54  - type: protocolBased55    mysqlWire: grp-starrocks56    postgresWire: grp-trino57  - type: header58    headerName: x-target-engine59    headerValueToGroup:60      duckdb: grp-duckdb61      starrocks: grp-starrocks62      trino: grp-trino63  - type: queryRegex64    rules:65      - regex: "(?i)^SELECT.*LIMIT\\s+\\d+\\s*$"66        targetGroup: grp-duckdb67  - type: tags68    rules:69      - tags:70          workload: etl71        targetGroup: grp-trino72      - tags:73          workload: dashboard74        targetGroup: grp-starrocks75 76routingFallback: grp-trino77 78translation:79  errorOnUnsupported: false

This config does several things at once. BI tools connecting over MySQL wire (port 3306) land on StarRocks automatically — they do not need to know StarRocks exists. PostgreSQL wire clients land on Trino. Any client can override routing with an x-target-engine header. Queries matching SELECT ... LIMIT N patterns (typically ad-hoc exploration) get routed to DuckDB for sub-second response. Tagged workloads (etl, dashboard) land on the right engine regardless of protocol.

Protocol-Based Routing

The simplest and most powerful routing rule is protocol-based. In a multi-engine setup, different client types naturally use different protocols. BI tools like Tableau and Grafana typically connect over MySQL wire. Data engineering tools use Trino HTTP or PostgreSQL wire. By mapping protocols to engine groups, you achieve workload isolation without any client-side configuration.

The protocolBased router runs first in the chain. If a MySQL wire connection arrives, it goes to StarRocks. If a Trino HTTP request arrives, the chain continues to the next router (header, regex, tags) because we did not map trinoHttp in the protocol router — we want more granular control over Trino HTTP traffic.

Header and Tag-Based Routing

For Trino HTTP clients that need explicit engine selection, the header router inspects a custom HTTP header. This is useful for orchestration tools like Airflow or Dagster that can set headers per task.

python

1# Airflow operator that routes to StarRocks via QueryFlux2from airflow.providers.common.sql.operators.sql import SQLExecuteQueryOperator3 4refresh_dashboard_aggs = SQLExecuteQueryOperator(5    task_id="refresh_dashboard_aggs",6    conn_id="queryflux_trino",  # points to QueryFlux:80807    sql="""8        INSERT OVERWRITE analytics.dashboard_metrics9        SELECT10            date_trunc('hour', event_ts) AS hour,11            event_type,12            COUNT(*) AS event_count,13            COUNT(DISTINCT user_id) AS unique_users14        FROM iceberg.analytics.events15        WHERE event_ts >= DATE_ADD('day', -1, CURRENT_DATE)16        GROUP BY 1, 217    """,18    # This header tells QueryFlux to route to StarRocks19    headers={"x-target-engine": "starrocks"},20)

For session-level routing that works across protocols, use the tags router. Trino clients attach tags via X-Trino-Client-Tags; MySQL and PostgreSQL wire clients use session variables. The tags router matches key-value pairs with AND logic within each rule — the first rule where all tags match wins.

Cost-Aware Routing with Python

The most powerful routing option is the Python script router. You write a route(query, ctx) function that inspects the SQL text, the protocol, headers, and user context, and returns a cluster group name or None to pass to the next router. The full context dictionary shape is documented in the routing and clusters architecture guide. This lets you encode cost-optimization logic directly in the routing layer.

Cloud engines charge differently. Trino bills by cluster uptime (compute-priced). Athena bills by bytes scanned (scan-priced). StarRocks bills by instance hours. A Python router can inspect query shape and steer each query to the cheapest engine that can serve it well.

yaml

1routers:2  - type: protocolBased3    mysqlWire: grp-starrocks4  - type: pythonScript5    script: |6      import re7 8      HEAVY_PATTERNS = [9          re.compile(r'(?i)\bJOIN\b.*\bJOIN\b'),   # multi-join10          re.compile(r'(?i)\bWINDOW\b|\bOVER\s*\('),  # window functions11          re.compile(r'(?i)\bGROUP BY\b.*,.*,'),   # multi-column aggregation12      ]13      LIGHT_PATTERN = re.compile(r'(?i)^SELECT\b.{0,200}\bLIMIT\s+\d+\s*$')14 15      def route(query: str, ctx: dict) -> str | None:16          sql = query.strip()17 18          # Lightweight exploration queries go to DuckDB19          if LIGHT_PATTERN.match(sql):20              return "grp-duckdb"21 22          # CPU-heavy analytical queries go to Trino23          for pattern in HEAVY_PATTERNS:24              if pattern.search(sql):25                  return "grp-trino"26 27          # Dashboard-style queries (simple aggregations) go to StarRocks28          return "grp-starrocks"

In QueryFlux benchmarking, this kind of workload-aware routing reduced total query costs by up to 56 percent compared with sending every query to a single default engine. Individual queries sometimes dropped by up to 90 percent when they moved from a compute-priced engine to a scan-priced one (or vice versa) where the pricing model better matched the workload shape.

Dialect Translation

When a client speaks Trino SQL but the routed backend is StarRocks (MySQL dialect), the SQL must be rewritten. QueryFlux handles this automatically via an embedded sqlglot translator. It detects the client dialect from the frontend protocol and the target dialect from the engine type, and transpiles on the fly. For compatible dialects (e.g., MySQL wire to StarRocks), translation is skipped entirely.

For edge cases where sqlglot does not cover a specific function or syntax, you can add Python transform scripts that modify the AST after translation.

yaml

1translation:2  errorOnUnsupported: false3  pythonScripts:4    - |5      import sqlglot.expressions as exp6 7      def transform(ast, src, dst) -> None:8          # Strip catalog prefix for engines that do not support 3-part names9          if dst in ("starrocks", "duckdb"):10              for table in ast.find_all(exp.Table):11                  if table.args.get("catalog"):12                      table.set("catalog", None)

SLA Protection with Concurrency Caps

Without concurrency limits, a batch ETL job can saturate your StarRocks cluster and make every dashboard unresponsive. QueryFlux enforces per-group concurrency caps (maxRunningQueries) and optional queue depths (maxQueuedQueries). When a group is full, new queries queue at the proxy instead of hammering the backend. If the queue is also full, the query is rejected immediately with a clear error.

This is a first-class Prometheus metric. You can set a Grafana alert on queue depth and know within seconds when a group is under pressure. You can also define fallback routing — if the StarRocks group is saturated, overflow queries spill to Trino automatically.

yaml

1clusterGroups:2  grp-starrocks:3    enabled: true4    maxRunningQueries: 505    maxQueuedQueries: 2006    members: [starrocks-fe]7    strategy:8      type: leastLoaded

Transparent Engine Migration with Weighted Balancing

When migrating workloads between engines — say, moving dashboard queries from Trino to StarRocks — you want a gradual rollout, not a flag day. QueryFlux supports weighted load balancing across a cluster group containing multiple engines. Start StarRocks at 10 percent, monitor latency in Studio's query history, and ramp to 100 percent with zero client changes.

yaml

1clusters:2  trino-coord:3    engine: trino4    enabled: true5    endpoint: http://trino:80806  starrocks-fe:7    engine: starRocks8    enabled: true9    endpoint: http://starrocks-fe:803010    auth:11      type: basic12      username: root13      password: ""14 15clusterGroups:16  grp-dashboard:17    enabled: true18    maxRunningQueries: 8019    members: [trino-coord, starrocks-fe]20    strategy:21      type: weighted22      weights:23        trino-coord: 2024        starrocks-fe: 80

Observability Across All Engines

QueryFlux exposes Prometheus metrics on the admin API port (default 9000) covering query count, latency histograms, active queries, queue depth, and error rate — broken down by cluster group and engine. The project includes pre-built Grafana dashboards. This gives you one pane of glass across Trino, StarRocks, and DuckDB instead of three separate monitoring stacks.

QueryFlux Studio (the bundled web UI at port 3000) shows live cluster status, routing rule configuration, and a query history page that records every query with its routing decision, execution time, and target engine. This is invaluable for debugging routing rules and comparing engine performance side-by-side during migrations.

Smoke-Testing the Setup

Once the stack is running, verify that queries route correctly with a simple curl against the Trino HTTP frontend.

bash

1# This query hits QueryFlux on port 8080 (Trino HTTP)2# The router chain decides which engine handles it3curl -s -X POST http://localhost:8080/v1/statement \4  -H "X-Trino-User: analyst" \5  -d "SELECT event_type, COUNT(*) AS cnt FROM iceberg.analytics.events GROUP BY 1 ORDER BY cnt DESC LIMIT 10"6 7# Force routing to DuckDB via header override8curl -s -X POST http://localhost:8080/v1/statement \9  -H "X-Trino-User: analyst" \10  -H "x-target-engine: duckdb" \11  -d "SELECT * FROM iceberg.analytics.events LIMIT 5"12 13# Connect via MySQL wire to hit StarRocks automatically14mysql -h 127.0.0.1 -P 3306 -u root \15  -e "SELECT COUNT(*) FROM analytics.events WHERE event_ts >= '2026-04-01'"

Check the QueryFlux logs and Studio query history to confirm each query landed on the expected engine. The routing trace records which router matched and why, making it straightforward to debug unexpected routing decisions.

When to Use This Architecture

A multi-engine lakehouse with QueryFlux makes sense when your team runs at least two analytical engines against the same Iceberg tables and you are tired of managing separate connection strings, monitoring dashboards, and driver configurations for each one. The sweet spot is a platform team that serves multiple internal consumers — BI analysts who need fast dashboards, data engineers running complex transformations, and ML engineers doing ad-hoc exploration.

If you are running a single engine today, you do not need QueryFlux yet. But the moment you add a second engine — for cost optimization, latency requirements, or migration — QueryFlux eliminates the N-by-M integration problem and gives you a controlled, observable routing layer from day one. The getting started guide walks through the full Docker Compose setup in under five minutes.

The architecture we built in this article — Trino for heavy analytics, StarRocks for dashboards, DuckDB for exploration, all reading from the same Iceberg catalog through QueryFlux — handles the full spectrum of analytical workloads. Clients connect once, routing rules encode your cost and latency intent, dialect translation handles SQL differences transparently, and concurrency caps protect backend SLAs. The operational surface area is one YAML file and one Prometheus endpoint instead of three separate engine configurations.

Building a Multi-Engine Iceberg Lakehouse with QueryFlux

Tags

Related articles

Apache Iceberg v3: What's New?

dbt Incremental Models That Actually Scale

Medallion Architecture: Bronze, Silver, Gold Done Right

Real-Time CDC with Debezium and Kafka

Snowflake Cost Optimization Without Slowing Teams Down