Apache Iceberg v3: What's New?

Apache Iceberg v3 feature overview showing Variant Type, Nanosecond Timestamps, Deletion Vectors, Geospatial Types, and Table Encryption

Apache Iceberg v3 is the most significant update to the table format since v2 introduced row-level deletes in 2020. Ratified in 2025 and already supported in preview by Snowflake, Databricks, and Amazon EMR, v3 adds seven headline features that address long-standing gaps in semi-structured data handling, delete performance, change tracking, geospatial workloads, and security. This article walks through each feature, explains why it matters, and shows how to use it in practice.

Before we get into the details, one reassuring fact: v3 is fully backward-compatible with v2 data files. Upgrading a table's format version does not require rewriting any existing data. The metadata layer gains new capabilities, but every Parquet and Avro file written under v2 remains valid. You can upgrade incrementally, one table at a time, as your query engines add support.

The Variant Type

The biggest headline feature in v3 is the Variant type. If you have ever stored JSON blobs as STRING columns because your schema was too unpredictable, Variant is the solution you have been waiting for. It stores semi-structured data — JSON objects, arrays, nested structures — in a compact binary encoding directly inside Parquet files.

Why is this better than a STRING column? Three reasons. First, query engines can push predicates into Variant columns without deserializing the entire JSON string at read time, which makes analytical queries significantly faster. Second, the binary encoding is more compact than raw JSON text, reducing storage costs. Third, Variant supports schema evolution naturally — you can add new fields to your JSON payloads without altering the Iceberg schema at all.

Here is how to create a table with a Variant column in Spark SQL:

sql

1-- Requires format-version 32CREATE TABLE catalog.events (3  event_id     BIGINT,4  event_type   STRING,5  event_ts     TIMESTAMP,6  payload      VARIANT7)8USING iceberg9TBLPROPERTIES ('format-version' = '3');10 11-- Insert JSON data directly into the Variant column12INSERT INTO catalog.events VALUES (13  1001,14  'page_view',15  TIMESTAMP '2026-04-14 09:30:00',16  PARSE_JSON('{"url": "/pricing", "referrer": "google.com", "duration_ms": 4200}')17);

In Parquet files, each Variant value is stored as two binary fields — a metadata field containing type information and a value field containing the encoded data. This two-part structure lets engines decode only the fields they need during a query, skipping irrelevant nested structures entirely.

For teams migrating from v2, the path is straightforward: upgrade the table to v3, add a new Variant column, and start writing semi-structured payloads to it. Existing STRING columns with JSON data can be migrated in-place by casting.

sql

1-- Upgrade an existing table to v32ALTER TABLE catalog.events3SET TBLPROPERTIES ('format-version' = '3');4 5-- Add a Variant column alongside the old STRING column6ALTER TABLE catalog.events ADD COLUMNS (7  payload_v VARIANT8);9 10-- Backfill Variant from existing JSON strings11MERGE INTO catalog.events t12USING (SELECT event_id, PARSE_JSON(payload_str) AS pv FROM catalog.events) s13ON t.event_id = s.event_id14WHEN MATCHED THEN UPDATE SET payload_v = s.pv;

Nanosecond Timestamps

Iceberg v2 supports microsecond-precision timestamps. That is sufficient for most analytics, but not for high-frequency trading, IoT sensor data, or distributed tracing where events happen within the same microsecond. v3 adds two new types: timestamp_ns (without timezone) and timestamptz_ns (with timezone). Both store values as 64-bit integers representing nanoseconds since the Unix epoch.

The new types work with all existing partition transforms. You can partition by year, month, day, or hour on a nanosecond timestamp column the same way you would on a microsecond column. Iceberg handles the conversion internally.

sql

1CREATE TABLE catalog.sensor_readings (2  device_id    STRING,3  reading_ns   TIMESTAMPTZ_NS,4  temperature  DOUBLE,5  pressure     DOUBLE6)7USING iceberg8PARTITIONED BY (days(reading_ns))9TBLPROPERTIES ('format-version' = '3');

In Parquet files, nanosecond timestamps map to INT64 with NANOS precision. ORC files use the TIMESTAMP_NANO type. If your pipeline already writes nanosecond-precision data but truncates to microseconds when landing in Iceberg, upgrading to v3 removes that precision loss with no changes to your file format.

Geometry and Geography Types

v3 brings first-class geospatial support to Iceberg through two new primitive types: geometry and geography. The geometry type uses projected coordinate systems (like UTM) for flat-earth calculations. The geography type uses latitude and longitude on a spherical model for global analysis.

These types are not just metadata labels. Iceberg integrates them with partition transforms and column-level metrics. When you write geospatial data, Iceberg stores bounding boxes in the column metadata. Engines that understand spatial predicates — like DuckDB, Apache Sedona, and Wherobots — can use those bounding boxes to skip files that fall outside a query's spatial filter.

sql

1CREATE TABLE catalog.store_locations (2  store_id    BIGINT,3  name        STRING,4  location    GEOGRAPHY,5  opened_at   DATE6)7USING iceberg8TBLPROPERTIES ('format-version' = '3');

The implementation aligns with the GeoParquet specification, so Parquet files written by Iceberg v3 are compatible with the broader GeoParquet ecosystem. This means you can query the same geospatial Iceberg table from Spark, Snowflake, DuckDB, or any GeoParquet-aware tool without converting data between formats.

Default Column Values

A seemingly simple feature with a big practical impact: columns in v3 tables can have default values defined in the schema metadata. When a writer creates a new data file and a column has a default value but the writer does not supply data for that column, the default is applied automatically.

This makes schema evolution instantaneous for additive changes. In v2, adding a new non-nullable column to a production table required a backfill step — you had to rewrite existing data files to include the new column. In v3, you set a default value and skip the rewrite entirely. Existing files return the default when the column is read, and new files include the actual value.

sql

1-- Add a column with a default value — no data rewrite needed2ALTER TABLE catalog.events ADD COLUMNS (3  processing_version INT DEFAULT 14);5 6-- Existing rows return 1 for processing_version7-- New rows get the actual value at write time8SELECT event_id, processing_version9FROM catalog.events10LIMIT 5;

Default values are stored in the table's metadata JSON, not in the Parquet files. This keeps the feature backward-compatible — older readers that do not understand defaults simply see a null for the new column, which you can handle at the application layer.

Binary Deletion Vectors

In v2, row-level deletes generate positional delete files — small Parquet files that list the file path and row position of each deleted row. This works, but at scale the number of delete files explodes, especially in CDC pipelines where thousands of rows are deleted across hundreds of data files per commit. Each query must then merge these delete files at read time, which degrades scan performance.

v3 replaces positional delete files with deletion vectors. Instead of a separate Parquet file per delete set, each data file gets a compact Roaring bitmap that marks which row positions have been deleted. These bitmaps are stored inside Puffin files — a lightweight container format already used by Iceberg for statistics.

The performance difference is substantial. Roaring bitmaps compress extremely well for the kinds of position sets that occur in practice (sequential IDs, clustered deletes). Apache Iceberg's custom bitmap implementation benchmarks at roughly 2x faster than the standard library for mixed workloads and up to 7x faster for ordered position inserts.

sql

1-- Enable deletion vectors on an existing v3 table2ALTER TABLE catalog.events3SET TBLPROPERTIES (4  'write.delete.mode' = 'merge-on-read',5  'write.update.mode' = 'merge-on-read'6);7 8-- Deletes now produce compact deletion vectors9-- instead of positional delete files10DELETE FROM catalog.events11WHERE event_ts < TIMESTAMP '2026-01-01 00:00:00';

For CDC pipelines, this is transformative. A MERGE operation that touches 50,000 rows across 200 data files used to produce 200 separate positional delete files. With deletion vectors, it produces a single Puffin file containing 200 compact bitmaps. Subsequent reads are faster because the engine loads one file instead of hundreds, and compaction jobs run more efficiently because there are fewer files to consolidate.

Row Lineage

Row lineage is the most forward-looking feature in v3. It adds two metadata columns to every data file: _row_id (a unique identifier for each row) and _last_updated_sequence_number (the snapshot sequence number when the row was last modified). These columns are maintained automatically by the Iceberg specification — you do not need to add them manually.

In v2, tracking changes between snapshots gives you net differences: you can see which files were added or removed, but not which specific rows changed within those files. Row lineage closes that gap. You can now query a table and ask: show me every row that was modified since snapshot 42.

sql

1-- Query row lineage metadata columns (Spark)2SELECT3  _row_id,4  _last_updated_sequence_number AS last_modified_seq,5  event_id,6  event_type7FROM catalog.events8WHERE _last_updated_sequence_number > 429ORDER BY _row_id;

This is immediately useful for three scenarios. First, incremental CDC: downstream consumers can pull only the rows that changed since their last checkpoint, without scanning the full table. Second, regulatory compliance: you can trace exactly when a row was created or modified, which is required by regulations like GDPR's right to know what data is held. Third, AI/ML pipelines: training data lineage becomes native to the storage layer instead of requiring a separate lineage system.

Under the hood, the implementation is efficient. Each snapshot commit records a first-row-id (assigned from the table's monotonically increasing row counter) and an added-rows count. This metadata lives in the manifest list, so it adds minimal overhead to commits. The _row_id and _last_updated_sequence_number columns are written into data files during MERGE and UPDATE operations by engines like Spark.

Multi-Argument Transforms

Iceberg's partition transforms have been limited to single-column inputs since v1. You could bucket by one column, truncate another, or extract a date part — but you could not combine columns in a single transform. v3 lifts this restriction.

Multi-argument transforms accept multiple source column IDs as input. The most immediate use case is multi-column bucketing, where you want to distribute data based on a hash of two or more columns together. Future use cases include Z-order transforms (for multi-dimensional clustering) and geo-partitioning (combining latitude and longitude into spatial tiles).

sql

1-- v3 enables multi-column bucket transforms2CREATE TABLE catalog.orders (3  order_id    BIGINT,4  customer_id BIGINT,5  region      STRING,6  order_ts    TIMESTAMP,7  total       DECIMAL(12, 2)8)9USING iceberg10PARTITIONED BY (11  days(order_ts),12  bucket(16, customer_id, region)  -- multi-arg: bucket by two columns13)14TBLPROPERTIES ('format-version' = '3');

This is a metadata-only change. Existing data files are not affected when you update a partition spec to use multi-argument transforms. New files are written with the new partitioning, and Iceberg's partition evolution handles the transition seamlessly.

Table Encryption

v3 formalizes encryption at the table level. In v2, encryption was handled outside the spec — you relied on storage-level encryption (like S3 SSE) or custom tooling. v3 brings it into the Iceberg metadata layer through an encryption-keys list in the table metadata and integration with external Key Management Services (KMS).

When enabled, Iceberg encrypts and tamper-proofs data files, delete files, manifest files, and manifest list files. The metadata.json file itself is not encrypted — it needs to remain readable so engines can locate and decrypt everything else. Each file is encrypted with a data encryption key (DEK) that is itself wrapped by a master key stored in your KMS (AWS KMS, Azure Key Vault, or GCP Cloud KMS).

sql

1-- Enable table encryption with AWS KMS2ALTER TABLE catalog.sensitive_events3SET TBLPROPERTIES (4  'encryption.key-id' = 'arn:aws:kms:us-east-1:123456789:key/abc-def-123'5);

This is particularly valuable for regulated industries — healthcare, finance, government — where data-at-rest encryption with externally managed keys is a compliance requirement. The encryption is transparent to query engines: they decrypt files automatically using the KMS integration, so end users do not need to change their queries.

Upgrading from v2 to v3

The upgrade is a single metadata operation. No data is rewritten. You run one ALTER TABLE command and your table is on v3.

sql

1ALTER TABLE catalog.my_table2SET TBLPROPERTIES ('format-version' = '3');

There are two important caveats. First, the upgrade is irreversible — you cannot downgrade a v3 table back to v2. Test on non-production tables first. Second, not all engines support all v3 features yet. Before upgrading, verify that every engine in your pipeline can at least read v3 metadata. Engines that do not understand a specific v3 feature (like Variant) will simply skip those columns or return null, but they must be able to parse the v3 metadata format.

As of early 2026, here is the engine support landscape: Spark 4.0+ fully supports v3 including Variant and deletion vectors. Snowflake has v3 in public preview with support for Variant, default values, deletion vectors, and row lineage. Databricks supports v3 through Unity Catalog on Runtime 17.3+. Trino has experimental v3 support including Variant and nanosecond timestamps. Flink support is in progress. DuckDB reads v3 tables and supports geospatial types.

A practical migration strategy: start with tables that benefit most from a specific v3 feature. If you have CDC-heavy tables drowning in positional delete files, upgrade those first to get deletion vectors. If you have JSON-heavy tables, upgrade to get Variant. Do not upgrade every table in your catalog on day one — roll it out incrementally as engine support solidifies.

The bottom line: Iceberg v3 is not an incremental spec revision. It addresses five years of community pain points — from JSON handling to delete performance to change tracking — in a single backward-compatible release. The features are already landing in production engines. If you are running Iceberg v2 today, the question is not whether to upgrade, but which tables to upgrade first.

Apache Iceberg v3: What's New?

Tags

Related articles

Building a Multi-Engine Iceberg Lakehouse with QueryFlux

dbt Incremental Models That Actually Scale

Medallion Architecture: Bronze, Silver, Gold Done Right

Real-Time CDC with Debezium and Kafka

Snowflake Cost Optimization Without Slowing Teams Down