📄 NEW: Free Data Engineering Cheatsheet 2026 — SQL, Airflow, Spark, Kafka, dbt & more →

Apache Iceberg

Free tier available

Open table format for huge analytic datasets

Data Lakes lakehouse table-format open-source

Visit Website ↗ Documentation GitHub

📖 Overview

Apache Iceberg is an open table format designed for huge analytic datasets. Created at Netflix, it's become the leading open standard for lakehouse tables. Its engine-agnostic design means the same tables work with Spark, Trino, Flink, and increasingly Snowflake and Databricks.

✨ Key Features

✓ Engine Agnostic: Works with any compute engine
✓ Hidden Partitioning: No partition columns in queries
✓ Schema Evolution: Safe schema changes
✓ Time Travel: Query historical snapshots
✓ Partition Evolution: Change partitioning without rewrite
✓ Row-level Updates: Efficient MERGE operations

💰 Pricing

Model

open source

Starting Price

$0

✓ Free tier available

👍 Pros

+ True open standard (no vendor control)
+ Best multi-engine support
+ Netflix-proven at massive scale
+ Growing adoption across vendors
+ Snowflake and Databricks support

👎 Cons

− Requires more setup than Delta
− Ecosystem still maturing
− No single vendor champion
− Some features vary by engine

🎯 Best For

Teams wanting vendor independence and multi-engine flexibility. The safe bet for long-term data lake strategy.

🔗 Works With

📁 More Data Lakes Tools

Open-source storage layer bringing reliability to data lakes

Lakehouse platform with SQL query engine

View all Data Lakes tools →