📄 NEW: Free Data Engineering Cheatsheet 2026 — SQL, Airflow, Spark, Kafka, dbt & more →
Apache Hudi logo

Apache Hudi

Open-source data lake framework providing streaming data lake capabilities with ACID transactions, upserts, and incremental processing.

Data Lakes data-lake streaming acid

Key Features

  • Upserts: Insert or update based on key
  • Deletes: Soft and hard deletes
  • Merge on read: Lazy reconciliation
  • Copy on write: Immediate consistency
  • Near real-time: Minutes latency
  • Kafka integration: Direct streaming ingest
  • DeltaStreamer: Built-in ingestion utility
  • Flink support: Stream processing integration
  • Change streams: Track data changes
  • Incremental queries: Process only deltas
  • Time travel: Query historical versions
  • Rollback: Revert to previous state
  • Atomicity: All-or-nothing commits
  • Consistency: Schema enforcement
  • Isolation: Concurrent reader/writer
  • Durability: Committed data persists

📁 More Data Lakes Tools

View all Data Lakes tools →