← All Tools
DataHub logo

DataHub

Open-source metadata platform for the modern data stack

Overview

DataHub is an open-source metadata platform originally built at LinkedIn. It provides data discovery, lineage, and governance capabilities. Acryl Data offers a managed cloud version. DataHub is highly extensible and handles metadata at LinkedIn's scale.

Key Features

  • Metadata Ingestion: 50+ source connectors
  • Lineage: Column-level data lineage
  • Search & Discovery: Find data assets fast
  • Data Quality: Integrate quality metrics
  • Governance: Tags, glossary, ownership
  • Real-time Updates: Stream metadata changes

Pros

  • 👍 True open-source with active community
  • 👍 LinkedIn-proven scale
  • 👍 Highly extensible architecture
  • 👍 Good lineage capabilities
  • 👍 Self-hosted option
  • 👍 Acryl offers managed cloud

Cons

  • 👎 Complex to deploy self-hosted
  • 👎 Steeper learning curve
  • 👎 UI less polished than commercial tools
  • 👎 Requires investment to configure

Best For

Organizations wanting open-source data catalog at scale. Good for teams with engineering resources to customize and maintain.

Founded: 2020 HQ: LinkedIn / Acryl Data