← All Tools
AWS Glue logo

AWS Glue

Serverless data integration service on AWS

Overview

AWS Glue is a serverless data integration service that makes it easy to discover, prepare, and combine data for analytics and ML. It includes a data catalog, ETL engine (Spark-based), and crawlers for automatic schema discovery.

Key Features

  • Serverless: No infrastructure to manage
  • Data Catalog: Centralized metadata repository
  • Crawlers: Auto-discover schemas
  • Visual ETL: Low-code job authoring
  • Spark Engine: Scalable processing
  • Job Bookmarks: Incremental processing

Pros

  • 👍 Deep AWS integration
  • 👍 Serverless scaling
  • 👍 Data Catalog is useful standalone
  • 👍 Visual editor for simple jobs
  • 👍 Pay only for compute used

Cons

  • 👎 Can be expensive at scale
  • 👎 Cold start latency
  • 👎 Limited to Spark/Python
  • 👎 Complex pricing model
  • 👎 Debugging can be painful

Best For

AWS-native organizations needing serverless ETL. Good for teams without dedicated data engineers who need basic data integration.

Founded: 2017 HQ: Amazon Web Services