- Serverless ETL service for data integration
- Automatically discovers and catalogs data
- Transforms and moves data to S3, Redshift, or Athena
Open AWS Glue
- Log in to the AWS Management Console.
- In the search bar, type “Glue” and open AWS Glue.
Create a Database
The database will store metadata about your tables.
- In the Glue console, go to Data Catalog , then Databases.
- Click Add database.
- Enter a name, for example mygluedb.
- Click Create.
Create a Crawler
A crawler scans your data source (like S3) and automatically builds a table in the Glue Data Catalog.
- Go to Data Catalog ,then Crawlers.
- Click Create crawler.
- Give it a name, e.g., s3_data_crawler.
- Source type: Choose Data stores.
- Connection type: Select S3 and browse to your bucket (s3://sourcedata/).
- IAM Role: Choose Create new IAM role .Glue will make one automatically.
- Output: Choose your database (mygluedb).
- Review all settings and click Create crawler.
- Once created, click Run crawler.
When it finishes, you’ll see a new table under your Glue database.
Create an ETL Job
Now we’ll transform and move the data.
- Go to ETL Jobs then Jobs then Create job.
- Choose Visual with a source and target.
- Select your source table (from the crawler).
- Choose Amazon S3 as the target.
- Set a target path like s3://destdata/.
- Set a target path like s3://destdata/.
- Click Next to view the script . Glue automatically generates a PySpark script.
- Click Save and run job.
AWS Glue will now extract, transform, and load your data to the target bucket.
Check the Output
- Go to your target S3 bucket (s3://destdata/)
- You should see new files with transformed data.
AWS Glue makes building and managing ETL pipelines simple and efficient. It’s an ideal solution for quickly turning raw data into analytics-ready formats, helping data engineers and analysts focus on insights rather than infrastructure.
If you’re looking for expert help to transform data with AWS Glue, our team at Skynats is here to assist. With our professional AWS Management Services and reliable DevOps Support Services, we ensure seamless data movement, transformation, and automation across your cloud environment. Contact Skynats today to get end-to-end AWS Glue implementation and 24/7 technical support.