{"id":14224,"date":"2025-02-07T12:08:58","date_gmt":"2025-02-07T06:38:58","guid":{"rendered":"https:\/\/www.skynats.com\/?p=14224"},"modified":"2025-02-07T12:09:01","modified_gmt":"2025-02-07T06:39:01","slug":"how-to-install-apache-airflow-on-ubuntu-24-04","status":"publish","type":"post","link":"https:\/\/www.skynats.com\/blog\/how-to-install-apache-airflow-on-ubuntu-24-04\/","title":{"rendered":"How to Install Apache Airflow on Ubuntu 24.04"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\">Apache Airflow is a powerful open-source tool designed for orchestrating complex workflows and data pipelines. It enables you to programmatically schedule and monitor workflows, making it perfect for automating tasks like data processing and machine learning pipelines. With dynamic pipeline generation, robust scheduling, and monitoring features, Airflow has become one of the top tools in the data engineering field. To get started, install Apache Airflow on Ubuntu 24.04 and leverage its capabilities for streamlined workflow automation.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Prerequisites:<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">An Ubuntu 24.04 instance has at least 4 GB of RAM.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">A domain(skynats.example.com) with an A record pointing to the instance\u2019s IP address.<\/p>\n\n\n\n<p class=\"has-normal-font-size wp-block-paragraph\"><strong>Step 1: Update Your System<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>sudo apt update<\/code><\/pre>\n\n\n\n<p class=\"has-normal-font-size wp-block-paragraph\"><strong>Step 2: Verify Python Installation<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>python3 --version<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">If its not installed, then run:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>sudo apt install python3<\/code><\/pre>\n\n\n\n<p class=\"has-normal-font-size wp-block-paragraph\"><strong>Step 3: Install Required Packages<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>apt-get install build-essential libpq-dev python3-dev<\/code><\/pre>\n\n\n\n<p class=\"has-normal-font-size wp-block-paragraph\"><strong>Step 4: Create a Virtual Environment<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Create a new Python virtual environment called airflow_env:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>python3 -m venv airflow_env<\/code><\/pre>\n\n\n\n<p class=\"has-normal-font-size wp-block-paragraph\"><strong>Step 5: Activate the Virtual Environment<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>source ~\/airflow_env\/bin\/activate<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Once activated, your prompt should change to (airflow_env).<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-step-6-install-apache-airflow-with-postgresql-support\" style=\"font-size:18px\">Step 6: Install Apache Airflow with PostgreSQL Support<\/h2>\n\n\n\n<pre class=\"wp-block-code\"><code>pip install apache-airflow&#91;postgres] psycopg2<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-step-7-install-postgresql\" style=\"font-size:18px\">Step 7: Install PostgreSQL<\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>sudo apt install postgresql postgresql-contrib \nsudo systemctl start postgresql<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-step-8-configure-the-postgresql-for-airflow\" style=\"font-size:18px\">Step 8: Configure the PostgreSQL for Airflow<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Access the <a href=\"https:\/\/www.postgresql.org\/\" target=\"_blank\" rel=\"noopener\">PostgreSQL<\/a> console:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>sudo -u postgres psql<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Inside the PostgreSQL console, create a new user for Airflow and set the password:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>CREATE USER airflow PASSWORD 'yourpassword'; <\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">&#8212; Replace &#8216;yourpassword&#8217; with your desired password.Grant the new user full privileges on all tables in the public schema:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO airflow;<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Create a new database for Airflow:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>CREATE DATABASE airflowdb;<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Grant the Airflow user ownership of the database:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>ALTER DATABASE airflowdb OWNER TO airflow;<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Grant the Airflow user all privileges on the public schema and exit PostgreSQL:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>GRANT ALL ON SCHEMA public TO airflow; \nexit;<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-step-9-modify-the-airflow-configuration\" style=\"font-size:18px\">Step 9: Modify the Airflow Configuration<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">If you don\u2019t see your Airflow installation directory, initialize the database and start the scheduler to generate the necessary directories:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>airflow db init; airflow scheduler<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Stop the scheduler using CTRL+C, then open the airflow.cfg file:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>nano ~\/airflow\/airflow.cfg<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Find the following lines and modify them:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>executor = LocalExecutor \nsql_alchemy_conn = postgresql+psycopg2:\/\/airflow:YourStrongPassword@localhost\/airflowdb<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Replace Your StrongPassword with the password you set earlier for the PostgreSQL airflow user. Save and close the file.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-step-10-initialize-airflow-s-metadata-database\" style=\"font-size:18px\">Step 10: Initialize Airflow\u2019s Metadata Database<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Apply the configuration changes by initializing the Airflow metadata database:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>airflow db init<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-step-11-create-an-admin-user-for-airflow\" style=\"font-size:18px\">Step 11: Create an Admin User for Airflow<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Create an administrative user for accessing the Apache Airflow UI. Replace skynats with your desired username: <\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>airflow users create \\\n  --username skynats \\ \n --password yourSuperSecretPassword \\ \n --firstname Skynats \\ \n --lastname User \\ \n --role Admin \\ \n --email yourmail@gmail.com<\/code><\/pre>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"h-step-12-start-the-airflow-web-server-and-scheduler\" style=\"font-size:18px\">Step 12: Start the Airflow Web Server and Scheduler<\/h4>\n\n\n\n<p class=\"wp-block-paragraph\">Start the Airflow web server on port 8080 in the background and redirect logs to a file. Start the Airflow scheduler in the background and redirect logs to another file:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>nohup airflow webserver -p 8080 > webserver.log 2>&amp;1 &amp;\nnohup airflow scheduler > scheduler.log 2>&amp;1 &amp;<\/code><\/pre>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"h-step-13-configure-nginx-as-a-reverse-proxy\" style=\"font-size:18px\">Step 13: Configure Nginx as a Reverse Proxy<\/h4>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-step-14-access-the-airflow-ui\" style=\"font-size:18px\">Step 14: Access the Airflow UI<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Open your browser and navigate to your domain (skynats.example.com). You should be greeted by the Apache Airflow login page. Use the credentials you created earlier to log in.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-step-15-secure-apache-airflow-with-ssl-certificates\" style=\"font-size:18px\">Step 15: Secure Apache Airflow with SSL Certificates<\/h2>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"h-step-16-now-create-and-run-dags-using-apache-airflow\" style=\"font-size:18px\">Step 16: Now create and run DAGs using Apache Airflow.<\/h4>\n\n\n\n<p class=\"wp-block-paragraph\">Navigate to the Airflow web interface, locate your dags, enable it, and manually trigger it. You can monitor the DAG&#8217;s execution through the Graph View and Event Log.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">If you need assistance or encounter any issues while following the steps to install Apache Airflow on Ubuntu 24.04, feel free to reach out to our support team. Our experts are ready to provide guidance and ensure a smooth installation process. <a href=\"https:\/\/www.skynats.com\/contact-us\/\">Contact us<\/a> today for professional support and get the help you need to successfully set up Apache Airflow on your Ubuntu system!<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Apache Airflow is a powerful open-source tool designed for orchestrating complex workflows and data pipelines. It enables you to programmatically schedule and monitor workflows, making it perfect for automating tasks like data processing and machine learning pipelines. With dynamic pipeline generation, robust scheduling, and monitoring features, Airflow has become one of the top tools in [&hellip;]<\/p>\n","protected":false},"author":16,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[5],"tags":[799,960,899],"class_list":["post-14224","post","type-post","status-publish","format-standard","hentry","category-blog","tag-apache-airflow","tag-install-apache-airflow-on-ubuntu-24-04","tag-ubuntu-24-04"],"_links":{"self":[{"href":"https:\/\/www.skynats.com\/blog\/wp-json\/wp\/v2\/posts\/14224","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.skynats.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.skynats.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.skynats.com\/blog\/wp-json\/wp\/v2\/users\/16"}],"replies":[{"embeddable":true,"href":"https:\/\/www.skynats.com\/blog\/wp-json\/wp\/v2\/comments?post=14224"}],"version-history":[{"count":0,"href":"https:\/\/www.skynats.com\/blog\/wp-json\/wp\/v2\/posts\/14224\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.skynats.com\/blog\/wp-json\/wp\/v2\/media?parent=14224"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.skynats.com\/blog\/wp-json\/wp\/v2\/categories?post=14224"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.skynats.com\/blog\/wp-json\/wp\/v2\/tags?post=14224"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}