Create an Azure Blob Storage connection in Airflow
Azure Blob Storage provides the storage for all of your Azure Storage data objects, including blobs, files, queues, and tables. Integrating your Azure storage account with Airflow lets you perform different kind of operations on blob objects stored in the cloud. For example, you can create or delete a container, upload or read a blob, or download blobs using Airflow.
This guide explains how to set up an Azure Blob Storage connection using the Azure Blob Storage connection type. Astronomer recommends using this connection type because it utilizes the wasb
protocol, which means you can connect with any Azure Storage account including Azure Data Lake Gen 1 and Azure Data Lake Gen 2.
Prerequisites
- The Astro CLI.
- A locally running Astro project.
- An Azure storage account.
- Permissions to access blob data from your local Airflow environment.
Get connection details
To create an Azure Blob Storage connection in Airflow, you can use any of the following methods:
- Shared access key
- Connection string
- SAS token
- Azure app service principal
Microsoft generates two shared access keys by default for every storage account. You can use them to give Airflow access to the data in your storage account.
An Azure Blob Storage connection using a shared access key requires the following information:
- Name of the storage account
- Shared access key
Complete the following steps to retrieve these values:
- In your Azure portal, open your storage account.
- Copy the name of your storage account.
- Follow Microsoft documentation to copy the storage account Key.
A connection string for a storage account includes the authorization information required to access data in your storage account.
An Azure blob storage connection using connection string requires the following information:
- Storage account name
- Storage account connection string
Complete the following steps to retrieve these values:
- In your Azure portal, open your storage account.
- Copy the name of your storage account.
- Follow Microsoft documentation to copy the Connection string.
A shared access signature (SAS) token provides granular access for a storage account.
An Azure blob storage connection using SAS token requires the following information:
- Storage account name
- SAS token
Complete the following steps to retrieve these values:
- In your Azure portal, navigate to your Storage account view and select your subscription.
- Copy the name of your storage account.
- Follow the Microsoft documentation to generate your SAS token. Copy the SAS token.
A service principal for an Azure app provides granular access for a storage account.
An Azure Blob Storage connection using a service principal requires the following information:
- Storage account URL
- Application Client ID
- Tenant ID
- Client secret
Complete the following steps to retrieve these values:
- In your Azure portal, open your storage account.
- Follow Azure documentation to copy your Blob Service URL. It should be in the format
https://mystorageaccount.blob.core.windows.net/
. - Open your Microsoft Entra ID application. Then, from the Overview tab, copy the Application (client) ID and Directory (tenant) ID.
- Create a new client secret for your application to be used in the Airflow connection. Copy the VALUE of the client secret that appears.
- Assign the Storage Blob Data Contributor role to your app so that Airflow can access blob objects in your storage account.
Create your connection
- Shared access key
- Connection string
- SAS token
- Azure app service principal
-
Open your Astro project and add the following line to your
requirements.txt
file:apache-airflow-providers-microsoft-azure
This installs the Microsoft Azure provider package, which makes the Azure Blob Storage connection type available in Airflow.
-
Run
astro dev restart
to restart your local Airflow environment and apply your changes inrequirements.txt
. -
In the Airflow UI for your local Airflow environment, go to Admin > Connections. Click + to add a new connection, then choose the Azure Blob Storage connection type.
-
Fill out the following connection fields using the information you retrieved from Get connection details:
- Connection Id: Enter a name for the connection.
- Blob Storage Login: Enter your storage account name.
- Blog Storage Key: Enter your storage account Key.
-
Click Test. After the connection test succeeds, click Save.
-
Open your Astro project and add the following line to your
requirements.txt
file:apache-airflow-providers-microsoft-azure
This installs the Microsoft Azure provider package, which makes the Azure Blob Storage connection type available in Airflow.
-
Run
astro dev restart
to restart your local Airflow environment and apply your changes inrequirements.txt
. -
In the Airflow UI for your local Airflow environment, go to Admin > Connections. Click + to add a new connection, then choose the Azure Blob Storage connection type.
-
Fill out the following connection fields using the information you retrieved from Get connection details:
- Connection Id: Enter a name for the connection.
- Blob Storage Connection String: Enter your storage account connection string.
-
Click Test. After the connection test succeeds, click Save.
If you want, you can replace the value in Blob Storage Connection String with the connection string for an SAS token.
-
Open your Astro project and add the following line to your
requirements.txt
file:apache-airflow-providers-microsoft-azure
This installs the Microsoft Azure provider package, which makes the Azure Blob Storage connection type available in Airflow.
-
Run
astro dev restart
to restart your local Airflow environment and apply your changes inrequirements.txt
. -
In the Airflow UI for your local Airflow environment, go to Admin > Connections. Click + to add a new connection, then choose the Azure Blob Storage connection type.
-
Fill out the following connection fields using the information you retrieved from Get connection details:
- Connection Id: Enter a name for the connection.
- Blob Storage Login: Enter the name of your storage account.
- SAS Token: Enter your SAS token.
-
Click Test. After the connection test succeeds, click Save.
-
Open your Astro project and add the following line to your
requirements.txt
file:apache-airflow-providers-microsoft-azure
This installs the Microsoft Azure provider package, which makes the Azure Blob Storage connection type available in Airflow.
-
Run
astro dev restart
to restart your local Airflow environment and apply your changes inrequirements.txt
. -
In the Airflow UI for your local Airflow environment, go to Admin > Connections. Click + to add a new connection, then choose the Azure Blob Storage connection type.
-
Fill out the following connection fields using the information you retrieved from Get connection details:
- Connection Id: Enter a name for the connection.
- Account Name: Enter Blob Service URL for your storage account.
- Blob Storage Login: Enter your Application (client) ID.
- Blob Storage Key: Enter your client secret Value.
- Tenant Id: Enter your Directory (tenant) ID.
-
Click Test. After the connection test succeeds, click Save.
How it works
Airflow uses the Azure SDK for Python to connect to Azure services through the WasbHook.