Authenticate to cloud services with user credentials
When you develop Apache Airflow DAGs locally with the Astro CLI, testing with local data is the easiest way to get started. For more complex data pipelines, you might need to test DAGs locally with data that's stored in your organization's cloud, such as secret values in a secrets backend service.
To access data on the cloud while developing locally with the Astro CLI, export your cloud account user credentials to a secure configuration file and mount that file in the Docker containers running your local Airflow environment. After you configure this file, you can connect to your cloud without needing to configure additional credentials in Airflow connections. Airflow inherits all permissions from your cloud account and uses them to access your cloud.
Setup
- AWS
- GCP
- Azure
Prerequisites
- A user account on AWS with access to AWS cloud resources.
- The AWS CLI.
- The Astro CLI.
- An Astro project.
Retrieve AWS user credentials locally
Run the following command to obtain your user credentials locally:
aws configure
This command prompts you for your Access Key Id, Secret Access Key, Region, and output format. If you log into AWS using single sign-on (SSO), run aws configure sso
instead.
The AWS CLI then stores your credentials in two separate files:
.aws/config
.aws/credentials
The location of these files depends on your operating system:
- Linux:
/home/<username>/.aws
- Mac:
/Users/<username>/.aws
- Windows:
%UserProfile%/.aws
Configure your Astro project
The Astro CLI runs Airflow in a Docker-based environment. To give Airflow access to your credential files, you'll mount the .aws
folder as a volume in Docker.
-
In your Astro project, create a file named
docker-compose.override.yml
with the following configuration:- Mac
- Linux
- Windows
version: "3.1"
services:
scheduler:
volumes:
- /Users/<username>/.aws:/home/astro/.aws:rw
webserver:
volumes:
- /Users/<username>/.aws:/home/astro/.aws:rw
triggerer:
volumes:
- /Users/<username>/.aws:/home/astro/.aws:rwversion: "3.1"
services:
scheduler:
volumes:
- /home/<username>/.aws:/home/astro/.aws:rw
webserver:
volumes:
- /home/<username>/.aws:/home/astro/.aws:rw
triggerer:
volumes:
- /home/<username>/.aws:/home/astro/.aws:rwversion: "3.1"
services:
scheduler:
volumes:
- /c/Users/<username>/.aws:/home/astro/.aws:rw
webserver:
volumes:
- /c/Users/<username>/.aws:/home/astro/.aws:rw
triggerer:
volumes:
- /c/Users/<username>/.aws:/home/astro/.aws:rw
Depending on your Docker configurations, you might have to make your .aws
folder accessible to Docker. To do this, open Preferences in Docker Desktop and go to Resources -> File Sharing. Add the full path of your .aws
folder to the list of shared folders.
-
In your Astro project's
.env
file, add the following environment variables. Make sure that the volume path is the same as the one you configured in thedocker-compose.override.yml
.AWS_CONFIG_FILE=/home/astro/.aws/config
AWS_SHARED_CREDENTIALS_FILE=/home/astro/.aws/credentials
When you run Airflow locally, all AWS connections without defined credentials automatically fall back to your user credentials when connecting to AWS. Airflow applies and overrides user credentials for AWS connections in the following order:
- Mounted user credentials in the
~/.aws/config
file. - Configurations in
aws_access_key_id
,aws_secret_access_key
, andaws_session_token
. - An explicit username & password provided in the connection.
For example, if you completed the configuration in this document and then created a new AWS connection with its own username and password, Airflow would use those credentials instead of the credentials in ~/.aws/config
.
Prerequisites
- A user account on GCP with access to GCP cloud resources.
- The Google Cloud SDK.
- The Astro CLI.
- An Astro project.
- Optional. Access to a secrets backend hosted on GCP, such as GCP Secret Manager.
Retrieve GCP user credentials locally
Run the following command to obtain your user credentials locally:
gcloud auth application-default login
The SDK provides a link to a webpage where you can log in to your Google Cloud account. After you complete your login, the SDK stores your user credentials in a file named application_default_credentials.json
.
The location of this file depends on your operating system:
- Linux:
$HOME/.config/gcloud/application_default_credentials.json
- Mac:
/Users/<username>/.config/gcloud/application_default_credentials.json
- Windows:
%APPDATA%/gcloud/application_default_credentials.json
Configure your Astro project
The Astro CLI runs Airflow in a Docker-based environment. To give Airflow access to your credential file, mount it as a Docker volume.
-
In your Astro project, create a file named
docker-compose.override.yml
to your project with the following configuration:- Mac
- Linux
- Windows
version: "3.1"
services:
scheduler:
volumes:
- /Users/<username>/.config/gcloud/application_default_credentials.json:/usr/local/airflow/gcloud/application_default_credentials.json:rw
webserver:
volumes:
- /Users/<username>/.config/gcloud/application_default_credentials.json:/usr/local/airflow/gcloud/application_default_credentials.json:rw
triggerer:
volumes:
- /Users/<username>/.config/gcloud/application_default_credentials.json:/usr/local/airflow/gcloud/application_default_credentials.json:rwversion: "3.1"
services:
scheduler:
volumes:
- /home/<username>/.config/gcloud/application_default_credentials.json:/usr/local/airflow/gcloud/application_default_credentials.json:rw
webserver:
volumes:
- /home/<username>/.config/gcloud/application_default_credentials.json:/usr/local/airflow/gcloud/application_default_credentials.json:rw
triggerer:
volumes:
- /home/<username>/.config/gcloud/application_default_credentials.json:/usr/local/airflow/gcloud/application_default_credentials.json:rwversion: "3.1"
services:
scheduler:
volumes:
- /c/Users/<username>/AppData/Roaming/gcloud/application_default_credentials.json:/usr/local/airflow/gcloud/application_default_credentials.json:rw
webserver:
volumes:
- /c/Users/<username>/AppData/Roaming/gcloud/application_default_credentials.json:/usr/local/airflow/gcloud/application_default_credentials.json:rw
triggerer:
volumes:
- /c/Users/<username>/AppData/Roaming/gcloud/application_default_credentials.json:/usr/local/airflow/gcloud/application_default_credentials.json:rw -
In your Astro project's
.env
file, add the following environment variable. Ensure that this volume path is the same as the one you configured indocker-compose.override.yml
.GOOGLE_APPLICATION_CREDENTIALS=/usr/local/airflow/gcloud/application_default_credentials.json
When you run Airflow locally, all GCP connections without defined credentials automatically fall back to your user credentials when connecting to GCP. Airflow applies and overrides user credentials for GCP connections in the following order:
- Mounted user credentials in the
/~/gcloud/
folder - Configurations in
gcp_keyfile_dict
- An explicit username & password provided in the connection
For example, if you completed the configuration in this document and then created a new GCP connection with its own username and password, Airflow would use those credentials instead of the credentials in ~/gcloud/application_default_credentials.json
.
Prerequisites
- A user account on Azure with access to Azure cloud resources.
- The Azure CLI.
- The Astro CLI.
- An Astro project
- If you're using Windows, Windows Subsystem Linux.
Retrieve Azure user credentials locally
Run the following command to obtain your user credentials locally:
az login
The CLI provides you with a link to a webpage where you authenticate to your Azure account. Once you complete the login, the CLI stores your user credentials in your local Azure configuration folder. The developer account credentials are used in place of the credentials associated with the Registered Application (Service Principal) in Microsoft Entra ID.
The default location of the Azure configuration folder depends on your operating system:
- Linux:
$HOME/.azure/
- Mac:
/Users/<username>/.azure
- Windows:
%USERPROFILE%/.azure/
Configure your Astro project
The Astro CLI runs Airflow in a Docker-based environment. To give Airflow access to your credential files, mount the .azure
folder as a volume in Docker.
-
In your Astro project, create a file named
docker-compose.override.yml
with the following configuration:- Mac
- Windows and Linux
version: "3.1"
services:
scheduler:
volumes:
- /Users/<username>/.azure:/usr/local/airflow/.azure:rw
webserver:
volumes:
- /Users/<username>/.azure:/usr/local/airflow/.azure:rw
triggerer:
volumes:
- /Users/<username>/.azure:/usr/local/airflow/.azure:rwversion: "3.1"
services:
scheduler:
volumes:
- /home/<username>/.azure:/usr/local/airflow/.azure
webserver:
volumes:
- /home/<username>/.azure:/usr/local/airflow/.azure
triggerer:
volumes:
- /home/<username>/.azure:/usr/local/airflow/.azureinfoIn Azure CLI versions 2.30.0 and later on Windows systems, credentials generated by the CLI are saved in an encrypted file and cannot be accessed from Astro Runtime Docker containers. See MSAL-based Azure CLI.
To work around this limitation on a Windows computer, use Windows Subsystem Linux (WSL) when completing this setup.
If you installed the Azure CLI both in Windows and WSL, make sure that the
~/.azure
file path in your volume points to the configuration file for the Azure CLI installed in WSL. -
Add the following lines after the
FROM
line in yourDockerfile
to install the Azure CLI inside your Astro Runtime image:# FROM ...
USER root
RUN curl -sL https://aka.ms/InstallAzureCLIDeb | bash
USER ASTRO
If you're using an Apple M1 Mac, you must use thelinux/amd64
distribution of Astro Runtime. Replace the first line in the Dockerfile
of your Astro project with FROM --platform=linux/amd64 quay.io/astronomer/astro-runtime:<version>
.
-
Add the following environment variable to your
.env
file. Make sure the file path is the same volume location you configured indocker-compose.override.yml
AZURE_CONFIG_DIR=/usr/local/airflow/.azure
When you run Airflow locally, all Azure connections without defined credentials automatically fall back to your user credentials when connecting to Azure. Airflow applies and overrides user credentials for Azure connections in the following order:
- Mounted user credentials in
/~/.azure
. - Configurations in
azure_client_id
,azure_tenant_id
, andazure_client_secret
. - An explicit username & password provided in the connection.
For example, if you completed the configuration in this document and then created a new Azure connection with its own username and password, Airflow would use those credentials instead of the credentials in ~/.azure/config
.
Test your credentials with a secrets backend
Now that Airflow has access to your user credentials, you can use them to connect to your cloud services. Use the following example setup to test your credentials by pulling values from different secrets backends.
- AWS
- GCP
- Azure
-
Create a secret for an Airflow variable or connection in AWS Secrets Manager. All Airflow variables and connection keys must be prefixed with the following strings respectively:
airflow/variables/<my_variable_name>
airflow/connections/<my_connection_id>
For example when adding the secret variable
my_secret_var
you will need to give the secret the nameairflow/variables/my_secret_var
.When setting the secret type, choose
Other type of secret
and select thePlaintext
option. If you're creating a connection URI or a non-dict variable as a secret, remove the brackets and quotations that are pre-populated in the plaintext field. -
Add the following environment variables to your Astro project
.env
file. For additional configuration options, see the Apache Airflow documentation. Make sure to specify yourregion_name
.AIRFLOW__SECRETS__BACKEND=airflow.providers.amazon.aws.secrets.secrets_manager.SecretsManagerBackend
AIRFLOW__SECRETS__BACKEND_KWARGS={"connections_prefix": "airflow/connections", "variables_prefix": "airflow/variables", "region_name": "<your-aws-region>"} -
Run the following command to start Airflow locally:
astro dev start
-
Access the Airflow UI at
localhost:8080
and create an Airflow AWS connection namedaws_standard
with no credentials. See Connections.When you use this connection in your DAG, it will fall back to using your configured user credentials.
-
Add a DAG which uses the secrets backend to your Astro project
dags
directory. You can use the following example DAG to retrieve<my_variable_name>
and<my_connection_id>
from the secrets backend and print it to the terminal:from airflow.models.dag import DAG
from airflow.hooks.base import BaseHook
from airflow.models import Variable
from airflow.decorators import task
from datetime import datetime
with DAG(
'example_secrets_dag',
start_date=datetime(2022, 1, 1),
schedule=None
):
@task
def print_var():
my_var = Variable.get("<my_variable_name>")
print(f"My secret variable is: {my_var}") # secrets will be masked in the logs!
conn = BaseHook.get_connection(conn_id="<my_connection_id>")
print(f"My secret connection is: {conn.get_uri()}") # secrets will be masked in the logs!
print_var() -
In the Airflow UI, unpause your DAG and click Play to trigger a DAG run.
-
View logs for your DAG run. If the connection was successful, your masked secrets appear in your logs. See Airflow logging.
-
Create a secret for an Airflow variable or connection in GCP Secret Manager. You can do this using the Google Cloud Console or the gcloud CLI. All Airflow variables and connection keys must be prefixed with the following strings respectively:
airflow-variables-<my_variable_name>
airflow-connections-<my_connection_name>
For example when adding the secret variable
my_secret_var
you will need to give the secret the nameairflow-variables-my_secret_var
. -
Add the following environment variables to your Astro project
.env
file. For additional configuration options, see the Apache Airflow documentation. Make sure to specify yourproject_id
.AIRFLOW__SECRETS__BACKEND=airflow.providers.google.cloud.secrets.secret_manager.CloudSecretManagerBackend
AIRFLOW__SECRETS__BACKEND_KWARGS={"connections_prefix": "airflow-connections", "variables_prefix": "airflow-variables", "project_id": "<my-project-id>"} -
Run the following command to start Airflow locally:
astro dev start
-
Access the Airflow UI at
localhost:8080
and create an Airflow GCP connection namedgcp_standard
with no credentials. See Connections.When you use this connection in your DAG, it will fall back to using your configured user credentials.
-
Add a DAG which uses the secrets backend to your Astro project
dags
directory. You can use the following example DAG to retrieve<my_variable_name>
and<my_connection_id>
from the secrets backend and print it to the terminal:from airflow.models.dag import DAG
from airflow.hooks.base import BaseHook
from airflow.models import Variable
from airflow.decorators import task
from datetime import datetime
with DAG(
'example_secrets_dag',
start_date=datetime(2022, 1, 1),
schedule=None
):
@task
def print_var():
my_var = Variable.get("<my_variable_name>")
print(f"My secret variable is: {my_var}")
conn = BaseHook.get_connection(conn_id="<my_connection_name>")
print(f"My secret connection is: {conn.get_uri()}")
print_var() -
In the Airflow UI, unpause your DAG and click Play to trigger a DAG run.
-
View logs for your DAG run. If the connection was successful, your masked secrets appear in your logs. See Airflow logging.
-
Create a secret for an Airflow variable or connection in Azure Key Vault. All Airflow variables and connection keys must be prefixed with the following strings respectively:
airflow-variables-<my_variable_name>
airflow-connections-<my_connection_name>
For example, to use a secret named
mysecretvar
in your DAG, you must name the secretairflow-variables-mysecretvar
.You will need to store your connection in URI format.
-
In your Astro project, add the following line to Astro project
requirements.txt
file:apache-airflow-providers-microsoft-azure
-
Add the following environment variables to your Astro project
.env
file. For additional configuration options, see the Apache Airflow documentation. Make sure to specify yourvault_url
.AIRFLOW__SECRETS__BACKEND=airflow.providers.microsoft.azure.secrets.key_vault.AzureKeyVaultBackend
AIRFLOW__SECRETS__BACKEND_KWARGS={"connections_prefix": "airflow-connections", "variables_prefix": "airflow-variables", "vault_url": "<your-vault-url>"}By default, this setup requires that you prefix any secret names in Key Vault with
airflow-connections
orairflow-variables
. If you don't want to use prefixes in your Key Vault secret names, set the values for"connections_prefix"
and"variables_prefix"
to""
withinAIRFLOW__SECRETS__BACKEND_KWARGS
. Thevault_url
can be found on the overview page of your Key vault underVault URI
. -
Run the following command to start Airflow locally:
astro dev start
-
Access the Airflow UI at
localhost:8080
and create an Airflow Azure connection namedazure_standard
with no credentials. See Connections.When you use this connection in your DAG, it will fall back to using your configured user credentials.
-
Add a DAG which uses the secrets backend to your Astro project
dags
directory. You can use the following example DAG to retrieve a value fromairflow/variables
and print it to the terminal:from airflow.models.dag import DAG
from airflow.hooks.base import BaseHook
from airflow.models import Variable
from airflow.decorators import task
from datetime import datetime
with DAG(
'example_secrets_dag',
start_date=datetime(2022, 1, 1),
schedule=None
):
@task
def print_var():
my_var = Variable.get("mysecretvar")
print(f"My secret variable is: {my_var}")
conn = BaseHook.get_connection(conn_id="mysecretconnection")
print(f"My secret connection is: {conn.get_uri()}")
print_var() -
In the Airflow UI, unpause your DAG and click Play to trigger a DAG run.
-
View logs for your DAG run. If the connection was successful, your masked secrets appear in your logs. See Airflow logging.