Authorize an Astro Deployment to cloud resources using workload identity
When you create an Airflow connection from a Deployment to access cloud resources, Airflow uses your connection details to access those services. You can add credentials to your Airflow connections to authenticate, but it can be risky to add secrets like passwords to your Airflow environment.
To avoid adding secrets to your Airflow connection, you can directly authorize your Astro Deployment to access AWS or GCP cloud services using workload identity. Astronomer recommends using a workload identity in most cases to improve security and avoid managing credentials across your Deployments. If you have less strict security requirements, you can still use any of the methods described in Airflow connection guides to manage your connection authorization.
This guide explains how to authorize your Deployment to a cloud using workload identity. For each Deployment, you will:
- Authorize your Deployment to your cloud services.
- Create an Airflow connection to access your cloud services.
Watch the Astro Academy Customer Workload Managed Identity Learning Byte video to learn more about managed identities and how to set up passwordless authentication for GCP.
Prerequisites
The Astro cluster running your Deployment must be connected to your cloud's network. See Networking overview.
What is workload identity?
A workload identity is a Kubernetes service account that provides an identity to your Deployment. The Deployment can use this identity to authenticate to a cloud's API server, and the cloud can use this identity to authorize the Deployment to access different resources.
Setup
- AWS
- GCP
- Azure
Attach an IAM role to your Deployment
You can attach an AWS IAM role to your Deployment to grant the Deployment all of the role's permissions.
Using IAM roles provides the greatest amount of flexibility for authorizing Deployments to your cloud. For example, you can use existing IAM roles on new Deployments, or your can attach a single IAM role to multiple Deployments that all require the same level of access to your cloud.
Prerequisites
- Minimum Astro Runtime version:
- 9.15.0
- 10.9.0
- 11.5.0
- A new or existing IAM role in your data sources with the required permissions you want your Deployment to have.
- If using AWS CloudShell, the required CLIs are enabled by default.
- If you use a local terminal, the following CLIs are required:
Step 1: Authorize the Deployment to your IAM role
To authorize your Deployment, create an IAM role to assign as your Deployment's workload identity:
- Create an IAM role to delegate permissions to in an AWS service. Grant the role any permission that the Deployment will need in your AWS account. Copy the IAM role ARN to use later in this setup.
- In the Astro UI, select your Deployment and then click Details. In the Advanced section, click Edit.
- In the Workload Identity menu, select Customer Managed Identity.
- Enter your IAM role ARN when prompted, then copy and run the provided CLI command. Click Save Configuration to save the IAM role as a selectable configuration.
- Click Update Deployment to apply the selected IAM role to the Deployment.
- (Optional) Repeat these steps for each Astro Deployment that needs to access your AWS resources. Or, you can edit the
<DeploymentNamespace>
value inCondition
when setting up the Workload Identity for one of the following scenarios to apply to multiple Deployments.
Specify Kubernetes service accounts
Available for both Standard and Dedicated clusters. If your organization does not allow you to use a wildcards in your IAM Trust Policies, change the <DeploymentNamespace>
value in Condition
to specify the Kubernetes service accounts. The following shows an example:
{
"Condition": {
"StringLike": {
"<clusterOIDCIssuerUrl>:aud": "sts.amazonaws.com",
"<clusterOIDCIssuerUrl>:sub": "system:serviceaccount:<DeploymentNamespace>:<DeploymentNamespace>-kpo"
"<clusterOIDCIssuerUrl>:sub": "system:serviceaccount:<DeploymentNamespace>:<DeploymentNamespace>-dag-processor-serviceaccount"
"<clusterOIDCIssuerUrl>:sub": "system:serviceaccount:<DeploymentNamespace>:<DeploymentNamespace>-scheduler-serviceaccount"
"<clusterOIDCIssuerUrl>:sub": "system:serviceaccount:<DeploymentNamespace>:<DeploymentNamespace>-triggerer-serviceaccount"
"<clusterOIDCIssuerUrl>:sub": "system:serviceaccount:<DeploymentNamespace>:<DeploymentNamespace>-webserver-serviceaccount"
"<clusterOIDCIssuerUrl>:sub": "system:serviceaccount:<DeploymentNamespace>:<DeploymentNamespace>-worker-serviceaccount"
}
}
}
Dedicated clusters only: Share or re-use a managed identity using a wildcard
If you want to share or re-use the same customer managed identity on static or ephemeral Deployments for dedicated clusters, without having to update your Trust Policy in your AWS account for every net new Deployment, change the <DeploymentNamespace>
value in Condition
to include a wildcard. You should only use a wildcard in dedicated clusters for security purposes. The following shows an example:
{
"Condition": {
"StringLike": {
"<clusterOIDCIssuerUrl>:aud": "sts.amazonaws.com",
"<clusterOIDCIssuerUrl>:sub": "system:serviceaccount:*:*"
}
}
}
Step 2: Create an Airflow connection
Now that your Deployment is authorized, you can connect it to your cloud using an Airflow connection. Create an Amazon Web Services connection in either the Astro UI or the Airflow UI for your Deployment and specify the following fields:
- Connection Id: Enter a name for the connection.
If you don't see Amazon Web Services as a connection type in the Airflow UI, ensure you have installed its provider package in your Astro project's requirements.txt
file. See Use Provider in the Astronomer Registry for the latest package.
If you use a mix of strategies for managing connections and define the same connection in multiple ways, Airflow uses the following order of precedence:
- Secrets Backend
- Environment Manager
- Environment Variables
- Airflow UI using the Airflow metadata database
Alternative setup: Authorize your Deployment with AWS IAM roles
Step 1: Authorize the Deployment in your cloud
To grant a Deployment access to a service that is running in an AWS account not managed by Astronomer, use AWS IAM roles to authorize your Deployment's workload identity. IAM roles on AWS are often used to manage the level of access a specific user, object, or group of users has to a resource, such as Amazon S3 buckets, Redshift instances, and secrets backends.
To authorize your Deployment, create an IAM role that is assumed by the Deployment's workload identity:
-
In the Astro UI, select your Deployment and then click Details. Copy the Deployment's Workload Identity.
-
In the AWS account that contains your AWS service, create an IAM role. See Creating a role to delegate permissions to an AWS service.
-
In the AWS Management Console, go to the Identity and Access Management (IAM) dashboard.
-
Click Roles and in the Role name column, select the role you created in Step 2.
-
Click Trust relationships.
-
Click Edit trust policy and paste the workload identity you copied from Step 1 in the trust policy. Your policy should look like the following:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": ["<workload-identity-role>"]
},
"Action": "sts:AssumeRole"
}
]
} -
Click Update policy.
Repeat these steps for each Astro Deployment that needs to access your AWS resources.
Step 2: Create an Airflow connection
Now that your Deployment is authorized, you can connect it to your cloud using an Airflow connection. Either create an Amazon Web Services connection in the Astro UI or the Airflow UI for your Deployment and specify the following fields:
-
Connection Id: Enter a name for the connection.
-
Extra:
{
"role_arn": "<your-role-arn>",
"region_name": "<your-region>"
}
If you don't see Amazon Web Services as a connection type in the Airflow UI, ensure you have installed its provider package in your Astro project's requirements.txt
file. See Use Provider in the Astronomer Registry for the latest package.
If you use a mix of strategies for managing connections, if you define the same connection in multiple ways, Airflow uses the following order of precedence:
- Secrets Backend
- Environment Manager
- Environment Variables
- Airflow UI using the Airflow metadata database
Attach a service account to your Deployment
You can attach a custom GCP service account to your Deployment to grant the Deployment all of the service account's permissions.
Using service accounts provides the greatest amount of flexibility for authorizing Deployments to your cloud. For example, you can use existing service accounts on new Deployments, or your can attach a single service account to multiple Deployments that all have the same level of access to your cloud.
-
Create a service account in the GCP project that you want your Deployment to access. Grant the service account any permissions that the Deployment will need in your GCP project. Copy the service account ID to use later in this setup.
-
In the Astro UI, select your Deployment, then click Details. In the Advanced section, click Edit.
-
In the Workload Identity menu, select Customer Managed Identity
-
Enter your GCP service account ID when prompted, then copy and run the provided gcloud CLI command.
-
Click Update Deployment. The service account is now selectable as a workload identity for the Deployment.
-
Complete one of the following options for your Deployment to access your cloud resources:
-
Create a Google Cloud connection type in Airflow and configure the following values:
- Connection Id: Enter a name for the connection.
- Impersonation Chain: Enter the ID of the service account that your Deployment should impersonate.
-
To access resources in a secrets backend, run the following command to create an environment variable that grants access to the secrets backend:
astro deployment variable create --deployment-id <your-deployment-id> AIRFLOW__SECRETS__BACKEND_KWARGS={"connections_prefix": "airflow-connections", "variables_prefix": "airflow-variables", "project_id": "<your-secret-manager-project-id>", "impersonation_chain": "<your-gcp-service-account>"}
-
Alternative setup: Authorize your Deployment through GCP service account impersonation
If your organization has requirements over how service accounts are managed outside of your cloud, you can manually configure GCP service account impersonation to allow your Deployment's default workload identity to impersonate a service account in your GCP project.
-
Create a service account in the GCP project that you want your Deployment to access. Grant the service account any permissions that the Deployment will need in your GCP project. Copy the service account ID to use later in this setup.
-
In the Astro UI, select your Deployment, then click Details. Copy the Deployment's Workload Identity.
-
In the Google Cloud Console, open the IAM & Admin > Service Accounts menu, then open the service account you just created.
-
In the Actions column, click Manage Permissions, then click Grant Access. In the modal that appears, enter your Deployment's workload identity service account in the Add Principals field and select the
Service Account Token Creator
in the Assign Roles field. -
Complete one of the following options for your Deployment to access your cloud resources:
- Create a Google Cloud connection type in Airflow and configure the following values:
- Connection Id: Enter a name for the connection.
- Impersonation Chain: Enter the ID of the service account that your Deployment should impersonate.
Note that this implementation requires
apache-airflow-providers-google >= 10.8.0
. See Add Python, OS-level packages, and Airflow providers.- Specify the impersonation chain in code when you instantiate a Google Cloud operator. See Airflow documentation. Note that if you configure both a connection type and an operator, the operator-level configuration takes precedence.
- To access resources in a secrets backend, run the following command to create an environment variable that grants access to the secrets backend:
astro deployment variable create --deployment-id <your-deployment-id> AIRFLOW__SECRETS__BACKEND_KWARGS={"connections_prefix": "airflow-connections", "variables_prefix": "airflow-variables", "project_id": "<your-secret-manager-project-id>", "impersonation_chain": "<your-gcp-service-account>"}
- Create a Google Cloud connection type in Airflow and configure the following values:
Alternative setup: Grant an IAM role to your Deployment workload identity
Complete this alternative setup if you don't have an existing Google service account that your Deployment workload identity can impersonate.
Step 1: Authorize the Deployment in your cloud
To grant a Deployment access to a service that is running in a GCP account not managed by Astronomer, use your Deployment's workload identity. Workload identity is a service account in GCP that's used to manage the level of access for a specific user, object, or group of users to a resource, such as Google BigQuery or a GCS bucket.
To authorize your Deployment, grant the required access to your Deployment's workload identity:
-
In the Astro UI, select your Deployment, then click Details. In the Workload Identity dropdown menu, select Default Identity. Then, copy the workload identity that appears next to the dropdown menu.
-
Grant your Deployment's workload identity an IAM role that has access to your external data service. To do this with the Google Cloud CLI, run:
gcloud projects add-iam-policy-binding $GOOGLE_CLOUD_PROJECT --member=serviceAccount:<workload-identity> --role=roles/viewer
To grant your workload identity an IAM role using the Google Cloud console, see Grant an IAM role.
Repeat these steps for each Deployment that needs to access your GCP resources.
Step 2: Create an Airflow connection
Now that your Deployment is authorized, you can connect it to your cloud using an Airflow connection. Either create a Google Cloud connection in the Astro UI or the Airflow UI for your Deployment and specify the following fields:
- Connection Id: Enter a name for the connection.
- Project Id: Enter the ID of your Google Cloud Project where your services are running.
If you don't see Google Cloud as a connection type in the Airflow UI, ensure you have installed its provider package in your Astro project's requirements.txt
file. See Use Provider in the Astronomer Registry for the latest package.
If you use a mix of strategies for managing connections, if you define the same connection in multiple ways, Airflow uses the following order of precedence:
- Secrets Backend
- Environment Manager
- Environment Variables
- Airflow UI using the Airflow metadata database
In this setup, you'll authorize an existing user-assigned managed identity to a resource on Azure, then give permissions to your Deployment to assume that managed identity.
Prerequisites
- A Microsoft Entra ID tenant with Global Administrator or Application Administrator privileges.
- A user-assigned managed identity on Azure. See Azure documentation.
- The Azure CLI.
You can only use the same user-assigned managed identity for up to four Deployments. If you need to authorize more than four Deployments to Azure, you need to create more than one user-managed identity. For more information, see Microsoft Entra documentation.
Step 1: Authorize the managed identity in Azure
- In your Azure portal, open the resource that your managed identity needs access to. Then, select Access control (IAM).
- Click Add > Add role assignment.
- Select the role for your managed identity, then click Next.
- In the Assign access to section, select Managed identity. Click + Select Members and choose your managed identity. After you add your managed identity, click Next.
- Review and finalize the assignment.
Step 2: Configure your Deployment
- In your Azure portal, open the Managed Identities menu.
- Search for your managed identity, click Properties, then copy its Name, Client ID, Tenant ID, and Resource group name.
- In the Astro UI, select your Deployment, click Details, then click How to Configure... under Workload Identity.
- In Managed Identity, enter the Name of the managed identity you assigned to the resource.
- In Resource Group, enter the Resource group name that your managed identity belongs to.
- Using the Azure CLI, copy and run the provided command in your local terminal.
- After the command completes, click Close on the modal in the Astro UI.
- (Optional) repeat Steps 4 - 8 for any other Deployments that need to be authorized to Azure.
Step 3: Create an Airflow connection
- In the Astro UI, click Environment in the main menu to open the Connections page.
- Click + Connection to add a new connection for your Workspace.
- Search for Azure, then select the Managed identity option.
- Configure your Airflow connection with the information you copied in the previous steps.
- Link the connection to the Deployment(s) where you configured your managed identity.
Any DAG that uses your connection will now be authorized to Azure through your managed identity.
If you use a mix of strategies for managing connections, if you define the same connection in multiple ways, Airflow uses the following order of precedence:
- Secrets Backend
- Environment Manager
- Environment Variables
- Airflow UI using the Airflow metadata database