Ask Astro: an end-to-end LLM-Chatbot reference architecture
Ask Astro is a free and open-source chatbot for answering questions about Apache Airflow® and Astronomer. It was built using Apache Airflow, which is used to orchestrate data ingestion, embedding, and loading to a vector database, as well as a feedback loop. The full source code is open-source and available on GitHub.
Ask Astro serves a dual purpose:
- Resource for the Community: Ask Astro is automatically updated with the latest information about Apache Airflow and Astronomer, serving as a free resource for the community to answer Airflow-related questions.
- Learning Tool: Ask Astro is a reference architecture for building a chatbot with access to specialized information on top of Apache Airflow. It demonstrates how to use Airflow to orchestrate a RAG based generative AI application. To adapt Ask Astro to your use case, you can replace the sources of information with those about your own domain and adjust the prompts to fit your needs.
Architecture
Ask Astro consists of 3 main components:
- Data ingestion and embedding: Information about Apache Airflow and Astronomer is ingested from a large number of sources including the Airflow, Cosmos, and Astronomer documentation, and the Astronomer blog. This information is then embedded using OpenAI's models and stored in Weaviate, a vector database.
- Prompt generation: When a user asks a question, an enhanced RAG pattern is used to get the most accurate answer. First, an OpenAI model rewords the question to create 3 prompts that are used to query Weaviate for related information. In the second step, a Cohere re-ranking model ranks the answers which are checked for relevancy by another OpenAI model. The final prompt and answer is generated by an OpenAI model as well.
- Feedback loop: Users can provide feedback on the answers they receive, which is used to improve the quality of the answers in the future. The question, answer, and feedback are ingested back into Weaviate with a second Airflow pipeline.
Airflow features
The DAGs that power Ask Astro highlight several key Airflow best practices and features:
- Airflow retries: To protect against transient API failures and rate limits, all tasks are configured to automatically retry after an adjustable delay.
- Notifications: If a DAG fails, a Slack notification is automatically sent to the ML and AI team using an
on_failure_callback
at the DAG-level. - Dynamic task mapping: Text processing and ingestion into Weaviate are split into multiple parallelized tasks, the number of which is determined at runtime based on the number of documents that need to be processed.
- Scheduling: All Ask Astro DAGs run on a time-based schedule to regularly and automatically update Weaviate with the newest information. Aside from time-based scheduling, Airflow offers advanced options such as data-driven scheduling or scheduling based on external events detected using sensors.
- Modularization: Functions defining how information is extracted and processed are modularized in the
include
folder and imported into the DAGs. This makes the DAG code more readable and offers the ability to reuse functions across multiple DAGs.
Next Steps
Get the Astronomer GenAI cookbook to view more examples of how to use Airflow to build generative AI applications.
If you'd like to build your own chatbot, feel free to fork the Ask Astro repository and adapt it to your use case. We recommend to deploy the Airflow pipelines using a free trial of Astro.