A LangGraph-based RAG (Retrieval-Augmented Generation) system for Foundation Model Benchmarking Tool (FMBench). This repository showcases a design pattern for building and deploying LangGraph agents with a progression from local development to serverless deployment. The assistant helps users understand and work with FMBench - a Python package for running performance benchmarks for Foundation Models (FM) deployed on AWS Generative AI services.
We use the following tools and technologies:
- Ingest the FMBench documentation data in a local
FAISS
index. - Amazon Bedrock for LLMs, Amazon API gateway and AWS Lambda for hosting.
- LangGraph for Agents and LangChain for RAG.
graph LR
%% Agent Building section
A1["LangGraph"] --> D["FastAPI"]
A2["Streamlit"] --> D
A3["Amazon Bedrock"] --> D
%% API & Packaging section
D --> E["Docker Container"]
%% AWS Deployment section
E --> F["AWS Services"]
F --> G1["AWS Lambda & Amazon API Gateway"]
%% Subgraph definitions
subgraph "Agent Building"
A1
A2
A3
end
subgraph "API & Packaging"
D
E
end
subgraph "AWS Deployment"
F
G1
end
%% Styling
classDef dev fill:#d1f0ff,stroke:#0077b6
classDef mid fill:#ffe8d1,stroke:#b66300
classDef aws fill:#ffd1e8,stroke:#b6007a
class A,B,C1,C3 dev
class D,E mid
class F,G1 aws
This project demonstrates a complete workflow for developing and deploying AI agents:
- Local Development: Build and test the agent locally
- FastAPI Server: Convert the agent to a FastAPI application
- Docker Containerization: Package the application in a Docker container
- AWS Lambda Deployment: Deploy the containerized application to AWS Lambda with API Gateway
- RAG System: Uses LangChain, FAISS, and AWS Bedrock to provide information about Foundation Model Benchmarking Tool (FMBench)
- LangGraph Agent: ReAct agent pattern with tools for retrieving program information
- Streamlit Frontend: User-friendly chat interface for interacting with the agent
- FastAPI Backend: Serves the agent via HTTP endpoints
- AWS Lambda Integration: Serverless deployment with API Gateway
-
Data Collection:
- Process FMBench documentation data and save as
documents_1.json
in thedata
folder. - Place the processed data in the data folder
- Process FMBench documentation data and save as
-
Index Building:
- Run build_index.py to create the FAISS vector index
-
Local Testing:
- Run the FastAPI server with
langchain serve
- Test the API endpoints with the local webserver
- Test the user interface with Streamlit
- Run the FastAPI server with
-
Deployment:
- Run python deploy.py to deploy to AWS Lambda and API Gateway
- Test the deployed application by running Streamlit with the API Gateway endpoint
graph TD
A[Crawl data with firecrawl.dev] --> B[Place data in data folder]
B --> C[Run build_index.py]
C --> D[Run langchain server]
D --> E[Test with FastAPI/uvicorn local webserver]
E --> F[Test with Streamlit]
F --> G[Run python deploy.py]
G --> H[Deploy to API Gateway and Lambda]
H --> I[Run Streamlit with API Gateway endpoint]
classDef data fill:#d1f0ff,stroke:#0077b6
classDef local fill:#ffe8d1,stroke:#b66300
classDef deploy fill:#ffd1e8,stroke:#b6007a
class A,B,C data
class D,E,F local
class G,H,I deploy
- Python 3.11+
- uv for Python package management
- Docker (for containerization)
- AWS CLI configured with appropriate permissions
- AWS account with access to Bedrock and Lambda services
git https://github.com./yourusername/fmbench-assistant
cd dsan-assistant
Create a .env
file in the project root with your AWS credentials and configuration:
AWS_ACCESS_KEY_ID=your_access_key
AWS_SECRET_ACCESS_KEY=your_secret_key
AWS_REGION=us-east-1
This project uses uv
for Python package management:
# Install uv if you don't have it
curl -LsSf https://astral.sh/uv/install.sh | sh
export PATH="$HOME/.local/bin:$PATH"
# Create a virtual environment and install dependencies
uv venv --python 3.11 && source .venv/bin/activate && uv pip install --requirement pyproject.toml
Before running the application, you need to build the vector index from the source documents:
python build_index.py
This will create a FAISS index in the indexes/fmbench_index
directory.
langchain serve
streamlit run chatbot.py -- --api-server-url http://localhost:8000/generate
LangSmith will help us trace, monitor and debug LangChain applications. You can sign up for LangSmith here. If you don't have access, you can skip this section
export LANGCHAIN_TRACING_V2=true
export LANGCHAIN_API_KEY=<your-api-key>
export LANGCHAIN_PROJECT=<your-project> # if not specified, defaults to "default"
The repository includes a script to build and push the Docker image to Amazon ECR:
chmod +x build_and_push.sh
./build_and_push.sh
Use the deployment script to create or update the Lambda function and API Gateway:
python deploy.py --function-name fmbench-assistant --role-arn YOUR_LAMBDA_ROLE_ARN --api-gateway
If you want to your Amazon Bedrock in a cross-account way i.e. the Lambda exists in say Account A but you want to use Amazon Bedrock in Account B then use the following command line
python deploy.py --function-name fmbench-assistant --role-arn YOUR_LAMBDA_ROLE_ARN --bedrock-role-arn YOUR_ACCOUNT_B_BEDROCK_ROLE_ARN--api-gateway
The IAM role you need to use for the AWS Lambda needs to have Amazon Bedrock access (for example via AmazonBedrockFullAccess
) to use the models available via Amazon Bedrock and the models need to be enabled within your AWS account, see instructions available here.
This will:
- Create/update a Lambda function using the Docker image
- Set up an API Gateway with appropriate routes
- Configure permissions and API keys
- Output the deployed API URL
Once deployed, you can connect the Streamlit frontend to the deployed API:
streamlit run chatbot.py -- --api-server-url https://YOUR_API_ID.execute-api.us-east-1.amazonaws.com/prod/generate
fmbench-assistant/
├── app/ # FastAPI application
│ ├── __init__.py
│ └── server.py # FastAPI server implementation
├── data/ # Source data
│ └── documents_1.json # FMBench documentation data
├── indexes/ # Vector indexes
│ └── fmbench_index/ # FAISS index for FMBench data
├── .env # Environment variables (not in repo)
├── .gitignore # Git ignore file
├── build_and_push.sh # Script to build and push Docker image
├── build_index.py # Script to build vector index
├── chatbot.py # Streamlit frontend
├── deploy.py # AWS Lambda deployment script
├── Dockerfile # Docker configuration
├── dsan_rag_setup.py # RAG system setup
├── pyproject.toml # Project configuration
├── README.md # Project documentation
└── requirements.txt # Python dependencies
- Conversation Memory: Maintains chat history for contextual responses
- Vector Search: FAISS-based retrieval for efficient document search
- AWS Bedrock Integration: Leverages AWS's foundation models
- Cross-Account Access: Supports cross-account access to AWS Bedrock
- Streamlit UI: User-friendly interface for interacting with the agent
See CONTRIBUTING.md for contribution guidelines.