Problem: Your LLM Can't See Your Private AWS Data
MCP AWS Knowledge Bases integration lets Claude and other AI agents query your private Amazon Bedrock Knowledge Bases directly — without writing custom Boto3 glue code, managing auth tokens, or building a separate retrieval pipeline.
If you've already embedded your internal docs, runbooks, or product data into an S3-backed Bedrock Knowledge Base, this MCP server is the missing bridge to expose that data to any MCP-compatible AI client.
You'll learn:
- How to tag and prepare an existing Bedrock Knowledge Base for MCP auto-discovery
- How to install and configure
awslabs.bedrock-kb-retrieval-mcp-serverwith correct IAM permissions - How to enable optional reranking for higher-quality retrieval results
- How to verify queries return grounded, cited responses
Time: 25 min | Difficulty: Intermediate
Why This Happens
Standard Bedrock RAG requires calling RetrieveAndGenerate or Retrieve APIs directly, handling auth, parsing nested JSON, and wiring results back into your agent prompt manually. Every new AI tool means a new custom integration.
MCP solves this with a single standardized server. Once bedrock-kb-retrieval-mcp-server is running, any MCP-compatible client — Cursor, Claude Code, Kiro, VS Code, or a custom agent — can query your private knowledge bases using natural language with zero per-client code.
Symptoms you need this:
- Your agent hallucinates because it can't access internal docs
- You're copy-pasting runbooks or AWS docs into system prompts as context
- You have a Bedrock Knowledge Base sitting idle because wiring it to your AI workflow is too much friction
Architecture Overview
AI Client (Cursor / Claude Code / custom agent)
│
│ MCP (stdio / HTTP, JSON-RPC 2.0)
▼
bedrock-kb-retrieval-mcp-server
│
│ BedrockAgentRuntime API (Retrieve)
▼
Amazon Bedrock Knowledge Base
│
│ Vector search
▼
S3 Bucket → Chunked docs → OpenSearch Serverless (or Aurora pgvector)
The MCP server abstracts the entire bottom half. Your agent issues a tool_call with a natural language query and gets back cited passages — no RAG plumbing in your agent code.
Solution
Step 1: Tag Your Knowledge Base for MCP Discovery
The server uses tag-based discovery. Your Bedrock Knowledge Base must carry a specific tag before the server can find it.
In the AWS console, go to Amazon Bedrock → Knowledge Bases, select your KB, open the Tags tab, and add:
Key: mcp-multirag-kb
Value: true
Or via AWS CLI:
# Replace KB_ID and account ID with real values
aws bedrock-agent tag-resource \
--resource-arn arn:aws:bedrock:us-east-1:123456789012:knowledge-base/KB_ID \
--tags mcp-multirag-kb=true
If you prefer a custom tag key — for environment separation like dev-kb=true — set KB_INCLUSION_TAG_KEY in the server env config in Step 4. The default expected key is mcp-multirag-kb.
Step 2: Set Up IAM Permissions
Your AWS IAM role or user needs the following permissions. This is the minimum set for basic retrieval:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"bedrock:ListKnowledgeBases",
"bedrock:GetKnowledgeBase",
"bedrock:ListDataSources",
"bedrock:Retrieve",
"bedrock:RetrieveAndGenerate"
],
"Resource": "*"
}
]
}
If you plan to enable reranking (Step 5), add these two actions to the same statement:
"bedrock:Rerank",
"bedrock:InvokeModel"
Verify your CLI profile has access before proceeding:
aws bedrock-agent list-knowledge-bases --region us-east-1
Expected: A JSON list containing your KB name and ID. If you get AccessDeniedException, the policy isn't attached to the profile you're using — confirm with aws sts get-caller-identity.
Step 3: Install uv and the MCP Server
The server is published on PyPI as awslabs.bedrock-kb-retrieval-mcp-server. The recommended install method is uvx, which runs it without a manual virtualenv.
# Install uv (macOS / Linux)
curl -LsSf https://astral.sh/uv/install.sh | sh
# Install Python 3.10 via uv (required minimum)
uv python install 3.10
# Smoke test — confirm the server binary is reachable
uvx awslabs.bedrock-kb-retrieval-mcp-server@latest --help
Expected output: CLI help text listing available flags.
If it fails:
command not found: uvx→ Restart your terminal or runsource ~/.bashrc/source ~/.zshrcNo Python 3.10 found→ Runuv python install 3.10explicitly before retrying
Step 4: Configure Your MCP Client
Add the server block to your MCP client config. For Cursor: ~/.cursor/mcp.json. For Claude Code: ~/.claude/mcp.json. For VS Code: workspace .vscode/mcp.json.
macOS / Linux:
{
"mcpServers": {
"awslabs.bedrock-kb-retrieval-mcp-server": {
"command": "uvx",
"args": ["awslabs.bedrock-kb-retrieval-mcp-server@latest"],
"env": {
"AWS_PROFILE": "your-profile-name",
"AWS_REGION": "us-east-1",
"FASTMCP_LOG_LEVEL": "ERROR",
"KB_INCLUSION_TAG_KEY": "mcp-multirag-kb",
"BEDROCK_KB_RERANKING_ENABLED": "false"
},
"disabled": false,
"autoApprove": []
}
}
}
Windows — uvx isn't a standalone executable on Windows, so use uv tool run instead:
{
"mcpServers": {
"awslabs.bedrock-kb-retrieval-mcp-server": {
"disabled": false,
"timeout": 60,
"type": "stdio",
"command": "uv",
"args": [
"tool", "run",
"--from", "awslabs.bedrock-kb-retrieval-mcp-server@latest",
"awslabs.bedrock-kb-retrieval-mcp-server.exe"
],
"env": {
"AWS_PROFILE": "your-aws-profile",
"AWS_REGION": "us-east-1",
"FASTMCP_LOG_LEVEL": "ERROR"
}
}
}
}
Docker — recommended for team or CI deployments where you want a pinned version and no local Python dependency:
# Build the image once from the awslabs monorepo
docker build \
-t awslabs/bedrock-kb-retrieval-mcp-server:latest \
https://github.com/awslabs/mcp.git#main:servers/bedrock-kb-retrieval-mcp-server
Then reference the container in mcp.json:
{
"mcpServers": {
"awslabs.bedrock-kb-retrieval-mcp-server": {
"command": "docker",
"args": [
"run", "--rm", "--interactive",
"--env", "FASTMCP_LOG_LEVEL=ERROR",
"--env", "BEDROCK_KB_RERANKING_ENABLED=false",
"--env", "AWS_REGION=us-east-1",
"--env-file", "/full/path/to/.env",
"awslabs/bedrock-kb-retrieval-mcp-server:latest"
],
"env": {},
"disabled": false,
"autoApprove": []
}
}
}
Store AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and AWS_SESSION_TOKEN in the .env file — never hardcode credentials in mcp.json.
Step 5: Enable Reranking (Optional but Recommended for Production)
Reranking re-scores retrieved passages using a second Bedrock model pass, improving result relevance for ambiguous or multi-intent queries. It's disabled by default because it adds ~0.5–1s latency per query and requires extra IAM permissions.
To enable globally:
"BEDROCK_KB_RERANKING_ENABLED": "true"
The variable accepts true, 1, yes, or on (case-insensitive). Any other value disables it. Individual tool calls can override this global default per-query.
Region check: Reranking is only available in select AWS regions. As of early 2026, confirmed regions include us-east-1, us-west-2, and eu-west-1. Verify the current list at the Bedrock reranking supported regions docs before switching your AWS_REGION.
You also need to enable model access for the reranking foundation models in the Bedrock console under Model access → Manage model access.
Verification
Restart your MCP client after saving mcp.json. Then query your knowledge base with content you know is embedded in your documents.
In Cursor or Claude Code:
@bedrock-kb What are our internal deployment runbook steps for production releases?
You should see: A response with cited passages including the source document name and the exact retrieved text chunks from your S3-backed KB.
To verify the connection independently:
import boto3
# Confirm your KB is reachable and ACTIVE
client = boto3.client("bedrock-agent", region_name="us-east-1")
response = client.list_knowledge_bases()
for kb in response.get("knowledgeBaseSummaries", []):
print(kb["knowledgeBaseId"], kb["name"], kb["status"])
Expected: Your KB listed with status: ACTIVE. If it shows CREATING, the initial sync is still running. If it shows FAILED, check the data source sync logs in the console under Knowledge Bases → Data sources → Sync history.
What You Learned
- The
mcp-multirag-kb=truetag is what makes a Knowledge Base visible to the server — without it the server discovers nothing BEDROCK_KB_RERANKING_ENABLEDis a global default; individual API calls can still override it per-query- Reranking requires
bedrock:Rerank+bedrock:InvokeModelIAM actions and is region-limited - The server returns text and metadata only — image content blocks from KB results are not passed through
Limitation: Cross-account Knowledge Base access is not supported in the default config. All tagged KBs must be in the same AWS account as the configured profile.
Tested on awslabs.bedrock-kb-retrieval-mcp-server latest (March 2026), Python 3.10, AWS us-east-1 and us-west-2, macOS 14 and Ubuntu 24.04
FAQ
Q: How do I connect multiple Knowledge Bases to the same MCP server?
A: Tag each Knowledge Base with mcp-multirag-kb=true. The server auto-discovers all tagged KBs in the configured region and exposes them all as queryable sources. No extra config needed per KB.
Q: What is the difference between the Retrieve and RetrieveAndGenerate Bedrock APIs here?
A: The MCP server calls Retrieve by default, returning raw passages with source citations. RetrieveAndGenerate adds an LLM generation step on the AWS side, costing more and adding latency. For MCP workflows, Retrieve is preferable — your AI client handles generation.
Q: Does this work with Knowledge Bases backed by Aurora pgvector instead of OpenSearch Serverless? A: Yes. The server calls the Bedrock Agent Runtime API, which is vector store-agnostic. Aurora pgvector, OpenSearch Serverless, MongoDB, Pinecone, and Redis Enterprise Cloud are all supported as backing stores — the MCP server never talks to the vector DB directly.
Q: Can I use this without an AWS CLI profile configured locally?
A: Not with stdio transport. You need either a local AWS profile or environment variables (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN). For team deployments, use the Docker config with an .env file, or run on EC2/ECS with an IAM instance role.
Q: What happens if my Knowledge Base data source goes out of sync?
A: Queries still succeed but return stale results from the last sync. Trigger a new sync via aws bedrock-agent start-ingestion-job --knowledge-base-id KB_ID --data-source-id DS_ID, or set up an EventBridge rule to re-sync whenever your S3 source bucket receives new objects.