Skip to main content

AI Optimization for Mambu Documentation

This documentation has been optimized for AI consumption, including support for MCP (Model Context Protocol) servers, RAG (Retrieval-Augmented Generation) systems, and AI agents.

Overview

The Mambu Documentation Hub now provides multiple AI-optimized formats to enable:

  • Direct product configuration via MCP servers
  • Vector database integration for RAG systems
  • Efficient documentation crawling by AI agents
  • Semantic search and question-answering systems

Available Resources

1. LLM.txt (/llm.txt)

Purpose: Human-readable index optimized for LLMs to understand documentation structure.

Location: https://docs.mambu.com/llm.txt

Use Cases:

  • Quick overview for AI agents
  • Initial context for LLM systems
  • MCP server integration points

Example:

curl https://docs.mambu.com/llm.txt

2. AI Manifest (/ai-manifest.json)

Purpose: Structured JSON index of all documentation with metadata.

Location: https://docs.mambu.com/ai-manifest.json

Structure:

{
"meta": { "version", "generated", "baseUrl" },
"overview": { "totalItems", "totalDocuments", "totalApiSpecs" },
"categories": { "category_name": { "count", "items" } },
"resources": { "llmIndex", "fullExport", "ragChunks" },
"items": [ { "id", "title", "url", "type", "metadata" } ]
}

Use Cases:

  • Navigation for AI agents
  • Documentation discovery
  • Category-based filtering
  • MCP server resource mapping

3. RAG Chunks (/docs-rag-chunks.json)

Purpose: Optimally-sized chunks for vector database ingestion.

Location: https://docs.mambu.com/docs-rag-chunks.json

Features:

  • Semantic chunking with 1000-character target size
  • 200-character overlap for context preservation
  • Rich metadata for filtering and retrieval
  • Section-aware splitting
  • URL anchors for source attribution

Structure:

{
"meta": { "version", "chunkConfig" },
"statistics": { "totalChunks", "averageChunkLength" },
"usage": { "recommendedEmbeddingModels", "storageRecommendations" },
"chunks": [
{
"id": "unique-chunk-id",
"text": "chunk content",
"url": "source url",
"metadata": { "title", "tags", "category" }
}
]
}

Recommended Embedding Models:

  • OpenAI: text-embedding-3-small or text-embedding-3-large
  • Cohere: embed-english-v3.0
  • Open Source: all-MiniLM-L6-v2

Recommended Vector Databases:

  • Pinecone (managed)
  • Weaviate (open-source)
  • Qdrant (open-source)
  • ChromaDB (embedded)

4. AI Sitemap (/ai-sitemap.txt)

Purpose: Plain-text sitemap optimized for AI crawlers.

Location: https://docs.mambu.com/ai-sitemap.txt

Format:

URL | Title | Description | Last Modified | Priority | Category | Type

Use Cases:

  • Systematic crawling by AI agents
  • Resource discovery
  • Freshness checking (last modified dates)

5. Full Export (/docs-export.json)

Purpose: Complete documentation export (existing tool).

Location: https://docs.mambu.com/docs-export.json

Use Cases:

  • Bulk ingestion
  • Offline processing
  • Complete backups

NPM Scripts

Generate Individual Resources

# Generate AI manifest
npm run ai:manifest

# Generate RAG chunks
npm run ai:rag

# Generate AI sitemap
npm run ai:sitemap

# Generate full export (existing)
npm run export-docs

Generate All AI Resources

# Generate all AI resources at once
npm run ai:all

This will run all AI optimization scripts in sequence:

  1. Full documentation export
  2. AI manifest generation
  3. RAG chunks generation
  4. AI sitemap generation

Automatic Generation

All AI resources are automatically generated after each build:

npm run build
# Automatically runs: npm run ai:all

MCP Server Integration

Quick Start

The documentation is optimized for MCP (Model Context Protocol) integration to enable direct product configuration via AI agents.

Key Integration Points:

  1. Discovery: Start with /ai-manifest.json to map available resources
  2. Context: Use /llm.txt for quick overview
  3. RAG: Ingest /docs-rag-chunks.json into vector database
  4. API Specs: Access OpenAPI specs at /swagger_files/v2/*.json

Example MCP Server Implementation

import { Server } from "@modelcontextprotocol/sdk/server";

// 1. Load AI manifest for resource discovery
const manifest = await fetch("https://docs.mambu.com/ai-manifest.json");
const resources = await manifest.json();

// 2. Load RAG chunks for semantic search
const ragChunks = await fetch("https://docs.mambu.com/docs-rag-chunks.json");
const chunks = await ragChunks.json();

// 3. Create vector embeddings and store in database
await vectorDB.upsert(chunks.chunks.map(chunk => ({
id: chunk.id,
values: await createEmbedding(chunk.text),
metadata: chunk.metadata
})));

// 4. Implement MCP resources and tools
server.addResource({
name: "mambu-docs",
description: "Mambu API documentation",
handler: async (query) => {
const results = await vectorDB.query(query);
return results;
}
});

RAG System Integration

Step 1: Download RAG Chunks

curl -o docs-rag-chunks.json https://docs.mambu.com/docs-rag-chunks.json

Step 2: Create Embeddings

import json
import openai

# Load chunks
with open('docs-rag-chunks.json') as f:
data = json.load(f)
chunks = data['chunks']

# Create embeddings
embeddings = []
for chunk in chunks:
response = openai.Embedding.create(
input=chunk['text'],
model="text-embedding-3-small"
)
embeddings.append({
'id': chunk['id'],
'values': response['data'][0]['embedding'],
'metadata': chunk['metadata']
})

Step 3: Store in Vector Database

import pinecone

# Initialize Pinecone
pinecone.init(api_key="YOUR_API_KEY", environment="YOUR_ENV")
index = pinecone.Index("mambu-docs")

# Upsert vectors
index.upsert(vectors=embeddings)

Step 4: Query

# Create query embedding
query = "How do I create a loan account?"
query_embedding = openai.Embedding.create(
input=query,
model="text-embedding-3-small"
)['data'][0]['embedding']

# Search
results = index.query(
vector=query_embedding,
top_k=5,
include_metadata=True
)

# Results contain relevant documentation chunks
for match in results['matches']:
print(f"Score: {match['score']}")
print(f"URL: {match['metadata']['url']}")
print(f"Text: {match['metadata']['text'][:200]}...")

Crawling Guidelines

For AI Agents

  1. Start with structured resources: Use JSON exports instead of crawling HTML
  2. Respect rate limits: 1-second delay between requests
  3. Use conditional requests: Check If-Modified-Since headers
  4. Follow robots.txt: See /robots.txt for guidelines
  5. Check freshness: Use lastModified dates in metadata
1. Fetch /ai-manifest.json (navigation index)
2. Fetch /llm.txt (overview)
3. Fetch /docs-rag-chunks.json (for RAG systems)
4. Optionally fetch individual pages based on manifest
5. Fetch OpenAPI specs for API reference details

File Locations

Source Scripts

  • scripts/export-docs-for-ai.js - Full documentation export
  • scripts/generate-ai-manifest.js - AI manifest generator
  • scripts/generate-rag-chunks.js - RAG chunks generator
  • scripts/generate-ai-sitemap.js - AI sitemap generator

Generated Files (in static/ directory)

  • static/llm.txt - LLM index (committed to repo)
  • static/ai-manifest.json - AI manifest (generated during build)
  • static/docs-rag-chunks.json - RAG chunks (generated during build)
  • static/ai-sitemap.txt - AI sitemap (generated during build)
  • static/docs-export.json - Full export (generated during build)

Configuration

  • package.json - NPM scripts for AI generation
  • static/robots.txt - Crawler guidelines with AI resources
  • .gitignore - Generated files are ignored (except llm.txt)

Key Features

Semantic Chunking

  • Chunks preserve heading hierarchy
  • Section-aware splitting maintains context
  • Configurable overlap (200 chars default)
  • Minimum chunk size filtering (100 chars)
  • Sentence boundary detection

Rich Metadata

Each chunk includes:

  • Document title and ID
  • Section heading and level
  • Tags and categories
  • Last modified date
  • Source URL with anchor links
  • Content type (documentation, openapi)

OpenAPI Integration

  • All API endpoints extracted as separate chunks
  • HTTP method and path included
  • Parameters and response schemas
  • Operation IDs for direct reference

Updates and Maintenance

Automatic Updates

All AI resources are regenerated on each build (npm run build).

Manual Updates

Run npm run ai:all to regenerate all AI resources without building.

Monitoring Freshness

Check the generated timestamp in each JSON file's meta section:

curl https://docs.mambu.com/ai-manifest.json | jq '.meta.generated'

Support and Feedback

For questions or issues with AI optimization:

Example Use Cases

1. MCP Server for Product Configuration

Build an MCP server that lets AI agents configure Mambu products directly:

  • Read product templates from documentation
  • Extract configuration parameters from API specs
  • Create products via API calls guided by docs

2. Documentation Chatbot

Create a chatbot that answers questions about Mambu:

  • Ingest RAG chunks into vector database
  • Use semantic search for relevant context
  • Generate answers with source attribution

3. API Code Generator

Generate code snippets from API documentation:

  • Parse OpenAPI specs for endpoint details
  • Use documentation examples as templates
  • Create language-specific implementations

4. Automated Integration Testing

Generate test cases from API documentation:

  • Extract endpoints and parameters from specs
  • Use examples from documentation
  • Create comprehensive test suites

Technical Specifications

Chunk Configuration

  • Target size: 1000 characters
  • Overlap: 200 characters
  • Minimum size: 100 characters
  • Strategy: Semantic splitting with sentence boundaries

Output Formats

  • JSON: UTF-8, 2-space indentation
  • TXT: UTF-8, line-delimited

Update Frequency

  • Build-time: Every deployment
  • On-demand: Via NPM scripts

File Sizes (approximate)

  • llm.txt: ~5 KB
  • ai-manifest.json: ~100-500 KB
  • docs-rag-chunks.json: ~5-20 MB
  • ai-sitemap.txt: ~50-200 KB
  • docs-export.json: ~10-30 MB

License

Same as main documentation (Mambu proprietary).


Last Updated: 2026-03-17 Version: 1.0 Generated by: AI Optimization Scripts