Nimbly AI Chatbot - Technical Design Document
Overview
Purpose: This technical design document describes the actual implementation of the Nimbly AI Chatbot, a WhatsApp-powered assistant that provides conversational access to organizational data.
Architecture: A straightforward Node.js/TypeScript Express application with direct agent execution, PostgreSQL for user management, MongoDB for operational data, and Parasail AI models for natural language processing.
Core Capabilities:
- WhatsApp webhook integration via Whapi
- Intent classification and routing to specialized agents
- Report, Issue, and Site information agents
- User authentication and session management
- Conversation memory using Mastra PostgreSQL storage
Users: Organization members access reports, manage issues, and get site information through WhatsApp conversations.
High-Level Architecture
graph TB subgraph "External" WA[WhatsApp] --> WC[Whapi Webhook] end subgraph "Express App" WC --> R[Router] R --> AC[Auth Controller] R --> WH[Whapi Controller] WH --> EX[executeAgent] end subgraph "AI Agents" EX --> IC[Intent Classifier] IC --> RA[Report Agent] IC --> IA[Issue Agent] IC --> SA[Site Agent] end subgraph "Data Layer" AC --> PG[(PostgreSQL)] RA --> MG[(MongoDB)] IA --> MG SA --> MG RA --> MM[Mastra Memory] IA --> MM SA --> MM MM --> PG end subgraph "External Services" AC --> SB[Supabase Auth] RA --> PA[Nimbly Platform APIs] end
Technology Stack
Core Infrastructure
- Runtime: Node.js 20.x with TypeScript
- Framework: Express.js for HTTP server and webhook handling
- Package Management: npm with standard scripts
AI & Models
- AI Framework: Mastra.ai for agent orchestration and memory management
- Model Provider: Parasail.io with custom integration
- Models Used:
- GLM-4.5 (
zai-org/GLM-4.5V): Primary model for report and issue agents - LLaMA-3 (
meta-llama/Llama-3.3-70B-Instruct): Intent classification
- GLM-4.5 (
Data Storage
- PostgreSQL: User management, authentication, sessions
- MongoDB: Operational reports, schedules, questionnaires
- Mastra Memory: Conversation history stored in PostgreSQL
External Integrations
- WhatsApp: Whapi.Cloud for message handling and webhooks
- Authentication: Supabase Auth for user authentication
- Platform APIs: Nimbly backend services for real-time data
System Components
Authentication & User Management
Flow
- WhatsApp message arrives via Whapi webhook
- System checks for existing session in PostgreSQL
user_sessionstable - If no session, generates login link and requires authentication
- Users authenticate via Supabase, creating session in PostgreSQL
Database Schema (PostgreSQL)
-- User access and permissions
CREATE TABLE user_chatbot_access (
id uuid PRIMARY KEY DEFAULT gen_random_uuid(),
email varchar(255) NOT NULL,
organization_id varchar(255) NOT NULL,
whatsapp_number varchar(32),
is_enabled boolean DEFAULT true,
roles text[],
scopes jsonb DEFAULT '{}'::jsonb,
created_at timestamptz DEFAULT now()
);
-- Active sessions
CREATE TABLE user_sessions (
session_id uuid PRIMARY KEY DEFAULT gen_random_uuid(),
user_id uuid REFERENCES user_chatbot_access(id),
organization_id varchar(255) NOT NULL,
whatsapp_number varchar(32) NOT NULL,
auth_token text NOT NULL,
preferred_language varchar(5) DEFAULT 'en',
expires_at timestamptz NOT NULL,
created_at timestamptz DEFAULT now()
);Session Management
// Simple session validation
async validateSession(whatsappNumber: string): Promise<SessionContext | null> {
const active = await this.sessionRepository.findActiveByPhone(whatsappNumber);
if (!active) return null;
const jwt = decrypt(active.auth_token);
verifyJwt(jwt);
return {
sessionId: active.session_id,
userId: active.user_id,
organizationId: active.organization_id,
whatsappNumber: active.whatsapp_number,
authToken: jwt,
};
}Message Processing Flow
Webhook Handler
// Whapi webhook endpoint
this._router.post('/webhook', whapiWebhookValidation, (req: Request, res: Response) => {
return this.whapiController.handleWebhook({ req, res });
});Agent Execution
// Simple agent executor in mastra/index.ts
export const executeAgent = async (
agentName: string,
message: string,
options?: ExecuteAgentOptions,
): Promise<AgentResponse> => {
const executor = agentExecutors[agentName];
if (!executor) {
throw new Error(`Agent ${agentName} not found`);
}
const response = await executor(message, options);
return response;
};AI Agents
Intent Classifier
- Model: LLaMA-3 via Parasail
- Purpose: Classify messages into REPORT, ISSUE, SITE, or OTHER categories
- Implementation: Direct function call with structured response
export async function classifyIntent(message: string): Promise<string> {
const parasailModel = createParaSailModel(LLAMA_MODELS.LLAMA_3, PARASAIL_API_KEY);
const response = await parasailModel.generate(
`Classify message: '${message}'\n\nCategories: ISSUE, SITE, IRRELEVANT, OTHER\n\nClassification:`,
{ systemPrompt: 'Respond with exactly one word: REPORT, ISSUE, SITE, IRRELEVANT, or OTHER.' }
);
return response.trim();
}Report Agent
- Model: GLM-4.5 via Parasail
- Purpose: Answer report queries using MongoDB aggregations
- Memory: Mastra PostgreSQL-based memory for conversation continuity
- Tools: Report data aggregation, dashboard KPIs, platform stats
Issue Agent
- Model: GLM-4.5 via Parasail
- Purpose: Manage and track support issues
- Memory: Shared Mastra memory
Site Agent
- Model: GLM-4.5 via Parasail
- Purpose: Provide location and facility information
- Memory: Shared Mastra memory
Database Integration
MongoDB Usage
MongoDB collections are accessed directly via Mongoose in agent tools:
export const reportDataTool = createTool({
id: 'report-data',
description: 'Generates report analytics from MongoDB collections',
async execute({ context }) {
// Direct MongoDB aggregation
const db = mongoose.connection.db;
const reportsCollection = db.collection('reports');
const pipeline = [
{ $match: filters },
{ $group: { _id: null, total: { $sum: 1 } } }
];
const result = await reportsCollection.aggregate(pipeline).toArray();
return result;
}
});Memory Management
Mastra memory stores conversation history in PostgreSQL:
export const agentMemory = postgresStore
? new Memory({
storage: postgresStore,
options: {
lastMessages: 12,
semanticRecall: false,
workingMemory: { enabled: true, scope: 'resource' }
}
})
: undefined;
// Memory usage in agents
const response = await reportAnalysisAgent.generate(
finalPrompt,
memoryOption ? { memory: memoryOption } : undefined,
);API Endpoints
Core Routes
POST /health- Basic health checkPOST /api/v1/whapi/webhook- WhatsApp webhook endpointPOST /api/v1/whapi/test-site-agent- Test endpoint for site agent
Authentication Routes
POST /auth/login- User authenticationPOST /auth/refresh- Session refresh
Error Handling
Simple Error Pattern
try {
const response = await executeAgent(agentName, message, options);
return res.json({ success: true, response: response.text });
} catch (error) {
console.error('Agent error:', error);
return res.status(500).json({
error: 'Internal server error',
details: error instanceof Error ? error.message : 'Unknown error'
});
}Logging
- Structured logging with Winston
- Error tracking with context
- Development vs production log levels
Deployment
Environment Variables
DATABASE_URL- PostgreSQL connectionMONGODB_URI- MongoDB connectionPARASAIL_API_KEY- AI model providerWHAPI_API_KEY- WhatsApp integrationSUPABASE_URL&SUPABASE_ANON_KEY- Authentication
Server Startup
// app.ts startup sequence
const startServer = async () => {
await connectToMongoDB(); // Optional, continues if fails in dev
if (process.env.DB_BOOTSTRAP === 'true') {
await ensureDbIndexes(); // Create PostgreSQL tables/indexes
}
app.listen(PORT, () => {
logger.info(`Nimbly AI Chatbot running on port ${PORT}`);
});
};Testing
Agent Testing
// Test endpoint for agents
POST /api/v1/whapi/test-site-agent
{
"message": "What are your business hours?"
}Health Monitoring
- Basic health endpoint returns service status
- Database connectivity checks
- Optional MongoDB connection monitoring
Scaling Considerations
Current Limitations
- Single instance deployment
- No horizontal scaling implemented
- Memory storage in PostgreSQL (no distributed caching)
Scaling Path
- Database: PostgreSQL connection pooling
- Application: Horizontal scaling behind load balancer
- Memory: Consider Redis for session caching if needed
- AI Models: Model-specific rate limiting and fallbacks
This design reflects the actual simple, straightforward implementation rather than theoretical architectural patterns.