Nimbly AI Chatbot - Technical Design Document

Overview

Purpose: This technical design document describes the actual implementation of the Nimbly AI Chatbot, a WhatsApp-powered assistant that provides conversational access to organizational data.

Architecture: A straightforward Node.js/TypeScript Express application with direct agent execution, PostgreSQL for user management, MongoDB for operational data, and Parasail AI models for natural language processing.

Core Capabilities:

  • WhatsApp webhook integration via Whapi
  • Intent classification and routing to specialized agents
  • Report, Issue, and Site information agents
  • User authentication and session management
  • Conversation memory using Mastra PostgreSQL storage

Users: Organization members access reports, manage issues, and get site information through WhatsApp conversations.

High-Level Architecture

graph TB
    subgraph "External"
        WA[WhatsApp] --> WC[Whapi Webhook]
    end

    subgraph "Express App"
        WC --> R[Router]
        R --> AC[Auth Controller]
        R --> WH[Whapi Controller]
        WH --> EX[executeAgent]
    end

    subgraph "AI Agents"
        EX --> IC[Intent Classifier]
        IC --> RA[Report Agent]
        IC --> IA[Issue Agent]
        IC --> SA[Site Agent]
    end

    subgraph "Data Layer"
        AC --> PG[(PostgreSQL)]
        RA --> MG[(MongoDB)]
        IA --> MG
        SA --> MG
        RA --> MM[Mastra Memory]
        IA --> MM
        SA --> MM
        MM --> PG
    end

    subgraph "External Services"
        AC --> SB[Supabase Auth]
        RA --> PA[Nimbly Platform APIs]
    end

Technology Stack

Core Infrastructure

  • Runtime: Node.js 20.x with TypeScript
  • Framework: Express.js for HTTP server and webhook handling
  • Package Management: npm with standard scripts

AI & Models

  • AI Framework: Mastra.ai for agent orchestration and memory management
  • Model Provider: Parasail.io with custom integration
  • Models Used:
    • GLM-4.5 (zai-org/GLM-4.5V): Primary model for report and issue agents
    • LLaMA-3 (meta-llama/Llama-3.3-70B-Instruct): Intent classification

Data Storage

  • PostgreSQL: User management, authentication, sessions
  • MongoDB: Operational reports, schedules, questionnaires
  • Mastra Memory: Conversation history stored in PostgreSQL

External Integrations

  • WhatsApp: Whapi.Cloud for message handling and webhooks
  • Authentication: Supabase Auth for user authentication
  • Platform APIs: Nimbly backend services for real-time data

System Components

Authentication & User Management

Flow

  1. WhatsApp message arrives via Whapi webhook
  2. System checks for existing session in PostgreSQL user_sessions table
  3. If no session, generates login link and requires authentication
  4. Users authenticate via Supabase, creating session in PostgreSQL

Database Schema (PostgreSQL)

-- User access and permissions
CREATE TABLE user_chatbot_access (
  id uuid PRIMARY KEY DEFAULT gen_random_uuid(),
  email varchar(255) NOT NULL,
  organization_id varchar(255) NOT NULL,
  whatsapp_number varchar(32),
  is_enabled boolean DEFAULT true,
  roles text[],
  scopes jsonb DEFAULT '{}'::jsonb,
  created_at timestamptz DEFAULT now()
);
 
-- Active sessions
CREATE TABLE user_sessions (
  session_id uuid PRIMARY KEY DEFAULT gen_random_uuid(),
  user_id uuid REFERENCES user_chatbot_access(id),
  organization_id varchar(255) NOT NULL,
  whatsapp_number varchar(32) NOT NULL,
  auth_token text NOT NULL,
  preferred_language varchar(5) DEFAULT 'en',
  expires_at timestamptz NOT NULL,
  created_at timestamptz DEFAULT now()
);

Session Management

// Simple session validation
async validateSession(whatsappNumber: string): Promise<SessionContext | null> {
  const active = await this.sessionRepository.findActiveByPhone(whatsappNumber);
  if (!active) return null;
 
  const jwt = decrypt(active.auth_token);
  verifyJwt(jwt);
 
  return {
    sessionId: active.session_id,
    userId: active.user_id,
    organizationId: active.organization_id,
    whatsappNumber: active.whatsapp_number,
    authToken: jwt,
  };
}

Message Processing Flow

Webhook Handler

// Whapi webhook endpoint
this._router.post('/webhook', whapiWebhookValidation, (req: Request, res: Response) => {
  return this.whapiController.handleWebhook({ req, res });
});

Agent Execution

// Simple agent executor in mastra/index.ts
export const executeAgent = async (
  agentName: string,
  message: string,
  options?: ExecuteAgentOptions,
): Promise<AgentResponse> => {
  const executor = agentExecutors[agentName];
  if (!executor) {
    throw new Error(`Agent ${agentName} not found`);
  }
 
  const response = await executor(message, options);
  return response;
};

AI Agents

Intent Classifier

  • Model: LLaMA-3 via Parasail
  • Purpose: Classify messages into REPORT, ISSUE, SITE, or OTHER categories
  • Implementation: Direct function call with structured response
export async function classifyIntent(message: string): Promise<string> {
  const parasailModel = createParaSailModel(LLAMA_MODELS.LLAMA_3, PARASAIL_API_KEY);
 
  const response = await parasailModel.generate(
    `Classify message: '${message}'\n\nCategories: ISSUE, SITE, IRRELEVANT, OTHER\n\nClassification:`,
    { systemPrompt: 'Respond with exactly one word: REPORT, ISSUE, SITE, IRRELEVANT, or OTHER.' }
  );
 
  return response.trim();
}

Report Agent

  • Model: GLM-4.5 via Parasail
  • Purpose: Answer report queries using MongoDB aggregations
  • Memory: Mastra PostgreSQL-based memory for conversation continuity
  • Tools: Report data aggregation, dashboard KPIs, platform stats

Issue Agent

  • Model: GLM-4.5 via Parasail
  • Purpose: Manage and track support issues
  • Memory: Shared Mastra memory

Site Agent

  • Model: GLM-4.5 via Parasail
  • Purpose: Provide location and facility information
  • Memory: Shared Mastra memory

Database Integration

MongoDB Usage

MongoDB collections are accessed directly via Mongoose in agent tools:

export const reportDataTool = createTool({
  id: 'report-data',
  description: 'Generates report analytics from MongoDB collections',
  async execute({ context }) {
    // Direct MongoDB aggregation
    const db = mongoose.connection.db;
    const reportsCollection = db.collection('reports');
 
    const pipeline = [
      { $match: filters },
      { $group: { _id: null, total: { $sum: 1 } } }
    ];
 
    const result = await reportsCollection.aggregate(pipeline).toArray();
    return result;
  }
});

Memory Management

Mastra memory stores conversation history in PostgreSQL:

export const agentMemory = postgresStore
  ? new Memory({
      storage: postgresStore,
      options: {
        lastMessages: 12,
        semanticRecall: false,
        workingMemory: { enabled: true, scope: 'resource' }
      }
    })
  : undefined;
 
// Memory usage in agents
const response = await reportAnalysisAgent.generate(
  finalPrompt,
  memoryOption ? { memory: memoryOption } : undefined,
);

API Endpoints

Core Routes

  • POST /health - Basic health check
  • POST /api/v1/whapi/webhook - WhatsApp webhook endpoint
  • POST /api/v1/whapi/test-site-agent - Test endpoint for site agent

Authentication Routes

  • POST /auth/login - User authentication
  • POST /auth/refresh - Session refresh

Error Handling

Simple Error Pattern

try {
  const response = await executeAgent(agentName, message, options);
  return res.json({ success: true, response: response.text });
} catch (error) {
  console.error('Agent error:', error);
  return res.status(500).json({
    error: 'Internal server error',
    details: error instanceof Error ? error.message : 'Unknown error'
  });
}

Logging

  • Structured logging with Winston
  • Error tracking with context
  • Development vs production log levels

Deployment

Environment Variables

  • DATABASE_URL - PostgreSQL connection
  • MONGODB_URI - MongoDB connection
  • PARASAIL_API_KEY - AI model provider
  • WHAPI_API_KEY - WhatsApp integration
  • SUPABASE_URL & SUPABASE_ANON_KEY - Authentication

Server Startup

// app.ts startup sequence
const startServer = async () => {
  await connectToMongoDB(); // Optional, continues if fails in dev
 
  if (process.env.DB_BOOTSTRAP === 'true') {
    await ensureDbIndexes(); // Create PostgreSQL tables/indexes
  }
 
  app.listen(PORT, () => {
    logger.info(`Nimbly AI Chatbot running on port ${PORT}`);
  });
};

Testing

Agent Testing

// Test endpoint for agents
POST /api/v1/whapi/test-site-agent
{
  "message": "What are your business hours?"
}

Health Monitoring

  • Basic health endpoint returns service status
  • Database connectivity checks
  • Optional MongoDB connection monitoring

Scaling Considerations

Current Limitations

  • Single instance deployment
  • No horizontal scaling implemented
  • Memory storage in PostgreSQL (no distributed caching)

Scaling Path

  1. Database: PostgreSQL connection pooling
  2. Application: Horizontal scaling behind load balancer
  3. Memory: Consider Redis for session caching if needed
  4. AI Models: Model-specific rate limiting and fallbacks

This design reflects the actual simple, straightforward implementation rather than theoretical architectural patterns.