Nimbly AI Chatbot Architecture

Last updated: 2026-03-03

1. System Context

flowchart LR
    U[WhatsApp User] --> W[Whapi Cloud]
    W -->|Webhook| API[Express App]

    API --> R[Routes + Middleware]
    R --> C[WhapiController]

    C --> H[Hybrid Routing + Intent Safety]
    C --> L[Language + Date Extraction]
    C --> A[Mastra Agent Runtime]

    A --> P1[Nimbly Platform API\nStatistics + Search Endpoints]
    A --> P2[Parasail LLM APIs\nGLM/Qwen/Llama/Gemma]

    API --> DB[(PostgreSQL\nDrizzle + Mastra Memory Store)]
    API --> SB[Supabase Auth]
    API --> WHO[Whapi outbound API]

Primary responsibilities

  • Webhook ingress and authentication/session gating.
  • Conversational routing across REPORT, ISSUE, SITE, plus fallback intents.
  • Agent orchestration with tool-calling into Nimbly platform APIs.
  • Session state, memory, language continuity, and response safety filtering.

2. Runtime Components

LayerKey filesResponsibility
App bootstrapsrc/app.tsExpress init, middleware, route mounting, optional DB bootstrap (DB_BOOTSTRAP=true).
Route compositionsrc/routes/index.tsMounts /auth, /api/v1, /api/v1/whapi.
Webhook validationsrc/middleware/whapi-validation.middleware.tsValidates event shape, ignores echoed outbound messages, validates session.
Webhook orchestrationsrc/controllers/whapi.controller.tsFull message pipeline, intent routing, agent execution, session updates.
Auth domainsrc/domains/auth/*, src/services/auth.service.tsLogin-link flow, Supabase credential validation, session/token lifecycle.
Agent runtimesrc/mastra/index.ts, src/mastra/agents/*, src/mastra/tools/*Parent + specialist agents and tool integration.
Platform integrationsrc/services/dashboard/*, src/services/*search*.ts, src/services/platform-*.tsStatistics/search APIs and token refresh/retry logic.
Persistencesrc/db/*, src/domains/auth/repository/*Drizzle schema/repositories and index bootstrap.

3. HTTP Surface

MethodPathHandlerNotes
GET/healthhealth.routes.tsHealth payload (status, timestamp, uptime, environment).
GET/health/pinghealth.routes.tsLightweight probe ({ pong: true }).
POST/api/v1/whapi/webhookWhapiController.handleWebhookMain WhatsApp ingress path.
POST/api/v1/whapi/test-site-agentinline route in whapi.router.tsTest-only direct site-agent invocation.
GET/auth/loginAuthController.getLoginPageTokenized login page (HTML).
POST/auth/authenticateAuthController.authenticateValidates credentials + creates/renews session.
POST/auth/accessAuthController.createAccessAdmin/API path to create user_chatbot_access record.
POST/api/v1/webhook/whatsappChatController.webhookLegacy stub path (returns { status: 'ok' }).

4. Authenticated Message Flow

sequenceDiagram
    autonumber
    participant User as WhatsApp User
    participant Whapi as Whapi Cloud
    participant MW as whapiWebhookValidation
    participant WC as WhapiController
    participant HR as Hybrid Router
    participant AG as Mastra Agent
    participant Tool as KPI/Dashboard/Search Tool
    participant API as Nimbly Platform API
    participant DB as Postgres

    User->>Whapi: Send message
    Whapi->>MW: POST /api/v1/whapi/webhook
    MW->>DB: validateSession(whatsapp_number)
    DB-->>MW: Active session + context
    MW-->>WC: req.sessionContext set

    WC->>WC: Extract message / interactive payload
    WC->>WC: Greeting / command / shortcut checks
    WC->>HR: routeMessageWithHybridApproach(message, sessionContext)
    HR-->>WC: intent + confidence + routingLayer
    WC->>WC: resolveIntent safety net + language/date resolution
    WC->>DB: Persist intent, language, context_window, counters

    WC->>Whapi: Optional "processing" acknowledgement
    WC->>AG: executeAgent(report|issue|site)
    AG->>Tool: Delegated tool call(s)
    Tool->>API: Dashboard/KPI/search request
    API-->>Tool: Data
    Tool-->>AG: Structured result
    AG-->>WC: Final response text

    WC->>WC: cleanAIResponse + content safety
    WC->>Whapi: sendMessage(final text)
    WC->>DB: touchSession + pendingAction updates

Core pipeline behavior

  1. Webhook gating
  • Requires event.type === "messages".
  • Ignores outbound echo (from_me=true) and returns 200 { skipped: "outgoing" }.
  • Normalizes phone via normalizeWhatsappForStorage().
  1. Session state housekeeping
  • Session-memory TTL: 3h (SESSION_MEMORY_TTL_MS).
  • Expired state clears:
    • Mastra thread (agentMemory.deleteThread)
    • in-memory recent history (clearUserHistory)
    • DB conversation state (resetConversationState)
  1. Shortcuts before LLM routing
  • Greeting detection in multiple languages returns quick-button greeting.
  • Command handling (help, logout, clear, start/prompt, how to use, quick actions, account & settings).
  • Numeric quick prompts (1-5) and interactive IDs (qp_1qp_5).
  1. Intent + language + date resolution
  • Hybrid routing + resolver safety net determine final intent.
  • Language detection + follow-up continuity logic chooses response language.
  • AI date extraction (Qwen3-235B) accepted only when confidence >= 0.5.
  1. Execution and post-processing
  • Calls report-agent, issue-agent, or site-agent.
  • Cleans markdown/reasoning tags.
  • Applies profanity/insult output filter; unsafe outputs replaced with localized safe fallback.

5. Unauthenticated Login Flow

sequenceDiagram
    autonumber
    participant User as WhatsApp User
    participant Whapi as Whapi Cloud
    participant MW as whapiWebhookValidation
    participant Auth as AuthUsecase
    participant DB as Postgres
    participant WA as Whapi Outbound
    participant Browser as User Browser
    participant Supa as Supabase
    participant Plat as Platform API

    User->>Whapi: Send first message
    Whapi->>MW: POST /api/v1/whapi/webhook
    MW->>Auth: validateSession(phone)
    Auth->>DB: findActiveByPhone
    DB-->>Auth: none
    Auth-->>MW: SESSION_EXPIRED

    MW->>Auth: generateLoginLink(phone)
    Auth->>DB: create login token (15 min TTL)
    Auth->>WA: sendUrlButtonMessage(auth link)
    MW-->>Whapi: 401 Authentication required

    User->>Browser: Open /auth/login?token=...&phone=...
    Browser->>Auth: GET login page
    Auth->>DB: diagnose token
    Auth-->>Browser: Render login form

    User->>Auth: POST /auth/authenticate
    Auth->>Supa: signInWithPassword
    Auth->>Plat: /v1.0/users/login + /v1.0/auth/retrieve-jwt
    Auth->>DB: upsert session (encrypted JWT + encrypted platform token)
    Auth->>WA: send login success buttons
    Auth->>DB: mark login token used
    Auth-->>Browser: Success page + redirect to WhatsApp

Auth/session details

  • Session JWT (JWT_SECRET) is encrypted at rest (AES-256-GCM, ENCRYPTION_KEY).
  • Session default TTL in DB: 720 hours.
  • Platform token is stored encrypted and decrypted in repository mapping.
  • Login token table enforces single-use token semantics (used flag + expiry).

6. Hybrid Intent Routing

flowchart TD
    A[Incoming message] --> B[Entity extraction]
    B --> C{Has entities + currentIntent?}
    C -->|Yes + strong entity match| D[entity_fast\nconfidence 0.85]
    C -->|No| E[LLM intent classifier]

    E --> F{Likely follow-up\nand currentIntent exists?}
    F -->|Yes| G[context_analyzer\nreuse previous intent]
    F -->|No| H{confidence >= 0.7?}

    H -->|Yes| I[classifier direct]
    H -->|No| J{confidence < 0.4\n+ contextWindow exists\n+ turn_count < 5\n+ topic_switch_count < 2}
    J -->|Yes| K[context analyzer model]
    J -->|No| L[classifier fallback\nor previous intent]

    D --> M[Intent resolver safety net]
    G --> M
    I --> M
    K --> M
    L --> M

    M --> N[Final intent]

Safety-net heuristics (resolveIntent)

  • Reuses previous intent for short follow-ups and confirmations.
  • Hard cues can override classifier output:
    • issue terms/IDs (NIMBLY-123, bug, ticket)
    • report terms (report, KPI, RCR)
    • site/location terms
  • IRRELEVANT + confirmation can reuse prior domain intent.

7. Agent Architecture and Delegation

graph TD
    RPT[report-agent\nGLM-4.6V] --> RKPI[report-kpi-agent]
    RPT --> RDASH[report-dashboard-agent]
    RPT --> S1[search-sites]
    RPT --> U1[search-users]
    RPT --> Q1[search-questionnaires]

    ISS[issue-agent\nGLM-4.6V] --> IKPI[issue-kpi-agent]
    ISS --> IDASH[issue-dashboard-agent]
    ISS --> S2[search-sites]
    ISS --> U2[search-users]
    ISS --> Q2[search-questionnaires]

    SITE[site-agent\nGLM-4.7-FP8] --> S3[search-sites]
    SITE --> U3[search-users]
    SITE --> Q3[search-questionnaires]

    RKPI --> T1[report-kpi]
    RKPI --> T2[combined-kpi]
    RKPI --> T3[platform-stats]

    RDASH --> T4[report-dashboard]
    RDASH --> T5[issue-dashboard\nfor IRR merges]

    IKPI --> T6[issue-kpi]
    IKPI --> T7[combined-kpi]
    IKPI --> T8[platform-stats]

    IDASH --> T9[issue-dashboard]

Deterministic routing overrides

  • report-agent bypasses orchestration LLM in specific cases:
    • aggregate report-count intents across all branches/sites
    • combined report+issue overview intents
  • issue-agent can force KPI route for KPI-like intents and site-follow-up cases.

Site/user/questionnaire resolution

  • Search tools are used to resolve names into IDs before analytics calls.
  • resolvePlatformSiteIDs() handles direct IDs, numeric references, and disambiguation.
  • Numeric values that look like calendar years are excluded from site-ID candidates.

8. Tool-to-API Mapping

ToolServiceEndpoint family
report-kpifetchReportKPIsPOST /v1.0/statistics/dashboard/report/detail
issue-kpifetchIssueKPIsPOST /v1.0/statistics/dashboard/issue/detail
combined-kpifetchCombinedKPIsParallel issue/report detail endpoints
report-dashboardfetchReportDashboardPOST /v1.0/statistics/dashboard/report
issue-dashboardfetchIssueDashboardPOST /v1.0/statistics/dashboard/issue
platform-statsStatisticsServiceGET /v1.0/statistics/dashboard/home-page-metrics
search-sitessearchSitesGET /v1.0/sites/compact/paginate
search-userssearchUsersGET /v1.0/users/paginate
search-questionnairessearchQuestionnairesGET /v1.0/questionnaires/questionnaireIndexes/paginate

9. Token Refresh and Request Context

AsyncLocalStorage carries request-scoped session credentials into deep service calls.

sequenceDiagram
    autonumber
    participant Controller as WhapiController
    participant Ctx as AsyncLocalStorage
    participant Stats as callStatisticsDashboard / PlatformApiClient
    participant Refresh as platformTokenRefreshService
    participant DB as SessionRepository
    participant Plat as Platform Internal Auth API

    Controller->>Ctx: runWithRequestContext(sessionId, platformToken, ...)
    Stats->>Plat: API call with platform token
    Plat-->>Stats: 401 Unauthorized

    Stats->>Refresh: refresh(sessionId)
    Refresh->>DB: getSessionCredentials(sessionId)
    Refresh->>Refresh: decrypt existing token and extract platform userID
    Refresh->>Plat: GET /v1.0/auth/internal/{userID}/jwt-token
    Plat-->>Refresh: new JWT token
    Refresh->>DB: updatePlatformToken(encrypted)
    Refresh-->>Stats: new token

    Stats->>Ctx: setRequestPlatformToken(new token)
    Stats->>Plat: retry original request

Failure outcome

  • If refresh ultimately fails, PlatformAuthExpiredError is thrown.
  • Controller catches this, invalidates session, notifies user, and generates a new login link.

10. Memory and Conversation State

Three memory layers are active:

  1. Session DB state (user_sessions)
  • current_intent, pending_action, turn_count, topic_switch_count
  • last_entities, context_window, confidence_history
  1. Mastra persistent memory (PostgresStore)
  • Thread/resource model built from organization + session/user identifiers.
  • Thread summarization/pruning when message count grows (>=40, keep last 6).
  1. In-process short history (Map)
  • Last 3 user utterances per thread for prompt priming.
stateDiagram-v2
    [*] --> Unauthenticated
    Unauthenticated --> PendingLogin: generateLoginLink()
    PendingLogin --> Authenticated: /auth/authenticate success

    state Authenticated {
      [*] --> Active
      Active --> Active: message turn / touchSession
      Active --> PendingAction: assistant asks confirmation
      PendingAction --> Active: confirmation or new non-confirmation message
      Active --> MemoryExpired: inactivity > 3h
      PendingAction --> MemoryExpired: inactivity > 3h
      MemoryExpired --> Active: state reset + new turn
    }

    Authenticated --> LoggedOut: logout command / platform auth expired / softDelete
    LoggedOut --> PendingLogin: next incoming message

11. Data Model

erDiagram
    USER_CHATBOT_ACCESS ||--o{ USER_SESSIONS : has

    USER_CHATBOT_ACCESS {
        uuid id PK
        varchar username
        varchar email
        uuid supabase_user_id
        varchar organization_id
        varchar whatsapp_number
        boolean is_enabled
        text[] roles
        jsonb scopes
        jsonb rate_limit
        timestamptz created_at
        timestamptz updated_at
        timestamptz deleted_at
    }

    USER_SESSIONS {
        uuid session_id PK
        uuid user_id FK
        varchar organization_id
        varchar whatsapp_number
        text auth_token
        text platform_token
        varchar preferred_language
        timestamptz created_at
        timestamptz last_activity
        timestamptz expires_at
        timestamptz deleted_at
        varchar current_intent
        varchar pending_action
        timestamptz pending_action_updated_at
        int turn_count
        int topic_switch_count
        jsonb last_entities
        jsonb context_window
        jsonb confidence_history
    }

    LOGIN_TOKENS {
        uuid id PK
        varchar whatsapp_number
        varchar token UK
        timestamptz expires_at
        boolean used
        timestamptz created_at
    }

Important indexes/constraints

  • Partial unique indexes enforce active-record uniqueness, including:
    • one active access record per (email, organization_id)
    • one active session per (whatsapp_number, organization_id)
  • Login token uniqueness via idx_login_tokens_token.

12. Date/Time Semantics

  • Natural-language ranges are extracted in two stages:
    • extractDateRangeWithAI() for webhook flow context-aware extraction.
    • date-range.ts utilities for normalization/keyword expansion.
  • Supported keywords include static (this_week, last_month) and rolling (last_7_days, last_3_months).
  • Statistics period bucketing is inferred by resolved span:
    • <=14 days daily
    • <=45 days weekly
    • <=120 days monthly
    • <=365 days quarterly
    • otherwise yearly

13. Security, Safety, and Guardrails

  • Token security: JWT + platform tokens encrypted in DB (AES-256-GCM).
  • Session gating: webhook processing blocked until valid session.
  • Prompt guardrails: explicit professional-conduct constraints in agent instructions.
  • Output safety filter: profanity/insult pattern check before outbound WhatsApp send.
  • No secret leakage policy: agent prompts explicitly forbid exposing internal tokens/IDs.

14. Observability and Reliability

  • Winston JSON logging (logs/error.log, logs/combined.log; console in non-prod).
  • Tool/agent execution logs include request/response previews and tool-call summaries.
  • Retry wrappers:
    • ParaSail provider retries retryable HTTP/network statuses.
    • Generic withRetry() handles transient transport failures.
    • Platform token refresh uses exponential backoff and deduplicated in-flight refreshes.

15. Deployment Topology

flowchart TD
    Dev["Developer / CI"] --> Build["Docker build (Node 20)"]
    Build --> ECR[Push image to AWS ECR]
    ECR --> ECS["ECS Fargate Service<br/>(staging or production)"]
    ECS --> Health["/health endpoint"]

    Dev --> GCF["Optional Google Cloud Functions deploy<br/>via deployment script"]

    Env["Environment config in GCS"] --> ECS
    Env --> GCF

Notable operational details

  • ECS deployment script injects environment-specific task/service config and waits for stable rollout.
  • GCS-backed .env sync scripts are used for staging/production env distribution.
  • Docker runtime exposes 8080 (while local TS dev defaults to 3133).

16. Known Legacy/Transitional Areas

  • POST /api/v1/webhook/whatsapp (ChatController) is currently a stub and not the primary production path.
  • MessageRouterService exists but current webhook path uses WhapiController + direct executeAgent(...) orchestration.
  • docker-compose.yml still provisions MongoDB, but runtime persistence is PostgreSQL/Drizzle/Mastra Postgres store.

17. Key Source Map

  • Entry: src/app.ts
  • Routing: src/routes/index.ts, src/routes/whapi.router.ts
  • Webhook pipeline: src/middleware/whapi-validation.middleware.ts, src/controllers/whapi.controller.ts
  • Auth/session: src/domains/auth/*, src/services/auth.service.ts
  • Hybrid routing: src/services/hybrid-router.service.ts, src/services/intent-resolver.service.ts
  • Agent runtime: src/mastra/index.ts, src/mastra/agents/*, src/mastra/tools/*
  • Dashboard integration: src/services/dashboard/*
  • Token refresh: src/services/platform-token-refresh.service.ts, src/services/request-context.ts
  • DB schema: src/domains/auth/auth.schema.ts, src/db/ensure-indexes.ts

18. Support Matrix

18.1 Channel and Message-Type Support

CapabilityStatusDetails
WhatsApp ingress via Whapi webhookSupportedPOST /api/v1/whapi/webhook
Plain text user messagesSupportedReads messages[0].text.body
Interactive quick-reply/list responsesSupportedReads interactive.list_reply, interactive.button_reply, reply.list_reply, reply.buttons_reply
Media message understanding (image/video/audio/document/location/contact)Not implementedMessage schema exists, but controller routing only consumes text/interactive title fields
Outbound text messageSupportedsendMessage()
Outbound interactive quick-reply buttonsSupportedsendReplyButtonMessage()
Outbound list messageSupportedsendListMessage()
Outbound URL buttonSupportedsendUrlButtonMessage()

18.2 Intent and Domain Support

IntentStatusPrimary runtime path
REPORTSupportedreport-agent KPI/dashboard/search tools
ISSUESupportedissue-agent KPI/dashboard/search tools
SITESupportedsite-agent search tools
IRRELEVANTSupportedLocalized help fallback response
OTHERSupportedClarification response / fallback to previous intent when appropriate

18.3 Command Support (Current Behavior)

Commands are phrase-based exact matches on lowercased trimmed text:

Input textBehavior
helpHelp buttons
start or promptStarter/help buttons
quick actionsQuick-action list
how to useUsage explainer
account & settingsClear/logout buttons
clearClears memory + session conversation state
logoutSoft-deletes session and prompts re-login

Notes:

  • Numeric quick prompts are supported (1-5, 1️⃣-5️⃣, and qp_1qp_5 interactive IDs).
  • Slash variants like /help are not explicitly parsed by CommandService and are currently not first-class command tokens.

18.4 Language Support

The codebase has multiple language layers with different coverage:

LayerEffective support
Full response-language pipeline (SupportedLanguage)English, Indonesian, Malay, Spanish, Portuguese, Thai, Korean
System localization bundle (getLocalizedMessage)English, Indonesian, Malay, Spanish, Portuguese, Thai, Korean
LLM language-ID model output codesen, id, ms, es, pt, th, ko
Greeting detection (getGreetingLanguage)English, Indonesian, Portuguese, Spanish, Japanese, Korean, Chinese, Thai
Greeting response stringsIncludes English, Indonesian, Malay, Portuguese, Spanish, Japanese, Korean, Chinese, Thai, Vietnamese
Safety fallback messagesIncludes English, Indonesian, Malay, Spanish, Portuguese, Thai, Korean, Vietnamese, Chinese

Implication: end-to-end localized business responses are strongest for the 7 SupportedLanguage values; additional languages appear in greeting/safety paths but do not have full shared localization coverage.

18.5 Date/Time Input Support

Supported keyword forms accepted by dashboard/KPI tool schemas:

  • Static keywords: today, yesterday, this_week, last_week, this_month, last_month, this_year, last_year, last_30_days
  • Rolling keywords: last_<N>_days|weeks|months|years

Additional date parsing supported in utility layer (utils/date-range.ts):

  • Explicit YYYY-MM-DD, DD/MM/YYYY, DD-MM-YYYY, and textual month forms
  • English/Indonesian/Thai month names
  • Thai Buddhist Era year normalization (BE-543)
  • Relative phrasing such as X days ago, last few weeks, weekday/weekend modifiers

18.6 Analytics Keyword-Preset Support

Report dashboard presets (report-dashboard):

  • completion_by_site, completion_by_department, completion_by_user, completion_by_questionnaire, completion_by_date, completion_by_site_group
  • missed_by_site, missed_by_department, missed_by_user
  • rcr_by_site, rcr_by_department, rcr_by_user
  • completion_time_by_site, completion_time_by_department
  • submitted_by_site, submitted_by_user
  • report_trend, report_status_breakdown, report_type_breakdown

Issue dashboard presets (issue-dashboard):

  • issues_by_site, issues_by_department, issues_by_user, issues_by_questionnaire, issues_by_date, issues_by_site_group
  • open_issues_by_site, open_issues_by_department, open_issues_by_user
  • resolution_time_by_site, resolution_time_by_department
  • issue_trend, issue_flags, recurring_issues, issue_heatmap, issue_insights

19. Configuration Reference

19.1 Runtime Environment Variables

VariableRequiredDefaultUsed byBehavior if missing
PORTNo3133app.tsApp binds on default local port
NODE_ENVNodevelopmentapp.ts, logger, healthAffects startup branch, logging, error stack visibility
DB_BOOTSTRAPNofalseapp.tsSkips ensureDbIndexes() when not true
DATABASE_URLYes (production)noneDrizzle, Mastra memoryDB operations fail if absent/unreachable
PGSSLNofalseDrizzle pool configSSL disabled unless explicitly true
ENCRYPTION_KEYYesnonetoken encrypt/decryptThrows when encrypt/decrypt is attempted
JWT_SECRETYesnoneauth JWT issue/verifyThrows on session token issue/validation
SUPABASE_URLYes (for login)noneSupabase clientsSupabase public client init fails
SUPABASE_ANON_KEYYes (for login)noneSupabase public clientSign-in unavailable
SUPABASE_SERVICE_ROLE_KEYNononeSupabase admin clientAdmin provisioning path disabled
PLATFORM_API_URLNohttps://api-staging.hellonimbly.comPlatform auth/API clientUses staging endpoint by default
PLATFORM_AUTH_INTERNAL_TOKENYes (for refresh path)nonetoken refresh serviceRefresh throws PlatformAuthExpiredError
STATISTICS_JWT_SECRETConditionally requirednonestatistics fallback tokenFallback stats token generation fails
STATISTICS_API_BASE_URLNohttps://api-staging.hellonimbly.com/v1.0statistics dashboard clientUses default base URL
STATISTICS_API_TIMEOUT_MSNo12000statistics dashboard clientUses 12s request timeout
PARASAIL_API_KEYYes (for agents)noneMastra agents/date/lang/classifier/contextAgent/model init or inference paths fail
PARASAIL_API_BASE_URLNohttps://api.parasail.io/v1model providerUses default Parasail base URL
WHAPI_API_KEYRequired for outbound sendemptyWhapi clientSend functions no-op with warning logs
WHAPI_BASE_URLNohttps://gate.whapi.cloudWhapi clientUses default Whapi endpoint
APP_BASE_URLNohttp://localhost:3133login link generationLogin links point to default local URL
APP_LOGO_URLNohardcoded Firebase URLauth HTML pagesUses default logo URL
LOG_LEVELNoinfowinston loggerUses info level
LOG_DIRNologswinston loggerWrites logs under logs/
MASTRA_TELEMETRY_DISABLEDNofalseMastra bootTelemetry disabled only when true

19.2 Script/Diagnostic Variables

VariableUsed byPurpose
TEST_ORG_IDsrc/scripts/test-kpi-apis.tsTarget organization for KPI smoke tests
PLATFORM_EMAIL, PLATFORM_PASSWORDsrc/scripts/test-platform-api.tsPlatform API test credential inputs

19.3 Deployment Configuration Artifacts

File(s)Purpose
ecs-task-definition-*.jsonECS task CPU/memory/container/health definitions
ecs-service-*.jsonECS service network/load balancer/deployment settings
scripts/utils/populate-config.shReplaces placeholder IDs/ARNs with real AWS resources
scripts/deployment/deploy-ecs.shEnd-to-end image build/push/deploy/wait flow
scripts/deployment/deploy.shOptional Google Cloud Functions deployment path
scripts/env/*.shGCS-based .env sync helpers

20. Operational Limits and Tuning Knobs

AreaCurrent valueLocation
Session conversation-memory TTL3 hourswhapi.controller.ts
Session auth TTL720 hours (30 days)auth.usecase.ts + jwt.util.ts
Login link TTL15 minutesauth.usecase.ts
AI date extraction minimum confidence0.5whapi.controller.ts
Hybrid classifier thresholdshigh 0.7, medium 0.4, low 0.2hybrid-router.service.ts
Context analysis guardrailsturn_count < 5, topic_switch_count < 2hybrid-router.service.ts
Context window persistencemax 3 turns (6 messages)updateContextWindow()
In-memory recent user historymax 3 entries/threadrecent-user-history.ts
Thread summarization triggerat least 40 messagesthread-summary.ts
Thread prune keep-backlast 6 messagesthread-summary.ts
Agent short-circuit token guard60000 estimated tokensreport/issue/site agents
Statistics API timeout12000 ms defaultstatistics-api.client.ts
Search pagination defaultspage 1, limit 10search services
Platform refresh retries3 attempts with exponential backoffplatform-token-refresh.service.ts
Parasail provider retries3 attempts, retry on 429/500/502/503/504 + networkparasail-provider.ts
Generic agent-call retries2 retries for transient transport errorsmastra/utils/retry.ts

21. Limitations and Non-Goals (Current Implementation)

  1. Media-content understanding is not implemented in the webhook controller; only text/interactive titles are routed.
  2. Streaming LLM responses are not implemented (streamChat throws Streaming not implemented yet).
  3. Command recognition is exact phrase matching; slash-prefixed command tokens are not a first-class parse mode.
  4. Full multilingual coverage is uneven: core localized flows are 7-language, while greeting/safety layers include additional languages.
  5. Date keyword whitelist excludes values like tomorrow; if AI extraction returns unsupported keywords they are sanitized to null.
  6. Webhook route is defined for POST only; middleware includes a GET-verification branch that is not directly mounted by router configuration.
  7. docker-compose.yml still provisions MongoDB although runtime persistence is PostgreSQL + Drizzle + Mastra Postgres store.
  8. MessageRouterService and /api/v1/webhook/whatsapp are legacy/non-primary execution paths.
  9. Rate-limit policy fields exist in user_chatbot_access but no enforcement middleware/service currently consumes them.
  10. If WHAPI_API_KEY is missing, outbound messaging silently degrades to log-only behavior (no user delivery).

22. Failure Modes and User-Visible Outcomes

Failure conditionHandling pathUser-visible effect
No active sessionwebhook validation generateLoginLink()401 webhook response + WhatsApp auth-link message
Auth service DB/repository errorwebhook validation catches AUTH_SERVICE_UNAVAILABLE503 response (Authentication service temporarily unavailable)
Platform token expired and refresh failsraises PlatformAuthExpiredError, controller invalidates sessionUser receives expiry message + new login link
Agent/tool execution errorcontroller catch fallbackUser gets generic processing-error message
Output flagged by safety filtercheckOutputSafety()Replaced with professional localized safe fallback
Missing date for dashboard/KPI requestagent prompt policy requests clarificationIntended behavior is to ask user for a time period before fetching data

23. Model Inventory (What Is Used Today)

23.1 Active Models in Runtime Paths

ModelProvider pathUsed inPurpose
zai-org/GLM-4.6VParasail (OpenAI-compatible)report-agent, issue-agent, report-kpi-agent, report-dashboard-agent, issue-kpi-agent, issue-dashboard-agent, thread-summary-agentMain orchestration, KPI/dashboard reasoning, summarization
zai-org/GLM-4.7-FP8Parasail (OpenAI-compatible)site-agentSite/user/questionnaire lookup conversations
Qwen/Qwen3-235B-A22B-Instruct-2507Parasail custom chat clientextractDateRangeWithAI()Multilingual date/time range extraction
meta-llama/Llama-3.3-70B-InstructParasail custom chat clientclassifyIntent(), analyzeWithContext()Intent classification and low-confidence contextual disambiguation
google/gemma-3-27b-itParasail custom chat clientdetectLanguageFromText()Language identification (with heuristic fallback)

23.2 Model Usage Behavior

Flow stageModel behavior
Hybrid routingLlama classifier always runs; context analyzer runs only when confidence/turn-switch guardrails allow
Date extractionQwen runs in parallel with routing; result is discarded when confidence < 0.5
Language detectionGemma language-ID attempt first; regex/keyword heuristics/fallback language if needed
Report/Issue/Site response synthesisGLM agents produce final user-facing text after tool calls
Long thread compressionGLM summary agent writes thread summary and prunes older messages

23.3 Model Constants Available but Not Currently Used

Model constantStatus
deepseek-ai/DeepSeek-R1-0528Defined in provider file, not currently wired to active execution paths

24. Agent Catalog and Responsibilities

24.1 Full Agent Catalog

Agent ID / functionModelScopeMemoryPrimary toolsMain responsibility
intent-classifier (classifyIntent)Llama 3.3 70BShared utilityNoNoneClassify message into REPORT, ISSUE, SITE, IRRELEVANT, or OTHER with confidence
report-agentGLM-4.6VParent orchestratorYes (Mastra + session context)ask-report-kpi, ask-report-dashboard, search-*Route report queries, resolve entities, enforce report response policy
issue-agentGLM-4.6VParent orchestratorYesask-issue-kpi, ask-issue-dashboard, search-*Route issue queries, resolve entities, enforce issue response policy
site-agentGLM-4.7-FP8Domain agentYessearch-sites, search-users, search-questionnairesResolve site/user/questionnaire references and site info responses
report-kpi-agentGLM-4.6VSpecialistStatelessreport-kpi, combined-kpi, platform-statsKPI-only report metrics and overview responses
report-dashboard-agentGLM-4.6VSpecialistStatelessreport-dashboard, issue-dashboardDetailed report analytics and breakdowns; IRR merges when requested
issue-kpi-agentGLM-4.6VSpecialistStatelessissue-kpi, combined-kpi, platform-statsKPI-only issue metrics and overview responses
issue-dashboard-agentGLM-4.6VSpecialistStatelessissue-dashboardDetailed issue analytics and breakdowns
thread-summary-agent (lazy)GLM-4.6VInternal maintenanceN/ANoneSummarize/prune long thread history

24.2 Delegation Behavior

Parent agentDelegation decisions
report-agentKPI intents report-kpi-agent; detailed analysis report-dashboard-agent; deterministic overrides for aggregate counts/combined overview
issue-agentKPI intents issue-kpi-agent; detailed analysis issue-dashboard-agent; deterministic KPI fallbacks for KPI-like queries
site-agentStays within search-tool domain; no KPI/dashboard delegation

25. Response Generation Pipeline and Behaviors

flowchart TD
    A[Inbound message] --> B[Validation + Session Context]
    B --> C[Greeting/Command/Shortcut checks]
    C --> D[Hybrid intent routing + intent safety net]
    D --> E[Language detection + language continuity rules]
    E --> F[Parallel date extraction]
    F --> G[Domain agent execution]
    G --> H[Post-process: clean tags/markdown]
    H --> I[Safety filter]
    I --> J[Humanize dates]
    J --> K[Whapi outbound send]

25.1 How Responses Are Built

  1. Ingress message is normalized from text or interactive selection title.
  2. Greeting and command paths can short-circuit before intent/agent execution.
  3. Hybrid routing + heuristics pick the final intent.
  4. Language is resolved using detection plus follow-up continuity override.
  5. Date range is extracted in parallel and accepted only when confidence passes threshold.
  6. Chosen agent runs, calls tools, and synthesizes domain response.
  7. Controller cleans AI tags, applies safety filter, and sends via Whapi.

25.2 Domain-Specific Response Post-Processing

DomainAdditional formatting behavior
ReportnormalizeReportResponseBody() strips operational chatter/internal site-ID fragments and de-duplicates blocks; ensureReportPortalFooter() appends admin-portal footer in detected language
IssueHumanized dates + localized fallback on failures
SiteHumanized dates + localized busy/error fallbacks

25.3 Safety and Professionalism Behaviors

BehaviorImplementation
Prompt-level professional conduct rulesInjected into agent instructions via PROFESSIONAL_CONDUCT_PROMPT
Output profanity/insult screeningcheckOutputSafety()
Unsafe response handlingReplace with localized professional fallback

26. Detailed Language and Behavior Rules

26.1 Language Selection Rules

  1. Detect language from latest message (Gemma + heuristics).
  2. If follow-up and message looks entity-only, preserve previous language.
  3. If explicit language-switch phrase appears, honor detected new language.
  4. Persist resolved language to session (preferred_language).

26.2 Conversation Behaviors

BehaviorCurrent implementation
Short follow-ups (same, again, last month)Often reuse previous intent via hybrid and resolver heuristics
Confirmation trackingpending_action set when response contains follow-up trigger phrases (do you want me, proceed, etc.)
Pending-action expiryCleared after 3 hours or when non-confirmation message arrives
Topic-switch trackingtopic_switch_count increments when switching among REPORT, ISSUE, and SITE
Context windowSliding window of last 3 turns (user+assistant pairs) stored in session

26.3 Date Behaviors

BehaviorCurrent implementation
Missing date for analytics requestsAgents are instructed to ask user for time period
Explicit date range precedenceExplicit start/end should be preserved over relative keyword when present
Tool keyword whitelist enforcementUnsupported keywords sanitized out before tool payload use
Date display in final textISO-like dates are humanized by humanizeResponseDates()

26.4 Response-Style Behaviors from Agent Policies

RuleScope
Keep concise, no process-chatter (I'll check, let me)report/issue specialist and parent agents
Never reveal internal credentials/context IDsall domain agents
Route KPI-vs-dashboard based on query typereport/issue parent agents
Apply site/user/questionnaire name resolution before analyticsreport/issue/site agents via search-* tools