System Architecture

Complete system design and database schemas

Chatbot Creation Framework - System Architecture

Overview

This document outlines the architecture for a chatbot creation framework that allows users to create and manage multiple chatbot projects with multimodal RAG capabilities, integrated with n8n workflows for document processing.

Technology Stack

  • Frontend & Backend: Next.js 14 with TypeScript
  • Authentication: NextAuth.js
  • Main Database: PostgreSQL
  • Storage: MinIO (S3-compatible)
  • Caching: Redis
  • Containerization: Docker & Docker Compose
  • Document Processing: n8n workflows
  • Vector Database: pgvector extension for PostgreSQL

System Architecture

graph TB
    subgraph "User Interface"
        UI[Next.js Frontend]
        Chat[Chat Interface]
        Dashboard[Project Dashboard]
        Upload[File Upload]
    end
    
    subgraph "Next.js Backend"
        API[API Routes]
        Auth[NextAuth.js]
        FileHandler[File Handler]
        DBManager[DB Manager]
        MinIOManager[MinIO Manager]
    end
    
    subgraph "Infrastructure Services"
        MainDB[(Main PostgreSQL)]
        Redis[(Redis Cache)]
        MinIO[(MinIO Storage)]
    end
    
    subgraph "External Services"
        N8N[n8n Workflow Engine]
    end
    
    subgraph "Per-Chatbot Resources"
        ChatbotDB[(Chatbot PostgreSQL)]
        ChatbotBucket[(MinIO Bucket)]
    end
    
    UI --> API
    Chat --> API
    Dashboard --> API
    Upload --> API
    
    API --> Auth
    API --> FileHandler
    API --> DBManager
    API --> MinIOManager
    
    Auth --> MainDB
    DBManager --> MainDB
    DBManager --> ChatbotDB
    MinIOManager --> MinIO
    FileHandler --> MinIO
    FileHandler --> ChatbotBucket
    
    API --> N8N
    N8N --> ChatbotDB
    N8N --> ChatbotBucket
    
    API --> Redis

Database Architecture

Main Application Database Schema

-- Users table
CREATE TABLE users (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    email VARCHAR(255) UNIQUE NOT NULL,
    name VARCHAR(255),
    image VARCHAR(255),
    created_at TIMESTAMP DEFAULT NOW(),
    updated_at TIMESTAMP DEFAULT NOW()
);

-- Chatbot projects table
CREATE TABLE chatbot_projects (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    user_id UUID REFERENCES users(id) ON DELETE CASCADE,
    name VARCHAR(255) NOT NULL,
    description TEXT,
    system_prompt TEXT,
    n8n_chat_url VARCHAR(500),
    n8n_webhook_url VARCHAR(500),
    database_name VARCHAR(100) UNIQUE NOT NULL,
    minio_bucket_name VARCHAR(100) UNIQUE NOT NULL,
    status VARCHAR(50) DEFAULT 'active', -- active, inactive, processing
    created_at TIMESTAMP DEFAULT NOW(),
    updated_at TIMESTAMP DEFAULT NOW()
);

-- Documents table
CREATE TABLE documents (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    chatbot_project_id UUID REFERENCES chatbot_projects(id) ON DELETE CASCADE,
    filename VARCHAR(255) NOT NULL,
    original_filename VARCHAR(255) NOT NULL,
    file_type VARCHAR(50) NOT NULL, -- pdf, markdown, docx, etc.
    file_size BIGINT NOT NULL,
    minio_path VARCHAR(500) NOT NULL,
    processing_status VARCHAR(50) DEFAULT 'pending', -- pending, processing, completed, failed
    processing_error TEXT,
    uploaded_at TIMESTAMP DEFAULT NOW(),
    processed_at TIMESTAMP
);

-- API keys for chatbot access
CREATE TABLE chatbot_api_keys (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    chatbot_project_id UUID REFERENCES chatbot_projects(id) ON DELETE CASCADE,
    api_key VARCHAR(255) UNIQUE NOT NULL,
    name VARCHAR(255) NOT NULL,
    is_active BOOLEAN DEFAULT true,
    created_at TIMESTAMP DEFAULT NOW(),
    last_used_at TIMESTAMP
);

-- Processing jobs tracking
CREATE TABLE processing_jobs (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    chatbot_project_id UUID REFERENCES chatbot_projects(id) ON DELETE CASCADE,
    document_id UUID REFERENCES documents(id) ON DELETE CASCADE,
    n8n_execution_id VARCHAR(255),
    status VARCHAR(50) DEFAULT 'pending', -- pending, running, completed, failed
    started_at TIMESTAMP,
    completed_at TIMESTAMP,
    error_message TEXT
);

Per-Chatbot Database Schema

Each chatbot project will have its own PostgreSQL database with the following schema:

-- Enable pgvector extension for vector similarity search
CREATE EXTENSION IF NOT EXISTS vector;

-- Vector embeddings table for RAG
CREATE TABLE embeddings (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    document_id UUID NOT NULL,
    chunk_id VARCHAR(100) NOT NULL,
    content TEXT NOT NULL,
    metadata JSONB,
    embedding VECTOR(1536), -- Assuming OpenAI embeddings
    created_at TIMESTAMP DEFAULT NOW()
);

-- Chat conversations
CREATE TABLE conversations (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    session_id VARCHAR(255) NOT NULL,
    created_at TIMESTAMP DEFAULT NOW()
);

-- Chat messages
CREATE TABLE messages (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    conversation_id UUID REFERENCES conversations(id) ON DELETE CASCADE,
    role VARCHAR(20) NOT NULL, -- user, assistant, system
    content TEXT NOT NULL,
    metadata JSONB,
    created_at TIMESTAMP DEFAULT NOW()
);

-- Extracted images metadata
CREATE TABLE extracted_images (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    document_id UUID NOT NULL,
    image_path VARCHAR(500) NOT NULL, -- Path in MinIO bucket
    image_description TEXT,
    page_number INTEGER,
    created_at TIMESTAMP DEFAULT NOW()
);

-- Indexes for performance
CREATE INDEX idx_embeddings_document_id ON embeddings(document_id);
CREATE INDEX idx_embeddings_vector ON embeddings USING ivfflat (embedding vector_cosine_ops);
CREATE INDEX idx_messages_conversation_id ON messages(conversation_id);
CREATE INDEX idx_messages_created_at ON messages(created_at);

Component Architecture

1. Authentication Layer

  • NextAuth.js with multiple providers (email, Google, GitHub)
  • JWT tokens for API authentication
  • Role-based access control

2. Project Management

  • CRUD operations for chatbot projects
  • Database and MinIO bucket provisioning
  • Configuration management

3. File Upload System

  • Multi-format support (PDF, Markdown, DOCX)
  • File validation and virus scanning
  • Chunked upload for large files
  • Progress tracking

4. Document Processing Pipeline

  • Integration with n8n workflows
  • Webhook endpoints for status updates
  • Queue management for batch processing
  • Error handling and retry logic

5. Chat Interface

  • Real-time messaging with WebSockets
  • Message history persistence
  • Context-aware responses
  • Rate limiting and security

6. Infrastructure Management

  • Automated database creation/deletion
  • MinIO bucket lifecycle management
  • Connection pooling
  • Health monitoring

API Endpoints

Authentication

  • POST /api/auth/* - NextAuth.js endpoints

Projects

  • GET /api/projects - List user's chatbot projects
  • POST /api/projects - Create new chatbot project
  • GET /api/projects/[id] - Get project details
  • PUT /api/projects/[id] - Update project configuration
  • DELETE /api/projects/[id] - Delete project and resources

Documents

  • GET /api/projects/[id]/documents - List project documents
  • POST /api/projects/[id]/documents - Upload new document
  • DELETE /api/documents/[id] - Delete document
  • POST /api/documents/[id]/process - Trigger document processing

Chat

  • POST /api/projects/[id]/chat - Send chat message
  • GET /api/projects/[id]/conversations - Get conversation history
  • WebSocket /api/projects/[id]/ws - Real-time chat connection

Webhooks

  • POST /api/webhooks/n8n/[projectId] - n8n processing status updates

Security Considerations

  1. Authentication & Authorization

    • Multi-factor authentication support
    • API key management for chatbot access
    • Rate limiting per user/project
  2. Data Protection

    • Encrypted file storage
    • Database encryption at rest
    • Secure communication (HTTPS/WSS)
  3. Input Validation

    • File type and size validation
    • Content sanitization
    • SQL injection prevention
  4. Resource Isolation

    • Separate databases per chatbot
    • Isolated MinIO buckets
    • Resource quotas and limits

Deployment Architecture

graph TB
    subgraph "Docker Environment"
        NextJS[Next.js App Container]
        MainDB[PostgreSQL Main Container]
        Redis[Redis Container]
        MinIO[MinIO Container]
        Nginx[Nginx Reverse Proxy]
    end
    
    subgraph "External Services"
        N8N[n8n Instance]
        ChatbotDBs[Dynamic Chatbot DBs]
    end
    
    Internet --> Nginx
    Nginx --> NextJS
    NextJS --> MainDB
    NextJS --> Redis
    NextJS --> MinIO
    NextJS --> N8N
    NextJS --> ChatbotDBs

Scalability Considerations

  1. Horizontal Scaling

    • Stateless Next.js application design
    • Database connection pooling
    • Redis session management
  2. Resource Management

    • Per-project resource limits
    • Automated cleanup of inactive projects
    • Storage optimization
  3. Performance Optimization

    • Vector similarity search optimization
    • Caching strategies
    • CDN integration for file delivery

Monitoring and Logging

  1. Application Monitoring

    • Health check endpoints
    • Performance metrics
    • Error tracking
  2. Infrastructure Monitoring

    • Database performance
    • Storage usage
    • Network metrics
  3. Business Metrics

    • User engagement
    • Document processing times
    • Chat interaction patterns

This architecture provides a robust, scalable foundation for the chatbot creation framework while maintaining security, performance, and maintainability.