Chatbot Creation Framework - System Architecture
Overview
This document outlines the architecture for a chatbot creation framework that allows users to create and manage multiple chatbot projects with multimodal RAG capabilities, integrated with n8n workflows for document processing.
Technology Stack
- Frontend & Backend: Next.js 14 with TypeScript
- Authentication: NextAuth.js
- Main Database: PostgreSQL
- Storage: MinIO (S3-compatible)
- Caching: Redis
- Containerization: Docker & Docker Compose
- Document Processing: n8n workflows
- Vector Database: pgvector extension for PostgreSQL
System Architecture
graph TB
subgraph "User Interface"
UI[Next.js Frontend]
Chat[Chat Interface]
Dashboard[Project Dashboard]
Upload[File Upload]
end
subgraph "Next.js Backend"
API[API Routes]
Auth[NextAuth.js]
FileHandler[File Handler]
DBManager[DB Manager]
MinIOManager[MinIO Manager]
end
subgraph "Infrastructure Services"
MainDB[(Main PostgreSQL)]
Redis[(Redis Cache)]
MinIO[(MinIO Storage)]
end
subgraph "External Services"
N8N[n8n Workflow Engine]
end
subgraph "Per-Chatbot Resources"
ChatbotDB[(Chatbot PostgreSQL)]
ChatbotBucket[(MinIO Bucket)]
end
UI --> API
Chat --> API
Dashboard --> API
Upload --> API
API --> Auth
API --> FileHandler
API --> DBManager
API --> MinIOManager
Auth --> MainDB
DBManager --> MainDB
DBManager --> ChatbotDB
MinIOManager --> MinIO
FileHandler --> MinIO
FileHandler --> ChatbotBucket
API --> N8N
N8N --> ChatbotDB
N8N --> ChatbotBucket
API --> Redis
Database Architecture
Main Application Database Schema
-- Users table
CREATE TABLE users (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
email VARCHAR(255) UNIQUE NOT NULL,
name VARCHAR(255),
image VARCHAR(255),
created_at TIMESTAMP DEFAULT NOW(),
updated_at TIMESTAMP DEFAULT NOW()
);
-- Chatbot projects table
CREATE TABLE chatbot_projects (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id UUID REFERENCES users(id) ON DELETE CASCADE,
name VARCHAR(255) NOT NULL,
description TEXT,
system_prompt TEXT,
n8n_chat_url VARCHAR(500),
n8n_webhook_url VARCHAR(500),
database_name VARCHAR(100) UNIQUE NOT NULL,
minio_bucket_name VARCHAR(100) UNIQUE NOT NULL,
status VARCHAR(50) DEFAULT 'active', -- active, inactive, processing
created_at TIMESTAMP DEFAULT NOW(),
updated_at TIMESTAMP DEFAULT NOW()
);
-- Documents table
CREATE TABLE documents (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
chatbot_project_id UUID REFERENCES chatbot_projects(id) ON DELETE CASCADE,
filename VARCHAR(255) NOT NULL,
original_filename VARCHAR(255) NOT NULL,
file_type VARCHAR(50) NOT NULL, -- pdf, markdown, docx, etc.
file_size BIGINT NOT NULL,
minio_path VARCHAR(500) NOT NULL,
processing_status VARCHAR(50) DEFAULT 'pending', -- pending, processing, completed, failed
processing_error TEXT,
uploaded_at TIMESTAMP DEFAULT NOW(),
processed_at TIMESTAMP
);
-- API keys for chatbot access
CREATE TABLE chatbot_api_keys (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
chatbot_project_id UUID REFERENCES chatbot_projects(id) ON DELETE CASCADE,
api_key VARCHAR(255) UNIQUE NOT NULL,
name VARCHAR(255) NOT NULL,
is_active BOOLEAN DEFAULT true,
created_at TIMESTAMP DEFAULT NOW(),
last_used_at TIMESTAMP
);
-- Processing jobs tracking
CREATE TABLE processing_jobs (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
chatbot_project_id UUID REFERENCES chatbot_projects(id) ON DELETE CASCADE,
document_id UUID REFERENCES documents(id) ON DELETE CASCADE,
n8n_execution_id VARCHAR(255),
status VARCHAR(50) DEFAULT 'pending', -- pending, running, completed, failed
started_at TIMESTAMP,
completed_at TIMESTAMP,
error_message TEXT
);
Per-Chatbot Database Schema
Each chatbot project will have its own PostgreSQL database with the following schema:
-- Enable pgvector extension for vector similarity search
CREATE EXTENSION IF NOT EXISTS vector;
-- Vector embeddings table for RAG
CREATE TABLE embeddings (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
document_id UUID NOT NULL,
chunk_id VARCHAR(100) NOT NULL,
content TEXT NOT NULL,
metadata JSONB,
embedding VECTOR(1536), -- Assuming OpenAI embeddings
created_at TIMESTAMP DEFAULT NOW()
);
-- Chat conversations
CREATE TABLE conversations (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
session_id VARCHAR(255) NOT NULL,
created_at TIMESTAMP DEFAULT NOW()
);
-- Chat messages
CREATE TABLE messages (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
conversation_id UUID REFERENCES conversations(id) ON DELETE CASCADE,
role VARCHAR(20) NOT NULL, -- user, assistant, system
content TEXT NOT NULL,
metadata JSONB,
created_at TIMESTAMP DEFAULT NOW()
);
-- Extracted images metadata
CREATE TABLE extracted_images (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
document_id UUID NOT NULL,
image_path VARCHAR(500) NOT NULL, -- Path in MinIO bucket
image_description TEXT,
page_number INTEGER,
created_at TIMESTAMP DEFAULT NOW()
);
-- Indexes for performance
CREATE INDEX idx_embeddings_document_id ON embeddings(document_id);
CREATE INDEX idx_embeddings_vector ON embeddings USING ivfflat (embedding vector_cosine_ops);
CREATE INDEX idx_messages_conversation_id ON messages(conversation_id);
CREATE INDEX idx_messages_created_at ON messages(created_at);
Component Architecture
1. Authentication Layer
- NextAuth.js with multiple providers (email, Google, GitHub)
- JWT tokens for API authentication
- Role-based access control
2. Project Management
- CRUD operations for chatbot projects
- Database and MinIO bucket provisioning
- Configuration management
3. File Upload System
- Multi-format support (PDF, Markdown, DOCX)
- File validation and virus scanning
- Chunked upload for large files
- Progress tracking
4. Document Processing Pipeline
- Integration with n8n workflows
- Webhook endpoints for status updates
- Queue management for batch processing
- Error handling and retry logic
5. Chat Interface
- Real-time messaging with WebSockets
- Message history persistence
- Context-aware responses
- Rate limiting and security
6. Infrastructure Management
- Automated database creation/deletion
- MinIO bucket lifecycle management
- Connection pooling
- Health monitoring
API Endpoints
Authentication
POST /api/auth/*- NextAuth.js endpoints
Projects
GET /api/projects- List user's chatbot projectsPOST /api/projects- Create new chatbot projectGET /api/projects/[id]- Get project detailsPUT /api/projects/[id]- Update project configurationDELETE /api/projects/[id]- Delete project and resources
Documents
GET /api/projects/[id]/documents- List project documentsPOST /api/projects/[id]/documents- Upload new documentDELETE /api/documents/[id]- Delete documentPOST /api/documents/[id]/process- Trigger document processing
Chat
POST /api/projects/[id]/chat- Send chat messageGET /api/projects/[id]/conversations- Get conversation historyWebSocket /api/projects/[id]/ws- Real-time chat connection
Webhooks
POST /api/webhooks/n8n/[projectId]- n8n processing status updates
Security Considerations
-
Authentication & Authorization
- Multi-factor authentication support
- API key management for chatbot access
- Rate limiting per user/project
-
Data Protection
- Encrypted file storage
- Database encryption at rest
- Secure communication (HTTPS/WSS)
-
Input Validation
- File type and size validation
- Content sanitization
- SQL injection prevention
-
Resource Isolation
- Separate databases per chatbot
- Isolated MinIO buckets
- Resource quotas and limits
Deployment Architecture
graph TB
subgraph "Docker Environment"
NextJS[Next.js App Container]
MainDB[PostgreSQL Main Container]
Redis[Redis Container]
MinIO[MinIO Container]
Nginx[Nginx Reverse Proxy]
end
subgraph "External Services"
N8N[n8n Instance]
ChatbotDBs[Dynamic Chatbot DBs]
end
Internet --> Nginx
Nginx --> NextJS
NextJS --> MainDB
NextJS --> Redis
NextJS --> MinIO
NextJS --> N8N
NextJS --> ChatbotDBs
Scalability Considerations
-
Horizontal Scaling
- Stateless Next.js application design
- Database connection pooling
- Redis session management
-
Resource Management
- Per-project resource limits
- Automated cleanup of inactive projects
- Storage optimization
-
Performance Optimization
- Vector similarity search optimization
- Caching strategies
- CDN integration for file delivery
Monitoring and Logging
-
Application Monitoring
- Health check endpoints
- Performance metrics
- Error tracking
-
Infrastructure Monitoring
- Database performance
- Storage usage
- Network metrics
-
Business Metrics
- User engagement
- Document processing times
- Chat interaction patterns
This architecture provides a robust, scalable foundation for the chatbot creation framework while maintaining security, performance, and maintainability.