System architecture — Mastra framework, 4-layer memory, agent orchestration, and infrastructure.

Architecture

HeyCMO is built on the Mastra AI framework with a layered architecture designed for reliability, memory persistence, and multi-agent orchestration.

Tech Stack

Layer	Technology
HTTP Server	Hono (lightweight, edge-compatible)
AI Framework	Mastra (agents, tools, workflows, RAG)
LLM Providers	OpenAI GPT-4o, GPT-4o-mini (via Mastra model routing)
Database	PostgreSQL (via Prisma ORM + `@prisma/adapter-pg`)
Memory	`@mastra/memory` + `@mastra/rag` with pgvector
Embeddings	OpenAI `text-embedding-3-small`
Task Queue	Inngest (durable workflow execution, cron scheduling)
Billing	Stripe (checkout sessions, webhook handling)
Integrations	Composio (100+ platform connectors via MCP)
Observability	Langfuse (`@mastra/langfuse`) + Pino structured logging
Visual Rendering	Playwright (carousel, static, OG image generation)
Voice	ElevenLabs HTTP API (text-to-speech for audio content)

System Diagram

┌─────────────────────────────────────────────────────────┐
│                     Hono HTTP Server                     │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌────────┐  │
│  │ /mcp/sse │  │/api/health│  │/api/innge│  │/api/*  │  │
│  │/mcp/msg  │  │/api/h/rdy │  │  st      │  │billing │  │
│  └────┬─────┘  └──────────┘  └────┬─────┘  └───┬────┘  │
│       │                            │             │       │
│  ┌────▼────────────────────────────▼─────────────▼───┐  │
│  │              Mastra AI Framework                   │  │
│  │  ┌──────────────────────────────────────────────┐  │  │
│  │  │              Agent Layer (10 agents)          │  │  │
│  │  │  CMO → SEO Writer, Social, Email, Analyst,   │  │  │
│  │  │  Engagement, Researcher, CRO, Growth, Sales  │  │  │
│  │  └──────────────────┬───────────────────────────┘  │  │
│  │                     │                               │  │
│  │  ┌─────────┐  ┌────▼────┐  ┌────────────────────┐  │  │
│  │  │  Tools  │  │Workflows│  │  Memory (4-layer)  │  │  │
│  │  │  (15+)  │  │  (12)   │  │  Working│Semantic  │  │  │
│  │  └─────────┘  └─────────┘  │  Episodic│Procedural│  │  │
│  │                             └────────────────────┘  │  │
│  └─────────────────────────────────────────────────────┘  │
│                                                           │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐  │
│  │PostgreSQL│  │ Inngest  │  │ Stripe   │  │ Composio │  │
│  │ + pgvec  │  │Cron/Queue│  │ Billing  │  │   MCP    │  │
│  └──────────┘  └──────────┘  └──────────┘  └──────────┘  │
└─────────────────────────────────────────────────────────┘

4-Layer Memory System

HeyCMO agents have persistent, contextual memory across all interactions:

Working Memory

Short-term context for the current conversation. Stores brand profile, recent instructions, and active task context. Every agent reads brand context from working memory before making decisions.

Long-term knowledge stored as vector embeddings in PostgreSQL with pgvector. Past content, research findings, and brand guidelines are chunked, embedded with OpenAI's text-embedding-3-small, and retrievable via similarity search.

Content Query Tool — Agents search past content for reference and inspiration
Performance Query Tool — Agents query historical performance data for data-driven decisions

Episodic Memory

Records of past agent interactions and outcomes. Enables agents to learn from previous successes and failures — "last time we wrote about this topic, it scored 0.85 on brand voice."

Procedural Memory

Self-optimization loop that adjusts agent behavior based on patterns. The Self-Optimization Workflow analyzes telemetry data, detects patterns (declining quality, improving engagement), and proposes weight adjustments to scoring criteria.

Human-in-the-Loop Approval

HeyCMO uses a suspend/resume workflow pattern for human approval. When the content creation workflow produces content:

Content is scored by the eval system (brand voice, quality, engagement prediction)
The workflow suspends and waits for human approval
You review the content via list_pending_approvals in your MCP client
You call resume_workflow with an approve or reject decision
Approved content proceeds to publishing; rejected content goes back for revision

This ensures nothing publishes without your explicit approval while keeping the automation pipeline fully intact.

Inngest Cron Jobs

HeyCMO uses Inngest for durable, scheduled workflow execution:

Daily Research — Automated content research runs on a schedule
Engagement Monitoring — Periodic checks for comments and DMs across platforms
Self-Optimization — Regular performance analysis and strategy adjustment
Events Cleanup — Periodic cleanup of stale event data

Inngest provides automatic retries, crash recovery, and observability for all scheduled functions. The /api/inngest endpoint serves as the function handler.

Real-Time Progress Events

Long-running workflows (like the research pipeline) emit real-time progress notifications via the MCP protocol:

Step 1/6: Analyzing brand context...
Step 2/6: Searching web sources via Exa...
Step 3/6: Extracting social signals from X/Twitter...
Step 4/6: Scoring and ranking ideas...
Step 5/6: Identifying SEO quick wins...
Step 6/6: Storing results in semantic memory...

Progress events are forwarded to your MCP client as notifications/progress messages, giving you live visibility into what HeyCMO is doing.

Workflow Recovery

On server restart, HeyCMO automatically recovers interrupted workflows from PostgresStore. The recovery system checks for running or suspended workflow runs and restarts them, ensuring research pipelines and other long-running workflows survive server restarts without data loss.

Recoverable workflows: researchPipeline, contentCreation, crossChannelPublish, brandInterview, engagementResponse, analyticsReport, selfOptimization.

Agent Safety Processors

Every agent has input and output processors for safety:

Processor	Purpose	Agents
UnicodeNormalizer	Prevents Unicode-based prompt injection	All agents
PromptInjectionDetector	Detects and warns on injection attempts	CMO
TokenLimiterProcessor	Caps input tokens to prevent abuse	CMO, Email, Researcher
ModerationProcessor	Blocks harmful content in outputs	Social Manager
PIIDetector	Redacts personal information from outputs	Engagement
LanguageDetector	Detects input language for proper routing	Engagement
ToolCallFilter	Validates tool call parameters	Social Manager

Eval System

Before any content is presented for approval, it passes through three evaluation dimensions:

Brand Voice Match — Compares generated content against the brand profile for tone, vocabulary, and style consistency (0–1 score)
Content Quality — Evaluates structure, depth, readability, and SEO optimization (0–1 score)
Engagement Prediction — Predicts likely engagement based on historical performance patterns (0–1 score)

Content scoring below configurable thresholds is flagged for revision before human review.

Architecture

Architecture

Tech Stack

System Diagram

4-Layer Memory System

Working Memory

Semantic Memory (RAG)

Episodic Memory

Procedural Memory

Human-in-the-Loop Approval

Inngest Cron Jobs

Real-Time Progress Events

Workflow Recovery

Agent Safety Processors

Eval System

Infrastructure

Rate Limiting

Authentication

CORS & Security Headers

Body Size Limits

SSRF Protection

On this page