🧠 HeyCMO

Architecture

System architecture β€” Mastra framework, 4-layer memory, agent orchestration, and infrastructure.

Architecture

HeyCMO is built on the Mastra AI framework with a layered architecture designed for reliability, memory persistence, and multi-agent orchestration.

Tech Stack

LayerTechnology
HTTP ServerHono (lightweight, edge-compatible)
AI FrameworkMastra (agents, tools, workflows, RAG)
LLM ProvidersOpenAI GPT-4o, GPT-4o-mini (via Mastra model routing)
DatabasePostgreSQL (via Prisma ORM + @prisma/adapter-pg)
Memory@mastra/memory + @mastra/rag with pgvector
EmbeddingsOpenAI text-embedding-3-small
Task QueueInngest (durable workflow execution, cron scheduling)
BillingStripe (checkout sessions, webhook handling)
IntegrationsComposio (100+ platform connectors via MCP)
ObservabilityLangfuse (@mastra/langfuse) + Pino structured logging
Visual RenderingPlaywright (carousel, static, OG image generation)
VoiceElevenLabs HTTP API (text-to-speech for audio content)

System Diagram

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                     Hono HTTP Server                     β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚ /mcp/sse β”‚  β”‚/api/healthβ”‚  β”‚/api/inngeβ”‚  β”‚/api/*  β”‚  β”‚
β”‚  β”‚/mcp/msg  β”‚  β”‚/api/h/rdy β”‚  β”‚  st      β”‚  β”‚billing β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”¬β”€β”€β”€β”€β”˜  β”‚
β”‚       β”‚                            β”‚             β”‚       β”‚
β”‚  β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”  β”‚
β”‚  β”‚              Mastra AI Framework                   β”‚  β”‚
β”‚  β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚  β”‚
β”‚  β”‚  β”‚              Agent Layer (17 agents)          β”‚  β”‚  β”‚
β”‚  β”‚  β”‚  CMO β†’ SEO Writer, Social, Email, Analyst,   β”‚  β”‚  β”‚
β”‚  β”‚  β”‚  Engagement, Researcher, CRO, Growth, Sales  β”‚  β”‚  β”‚
β”‚  β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚  β”‚
β”‚  β”‚                     β”‚                               β”‚  β”‚
β”‚  β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚  β”‚
β”‚  β”‚  β”‚  Tools  β”‚  β”‚Workflowsβ”‚  β”‚  Memory (4-layer)  β”‚  β”‚  β”‚
β”‚  β”‚  β”‚  (15+)  β”‚  β”‚  (12)   β”‚  β”‚  Workingβ”‚Semantic  β”‚  β”‚  β”‚
β”‚  β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚  Episodicβ”‚Proceduralβ”‚  β”‚  β”‚
β”‚  β”‚                             β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β”‚                                                           β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚PostgreSQLβ”‚  β”‚ Inngest  β”‚  β”‚ Stripe   β”‚  β”‚ Composio β”‚  β”‚
β”‚  β”‚ + pgvec  β”‚  β”‚Cron/Queueβ”‚  β”‚ Billing  β”‚  β”‚   MCP    β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

4-Layer Memory System

HeyCMO agents have persistent, contextual memory across all interactions:

Working Memory

Short-term context for the current conversation. Stores brand profile, recent instructions, and active task context. Every agent reads brand context from working memory before making decisions.

Semantic Memory (RAG)

Long-term knowledge stored as vector embeddings in PostgreSQL with pgvector. Past content, research findings, and brand guidelines are chunked, embedded with OpenAI's text-embedding-3-small, and retrievable via similarity search.

  • Content Query Tool β€” Agents search past content for reference and inspiration
  • Performance Query Tool β€” Agents query historical performance data for data-driven decisions

Episodic Memory

Records of past agent interactions and outcomes. Enables agents to learn from previous successes and failures β€” "last time we wrote about this topic, it scored 0.85 on brand voice."

Procedural Memory

Self-optimization loop that adjusts agent behavior based on patterns. The Self-Optimization Workflow analyzes telemetry data, detects patterns (declining quality, improving engagement), and proposes weight adjustments to scoring criteria.

Human-in-the-Loop Approval

HeyCMO uses a suspend/resume workflow pattern for human approval. When the content creation workflow produces content:

  1. Content is scored by the eval system (brand voice, quality, engagement prediction)
  2. The workflow suspends and waits for human approval
  3. You review the content via list_pending_approvals in your MCP client
  4. You call resume_workflow with an approve or reject decision
  5. Approved content proceeds to publishing; rejected content goes back for revision

This ensures nothing publishes without your explicit approval while keeping the automation pipeline fully intact.

Inngest Cron Jobs

HeyCMO uses Inngest for durable, scheduled workflow execution:

  • Daily Research β€” Automated content research runs on a schedule
  • Engagement Monitoring β€” Periodic checks for comments and DMs across platforms
  • Self-Optimization β€” Regular performance analysis and strategy adjustment
  • Events Cleanup β€” Periodic cleanup of stale event data

Inngest provides automatic retries, crash recovery, and observability for all scheduled functions. The /api/inngest endpoint serves as the function handler.

Real-Time Progress Events

Long-running workflows (like the research pipeline) emit real-time progress notifications via the MCP protocol:

Step 1/6: Analyzing brand context...
Step 2/6: Searching web sources via Exa...
Step 3/6: Extracting social signals from X/Twitter...
Step 4/6: Scoring and ranking ideas...
Step 5/6: Identifying SEO quick wins...
Step 6/6: Storing results in semantic memory...

Progress events are forwarded to your MCP client as notifications/progress messages, giving you live visibility into what HeyCMO is doing.

Workflow Recovery

On server restart, HeyCMO automatically recovers interrupted workflows from PostgresStore. The recovery system checks for running or suspended workflow runs and restarts them, ensuring research pipelines and other long-running workflows survive server restarts without data loss.

Recoverable workflows: researchPipeline, contentCreation, crossChannelPublish, brandInterview, engagementResponse, analyticsReport, selfOptimization.

Agent Safety Processors

Every agent has input and output processors for safety:

ProcessorPurposeAgents
UnicodeNormalizerPrevents Unicode-based prompt injectionAll agents
PromptInjectionDetectorDetects and warns on injection attemptsCMO
TokenLimiterProcessorCaps input tokens to prevent abuseCMO, Email, Researcher
ModerationProcessorBlocks harmful content in outputsSocial Manager
PIIDetectorRedacts personal information from outputsEngagement
LanguageDetectorDetects input language for proper routingEngagement
ToolCallFilterValidates tool call parametersSocial Manager

Eval System

Before any content is presented for approval, it passes through three evaluation dimensions:

  1. Brand Voice Match β€” Compares generated content against the brand profile for tone, vocabulary, and style consistency (0–1 score)
  2. Content Quality β€” Evaluates structure, depth, readability, and SEO optimization (0–1 score)
  3. Engagement Prediction β€” Predicts likely engagement based on historical performance patterns (0–1 score)

Content scoring below configurable thresholds is flagged for revision before human review.

Infrastructure

Rate Limiting

Per-key and per-IP rate limiting with configurable burst rates per endpoint category (API, MCP, checkout, webhooks, admin).

Authentication

API keys use the hcmo_live_ prefix, are hashed with SHA-256 for storage, and validated with timing-safe comparison. MCP endpoints use token-based auth via query parameter.

CORS & Security Headers

Strict CORS policy (only heycmo.ai and app.heycmo.ai), plus X-Content-Type-Options, X-Frame-Options, Referrer-Policy, Permissions-Policy, and Strict-Transport-Security headers on all responses.

Body Size Limits

Request body size is enforced per endpoint: 100KB for API routes, 1MB for Stripe webhooks, 10KB for admin routes.

SSRF Protection

All URL-accepting tools validate against private/internal IP ranges (localhost, 10.x, 172.16-31.x, 192.168.x, link-local, AWS metadata endpoint) to prevent server-side request forgery.

On this page