Architecture Overview
Active Development
The architecture is evolving rapidly. New subsystems are added frequently. This document reflects the current state of the codebase.
High-Level Architecture
┌─────────────────────────────────────────────────────────────────┐
│ External Services │
│ GitHub · GitLab · Jira · Plane · Confluence │
└─────────────────────────┬───────────────────────────────────────┘
│ Webhooks (POST /webhook/:slug)
▼
┌─────────────────────────────────────────────────────────────────┐
│ Aether Go Backend (Fiber) │
│ │
│ ┌──────────────┐ ┌───────────────┐ ┌──────────────────────┐ │
│ │ Webhook │─▶│ Task/Event │─▶│ LLM Service │ │
│ │ Handler │ │ Resolver │ │ (memory + tools + │ │
│ └──────────────┘ └───────────────┘ │ MCP + tool calls) │ │
│ └──────────┬───────────┘ │
│ ┌──────────────────────────────────┐ │ │
│ │ gRPC Agent Gateway (:9090) │ │ │
│ │ ┌──────┐ ┌──────┐ ┌──────────┐ │ │ │
│ │ │LLM │ │Tools │ │ Memory │ │ │ │
│ │ │Hub │ │Hub │ │ Hub │ │ │ │
│ │ └──────┘ └──────┘ └──────────┘ │ │ │
│ └────────────────┬─────────────────┘ │ │
└───────────────────│──────────────────────────────│──────────────┘
│ gRPC │ HTTP
┌───────────┴──────────┐ ┌──────────▼──────────┐
│ External Agents │ │ LiteLLM Proxy │
│ (Python/Go/TS/any) │ │ (OpenAI, Claude, │
└──────────────────────┘ │ Ollama, etc.) │
└─────────────────────┘
┌────────────┐ ┌────────────┐ ┌────────────┐
│ PostgreSQL │ │ Redis │ │ NATS │
│ +pgvector │ │(mem+ratelim│ │(JetStream) │
└────────────┘ └────────────┘ └────────────┘Go Backend Structure
The backend follows a clean architecture pattern with three layers: Domain, Application, and Infrastructure.
goway/internal/
├── domain/
│ ├── entity/ # Domain models (pure Go structs, no external deps)
│ └── port/ # Interfaces defining contracts
│ ├── repository/ # Data access interfaces
│ ├── issuetracker/ # Tracker client interfaces
│ ├── memory/ # Memory storage interfaces
│ ├── messaging/ # Event publisher interfaces
│ ├── llm/ # LLM provider interfaces
│ ├── polling/ # Polling source interfaces
│ └── ingestion/ # Knowledge ingestion interfaces
├── application/
│ ├── task/ # Task registry and event mapping
│ ├── memory/ # Semantic search service
│ ├── routing/ # Capability-based agent routing
│ ├── tooling/ # Tool registry management
│ ├── mcp/ # MCP discovery and management
│ ├── metadata/ # Metadata sync services
│ ├── plugins/ # Plugin loader
│ └── dto/ # Data transfer objects
└── infrastructure/
└── adapter/
├── postgres/ # Repository implementations (33+ files)
├── redis/ # Short-term memory, session store
├── http/handler/ # Fiber HTTP handlers (25+ files)
├── llm/ # LiteLLM client and orchestration
├── embedding/ # Vector embedding generation
├── issuetracker/ # GitHub, GitLab, Jira, Plane clients
├── agentgateway/ # gRPC Agent Gateway server
├── ingestion/ # Knowledge connectors (Confluence)
├── nats/ # NATS event publisher
├── mcp/ # MCP HTTP client
├── ratelimiter/ # Redis-backed sliding window rate limiter
├── auth/ # Agent authentication (bcrypt)
└── config/ # Environment configurationKey principle: The domain layer has no external dependencies. Application and infrastructure layers depend on domain interfaces (ports), not implementations. This makes every layer independently testable.
Event Flow (Webhook → Response)
1. POST /webhook/:slug
└─ webhook.go handler
├─ Verify signature/token
├─ Persist raw event to webhook_events table
└─ Extract source + event + action
2. Normalize event
└─ SourceEventMappingRepository.GetCanonicalTrigger(source, event, action)
└─ Query: source_event_mappings table
└─ Returns: canonical_trigger (e.g., "issue.created")
└─ If no mapping: log and discard
3. Resolve tasks
└─ TaskRepository.ResolveTasks(canonical_trigger)
└─ Query: trigger_task_mappings → tasks (JOIN)
└─ Returns: []Task{ID, ActorID, ActorFallback, Description}
4. For each task:
a. Get agent from AgentRepository.GetByID(task.ActorID)
b. If agent has AGENT_GATEWAY session → stream task to external agent
c. Otherwise → call internal LLM service
5. Internal LLM call (llm/service.go):
a. Build system prompt from agent.Prompt + context + identity
b. Inject relevant memories via semantic search
c. Inject webhook event context
d. Call LiteLLM proxy (with tool definitions if tool calling enabled)
e. Handle tool calls in a loop (max LLM_MAX_TOOL_ROUNDS iterations)
f. Route response to issue tracker
6. Response routing (issuetracker/handler.go):
└─ Determine action based on task type:
- PM tasks → add_comment
- TA tasks → update_description (appends "Technical Analysis" section)
- QA tasks → add_comment
- Dev tasks → add_commentDomain Entities
The internal/domain/entity/ directory contains 33 entity files representing the full data model:
Core Entities
| Entity | Key Fields | Purpose |
|---|---|---|
Agent | id, name, role_id, prompt, context, identity | AI persona with system prompt |
AgentRole | id, name | Role: pm, ta, qa, dev, hub |
Project | id, name, source, source_project_id | External project binding |
Task | id, actor_id, actor_fallback, description | Work unit definition |
Event Routing
| Entity | Purpose |
|---|---|
SourceEventMapping | Maps (source, event, action) → canonical_trigger |
TriggerTaskMapping | Maps canonical_trigger → task_id |
WebhookEvent | Persisted record of every received webhook |
WebhookMapping | Advanced routing rule with jq filter |
Agent Gateway (gRPC)
| Entity | Purpose |
|---|---|
AgentSession | Active gRPC connection with heartbeat tracking |
AgentCapability | Declared capability (e.g., code_review, security) |
AgentAuditLog | Immutable audit trail of agent actions |
TaskExecution | Task execution history with metrics |
Memory & Knowledge
| Entity | Purpose |
|---|---|
AgentMemory | Persistent memory with pgvector embedding |
KnowledgeSource | External knowledge source config (Confluence, etc.) |
DocumentChunk | Ingested document fragment with embedding |
IngestionJob | Knowledge ingestion job tracking |
Tools & MCP
| Entity | Purpose |
|---|---|
ToolDefinition | Tool schema, parameters, and metadata |
ToolExecution | Tool call history with timing and tokens |
MCPServer | MCP (Model Context Protocol) server config |
Integration & Polling
| Entity | Purpose |
|---|---|
Integration | Dynamic integration credentials (encrypted) |
IntegrationAssignment | Agent-specific credential assignment |
PollSource | Polling configuration for non-webhook sources |
PollSourceState | Last-polled tracking |
Key Subsystems
Task Registry & Event Routing
application/task/registry.go — The central routing table:
- Manages canonical trigger → task mappings
- Resolves which tasks fire for a given webhook event
- Configuration lives in the database, seeded from YAML
Capability-Based Agent Routing
application/routing/capability_router.go — Selects the best agent for a task based on declared capabilities:
- Agents declare capabilities (e.g.,
code_review,security,testing) - When a task fires, the router scores available agents by capability match
- Supports fallback chains when no matching agent is found
- Load balancing when multiple agents share the same capabilities
This runs as an alternative to explicit actor_id assignment in tasks. Both mechanisms work together.
LLM Service
infrastructure/adapter/llm/service.go — Orchestrates the full LLM interaction:
- Builds the full system prompt (agent identity + context + responsibilities)
- Runs semantic search to find relevant memories → appends to prompt
- Calls LiteLLM proxy with the message
- If the LLM returns tool calls, executes them (loop up to
LLM_MAX_TOOL_ROUNDS) - Routes the final response to the appropriate issue tracker action
Memory System
Three-tier architecture:
- Short-term (Redis): TTL-based session memory. Cleared automatically.
- Long-term (PostgreSQL): Persistent memories stored in
agent_memoriestable. - Semantic search (pgvector): Vector embeddings stored alongside memories. Natural language search finds relevant context.
Memory is automatically injected into LLM prompts for every call.
Tool System
Tools are the mechanism by which agents take actions in the world (post comments, update issues, etc.):
- Tool definitions live in the
tool_definitionstable - Tool providers implement a common interface and are registered in the tool registry
- LLM tool calling — The LLM receives tool schemas and decides which to call
- MCP servers — External MCP servers can dynamically extend the tool set
Built-in tool providers:
- GitHub:
add_comment,update_description,get_issue,create_label - GitLab:
add_comment,update_description,get_issue - Jira:
add_comment,update_description,get_issue,transition_issue - Plane:
add_comment,update_description,get_issue
gRPC Agent Gateway
infrastructure/adapter/agentgateway/ — Enables external agents to participate in Aether's task processing:
- Agent registers with ID + secret → receives session ID
- Agent opens a streaming connection to receive tasks
- Agent uses Hub services (LLM, tools, memory, messaging) for processing
- Agent reports completion or failure
- Rate limiting enforced per trust level via Redis
Knowledge Ingestion
infrastructure/adapter/ingestion/ — Ingests external knowledge into the memory system:
- Confluence (implemented): Fetches pages, splits into chunks, generates embeddings, stores in
document_chunks - Framework supports additional connectors (see Roadmap)
MCP (Model Context Protocol)
infrastructure/adapter/mcp/ — MCP server integration:
- MCP servers are registered in the
mcp_serverstable - At runtime, Aether discovers available tools from MCP servers
- Agent-specific MCP server assignments are supported
- Tools from MCP servers are available to the LLM alongside built-in tools
Polling Scheduler
application/polling/ — For sources that don't support webhooks:
- Configurable polling interval per source
- State tracking (last polled timestamp)
- Triggers same event pipeline as webhooks on new items
- Works alongside webhooks (both can be enabled simultaneously)
Database Schema (42 Migrations)
The database schema has grown through 42 sequential migrations. Key milestones:
| Migrations | What Was Added |
|---|---|
| 001–003 | Webhook events, definitions, subscriptions |
| 004–009 | Agent roles, agents, projects, source mappings, project assignments |
| 010–011 | Agent memories, watchdog logging |
| 012–014 | Tasks, trigger mappings, source event mappings |
| 015–017 | Workflows, integrations, integration assignments |
| 018–021 | Agent email, poll sources and state |
| 022–024 | Webhook mappings (advanced), tool definitions, tool executions |
| 025–031 | Agent auth secrets, sessions, capabilities, audit logs, task execution tracking |
| 032–035 | pgvector extension, knowledge sources, document chunks, ingestion jobs |
| 036–042 | Agent budgets, full-text search indexes, memory metadata, MCP server config |
See Database Guide for migration commands.
Issue Tracker Integration
infrastructure/adapter/issuetracker/ — Routes agent responses back to the originating platform:
// Each tracker implements the IssueTrackerClient interface
type IssueTrackerClient interface {
AddComment(ctx context.Context, projectID, issueID, comment string) error
UpdateDescription(ctx context.Context, projectID, issueID, content string) error
GetIssue(ctx context.Context, projectID, issueID string) (*Issue, error)
}Two registry types:
- Static registry — Configured from environment variables (
GITHUB_TOKEN,JIRA_URL, etc.) - Dynamic registry — Configured via API, credentials stored encrypted in the database
issuetracker/handler.go decides which tracker action to take based on the task type (PM → comment, TA → description update, etc.).
Shared Utilities
Two utility packages exist to eliminate repeated boilerplate across the backend. When contributing, always use them instead of writing inline equivalents.
httputil — HTTP Handler Utilities
Path: internal/infrastructure/adapter/http/handler/httputil/
All ~25 Fiber HTTP handlers share a common set of helpers for error responses, parameter parsing, and service-availability guards.
// Error responses (always use these — never write inline c.Status(...).JSON(...))
httputil.ErrInternal(c, err) // 500
httputil.ErrBadRequest(c, "message") // 400
httputil.ErrNotFound(c, "Resource not found") // 404
httputil.ErrUnavailable(c, "Service offline") // 503
// URL parameter parsing (returns (0, false) and writes 400 on failure)
id, ok := httputil.ParseIntParam(c, "id")
if !ok { return nil }
// Query string helpers
page, pageSize := httputil.PaginationParams(c, 20)
search := httputil.OptionalString(c.Query("search")) // *string or nil
active := httputil.OptionalBool(c.Query("active")) // *bool or nil
// Nil-service guard (returns false and writes 503 when nil)
if !httputil.RequireService(c, h.svc, "Service not configured") { return nil }httpclient — Issue Tracker HTTP Client
Path: internal/infrastructure/adapter/issuetracker/httpclient/
All four issue tracker adapters (GitHub, GitLab, Jira, Plane) delegate their HTTP calls to httpclient.DoRequest. New tracker implementations must do the same rather than writing a new doRequest from scratch.
func (c *Client) doRequest(ctx context.Context, method, url string, payload interface{}) (*issuetracker.Response, error) {
return httpclient.DoRequest(ctx, c.httpClient, method, url, payload, c.setHeaders, c.logger, "MyTracker")
}HTTP errors (4xx/5xx) return Response{Success: false} rather than a Go error — call sites check result.Success rather than err != nil for provider-level failures.
httpclient.BuildHTTPClient(insecureTLS, logger, "Provider") creates the *http.Client with a 30s timeout and optional TLS skip, keeping TLS configuration consistent across all adapters.
Services Started at Boot
The cmd/server/main.go entry point initializes ~20 services in dependency order:
- Logger (Zap)
- Config (env vars)
- Database connection + auto-migrations
- All repositories (22+)
- Redis connection
- NATS/Redis event bus
- Auth service (bcrypt)
- LiteLLM client
- gRPC Agent Gateway (if
AGENT_GATEWAY_ENABLED=true) - Rate limiter (if
RATE_LIMIT_ENABLED=true) - Session cleanup service
- Poll scheduler
- Tool execution cleanup job
- Audit log cleanup job
- Issue tracker registry (static + dynamic)
- Tracker handler (response routing)
- Metadata sync service
- Tool registry + plugin loader
- MCP client + services
- LLM service (with tool calling)
- Embedding service + semantic search service
- Fiber HTTP app + all route handlers
- Graceful shutdown handler
Technology Stack
| Component | Technology | Why |
|---|---|---|
| Backend language | Go 1.24 | Performance, concurrency, type safety |
| HTTP framework | Fiber v2 | Fast, Express-like, middleware ecosystem |
| gRPC | google.golang.org/grpc | External agent protocol |
| Database | PostgreSQL 16 | ACID, full-text search, pgvector extension |
| Vector search | pgvector | Semantic memory search embedded in PostgreSQL |
| Cache/memory | Redis 8 | Fast TTL-based memory, sliding window rate limits |
| Message queue | NATS 2 (JetStream) | Async task processing, inter-agent messaging |
| LLM interface | LiteLLM | Provider-agnostic LLM access |
| Migrations | golang-migrate | Versioned, reversible migrations |
| Logging | Zap | Structured, high-performance logging |
| Frontend | React 18 + TypeScript | Modern, type-safe UI |
| Build tool | Vite | Fast dev server and bundling |
| Containerization | Docker Compose | Local development and deployment |
