MarvisX architecture paper (arXiv mirror)

Abstract

We present MarvisX, an open-source Company Brain designed to serve as a shared substrate for human developers and AI agents working on the same codebase across multiple projects. The system addresses three operational queries underserved by existing platforms: cross-project impact analysis (Q3), immutable decision audit (Q6), and agent-native knowledge sharing without re-onboarding cost.

The architecture comprises eight functional domains (capture, store, retrieve, reflect, execute, agent-native, trust, productization), spanning roughly forty-three capabilities and one hundred fifty building blocks. The core innovation is a knowledge graph with sixteen deterministic edge types (calls, imports, defines, produces, contains, describes, documents, cites, applies_to, depends_on, mentions, refers_to, shares_tag, similar_to, resolves_to, plus one bridge type), combined with an append-only foreign-key-linked audit log and a five-layer cognitive reflection loop.

We propose a five-form knowledge taxonomy — ADR (architecture decisions), Spec (formal requirements), Playbook (procedural runbooks), Tribal (institutional context), and External (third-party reference material) — as a typed schema for organising organisational memory across humans and agents. Each form has distinct lifecycle, validity, and retrieval semantics.

We describe the Constitution hook system — five immutable safety rules enforced at the system level via bash hooks — and discuss alignment with EU AI Act Article 12 provenance requirements. We release the architecture for defensive publication purposes; the implementation is available under BSL classic license at github.com/emiliomartucci/marvisx-oss.

§1 — Introduction

This section establishes the operational problem statement: AI agents lose context across sessions, human developers lose context across projects, and organisations cannot answer simple audit questions ("who approved this change on date X?") without manual archaeology through Slack threads and Notion pages.

We position MarvisX in the landscape of 25 surveyed platforms (mem0, Letta, Cognee, Graphiti, Glean, Atlan, PM incumbents) and identify two queries where existing platforms provide no first-class primitive: cross-project deterministic impact analysis (Q3) and immutable decision audit (Q6).

§2 — Knowledge graph design

The MarvisX knowledge graph defines 16 deterministic edge types over a typed node space. Nodes carry the identifier format {prefix}:{kind}:{slug}, where prefix denotes the source domain (py, ts, task, pr, commit, handoff, solution, learning, audit, etc.) and kind denotes the role (function, file, module, artifact, sheet).

Edge categories:

Code edges: calls, imports, defines — automatically extracted from AST parse, enable deterministic impact traversal.
Work chain: produces, contains — link commits → PRs → tasks → artefacts.
Knowledge chain: describes, documents, cites, applies_to — link prose artefacts (handoffs, learnings, plans) to code or other artefacts.
Cross-project: depends_on, mentions, refers_to, shares_tag, similar_to — link artefacts across project boundaries, enabling cross-project impact and pattern discovery.
Bridge: resolves_to — module stubs to canonical files (invariant: stub → canonical, never the reverse).

The graph is populated by a watcher daemon (pir-kg-watcher) that listens to filesystem events and replays tree-sitter parsers over modified files. A separate ingester populates artefact nodes from a typed ingestion lane (files, meetings, sessions). Each edge insert is idempotent (UPSERT on (src_id, dst_id, edge_type)) and time-stamped, enabling time-travel as_of queries.

§3 — The five knowledge forms taxonomy

Organisational memory is heterogeneous. A single "document" abstraction obscures critical lifecycle and retrieval differences. MarvisX defines five typed forms:

ADR (Architecture Decision Record) — frozen at write time, immutable, FK-linked to author + date + context. Supersedes only via explicit supersedes edge.
Spec — formal requirements, versioned, machine-readable. Lifecycle: draft → ratified → superseded.
Playbook — procedural runbook (incident response, deploy, onboarding). Mutable; revisions tracked.
Tribal — institutional context not yet codified (founder notes, war stories). Mutable, salience-scored.
External — third-party reference (vendor doc, academic paper, regulatory text). Read-only mirror, citation-tracked.

Each form has distinct salience decay, retrieval default ranker, and approval gate.

§4 — Brain reflection: a five-layer cognitive loop

The Brain layer is a periodic (cron) cognitive cycle that operates over the substrate:

L1 Substrate — raw events (commits, task transitions, ingest insertions, agent tool calls) collected into an append-only stream.
L2 Digest + Journal — narrative aggregation per scope (company / program / project). Output: what_changed, decisions_observed, open_loops, notable_context, tomorrow_watch.
L3 Drift checker — observed-vs-expected gaps. Severity-scored, axis-tagged (taxonomy / coverage / contradiction / provenance / recurrence).
L4 Memory ops — proposed write operations (reinforce, consolidate, supersede candidate, provenance hardening, orphan, contradiction). Approval-gated.
L5 Learn findings — conclusions ratifiable into the substrate as ADRs, learnings, or guides. Confidence-tiered (low / medium / high), severity-tiered, approval-gated.

§5 — Audit-grade memory

Every state-changing tool call writes to audit_log with foreign-key links to the task, PR, user (or agent), and project that originated the call. The table is append-only at the database level (no UPDATE/DELETE grants for the runtime user). Time stamps use the database transaction clock to prevent agent-side time skew.

This pattern aligns with EU AI Act Art. 12 ("Record-keeping") and DORA operational resilience requirements. It also passes Series-B due-diligence audit asks ("who approved deploy on date X?") in seconds.

§6 — Constitution hooks: bash-enforced safety

Five immutable rules enforced at filesystem level via git hooks and pre-tool wrappers:

Rule 1 — Task First · every code change requires a tracked task. block-push-no-task.sh.
Rule 2 — No Hotfix Prod · no direct writes to deployed runtime. block-db-direct-write.sh.
Rule 3 — No Merge on Main · merges only after Triage approval. enforce-no-merge-main.sh.
Rule 4 — Worktree for Code · code edits only in isolated worktrees. enforce-worktree.sh.
Rule 5 — No Bypass Orchestrator · no subtree push or direct deploy. block-subtree-push.sh.

Hooks are deterministic, fail-closed, and apply identically to human and agent actors.

§7 — EU AI Act alignment

The architecture aligns with three Article 12 requirements out-of-the-box: automatic logging of system events, FK-linked record provenance, and time-stamped retention with privacy controls. Self-host deployment satisfies the Article 12 obligation that records remain under the operator's direct control — a pattern multi-tenant managed SaaS cannot trivially satisfy without per-tenant log isolation retrofit.

§8 — Related work

We survey 25 platforms across four categories: agent memory (mem0, Letta, Cognee, Graphiti), enterprise search (Glean, Lucidworks), data-catalog (Atlan, Alation, Collibra), and PM incumbents (Linear, Notion, Coda). None provide first-class cross-project impact analysis with deterministic edges; only a subset provide append-only FK-linked audit. We position MarvisX as a Company Brain — a category orthogonal to all four — explicitly agent-native.

§9 — References

Full citation list available in arXiv submission. BibTeX: arXiv:XXXX.XXXXX (placeholder; final DOI assigned at submission).

Cite as: Martucci, E. (2026). MarvisX architecture: a 16-edge cross-project knowledge graph with audit-grade memory…. Defensive publication, arXiv:XXXX.XXXXX.

← Back to blog

MarvisX architecture: a 16-edge cross-project knowledge graph with audit-grade memory, 5-knowledge-form taxonomy, and Brain-style reflection for agent-native development