Architecture Metadata Registry¶
The 15 YAML files that define the entire NovaTrek architecture — and why everything is modeled as structured data
What Is the Metadata Registry?¶
The architecture/metadata/ directory contains 15 YAML files that together form a complete, machine-readable model of the NovaTrek platform. Every service, domain, capability, integration, data store, actor, event, application, ticket, and deployment status is captured as structured data.
This is not documentation in the traditional sense. These files are the single source of truth that portal pages, CALM topology, sequence diagrams, and capability maps are all generated from. Editing a YAML file and pushing to main triggers CI to regenerate the entire architecture portal automatically.
Why YAML Instead of Documentation?¶
The problem with prose¶
Traditional architecture documentation is a collection of wiki pages, Word documents, and diagrams maintained by hand. This approach has well-known failure modes:
- Drift — documentation falls behind reality within weeks of being written
- Duplication — the same fact (e.g., "svc-payments is in the Support domain") appears in 12 places and is updated in 3 of them
- No validation — a wiki page can claim anything; there is no way to lint or test it
- No generation — every downstream artifact (diagrams, capability maps, service catalogs) must be manually maintained in parallel
The metadata-driven alternative¶
By encoding architecture facts as structured YAML, NovaTrek gains:
| Property | Prose Documentation | YAML Metadata |
|---|---|---|
| Single source of truth | Facts scattered across wikis | One file per concern, all derived from it |
| Machine-readable | Humans only | Parsed by generators, linters, CALM exporters |
| Diffable in PRs | Ambiguous prose changes | Structural changes reviewable field-by-field |
| Generates everything | Manual upkeep for each artifact | Portal pages, diagrams, topology — all auto-generated |
| Validatable | Cannot verify consistency | Scripts can check referential integrity across files |
| Version-controlled | Wiki history is opaque | Full git history with blame, diff, and revert |
The core principle: edit the data, not the documents. Documents are generated outputs.
File Reference¶
Domain and Service Model¶
| File | Purpose | Key Fields |
|---|---|---|
| domains.yaml | Groups all 22 services into 9 business domains with display colors and icons | domain name, color, light (pastel), icon, services list |
| label-to-svc.yaml | Maps human-readable labels (e.g., "Reservations") to service identifiers (e.g., svc-reservations) | flat key-value pairs |
| data-stores.yaml | Database engine, schema, tables, columns, constraints, and features for every service | engine, schema, tables (with column-level detail), volume, backup policy |
| cross-service-calls.yaml | Complete REST integration map — which endpoint calls which endpoint on which service | source endpoint, target service, target method/path, sync/async flag |
| pci.yaml | PCI DSS compliance scope — which services handle payment data and the data flow paths | in-scope services, external systems, data flow pairs |
Business Capabilities¶
| File | Purpose | Key Fields |
|---|---|---|
| capabilities.yaml | L1/L2 business capability definitions organized by domain (34 capabilities across 9 domains) | domain id/name, capability id/name, status, owning services |
| capability-changelog.yaml | Single source of truth for all capability changes per solution — L3 capabilities emerge here | ticket, date, solution folder, affected capabilities, L3 sub-capabilities, ADR references |
Capability Data Ownership
For solved tickets, capability mappings live exclusively in capability-changelog.yaml. The tickets.yaml file only carries planned_capabilities for unsolved tickets (planning estimates). This avoids duplication.
Actors and Applications¶
| File | Purpose | Key Fields |
|---|---|---|
| actors.yaml | External systems, frontend applications, and human actors that interact with the platform | actor name, type (person/system/app), description, domain |
| applications.yaml | Frontend app definitions with screen-by-screen user flows | app id, title, type, technology, team, screens with step-by-step service calls |
| app-titles.yaml | Short display names for applications used in portal UI rendering | flat key-value pairs |
| consumers.yaml | Maps each service to the application screens that consume it | service name, list of (app, screen) pairs |
Events¶
| File | Purpose | Key Fields |
|---|---|---|
| events.yaml | Domain event catalog — channels, producers, consumers, and triggers | event key, channel name, producing service, trigger endpoint, consuming services, domain |
Tickets and Delivery¶
| File | Purpose | Key Fields |
|---|---|---|
| tickets.yaml | Ticket registry with service mappings, priority, status, and capability links | ticket key, summary, status, priority, assignee, solution folder, user story, planned capabilities |
| delivery-status.yaml | Deployment progress across 6 delivery waves with Azure resource tracking | wave number, name, status, deployed date, resource list |
| pipeline-registry.yaml | Central CI/CD pipeline registry — GitHub workflows and per-service CI patterns | repository, global pipelines, per-service pipeline templates |
How These Files Power the Portal¶
Every portal page is a generated output of these metadata files. Nothing is maintained by hand in the portal.
architecture/metadata/*.yaml
│
▼
Generator Scripts
(portal/scripts/)
│
├──► Microservice deep-dive pages (22 pages, 139 sequence diagrams)
├──► Application pages with screen flows
├──► Service Catalog with Swagger UI links
├──► Business Capability map
├──► Solution design pages
├──► Ticket pages
├──► CALM topology JSON (74 nodes, 146 relationships)
└──► Event Catalog
Which scripts consume which files¶
| Generator Script | Metadata Files Used |
|---|---|
generate-microservice-pages.py | domains, data-stores, cross-service-calls, label-to-svc, applications, capabilities |
generate-application-pages.py | applications, consumers, label-to-svc |
generate-swagger-pages.py | domains, data-stores |
generate-capability-pages.py | capabilities, capability-changelog |
generate-ticket-pages.py | tickets, capabilities |
generate-solution-pages.py | capability-changelog, tickets |
generate-calm.py | domains, data-stores, cross-service-calls, events, actors |
ticket-client.py | tickets, capabilities |
All generators use the shared load_metadata.py module which validates and caches metadata on load.
Who Edits These Files — and How?¶
The architect workflow¶
Architects do not typically edit YAML metadata by hand. Instead, the AI agent (GitHub Copilot or Roo Code) modifies these files as part of executing the solution design workflow. Here is the typical flow:
Architect receives ticket (e.g., NTK-10005)
│
▼
Architect prompts AI agent to design a solution
│
▼
AI agent reads ticket, specs, and metadata
│
▼
AI agent produces solution design documents AND
updates YAML files as a side effect:
• capability-changelog.yaml (new capability entries)
• tickets.yaml (status → Solved, solution link)
• cross-service-calls.yaml (new integration points)
• OpenAPI specs (new/modified endpoints)
• data-stores.yaml (new tables or columns)
│
▼
Architect reviews YAML changes in PR diff
│
▼
Merge to main → CI regenerates portal
The architect's role is to prompt, review, and approve — not to author YAML by hand. The AI agent understands the file formats, referential integrity rules, and naming conventions, so it produces structurally correct updates that the architect validates in a pull request diff.
When would an architect edit YAML manually?¶
Rarely, but it happens for structural or bootstrapping changes:
| Scenario | Example | Why Manual? |
|---|---|---|
| Initial setup | Creating domains.yaml with all 9 domains | One-time bootstrap — no prior data to derive from |
| Domain restructuring | Splitting "Support" into "Payments" and "Communication" | Strategic decision requiring human judgment about taxonomy |
| Fixing a typo | Correcting a team name in actors.yaml | Faster to fix directly than to prompt an AI agent |
| Adding a new service | New entry in domains.yaml, data-stores.yaml, label-to-svc.yaml | Minimal, predictable changes — may be manual or AI-assisted |
For anything involving analysis, cross-referencing, or producing multiple coordinated changes (the common case during solution design), the AI agent handles the updates.
Are these files generated from code?¶
No. These metadata files are authored data, not generated outputs. They are not extracted from running services, source code, or infrastructure. The flow is:
Authored inputs: Generated outputs:
───────────────── ───────────────────
architecture/metadata/*.yaml ──► portal/docs/ (Markdown pages)
architecture/specs/*.yaml ──► portal/docs/microservices/ (deep-dive pages)
──► portal/docs/microservices/svg/ (sequence diagrams)
──► architecture/calm/ (CALM topology JSON)
──► portal/docs/services/api/ (Swagger UI)
The YAML files are inputs. The portal pages, diagrams, and CALM topology are outputs. This is the opposite of many "docs-as-code" approaches that extract documentation from source code — here, the architecture model is the primary artifact, and everything else is derived from it.
CI/CD integration¶
- Edit YAML files (typically via AI agent during solution design, occasionally by hand)
- Commit and push to
main - CI regenerates all portal pages automatically via
generate-all.sh
For local preview before pushing:
# Regenerate all portal pages from metadata
bash portal/scripts/generate-all.sh
# Build the portal locally
cd portal && python3 -m mkdocs build
# Or serve locally for preview
cd portal && python3 -m mkdocs serve
Validation rules¶
- Referential integrity: Service names in
cross-service-calls.yamlmust exist indomains.yaml - No duplication: Solved ticket capabilities go in
capability-changelog.yamlonly, never duplicated intickets.yaml - Consistent identifiers: Service names use
svc-prefix everywhere (e.g.,svc-check-in, nevercheck-in) - ISO 8601 dates: All date fields use
YYYY-MM-DDformat
Relationship Between Metadata and OpenAPI Specs¶
The metadata registry and the OpenAPI specs are complementary:
| Concern | Where It Lives |
|---|---|
| What endpoints does a service expose? | architecture/specs/svc-*.yaml (OpenAPI) |
| What domain does the service belong to? | architecture/metadata/domains.yaml |
| What database does it use? | architecture/metadata/data-stores.yaml |
| Which other services does it call? | architecture/metadata/cross-service-calls.yaml |
| Who publishes/subscribes to events? | architecture/metadata/events.yaml |
| What business capability does it support? | architecture/metadata/capabilities.yaml |
OpenAPI specs define the contract surface (endpoints, schemas, parameters). Metadata files define the architecture topology (how services relate to each other, to domains, to capabilities, and to actors). Together they form the complete architecture model.
Quick Reference
Adding a new service: Add entries to domains.yaml, data-stores.yaml, and label-to-svc.yaml at minimum. Create the OpenAPI spec in architecture/specs/. Run bash portal/scripts/generate-all.sh to regenerate portal pages.
Recording a solved ticket's capability impact: Add an entry to capability-changelog.yaml — not to tickets.yaml. Update the ticket's status in tickets.yaml only.
Checking what calls what: Open cross-service-calls.yaml for the full REST integration map, or events.yaml for async pub-sub relationships.