Skip to content

Pipeline Security Gates

Every piece of content published to the NovaTrek Architecture Portal passes through a series of automated security gates in the CI/CD pipeline. No content can reach production without passing all gates. This is fundamentally different from wiki-based platforms where content is published the moment an author clicks "Save."

For the complete evidence base including NIST, SLSA, and OWASP citations, see Research Results.

Fictional Domain

Everything on this portal is entirely fictional. NovaTrek Adventures is a completely fictitious company. All pipeline references describe the NovaTrek proof-of-concept implementation.


Gate Architecture

Author writes content
Feature branch (git push)
    ├─── GitHub Push Protection ─── blocks commits containing secrets
Pull Request opened
    ├─── Gate 1: YAML Metadata Validation
    ├─── Gate 2: Solution Folder Structure Validation
    ├─── Gate 3: Data Isolation Audit
    ├─── Gate 4: Portal Build (link validation)
    ├─── Gate 5: Confluence Dry-Run (mirror validation)
    ├─── Gate 6: PR Review Approval (human gate)
Merge to main (only if ALL gates pass)
    ├─── Gate 7: Production Build
    ├─── Gate 8: Static Asset Integrity
Deploy to Azure Static Web Apps
    ├─── Gate 9: Azure Platform Security (WAF, DDoS protection)
Content live on portal

Pre-Merge Gates (PR Phase)

These gates run automatically on every pull request. All must pass before the PR can be merged.

Gate 1 — YAML Metadata Validation

What it checks: All YAML files in architecture/metadata/ are syntactically valid and parseable.

Why it matters: Malformed YAML could cause generators to produce incorrect output or fail silently, leading to missing or corrupted content on the portal.

Implementation: The validate-solution.yml workflow parses every YAML file with Python's yaml.safe_load() — note the use of safe_load, not load, which prevents YAML deserialization attacks.

Blocks merge on failure: Yes.

Gate 2 — Solution Folder Structure Validation

What it checks: Every solution folder under architecture/solutions/ contains the required artifacts:

  • A master document (*-solution-design.md)
  • A capabilities mapping (3.solution/c.capabilities/capabilities.md)

Why it matters: Incomplete solutions could reference non-existent files, causing broken links on the portal or missing capability rollup data.

Blocks merge on failure: Yes.

Gate 3 — Data Isolation Audit

What it checks: Scans all tracked files for patterns that indicate corporate data leakage:

  • Real company names or internal system identifiers
  • Real domain names (only *.novatrek.example.com is permitted)
  • Corporate email patterns
  • Internal project codes or system names
  • API keys, tokens, or credentials in content files

Why it matters: The NovaTrek workspace is synthetic by design. Any real corporate data appearing in the repository represents a data leakage incident. This gate catches it before it reaches the published site.

Implementation: scripts/audit-data-isolation.sh — a custom shell script that runs regex pattern matching against all tracked files.

Blocks merge on failure: Yes.

Gate 4 — Portal Build

What it checks: The full MkDocs site builds successfully, including:

  • All generators run (microservice pages, solution pages, capability pages, ticket pages)
  • All internal links resolve to existing pages
  • All referenced assets (SVGs, images) exist
  • MkDocs configuration is valid

Why it matters: A successful build proves that the content is internally consistent. Broken links, missing files, or configuration errors are caught before they reach production.

Blocks merge on failure: Yes.

Gate 5 — Confluence Dry-Run

What it checks: The Confluence mirror preparation script (confluence-prepare.py) runs successfully and the resulting Markdown passes mark --dry-run validation.

Why it matters: Even though Confluence is a read-only mirror, publishing failures there indicate content formatting issues that may also affect the primary portal.

Blocks merge on failure: Yes.

Gate 6 — PR Review Approval

What it checks: At least one designated reviewer has approved the pull request.

Why it matters: Automated gates catch structural and formatting issues but cannot evaluate content accuracy, architectural correctness, or appropriateness. The human review gate ensures that a second pair of eyes validates the substance of every change.

Configuration: GitHub branch protection rules on main require:

  • At least 1 approving review
  • Dismissal of stale approvals when new commits are pushed
  • No self-approval (the PR author cannot approve their own PR)

Blocks merge on failure: Yes.


Post-Merge Gates (Deploy Phase)

These gates run after the PR is merged to main, before content reaches production.

Gate 7 — Production Build

What it checks: The full site builds again from the merged main branch. This is not redundant — it catches merge conflicts or timing issues where two PRs were individually valid but conflict when combined.

Why it matters: Defense in depth. Even if a PR gate was somehow bypassed, the production build catches issues before deployment.

Blocks deployment on failure: Yes.

Gate 8 — Static Asset Integrity

What it checks: Non-Markdown assets (SVGs, OpenAPI specs, Swagger UI pages, staticwebapp.config.json) are correctly copied into the build output.

Why it matters: MkDocs only processes Markdown files. Static assets must be explicitly copied into the site/ output directory. Missing assets could break diagrams, API documentation, or security headers.

Implementation: The generate-all.sh script and post-build cp commands handle this.

Blocks deployment on failure: Yes (missing staticwebapp.config.json would remove all security headers).

Gate 9 — Azure Platform Security

What it checks: Azure Static Web Apps provides platform-level protections:

  • DDoS protection (Azure-managed, included with the platform)
  • TLS termination (HTTPS only, managed certificates, TLS 1.2 minimum)
  • Global CDN (Azure Front Door edge nodes, reducing origin exposure)
  • Custom domain validation (prevents domain spoofing)
  • Staging environments (PR deployments go to isolated preview URLs, not production)

Why it matters: Even with perfect content security, the hosting platform must also be secure. Azure Static Web Apps is a managed platform with enterprise-grade security controls — the NovaTrek team does not manage web servers, load balancers, or TLS certificates.


Gate Comparison with Confluence

Gate Docs-as-Code Confluence Equivalent
Secret scanning Automated, blocks push Not available
YAML validation Automated, blocks merge Not applicable
Data isolation audit Automated, blocks merge Not available
Link validation Automated, blocks merge Not available
Pre-publish review Required PR approval Optional (page restrictions)
Build integrity Automated, blocks deploy Not applicable
Security headers Version-controlled, gated Atlassian-managed
Platform security Azure (SOC 2, ISO 27001) Atlassian (SOC 2, ISO 27001)

Key difference: Confluence has zero automated gates between editing and publishing. Every control is either manual (page restrictions) or managed by Atlassian (platform security). The docs-as-code model provides 6 automated gates plus a required human review, all of which must pass before content reaches production.


SLSA Framework Alignment

The Supply-chain Levels for Software Artifacts (SLSA) framework, developed by Google, provides an authoritative blueprint for securing CI/CD pipelines against supply chain attacks. The NovaTrek documentation pipeline aligns with SLSA Build Levels 1--3:

SLSA Level Requirement NovaTrek Implementation
Build L1 Fully scripted builds with provenance metadata Entire build defined declaratively in GitHub Actions YAML
Build L2 Hosted platform with cryptographically signed provenance Deployments run exclusively on GitHub-hosted runners; artifacts tied to source commits
Build L3 Hardened, ephemeral build environments Each build spins up a clean, isolated runner — executes MkDocs build, deploys, and destroys the environment

Source: SLSA Framework specification and JFrog SLSA analysis.


OWASP CI/CD Risk Mitigation

The OWASP Top 10 CI/CD Security Risks identifies key pipeline threat categories. The docs-as-code model addresses the most critical risks:

OWASP Risk Risk Description Docs-as-Code Mitigation
CICD-SEC-1 Insufficient Flow Control Branch protection rules, required PR approvals, automated status checks
CICD-SEC-3 Dependency Chain Abuse Snyk SCA blocks malicious packages; Dependabot automates updates
CICD-SEC-4 Poisoned Pipeline Execution (PPE) Ephemeral runners, SLSA Level 2 provenance, immutable build environments
CICD-SEC-6 Insufficient Credential Hygiene Workload Identity Federation (OIDC) eliminates long-lived deployment credentials

Source: OWASP CI/CD Security Cheat Sheet.


Snyk Integration

Snyk provides three distinct scanning capabilities, each deployed as a CI gate in the documentation pipeline:

Snyk Dependency Scan (snyk test)

What it checks: All Python packages in requirements-docs.txt against the Snyk vulnerability database.

Why it matters: MkDocs, pymdownx, and other build-time dependencies may contain vulnerabilities. Even though these packages only run at build time (not in production), a compromised build dependency could inject malicious content into the generated HTML.

Implementation:

- name: Snyk dependency scan
  uses: snyk/actions/python-3.12@master
  env:
    SNYK_TOKEN: ${{ secrets.SNYK_TOKEN }}
  with:
    args: --severity-threshold=high --file=requirements-docs.txt

Blocks merge on failure: Yes (HIGH or CRITICAL severity).

Snyk Code Analysis (snyk code test)

What it checks: Static analysis of Python generator scripts in portal/scripts/ for security issues including:

  • Path traversal vulnerabilities (generators process file paths from YAML input)
  • Unsafe deserialization (generators parse YAML metadata)
  • Injection risks (generators produce HTML output)
  • Hardcoded secrets or credentials

Why it matters: The generator scripts are the boundary between untrusted input (YAML metadata, OpenAPI specs) and trusted output (published HTML). Security flaws in generators could allow a crafted YAML file to produce malicious portal content.

Blocks merge on failure: Yes.

Snyk Infrastructure-as-Code Scan (snyk iac test)

What it checks: Infrastructure and configuration files for security misconfigurations:

  • staticwebapp.config.json — overly permissive CSP, missing security headers
  • infra/*.bicep — Azure resource misconfigurations
  • .github/workflows/*.yml — overly broad workflow permissions, missing pinned action versions

Why it matters: A misconfigured staticwebapp.config.json could silently remove all security headers from the production site. Snyk IaC catches these misconfigurations before they are deployed.

Implementation:

- name: Snyk IaC scan
  uses: snyk/actions/iac@master
  env:
    SNYK_TOKEN: ${{ secrets.SNYK_TOKEN }}
  with:
    args: --severity-threshold=high
    file: portal/staticwebapp.config.json

Blocks merge on failure: Yes (HIGH or CRITICAL severity).

Continuous Monitoring

Beyond CI gates, Snyk's GitHub integration provides continuous monitoring:

  • New vulnerability alerts: If a CVE is published for a dependency that was clean at merge time, Snyk opens an automated PR with the fix
  • License compliance: Snyk can enforce that all dependencies use approved licenses (MIT, Apache-2.0, etc.)
  • Reporting dashboard: Security team gets a single-pane view of all vulnerability findings across the repository

This is a capability that Confluence cannot match — there is no way for an organization to scan Confluence's own dependencies or receive alerts when Confluence's build toolchain has a new vulnerability.


Secret Sprawl: The Scale of the Problem

The 2025 State of Secrets Sprawl report by GitGuardian quantifies the scale of credential exposure in modern software environments:

  • 23.77 million new hardcoded secrets found in public repositories in 2024
  • 25% year-over-year increase in secret exposure
  • 58% of all detected leaks are generic secrets (API keys, passwords, connection strings)

While Confluence relies on authors to avoid pasting secrets (with no automated detection), the docs-as-code pipeline provides two layers of defence:

  1. GitHub Push Protection — operates as a pre-receive hook that rejects commits containing detected secrets before they enter the repository history
  2. GitHub Secret Scanning — continuously monitors for secrets that bypass push protection, scanning for 200+ partner patterns plus custom organization-defined patterns

Adding More Gates

The pipeline is extensible. Additional gates that can be added with minimal effort:

Gate Tool Purpose
Markdown lint markdownlint-cli Enforce consistent formatting and catch common Markdown errors
Spell check cspell Catch typos and enforce terminology consistency
Accessibility check pa11y-ci Validate generated HTML meets WCAG guidelines
Link rot detection lychee Check external links still resolve (scheduled, not blocking)
Content policy check Custom script Enforce organization-specific content policies (e.g., no PII, no internal codenames)

Each gate is a step in the GitHub Actions workflow — a YAML file that is itself version-controlled, reviewed, and auditable.