Solution architecture · 2026

The Modern Solution Architect: A Strategic Framework for AI-Native, Cloud-Distributed Enterprise Systems

Linh Truong, MA (Harvard), MBA · Author & source: LinhTruong.com · Linh@Alumni.Harvard.edu

A working map for practicing architects: where AI, multi-cloud, platforms, zero trust, data and events, FinOps, and sustainability show up in real designs—and how to keep decisions tied to shipping systems, not slide decks.

AI-Native Design Multi-Cloud Event-Driven Zero Trust Platform Engineering FinOps · GreenOps TOGAF · C4 · arc42 DORA · SPACE

01Executive Summary

The job is less “draw the system” and more “make sure the system keeps its promises.” You are trading off models, regions, data residency, agent workflows, and platform defaults every week. What follows is an operating model, a reference picture of the stack, decision habits that survive contact with engineering, and KPIs you can actually track (DORA, SPACE, FinOps-style economics—used as shorthand, not as doctrine).

5
Strategic pillars
12
Architectural capabilities
7
Decision frameworks
24
KPIs & fitness functions

02Strategic Framework — The Five Pillars

I group the work into five pillars—business alignment, AI-native design, platform delivery, security and trust, FinOps and sustainability. Each has its own principles and a few measurable checks; the point is to reuse the same mental model whether you are on one product team or across a federated org.

Outcome-Driven Architecture Solution Architect PILLAR 01 Business Alignment Capability mapping Value-stream architecture North-star metrics Wardley mapping PILLAR 02 AI-Native Design LLM orchestration Agentic workflows RAG & knowledge graphs Responsible AI guardrails PILLAR 03 Platform & Delivery Internal Developer Platform Golden paths · paved roads GitOps & progressive delivery DORA fitness functions PILLAR 04 Security & Trust Zero-trust by default Supply-chain integrity (SLSA) Confidential computing Sovereign data residency PILLAR 05 Sustainability & FinOps Cost-aware design · carbon-aware compute
Figure 1 — Five strategic pillars of the modern Solution Architect operating model.
P1 · Business Alignment

Map systems to value, not technology

Every solution must trace to a measurable business capability and a strategic north-star metric. Wardley maps surface evolutionary stages; capability heatmaps expose duplication and gaps.

P2 · AI-Native Design

Treat models as first-class components

LLMs, retrieval, vector stores, and agents are now load-bearing primitives. Design for evaluations, drift, hallucination budgets, and human-in-the-loop checkpoints from day one.

P3 · Platform & Delivery

Reduce cognitive load, increase flow

Architects shape the Internal Developer Platform so product teams ship through paved roads. Treat the platform as a product with users, SLOs, and adoption metrics.

P4 · Security & Trust

Assume breach, prove integrity

Zero trust, SLSA-attested supply chains, signed artifacts, and confidential compute are baseline. Architectural decisions encode regulatory posture (DORA-EU, AI Act, NIS2, GDPR).

P5 · Sustainability & FinOps

Design for cost and carbon

Unit economics and gCO₂e per request are now non-functional requirements. Carbon-aware scheduling, right-sizing, and tier-based storage become design inputs.

Cross-cutting

Evolutionary by construction

Architecture is a continuous flow, not a milestone. Fitness functions, ADRs, and architectural intent in code keep the system honest as it evolves.

03Role Model — What a Solution Architect Owns

Most solution architects work at four altitudes at once—enterprise, domain, solution, and component. Naming the altitude saves you from living only in ivory-tower slides or only in someone else’s pull request.

ALTITUDE 1 · ENTERPRISE Strategy, capability model, investment portfolio, principles, target operating model Horizon: 3–5 years · Stakeholders: CIO, CTO, business unit leaders · Artefacts: capability map, principles, ADR-0001 ALTITUDE 2 · DOMAIN Bounded contexts, domain integration patterns, data products, platform alignment Horizon: 12–24 months · Stakeholders: domain leads, product managers · Artefacts: context maps, event storming output, data contracts ALTITUDE 3 · SOLUTION End-to-end solution design, NFRs, integration, deployment topology, risk register Horizon: 3–9 months · Stakeholders: product team, security, SRE · Artefacts: C4 L1–L3, threat model, sequence diagrams ALTITUDE 4 · COMPONENT Service contracts, schema design, code-level patterns, fitness functions in pipelines Horizon: weeks–sprints · Stakeholders: engineers · Artefacts: OpenAPI, AsyncAPI, ADRs, executable architecture tests
Figure 2 — Four altitudes the Solution Architect operates across. Each carries its own horizon, stakeholders, and artefacts.

Core deliverables by altitude

AltitudePrimary DecisionsKey ArtefactsCadence
Enterprise Build vs. buy vs. compose; sovereign vs. hyperscaler; reference standards Capability map · principles · ADR registry · investment roadmap Quarterly
Domain Bounded-context boundaries; integration style; data ownership Context map · event storm · data product catalogue · API standards Per initiative
Solution Topology · resilience strategy · cost envelope · threat model C4 diagrams · NFR matrix · runbook · risk register Per epic
Component API contracts · schemas · concurrency · failure modes OpenAPI / AsyncAPI · ADRs · architectural unit tests Per sprint

04Reference Target Architecture (2026)

Below is a vendor-neutral reference stack: horizontal planes (experience through foundation) with security, observability, FinOps, and governance called out as cross-cutting. Adapt labels to your estate; the layering is the takeaway.

EXPERIENCE APPLICATION INTELLIGENCE DATA & EVENTS PLATFORM FOUNDATION CROSS-CUTTING · Zero-Trust Security · Observability (OTel) · FinOps · Governance · Responsible AI · Sovereignty Web · PWA · Mobile SSR · edge rendering Conversational UX copilots · agents · voice Partner / B2B APIs OAuth · mTLS · webhooks Embedded / IoT MQTT · edge inference CDN · WAF · Edge bot mgmt · rate-limit API Gateway · BFF GraphQL federation REST · gRPC · WebSocket Domain Microservices DDD bounded contexts polyglot · stateless Workflow Orchestration Temporal · Step Functions sagas · long-running Serverless / Functions event-driven compute scale-to-zero LLM Gateway model routing · caching prompt registry PII redaction Agent Orchestration tool use · MCP servers planning · reflection human-in-the-loop RAG & Knowledge vector + lexical graph retrieval freshness SLO ML Platform feature store training · serving model registry Eval & Guardrails offline + online evals policy enforcement red-team harness Event Backbone Kafka · Pulsar schema registry CloudEvents Operational Stores SQL · NoSQL · Graph vector · time-series CDC outbox Lakehouse Iceberg · Delta bronze · silver · gold streaming SQL Data Mesh data products contracts · SLOs federated governance MDM customer 360 lineage catalog Internal Dev Platform Backstage · golden paths self-service templates CI/CD · GitOps progressive delivery SLSA-attested builds Kubernetes + Mesh multi-cluster · mTLS eBPF networking Observability (OTel) traces · metrics · logs SLO & error budgets Hyperscaler Region A AWS · Azure · GCP Hyperscaler Region B DR · active-active Sovereign / Private on-prem · sovereign cloud Edge POPs low-latency inference · 5G MEC
Figure 3 — Vendor-neutral reference architecture organised into six capability planes with cross-cutting concerns.

05Decision Frameworks & Architectural Methods

Most of your leverage is in explicit decisions—what you chose, what you ruled out, and what would make you change your mind. The toolkit below mixes lightweight habits (ADRs, C4) with heavier enterprise methods where they earn their keep.

Decision Records

ADR (Architecture Decision Records)

Markdown-versioned, in-repo records of every significant decision: context, options, decision, consequences. Linked from C4 diagrams.

Visualisation

C4 Model + arc42

C4 for layered diagrams (Context → Container → Component → Code); arc42 as a structured documentation template. Diagrams-as-code via Structurizr / Mermaid.

Strategy mapping

Wardley Mapping

Position components on a value-chain × evolution axis to surface what to build, buy, outsource, or retire. Drives platform vs. product investment.

Trade-off analysis

Quality Attribute Workshop (QAW) & ATAM

Elicit and prioritise NFRs, identify sensitivity and tradeoff points across quality attributes. Outputs feed risk register and fitness functions.

Enterprise

TOGAF ADM & capability mapping

Use TOGAF's phases as a checklist, not a religion. Capability maps remain the most durable enterprise artefact for prioritisation.

Evolutionary

Fitness Functions

Automated tests in pipelines that enforce architectural characteristics — latency, dependency direction, coupling, license, cost-per-request — turning intent into code.

Architect's decision flow

1 · Frame Problem · stakeholders drivers · constraints 2 · Discover Domain · data · risks event storming 3 · Options ≥3 alternatives QAW · spikes 4 · Decide ADR · trade-offs reversibility check 5 · Encode Fitness functions diagrams-as-code 6 · Evolve Inspect · adapt retire ADRs Reversibility-aware decision loop — bias to one-way doors only when justified
Figure 4 — Architect's decision flow. Every step produces an auditable artefact and feeds the next.

06Non-Functional Requirements & Fitness Functions

Where you can, turn quality attributes into checks in the pipeline—latency budgets, coupling rules, cost per request, whatever your team will actually run. The table is a starting template per solution, not a standard you must paste wholesale.

Quality Attribute Target (illustrative) Fitness function / mechanism
Availability99.95% rolling 30dSLO burn-rate alert · synthetic probes · chaos-engineered failover
Latencyp95 < 300 ms end-to-endOTel-based SLI · perf gate in CI · k6 load profile
Throughput5k req/s sustained per regionLoad test suite · autoscaling policy · capacity model
SecurityZero criticals; mTLS everywhereSCA + SAST + DAST gates · SLSA L3 · IaC policy-as-code (OPA)
PrivacyGDPR Art. 5 by designPII tagging · purpose binding · automated DSAR · differential privacy
AI Safety<1% harmful output rateEval harness · jailbreak suite · content filters · human review on high-risk
Observability100% RED metrics · 90% tracesOTel auto-instrumentation · log structure linter
Cost≤ $0.0042 per requestUnit economics dashboard · per-tenant tagging · FinOps budget gates
Sustainability≤ 0.8 gCO₂e per requestCarbon-aware scheduler · CCF metrics · region selection policy
PortabilityRe-deploy in < 48hOCI containers · IaC modules · provider-neutral abstractions
MaintainabilityChange-failure rate < 10%DORA metrics · ArchUnit-style coupling tests · test pyramid

07AI-Native & Agentic Architecture Patterns

If models sit on the critical path, they get the same treatment as payments or auth: SLOs, evals, cost envelopes, and kill switches—not a sidebar “AI workstream.”

Pattern

LLM Gateway with model routing

Centralised gateway routes by capability/cost/latency, applies rate limits, PII redaction, prompt caching, and structured logging. Decouples app code from model vendor; enables A/B and shadow evaluation.

Pattern

Retrieval-Augmented Generation (RAG 2.0)

Hybrid retrieval (BM25 + dense + graph), reranking, query rewriting, freshness SLOs, and groundedness evaluation. Knowledge contracts ensure source-of-truth alignment.

Pattern

Agentic workflows with MCP

Tool-using agents over Model Context Protocol servers, with explicit planning, budgets, sandboxes, and trace-replayable runs. Human-in-the-loop on irreversible actions.

Pattern

Evaluation-driven development

Golden datasets, LLM-as-judge with calibration, regression suites in CI, and online quality SLOs. No model change ships without eval delta.

Pattern

Responsible-AI guardrails

Input/output classifiers, policy engines, content provenance (C2PA), and audit logs aligned to the EU AI Act risk tiers and ISO/IEC 42001.

Pattern

Small + specialised models

Use frontier models for reasoning; distil or fine-tune small models for hot paths. Quantise for edge. Result: lower latency, lower cost, lower carbon.

08Security, Trust & Regulatory Posture

EU AI Act, DORA (where it applies), NIS2, GDPR, and sovereign-cloud expectations show up in landing zones, data flows, and control design—waiting for a compliance review to “sign off” the architecture is how you paint yourself into a corner.

Architectural rule: regulatory obligations are encoded as policy-as-code at the platform layer, so product teams inherit compliance by default rather than re-implementing it per service.

Zero Trust

Identity-aware proxies, mTLS service mesh, per-request authorisation (OPA/Cedar), short-lived workload identities (SPIFFE).

Supply Chain

SLSA L3+, signed artefacts (Sigstore/Cosign), SBOM in every release, reproducible builds, dependency-pinning policy.

Confidential Compute

TEE-backed enclaves for sensitive workloads (TDX/SEV-SNP), attested key release, encrypted memory and bus.

Data Sovereignty

Region pinning, sovereign-cloud landing zones, customer-managed keys (HYOK/BYOK), data-residency policies in IaC.

Resilience

DORA-aligned recovery objectives, chaos & failover drills, third-party concentration risk register, exit plans.

Post-Quantum Readiness

Crypto-agility, hybrid TLS (X25519 + ML-KEM), inventory of long-lived secrets and certificates.

0912-Month Adoption Roadmap

Ship this as a quarterly plan: theme, a few measurable outcomes, and clear “we are done” lines. Architecture should be co-owned with engineering and product—you are not running a parallel PMO.

Q1 FOUNDATIONS Principles · ADR registry Capability map · NFR baseline Observability uplift (OTel) Q2 PLATFORM IDP MVP · golden paths SLSA-attested pipelines Mesh + zero-trust rollout Q3 AI-NATIVE LLM gateway · eval harness First agentic workflow RAG knowledge contracts Q4 SCALE & OPTIMISE FinOps unit economics Carbon-aware scheduling DR & sovereign landing zones N+1 EVOLVE Federated architecture guild · fitness
Figure 5 — Quarterly adoption roadmap. Each quarter delivers usable platform capabilities, not slideware.

10Measuring Architectural Success

If you cannot point to numbers, you will lose arguments to urgency. Mix delivery, reliability, developer experience, and unit economics—the exact targets below are examples; swap in what your org already reports.

CategoryMetricTarget / Direction
Delivery (DORA)Deployment frequencyMultiple / day
Lead time for change< 1 day
Change failure rate< 10%
MTTR< 1 hour
Developer experience (SPACE)Time to first commit< 1 day
Cognitive-load score↓ each quarter
Golden-path adoption> 80%
ReliabilitySLO attainment≥ 99.9% of SLOs green
Error-budget burnWithin policy
EconomicsCost per business transaction↓ trending
Carbon per transaction↓ trending
AI qualityGroundedness score≥ 0.85
Harmful output rate< 1%

11Anti-Patterns to Avoid

Anti-pattern

Ivory-tower architecture

PDFs and diagrams disconnected from running code. Counter: diagrams-as-code, ADRs in the repo, fitness functions in CI.

Anti-pattern

Big design up front

Locking decisions before learning. Counter: distinguish one-way vs. two-way doors; defer reversible decisions to the last responsible moment.

Anti-pattern

AI as a feature flag

Bolting an LLM onto a UI without evals, guardrails, or cost controls. Counter: treat AI as a critical-path subsystem with its own SLOs.

Anti-pattern

Platform without users

Building an IDP nobody adopts. Counter: platform-as-product with PMs, SLOs, adoption metrics, and feedback loops.

Anti-pattern

Cloud lift-and-shift theatre

Re-hosting VMs with no architectural change. Counter: re-platform around managed services, events, and serverless where appropriate.

Anti-pattern

Single-vendor lock-in by default

Counter: provider-neutral abstractions for compute, identity, and data plane; measured exception process for managed services.

12Research Agenda & Open Questions

The list below is deliberately unfinished—places where I still want sharper methods and better data, not polished conclusions.

  1. Agentic system safety — provable bounds on tool-use blast radius and economic guardrails for autonomous agents.
  2. Continuous architecture — formalising fitness functions as the primary architectural contract instead of static diagrams.
  3. AI-assisted architecting — copilots that propose ADRs, detect drift, and synthesise C4 views from running systems.
  4. Sustainable software — standard methodologies for measuring software carbon intensity (SCI) at the request level.
  5. Post-quantum migration — pragmatic roadmaps for crypto-agility in long-lived enterprise systems.
  6. Sovereign multi-cloud — workload portability under conflicting jurisdictional requirements.

13Key Concepts (Glossary)

TOGAF 10 arc42 C4 Model Wardley Mapping DDD Team Topologies DORA SPACE SLSA SPIFFE / SPIRE OpenTelemetry CloudEvents AsyncAPI OPA · Cedar Sigstore EU AI Act DORA-EU NIS2 ISO/IEC 42001 FinOps Framework Green Software Foundation SCI MCP C2PA

Shorthand I use in workshops and reviews so the same words mean the same thing across business, engineering, security, and regulators. For citable sources and specifications, see §14 · References.

14References

Figures, narrative, and synthesis in this playbook are original work by the author. The list below grounds methods, metrics, regulations, and platform primitives in primary documents—standards, authoritative books, and official specifications. It is not exhaustive; use it as a trailhead for procurement, security review, and audit.

Enterprise architecture, documentation, and decision records

  1. The Open Group. TOGAF Standard, 10th Edition.
    https://www.opengroup.org/togaf
  2. Starke, G., & Hruschka, P. arc42 — Template for architecture communication and documentation.
    https://arc42.github.io/
  3. Brown, S. The C4 model for visualising software architecture.
    https://c4model.com/
  4. Nygard, M. “Documenting Architecture Decisions.” 2011 — ADR pattern; community templates and tooling at ADR GitHub.
    https://adr.github.io/
  5. Wardley, S. Wardley Mapping (Creative Commons resources on mapping value chains and evolution).
    Wardley Mapping (CC edition)

Domain-driven design, teams, and socio-technical modelling

  1. Evans, E. Domain-Driven Design: Tackling Complexity in the Heart of Software. Addison-Wesley, 2004.
  2. Vernon, V. Implementing Domain-Driven Design. Addison-Wesley, 2013.
  3. Brandolini, A. Introducing EventStorming (leanpub / workshop method for collaborative domain modelling).
  4. Skelton, M., & Pais, M. Team Topologies: Organizing Business and Technology Teams for Fast Flow. IT Revolution, 2019.

Quality attributes, trade-off analysis, and evolutionary architecture

  1. Software Engineering Institute (CMU). Architecture Tradeoff Analysis Method (ATAM).
    SEI — Architecture analysis methods
  2. Software Engineering Institute (CMU). Quality Attribute Workshop (QAW).
    SEI wiki — Quality Attribute Workshop
  3. Ford, N., Parsons, R., & Kua, P. Building Evolutionary Architectures: Automated Software Governance. O’Reilly, 2017 — fitness functions.

Delivery performance, developer experience, and reliability

  1. Forsgren, N., Humble, J., & Kim, G. Accelerate: The Science of Lean Software and DevOps. IT Revolution, 2018 — DORA metrics.
  2. Microsoft Research. “The SPACE of Developer Productivity.” ACM Queue, 2021 (SPACE framework).
    https://queue.acm.org/detail.cfm?id=3454124
  3. Google. Site Reliability Engineering (book).
    https://sre.google/sre-book/table-of-contents/
  4. Google. The Site Reliability Workbook.
    https://sre.google/workbook/table-of-contents/

Platform engineering, GitOps, and internal developer surfaces

  1. CNCF. Platforms Definition White Paper (cloud-native internal platforms).
    CNCF TAG App Delivery — Platforms white paper
  2. Backstage. Backstage documentation (open source developer portal / IDP).
    https://backstage.io/docs/overview/what-is-backstage
  3. OpenGitOps. GitOps Principles.
    https://opengitops.dev/
  4. FinOps Foundation. FinOps Framework.
    https://www.finops.org/framework/

Observability, events, APIs, and integration contracts

  1. OpenTelemetry Project. OpenTelemetry Specification.
    https://opentelemetry.io/docs/specs/otel/
  2. Cloud Native Computing Foundation. CloudEvents — Specification.
    CloudEvents core spec (GitHub)
  3. AsyncAPI Initiative. AsyncAPI Specification.
    https://www.asyncapi.com/docs/reference/specification/latest
  4. Hohpe, G., & Woolf, B. Enterprise Integration Patterns. Addison-Wesley, 2003.

Security, identity, supply chain, and policy-as-code

  1. NIST. Cybersecurity Framework (CSF) 2.0.
    https://www.nist.gov/cyberframework
  2. Rose, S., et al. NIST SP 800-207 — Zero Trust Architecture.
    NIST SP 800-207 (final)
  3. Linux Foundation / CNCF. SPIFFE — Secure Production Identity Framework for Everyone.
    https://spiffe.io/docs/latest/spiffe-about/overview/
  4. OpenSSF. SLSA: Supply-chain Levels for Software Artifacts.
    https://slsa.dev/
  5. OpenSSF. Sigstore — documentation (artifact signing and transparency).
    https://docs.sigstore.dev/
  6. Open Policy Agent. OPA Documentation.
    https://www.openpolicyagent.org/docs/latest/
  7. AWS. Cedar policy language — specification.
    https://www.cedarpolicy.com/

Privacy, EU digital regulation, and operational resilience

  1. European Union. General Data Protection Regulation (GDPR), Regulation (EU) 2016/679.
    EUR-Lex — GDPR
  2. European Parliament and Council. Regulation (EU) 2022/2554 — Digital Operational Resilience Act (DORA).
    EUR-Lex — DORA (2022/2554)
  3. European Parliament and Council. Directive (EU) 2022/2555 — NIS2 (measures for high common level of cybersecurity).
    EUR-Lex — NIS2
  4. European Parliament and Council. Regulation (EU) 2024/1689 — Artificial Intelligence Act.
    EUR-Lex — EU AI Act

AI management, risk, provenance, and application security

  1. NIST. Artificial Intelligence Risk Management Framework (AI RMF 1.0), NIST AI 100-1.
    https://doi.org/10.6028/NIST.AI.100-1
  2. ISO/IEC JTC 1/SC 42. ISO/IEC 42001:2023 — AI management system.
    ISO store — ISO/IEC 42001
  3. Mitchell, M., et al. “Model Cards for Model Reporting.” ACM FAT*, 2019. arXiv:1810.03993.
    https://arxiv.org/abs/1810.03993
  4. Lewis, P., et al. “Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.” NeurIPS, 2020. arXiv:2005.11401.
    https://arxiv.org/abs/2005.11401
  5. Anthropic et al. Model Context Protocol (MCP).
    https://modelcontextprotocol.io/
  6. C2PA. Content Credentials Specification (C2PA technical specification).
    spec.c2pa.org — C2PA Specification
  7. OWASP Foundation. OWASP Top 10 for Large Language Model Applications.
    OWASP LLM Top 10

Sustainability and software carbon accounting

  1. Green Software Foundation. Software Carbon Intensity (SCI) Specification.
    https://sci.greensoftware.foundation/
  2. Green Software Foundation. Carbon Aware SDK / carbon-aware computing (patterns for carbon-aware workloads).
    https://greensoftware.foundation/