Solution architecture · 2026

The Modern Solution Architect: A Strategic Framework for AI-Native, Cloud-Distributed Enterprise Systems

Linh Truong, MA (Harvard), MBA · Author & source: LinhTruong.com · Linh@Alumni.Harvard.edu

A working map for practicing architects: where AI, multi-cloud, platforms, zero trust, data and events, FinOps, and sustainability show up in real designs—and how to keep decisions tied to shipping systems, not slide decks.

AI-Native Design Multi-Cloud Event-Driven Zero Trust Platform Engineering FinOps · GreenOps TOGAF · C4 · arc42 DORA · SPACE

01Executive Summary

The job is less “draw the system” and more “make sure the system keeps its promises.” You are trading off models, regions, data residency, agent workflows, and platform defaults every week. What follows is an operating model, a reference picture of the stack, decision habits that survive contact with engineering, and KPIs you can actually track (DORA, SPACE, FinOps-style economics—used as shorthand, not as doctrine).

Strategic pillars

Architectural capabilities

Decision frameworks

KPIs & fitness functions

02Strategic Framework — The Five Pillars

I group the work into five pillars—business alignment, AI-native design, platform delivery, security and trust, FinOps and sustainability. Each has its own principles and a few measurable checks; the point is to reuse the same mental model whether you are on one product team or across a federated org.

Figure 1 — Five strategic pillars of the modern Solution Architect operating model.

P1 · Business Alignment

Map systems to value, not technology

Every solution must trace to a measurable business capability and a strategic north-star metric. Wardley maps surface evolutionary stages; capability heatmaps expose duplication and gaps.

P2 · AI-Native Design

Treat models as first-class components

LLMs, retrieval, vector stores, and agents are now load-bearing primitives. Design for evaluations, drift, hallucination budgets, and human-in-the-loop checkpoints from day one.

P3 · Platform & Delivery

Reduce cognitive load, increase flow

Architects shape the Internal Developer Platform so product teams ship through paved roads. Treat the platform as a product with users, SLOs, and adoption metrics.

P4 · Security & Trust

Assume breach, prove integrity

Zero trust, SLSA-attested supply chains, signed artifacts, and confidential compute are baseline. Architectural decisions encode regulatory posture (DORA-EU, AI Act, NIS2, GDPR).

P5 · Sustainability & FinOps

Design for cost and carbon

Unit economics and gCO₂e per request are now non-functional requirements. Carbon-aware scheduling, right-sizing, and tier-based storage become design inputs.

Cross-cutting

Evolutionary by construction

Architecture is a continuous flow, not a milestone. Fitness functions, ADRs, and architectural intent in code keep the system honest as it evolves.

03Role Model — What a Solution Architect Owns

Most solution architects work at four altitudes at once—enterprise, domain, solution, and component. Naming the altitude saves you from living only in ivory-tower slides or only in someone else’s pull request.

Figure 2 — Four altitudes the Solution Architect operates across. Each carries its own horizon, stakeholders, and artefacts.

Core deliverables by altitude

Altitude	Primary Decisions	Key Artefacts	Cadence
Enterprise	Build vs. buy vs. compose; sovereign vs. hyperscaler; reference standards	Capability map · principles · ADR registry · investment roadmap	Quarterly
Domain	Bounded-context boundaries; integration style; data ownership	Context map · event storm · data product catalogue · API standards	Per initiative
Solution	Topology · resilience strategy · cost envelope · threat model	C4 diagrams · NFR matrix · runbook · risk register	Per epic
Component	API contracts · schemas · concurrency · failure modes	OpenAPI / AsyncAPI · ADRs · architectural unit tests	Per sprint

04Reference Target Architecture (2026)

Below is a vendor-neutral reference stack: horizontal planes (experience through foundation) with security, observability, FinOps, and governance called out as cross-cutting. Adapt labels to your estate; the layering is the takeaway.

Figure 3 — Vendor-neutral reference architecture organised into six capability planes with cross-cutting concerns.

05Decision Frameworks & Architectural Methods

Most of your leverage is in explicit decisions—what you chose, what you ruled out, and what would make you change your mind. The toolkit below mixes lightweight habits (ADRs, C4) with heavier enterprise methods where they earn their keep.

Decision Records

ADR (Architecture Decision Records)

Markdown-versioned, in-repo records of every significant decision: context, options, decision, consequences. Linked from C4 diagrams.

Visualisation

C4 Model + arc42

C4 for layered diagrams (Context → Container → Component → Code); arc42 as a structured documentation template. Diagrams-as-code via Structurizr / Mermaid.

Strategy mapping

Wardley Mapping

Position components on a value-chain × evolution axis to surface what to build, buy, outsource, or retire. Drives platform vs. product investment.

Trade-off analysis

Quality Attribute Workshop (QAW) & ATAM

Elicit and prioritise NFRs, identify sensitivity and tradeoff points across quality attributes. Outputs feed risk register and fitness functions.

Enterprise

TOGAF ADM & capability mapping

Use TOGAF's phases as a checklist, not a religion. Capability maps remain the most durable enterprise artefact for prioritisation.

Evolutionary

Fitness Functions

Automated tests in pipelines that enforce architectural characteristics — latency, dependency direction, coupling, license, cost-per-request — turning intent into code.

Architect's decision flow

Figure 4 — Architect's decision flow. Every step produces an auditable artefact and feeds the next.

06Non-Functional Requirements & Fitness Functions

Where you can, turn quality attributes into checks in the pipeline—latency budgets, coupling rules, cost per request, whatever your team will actually run. The table is a starting template per solution, not a standard you must paste wholesale.

Quality Attribute	Target (illustrative)	Fitness function / mechanism
Availability	99.95% rolling 30d	SLO burn-rate alert · synthetic probes · chaos-engineered failover
Latency	p95 < 300 ms end-to-end	OTel-based SLI · perf gate in CI · k6 load profile
Throughput	5k req/s sustained per region	Load test suite · autoscaling policy · capacity model
Security	Zero criticals; mTLS everywhere	SCA + SAST + DAST gates · SLSA L3 · IaC policy-as-code (OPA)
Privacy	GDPR Art. 5 by design	PII tagging · purpose binding · automated DSAR · differential privacy
AI Safety	<1% harmful output rate	Eval harness · jailbreak suite · content filters · human review on high-risk
Observability	100% RED metrics · 90% traces	OTel auto-instrumentation · log structure linter
Cost	≤ $0.0042 per request	Unit economics dashboard · per-tenant tagging · FinOps budget gates
Sustainability	≤ 0.8 gCO₂e per request	Carbon-aware scheduler · CCF metrics · region selection policy
Portability	Re-deploy in < 48h	OCI containers · IaC modules · provider-neutral abstractions
Maintainability	Change-failure rate < 10%	DORA metrics · ArchUnit-style coupling tests · test pyramid

07AI-Native & Agentic Architecture Patterns

If models sit on the critical path, they get the same treatment as payments or auth: SLOs, evals, cost envelopes, and kill switches—not a sidebar “AI workstream.”

Pattern

LLM Gateway with model routing

Centralised gateway routes by capability/cost/latency, applies rate limits, PII redaction, prompt caching, and structured logging. Decouples app code from model vendor; enables A/B and shadow evaluation.

Pattern

Retrieval-Augmented Generation (RAG 2.0)

Hybrid retrieval (BM25 + dense + graph), reranking, query rewriting, freshness SLOs, and groundedness evaluation. Knowledge contracts ensure source-of-truth alignment.

Pattern

Agentic workflows with MCP

Tool-using agents over Model Context Protocol servers, with explicit planning, budgets, sandboxes, and trace-replayable runs. Human-in-the-loop on irreversible actions.

Pattern

Evaluation-driven development

Golden datasets, LLM-as-judge with calibration, regression suites in CI, and online quality SLOs. No model change ships without eval delta.

Pattern

Responsible-AI guardrails

Input/output classifiers, policy engines, content provenance (C2PA), and audit logs aligned to the EU AI Act risk tiers and ISO/IEC 42001.

Pattern

Small + specialised models

Use frontier models for reasoning; distil or fine-tune small models for hot paths. Quantise for edge. Result: lower latency, lower cost, lower carbon.

08Security, Trust & Regulatory Posture

EU AI Act, DORA (where it applies), NIS2, GDPR, and sovereign-cloud expectations show up in landing zones, data flows, and control design—waiting for a compliance review to “sign off” the architecture is how you paint yourself into a corner.

Architectural rule: regulatory obligations are encoded as policy-as-code at the platform layer, so product teams inherit compliance by default rather than re-implementing it per service.

Zero Trust

Identity-aware proxies, mTLS service mesh, per-request authorisation (OPA/Cedar), short-lived workload identities (SPIFFE).

Supply Chain

SLSA L3+, signed artefacts (Sigstore/Cosign), SBOM in every release, reproducible builds, dependency-pinning policy.

Confidential Compute

TEE-backed enclaves for sensitive workloads (TDX/SEV-SNP), attested key release, encrypted memory and bus.

Data Sovereignty

Region pinning, sovereign-cloud landing zones, customer-managed keys (HYOK/BYOK), data-residency policies in IaC.

Resilience

DORA-aligned recovery objectives, chaos & failover drills, third-party concentration risk register, exit plans.

Post-Quantum Readiness

Crypto-agility, hybrid TLS (X25519 + ML-KEM), inventory of long-lived secrets and certificates.

0912-Month Adoption Roadmap

Ship this as a quarterly plan: theme, a few measurable outcomes, and clear “we are done” lines. Architecture should be co-owned with engineering and product—you are not running a parallel PMO.

Figure 5 — Quarterly adoption roadmap. Each quarter delivers usable platform capabilities, not slideware.

10Measuring Architectural Success

If you cannot point to numbers, you will lose arguments to urgency. Mix delivery, reliability, developer experience, and unit economics—the exact targets below are examples; swap in what your org already reports.

Category	Metric	Target / Direction
Delivery (DORA)	Deployment frequency	Multiple / day
	Lead time for change	< 1 day
	Change failure rate	< 10%
	MTTR	< 1 hour
Developer experience (SPACE)	Time to first commit	< 1 day
	Cognitive-load score	↓ each quarter
	Golden-path adoption	> 80%
Reliability	SLO attainment	≥ 99.9% of SLOs green
Reliability	Error-budget burn	Within policy
Economics	Cost per business transaction	↓ trending
Economics	Carbon per transaction	↓ trending
AI quality	Groundedness score	≥ 0.85
AI quality	Harmful output rate	< 1%

11Anti-Patterns to Avoid

Anti-pattern

Ivory-tower architecture

PDFs and diagrams disconnected from running code. Counter: diagrams-as-code, ADRs in the repo, fitness functions in CI.

Anti-pattern

Big design up front

Locking decisions before learning. Counter: distinguish one-way vs. two-way doors; defer reversible decisions to the last responsible moment.

Anti-pattern

AI as a feature flag

Bolting an LLM onto a UI without evals, guardrails, or cost controls. Counter: treat AI as a critical-path subsystem with its own SLOs.

Anti-pattern

Platform without users

Building an IDP nobody adopts. Counter: platform-as-product with PMs, SLOs, adoption metrics, and feedback loops.

Anti-pattern

Cloud lift-and-shift theatre

Re-hosting VMs with no architectural change. Counter: re-platform around managed services, events, and serverless where appropriate.

Anti-pattern

Single-vendor lock-in by default

Counter: provider-neutral abstractions for compute, identity, and data plane; measured exception process for managed services.

12Research Agenda & Open Questions

The list below is deliberately unfinished—places where I still want sharper methods and better data, not polished conclusions.

Agentic system safety — provable bounds on tool-use blast radius and economic guardrails for autonomous agents.
Continuous architecture — formalising fitness functions as the primary architectural contract instead of static diagrams.
AI-assisted architecting — copilots that propose ADRs, detect drift, and synthesise C4 views from running systems.
Sustainable software — standard methodologies for measuring software carbon intensity (SCI) at the request level.
Post-quantum migration — pragmatic roadmaps for crypto-agility in long-lived enterprise systems.
Sovereign multi-cloud — workload portability under conflicting jurisdictional requirements.

13Key Concepts (Glossary)

TOGAF 10 arc42 C4 Model Wardley Mapping DDD Team Topologies DORA SPACE SLSA SPIFFE / SPIRE OpenTelemetry CloudEvents AsyncAPI OPA · Cedar Sigstore EU AI Act DORA-EU NIS2 ISO/IEC 42001 FinOps Framework Green Software Foundation SCI MCP C2PA

Shorthand I use in workshops and reviews so the same words mean the same thing across business, engineering, security, and regulators. For citable sources and specifications, see §14 · References.

14References

Figures, narrative, and synthesis in this playbook are original work by the author. The list below grounds methods, metrics, regulations, and platform primitives in primary documents—standards, authoritative books, and official specifications. It is not exhaustive; use it as a trailhead for procurement, security review, and audit.

Enterprise architecture, documentation, and decision records

The Open Group. TOGAF Standard, 10th Edition.
https://www.opengroup.org/togaf
Starke, G., & Hruschka, P. arc42 — Template for architecture communication and documentation.
https://arc42.github.io/
Brown, S. The C4 model for visualising software architecture.
https://c4model.com/
Nygard, M. “Documenting Architecture Decisions.” 2011 — ADR pattern; community templates and tooling at ADR GitHub.
https://adr.github.io/
Wardley, S. Wardley Mapping (Creative Commons resources on mapping value chains and evolution).
Wardley Mapping (CC edition)

Domain-driven design, teams, and socio-technical modelling

Evans, E. Domain-Driven Design: Tackling Complexity in the Heart of Software. Addison-Wesley, 2004.
Vernon, V. Implementing Domain-Driven Design. Addison-Wesley, 2013.
Brandolini, A. Introducing EventStorming (leanpub / workshop method for collaborative domain modelling).
Skelton, M., & Pais, M. Team Topologies: Organizing Business and Technology Teams for Fast Flow. IT Revolution, 2019.

Quality attributes, trade-off analysis, and evolutionary architecture

Software Engineering Institute (CMU). Architecture Tradeoff Analysis Method (ATAM).
SEI — Architecture analysis methods
Software Engineering Institute (CMU). Quality Attribute Workshop (QAW).
SEI wiki — Quality Attribute Workshop
Ford, N., Parsons, R., & Kua, P. Building Evolutionary Architectures: Automated Software Governance. O’Reilly, 2017 — fitness functions.

Delivery performance, developer experience, and reliability

Forsgren, N., Humble, J., & Kim, G. Accelerate: The Science of Lean Software and DevOps. IT Revolution, 2018 — DORA metrics.
Microsoft Research. “The SPACE of Developer Productivity.” ACM Queue, 2021 (SPACE framework).
https://queue.acm.org/detail.cfm?id=3454124
Google. Site Reliability Engineering (book).
https://sre.google/sre-book/table-of-contents/
Google. The Site Reliability Workbook.
https://sre.google/workbook/table-of-contents/

Platform engineering, GitOps, and internal developer surfaces

CNCF. Platforms Definition White Paper (cloud-native internal platforms).
CNCF TAG App Delivery — Platforms white paper
Backstage. Backstage documentation (open source developer portal / IDP).
https://backstage.io/docs/overview/what-is-backstage
OpenGitOps. GitOps Principles.
https://opengitops.dev/
FinOps Foundation. FinOps Framework.
https://www.finops.org/framework/

Observability, events, APIs, and integration contracts

OpenTelemetry Project. OpenTelemetry Specification.
https://opentelemetry.io/docs/specs/otel/
Cloud Native Computing Foundation. CloudEvents — Specification.
CloudEvents core spec (GitHub)
AsyncAPI Initiative. AsyncAPI Specification.
https://www.asyncapi.com/docs/reference/specification/latest
Hohpe, G., & Woolf, B. Enterprise Integration Patterns. Addison-Wesley, 2003.

Security, identity, supply chain, and policy-as-code

NIST. Cybersecurity Framework (CSF) 2.0.
https://www.nist.gov/cyberframework
Rose, S., et al. NIST SP 800-207 — Zero Trust Architecture.
NIST SP 800-207 (final)
Linux Foundation / CNCF. SPIFFE — Secure Production Identity Framework for Everyone.
https://spiffe.io/docs/latest/spiffe-about/overview/
OpenSSF. SLSA: Supply-chain Levels for Software Artifacts.
https://slsa.dev/
OpenSSF. Sigstore — documentation (artifact signing and transparency).
https://docs.sigstore.dev/
Open Policy Agent. OPA Documentation.
https://www.openpolicyagent.org/docs/latest/
AWS. Cedar policy language — specification.
https://www.cedarpolicy.com/

Privacy, EU digital regulation, and operational resilience

European Union. General Data Protection Regulation (GDPR), Regulation (EU) 2016/679.
EUR-Lex — GDPR
European Parliament and Council. Regulation (EU) 2022/2554 — Digital Operational Resilience Act (DORA).
EUR-Lex — DORA (2022/2554)
European Parliament and Council. Directive (EU) 2022/2555 — NIS2 (measures for high common level of cybersecurity).
EUR-Lex — NIS2
European Parliament and Council. Regulation (EU) 2024/1689 — Artificial Intelligence Act.
EUR-Lex — EU AI Act

AI management, risk, provenance, and application security

NIST. Artificial Intelligence Risk Management Framework (AI RMF 1.0), NIST AI 100-1.
https://doi.org/10.6028/NIST.AI.100-1
ISO/IEC JTC 1/SC 42. ISO/IEC 42001:2023 — AI management system.
ISO store — ISO/IEC 42001
Mitchell, M., et al. “Model Cards for Model Reporting.” ACM FAT*, 2019. arXiv:1810.03993.
https://arxiv.org/abs/1810.03993
Lewis, P., et al. “Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.” NeurIPS, 2020. arXiv:2005.11401.
https://arxiv.org/abs/2005.11401
Anthropic et al. Model Context Protocol (MCP).
https://modelcontextprotocol.io/
C2PA. Content Credentials Specification (C2PA technical specification).
spec.c2pa.org — C2PA Specification
OWASP Foundation. OWASP Top 10 for Large Language Model Applications.
OWASP LLM Top 10

Sustainability and software carbon accounting

Green Software Foundation. Software Carbon Intensity (SCI) Specification.
https://sci.greensoftware.foundation/
Green Software Foundation. Carbon Aware SDK / carbon-aware computing (patterns for carbon-aware workloads).
https://greensoftware.foundation/