Source / canonical copy:LinhTruong.com. If you forward this HTML, link there so attribution stays with the file.
Notes and SVGs I reuse when someone asks for a single picture of the whole road: foundations, systems, networking, distributed systems, data/ML, security, how we actually ship software, and a tight read on research and career moves. Slanted toward practitioners who present, review designs, or write—not a degree substitute.
Three forces keep showing up in the teams I talk to: AI bundled into products (often LLM-shaped), systems that are distributed by default (edge, cloud, device), and pressure to show your work—types, tests, traces, sometimes proof-ish reasoning. None of that is comfortable if you only live in one band of the stack.
Figure 1 — Five layers; lower layers fund the upper ones—the more bands you can reason about, the less surprised you are in review.
Trend 01
AI-native software
LLMs and agents move from features to infrastructure. Retrieval, evals, and tool use are first-class concerns alongside testing and logging.
Trend 02
Edge + Cloud + Device
WASM, on-device inference, and serverless make "where does code run?" a design decision per request — not per service.
Trend 03
Verified & Observed
Strong types (Rust, TS, Swift 6), property testing, OpenTelemetry, and SBOMs raise the floor for production-grade software.
2 · Core Foundations
Foundations are option value. When you actually remember complexity classes, basic probability, and information theory, architecture arguments get shorter and code reviews get more concrete.
Math
The non-negotiables
Discrete math — sets, relations, induction, combinatorics, graph theory.
Probability & statistics — distributions, Bayes, A/B testing, confidence intervals.
Linear algebra — vectors, matrices, eigenvalues (used everywhere in ML and graphics).
Information theory — entropy, compression, error correction.
CS Theory
What every developer should be able to argue
Why P vs NP matters for problem framing and approximation.
How Turing machines and the halting problem bound what is computable.
Why a regex is not a parser, and when to reach for a CFG.
What "NP-hard but tractable" means in practice (SAT, ILP, heuristics).
How information-theoretic lower bounds drive lossless compression and hashing.
3 · Data Structures & Algorithms
You do not need to memorize every algorithm — you need a map of which structure solves which class of problem, and the cost of each operation. The diagram below organizes the canon by access pattern.
Figure 2 — Pick a structure by access pattern first, then refine by constraints (memory, concurrency, persistence).
The algorithm strategies you must own
Strategy
When to reach for it
Canonical example
Big-O hint
Divide & Conquer
Problem splits into independent subproblems
Merge sort, FFT
O(n log n)
Greedy
Local optimum implies global optimum (provable)
Huffman, Dijkstra
O(n log n)
Dynamic Programming
Overlapping subproblems + optimal substructure
LCS, knapsack, edit distance
O(n·m) typical
Backtracking / Branch & Bound
Search a space with prunable invariants
SAT, N-queens, scheduling
Exponential worst case
Graph traversal
Reachability, shortest path, cycles
BFS, DFS, A*
O(V+E)
Randomized / Sketches
Massive data, approximate answers OK
Bloom, HLL, MinHash
Sub-linear memory
Approximation
NP-hard but a 2× or (1+ε) bound is fine
Set cover, TSP
Polynomial w/ ratio
Online & streaming
Data arrives once, can't store it all
Reservoir sampling, EWMA
O(1) per item
4 · Systems & Architecture
Performance, reliability, and cost all bottom out in physics: how memory hierarchies, kernels, and hardware coordinate. The "latency numbers every programmer should know" table below is a developer's compass.
Figure 3 — The seven-orders-of-magnitude gap between a register and a cross-region call is why where code runs matters as much as what it does.
OS
Process & memory
Virtual memory, paging, page-cache, mmap, copy-on-write. Know your scheduler (CFS, EEVDF on Linux 6.x) and how io_uring changes I/O.
Concurrency
Three models
Threads + locks, async/await (cooperative), and actors/CSP (message passing). Pick one per service boundary, not per file.
Hardware
Mechanical sympathy
Cache lines (64B), NUMA, SIMD, branch prediction. A tight loop that fits in L1 is 100× a naïve one over RAM.
5 · Networking & the Web
The web runs on a small number of layered abstractions. Understanding them lets you debug "why is my page slow?" with first principles instead of guesses.
Figure 4 — HTTP/3 over QUIC is now the default on most CDNs. Post-quantum hybrid key exchange is rolling out in TLS 1.3.
Distributed systems are the discipline of partial failure. The CAP and PACELC framings, plus consensus, are the conceptual core.
Figure 5 — PACELC extends CAP with the steady-state trade-off between latency and consistency.
Consensus
Paxos, Raft, Viewstamped Replication
All solve the same problem: agree on a value despite failures. Raft is the de-facto teaching algorithm; Multi-Paxos and Viewstamped Replication power the largest systems. Modern variants (EPaxos, Flexible Paxos) trade latency for failure-domain flexibility.
Replication
Patterns
Leader/follower (Postgres, MySQL).
Multi-leader (CRDTs, geo-active-active).
Leaderless / quorum (Dynamo-style).
Chain replication (CRAQ for read scaling).
7 · Databases & Data Engineering
The database is usually your most expensive and least reversible decision. Choose by access pattern + consistency need + ops cost, in that order.
Figure 6 — Default to Postgres until a workload forces you out. Mix engines at the storage layer, not at the API.
Data engineering pipeline shape
source → ingest → store (lake/warehouse) → transform (dbt/Spark) → serve
↓ ↓
catalog (Iceberg/Delta/Hudi) metrics + features
↓ ↓
lineage / quality ML training / RAG
8 · AI / ML / LLMs
In 2026 a software developer is expected to integrate AI competently even if they don't train models. The diagram below distinguishes the four common operating modes.
Figure 7 — Start with prompting, add RAG when grounded knowledge is needed, add tools when actions are needed, fine-tune last.
Evaluation is the moat
The most common failure mode in AI features is "looks good in the demo, regresses in prod." Build evals before you ship anything.
Golden datasets — 50–500 hand-curated examples with expected outputs.
Cost & latency budgets — p50/p95/p99 with regression alerts in CI.
A research paper or production launch in 2026 without an eval section is incomplete. The eval is the experiment.
9 · Security & Cryptography
Threat modeling is cheap; incidents are not. Use the STRIDE mnemonic on every new feature, and pair it with the standard mitigations below.
Figure 8 — Apply STRIDE per data-flow boundary, not per feature.
2026 baseline you should be able to defend
Passkeys (WebAuthn) over passwords. SSO via OIDC.
OWASP Top 10 + OWASP LLM Top 10 (prompt injection, insecure output handling, model DoS).
Secrets in a vault, never in env files committed to git.
Supply chain — SBOMs, signed artifacts (Sigstore), pinned dependencies, lockfiles.
Post-quantum readiness — hybrid KEMs (ML-KEM/Kyber + X25519) where possible.
Zero-trust networking — identity per workload, not per IP.
10 · Software Engineering Practice
The most underrated CS skill is shipping software that other humans can change. The diagram below shows the modern feedback loop a high-functioning team runs.
Figure 9 — The loop closes when production behavior changes the next spec. Teams without that closure stall.
Code quality
The big six
Types & static analysis
Unit + property + integration tests
Code review with checklists
Linters & formatters in CI
Dependency upgrade bots
Mutation testing for hot paths
Workflow
Trunk-based delivery
Short-lived branches
Feature flags for unfinished work
Continuous deploy to staging
Canary + blue/green to prod
Rollback in <5 min
Docs
What to write
ADRs (architecture decisions)
RFCs for cross-team work
Runbooks per service
Onboarding 90-day plan
Glossary of domain terms
Testing pyramid (and the modern flip)
Level
Quantity
Speed
Confidence
2026 note
Unit
Many
ms
Local
Property tests catch what examples miss
Integration
Some
seconds
Module
Use real DBs in containers, not mocks
End-to-end
Few
minutes
System
Playwright/Cypress; run on a deployed env
Eval / behavioral
Always
varies
AI features
The new tier for LLM-powered features
Chaos / load
Periodic
hours
Reliability
Run before launches, not after incidents
11 · Cloud, DevOps & Platform Engineering
Cloud is the substrate; platform engineering is how you make it usable. A good internal developer platform (IDP) turns "deploy" into a one-line action and bakes in compliance.
Figure 10 — A working IDP makes the secure, observable, scalable path the easy one.
Infrastructure-as-code stance
Treat infra like product code: review, test (with terraform plan diffs and OPA policies), version.
One repository of truth; drift is a bug, not a feature.
Cost is a first-class metric — tag resources, set budgets, alert on burn.
12 · Performance & Observability
"Make it work, make it right, make it fast" — in that order. Measure before you optimize. Modern observability is three pillars plus one.
Figure 11 — Pillars are not enough on their own; the value is in correlation through OpenTelemetry trace IDs.
The performance loop
Define the SLI (e.g., p95 request latency).
Measure — instrument, run load tests, capture flame graphs.
Find the bottleneck — Amdahl says fix the biggest slice first.
Change one thing, A/B it, re-measure.
Lock in with a regression test or budget alert.
13 · Research-Paper Methodology for CS
Whether you are writing a conference paper, a tech report, or a launch retrospective, the structure is the same: question → method → evidence → claim. The diagram below is a research project as a pipeline.
Figure 12 — Treat the paper as a pipeline whose artifacts (data, code, figures) are reusable across drafts and venues.
IMRaD structure with CS-specific guidance
Section
Length
Key questions to answer
Common mistakes
Abstract
150–250 words
What problem, what idea, what evidence, what impact?
Vague claims, no numbers
Introduction
1–1.5 pages
Why now? What gap? Three-bullet contribution list.
"How to write a great research paper" — Simon Peyton Jones (talk)
16 · References & sources
The diagrams here are my synthesis; the list below is where I point people for citable anchors—textbooks, papers, and standards that match §1–§15. It is not exhaustive relative to every vendor drawn in your head.
Note: Use published editions for bibliographies; arXiv is fine for preprints. RFCs and NIST publications are versioned—always confirm you cite the revision your design actually follows.
Algorithms, complexity & discrete foundations
Cormen, Leiserson, Rivest & Stein, Introduction to Algorithms (CLRS). MIT Press—standard reference for Big-O, data structures, and classical algorithms tied to §3. ISBN 978-0262046305.
Knuth, The Art of Computer Programming, Vol. 1 (Fundamental Algorithms). Addison-Wesley—historical rigor for analysis and combinatorial foundations.
Sipser, Introduction to the Theory of Computation. Cengage—automata, computability, complexity classes; connects to “NP-hard” discussions in §3.
Computer systems, OS & architecture
Bryant & O’Hallaron, Computer Systems: A Programmer’s Perspective (CSAPP). Pearson—memory hierarchy, linking, concurrency primitives; backs §4. ISBN 978-0134092669.
Patterson & Hennessy, Computer Architecture: A Quantitative Approach. Morgan Kaufmann—pipeline, caches, and performance modeling vocabulary in §4, §12.
Abelson & Sussman, Structure and Interpretation of Computer Programs (SICP). MIT Press—procedures, state, and metalinguistic abstraction; cited in the appendix and useful for §2 foundations.
Lamport, “Time, Clocks, and the Ordering of Events in a Distributed System.” CACM 1978. Logical clocks—§6. Author PDF
Fischer, Lynch & Paterson, “Impossibility of Distributed Consensus with One Faulty Process.” J. ACM 1985 (FLP). Explains why asynchronous consensus is impossible with one crash fault without timeouts.
Lamport, “The Paxos Algorithm.” (family of papers); see also Lamport’s Paxos Made Simple. Paxos Made Simple (PDF)
Ongaro & Ousterhout, “In Search of an Understandable Consensus Algorithm (Raft).” USENIX ATC 2014. https://raft.github.io/raft.pdf
Gilbert & Lynch, “Brewer’s Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services.” SIGACT 2002—CAP formalization often referenced with §6–§7.
Databases, storage & data-intensive systems
Kleppmann, Designing Data-Intensive Applications (DDIA). O’Reilly—storage engines, replication, stream processing; core for §7. ISBN 978-1449373320.
Gray, “The Transaction Concept: Virtues and Limitations.” VLDB 1981—transaction semantics vocabulary. Author archive PDF