Computer science · field notes · May 2026

Computer Science — how I map the stack for working developers

Linh Truong, MA (Harvard), MBA · LinhTruong.com · Linh@Alumni.Harvard.edu

Source / canonical copy: LinhTruong.com. If you forward this HTML, link there so attribution stays with the file.

Notes and SVGs I reuse when someone asks for a single picture of the whole road: foundations, systems, networking, distributed systems, data/ML, security, how we actually ship software, and a tight read on research and career moves. Slanted toward practitioners who present, review designs, or write—not a degree substitute.

Audience: Developers, leads, anyone teaching CS-shaped material Revised: May 2026 Format: 14 inline diagrams + appendix + references

1 · The Computer Science Landscape (2026)

Three forces keep showing up in the teams I talk to: AI bundled into products (often LLM-shaped), systems that are distributed by default (edge, cloud, device), and pressure to show your work—types, tests, traces, sometimes proof-ish reasoning. None of that is comfortable if you only live in one band of the stack.

Figure 1 — Five layers; lower layers fund the upper ones—the more bands you can reason about, the less surprised you are in review.

Trend 01

AI-native software

LLMs and agents move from features to infrastructure. Retrieval, evals, and tool use are first-class concerns alongside testing and logging.

Trend 02

Edge + Cloud + Device

WASM, on-device inference, and serverless make "where does code run?" a design decision per request — not per service.

Trend 03

Verified & Observed

Strong types (Rust, TS, Swift 6), property testing, OpenTelemetry, and SBOMs raise the floor for production-grade software.

2 · Core Foundations

Foundations are option value. When you actually remember complexity classes, basic probability, and information theory, architecture arguments get shorter and code reviews get more concrete.

Math

The non-negotiables

Discrete math — sets, relations, induction, combinatorics, graph theory.
Probability & statistics — distributions, Bayes, A/B testing, confidence intervals.
Linear algebra — vectors, matrices, eigenvalues (used everywhere in ML and graphics).
Logic & proofs — propositional/first-order logic, invariants, contracts.
Information theory — entropy, compression, error correction.

CS Theory

What every developer should be able to argue

Why P vs NP matters for problem framing and approximation.
How Turing machines and the halting problem bound what is computable.
Why a regex is not a parser, and when to reach for a CFG.
What "NP-hard but tractable" means in practice (SAT, ILP, heuristics).
How information-theoretic lower bounds drive lossless compression and hashing.

3 · Data Structures & Algorithms

You do not need to memorize every algorithm — you need a map of which structure solves which class of problem, and the cost of each operation. The diagram below organizes the canon by access pattern.

Figure 2 — Pick a structure by access pattern first, then refine by constraints (memory, concurrency, persistence).

The algorithm strategies you must own

Strategy	When to reach for it	Canonical example	Big-O hint
Divide & Conquer	Problem splits into independent subproblems	Merge sort, FFT	O(n log n)
Greedy	Local optimum implies global optimum (provable)	Huffman, Dijkstra	O(n log n)
Dynamic Programming	Overlapping subproblems + optimal substructure	LCS, knapsack, edit distance	O(n·m) typical
Backtracking / Branch & Bound	Search a space with prunable invariants	SAT, N-queens, scheduling	Exponential worst case
Graph traversal	Reachability, shortest path, cycles	BFS, DFS, A*	O(V+E)
Randomized / Sketches	Massive data, approximate answers OK	Bloom, HLL, MinHash	Sub-linear memory
Approximation	NP-hard but a 2× or (1+ε) bound is fine	Set cover, TSP	Polynomial w/ ratio
Online & streaming	Data arrives once, can't store it all	Reservoir sampling, EWMA	O(1) per item

4 · Systems & Architecture

Performance, reliability, and cost all bottom out in physics: how memory hierarchies, kernels, and hardware coordinate. The "latency numbers every programmer should know" table below is a developer's compass.

Figure 3 — The seven-orders-of-magnitude gap between a register and a cross-region call is why where code runs matters as much as what it does.

Process & memory

Virtual memory, paging, page-cache, mmap, copy-on-write. Know your scheduler (CFS, EEVDF on Linux 6.x) and how io_uring changes I/O.

Concurrency

Three models

Threads + locks, async/await (cooperative), and actors/CSP (message passing). Pick one per service boundary, not per file.

Hardware

Mechanical sympathy

Cache lines (64B), NUMA, SIMD, branch prediction. A tight loop that fits in L1 is 100× a naïve one over RAM.

5 · Networking & the Web

The web runs on a small number of layered abstractions. Understanding them lets you debug "why is my page slow?" with first principles instead of guesses.

Figure 4 — HTTP/3 over QUIC is now the default on most CDNs. Post-quantum hybrid key exchange is rolling out in TLS 1.3.

What to debug, in order

DNS — resolution time, TTL, regional anycast.
TCP/QUIC handshake — 0-RTT eligibility, head-of-line blocking.
TLS — cert chain, OCSP staple, ALPN.
HTTP semantics — caching headers, range, compression (Brotli/zstd).
Server processing — DB query, CPU, GC pauses.
Client rendering — Critical CSS, hydration, Core Web Vitals (LCP, INP, CLS).

6 · Distributed Systems

Distributed systems are the discipline of partial failure. The CAP and PACELC framings, plus consensus, are the conceptual core.

Figure 5 — PACELC extends CAP with the steady-state trade-off between latency and consistency.

Consensus

Paxos, Raft, Viewstamped Replication

All solve the same problem: agree on a value despite failures. Raft is the de-facto teaching algorithm; Multi-Paxos and Viewstamped Replication power the largest systems. Modern variants (EPaxos, Flexible Paxos) trade latency for failure-domain flexibility.

Replication

Patterns

Leader/follower (Postgres, MySQL).
Multi-leader (CRDTs, geo-active-active).
Leaderless / quorum (Dynamo-style).
Chain replication (CRAQ for read scaling).

7 · Databases & Data Engineering

The database is usually your most expensive and least reversible decision. Choose by access pattern + consistency need + ops cost, in that order.

Figure 6 — Default to Postgres until a workload forces you out. Mix engines at the storage layer, not at the API.

Data engineering pipeline shape

source  →  ingest  →  store (lake/warehouse)  →  transform (dbt/Spark)  →  serve
                          ↓                              ↓
                       catalog (Iceberg/Delta/Hudi)   metrics + features
                          ↓                              ↓
                       lineage / quality             ML training / RAG

8 · AI / ML / LLMs

In 2026 a software developer is expected to integrate AI competently even if they don't train models. The diagram below distinguishes the four common operating modes.

Figure 7 — Start with prompting, add RAG when grounded knowledge is needed, add tools when actions are needed, fine-tune last.

Evaluation is the moat

The most common failure mode in AI features is "looks good in the demo, regresses in prod." Build evals before you ship anything.

Golden datasets — 50–500 hand-curated examples with expected outputs.
Automatic graders — exact match, regex, LLM-as-judge with rubric, embedding similarity.
Online evals — implicit signals (regenerations, edits, thumbs), explicit feedback widget.
Safety evals — prompt injection, PII leakage, jailbreak resistance.
Cost & latency budgets — p50/p95/p99 with regression alerts in CI.

A research paper or production launch in 2026 without an eval section is incomplete. The eval is the experiment.

9 · Security & Cryptography

Threat modeling is cheap; incidents are not. Use the STRIDE mnemonic on every new feature, and pair it with the standard mitigations below.

Figure 8 — Apply STRIDE per data-flow boundary, not per feature.

2026 baseline you should be able to defend

Passkeys (WebAuthn) over passwords. SSO via OIDC.
OWASP Top 10 + OWASP LLM Top 10 (prompt injection, insecure output handling, model DoS).
Secrets in a vault, never in env files committed to git.
Supply chain — SBOMs, signed artifacts (Sigstore), pinned dependencies, lockfiles.
Post-quantum readiness — hybrid KEMs (ML-KEM/Kyber + X25519) where possible.
Zero-trust networking — identity per workload, not per IP.

10 · Software Engineering Practice

The most underrated CS skill is shipping software that other humans can change. The diagram below shows the modern feedback loop a high-functioning team runs.

Figure 9 — The loop closes when production behavior changes the next spec. Teams without that closure stall.

Code quality

The big six

Types & static analysis
Unit + property + integration tests
Code review with checklists
Linters & formatters in CI
Dependency upgrade bots
Mutation testing for hot paths

Workflow

Trunk-based delivery

Short-lived branches
Feature flags for unfinished work
Continuous deploy to staging
Canary + blue/green to prod
Rollback in <5 min

Docs

What to write

ADRs (architecture decisions)
RFCs for cross-team work
Runbooks per service
Onboarding 90-day plan
Glossary of domain terms

Testing pyramid (and the modern flip)

Level	Quantity	Speed	Confidence	2026 note
Unit	Many	ms	Local	Property tests catch what examples miss
Integration	Some	seconds	Module	Use real DBs in containers, not mocks
End-to-end	Few	minutes	System	Playwright/Cypress; run on a deployed env
Eval / behavioral	Always	varies	AI features	The new tier for LLM-powered features
Chaos / load	Periodic	hours	Reliability	Run before launches, not after incidents

11 · Cloud, DevOps & Platform Engineering

Cloud is the substrate; platform engineering is how you make it usable. A good internal developer platform (IDP) turns "deploy" into a one-line action and bakes in compliance.

Figure 10 — A working IDP makes the secure, observable, scalable path the easy one.

Infrastructure-as-code stance

Treat infra like product code: review, test (with terraform plan diffs and OPA policies), version.
One repository of truth; drift is a bug, not a feature.
Cost is a first-class metric — tag resources, set budgets, alert on burn.

12 · Performance & Observability

"Make it work, make it right, make it fast" — in that order. Measure before you optimize. Modern observability is three pillars plus one.

Figure 11 — Pillars are not enough on their own; the value is in correlation through OpenTelemetry trace IDs.

The performance loop

Define the SLI (e.g., p95 request latency).
Measure — instrument, run load tests, capture flame graphs.
Find the bottleneck — Amdahl says fix the biggest slice first.
Change one thing, A/B it, re-measure.
Lock in with a regression test or budget alert.

13 · Research-Paper Methodology for CS

Whether you are writing a conference paper, a tech report, or a launch retrospective, the structure is the same: question → method → evidence → claim. The diagram below is a research project as a pipeline.

Figure 12 — Treat the paper as a pipeline whose artifacts (data, code, figures) are reusable across drafts and venues.

IMRaD structure with CS-specific guidance

Section	Length	Key questions to answer	Common mistakes
Abstract	150–250 words	What problem, what idea, what evidence, what impact?	Vague claims, no numbers
Introduction	1–1.5 pages	Why now? What gap? Three-bullet contribution list.	Backstory instead of stakes
Related work	0.5–1 page	What does this paper do that prior work does not?	List instead of comparison
Method	2–4 pages	Could a competent reader reimplement it?	Skipping non-obvious choices
Experiments	2–4 pages	Baselines, datasets, metrics, ablations, statistical tests.	Single-seed results
Discussion	0.5–1 page	What did we learn? When does this fail?	Re-stating results
Threats to validity	0.25–0.5 page	What might be wrong, and what would change the conclusion?	Omitting
Reproducibility	Appendix	Code, data, hyperparameters, hardware, seeds, exact commands.	"Available on request"

Reproducibility — the 2026 bar

Pinned environment (uv, poetry, nix, or container).
Deterministic seeds and exact versions of model weights / datasets.
One make repro target that regenerates every figure and table.
Dataset card + model card with intended use and limitations.
Public artifact archived (Zenodo / OSF) with a DOI.

If your reviewer cannot run your experiment with one command on one machine, your paper is too long.

Common CS publication venues

Area	Top venues	Style
Systems	SOSP, OSDI, NSDI, EuroSys, ATC, FAST	Implementation + measurement
Networking	SIGCOMM, NSDI, CoNEXT	Protocols + measurement
Databases	SIGMOD, VLDB, ICDE, CIDR	Engines, query processing
PL	PLDI, POPL, OOPSLA, ICFP	Semantics, compilers, proofs
Theory	STOC, FOCS, SODA	Proofs, complexity
ML / AI	NeurIPS, ICML, ICLR, ACL, EMNLP	Empirical + analysis
Security	USENIX Security, IEEE S&P, CCS, NDSS	Attacks, defenses, measurement
HCI	CHI, UIST, CSCW	User studies, design
SE	ICSE, FSE, ASE	Tools, empirical SE

14 · Career & Learning Strategy for Software Developers

Skills compound; reputations compound faster. Aim for T-shaped early (one deep stem, broad horizontal), then π-shaped mid-career (two deep stems).

Figure 13 — Add the second stem deliberately; usually one is technical, one is adjacent (product, security, ML, distributed).

Learning

How to keep up without burning out

Spaced practice — weekly, not weekend cram.
Build, then read — write the toy version, then read the paper.
Teach — internal talks, blog posts, code reviews. Output forces understanding.
Cap inputs — 2 newsletters, 1 podcast, 3 papers/month beats infinite feeds.
Re-read classics — Designing Data-Intensive Applications, SICP, The Pragmatic Programmer.

Visibility

How to be known for something

Pick one topic you are willing to defend for 18 months.
Publish — blog, paper, OSS — on a regular cadence.
Show your work: benchmarks, reproductions, post-mortems.
Speak at one venue per year (lunch & learn counts).
Help one person publicly every week (issue, PR, review).

The 6-month focus plan (a template)

Figure 14 — Same shape works for "learn Rust", "ship an ML feature", or "write your first paper".

15 · Master Checklist — Everything You Should Be Able To Do

Code & algorithms

Pick a data structure by access pattern and justify the trade-off in writing.
Reason about Big-O and amortized cost without looking it up.
Write idiomatic code in at least two paradigms (OO + functional or systems).
Recognize when a problem is NP-hard and choose an approximation strategy.

Systems

Estimate latency, throughput, and storage on the back of an envelope.
Design a service that survives a single-AZ outage.
Trace a slow request from browser to database and back.
Write a runbook that an oncall engineer can use at 3am.

Data & AI

Design a RAG pipeline with a sensible eval harness.
Explain the difference between fine-tuning, RAG, and prompting to a PM.
Identify when a vector database is overkill vs. essential.
Diagnose a model regression with a holdout set.

Security & reliability

Run STRIDE on a feature design and produce mitigations.
Set an SLO and an error budget you actually enforce.
Lead a blameless post-mortem with clear action items.
Rotate a secret without downtime.

Practice & communication

Write an ADR that holds up six months later.
Give a 10-minute talk on something you built last quarter.
Review a PR with substantive feedback, not nits.
Mentor a junior to ship their first production change.

Research

Frame a falsifiable research question.
Run an experiment with baselines, ablations, and seeds.
Produce a one-command reproduction.
Publish or present at least one piece of work each year.

Appendix · Curated reading list

Foundations

Introduction to Algorithms — Cormen et al.
Structure and Interpretation of Computer Programs — Abelson & Sussman
Computer Systems: A Programmer's Perspective — Bryant & O'Hallaron

Systems

Designing Data-Intensive Applications — Kleppmann
Database Internals — Petrov
Site Reliability Engineering — Google

AI / ML

Deep Learning — Goodfellow et al.
Pattern Recognition and Machine Learning — Bishop
Papers: Attention Is All You Need, Chinchilla, RAG, Constitutional AI

Practice

The Pragmatic Programmer — Hunt & Thomas
Accelerate — Forsgren, Humble, Kim
A Philosophy of Software Design — Ousterhout

Security

Cryptography Engineering — Ferguson, Schneier, Kohno
The Tangled Web — Zalewski
OWASP Top 10 + LLM Top 10

Research craft

The Craft of Research — Booth et al.
Writing for Computer Science — Zobel
"How to write a great research paper" — Simon Peyton Jones (talk)

16 · References & sources

The diagrams here are my synthesis; the list below is where I point people for citable anchors—textbooks, papers, and standards that match §1–§15. It is not exhaustive relative to every vendor drawn in your head.

Note: Use published editions for bibliographies; arXiv is fine for preprints. RFCs and NIST publications are versioned—always confirm you cite the revision your design actually follows.

Algorithms, complexity & discrete foundations

Cormen, Leiserson, Rivest & Stein, Introduction to Algorithms (CLRS). MIT Press—standard reference for Big-O, data structures, and classical algorithms tied to §3. ISBN 978-0262046305.
Sedgewick & Wayne, Algorithms (4th ed.). Addison-Wesley—practical algorithms with running-cost intuition. https://algs4.cs.princeton.edu/
Knuth, The Art of Computer Programming, Vol. 1 (Fundamental Algorithms). Addison-Wesley—historical rigor for analysis and combinatorial foundations.
Sipser, Introduction to the Theory of Computation. Cengage—automata, computability, complexity classes; connects to “NP-hard” discussions in §3.

Computer systems, OS & architecture

Bryant & O’Hallaron, Computer Systems: A Programmer’s Perspective (CSAPP). Pearson—memory hierarchy, linking, concurrency primitives; backs §4. ISBN 978-0134092669.
Patterson & Hennessy, Computer Architecture: A Quantitative Approach. Morgan Kaufmann—pipeline, caches, and performance modeling vocabulary in §4, §12.
Arpaci-Dusseau & Arpaci-Dusseau, Operating Systems: Three Easy Pieces (OSTEP). Free online textbook—processes, virtualization, persistence. https://pages.cs.wisc.edu/~remzi/OSTEP/

Programming languages & abstraction

Abelson & Sussman, Structure and Interpretation of Computer Programs (SICP). MIT Press—procedures, state, and metalinguistic abstraction; cited in the appendix and useful for §2 foundations.

Networking & web protocols

Tanenbaum & Wetherall, Computer Networks (ed.). Pearson—layered stack context for §5.
Stevens, TCP/IP Illustrated, Vol. 1 (and Vol. 2 for implementation detail). Addison-Wesley.
Postel, RFC 793 — Transmission Control Protocol. IETF. RFC 793
Iyengar & Thomson (eds.), RFC 9000 — QUIC: A UDP-Based Multiplexed and Secure Transport. RFC 9000
Fielding, RFC 9110 — HTTP Semantics (HTTP-bis). RFC 9110

Distributed systems, consensus & consistency

Lamport, “Time, Clocks, and the Ordering of Events in a Distributed System.” CACM 1978. Logical clocks—§6. Author PDF
Fischer, Lynch & Paterson, “Impossibility of Distributed Consensus with One Faulty Process.” J. ACM 1985 (FLP). Explains why asynchronous consensus is impossible with one crash fault without timeouts.
Lamport, “The Paxos Algorithm.” (family of papers); see also Lamport’s Paxos Made Simple. Paxos Made Simple (PDF)
Ongaro & Ousterhout, “In Search of an Understandable Consensus Algorithm (Raft).” USENIX ATC 2014. https://raft.github.io/raft.pdf
Gilbert & Lynch, “Brewer’s Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services.” SIGACT 2002—CAP formalization often referenced with §6–§7.

Databases, storage & data-intensive systems

Kleppmann, Designing Data-Intensive Applications (DDIA). O’Reilly—storage engines, replication, stream processing; core for §7. ISBN 978-1449373320.
Gray, “The Transaction Concept: Virtues and Limitations.” VLDB 1981—transaction semantics vocabulary. Author archive PDF
Petrov, Database Internals. O’Reilly—B-trees, LSM, distributed storage mechanics complementing DDIA.

Machine learning, deep learning & retrieval-augmented generation

Goodfellow, Bengio & Courville, Deep Learning. MIT Press—foundations for §8. https://www.deeplearningbook.org/
Bishop, Pattern Recognition and Machine Learning. Springer—classical probabilistic ML.
Vaswani et al., “Attention Is All You Need.” NeurIPS 2017—transformers underpin LLM sections. https://arxiv.org/abs/1706.03762
Lewis et al., “Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.” NeurIPS 2020—RAG. https://arxiv.org/abs/2005.11401

Security, cryptography & web application safety

Ferguson, Schneier & Kohno, Cryptography Engineering. Wiley—engineering practice for crypto systems; §9.
OWASP Top 10 Web Application Security Risks. Community standard. OWASP Top 10
OWASP Top 10 for Large Language Model Applications. OWASP LLM Top 10

Software engineering, reliability & platform practice

Hunt & Thomas, The Pragmatic Programmer (20th Anniversary). Addison-Wesley—habits for §10.
Forsgren, Humble & Kim, Accelerate. IT Revolution—DORA metrics and delivery science referenced in modern practice discussions. ISBN 978-1942788331.
Ousterhout, A Philosophy of Software Design. Yaknyam Press—module design and complexity control.
Google, Site Reliability Engineering (free). O’Reilly / Google—SLOs, error budgets, incident response; §11–§12. https://sre.google/sre-book/table-of-contents/
Google, The Site Reliability Workbook. Companion for practical patterns. https://sre.google/workbook/table-of-contents/

Observability & performance analysis

OpenTelemetry Project. Vendor-neutral telemetry model. https://opentelemetry.io/
W3C Trace Context. Distributed trace propagation. https://www.w3.org/TR/trace-context/
Gregg, Systems Performance (2nd ed.). Pearson—USE methodology, profiling, and systems observability depth for §12.

Research communication & CS writing

Booth, Colomb & Williams, The Craft of Research (4th ed.). University of Chicago Press—question framing and argument structure; §13.
Zobel, Writing for Computer Science. Springer—conventions for CS papers and reports.
Peyton Jones, “How to Write a Great Research Paper.” Microsoft Research talk (series). Microsoft Research video listing

Disclaimer. Reading list only—links aren’t endorsements. Canonical page for this file: LinhTruong.com — Linh Truong.