Developer notes · May 2026

AI, data & ethics — how I map it to engineering work

Single-page notes on the data lifecycle, familiar ethics pillars, EU AI Act / GDPR / NIST / ISO frames (as engineers actually use them), controls, eval habits, and checklists. I reach for this in design reviews—not as legal advice, but so “ethics” becomes tickets and tests instead of a vague slide.

Linh Truong, MA (Harvard), MBA · LinhTruong.com · Linh@Alumni.Harvard.edu

Source: if you redistribute this file, link LinhTruong.com so the copy people get still shows authorship.

For: software / ML / platform engineers Spans: data → model → deploy → governance Revised: May 2026

1. Opening summary

For me, ethics stopped being “someone else’s slide deck” once it collided with audits, DPIAs, and release gates. Regulators can pursue serious penalties under frameworks like the EU AI Act; GDPR still bites on sloppy data handling; buyers ask for model cards; a bad launch is a trust problem, not just a metric blip. What follows is the map I use: controls, lifecycle, architecture sketches, and checklists—so “responsible AI” shows up as code and reviews, not vibes.

Risk

Non-compliant systems face fines, takedowns, and reputational harm.

Trust

Documented provenance and evaluation are now a procurement requirement.

Quality

When fairness, safety, and drift get the same care as latency, I see fewer ugly surprises after launch.

Velocity

Reusable guardrails accelerate, not slow, future releases.

2. Why Data Ethics Matters for Engineers

Software engineers are the last mile of ethical AI. Policies become real only when expressed as code: data filters, eval gates, RBAC rules, audit logs, content-safety checks, retention jobs, and rollback triggers. Treat ethics as a non-functional requirement on equal footing with availability, latency, and security.

Engineer-Owned Concerns

Data collection & consent surfaces
PII handling, masking, retention
Bias evaluation in CI/CD
Prompt-injection and jailbreak defenses
Model and data lineage
Logging, audit, and right-to-explanation APIs

Shared with Product/Legal

Lawful basis for processing
Risk classification (EU AI Act)
Disclosure / labeling decisions
DPIAs & algorithmic impact assessments
Human-oversight UX patterns

Owned by Org/Governance

AI policy & ISO/IEC 42001 program
Vendor due diligence
Acceptable-use policies
Regulator notifications
Incident disclosure

3. The Six Pillars of AI Ethics

OECD, UNESCO, NIST, and EU High-Level Expert Group language mostly lines up on six themes. I use them as a rubric when I'm sanity-checking a system—not as a substitute for your counsel's risk tiering.

Figure 1. Six engineering-actionable pillars of trustworthy AI.

4. The AI Data Lifecycle — Where Ethics Lives

Ethical risk is introduced — and must be mitigated — at every stage of the pipeline. The diagram below maps the eight canonical stages with the controls a software engineer is expected to implement at each.

Figure 2. The eight-stage AI data lifecycle. Ethics is enforced as code at every stage; governance wraps the loop.

Stage-by-stage engineering controls

Stage	Risk	Engineering control
1. Framing	Solving the wrong problem; harmful use case	Algorithmic Impact Assessment (AIA); proportionality review; “should we build it?” memo
2. Collection	Unlawful, biased, or non-consensual sourcing	Source whitelist; consent capture; robots.txt & ToS check; provenance signatures (C2PA)
3. Labeling	Annotator bias; harmful labor practices	Annotator guidelines; agreement metrics (Krippendorff α); fair-pay attestation
4. Training	Memorization of PII; poisoning	De-duplication; PII scrubbing; differential privacy; checkpoint signing
5. Evaluation	Hidden subgroup failures	Disaggregated metrics; red-teaming; jailbreak suites; capability evals
6. Deployment	Misuse, hallucination, prompt injection	System-prompt hardening; output filters; rate limits; user-visible disclosure
7. Monitoring	Drift, regression, emerging harms	Production fairness dashboards; user-feedback loop; incident hotline
8. Decommission	Stale models causing harm; retention violations	Sunset policy; deletion / unlearning workflows; archive of lineage

5. Reference Architecture (Engineering Stack)

The following stack is what most regulated AI products look like in 2026. Each layer pairs a capability with the controls that make it ethical and auditable.

Figure 3. Reference architecture. Governance and observability are cross-cutting concerns, not afterthoughts.

6. Regulatory Landscape (2026)

Multiple regimes now apply concurrently. Engineers do not need to be lawyers, but must know which controls satisfy which regime so technical decisions don't undo legal posture.

Regulation	Scope	Status (May 2026)	Engineer-facing requirements
EU AI Act	Risk-tiered AI systems sold/used in EU	In force; high-risk obligations apply from Aug 2026; GPAI obligations active	Risk classification, technical docs (Annex IV), logging, human oversight, post-market monitoring, conformity assessment
GDPR	Personal data of EU residents	In force	Lawful basis, DPIA, data minimization, Art. 22 automated-decision rights, deletion/portability
NIST AI RMF 1.0	US guidance, gov & contractors	Voluntary, widely adopted	Map · Measure · Manage · Govern functions
ISO/IEC 42001	AI management system standard	Certification track	AIMS policies, controls, internal audit, continual improvement
ISO/IEC 23894	AI risk management guidance	Reference	Risk identification, treatment, monitoring
Colorado AI Act / NYC LL144	US state & municipal AI laws	In force	Bias audits, candidate notice, impact assessments
UK AI Bill (2026)	UK domestic regime	Sectoral guidance from regulators	Transparency, safety testing for frontier models
HIPAA / FDA SaMD	US health AI	In force	Validation, change-control, predetermined change-control plans (PCCP)

EU AI Act risk tiers (engineer mental model): Prohibited social scoring, manipulative AI, certain biometric uses · High-risk employment, credit, education, critical infra, law enforcement · Limited risk chatbots, deepfakes — disclosure required · Minimal spam filters, game NPCs.

7. Bias, Fairness & Representation

Bias is rarely a single bug. It is the cumulative effect of which data was collected, who labeled it, what objective was optimized, and who used the output. Engineers must measure it explicitly — disaggregated metrics, not aggregate accuracy.

Common fairness definitions

Demographic parity: P(ŷ=1|A=a) equal across groups
Equalized odds: equal TPR & FPR across groups
Equal opportunity: equal TPR across groups
Calibration: predicted probabilities match observed outcomes per group
Counterfactual fairness: output unchanged if protected attribute is flipped

No single metric satisfies all definitions simultaneously — choose deliberately and document the trade-off.

Mitigation patterns

Pre-processing: re-weighting, re-sampling, representative collection
In-processing: fairness constraints, adversarial debiasing
Post-processing: threshold tuning per group, reject-option classification
Generative-AI specific: prompt scaffolding, RLHF on diverse raters, refusal calibration

# Disaggregated evaluation skeleton (illustrative)
for group in df["protected_attr"].unique():
    sub = df[df["protected_attr"] == group]
    metrics[group] = {
        "n": len(sub),
        "accuracy": accuracy(sub.y_true, sub.y_pred),
        "tpr": recall_score(sub.y_true, sub.y_pred),
        "fpr": false_positive_rate(sub.y_true, sub.y_pred),
        "calibration_ece": expected_calibration_error(sub.y_true, sub.y_prob),
    }
assert max_gap(metrics, "tpr") < 0.05, "TPR disparity exceeds threshold"

8. Privacy & Data Protection

Foundational principles (GDPR-aligned)

Lawfulness, fairness, transparency
Purpose limitation
Data minimization
Accuracy
Storage limitation
Integrity & confidentiality
Accountability

Engineering techniques

PII detection & redaction at ingestion (regex + NER + classifier)
Tokenization / pseudonymization with key separation
Differential privacy (DP-SGD, output noise) for training and analytics
Federated learning for on-device training
Confidential compute (TEEs) for sensitive workloads
Right-to-be-forgotten: deletion + machine unlearning roadmap
Retention jobs tied to data classification

Engineer rule of thumb: If you can't articulate the lawful basis and retention period for a dataset in one sentence, you should not be training on it yet.

9. Security & Adversarial Robustness

AI systems inherit classical software risk and add a new attack surface (model behavior). Map your threats with frameworks like MITRE ATLAS and the OWASP Top 10 for LLM Applications.

Threat	Where it hits	Mitigation
Prompt injection (direct & indirect)	LLM input from users / retrieved docs	Content provenance tagging, instruction-isolation, allowlisted tools, output validation
Training-data poisoning	Crawled or third-party corpora	Source vetting, anomaly detection, signed datasets, dedup & canaries
Model extraction / theft	Public APIs	Rate limits, watermarking, query budgets, auth
Membership inference	Models trained on PII	DP-SGD, regularization, output perturbation
Jailbreaks	Aligned models	System-prompt hardening, classifier ensembles, refusal training, red-team CI
Tool/agent abuse	Autonomous agents	Capability tokens, human approval for high-impact actions, sandboxing
Supply-chain	Models/weights from hubs	SBOM, hash pinning, provenance attestations (SLSA, Sigstore)

10. Transparency, Explainability & Provenance

Model Cards

Short, structured documentation of intended use, limits, evaluation, and ethical considerations.

Datasheets for Datasets

Source, collection method, motivation, composition, preprocessing, distribution.

C2PA & Content Credentials

Cryptographic provenance for generated media; required for many EU "limited-risk" disclosure cases.

Explainability

Feature attributions (SHAP, IG), example-based (influence functions), surrogate models, chain-of-thought traces (with care).

Right to explanation

Article 22 GDPR & AI Act: meaningful info about logic, significance, consequences.

Audit trails

Append-only logs: model version, prompt, context, output, user, override, justification.

11. Human Oversight & Human-in-the-Loop

The EU AI Act mandates "effective human oversight" for high-risk systems. That is concrete: a person must be able to interpret the output, override it, and stop the system.

Human-in-the-loop (HITL): human reviews every decision before action — for irreversible/high-stakes use cases
Human-on-the-loop: human monitors and can intervene — for high-volume, lower-stakes flows
Human-in-command: human sets policy/strategy; AI executes within boundaries

Design the override path first. If a reviewer cannot disagree easily, the oversight is theatre.

12. Governance & Documentation Artifacts

Artifact	Purpose	Owner	Trigger
AI Use-Case Intake Form	Capture purpose, data, risk class	Product + Engineering	New AI feature proposed
Algorithmic Impact Assessment	Identify affected populations & harms	Ethics/Policy + Eng	Before training
DPIA	Privacy risk analysis	DPO + Eng	Personal data involved
Datasheet	Document dataset	Data Eng	Per dataset version
Model Card	Document model	ML Eng	Per model release
Risk Register	Track top risks & status	AI Governance	Continuous
Annex IV Tech File	EU AI Act conformity dossier	Eng Lead + Legal	High-risk systems
Post-market Monitoring Plan	Detect & respond to harms	SRE + ML Ops	At launch

13. Implementation Strategy

13.1 Operating model (lightweight RACI)

Activity	Engineer	ML Lead	Product	Legal/DPO	AI Governance
Risk classification	C	C	R	A	C
Dataset approval	R	A	C	C	I
Model card	C	R/A	C	I	I
Bias evaluation	R	A	C	I	C
Production guardrails	R/A	C	C	I	I
Incident response	R	C	C	C	A

13.2 12-month rollout roadmap

Figure 4. Four-phase rollout from inventory to certification.

14. Metrics, Evaluation & KPIs

Capability & quality

Task accuracy, F1, BLEU/ROUGE — by subgroup
Calibration (ECE) — by subgroup
Hallucination rate & faithfulness (RAG: groundedness)
Latency p50/p95 and cost per request

Safety & ethics

Toxicity / harmful-content rate (per category)
Refusal precision & over-refusal rate
Jailbreak success rate against red-team suite
PII leak rate (canary & memorization probes)
Fairness disparity (TPR/FPR gap, calibration gap)
Override / appeal rate — and resolution time

Operational

Coverage of model cards & datasheets (%)
Time to remediate a flagged incident
Drift detection lead time
% of changes passing automated ethics gates

Governance

Risk register currency (≤30 days)
DPIA / AIA completion before launch
Vendor due-diligence coverage
Audit findings closed on time

15. Engineer's Pre-Production Checklist

Data

[ ] Lawful basis recorded; consent stored where required
[ ] Source license/ToS reviewed (no robots.txt or ToS violations)
[ ] PII detected, classified, masked, or removed
[ ] Dataset version pinned; hash recorded; datasheet committed
[ ] Bias audit run; subgroup coverage documented

Model

[ ] Reproducible training run (config + seeds + commit pinned)
[ ] Eval suite includes fairness, safety, and capability — disaggregated
[ ] Red-team / jailbreak suite passed (or risks accepted & mitigated)
[ ] Model card published; intended use & out-of-scope use stated
[ ] Weights signed; provenance attestation produced

Deployment

[ ] System prompts hardened; tool allowlist explicit
[ ] Output guardrails (toxicity, PII, secrets) enabled
[ ] AI-generated content disclosed in UX (where applicable)
[ ] Rate limits, auth, abuse signals in place
[ ] Human override path documented & tested

Operations

[ ] Telemetry: usage, errors, drift, fairness, refusal rates
[ ] On-call runbook for AI-specific incidents
[ ] Feedback & appeal channel exposed to users
[ ] Sunset date or review date set
[ ] Deletion / retention jobs configured & tested

16. Incident Response & Red-Teaming

AI incidents differ from classic outages: nothing is broken, the system is just wrong, harmful, or unsafe. Build an AI-specific IR playbook.

Detect — user reports, drift alarms, content-safety spikes, regulator inquiries
Triage — severity by reach × harm × reversibility
Contain — feature flag off, route to fallback, downgrade model, throttle
Eradicate — patch prompt, retrain, retire dataset
Recover — canary, monitor, re-enable
Learn — post-mortem with ethics lens; add to red-team suite; update model card

Red-team continuously. A jailbreak that worked last quarter will be tried again next quarter. Maintain a versioned adversarial suite that runs on every model release.

17. Tooling Cheat Sheet (illustrative, not exhaustive)

Need	Open-source / standards	Notes
Fairness metrics	Fairlearn, AIF360, What-If Tool	Use disaggregated metrics, not a single number
Explainability	SHAP, Captum, InterpretML	Match technique to model class & audience
Privacy	Opacus (DP-SGD), TensorFlow Privacy, Presidio (PII)	Validate ε budget & downstream utility
Eval frameworks	lm-eval-harness, HELM, BIG-bench, Inspect	Add internal red-team suite
Guardrails	NeMo Guardrails, Guardrails AI, LLM Guard	Defense-in-depth, not single-layer
Provenance	C2PA, Sigstore, SLSA	Sign artifacts and content
Lineage / catalog	OpenLineage, DataHub, Unity Catalog	Required for "explain why" requests
Risk & governance	NIST AI RMF Playbook, ISO/IEC 42001 controls	Map controls to existing SOC2/ISO 27001

18. References & Further Reading

Diagrams and narrative in §1–§17 are my synthesis for engineering readers; §18 lists primary standards and papers to verify details. Author: Linh Truong · LinhTruong.com.

EU AI Act, Regulation (EU) 2024/1689 — Official Journal of the European Union.
EU General Data Protection Regulation (GDPR), Regulation (EU) 2016/679.
NIST AI Risk Management Framework (AI RMF 1.0) and Generative AI Profile.
ISO/IEC 42001:2023 — AI Management Systems.
ISO/IEC 23894 — Information technology — Artificial intelligence — Guidance on risk management.
OECD Principles on Artificial Intelligence.
UNESCO Recommendation on the Ethics of AI (2021).
Mitchell et al., "Model Cards for Model Reporting".
Gebru et al., "Datasheets for Datasets".
OWASP Top 10 for LLM Applications.
MITRE ATLAS — Adversarial Threat Landscape for AI Systems.
C2PA — Coalition for Content Provenance and Authenticity, technical specification.
Dwork & Roth, "The Algorithmic Foundations of Differential Privacy".
Barocas, Hardt & Narayanan, "Fairness and Machine Learning".

Educational engineering reference only—not legal, compliance, or security advice for your jurisdiction. Check current legal texts and qualified counsel before you rely on penalty figures or risk labels in production decisions.

About the author

Linh Truong, MA (Harvard), MBA
Linh@Alumni.Harvard.edu
LinhTruong.com

Penalty amounts and regulatory summaries here are abbreviated; always read the official instrument and your local implementation.