The ModernEncyclopedia Est. 2026 · A living curriculum · Regularly updated
DAT-33 · Sciences · Fully written

Learn Data Science with any AI

Statistics & inference

Statistics is the science of learning from data under uncertainty — and data science is its modern, computational form. Together they are the discipline of turning messy numbers into defensible conclusions, and, just as importantly, of not fooling yourself along the way.

It may be the single most useful subject on this whole site: almost every argument you'll ever hear rests on a number, and this teaches you to interrogate it. Set your level below.

Build a prompt ↓

§01

Compose your prompt

Choose a prompt and a level, then copy
Prompt settings
Subject
DAT-33 · Data Science
This prompt is scoped to Data Science. Browse the full library to switch subjects.
Which prompt
Your first contact with a topic, pitched exactly at your level.
Level
How deep to pitch it — from a curious start to full university depth.
Topic — optional, narrows the focus
Study time — used by the syllabus builder
British English
Keeps spelling and exam framing UK-style. Turn off for US spelling.
§02

A map of the field

Learning from data, done honestly

From the foundations to the applied pipeline.

  • Statistical inference & experimental design — how to draw sound conclusions, and how to run a study worth trusting.
  • Probability — the mathematics of chance beneath it all.
  • Regression & modelling — finding and using relationships in data.
  • Causal inference — the hard step from correlation to cause.
  • Data engineering & visualisation — getting, cleaning and honestly showing data.
  • The ethics of data — bias, fairness and privacy.
§03

The canon

The statisticians who built the tools

Real figures behind the everyday methods.

  • Thomas Bayes (18th c.) — Bayes' theorem, the mathematics of updating belief with evidence.
  • Karl Pearson — correlation and much of the modern statistical toolkit.
  • Ronald Fisher (1920s) — experimental design and the p-value; foundational, and now hotly re-examined.
  • Neyman & Pearson — the framework of hypothesis testing.
  • John Tukey — exploratory data analysis, and the idea of really looking at your data first.
  • Judea Pearl — made causal reasoning rigorous, turning "correlation is not causation" into real mathematics.
§04

The live debates

Where statisticians disagree

Genuine debates, several of them urgent.

  • Bayesian vs frequentist. A deep, century-old split over what probability even means.
  • The p-value and the replication crisis. The overuse and abuse of significance testing — "p-hacking" — has shaken whole fields.
  • Correlation vs causation. The oldest warning, and Pearl's modern tools for actually crossing the gap.
  • Can models be fair? Algorithmic bias, and whether "fairness" can even be defined consistently.
  • When big data misleads. Simpson's paradox, sampling bias, and the illusion that more data means more truth.
§05

Where to start

A route in

A route in — everything runs from the panel above.

  1. Run Orientation on "what a p-value really means" — most people, including experts, get it wrong.
  2. Use Great Debates on Bayesian vs frequentist thinking.
  3. Practise reading real studies critically with the tutor: what would make this finding wrong?
  4. Read a clear statistics intro alongside a data-ethics source.

The goal isn't to compute — it's to not fool yourself, and to notice when others are fooling you.