Methodology

How a score
becomes a measurement.

Every figure on a LabTest IQ result page is computed from a published formula. This page is the receipt: the math behind the test, with the constants and approximations spelled out.

The instrument

Thirty hand-authored items, evenly distributed across four reasoning domains: pattern recognition, numerical reasoning, verbal analogy, and spatial transformation. Each item is reviewed for clarity, single-answer correctness, and absence of cultural or linguistic confounds. Spatial and pattern items are language-neutral; verbal items remain in English to preserve calibration.

Difficulty calibration

Items are assigned to one of three difficulty tiers (1, 2, 3) at authoring time, then revised against observed accuracy distributions. Tier-3 items are answered correctly by fewer than 35% of test-takers; tier-1 items by more than 75%. Difficulty determines the weight a correct answer contributes to the raw total.

Raw score → IQ

The raw total is mapped onto the standardised IQ scale via a piecewise-linear function anchored at the theoretical maximum of the item bank. The mapping is published in lib/scoring.ts. We use a piecewise-linear curve rather than a polynomial fit so that local slope corresponds to the difficulty composition of the items between two anchor points — this keeps the conversion legible and the score robust to small changes in the item bank.

The standardised scale

IQ scores are reported on the Wechsler-convention scale: mean 100, standard deviation 15. This is the same scale used in clinical psychology since Wechsler (1939). Reporting on this scale lets a LabTest IQ score be compared like-for-like with scores from validated instruments — though clinical interpretation still requires a licensed practitioner.

Percentile

Your percentile is the standard normal cumulative distribution function evaluated at your z-score: percentile = Φ((IQ − 100) / 15) × 100. So an IQ of 115 (z = 1) is the 84.13th percentile; an IQ of 130 (z = 2) is the 97.72nd. We compute Φ using the Abramowitz & Stegun (1965) erf approximation 7.1.26, accurate to 1.5 × 10⁻⁷ — comfortably below any rounding the report exposes.

Bands

Bands are conventional ranges used in clinical reporting and retained here for continuity: Average (85–115), Above Average (115–130), High (130–140), Very High (140–150), Exceptional (≥ 150). They summarise; they do not diagnose.

Per-domain norms

Per-category sub-scores are computed by z-scoring the user's domain accuracy against the live cohort: domain_iq = 100 + 15 × ((acc − cohort_mean) / cohort_sd). Cohort statistics are only surfaced once at least 30 attempts contribute to a domain — below that threshold the domain's bar is shown but no comparative claim is made.

Worked example

A user answers 22 of 30 items correctly across mixed difficulties for a raw of 54 out of 75. The piecewise-linear mapping returns IQ = 120. z = (120 − 100) / 15 = 1.333. Φ(1.333) = 0.9088, so percentile = 90.88. Band: Above Average. If their cohort-normed Pattern accuracy z-scores at 1.8, their Pattern sub-IQ is 127 (96th percentile within domain).

What this is not

LabTest IQ is a screening instrument, not a clinical assessment. Diagnostic intelligence testing requires individualised administration by a licensed psychologist using validated, copyrighted instruments such as the WAIS-IV. Use this for self-evaluation and training; use a clinical instrument for any decision that matters.

Methodology FAQ

Is this clinically valid?+

LabTest IQ is calibrated against the same scale (μ=100, σ=15) used in clinical instruments, but it is not itself a clinically validated instrument. Treat the result as a careful screening estimate, not a diagnosis.

Why μ=100 and σ=15 specifically?+

Convention from Wechsler (1939). Reporting on this scale makes the score interoperable with the entire clinical literature — anyone comparing your number to a WAIS-IV result is reading the same units.

Why a piecewise-linear curve and not a polynomial fit?+

Piecewise-linear keeps local slope tied to local item difficulty. A polynomial would smear across the bank and make small bank changes silently shift the whole curve. Piecewise is auditable; polynomial is not.

Why the erf approximation and not a lookup table?+

The Abramowitz & Stegun 7.1.26 approximation is closed-form, runs in microseconds, and is accurate to 1.5 × 10⁻⁷. A lookup table is no faster and adds a binary asset to ship.

Can I retake the test?+

Yes. Note that repeated exposure to the same item bank typically inflates scores on later attempts. We track form variants (A, B) to reduce overlap.

Take the assessment Training drills →

How a scorebecomes a measurement.