MAT-144 · Mathematical Reasoning
Topic 05 · Statistics
Vocabulary & key terms
Every term defined across this topic, grouped by lesson. Tap a lesson title to jump back to the page where the term was introduced.
30
terms in this topic. Skim before the review.
- Data set
- The collection of numbers we actually have in hand. 10 weights, 12 class sizes, 7 dog weights, 100 test scores — each is a data set.
- Population
- The full group we'd ideally like to know about. "All American adult voters," "all households in Phoenix," "all 11th graders." Usually too big to measure directly.
- Sample
- A subset of the population that we can measure. A poll of 1,000 voters is a sample from the population of all voters. The size of the sample drives how confident we can be in inferential claims.
- Descriptive claim
- A claim about the data set itself. "The median weight is 162 pounds." Exact for this data, no margin of error.
- Inferential claim
- A claim that extrapolates from the sample to the population. "The median weight of all 11th-grade boys in the US is about 162 pounds, ±5." Always has a margin of error.
- Mean (arithmetic average)
- mean = (x₁ + x₂ + … + xₙ) / n. The bar atop a variable, x̄, is the standard symbol for sample mean. Round when the answer isn't a whole number; ALEKS typically asks for one decimal.
- Median
- Sort the data first. With odd n, the median is the single middle value. With even n, average the two middle values. Don't forget to sort before reading off the middle.
- Mode
- Count how often each value appears. The most frequent value (or values) is the mode. A data set with two tied highest frequencies is bimodal; one with no repeats has no mode.
- "No mode"
- When every value in the data set appears exactly once, the mode is undefined. ALEKS gives you a "No mode" button for this case; do not type a number.
- Bimodal / multimodal
- Two values tied for most frequent → bimodal; three or more → multimodal. List all the modes, separated by commas, in the order they appear in the data set.
- Range
- Maximum value minus minimum value. Always non-negative. ALEKS Q4 is a direct range calculation: max − min, in whatever units the data carries (pounds, dollars, hours).
- Standard deviation (SD or σ)
- The typical (root-mean-square) distance from the mean. Has the same units as the data. A data set in pounds has an SD in pounds; a data set in dollars has an SD in dollars.
- Sample SD vs population SD
- Sample SD divides the sum of squared deviations by n − 1; population SD divides by n. Most real data sets are samples, so the sample version is the default. Excel:
=STDEV.S()(sample) vs=STDEV.P()(population). - Variance
- The SD without the final square root. Variance has units of data² (pounds squared, dollars squared), which is why the SD is usually reported instead — its units match the data.
- Outlier
- A value far from the rest of the data. Outliers pull the mean and the SD upward (both are sensitive). The median and the range are less affected: the median barely moves, and the range only shifts if the outlier is the new max or min.
- Category (bar chart)
- A discrete label on the x-axis: "Monday," "Coffee," "Canada." Bar charts arrange one bar per category, with visible gaps between bars to signal that the x-axis is not numeric.
- Bin (histogram)
- An interval on a numeric x-axis. Values in the data set are tallied into the bin they fall inside, and bin counts become bar heights. Bins touch each other because the underlying axis is continuous.
- Five-number summary
- Min, Q1 (25th percentile), median (50th percentile), Q3 (75th percentile), max. These five numbers are what a box plot displays in a single picture.
- Quartile (Q1, Q3)
- The 25th and 75th percentile values. Q1 separates the lowest quarter of the data from the rest; Q3 separates the highest quarter. The middle 50% of the data sits inside the box of a box plot, between Q1 and Q3.
- Skew
- A distribution that is not symmetric. A right-skewed (positive) distribution has a longer tail to the right; left-skewed has a longer tail to the left. Skew shows up as a lopsided shape in a histogram or an off-center median inside a box plot's box.
- Normal distribution
- A symmetric bell-shaped distribution. Identified visually by its single peak at the mean and tails that taper smoothly on both sides. Also called a Gaussian distribution.
- Mean (μ)
- Greek letter mu. The center of a normal distribution — the value under the highest point of the curve. ALEKS Q5 labels this point V.
- Standard deviation (σ)
- Greek letter sigma. Sets the spread (width) of the bell. A larger σ makes the curve wider and shorter; a smaller σ makes it narrower and taller. Computed in Lesson 3.
- Empirical rule (68-95-99.7)
- About 68% of values lie within 1σ of the mean, about 95% within 2σ, and about 99.7% within 3σ. Memorize these three percentages; the rest of Q5 is arithmetic.
- U and W (ALEKS notation)
- Lower and upper boundary points equidistant from the mean. U = μ − kσ, W = μ + kσ for some k (1, 2, or 3 depending on the shaded region in the figure). V is always the mean μ.
- Sample size (n)
- The number of people (or items) actually interviewed / measured. Larger n is more expensive to collect but produces a tighter MOE. National political polls typically use n between 600 and 1,500.
- Sample proportion (p̂)
- Pronounced "p-hat." The proportion of the sample that gave a particular answer (e.g., 0.45 = 45% favoring the candidate). Our best estimate of the population proportion p, but not the same number.
- Margin of error (MOE)
- Half the width of the confidence interval. Rule of thumb at 95% confidence: MOE ≈ 1 / √n, or as percentage points ≈ 100 / √n %. Reported alongside the sample proportion in any reputable poll.
- Confidence interval (CI)
- The range sample proportion ± MOE. For "45% ± 3%," the CI is 42% to 48%. Always reported with a confidence level (95% is the convention).
- Confidence level
- How often the true population value would land inside the confidence interval if we repeated the poll many times. 95% is standard; higher confidence (99%) gives a wider interval, lower confidence (90%) gives a narrower one.