Statistics for the Behavioral Sciences

Basic Knowledge

Population and Sample

Parameter and Statistic

Nominal scale, ordinal scale, interval scale, and ratio scale

Frequency Distribution, Proportion and Percentage

Percentile, and Percentile Rank: corresponding to the proportion of the left of the score in question.

Mean, Median, and Mode

Range, and Inter-quartile range

Sum of the Squared Deviation, Variance, and Standard Deviation


Z-score: Location of scores and standardized distributions

Standard Distributions:    μ = 0, and σ = 1;

Probability: Any particular outcome as a fraction or proportion

Random Sample: must satisfy two requirements

  1. Every individual in the population has an equal chance of being selected;
  2. There must be sampling with replacement

Normal distribution

Binomial distribution

Distribution of sample means: the collection of sample means for all the possible random samples of a particular size (n) that can be obtained from a population

Z-score for each M in distribution of sample means (Known population mean and standard deviation)

Hypothesis Testing: a statistical method that uses sample data to evaluate a hypothesis about a population parameter.

  1. State the NULL hypothesis (H0), and select an alpha level (α) (a probability value that is used to define the very unlikely sample outcomes is the NULL hypothesis is true. (One-Tailed or Two-Tailed);
  2. Locate the critical region (z_table): extreme sample values that are very unlikely to be obtained if the NULL hypothesis is true. The boundaries for the critical region are determined by the alpha level;
  3. Collect the data, and compute the test statistic z = (M - μ)/σm;
  4. Make a decision

Type I Error: rejecting a true H0, determined by the alpha level (α)

Type II Error: failure to reject a false H0, presented by symbol beta (β) (probability of Type II Error).

Effect Size

Power (= 1 - β): The portion of the treatment distribution located beyond the boundary (critical value) of the critical region. (1) When α increase, power increase; (2) One-Tailed power > Two-Tailed power; (3) When sample size (N) increase, power increase.


t-statistic: used when the population standard deviation σ (or variance σ2) is unknown.

Effect size

t-statistic (Independent-Measures): Used when data from two separate samples to draw inferences about the mean difference between two population or between two different treatment conditions

Effect size

Homogeneity of Variance: two or more population have equal variance

t-statistic (Repeated-Measures or Matched-Subjects Design): Remove or reduces individual differences which in turn lowers sample variability and tends to increase the chances for obtaining a significant result.

Effect size

Estimation: How much treatment effect there is


ANOVA (Independent-Measures): A statistical technique that is used to test for mean differences among two or more treatment conditions

Effect size

Post hoc test

  1. Tukey's HSD
    1. The studentized Range Statistic (q) (TABLE)
  2. Scheffe

ANOVA (Repeated-Measures): for repeated-measures or matched-subject design. Eliminates the influence of individual differences from the analysis.

Effect size

Post hoc test

  1. Tukey's HSD
  2. Scheffe

ANOVA (two-factor design): Two independent variables A, and B. Independent subjects

Effect size

Post hoc test

  1. Further evaluation

ANOVA (two-factor design and repeated-measures design): Two independent variables A, and B. Repeated-measures or matched-subjects design

Correlation and regression

Correlation: Not a causal relationship, expressed in three parameters

  1. Direction
  2. Form
  3. Degree

Pearson Correlation

Spearman Correlation: Both variables are measured on ordinal scales

Point-Biserial correlation: One of the two variables is dichotomous

Phi-Coefficient: Both variables are dichotomous

Regression (Least-squares method)

Nonparametric technique

Chi-Square tests: for nonparametric technique. Tests hypothesis about the form of the entire frequency distribution; each observed frequency reflects a different individual, no individual can produce a response classified in more than one category (or frequency in one category)

  1. Test for goodness of fit: compares the frequency distribution for a sample to the frequency distribution (fe) that is predicted by H0
  2. Test for independent: assesses the relationship between two variables, or tests whether tow sample distributions share a common median
  3. chi-Square Distribution (TABLE)

Effect size for chi-square test

Sign test: use binomial test

Mann-Whitney test (U): nonparametric alternatives to the independent-measures t-statistic. A small value of U (near zero) is evidence of a difference between the two treatments.

Wilcoxon test (T): nonparametric alternatives to repeated-measures t-statistic. A small value of T (near zero) provides evidence of difference.

  1. Compute a different score for each individual;
  2. Ranks (abs(different scores));
  3. sum ranks according to sign zero scores excluded(P: positive, N: negative), and add zeros equally to P and N sign;
  4. Critical Values of T for the Wilcoxon Signed-Ranks Test (TABLE);

Kruskal-Wallis test: nonparametric alternatives to single factor ANOVA. The test statistic H, is equivalent to a chi-square statistic with degrees of freedom equal to the number of treatment conditions minus one.

  1. Combined and ranked for all scores;
  2. sum of the ranks for each treatment (T)

Extra Information

Relationship (or conversion)