The p-value is a crucial concept in hypothesis testing used in data science, statistics, and machine learning. This tutorial explains what a p-value is, how to interpret it, and how to calculate it using Python.
1. What Is a P-Value?
A p-value is the probability of obtaining test results at least as extreme as the observed data, assuming that the null hypothesis (\( H_0 \)) is true.
- It helps determine whether the observed result is due to random chance or if there is statistical significance.
- A smaller p-value indicates stronger evidence against the null hypothesis.
Example Meaning of P-Value:
- If p = 0.05, there is a 5% chance that the observed results occurred due to random chance.
- If p = 0.01, there is a 1% chance that the results are due to random variation.
2. Hypothesis Testing & P-Value
In hypothesis testing, we set up two hypotheses:
- Null Hypothesis (\( H_0 \)): Assumes there is no effect or no difference.
- Alternative Hypothesis (: Assumes there is an effect or a difference.
Then, we calculate the p-value to determine whether to reject \( H_0 \) or fail to reject \( H_0 \).
3. How to Interpret the P-Value?
P-Value | Interpretation |
---|---|
p > 0.05 | Fail to reject \( H_0 \) (Not statistically significant) |
p โค 0.05 | Reject \( H_0 \) (Statistically significant) |
p โค 0.01 | Strong evidence to reject \( H_0 \)ย (Highly significant) |
Very strong evidence to reject \( H_0 \) (Extremely significant) |
Example Interpretation:
- โ Fail to reject \( H_0 \) (No strong evidence to support \( H_a \)).
- โ Reject \( H_0 \) (Significant evidence supporting \( H_a \)).
- โ Reject \( H_0 \) (Very strong evidence for \( H_a \)).
4. Example: P-Value in Action
Scenario:
A company claims that the average delivery time of their service is less than or equal to 30 minutes. A researcher collects data and finds an average delivery time of 28 minutes.
We conduct a hypothesis test:
- : The mean delivery time is 30 minutes (ฮผ = 30)
- : The mean delivery time is less than 30 minutes ( ฮผ < 30 )
We will perform a one-sample t-test using Python.
5. P-Value Calculation in Python
import numpy as np from scipy import stats # Sample data data = np.array([28, 32, 29, 31, 27, 30, 28, 29, 30, 28]) # Null hypothesis: Mean delivery time is 30 minutes mu_0 = 30 # Perform a one-sample t-test t_stat, p_value = stats.ttest_1samp(data, mu_0) # Since we are testing if mean is "less than" 30, use one-tailed p-value p_value_one_tailed = p_value / 2 # Divide by 2 for one-tailed test # Print results print(f"T-Statistic: {t_stat:.4f}") print(f"P-Value (One-Tailed): {p_value_one_tailed:.4f}") # Decision alpha = 0.05 # 5% significance level if p_value_one_tailed < alpha: print("Reject the null hypothesis: The average delivery time is significantly less than 30 minutes.") else: print("Fail to reject the null hypothesis: No significant evidence that delivery time is less than 30 minutes.")
6. Explanation of the Code
- We use
stats.ttest_1samp(data, mu_0)
to perform a one-sample t-test. - We divide the p-value by 2 to get the one-tailed p-value (because we are testing for “less than”).
- If p โค 0.05p, we reject \( H_0 \) (significant result).
7. Sample Output
T-Statistic: -1.6560 P-Value (One-Tailed): 0.0686 Fail to reject the null hypothesis: No significant evidence that delivery time is less than 30 minutes.
Since p = 0.0686 > 0.05, we fail to reject \( H_0 \), meaning we do not have enough evidence that delivery time is significantly less than 30 minutes.
8. Common Mistakes with P-Values
- P-Value is NOT the Probability That \( H_0 \) is True
- It only measures how likely the data is under \( H_0 \).
- A small p-value does not prove that \( H_0 \) is false, only that it is unlikely.
- P-Value Does NOT Measure Effect Size
- A small p-value does not tell you how large the effect is.
- Use confidence intervals or Cohen’s d to measure effect size.
- P-Value Can Be Influenced by Sample Size
- Large sample sizes can produce small p-values even if the effect is not practically significant.
- Small sample sizes may not give a small p-value even if there is a real effect.
9. Summary
- The p-value tells us how likely our results are under the null hypothesis.
- If , we reject \( H_0 \) (statistically significant result).
- If , we fail to reject \( H_0 \) (not significant).
- P-values do not measure the probability that \( H_0 \) is true or the strength of an effect