All Courses (6)

Master's Degree (2)

Fellowship (2)

Certifications (2)

Woolf University

Top Rated

MS in Computer Science: Machine Learning and Artificial Intelligence

Woolf University

Popular

MS in Computer Science: Cloud Computing with AI System Design

Vishlesan I-Hub, IIT Patna

Professional Fellowship in Data Science and Agentic AI Engineering

Vishlesan I-Hub, IIT Patna

Professional Fellowship in Software Engineering with AI and DevOps

IBM & Microsoft

Advanced Certification in Data Analytics & Gen AI Engineering

IBM & Microsoft

Advanced Certification in Web Development & Gen AI Engineering

Course Outline

Introduction to Hypothesis Testing

One Sample T-Test and Z-Test

Two Sample T-Test and Z-Test

Analysis of Variance (ANOVA): Types, Formula, Examples

Two Sample T-Test and Z-Test

Last Updated: 10th October, 2023

In statistics, hypothesis testing is a common method used to determine if there is a significant difference between a sample and a known population. Another commonly used tests in hypothesis testing is the two-sample t-test and z-test. In this lesson, we will explore the differences between the two tests and how they are used in practice.

Two-Sample t-Test

The Two-Sample t-Test is a statistical test used to compare the means of two independent samples. It is a hypothesis test that determines whether there is a significant difference between the means of the two populations from which the samples are drawn.

Assumptions

Before performing a Two-Sample t-Test, it is important to check that the following assumptions are met:

The samples are independent.
The populations from which the samples are drawn are normally distributed.
The variances of the two populations are equal.

If these assumptions are not met, then a different statistical test may be more appropriate.

Example

Suppose we want to compare the mean heights of male and female students at a university. We collect a random sample of 20 male students and a random sample of 20 female students and measure their heights in inches. The data is summarized below:

	Mean Height (in)	Standard Deviation
Males	70.5	2.5
Females	64.8	3.2

To test whether there is a significant difference in the mean heights of male and female students, we can perform a Two-Sample t-Test. Assuming the above assumptions are met, we can use a significance level of 0.05.

The null hypothesis is that there is no significant difference between the mean heights of male and female students. The alternative hypothesis is that there is a significant difference.

We can calculate the test statistic as follows:

t = (x̄₁ - x̄₂) / (sᵢ * sqrt(1/n₁ + 1/n₂))

where x̄₁ and x̄₂ are the sample means, sᵢ is the pooled standard deviation, n₁ and n₂ are the sample sizes.

The pooled standard deviation is calculated as follows:

sᵢ = sqrt(((n₁ - 1) * s₁² + (n₂ - 1) * s₂²) / (n₁ + n₂ - 2))

where s₁ and s₂ are the sample standard deviations.

Plugging in the values, we get:

t = (70.5 - 64.8) / (sqrt(((20 - 1) * 2.5² + (20 - 1) * 3.2²) / (20 + 20 - 2))
    * sqrt(1/20 + 1/20))

t = 6.08

The degrees of freedom for this test is (n₁ + n₂ - 2) = 38.

Using a t-distribution table or a statistical software, we can find the p-value for the test. The p-value is the probability of getting a test statistic as extreme as the one calculated, assuming the null hypothesis is true.

Assuming a significance level of 0.05, the critical t-value (two-tailed) is ±2.024.

The p-value is calculated to be 3.79e-07, which is much smaller than 0.05. Therefore, we reject the null hypothesis and conclude that there is a significant difference between the mean heights of male and female students.

Z-Test

The z-Test is a statistical test used to determine whether a sample mean is significantly different from a known population mean when the population standard deviation is known. It is based on the normal distribution and is used when the sample size is large (usually more than 30).

Assumptions

Before performing a z-Test, it is important to check that the following assumptions are met:

The sample is random and independent.
The population from which the sample is drawn is normally distributed.
The population standard deviation is known.

If these assumptions are not met, then a different statistical test may be more appropriate.

Example

Suppose a company produces light bulbs and claims that the average lifespan of its bulbs is 5000 hours, with a standard deviation of 1000 hours. To test this claim, a random sample of 50 bulbs is selected and their lifespans are measured. The sample mean is found to be 4800 hours.

To determine whether this sample mean is significantly different from the population mean, we can perform a z-Test. Assuming the above assumptions are met, we can use a significance level of 0.05.

The null hypothesis is that there is no significant difference between the sample mean and the population mean. The alternative hypothesis is that there is a significant difference.

Solution

First, we need to calculate the z-score:

z = (sample mean - population mean) / 
        (population standard deviation /sqrt(sample size)) 

    = (4800 - 5000) / (1000 / sqrt(50)) = -2.24

Next, we can use a standard normal distribution table or a calculator to find the p-value associated with the z-score. In this case, the p-value is approximately 0.0125.

Since the p-value is less than 0.05, we reject the null hypothesis and conclude that the sample mean is significantly different from the population mean at a significance level of 0.05.

Conclusion

In conclusion, the Two-Sample t-Test and z-Test are important statistical tests used to compare means and determine whether a sample mean is significantly different from a known population mean. The Two-Sample t-Test is used when comparing the means of two independent samples, while the z-Test is used when the population standard deviation is known and the sample size is large. It is important to check the assumptions for each test before performing the test to ensure its validity. The results of these tests can provide valuable insights and help make informed decisions in various fields such as science, business, and social sciences.

Key Takeaways

Here are the key takeaways from the above text:

The Two-Sample t-Test is a statistical test used to compare the means of two independent samples.
The z-Test is a statistical test used to determine whether a sample mean is significantly different from a known population mean when the population standard deviation is known.
Both tests have specific assumptions that must be met before they can be used.
The results of these tests can be used to make conclusions about whether there is a significant difference between the means of two populations or whether a sample mean is significantly different from a population mean.
It is important to choose the appropriate test based on the nature of the data and the research question being investigated.

Quiz

1. Which statistical test is used to compare the means of two independent samples?

One-Sample t-Test
Two-Sample t-Test
z-Test
ANOVA

Answer: b) Two-Sample t-Test

2. What is the null hypothesis for a Two-Sample t-Test?

There is a significant difference between the means of the two populations.
There is no significant difference between the means of the two populations.
The variances of the two populations are not equal.
The samples are not independent.

Answer: b. There is no significant difference between the means of the two populations.

3. Which assumption is necessary for a z-Test?

The sample is random and independent.
The population from which the sample is drawn is normally distributed.
The population standard deviation is known.
The variances of the two populations are equal.

Answer: c. The population standard deviation is known.

4. When is a z-Test appropriate?

When the population standard deviation is unknown and the sample size is small.
When the population standard deviation is known and the sample size is small.
When the population standard deviation is unknown and the sample size is large.
When the population standard deviation is known and the sample size is large.

Answer: d. When the population standard deviation is known and the sample size is large.

4. What is the alternative hypothesis for a Two-Sample t-Test?

There is a significant difference between the means of the two populations.
There is no significant difference between the means of the two populations.
The variances of the two populations are not equal.
The samples are not independent.

Answer: a. There is a significant difference between the means of the two populations.

Module 7: Hypothesis Testing