home

bytes

tutorials

applied statistics

two sample t and z test

# Two-Sample t-Test and z-Test

Module - 7 Hypothesis Testing
Two-Sample t-Test and z-Test

Overview

In statistics, hypothesis testing is a common method used to determine if there is a significant difference between a sample and a known population. Another commonly used tests in hypothesis testing is the two-sample t-test and z-test. In this lesson, we will explore the differences between the two tests and how they are used in practice.

Two-Sample t-Test

The Two-Sample t-Test is a statistical test used to compare the means of two independent samples. It is a hypothesis test that determines whether there is a significant difference between the means of the two populations from which the samples are drawn.

Assumptions

Before performing a Two-Sample t-Test, it is important to check that the following assumptions are met:

1. The samples are independent.
2. The populations from which the samples are drawn are normally distributed.
3. The variances of the two populations are equal.

If these assumptions are not met, then a different statistical test may be more appropriate.

Example

Suppose we want to compare the mean heights of male and female students at a university. We collect a random sample of 20 male students and a random sample of 20 female students and measure their heights in inches. The data is summarized below:

Mean Height (in)Standard Deviation
Males70.52.5
Females64.83.2

To test whether there is a significant difference in the mean heights of male and female students, we can perform a Two-Sample t-Test. Assuming the above assumptions are met, we can use a significance level of 0.05.

The null hypothesis is that there is no significant difference between the mean heights of male and female students. The alternative hypothesis is that there is a significant difference.

We can calculate the test statistic as follows:

t = (x̄₁ - x̄₂) / (sᵢ * sqrt(1/n₁ + 1/n₂))


where x̄₁ and x̄₂ are the sample means, sᵢ is the pooled standard deviation, n₁ and n₂ are the sample sizes.

The pooled standard deviation is calculated as follows:

sᵢ = sqrt(((n₁ - 1) * s₁² + (n₂ - 1) * s₂²) / (n₁ + n₂ - 2))


where s₁ and s₂ are the sample standard deviations.

Plugging in the values, we get:

t = (70.5 - 64.8) / (sqrt(((20 - 1) * 2.5² + (20 - 1) * 3.2²) / (20 + 20 - 2))
* sqrt(1/20 + 1/20))

t = 6.08


The degrees of freedom for this test is (n₁ + n₂ - 2) = 38.

Using a t-distribution table or a statistical software, we can find the p-value for the test. The p-value is the probability of getting a test statistic as extreme as the one calculated, assuming the null hypothesis is true.

Assuming a significance level of 0.05, the critical t-value (two-tailed) is ±2.024.

The p-value is calculated to be 3.79e-07, which is much smaller than 0.05. Therefore, we reject the null hypothesis and conclude that there is a significant difference between the mean heights of male and female students.

z-Test

The z-Test is a statistical test used to determine whether a sample mean is significantly different from a known population mean when the population standard deviation is known. It is based on the normal distribution and is used when the sample size is large (usually more than 30).

Assumptions

Before performing a z-Test, it is important to check that the following assumptions are met:

1. The sample is random and independent.
2. The population from which the sample is drawn is normally distributed.
3. The population standard deviation is known.

If these assumptions are not met, then a different statistical test may be more appropriate.

Example

Suppose a company produces light bulbs and claims that the average lifespan of its bulbs is 5000 hours, with a standard deviation of 1000 hours. To test this claim, a random sample of 50 bulbs is selected and their lifespans are measured. The sample mean is found to be 4800 hours.

To determine whether this sample mean is significantly different from the population mean, we can perform a z-Test. Assuming the above assumptions are met, we can use a significance level of 0.05.

The null hypothesis is that there is no significant difference between the sample mean and the population mean. The alternative hypothesis is that there is a significant difference.

Solution

First, we need to calculate the z-score:

z = (sample mean - population mean) /
(population standard deviation /sqrt(sample size))

= (4800 - 5000) / (1000 / sqrt(50)) = -2.24


Next, we can use a standard normal distribution table or a calculator to find the p-value associated with the z-score. In this case, the p-value is approximately 0.0125.

Since the p-value is less than 0.05, we reject the null hypothesis and conclude that the sample mean is significantly different from the population mean at a significance level of 0.05.

Conclusion

In conclusion, the Two-Sample t-Test and z-Test are important statistical tests used to compare means and determine whether a sample mean is significantly different from a known population mean. The Two-Sample t-Test is used when comparing the means of two independent samples, while the z-Test is used when the population standard deviation is known and the sample size is large. It is important to check the assumptions for each test before performing the test to ensure its validity. The results of these tests can provide valuable insights and help make informed decisions in various fields such as science, business, and social sciences.

Key Takeaways

Here are the key takeaways from the above text:

• The Two-Sample t-Test is a statistical test used to compare the means of two independent samples.
• The z-Test is a statistical test used to determine whether a sample mean is significantly different from a known population mean when the population standard deviation is known.
• Both tests have specific assumptions that must be met before they can be used.
• The results of these tests can be used to make conclusions about whether there is a significant difference between the means of two populations or whether a sample mean is significantly different from a population mean.
• It is important to choose the appropriate test based on the nature of the data and the research question being investigated.

Quiz

1. Which statistical test is used to compare the means of two independent samples?

1. One-Sample t-Test
2. Two-Sample t-Test
3. z-Test
4. ANOVA

Answer: b) Two-Sample t-Test

2. What is the null hypothesis for a Two-Sample t-Test?

1. There is a significant difference between the means of the two populations.
2. There is no significant difference between the means of the two populations.
3. The variances of the two populations are not equal.
4. The samples are not independent.

Answer: b. There is no significant difference between the means of the two populations.

3. Which assumption is necessary for a z-Test?

1. The sample is random and independent.
2. The population from which the sample is drawn is normally distributed.
3. The population standard deviation is known.
4. The variances of the two populations are equal.

Answer: c. The population standard deviation is known.

4. When is a z-Test appropriate?

1. When the population standard deviation is unknown and the sample size is small.
2. When the population standard deviation is known and the sample size is small.
3. When the population standard deviation is unknown and the sample size is large.
4. When the population standard deviation is known and the sample size is large.

Answer: d. When the population standard deviation is known and the sample size is large.

4. What is the alternative hypothesis for a Two-Sample t-Test?

1. There is a significant difference between the means of the two populations.
2. There is no significant difference between the means of the two populations.
3. The variances of the two populations are not equal.
4. The samples are not independent.

Answer: a. There is a significant difference between the means of the two populations.

AlmaBetter’s curriculum is the best curriculum available online. AlmaBetter’s program is engaging, comprehensive, and student-centered. If you are honestly interested in Data Science, you cannot ask for a better platform than AlmaBetter.

Kamya Malhotra
Statistical Analyst
Fast forward your career in tech with AlmaBetter

Vikash SrivastavaCo-founder & CPTO AlmaBetter

Related Tutorials to watch

Top Articles toRead

Made with in Bengaluru, India
• Official Address
• 4th floor, 133/2, Janardhan Towers, Residency Road, Bengaluru, Karnataka, 560025
• Communication Address
• 4th floor, 315 Work Avenue, Siddhivinayak Tower, 152, 1st Cross Rd., 1st Block, Koramangala, Bengaluru, Karnataka, 560034
• Follow Us

© 2023 AlmaBetter