Limit Theorems (Central Limit Theorem, Law of Large Numbers)

Course Outline

Probability and Fundamental Principle of Counting

Random Variables and Probability Distributions

Joint Probability Distribution and Conditional Probability

Bayes' Theorem, Conditional Probability and Independence

Moments and Moment Generating Functions

Limit Theorems (Central Limit Theorem, Law of Large Numbers)

Last Updated: 10th October, 2023

Limit theorems, such as the Law of Large Numbers (LLN) and Central Limit Theorem (CLT), are essential concepts in likelihood hypothesis that portray the behaviour of random variables as the test estimate develops infinitely huge. The LLN states that the sample mean of an expansive number of independent and identically distributed (i.i.d.) arbitrary factors converges to the true mean of the population, whereas the CLT states that the whole or normal of a huge number of i.i.d. random variables converges to a normal distribution. Limit theorems have numerous imperative applications in areas such as fund, material science, science, and machine learning, and give a thorough scientific establishment for numerous critical measurable and machine learning procedures.

Introduction to Limit Theorems

Limit theorems are fundamental results in probability theory that describe the behaviour of random variables as the sample size grows infinitely large. These theorems allow us to make predictions about the behaviour of random variables based on their underlying distributions, even when we have limited information about those distributions. Limit theorems are especially important in statistics, where they form the basis for many statistical tests and estimation methods.

Central Limit Theorem

The Central Limit Theorem (CLT) is one of the most well-known limit theorems and is widely used in statistics. The CLT states that the sum or average of a large number of independent and identically distributed (i.i.d.) random variables will have a normal distribution, regardless of the distribution of the individual random variables themselves.

More formally, let X1, X2, ..., Xn be a sequence of i.i.d. random variables with mean μ and variance σ^2. Then, the sum of these random variables, S = X1 + X2 + ... + Xn, will have a distribution that approaches a normal distribution as n grows large. Specifically, as n → ∞:

(S - nμ) / (σ√n) → N(0, 1)

Where N(0, 1) denotes a standard normal distribution with mean 0 and variance 1. This means that, for large n, we can use the standard normal distribution to approximate the distribution of S.

The CLT has many important applications in statistics. For example, it is often used to construct confidence intervals for the mean of a population or to test hypotheses about the mean. It is also used in the construction of many other statistical tests, such as the t-test and the F-test.

Law of Large Numbers

The Law of Large Numbers (LLN) is another important limit theorem that describes the behaviour of the average of a sequence of random variables as the sample size grows large. Unlike the CLT, the LLN does not make any assumptions about the distribution of the random variables.

The LLN comes in two forms: the weak LLN and the strong LLN. The weak LLN states that, for a sequence of i.i.d. random variables X1, X2, ..., Xn with mean μ and variance σ^2, the sample mean X̄n = (X1 + X2 + ... + Xn) / n converges in probability to the true mean μ. This means that, for any ε > 0:

lim P(|X̄n - μ| > ε) = 0

As n grows large. Intuitively, this means that the sample mean becomes more and more accurate as the sample size grows.

The strong LLN is a stronger version of the weak LLN, and states that the sample mean converges almost surely to the true mean μ. This means that, with probability 1:

lim X̄n = μ

As n grows large. The strong LLN is a more powerful result than the weak LLN, but it requires stronger assumptions about the random variables, such as their independence and identical distribution.

Proof of the Central Limit Theorem

There are many different proofs of the Central Limit Theorem, each with its own strengths and weaknesses. One of the most well-known proofs is due to Lindeberg, and is based on the concept of characteristic functions.

The characteristic function of a random variable X is defined as φX(t) = E[e^(itX)], where i is the imaginary unit. The characteristic function plays an important role in probability theory, as it uniquely characterizes the distribution of a random variable. In particular, if two random variables have the same characteristic function, then they have the same distribution.

Lindeberg's proof of the CLT involves showing that the characteristic function of the sum S = X1 + X2 + ... + Xn converges to the characteristic function of a normal distribution as n grows large. This requires several technical assumptions, such as the existence of certain moments of the random variables and the convergence of their variances.

Another famous proof of the CLT is due to Levy, and is based on the concept of characteristic functions as well as the Lindeberg-Feller condition. The Lindeberg-Feller condition requires that the random variables are "not too different" from each other in some sense, and is a weaker assumption than the Lindeberg condition used in Lindeberg's proof.

Applications of Limit Theorems

Limit theorems have many important applications in a wide variety of fields, from finance to physics to biology. Here are a few examples:

In finance, the CLT is used to model the behavior of stock prices and other financial instruments. For example, the Black-Scholes model, which is widely used to price options, assumes that stock prices follow a log-normal distribution, which can be justified using the CLT.
In physics, the CLT is used to model the behavior of systems with many interacting particles, such as gases or liquids. The Maxwell-Boltzmann distribution, which describes the velocity distribution of particles in a gas, can be derived using the CLT.
In biology, the LLN is used to estimate the mean and variance of biological populations based on a sample of individuals. For example, if we want to estimate the average height of a population of trees, we can measure the heights of a sample of trees and use the LLN to estimate the true population mean.
In machine learning, the CLT and other limit theorems are used to justify the use of certain algorithms and models. For example, the CLT is often used to justify the use of linear regression and logistic regression models in classification problems.

Limit Theorems in Practice

While limit theorems are powerful tools for making predictions about the behavior of random variables, there are several practical considerations that must be taken into account when using them in real-world situations.
One important consideration is the sample size. As the sample size grows larger, the distributions of sample means and sums become increasingly normal, and the CLT becomes more applicable. However, for small sample sizes, the normal approximation may not be accurate, and other methods may be more appropriate.
Another consideration is the distributional assumptions. The CLT assumes that the underlying random variables are i.i.d. with finite mean and variance. In practice, these assumptions may not hold, and other limit theorems or statistical methods may be needed.
Finally, it is important to keep in mind that limit theorems describe the behavior of random variables as the sample size grows infinitely large. In practice, we are usually working with finite sample sizes, and the behavior of the random variables may be quite different from what is predicted by the limit theorems.

Limit Theorems in Machine Learning

Limit theorems have many important applications in machine learning, where they are used to justify the use of certain algorithms and models. Here are a few examples:

In deep learning, the CLT is used to justify the use of gradient descent optimization algorithms. The CLT implies that the gradient of the loss function converges to a normal distribution as the number of training samples grows large, which justifies the use of gradient descent to find the optimal weights of a neural network.
In Bayesian machine learning, the LLN and CLT are used to justify the use of Markov chain Monte Carlo (MCMC) methods for sampling from complex posterior distributions. The LLN implies that the law of large numbers holds for MCMC samples, while the CLT implies that the central limit theorem holds, allowing us to use standard statistical techniques to analyze the samples.
In statistical hypothesis testing, limit theorems are used to construct test statistics and determine the significance of results. For example, the t-test for comparing means of two samples relies on the CLT to justify the use of a normal approximation to the sampling distribution of the difference in means.

Conclusion

In conclusion, limit theorems such as the Law of Large Numbers (LLN) and Central Limit Theorem (CLT) are fundamental concepts in probability theory that describe the behavior of random variables as the sample size grows infinitely large. The LLN and CLT have many important applications in fields such as finance, physics, biology, and machine learning.

Key Takeaways

The Law of Large Numbers (LLN) and the Central Limit Theorem (CLT) are two important limit theorems that describe the behavior of random variables as the sample size grows infinitely large.
The LLN states that the sample mean of a large number of independent and identically distributed (i.i.d.) random variables converges to the true mean of the population as the sample size grows infinitely large.
The CLT states that the sum or average of a large number of i.i.d. random variables converge to a normal distribution as the sample size grows infinitely large.
The proofs of the LLN and CLT are quite technical and require a strong background in probability theory.
Limit theorems have many important applications in a wide variety of fields, including finance, physics, biology, and machine learning.
When using limit theorems in practice, it is important to take into account factors such as sample size and distributional assumptions.
Limit theorems provide a rigorous mathematical foundation for many important statistical and machine-learning techniques.

Quiz

Which theorem describes the behavior of the sample mean of a large number of independent and identically distributed random variables as the sample size grows infinitely large?

A) Central Limit Theorem

B) Law of Large Numbers

C) Bayes' Theorem

D) Theorem of Total Probability

Answer: B) Law of Large Numbers

2. Which theorem describes the convergence of the sum or average of a large number of independent and identically distributed random variables to a normal distribution as the sample size grows infinitely large?

A) Central Limit Theorem

B) Law of Large Numbers

C) Bayes' Theorem

D) Theorem of Total Probability

Answer: A) Central Limit Theorem

3. Which field of study uses limit theorems to justify the use of Markov chain Monte Carlo (MCMC) methods for sampling from complex posterior distributions?

A) Finance

B) Physics

C) Biology

D) Bayesian Machine Learning

Answer: D) Bayesian Machine Learning

4. Which statistical test relies on the Central Limit Theorem to justify the use of a normal approximation to the sampling distribution of the difference in means?

A) F-test

B) ANOVA

C) Chi-squared test

D) t-test

Answer: D) t-test

Module 3: Probability Theory