You have nothing to worry about, though! After reading this article, you will be able to gauge the importance of p-value in a layperson’s terms, instead of being sucked into the matrix of mathematics.
Statistical Significance is a measure of whether your research findings are meaningful or not.
We have always considered p-value as 5%; however, it depends on what is at stake with respect to the problem statement.
As we already know, p tells you the probability of something happening randomly. If p is 5%, it means that a particular result in your study has a 5% chance of being just a coincidence.
However, at some point you might wonder if a 5% chance is too less or too much. ? Well, it totally depends on what’s at stake.
For example:
We can conclude that in research the stakes are not the same every time. Furthermore, the effect would vary depending on different problem statements.
Now, let’s delve into an intuition-based understanding.
Note: All examples are constructed to help our readers understand things from a layperson’s perspective by connecting relatable life experiences, we have no intention of hurting anyone’s sentiments.
Suppose you are convinced that your best friend's wife is cheating on him. You are throwing a weekend party for him and you are debating whether or not to tell him. On one hand, you don't want to disturb their relationship. On the other hand, you think he has the right to know. What do you do?
You could be mistaken. However, before we get into that, there would be two possibilities:
"She has no affair”. This is known as the "null hypothesis", since it is the proposition that we should accept as true in the absence of sufficient evidence with respect to specific conditions.
"She's having an affair". This is the "alternative hypothesis". You will only state the truth of this statement if you have enough data or evidence to verify it.
Before we proceed, we must first grasp the two categories of errors that can occur.
Type 1 error occurs when you say that the null hypothesis is false and the alternative is true (stating that she is having an affair, i.e.,. incorrectly rejecting the null hypothesis.
It is a Type 2 error to assume that the null is true (accepting that she is not having an affair), when in fact, she is, i.e., accepting the null hypothesis incorrectly.
Your query relates to Type 1 errors, or more specifically, the probability of making a Type 1 error.
What level of certainty is required to determine the solution to this problem statement? How confident do you need to be that your allegations of cheating against his partner are true?
There is really no way to gauge your level of certainty in situations like these that might occur in real life.
However, in a statistical hypothesis test, you can be certain that the probability of a Type 1 error is no greater than the "significance level" that you specify. The probability of a Type 1 error can be thought of as the significance level, which is frequently represented by the Greek letter alpha.
So, how sure do you need to be?
What will happen if you are incorrect? In that situation, your friend will unfriend you – and we are not just talking about Facebook! He will be mad at you. He might even become violent.
Hence, you want to have a very low chance of being wrong while making your statement. You want to use an extremely low alpha value. Would you be okay with a 5% chance that you are making something up? Probably not. You would rather have near certainty, with only a 0.01% risk of being wrong.
In such a case we can say, the probability of being wrong should be very low and if you take 5% as a p-value here, which means there is a 5% chance of being wrong. (If you tell your friend his wife is having an affair, then the probability of being wrong should be very less or you should have enough evidence to prove the same.)
This is because you are reading academic research reports. There is a 5% chance that a researcher who claims to have rejected the null hypothesis at an alpha level of 0.05 is mistaken.
However, there is another possibility. Everyone else in the same research community who is an academic is aware of that. The same experiment is therefore repeated by other researchers to make sure the initial conclusion was accurate.
What has the initial researcher lost if it turns out that rejecting the null was incorrect? Just a little bit, and possibly not even that.
Let's get into another intuition-based understanding:
For instance, a pharmaceutical company is about to launch a new drug into the market and wants to ensure that the drug is both safe and effective.
Their null hypothesis is that the drug is either unsafe or ineffective. They will reject that null and only release the drug if they are confident that it is both safe and effective. They want to use a very low significance level because they could be sued for millions of dollars if they are wrong.
You are clearly upset that we haven't mentioned the p-value yet. In actuality, we already have.
If the null hypothesis is true, the p-value is the probability of seeing an experimental result as extreme as the one observed and the company could get into legal trouble. By "extreme," we mean a significant departure from the conditions under which the null would be true. In a one-tailed test (e.g., the null hypothesis is that the population mean is greater than or equal to a specified value), the null hypothesis is rejected if the sample mean is significantly less than that value.
In a two-tailed test (e.g., the null hypothesis is that the population mean is exactly equal to a specified value), the null hypothesis is rejected if the sample mean is significantly greater or lower than the hypothetical value.
The decision rule, "reject the null hypothesis if the observed value of the test statistic is more extreme than the critical value", is the same as the decision rule, "reject the null hypothesis if the p-value is less than alpha", in a statistical hypothesis test.
This is because the critical value of the test statistic is the one for which the p-value equals the alpha level you specify.
Finally, the answer to your question: p-value is the probability of getting the actual observed results, if the null hypothesis is true.
Well, if you want a quick and tainted survey, and it doesn’t matter much if you get the wrong decision, then you can have a p value of 0.8 (80% confidence that you have got the answer right).
However, if lives depend on it, as in a drug trial, and you don’t want to kill more people than the standard treatment, then you choose 0.99 for p (99% confidence).
Here’s some bad science: Choose a p value for your sponsored research, say 0.95, then find you don’t quite have a ‘result’, as in, the p value of what you’ve found is 0.93, and then decide to change the p value to say, 0.90, so that you can report a positive result, and thereby ensure that you get further funding.
Conclusion:
A p-value of 0.05 means you will be wrong 5 times out of 100 when you reject the null hypothesis.
0.05 is probably the upper limit for significance and probably 0.01 is always significant. The experimenter determines the significance.
If you want to learn more about about Hypothesis Testing and build a career in Data Science, you can sign up for AlmaBetter's Full Stack Data Science prorgam.
Read our latest case study on Netflix Churn Prediction.
You have nothing to worry about, though! After reading this article, you will be able to gauge the importance of p-value in a layperson’s terms, instead of being sucked into the matrix of mathematics.
Statistical Significance is a measure of whether your research findings are meaningful or not.
We have always considered p-value as 5%; however, it depends on what is at stake with respect to the problem statement.
As we already know, p tells you the probability of something happening randomly. If p is 5%, it means that a particular result in your study has a 5% chance of being just a coincidence.
However, at some point you might wonder if a 5% chance is too less or too much. ? Well, it totally depends on what’s at stake.
For example:
We can conclude that in research the stakes are not the same every time. Furthermore, the effect would vary depending on different problem statements.
Now, let’s delve into an intuition-based understanding.
Note: All examples are constructed to help our readers understand things from a layperson’s perspective by connecting relatable life experiences, we have no intention of hurting anyone’s sentiments.
Suppose you are convinced that your best friend's wife is cheating on him. You are throwing a weekend party for him and you are debating whether or not to tell him. On one hand, you don't want to disturb their relationship. On the other hand, you think he has the right to know. What do you do?
You could be mistaken. However, before we get into that, there would be two possibilities:
"She has no affair”. This is known as the "null hypothesis", since it is the proposition that we should accept as true in the absence of sufficient evidence with respect to specific conditions.
"She's having an affair". This is the "alternative hypothesis". You will only state the truth of this statement if you have enough data or evidence to verify it.
Before we proceed, we must first grasp the two categories of errors that can occur.
Type 1 error occurs when you say that the null hypothesis is false and the alternative is true (stating that she is having an affair, i.e.,. incorrectly rejecting the null hypothesis.
It is a Type 2 error to assume that the null is true (accepting that she is not having an affair), when in fact, she is, i.e., accepting the null hypothesis incorrectly.
Your query relates to Type 1 errors, or more specifically, the probability of making a Type 1 error.
What level of certainty is required to determine the solution to this problem statement? How confident do you need to be that your allegations of cheating against his partner are true?
There is really no way to gauge your level of certainty in situations like these that might occur in real life.
However, in a statistical hypothesis test, you can be certain that the probability of a Type 1 error is no greater than the "significance level" that you specify. The probability of a Type 1 error can be thought of as the significance level, which is frequently represented by the Greek letter alpha.
So, how sure do you need to be?
What will happen if you are incorrect? In that situation, your friend will unfriend you – and we are not just talking about Facebook! He will be mad at you. He might even become violent.
Hence, you want to have a very low chance of being wrong while making your statement. You want to use an extremely low alpha value. Would you be okay with a 5% chance that you are making something up? Probably not. You would rather have near certainty, with only a 0.01% risk of being wrong.
In such a case we can say, the probability of being wrong should be very low and if you take 5% as a p-value here, which means there is a 5% chance of being wrong. (If you tell your friend his wife is having an affair, then the probability of being wrong should be very less or you should have enough evidence to prove the same.)
This is because you are reading academic research reports. There is a 5% chance that a researcher who claims to have rejected the null hypothesis at an alpha level of 0.05 is mistaken.
However, there is another possibility. Everyone else in the same research community who is an academic is aware of that. The same experiment is therefore repeated by other researchers to make sure the initial conclusion was accurate.
What has the initial researcher lost if it turns out that rejecting the null was incorrect? Just a little bit, and possibly not even that.
Let's get into another intuition-based understanding:
For instance, a pharmaceutical company is about to launch a new drug into the market and wants to ensure that the drug is both safe and effective.
Their null hypothesis is that the drug is either unsafe or ineffective. They will reject that null and only release the drug if they are confident that it is both safe and effective. They want to use a very low significance level because they could be sued for millions of dollars if they are wrong.
You are clearly upset that we haven't mentioned the p-value yet. In actuality, we already have.
If the null hypothesis is true, the p-value is the probability of seeing an experimental result as extreme as the one observed and the company could get into legal trouble. By "extreme," we mean a significant departure from the conditions under which the null would be true. In a one-tailed test (e.g., the null hypothesis is that the population mean is greater than or equal to a specified value), the null hypothesis is rejected if the sample mean is significantly less than that value.
In a two-tailed test (e.g., the null hypothesis is that the population mean is exactly equal to a specified value), the null hypothesis is rejected if the sample mean is significantly greater or lower than the hypothetical value.
The decision rule, "reject the null hypothesis if the observed value of the test statistic is more extreme than the critical value", is the same as the decision rule, "reject the null hypothesis if the p-value is less than alpha", in a statistical hypothesis test.
This is because the critical value of the test statistic is the one for which the p-value equals the alpha level you specify.
Finally, the answer to your question: p-value is the probability of getting the actual observed results, if the null hypothesis is true.
Well, if you want a quick and tainted survey, and it doesn’t matter much if you get the wrong decision, then you can have a p value of 0.8 (80% confidence that you have got the answer right).
However, if lives depend on it, as in a drug trial, and you don’t want to kill more people than the standard treatment, then you choose 0.99 for p (99% confidence).
Here’s some bad science: Choose a p value for your sponsored research, say 0.95, then find you don’t quite have a ‘result’, as in, the p value of what you’ve found is 0.93, and then decide to change the p value to say, 0.90, so that you can report a positive result, and thereby ensure that you get further funding.
Conclusion:
A p-value of 0.05 means you will be wrong 5 times out of 100 when you reject the null hypothesis.
0.05 is probably the upper limit for significance and probably 0.01 is always significant. The experimenter determines the significance.
If you want to learn more about about Hypothesis Testing and build a career in Data Science, you can sign up for AlmaBetter's Full Stack Data Science prorgam.
Read our latest case study on Netflix Churn Prediction.