When conducting a hypothesis test with a binomial distribution (sometimes called a Binomial Test), there are three ways to calculate the P-value (with additional variations possible). The only exact calculation is to use the binomial probability distribution. The other methods are approximations using the standardized normal distribution (when certain criteria have been achieved). Of these two methods, one can use the sample counts or one can use the sample proportions. Furthermore, it is possible in both of these approximating cases to apply a continuity correction to account for the use of a continuous distribution to approximate a discrete distribution.
This problem introduces the method to obtain an approximate P-value using the standard normal distribution as a reasonable approximation for the binomial distribution of counts. This method does demonstrate the continuity correction, and this demonstration uses Excel to obtain the answers. VERY IMPORTANT: This method will work if the minimum of or is 10 or greater. If this criteria is not achieved, then the normal approximation is not very accurate.
For this demonstration problem, we will test a hypothesis that a student has performed better than chance on a multiple choice test. The test in question is composed of 50 multiple-choice questions with 5 possible answers for each question. If a student is randomly guessing for each question, then it is reasonable to assume a random sample of responses on the test. What can we conclude if a student scored 14 correct answers? In particular, can we conclude that this is significantly better than chance?
To start, we clearly construct the hypotheses for this problem. With 5 options per question and 50 total questions, the average number of correct answers (successful observations) would be . Because the researcher is interested in a performance better than chance, this would suggest a one-tailed test (as can be seen in the choice of ):
The distribution under examination is the binomial count distribution. As indicated above, this is a nearly normal distribution with and . With this information, we can calculate a z-score as the test statistic for this scenario. With the observed value of 14 correct answers, we could use the following:
While this will provide an adequate approximation in most circumstances, there is an argument that can improve the approximation slightly. In particular, the value 14 is different on a continuous distribution compared to a discrete distribution. With a continuous distribution, it is better to think of 14 as a range from 13.5 to 14.5. Using conventional probability notation, this is to suggest that
where is a discrete variable and is a continuous variable.
There is one complicating issue when using the continuity correction: Do you use the lower-bound or the upper-bound? To answer this, it is best to consider the P-value. The P-value is a measure of an observation or one more extreme. As such, it makes sense to choose the bound closer to the hypothesized mean. For our example, we would have
Using this formula, the test statistic for this sample would be the z-score .
Using the standard normal distribution, we can now calculate the P-value for this scenario:
This can be obtained from Excel using the following formula:
=1-NORMSDIST(1.237)
Thus, with a traditional significance level of either or , this P-value would result in failing to reject the null hypothesis. Thus, there is not enough sample evidence to support the claim that the student performed any better than would be expected by chance or random guessing.
Note: For a two-tailed hypothesis test, it would only be necessary to double the P-value obtained by this protocol.
Exercise Problem
On a multiple-choice exam with 80 questions and 4 possible answers for each question, a student gets a score of . You wish to conduct a hypothesis test to determine if this score is significantly better than would be expected by chance (i.e., by simply guessing for each question). You decide to use a test with .
What is the hypothesized population proportion for this test?
(Report answer as a decimal accurate to 2 decimal places. Do not report using the percent symbol.)
Using this population proportion, what is the hypothesized average number of correct responses on this test from guessing each answer?
Based on the researcher's understanding of the situation, how many tails would this hypothesis test have?
Choose the correct pair of hypotheses for this situation:
Using the normal approximation for the binomial distribution with the continuity correction, was is the test statistic for this sample (this student's test score)?
(Report answer as a decimal accurate to 3 decimal places.)
You are now ready to calculate the P-value for this sample.
P-value =
(Report answer as a decimal accurate to 4 decimal places.)
This P-value (and test statistic) leads to a decision to...
As such, the final conclusion is that...
This problem introduces the method to obtain an approximate P-value using the standard normal distribution as a reasonable approximation for the binomial distribution of counts. This method does demonstrate the continuity correction, and this demonstration uses Excel to obtain the answers. VERY IMPORTANT: This method will work if the minimum of or is 10 or greater. If this criteria is not achieved, then the normal approximation is not very accurate.
For this demonstration problem, we will test a hypothesis that a student has performed better than chance on a multiple choice test. The test in question is composed of 50 multiple-choice questions with 5 possible answers for each question. If a student is randomly guessing for each question, then it is reasonable to assume a random sample of responses on the test. What can we conclude if a student scored 14 correct answers? In particular, can we conclude that this is significantly better than chance?
To start, we clearly construct the hypotheses for this problem. With 5 options per question and 50 total questions, the average number of correct answers (successful observations) would be . Because the researcher is interested in a performance better than chance, this would suggest a one-tailed test (as can be seen in the choice of ):
The distribution under examination is the binomial count distribution. As indicated above, this is a nearly normal distribution with and . With this information, we can calculate a z-score as the test statistic for this scenario. With the observed value of 14 correct answers, we could use the following:
There is one complicating issue when using the continuity correction: Do you use the lower-bound or the upper-bound? To answer this, it is best to consider the P-value. The P-value is a measure of an observation or one more extreme. As such, it makes sense to choose the bound closer to the hypothesized mean. For our example, we would have
Using the standard normal distribution, we can now calculate the P-value for this scenario:
This can be obtained from Excel using the following formula:
Note: For a two-tailed hypothesis test, it would only be necessary to double the P-value obtained by this protocol.
Exercise Problem
On a multiple-choice exam with 80 questions and 4 possible answers for each question, a student gets a score of . You wish to conduct a hypothesis test to determine if this score is significantly better than would be expected by chance (i.e., by simply guessing for each question). You decide to use a test with .
What is the hypothesized population proportion for this test?
(Report answer as a decimal accurate to 2 decimal places. Do not report using the percent symbol.)
Using this population proportion, what is the hypothesized average number of correct responses on this test from guessing each answer?
Based on the researcher's understanding of the situation, how many tails would this hypothesis test have?
Choose the correct pair of hypotheses for this situation:
(A) | (B) | (C) |
---|---|---|
(D) | (E) | (F) |
Using the normal approximation for the binomial distribution with the continuity correction, was is the test statistic for this sample (this student's test score)?
(Report answer as a decimal accurate to 3 decimal places.)
You are now ready to calculate the P-value for this sample.
P-value =
(Report answer as a decimal accurate to 4 decimal places.)
This P-value (and test statistic) leads to a decision to...
As such, the final conclusion is that...