weba

Karma: 0

How doo you calculate the p-value of a normal distribution

Mean = 100 standard deviation =25

Answer this

0 Views: 406 Answers: 1 Posted: 14 years ago

1 Answer

anthonyr32

Karma: 135

Calculating a Single p Value From a Normal Distribution

We look at the steps necessary to calculate the p-value for a particular test. In the interest of simplicity we only look at a two sided test, and we focus on one example. Here we want to show that the mean is not close to a fixed value, a.

H0: mux = a,
Ha: mux not = a,
The p value is calculated for a particular sample mean. Here we assume that we obtained a sample mean, x and want to find its p value. It is the probability that we would obtain a given sample mean that is greater than the absolute value of its Z-score or less than the negative of the absolute value of its Z-score.

For the special case of a normal distribution we also need the standard deviation. We will assume that we are given the standard deviation and call it s. The calculation for the p value can be done in several of ways. We will look at two ways here. The first way is to convert the sample means to their associated Z-score. The other way is to simply specify the standard deviation and let the computer do the conversion. At first glance it may seem like a no brainer, and we should just use the second method. Unfortunately, when using the t-distribution we need to convert to the t-score, so it is a good idea to know both ways.

We first look at how to calculate the p-value using the Z-score. The Z-score is found by assuming that the null hypothesis is true, subtracting the assumed mean, and dividing by the theoretical standard deviation. Once the Z-score is found the probability that the value could be less the Z-score is found using the pnorm command.

This is not enough to get the p-value. If the Z-score that is found is positive then we need to take one minus the associated probability. Also, for a two sided test we need to multiply the result by two. Here we avoid these issues and insure that the Z-score is negative by taking the negative of the absolute value.

We now look at a specific example. In the example below we will use a value of a of 5, a standard deviation of 2, and a sample size of 20. We then find the p-value for a sample mean of 7:

> a <- 5
> s <- 2
> n <- 20
> xbar <- 7
> z <- (xbar-a)/(s/sqrt(n))
> z
[1] 4.472136
> 2*pnorm(-abs(z))
[1] 7.744216e-06
>
We now look at the same problem only specifying the mean and standard deviation within the pnorm command. Note that for this case we cannot so easily force the use of the left tail. Since the sample mean is more than the assumed mean we have to take two times one minus the probability:

> a <- 5
> s <- 2
> n <- 20
> xbar <- 7
> 2*(1-pnorm(xbar,mean=a,sd=s/sqrt(20)))
[1] 7.744216e-06
>
Calculating a Single p Value From a t Distribution

Finding the p-value using a t distribution is very similar to using the Z-score as demonstrated above. The only difference is that you have to specify the number of degrees of freedom. Here we look at the same example as above but use the t distribution instead:

> a <- 5
> s <- 2
> n <- 20
> xbar <- 7
> t <- (xbar-a)/(s/sqrt(n))
> t
[1] 4.472136
> 2*pt(-abs(t),df=n-1)
[1] 0.0002611934
>
We now look at an example where we have a univariate data set and want to find the p-value. In this example we use one of the data sets given in the data input chapter. We use the w1 data set:

> w1 <- read.csv(file="w1.dat",sep=",",head=TRUE)
> summary(w1)
vals
Min. :0.130
1st Qu.:0.480
Median :0.720
Mean :0.765
3rd Qu.:1.008
Max. :1.760
> length(w1$vals)
[1] 54
Here we use a two sided hypothesis test,

H0: mu1 = 0.7,
Ha: mu1 not = 0.7,
So we calculate the sample mean and sample standard deviation in order to calculate the p-value:

> t <- (mean(w1$vals)-0.7)/(sd(w1$vals)/sqrt(length(w1$vals)))
> t
[1] 1.263217
> 2*pt(-abs(t),df=length(w1$vals)-1)
[1] 0.21204
Calculating Many p Values From a t Distribution

Suppose that you want to find the p-values for many tests. This is a common task and most software packages will allow you to do this. Here we see how it can be done in R.

Here we assume that we want to do a one-sided hypothesis test for a number of comparisons. In particular we will look at three hypothesis tests. All are of the following form:

H0: mu1 - mu2 = 0,
Ha: mu1 - mu2 not = 0,
We have three different sets of comparisons to make:

Comparison 1
Mean Std.
Dev. Number
(pop.)
Group I 10 3 300
Group II 10.5 2.5 230

Comparison 2
Mean Std.
Dev. Number
(pop.)
Group I 12 4 210
Group II 13 5.3 340

Comparison 3
Mean Std.
Dev. Number
(pop.)
Group I 30 4.5 420
Group II 28.5 3 400

For each of these comparisons we want to calculate a p-value. For each comparison there are two groups. We will refer to group one as the group whose results are in the first row of each comparison above. We will refer to group two as the group whose results are in the second row of each comparison above. Before we can do that we must first compute a standard error and a t-score. We will find general formulae which is necessary in order to do all three calculations at once.

We assume that the means for the first group are defined in a variable called m1. The means for the second group are defined in a variable called m2. The standard deviations for the first group are in a variable called sd1. The standard deviations for the second group are in a variable called sd2. The number of samples for the first group are in a variable called num1. Finally, the number of samples for the second group are in a variable called num2.

With these definitions the standard error is the square root of (sd1^2)/num1+(sd2^2)/num2. The associated t-score is m1 minus m2 all divided by the standard error. The R comands to do this can be found below:

> m1 <- c(10,12,30)
> m2 <- c(10.5,13,28.5)
> sd1 <- c(3,4,4.5)
> sd2 <- c(2.5,5.3,3)
> num1 <- c(300,210,420)
> num2 <- c(230,340,400)
> se <- sqrt(sd1*sd1/num1+sd2*sd2/num2)
> t <- (m1-m2)/se
To see the values just type in the variable name on a line alone:

> m1
[1] 10 12 30
> m2
[1] 10.5 13.0 28.5
> sd1
[1] 3.0 4.0 4.5
> sd2
[1] 2.5 5.3 3.0
> num1
[1] 300 210 420
> num2
[1] 230 340 400
> se
[1] 0.2391107 0.3985074 0.2659216
> t
[1] -2.091082 -2.509364 5.640761
To use the pt command we need to specify the number of degrees of freedom. This can be done using the pmin command. Note that there is also a command called min, but it does not work the same way. You need to use pmin to get the correct results. The numbers of degrees of freedom are pmin(num1,num2)-1. So the p-values can be found using the following R command:

> pt(t,df=pmin(num1,num2)-1)
[1] 0.01881168 0.00642689 0.99999998
If you enter all of these commands into R you should have noticed that the last p-value is not correct. The pt command gives the probability that a score is less that the specified t. The t-score for the last entry is positive, and we want the probability that a t-score is bigger. One way around this is to make sure that all of the t-scores are negative. You can do this by taking the negative of the absolute value of the t-scores:

> pt(-abs(t),df=pmin(num1,num2)-1)
[1] 1.881168e-02 6.426890e-03 1.605968e-08
The results from the command above should give you the p-values for a one-sided test. It is left as an exercise how to find the p-values for a two-sided test.

14 years ago. Rating: 0
Comment this answer

Top contributors in Uncategorized category

ROMOS
Answers: 18061 / Questions: 154
Karma: 1102K

Colleen
Answers: 47269 / Questions: 115
Karma: 953K

country bumpkin
Answers: 11322 / Questions: 160
Karma: 838K

Benthere
Answers: 2392 / Questions: 30
Karma: 760K

> Top contributors chart

Unanswered Questions

vsbetboo

Answers: 0 Views: 8 Rating: 0

Hateco Long Biên

Answers: 0 Views: 8 Rating: 0

Nhà Đài VIN88

Answers: 0 Views: 10 Rating: 0

Nhà Đài NET88

Answers: 0 Views: 8 Rating: 0

Xksportai

Answers: 0 Views: 11 Rating: 0

Nhà Đài VA88

Answers: 0 Views: 9 Rating: 0

karenobriencom

Answers: 0 Views: 11 Rating: 0

Nhà Đài VIN88

Answers: 0 Views: 8 Rating: 0

> More questions...

Top Labels

developer

games

animals

health

religion

love

directions

help

music

god

cars

facebook

silicone

video

maps

travel

shoes

akaqa

distance

education

medical

life

business

science

money

school

math

computer

food

> More labels...

1 Answer

DEMO2

Top contributors in Uncategorized category

Unanswered Questions

Top Labels