set.seed(1)
S <- c(1, .3, .8, # create covariance matrix for sample data
.3, 1, .5,
.8, .5, 1)
Sigma <- matrix(S, ncol=3, by=T)
n <- 10
x <- mvrnorm(n = n, mu=c(0,0,0), Sigma) # multivariate normally distributed data
d <- as.data.frame(x)
# dependent variable Y as linear combination plus error
d <- transform(d, Y = scale(V1) + scale(V2) + scale(V3) + rnorm(n, sd=.5))Regression - Custom parameter hypotheses
Introduction
When performing a regression, the default output in most statistical programs tests every parameter against the null hypothesis that the population parameters equals zero, i.e. there is no effect. Yet, in many cases this is not what we want. We may need to specify custom hypotheses like the following: Does the parameter
Next, we will estimate a regression model and output the coefficientes. The function summary returns several results for a regression model. We are only interested in the coefficients. The summary object s is made up of several smaller objects (type names(s)). One contains the a dataframe of estimated coefficients.
m <- lm(Y ~ V1 + V2, data=d)
s <- summary(m)
b <- s$coefficients
b # b is a dataframe with coefficients Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.6428166 0.2494192 2.577254 3.661856e-02
V1 1.6956870 0.2123248 7.986286 9.215308e-05
V2 1.7717522 0.3224852 5.494057 9.122638e-04
The standard test statistic, as given in the output, tests the default hypothesis of a zero effect, i.e.
To retrieve the estimate, standard error (s.e.), t- and p-value for a coefficient, you can access the dataframe b, e.g. for V1.
b["V1",] Estimate Std. Error t value Pr(>|t|)
1.695687e+00 2.123248e-01 7.986286e+00 9.215308e-05
Before we will test custom hypotheses, let’s reproduce the t- and p-value step by step, to get a better understanding of what happens.
b_h0 <- 0 # null hypothesis
b_hat <- b[1,1] # estimate for V1
b_se <- b[1,2] # s.e. for V1
ts <- (b_hat - b_h0) / b_se # test statistic, i.e. t-value
deg <- m$df.residual # degrees of freedom of residual, i.e. 10-3
2*pt(abs(ts), deg, lower=FALSE) # note that it is a two sided test, hence 2 x ...[1] 0.03661856
As the test is two-sided, the t-value (test value) cuts off a region of the t-distribution at both ends. The size of the region equals the p-value.

Testing if a parameter has a specific value
Now our goal is to test another hypothesis, not the default that the population parameter is equal to zero. There are several ways how to do that in R. We will explore three ways.
1. t-Test
The first approach is straighforward. To test against another value than zero, we merely need to change the null hypothesis of the t-Test. E.g. to test if the coefficient
b_h0 <- 1 # null hypothesis
ts <- (b_hat - b_h0) / b_se # test statistic, i.e. t-value
2*pt(abs(ts), deg, lower=FALSE) # note that it is a two sided test[1] 0.1952273
For convencience, we can wrap the code into a function that allows to select a coefficient and specify a custom null hypothesis value.
# m: lm model object
# b.index: index of model parameter to test
# b_h0: hypothesis to test parameter against
#
test_h0 <- function(m, b.index, b_h0)
{
b <- summary(m)$coef # get coefficients
ts <- (b[b.index, 1] - b_h0) /
b[b.index, 2] # test statistic, i.e. t-value
deg <- m$df.residual # degrees of freedom of residual
2*pt(abs(ts), deg, lower=FALSE) # note that it is a two sided test
}Let’s test our function.
test_h0(m, 1, 1)[1] 0.1952273
2. Comparing a model and a linearily restricted model
Another way to achieve the same is to compare our model to a linearily restricted model. This test allows to test multiple hypotheses on multiple parameters. We can e.g. test hypotheses like linearHypothesis from the car package performs a test for hypotheses of this kind. The argument hypothesis.matrix takes a matrix, where every row specifies a linear combination of parameters. The argument rhs takes the right-hand-side, i.e. the results of this combination. In our case we want to specify the hypothesis that
h <- linearHypothesis(m, hypothesis.matrix = c(1, 0, 0), rhs=1)
hLinear hypothesis test
Hypothesis:
(Intercept) = 1
Model 1: restricted model
Model 2: Y ~ V1 + V2
Res.Df RSS Df Sum of Sq F Pr(>F)
1 8 4.8531
2 7 3.7535 1 1.0997 2.0508 0.1952
Note that the p-value is exactly the same as in the t-Test before.
3. Reparametrization
A third way to test this hypothesis is to slightly rewrite the regression model. This process is called reparametrization. Let
We can now use this new parametrization to get what we want.
m2 <- lm(I(Y - 1) ~ V1 + V2, data=d)
summary(m2)$coef Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.3571834 0.2494192 -1.432061 1.952273e-01
V1 1.6956870 0.2123248 7.986286 9.215308e-05
V2 1.7717522 0.3224852 5.494057 9.122638e-04
Again note that the p-value for
More complicated hypotheses
We may be interested in testing more complicated hypothesis. E.g. if the effect of two parameters is identical i.e. if linearHypothesis this is straighforward. Just reformulate the hypothesis to get a scalar value on the right hand side, i.e.
h <- linearHypothesis(m, hypothesis.matrix = c(0, 1, -1), rhs=0)
hLinear hypothesis test
Hypothesis:
V1 - V2 = 0
Model 1: restricted model
Model 2: Y ~ V1 + V2
Res.Df RSS Df Sum of Sq F Pr(>F)
1 8 3.7754
2 7 3.7535 1 0.021973 0.041 0.8453
Again reparametrization can be used as well. Let
m3 <- lm(Y ~ V1 + I(V1 + V2), data=d)
b3 <- summary(m3)$coef
b3 Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.64281660 0.2494192 2.5772544 0.0366185604
V1 -0.07606522 0.3757534 -0.2024339 0.8453354355
I(V1 + V2) 1.77175222 0.3224852 5.4940574 0.0009122638
The estimate for
Deriving standard errors for combinations of parameters
Another approach to conduct a statistical test is it to derive the standard error for the weighted parameter combination
Let
Then the standard error of
The s.e. for any combinations of parameters can also be derived from the estimates of the parameter’s s.e. Let’s get the estimated regression parameters variance / covariance matrix.
cm <- vcov(m)
cm (Intercept) V1 V2
(Intercept) 0.062209914 0.003229363 0.029715547
V1 0.003229363 0.045081838 0.003943975
V2 0.029715547 0.003943975 0.103996706
The combination we are interested in is
s2.theta <- sum(1^2*cm[2,2] + (-1)^2*cm[3,3]) + 2*sum(1*(-1)*cm[2,3])
s.theta = s2.theta^.5
s.theta[1] 0.3757534
Note that the value matches the result from the approach above. We can now perform a t-test using this result for the s.e. The results are already shown in the regression output from the reparametrization above. But for the sake of completeness we will do it here once more by hand.
b_h0 <- 0 # null hypothesis
ts <- (b3[2, 1] - b_h0) / s.theta # test statistic, i.e. t-value
2*pt(abs(ts), deg, lower=FALSE) # note that it is a two sided test[1] 0.8453354
Note, that the value is equal to the p-value shown in the regression output from the reparametrization above.