DescriptionChapter 5: Simple Regression

Introduction

The workhorse of economic statistics is the regression. It is used primarily to test theories to

see whether the data supports a hypothesis and to forecast future values, like stock prices or inflation.

In this chapter we will look at a simple regression where we look at the relationship between Y the

dependent variable and X the independent variable. In general we say that X determines Y.

Y = f (X )

Y = 0 + 1 X + U

Where 0 and 1 are the y-intercept and slope respectively. U is the error term. In class we show how

to calculate these parameters and to determine if they are statistically significant. In this project we will

understand how Excel and SAS calculate the parameters. We will further see how to use the output

from these programs to do analysis.

Example: GDP and Oil

One theory we can test is whether increases in oil prices is related with lower GDP. The idea is

that the US is a net oil importer and that when the price of oil goes up it costs more to produce goods so

GDP will be lower. In its simplest form the model is:

GDP = 0 + 1Oil + U

We would expect the slope will be negative and the y intercept a positive number. We can get the data

from a number of sources. Once source is the Economic Report of the President where we got this data.

We can put the data into Excel for the years 1959 to 2015.

Regression in Excel

We can use what we learned in the last four projects to do a regression analysis. The first set is to get

the data into a useable form. Each of the variables need to have the same number of observations. The

data set we will use here will look like this.

The variables are:

Year

GDP

M2

R

U

LR

Oil

Ex

1959 to 2015

Gross Domestic Product

Money Supply

Interest Rate (1 Year Treasury)

Unemployment Rate

Labor Force Participation Rate

Oil Price per barrel (Texas)

Exports

We will use a number of these variables in this and the next project. Notice that the labels are in row 1

and that there are no rows between the label and the data. This will be important later.

The simplest way to run a regression in Excel is to use the Data Analysis box under data.

This is the same box that we used in doing descriptive statistics and histograms in past projects. One of

the selections is regression. This is what we are going to use. Once you select regression and click ok.

You should see the following box:

In this case select the X and Y data by going to the data set and choosing GDP for Y and OIL for X.

Notice that the Label box is checked and that the data starts with the variable names in cells B1 and G1.

Also the New Worksheet button is checked which will put the results in a new worksheet, not with the

dataset.

Once you click OK a new worksheet will appear with the results. There will be two tables one with

ANOVA and the other with the regression coefficients. These are the results we can use to test our

hypothesis that when oil price rise GDP will fall.

Let’s look at the results to see what they say. First, the Adjusted R2 is .638 (in blue) this says that the

change in oil prices explain 63.8% of the change in GDP. The F statistic (in red) is greater that the critical

value – rule of thumb is if it is greater than 5 then the Adjusted R2 is statistically significant.

Next turn to the coefficients (in orange), OIL is positive and says that for every dollar oil prices rise GDP

goes up by $128 billion. This is the opposite of what we expected. Next we see if this coefficient is

significant and it is with a t statistic of 10 (in green) – the rule of thumb for t statistics is anything greater

than 2 is significant.

So what is wrong with are theory? What might be another theory that would explain this? (Hint: Time)

Single Functions:

If all we wanted was the coefficient 1 Excel has a formula for that called linest:

=LINEST(Data!B2:B58,Data!G2:G58)

Notice that data is first for Y and then for X separated by a comma. This will return 128.05.

=INTERCEPT(Data!B2:B58,Data!G2:G58)

Will give you 0 5458.06 just as the data analysis does.

You can also get the R2 use a formula called RSQ:

=RSQ(Data!B2:B58,Data!G2:G58)

This returns .645 just as above. The only problem is we have no easy way to get a t value for the

parameters.

Example of a regression using functions alone.

Here we have a calculation for the t test. In most cases using the Data Analysis Regression package will

be the easiest way to obtain regression results in Excel.

Regression in SAS

So the first step is to import the data using the import function. Remember if you are having

problems importing the data using the Excel workbook option try the Excel Workbook on PC files server

option

Then proceed as normal, click next, then give your data file a name like GDP.

Next you pick the worksheet with your data, remembering the first row are variable names and the

second row starts the data.

The last step is to give your dataset a name so SAS can find it.

Then click Finish. You should now have your data set ready to be used by SAS. If you look on the far left

under Explorer/Libraries/Work you should find your data with the name you gave it. If you click on it

you should see the data set. It is always good to check.

Next, turn to the Editor on the lower right of the program where again we will write the code that runs

the regression for us. Here we will do a simple regression, just as we did in Excel. We will look at the

relationship between GDP and Oil prices.

The first set is to create a new data set GDP1 from your original dataset GDP. (Note you do not have to

do this here but in the future if you want to create new variables this step will be helpful) The second

two lines give descriptive statistics, something we didn’t do in Excel but is always a good thing to do.

The last two lines run the regression for us.

PROC REG;

Calls the regression program and tells SAS to run it.

MODEL GDP=OIL;

The model statement tells SAS what the regression equation is. (Y variable = X variables). The run

commands just make sure each part of the program runs independently. Once you have typed in the

program as pictured above click on the running man button above in SAS. SAS will then take a few

seconds and then show you the output.

The first thing not reproduced here is the univariate output which tells you the mean, median, mode

and other statistics. The PROC REG output starts and gives the ANOVA with the R2 colored in purple and

the F stat in yellow. The parameter estimates (in green) and the t statistics (in red). Notice that they are

the same numbers we obtained in Excel.

SAS then gives a number of graphs to help you better understand the regression and the error terms.

These graphs allow us to see if the regression assumptions hold. You can see if there is a relationship

between the error terms, whether the errors are normally distributed and if the X (OIL) is independent

of the errors.

The last graph plots the trend line and the 95% confidence level. Again we can see the positive

relationship between OIL and GDP.

With just two lines all of these results are obtained in SAS. In the next chapter we will expand this

model to have multiple X variables and use other tests to ensure we have not violated any of our

assumptions.

EXAMPLE with DIFFERENCES

So we have a problem with our hypothesis. We were expecting a negative relationship but

instead it was positive. We might be worried that over time everything goes up and this might be

messing up the real relationship. One way to see if this is true is to look at the differences in GDP,

(GDP70-GDP69) and OIL (OIL70-OIL69), so we are looking at the change in GDP and OIL not GDP and OIL

itself. We saw a similar problem before when we indexed the data in our Graphing Chapter.

There are a number of ways to create differences. First you could just do it in Excel as part of

the data set.

DGDP is for difference in GDP for 1960 it is 3078.4-3052.2 =26.2. You will do the same for DOIL

expect the function will be =G3-G2. Once you have put these into I3 and J3 you can copy this for the

rest of the years. Notice you lose one year (1959) which is blank.

You can also create the difference in SAS, here I will call the two new variables GDPD and OILD

so as not to confuse them with the variables you created in Excel.

The editor program will look like this:

It is important to have the run command before you add the proc reg because you need to

create the variables before you can use them. To create the variables we use the lag command and the

just do the subtraction to get the difference. If you look at the dataset GDP1 in explore you will see that

GDPD and DGDP are the same.

Now let’s look at the results from our new regression. What can we say about the parameters

and the test statistics?

The R2 has dropped to 4.36% and the F stat tells us this is not significantly different from zero,

i.e., there is no relationship between GDP and oil prices. The parameter is still positive but the t statistic

says that we cannot reject the hypothesis that beta1 is zero. So time is important and it might be that

the positive relationship is because everything goes up with inflation. This is a cautionary note, that

even when the regression looks good we must be careful!

PROJECT

1. Look at the relationship between GDP and Exports. Normally we would think there should be a

positive relationship. The more you export the more money you make and the higher GDP. Run

a regression in both Excel and SAS.

2. Explain the results you obtained, specifically explain:

a. R2

b. F

c. Parameters

d. T stat

3. What can you say about this relationship? Is our hypothesis correct? Why?

State University of New York

College at Buffalo

Department of Economics and Finance

ECONOMETRICS

ECO 424 W

FINAL PROJECT

1.

Overview: The project involves four parts. The first part of the project is to find

the data you want to use. Second, you must produce a theory which can be tested

using a regression. The third thing is to run a SAS program which will estimate

the regression equation developed in the two parts of the project. The final part of

the project is to write up the theory and the statistical results in a 3 page paper.

2.

Data Sets: During the semester we have found and used a number of data sets on

the internet. You can use any of these or pick one of your own. Note if you are in

senior seminar, double dipping is allowed so you might want to use the same

dataset. Some examples are:

American Factfinder:

http://factfinder.census.gov/faces/nav/jsf/pages/searchresults.xhtml?refresh=t

BLS:

http://www.bls.gov/cpi/data.htm

BEA:

http://bea.gov/

Federal Reserve Fred:

https://research.stlouisfed.org/fred2/

Federal Reserve (Survey of Consumer Finance):

http://www.federalreserve.gov/econresdata/scf/scfindex.htm

NBER Datasets (A large number of datasets):

http://www.nber.org/data/

CDC Surveys:

http://www.cdc.gov/nchs/surveys.htm

Economic Report of the President:

https://www.gpo.gov/fdsys/browse/collection.action?collectionCode=ERP

3.

Developing a Theory: The theory which you develop will depend on which of

the data sets you use. After choosing a data set you must come up with a theory

which can be tested by running a regression. An example of a theory might be

that GDP (Y) is a function of the money supply (M), past GDP (Y-1) and real

interest rates (r), which can be written as:

Y = 0 + 1 M + 2 Y-1 + 3 r + U

For your theory you must have one dependant (Y) and three

independent (M, Y-1 , r) variables. You can create any theory which you

want. Do not worry about whether your theory is correct, you will not be

graded on the theory but on how you present the theory and the statistics.

4.

Analyzing the Theory: Any theory which you develop must involve the

determination of the sign pattern and the magnitude for each of the

coefficients, for example 0 might be assumed to be positive and about the

same size as last year’s GDP.

3 on the other hand should be negative, or

that GDP goes down as interest rates go up. The signs of the betas are what

makes up your theory and is what you will test when you run your SAS

project.

5.

SAS Project: Copy the data set which you have chosen from my account, as you

have done in the other computer projects this semester. Then use the following

statements:

Proc Reg;

Model Y = X1 X2 X3 /DW WHITE VIF;

Proc Autoreg;

Model Y = X1 X2 X3 / nlag=1;

Proc Corr;

Var X1 X2 X3;

Proc Means;

Var Y X1 X2 X3;

These statements will produce the regression results or will help in explaining and

testing your theory. Notice that these statements are all part of projects 4 and 5

which you will have completed before you start your project. If you are having

problems running your computer program review what you did in your past

projects.

6.

Statistical Results: The results which you obtain will test your theory. Did you

get the signs on the parameters, which you expected? If not can you explain why

you didn’t get the expected results. Are the size of the parameters what you

predicted and again if not why. You will also have to test to see if the parameters

are statistically significant by using a t-test. You will also want to test the

explanatory power of your theory using the R2 and the F-test. In order to make

sure that your model follows the standard assumptions, you must also use the

Durbin Watson test to rule out serial correlation, a plot of the residuals to check

for heteroscedasticity and the variance covariance matrix to check for

multicollinearity (VIF).

7.

Paper Outline: The paper should follow this general outline.

I.

Introduction and Explanation of Theory (1/2 page)

II.

Analysis of Theory (1/2 page)

a.

Variables

b.

Sign of parameters

c.

Size of parameters

d.

Other aspects of theory

III.

Statistical Results (1 page)

a.

Regression Results

b.

Test of parameters

c.

Test of the model

d.

Test for problems with the model

e.

Explanation of statistical results

f.

Other statistical results (such as mean of variables)

IV

Interpreting Results (1/2 page)

a.

Comparing Original Results with Actual Results

b.

Reinterpreting the Theory with the Statistical

Results

c.

Forecasting using results

V.

Concluding Statements ( 1/2 page)

a. What did you learn about your theory?

VI.

Note: if you want to add extra graphs and charts that is ok, the

minimum requirement is 3 pages. In the end you want to have a

report which is easy to understand and explains what you did.

The Paper is due at the exam, you may turn it in early and I will add comments so you

can rewrite the paper.

Purchase answer to see full

attachment

**We offer the bestcustom writing paper services. We have done this question before, we can also do it for you.**

#### Why Choose Us

- 100% non-plagiarized Papers
- 24/7 /365 Service Available
- Affordable Prices
- Any Paper, Urgency, and Subject
- Will complete your papers in 6 hours
- On-time Delivery
- Money-back and Privacy guarantees
- Unlimited Amendments upon request
- Satisfaction guarantee

#### How it Works

- Click on the “Place Order” tab at the top menu or “Order Now” icon at the bottom and a new page will appear with an order form to be filled.
- Fill in your paper’s requirements in the "
**PAPER DETAILS**" section. - Fill in your paper’s academic level, deadline, and the required number of pages from the drop-down menus.
- Click “
**CREATE ACCOUNT & SIGN IN**” to enter your registration details and get an account with us for record-keeping and then, click on “PROCEED TO CHECKOUT” at the bottom of the page. - From there, the payment sections will show, follow the guided payment process and your order will be available for our writing team to work on it.