Linear Regression Analysis
In most cases when we are working with data from an experiment, there
will be some "scatter" in the data due to errors. These errors may
come from lack of precision in instrumentation, constraints or factors
which were unaccounted for, human errors, or many other sources. In
many situations, we want to determine a relationship between the
variables from our data. To achieve this we perform a regression
analysis.
Least Squares Method:
One method which can be employed to determine a best fit line to
data is the method of Least Squares. Say that we have some data
points that we wish to find a best fit line through:
x y
1 1
4 2
6 8
8 11
11 12
16 14
19 22
21 25
The problem which develops when trying to select a "best-fit" line
is that there are a number of different lines which may look to be
the "best". We also may have some personal feelings as to how we
want our line to fit. It is therefore difficult to select a best-fit
line objectively.
Least squares is a method permits the selection of a line which
minimizes the error between all of the data points and the line
which best fits our data.
Gottfried (page 94-95) presents an explanation of the derivation
of the least squares equations, however they are as follows:
In the above expression:
a = slope of the line
b = y-intercept of the line
n = number of data points
x = x-coordinates of the individual points
y = y-coordinates of the individual points
How good does the line fit?
How well of a fit has been achieved can be measured with r2 which is
given by the following expression:
r2 = 1-SSE/SST
where: SSE is the sum of the squares of the error and SST is the
sum of the squares of the deviations about the mean which are given
by:
Regression Tools in Excel:
There are a few different methods of performing a linear regression
in Excel. One of the methods can be conducted after the data has
been graphed as an "XY Scatter" graph (with no line connecting the
data points):
1. Select the data by clicking on one of the data points. Several
of the points should become highlighted when they have been
selected.
2. In the menu bar, click Add Treadline under the Chart menu or
click the right mouse button to get a quick reference pop-up panel.
3. Under the "options" select to display equation and r2 value.
4. Click OK.
The treadline can be removed by selecting it and hitting the delete
key.
Regression Tools
The regression tools can also be employed in Excel to determine
the coefficients for the equation of the line.
The data does not have to be graphed if the regression tools are
selected.
1. Under Tools on the menu bar, select "data analysis". If "data
analysis" doesn’t appear under "Tools", select "Tools - Addins -
Analysis Toolpak".
2. Under data analysis select "Regression".
3. Specify the range of cells with the dependent variable - y
Specify the range of cells with the independent variable -x
Specify a point in the spreadsheet for the output data - select
a point which does not have anything for about 10 columns to the
right.
4. Click OK. The slope and intercept are given under the heading
"Coefficients"
CIVE 1331 COMPUTING FOR ENGINEEERS
LAB 9 -Mathematica Fundamentals
1. Introduction.
The objective of this lab is to introduce you to Mathematica.
2. Instructions.
a. Work through the Try It! exercises in Chapter 2 of the Mathematica
Engineers Toolkit on pages 12 to 25. Replace your name in the
appropriate cell. Save this notebook as lab08a.ma.
b. Look in the Labs directory on Fish for the file TAX.MA. Do
Exercises 4, 5, and 6 on pages 31 and 32. Save the notebook for
4 and 5 as lab08b.ma. Save the notebook for 6 as lab08c.ma. Note
that the notebook contents for exercise 6 is located on page 7.
3. Deliverables.
Write up a short report for this lab. The report should include a
summary of the lab and a hard copy of your workbooks. Submit your
report to the TA at the next lab meeting.
MATHEMATICA
Mathematica is a computer software system and language intended for
mathematical calculations. The calculations may be very simple or
quite complex. The program may be used for simple calculations, or
visualizing or graphing functions and data.
The program mathematica is a very general software system which is
available on numerous different platforms: P.C.’s, Macintosh, Unix
machines, or even supercompters. The general form of the program
is the same regardless of platform.
Windows Front-End:
As a user, you will interact with a "Windows front-end" that allows
you to communicate with a graphical user interface (GUI). The front
end allows the user to generate a notebook which contains commands
that instruct the program in the calculations to be carried out, as
well as comments or documentation of the problem solving technique.
Communication with the front-end is done by typing on the keyboard
as well as using the mouse.
The part of mathematica which carries out the calculations is referred
to as the "kernel". The front-end for the program allows you to
complete the calculations or other input exactly the way that
you want it and then submit the input to the kernel for operations,
after which the results will be returned to the front-end.
The operations that mathematica carries out may be numerical or
in terms of variables. One of the nice features of mathematica
comes in terms of dealing with very complex expressions. Although
complex equations are input on a single line, when the kernel
returns the results, the expressions are written in a typeset form.
For example we may input an expression:
(x1+x2+x3)/(x1+x2)
The kernel will return the expression in a typeset form:
A typeset form of the output allows you to see directly whether
or not you have input the expression correctly.
We will learn throughout this course that input for mathematica must
follow a specific syntax or form. There are thousands of commands
in mathematica and we will be learning a number of the commands as
we work with the program over the next few weeks.
GETTING STARTED with MATHEMATICA
Similar to the other programs that we have worked with in this
course, the mathematica program is started by double clicking on
the mathematica icon. The Mathematica Window contains similar menu
bars and even tool bars just like other programs that we’ve worked
with.
Once we start Mathematica we will be working on a specific notebook
which contains the particular project that we are working on. Each
notebook contains a series of cells that we input expressions and
commands into. There are brackets on the right-hand side of the
workbook which indicate how big the cells are. In many cases a
group of cells which are interrelated may be joined by a group
bracket. There will typically be a space between existing cells.
Four basic types of cells:
1. Text Cells: text cells are cells that you create and allow you
to organize your notebook so that your work makes sense to both you
and any other reader of your notebook. The kernel in Mathematica
will ignore the expressions within the cell. (Cells may be
designated text, title, section, etc.)
2. Input Cells: input cells are also cells that you create.
In general, input cells contain commands to the kernel. Once we
input the expressions and submit them to the kernel, they will be
returned with In[number] put in front of the command. Do not put
In…. into your commands, mathematica automatically does this. If
desired, a comment may be included in an input cell by including
the comment in (* *).
3. Output Cells: output cells are created by Mathematica based
on commands in the input cells that we create. Output cells are
labeled by Out[number]. Some input cells will create an output
cell and some will not. It depends on the particular command.
4. Graphics Cells: graphics cells are created by Mathematica
similar to output cells. The graphics cells are created by input
cells with commands such as Plot.
The default cell type is an input cell. If you want to define a
text cell, you must define the style of the cell.
Once we create a cell we can change it’s style by positioning the
cursor in the cell, pushing the mouse button, and selecting style,
cell style from the menu bar. We can also go back to any cell and
edit the content or add new content by simply placing the cursor at
the location that we want to edit and pushing the left mouse button.
Take care in hitting the return key. To put some new input on a new
line, hit the return key to start a new line. At the end of the
content in a cell, however, do not hit the return key or you will
be inserting a blank line. Text cells have automatic word wrapping
so you do not have to insert a return to start a new line.
You may insert a new cell between two existing cells by placing the
cursor in the space between the two existing cells and pushing the
left mouse button. This will insert a new cell which you can begin
typing in at any time.
Removing a Cell
In order to remove a cell, select the cell by clicking the mouse on
the bracket in the right margin. The selected cell should become a
black rectangle that you can delete by simply pushing the delete (del)
key.
Creating Input Cells
Input cells are created just like text cells, however you don’t have
to define the style of the text. Instead of defining the style of
text, an input cell needs to be evaluated by the kernal of mathematica.
This is accomplished by holding down the shift key while hitting the
return key. Another way to evaluate an input cell is to hit the
evaluate button on the toolbar.
After the input cell is evaluated, it will automatically insert
In[number] into the cell as well as creating the appropriate output
cell if required.
As mentioned previously, input operations in mathematica may be
numerical or in terms of variables. You may also define a particular
variable and then refer to it in terms of the variable name. It is
important, however to remember after you define a variable in terms
of numerical value, mathematica will store this value of the variable
in the remainder of the notebook until it is redefined.
It is actually a good idea if you are sure that you are done
with a particular variable to "clear" the value to avoid problems
later in the notebook if you forget that you have already previously
defined this value. This is accomplished with Clear(variable name)
With the use of the text cells and by playing with the layout of the
individual cells, you can make a particular notebook very orderly and
something that you can actually present clearly showing a given
exercise or series of calculations. In some cases you may not want
mathematica to show a particular output line for presentation reasons.
If you wish for a particular output cell not to be created by
Mathematica, end the input cell (the actual expression not a comment)
with a semi-colon.
For example: if x = 5/4 is the input expression the output
expression will simply be 5/4. If you wish this to be suppressed,
simply input
x=5/4;
Spaces in input cells are extremely important. A space between two
numbers or variables tell Mathematica that the parameters are to be
multiplied.
Saving files in mathematica:
Saving files in mathematica is similar to any other program that we
have used. The extension that we use for a mathematica notebook is
.ma
When a notebook is reopened it will not have any of the In[#] or
Out[#]. These are associated with a particular notebook and will
not be inserted until the notebook is evaluated by Mathematica.
All of the cells can be evaluated at once by selecting Evaluate
Notebook from the Action menu.
If you edit a particular input cell after it has been evaluated,
the original In[#] and Out[#] disappear until it is reevaluated
by pushing shift-Enter. A new # will be associated with the In[#]
and Out[#].
If we make a particular change to an Input cell in the middle of a
notebook, other cells within the notebook are not automatically
updated. They are not updated until they are resubmitted to the
kernal to be evaluated. The numbering order of the in and out lines
from the kernal will generally follow the order that they are
submitted to the kernal. If you have had to go back and edit a
number of input cells, the mathematica numbering may be quite
confusing and will not be sequentially numbered. This however
can be accomplished by selecting the Action, evaluate notebook.
The numbering will now be consecutive from top to bottom in the
notebook, however it will not start from 1, it will start from the
last execution number the kernel has completed. There are ways to
get around this. One way is to close and reopen the notebook. The
other is to selecting Disconnect Kernel from the options menu. When
the kernel is reconnected and evaluate notebook is selected the
numbering will begin from 1.