Linear Regression Analysis

In most cases when we are working with data from an experiment, there
 will be some "scatter" in the data due to errors.  These errors may 
come from lack of precision in instrumentation, constraints or factors
 which were unaccounted for, human errors, or many other sources.  In
 many situations, we want to determine a relationship between the 
variables from our data.  To achieve this we perform a regression 
analysis.  

Least Squares Method:

One method which can be employed to determine a best fit line to 
data is the method of Least Squares.  Say that we have some data 
points that we wish to find a best fit line through:
x	      y
1	1
4	2
6	8
8	11
11	12
16	14
19	22
21	25

The problem which develops when trying to select a "best-fit" line
 is that there are a number of different lines which may look to be
 the "best".  We also may have some personal feelings as to how we 
want our line to fit.  It is therefore difficult to select a best-fit
 line objectively.  

Least squares is a method permits the selection of a line which 
minimizes the error between all of the data points and the line
 which best fits our data.  
 
 
Gottfried (page 94-95) presents an explanation of the derivation 
of the least squares equations, however they are as follows:
 

In the above expression:
			a = slope of the line
			b = y-intercept of the line
			n = number of data points
			x = x-coordinates of the individual points
			y = y-coordinates of the individual points

How good does the line fit?
How well of a fit has been achieved can be measured with r2 which is 
given by the following expression:

r2 = 1-SSE/SST
 where:  SSE is the sum of the squares of the error and SST is the 
sum of the squares of the deviations about the mean which are given
 by:

 
 

Regression Tools in Excel:

There are a few different methods of performing a linear regression 
in Excel.  One of the methods can be conducted after the data has 
been graphed as an "XY Scatter" graph (with no line connecting the
 data points):
 
1.  Select the data by clicking on one of the data points.  Several
 of the points should become highlighted when they have been 
selected.
2.  In the menu bar, click Add Treadline under the Chart menu or
 click the right mouse button to get a quick reference pop-up panel.
3.  Under the "options" select to display equation and r2 value.
4.  Click OK.
The treadline can be removed by selecting it and hitting the delete
 key.
 Regression Tools
The regression tools can also be employed in Excel to determine 
the coefficients for the equation of the line.
The data does not have to be graphed if the regression tools are
 selected.  
1.  Under Tools on the menu bar, select "data analysis".  If "data
 analysis" doesn’t appear under "Tools", select "Tools - Addins - 
Analysis Toolpak".
2.  Under data analysis select "Regression".
3.  Specify the range of cells with the dependent variable - y
     Specify the range of cells with the independent variable -x
 Specify a point in the spreadsheet for the output data - select
 a point which does not have anything for about 10 columns to the
 right.  
4. Click OK.  The slope and intercept are given under the heading 
"Coefficients"

CIVE 1331 COMPUTING FOR ENGINEEERS
LAB 9 -Mathematica Fundamentals


1. Introduction. 

The objective of this lab is to introduce you to Mathematica.


2. Instructions.

a.  Work through the Try It! exercises in Chapter 2 of the Mathematica
 Engineers Toolkit on pages 12 to 25.  Replace your name in the 
appropriate cell.  Save this notebook as lab08a.ma.

b.  Look in the Labs directory on Fish for the file TAX.MA.  Do 
Exercises 4,  5, and 6 on pages 31 and 32. Save the notebook for 
4 and 5 as lab08b.ma.  Save the notebook for 6 as lab08c.ma. Note 
that the notebook contents for exercise 6 is located on page 7.


3. Deliverables.

Write up a short report for this lab. The report should include a 
summary of the lab and a hard copy of your workbooks.  Submit your 
report to the TA at the next lab meeting.


MATHEMATICA

 

Mathematica is a computer software system and language intended for 
mathematical calculations.  The calculations may be very simple or 
quite complex.  The program may be used for simple calculations, or 
visualizing or graphing functions and data. 

The program mathematica is a very general software system which is 
available on numerous different platforms: P.C.’s, Macintosh, Unix 
machines, or even supercompters.  The general form of the program 
is the same regardless of platform.  

Windows Front-End:
As a user, you will interact with a "Windows front-end" that allows 
you to communicate with a graphical user interface (GUI).  The front 
end allows the user to generate a notebook which contains commands 
that instruct the program in the calculations to be carried out, as 
well as comments or documentation of the problem solving technique.  
Communication with the front-end is done by typing on the keyboard 
as well as using the mouse.  

The part of mathematica which carries out the calculations is referred 
to as the "kernel".  The front-end for the program allows you to 
complete the calculations or other input exactly the way that 
you want it and then submit the input to the kernel for operations, 
after which the results will be returned to the front-end. 

The operations that mathematica carries out may be numerical or 
in terms of variables.  One of the nice features of mathematica 
comes in terms of dealing with very complex expressions.  Although 
complex equations are input on a single line, when the kernel 
returns the results, the expressions are written in a typeset form.  

For example we may input an expression:

(x1+x2+x3)/(x1+x2)

The kernel will return the expression in a typeset form:
 

A typeset form of the output allows you to see directly whether 
or not you have input the expression correctly.  

We will learn throughout this course that input for mathematica must 
follow a specific syntax or form.  There are thousands of commands 
in mathematica and we will be learning a number of the commands as 
we work with the program over the next few weeks.

GETTING STARTED with MATHEMATICA

Similar to the other programs that we have worked with in this 
course, the mathematica program is started by double clicking on 
the mathematica icon.  The Mathematica Window contains similar menu 
bars and even tool bars just like other programs that we’ve worked 
with.  

Once we start Mathematica we will be working on a specific notebook 
which contains the particular project that we are working on.  Each 
notebook contains a series of cells that we input expressions and 
commands into.  There are brackets on the right-hand side of the 
workbook which indicate how big the cells are.  In many cases a 
group of cells which are interrelated may be joined by a group 
bracket.  There will typically be a space between existing cells.  

Four basic types of cells:
1.  Text Cells: text cells are cells that you create and allow you 
to organize your notebook so that your work makes sense to both you 
and any other reader of your notebook.  The kernel in Mathematica 
will  ignore the expressions within the cell.  (Cells may be 
designated text, title, section, etc.)
2.  Input Cells: input cells are also cells that you create.  
In general, input cells contain commands to the kernel.  Once we 
input the expressions and submit them to the kernel, they will be 
returned with In[number] put in front of the command.  Do not put 
In…. into your commands, mathematica automatically does this.  If 
desired, a comment may be included in an input cell by including 
the comment in (*       *).  
3.  Output Cells:  output cells are created by Mathematica based 
on commands in the input cells that we create.  Output cells are 
labeled by Out[number].  Some input cells will create an output 
cell and some will not.  It depends on the particular command.
4.  Graphics Cells:  graphics cells are created by Mathematica 
similar to output cells.  The graphics cells are created by input 
cells with commands such as Plot.  

The default cell type is an input cell.  If you want to define a 
text cell, you must define the style of the cell.

Once we create a cell we can change it’s style by positioning the 
cursor in the cell, pushing the mouse button, and selecting style, 
cell style from the menu bar.   We can also go back to any cell and 
edit the content or add new content by simply placing the cursor at 
the location that we want to edit and pushing the left mouse button.  

Take care in hitting the return key.  To put some new input on a new 
line, hit the return key to start a new line.  At the end of the 
content in a cell, however, do not hit the return key or you will 
be inserting a blank line.  Text cells have automatic word wrapping 
so you do not have to insert a return to start a new line.
You may insert a new cell between two existing cells by placing the 
cursor in the space between the two existing cells and pushing the 
left mouse button.  This will insert a new cell which you can begin 
typing in at any time.  

Removing a Cell
In order to remove a cell, select the cell by clicking the mouse on 
the bracket in the right margin.  The selected cell should become a 
black rectangle that you can delete by simply pushing the delete (del)
 key.
   
Creating Input Cells
Input cells are created just like text cells, however you don’t have 
to define the style of the text.  Instead of defining the style of 
text, an input cell needs to be evaluated by the kernal of mathematica.
  This is accomplished by holding down the shift key while hitting the
 return key.  Another way to evaluate an input cell is to hit the 
evaluate button on the toolbar.  

After the input cell is evaluated, it will automatically insert 
In[number] into the cell as well as creating the appropriate output 
cell if required.  

As mentioned previously, input operations in mathematica may be 
numerical or in terms of variables.  You may also define a particular 
variable and then refer to it in terms of the variable name.  It is 
important, however to remember after you define a variable in terms 
of  numerical value, mathematica will store this value of the variable 
in the remainder of the notebook until it is redefined.
	It is actually a good idea if you are sure that you are done 
with a particular variable to "clear" the value to avoid problems 
later in the notebook if you forget that you have already previously 
defined this value.  This is accomplished with  Clear(variable name)

With the use of the text cells and by playing with the layout of the 
individual cells, you can make a particular notebook very orderly and 
something that you can actually present clearly showing a given 
exercise or series of calculations.  In some cases you may not want 
mathematica to show a particular output line for presentation reasons. 
 If you wish for a particular output cell not to be created by 
Mathematica, end the input cell (the actual expression not a comment) 
with a semi-colon.
	For example: if x = 5/4 is the input expression the output 
expression will simply be 5/4.  If you wish this to be suppressed, 
simply input
	x=5/4;

Spaces in input cells are extremely important.  A space between two 
numbers or variables tell Mathematica  that the parameters are to be 
multiplied.  

 Saving files in mathematica:
Saving files in mathematica is similar to any other program that we 
have used.  The extension that we use for a mathematica notebook is 
.ma

When a notebook is reopened it will not have any of the In[#]  or 
Out[#].  These are associated with a particular notebook and will 
not be inserted until the notebook is evaluated by Mathematica.  
All of the cells can be evaluated at once by selecting Evaluate 
Notebook from the Action menu.  

If you edit a particular input cell after it has been evaluated, 
the original In[#]  and Out[#] disappear until it is reevaluated 
by pushing shift-Enter.  A new # will be associated with the In[#]  
and Out[#].

If we make a particular change to an Input cell in the middle of a 
notebook, other cells within the notebook are not automatically 
updated.  They are not updated until they are resubmitted to the 
kernal to be evaluated.  The numbering order of the in and out lines 
from the kernal will generally follow the order that they are 
submitted to the kernal.  If you have had to go back and edit a 
number of input cells, the mathematica numbering may be quite 
confusing and will not be sequentially numbered.  This however 
can be accomplished by selecting the Action, evaluate notebook.  
The numbering will now be consecutive from top to bottom in the 
notebook, however it will not start from 1, it will start from the 
last execution number the kernel has completed.  There are ways to 
get around this.  One way is to close and reopen the notebook.  The 
other is to selecting Disconnect Kernel from the options menu.  When 
the kernel is reconnected and evaluate notebook is selected the 
numbering will begin from 1.