FITTING DATA
There are several methods that one can use to find a
function that passes through a set of data points thereby revealing a
mathematical relationship. To perform a
fit the experimenter must choose a functional form.
Functional form defines the mathematical relationship between the dependent variable (y) and independent variable (x). The form usually contains parameters whose values must be chosen to fix the relationship. Examples of some functional forms follow.
Line: |
_{} |
Parameters: m, b |
Exponential: |
_{} |
Parameters: A_{o}, l |
Polynomial |
_{} |
Parameters: a, b, c, d |
A computer program usually changes the parameters of interest in some pattern
that is designed to find the best values for the parameters in an efficient
way. The y-values calculated with the fit function are compared to the data
y-values. The quality of the fit is judged by the difference. The method
employed to search for the best parameters is unimportant as long as a good fit
is found. Consider the following function
_{}
There are four parameters, A, B, C, and D. Choosing different values for these parameters results in a different lines as shown in the graph below. Both lines shown on the plot represent the same functional form F(t) but with different values for the parameters.
Neither function passes through the experimental values, shown as triangles with error bars. A fitting program keeps changing the parameter values and testing if the new line passes through the data points. When the line passes sufficiently close to the data (y-values from fit are close to the y-values of the data) the fitting program returns the values of the parameters. A good fitting routine can vary the parameters again to see how much a parameter can change while still passing through the error bars. This allows the routine to establish an uncertainty (range of possible values) for each parameter. In the laboratory, data analysis almost always requires both a value and an uncertainty.
The fitting routines
provide values and uncertainties. Student
therefore need to be able to pass data to one of these fitting programs,
run the fit and retrieve parameters and uncertainties. If one uses a routine
that doesn’t provide parameter uncertainties then an alternativemethod
to determine these uncertainties is required.
Trendline:
Excel provides trend lines for charts. These lines are made to pass through the
data. The parameters can be viewed by displaying the trend line function on the
chart. The disadvantage of this method is that it doesn't indicate the
uncertainty in the parameters. Therefore
this approach is NOT recommended. However this method in junction with the
approach described below can be used to find a line that fits the data and an
estimate for the parameter uncertainties.
Finding Uncertainty (repeated trials method): An uncertainty can be determined (for a trend line analysis) by measuring more than one data set. Trend lines can be placed on each of the different data sets and the parameter values from each dataset (e.g. slope of a straight line) can be put into a table and compared using the SD to estimate the uncertainty in the fitted parameter (e.g. slope). This method requires the experimenter to repeat the experiment so that independent datasets are compared. This method can be used to estimate an uncertainty for any fit method. As mentioned above, routines such as DataFit provide uncertainties based on one data set. The two methods should agree.
DataFit: A separate program, DataFit, is one of the best tools for general fitting.
§ Start program using the DataFit icon.
§ Enter the number of independent variables (usually 1).
§ Decide if you want to have a column for y uncertainties (standard deviation column) or no column (usually no column). Typically, one wants to enter an uncertainty for each Y-data point.
§ Hit OK.
§ Paste the data to the data window.
§ Choose regression under the solve menu.
§ Choose nonlinear.
§ Choose the functional form from among the options or provide a custom function.
§ If the fit is successful the results can be obtained by choosing - detailed…- in the results menu. Scroll down until you find the table Regression Variable Results use Value and Standard Error for each parameter.
The results contain the parameters and their uncertainties (standard error) as well as a host of plots and other indicators.
Ask your instructor to show you the procedure. While your instructor is demonstrating, add your own comments to the above procedure so that you can perform a fit on your own.
Graphical Analysis: This package is supplied as part of the data collection and analysis tools from Vernier Software. This package allows the experimenter to enter or import data, to plot, calculate, graph and fit data. It has a complete set of tools so that a full analysis can be performed. It provides text boxes for comments, and graphs with sophisticated display options. Graphical analysis is a fairly complete, additional spread sheet which is available for student use. Copies of the software are available for installation on your home computer. Ask your instructor.
Start Graphical Analysis by double clicking on the GA icon. Open a new analysis by choosing new under the file menu. Import or cut & paste data to the table window. Use the toolbar “Curve Fit” button or choose curve fit from the analyze menu. Choose the data to fit in the window that appears (y-column). Choose the functional form and click the “try fit” button. Complete the process by using the “OK” button. A window should appear on the graph showing the parameters and the associated uncertainties. The root mean square error, RMSE, is also given.
Vernier Tech Info Library TIL # 1014 (from website)
MSE:
Mean Square Error, for every data point, you take the distance vertically from
the point to the corresponding point on the curve fit and square the value.
Then you add up all those values for all data points, and divide by the number
of points. The squaring is done so negative values do not cancel positive
values. The smaller the Mean Squared Error, the closer the fit is to the data.
RMSE: Root Mean Squared Error is just the square
root of the mean square error. That is probably the most easily interpreted
statistic, since it has the same units as the quantity plotted on the y axis.
The RMSE is thus the distance, on average, of a data point from the fitted
line, measured along a vertical line.
LoggerPro: Logger Pro, also provided by Vernier, has fitting functions available. These come in handy when recorded data needs to be fit quickly. Logger Pro does provide an estimate of the uncertainty for the fitted parameters.
If you do not see the parameter uncertainties in the fitting summary dialog box then right click on the dialog box to obtain get the options and check the appropriate boxes.
There are several interesting options available. You can vary the parameters and see how the function changes. You can define additional functions. Also see Logger Pro help files.
The following functions are useful for the restricted case of a straight-line relationship between the dependent and the independent variable.
Linest: The function LINEST returns the slope and intercept data from a straight line LSQ fit. Since there are several values returned you must:
· Enter the formula LINEST(y range, x range, 1, 1) into a cell.
· Select a range of cells that include this formula at the upper left (2 cells across, 5 down).
· Type F2 function key followed by “cntl-shift-enter”. (Excel’s Array entry)
· In these data the slope intercept and their uncertainties can be found.
Regression Analysis: This is a just a fancy name for straight-line fitting. It assumes that the relationship between the variables is linear. It therefore can be used to find the best straight line that passes through as set of (x,y) pairs. The regression function returns a full set of quantities that can be used to describe the quality of the fit. It also provides estimates of the uncertainty in the slope and intercept. To perform a regression in excel:
§ Use Data Analysis item in the tools menu.
§ Choose regression.
§ Enter the y and x values.
§ Hit OK.
§ View the sheet with the results.
§ The intercept and uncertainty are tabulated.
§ A value for the slope (x variable 1) and an associated uncertainty for this value are also tabulated.