best fit line
(noun)
A line on a graph showing the general direction that a group of points seem to be heading.
Examples of best fit line in the following topics:
-
Least-Squares Regression
- The process of fitting the best- fit line is called linear regression.
- Finding the best fit line is based on the assumption that the data are scattered about a straight line.
- Any other potential line would have a higher SSE than the best fit line.
- Therefore, this best fit line is called the least squares regression line.
- The following figure shows how a best fit line can be drawn through the scatter plot graph: .
-
Optional Collaborative Classroom Activity
- We will plot a regression line that best "fits" the data.
- It turns out that the line of best fit has the equation:
- The best fit line always passes through the point .
- The process of fitting the best fit line is called linear regression.
- This best fit line is called the least squares regression line.
-
Assumptions in Testing the Significance of the Correlation Coefficient
- The regression line equation that we calculate from the sample data gives the best fit line for our particular sample.
- We want to use this best fit line for the sample as an estimate of the best fit line for the population.
- (We do not know the equation for the line for the population.
- Our regression line from the sample is our best estimate of this line in the population. )
- Assumption (1) above implies that these normal distributions are centered on the line: the means of these normal distributions of y values lie on the line.
-
The Coefficient of Determination
- r2 , when expressed as a percent, represents the percent of variation in the dependent variable y that can be explained by variation in the independent variable x using the regression (best fit) line.
- This can be seen as the scattering of the observed data points about the regression line.
- Approximately 44% of the variation (0.4397 is approximately 0.44) in the final exam grades can be ex- plained by the variation in the grades on the third exam, using the best fit regression line.
- Therefore approximately 56% of the variation (1 - 0.44 = 0.56) in the final exam grades can NOT be explained by the variation in the grades on the third exam, using the best fit regression line.
- (This is seen as the scattering of the points about the line. )
-
Line of Best Fit
- The trend line (line of best fit) is a line that can be drawn on a scatter diagram representing a trend in the data.
- The trend line, or line of best fit, is a line that can be drawn on a scatter diagram representing a trend in the data.
- The mathematical process which determines the unique line of best fit is based on what is called the method of least squares - which explains why this line is sometimes called the least squares line.
- draw the scatterplot on a grid and draw the line of best fit;
- This graph shows what happens when we draw the line of best fit from the first data to the last data.
-
Outliers
- We could guess at outliers by looking at a graph of the scatter plot and best fit line.
- where $\hat{y}$=-173.5+4.83x is the line of best fit.
- Y2 and Y3 have the same slope as the line of best fit.
- The next step is to compute a new best-fit line using the 10 remaining points.
- Using the new line of best fit, $\hat{y}$= −355.19+7.39(73) = 184.28.
-
Summary
- Line of Best Fit or Least Squares Line (LSL): $\hat{y}$= a+bx x = independent variable; y = dependent variable
- Used to determine whether a line of best fit is good for prediction.
- The closer r is to 1 or -1, the closer the original points are to a straight line.
- Sum of Squared Errors (SSE): The smaller the SSE, the better the original set of points fits the line of best fit.
- Outlier: A point that does not seem to fit the rest of the data.
-
Student Learning Outcomes
-
Plotting Lines
- In statistics, charts often include an overlaid mathematical function depicting the best-fit trend of the scattered data.
- This layer is referred to as a best-fit layer and the graph containing this layer is often referred to as a line graph.
- It is simple to construct a "best-fit" layer consisting of a set of line segments connecting adjacent data points; however, such a "best-fit" is usually not an ideal representation of the trend of the underlying scatter data for the following reasons:
- In either case, the best-fit layer can reveal trends in the data.
- Best-fit curves may vary from simple linear equations to more complex quadratic, polynomial, exponential, and periodic curves.
-
Fitting a Curve
- Curve fitting with a line attempts to draw a line so that it "best fits" all of the data.
- Curve fitting is the process of constructing a curve, or mathematical function, that has the best fit to a series of data points, possibly subject to constraints.
- With linear regression, a line in slope-intercept form, $y=mx+b$ is found that "best fits" the data.
- To find the slope of the line of best fit, calculate in the following steps:
- Example: Write the least squares fit line and then graph the line that best fits the data