Chapter 4 Statistics

Question Answer
What is a bivariate data? When values from two variables are collected from each individual in the population or sample, the set of data values are called bivariate data. (For example: height and weight collected from each GPC student)
What is the response variable in a bivariate set of data? The response variable is the dependent variable in the bivariate data set. It can be explained (at least in part) by the explanatory variable.
What is the explanatory or predictor variable in a bivariate set of data? The explanatory variable is the independent variable in the bivariate data set. Its value can be used to predict, though not perfectly, values of the dependent variable (the response variable).
What is a lurking variable? A lurking variable is a variable that is related to either the response variable or the predictor variable or both, but is not considered as part of the study. A lurking variable can lead to incorrect or misleading results in a study.
What is a scatter diagram? A scatter diagram (or scatter plot) is a graph of a set of points, which shows the relationship between two quantitative variables measured on the same individual with one point for each individual. The points are not connected.
Which variable is placed on which axis in a scatter diagram? The explanatory variable is plotted on the horizontal axis and the response variable is plotted on the vertical axis.
Why are scatter diagrams useful? A scatter diagram can be used to indicate a relationship or lack of a relationship between the explanatory variable and the response variable. For example: Do the dots line up approximately along a straight line?
What is meant by a positive linear association between two variables? Two variables that are linearly related are said to be positively associated if whenever the value of the predictor variable increases, the value of the response variable also increases. (The dots are approximately on a line and it go up left to right.)
What is meant by a negative linear association between two variables? Two variables that are linearly related are said to be negatively associated if whenever the value of the predictor variable increases, the value of the response variable decreases. (The dots are approximately on a line and it go down left to right.)
What is the correlation coefficient in a bivariate study? The linear correlation coefficient is a measure of the strength and direction of a linear relationship between two quantitative variables. For a sample, it is often represented by the letter r.
What are the possible values of the correlation coefficient and what do these values indicate? The correlation coefficient can take values between -1 and 1, inclusive.
What do the values of the correlation coefficient, r, indicate? An r close to 1 shows a strong positive linear relationship between the variables. An r close to -1 indicates a strong negative linear relationship between the variables. An r close to 0 indicates that the variables are not linearly related.
What is least-square criterion? It is the smallest sum of the squares of the differences between the predicted y data and the observed y data values. It indicates the smallest sum of the squares of the residuals.
What are the linear equations found using the least-square criteria called? Linear equations found using the least square criteria are called linear regression equations.
What is the residual in a linear relationship? Residual = observed y ¬– predicted y (Note the order.) The predicted y is found using the linear relationship.
How can you interpret the slope in a linear regression equation? The slope can be interpreted as the average rate of change of the response variable, y, with respect to the explanatory variable, x. Thus, when x increases by one unit, y will change by the amount of the slope.
How can you interpret the y-intercept in a linear regression equation? The y-intercept can be interpreted as the predicted value of the response variable when the predictor variable is zero.
Under what conditions does the interpretation of the y- intercept in a regression equation make sense? This makes sense only if the value of 0 for the explanatory variable makes sense and there is an observed value of the explanatory variable near 0. (Never make predictions too far from observed values.)
What does the coefficient of determination measure? The coefficient of determination, R2, measures the percentage of the total variation in the response variable, y, which is explained by the least-squares regression associated with x.
What values can the coefficient of determination take on? The coefficient of determination can take values between 0 and 1, inclusive.
What do the possible values of the coefficient of determination, R2, indicate about the linear regression equation? R2 = 1 means that 100% of the variation in the response variable is predicted by the regression line. (The regression line fits the data exactly.) The closer R2 is to zero the worse the regression line predicts the relationship between x and y.
For linear regression equations (but not all regression equations) how might you find the coefficient of determination, R2? For a linear regression model, we just need to square the correlation coefficient, r, to find the value of coefficient of determination, R2.