Question#1
If I am
to model the relationship between the mean or expected number of games won by a
major-league team and the team’s batting average is x, then a straight line would be used and the slope of a
line would be negative. This is because a negative slope line implies that y will decrease when x increases
and vice versa. An example of a graph with negative slope is as follows:
Negative Slope
m =
This indicates that when x increases by 3, then y decreases instantly by 4, and when x decreases by 3, then y increases automatically by 4.
Question#2
The pattern revealed by the scattergram
agrees with my answer to part a.
In order to construct a simple linear
regression of the data, a linear relationship between the two variables should
exist. Whilst there are a couple of ways to determine whether the linear
relationship is present between the two variables or not, the best way is to
create a scatterplot using SPSS in which the dependent variable can be plotted
against the independent variable.
The eqaution of least squares line is ŷ= a + b x.
Question#3
This graph reveals that the least squares
line fits the point on my scattergram.
Question#4
After looking at the data, I have found that
the mean or expected number of games won is strongly related to a team’s
batting average, as the two variables are positively related to one another and
their highest values are also interlinked.
Question#5
From the regression equation, I have
seen that the straight line expression is 119.86 +0.346x. It is a reliable
equation as the regression F value is <0.05. In the meantime, the value
of
and
are 119.86 and 0.346 respectively. These values have
been obtained from the regression table.
Question#6
The equation of the least squares line for
Brand A and Brand B is as follows:
y = mx + b
Here,
y = how far up
m = gradient or slope (how steep the line is)
x = how far long
b = the Y intercept (the line that crosses
the Y axis)
Question#7
For the first brand:
For the second brand:
Question#8
I would like to use the least squares line to
predict useful life for a given cutting speed for the second brand, as its
value of y is better than the first brand’s value of y.
Question#9
The equation of the least squares line is as
follows:
Question#10
After testing at α = 0.05, I have found that
the straight-line model contributes information for predicting overhead costs.
Question#11
While a scatterplot allows us to check for
autocorrelations, the Correlation matrix is the most effective and best
assumption to be made about the random error ϵ in this problem. While computing
the matrix of Pearson’s Bivariate Correlation among all independent variables,
the Correlation matrix should be smaller than 1.
Question#12
The slope of the least squares line is
positive as r is positive.
Question#13
The slope of the least squares line is
negative as r is negative.
Question#14
If the value of r is 0, then this will
indicate that there is no linear relationship between the data. It means if the
value of x increases, then the value of y will also increase. In this
situation, the slope of the least squares line will be 0.
Question#15
If the value of r2 is 0.64, then this will indicate that there is a
positive linear relationship between the data. It means if the value of x
increases, the value of y will automatically decrease. In this situation, the
slope of the least squares line will be uneven.
Question#16
The correlation coefficient for both sets of
data is as follows:
|
|
Coefficient, r
|
|
|
Strength of Association
|
Positive
|
Negative
|
|
Small
|
.1 to .3
|
-0.1 to -0.3
|
|
Medium
|
.3 to .5
|
-0.3 to -0.5
|
|
Large
|
.5 to 1.0
|
-0.5 to -1.0
|
Question#17
The accuracy of weigh-in-motion data is
always less for the static weigh scale in which the environment is better
controlled. In the absence of the correlation coefficients, it will never be
possible to determine the effectiveness of the weigh-in-motion scale.
Question#18
The equation of the least squares line is as
follows:
Question#19
When the test is done using α = .05, we can
easily say that the data does not support this concept.
Question#20
The estimate of the intercept β0
and slope β1 is 0 and 2 respectively.
Question#21
Yes, the annual energy consumption is
positively and linearly related to the shell area of the building.
Question#22
From this photo, it is evident that the
observed significance level of the test of part b is 28.
Question#23
The coefficient of determination for a linear
regression model with one independent variable is as follows:
R2 = { ( 1 / N ) * Σ [ (xi - x)
* (yi - y) ] / (σx * σy )
}2
Here,
N = The total number of observations done as
part of this model.
Σ = The summation symbol
xi = The x value for observation i
x = The mean x value
yi = The y value for observation i
y = The mean y value
σx = The standard deviation of x
σy = he standard deviation of y.
Question#24
The predicted value of energy consumption can
be determined in the following way.
There is a 95 percent
probability, so the confidence interval of the regression line can easily be
calculated using the data.
Here
The standard error of the prediction is
For the specific value x0 the
prediction value will be
73.16.
This interval is useful as it helps come up
with satisfactory results.
References
Nelson, P. R., Copeland, K.A.F., &
Coffin, M. (2003). Introductory Statistics for Engineering Experimentation.
Burlington, US: Academic Press.
Hoerl, R., & Snee, R. (2012). Wiley and
SAS Business Series: Statistical Thinking: Improving Business Performance (2).
Hoboken, US: Wiley.
SAGE Publications Ltd. (2017). Correlation
and Regression – Pearson [Video file]. doi: 10.4135/9781526400086.
SAGE Publications Ltd. (2014). Correlation
& Simple Regression [Video File].doi: 10.4135/9781473996922