MS 150 Statistics quiz three 4.2 linear regression • Name:

Grayware 2005
MonthMonth num (x)Threats in 10000s (y)
Jan159
Feb265
Mar382
Apr476
May588
Jun691
Jul794
Aug891
Sep9100
Oct10118
Nov11112
Dec12124

While the number of new malware threats (computer viruses, worms, exploits, and rootkits) appearing per month in 2005 leveled off, the numer of grayware threats (computer adware, backdoors, downloaders, and droppers) appearing per month climbed steadily throughout 2005. Use the third column, the y data, to answer the following questions.

  1. __________ Find the mode of the threats column.
  2. __________ Find the median of the threats column.
  3. __________ Find the mean of the threats column.
  4. __________ Find the standard deviation of the threats column.

Use the second and third columns in the table on the right to find the linear regression (best fit) line through the data and to answer the questions below.

  1. ______________ Use the computer to plot the data. Does the relationship appear to be linear (roughly a straight line) or non-linear (curved)?
  2. ______________ Determine the slope of the linear regression for the data.
  3. ______________ Determine the y-intercept of the linear regression for the data.
  4. ______________ Determine the correlation coefficient r.
  5. ______________ Is the correlation positive or negative?
  6. ______________ Is the correlation none, weak, moderate, strong, or perfect?
  7. ______________ Determine the coefficient of determination.
  8. ______________ What percent in the variation in month number "explains" the variation in the grayware threats?
  9. ______________ Given that the trend has held to date, use the slope and intercept above to calculate the predicted grayware threat in September 2006 (for the month number use 21).
  10. ______________ Presume that the trend will continue. Use the slope and intercept to calculate the the month number in which the threats will be 200.
  11. ______________ Toughie: What month name and year does the above month number correspond?

Data based on the white paper The trend of threats today: 2005 Annual Roundup and 2006 Forecast 2006 by Trend Micro Incorporated.

Basic Statistics
Statistic or ParameterSymbolEquationsOpenOffice
Sample mode=mode(data)
Sample median=median(data)
Sample mean x Σx/n=average(data)
Sample standard deviationsx or s=stdev(data)
Linear Regression Functions
Statistic or ParameterMath symbolStat symbolOpenOffice
Slopemb=slope(y data;x data)
Interceptba=intercept(y data;x data)
Correlation r=correl(y data;x data)
Coefficient of Determination r2 =(correl(y data;x data))^2

StatisticsLee LingCOMFSM