### MS 150 Statistics fx summer 2007 • Name:

On Sunday 08 July 2007 the Honolulu Advertiser ran an article covering the rising number of Micronesians using Hawaii's homeless shelters. The number soared by nearly three times between 2001 and 2006, and Micronesians now make up more than 20 percent of the state's total homeless population.

Micronesian homeless shelter users
YearNumber
2001286
2002316
2003554
2004463
2005513
2006736

#### Part I: Basic Statistics

Use the number of Micronesians in homeless shelters in Hawaii ("number of shelter users") for the following calculations. Do not use the year data!

1. _________ What level of measurement is the number of shelter users data?
2. _________ Determine the sample size n.
3. _________ Calculate the sample mean x.
4. _________ Determine the median.
5. _________ Determine the mode.
6. _________ Determine the minimum.
7. _________ Determine the maximum.
8. _________ Calculate the range.
9. _________ Calculate the sample standard deviation sx.
10. _________ Calculate the sample Coefficient of Variation.
11. _________ Determine the class width. Use three bins (classes or intervals). Note that the number of bins is three!
12. Fill in the following table with the class upper limits in the first column, the frequencies in the second column, and the relative frequencies in the third column
Bins (x)Frequency fRF p(x)
___________________________
___________________________
___________________________
Sums:__________________
13. Sketch a histogram of the relative frequency data.
14. __________________ What is the shape of the distribution?
15. __________________ Using the sample mean x and the sample standard deviation sx calculated above, determine the z-score for the 736 Micronesian shelter users in 2006.
16. _________ Is the z-score for 736 Micronesian shelter users in 2006 an ordinary or extraordinary value?
17. _________ Calculate the standard error of the sample mean for the number of Micronesian shelter users.
18. _________ Find tcritical for a confidence level of 95% for the number of Micronesian shelter users.
19. _________ Determine the margin of error E for the sample mean.
20. Write out the 95% confidence interval for the population mean number of shelter users:
_________ ≤ μ ≤ _________

#### Part II: Hypothesis Testing using the t-test

The following data is also from the Honolulu Advertiser article cited above. This section examines whether there is pairwise difference in the number of shelter users based on island ethnicity for the six years between 2001 and 2006.

Homeless shelter users
YearHawaiians (x)Micronesians (y)
20011117286
20021039316
2003 864554
2004 857463
2005 756513
2006 744736
1. __________________ Use the paired TTEST function =TTEST(data_range_x;data_range_y;2;1)to determine the p-value for this paired two sample data.
2. __________________ Is the pairwise difference in the number of Hawaiians and the number of Micronesians statistically significant at a risk of a type I error alpha α = 0.05?
3. __________________ Would we fail to reject or reject a hypothesis of no mean pairwise difference in the numbers in homeless shelters in Hawaii based on ethnicity?
4. __________________ What is the maximum level of confidence we can have that the pairwise difference is statistically significant?

#### Part III: Linear Regression

Micronesian shelter Users
YearNumber
01286
02316
03554
04463
05513
06736

The data in this section examines whether there is a trend in the number of Micronesian entering homeless shelters from 2001 to 2006. Note that only the last two digits of the year are being used in this section.

1. _________ Calculate the slope of the best fit (least squares) line.
2. _________ Calculate the y-intercept of the best fit (least squares) line.
3. _________ Is the correlation positive, negative, or neutral?
4. _________ Calculate the predicted number of Micronesians in homeless shelters in Hawaii in 07.
5. _________ Calculate the year in which 1000 Micronesians are predicted to be entering homeless shelters in Hawaii.
6. _________ Calculate the linear correlation coefficient r for the data.
7. _________ Is the correlation none, low, moderate, high, or perfect?

The Honolulu Advertiser article is based on independent research by Michael D. Ullman. As always statistics can be and are deployed to support particular positions. This data is likely being used by the state of Hawaii to seek reimbursement for Compact impact. The article makes an important note at the start of the article, "[Micronesians] now make up more than 20% of the state's total homeless population. Given the data above, this suggests that 20% are Hawaiian and the 60% majority of the homeless are neither Hawaiian nor Micronesian.

#### Tables of Formulas and OpenOffice Calc functions

Basic Statistics
Statistic or ParameterSymbolEquationsOpenOffice
Square root=SQRT(number)
Sample standard deviationsx or s=STDEV(data)
Sample Coefficient of VariationCV sx/x =STDEV(data)/AVERAGE(data)
Confidence interval statistics
Statistic or ParameterSymbolEquationsOpenOffice
Degrees of freedomdf= n-1=COUNT(data)-1
Find a tc value from a confidence level c tc =TINV(1-c;df)
Calculate the standard error of the mean=sx/SQRT(n)
Calculate a margin of error for the mean E for n < 30 using sx. Should also be used for n ≥ 30. E =tc*sx/SQRT(n)
Calculate a confidence interval for a population mean µ from a sample mean x and an error tolerance E x - E ≤ µ ≤ x + E
Hypothesis Testing
Relationship between confidence level c and alpha α for two-tailed tests 1 − c = α
Calculate t-critical for a two-tailed test tc =TINV(α;df)
Calculate a t-statistic t =(x - µ)/(sx/SQRT(n))
Calculate a two-tailed p-value from a t-statisticp-value = TDIST(ABS(t);df;2)
Calculate a p-value for the difference of the means from two samples of paired samples=TTEST(data_range_x;data_range_y;2;1)
Calculate a p-value for the difference of the means from two independent samples, no presumption that σx = σy =TTEST(data_range_x;data_range_y;2;3)
Linear Regression Statistics
Statistic or ParameterSymbolEquationsOpenOffice
Slopeb=SLOPE(y data; x data)
Intercepta=INTERCEPT(y data; x data)
Correlationr=CORREL(y data; x data)
Coefficient of Determinationr2  =(CORREL(y data; x data))^2