Statistics Homework

Final: Statistics II – 01:960:212:02 Spring 2021

NAME: _____________________________________________

Instructions:

a) Read the Instructions.

b) Put your name on the Final.

c) Show and explain all your work.
d) Clearly circle or underline your final answers where applicable.
e) If you get stuck on a problem, save it for the end. Do the easy ones first. Good Luck!!!

Problem 1 – TRUE or FALSE (Answer these on this midterm sheet) – 1 pt each

a) A t-test assumes that the population standard deviation, (, is known

( )True ( )False

b) One can reject the null hypothesis when the p-value is less than the level of significance the hypothesis test is performed at

( )True ( )False

c) When performing multiple comparisons, the Family Wise Error Rate is probability of making at least on Type I error

( )True ( )False

d) When constructing a t-interval for a population mean, (, increasing the sample size decreases the precision

( )True ( )False

e) Increasing the level of significance when performing a z-test entails an increase in the statistical power

( )True ( )False

f) An F-distribution is a continuous distribution on the non-negative real number

( )True ( )False

g) ANOVA is a method used to compare variances between two or more groups

( )True ( )False

h) A Binomial random variable is a discrete random variable

( )True ( )False

i) Statistical Power is 1 minus the probability of a Type I Error

( )True ( )False

j) A (2 distribution is a right skewed continuous distribution

( )True ( )False

Problem 2 –Four brands of high-end earbuds are compared for sound quality. The four brands are
Ear Light
,
Loud n’ Clear
,
Sound Aid
and
Crystal Clear
. Without going into unnecessary details, the quality of sound can be determined objectively by measuring audio signals received by a robot head wearing the earbuds and then comparing them with the known signal wave that was sent. The unit of measure to quantify sound quality is referred to as a “Quali”, where a lower Quali value coincides with better sound quality. Below is a table displaying summary statistics for an experiment performing sound quality assessments performed on the four earbud brands.

Earbud Type

Sample Size

Sample Average

Sample Standard Deviation

Ear Light

5

12

3.082207001

Loud n’ Clear

4

17

3.16227766

Sound Aid

7

16

1.414213562

Crystal Clear

6

15

1.673320053

a) Determine at a level of significance (0= 0.05, whether there is any statistically significant difference in the sound quality of earbuds brands (The upper 5% cutoff value for an F-distribution on 3 numerator degrees of freedom and 18 denominator degrees of freedom is 3.16). – 4 pts

b) If one wanted to determine which pairs of earbuds had statistically significant difference in sound quality, what level of significance should the individual pair-wise hypothesis tests be performed at to control the Family Wise Error Rate at 5% using the Bonferroni Method? (YOU DO NOT NEED TO PERFORM ANY OF THE PAIRWISE COMPARISONS). – 1 pt

Problem 3: Multiple choice problems (There will be only one answer) – 2 pts each

I) Which of the following statements is true regarding a dotplot?

____

A. Dotplots depict the distribution of numerical data.

B. Scatterplot is another name for a dotplot.

C. Dotplots can only be constructed for data that is discrete.

D. Dotplots can be used to confirm that a data sample arises from a normal distribution.

II) When constructing a QQ-plot for determining whether it is plausible to assume a dataset
arises from a normal distribution, which of the following statements is true?

____

A. The empirical quantiles of the data must be on the vertical axis and the theoretical quantiles of a standard normal random variable must be on the horizontal axis.

B. The empirical quantiles of the data must be on the horizontal axis and the theoretical quantiles of a standard normal random variable must be on the vertical axis.

C. The empirical quantiles of the data can be on either axis and the other axis can display the theoretical
quantiles of any normal random variable.

D. The number of quantiles depicted on the QQ-plot (i.e. number of points), must be equal to the number of observations in your dataset.

III) Given an iid (independent and identically distributed) sample from a normal distribution,
the sampling distribution of the sample standard deviation follows a …

____

A. t-distribution

B. F-distribution

C. (2-distribution

D. None of the above
IV) Which of the following is the correct frequentist interpretation of a (1 – (0)% confidence interval
for some parameter of interest?

____

A. The probability the parameter of interest falls within the confidence interval is (0

B. The probability the parameter of interest falls within the confidence interval is (1 – (0)

C. The probability the confidence interval covers the true value of the parameter is (1 – (0)

D. The probability the confidence interval covers the true value of the parameter is (0
V) Which of the following is NOT an assumption of the ANOVA model?

____

A. The observations within each group are an iid sample.

B. Observations in different groups are independent.

C. The number of observations in each group is the same.

D. The population standard deviation within each group is the same.

Problem 4 – State the Central Limit Theorem and why it is important. Be sure to list all the assumptions that are needed for the result of the Central Limit Theorem. – 4 pts

Problem 5 – The stacked barchart below reports the results of a random survey conducted in the cities of Byzantium and Constantinople. Individuals were randomly selected and asked whether they had been vaccinated against Bubonic Plague or not.

(a) How many individuals were sampled in Byzantium and Constantinople respectively? – 1 pts

(b) Provide an estimate of the percentage of individuals within each town who are vaccinated against Bubonic Plague. – 1 pts

(c) An epidemiologist wishes to prove that there is a difference in the percentage of individuals vaccinated in the two cities. Frame his research question as a hypothesis testing problem, explicitly describing the parameter/s involved and the respective hypotheses. – 1 pt

(d) Carry out the hypothesis test you specified in part (c) at level of significance (0 = 0.2 using the data provided in the barchart. Would you accept/reject the null hypothesis in part (c)? Show and clearly explain all your work leading to your conclusion (The upper 10% cutoff value for a standard normal distribution is 1.28). – 3 pts

Problem 6 – Recall the focus of methods such as z-tests, t-tests, ANOVA, etc. all focus on analyzing the population mean, (, of some variable of interest in a population. Explain why many statistical methods are framed in terms of the population mean, (, Discuss not only the practical reasons but also any theoretical reasons as well. – 4 pts

Problem 7 – For each of the scenarios below, circle which statistical method would be the best approach to use to answer the research question posed by putting an X to the right of the most appropriate method. – 2 pts each

(a)
An IT company wishes to improve its customer service by increasing the number of customer service representatives at its call center to handle calls. In to inform the number of customer service representatives they need to hire, they needed to get a sense of how long customers are waiting on hold when calling their IT help line. They want to be 90% certain that they hire enough customer service representative so that no customer is ever put on hold. Consequently, they conducted a survey to ascertain how long customers were put on hold. They randomly selected 75 incoming calls to its help line that were put on hold, and recorded the duration they were on hold. Let ( denote the mean waiting time on hold. What statistical method should they employ? (Pick one)
Compute a 90% Upper Bound for ( _______

Perform a Right-Tailed Test for ( _______

Perform a Left-Tailed Test for ( _______

Compute a 90% Lower Bound for ( _______
(b)
An aspirin manufacturer claims its bottles contain 500 grains of aspirin. Let ( represent the true mean weight of a tablet of aspirin. Since each bottle contains 100 tablets, if the manufactures claim is true then true mean weight of the tablets should be 5 grains. Each of 100 tablets taken from a very large lot is weighed, resulting in a sample average weight of 4.87 grains and a sample standard deviation of 0.35 grain. An investigator wishes to know whether this data provide strong enough evidence to conclude that the company is short-changing the consumer. How should the investigator proceed? (Pick one)
Perform a Left-Tailed z-Test for ( _______

Perform a Left-Tailed t-Test for ( _______

Perform a Right-Tailed t-Test for ( _______

Compute a 90% Lower Bound for ( _______
(c)
In married couples, does one spouse (the husband or the wife) tend to live longer? By accessing public death registries, a Sociologist was able to obtain a simple random sample of 100 married couples who were born within 1 month of each other. Her goal is to answer the previous question. She recorded the age each partner passed away as well as which partner outlived their spouse. What is the best way to analyze this data to answer the question? (Pick one)
Perform a two sample, two-sided hypothesis test of equality of mean death ages _______
Perform a two-sided, paired hypothesis test where
the null is that the common mean difference of death ages is 0 _______

Perform a right-tailed test on the population proportion of couples where wives outlive their husbands _______

Perform a left-tailed test on the population proportion of couples where husbands outlive their wives _______

(d)
Hedgehog Hedge Fund uses complex stock model to inform their trading strategies. One particular model used for the car manufacturer, Rocket Motors, utilize various economic variables as model inputs. They are debating whether to take a position involving Rocket Motors stock. Consequently, they wish to be able to get a range of values for Rocket Motor’s future stock price which will cover the actual stock price with 95% certainty. As there is a lot of money at stake, the firm hires a Quantitative Analyst to tackle the previous problem. Recognizing that the model input that most effects the value of the estimated stock price is the ‘typical’ price of a gallon of unleaded regular gas, he requests the company’s market research group to provide him with the prices of a gallon of unleaded regular gas for a simple random sample of 1,000 gas stations across the US. How should he use this data to proceed with his task? (Pick one)

Calculate the sample average of the gas prices _______

Calculate the sample median of the gas prices _______

Compute a two-sided 95% confidence interval for the mean gas price _______

Calculate the sample standard deviation of the gas prices _______
(e)
An hourglass is a time-keeping device from antiquity used to keep track of time. The device comprises of two glass bulbs arranged in a figure-eight pattern connected by a thin neck. Inside the bulbs is sand. When all the sand is collected in one bulb, the hourglass is turned upside down so that the bulb containing all the sand is on top. The sand will then spill through the neck to the lower bulb by the force of gravity. As the name implies, it should take an hour for all the sand to pour through the glass. However, as the device is crude, there actual time it takes for all the glass to empty from the top bulb could vary slightly from run-to-run due to slight irregularities in how the glass is distributed in the bulb (i.e. if it is piled highest towards the sides of the glass versus in the middle)
There are currently two main manufacturers of hour-glasses –
Time Stands Still
and
Sands of Time
. An enthusiast of time pieces wishes to determine if one company’s hourglass provides more precise estimates of time than its competitor. He purchases an hourglass from
Time Stands Still
, and one from
Sands of Time
. In his experiment, he activates a given hourglass and uses a stopwatch to determine the exact time it takes for the sand to empty from the top bulb to the bottom. He does this 8 times for the
Time Stands Still
hourglass, and 6 times for the
Sands of Time
version. How can he determine whether one hourglass is a more precise measure of an hour then the other? (Pick one)
Perform a two sample, two-sided hypothesis test of equality of population means _______

Perform an ANOVA (Analysis of Variance) _______

Perform a two sample, two-sided hypothesis test of equality of population
standard deviations where the null is that the difference of population standard deviations is 0 _______

Perform a two sample, two-sided hypothesis test of equality of population
standard deviations where the null is that the ratio of population standard deviations is 1 _______

Problem 8 – Below are QQ Plots for two different data sets.

Which of the datasets appears to be normally distributed, Dataset I or Dataset II? Explain your answer. – 3 pts

Problem 9 – Famed Climatologist, Dr. Lumvoir, wishes to publish the results of his recent study on climate change. The experiment entailed measuring the temperature at a fixed location at exactly the same time of day for one month, a task which he relegated to his Graduate Assistant (GA). The GA compiled the data and calculated the following summary statistics which he reported to Dr. Lumvoir:
Average = 30.6ºF

Range = 15ºF

Median = 28ºF

IQR = 10ºF

Mode = 29ºF

Standard Deviation = 13.5ºF

Unfortunately, Dr. Lumvoir cannot use the summary statistics as is because they are in degrees Fahrenheit whereas scientific journals require data measurements to be reported using the metric system (recall temperature is measured in degrees Celsius in the metric system with the conversion formula from Celsius to Fahrenheit given by ºF = ºC x 1.8 + 32).
1) Dr. Lumvoir does not have access to raw data, so he asks his GA to recalculate the above summary statistics. Unfortunately, the GA is currently away on spring break. The submission deadline for the journal report is only a day away. Can Dr. Lumvoir somehow do the Fahrenheit to Celsius conversion for the above summary statistics without having the raw data? If so, what are the corresponding summary statistics when converted to degrees Celsius? – 3 pts

2) To support his hypothesis, Dr. Lumvoir, performed the below hypothesis test, where ( represents the mean temperature.
H0: ( >= 32ºF vs. HA: ( < 32ºF at level of significance (0 = 0.05 Using the data measured in degrees Fahrenheit, he came to the conclusion to reject the null hypothesis. However, as he must now convert the data to degrees Celsius, would his result change? Explain why or why not. (Hint: Think about how the forms of the above hypotheses would change and then look at the form of the standardized test statistic). – 2 pts Problem 10 – The scatter plot below is for a bivariate sample where the horizontal axis is years of education and the vertical axis is yearly salary 10 years post schooling. The line in red is the least squares regression line estimated using the data. Comment on whether the assumptions necessary to perform inference on the regression model are satisfied. – 3 pts

Place your order
(550 words)

Approximate price: $22

Calculate the price of your order

550 words
We'll send you the first draft for approval by September 11, 2018 at 10:52 AM
Total price:
$26
The price is based on these factors:
Academic level
Number of pages
Urgency
Basic features
  • Free title page and bibliography
  • Unlimited revisions
  • Plagiarism-free guarantee
  • Money-back guarantee
  • 24/7 support
On-demand options
  • Writer’s samples
  • Part-by-part delivery
  • Overnight delivery
  • Copies of used sources
  • Expert Proofreading
Paper format
  • 275 words per page
  • 12 pt Arial/Times New Roman
  • Double line spacing
  • Any citation style (APA, MLA, Chicago/Turabian, Harvard)

Our guarantees

Delivering a high-quality product at a reasonable price is not enough anymore.
That’s why we have developed 5 beneficial guarantees that will make your experience with our service enjoyable, easy, and safe.

Money-back guarantee

You have to be 100% sure of the quality of your product to give a money-back guarantee. This describes us perfectly. Make sure that this guarantee is totally transparent.

Read more

Zero-plagiarism guarantee

Each paper is composed from scratch, according to your instructions. It is then checked by our plagiarism-detection software. There is no gap where plagiarism could squeeze in.

Read more

Free-revision policy

Thanks to our free revisions, there is no way for you to be unsatisfied. We will work on your paper until you are completely happy with the result.

Read more

Privacy policy

Your email is safe, as we store it according to international data protection rules. Your bank details are secure, as we use only reliable payment systems.

Read more

Fair-cooperation guarantee

By sending us your money, you buy the service we provide. Check out our terms and conditions if you prefer business talks to be laid out in official language.

Read more

Order your essay today and save 30% with the discount code HAPPY