*
After collecting data, researchers are faced with pages of unorganized numbers, stacks of survey responses, etc.
The goal of descriptive statistics is to aggregate the individual scores (datum) in a way that can be readily summarized
Following are several options that can be used to get a “picture” of how scores were distributed
*
*
Car Sales by Year in Millions
Bar chart is a very simple graph to construct in which allows you to display the frequency counts or percentages of a categorical variable
It is particularly useful for categorical variables with 2 or more levels or categories
It is the most effective way to compare frequency counts or percentages graphically between the categories.
In a bar chart,
Horizontal axis Levels or categories of the variable of interest (categorical)
Vertical axis Frequency counts or percentage
Year
2-*
*
Useful in statistical analysis
Also excellent for huge quantities of data
Can show patterns otherwise invisible
Each point is plotted base on the corresponding pair of values of the observation.
It could be used to spot outliers or possible errors, however it is not always apparent.
*
A graph in which the classes are marked on the horizontal axis and the class frequencies on the vertical axis.
The class frequencies are represented by the heights of the bars and the bars are drawn adjacent to each other.
*
*
Report statistics and graphs depends on the types of variables of interest:
For continuous variables
N, mean, standard deviation, minimum maximum,
histograms, dot plots, box plots, scatter plots
For categorical variables
frequency counts, percentages
one-way tables, two-way tables
bar charts
*
*
A frequency distribution displays the number (or percent) of individuals that obtained a particular score or fell in a particular category
As such, these tables provide a picture of where people respond across the range of the measurement scale
One goal is to determine where the majority of respondents were located
*
Frequency
the number of individuals that obtained a particular score (or response)
Percent
The corresponding percentage of individuals that obtained a particular score
Cumulative Percent
The percentage of individuals that fell at or below a particular score (not relevant for nominal variables)
*
Frequency distribution showing the pizza delivery time of employees from a restaurant
*
*
*
The normal curve is often called the Gaussian distribution, after Carl Friedrich Gauss,), Gauss is acknowledged by Germany on the 10 Deutschmark bill.
From http://www.willamette.edu/~mjaneba/help/normalcurve.html
*
A mathematical model or and an idealized conception of the form a distribution might have taken under certain circumstances.
Mean of any distribution has a Normal distribution (Central Limit Theorem)
Many observations (height of adults, weight ofchildren, intelligence) have Normal distributions
Shape
Bell shaped graph, most of data in middle
Symmetric, with mean, median and mode at same point
To indicate the spread of the distribution, we use the standard deviation (SD) around the mean as the unit.
±1SD around the mean 68% obs.
±2SD around the mean 95% obs.
±3SD around the mean 99.7% obs.
*
(from http://www.music.miami.edu/research/statistics/normalcurve/images/normalCurve1.gif
*
Theoretical construction
Also called Bell Curve or Gaussian Curve
Perfectly symmetrical normal distribution
The mean of a distribution is the midpoint of the curve
The tails of the curve are infinite
Mean of the curve = median = mode
The “area under the curve” is measured in standard deviations from the mean
*
Whenever you see a normal curve, you should imagine the bar graph within it.
*
75.bin
If your data fits a normal distribution, approximately 68% of your subjects will fall within one standard deviation of the mean.
Over 99% of your subjects will fall within three standard deviations of the mean.
*
76.bin
When you have a subject’s raw score, you can use the mean and standard deviation to calculate his or her standardized score if the distribution of scores is normal. Standardized scores are useful when comparing a student’s performance across different tests, or when comparing students with each other. Your assignment for this unit involves calculating and using standardized scores.
z-score -3 -2 -1 0 1 2 3
T-score 20 30 40 50 60 70 80
IQ-score 65 70 85 100 115 130 145
SAT-score 200 300 400 500 600 700 800
*
77.bin
Normal distributions (bell shaped) are a family of distributions that have the same general shape. They are symmetric (the left side is an exact mirror of the right side) with scores more concentrated in the middle than in the tails. Examples of normal distributions are shown to the right. Notice that they differ in how spread out they are..
Pearson defines Kurtosis, = , as a measure of departure from normality in a paper published in Biometrika. A distribution is platykurtic if it is flatter than the corresponding normal curve and leptokurtic if it is more peaked than the normal curve.
*
78.bin
The mean and standard deviation are useful ways to describe a set of scores. If the scores are grouped closely together, they will have a smaller standard deviation than if they are spread farther apart.
*
Small Standard Deviation
Large Standard Deviation
Different Means
Different Standard Deviations
Different Means
Same Standard Deviations
Same Means
Different Standard Deviations
Length of Right Foot
8
7
6
5
4
3
2
1
4 5 6 7 8 9 10 11 12 13 14
Data do not always form a normal distribution. When most of the scores are high, the distributions is not normal, but negatively (left) skewed.
Number of People with
that Shoe Size
Skew refers to the tail of the distribution.
*
Because the tail is on the negative (left) side of the graph, the distribution has a negative (left) skew.
Length of Right Foot
8
7
6
5
4
3
2
1
4 5 6 7 8 9 10 11 12 13 14
Number of People with
that Shoe Size
When most of the scores are low, the distributions is not normal, but positively (right) skewed.
Because the tail is on the positive (right) side of the graph, the distribution has a positive (right) skew.
*
When data are skewed, they do not possess the characteristics of the normal curve (distribution). For example, 68% of the subjects do not fall within one standard deviation above or below the mean. The mean, mode, and median do not fall on the same score. The mode will still be represented by the highest point of the distribution, but the mean will be toward the side with the tail and the median will fall between the mode and mean.
*
mean
median
mode
Negative or Left Skew Distribution
mean
median
mode
Positive or Right Skew Distribution
Delivering a high-quality product at a reasonable price is not enough anymore.
That’s why we have developed 5 beneficial guarantees that will make your experience with our service enjoyable, easy, and safe.
You have to be 100% sure of the quality of your product to give a money-back guarantee. This describes us perfectly. Make sure that this guarantee is totally transparent.
Read moreEach paper is composed from scratch, according to your instructions. It is then checked by our plagiarism-detection software. There is no gap where plagiarism could squeeze in.
Read moreThanks to our free revisions, there is no way for you to be unsatisfied. We will work on your paper until you are completely happy with the result.
Read moreYour email is safe, as we store it according to international data protection rules. Your bank details are secure, as we use only reliable payment systems.
Read moreBy sending us your money, you buy the service we provide. Check out our terms and conditions if you prefer business talks to be laid out in official language.
Read more