1**. What is the purpose of statistics?**

The purpose of statistics is to collect, organize, and analyze data (from a sample); interpret the results; and try to make predictions (about a population). One relies on statistics to determine how close to what one anticipated would happen actually did happen.

**2. What is the purpose of descriptive statistics?**

Descriptive statistics are used to summarize data in a clear and understandable way and enable the researcher to discern patterns and general trends. For example, suppose a researcher gave a test measuring achievement of 200 pupils in Mathematics at a Unity School. There are two basic methods to summarize data: numerical and graphical. Using the numerical approach, she or he might compute descriptive statistics such as the mean, percentages, frequencies, and standard deviation. These statistics convey information about the degree of achievement and the degree to which people differ in achievement. Using the graphical approach, one might create a stem and leaf display, histogram, pie chart, or box plot. These graphs contain detailed information about the distribution of charismatic scores. Graphical methods are better suited than numerical methods for identifying patterns in the data, whereas numerical approaches are more precise and objective.

**3. Why and how would one use inferential statistics?**

Inferential statistics are used to draw implications about a population from a sample. In inferential statistics, we compare a numerical result (test value) to a number that is reflective of a chance happening (critical value) and determine how significant the difference between these two numbers is. When we test a hypothesis or create a confidence level, we are doing inferential statistics.

**4. Are predictions indisputable in statistics?**

Statistics can be used to predict, but these predictions are not certainties. Statistics offers us a best guess. The fact that conclusions might be incorrect separates statistics from most other branches of mathematics. If a fair coin is tossed 10 times and 10 heads appear, the statistician would incorrectly report that the coin is biased. This conclusion, however, is not certain. It is only a likely conclusion, reflecting the very low probability of getting 10 heads in 10 tosses.

**5. What are hypotheses?**

Hypotheses are educated guesses (definitive statements) derived by logical analysis using induction or deduction from one’s knowledge of the problem and from the purpose for conducting a study. They can range from very general statements to highly specific ones. We can translate our claims into substantive hypotheses (what you are trying to substantiate) and then turn this claim into null and alternative hypotheses. In a dissertation a hypothesis usually is written in the present tense about a population and not a sample. For example if you are studying the effect of PBL (project based learning) at a particular charter school system, your hypothesis might be: There is a relationship between the quality of projects produced and achievement on standardized exams.

**6. What is statistical hypothesis testing?**

Statistical hypothesis testing, or tests of significance, is used to determine whether the differences between two or more descriptive statistics (such as a mean, percentage, proportion, or standard deviation) are statistically significant or more likely due to chance variations. It is a method of testing claims made about populations by using a sample (subset) from that population.

In hypothesis testing, descriptive numbers are standardized so that they can be compared to fixed values, which are found in tables and in computer programs, and indicate how unusual it is to obtain the data collected. A statistical hypothesis to be tested is always written as a null hypothesis (no change). Generally the null hypothesis will contain the symbol“=” to indicate the status quo, or no change. An appropriate test will tell us to either **re****ject the null hypothesis or fail to reject** (**in essence accept**) the null hypothesis. We do not use the word

*accept**when discussing statistical results.*

Some people refer to the null hypothesis as the “no” hypothesis; *no *relationship, *no *change; *no *difference. If the null hypothesis is not rejected, this does not lead to the conclusion that no association or differences exist, but instead that the analysis *did not detect *any association or difference between the variables or groups. Failing to reject the null hypothesis is comparable to a finding of not guilty in a trial. The defendant is not declared innocent. Instead, there is not enough evidence to be convincing beyond a reasonable doubt. In the judicial system, a decision is made and the defendant is set free.

**7. Once I find an appropriate test for my hypothesis, is there anything else I need to be concerned about?**

Certain conditions are necessary prior to initiating a statistical test. One important condition is the **distribution of the data**. Once data are standardized and the significance level determined, a statistical test can be performed to analyze the data and possibly make inferences about a population (universe).

**8. What are p values?**

A *p *value (or probability value) is the probability of getting a value of the sample test statistic that is at least as extreme as the one found from the sample data, assuming the null hypothesis is true. Traditionally, statisticians used alpha (α) values that set up a dichotomy: reject/fail to reject null hypothesis. A *p *value measures how confident we are in rejecting a null hypothesis. If a *p *value is less than 0.01, we say this is highly statistically significant, and there is very strong evidence against the null hypothesis. A *p *value between 0.01 and 0.05 indicates that there is statistically significant and adequate evidence against the null hypothesis. For a *p *value greater than 0.05, there is, generally, insufficient evidence against the null hypothesis, and the null hypothesis is not rejected.

p value | Interpretation |

p 0.01 | Very strong evidence against H_{o} |

p 0.05 | Moderate evidence against H_{o} |

p 0.10 | Suggestive evidence against Ho |

p > 0.10 | Little or no real evidence against Ho |

If the null hypothesis is not rejected, this does not lead to the conclusion that no association or differences exist, but instead that the analysis did not detect any association or difference between the variables or groups. Failing to reject the null hypothesis is comparable to a finding of not guilty in a trial. The defendant is not declared innocent. Instead, there is not enough evidence to be convincing beyond a reasonable doubt. In the judicial system, a decision is made and the defendant is set free.

**9. What is the connection between hypothesis testing and confidence intervals?**

There is an extremely close relationship between confidence intervals and hypothesis testing. When a 95% confidence interval is constructed, all values in the interval are considered plausible values for the parameter being estimated. Values outside the interval are rejected as implausible. If the value of the parameter specified by the null hypothesis is contained in the 95% interval, then the null hypothesis cannot be rejected at the 0.05 level. If the value specified by the null hypothesis is not in the interval, and then the null hypothesis can be rejected at the 0.05 level. If a 99% confidence interval is constructed, then values outside the interval are rejected at the 0.01 level.

**10. What does statistically significant mean?**

In English, *significant *means important. In statistics, it means *probably true*. Significance levels show you how likely a result is due to chance. The most common level, which usually indicates “good enough,” is 0.95. This means that the finding has a 95% chance of being true. However, this is reported as a 0.05 level of significance, meaning that the finding has a 5% (0.05) chance of *not *being true, which is the converse of a 95% chance of being true. To find the significance level, subtract the number shown from 1. For example, a value of 0.01 means there is a 99% (1 – 0.01 = 0.99) chance of it being true.

**11. What are the different levels of measurement?**

Data come in four types and four levels of measurement, which can be remembered by the French word for black: **NOIR: **nominal (lowest), ordinal, interval, and ratio highest

Nominal Scale | Measures in terms of name of designations or discrete units or categories. Example: gender, color of home, religion, e.t.c. |

Ordinal Scale | Measures in terms of such values as more or less, larger or smaller, but without specifying the size of the intervals. Example: rating scales, ranking scales, Likert-type scales. |

Interval Scale | Measures in terms of equal intervals or degrees of difference but without a true a true zero point. Ratios do not apply. Example: temperature, GPA, IQ. |

Ratio Scale | Measures in terms of equal intervals and an absolute zero point of origin. Ratios apply. Example: height, delay time, weight. |

A general and important guideline is that the statistics based on one level of measurement should not be used for a lower level, but can be used for a higher level. An implication of this guideline is that data obtained from using a Likert-type scale (a scale in which people set their preferences from say 1 = *totally agree *to 7 = *totally disagree*) should, generally, not be used in parametric tests. However, there is controversy regarding treating Likert-type scales as interval data (see below). However, if you cannot use a parametric test, there is almost always an alternative approach using nonparametric tests.

Can Likert-type scales be considered interval? Likert-type scales are used to quantify results and obtain shades of perceptions. Choices (or categories of responses) usually range from *strongly disagree *to *strongly agree*. As the categories move from one to the next (e.g., from *strongly disagree *to *disagree*), the value will increase by one unit. The Likert-type scale has equal units as the categories move from most negative to most positive. This allows measurement of attitudes, beliefs, and perceptions, providing an efficient and effective means of quantifying data.

Although Likert-type scales are ordinal data, they are commonly used with interval procedures, provided the scale item has at least five and preferably seven categories. Most researchers would not use a 3-point Likert-type scale with a technique requiring interval data. The fewer the number of points, the more likely the departure from the **assumption of normal distribution** required for many tests. Here is a typical footnote inserted in research using interval techniques with Likert-type scales:

*In regard to the use of (insert name of test – such as t test or Pearson test), which assumes interval data, with ordinal Likert-type scale items, in a review of the literature on this topic, Jaccard and Wan (1996, p. 4) found, “for many statistical tests, rather severe departures (from intervalness) do not seem to affect Type I and Type II errors dramatically when scales of five or seven categories are used.”*

Under certain circumstances, there is general consensus that ranked data are interval. This would happen, for instance, in a survey of children’s allowances if all children in the sample got allowances of $5, $10, or $15 exactly and these were measured as low, medium, and high. That is, intervalness is an attribute of the data, not of the labels.

**12. What type of distributions can be found when data are collected?**

One of the most important characteristics related to the shape of a distribution is whether the distribution is skewed or symmetrical. Skewness (the degree of asymmetry) is important. A large degree of skewness causes the mean to be less acceptable and useful as the measure of central tendency. To use many parametric statistical tests requires a normal (symmetrical) distribution of the data. Graphical methods such as histograms are very helpful in identifying skewness in a distribution.

If the mean, median, and mode are identical, then the shape of the distribution will be unimodal, symmetric, and resemble a normal distribution. A distribution that is skewed to the right and unimodal will have a long right tail, whereas a distribution that is skewed to the left and unimodal will have a long left tail. A unimodal distribution that is skewed has its mean, median, and mode occur at different values. For highly skewed distributions, the median is the preferred measure of central tendency, since a mean can be greatly affected by a few extreme values on one end.

Kurtosis is a parameter that describes whether the particular distribution concentrates its probability in a central peak or in the tails, or how pointed or flat a distribution looks. Kurtosis explains how outlier- prone a distribution is. The kurtosis of the normal distribution is 3. Distributions that are more outlier- prone than the normal distributions have kurtosis greater than 3; distributions that are less outlier-prone have kurtosis less than 3. Platykurtic means flatter than a normal curve; leptokurtic means pointier than a normal curve.

**13. What is the difference between a parametric and a nonparametric test?**

Most of the better known statistical tests use parametric methods. These methods generally require strict restrictions such as the following:

1. The data should be ratio or interval.

2. The sample data must come from a normally distributed population.

Good things about nonparametric methods are as follows:

1. They can be applied to a wider variety of situations and are distribution free.

2. They can be used with nominal and ranked data.

3. They use simpler computations and can be easier to understand.

Drawbacks of nonparametric methods are as follows:

1. They tend to waste information since most of the information is reduced to qualitative form.

2. They are generally less sensitive, so stronger evidence is needed to show significance, which could mean larger samples are needed for statistical data analysis.

Mary CoolerVery good article. I will be going through many of these issues as well..|