Of the over two million college degrees that are granted in the U.S. every year, including those earned at accredited online colleges nationwide, probably two-thirds require completion of a statistics class. That’s over a million and a half students taking Statistics 101, even more when you consider that some don’t complete the course.
Everybody who has completed high school has learned some statistics. There are good reasons for that. Your class grades were averages of scores you received for tests and other efforts. Most of your classes were graded on a curve, requiring the concepts of the Normal distribution, standard deviations, and confidence limits. Your scores on standardized tests, like the SAT, were presented in percentiles. You learned about pie and bar charts, scatter plots, and maybe other ways to display data. You might even have learned about equations for lines and some elementary curves. So by the time you got to prom, you were exposed to at least enough statistics to read USA Today.
Faced with taking Statistics 101, you may be filled with excitement, ambivalence, trepidation, or just plain terror. Your instructor may intensify those feelings with his or her teaching style and class requirements. So to make things just a bit easier, here are a few concepts to remember.
Everything is Uncertain
The fundamental difference between statistics and most other types of data analysis is that in statistics, everything is uncertain. Input data have variabilities associated with them. If they don’t, they are of no interest. As a consequence, results are always expressed in terms of probabilities.
Every data measurement is variable, consisting of:
- Characteristic of Population—This is the part of a data value that you would measure if there were no variability. It’s the portion of a data value that is the same between a sample and the population the sample if from.
- Natural Variability—This part of a data value is the uncertainty or variability in population patterns. It’s the inherent differences between a sample and the population. In a completely deterministic world, there would be no natural variability.
- Sampling Variability—This is the difference between a sample and the population that is attributable to how uncharacteristic (non-representative) the sample is of the population.
- Measurement Variability—This is the difference between a sample and the population that is attributable to how data were measured or otherwise generated.
- Environmental Variability— This is the difference between a sample and the population that is attributable to extraneous factors.
The goal of most statistical procedures is to estimate the characteristic of the population, characterize the natural variability, and control and minimize the sampling, measurement, and environmental variability. Minimizing variance can be difficult because there are so many causes and because the causes are often impossible to anticipate or control. So if you’re going to conduct a statistical analysis, you’ll need to understand the three fundamentals of variance control—Reference, Replication, and Randomization.
Statistics ♥ Models
Statistics and models are closely intertwined. Models serve as both inputs and outputs of statistical analyses. Statistical analyses begin and end with models.
Statistics uses distribution models (equations) to describe what a data frequency would look like if it were a perfect representation of the population. If data follow a particular distribution model, like the Normal distribution, the model can be used as a template for the data to represent data frequencies and error rates. This is the basis of parametric statistics; you evaluate your data as if they came from a population described by the model.
Statistical techniques are also used to build models from data. Statistical analyses estimate the mathematical coefficients (parameters) for the terms (variables) in the model, and include an error term to incorporate the effects of variation. The resulting statistical model, then, provides an estimate of the measure being modeled along with the probability that the model might have occurred by chance, based on the distribution model.
Measurement Scales shape Analyses
You may not hear very much about measurement scales in Statistics 101, but you should at least be aware of the difference between nominal scales, ordinal scales, and continuous scales. Nominal scales, also called grouping or categorical scales, are like stepping stones; each value of the scale is different from other values, but neither higher nor lower. Discrete scales are like steps; each value of the scale has a distinct break from the next discrete value, which is either higher or lower. Continuous scales are like ramps; each value of the scale is just a little bit higher or lower than the next value. There are many more types of scales, especially for time scales, but that’s enough for Statistics 101.
The reason measurement scales are important is that they will help guide which graph or statistical procedure is most appropriate for an analysis. In some situations, you can’t even conduct a particular statistical procedure if the data scales are not appropriate.
Everything Starts with a Matrix
You may not realize it in Statistics 101, but all statistical procedures involve a matrix. Matrices are convenient ways to assemble data so that computers can perform mathematical calculations. If you go beyond Statistics 101, you’ll learn a lot about matrix algebra. But for Statistics 101, all you have to know is that a matrix is very much like a spreadsheet. In a spreadsheet you have rows and columns that define rectangular areas, called cells. In statistics, the rows of the spreadsheet represent individual samples, cases, records, observations, entities that you’re making measurements on, sample collection points, survey respondents, organisms, or any other point or object on which information is collected. The columns represent variables, the measurements or the conditions or the types of information you’re recording. The columns can correspond to instrument readings, survey responses, biological parameters, meteorological data, economic or business measures, or any other types of information. You usually have several sets of variables for a given set of samples. Together, the rows and the columns of the spreadsheet define the cells, which is where the data are stored. Samples (rows), variables (columns), and data (cells) are the matrix that goes into a statistical analysis. If you understand data matrices, you’ll be able to conduct statistical analyses even without your Statistics 101 instructor to help you.
Statistics is More than Description and Testing
In Statistics 101, you learn about probability, distribution models, populations, and samples. Eventually, this knowledge will enable you to be able to describe the statistical properties of a population and to test the population for differences from other populations. But these capabilities, formidable though they are, don’t reveal the truly mind boggling analyses you can do with statistics. You can:
- Describe—characterizing populations and samples using descriptive statistics, statistical intervals, correlation coefficients, and graphics.
- Compare and Test—detecting differences between statistical populations or reference values using simple hypothesis tests, and analysis of variance and covariance.
- Identify and Classify—identifying known or hypothesized entities or classifying groups of entities using descriptive statistics; statistical tests, graphics, and multivariate techniques such as cluster analysis and data mining techniques.
- Predict—predicting measurements using regression and neural networks, forecasting using time-series modeling techniques, and interpolating spatial data.
- Explain—explaining latent aspects of phenomena using regression, cluster analysis, discriminant analysis, factor analysis, and other data mining techniques.
So don’t get discouraged if you can’t see how statistics will help you in your career based on Statistics 101. There’s a lot more out there. You just have to take the first step.
Read more about using statistics at the Stats with Cats blog. Join other fans at the Stats with Cats Facebook group and the Stats with Cats Facebook page. Order Stats with Cats: The Domesticated Guide to Statistics, Models, Graphs, and Other Breeds of Data Analysis at Wheatmark, amazon.com, barnesandnoble.com, or other online booksellers.
Pingback: Top 50 Statistics Blogs of 2011
Pingback: Top 50 Statistics Blogs of 2011 | The 10 Best Landscape Architecture Programs
We wanted to let you know that your blog was included in our list of the top 50 statistics blogs of 2011. Our goal was to highlight blogs that students and prospective students will find useful and interesting in their exploration of the field.
You can view the entire list at http://www.thebestcolleges.org/best-statistics-blogs/
Pingback: Khoa học Thống kê « MFEPE
Pingback: Top 50 Statistics Blogs of 2011 « MFEPE
Right on target. As a teacher I see a great degree of frustration from students that didn’t know what they were getting it by taking a college statistics class. They thought that with the high school statistics they would do great, and to some degree that is the case for the majority, but your mileage varies considerably.
Pingback: 5 college courses that are more important than “Beyonce Class” | Credit Union Student Loans
Pingback: Universe Update, March 2014 « Space « Science Today
I have no interest in learning Statistics but still need to take it to fulfill the IGETC. It’s either that or other type maths which I think would be even more difficult for me (Precalculus). I am going to read this page again several times and hope for the best when I take this class this coming Summer.
How did you end up doing in the class, Sophie.
Hi hyacinth69, sorry it took so long to reply back! I ended up having an absolutely amazing instructor in the summer of 2014 and did well enough to pass with an A!! I was terrified but having the right instructor made all the difference in the world! There was no HW or quiz, just 4 exams and he said if we were satisfied with our grade after the first two or three, then that taking the fourth exam was optional. I got just enough total points to skim by in managing to snag an A so I was like, See Ya! I sent the instructor a small gift card as a thank you for being such a great teacher but I never have to deal with another math class, so yeah!
I have to take stat again after failing it my soph year. I need it for social work in order to take other classes, But i dont get it at all
Sorry to hear that Everett! I wish I could say to take it with the instructor I had but I know you probably don’t live in Orange County, CA. I am hopeless when it comes to anything mathematical but my instructor actually made it understandable to someone math averse like myself. I have read somewhere that the Statistics for dummies is a decent book but I don’t know for sure. Find a tutor if you can afford one or to the tutoring center frequently but don’t try it do it all on your own the second time. Wishing you the very best of luck!!!
Statistics help in making decision about everyday life. I like to learn Statistics. Thanks for article.
I want to study statistics
I wish there were examples of the things you discussed in the article.
Pingback: Searching for Answers | Stats With Cats Blog
Here you are learning something a business or other real-world context. I am walking away really feeling like I have an introductory knowledge of statistics
Sophie, who was instructor and where did you go to school for your Stats Class?
Hi, I took the class in Irvine , CA at Irvine valley college by the instructor Seth Hochwald. Amazing teacher.
The teacher makes all the difference.
I just did my final in my Stats class, and I probably did terrible. Its weird, because I’m usually great at math, and I went to every class, did all the homework, had a tutor, and still couldn’t respond to a couple of questions in the final test. and the Final counted for 30% of the final grade. 😦
I had class 3 hours per week, with an teacher difficult to understand (had a thick Asian accent), who keep changing the vocabulary/definitions, and graded on following his steps to a ‘t’, regardless of the results or the understanding of the subject. Some of his questions didn’t make any sense.
So, yeah! Sometimes the teacher makes all the difference!