It was Professor Plot in the Diagram with a Graph

Looking for clues 2008-04-04 016

I’m looking for clues.

You probably were taught how to graph data in high school. Depending on your work, you may frequently plot data yourself or look at graphs prepared by others. Even if you don’t use graphs on your job, you may run into them during your leisure time, reading the newspaper, managing your finances, or playing Dungeons and Dragons. But there’s a big difference between looking at someone else’s graph and preparing one yourself. When you were learning how to graph in school, the teachers told you what kind of graph to use. They gave you carefully selected data that was matched to the graph you were supposed to create. There was help available if you had any questions. Now, it’s just you and your computer. So if you have no clue as to where to begin, here are a few tips that may help.

First let’s get past the jargon of plots, charts, graphs, and diagrams. All of these terms are defined as visual representations of data. All are used synonymously. All are used as both nouns and verbs. All have other meanings. To split hairs:

  • Plots tend to place more emphasis on individual data points.
  • Charts tend to involve lines and areas more than individual points.
  • Graphs tend to be more mathematically complex than charts and plots.
  • Diagrams tend to be more artistic and fill the entire data space.

Not everyone would agree with this, of course. That being said, you can usually refer to visual representations of data by any of the four terms without being called out by a smart-aleck critic. If you’re referring to a specific kind of visual representations of data, one of the four terms usually is preferred, for example, bar charts, scatter plots, and block diagrams. Most specific kinds of visual representations of data are called plots or charts, and to a much lesser extent, diagrams. The term graph is used mostly in a general sense, which is how it is used in this blog.

A Graph a Minute

The first thing you’ll need to do is figure out what kinds of graphs you could draw. Start by answering these questions:

  • Is your focus on variables or samples? Do you want to show how a number of samples are related to each other on the basis of one or more variables or do you want to show how a number of variables are related to each other for a very small number of samples?
  • Will you plot individual points or group means? How many data points do you have to plot? Do you want to show the points individually or do you want to show the averages of groups of data points (this is useful when you have a large number of data points)?
  • What is the aim of the graph? There are many reasons to plot data and most graphs have multiple goals. For simplicity, decide whether the primary aim is to show:
    • Data frequency and distribution
    • Relative proportions of the components of a mixture
    • Properties or values of data points
    • Trends, patterns, or other relationships among variables.
  • How many axes will you need? How many variables do you have? Are they measured on the same or different scales? Are the scales discrete or continuous?

Once you can answer those questions, you can use this table to help you choose some of the more common kinds of graphs to try with your data. There are, of course, a virtually uncountable number of kinds of graphs, subspecies of graphs, variations and extensions of graphs, and combinations of graphs. To start, focus on simple graphs you can get from the software you have available. Later, you can prepare the Piper plots you used to justify your purchase of that specialized piece of software you wanted.

Common Types of Graphs for General Data Analysis.

Data Scales

Chart

Used to Show

Chart
Axes

Horizontal Axis

Vertical
Axis

Additional
Axes

Availability

Box Plot

Distribution

Rectangular

Categorical, continuous (sample size)

Continuous

Specialized software

Dot Plot

Distribution

Rectangular

Ordinal, continuous

Ordinal

Specialized software

Histogram

Distribution

Rectangular

Ordinal, continuous

Ordinal

Spreadsheet software

Probability Plot

Distribution

Rectangular

Ordinal, continuous

Continuous

Specialized software

Q-Q Plot

Distribution

Rectangular

Ordinal

Ordinal

Specialized software

Stem-Leaf Diagram

Distribution

Rectangular

Ordinal

Ordinal, continuous

Specialized software

Ternary Plot

Mixtures

Triangular

Continuous (percentages)

Continuous (percentages)

Continuous (Percentages)

Specialized software

Pie Chart

Mixtures

Circular

Categorical

Continuous (percentages)

Spreadsheet software

Area Chart

Properties

Rectangular

Ordinal, continuous

Continuous

Spreadsheet software

Bar Chart

Properties

Rectangular

Categorical

Continuous

Spreadsheet software

Candlestick Chart

Properties

Rectangular

Continuous

Continuous

Develop from scatter plot

Control Chart

Properties

Rectangular

Continuous

Continuous

Specialized software

Deviation Plot

Properties

Rectangular

Continuous

Continuous

Develop from scatter plot

Line Chart

Properties

Rectangular

Categorical, ordinal

Continuous

Spreadsheet software

Map

Properties

Rectangular

Continuous

Continuous

Any

Specialized software

Matrix Plot

Properties

Rectangular

Nominal

Nominal

Text

Develop from table

Means Plot

Properties

Rectangular

Continuous

Continuous

Develop from scatter plot

Spread Plot

Properties

Rectangular

Continuous

Continuous

Develop from scatter plot

Block Diagram

Properties

Cubic

Nominal

Nominal

Nominal

Specialized software

Rose Diagram

Properties

Circular

Ordinal, continuous

Continuous

Specialized software

Multivariable Plot

Relationships

Rectangular, circular, other

Any

Continuous

Continuous

Specialized software

Bubble Plot

Relationships

Rectangular

Continuous

Continuous

Continuous

Spreadsheet software

Contour Plot

Relationships

Rectangular

Continuous

Continuous

Continuous

Specialized software

Icon Plot

Relationships

Rectangular

Continuous

Continuous

Multivariable plot*

Specialized software

Scatter Plot: 2D

Relationships

Rectangular

Continuous

Continuous

Spreadsheet software

Scatter Plot: 3D

Relationships

Cubic

Continuous

Continuous

Continuous

Specialized software

Surface Plot

Relationships

Cubic

Continuous

Continuous

Continuous

Specialized software

* (e.g., Radar Plot, Sun Chart, Star Plot, Side-by-side bar charts, Polygon Plot, Sparklines, Chernoff faces)

 

You Can’t Spell Chart without Art

There are competing philosophies of graphing, divided to some extent by perceptions about the audience for a graph. The philosophy of many art directors of newspapers and magazines is to keep the graph simple, interesting, and attractive in order to engage the reader. Look no further than USA Today, Newsweek, or Time to see three dimensional exploded pie charts and bar charts made of little soldier icons or dollar bills or some other cutesy graphic. In contrast, Edward Tufte, perhaps the preeminent expert in informational graphics, espouses a philosophy that assumes the audience is knowledgeable and interested. Graphs should provide as much information as needed as efficiently as possible. Tufte makes many good points in his books, The Visual Display of Quantitative Information (1983, 2001), Envisioning Information (1990), Visual Explanations (1997), and Beautiful Evidence (2006), including:

  • The dimension of a chart must not be greater than the dimension of the data. For example, if you’re plotting two variables on a Cartesian (rectangular) graph, don’t add an extra axis (dimension) for depth. It may be visually appealing but it’s scientifically misleading.
  • Data must be presented in context. You shouldn’t show just part of a data set.
  • Label everything you need to make sure the data are presented accurately and meaningfully.
  • Maximize the data density and the data-ink ratio. Put enough data in your graph to make it worthwhile. Eliminate everything on the chart that isn’t data or contributes to the interpretation of the data.
  • Eliminate chart junk, the unnecessary pictures, dimensionality, grid lines, fill patterns, and other objects that clutter a graph while adding no scientific value.

Tufte believes he has the audience’s attention while the art directors believe they have to compete for it. Then there are authors like David McCandless (www.informationisbeautiful.net) who look at presenting data from an artistic perspective. Their graphics are truly works of art though the graphs are based on data and aimed at engaged audiences. All of these graph developers make valid points. They simply have different perspectives, different audiences, different aims, and different data.

Read more about using statistics at the Stats with Cats blog. Join other fans at the Stats with Cats Facebook group and the Stats with Cats Facebook page. Order Stats with Cats: The Domesticated Guide to Statistics, Models, Graphs, and Other Breeds of Data Analysis at Wheatmark, amazon.combarnesandnoble.com, or other online booksellers.

About statswithcats

Charlie Kufs has been crunching numbers for over thirty years. He retired in 2019 and is currently working on Stats with Kittens, the prequel to Stats with Cats.
This entry was posted in Uncategorized and tagged , , , , , , , , , , , , , . Bookmark the permalink.