Explore a dataset of your own choice, ask a question about the data, and create a set of visualizations responding to that question.
Start by find data sets about that talk about places. You might look at demographics of neighborhoods within DC, a season’s box scores of high school baseball teams in your home county, or graduation rates for womens colleges around the Northeast.
Ideally, this will fit as the first part of a two-part project, building into the next project, about building maps. In this first project, Neighborhoods, you’ll be making non-spatial visualizations like line graphs, bar graphs, and proportional area charts.
In the next project, Mapping, you’ll make a thematic map (of quantitative data) or reference map (showing where places and landscape features are) that builds on the same topic. The final product will be a map, accompanied by a series of visualizations, under the same title.
If you truly want, you can make your later map about some other subject, but aim to tie the two together.
A few examples of pieces that mix non-spatial visualization types with maps (mostly thematic/data maps, in these examples).
- Mix of visualization and mapping in the FiveThirtyEight What Went Wrong in Flint
- Bar graphs, locator maps, and dot density maps in Amazon Doesn’t Consider the Race of Its Customers. Should It? at Bloomberg
- Slope graphs and choropleth maps in Away From Cities, Into Suburbs
- See also data placed on a simplified map of London in London Squared by After the Flood
- Trim size: open
- Color or grayscale
- The reader is generally curious person with at least a high school education, holding your piece at a normal reading distance for a book or a magazine.
- Include your name, a title, and a concise introduction.
Provide notes about your sources, enough that we know where you got the data from. “Data from 2012–2017 American Community Survey estimates” would be a good answer. For your own purposes, save the exact web addresses so you can go back.
Phases of work
1. Explore datasets and ask a question
Bring your own datasets. You might return to one of the datasets we used in previous assignments, find something yourself, or try a new one.
- Census data, particularly decennial and ACS data. A good place to start exploring is the American FactFinder site.
- The American Time Use Survey from the U.S. Department of Labor, Bureau of Labor Statistics, offers an astonishing, granular look at how Americans spend their time.
- Want to find out a university’s budget, graduation rates, or staff salaries? Or more? Look at the Department of Education’s IPEDS data.
- Less statistically complicated: perhaps there’s something to say about subway system open/closing hours (though you’ll have to add more detail)
- You might find something in a rundown of studies about ridehailing services (follow the links til you get to the source articles)
- Perhaps, go read the paper; dig up the original studies if need be.
Start exploring the data, producing rough visualizations to help you understand what’s going on, and arrive at a question to explore. Then examine that question in more depth.
Help with data wrangling
I’m happy to sit with you to help you find datasets that will help you answer your question, so far as I can. (For non-spatial data, I’m best equipped to talk about the ACS and current Census data. I also might point you toward a subject matter librarian.) I may also be able to convert data that are in formats that are not amenable to analysis, either because of organization or file format. Email me and schedule an appointment, or come by office hours.
2. Design studies
Start experimenting with ways of visually presenting your data. Think of this as sketching—try different graphic forms, scales, and methods of encoding. At some stage, you’ll also need to consider how to fit this onto the page.
For this project, you need to sketch out several different approaches, with a small but representative amount of real data, but you do not need to develop multiple versions to the final stage.
3. Final design
Polish your rough design. You might want to produce new base graphics from a spreadsheet/stats program. Move on to a more visually expressive and typographically-oriented environment to prepare the final graphics—perhaps Illustrator, another drawing program, or (yes) working by hand.
- “Dataset” doesn’t mean “gigantic dataset”! There’s no minimum or maximum, just start and end dates for the project. You can choose a tiny or simple dataset and invest your time in exploring design.
- You will almost certainly need to show several visualizations on one page. A single, monster visualization will sink under its own weight. A lone, featherweight visualization will underwhelm. Give us substance.
- Truly, the labeling defaults on stats/spreadsheet software are terrible. Do not label everything; we won’t be able to see all the text and no one will care. But label the values that matter.
Tools and methods
Use what you like. For manipulating or analyzing data, and for preparing rough visualizations, a spreadsheet or Workbench would serve well. You might want to create your base visualizations using RawGraphs or Flourish—but edit and improve them in Illustrator, another drawing program, or by hand.
For background, return to chapters 4 and 5 from The Truthful Art.
This project leans on good typography as much as on the graphic side of information graphics. If you want type tips, look to the “Letter” and “Text” sections of Ellen Lupton’s Thinking with Type site, or check out the book.
Choose legible, well-drawn fonts.
Free/open source suggestions: Source Sans Pro, IBM Plex, Barlow, Cormorant Garamond, Libre Caslon, and Libre Baskerville. (More libre fonts.)
Commercial suggestions: Adobe Minion, Adobe Garamond, Caslon, Frutiger, Trade Gothic, Franklin Gothic, Myriad, Meta, DIN, Helvetica, Jenson, Archer, Gotham, and Whitney.