Drawing Donation Maps from FEC Data: A Step-by-Step Guide
logo

Overview


The Federal Elections Commission (FEC) oversees the legal codes over election contribution and documents all cycling of money that individuals, corporations, PACs and campaigns engage in during any given election cycle. In their documentation, there is data that addresses who is donating money to the candidates and where the campaigns are putting this money. These large datasets can provide great insight regarding where money is coming from, who it is going to, how much is exchanged and numerous other pieces of information.

The goal of this walkthrough is to discuss how we can draw maps using a general FEC data set for Presidential Elections. The code in this document applies this goal using data specific to the state of Florida. However, small tweaks here and there can give us interesting maps for other states, and even the whole country as well. You can also use these guidelines and other FEC data sets to generate interesting maps for other races as well.

The Data


Before we begin with the map itself, we need to get the data. Here, we will download the data from the FEC website. We can use the Bulk Data Download page to retrieve large .csv files for campaign contributions and expenditures for each election year. However, if you are interested in a specific race in a election year, you can follow the steps below. This will give you smaller and more manageable file sizes.

When you launch the Bulk Data Download page, here is what you see.

In the left-hand panel, there are tabs that lead you to summary tables for each of the categories listed. In the main panel, you can expand the columns to download bulk .csv files as you desire.

To access the data for Presidential elections, select Candidates on the left-hand column. Scroll on the main body until you find “Presidential Candidates”. Click on the “Presidential Candidate Map” feature

You will find a feature that presents the summary statistics for each state on all presidential candidates in either 2016 or 2020. On this screen, you will notice:

  • Year selection
  • List of Candidates
  • A general map shaded by how much each state contributed to the Presidential election in general
  • Options on the right for more specific data exploration and download.

You can click around on the page to explore data by state.

For the purposes of this walkthrough, I will select the 2016 election and will look at data for Florida. As such, I click on 2016 in the upper left and on Florida in the map. Here, I will see a big blue button on the upper right for me to download raising data for Florida. When I click on it, I will get a .zip file that unzips to a .csv which has individual contribution data from Florida to all the candidates in the 2016 Presidential election.

Feel free to rename the .csv file to something shorter and more memorable as your please. I would also recommend relocating this file to somewhere that is more accessible such as a Data/ or Working/ folder on your desktop.

Cleaning the Data


Now that we have the data, we need to clean it so that it can be used for a map. For this step, I recommend the tidyverse package in R to clean the data. We load the package as follows:

library(tidyverse)

We also load in the data that we just downloaded from the FEC site:

presfl <- read_csv("~/Desktop/Data/Finance/2016PresFL.csv")

For the map, there is one variable that we are particularly interested in, and that is the zip code. However, if you look at the contbr_zip variable in the data set, you will notice that it is a 9-digit value. We need the 5-digit code for the map.

As such, we break up the values in this variable as follows:

presfl <- presfl %>% separate(contbr_zip, into = c("zip", "pfour"), sep = 5, remove = FALSE)

Get the Map


Now, we are ready to set up a map. Here, we will use the tigris package, which comes with many shape files, of which include Florida’s county and zip codes boundaries. (If you are interested in more maps that can be drawn using tigris shape files, here is the GitHub link that contains all the information as well as details on accessing historical shape files for each of the different shapes that this package provides.)

library(tigris)

For this example, I will need both the county and zip code maps as I am shading by zip code but will put a county map over the zip code data (as we will see later). As such, I need to load both shape files. This might take some time.

florida_zip <- zctas(cb = TRUE, starts_with = c("32", "33", "34"))
florida_counties <- counties(12, cb = TRUE)

Next, we need to set the shapes to something that we can use. We will fortify the shapes as follows:

library(rgeos)
library(rgdal)
library(taRifx)

florida_zip$fips <- destring(florida_zip$GEOID10)
fortified_fl_zip <- fortify(florida_zip, region = "fips")

florida_counties$fips <- destring(florida_counties$GEOID)
fortified_fl_counties <- fortify(florida_counties, region = "fips")

Drawing Specific Maps


There are many maps that are possible from this data. For this walkthrough, I will demonstrate how to draw a map depicting the number of donors for a candidate and the amount donated to this candidate in dollars for Florida.

Maps for Number of Donors


Donald Trump

For this map, I will draw the number of donors in Florida who gave to his campaign for the general election. To do this, I will select the “general election” donations from the dataset.

presflgen <- presfl %>% filter(election_tp == "G2016")

I will then select Donald Trump from the recipient variable list:

trump <- presflgen %>% filter(cand_nm == "Trump, Donald J.")

To map the number of donors by zip code, we aggregate the data to reflect the data that we need.

trumpdonor <- trump %>% group_by(zip) %>% summarise(donor = n())

To be sure that all the donations are actually for Florida, we run the following:

trumpdonor <- trumpdonor %>% filter(zip %in% 32002:34998)

Now, we are ready to map! Here, I cap the map at 200 donors in a zip code so that we can see details between places that do not have this many donors. In this map, I use zip code data but overlay county maps for reference.

ggplot() +
  geom_map(
    data = trumpdonor, 
    aes(map_id = zip, fill = donor), 
    map = fortified_fl_zip
  ) + 
  expand_limits(
    x = fortified_fl_zip$long, 
    y = fortified_fl_zip$lat
  ) +
  geom_polygon(
    data = fortified_fl_counties, 
    aes(x = long, y = lat), 
    color = "grey50", 
    fill = NA, 
    group = fortified_fl_counties$group, 
    size=.25
  ) +
  coord_map(
    "lambert", 
    parameters = c(30,40)
  ) + 
  scale_fill_gradient(low = "#fcbba1", high = "#cb181d",
                      na.value = "#99000d", limits = c(0, 200))+
  theme_void() +
  labs(fill = "Number of\nDonors",
       title = "Distribution of Donors to Donald Trump by Zip Code"
  )+
  theme(title = element_text(size = 14, colour="black"),
        plot.title = element_text(hjust = 0.5))

Hillary Clinton


We can follow a similar method to draw the same map for Hillary Clinton

#Select candidate
clinton <- presflgen %>% filter(cand_nm == "Clinton, Hillary Rodham")

#Calculate number of donors by zip code
clintondonor <- clinton %>% group_by(zip) %>% summarise(donor = n())

#Make Sure Zip codes are for Florida
clintondonor <- clintondonor %>% filter(zip %in% 32002:34998)

#Generate map
#Cap at 200 donors
ggplot() +
  geom_map(
    data = clintondonor, 
    aes(map_id = zip, fill = donor), 
    map = fortified_fl_zip
  ) + 
  expand_limits(
    x = fortified_fl_zip$long, 
    y = fortified_fl_zip$lat
  ) +
  geom_polygon(
    data = fortified_fl_counties, 
    aes(x = long, y = lat), 
    color = "grey50", 
    fill = NA, 
    group = fortified_fl_counties$group, 
    size=.25
  ) +
  coord_map(
    "lambert", 
    parameters = c(30,40)
  ) + 
  scale_fill_gradient(low = "#c6dbef", high = "#2171b5",
                      na.value = "#084594", limits = c(0, 200))+
  theme_void() +
  labs(fill = "Number of\nDonors",
       title = "Distribution of Donors to Hillary Clinton by Zip Code"
  )+
  theme(title = element_text(size = 14, colour="black"),
        plot.title = element_text(hjust = 0.5))

Maps for Amount of Dollars Donated


Hillary Clinton

In a similar fashion, we can generate a map that looks at how much money is donated to the Clinton campaign and where it came from in Florida.

First, I select Hillary Clinton from the list.

clinton <- presflgen %>% filter(cand_nm == "Clinton, Hillary Rodham")

Then, we can calculate the total donated to Clinton grouped by zip code:

clintondonation <- clinton %>% group_by(zip) %>% summarise(total = sum(contb_receipt_amt))

Before we map, we need to make sure, again, that all the zip codes are actually in the range for Florida:

clintondonation <- clintondonation %>% filter(zip %in% 32002:34998)

Finally, we map! Here, I cap amounts at $10,000 so that we can better observe differences between the places that have smaller donation amounts.

ggplot() +
  geom_map(
    data = clintondonation, 
    aes(map_id = zip, fill = total), 
    map = fortified_fl_zip
  ) + 
  expand_limits(
    x = fortified_fl_zip$long, 
    y = fortified_fl_zip$lat
  ) +
  geom_polygon(
    data = fortified_fl_counties, 
    aes(x = long, y = lat), 
    color = "grey50", 
    fill = NA, 
    group = fortified_fl_counties$group, 
    size=.25
  ) +
  coord_map(
    "lambert", 
    parameters = c(30,40)
  ) + 
  scale_fill_gradient(low = "#c6dbef", high = "#2171b5",
                      na.value = "#084594", limits = c(0, 10000))+
  theme_void() +
  labs(fill = "Dollars",
       title = "Distribution of Dollars to Hillary Clinton by Zip Code"
  )+
  theme(title = element_text(size = 14, colour="black"),
        plot.title = element_text(hjust = 0.5))

Donald Trump

We can follow a similar method to draw the same map for Donald Trump

#Select candidate
trump <- presflgen %>% filter(cand_nm == "Trump, Donald J.")

#Donations to Trump -- is Negative for Refunds
trumpdonation <- trump %>% group_by(zip) %>% summarise(total = sum(contb_receipt_amt))
trumpdonation <- trumpdonation %>% filter(zip %in% 32002:34998)

#Draw Map
#Overlay County
#Cap at $10000
ggplot() +
  geom_map(
    data = trumpdonation, 
    aes(map_id = zip, fill = total), 
    map = fortified_fl_zip
  ) + 
  expand_limits(
    x = fortified_fl_zip$long, 
    y = fortified_fl_zip$lat
  ) +
  geom_polygon(
    data = fortified_fl_counties, 
    aes(x = long, y = lat), 
    color = "grey50", 
    fill = NA, 
    group = fortified_fl_counties$group, 
    size=.25
  ) +
  coord_map(
    "lambert", 
    parameters = c(30,40)
  ) + 
  scale_fill_gradient(low = "#fcbba1", high = "#cb181d",
                      na.value = "#99000d", limits = c(0, 10000))+
  theme_void() +
  labs(fill = "Dollars",
       title = "Distribution of Dollars to Donald Trump by Zip Code"
  )+
  theme(title = element_text(size = 14, colour="black"),
        plot.title = element_text(hjust = 0.5))

Cautionary Tales and Concluding Thoughts


From this exploration, we can easily draw maps that reflect campaign contributions in Florida from publicly available data through the FEC. There are some additional sources of data that one could explore for campaign finance related projects including OpenSecrets which has more organized data tables for congressional elections, and the Florida Department of Elections which contains data for Florida specific races such as those for Governor, State House and State Senate.

There is a lot of opportunities available in research within each of these datasets, but I should include a few cautionary tales reflecting some things I learned in the process.

  1. File Sizes and Labels: These data files are large and messy. Candidates are referred to using their FEC numbers rather than names for the most part. I think the Presidential election file is unique in that it uses the candidate names. The FEC, and OpenSecrets for that matter, includes reference tables for the candidates.
  2. OpenSecrets: What is this OpenSecrets you speak of? I know that I did not include steps for accessing this database, but I did explore it during the process of the project, so I am including some notes about it in case you are interested in using it in your projects. In my opinion, OpenSecrets is both an upgrade and a downgrade to the FEC data. It is manner (more on this later). It provides easy to use reference tables for the variable names, candidate codes, and election types. However, there is not that much selection you can do, unlike FEC data. In the FEC database, it is the case, as we explored earlier, that one can choose Florida data for a given election using point-and-click. Using OpenSecrets, you are given the entire massive file and are expected to clean/filter out what you need. You are also required to register to use the database but the account is free.
  3. FEC Database: While the FEC database is a more official source, it comes with some downsides in itself. It is harder to find variable labels and the site layout changes quite frequently. For example, from the beginning of this project to the time that I wrote this document, the method to access Florida contributions data (as described earlier) is not the same. We could anticipate that it would change sometime in the near future, which means that it would be wise to store data, especially those from earlier elections, in a Google Drive or other cloud-based storage system for ease of access. You also don’t want your computer to be running these massive downloads every day anyways.

From the maps that we drew in this paper, we see interesting patterns regarding where money is coming from across the state of Florida, which candidate it is going to and how much is being circulated.

If you have any questions for me, please feel free to reach out. I am available at jennifer.lin16@ncf.edu but if you are reaching out after Fall 2020, you might be better able to reach me at jenniferlin2025@u.northwestern.edu.

That’s about it! Have fun drawing maps, folks!

Acknowledgements

Thank you to Pimp my RMD and epuRate for the tips on Markdown formatting and stylistic inspirations.

 




A work by Jennifer Lin