unvotes package

Citation: Erik Voeten “Data and Analyses of Voting in the UN General Assembly” Routledge Handbook of International Organization, edited by Bob Reinalda (published May 27, 2013)

The unvotes package contains three datasets, each set up as a data frame which is also called a data table. First is the history of each country’s vote. These are represented in the un_votes dataset, with one row for each country/vote pair:

Note that packages( also called “libraries”) must be installed before they are loaded. Usually, RStudio will notify you to install the packages required in a file. We use the library() function to load the libraries after they have been installed. This code below just “loads” the two packages, dplyr and unvotes.

library(dplyr)
library(unvotes)
#use the name of the data set to print out a "tibble" or the first 10 rows of the data set.
un_votes
## # A tibble: 869,937 × 4
##     rcid country            country_code vote 
##    <dbl> <chr>              <chr>        <fct>
##  1     3 United States      US           yes  
##  2     3 Canada             CA           no   
##  3     3 Cuba               CU           yes  
##  4     3 Haiti              HT           yes  
##  5     3 Dominican Republic DO           yes  
##  6     3 Mexico             MX           yes  
##  7     3 Guatemala          GT           yes  
##  8     3 Honduras           HN           yes  
##  9     3 El Salvador        SV           yes  
## 10     3 Nicaragua          NI           yes  
## # … with 869,927 more rows
## # ℹ Use `print(n = ...)` to see more rows

“rcid” stands for roll call id, the number of the roll call vote starting back when the UN was established in 1946. The other columns are self-explanatory with the variable “vote” being yes if the country voted “yes” on that roll call vote.

The unvotes package also contains a second dataset of information about each roll call vote, including the UN Session number and date, the description of the roll call vote, and relevant resolution that was voted on:

un_roll_calls
## # A tibble: 6,202 × 9
##     rcid session importantvote date       unres   amend  para short        descr
##    <int>   <dbl>         <int> <date>     <chr>   <int> <int> <chr>        <chr>
##  1     3       1             0 1946-01-01 R/1/66      1     0 AMENDMENTS,… "TO …
##  2     4       1             0 1946-01-02 R/1/79      0     0 SECURITY CO… "TO …
##  3     5       1             0 1946-01-04 R/1/98      0     0 VOTING PROC… "TO …
##  4     6       1             0 1946-01-04 R/1/107     0     0 DECLARATION… "TO …
##  5     7       1             0 1946-01-02 R/1/295     1     0 GENERAL ASS… "TO …
##  6     8       1             0 1946-01-05 R/1/297     1     0 ECOSOC POWE… "TO …
##  7     9       1             0 1946-02-05 R/1/329     0     0 POST-WAR RE… "TO …
##  8    10       1             0 1946-02-05 R/1/361     1     1 U.N. MEMBER… "TO …
##  9    11       1             0 1946-02-05 R/1/376     0     0 TRUSTEESHIP… "TO …
## 10    12       1             0 1946-02-06 R/1/394     1     1 COUNCIL MEM… "TO …
## # … with 6,192 more rows
## # ℹ Use `print(n = ...)` to see more rows

From inspecting the tibble for this data set we can see there have been over 6200 roll call votes since the beginning of the UN in 1946.

The 3rd data set contains data on just six selected issues for which the UN has had numerous roll call votes. Note that there are many more issues not included in this data “subset” of UN votes.

un_roll_call_issues
## # A tibble: 5,745 × 3
##     rcid short_name issue               
##    <int> <chr>      <fct>               
##  1    77 me         Palestinian conflict
##  2  9001 me         Palestinian conflict
##  3  9002 me         Palestinian conflict
##  4  9003 me         Palestinian conflict
##  5  9004 me         Palestinian conflict
##  6  9005 me         Palestinian conflict
##  7  9006 me         Palestinian conflict
##  8   128 me         Palestinian conflict
##  9   129 me         Palestinian conflict
## 10   130 me         Palestinian conflict
## # … with 5,735 more rows
## # ℹ Use `print(n = ...)` to see more rows

Before we merge the three data sets, let’s take a look at how many roll call votes there have been on these six issues.

library(dplyr)
count(un_roll_call_issues, issue, sort = TRUE)
## # A tibble: 6 × 2
##   issue                                    n
##   <fct>                                <int>
## 1 Arms control and disarmament          1092
## 2 Palestinian conflict                  1061
## 3 Human rights                          1015
## 4 Colonialism                            957
## 5 Nuclear weapons and nuclear material   855
## 6 Economic development                   765

Explanation of the code chunk used:

We use the count function in the dplyr library to first sort the six issues into groups and then count the number of votes on each group. A function in R has the form of function_name( ). In the parentheses, we first list the name of the dataset un_roll_call_issues, the variable of interest [issue], and indicate we do want to sort [sort = TRUE].

We can perform more analyses if we put the data in the three data sets into one data table. We can do this by merging or joining the three using a common variable, rcid.

library(dplyr)
unvotes1 <- un_votes %>%
  inner_join(un_roll_calls, by = "rcid") %>%
  inner_join(un_roll_call_issues, by = "rcid")
#this prints a tibble of unvotes
unvotes1
## # A tibble: 857,878 × 14
##     rcid country      count…¹ vote  session impor…² date       unres amend  para
##    <dbl> <chr>        <chr>   <fct>   <dbl>   <int> <date>     <chr> <int> <int>
##  1     6 United Stat… US      no          1       0 1946-01-04 R/1/…     0     0
##  2     6 Canada       CA      no          1       0 1946-01-04 R/1/…     0     0
##  3     6 Cuba         CU      yes         1       0 1946-01-04 R/1/…     0     0
##  4     6 Dominican R… DO      abst…       1       0 1946-01-04 R/1/…     0     0
##  5     6 Mexico       MX      yes         1       0 1946-01-04 R/1/…     0     0
##  6     6 Guatemala    GT      no          1       0 1946-01-04 R/1/…     0     0
##  7     6 Honduras     HN      yes         1       0 1946-01-04 R/1/…     0     0
##  8     6 El Salvador  SV      abst…       1       0 1946-01-04 R/1/…     0     0
##  9     6 Nicaragua    NI      yes         1       0 1946-01-04 R/1/…     0     0
## 10     6 Panama       PA      abst…       1       0 1946-01-04 R/1/…     0     0
## # … with 857,868 more rows, 4 more variables: short <chr>, descr <chr>,
## #   short_name <chr>, issue <fct>, and abbreviated variable names
## #   ¹​country_code, ²​importantvote
## # ℹ Use `print(n = ...)` to see more rows, and `colnames()` to see all variable names
#this prints out all the column/variable names in the unvotes1 data set
colnames(unvotes1)
##  [1] "rcid"          "country"       "country_code"  "vote"         
##  [5] "session"       "importantvote" "date"          "unres"        
##  [9] "amend"         "para"          "short"         "descr"        
## [13] "short_name"    "issue"

This code chunk begins by creating a new dataframe named “unvotes1” using the assign characters <- to put the data in the original un_votes data set into “unvotes1.”

The next symbol is call the pipe operator because it “pipes” or transfers the result of the assignment to begin the joining process. You can think of the pipe operator %>% as saying “and then.”

For example, assign the data in un_votes to the new data set unvotes1 “and then” join that data set with the data in un_roll_calls using the common variable “rcid” to match up the rows.

Finally, we have another pipe operator so “and then” use the inner_join function to merge the data in the last data set into unvotes1.

One could then count how often each country votes “yes” on a resolution in each year:

library(lubridate)

by_country_year <- unvotes1 %>%
  group_by(year = year(date), country) %>%
  summarize(votes = n(),
            percent_yes = mean(vote == "yes"))

by_country_year
## # A tibble: 10,452 × 4
## # Groups:   year [73]
##     year country     votes percent_yes
##    <dbl> <chr>       <int>       <dbl>
##  1  1946 Afghanistan    12       0.25 
##  2  1946 Argentina      27       0.778
##  3  1946 Australia      27       0.556
##  4  1946 Belarus        27       0.370
##  5  1946 Belgium        27       0.630
##  6  1946 Bolivia        27       0.741
##  7  1946 Brazil         27       0.704
##  8  1946 Canada         26       0.808
##  9  1946 Chile          27       0.741
## 10  1946 Colombia       25       0.28 
## # … with 10,442 more rows
## # ℹ Use `print(n = ...)` to see more rows

After which this can be visualized for one or more countries:

library(ggplot2)
theme_set(theme_bw())

#This next line creates a new data object "countries" consisting of just the names of the four listed countries.

countries <- c("United States", "United Kingdom", "India", "France")

#Next we begin with the "by_country_year" data set 
# and then
# Filter out all but the four countries in the "countries" data object
# and then begin our plot using the ggplot function
by_country_year %>%
  filter(country %in% countries) %>%
  ggplot(aes(year, percent_yes, color = country)) +
  geom_line() +
  ylab("% of votes that are 'Yes'")

Look at how the voting record of the US has changed on each of the six issues:

unvotes1 %>%
  filter(country == "United States") %>%
  group_by(year = year(date), issue) %>%
  summarize(votes = n(),
            percent_yes = mean(vote == "yes")) %>%
  filter(votes > 5) %>%
  ggplot(aes(year, percent_yes)) +
  geom_point() +
  geom_smooth(se = FALSE) +
  facet_wrap(~ issue)

Return to the debrief[https://bus-data-lit-lab.netlify.app/debrief.html]