Introduction to the chorrrds package

Music chords analysis!

Bruna Wundervald (Maynooth University) , Julio Trecenti (Curso-R)

Note: This post was update on 17/03/2020 due to some changes with the new version of the package.


chorrrds is a package to retrieve and analyse music chords data. It scrapes the Cifraclub website to download and organize music chords.

The main reason to create chorrrds was my undergrad thesis. In my work, I did an end-to-end analysis, exploring feature engineering techniques to describe and predict musical genres from music chord representation.

chorrrds can be considered a package for MIR (Music Information Retrieval). MIR is a broad area of computational music which extracts and processes music data, from the unstructured ones, as sound waves, to structured, as sheet music or chords.

In this post we’ll describe chorrrds functions and show some examples. Stay tuned!


You can install chorrrds from your favourite CRAN mirror, simply running:


You can also install the latest versios of chorrrds from the R-Music GitHub organization with:

# install.packages("devtools")


The main function of the package is called get_chords(). It extracts music chords from an specific artist. There is two steps to obtain the data:

  1. Extraction of song urls for each music of an artist with get_songs.
  2. Extraction of music chords using the urls with get_chords.


# Step 1: Getting the chords for some Janis Joplin songs
songs <- "janis-joplin" %>% 
  chorrrds::get_songs() %>% 
  dplyr::sample_n(5)        # Just selecting a random sample of 5 songs 

# Step 2: getting the chords for the selected songs
chords <- songs %>% 
  dplyr::pull(url) %>%                     
  purrr::map(chorrrds::get_chords) %>%     # Mapping the function over the 
                                           # selected urls
  purrr::map_dfr(dplyr::mutate_if, is.factor, as.character)   %>% 
  chorrrds::clean(message = FALSE)         # Cleans the dataset, in case
                                           # strange elements, as music lyrics, 
                                           # are present when they shouldn't

chords %>% slice(1:10) 
chord key song artist
C#m E Miseryn Janis Joplin
G#m E Miseryn Janis Joplin
B E Miseryn Janis Joplin
C#m E Miseryn Janis Joplin
G#m E Miseryn Janis Joplin
A E Miseryn Janis Joplin
E E Miseryn Janis Joplin
G#7 E Miseryn Janis Joplin
C#m E Miseryn Janis Joplin
F# E Miseryn Janis Joplin

The table above shows us how are the results of the get_chords function. As you can see, the data is in a long format: the chords appear in the sequence they are in each song, being repeated sometimes.


There are actually many datasets that come built-in with the package. They have been previously used in my undergrad thesis, so they were just kept in the package to work as example data. These data are composed of several Brazilian artists music chords. You can check the available datasets with the code above:


Use case

Returning to the data we collected before, let’s explore it!

The first thing we can look at is the most common chords in each music. Which are the common chords in music made by Janis Joplin? Are the proportions of these chords similar between the songs?

chords %>% 
  dplyr::group_by(song) %>% 
  dplyr::count(chord) %>%
  dplyr::top_n(n, n = 3) %>%
  dplyr::mutate(prop = scales::percent(n/sum(n))) 
song chord n prop
A Woman Left Lonely Bb 11 34%
A Woman Left Lonely D 10 31%
A Woman Left Lonely F 11 34%
Blind Man B 8 28%
Blind Man C 7 24%
Blind Man D 7 24%
Blind Man Em 7 24%
Bye Bye Baby A 4 10%
Bye Bye Baby C 11 28%
Bye Bye Baby D 4 10%
Bye Bye Baby E 4 10%
Bye Bye Baby G 16 41%
Miseryn B 5 33.3%
Miseryn C#m 6 40.0%
Miseryn G#m 4 26.7%
One Good Man A 6 33%
One Good Man B 3 17%
One Good Man E 9 50%

With the dataset analyzed here, we can already obtain some interesting information. For instance, we can observe a quite varying difference between the most common chords proportions, there is no clear rule. That shows us that the structure of her songs might not follow a closed pattern, which can be a sign of how creative the artist was.

We can also look at something called “chord bigrams”. This is pretty much the task of creating pairs of chords that happened in sequence, by song, and analyze their frequencies, and it’s done with the chords_ngram() function

chords %>%
  split(.$song) %>% 
  purrr:::map(chorrrds::chords_ngram, n = 2) %>% 
  dplyr::bind_rows() %>% 
  dplyr::group_by(song) %>% 
  dplyr::count(chords_ngram) %>% 
  dplyr::top_n(n, n = 2) 
song chords_ngram n
A Woman Left Lonely Bb F 6
A Woman Left Lonely G C 6
Blind Man C B 7
Blind Man D C 7
Blind Man Em D 7
Bye Bye Baby C G 10
Bye Bye Baby G C 10
Miseryn B C#m 3
Miseryn C#m G#m 4
One Good Man A E 6
One Good Man B A 3
One Good Man E A 3
One Good Man E B 3

There are some bigrams that happen a lot in a song, while others just a few times, but are still the most frequent ones. In the song called “A Woman Left Lonely”, for example, we have only two chords sequences(Bb -> F and G -> C), which is harmonically interesting to observe.

Now, we have already explored the data a little bit. We can make it even more interesting, by building a chord diagram. The word “chord” here does not mean the musical one, but a graphic element that shows us the strength of a connection. The connections will be the observed chord transitions in our selected songs, and their strengths, how many times each transition happened. With this configuration, the chord diagram makes the relationship between each chord explicit.

# devtools::install_github("mattflor/chorddiag")

comp <- chords %>% 
  dplyr::mutate(seq = lead(chord)) %>% 
  dplyr::filter(chord != seq) %>% 
  dplyr::group_by(chord, seq) %>%  
  dplyr::summarise(n = n())

mat <- tidyr::spread(comp, key = chord, value = n, fill = 0)  
mm <- as.matrix(mat[, -1]) 

# Building the chords diagram
chorddiag::chorddiag(mm, showTicks = FALSE,
                     palette = "Reds")

Now we can clearly see how the transitions behave in the songs we’re using. There are strong connections between the chord A and E, A and B, and the others one are, in general, fragmented.
A cool thing to notice is that the diagram is interactive, then we can see the strength of each transition with the mouse cursor!

Wrap up

In this blog post, we:

chorrrds is a new package, with a lot of potential and many possible applications to be explored. We hope this was useful, and that now you’re starting to be as enchanted by music information retrieval as we are!


For attribution, please cite this work as

Wundervald & Trecenti (2018, Aug. 19). R-Music: Introduction to the chorrrds package. Retrieved from

BibTeX citation

  author = {Wundervald, Bruna and Trecenti, Julio},
  title = {R-Music: Introduction to the chorrrds package},
  url = {},
  year = {2018}