Music chords analysis!
Note: This post was update on 17/03/2020 due to some changes with the new version of the package.
chorrrds
is a package to retrieve and analyse music chords data. It scrapes the Cifraclub website to download and organize music chords.
The main reason to create chorrrds
was my undergrad thesis. In my work, I did an end-to-end analysis, exploring feature engineering techniques to describe and predict musical genres from music chord representation.
chorrrds
can be considered a package for MIR (Music Information Retrieval). MIR is a broad area of computational music which extracts and processes music data, from the unstructured ones, as sound waves, to structured, as sheet music or chords.
In this post we’ll describe chorrrds
functions and show some examples. Stay tuned!
You can install chorrrds
from your favourite CRAN mirror, simply running:
install.packages("chorrrds")
You can also install the latest versios of chorrrds
from the R-Music GitHub organization with:
# install.packages("devtools")
devtools::install_github("r-music/chorrrds")
The main function of the package is called get_chords()
. It extracts music chords from an specific artist. There is two steps to obtain the data:
get_songs
.get_chords
.
library(tidyverse)
set.seed(20191)
# Step 1: Getting the chords for some Janis Joplin songs
songs <- "janis-joplin" %>%
chorrrds::get_songs() %>%
dplyr::sample_n(5) # Just selecting a random sample of 5 songs
# Step 2: getting the chords for the selected songs
chords <- songs %>%
dplyr::pull(url) %>%
purrr::map(chorrrds::get_chords) %>% # Mapping the function over the
# selected urls
purrr::map_dfr(dplyr::mutate_if, is.factor, as.character) %>%
chorrrds::clean(message = FALSE) # Cleans the dataset, in case
# strange elements, as music lyrics,
# are present when they shouldn't
chords %>% slice(1:10)
chord | key | song | artist |
---|---|---|---|
C#m | E | Miseryn | Janis Joplin |
G#m | E | Miseryn | Janis Joplin |
B | E | Miseryn | Janis Joplin |
C#m | E | Miseryn | Janis Joplin |
G#m | E | Miseryn | Janis Joplin |
A | E | Miseryn | Janis Joplin |
E | E | Miseryn | Janis Joplin |
G#7 | E | Miseryn | Janis Joplin |
C#m | E | Miseryn | Janis Joplin |
F# | E | Miseryn | Janis Joplin |
The table above shows us how are the results of the get_chords
function. As you can see, the data is in a long format: the chords appear in the sequence they are in each song, being repeated sometimes.
There are actually many datasets that come built-in with the package. They have been previously used in my undergrad thesis, so they were just kept in the package to work as example data. These data are composed of several Brazilian artists music chords. You can check the available datasets with the code above:
library(chorrrds)
ls("package:chorrrds")
Returning to the data we collected before, let’s explore it!
The first thing we can look at is the most common chords in each music. Which are the common chords in music made by Janis Joplin? Are the proportions of these chords similar between the songs?
chords %>%
dplyr::group_by(song) %>%
dplyr::count(chord) %>%
dplyr::top_n(n, n = 3) %>%
dplyr::mutate(prop = scales::percent(n/sum(n)))
song | chord | n | prop |
---|---|---|---|
A Woman Left Lonely | Bb | 11 | 34% |
A Woman Left Lonely | D | 10 | 31% |
A Woman Left Lonely | F | 11 | 34% |
Blind Man | B | 8 | 28% |
Blind Man | C | 7 | 24% |
Blind Man | D | 7 | 24% |
Blind Man | Em | 7 | 24% |
Bye Bye Baby | A | 4 | 10% |
Bye Bye Baby | C | 11 | 28% |
Bye Bye Baby | D | 4 | 10% |
Bye Bye Baby | E | 4 | 10% |
Bye Bye Baby | G | 16 | 41% |
Miseryn | B | 5 | 33.3% |
Miseryn | C#m | 6 | 40.0% |
Miseryn | G#m | 4 | 26.7% |
One Good Man | A | 6 | 33% |
One Good Man | B | 3 | 17% |
One Good Man | E | 9 | 50% |
With the dataset analyzed here, we can already obtain some interesting information. For instance, we can observe a quite varying difference between the most common chords proportions, there is no clear rule. That shows us that the structure of her songs might not follow a closed pattern, which can be a sign of how creative the artist was.
We can also look at something called “chord bigrams”. This is pretty much the task of creating pairs of chords that happened in sequence, by song, and analyze their frequencies, and it’s done with the chords_ngram()
function
chords %>%
split(.$song) %>%
purrr:::map(chorrrds::chords_ngram, n = 2) %>%
dplyr::bind_rows() %>%
dplyr::group_by(song) %>%
dplyr::count(chords_ngram) %>%
dplyr::top_n(n, n = 2)
song | chords_ngram | n |
---|---|---|
A Woman Left Lonely | Bb F | 6 |
A Woman Left Lonely | G C | 6 |
Blind Man | C B | 7 |
Blind Man | D C | 7 |
Blind Man | Em D | 7 |
Bye Bye Baby | C G | 10 |
Bye Bye Baby | G C | 10 |
Miseryn | B C#m | 3 |
Miseryn | C#m G#m | 4 |
One Good Man | A E | 6 |
One Good Man | B A | 3 |
One Good Man | E A | 3 |
One Good Man | E B | 3 |
There are some bigrams that happen a lot in a song, while others just a few times, but are still the most frequent ones. In the song called “A Woman Left Lonely”, for example, we have only two chords sequences(Bb -> F and G -> C), which is harmonically interesting to observe.
Now, we have already explored the data a little bit. We can make it even more interesting, by building a chord diagram. The word “chord” here does not mean the musical one, but a graphic element that shows us the strength of a connection. The connections will be the observed chord transitions in our selected songs, and their strengths, how many times each transition happened. With this configuration, the chord diagram makes the relationship between each chord explicit.
# devtools::install_github("mattflor/chorddiag")
library(chorddiag)
comp <- chords %>%
dplyr::mutate(seq = lead(chord)) %>%
dplyr::filter(chord != seq) %>%
dplyr::group_by(chord, seq) %>%
dplyr::summarise(n = n())
mat <- tidyr::spread(comp, key = chord, value = n, fill = 0)
mm <- as.matrix(mat[, -1])
# Building the chords diagram
chorddiag::chorddiag(mm, showTicks = FALSE,
palette = "Reds")
Now we can clearly see how the transitions behave in the songs we’re using. There are strong connections between the chord A and E, A and B, and the others one are, in general, fragmented.
A cool thing to notice is that the diagram is interactive, then we can see the strength of each transition with the mouse cursor!
In this blog post, we:
chorrrds
package, which extracts chords from the songs of an artist.chorrrds
is a new package, with a lot of potential and many possible applications to be explored. We hope this was useful, and that now you’re starting to be as enchanted by music information retrieval as we are!
For attribution, please cite this work as
Wundervald & Trecenti (2018, Aug. 19). R-Music: Introduction to the chorrrds package. Retrieved from https://r-music.rbind.io/posts/2018-08-19-chords-analysis-with-the-chorrrds-package/
BibTeX citation
@misc{wundervald2018introduction, author = {Wundervald, Bruna and Trecenti, Julio}, title = {R-Music: Introduction to the chorrrds package}, url = {https://r-music.rbind.io/posts/2018-08-19-chords-analysis-with-the-chorrrds-package/}, year = {2018} }