AUTOMATIC DETECTION OF INTERNAL AND IMPERFECT

10th International Society for Music Information Retrieval Conference (ISMIR 2009)

AUTOMATIC DETECTION OF INTERNAL AND IMPERFECT RHYMES

IN RAP LYRICS

Hussein Hirjee Daniel G. Brown

University of Waterloo

Cheriton School of Computer Science

{hahirjee, browndg}@uwaterloo.ca

ABSTRACT

Imperfect and internal rhymes are two important fea-

tures in rap music often ignored in the music information

retrieval community. We develop a method of scoring

potential rhymes using a probabilistic model based on

phoneme frequencies in rap lyrics. We use this scoring

scheme to automatically identify internal and line-ﬁnal

rhymes in song lyrics and demonstrate the performance

of this method compared to rules-based models. Higher

level rhyme features are produced and used to compare

rhyming styles in song lyrics from different genres, and

for different rap artists.

1. INTRODUCTION

Song lyrics have received relatively little attention in mu-

sic information retrieval, but can provide data about song

style or content that is missing from raw audio ﬁles or user-

input tags. Recent work focusing on lyrics [1–3] involves

using lyric text to extract song topic, theme, or mood in-

formation; the pattern and sound of the words themselves

is usually ignored.

These sound features are central to rap music, providing

information about vocal delivery and rhyme scheme. This

data can be characteristic of different rappers, as MCs often

boast of the uniqueness and superiority of their rhyming

style. Lyric rhymes have previously been studied as an

aid in characterizing different musical genres [4], but this

prior work ignores two stylistic features of rap lyrics: im-

perfect rhymes, where syllable end sounds are similar but

not identical, and internal rhyme, which occurs in the mid-

dle of lines.

To study these features, we have developed a system

for automatic detection of rap music rhymes. We train a

probabilistic scoring model of rhymes using a corpus of

rap lyrics known to be rhyming, using ideas derived from

bioinformatics. We then use this model to ﬁnd and catego-

rize various rhymes in different song lyrics, and assess the

model’s success. Finally, we calculate high-level statistical

Permission to make digital or hard copies of all or part of this work for

personal or classroom use is granted without fee provided that copies are

not made or distributed for proﬁt or commercial advantage and that copies

bear this notice and the full citation on the ﬁrst page.

° 2009 International Society for Music Information Retrieval.

rhyme scheme features to attempt to quantitatively model

and compare rhyming styles between artists and genres.

Our work allows the automated study of new features in

rap music, and may be extensible to other genres of song

lyrics or for poetry analysis.

2. BACKGROUND

Hip hop music is characterized by lyrics with intermit-

tent rhymes being rhythmically chanted (rapped) to an ac-

companying beat. In “Old School” rap (dating from the

late 1970s to mid 1980s), lyrics typically followed a sim-

ple pattern and contained a single rhyme falling on the

fourth beat of each bar [5]. Contemporary rap features

more varied delivery and many complex rhyme stylistic el-

ements that are often overlooked. Key among these are

rhymes that are imperfect, extended, or internal. Holtman

[6] provides a good overview of the abundance of imper-

fect rhyme in rap lyrics. A normal rhyme involves two syl-

lables that share the same nucleus (vowel) and coda (end-

ing consonants). Two syllables form an imperfect rhyme if

one of these two parts does not correspond exactly. How-

ever, these types of rhymes are not just composed of vow-

els and consonants being paired randomly; there is a con-

straint to the amount of dissimilarity in these rhymes, de-

termined by the shared articulatory features of matching

phonemes.

In Holtman’s hierarchy, the most similar consonants are

nasals, fricatives, and plosives differing only in place of

articulation, as in the line-ending /m/ and /n/ phonemes in:

Entertain and tear you out of your frame

Leave you in a puddle of blood, then let it rain. [7]

(Rhyming syllables in quoted lyrics are displayed with

the same font style.) Less similar consonant pairs include

those with the same place of articulation, but differing in

voice or continuancy, such as the /k/ and /g/ pair in:

Bring a bullet-proof vest, nothin’ to ricochet

Ready to aim at the brain, now what the trigger say? [7]

Vowels are most similar when differing only in height or

“length” (advanced tongue root), such as the penultimate

vowels in:

I’m the alpha, with no omega

Beginning without the, end so play the. [7]

Holtman’s work is largely taxonomic and describes

known rhymes, rather than discovering them. Hence, we

711

Oral Session 8: Lyrics

used a statistical model of phonetic similarity based on

frequencies in actual rap lyrics. However, the patterns we

automatically discovered largely validate her taxonomy.

Rap music often features triplet or longer rhymes with

unstressed syllables following the initial stressed pair,

which may span multiple words (mosaic rhymes). Longer

rhymes can also include more than one pair of stressed

syllables:

Maybe my sense of h

umor gets

ınto you

But girl, they can make a perf

ume from the sc

ent of you.

[8]

(Here the accents mark the syllables with primary stress.)

Finally, contemporary rap music features dazzlingly com-

plex internal rhyme. Alim [9] analyzes Pharoahe Monch’s

1999 album Internal Affairs [10] as a case study, and iden-

tiﬁes chain rhymes, compound rhymes, and bridge rhymes.

Chain rhymes are consecutive words or phrases in which

each rhymes with the previous, as in:

New York City gritty committee pity the fool that

Act shitty in the midst of the calm the witty, [10]

where “city”, “gritty”, “committee”, and “pity” participate

in a chain. Compound rhymes are formed when two pairs

of line internal rhymes overlap within a single line. A good

example of this is given in “Ofﬁcial”:

Yo, I stick around like hockey, now what the puck

Cooler than fuck, maneuver like Vancouver Canucks,

[10]

where “maneuver” and “Vancouver” are found between

“fuck” and “Canucks.” Bridge rhymes are internal rhymes

spanning two lines:

How I made it you salivated over my calibrated

RAPS that validated my ghetto credibility

Still I be PACKin agilities unseen

Forreal-a my killin abilities unclean facilities. [10]

Here, we call pairs in which both members are internal

(such as “agilities” / “abilities”) bridge rhymes, and those

where the ﬁrst word or phrase is line-ﬁnal (such as “cali-

brated” / “validated”), link rhymes.

3. FINDING RHYMES AUTOMATICALLY: A

PROBABILISTIC MODEL

We modeled our rhyme detection program after local align-

ment protein homology detection algorithms using BLO-

SUM (BLOcks of amino acid SUbstition Matrix) [11]. In

this framework, pairs of proteins are modeled as sequences

of symbols generated either randomly or based on shared

ancestry (homology). Pairs of matched amino acids re-

ceive a log-odds score in the BLOSUM matrix M: a pos-

itive score indicates the pair more likely co-occurs due to

homology, and a negative score indicates the pair is more

likely to co-occur due to chance. Scores are in log-odds:

M[i, j] = log

(Pr[i, j|H]/ Pr[i, j|R]), where H is a model

of related proteins (obtained by counting the frequency

with which we see symbols i and j matched to each other

in proteins known to be homologous) and R is the fre-

quency of the symbols i and j in random proteins (obtained

from frequency counts over all proteins). If a pair of pro-

tein sequences contains regions in which the amino acids

align to give high scores, the pair is considered to be ho-

mologous.

In our work, song lyrics are transcribed into sets of se-

quences of syllables, with each sequence corresponding to

a line of text. Similar to Kawahara’s [12] treatment of con-

sonants in Japanese rap lyrics, probabilistic methods are

used to calculate similarity scores for any given pair of syl-

lables. Phonemes which match with each other in rhyming

phrases more often than expected by chance receive pos-

itive scores, while those which match less often than ex-

pected receive negative scores. Regions with syllables that,

when matched to each other, have total score surpassing a

threshold are identiﬁed as rhymes.

4. RHYMING SYLLABLES

To generate models of rhyming and randomly co-occurring

syllables in rap lyrics, we needed a data set of known

rhymes. Our training corpus includes the lyrics of 31

inﬂuential albums from the “Golden Age” of rap (1984-

1994), chosen because they received the highest rating

from The Source, the top-selling US rap music magazine

of the time, plus nine additional albums by inﬂuential

artists from the time period (Run-DMC, LL Cool J, The

Beastie Boys, Public Enemy, Eric B. and Rakim). We

downloaded lyrics from the Web and manually corrected

them to ﬁx typos and ensure that pairs of consecutive lines

ended with matching rhymes, yielding 27,956 lines of

lyrics (13,978 rhymed pairs), approximately 700 lines per

album.

We ﬁrst transcribe plaintext lyrics into sequences of

phonemes using a wrapper we built around the Carnegie

Mellon University (CMU) Pronouncing Dictionary [13],

which gives phonemes and stress markings for words in

North American English. We augmented the dictionary

with slang terms and common elements of hip-hop vernac-

ular (e.g., the “-in” ending in “runnin’”, or the “-a” ending

in “brotha” or “killa”), and reduced the stress assigned

to common one-syllable words of minor signiﬁcance in

rhyme (“a”, “I”, etc.). To handle words not found in

the augmented dictionary, we added the Naval Research

Laboratory’s text-to-phoneme rules [14].

5. SCORING POTENTIAL RHYMES

To generate a log-odds scoring matrix for rhyming syl-

lables, we need models for random syllables and for

rhymes. For any pair of syllables i and j, the random

model, Pr[i, j|Random], gives the likelihood of i and j

being matched together by chance while the rhyme model,

Pr[i, j|Rhyme], gives the likelihood of i and j being paired

in a true rhyme. As in BLOSUM [11], the log-odds score

is calculated as ln(Pr[i, j|Rhyme]/ Pr[i, j|Random]). To

avoid overﬁtting, we reduce each syllable to its vowel

(nucleus), end consonants (coda), and stress—the relevant

features for determining rhyme. We approximate the coda

by taking the ﬁrst half (rounded up) of the consonants

712

10th International Society for Music Information Retrieval Conference (ISMIR 2009)

between adjacent pairs of vowels. Both models are trained

using the occurrence frequencies of phonemes in the

training data.

In the random model, the likelihood of vowel a match-

ing with vowel b is calculated by taking the product of the

frequencies of a and b. The likelihoods for consonants and

varying stress are calculated in the same manner. For the

rhyming model, the likelihood of vowels a and b being

matched is calculated by taking the number of times a and

b are seen matching in known rhymes, and dividing by the

total number of matched vowel pairs in known rhymes.

Then the log-odds score for the vowels is calculated as

vowelScore(a, b) = ln(Pr[a, b|Rhyme]/ Pr[a, b|Random]).

The likelihood for consonants is more complicated

since we must also consider unmatched consonants when

aligning syllable codas of differing size. We use an iterated

approach to solve these problems. In the ﬁrst pass over

the training data, we produce initial vowel and consonant

scoring matrices by calculating the statistics above. We

consider rhymes in paired lines to be all syllables follow-

ing the ﬁnal primary-stressed syllable, after Holtman [6].

In the second pass, we identify the start of rhymes by

moving backwards from the end of the line while initial

scores for stressed syllables are positive. We perform

global alignment [15] on matched codas to determine

frequencies for consonants pairing with other consonants,

and being unmatched at the start or end of the coda. This

distinction is useful since some consonants (such as /l/ and

/r/) are more likely to be unmatched at the beginning of

clusters, and others (often coronals, such as /d/ and /z/)

are more likely to be unmatched at the ends of clusters. A

simple example of this is found in the repeated occurrences

of “alarmed” rhyming with “bomb” in Public Enemy’s

“Louder Than A Bomb.” [16]

Using these frequency statistics, we produce the

rhyming model and log-odds scores for consonants and

stress in the same way as for vowels. Finally, we normal-

ize the consonant score by dividing by the length of the

coda to avoid the problem of syllables with long codas

having the consonant score dominate. Intuitively, “win”

and “gin” rhyme as well as “splints” and “mints.” Since

all the constituent scores are log-odds, they can be added

together to form a combined probabilistic log score. The

ﬁnal score for two given syllables is the sum of the vowel

score, normalized consonant score, and stress score.

Tables 1 and 2 show the pairwise scoring matrices. The

symbols “ *” and “* ” indicate scores for unmatched con-

sonants at the beginning and end of codas, respectively.

High scores for pairs like (/m/,/n/) and (/k/,/p/) largely val-

idate Holtman’s hierarchy [6].

6. RHYME DETECTION ALGORITHM

With our probabilistic scoring method for matched sylla-

bles in place, we need a procedure to identify internal and

end rhymes. Our technique is a variant on local align-

ment [15]; for each syllable, we identify its closest pre-

ceding rhyming syllable, and longest preceding rhyming

phrase within the current and previous lines. For example,

AA AE AH AO AW AY EH ER EY IH IY OW OY UH UW

AA 2.3 -3.3 -0.8 1.6 -1.7 -2.7 -7.2 -0.6 -3.9 -4.8 -3.9 -1.0 -1.7 -3.3 -3.9

AE 2.1 -1.5 -6.6 -1.9 -3.3 -1.5 -3.4 -1.8 -2.0 -4.3 -4.6 -4.5 -3.7 -6.7

AH 2.2 -1.2 -1.4 -1.4 -0.6 -0.2 -1.7 -0.3 -3.0 -1.0 -0.6 -0.9 -1.5

AO 3.1 -1.0 -3.8 -6.5 -1.1 -3.9 -4.2 -6.3 -0.3 -0.4 1.1 -3.3

AW 3.8 -0.3 -6.0 -4.2 -5.7 -6.0 -5.7 -2.0 -2.9 -4.5 -1.4

AY 2.5 -4.2 -1.1 -7.0 -1.8 -3.2 -4.3 -1.1 -5.7 -6.4

EH 1.9 -1.2 -1.5 0.2 -2.1 -7.0 -4.5 -6.1 -4.3

ER 3.9 -5.6 -1.5 -5.5 -1.6 -2.7 -1.3 -2.6

EY 2.5 -3.4 -2.7 -4.4 -4.3 -5.8 -6.5

IH 2.0 -0.9 -7.1 0.2 -2.2 -3.7

IY 2.4 -4.4 -4.2 -5.8 -6.4

OW 2.8 -4.0 -2.5 -1.5

OY 4.9 0.1 -3.7

UH 2.6 -0.5

UW 3.1

Table 1. Scoring Matrix for Vowels

given the line

Unobtainable to the brain it’s unexplainable what the

verse’ll do [10]

from Pharoahe Monch’s “Right Here,” the middle “ain”

syllables all rhyme, while the whole of “unexplainable”

also rhymes with “unobtainable.”

For every pair of consecutive lines in a set of lyrics,

we ﬁrst construct a two-dimensional matrix of the score

for every pair of syllables. Entries in this matrix (corre-

sponding to pairs of syllables in the lines) are selected as

“anchors” if they have score above a threshold and con-

tain a stressed syllable or are line-ﬁnal. From these anchor

positions, rhymes are extended forward, ensuring that the

length-normalized score is above a syllable threshold. In

addition to the iterative extension, a “jump”-type exten-

sion is also allowed, in which one or two syllables can be

skipped over if the following syllable pair is an anchor type

with score above a higher threshold. This was included

since longer polysyllabic mosaic rhymes often contain one

or two syllables that do not rhyme in the midst of three

or four that do. A good example of this can be found in

Fabolous’ “Can’t Deny It”:

I keep spittin’, them clips copped on those calicos

Keep shittin’, with ziplocks of that Cali ’dro [8]

where the two lines rhyme in their entirety, with the excep-

tion of “them”/“with” and “those”/“that.”

We ﬁltered the set of rhymes to remove one-syllable

rhymes including unstressed syllables, as these tended to

be noise. After a set of rhymes was identiﬁed, we removed

duplicates and consolidated consecutive and overlapping

rhymes together.

7. VALIDATING THE METHOD

Our ﬁrst test veriﬁes that our probabilistic score for sylla-

ble rhyming is better at identifying perfect and imperfect

rhymes than rules-based phonetic similarity measures. We

did a 10-fold cross validation where we chose 36 albums

from the training data, trained a rhyme model for those al-

bums, and used it to score the known rhyming lines from

the other four albums (true positives) as well as randomly

selected lines from those four albums (presumed to be true

713

Oral Session 8: Lyrics

B CH D DH F G JH K L M N NG P R S SH T TH V Z ZH * *

B 4.3 -4.8 1.1 0.4 -5.5 1.9 1.9 -6.9 -0.3 -0.5 -1.6 -5.5 0.1 -0.9 -1.6 -4.6 -1.0 -4.3 2.3 0.3 -2.5 -0.6 -1.5

CH 4.2 -1.6 -4.9 -0.3 0.3 0.4 1.5 -6.8 -6.6 -2.8 -5.5 1.1 -6.7 0.3 0.6 0.9 1.4 -6.1 -2.0 -2.5 -6.0 -2.6

D 2.3 -7.0 -7.6 0.1 0.2 -3.1 -1.7 -2.2 -2.2 -3.0 -1.8 -0.9 -9.0 -2.1 0.2 0.0 -0.2 0.0 -4.6 -0.2 1.2

DH 3.5 -5.6 -5.1 -4.2 -0.4 -0.2 -2.0 -7.5 -5.6 -6.2 -1.4 -7.0 -4.8 -0.3 1.3 2.8 1.1 -2.6 -6.0 -3.4

F 3.4 -1.2 -4.9 -0.3 -1.5 -1.3 -3.5 -1.6 1.1 -2.7 1.1 1.2 -0.9 4.0 0.6 -7.3 -3.2 -1.4 -2.9

G 4.2 1.9 0.0 -0.2 -1.0 -1.9 -5.7 -0.6 -0.8 -2.5 -4.9 -1.1 -4.5 0.3 -0.3 -2.7 -0.9 -2.8

JH 5.2 -6.3 -1.5 0.1 -0.5 -4.8 -0.2 -0.3 -0.6 0.6 -1.1 -3.6 1.4 1.0 4.1 -5.3 0.5

K 2.6 -2.9 -2.1 -2.6 -1.3 1.7 -2.1 -0.7 -0.6 0.9 0.5 -1.8 -3.1 -4.7 -1.0 -1.8

L 2.8 -1.8 -1.8 -2.8 -8.1 -0.5 -2.9 -6.6 -2.9 -6.3 -1.3 -1.6 -4.5 0.4 -1.0

M 2.7 1.8 0.7 -3.2 -1.2 -2.9 -1.1 -2.5 0.4 -0.6 -3.7 -4.2 -0.8 -1.7

N 2.2 1.2 -2.5 -1.0 -2.3 -0.7 -1.5 -0.6 -1.5 -2.1 -5.1 -0.4 -2.3

NG 4.1 -6.8 -2.7 -2.3 -5.3 -3.5 -5.0 -2.1 -2.0 -3.2 0.2 -3.9

P 3.3 -2.0 -1.1 -0.7 1.1 0.9 -0.6 -7.9 -3.8 -0.7 -0.8

R 2.8 -2.3 -0.8 -1.2 -6.1 -2.1 -2.2 -4.3 1.7 -0.7

S 2.6 2.4 -1.0 1.0 -2.4 0.5 0.0 0.6 0.6

SH 5.2 -0.6 -4.1 -1.3 -0.2 3.6 -5.8 -7.7

T 1.7 1.6 -0.9 -9.2 -5.2 0.0 0.7

TH 4.4 0.5 -6.1 -2.0 -5.4 -0.6

V 2.9 -0.4 1.6 -1.2 -1.7

Z 2.6 3.0 -1.3 1.1

ZH 6.8 -3.7 -5.6

Table 2. Scoring Matrix for Consonants

negatives). We developed implementations of the mini-

mal mismatch of articulatory features and Kondrak align-

ment [17] metrics to compare the performance of these

scoring measures, which are based on the physical process

of the human voice. We show receiver operator character-

istic (ROC) curves comparing the true positive rate to false

positive rate when varying the score threshold for each of

the three methods in Figure 1. The probabilistic method

signiﬁcantly outperforms both simpler rules-based meth-

ods.

Figure 1. ROC curves for the three different scoring meth-

ods, comparing percentage of actual rhymes found by algo-

rithm on the y-axis with percentage of unrelated syllables

detected as rhyming on the x-axis

Next, we considered false positives and negatives for

detected end rhymes, using the score threshold of 1.5

(meaning matched syllables are at least e

1.5

times more

likely to rhyme than expected by chance). Out of 1000

pairs of unrelated random lines from our training data,

79 syllables were marked as parts of end rhymes (“false

positives”) by our procedure. Of these, 22 were in fact

true rhymes, with scores higher than 3.0. 30 were near-

rhymes; that is, that they could be found (though less

frequently) as line ﬁnal rhymes in actual lyrics. Usu-

ally scoring above 2.0, they included matches such as

“stiff”/“ﬁt”, “pen”/“thing”, and “cling”/“smothering”,

with more than one articulatory difference or different

stress. 14 matched end syllables (often sufﬁxes), typically

with high scores (greater than 3.0). Examples such as

“weaker”/“drummer” and “tappin’”/“position”, may have

exact matches, but are not relevant rhymes due to their

lack of stress. The remaining 13 moderately high scoring

(between 1.5 and 2.5) pairs featured either high consonant

scores (like “bust”/“test”) or high vowel scores due to

matching rare vowel sounds (“box”/“wrong”).

From a set of 1000 matched pairs of lines, we used the

iterative method (moving backwards from the end of the

line while scores for stressed syllables are positive) to see

which true rhymes would be missed. Pairs with all such

matches scoring less than 1.5 were marked and treated as

false negatives. Out of 132 such syllables, the largest group

(48) were moderately low scoring (between -1.0 and 1.5)

pairs participating in polysyllabic and mosaic rhymes. A

good example of this is “battery”/“battle me” in Eric B.

and Rakim’s “No Omega” [7]; many of these were ﬂanked

by high scoring pairs, and would be included in rhymes us-

ing the jump extension described in the above section. 35

were very low scoring pairs (less than 0.0) which were ei-

ther caused by words having been transcribed improperly

or the lack of a true rhyme in the lyrics. 22 were caused

by the rhyme start being extended too far back and start-

ing with a low positive scoring pair. Again, this would

not cause problems in our actual detection algorithm since,

in that case, rhymes are extended forward from stressed

anchors. 17 were caused by differences between the ac-

tual pronunciation and the dictionary’s pronunciation (“po-

ems” treated as one syllable, or “battles” speciﬁcally being

pronounced to rhyme with “shadows”). Finally, 10 were

caused by deliberate mismatch in syllable stress.

The probabilistic model is quite good at ﬁnding both

perfect and imperfect rhymes. Quite few syllable pairs

714

10th International Society for Music Information Retrieval Conference (ISMIR 2009)

(less than 15 in the 1000 line pairs) scored highly without

being perceivably rhyming, and most low scoring “true”

rhyme pairs take part in complex mosaic and polysyllabic

rhymes.

Finally, we used our model on a set of manually an-

notated rap lyrics, to measure the ability of the program

to ﬁnd both internal and line-ﬁnal rhymes. We used ﬁve

songs of varying style: the Beastie Boys’ “Intergalac-

tic”, a Grammy-winning song in the old-school style;

Pharoahe Monch’s “The Truth” (featuring Common and

Talib Kweli) and “Right Here”, which were annotated by

Alim [9] and feature high rhyme density and a compli-

cated scheme; Jay-Z and Eminem’s “Renegade”, which

features very high rhyme density; and Fabolous’ “Trade

It All (Part 2)”, a song speciﬁcally mentioned by Alim

for its prevalence of long (ﬁve or six syllable) rhymes.

We show the ROC curves for this test set in Figure 2; the

best overall performance is for speciﬁcity and sensitivity

just above 60%. Most “false positive” are rhymes that

were not annotated due to lack of rhythmic importance

or accidental omission. False negatives included several

where the performer created a rhyme from words that do

not appear to rhyme as text, and some longer rhymes that

were cut off prematurely due to too many non-rhyming

syllables within them and lower scoring syllable pairs

surrounding them. Finally, some rhymes were missed due

to intervening rhymes being found between the rhyming

parts, particularly when the threshold for rhymes is set

low. This is especially evident in the ROC curves at lower

cut-off thresholds, where true positive rates peak around

80% and begin to decline as the threshold is lowered.

8. EXPERIMENTS

We used our procedure to examine a variety of features

about the rhymes in several sets of lyrics. We computed

the number of syllables per line, the number of rhymes

per line, the number of rhymes per syllable, average end

rhyme scores, and proportion of rhymes having two, three,

four, or more syllables. We also counted all of the complex

rhyming features (bridge, link, chain and internal rhymes)

per line.

We hypothesized that these features would show dif-

ferences between genres of popular music, and calculated

them for four sets of data: the top 10 songs from Bill-

board Magazine’s 2008 year-end Hot Rap Singles chart;

the top 20 songs from the 2008 year-end Hot Modern Rock

Songs chart; the ﬁrst 400 lines of Milton’s “Paradise Lost”

[18], as a similar-sized sample of non-rhyming verse; and

the top 10 songs from the 1998 year-end Hot Rap Singles

chart. To compare the verses most of all, the song lyrics

were modiﬁed to remove intro/outro text, repeated lines,

and additional choruses. Our results are in Table 3. High

end rhyme scores are indicative of song lyrics in general

(relative to unrhymed verse); rap has higher rhyme density,

internal rhyme, link rhymes, and bridge rhymes. Interest-

ingly, blank verse and rock lyrics have similar amounts of

rhyming per line, but rock lyrics have more rhymes per syl-

lable. The data from 1998 and 2008 rap songs suggest that

in their rhyming pattern, there has not been much shift in

style.

Rap ’08 Rap ’98 Rock Blank

Number of Lines 476 613 502 400

Number of Syllables 4646 6492 4053 4146

Syllables per Line 9.76 10.59 8.07 10.37

Number of Rhymes 794 1118 476 393

Rhymes per Line 1.67 1.82 0.95 0.98

Rhymes per Syllable 0.17 0.17 0.12 0.09

Rhyme Density 0.28 0.27 0.19 0.12

Average End Score 5.28 5.21 4.36 2.49

per Syllable 3.75 3.67 4.01 2.28

Doubles per Rhyme 0.23 0.29 0.15 0.18

Triples per Rhyme 0.08 0.06 0.04 0.03

Quads per Rhyme 0.02 0.03 0.05 0.00

Longs per Rhyme 0.03 0.02 0.04 0.01

Internals per Line 0.62 0.60 0.27 0.28

Links per Line 0.20 0.28 0.13 0.16

Bridges per Line 0.43 0.48 0.28 0.40

Chaining per Line 0.32 0.18 0.15 0.07

Table 3. Rhyme Features for Different Genres

We also hypothesized that features of individual rap-

pers might also be informative, so we produced these

statistics for albums by nine famous MCs from a diverse

range of styles and eras: Run-DMC, Rakim, Notorious

B.I.G., 2Pac, Jay-Z, Fabolous, Eminem, 50 Cent, and Lil’

Wayne. Features were calculated for segments of at least

40 lines to produce means and standard deviations of the

statistics for each album. The results indicate that many

of these features can be characteristic of different artists’

styles. For example, Run-DMC’s (1984) old-school style

has lower rhyme density and less internal rhyme with an

average of 1.5 rhymes per line and only 6% of rhymes

being longer than 2 syllables; while Rakim (1987), known

for his more complex style, is detected as using more

internal rhymes (0.63 per line to Run-DMC’s 0.48) and

more rhymes longer than 2 syllables (12%). Rival rappers

Notorious B.I.G. (1994) and Tupac Shakur (1995) display

fairly similar style characteristics: 28% of their rhymes

are 2 syllables long, 6% are three syllables, and 3% are

longer. However, Biggie’s lines are signiﬁcantly shorter in

length, with, on average, 10.8 syllables to 2Pac’s 11.6.

Artists from the early 2000s like Jay-Z (2001), Eminem

(2000), and especially Fabolous (2001) favour longer

rhymes, with 15%, 17%, and 30% respectively of their

rhymes being longer than 2 syllables. They also have the

most rhyme density overall, with 2.2, 2.3, and 1.9 rhymes

per line respectively. Jay-Z and Eminem tend to use

more internal rhyme as well, having 0.8 internal rhymes

per line–about 25% higher than the average among other

MCs. Although he portrays a “thug” persona, 50 Cent

(2003) uses the most syllables per line (12.1), while Lil’

Wayne (2008) has the fewest (10.2). However, he manages

high rhyme density (0.3 rhymed syllables for each syllable

used) with relatively few (only 1.8) rhymes per line. In

general, we ﬁnd that automatic rhyme detection can yield

characteristic data about performers and genres.

715

Oral Session 8: Lyrics

Figure 2. Rhyme Detection Syllable ROC Curves for Different Songs. The y-axis indicates the percentage of true rhymes

identiﬁed by the algorithm, while the x-axis shows the percentage of automatically identiﬁed rhymes not considered to be

true rhymes.

9. CONCLUSION

Using a probabilistic scoring model, we were able to iden-

tify both perfect and imperfect rhymes with a higher level

of accuracy then simpler rules-based methods. The heuris-

tic rhyme detection methods achieved moderate success at

ﬁnding both internal and line-ﬁnal rhymes in song lyrics.

More importantly, statistical features of these rhymes did

correspond to real world characterizations of rhyme style,

and many of these features are quite consistent within in-

dividual artists’ lyrics and varied between different artists.

This leads to the possibility that automatically calculated

rhyme statistics can be used to make meaningful catego-

rizations and recommendations based on rhyme style.

10. REFERENCES

[1] B. Wei, C. Zhang, M. Ogihara: “Keyword Generation for

Lyrics,” Proceedings of the International Conference on Mu-

sic Information Retrieval, 2007.

[2] F. Kleedorfer, P. Knees, T. Pohle: “Oh Oh Oh Whoah! To-

wards Automatic Topic Detection in Song Lyrics,” Proceed-

ings of the International Conference on Music Information

Retrieval, pp. 287–292, 2008.

[3] H. Fujihara, M. Goto, J. Ogata: “Hyperlinking Lyrics: A

Method for Creating Hyperlinks Between Phrases in Song

Lyrics,” Proceedings of the International Conference on Mu-

sic Information Retrieval, pp. 281–286, 2008.

[4] R. Mayer, R. Neumayer, A. Rauber: “Rhyme and Style Fea-

tures for Musical Genre Classiﬁcation by Song Lyrics,” Pro-

ceedings of the International Conference on Music Informa-

tion Retrieval, pp. 337–342, 2008.

[5] Adam Bradley: Book of Rhymes: The Poetics of Hip Hop,

Basic Civitas Books, 2009.

[6] Astrid Holtman: A Generative Theory of Rhyme: An Opti-

mality Approach, PhD dissertation, Utrecht Institute of Lin-

guistics, 1996.

[7] Eric B. and Rakim: Let the Rhythm Hit ’Em, MCA Records,

1990.

[8] Fabolous: Ghetto Fabolous, Elektra Records, 2001.

[9] H. Samy Alim: “On Some Serious Next Millennium Rap

Ishhh: Pharoahe Monch, Hip Hop Poetics, and the Internal

Rhymes of Internal Affairs,” Journal of English Linguistics,

Vol. 31, No. 1, pp. 60–84, 2003.

[10] Pharoahe Monch: Internal Affairs, Rawkus Records, 1999.

[11] S. Henikoff and J.G. Henikoff “Amino Acid Substitution

Matrices from Protein Blocks” Proceedings of the National

Academy of Sciences of the United States of America, Vol. 89

No. 22 pp. 10915-10919, 1992.

[12] Shigeto Kawahara: “Half rhymes in Japanese rap lyrics and

knowledge of similarity,” Journal of East Asian Linguistics,

Vol. 16, No. 2, pp. 113–144, 2007.

[13] Kevin Lenzo: The CMU Pronouncing Dictionary,

http://www.speech.cs.cmu.edu/cgi-bin/

cmudict,2007.

[14] H.S. Elovitz, R.W. Johnson, A. McHugh, J.E. Shore

“Automatic translation of English text to phonetics

by means of letter-to-sound rules,” Interim Report

Naval Research Lab. Washington, DC., 1976 http:

//www.speech.cs.cmu.edu/comp.speech/

Section5/Synth/text.phoneme.3.html.

[15] R. Durbin, S. Eddy, A. Krogh, G. Mitchison Biological Se-

quence Analysis: Probabilistic Models of Proteins and Nu-

cleic Acids, Cambridge University Press, 1999.

[16] Public Enemy: It Takes a Nation of Millions to Hold Us Back,

Def Jam Recordings, 1988.

[17] Grzegorz Kondrak: “A New Algorithm for the Alignment

of Phonetic Sequences,” Proceedings of the First Meeting of

the North American Chapter of the Association for Compu-

tational Linguistics, pp. 288–295, 2000.

[18] John Milton: Paradise Lost, Samuel Simmons, 1667.

716