Hip-hop is ubiquitous, but many people, including Sophia Grace Brownlee, don't understand the lyrics. Why is this?

Sophia Grace and Rosie were in the news again recently; they were at the Grammys to interview what are becoming familiar faces to them. If you, like my mom, are a year or more behind in your viral video watching (she sent me a link to Antoine Dodson's breakout interview just last week), Sophia Grace, 8, and Rosie, 5, are two English girls who sang Nicki Minaj's “Super Bass” in a video posted on YouTube on September 19, 2011.

It went viral -- the two are pretty adorable. Someone on The Ellen DeGeneres Show watched it and invited them to come on the show. There they performed the song again and met Nicki Minaj. Since then, they've done more Ellen visits, and have sung a couple other songs, but nothing has been quite as captivating as their Super Bass performance.

I find this particular viral phenom fascinating, but perhaps not for the same reasons you might. Every time their names come up, it makes me wonder about the comprehensibility of hip-hop lyrics. Would the whole episode have been so well tolerated, nay, cherished, by so many if Super Bass' risqué subject matter would have been placed in a nice, comprehensible ballad (think Taylor Swift, or as I would prefer, Joni Mitchell)? We live in a culture in which even the words gun and bullet have been censored in Foster the People's "Pumped Up Kicks", so my guess is no. That is, pound for profane pound, hip-hop lyrics tend to have relative immunity compared with other genres. This is probably not because the general public likes hip-hop more than any other genre -- in fact, a recent NPR piece seems to indicate quite the opposite. Rather, it's probably just because we can't understand what the hell hip-hop artists are saying.

You might think it's ridiculous to say that hip-hop has relative immunity compared to other genres. After all, people are always saying hip-hop songs in particular contain foul language. That's perhaps true, but for all the potentially offensive hip-hop lyrics that people do understand, there are plenty that they don't. The latter fly under the radar -- you can't take offense if you can't understand what someone's saying.

To illustrate, let's get back to Sophia Grace and Super Bass. When Ellen asked Sophia Grace if she knew what the song is about, Sophia said no (starting at about 2:54 in). Okay, Sophia is eight, and most adult listeners would have a vague impression that Super Bass is about love and sex and romance and sweet-ass car stereo systems. But that's probably it: research on comprehension of other genres such as heavy metal and rock/pop indicates that listeners have only a thematic understanding of lyrics. And, although no studies I'm aware of compare the comprehensibility of lyrics across genres (there are, however, plenty that examine heavy metal and hip-hop for an association with anti-social behavior), hip-hop lyrics do seem much trickier to understand than rock and perhaps even heavy metal. As Paul Devlin excellently says in his article "Fact-Check the Rhyme", "Transcription of rap lyrics is excruciatingly difficult, due to speed of delivery, slang, purposeful mispronunciation, and the problem of the beat sometimes momentarily drowning out or obscuring the lyrics. (And unlike rock or pop albums, rap album booklets very rarely include lyrics.)"

In fact, compared to other hip-hop songs, the lyrics of Super Bass are actually fairly understandable -- it's what I call hip-pop, a trend these days toward blending the genres of hip-hop and pop. Still, Nicki includes several features of African-American English, which can impede comprehension for people not familiar with this dialect of English. Nicki also slips in some non-canonical pronunciations, slang, and double entendres characteristic of hip-hop. Sure, the words ho, ni**a, and panties comin' off seem pretty easy to understand. But Nicki's references to cocaine, condoms, and oral sex are much more subtle. Of these, the reference to cocaine is the most straightforward, since the word coke itself is apparently used. But it just doesn't sound too much like coke. This is because in the lines

He ill, he real, he might gotta deal

He pop bottles and he got the right kind of build

He cold, he dope, he might sell coke

the vowels in cold, dope, and coke sound more like the vowels in cow and down, if indeed they sound at all like existing vowels in English. This impression is confirmed if you look at a spectrogram analysis of these words, which is just what I've done. Think of the sounds we produce as lying in a continuous, multi-dimensional space (because that's what speech production is). If you analyze the frequency spectrum of the human voice as it produces speech, you will see that this spectrum has a series of peaks called formants, each of which corresponds to characteristic properties of the voice, such as pitch, or the particular sound being made, such as a vowel. The combination of the first and second formant values often comprise the unique features of each vowel; the first formant (F1) has to do with how open your mouth is when you say a particular vowel. When you say the word cop, for example, your mouth is more open than when you say the word coke, so of these, the cop vowel has a higher F1 value. The second formant (F2) has a higher frequency for a front vowel like the one in kit, as opposed to a vowel produced more in the back of the mouth, like coke. And because the situation I just described isn't complicated enough (vowels are notoriously difficult beasts), English also has a distinction between monophthongs and diphthongs. The articulation of a monophthong vowel for the most part stays the same as you say it (think kit, trap, cop). The articulation of diphthong vowels, on the other hand, changes over the course of pronunciation. Just say the words coke, cow, kind slowly -- preferably when no one's around, or you'll get some interested onlookers -- you'll notice a change in the vowels that you won't hear or feel when saying words with monophthong vowels.

Now that you've had a crash course in vowels, you're in a position to see just how unique Nicki's vowels in the words cold, dope, and coke are. Below I've created a two-dimensional vowel space plot comparing first and second formant values for various vowels, including Nicki's. By the way, these are the kind of plots that linguists, phoneticians in particular, look at all the time. The words in black are canonical pronunciations of vowels by female speakers of Mainstream American English (also called General American English), using values taken from Hillenbrand et al. (1995), “Acoustic characteristics of American English vowels” (pdf). Diphthong vowels like those in the words cow and coke are plotted at the values for the first vowel and have an arrow pointing to the second vowel values. Nicki's pronunciations of the words cold, dope, and coke are in -- what else -- pink. Nicki's pronunciation of coke is strikingly different from the mainstream pronunciation. In fact, this plot shows it's actually slightly closer to the mainstream pronunciation of trap than to the mainstream pronunciation of coke. In where her coke starts, it resembles cow more, but it finishes in a completely different spot. This is to some extent consistent with longitudinal findings that back vowels such as that in coke are getting "fronted" among some speakers, as is seen with the higher F2 values for Nicki's coke. But she's exaggerating this feature, and her vowels here still have a unique profile that is remarkably consistent across these three words.

Two-dimensional vowel space plot of female speakers' vowels. Values for Mainstream American English are given in black; Nicki Minaj's vowels in Super Bass are in pink.

In addition to changing her vowels, Nicki chops off the final sounds on the words build, cold, dope, and coke. Technically linguists call this a lack of a "burst release", which means that her tongue could be ending up where it would need to be in order to articulate the sound such as the final /k/ in coke, but that's where the sound ends. Often there would be a release of air, a follow-through, as it were.

The lack of a burst release is no doubt an aesthetic move in that not releasing the final consonants from cold, dope, and coke make them rhyme more. But reduced or deleted word-final consonants is a feature of African-American English, which is the prestigious variety of language in hip-hop (see H. Samy Alim’s excellent “Hip Hop Nation language”, part of Language in the USA: Themes for the twenty-first century). As an added benefit, this reduction further disguises the word coke, which, in combination with the preceding vowel Nicki's created, makes the lyric seem pretty benign. Why shouldn't an eight-year-old sing it? In fact, in the initial video uploaded onto YouTube, Sophia Grace closely imitates Nicki's pronunciation of coke, making it sound more like caw than any illegal substance.

Pronunciation differences mixed with tropes like metonymy make for even more difficult lyrical interpretation. This is the case with Nicki's dolo, as in:

He could ball with the crew, he could solo

But I think I like him better when he dolo

Dolo is Nicki's way of saying down low; in dolo, however, the vowels are again more ahh-like than o-like, and the final consonant of down is again dropped, making Nicki's pronunciation quite different from the mainstream pronunciation of down low. But here's where metonymy further masks the meaning: when she says down low, she's of course referring to oral sex. Thus, an eight-year-old English girl saying dow-laow is much cuter than her saying, well, oral sex. In the very next line, the sexual progression builds to penetration with the following metaphor:

And I think I like him better with the fitted cap on

A fitted cap being, of course, a condom; staying on the topic of sex makes sense given the next few lines about her panties comin' off when he gives her "that look". If these meanings flew by you, you're not alone: most people I've queried haven't gotten them either.

Lack of comprehension, it must be said, can always be due to the audience or to the speaker. At least one hip-hop group, Wu-Tang Clan, has expressed their opinion that the former is at play. Their Triumph lyric suggests that most people listen to hip-hop for the beats and not the lyrical content: "The dumb are mostly intrigued by the drum". I surely have fallen into this category before, as come on, I listen to Super Bass when, for example, I run; understanding the lyrics is a less pressing issue than making it up that killer flight of stairs.

But on the flip side, if hip-hop artists are taking such liberties as purposeful mispronunciation; use of auto-tune, slang, and features of African-American English; rhyming quickly; having the beats drown out the lyrics; and not publishing their lyrics, one could make the case that hip-hop artists don't really want the mainstream public (think Sophia Grace) to understand what they're saying.

It seems a less-than-intuitive marketing strategy, to say the least: why would artists interested in mass appeal want to make their songs difficult to understand? The answer largely has to do with creativity and innovation: hip-hop places a premium on these ephemera that is absent in other popular genres. For example, last fall there was a controversy about Justin Bieber being part of a cipher on BET because it was speculated that Ludacris would have written Bieb's rhymes. Outside of hip-hop, who writes the songs artists perform doesn't matter nearly as much. As I discuss in my most recent article on hip-hop, part of this creativity is using cutting-edge, in-group language and vocabulary. If you're part of an in-group seen as "cool", as soon as mainstream culture adopts the innovative language you're using, that language somehow loses its luster. This process is perfectly illustrated in a 30-second MTV cartoon about the life and death of the word bling. This clip shows the diffusion of the word bling, as it starts with Black hip-hop artists and is subsequently adopted by less and less "cool" people, until it finally "dies" upon being uttered to an elderly White woman.

There are some good reads on how and why African-American culture is currently the de-facto standard of cool in the U.S.: one is Greg Tate's Everything but the Burden: What White People are Taking from Black Culture; the other, White Kids: Language, Race, and Styles of Youth Identity  by Mary Bucholtz, focuses more on language matters. As the current dominant form of African-American music, hip-hop is pretty high on the coolness hierarchy. Knowing this, hip-hop artists have no need to make their lyrics accessible. In some sense, they're more concerned about shrouding what they're saying in enough secrecy so as to ensure that their verses won't be passé five years down the road.

Comprehension of hip-hop lyrics thus tends to operate on many more levels than rock, pop, and even country or heavy metal lyrics. Hip-hop artists must love this. If you were Nicki Minaj, wouldn't you be congratulating yourself that your song, which includes plenty of references to sex and drugs, is so iconic as to be sung by an 8-year-old British girl on Ellen, and everybody thinks it's the cutest thing ever? In disguising her verses, Nicki is able to mask her message enough that most of her listeners don't know what she's saying, giving her relative immunity to talk about whatever she wants to. But an infectious -- and comprehensible -- hook helps to ensure that we'll still like the song.


Thanks to D. Kyle Danielson for help with the analysis of Nicki's vowels. If you're interested in the spectrogram analysis of Nicki's vowels, drop me a line.


