The <i>Techne</i> of YouTube Performance: Musical Structure, Extended Techniques, and Custom Instruments in Solo Pop Covers

O’Hara, William

The Techne of YouTube Performance: Musical Structure, Extended Techniques, and Custom Instruments in Solo Pop Covers^*

William O’Hara

KEYWORDS: popular music, instrumentality, arrangement, cover songs, YouTube, social media, public music theory

ABSTRACT: They begin with a note, a chord, the tap of a button, or the triggering of a loop: through progressively layered textures, samples, and extended performance techniques, many solo performers on YouTube construct their cover songs piece by piece before the viewer’s eyes and ears. Combining virtuosity and novelty in a package ready-made for viral online popularity, this recent and rapidly growing internet phenomenon draws together traditions old and new, from the “one man band” of the nineteenth century, to the experimental live looping of 1980s performance art, to contemporary electronic music. Building on a number of recent studies that examine the affordances and restrictions of writing and performing music on various instruments, this article argues that these YouTube performers use both music-theoretical and instrumental expertise to convey complex textures through a minimal collection of musical materials. Through their sparse, economic construction, these intricate arrangements are each the end product of a careful analysis of each song, and they have much to teach us about the harmonic, melodic, and rhythmic structures of popular music.

DOI: 10.30535/mto.28.3.9

PDF text | PDF examples

Received December 2021

Volume 28, Number 3, September 2022
Copyright © 2022 Society for Music Theory

1. Introduction

[1.0.1] They begin with a single note, a chord, the press of a button, the triggering of a loop. Through progressively layered textures, samples, and extended performance techniques, the practitioners of an emerging genre of YouTube performance construct their cover songs piece by piece before the eager eyes and ears of their viewers. In recent years, these solo artists have created increasingly elaborate and innovative versions of their favorite songs. These intricate arrangements and performances combine virtuosity and novelty into a package ready-made for viral online popularity. Searching YouTube—an increasingly important platform for music listening⁽¹⁾—for “live looping cover” or “solo cover full band” will return hundreds of results. And here, a single performer who juggles layer after layer of accompanimental loops; who deftly conveys a complex musical texture through extended or even unconventional performance techniques; or who presents themselves in split-screen, showcasing their multi-instrumental talents.

[1.0.2] This article analyzes the phenomenon of the solo cover on YouTube, including videos that use live loops, multitracked performances, or unaccompanied or self-accompanied arrangements. The rapidly growing genre of the solo cover draws together traditions old and new, from the “one man band” of the nineteenth century, to the experimental live looping of 1980s performance art, to the techniques, equipment, and software now readily available to many amateur musicians.⁽²⁾ I will explore a series of recent, popular YouTube videos in order to study the ways in which amateur musicians craft such arrangements through a combination of creativity and music-analytical understanding that I define in terms of the ancient Greek concept of techne: a type of skilled creation or production informed by, and inseparable from, knowledge or theory. My goal in this analysis is to show how music theory and analysis intersect with creative listening, arrangement, extended performance techniques, and the technological mediations of musical instruments from the acoustic guitar to the MacBook Pro.

[1.0.3] While they feature different performance styles, more or less elaborate production values, and varying degrees of fidelity to their source material, the case studies presented in this article all have several things in common. Each video vividly showcases both its musical materials and the material form of the instruments on which the music is rendered. These displays are often complementary—in each pairing of musical element with bodily-instrumental affordance (or vice versa), one is always perfectly matched with its complement. Secondly, each video features a performance that is virtuosic in several ways. These virtuosities constitute a central aspect of each video’s appeal, and they take various forms ranging from the traditional notion of dazzling instrumental technique (as we will see, for example, in the videos by Luca Stricagnoli); to a precise musical craft, manifested in the ability of these performers to command of a large array of acoustic instruments and electronic devices simultaneously; to the ability to do all of this while casually projecting YouTube’s prevailing sense of one-on-one intimacy through staging, production techniques, and facial expressions.

[1.0.4] Each video also features, I will argue, what I call “music-analytical virtuosity,” or perhaps a “virtuosity of arrangement.” By these terms, I mean that each musician featured displays a knowledge of song structure and form (both in general, and in the case of their specific repertoire). In some cases, they also display a keen understanding of melody, harmony, and other musical devices, which they call upon in order to craft their arrangements. In characterizing this analytical understanding and creativity of arrangement as “virtuosity,” I am inspired by Jim Samson’s account of the concept, which begins:

Virtuosity brings into sharp focus the relationship between music’s object-status and its event-status. It marks out a relational field in which text, instrument, performer, and audience are all indispensable to defining significance. It draws the performer right into the heart of the work, foregrounding presentational strategies that are hard to illuminate through the familiar, pedigreed methods of music analysis. And it spotlights the instrument, elevating the idiomatic (the figure), a category much less amenable to close analysis than theme, harmony, and form (Samson 2003, 2).⁽³⁾

In Samson’s terms, virtuosity has often been defined by “presentational strategies” and “a focus on the instrument,” in terms of prodigious technique, the mystique or aura of the soloist, and even, as he notes later, “the dramas and discontinuities of [their] bodily activity” (Samson 2003, 78). In this essay, I am concerned with the ways in which contemporary YouTube artists engage their audiences through surprising arrangements that hinge upon technical and technological virtuosity, expressed through both instruments and the tools of recording and production. The theoretical apparatus and case studies in this essay are meant to demonstrate new ways of analyzing the rhetorics of arrangement, production, and performance, and of understanding them as arenas for virtuosic and expressive artistry.

[1.0.5] The final commonality, which follows directly from Samson’s notion of “draw[ing] the performer right into the heart of the work” is a focus on the musicians’ labor, which each of these videos features prominently. While many are thoroughly rehearsed and expertly executed, they are built around offering the audience a peek behind the curtain (often with the implication that this peek is an unpolished moment of cinéma vérité, even if such glimpses are carefully staged). These revelations of labor demonstrate just how much preparation and work goes into crafting a three-minute pop song. By foregrounding labor, these artists use their digital medium to question the boundary between a musical performance proper, and the musical, intellectual, and technical labor that goes into preparing it.

[1.0.6] In what follows, I will offer a brief introduction to solo YouTube covers and establish a theoretical background for further analytical work by briefly surveying three relevant areas of inquiry: the study of instrumentality and media, previous approaches to cover songs, and the idea of music theory as techne. Then, in a series of case studies, I will illustrate the diverse ways in which these YouTube performers use theoretical and instrumental expertise—a multidisciplinary, multimodal techne—to convey complex textures through a minimal collection of musical materials. In each of these cases, the instruments themselves are carefully arranged, modified, or even created in order to make these performances possible. In their sparse construction, these intricate arrangements are each the result of careful analysis. In laying bare their own processes of creation, they function as a form of public music theory. As music-analytical statements that both rely upon and embody numerous principles of music theory, they have much to teach us about how vernacular music theories, creative forms of arrangement, and virtuosic performance all come together to create new forms of musical knowledge and experience in the digital age.

1.1 Setting the Scene: Elise Trouw’s Mashup of Radiohead and The Police

Video Example 1. Elise Trouw, “Radiohead Meets the Police,” annotated introduction (0:00–0:59)

(click to watch video)

Example 1. Transcription of Elise Trouw, “Radiohead Meets the Police,” opening (0:00–0:58)

(click to enlarge and see the rest)

[1.1.1] Because the topic of this article is a relatively recent and idiosyncratic genre of audiovisual media, it is worthwhile to begin with a representative example. The performance presented in Video Example 1 gives one typical example of how these videos look and sound. In it, musician Elise Trouw uses “live looping”—a technique in which a computer or effects pedal repeats short segments of music numerous times, continuously—to build a mashup of Radiohead’s “Weird Fishes/Arpeggi” (from the 2007 album In Rainbows) and The Police’s “The Bed’s Too Big Without You” (from 1979’s Regatta de Blanc) in real time, instrument by instrument.⁽⁴⁾ Example 1 transcribes her piecemeal construction of the song. (Here and throughout this article, brackets indicate that a fragment of music is played live and recorded, while smaller notes and the annotation “looping” indicate that the fragment is being looped from that point on.) Trouw begins the video behind a drum kit in a cluttered studio, dramatically backlit by a partially covered window. She plays four measures of the fast rock beat from “Weird Fishes/Arpeggi” (mm. 1–4); this backbeat begins to loop continuously as she puts down the drumsticks and moves purposefully across the room. Picking up an electric bass and waiting for the right moment in the groove, Trouw then plays Sting’s characteristically sparse bassline from “The Bed’s Too Big Without You” (mm. 9–13).

Example 2. Opening chord progression from Radiohead, “Weird Fishes/Arpeggi”

(click to enlarge)

[1.1.2] Next, Trouw exchanges the bass for an electric guitar, on which she plays a series of arpeggios that pair the pace and picking pattern of the Radiohead song (shown in Example 2) with a modified version of the chord progression from The Police (mm. 17–20). While the Am – Bm – Em progression from “Bed’s Too Big” forms the core of these arpeggios, sevenths have been added to each chord to more closely align with the harmonic progression and picking pattern from “Weird Fishes/Arpeggi”: Em7 – F♯7 – A – GM⁷. After a few loops of her guitar introduction, Trouw begins to sing the first verse of “The Bed’s Too Big” (mm. 25–32). Because the chord progression changes for the song’s chorus, the guitar and bass loops drop out momentarily (1:38 in the full video), and Trouw accompanies herself live on the keyboard as she sings. After another verse/chorus pair (during which Trouw adds a few live guitar chords), the texture changes (3:18). She takes up the guitar again, this time standing behind the microphone, and plays the full chord progression from “Weird Fishes/Arpeggi.” She sings a verse from the Radiohead song over this accompaniment; the bass loop, which had been synchronized with the four-measure progression from The Police, drops out, leaving this section of the performance without any musical traces of The Police. That bass line never reappears, and the drums and guitar soon fall silent as well. “Bed’s Too Big” returns for one last chorus, accompanied by a few resonant chords from the keyboard, which had been played (but not yet heard as a loop) during the first chorus.⁽⁵⁾

[1.1.3] Elise Trouw’s mashup video is only one of thousands of solo performance videos available on YouTube, but it is broadly representative of one kind of musical performance that has become ubiquitous on the site. While famous musicians might sometimes partake in solo, in-home performances like these (particularly for promotional purposes, or as a replacement for touring during the COVID-19 pandemic), solo cover videos are most often produced by independent professional musicians (like Trouw) who turn to YouTube in order to build their audience or seek viral fame, or by serious amateurs sharing their hobby projects online. Like many YouTube artists, Trouw posts both original songs and covers, often done in the same loop-based style shown here (though it is worth noting that in this particular video, Trouw seems to have a collaborator recording and triggering the loops from off-camera; the incorporation of such gestures into the performance itself will be explored later).⁽⁶⁾ The format showcases her multi-instrumental virtuosity and, as I will argue throughout this article, demonstrates both a thorough technical understanding of the song(s) being performed, and a creative way of expressing that understanding. In each of the videos analyzed in this article, a given song’s melodic, harmonic, and/or formal attributes are shown clearly to the audience, and their deconstruction and reconstruction becomes an important aspect of the performance itself—perhaps even, in some cases, the defining element. Slickly produced videos such as this one are an expansion and refinement of the “do-it-yourself” aesthetic that characterized the early years of YouTube, but they are no longer exceptional: most of the videos I will study in this article demonstrate polished production values alongside their musical creativity.⁽⁷⁾ Finally, the participatory nature of YouTube means that Trouw’s video positions itself in dialogue with a now-established audiovisual genre; her mashup performance did not emerge ex nihilo, but is rather a response to many videos that came before, and it holds the potential to influence other musicians who may follow its example. YouTube and other social platforms are built on users responding to one another, whether implicitly or explicitly; this behavior is one aspect of what Paula Harper (2019, 1) calls viral musicking: “the production, watching, listening to, circulation, or ‘sharing’ of [musical] objects” online.

2. Theoretical Preliminaries

2.1 Techne

[2.1.1] The combination of virtuosic performance technique, creative customization or repurposing of musical instruments, and sophisticated music theoretical/analytical understanding that characterizes solo YouTube covers is best described as a form of techne. Techne, an important concept in ancient Greek philosophy, is often translated as “craft,” or sometimes as “skill” or “art”; due to its multivalence, however, it is also often left untranslated. As a very basic shorthand, techne is often defined in opposition to episteme (knowledge.) As Parry (2020) argues, however, this contemporary distinction is an oversimplification: throughout history, thinkers ranging from Plato and Aristotle to Martin Heidegger (1977) have all conceived of techne as practical knowledge or the ability to produce something, which both includes and applies theoretical knowledge.⁽⁹⁾ Serafina Cuomo usefully characterizes techne as being concerned with “knowing-how rather than knowing-that” (2007, 12). As Martha Nussbaum puts it, techne “is a deliberate application of human intelligence to some part of the world, yielding control over tuche” [happenstance, or luck] (2001, 95).⁽¹⁰⁾ Reading from Aristotle and others, she describes four features of techne: universality, teachability, precision, and concern with explanation. Techne “brings precision where before there was fuzziness and vagueness” (Nussbaum 2001, 96). And perhaps most crucially for the topic at hand, techne carries with it a comprehensive knowledge of both how and why techniques work. A physician understands not only what to do in order to treat a patient, but why a symptom occurs and why the remedy works.⁽¹¹⁾

[2.1.2] Tellingly, for instance, a passage from Plato’s dialogue Sophist casts music theory itself as a form of techne. Here, the two interlocutors discuss how objects and ideas of various kinds combine, including musical sounds:

Stranger: Now does everybody know which letters can join with which others? Or does one who is to join them properly have need of art [techne]?
Theaetetus: He has need of techne.
Stranger: What techne?
Theaetetus: The techne of grammar.
Stranger: And is not the same true in connection with high and low sounds? Is not one who has the techne to know the sounds which mingle and those which do not, musical, and one who does not know unmusical?
Theaetetus: Yes.
Stranger: And we shall find similar conditions, then, in all the other techne and processes which are devoid of techne?
Theaetetus: Of course.
(Plato 1921, 399–400 / §252–253; translation modified)

Just as a poet or orator requires knowledge of grammar in order to combine words effectively, a musician’s knowledge of which notes go well together and which do not is a form of techne for Plato—not episteme. The implication of this definition is that the musician is not merely knowledgeable about which sounds go together, but that they will use that knowledge to combine tones and create effective melody or harmony. As Jonathan Sterne puts it, “techne bridges the chasm between possibility and actuality: it indexes what the musician actually does and what she or he might do, or even what she or he is capable of doing or willing to do” (2006, 92). This drawing together of knowledge and practice resonates throughout the history of music theory, the broad trajectory of which abandons the early-medieval distinction between a cantor (a practical singer or musician possessing little theoretical knowledge) and a musicus (a speculative harmonic theorist), in favor of conceptions which progress from theoretical to practical musicking, or begin to efface the distinction altogether.⁽¹²⁾

[2.1.3] My invocation of the word techne is also meant to resonate with recent theoretical studies of musical technology, which apply the words derived from this Greek root in subtle ways. I am inspired, for instance, by Jonathan De Souza’s (2017, 4) description of musical “technics,” which draws on the multifaceted words “Technik” and “la technique” from German and French, respectively, in order to draw together both technology and technique. De Souza’s word highlights the complementarity of various musical technologies (again, broadly defined) and the bodies that use them. In his study of laptop and DJ performance, Mark Butler (2014, 174–175) distinguishes not only between techniques and technologies, but also develops new shades of meaning for the latter term. In the tradition of media theory, technologies grow out of existing bodily, sensory, social, or intellectual techniques.⁽¹³⁾ Building on the work of Michel Foucault—who in his later “ethical” writings described technologies of production, of signification, of power, and of the self—Butler speaks not only of technologies in the sense of tools, devices, and electronics, but in terms of “musical technologies” as well.⁽¹⁴⁾ These technologies (such as looping, cycling, and grooving) are the “mechanisms that afford the skills, activities, and outcomes of improvised performance” (Butler 2014, 174n2). This usage embodies the spirit of techne as I intend it: a mastery of the theoretical concepts, compositional techniques, instrumental and production equipment, and musico-rhetorical strategies that make for a performance that enthralls viewers and compels them to engage by liking, commenting upon, and sharing the videos with their peers.⁽¹⁵⁾

[2.1.4] My own conception of solo YouTube performances as a kind of techne pairs Butler’s analysis of specific musical elements as conceptual technologies and De Souza’s analysis of bodily and technological musicking, with the idea that these performances are also built upon careful analysis and arrangement of their source songs, and are characterized by their participation in a larger online ecosystem of viral popularity and iterative innovation. Drawing from Nussbaum, I am interested in characterizing ways of using musical skills to purposefully intervene in the (digital) world. A “techne of YouTube performance,” then, is a form of music-theoretical knowledge that exists at the intersection of analytical detail, virtuosic performance ability, practical instrumental considerations, and an awareness of one’s audience and the communicative tendencies of social media. It is a twenty-first-century form of performance practice that is both dependent upon and expressive of musical, technical, and cultural knowledge—a fusion of knowledge, skill, and production that became all the more crucial in the digitally mediated age of COVID-19.⁽¹⁶⁾ In these videos, the insights of analysis and the conditions of performance and consumption cannot be separated from one another—they come together in the form of catchy, transformatively minimalist arrangements, and the “Share” button that hovers enticingly underneath. And while the dense concentration of skills and actions that I describe here are especially noticeable in this genre and on this platform, they are by no means unique to either; indeed, they are widespread in other performance venues, and I hope that this research will help to render aspects of this often-hidden musical and communicative techne visible in other contexts.

2.2 Instrumentality and Media

[2.2.1] Over the past few decades, music theorists and musicologists have taken a strong interest in theorizing the act of performance. In the United Kingdom, major research projects such as the Centre for the History and Analysis of Recorded Music (CHARM) and the Centre for Musical Performance as Creative Practice (CMPCP) have advanced a performance-based musicology.⁽¹⁷⁾ Many music theorists have also attempted to bridge scholarly and performance perspectives, exploring how one practice can inform the other. Janet Schmalfeldt (2011), Jeffrey Swinkin (2016), Edward Klorman (2018), Daphne Leong (2019), and many others have recently explored the relationship between analysis and performance.⁽¹⁸⁾ In nearly all these accounts, the two acts are seen as complementary rather than distinct. Analysis is not merely preparation for performance, nor is performance simply tasked with expressing analytical insights; rather, each informs and deepens the other.

[2.2.2] Along with a surge of interest in the relationship between performance and analysis, a number of recent studies have specifically sought to analyze the affordances and restrictions of writing and performing music on specific instruments, from piano keyboards, to various stringed instruments, and even laptops, turntables, and electronic samplers. Jonathan De Souza calls this concept “idiomaticity”: the idea that “instruments shape players’ actions, [and] coordinated affordances and habits give rise to distinctive musical dialects made of seemingly prefabricated patterns.”⁽¹⁹⁾ Given the piano’s outsized place in the musicological and especially the music-theoretical imagination, the influence it exerts over various aspects of musical life may well go unnoticed until it is made an intentional object of scrutiny, or subverted through attention to a different interface. Emily Dolan (2012) has explored the regulative influence of the keyboard as an interface for new and experimental instruments in the eighteenth and nineteenth centuries, as well as the default configuration when commercial synthesizers such as the Moog emerged in the 1960s. The keyboard, she notes, makes novel instruments immediately intelligible for many musicians, no matter how outlandish the mechanism of sound production or the resulting tones might be. Departures from the keyboard have often been noted as emancipatory opportunities for combining new sounds in new ways (Dolan 2012, 8–9). Seeking such emancipation for music theory and analysis, both James Bungert (2015, 2017) and Roger Moseley (2015, 2018) have sought implicitly to defamiliarize the piano: the former by attending to the details of the performer’s physical relationship to the keyboard through the lens of phenomenology; and the latter by scrutinizing the discrete (or “digital,” with full awareness of the pun) parameters of keyboard design and music notation, and the efforts of various performers to render these systems more analog and continuous.

[2.2.3] Another distinct branch of recent music-theoretical inquiry on instrumentality investigates how non-keyboard instruments map musical space and the effects these spaces have on performance and composition. Anna Gawboy (2009), for example, has demonstrated how the two-handed arrangement of buttons on the concertina affects how its players voice their chords. Joti Rockwell (2009) has proposed a detailed transformational model for banjo technique, which uses various aspects of performance (both right-hand fingerings, and the distinctive rhythmic phenomena that arise through the instrument’s three-fingered plucking technique) not only to analyze the physical movements of banjo playing, but also as a way to understand the instrument’s unique features and their ramifications for bluegrass style (such as picking techniqes and the use of the shorter and higher-pitched “drone string”). Timothy Koozin (2011), Jonathan De Souza (2018), and Nicholas Shea (2022) all build on these foundations in their analysis of rock guitar style, chronicling how open strings and fretboard spaces intersect with the shape of the human hand. And De Souza (2017) includes detailed case studies about the harmonica, alternate guitar tunings, and transcriptions and arrangements for orchestra, all exploring how music-making is conditioned by varying relationships between bodies, minds, and musical instruments.

[2.2.4] In these theoretical studies of instrumentality, music is framed as a contingent encounter between a performing body and a specific technology, the affordances of which exert an influence over the music that is made. My analysis builds on these studies by examining how YouTubers use theoretical and instrumental expertise to convey complex textures through a minimal collection of musical materials. In each of the following videos, the instruments themselves are arranged, modified, or even created to enable these performances. In the first case study, simple instruments are crafted from unexpected materials, and tailored to both the melodic and phrase structure of the music being performed. In others, conventional instruments such as the guitar are employed in novel ways: retuned, played with a single hand, flat on a table, or even two-at-a-time. Electronic musical instruments play a constitutive role in several case studies, as traditional instruments are combined with digital audio workstations (DAWs) or other recording and playback devices to supplement the live performance. And all of them are mediated by the technologies of sound recording, video production, and online distribution.

[2.2.5] Samson (2003, 78) describes virtuosity as a “two-way process,” in which the audience’s eager reception is just as important as the performer on the stage. If the twenty-first century techne that I describe can be understood as a form of virtuoso performance—novel, dazzling, multi-instrumental—then the medium of YouTube itself can be thought of as a way of filtering and distributing our view of a twenty-first century virtuoso. In a certain sense, YouTube as a medium has enabled new forms of musicking much like the instrumental affordances cited by the scholars named above. Because it enables new forms and new audiences, Christopher Cayari (2011, 24) writes, YouTube is “a technology which allows listeners to become singers, watchers to become actors, and consumers to become producers, creating new original works and supplementing existing ones.” In a later study, Cayari (2017, 471–72) also notes that some YouTubers even cross the line from amateur to professional, using their YouTube creations to supplement their income or build a career. As Kiri Miller (2012) has shown, YouTube has not only opened up new venues for musical performance; its immediacy has facilitated new forms of musical instruction. These videos join innumerable do-it-yourself instructional videos in other domains, from woodworking to cooking to interior design.⁽²⁰⁾ In much the same way as video music lessons—whether live or recorded—benefit from their one-on-one directness, the perception of intimacy is often cited as a significant factor in the success of a given video, channel, or YouTube community, from bereavement support groups to communities of experimental filmmakers who use the medium to comment on the website itself.⁽²¹⁾ And built into this platform for amateur creation is a robust set of tools and a corresponding set of evolving digital-social norms around the sharing and circulation of content. As noted earlier, Paula Harper (2019, 7–9) extends the work of ethnomusicologist Christopher Small (1998) when she uses the term “viral musicking” to emphasize the essential creative role played by those who watch and listen to, respond to, and pass along videos on YouTube and other platforms. All of these factors contribute to a vibrant exchange economy of digital videos, in which content creators constantly respond to one another and to their audiences, and react quickly to emerging trends in their quest for visibility and popularity. In an important sense, then, the medium of YouTube effectively makes possible the genre of the solo cover, which draws upon traditional forms of live solo performance (such as live-looping and the singer-songwriter tradition), but also uses technology to go beyond them.

2.3 Theories of the Cover Song

[2.3.1] Some scholars have positioned cover songs as one recent installment within a long history of musical borrowing. Serge Lacasse (2000, 45–47), for instance, describes covers as the latest instance of musical intertextuality that stretches from parody masses to hip-hop samples through recurring techniques like quotation, allusion, collage, and pastiche. However, most scholarly accounts of rock covers have framed them as an idiomatic rock practice. Gabriel Solis, for example, argues that covers are a “distinctive versioning practice” tied to the cultural contexts of rock’s emergence in the 1950s and 1960s, namely, its appropriative relationship to African-American musical styles like the blues (2010, 315). These forebearers structure rock’s early history in a way that is very different from, for instance, orchestras performing the same piece of Classical music, or multiple jazz musicians recording the same standard. Michael Rings takes this perspective as well, arguing that jazz-enculturated listeners “will neither perceive nor appreciate [a given performance of a standard] as a remake of any other specific performance or recording” (2013, 56–57). And Ethan Hein points out that “the strength of the ‘no-covers’ rule in rap is so strong that it is hardly ever spoken,” because of the genre’s commitment to personal creativity and expression (2020, 17–18). Rather than being so commonplace as to escape notice as covers, as in jazz, hip-hop covers are practically nonexistent.

[2.3.2] Solis and Rings both build their arguments on Theodore Gracyk’s (1996) ontology of rock as a medium concerned entirely with recordings. “Covering became a common practice,” Gracyk writes in a later study, “only after the twentieth-century recording industry developed a culture in which recordings became a standard means of access to music, creating the conditions in which large numbers of people associate particular musical works with particular arrangements as interpreted by particular performers” (Gracyk 2012/13, 24). For Gracyk, the later artist’s communicative intention is what truly matters: a cover is meant as a response to another song, while a mere remake, he writes, would be “a new recording of a song that is already known by means of one or more recordings, but where there is either no expectation of, or indifference to, the intended audience’s knowledge of the original recording” (Gracyk 2012/13, 25). Rings, on the other hand, bases his analysis on the perspectives of well-informed listeners, who are familiar with both the original referent and the generic conventions of rock. Cover songs, for Rings, carry an expectation of creative transformation: a cover that exactly replicates the original recording would be considered a failure. In his analysis, Rings focuses mostly on what he calls generic resetting, in which the covering artist transplants the original into a new musical style (he mentions, for instance, the Sex Pistols’ fast and profane cover of “My Way,” from 1979). If we were to expand the scope of Rings’s analysis, we might say that he is primarily focused on what Kurt Mosser (2008, II) calls major interpretations—parodies and ironic covers (such as “My Way”)—and that he would minimize the aesthetic interest found in Mosser’s minor interpretations. The aesthetic pleasure a listener takes in hearing a cover version, theorizes Rings, comes from having their expectations subverted: the expressive and stylistic gap between the original and the cover version enables what he calls contrastive appreciation.

[2.3.3] Keeping these statements in mind, it is no surprise that many analyses of cover songs focus on how later performers re-interpret original recordings. Given the tendency toward generic resetting that Rings identifies, transformations of basic musical parameters like tempo, instrumentation, and harmony are to be expected. Music theorists have thus tended to focus their efforts on locating the communicative effectiveness of covers in factors less frequently approached in the discipline at large. The analysis of covers, consequently, is one of the more varied and methodologically diverse corners of the field. Kevin Holm-Hudson (2002), for instance, draws on Peircian semiotics, timbre and instrumentation, and a detailed analysis of studio recording techniques in his examination of two versions of the song “Superstar,” by The Carpenters (1971) and Sonic Youth (1994). A pedagogical account by Victoria Malawey (2010, 203–207) chronicles how she asks students to attend to parameters like texture, timbre, and tempo, alongside more often scrutinized features like pitch and rhythm. Her recent book (2020, 8–9 and passim) takes covers as a central repertoire for comparative analysis, with a specific focus on aspects of vocal performance and production. Some scholars have explored how the identity of the performer themselves can affect the “communicative intention” of a given cover song. Lori Burns and Alyssa Woods (2004) explore how Tori Amos both pays tribute to and updates Billie Holiday’s famous recording of “Strange Fruit” (1939), and subverts the misogynistic violence of Eminem’s “97 Bonnie and Clyde” (1999). Finally, Edward Klorman (2018) highlights how Cyndi Lauper’s famous version of “Girls Just Wanna Have Fun” (originally a punk song with a misogynistic message) relies on Lauper’s upbeat delivery, and indeed her very identity as a woman, for its reclamation of the problematic original.⁽²²⁾

[2.3.4] Gracyk’s arguments about the role of technological (re)production in rock music also helps us to more fully understand the status of YouTube covers. “Rock is a tradition of popular music whose creation and dissemination centers on recording technology,” writes Gracyk (1996, 1). “Rock music is both composed and received in light of musical qualities that are subject to mechanical reproduction but not notational specification.” Gracyk’s ontology, of course, need not be restricted to rock music; similar statements could be made about the studio-oriented nature of various genres and subgenres of contemporary popular music, including hip-hop and electronic dance music.⁽²³⁾ But it is productive in our case to substitute “YouTube covers” for “rock” in this argument. Cover songs like those discussed in this article are entirely dependent upon recording technology, both in the sense that their conception, performance, and production take place entirely in a studio environment (however roughly defined), and because their circulation, consumption, and reception occur digitally, on a website devoted to shared, social viewing and listening, and grounded in an ethos of do-it-yourself production. This is one aspect of a “techne of YouTube performance”: such covers constitute an audiovisual genre in which traditional video and sound recording practices have been made widely accessible and incorporated into a unified audiovisual product, which is then reliant upon contemporary modes of digital communication, circulation, and appreciation.

[2.3.5] As shown by these accounts, the significance of cover songs has long been shaped by a number of cultural, musical, and technological parameters. Studying covers on YouTube adds additional interpretive layers, including the additional communicative medium—the visual—and several important influences from the platform itself, including a distinct history and performance practice, the website’s sharing and social features, and YouTube’s position within the much larger entertainment ecosystem of the internet. As Carol Vernallis has theorized, YouTube operates on the principle of reiteration: “YouTube genres take up an obsessive pulse,” she writes, and cover videos are no exception (2013, 130–131). Each exemplar seems to build on the last, proposing new variations on previous ideas, musical and otherwise. YouTube covers vary widely in genre and style, but they fall into several distinct subgenres with regard to their performance forces, production quality and style, and their orientation towards their source material. Many artists operate channels filled with numerous performances that explore every facet of their own distinct performance style (as will be seen in nearly all the case studies in this article). Others build their viewership based on their personality or musicianship, and post performances in various genres and styles.

Example 3. A collage of YouTube cover performances

(click to enlarge)

[2.3.6] Example 3 presents stills from several representative genres of YouTube cover. Some covers are straight-ahead, full-band performances, which mostly attempt to replicate the originals. Others are stripped down, solo acoustic renditions, while still others arrange recognizable tunes for unconventional instrumentations (such as Coldplay’s “Viva la Vida” performed as a marimba duet). Some of these covers are serious, while many others are lighthearted and humorous. Many covers adapt and re-interpret their subjects, including “mashups” of two or more songs. While some solo performances unfold in bedrooms or other domestic spaces, other artists embrace the artifice of YouTube by placing multiple video feeds together, or by green-screening or image masking themselves into a power trio, a vocal quintet, or even more.⁽²⁵⁾ Finally, there are artists who take a naturalistic approach, emphasizing their liveness or lack of effects, or perhaps their intricate, well-rehearsed audiovisual constructions, executed in single takes. Regardless of the particular style, it is clear that YouTube has become a significant digital platform for creative musical expression of all kinds, and that cover performances constitute an important aspect of musicking on the site.

3. Four Case Studies

[3.1] In the second half of this paper, I will consider four case studies that outline the diversity of covering and arrangement practices on YouTube, and illustrate how harmonic and melodic structures are expressed or modified in novel ways. All four examples are solo performances, and they demonstrate a variety of instrumentations (from acoustic instruments, to traditional synthesizers, and even an iPhone) and arrangement strategies (including live solo performance, multitracking, and live looping).

3.1 Made from Scratch: Pupsi performs Toto’s “Africa”

Video Example 2. Pupsi, “Africa”

(click to watch video)

[3.1.1] Instrument maker and musician Toni Patanen opens his cover of Toto’s “Africa” with an extended montage of himself making a series of instruments out of vegetables by hollowing them out, boring holes, and tuning them. Patanen, who goes by “Pupsi” on YouTube, makes his living partly by selling ocarinas online, made much more traditionally out of clay. Over the first 90 seconds of the video, Patanen crafts two small ocarinas out of sweet potatoes and a larger one from a butternut squash. He carves out the centers, bores holes in the sides, and tunes them with the aid of a keyboard.

Example 4. Gamuts of Patanen’s two sweet potato ocarinas

(click to enlarge)

Example 5. Toni Patanen, “Africa,” first verse and chorus (2:00–2:58)

(click to enlarge)

[3.1.2] The gamuts of Patanen’s paired sweet potato ocarinas are reproduced in Example 4. Patanen uses them in two different pairs, designated “Right/Left Hand #1” and Right/Left Hand #2.” The ranges of the first pair are dictated by the distinct registral shift that characterizes each line of “Africa’s” verse, while the second pair correspond to the chorus (see Example 5). The first half of each line of the verse explores the upper fourth of the major scale, while the second half lies only in the lower fifth. This division allows Patanen to perform the song’s verse relatively easily on two small sweet potatoes, switching from one to another at a natural break point. The final line of the verse (mm. 12–13) then requires Patanen to switch to the higher pair of instruments, and the video cuts to a full-frame image of him using them (see 2:39 in Video Example 2). When the chorus arrives, the higher left-hand ocarina plays first (mm. 15–18), while the lower right-hand ocarina takes over in m. 19. The note B4 in m. 20, indicated with parentheses, is a departure from the original recording, and is used because the intended note, A4, is a step below the range of RH2.

[3.1.3] In a sense, these homemade ocarinas themselves embody a simple analysis of “Africa’s” melodic construction, which is then reflected directly in Patanen’s arrangement, the first few measures of which are transcribed in Example 5. The design of the instruments is dictated by the melodic structure of the song, its distinct registral breaks, and the pitch spaces it outlines—except in the case of m. 20, where the situation is reversed, and the limitations of the homemade instrument necessitate a small change to the melody. “Africa” generally lacks chromatic pitches, using primarily the notes found in the major scale and thus making possible a limited instrumental gamut, though the chorus’s prominent A natural is reflected in the highest ocarina’s (LH2) departure from the prevailing B major scale.

[3.1.4] To focus on the performance and the arrangement, however, tells only half the story. Patanen’s video begins by showing us how the instruments came about, in a sequence that lasts nearly half the video’s running time, and thus strains the definition of “introduction.” The vegetables are deposited roughly on the table, and then transformed into instruments through a series of fluid movements and a rapidly cut montage. We watch Patanen tune each instrument against his keyboard and demonstrate their gamuts. Finally, one minute and forty seconds into a four-minute video, the song itself begins. From that point onwards, Toto’s “Africa” is played in a recognizable fashion. The screen splits three ways to show the sweet potato’s bifurcated melody, the butternut squash’s bass line, and occasionally percussion and an additional sweet potato. The latter is performed with a pair of hollowed-out potatoes, high and low, held together in one hand like an agogô bell. The song itself is truncated; after a full verse and chorus, we get the point, and without lyrics there is little need to hear every verse of “Africa”—particularly since one reason for the choice of song, along with its simple diatonic gamut and registral break, is almost certainly its ubiquity online. Over the past several years, “Africa” has become something of a running joke on the Internet. As many of the articles highlighting the virality of Pupsi’s performance mention, the 1982 hit single enjoyed a resurgence with a 2017 cover by the popular band Weezer—a cover that was itself instigated by an ironic fan campaign on Twitter, spurred by users who chose it precisely because it is so culturally disfavored: Rolling Stone recently called the song “ridiculous by definition” (Sheffield 2018), while others have drawn attention to its cultural insensitivity.⁽²⁶⁾ So, along with its novel format, Patanen’s cover is a re-presentation of a song that has become an internet cliché, or a “meme,” as Paula Harper (2019, 18–21) uses the term: a cultural object that is meant to call to mind other variations on the same theme, and which draws much of its humorous meaning from that relational network (in this case, years of tongue-in-cheek renditions of the song).⁽²⁷⁾ Of the circulation and adaptation of repeated signifiers within such memes, Jean Burgess writes, “after becoming recognizable through this process of repetition, these key signifiers are then available for plugging into other forms, texts, and intertexts—they become part of the available cultural repertoire of vernacular video” (2014, 91).

[3.1.5] Audience reception, in the form of news articles and YouTube comment sections, can give some indication of how videos like these are received. The humor in Patanen’s video comes in equal parts from how far Patanen is willing to carry the joke, and our astonishment at how well he actually pulls off the act. Expert execution is essential to the joke as well—if the vegetable ocarinas didn’t work, or he could not play them well, the video would fall flat.⁽²⁸⁾ As it is, however, Patanen’s performance was a hit: in its first year online, it was viewed nearly seven million times, and as of this writing has reached nearly nine million views.⁽²⁹⁾ As is often the case with popular YouTube videos, the link has been circulated and recirculated by various websites hoping to entertain their readers, or to climb to the top of the search results as users try to find the video itself. Many websites embed the video from YouTube, surrounded by numerous advertisements and a few inches of copy lauding the latest viral sensation.⁽³⁰⁾ “Finnish ocarina maverick plays the Toto classic Africa using the orchestra in his fridge,” goes the headline on the classic rock news website Louder (Lewry 2019). “Toto’s ‘Africa’ played on a sweet potato and squash is beyond mesmerizing,” another website proclaims. “Wait till you hear how it sounds” (Rock Pasta 2019). These web promotions are surely in on the joke as well. “It’s pretty stellar stuff,” writes Marcus Gilmer (2019) on the popular blog Mashable. “And probably delicious, once baked at 350 degrees for 25 minutes with some salt and seasoning.”

Video Example 3. J. Views, “J. Views Playing ‘Teardrop’ with Vegetables”

(click to watch video)

[3.1.6] In seriousness, however, I have chosen this example not only for its musical and presentational qualities, but for how well it exemplifies the entire genre of the solo YouTube cover. Comparison with a similar video is instructive: Video Example 3 features a performance by Brooklyn-based musician J. Views, who uses a synthesizer kit called a Makey Makey to perform a live-looped cover of Massive Attack’s “Teardrop” on a collection of fruits and vegetables.⁽³¹⁾ The two videos begin in the same way, by setting the performance up with an initial sequence that either offers a glimpse behind the scenes, conjuring a sense of anticipation. J. Views’s video begins with a mundane scene: he strolls through a Brooklyn Whole Foods, filling a basket with future musical instruments. “Bass drum,” he says to the camera, thumping a resonant eggplant. Carrots for the hi-hat, grapes for bells. The video then cuts to J. Views’s apartment, where he carves a few of the veggies up, and explains what the Makey Makey is as he wires four strawberries together into a Frankensteinian synthesizer.⁽³²⁾ J. Views gives the camera a thumbs up at 0:58, and a brief fade to black lets us know that the prologue is over and the performance is beginning. Patanen, too, fades to black and begins his performance in a new visual format, featuring multiple angles of himself in splitscreen. The narrative traced by both videos—everyday objects are selected, modified into a form capable of creating music, and then used to perform a familiar song—is broadly recognizable among many YouTube videos, from educational lessons to tutorials for do-it-yourself projects. In fact, when they are viewed alongside Elise Trouw’s mashup of Radiohead and The Police, a genre begins to coalesce around their shared two-part structure, which juxtaposes a performance with a glimpse of how it was put together. For Trouw, the origin is in individual loops, while for Patanen and J. Views it is in the construction of the instruments themselves. Trouw’s continued accumulation of loops also avoids delineating the boundary between preparation and performance. As we will continue to see in the case studies to come, covers which depend on dramatic musical transformations tend to highlight their own genesis far more often than do performances featuring, for example, a full band recorded live or in studio, or a solo acoustic rendition.

3.2 Luca Stricagnoli Performs Michael Jackson’s “Thriller” and Metallica’s “Fade to Black”

[3.2.1] Italian guitarist Luca Stricagnoli maintains a YouTube channel filled with covers of popular songs, many of which feature him playing two guitars at once. He accomplishes this through a combination of careful musical arrangement, instrument modification, and various extended performance techniques. Two of his arrangements will serve to illustrate how his performance style is made possible through a detailed understanding of the music at hand, which shapes the particular instruments he uses, and the ways that he prepares them.

Video Example 4. Luca Stricagnoli, “Thriller”

(click to watch video)

[3.2.2] Stricagnoli’s performance of Michael Jackson’s “Thriller” uses a physical arrangement that is common on his YouTube channel. He wears one guitar in a conventional manner, while placing a second horizontally on a table in front of him. As shown in Example 6, he performs the bass line on the guitar that he holds, while performing the melody on the tabletop guitar, which has been retuned in diatonic steps. Example 7 shows a standard six-string guitar tuning for reference, along with the retunings of both guitars in “Thriller.” The first is straightforward—a standard tuning lowered by six semitones (and pictured in bass clef for legibility). The tabletop guitar has had its pitch raised by a capo—a device (visible in the screen capture) that stops the strings of the guitar at a higher fret, effectively raising the pitch of the open strings. A close examination of the tabletop guitar in the video reveals that it has been restrung with lighter-gauge strings in order to accommodate its higher pitch.⁽³³⁾

Example 6. Stricagnoli playing “Thriller” on two guitars (screen capture by the author from YouTube)

(click to enlarge)

Example 7. Standard guitar tuning and Stricagnoli’s retunings for “Thriller”

(click to enlarge)

Example 8. Stricagnoli’s rendition of “Thriller,” bass line

(click to enlarge)

Example 9. Transcription of Stricagnoli’s performance of “Thriller,” verse 1 (0:00–0:52), showing “vocal” melody with accompaniment

(click to enlarge)

[3.2.3] There are several things to note about Stricagnoli’s performance. The first is his one-handed rendition of “Thriller’s” memorable bassline, which is transcribed in Example 8. Using a combination of hammer-ons, pull-offs, and left-handed plucking, he is able to perform the bassline with only one hand, on the guitar that he holds traditionally.⁽³⁴⁾ Another important aspect of the performance is the fact that the melody from “Thriller’s” verse contains only six notes (as many as a standard guitar has strings), making possible a performance on the tabletop guitar using only re-tuned open strings. This second guitar also has a piece of plastic attached to it, on which Stricagnoli taps the backbeat with his thumb in between melodic notes. So, while one hand combines a series of techniques that are in the guitarist’s standard skill set, the other uses a performance technique more familiar to harpists than guitarists to render the melody in ringing, open strings (see Example 9). For each chorus and the bridge (both of which have a higher tessitura than the verse), finally, Stricagnoli plays the first guitar with more standard two-handed technique, rendering both melody and harmony clearly in a style most closely associated with classical guitar.

[3.2.4] Stricagnoli’s cover of Metallica’s “Fade to Black” (1984) is even more intricate. It is based on the same arrangement of guitars: one held conventionally, and one on a table. To this familiar pairing, Stricagnoli adds several additional tools. He wears a thumb pick on his right hand and uses a capo on the tabletop guitar, raising its pitch by six half-steps. Finally, at the beginning of the song, he employs an “EBow”—a small electronic device that generates a magnetic field to continuously vibrate a guitar string—to create a sustained resonant tone reminiscent of the original recording’s strings.

Example 10. Tunings for 7-string tabletop guitar (top) and conventionally held guitar (bottom)

(click to enlarge)

[3.2.5] The second guitar of “Fade to Black”—this time a seven-stringed “soprano” instrument—again uses an alternate tuning: as shown in Example 10, its strings (when stopped by a capo) express the top seven notes of a B natural minor scale.⁽³⁵⁾ The seven-string guitar provides many of the notes needed for the song’s melody and its opening solo. Notably, however, it lacks the tonic at the bottom of its gamut; this note will routinely be filled in by the other guitar’s highest open string. Unlike in “Thriller,” that traditionally held guitar is also retuned. Example 10 compares this guitar’s tuning to a conventional tuning. The lower four strings outline a B minor chord, which features prominently in the song’s accompaniment, while the top two strings provide a higher A and B, which tend to play a melodic role within Stricagnoli’s hand-crossing gestures (see mm. 4–5, 9, and 13–15 in Example 11 below).

Example 11. Luca Stricagnoli, “Fade to Black,” Introductory Guitar Solo with accompaniment (0:00–0:45)

(click to enlarge and see the rest)

Example 12. Metallica, “Fade to Black,” Introductory Guitar Solo (0:15–0:50)

(click to enlarge)

[3.2.6] Example 11 transcribes the first 40 seconds of Stricagnoli’s “Fade to Black” cover; Example 12 provides Metallica’s original for reference. Each stave represents one of the two guitars. In a technique familiar from his “Thriller” performance, Stricagnoli begins by performing the song’s introduction (which, in the original, was already performed on acoustic guitar) with the conventionally held guitar. He uses only one hand, again using hammer-ons and pull-offs to articulate the arpeggiated figure. The division of musical labor that obtained in “Thriller” is mostly continued here—accompaniment on the held guitar, melody on the tabletop—although in “Fade to Black,” the melody occasionally crosses over to the accompanimental guitar: even with the additional string, the tabletop guitar does not provide all the notes necessary to perform the opening guitar solo. As noted above, Stricagnoli several times uses the two highest strings on the conventional guitar—A and B—to fill in the melody’s lower register. The arrows throughout the example trace how the melody passes from one guitar to the other. Sometimes these notes are performed by the left hand, as double stops along with the accompaniment; those cases are indicated with downward stems, and an “open string” notation (°). However, Stricagnoli also reaches up and plucks these open strings with his right hand. These notes are indicated in the lower staff with upward stems and brackets marked “R.H.” The manner in which the two highest strings on the guitar are used sets them apart from the other strings. Tuned only a whole step apart from one another, and separated by a fifth (a larger interval than is ever found in a standard guitar tuning) from the other strings, these high strings complete the tabletop guitar’s B minor scale gamut; an impression only reinforced by the fact that the song’s accompanimental arpeggios only rarely reach the high A, and never the B. It is surely not coincidence that the two notes, adjacent rather than separated by a more conventional leap in their tuning, are also the roots of the song’s two most frequently used chords, B minor and A major.

[3.2.7] Recent approaches to alternate guitar tunings (or AGTs, after 2020) emphasize their affordances for performers and songwriters. Examining the work of jazz guitarist Kurt Rosenwinkel, for instance, Jonathan De Souza (2017, 88–97) notes that AGTs productively force a performer or improvisor into new musical patterns by disrupting the well-learned connections between physical gesture and sonic result. “One twist of a tuning peg can turn you into a beginner in an instant,” writes Christian Rover (2006, quoted in De Souza 2017, 88), referring to Rosenwinkel’s alternate tunings as “voluntary self-sabotage.”⁽³⁶⁾ Following David Lewin (1998) and Steven Rings (2011), De Souza applies transformational theory both in the analysis of harmonic progressions built on alternate tunings, and on the alternate tunings themselves in relation to standard guitar tuning. Kaminsky and Lyons (2020) analyze the use of AGTs in the music of Joni Mitchell, who in interviews has discussed retuning her guitar in search of new sounds, and in compensation for her left hand’s reduced range of motion due to a childhood case of polio. Through a chronological corpus study of her oeuvre, Kaminsky and Lyons identify characteristic chord shapes and fretboard gestures that Mitchell has used throughout her career, and apply them in counterpoint with text and vocal melody to produce rich hermeneutic interpretations of her songs.

[3.2.8] In the performances discussed here, however, the retuned open strings of Stricagnoli’s guitar have a different effect than do alternate tunings of a conventionally played guitar. Recall that due to his unusual technique, only the open strings of each tabletop guitar are available—six for “Thriller,” seven for “Fade to Black.” Therefore, rather than creating new affordances or engendering a fresh awareness of musical space due to unexpected inter-string intervals, Stricagnoli’s tabletop guitar retunings effectively turn the instrument into a tool best suited for performing a specific song. In this respect, they do act similarly to the retunings discussed by De Souza, in that they disrupt the performer’s traditional, habitual relationship with their instrument’s “enactive landscape” (2017, 81). But instead of encouraging refreshed engagement in the service of composition or improvisation, they allow—either through an act of memorization, or through deeply practiced re-training—novel ways of executing existing music. The retunings do introduce certain obstacles to performance, however. As shown by the circles in Example 12, some notes and figures from the original solo are difficult or impossible to execute with Stricagnoli’s one-handed technique. The circled Bs in m. 13, for instance, lie below the compass of the tabletop guitar (the lowest string of which is tuned to C♯), and are only available by cheating a hand up to the held guitar. Similarly, the triplet ornamentation in m. 15 would require fretting a note on the tabletop guitar, and would be unwieldy—though admittedly not impossible, given the tuning—to execute with open strings alone.

[3.2.9] Even more than the examples considered up to this point, Stricagnoli’s videos exemplify what is best described as a virtuosity of arrangement. By this, I mean that two separate virtuosic activities underlie Stricagnoli’s cover performances. First, the combination of instrumental modification and extended performance techniques, such as the unusual second guitar laid out on the table, and the combination of left-hand articulatory techniques used to execute the “Thriller” bass line. Second, there is the fact that those performance techniques are themselves made possible and useful by a careful analysis of each song, which dictates the terms of the performance. In much the same way as Toni Patanen’s divides “Africa” into two registral halves for each line of the verse, Stricagnoli’s performances are made possible by the realization that he must have had—whether through trial and error, or an intentional exploration of the melodic and harmonic structures found in the verses of each song—that he could re-tune a guitar to play the melody on the open strings, thereby opening the door to the right-handed tabletop performance. Elements that make both “Thriller” and “Fade to Black” suitable for these dual guitar performances include their textures (both feature a clear differentiation between repetitive accompanimental ostinati and melody) and their relatively restricted melodic gamuts (“Thriller” uses only six notes, and “Fade to Black” only nine, seven of which are covered by the tabletop guitar). The notion of a virtuosic arrangement draws together both of these aspects—music-analytical insight and the technique necessary to render those insights in sound—in order to characterize these solo performances as both musical performances and music-analytical demonstrations.

3.3 Kawehi Performs Nirvana’s “Heart-Shaped Box”

Example 13. Kawehi performing keyboard, vocals, and live loops (screenshot by the author from YouTube)

(click to enlarge)

Example 14. Kawehi’s sketch of her performance setup

(click to enlarge)

[3.3.1] Kawehi is an independent singer-songwriter based in Lawrence, Kansas. In addition to touring and recording, Kawehi has been an active YouTuber since 2013. Her videos are often technically and musically creative, chronicling her solo performances of both cover and original songs. Some of Kawehi’s performances take advantage of multitrack recording, allowing her to perform as an entire ensemble (she is the vocal quintet pictured in Example 3, above), while others employ “live-looping” technology. Live-looping (also seen in Elise Trouw’s video at the beginning of this article) is a practice in which short segments of music are recorded and then played back, so that a musician may accompany themselves. Long an analog practice associated with performance artists like Laurie Anderson (and dependent on literal tape loops, before the invention of digital looping devices in the 1990s), live-looping is now popular among indie musicians on YouTube. Contemporary musicians control their loops with either floor pedals or computers, using software such as Ableton Live or MainStage.

[3.3.2] Kawehi’s equipment is integral to both her performances and her online persona. In her self-presentation to her fans, she often posts behind-the-scenes videos of herself preparing to go on tour, or demonstrating her tools.⁽³⁷⁾ Example 13 is a screenshot from a performance of Nirvana’s “Heart-Shaped Box,” posted to YouTube in 2014. Example 14 is an image from Kawehi’s Instagram account that depicts the array of equipment seen and heard in the performance. At its center, from a technological perspective, is a laptop computer running Ableton Live, a digital audio workstation designed for recording, managing, and playing multiple tracks of audio in live performance. Kawehi controls a series of looping tracks in Ableton by means of a Novation “Launchpad” MIDI device, the glowing buttons of which are visible on the right side of Example 13. (She also uses a foot pedal, visible in the diagram but not seen in the video.) A MIDI keyboard is before her on the table and a microphone extends horizontally from an offscreen stand. These are her two central musical tools. The microphone is fed through a pre-amplifier and an effects processor, which will be used throughout the first part of the performance to add harmonies to Kawehi’s backing vocals in real time.

Video Example 5. Kawehi, “Heart-Shaped Box,” annotated introduction (0:00–2:03)

(click to watch video)

[3.3.3] Kawehi’s performance of “Heart-Shaped Box” (presented as Video Example 5) opens with the tangible sense that we are somehow “before the beginning” of the show. Kawehi looks askance at a second camera, which bobs as if its tripod is still being adjusted. She speaks into the microphone: “Yup, yup, is this thing on?” As the harmonizer splits her voice into cacophony, she demonstrates the tool that will underlie nearly her entire performance: an effects processor that harmonizes with her voice in real time. She sings triumphantly in response to her own question: “Yeah!” Given a pitch to grab onto, the black box’s voices coalesce into a chord, supporting her cry with a deep bass tone an octave below her fundamental, and a piquant minor third above.

Example 15. Transcription of Kawehi, “Heart-Shaped Box,” introduction (0:13–2:03)

(click to enlarge and see the rest)

[3.3.4] As the video begins, we see and hear nearly two minutes of Kawehi recording and looping musical fragments: keyboard drones, synthesized guitar riffs, backing vocals, and beatboxing. Example 15 presents a transcription of this musical mise en place; as in Example 1, brackets indicate music being performed and sampled live, while small notes (when possible) indicate loops.⁽³⁸⁾ After counting off a tempo, Kawehi first records a simple, two-measure vocal loop: two notes, A to F♯, harmonized automatically with a third above and an octave below (mm. 1–2). Her computer plays it back immediately, and she listens; satisfied, she records two complementary bass notes on her synthesizer, filling in a pair of triads (mm. 5–8).⁽³⁹⁾

[3.3.5] These opening motives form a template that will be reiterated throughout the introduction: each two-measure fragment is performed, and immediately played back as Kawehi prepares the next loop. The order in which musical fragments are recorded is significant: as the performance develops, it will become clear that Kawehi has not begun with the first notes of the song but has instead begun at the end of its form. “Heart-Shaped Box” uses only two different progressions, which happen to be closely related to one another. As Example 15 shows, the verse and the chorus follow the same progression, while a brief postchorus holds its second and third chords for a measure each. This postchorus progression also underpins the brief guitar solo that comes after the song’s second chorus.

Example 16. Common tones in “Heart-Shaped Box,” verse and postchorus

(click to enlarge)

[3.3.6] After these chords, she records vocal percussion, creating the only loop that will persist until the end of the song. After she creates the vocal percussion loop, Kawehi stops the bass line and backing vocals in order to replace them with two new vocal loops that will underpin the verse and chorus (mm. 13–14 and 16–17).⁽⁴⁰⁾ While the postchorus’s vocal dyads (A-C and F♯-A) implied an alternation in harmonies, the verse’s static minor third is more ambiguous, and thus more versatile. This will become an essential feature of the performance: the A-C dyad is effective at knitting together “Heart-Shaped Box” because of the song’s third-related chord progression. As shown in Example 16, A and C form the bottom third of A minor, and the top third of both F major and D⁷. As in the other videos examined thus far, this parsimonious relationship is both a musical feature that makes such a minimal arrangement possible for Kawehi to execute, and a feature that is highlighted, quasi-analytically, by its prominence in that arrangement. The A-C dyad is only silenced when the harmony changes for each brief post-chorus segment; its replacement, as noted above, is similarly parsimonious.

[3.3.7] Two synthesizer loops follow: the first (mm. 21–22) completes the verse/chorus chord progression, while the second breaks the two-measure template for the first time. Applying an electric guitar patch and leaning heavily on her keyboard’s pitch bend wheel, Kawehi performs the song’s chorus (“Hey! Wait! I’ve got a new complaint”) in instrumental form. This new melody is eight measures long: the first four-measures complete one iteration of the song’s thrice-repeated chorus, while the last four segue into the post-chorus (m. 29). The “guitar” loop is not only longer than any single musical idea heard in the video so far, it is the first to transcend the boundaries of the song’s formal sections, crossing from the repeated verse/chorus progression to the post-chorus. The fact that the accompanimental loops change under the lead guitar riff even as Kawehi is still performing it for the first time reveals why the recording session began with the postchorus: so that these loops could be held in reserve, allowing Kawehi to quickly change the chord progression while recording the second half of this riff.

[3.3.8] After completing the eight-measure guitar riff, Kawehi allows it to loop only halfway through: she cuts it and nearly all of the other loops off in order to create a transitional measure (m. 37), and then begins “Heart-Shaped Box” in earnest. From there, the performance closely follows the form of the original song (including the brief guitar solo over the postchorus progression, which Kawehi performs live on her synthesizer, but omitting the original’s repetition of the first verse). The opening two minutes of introductory material, however, cast the rest of the song in a very different light than a more conventional cover performance would. Videos like Kawehi’s first anatomize the music for the audience, laying out its constituent parts like so many pieces of a complex device on a table. The performance becomes a matter of Kawehi re-assembling those pieces in order, even as she sings and plays lead guitar riffs live. In this video, the transition to the song proper is marked by a measure of near-silence from most of the virtual ensemble, and the entrance of the lead vocals. In some of her live performances, however, she marks the distinction between preparatory recordings (done in full view of the audience yet still preliminary) and the “real” performance more distinctly through the stage set: her synthesizer and one microphone often face sideways and are used for recording loops, while a second microphone faces the audience and is used for lead vocals.⁽⁴¹⁾ Kawehi’s turn towards the audience thus marks the moment when she truly begins performing to them.

[3.3.9] Kawehi’s performances and the juxtaposition they execute between preliminary work and the notion of a self-contained performance call to mind Susan McClary’s (1991) account of Laurie Anderson’s performance art. “[Anderson’s] compositions rely upon precisely those tools of electronic mediation that most performance artists seek to displace,” writes McClary,

In order to put this aspect of her work into perspective, it is important to recall that most modes of mechanical and electronic reproduction strive to render themselves invisible and inaudible, to invite the spectator to believe that what is seen or heard is real. By contrast, in Laurie Anderson’s performances, one actually gets to watch her produce the sounds we hear. But her presence is always already multiply mediated: we hear her voice only as it is layered upon itself by means of sequencers. . . . The closer we get to the source, the more distant becomes the imagined ideal of unmediated presence and authenticity (McClary 1991, 137).

Much of McClary’s account can be applied directly to Kawehi: the subversion of digital technology’s pretense of effortless immediacy; the layering of sounds over and around herself; the foregrounding of the mechanisms of recording and reproduction. To this, in the case of a cover song, we might add elements of virtuosic arrangement that we have already identified: the piecemeal construction that dramatizes the process of arrangement, that analyzes for us in reverse. Like Anderson’s performance art, live-looping covers like Kawehi’s are not only dazzling technical displays; they can be read as critical work in themselves, in this case deconstructing the traditional narratives embodied in the very construction of a pop song, and demonstrating the performer’s ability to rely on or subvert expectations about arrangement and song structure.

[3.3.10] For example, consider the attention paid in Kawehi’s video to the postchorus. As Mark Spicer describes it, a postchorus is “a brief, self-contained passage that can be heard as a departure from the chorus and yet does not serve merely as a transition” (2011, [9]). Alyssa Barna (2020), by contrast, treats the postchorus as a more transitional formal function, arguing that it need not stand as an independent formal section; she styles the word with a hypen, in order to emphasize its dependence on the chorus.⁽⁴²⁾ In Nirvana’s original recording, the line is a mere turnaround, an extension of the song’s core chord progression as Kurt Cobain repeats the last few words of the chorus (“. . .your advice, your advice. . .”). These four measures wind the chorus down, but do not approach Spicer’s definition of “self-contained,” nor are they fully independent from the chorus.⁽⁴³⁾ Kawehi’s rendition, however, promotes these measures to greater structural prominence by elevating it in three ways. First, the postchorus is set apart by the unique accompanimental loops required: the vocal dyads (mm. 1–2) and synthesizer whole notes (mm. 5–6) which are used for no other section of the song. Second, those postchorus loops are emphasized by their prominent position in the introduction, as the first two loops we hear. Finally, in Kawehi’s version of “Heart-Shaped Box,” her interpretation of the song’s brief guitar solo appears over the postchorus harmonies rather than the verse-chorus progression, as it did in Nirvana’s original recording. Its status as an independent section is thus reinforced by the role it plays in underpinning a distinct formal feature of the song—the guitar solo—in addition to its appearance at the trailing end of each chorus.

[3.3.11] Kawehi’s live-looping arrangement of “Heart-Shaped Box” is thus not only a cover performance, but a tool for analysis, in that it draws attention to a subtle feature of the music, allowing us to hear it in a different way. In Spicer’s formulation, the postchorus is always already an ambiguous formal space: some might argue for other terms (such as “chorus-ending refrain”), he notes. And the example he chooses to illustrate the concept—Lady Gaga’s “Bad Romance” (2009)—is itself subject to ambiguity. In the studio recording, the song begins with the second half of the chorus, followed by the postchorus before a verse has been heard. In live performances, however, Spicer notes that Gaga often begins directly with the postchorus. In his analysis, however, the segment retains its formal identity even in the absence of an initial preceding chorus, thanks to the role it is heard to play later in the song.⁽⁴⁴⁾ Similarly, Kawehi’s arrangement declares the postchorus to be an important formal event by its repeated appearances, and even its substitution beneath the guitar solo, overriding the less prominent role it plays in Nirvana’s original. In both cases, the arrangement itself subtly changes the formal status of part of the song being covered.

[3.3.12] The introductory “recording” phase of “Heart-Shaped Box” also draws attention to the song’s hook—or lack thereof—and uses its absence as a way of structuring the listener’s experience of the song. The notion of a “hook” is central to the composition and criticism of popular music. A hook is a central musical or lyrical idea, brief and memorable, that both identifies and represents a song, and draws the listener in.⁽⁴⁵⁾ Many songs place the hook front-and-center, while others reserve it for the first or last line of the chorus, or perhaps the end of the verse. In the case of “Heart-Shaped Box,” the first line of the chorus serves as the hook. As Theodore Gracyk (2012/13, 25) has written, the communicative potential of a cover song comes from the expectation that the audience is familiar with the original. As the introduction begins, however, Kawehi’s performance relies not on the audience’s recognition of the song, but rather on conjuring a sense of mystery. Beginning with only a simple dyad, the song coalesces out of a cloud of abstract vocalizations, as Kawehi adds first a backbeat, and then a bass line that gives shape to the pivotal minor third. It is not until she plays the memorable guitar riff from the chorus that what we are hearing becomes clear, and anticipation begins to build for the first line of the verse. Adopting a term from music cognition, I will refer to this type of anticipation as veridical expectation: expectation based on a listener’s direct knowledge of a piece of music (see Huron 2006, 222–25). While scholars of cognition tend to use the term to describe direct, local expectations—anticipation of the very next note in a series, for instance—it applies equally well to a listener’s knowledge and expectation of an entire song: Jamshed Bharucha describes it as “explicit prior knowledge of what is to come”(1987, 4).⁽⁴⁶⁾ In Kawehi’s performance of “Heart-Shaped Box,” the song’s chorus—which in the original is both sung and played by the lead guitar—serves two purposes: in addition to its standard role in the chorus, it serves to identify the song as the introduction continues on, constituting the first clearly recognizable sign of the song’s identity, before the verse itself arrives.

3.4 iSongs Sequences Europe’s “The Final Countdown”

Video Example 6. iSongs, “Europe – The Final Countdown on iPhone (GarageBand),” annotated version

(click to watch video)

[3.4.1] If Kawehi’s performance re-purposes’s “Heart-Shaped Box”’s hook as an isolated introductory gesture, this article’s final case study demonstrates how the hook may also be intentionally delayed or withheld. A performance of Europe’s “The Final Countdown” (1986, shown in Video Example 6) by the YouTube artist known as “iSongs” takes the formal ambiguity and flexibility provided by live looping to its logical conclusion. The channel is run by a single musician who never shows their face, who performs—or perhaps more accurately, constructs—cover songs using the iPhone’s version of the popular music creation app GarageBand. GarageBand is a simple DAW that has been available in various forms on Apple computers and devices since 2004. In their cover of “The Final Countdown,” the performer behind “iSongs” works through the different parts of the song, constructing the drum groove, the bassline, and the keyboard accompaniment note by note. Because they are performing the music on a miniature DAW, however, the viewer sees every step of the process along the way: not only the notes themselves being played or programmed, but the preliminary work necessary to establish the tempo, set up new tracks, define the length of each loop, and so forth. This staging highlights the issue of musical labor by placing all the “behind the scenes” work of audio production—which might ordinarily take place in the privacy of the studio—right alongside the performance itself.⁽⁴⁷⁾ The movements of iSongs’s fingers are a carefully choreographed dance: as the only sign of human action in the video, they are fully “musicalized” and incorporated into the song itself, and their actions provide a narrative structure for the video.

[3.4.2] The first twenty-three seconds of the video are silent. During this time, we see iSongs set the tempo and build a one-measure drum loop by selecting attack points for the kick drum, snare drum, and hi-hat on a glowing grid of $_{4}^{4}$ time. This is one of the only actions in the video that does not unfold precisely in time, as it is that $_{4}^{4}$ beat itself that will soon provide the pulse against which the rest of the video’s events are synchronized. When the measure is complete, iSongs hits record, which activates a four-beat metronome count before the drum loop begins to play. This sequence of actions, or something very similar (depending on the particular instrument involved), will then be repeated for each new track that is recorded.

Example 17. Interface Actions and Musical Actions in iSongs, “The Final Countdown”

(click to enlarge)

[3.4.3] Example 17 presents a running list of the musical events seen in iSongs’s video. As shown by the labels at the top, the unseen performer’s actions alternate between keystrokes within the DAW interface (such as selecting tracks, customizing settings, and hitting record) alternate and more traditionally performative actions (playing melody or accompaniment live, or programming a sequence of drum beats). While it is useful to distinguish between these two categories of actions—musical actions and para-musical actions, so to speak—one of the video’s most important implications is the musicalization of otherwise non-musical activities. iSongs’s performances do not only lay bare the process of building a song layer by layer; they bring that process, which might ordinarily be non-linear and atemporal, into the realm of musical time. The paramusical gestures of the DAW’s interface no longer stand outside the figurative “frame” of the song, but are rather incorporated directly into the musical “canvas,” as it were. The boundary between pre-production and performance that had been made porous but still noticeable by Trouw, Patanen, and Kawehi is now fully deconstructed.

[3.4.4] Perhaps the most representative of these musicalized interface actions is the act of touching the “record” button to add a new loop (italicized each time it happens in Example 17). As noted above, touching “record” triggers a four-beat count, and as such the action must be closely synchronized to the final measure of each four-bar loop. The count imposes a strict time limit on other interface actions; iSongs works very quickly, and their movements are fluidly choreographed.⁽⁴⁸⁾ There are even moments (such as when they attempt to switch back to the bass track just after timestamp 3:04) when their finger seems to slip, trying several times to open a given menu. Were they to miss one of these cues, they would need to wait for the full four-measure loop to come around again (about 8 or 9 seconds) before re-recording the new layer. (Or, they would need to edit the video).

Example 18. Smart Chords used in “The Final Countdown” (Screen capture by the author from YouTube - 2:32)

(click to enlarge)

[3.4.5] Following the drum loop, iSongs next sequences three four-measure instrumental loops: bass, electric guitar, and synthesizer. The first two are rhythmically quantized, and have their “velocity sensitivity” (i.e., volume) turned off, in order to smooth out the inevitable imperfections that arise when trying to tap sixteenth notes with a single finger on a tiny screen.⁽⁴⁹⁾ (Here, recalling the varying levels of authenticity and polish displayed in the performances I have analyzed, the contrast between the imperfect real-time tapping and the immediate, quantized repetition is striking; iSong’s “behind-the-scenes” imperfections feel honest). The synthesizer, quantized as well, relies on a GarageBand function known as “Smart Chords.” Rather than using a traditional keyboard interface, smart chords map triads on to a series of lozenge-like buttons, so they can be played quickly and smoothly. By default, these collections tend to be diatonic, and are dictated by the global key that has been set for the track. As shown in Example 18, however, iSongs customizes the list of chords in order to include those that will be necessary for the initial chord progression (f♯ – D – b – E) and eventually for the second (f♯ – g♯ – A – D – C♯sus – C♯).⁽⁵⁰⁾ Because the “smart chord” interface does not allow for individual notes to be altered, the suspension over C♯ must be controlled by its own chord button.⁽⁵¹⁾

Example 19. Transcription of iSongs, “The Final Countdown,” all loops (4:50–5:14)

(click to enlarge)

[3.4.6] The miniature DAW of GarageBand presents a unique image of musical form.⁽⁵²⁾ Within GarageBand, each collection of loops can be stored in memory as a unit, which the app calls a “section.” While these sections are played sequentially and make it possible to organize a composition into smaller units, they do not map onto conventional formal designations like verse and chorus. Rather, they are variable; iSongs uses them at the level of four-measure phrases. After they have sequenced and recorded all the instruments for the opening four measures (Section A), they open the Sections menu and create two copies: Section B and Section C. Leaving Section B untouched, they then quite memorably delete all the music from Section C (2:40), save for the drum loop, which they edit slightly in order to add additional cymbal crashes for emphasis. With the instrumental tracks left intact but now empty, they record new material for the bass, guitar, and keyboards. Finally, they add and configure one more track: the lead synthesizer. After enabling all three formal sections, they record the song’s memorable 12-measure melody—its most and perhaps only recognizable feature, which has heretofore been withheld for more than four minutes. The full version of iSongs’s recording is transcribed in Example 19. While in this case I have not transcribed each action nor shown the loops taking shape one-by-one, the order in which the tracks are recorded proceeds from the bottom of the score to the top, and the repeat signs mirror the Section A/B/C structure that iSongs uses in GarageBand.

[3.4.7] The video’s audiovisual rhetoric thus depends in large part on the withholding of this recognizable introduction, and on the detailed work that precedes it. Like several of the videos studied in this article, the virtuosity demonstrated in iSongs’s rendition of “The Final Countdown” comes, in large part, from their facility in navigating a highly constrained environment: this time, a feature-constrained mobile version of a DAW rather than a retuned guitar or the contents of the produce aisle. Perhaps most significantly of all, from a formal standpoint, they save the famous synthesizer riff—the song’s most recognizable feature—until very late. Europe’s original recording flies out of the gate with its famous synthesizer riff, which is heard twice through before the lead vocalist enters. In iSongs’s version, however, tension—or at least curiosity—builds for nearly four minutes out of a five-and-a-half-minute song, until the recognizable synthesizer finally breaks through with the song’s famous introduction. As has been the case with many case studies described here, this video is at least as much about the process of production as it is about the finished piece of music. The video’s online reception confirms the importance of the process to the video’s effect. “Guys instead of skipping to 4:50 for the main melody, watch the whole thing, it’s really rewarding to hear the buildup and then the melody,” writes one commenter. “This may be my favorite of the 20-30 or so [iSongs videos] that I’ve watched, simply for that buildup to 4:26. Phenomenal :)” wrote another. While many commenters concur with these sentiments, others express gratitude at the hyperlinks, which YouTube automatically adds to any comments that include a timestamp, in order to skip directly to the recognizable melody. “THANKS FOR THE TIMESTAMP BROTHER,” one commenter remarks in reply to the previous comment, unmoved by the recommendation to experience the build-up in full.⁽⁵³⁾

4. Concluding Thoughts: Musical Labor and YouTube as Creative and Analytical Medium

[4.1] The specter of musical labor that is featured so prominently in this iSongs video has surfaced, to some degree or another, in nearly all of our other case studies as well. Pupsi’s cover of “Africa” opens with nearly two minutes of intensive work, the nature of which very neatly mirrors the domestic labor of food preparation. In a familiar gesture from cinema, the rapidly cut montage in which this work is presented is meant to demonstrate the passage of time: the parade of disconnected images implies that preparing the ocarinas takes far longer than two minutes. Kawehi’s behind-the-scenes videos and social media posts emphasize the equipment that makes her performances possible, and the introduction to her video dramatizes how even abstract fragments of music can come together into a recognizable and effective chord progression, when provided with musical context.

[4.2] These narratives of musical labor depend upon their medium as well. Since most of these videos lack any narration (with the sole exception of J. Views, who talks about what he is doing throughout the first minute of the clip), the communication is solely musical—and visual. Due to the unconventional instrumentation of most of these clips, something significant would be lost if we were not able to see what was actually happening in these “before the beginning” segments—segments which are present in all the videos studied, save for Luca Stricagnoli’s renditions of “Thriller” and “Fade to Black.” The videos by Pupsi and J. Views would lose their novelty if the sources of their sounds were unknown (to say nothing of how many would lose their patience through 90 full seconds of vegetable whittling sounds), while an iSongs track would be full of silent key presses, metronome count-offs, or long spans of repeated loops. If these preparatory segments were omitted completely, the sense of virtuosic performance would often be diminished as well: an iSongs performance would merely mean hitting play on a sequencer, while Luca Stricagnoli’s “Thriller” might well sound like a simple duet. YouTube is thus the perfect—and in many ways, the essential—platform for presenting musical innovations like those described in this article.⁽⁵⁴⁾

[4.3] Cover songs are a vital area for investigation, as a way of understanding and untangling many aesthetic issues in music, from originality to creativity, authorship to transformation and transcription. The deconstructive covers that circulate on YouTube and other social media platforms such as TikTok offer the perfect summary of arrangement’s transformative and expressive potential. The pleasure of a cover song lies in juxtaposing familiar memories or well-known lyrics with unfamiliar affect, or with the unexpected intimacy of an acoustic vocal or a beat vocalized a capella. As the frisson of recognition collides dramatically with an unfamiliar affect, an unexpected and dazzling technique, or a complex web of loops, a cover song forever changes our understanding of the original tune. In Listen: A History of Our Ears, philosopher Peter Szendy evocatively calls this kind of double listening plastic, or even elastic(2008, 35–39). Indeed it is: in these creative covers, we can often hear both the original, and the minimal deconstruction of it, as the song is stretched, squeezed, and molded into something new.

[4.4] The visual nature of these performances allows their artists to deconstruct the very act of arrangement, and to demonstrate their strategies directly to their audiences. In most contexts, pop arrangements are meant to unfold slowly, over three or four minutes, and to do so by increasing in intensity and excitement. The first verse is spartan; harmonies enter at the chorus, if not later. Backup singers, countermelodies, and horns join in. A guitar solo might signal a song’s moment of maximum excitement, while an extended fade-out seems to imply that the jam could go on forever, if not for the limitations of time and tape. These YouTube performances playfully invert that rising action, however. Kawehi’s deft MIDI controller, iSongs’ rapid tapping, Stricagnoli’s ambidextrous guitar playing, J. Views’ harpsichord strawberries and bass-pumping eggplants: each of these performs a kind of “pre-analysis,” telling the viewer what to watch and listen for. Each instrument that is created or customized; each layer that is looped or saved for later; each form-delineating addition or deletion functions simultaneously as an element in a performance and an analytical demonstration. As these performers expose their pre-recording analysis and arrangement, these YouTube covers offer a way of resisting what Korsyn (2003, 22–25) critiques as the “ideology of the abstract” in musical research (which demands that academic insights be tightly packaged, exchangeable, and quantifiable) by instead casting analysis as a procedural and performative activity. As music theorists look for ways to make our research and teaching accessible and relevant for broader publics, we would do well to learn from the case studies described here, each of which offers a masterclass in how to clearly understand, summarize, and convey song structure, and to narrativize—or subvert—significant formal processes like tension and release. YouTube is thus not only a platform for sharing unique cover performances (among many thousands of other kinds of videos), but a creative medium whose affordances constitute not only a new venue for musical performance, circulation, and reception, but a cutting-edge practical vocabulary for performative—and entertaining—analysis.

Return to beginning

William O’Hara
Gettysburg College
300 North Washington Street
Gettysburg PA 17325
williamevanohara@gmail.com

Return to beginning

Works Cited

Adams, Kyle. 2015. “What Did Danger Mouse Do? The Grey Album and Musical Composition in Configurable Culture.” Music Theory Spectrum 37 (1): 7–24. https://doi.org/10.1093/mts/mtv004.

Angier, Tom. 2012. Techne in Aristotle’s Ethics: Crafting the Moral Life. Continuum.

Arnold, Corwin, and Julianne Grasso. 2022 “Music Theory YouTube.” In The Oxford Handbook of Public Music Theory, ed. J. Daniel Jenkins. Oxford University Press. https://doi.org/ 10.1093/oxfordhb/9780197551554.013.32.

Auner, Joseph. 2003. “‘Sing It for Me’: Posthuman Ventriloquism in Recent Popular Music.” Journal of the Royal Musical Association 128 (1): 98–122. https://doi.org/10.1093/jrma/fkg004.

Auner, Joseph. 2017. “Reich on Tape: The Performance of Violin Phase.” Twentieth-Century Music 14 (1): 77–92. https://doi.org/10.1017/S147857221700007X.

—————. 2017. “Reich on Tape: The Performance of Violin Phase.” Twentieth-Century Music 14 (1): 77–92. https://doi.org/10.1017/S147857221700007X.

Barna, Alyssa. 2020. “The Dance Chorus in Recent Top-40 Music.” SMT-V 6 (4). https://doi.org/10.30535/smtv.6.4.

Behrent, Michael. 2013. “Foucault and Technology.” History and Technology 29 (1): 54–104. https://doi.org/10.1080/07341512.2013.780351.

Bell, Adam, Ethan Hein, and Jared Ratcliffe. 2015. “Beyond Skeuomorphism: The Evolution of Music Production Software User Interface Metaphors.” Journal of the Art of Record Production 9. https://www.arpjournal.com/asarpwp/beyond-skeuomorphism-the-evolution-of-music-production-software-user-interface-metaphors-2/.

Blasius, Leslie David. 2002. “Mapping the Terrain.” In The Cambridge History of Western Music Theory, ed. Thomas Christensen, 25–45. Cambridge University Press. https://doi.org/10.1017/CHOL9780521623711.003.

Bharucha, Jamshed. 1987. “Music Cognition and Perceptual Facilitation: A Connectionist Framework.” Music Perception 5 (1): 1–30. https://doi.org/10.2307/40285384.

Boone, Christine. 2013. “Mashing: Toward a Typology of Recycled Music.” Music Theory Online 19 (3). https://doi.org/10.30535/mto.19.3.1.

Boone, Christine. 2018. “Gendered Power Relationships in Mashups.” Music Theory Online 24 (1). https://doi.org/10.30535/mto.24.1.2.

—————. 2018. “Gendered Power Relationships in Mashups.” Music Theory Online 24 (1). https://doi.org/10.30535/mto.24.1.2.

Bungert, James. 2015. “Bach and the Patterns of Transformation.” Music Theory Spectrum 37 (1): 98–119. https://doi.org/10.1093/mts/mtv003.

Bungert, James. 2017. “A Tale of Three Schenkers: Analysis, Piano Pedagogy, and Performance of the Chopin Berceuse op. 57.” Music Theory Online 23 (3). https://doi.org/10.30535/mto.23.3.2.

—————. 2017. “A Tale of Three Schenkers: Analysis, Piano Pedagogy, and Performance of the Chopin Berceuse op. 57.” Music Theory Online 23 (3). https://doi.org/10.30535/mto.23.3.2.

Burgess, Jean. 2014. “All Your Chocolate Rain Are Belonging to Us? Viral Video, YouTube, and the Dynamics of Participatory Culture.” In Art in the Global Present, ed. Nikos Papastergladis and Victoria Lynn, 86–96. University of Technology Sydney ePress. https://doi.org/10.5130/978-0-9872369-9-9.e.

Burgess, Jean, and Joshua Green. 2009. YouTube: Online Video and Participatory Culture. Wiley.

Burns, Gary. 1987. “A Typology of Hooks in Popular Records.” Popular Music 6 (1): 1–20. https://doi.org/10.1017/S0261143000006577.

Burns, Lori, and Alyssa Woods. 2004. “Authenticity, Appropriation, Signification: Tori Amos on Gender, Race, and Violence in Covers of Billie Holiday and Eminem.” Music Theory Online 10 (2). https://www.mtosmt.org/issues/mto.04.10.2/mto.04.10.2.burns_woods.html.

Butler, Mark J. 2014. Playing with Something That Runs: Technology, Improvisation, and Composition in DJ and Laptop Performance. Oxford University Press. https://doi.org/10.1093/acprof:oso/9780195393613.001.0001.

Cayari, Christopher. 2011. “The YouTube Effect: How YouTube has Provided New Ways to Consume, Create, and Share Music.” International Journal of Education and the Arts 12 (6): 1–28.

Cayari, Christopher. 2017. “Music Making on YouTube.” In The Oxford Handbook of Music Making and Leisure, ed. Roger Mantie and Gareth Dylan Smith. Oxford University Press. https://doi.org/10.1093/oxfordhb/9780190244705.013.15.

—————. 2017. “Music Making on YouTube.” In The Oxford Handbook of Music Making and Leisure, ed. Roger Mantie and Gareth Dylan Smith. Oxford University Press. https://doi.org/10.1093/oxfordhb/9780190244705.013.15.

Christian, Aymar Jean. 2011. “Joe Swanberg, Intimacy, and the Digital Aesthetic.” Cinema Journal 50 (4): 117–35. https://doi.org/10.1353/cj.2011.0049.

Collins, Nicholas. 2014. Handmade Electronic Music: The Art of Hardware Hacking. Routledge.

Cook, Nicholas. 2013. Beyond the Score: Music as Performance. Oxford University Press. https://doi.org/10.1093/acprof:oso/9780199357406.001.0001.

Cuomo, Serafina. 2007. Technology and Culture in Greek and Roman Antiquity. Cambridge University Press.

Dahlhaus, Carl. 1989. Nineteenth-Century Music. Translated by J. Bradford Robinson. University of California Press.

De Souza, Jonathan. 2017. Music at Hand: Instruments, Bodies, and Cognition. Oxford University Press. https://doi.org/10.1093/acprof:oso/9780190271114.001.0001.

De Souza, Jonathan. 2018. “Fretboard Transformations.” Journal of Music Theory 62 (1): 1–39. https://doi.org/10.1215/00222909-4450624.

—————. 2018. “Fretboard Transformations.” Journal of Music Theory 62 (1): 1–39. https://doi.org/10.1215/00222909-4450624.

Dolan, Emily I. 2012. “Toward a Musicology of Interfaces.” Keyboard Perspectives 5: 1–13.

Furseth, Jessica. 2017. “How ‘Africa’ by Toto Became the Internet’s Favorite Song.” Vice, November 22, 2017. https://www.vice.com/en_au/article/43n95n/how-africa-by-toto-became-the-internets-favorite-song.

Gawboy, Anna. 2009. “The Wheatstone Concertina and Symmetrical Arrangements of Tonal Space.” Journal of Music Theory 53 (2): 163–90. https://doi.org/10.1215/00222909-2010-001.

Gibson, Margaret. 2015. “YouTube and Bereavement Vlogging: Emotional Exchange Between Strangers.” Journal of Sociology 52 (4): 1–15. https://doi.org/10.1177/1440783315573613.

Gilmer, Marcus. 2019. “See Toto’s ‘Africa’ Played on Some Sweet Potatoes and a Squash.” Mashable. https://mashable.com/video/toto-africa-played-with-squash

Gjerdingen, Robert O. 2009. “The Price of (Perceived) Affordance: Commentary for Huron and Berec.” Empirical Musicology Review 4 (3): 123–25. https://doi.org/10.18061/1811/44533.

Gracyk, Theodore. 1996. Rhythm and Noise: An Aesthetics of Rock. Duke University Press.

Gracyk, Theodore. 2012/13. “Covers and Communicative Intentions.” Journal of Music and Meaning 11: 23–46.

—————. 2012/13. “Covers and Communicative Intentions.” Journal of Music and Meaning 11: 23–46.

Harper, Paula. 2019. “Unmute This: Circulation, Sociality, and Sound in Viral Media.” PhD diss., Columbia University.

Hayles, N. Katherine. 1999. How We Became Posthuman: Virtual Bodies in Cybernetics, Literature, and Informatics. University of Chicago Press.

Heidegger, Martin. 1977 [1954]. “The Question Concerning Technology.” In The Question Concerning Technology and Other Essays, 3–35. Translated by William Lovitt, Garland Publishing.

Hein, Ethan. 2020. “Chris Thile, Kendrick Lamar, and the Problem of the White Rap Cover.” Visions of Research in Music Education 35: 1–27.

Hillrichs, Rainer. 2016. “Poetics of Early YouTube: Production, Performance, Success.” PhD diss., Friedrich-Wilhelms-Universität Bonn.

Holm-Hudson, Kevin. 2002. “Your Guitar, it Sounds so Sweet and Clear: Semiosis in Two Versions of ‘Superstar.’” Music Theory Online 8 (4). https://www.mtosmt.org/issues/mto.02.8.4/mto.02.8.4.holm-hudson.html.

Huron, David. 2004. “Music-Engendered Laughter: An Analysis of Humor Devices in PDQ Bach.” In Proceedings of the 8th International Conference on Music Perception and Cognition, Evanston, IL, 2004, ed. Scott D. Lipscomb, Richard Ashley, Robert O. Gjerdingen, and Peter Webster, 700–704. Causal Productions.

Huron, David. 2006. Sweet Anticipation: Music and the Psychology of Expectation. MIT Press. https://doi.org/10.7551/mitpress/6575.001.0001.

—————. 2006. Sweet Anticipation: Music and the Psychology of Expectation. MIT Press. https://doi.org/10.7551/mitpress/6575.001.0001.

Huron, David, and Jonathan Berec. 2009. “Characterizing Idiomatic Organization in Music: A Theory and Case Study of Musical Affordances.” Empirical Musicology Review 4 (3): 103–22. https://doi.org/10.18061/1811/44531.

Jenkins, Julie. 2018. “Blessing the Rains: Fieldwork Meditations on ‘Africa’ by Toto.” Suomen Antropologi 43 (2): 100–103. https://doi.org/10.30676/jfas.v43i2.77705.

Kaminsky, Peter, and Megan Lyons. 2020. “Enactive Soundscapes: Physio-Musical and Formal Process in the Music of Joni Mitchell.” Paper presented to the annual meeting of the Society for Music Theory (online).

Kane, Brian. 2014. Sound Unseen: Acousmatic Sound in Theory and Practice. Oxford University Press. https://doi.org/10.1093/acprof:oso/9780199347841.001.0001.

Katz, Mark. 2010. Capturing Sound: How Technology has Changed Music. University of California Press. https://doi.org/10.1525/9780520947351.

Klorman, Edward. 2018. “Performers as Creative Agents; or, Musicians Just Want to Have Fun.” Music Theory Online 24 (3). https://doi.org/10.30535/mto.24.3.10.

Koozin, Timothy. 2011. “Guitar Voicing in Pop-Rock Music: A Performance-Based Analytical Approach.” Music Theory Online 17 (3). https://doi.org/10.30535/mto.17.3.5.

Korsyn, Kevin. 2003. Decentering Music: A Critique of Contemporary Musical Research. Oxford University Press. https://doi.org/10.1093/acprof:oso/9780195104547.001.0001.

Lacasse, Serge. 2000. “Intertextuality and Hypertextuality in Recorded Popular Music.” In The Musical Work: Reality or Invention?, ed. Michael Talbot, 35–58. Liverpool University Press. https://doi.org/10.5949/liverpool/9780853238256.003.0003.

Leech-Wilkinson, Daniel. 2012. “Compositions, Scores, Performances, Meanings.” Music Theory Online 18 (1). https://doi.org/10.30535/mto.18.1.4.

Leong, Daphne. 2019. Performing Knowledge: Twentieth-Century Music in Analysis and Performance. Oxford University Press. https://doi.org/10.1093/oso/9780190653545.001.0001.

Lewin, David. 1998. “Some Ideas About Voice-Leading between PCSets.” Journal of Music Theory 42: 15–72. https://doi.org/10.2307/843852.

Lewry, Fraser. 2019. “Man Plays Toto’s ‘Africa’ on a Sweet Potato and a Butternut Squash.” https://www.loudersound.com/news/man-plays-totos-africa-on-a-sweet-potato-and-a-butternut-squash.

Lyall, Sarah. 2003. “Hamburg Journal: For Cooking up Music, Mixed Vegetables Do Just Fine.” The New York Times, March 6, 2003.

Malawey, Victoria. 2010. “An Analytical Model for Examining Cover Songs and Their Sources.” In Pop Culture Pedagogy in the Classroom, ed. Nicole Biamonte, 203–32. Scarecrow Press.

Malawey, Victoria. 2020. A Blaze of Light in Every Word: Analyzing the Popular Singing Voice. Oxford University Press. https://doi.org/10.1093/oso/9780190052201.001.0001.

—————. 2020. A Blaze of Light in Every Word: Analyzing the Popular Singing Voice. Oxford University Press. https://doi.org/10.1093/oso/9780190052201.001.0001.

Marrington, Mark. 2011. “Experiencing Musical Composition in the DAW: The Software Interface as Mediator of the Musical Idea.” Journal of the Art of Record Production 5. https://www.arpjournal.com/asarpwp/experiencing-musical-composition-in-the-daw-the-software-interface-as-mediator-of-the-musical-idea-2/.

Mauss, Marcel. 1979. “Body Techniques.” In Sociology and Psychology: Essays, 95–123. Translated by Ben Brewster. Routledge & Kegan Paul. https://doi.org/10.2307/3032558.

McAlpine, Fiona. 2008. Tonal Consciousness in the Medieval West. Peter Lang.

McClary, Susan. 1991. Feminine Endings: Music, Gender, & Sexuality. University of Minnesota Press.

McClelland, Ryan. 2003. “Performance and Analysis Studies: An Overview and Bibliography.” Indiana Theory Review 24 (Spring/Fall): 95–106. https://www.jstor.org/stable/24046462.

Miller, Kiri. 2012. Playing Along: Digital Games, YouTube, and Virtual Performance. Oxford University Press. https://doi.org/10.1093/acprof:oso/9780199753451.001.0001.

Moseley, Roger. 2015. “Digital Analogies: The Keyboard as a Field of Musical Play.” Journal of the American Musicological Society 68 (1): 151–228. https://doi.org/10.1525/jams.2015.68.1.151.

Moseley, Roger. 2018. “Chopin’s Aliases.” 19th-Century Music 42 (1): 3–29. https://doi.org/10.1525/ncm.2018.42.1.3.

—————. 2018. “Chopin’s Aliases.” 19th-Century Music 42 (1): 3–29. https://doi.org/10.1525/ncm.2018.42.1.3.

Mosser, Kurt. 2008. “‘Cover Songs’: Ambiguity, Multivalence, Polysemy.” Popular Musicology Online 2. http://www.popular-musicology-online.com/issues/02/mosser.html.

Müller, Eggo. 2009. “Where Quality Matters: Discourses on the Art of Making a YouTube Video.” In The YouTube Reader, ed. Pelle Snickars and Patrick Vonderau, 126–139. National Library of Sweden.

Nilson, Herman. 1998. Michel Foucault and the Games of Truth. Translated by Rachel Clark. St. Martin’s Press. https://doi.org/10.1007/978-1-349-26624-1.

Nobile, Drew. 2020. Form as Harmony in Rock Music. Oxford University Press. https://doi.org/10.1093/oso/9780190948351.001.0001.

Nussbaum, Martha. 2001. The Fragility of Goodness: Luck and Ethics in Greek Tragedy and Philosophy. Cambridge University Press. https://doi.org/10.1017/CBO9780511817915.

O’Hara, William. 2018. “Music Theory and the Epistemology of the Internet; or, Analyzing Music Under the New Thinkpiece Regime.” Analitica: Rivista online di studi musicali 10.

Osborn, Brad. 2021. Interpreting Music Video: Popular Music in the Post-MTV Era. Routledge.

Parry, Richard. 2020. “Episteme and Techné.” In The Stanford Encyclopedia of Philosophy, ed. Edward N. Zalta. https://plato.stanford.edu/entries/episteme-techne/.

Plato. 1921. Plato with an English Translation. Translated by H.N. Fowler. G.P. Putnam’s Sons.

Rings, Michael. 2013. “Doing it Their Way: Rock Covers, Genre, and Appreciation.” The Journal of Aesthetics and Art Criticism 71 (1): 55–63. https://doi.org/10.1111/j.1540-6245.2012.01541.x.

Rings, Steven. 2011. Tonality and Transformation. Oxford University Press. https://doi.org/10.1093/acprof:oso/9780195384277.001.0001.

Rock Pasta. 2019. “Toto’s ‘Africa’ Played on a Sweet Potato and Squash is Beyond Mesmerizing.” Rock Pasta. https://rockpasta.com/totos-africa-played-on-a-sweet-potato-and-squash-is-beyond-mesmerizing/.

Rockwell, Joti. 2009. “Banjo Transformations and Bluegrass Rhythm.” Journal of Music Theory 53 (1): 137–62. https://doi.org/10.1215/00222909-2009-023.

Rover, Christian. 2006. “Kurt Rosenwinkel: From a Guitarist’s Perspective.” http://www.christianrover.com/Englische%20Seiten/Rosenwinkelengl.html.

Samson, Jim. 2003. Virtuosity and the Musical Work: The Transcendental Studies of Liszt. Cambridge University Press. https://doi.org/10.1017/CBO9780511481963.

Schmalfeldt, Janet. 2011. In the Process of Becoming: Analytic and Philosophical Perspectives on Form in Early Nineteenth-Century Music. Oxford.University Press.

Shea, Nicholas. 2022. “The Feel of the Guitar in Popular Music Performance.” SMT-V 8 (3).

Sheffield, Rob. 2018. “How Toto’s ‘Africa’ Became the New “Don’t Stop Believin.’” Rolling Stone, October 31, 2018. https://www.rollingstone.com/music/music-features/toto-africa-the-new-anthem-747262/.

Shepherd, John, David Horn, Dave Laing, Paul Oliver, and Peter Wicke, eds. 2003. The Continuum Encyclopedia of Popular Music of the World, Volume 2: Performance and Production. Continuum. https://doi.org/10.5040/9781501329234.

Shifman, Limor. 2014. Memes in Digital Culture. MIT Press. https://doi.org/10.7551/mitpress/9429.001.0001.

Small, Christopher. 1998. Musicking: the Meanings of Performing and Listening. University Press of New England.

Solis, Gabriel. 2010. “I Did It My Way: Rock and the Logic of Covers.” Popular Music and Society 33 (3): 297–318. https://doi.org/10.1080/03007760903523351.

Spicer, Mark. 2011. “(Per)form in(g) Rock: A Response.” Music Theory Online 17 (3). https://doi.org/10.30535/mto.17.3.9.

Sterne, Jonathan. 2003. The Audible Past: Cultural Origins of Sound Reproduction. Duke University Press. https://doi.org/10.1515/9780822384250.

Sterne, Jonathan. 2006. “Communication as Techné.” In Communication As. . .: Perspectives on Theory, ed. Gregory J. Shepherd, Jeffrey St. John, and Ted Striphas, 91–98. Sage Publications. https://dx.doi.org/10.4135/9781483329055.n11.

—————. 2006. “Communication as Techné.” In Communication As. . .: Perspectives on Theory, ed. Gregory J. Shepherd, Jeffrey St. John, and Ted Striphas, 91–98. Sage Publications. https://dx.doi.org/10.4135/9781483329055.n11.

Strangelove, Michael. 2010. Watching YouTube: Extraordinary Videos by Ordinary People. University of Toronto Press. https://doi.org/10.3138/9781442687035.

Swinkin, Jeffrey. 2016. Performative Analysis: Reimagining Music Theory for Performance. University of Rochester Press. https://doi.org/10.1017/9781782046998.

Szendy, Peter. 2008. Listen: A History of Our Ears. Translated by Charlotte Mandell. Fordham University Press. https://doi.org/10.2307/j.ctt13x002m.

Thompson, Luke. 2019. “Enjoy Toto’s ‘Africa’ Played on Sweet Potatoes and Squash.” Nerdist, January 4, 2019. https://nerdist.com/article/enjoy-totos-africa-performed-on-carved-sweet-potatoes-and-squash/.

VanderHamm, David. 2018. “Virtuosity/Virtuoso.” Oxford Bibliographies Online. https://doi.org/10.1093/obo/9780199757824-0236.

Vernallis, Carol. 2013. Unruly Media: YouTube, Music Video, and the New Digital Cinema. Oxford University Press. https://doi.org/10.1093/acprof:oso/9780199766994.001.0001.

Yunek, Jeffrey Scott, Benjamin K. Wadsworth, and Simon Needle. 2021. “Perceiving the Mosaic: Form in the Mashups of DJ Earworm.” Music Theory Spectrum 43 (1): 19–42. https://doi.org/10.1093/mts/mtaa016.

Media Examples

Beyond the Guitar. 2019. “Avengers: Endgame - Main Theme Classical Guitar Cover.” YouTube video, 00:03:47. May 11, 2019. https://www.youtube.com/watch?v=5cFJMhe5bRU.

iSongs. 2019. “Europe - The Final Countdown on iPhone (GarageBand).” YouTube video, 00:05:46. August 24, 2019. https://www.youtube.com/watch?v=ImOMJBo12Ts.

J. Views. 2013. “J. Views playing Teardrop with Vegetables.” YouTube video, 00:03:51. February 17, 2013. https://www.youtube.com/watch?v=xvmTav3SYsc.

Kawehi. 2013. “Neda.” YouTube video. January 16, 2013. https://www.youtube.com/watch?v=k66S471J3as.

Kawehi. 2014a. “Anthem.” YouTube video, 00:06:56. August 18, 2014. https://www.youtube.com/watch?v=ykIwtNyASVM.

—————. 2014a. “Anthem.” YouTube video, 00:06:56. August 18, 2014. https://www.youtube.com/watch?v=ykIwtNyASVM.

Kawehi. 2014b. “Heart-Shaped Box by Nirvana (Cover by K⁠awehi).” YouTube video, 00:05:32. May 4, 2014. https://www.youtube.com/watch?v=077UlBtrqWs.

—————. 2014b. “Heart-Shaped Box by Nirvana (Cover by K⁠awehi).” YouTube video, 00:05:32. May 4, 2014. https://www.youtube.com/watch?v=077UlBtrqWs.

Pomplamoose. 2014. “Pharrell Mashup (Happy Get Lucky).” YouTube video, 00:02:53. February 18, 2014. https://www.youtube.com/watch?v=i7X8ZnmLfM0.

Pomplamoose. 2019. “MMMBop | Hanson | funk cover ft. Lucy Schwartz & Adam Neely.” YouTube video, 00:03:42. July 29, 2019. https://www.youtube.com/watch?v=fiShsfvbFUA.

—————. 2019. “MMMBop | Hanson | funk cover ft. Lucy Schwartz & Adam Neely.” YouTube video, 00:03:42. July 29, 2019. https://www.youtube.com/watch?v=fiShsfvbFUA.

Pupsi. 2018. “Toto – Africa (Sweet Potato and Squash Cover).” YouTube video, 00:04:02. December 31, 2018. https://www.youtube.com/watch?v=jRLfGwQ7Nsw.

Rockloe. 2016. “Stairway to Heaven Guitar Solo by Chloé.” YouTube video, 00:00:57. September 17, 2016. https://www.youtube.com/watch?v=hBFfVQF-KQ0.

Schiefel, Michael. 2008. “My Animals.” YouTube video, 00:05:03. November 14, 2008. https://www.youtube.com/watch?v=IjNUfonSGiA.

Smooth McGroove. 2013. “Street Fighter 2 – Guile Theme Acapella.” YouTube video, 00:02:24. April 8, 2013. https://www.youtube.com/watch?v=4qwKCQ4M2Nw.

Stricagnoli, Luca. 2019. “Thriller (Michael Jackson) – Luca Stricagnoli.” YouTube video, 00:03:36. January 4, 2019. https://www.youtube.com/watch?v=zJ_pDcjICtw.

Wintergatan. 2016. “Wintergatan - Marble Machine (music instrument using 2000 marbles).” YouTube video, 00:04:32. March 1, 2016. https://www.youtube.com/watch?v=IvUU8joBb1Q.

YUNI Marimba. 2017. “Coldplay - Viva la Vida.” YouTube video, 00:02:07. December 2, 2017. https://www.youtube.com/watch?v=WrvZ7qe9isI.

Return to beginning

Footnotes

* Previous versions of this article were read at the 2019 meeting of the Society for Music Theory (Columbus, Ohio) and the Gettysburg College faculty brown bag lunch series. I am grateful to the generous audiences at both of those talks for their thoughtful questions and feedback, and to Alex Rehding for his reading of this essay’s first full draft.
Return to text

Previous versions of this article were read at the 2019 meeting of the Society for Music Theory (Columbus, Ohio) and the Gettysburg College faculty brown bag lunch series. I am grateful to the generous audiences at both of those talks for their thoughtful questions and feedback, and to Alex Rehding for his reading of this essay’s first full draft.

1. As of 2021, 51% of all consumer music streaming worldwide occurs on YouTube, significantly outpacing paid, music-specific platforms such as Apple Music and Spotify. This figure also outpaces all physical media combined; see Osborn (2021, 1).
Return to text

2. On “one-man bands,” see Shepherd, et al. (2003, 48–49). On tape loops see McClary (1991, 132–147), Collins (2014, 47–52) and Auner (2017). On the democratization and increased affordability of both music listening and music performance, see Katz (2010). Cayari (2017) offers an overview of “pro-am” musicking on YouTube, summarizing a few predominant production styles and analyzing the site as both “a place and a medium” (469) that allows performers to share their music, offers several forms of social and creative interaction among musicians and listeners, and in turn shapes musical performance as both an activity of “serious leisure” and an increasingly professionalized form of entertainment in the twenty-first century.
Return to text

3. Readers wishing to know more about historical approaches to virtuosity may consult David VanderHamm’s (2018) excellent Oxford Bibliography on the topic, or seek concise introductions by Samson (2003, 1–7 and 66–102) and Dahlhaus (1989, 8–14 and 134–141), who uses virtuosity as one of the terms of analysis within his famous Beethoven/Rossini dichotomy.
Return to text

4. Although mashups originated as a studio phenomenon, with artists remixing and combining existing recordings, they can also be performed live. Boone (2013, [5.1]) terms these performances “cover mashups.”
Return to text

5. The keyboard does not appear in Example 1’s transcription, but the looped chords are recorded at 1:40 in the video.
Return to text

6. Put another way, Trouw’s video is typical in the way it places the process of constructing its own accompaniment front and center, but atypical in the way that it hides the technical means of production which drive that construction. Along with loops being triggered and silenced from off-camera, Trouw is followed from instrument to instrument by a live camera operator: a level of dynamic production not always found in YouTube musical performances, which often use fixed (and presumably unattended) cameras.
Return to text

7. On the aesthetics of early YouTube, see inter alia Müller (2009) and Hillrichs (2016). Burgess and Green (2009) argue against a reductive view that often presents YouTube as a low fidelity playground for amateurs, or draws a firm division between amateur and professional productions on the platform. Style, aesthetics, and purposes for production have varied widely since the site’s inception, and continue to do so.
Return to text

8. Techne (τέχνη) is transliterated in several ways, sometimes with an accent or circumflex over the second e; for the sake of simplicity, I have followed Serafina Cuomo’s (2007) practice of italicizing the word but omitting any accent mark.
Return to text

9. See Nussbaum (2001, 94–99) for a concise summary of techne in Ancient Greek philosophy. For a comprehensive summary of the relationship between techne and episteme throughout the history of philosophy, see Parry (2020), who outlines important discussions of the term in texts such as Xenophon’s Memorabilia, Aristotle’s Metaphysics and Nicomachean Ethics, and numerous dialogues of Plato, such as Phaedrus, Gorgias, and Republic. More specific treatments include Cuomo (2007, on ancient accounts of technology) and Angier (2012, on ethics as techne).
Return to text

10. Brian Kane’s (2014, 97–133) account of acousmatic sound at Bayreuth and in musique concrète is mostly concerned with this sense of techne, as an opposite term that supplements music’s natural powers.
Return to text

11. See Parry (2020, 2), which discusses Socrates’ use of various professions (most notably medicine) to illustrate the concept of techne.
Return to text

12. On the varying organizations of medieval and Renaissance treatises, theory’s encounters with various epistemologies, and the corresponding changes within the discipline, see Blasius (2002). On cantus and musicus, see McAlpine (2008, 32–40).
Return to text

13. See Sterne (2003, 92): “All the technologies of listening that I discuss emerge out of techniques of listening.” Sterne, in turn, is inspired in this statement by Mauss (1979).
Return to text

14. Butler (2014, 174n2) is aware of the same linguistic ambiguities noted by De Souza, noting that Foucault uses “technique” and “technology” interchangeably. For more on this issue when translating from European languages and in Foucault’s work in particular, see Behrent (2013, 58–60). On Foucault’s technologies see Nilson (1998, 97–102).
Return to text

15. Sterne (2006, 96) offers a similar justification for employing the word techne: while he acknowledges that invoking an esoteric ancient Greek concept is not always strictly necessary, the earlier sense of techne offers resistance to what he characterizes as the “add technology and stir” model that prevails in science and technology studies, by reclaiming the older, richer resonances of the stem “tech-.”
Return to text

16. While this article was revised and completed during the COVID pandemic, it is concerned with an audiovisual form that predated the outbreak, and is only superficially related to the various new modes of musicking that have arisen as part of worldwide efforts to mitigate the spread of COVID—though I do hope this research helps to establish and advance analytical frameworks that may be useful as scholars turn their attention to those socially distanced ways of rehearsing, recording, performing, and listening. As of this writing, more than 100 scholarly essays have been published about musicking during the pandemic, many of which can be found in Critical Improvisation Studies 14/1–3 (2021), a multi-issue forum entitled “Improvisation, Musical Communities, and the COVID-19 Pandemic,” which collects stories of how individuals, communities, and arts institutions adapted to the conditions of isolation; and within the research topic “Social Convergence in Times of Spatial Distancing: The Role of Music During the COVID-19 Pandemic,” in the journal Frontiers in Psychology.
Return to text

17. Numerous publications have come from the many scholars involved with these projects; most relevant for the present essay are Nicholas Cook (2013) and Daniel Leech-Wilkinson (2012). See www.cmpcp.ac.uk for more information and a full list of collaborators.
Return to text

18. These citations represent several studies from recent years. For a comprehensive overview of the relationship between analysis and performance in the twentieth century, see McClelland (2003).
Return to text

19. See De Souza (2017, 76–82). In formulating his argument about idiomaticity, De Souza draws on both personal performance experience, and cognitive theories and empirical research by David Huron and Jonathon Berec (2009) and Robert Gjerdingen (2009).
Return to text

20. On the rise of music theory instruction on YouTube, see Arnold and Grasso (2022).
Return to text

21. On YouTube’s digital communities, see Strangelove (2010, 103–136). On specific communities, see Christian (2011), Burgess and Green (2009, 53–55), and Gibson (2015).
Return to text

22. In a similar manner, a cover of Katy Perry’s “I Kissed a Girl” by gay singer-songwriter Ivri Lider cleverly inverts both the song’s affect and its original meaning (I am grateful to Alex Rehding for drawing my attention to this example). The performer’s identity, however, can also make a cover song unsuccessful: Ethan Hein (2020) chronicles the negative reception of NPR host and bluegrass musician Chris Thile’s 2016 cover of Kendrick Lamar’s “Alright,” and criticizes its problematic appropriation and inauthenticity.
Return to text

23. To summarize Gracyk’s argument further: rock is not defined by a specific genre of music, but rather a studio- and record-oriented outlook that sees recordings of a given song as “primary texts”—definitive performances on which later live performances—whether by the original artist, or another—are both based and judged. His ontology of rock is thus concerned not only with the song itself (as might be the case with a Classical “work,” of which many exemplary performances exist from a single score), but with a complete sonic unit that encompasses both a performance and all the audible artifacts that go with it, including incidental noise, artifacts of the recording process, and the effects of the particular medium; see Gracyk (1996, 1–67).
Return to text

24. For a music-theoretical perspective on mashups, see Boone (2013) and Boone (2018), particularly the comprehensive review of scholarly literature in the latter. For specific analyses, see Adams (2015) and Yunek, Wadsworth, and Needle (2021).
Return to text

25. On “multitracked” YouTube videos, see Cayari (2017, 473–475).
Return to text

26. “Africa,” released in the U.S. on October 30, 1982, was a modest hit: it spent 21 weeks on the Billboard “Hot 100” chart, attaining #1 for a single week (February 5, 1983) before falling off the charts completely by the end of March. (See the Billboard archives at https://www.billboard.com/archive/charts/1983/HSI). Jessica Furseth (2017), offering a more positive assessment in her catalog of “Africa’s” recent resurgence, speculates that the song became such an internet favorite thanks to a combination of it being “a well-crafted piece of music, with driving drum loops, layered harmonies, and an anthemic chorus,” but also “just dorky enough”: a nostalgic remnant of “a time when earnestness was far more socially acceptable.” On the song’s reductive view of the continent of Africa, see Jenkins (2018, 100).
Return to text

27. Harper’s definition (which is in dialogue with Limor Shifman [2014]) places “memes” in opposition to a phenomenon that is simply popular, which she classifies as “viral.” A meme is an iterative, repetitive form, in which creators and consumers repetitively riff on a similar idea, such as the same recognizable image overlaid with new and timely text. In this sense, it resembles Vernallis’ (2013, 130) discussion of genre on YouTube, cited above.
Return to text

28. The video has attracted nearly 18,000 comments. I will not quote them specifically here, but they tend to fall into two categories: most are jokes about “playing with your food”, while roughly one in ten express amazement at the performer’s skill with unlikely instruments.
Return to text

29. 8.8 million views between December 2018 and December, 2021.
Return to text

30. On the process of “churnalism” and the re-circulation of viral music content, see O’Hara (2018).
Return to text

31. Vegetable-based orchestras are not an uncommon sight in the world of experimental music; they offer a similar juxtaposition of unexpected materials, treated seriously. Their reception is much the same as Patanen’s; see, for instance, the blend of bemused detachment and genuine appreciation in Lyall (2003).
Return to text

32. Just as the limited gamuts of Patanen’s sweet potato ocarinas are precisely calibrated for performing “Africa,” so too are the strawberries configured especially for “Teardrop”: they perfectly cover the four notes of the song’s prominent harpsichord part.
Return to text

33. Acoustic guitar strings are generally made of steel, and tightly wrapped in a bronze alloy made of copper and zinc. The two or three highest strings are generally left unwrapped in order to keep tuning and tension consistent. Stricagnoli has apparently replaced all but the lowest string (which corresponds, unstopped, with a traditional guitar’s G string) with unwrapped steel strings better suited to his high-pitched tuning.
Return to text

34. A “hammer-on” is a technique in which the guitarist uses a finger on their left hand to press a string down on the fretboard with enough force to create a sound. A “pull off” is the reverse, in which the performer plays a note by subtly pulling the string to one side as they release it. In a “left-hand pluck,” the performer plucks an open string with their left hand, over the fretboard.
Return to text

35. Throughout the performance, Stricagnoli moves the capo in order to change the pitch of the open strings. For instance, he moves two frets lower for a transitional section just after the end of Example 8 (0:45), and then much higher for the verse (1:09).
Return to text

36. See De Souza (2017, 53–63 and 78–82). De Souza characterizes instrumental habits as dense, multisensory mappings between “a lived body, an affordance space, and an enactive landscape” (81). These interactions involve feedback in tactile, visual, and audible forms, and the performer’s interface with the instrument depends upon well-learned correspondences between those three streams—correspondences that will be disrupted or re-arranged by retuning.
Return to text

37. See, for example, Kawehi, “Gear-splain and Going on Tour!” (September 23, 2017) https://youtu.be/tsZmWXbEj-c. Joseph Auner (2003, 105) has argued that vocal loops and samples in popular music have often functioned as “posthuman ventriloquism,” that represents (pace Hayles 1999) “both destructive and liberating implications,” particularly with regard to its intersections with other aspects of the performer’s identity, such as race and gender. Kawehi embraces cybernetic imagery throughout her oeuvre, presenting herself as half robot on the cover of her 2016 album Evolution, and dramatizing her own post-human vocality in the effects-laden video “Anthem” (2015), in which she dons a child’s cardboard robot costume and repeatedly pulls off her own singing head, clad each time in a box labelled with a musical function: bass, beatbox, etc.; see https://www.youtube.com/watch?v=ykIwtNyASVM.
Return to text

38. Because my music notation software is unable to render special noteheads at a smaller size, harmonized pitches and vocal percussion beats are shown at full size.
Return to text

39. Here, eagle-eyed viewers may notice that Kawehi’s keyboard seems to be transposed down a perfect fifth: she plays C-A in order to produce F-D.
Return to text

40. The second verse/chorus loop, the “Yeah!” previewed at the beginning of the video, is preceded by what might be a mistake in m. 15. Kawehi sings “yeah!” in the second half of one of the established two-measure cycles, and cuts herself off quickly. It is unclear if this is an error, or a way of warming up her voice for the next measure, knowing that the preliminary “Yeah” would not be recorded.
Return to text

41. See, for example, her 2017 performance of Nine-Inch Nails’ “Closer” in Carrboro, NC (https://www.youtube.com/watch?v=lKf3yrMoCI4).
Return to text

42. Barna (2020) identifies the “dance chorus” as a distinct formal unit in contemporary pop. One of its important features is that it stands independently within the form, rather than being dependent on what precedes it or serving as a transition, as she argues a post-chorus would be. While Nobile (2020, 118) mostly echoes Spicer’s definition and emphasizes the independence of the postchorus (noting that it often includes text), he also acknowledges that “the distinction between transition and postchorus is not always clear-cut.”
Return to text

43. It is worth noting that by extending and repeating a truncated version of the chorus’s chord progression, the postchorus in “Heart-Shaped Box” acts in much the same way as another postchorus named by Spicer, found in Stephen Stills’s “Love the One You’re With” (1970).
Return to text

44. See Spicer (2011, [10n14]). To make one further point via an argument about punctuation: Spicer’s formulation of “postchorus” without any hyphen seems to reinforce the idea that a postchorus plays a specific formal role, and is not simply a module that appears after the chorus (i.e., a “post-chorus,” as many other sources including Barna 2020 style it).
Return to text

45. Gary Burns (1987) provides both a brief archaeology of the term “hook,” and a typology of the many forms it might take. The permutations of what constitutes a “hook” are too many and varied to rehearse here, but examples run the gamut from a recognizable musical feature such as an opening guitar riff or the first or last line of a chorus, to less obvious characteristics such as a distinctive, repeated accompaniment, the appearance of an unexpected instrument, timbre, or even a sound effect. With the emergence and rise of new genres and subgenres, the forms of the hook have only proliferated in the decades since Burns’s article (which is primarily concerned with classic rock and pop from the 1960s through 1980s), and the term may usefully be applied—or its absence noted—to genres like hip-hop and electronic dance music as well.
Return to text

46. Veridical expectations have received much less scholarly attention than their opposite term, schematic expectations, which describes the kinds of general tonal, melodic, or rhythmic tasks in which psychologists are often interested: stimuli and responses that illuminate general musical phenomena outside of specific contexts. David Huron (2004, 702) makes use of veridical expectation in some of his analyses, most notably an article on humor in the recordings of PDQ Bach. In it, he identifies an example of misquotation, in which PDQ Bach (the stage name of composer Peter Schickele) presents a truncated version of the wandering theme from the slow movement of Beethoven’s Fifth Symphony. The recomposed theme, brought curtly to a conclusion with a highly conventional cadence after only four measures, is only humorous if the listener is aware of how much longer it should be.
Return to text

47. In a sense, each iSongs video functions not only as a performance but also as a tutorial for the software; on video-based music lessons see Miller (2012, 155–182).
Return to text

48. So quickly, in fact, that I have sometimes suspected that parts of the video have been sped up and re-synchronized. Some moments appear almost as if frames have been omitted in order to speed them up. After careful study I am inclined to ascribe these inconsisitencies to the difficulties of filming the movement of fingers against the bright background of a phone screen, but some fast-motion editing remains a possibility.
Return to text

49. Quantization is a tool common to both DAWs and MIDI input for notation software. It enables a computer to smooth out slight performance mistakes, and in so doing to accept user input with the flexibility that a human transcription might allow. Setting the quantization to sixteenth notes, for instance, ensures that the computer’s rendering of an imperfect performance will conform to a clear metric grid, rather than using extremely small note values and rests to capture every nuance of the performer’s microtiming.
Return to text

50. It is likely that g♯ in m. 9 is an error, and the chord should actually be E/G♯—an option available in the chord editor, but not used.
Return to text

51. By default, the app offers sus2 and sus4 chords. Here, sus4 is selected but the interval omitted from the display, most likely because of the limited space on screen (the iPad version of GarageBand displays the full figure). And while individual notes cannot be changed once a chord is added to the palette, various voicings of each chord are available: the vertical subdivisions of each button correspond to the root, fifth, and root in the bass (the gray portion at the bottom of each button) and then five ever-higher voicings in the white portion.
Return to text

52. On the visual and musical metaphors expressed by most DAWs—which arguably shape the music that is made within them—see Bell, Hein, and Ratcliffe (2015) and Marrington (2011).
Return to text

53. See comments from users “Trackside Films,” “wexican,” and “Shxdo,” in that order. (YouTube does not currently allow direct links to specific comments, nor does it offer precise times and dates of their appearance.)
Return to text

54. The same principle applies to live performance. Consider, for example, the German jazz singer Michael Schiefel, whose original compositions are often built upon looped acapella accompaniments. When he performs live, Schiefel builds these musical textures up piece by piece, often spending a full minute or two scatting into a microphone and tapping on a control panel before he sings any lyrics. On his recordings, however, these “buildup” segments are often missing; his studio versions start with conventional vocal-instrumental introductions rather than extended loop collages. Compare, for instance, a live performance of “My Animals” (https://www.youtube.com/watch?v=IjNUfonSGiA) with the studio version of the same song from his 2006 album Don’t Touch My Animals. (I am grateful to Alex Rehding for introducing me to Schiefel’s music.) Furthermore, as Malawey (2020, 132–133) notes, the contingencies of live performance can intervene on loop-based music; she recalls attending a concert by singer James Blake, in which an increasingly irate Blake was forced to ask the audience to be quiet so he could record the loops necessary to begin a song.
Return to text

As of 2021, 51% of all consumer music streaming worldwide occurs on YouTube, significantly outpacing paid, music-specific platforms such as Apple Music and Spotify. This figure also outpaces all physical media combined; see Osborn (2021, 1).

On “one-man bands,” see Shepherd, et al. (2003, 48–49). On tape loops see McClary (1991, 132–147), Collins (2014, 47–52) and Auner (2017). On the democratization and increased affordability of both music listening and music performance, see Katz (2010). Cayari (2017) offers an overview of “pro-am” musicking on YouTube, summarizing a few predominant production styles and analyzing the site as both “a place and a medium” (469) that allows performers to share their music, offers several forms of social and creative interaction among musicians and listeners, and in turn shapes musical performance as both an activity of “serious leisure” and an increasingly professionalized form of entertainment in the twenty-first century.

Readers wishing to know more about historical approaches to virtuosity may consult David VanderHamm’s (2018) excellent Oxford Bibliography on the topic, or seek concise introductions by Samson (2003, 1–7 and 66–102) and Dahlhaus (1989, 8–14 and 134–141), who uses virtuosity as one of the terms of analysis within his famous Beethoven/Rossini dichotomy.

Although mashups originated as a studio phenomenon, with artists remixing and combining existing recordings, they can also be performed live. Boone (2013, [5.1]) terms these performances “cover mashups.”

The keyboard does not appear in Example 1’s transcription, but the looped chords are recorded at 1:40 in the video.

Put another way, Trouw’s video is typical in the way it places the process of constructing its own accompaniment front and center, but atypical in the way that it hides the technical means of production which drive that construction. Along with loops being triggered and silenced from off-camera, Trouw is followed from instrument to instrument by a live camera operator: a level of dynamic production not always found in YouTube musical performances, which often use fixed (and presumably unattended) cameras.

On the aesthetics of early YouTube, see inter alia Müller (2009) and Hillrichs (2016). Burgess and Green (2009) argue against a reductive view that often presents YouTube as a low fidelity playground for amateurs, or draws a firm division between amateur and professional productions on the platform. Style, aesthetics, and purposes for production have varied widely since the site’s inception, and continue to do so.

Techne (τέχνη) is transliterated in several ways, sometimes with an accent or circumflex over the second e; for the sake of simplicity, I have followed Serafina Cuomo’s (2007) practice of italicizing the word but omitting any accent mark.

See Nussbaum (2001, 94–99) for a concise summary of techne in Ancient Greek philosophy. For a comprehensive summary of the relationship between techne and episteme throughout the history of philosophy, see Parry (2020), who outlines important discussions of the term in texts such as Xenophon’s Memorabilia, Aristotle’s Metaphysics and Nicomachean Ethics, and numerous dialogues of Plato, such as Phaedrus, Gorgias, and Republic. More specific treatments include Cuomo (2007, on ancient accounts of technology) and Angier (2012, on ethics as techne).

Brian Kane’s (2014, 97–133) account of acousmatic sound at Bayreuth and in musique concrète is mostly concerned with this sense of techne, as an opposite term that supplements music’s natural powers.

See Parry (2020, 2), which discusses Socrates’ use of various professions (most notably medicine) to illustrate the concept of techne.

On the varying organizations of medieval and Renaissance treatises, theory’s encounters with various epistemologies, and the corresponding changes within the discipline, see Blasius (2002). On cantus and musicus, see McAlpine (2008, 32–40).

See Sterne (2003, 92): “All the technologies of listening that I discuss emerge out of techniques of listening.” Sterne, in turn, is inspired in this statement by Mauss (1979).

Butler (2014, 174n2) is aware of the same linguistic ambiguities noted by De Souza, noting that Foucault uses “technique” and “technology” interchangeably. For more on this issue when translating from European languages and in Foucault’s work in particular, see Behrent (2013, 58–60). On Foucault’s technologies see Nilson (1998, 97–102).

Sterne (2006, 96) offers a similar justification for employing the word techne: while he acknowledges that invoking an esoteric ancient Greek concept is not always strictly necessary, the earlier sense of techne offers resistance to what he characterizes as the “add technology and stir” model that prevails in science and technology studies, by reclaiming the older, richer resonances of the stem “tech-.”

While this article was revised and completed during the COVID pandemic, it is concerned with an audiovisual form that predated the outbreak, and is only superficially related to the various new modes of musicking that have arisen as part of worldwide efforts to mitigate the spread of COVID—though I do hope this research helps to establish and advance analytical frameworks that may be useful as scholars turn their attention to those socially distanced ways of rehearsing, recording, performing, and listening. As of this writing, more than 100 scholarly essays have been published about musicking during the pandemic, many of which can be found in Critical Improvisation Studies 14/1–3 (2021), a multi-issue forum entitled “Improvisation, Musical Communities, and the COVID-19 Pandemic,” which collects stories of how individuals, communities, and arts institutions adapted to the conditions of isolation; and within the research topic “Social Convergence in Times of Spatial Distancing: The Role of Music During the COVID-19 Pandemic,” in the journal Frontiers in Psychology.

Numerous publications have come from the many scholars involved with these projects; most relevant for the present essay are Nicholas Cook (2013) and Daniel Leech-Wilkinson (2012). See www.cmpcp.ac.uk for more information and a full list of collaborators.

These citations represent several studies from recent years. For a comprehensive overview of the relationship between analysis and performance in the twentieth century, see McClelland (2003).

See De Souza (2017, 76–82). In formulating his argument about idiomaticity, De Souza draws on both personal performance experience, and cognitive theories and empirical research by David Huron and Jonathon Berec (2009) and Robert Gjerdingen (2009).

On the rise of music theory instruction on YouTube, see Arnold and Grasso (2022).

On YouTube’s digital communities, see Strangelove (2010, 103–136). On specific communities, see Christian (2011), Burgess and Green (2009, 53–55), and Gibson (2015).

In a similar manner, a cover of Katy Perry’s “I Kissed a Girl” by gay singer-songwriter Ivri Lider cleverly inverts both the song’s affect and its original meaning (I am grateful to Alex Rehding for drawing my attention to this example). The performer’s identity, however, can also make a cover song unsuccessful: Ethan Hein (2020) chronicles the negative reception of NPR host and bluegrass musician Chris Thile’s 2016 cover of Kendrick Lamar’s “Alright,” and criticizes its problematic appropriation and inauthenticity.

To summarize Gracyk’s argument further: rock is not defined by a specific genre of music, but rather a studio- and record-oriented outlook that sees recordings of a given song as “primary texts”—definitive performances on which later live performances—whether by the original artist, or another—are both based and judged. His ontology of rock is thus concerned not only with the song itself (as might be the case with a Classical “work,” of which many exemplary performances exist from a single score), but with a complete sonic unit that encompasses both a performance and all the audible artifacts that go with it, including incidental noise, artifacts of the recording process, and the effects of the particular medium; see Gracyk (1996, 1–67).

For a music-theoretical perspective on mashups, see Boone (2013) and Boone (2018), particularly the comprehensive review of scholarly literature in the latter. For specific analyses, see Adams (2015) and Yunek, Wadsworth, and Needle (2021).

On “multitracked” YouTube videos, see Cayari (2017, 473–475).

“Africa,” released in the U.S. on October 30, 1982, was a modest hit: it spent 21 weeks on the Billboard “Hot 100” chart, attaining #1 for a single week (February 5, 1983) before falling off the charts completely by the end of March. (See the Billboard archives at https://www.billboard.com/archive/charts/1983/HSI). Jessica Furseth (2017), offering a more positive assessment in her catalog of “Africa’s” recent resurgence, speculates that the song became such an internet favorite thanks to a combination of it being “a well-crafted piece of music, with driving drum loops, layered harmonies, and an anthemic chorus,” but also “just dorky enough”: a nostalgic remnant of “a time when earnestness was far more socially acceptable.” On the song’s reductive view of the continent of Africa, see Jenkins (2018, 100).

Harper’s definition (which is in dialogue with Limor Shifman [2014]) places “memes” in opposition to a phenomenon that is simply popular, which she classifies as “viral.” A meme is an iterative, repetitive form, in which creators and consumers repetitively riff on a similar idea, such as the same recognizable image overlaid with new and timely text. In this sense, it resembles Vernallis’ (2013, 130) discussion of genre on YouTube, cited above.

The video has attracted nearly 18,000 comments. I will not quote them specifically here, but they tend to fall into two categories: most are jokes about “playing with your food”, while roughly one in ten express amazement at the performer’s skill with unlikely instruments.

8.8 million views between December 2018 and December, 2021.

On the process of “churnalism” and the re-circulation of viral music content, see O’Hara (2018).

Vegetable-based orchestras are not an uncommon sight in the world of experimental music; they offer a similar juxtaposition of unexpected materials, treated seriously. Their reception is much the same as Patanen’s; see, for instance, the blend of bemused detachment and genuine appreciation in Lyall (2003).

Just as the limited gamuts of Patanen’s sweet potato ocarinas are precisely calibrated for performing “Africa,” so too are the strawberries configured especially for “Teardrop”: they perfectly cover the four notes of the song’s prominent harpsichord part.

Acoustic guitar strings are generally made of steel, and tightly wrapped in a bronze alloy made of copper and zinc. The two or three highest strings are generally left unwrapped in order to keep tuning and tension consistent. Stricagnoli has apparently replaced all but the lowest string (which corresponds, unstopped, with a traditional guitar’s G string) with unwrapped steel strings better suited to his high-pitched tuning.

A “hammer-on” is a technique in which the guitarist uses a finger on their left hand to press a string down on the fretboard with enough force to create a sound. A “pull off” is the reverse, in which the performer plays a note by subtly pulling the string to one side as they release it. In a “left-hand pluck,” the performer plucks an open string with their left hand, over the fretboard.

Throughout the performance, Stricagnoli moves the capo in order to change the pitch of the open strings. For instance, he moves two frets lower for a transitional section just after the end of Example 8 (0:45), and then much higher for the verse (1:09).

See De Souza (2017, 53–63 and 78–82). De Souza characterizes instrumental habits as dense, multisensory mappings between “a lived body, an affordance space, and an enactive landscape” (81). These interactions involve feedback in tactile, visual, and audible forms, and the performer’s interface with the instrument depends upon well-learned correspondences between those three streams—correspondences that will be disrupted or re-arranged by retuning.

See, for example, Kawehi, “Gear-splain and Going on Tour!” (September 23, 2017) https://youtu.be/tsZmWXbEj-c. Joseph Auner (2003, 105) has argued that vocal loops and samples in popular music have often functioned as “posthuman ventriloquism,” that represents (pace Hayles 1999) “both destructive and liberating implications,” particularly with regard to its intersections with other aspects of the performer’s identity, such as race and gender. Kawehi embraces cybernetic imagery throughout her oeuvre, presenting herself as half robot on the cover of her 2016 album Evolution, and dramatizing her own post-human vocality in the effects-laden video “Anthem” (2015), in which she dons a child’s cardboard robot costume and repeatedly pulls off her own singing head, clad each time in a box labelled with a musical function: bass, beatbox, etc.; see https://www.youtube.com/watch?v=ykIwtNyASVM.

Because my music notation software is unable to render special noteheads at a smaller size, harmonized pitches and vocal percussion beats are shown at full size.

Here, eagle-eyed viewers may notice that Kawehi’s keyboard seems to be transposed down a perfect fifth: she plays C-A in order to produce F-D.

The second verse/chorus loop, the “Yeah!” previewed at the beginning of the video, is preceded by what might be a mistake in m. 15. Kawehi sings “yeah!” in the second half of one of the established two-measure cycles, and cuts herself off quickly. It is unclear if this is an error, or a way of warming up her voice for the next measure, knowing that the preliminary “Yeah” would not be recorded.

See, for example, her 2017 performance of Nine-Inch Nails’ “Closer” in Carrboro, NC (https://www.youtube.com/watch?v=lKf3yrMoCI4).

Barna (2020) identifies the “dance chorus” as a distinct formal unit in contemporary pop. One of its important features is that it stands independently within the form, rather than being dependent on what precedes it or serving as a transition, as she argues a post-chorus would be. While Nobile (2020, 118) mostly echoes Spicer’s definition and emphasizes the independence of the postchorus (noting that it often includes text), he also acknowledges that “the distinction between transition and postchorus is not always clear-cut.”

It is worth noting that by extending and repeating a truncated version of the chorus’s chord progression, the postchorus in “Heart-Shaped Box” acts in much the same way as another postchorus named by Spicer, found in Stephen Stills’s “Love the One You’re With” (1970).

See Spicer (2011, [10n14]). To make one further point via an argument about punctuation: Spicer’s formulation of “postchorus” without any hyphen seems to reinforce the idea that a postchorus plays a specific formal role, and is not simply a module that appears after the chorus (i.e., a “post-chorus,” as many other sources including Barna 2020 style it).

Gary Burns (1987) provides both a brief archaeology of the term “hook,” and a typology of the many forms it might take. The permutations of what constitutes a “hook” are too many and varied to rehearse here, but examples run the gamut from a recognizable musical feature such as an opening guitar riff or the first or last line of a chorus, to less obvious characteristics such as a distinctive, repeated accompaniment, the appearance of an unexpected instrument, timbre, or even a sound effect. With the emergence and rise of new genres and subgenres, the forms of the hook have only proliferated in the decades since Burns’s article (which is primarily concerned with classic rock and pop from the 1960s through 1980s), and the term may usefully be applied—or its absence noted—to genres like hip-hop and electronic dance music as well.

Veridical expectations have received much less scholarly attention than their opposite term, schematic expectations, which describes the kinds of general tonal, melodic, or rhythmic tasks in which psychologists are often interested: stimuli and responses that illuminate general musical phenomena outside of specific contexts. David Huron (2004, 702) makes use of veridical expectation in some of his analyses, most notably an article on humor in the recordings of PDQ Bach. In it, he identifies an example of misquotation, in which PDQ Bach (the stage name of composer Peter Schickele) presents a truncated version of the wandering theme from the slow movement of Beethoven’s Fifth Symphony. The recomposed theme, brought curtly to a conclusion with a highly conventional cadence after only four measures, is only humorous if the listener is aware of how much longer it should be.

In a sense, each iSongs video functions not only as a performance but also as a tutorial for the software; on video-based music lessons see Miller (2012, 155–182).

So quickly, in fact, that I have sometimes suspected that parts of the video have been sped up and re-synchronized. Some moments appear almost as if frames have been omitted in order to speed them up. After careful study I am inclined to ascribe these inconsisitencies to the difficulties of filming the movement of fingers against the bright background of a phone screen, but some fast-motion editing remains a possibility.

Quantization is a tool common to both DAWs and MIDI input for notation software. It enables a computer to smooth out slight performance mistakes, and in so doing to accept user input with the flexibility that a human transcription might allow. Setting the quantization to sixteenth notes, for instance, ensures that the computer’s rendering of an imperfect performance will conform to a clear metric grid, rather than using extremely small note values and rests to capture every nuance of the performer’s microtiming.

It is likely that g♯ in m. 9 is an error, and the chord should actually be E/G♯—an option available in the chord editor, but not used.

By default, the app offers sus2 and sus4 chords. Here, sus4 is selected but the interval omitted from the display, most likely because of the limited space on screen (the iPad version of GarageBand displays the full figure). And while individual notes cannot be changed once a chord is added to the palette, various voicings of each chord are available: the vertical subdivisions of each button correspond to the root, fifth, and root in the bass (the gray portion at the bottom of each button) and then five ever-higher voicings in the white portion.

On the visual and musical metaphors expressed by most DAWs—which arguably shape the music that is made within them—see Bell, Hein, and Ratcliffe (2015) and Marrington (2011).

See comments from users “Trackside Films,” “wexican,” and “Shxdo,” in that order. (YouTube does not currently allow direct links to specific comments, nor does it offer precise times and dates of their appearance.)

The same principle applies to live performance. Consider, for example, the German jazz singer Michael Schiefel, whose original compositions are often built upon looped acapella accompaniments. When he performs live, Schiefel builds these musical textures up piece by piece, often spending a full minute or two scatting into a microphone and tapping on a control panel before he sings any lyrics. On his recordings, however, these “buildup” segments are often missing; his studio versions start with conventional vocal-instrumental introductions rather than extended loop collages. Compare, for instance, a live performance of “My Animals” (https://www.youtube.com/watch?v=IjNUfonSGiA) with the studio version of the same song from his 2006 album Don’t Touch My Animals. (I am grateful to Alex Rehding for introducing me to Schiefel’s music.) Furthermore, as Malawey (2020, 132–133) notes, the contingencies of live performance can intervene on loop-based music; she recalls attending a concert by singer James Blake, in which an increasingly irate Blake was forced to ask the audience to be quiet so he could record the loops necessary to begin a song.

Return to beginning

Copyright Statement

[1] Copyrights for individual items published in Music Theory Online (MTO) are held by their authors. Items appearing in MTO may be saved and stored in electronic or paper form, and may be shared among individuals for purposes of scholarly research or discussion, but may not be republished in any form, electronic or print, without prior, written permission from the author(s), and advance notification of the editors of MTO.

[2] Any redistributed form of items published in MTO must include the following information in a form appropriate to the medium in which the items are to appear:

This item appeared in Music Theory Online in [VOLUME #, ISSUE #] on [DAY/MONTH/YEAR]. It was authored by [FULL NAME, EMAIL ADDRESS], with whose written permission it is reprinted here.

[3] Libraries may archive issues of MTO in electronic or paper form for public access so long as each issue is stored in its entirety, and no access fee is charged. Exceptions to these requirements must be approved in writing by the editors of MTO, who will act in accordance with the decisions of the Society for Music Theory.

This document and all portions thereof are protected by U.S. and international copyright laws. Material contained herein may be copied and/or distributed for research purposes only.

Return to beginning

Prepared by Andrew Eason, Editorial Assistant

Number of visits: 7852

The Techne of YouTube Performance: Musical Structure, Extended Techniques, and Custom Instruments in Solo Pop Covers*

William O’Hara

1. Introduction

1.1 Setting the Scene: Elise Trouw’s Mashup of Radiohead and The Police

2. Theoretical Preliminaries

2.1 Techne

2.2 Instrumentality and Media

2.3 Theories of the Cover Song

3. Four Case Studies

3.1 Made from Scratch: Pupsi performs Toto’s “Africa”

3.2 Luca Stricagnoli Performs Michael Jackson’s “Thriller” and Metallica’s “Fade to Black”

3.3 Kawehi Performs Nirvana’s “Heart-Shaped Box”

3.4 iSongs Sequences Europe’s “The Final Countdown”

4. Concluding Thoughts: Musical Labor and YouTube as Creative and Analytical Medium

Works Cited

Media Examples

Media Examples

Footnotes

Copyright Statement

Copyright © 2022 by the Society for Music Theory. All rights reserved.

The Techne of YouTube Performance: Musical Structure, Extended Techniques, and Custom Instruments in Solo Pop Covers^*