Pre-print text. Final text published in Rethinking Music through Science and Technology Studies, ed. Antoine Hennion and Christophe Levaux (London: Routledge, 2021). David Trippett Human Sounds: the Obscenity of Information 9 In 2017, Alexander Payne’s film Downsizing pursued an old thought experiment: what if humans could be shrunk, their perceptual worlds miniaturized, their bodies made Lilliputian all of an instant? How would they reexperience the space of the environment, of sound and light? In Payne’s film, the lead character, played by Matt Damon, opts in to a government program whereby he is to be shrunk by a factor of 2,744, along with thousands of others, to form an experimental “miniature” community in an attempt to solve the climate crisis. His wife—emerging as a science-fiction skeptic—opts out of the downsizing process at the last minute, leaving Damon’s character to ponder the wisdom of his irreversible choice. With its comedic focus on relationships, the film skirts questions of realism, that is, the unfamiliar, “real” perceptual world such a shrunken “human” might experience.i Miniaturized people, including Damon, retain their deep voices, their perceptual ranges and acuity, and their sensory proprioception; they feel no change of atmospheric pressure, gain no insight into the newly massified world around them, and can still engage with their unminiaturized human interlocutors at will. Within such narrow dimensions, the only “loss” is body mass; a narrative fact rather than an experiential postulate.ii All other parameters of existence remain stable—in keeping with a genealogy of minihuman films from The Incredible Shrinking Man (1957) to Ant-Man (2015)—presenting a shallow fiction that leaves perceptual questions unasked. In abstract terms, body morphology can be taken as a contested object, an assertion of matter in its relation to identity, one that places into relief the character of the relationship between perception, self-perception, and objects. “My ‘own’ body is material,” Jane Bennett asserts, “and yet this vital materiality is not fully or exclusively human,” for it depends on myriad microscopic bacteria (“swarms of foreigners”) that neither are human nor can be perceived by the naked eye (“a nested set of microbiomes”) (2010: 112–13). In this context, the relation between media and realism is anchored by human perception, which in turn is rendered idiosyncratic by virtue of an individual’s unique sense apparatus, acuity, plasticity, individual history, and training. Historically, the body has been figured as an object of difference in contexts as divergent as Thomas Aquinas’s Dominican theology, where physical things become individualized through “matter signed with quantity” (rather than form; see Funkenstein 1986: 135), to Jean Baudrillard’s poststructuralist critique of media, with its three orders of simulacra—a classical order of “counterfeit,” an industrial order of “indefinite reproducibility,” and a digital order of “simulation”—where this third order denies the possibility of counterfeiting an original body matter, flattening out uniqueness in favor of fractal bodies, or “models from which all forms proceed according to modulated differences” (Baudrillard [1976] 2017: 77). In the context of digital film, duly enmeshed in Baudrillard’s narrative of simulation, modeled perception (simulation of what it is like to hear or see as a different being) brings the argument full circle, where body morphology itself—that whose uniqueness becomes endangered by means of its simulation—demands individuality of matter via the very medium that denies it this, whether as a gnat, a crocodile, or a miniature Matt Damon. “Everything began with objects,” Baudrillard once remarked, in a parody of Genesis. “Yet there is no longer a system of objects” (Baudrillard 1988: 11). Understanding of the body both as unique in its perceptual apparatus and as a unique configuration of matter no longer has any meaning in digital representation, he infers, which makes the question of realism in film redundant by definition: “real” is forever referable to a subject position constructed by the audiovisual technology. First published in 1987, this statement about the role of digital information in society formed part of Baudrillard’s submission for the “habilitation” at the Sorbonne. Originally titled L’Autre par lui-même and translated into English as The Ecstasy of Communication, it bore the mildly sarcastic title “Habilitation” and was nearly rejected—perhaps because of what some have described as the misogynistic overtones of his concept of seduction and the argument’s recurrent allusions to pornography and the sexual body, perhaps because of the brevity of its claims (the original French edition is barely 92 small pages in length). For present purposes, what is remarkable about it is the book’s articulation of an enduring confrontation between the human body and digital media at a time when digital screen media was—by today’s standards—in its infancy. Two decades into the 21st century and the replication of voices through neural networks, the concept of the deepfake and of the legal ownership of singing holograms from Hatsune Miku to Maria Callas registers a further shift in the relation of mediatized appearance to the putatively real, a shift whose technological details Baudrillard could hardly have envisaged. Across this divide, the injured concept—the possibility of a unique identity—remains recalcitrant, perhaps because most witnesses to these simulations would still regard themselves as individuals. To be sure, it now seems unsurprising for a postmodern philosopher in the 1980s to signal as casualties the principles of a reality beyond the play of appearances, the existence of unique individual subjects, claims for a truth or a metaphysics that persists. But for Baudrillard in 1988 the loss of these concepts is positioned historically in relation to the radical increase of matterless data that accompanied screen media, summarized in the metaphor of the screen’s flat surface. It was superficiality made literal. On the face of it, it was as though these older cultural tropes had somehow been given up recently in response to the proliferation of cathode-ray TVs and the digital audio of Sony’s PCM-1 encoder: “Today the scene and the mirror have given way to the screen and the network. There is no longer any transcendence of depth,” he writes defiantly, “but only the immanent surface of operations unfolding, the smooth and functional surface of communication” (1988: 12). If these were technological affordances, they were thoroughly unwelcome. Trapped on this infinite surface, we learn, Baudrillard’s subject reflexively inhabits an overly transparent world in which aesthetic experience becomes entirely soluble in information streams, a world saturated in digital signs and their instantaneous networks; this environment creates a historically unprecedented identity that rebounds on the subject, who—unable to “produce the limits of his being,” unable to produce her- or himself as a mirror—becomes “a pure screen, a pure absorption and re-absorption surface of the influent networks” (27). More than a transformation, this represents a cold loss of identity, whose familiar grain of tangibility is no longer valid. Dystopian rhetoric aside, Baudrillard’s claim that simulation is built on a world of code has proven influential.iii It asserts that contemporary culture can be coded into ones and zeros, that “digitality is among us. It haunts all the messages and signs of our society” (Baudrillard [1976] 2017: 82). Alongside a darkening worldview in which knowledge and the very processes of thought are subsumed within data flows, the social control implied by a coded environment sets up a formidable political adversary that—for Baudrillard—must be resisted: “You can’t fight the code with political economy, nor with ‘revolution’…can we fight DNA?…Perhaps death and death alone, the reversibility of death, belongs to a higher order than the code” (25). Such rhetoric betrays a concrete reality in the late 1990s, one that fed the anxiety of identity loss implied by digital media. While the rhetoric of the deepfake was decades away, genetic cloning was a crisp, new technology. Across multiple essays Baudrillard rails specifically against human cloning, which he posits as code applied to no- longer-unique bodies: “The Father and Mother have disappeared…in the service of a matrix called code” (Baudrillard 2010: 96). The genetic formula inscribed into cells undoes the body’s physical reality by virtue of its potential for infinite replication; hence the simple fact of DNA (“the prosthesis par excellence”) transforms it into a simulation. In this guise, the body becomes an assemblage of virtual quantities, “a stockpile of information and of messages, a fodder for data processing…The individual is no longer anything but a cancerous metastasis of its base formula” (99–100). In its multiple iterations, this verdict passes through sarcasm (“it allows complex beings to achieve the destiny of protozoas”—96) to protest (“without the Other as mirror, as reflecting surface, consciousness of self is threatened with irradiation in the void”—140), turning finally to moral outrage, where the “subtle death” of doubling constitutes an innate self-destructiveness or “the transparency of evil” to which— pace Jean-Paul Sartre—even “the hell of other people would have been preferable” (Baudrillard 1993: 139). Denuded of aura, the individual is fatally cheapened in the process of simulation. With a nod to Sophocles’s Oedipus, this coding of bodies is “still incest, but without the tragedy” (138). It is the question of identity that forms the red thread in the discussion of digitality or quantification that follows. After a critique of Baudrillard’s historical technologies, this chapter identifies realism as a philosophical proposition that, in the digital age, has become synonymous with the relation of quantity to technology, a relation that has deep historical roots from the microscope in the mid-17th century to chronophotography in the 1890s and “imperceptible” pixels-per-inch in the 2010s. The calculation of perceptual difference, by applying the ratio of different body sizes to frequencies (whether in fiction or in acoustics), offers an explanation as to why frequency resolution has become a central parameter for realism in speech synthesis, technologies that synthesize the spoken voice, in a closing case study that indicates the extent to which identity and voice are no longer uniquely bonded nor primarily referable to physical bodies. Obscenity, or dots on a line Before pursuing this critique in the context of speech synthesis and its modes of simulation, two concepts that shape Baudrillard’s understanding of digital media bear some consideration. The first is obscenity; the second, communication. Both are central to his short book Ecstasy of Communication and undergird the anxieties provoked by mediatized information circa 1987. The primary definition of obscenity in the Oxford English Dictionary is that which is “offensive or grossly indecent, lewd.”iv Etymologically, it is often linked to the Latin caenum (“filth”). But the etymology of the term is unclear. A disputed reading of the Latin grammarian Marcus Terentius Varro’s De lingua Latina from the first century BC led to the notion that scaena (from ob-scaenum) could refer to the stage, where the indecent content in classical plays, content that is offensive to the eyes of the gods, should take place offstage or concealed from public view. This would include the classical acts of moral outrage: Medea’s enraged murder of her sons sired by Jason, after he abandons her for a younger princess, in Euripides’s Medea; Tarquin’s rape of Lucretia in Ovid’s poem; and the aforementioned Oedipus, where at the end of the play the protagonist must gouge out his own eyes. In a literal sense, obscene—in this “folk” etymology—came to mean that which eradicates our gaze. While this etymology is almost certainly wrong,v it is precisely the definition that Baudrillard inverts in his critique of screen media. Obscenity begins where there is no more spectacle, no more stage, no more theatre, no more illusion, when every-thing becomes immediately transparent, visible, exposed in the raw and inexorable light of information and communication. / We no longer partake of the drama of alienation, but are in the ecstasy of communication. And this ecstasy is obscene. Obscene is that which eliminates the gaze, the image and every representation. (Baudrillard 1988: 22) Here, far from concealment, the gesture of the obscene is that of zooming up infinitely close, of seeing so clearly as to be indistinguishable with what is taking place. In erasing the gap between what is taking place and acts of witnessing, it abolishes all schemes of representation, all space for mystery or hermeneutics. Significantly, it can accomplish this only through declarative, technological means: code and pixel density. For each time Baudrillard mentions obscenity, he loops back to the agencies of digital “information and communication.” Admittedly, these terms remain undetermined, and with the benefit of hindsight, we might simply read his statements as an undeveloped intellectual position, a way station en route to his more mature work on simulation and the hyperreal. But against such teleology, it also reverts to the fantasy of encoding personal experience and aspects of personal identity—our “private universe” that hitherto had been a secretive matter. It is in this spirit that, riffing on common associations of obscenity, he clarified: Obscenity is not confined to sexuality, because today there is a pornography of information and communication, a pornography of circuits and networks, and objects in their legibility…It is no longer the obscenity of the hidden, the repressed, the obscure, but that of the visible, the all-too-visible, the more-visible-than-visible; it is the obscenity of that which no longer contains a secret and is entirely soluble in information and communication. (Baudrillard 1988: 22) If this metaphor emanates from a technical capacity for visual close-ups that decontextualize for screen viewers explicit views of what is not supposed to be seen, it sits within an established genealogy of technological affordances for sensory perception, including in the electronic manipulation of sound.vi Of course, the visual affect of extreme close-ups is multivalent, and can equally be enlisted to deny the obscenity of a totalizing, data-rich sensation (‘more-visible-than-visible’) that Baudrillard has in mind. For as Lisa Marks reminds us, the electronic effects of pixelation in close-ups often draw attention to texture rather than realism, creating perceptions of haptic or tactile images through blurring or other manipulation of the underlying bitmap (Marks 2000: 176). But it takes a moment to remind ourselves that in the 1980s, when these sentences were being written, virtual artifacts—putatively immaterial objects like digital holograms, 3D modeling, and cyberspace—had barely been invented; according to Martin Hilbert and Priscila López, less than one percent of the world’s media storage capacity was digital in 1986; by 2007 it would become 94 percent (2011: 60–65). And the first practical video coding format, discrete cosine transform (DCT)—initially proposed as an image compression technique in 1972—was only adopted for compressing online video in 1988, the year of Baudrillard’s English translation. If we step outside Baudrillard’s philosophical frame, then, the question arises as to what form of data is at issue when a writer such as this refers to “the raw and inexorable light of information.” In the context of realism, one answer is resolution. Here, realism is posed as a quantitative proposition: what density of pixels or bit depth is needed to fully dissolve the secrets of aesthetic experience in “information”? Bits, unlike atoms, have no mass, color, or size and can travel at the speed of light. They are symbols, commonly considered as ones and zeros, to be set and reset as declarative assertions with no capacity for ambiguity, and once famously described by Nicholas Negroponte as “a state of being: on or off, true or false, up or down, in or out, black or white” (1995: 14). For sound, they are synonymous with audio sample rates: a 12-bit sampler outputs 12 bits of data for every sample, and the sonic resolution relates to the number of samples per second. As such, they are units of a symbolic sonic existence—capable of higher and lower “definition”—that challenge the singular authenticity of acoustic sound. Since its first use in 1936, the term high definition has shifted continuously from its origins in the number of lines of an analog TV screen to audiovisual media, from 8K imagery to so-called “lossless” sound. It is unnecessary to rehearse this history; suffice it to say that the upper limit of high-resolution screen technology has reached a surface of at least 220 million pixels on supercomputers in San Diego and at least 192,000 samples per second for audio recordings at 24-bit depth.vii While this represents a significant increase in relation to what preceded it in 1987, there is no reason to assume it has reached an absolute limit. For a philosophy of perception, however, it is indicative that the quantitative argument has prevailed: detractors to recent marketing strategies, such as Apple’s “retina” display (a branding tool for screens of putatively higher pixel density than the cellular organization in the eye at given viewing distances), have implicitly accepted the quantitative realism on offer. For Raymond Soneira, President of DisplayMate Technologies, 477 pixels per inch would be needed at a viewing distance of 12 inches (Steve Jobs had asserted circa 300 ppi).viii Technology journalist John Brownlee judged similarly: “Apple’s Retina Displays are only about 33% of the way there” (2012). By protesting against the degree required, both tacitly accepted the notion that ever greater density will ultimately render media screens indistinguishable from vision, what Jonathan Sterne once called ‘the dream of verisimilitude’ (Sterne 2012: 4). But given the limits of perceptual mechanisms in human eyes and ears that were determined in the 19th century by the likes of Thomas Young and Rudolf König based on a wave theory of light and sound, increases of resolution do not lead to a hypothetical endpoint, the fabled hyperreality, where it is impossible to distinguish sound samples from real voices at the level of sensation. A multimodal sensorium combined with nondeterministic cognitive processing cannot be accounted for in a mathematical mapping of physiology, of cell onto pixel, cochlear hair onto audio sample. As early as 1994, Michel Chion argued that the definition of a sound signal, rather than any correspondence to reality, is what creates a ‘hyperreal effect’ for listeners, citing the habit in sound recording of using ‘more treble than would be heard in a real situation’ (Chion 1994: 98-99). And as the editors of a more recent volume on stereo observe, stereophonic listening and fidelity were never “synonymous or even fully coterminous,” reliant on logics and practical needs that construct listener positions with no obligation to what is taken to be quantitatively “true” (Théberge, Devine and Everett 2015: 27). My claim, insofar as this discussion permits one, is that a historical perspective indicates that the sensation of the real, like the unreal or simulation, has no technological correlate. It remains a philosophical proposition. If we cast a glance back in history, the principle is essentially that of dots on a line. As such, it can be explained more clearly in relation to its original formation in the paradoxes of the pre-Alexandrian philosopher Zeno of Elea (c. 490–c. 430 BC), whose argument against pure motion is equally applicable to that against a realism defined by quantity. Recorded in Aristotle’s Physics, four paradoxes ascribed to Zeno concern the relations of time and motion. Zeno’s arrow—the third paradox—characterizes apparently opposed states of existence, both of which are exclusively true yet mutually contradictory: An arrow in flight is always traveling; yet at any given point in time it is stationary. Although it is always in motion, it cannot have time to move unless it is permitted more than one instant—that is, permitted to occupy at least two successive positions. At any given moment, therefore, the arrow is at rest, motionless at each point in its swift course. No matter how many individual, static moments accrue, even an explosion of pointillist speckles could never equate to pure motion as perceived by the eye witnessing the arrow in flight. This paradox was co-opted in the early 20th century by another French philosopher enthralled by the implications of the screen to explain the illusion of understanding duration, or any linear process of becoming. For Henri Bergson in 1907, parallel technologies resulting from developments in chronophotography in the 1880s, such as Edison’s kinetoscope and the Lumière brothers’ cinématographe, provided a new visual basis for interpreting the mind and its cognitive processes for relating sense acuity to an environment. Preoccupied above all by the necessities of action, Bergson asserts, “The intellect, like the senses, is limited to taking, at intervals, views that are instantaneous and by that very fact immobile of the becoming of matter” (Bergson 2005: 224). In Bergson’s view, this illusion is embodied in the paradigm of moving-image technology, whose rapidly successive still pictures gave the impression of motion to audiences through the agency of the revolving mechanism. Figure 9.1, taken from Etienne-Jules Marey’s study Cycliste (c. 1894), illustrates the principle of chronophotography that Bergson had in mind. The paradox illuminated by such seemingly mobile images, as Bergson recognized, was that of Zeno: In order that the pictures may be animated, there must be movement somewhere. The movement does indeed exist here; it is in the apparatus. It is because the film of the cinematograph unrolls, bringing in turn the different photographs of the scene to continue each other, that each actor of the scene recovers his mobility…Such is the contrivance of the cinematograph. And such is also that of our knowledge. Instead of attaching ourselves to the inner becoming of things, we place ourselves outside them in order to recompose their becoming artificially. We take snapshots, as it were, of the passing reality, and, as these are characteristic of the reality, we have only to string them on a becoming, abstract, uniform and invisible, situated at the back of the apparatus of knowledge, in order to imitate what there is that is characteristic in this becoming itself. Perception, intellection, language proceed so in general. Whether we would think becoming, or express it, or even perceive it, we hardly do anything else than set going a kind of cinematograph inside us. (Bergson 2005: 251–52). This practice of interpreting physical or mental functions in terms of tools or technical apparatus may resonate with more recent narratives of technogenesis (that humans coevolved with tools and technologies, where interior thought relates dynamically to exterior technicity) that we associate with thinkers such as Katherine Hayles (2012) and Bernhard Siegert (2003), particularly in light of recent re-evaluations of Ernst Kapp’s pioneering Elements of a Philosophy of Technology (1877). But historically, interpreting cognitive functions in terms of new mechanical paradigms was widespread in the wake of the Exposition universelle of 1889 and 1900. Bergson’s insight into the cinématographe mechanism was, however, to highlight not our ability but our failure truly to understand motion as a metaphor for becoming or duration. He regarded as foolish attempts to realize pure motion simply by intensifying the artifice, that is, by increasing the speed of cylinder rotation (or “resolution”), thereby making the intervals between states infinitely smaller: “Before the intervening movement you will always experience the disappointment of the child who tries by clapping his hands together to crush the smoke” (Bergson 2005: 254). As this quip illustrates, the principle of pure continuity was not simply a matter of quantity, or intensity of “information.” Continuity, or real lived experience, is of a different order to concatenated instants (spatialized as pixilation), and hence simulations. In 1907, just as now, these were not mathematically relatable to human perception. Accepting such “foolishnesss,” our instinctive trust of sensory feedback ensures that one enduring definition of realism is the degree to which a simulated object resembles its real- world object for sentient perception.ix For digital visual media, this indexical relation pertains for games and computer-generated images.x For sonic media, putatively authentic sound samples are indexical to real-world sounds, but, by implication, to their sound sources also. This implies an interface between sensation and response, whereby listeners become aware of some level of cognitive response beyond cognizing the bare sensory stimulus. That is to say, I am aware of my reaction to the sound of the voice calling me—what the American semiotician John Deely has termed the species expressae: “the cognitive response of the organism to the cognitive experience of a stimulus” (1994: 134).xi Discourses on realism in auditory gaming samples and sonic immersion have become well established since the millennium (Jørgensen 2006; Grimshaw 2008). But the claim that a sound signal can be replicated precisely enough that its output fools the cognitive response of listeners on both of Deely’s levels—as cognizing organism and as raw stimulus—remains untested. We might call it the “perfect speaker” hypothesis: where a speaker not only is indistinguishable from the sound source but also inspires the same emotional reaction. To be sure, in the context of audiophile culture, speaker manufacturers’ reliance on indexical relationships between sound source and sound sample have been applied to advertising for decades—and, as the British firm Mordaunt-Short’s phono-realist advertisement from 1988 implies, not only for humans. Figure 9.2, from What Hi-Fi? magazine, validates the speaker’s sonic authenticity by a Humboldt penguin’s confusion over the object of its amorous affection. In the absence of a firm definition for high-resolution audio for ears, comparable to the pixels-per-inch for retinas, perception must remain the register of realism.xii Speeds of existence: multiples of 1,000 A debate coeval with that of cinematographic realism is how to calculate empirically the difference in sense acuity and cognition between the perception of different humans. Just after Bergson, the philosopher Wilhelm Dilthey argued that poetry had always used language to produce “the impression and illusion of reality” ([1910] 1985). To that end, it too becomes an analogue technology that transports its readers’ cognitive sense into alien realities, not unlike a trip to the flicks. The reader, writes Dilthey: finds himself in a world of appearance not subject to the necessities of his actual existence (Existenz). But the [poetic] work heightens the reader’s feeling of his human existence (Daseingefühl). For the person confined by the course of his own life, it satisfies the longing to experience possibilities which he himself cannot realize. It opens up to him a view into a higher and more powerful world. (250)xiii Far from a dusty play of signs, the pleasures afforded by such a new experience are sensory, we learn: “pleasure in sound, rhythm, and visual clarity” that contributes to a sequence of psychic processes culminating in “a genuine understanding of an event on the basis of its relations to the whole scope of life” (251). Since research by figures like Rudolph König and Carl Stumpf proved empirically that auditory worlds existed beyond the range of human hearing, visual worlds beyond human sight, the very concept of a single reality appeared naively anthropocentric, on this empirical basis, perhaps for the first time. One contemporary example of such an attitude was Dilthey’s teacher, the German entomologist Karl Ernst de Baer, who asked the corresponding question: “Which version of nature is the right one?” This was the title of a lecture he gave in May 1860 (to mark the founding of the Russian Entomological Society in St. Petersburg), where he presented listeners with a thought experiment concerning sensory perception. It goes as follows: if a human life span of 80 years consists of 29,200 days, and if this were to pass by one thousand times faster giving a compressed life span of 29 days, which could again be sped up by a factor of one thousand, it would result in a total life span of 41 to 42 minutes (41m 46s), and the corresponding rate of perception would be a million times faster than usual. For such a person, de Baer suggests, the organic world would probably appear disappointingly static, but other experiences currently unavailable to us would be accessible. “All the sounds we hear would certainly be inaudible to such people, if their ears remain morphologically similar to ours; but perhaps they would perceive sounds that we do not hear, indeed perhaps they would even hear light that we see” (de Baer 1862: 30, author’s translation). Returning to the first temporal compression (one-thousandth of a full life), he calculates that the highest sounds we perceive, vibrating at 48,000 times between two pulsations,xiv would vibrate only 48 times between pulsations for people of shortened life span, hence they would sound low (30–31). At the upper end, even the second compression, resulting in a 42-minute life, he argues, would not quite open up our perceptual apparatus to an ether vibrating at “several hundred billion times a second.” But we could take the idea of shortening a real life further, until these vibrations of the ether, which we currently experience as light and colour, actually become audible. And might there yet be in nature quite different vibrations which are too fast for us to experience as sound, and too slow to appear to us as light?…It is not at all preposterous to believe so…The planets, our earth among them, move through the ether with quite considerable speed and must set this speed. Is there not perhaps a sounding of outer space, a harmony of the spheres, that is audible to ears quite different to ours? (31–32) The quasi-scientific postulate of alien auditory realities and the apparently simple manner of calculating their relation to lived experience suggest the degree of fascination that limited perception held for those curious about the biological underpinning of human nature. As an entomologist, de Bear had insects in mind when comparing perceptual realities of beings of compressed and uncompressed lifespan. Fully half a century later, this animal- human underpinning would receive perhaps its most enduring articulation, one whose abandonment of categorical differentiation between sentient organisms has proven attractive for posthumanist discourse in our century. In 1909, the Baltic German biological Jacob von Uexküll published Umwelt und Innenwelt der Tiere, in which he formalized his theory of Umwelt, whereby each sentient organism creates its unique environment by its capacity to receive only signals that register on its peculiar sense organs. It inhabits a bubble, its individual Umwelt, which is determined by what is perceived sensorially based on sense acuity (Merkwelt) and the uses to which these senses are regularly put, their habits or training (Wirkwelt). As a result, the world is different, sensorially speaking, even for members of the same species. Uexküll’s theory has been recounted many times; here—following de Bear’s compressions by one thousand—I’ll mention only his example of the tick that, upon smelling the butyric acid of passing prey, must drop down from a tree onto the animal and begin boring for blood. It will feed only once before dying, so it can neither learn nor refine the procedure. According to Uexküll, an experiment at the Zoological Institute in Rostock determined that the tick could survive up to 18 years without nourishment, that is, 18 years on a tree branch before falling onto a passing animal. During this time, Uexküll hypothesizes, the animal goes into a kind of hibernation, unaware of time passing, as the perceptual moment is lengthened far beyond that of human perception: The tick can wait 18 years; we humans cannot. Our human time consists of a series of moments, i.e. the shortest segments of time in which the world exhibits no changes. For a moment’s duration, the world stands still. A human moment lasts one-eighteenth of a second…The duration of a moment is different in different animals…During its waiting period, the tick is in a state similar to sleep…Time stands still in the tick’s waiting period…and it starts again only when the signal “butyric acid” awakens the tick to renewed activity. (von Uexküll [1934] 2010: 52) Given the variant speeds of existence contemplated above, his conclusion that “the subject controls the time of its environment” bears a striking relation to the crank-driven technology of cinematography—which, as Inga Pollmann has argued, “played a key role in Uexküll’s development of his theory of Umwelt” (2013: 779). How, we might wonder, did he settle on the comparison of 18 years to one-18th of a second, a tick’s perceptual “moment” to a human’s?xv This is perhaps nothing but a multiplication of the most common frame rate for the cinematograph—18 frames per second becomes 18 years; the one, the speed at which human perception experiences “motion picture” from still photograms, the other, the length a tick can suspend consciousness without noticing a perceptual gap. Such a comparison is in keeping with Uexküll’s inclination to draw on contemporary technology (rather than make purely speculative leaps) to overcome the theoretical impasse of accessing other sensory worlds, worlds that remain empirically unknowable to individual humans, or—put differently—to determine externally an experience that is internally subjective and autonomous. If cinematography is here figured as only the most contemporary form of a deus ex machina that achieves what human perception cannot, questions over speeds of cognition, perception of frequencies, and sensory resolution have found expression across motley contexts concerned with the discursive proposition of human realism. Sampling these contexts serves to uncover a behavioral habit whereby recent technological apparatuses, rather than logic or imagination, are co-opted by writers to overcome the above theoretical impasse between external and internal determination of subjective perception. H. G. Wells’s short story The New Accelerator (1901) gave expression to the fantasy of time axis manipulation in the realm of fiction; it postulates an elixir that, when taken, speeds up the taker’s cognitive and physiological processes so that the subject feels identical in him- or herself but the external world is radically slowed down: “My heart…was beating a thousand times a second, but that caused me no discomfort at all.” The illustration accompanying a magazine reprint in 1926 (Figure 9.3) depicts the two leading characters casually observing the “statuesque” modern traffic flying past, as though in freeze frame (Wells 1926: 60). Unsurprisingly, perhaps, playful thinking about speeds and resolutions of perception has taken root within the scientific imagination over centuries. During the Great Plague of 1665 the English natural philosopher Robert Hooke first presented the idea of thinking oneself into new perceptual worlds through the new technology of the microscope. The illustration of a blue fly, as Hooke peered at it through Christopher White’s microscope (Figure 9.4), indicates the invisible, miniature world made visible by his device. With an ear for microsonic realities, his text Micrographia relates how the sound of bees’ wings, understood in relation to the vibration of a musical string (“tun’d unison to it”), vibrates “many hundreds, if not some thousands” of times per second, and may be “the quickest vibrating spontaneous motions of any in the world.”xvi This auditory extrapolation, from an optical fascination with fluttering wings, is indicative of how vision became only the first sense modality to be treated to changed acuity. “Mechanical inventions” to enhance hearing in comparable ways are “not improbable,” he speculates; they could result in the ability to hear ten furlongs away, we learn, or hold a conversation “through a wall a yard thick” by propagating “auditory” vibrations not through air, but along wire or via light.xvii The quasi-literary imagination behind Hooke’s ideas is set in relief by comparison to later, deliberate borrowings from fiction, such as those of the Leipzig Cantor and theorist Moritz Hauptmann, who in 1863 speculated on the hearing of alternatively sized bodies and the proportional relations between their new sensory realities. With reference to Swift’s Gulliver’s Travels (1726), he explained that Lilliputians, at half a foot tall, are 12 times smaller than Gulliver’s six-foot stature; Brobdingnagians are to Gulliver as he is to the Lilliputians: 12 times larger. Hence, the relations may be characterized as follows (Hauptmann 1863: 25): Lilliputians Gulliver Brobdingnag 1/12 :1 :12/1 1 :12 :144 On this basis, the longest organ pipe of 32 feet (16Hz) would give the lowest C2 to Gulliver (and us), but—according to Swift’s ratios—the same would be only 2⅔ feet for the Lilliputians, but 384 feet for the Brobdingnagians. As Figure 9.5 shows, the equivalent low C2 pitch for Lilliputians would therefore be Gulliver’s g (196Hz) below middle C (c1). Likewise, if a Lilliputian oboe tunes the orchestra to A = 440Hz, the 1:12 pitch ratio would result in an e4 for Gulliver, while the lowest pitch of the Lilliputian double bass, the E of a 16-foot organ pipe, would be equivalent to Gulliver’s b1. Hauptmann’s illustrations extend to the Brobdingnagian orchestra: “From the double basses, bass trombones, ophiclides, and everything that produces a deep tone, we [and Gulliver] would only see the movements of the players and feel the aerial vibrations.” With poetic infrasound in mind, an oblique reference to George Berkeley’s immaterialism was perhaps inevitable: “Sound, like color, is merely subjective. Neither exists without a listening ear and a seeing eye” (Hauptmann 1863: 26, author’s translation). While morphology of the ear can be assumed to behave according to simple ratios, and a quantitative approach to realism and corporeal difference might appear to have prevailed in this historical context, Hauptmann ultimately cautions that temporality is not so simple: We cannot claim that a symphony that lasts…60 minutes for us, must last 5 minutes in Lilliput, 12 hours for the Brobdingnags. Other temporal dimensions may certainly be supposed…Since [Lilliputians and Brobdingnagians] inhabit our world, were conceived and warmed by our sun, so their year is the same, their day, their hours are just as long; but their metronome, pendulum, heartbeat, the movement of their accompaniment remain in relation to their body size. In short, conflicts and doubts arise everywhere, which we will soon leave behind, and we will have to be satisfied with the assumption that they are human conditions as they are to us, as befits humans of five to six feet tall. (Hauptmann 1863: 27) Here, the theoretical impasse identified by Uexküll, between external determination of a sensory experience and internal subjective autonomy, remains recalcitrant as the poetic fictions of smaller and larger people are made to inhabit the putatively singular world with its singular mass and speed of rotation around the sun.xviii Different hypothetical life expectancies would further complicate the multiple temporalities, so ultimately Hauptmann’s thought experiment, alongside de Bear’s ratio-adjusted vibrations, already begins to undermine the quantitative approach to realism that technological apparatuses afford, whether microscopes, organ pipes, or cinématographes. Voice resolution at 1:1,000 If, finally, we time-travel to the present, a more contemporary context for quantitative realism indicates that the discourse’s reliance on emergent technology remains undimmed in the third decade of the 21st century. Until 2010 speech synthesis, from digital assistants like Alexa and Siri to simulations that ventriloquize our own voices, typically functioned by sampling large amounts of recorded speech fragments from one individual so words can be reassembled into an utterance appropriate to the message being conveyed. The component sounds were simply concatenated into theoretically endless chains of human-like utterances, dubbed “concatenative text-to-speech” synthesis. While these remain rooted in phonemic sounds recorded in the real world, cobbled together by algorithm, a more recent approach sees synthetic voices emanate from the generation of raw waveforms, assembled one waveform at a time and densely combined. That is, synthetic sound samples are pieced together to form waveforms at high resolution to mimic a real voice. Harking back to Baudrillard’s terms of reference, this constitutes a third-order simulation. An example is DeepMind’s WaveNet where, like melodies generated by Markov chains, a predictive distribution for each audio sample is conditioned on all previous ones, rising to at least 16,000 samples per second, a remarkable level of artifice in pursuit of what the WaveNet engineer Aäron van den Oord has called “subjective naturalness.” This artificial approach to natural voices aims to “directly model the raw waveform of the audio signal, one sample at a time” (2016). Given Uexküll’s ratio of 18 years:1/18 of a second, the sample rate is not arbitrary, as we shall see. A similarly synthetic process of voice simulation, at a resolution 1,000 times lower, is the Austrian composer Peter Ablinger’s Deus Cantando (God, singing) (2009). This is only one of the most recent spectral analyses of recorded speech that form the basis of his aptly named “speaking piano,” a computer-controlled player piano that replicates on the instrument’s 88 keys the decomposed sound spectrum of recorded human speech. As Ablinger explains: Using…16 units per second (about the limit of the player piano), the original [sound] source approaches the border of recognition within the reproduction. With practice listening the player piano can even perform structures possible for a listener to…understand as spoken sentences. (Ablinger, n.d.) That is, you can “hear” the piano pronounce words only when you simultaneously see its words or know them in advance. The speaking piano’s sample rate, 1,000 times lower than WaveNet’s simulation, teeters on the brink of comprehensible phonemes (i.e., far removed from a “perfect speaker”), and a visual analog might be the differently pixelated screens that Uexküll uses to imagine the different visual worlds for a human, a fly, and a mollusc, based on the cellular density in their retinas, where visual objects become progressively harder to make out (Uexküll [1934] 2010: 64–65). For Ablinger, comprehensibility is secondary to investigating the liminal space between phonorealism and the innately musical medium of the 19th-century piano—or, as he puts it, “the observation of ‘reality’ via ‘music’” (Ablinger, n.d.). Faced with the question of what the “reality” of the sound of a human voice might be, technological innovation forces quantitative, frequency-based answers of the kind we’ve just sampled. While frequency rates vary, none is any the less artificial. Accepting the split of auditory perception into an infinite plurality, and with a continuing reliance on technological affordance, what we understand to be spoken sounds may become defined more by what can be simulated, rather than what any individual perception make take to be “real.” This would seem the tacit assumption behind more common assertions that a synthetic sound, when heard, becomes real in its own right. Coda It is a truism that, phenomenologically, voice and identity become interdependent over time for the subject; the timbre, intonation, and cadence of your vibrating physiology become, in part, your identifying sound. As Steven Connor famously put it: “Voice is not simply an emission of the body; it is also the imaginary production of a secondary body, a body double: a ‘voice-body’” (2000: 35). Beyond this monist coupling, the sensation of hearing oneself talk, the feel of our resonating throat, is characteristic of self-identity in a genetic sense. It is the first sound we hear in the outside world. So it seems unsurprising that it was the early materialists of the late 18th century who would recognize its self-identifying agency as such. The poetic preface to Erasmus Darwin’s Zoonomia in 1794, for instance, speaks of the moment a child first perceives sound in the world, before it incrementally becomes less alien: ’Erewhile, emerging from its liquid bed, It lifts in gelid air its nodding head; The light’s first dawn with trembling eyelid hails, With lungs untaught arrests the balmy gales; Tries its new tongue in tones unknown, and hears The strange vibrations with unpractis’d ears. (Darwin [1794] 1809: v) In this historical context, self-recognition also works at the level of the species; hearing a voice in the desert announces to you “a being like yourself,” explained Rousseau in On the Origin of Languages. Vocal signs “are, so to speak, the voice of the soul” (Rousseau [1781] 1986: 63–64). And as Michel Serres reminds us, such sentiments exceed the narrow dualism they imply, for “all real bodies shimmer like watered silk. They are hazy surfaces, mixtures of body and soul” ([1985] 2008: 35, emphasis added). This is perhaps the reason why voices cannot be relinquished in the miniaturized characters in films such as Downsizing, to return to the reference point with which we started. A size-modulated voice would imply a change of underlying identity incommensurate with the film’s narrative continuity. The lack of discursive treatment around realism in the context of filmic miniaturization would seem beside the point, then. The medium of digital cinema embodies this discourse in a cultural technique of quantitative sampling and the history of perception this implies. Historically, the voice’s condition has been perennially technologized, but with high-frequency speech synthesis, voices have seemingly become fractal for the first time, a data set with the potential for infinite replication, and as such are subject to the very critique that Baudrillard leveled at cloned DNA, with all the rhetorical excess and intellectual violence this implies. Beyond this paired critique of realism and identity, the ethical quandary implied by “cloned” voices raises a further question: do you own this identifying sound as a composer “owns” a composition, a performer “owns” a recording, and humans own their DNA; or is the simulation autonomous on its own terms? Lyrebird, a voice-cloning company in Montreal, uses generative speech synthesis technology similar to Wavenet, but specializes in drawing on human voice samples to ventriloquize those voices in words and statements they never uttered. The technology is susceptible to “deep fake” media, and Lyrebird has taken a public stand on its ethical responsibilities: In many use cases, the results [of generative media] are already indistinguishable from real media. This technology has exciting applications…but it also holds the potential for misuse…We are committed to modeling a responsible implementation of these technologies, unlocking the benefits of generative media while safeguarding against malicious use. We believe you should own and control the use of your digital voice. [We use] a process for training speech models that depends on real-time verbal feedback, ensuring that individuals can only create a text-to-speech model of their own voice. Once created, the user is the owner of their voice and has the sole authority to decide when and how it is used.xix Here, the ethical ground is guaranteed by the participation of the voice-owner (who must offer verbal feedback to generate the simulation), but in other hands the technology could proceed without consent. If simulations of the human voice are already attaining hyperreal heights through 16,000 samples per second, it may be necessary to define a real voice according to its origins in a human body, rather than any a priori sonic principles. Adapting Baudrillard: “It is no longer a question of a false representation of [a real voice] but of concealing the fact that the real [voice] is no longer [singularly] real” (Baudrillard 2010: 12– 13). In other words, the phenomenon of synthetic speech and artificial generation raises the underlying issue of how we might define the “real” of sound itself in the digital age, whether this in fact has any validity in the absence of a single (perceiving) subject position, or warrants status in our critical thinking. To the extent that sound can be considered an object, and therefore something that can be possessed as a digital quantity, do we have a right to own sounds arising from our congenital biological frame? Or might we consider these an accident or corollary of evolutionary history? Whether taken as an ontological or a historical matter, this topic—arising from perceptual realism—has occupied commentators long before digital speech synthesis, and points to a philosophical instability at the heart of sound studies, namely: the notion of sound itself as a contested object. References 1. Abbate, Carolyn. 2016. “Sound Object Lessons.” Journal of the American Musicological Society 69: 793–829. 2. Ablinger, Peter. n.d. “Quadraturen.” http://ablinger.mur.at/docu11.html#principles. 3. Bains, Paul. 2006. The Primacy of Semiosis: An Ontology of Relations. Toronto, ON: Uni- versity of Toronto Press. 4. Barton, Ruth. 2003. “‘ Men of Science’: Language, Identity and Professionalization in the Mid-Victorian Scientific Community.” History of Science 41: 73–119. 5. Baudrillard, Jean. (1976) 1993. Symbolic Exchange and Death. Rev. ed., translated by Ian Hamilton Grant. London: SAGE. 6. ———. 1988. The Ecstasy of Communication. Translated by Bernard and Caroline Schutze. New York: Sylvère Lotringer. 7. ———. 1993. The Transparency of Evil: Essays on Extreme Phenomena. Translated by James Benedict. London: Verso. 8. ———. 2010. Simulacra and Simulation. Translated by Sheila Faria Glaser. Ann Arbor, MI: University of Michigan Press. 9. Bennett, Jane. 2010. Vibrant Matter: A Political Ecology of Things. Durham, NC: Duke University Press. 10. Bergson, Henri. 2005. Creative Evolution. Translated by Arthur Mitchell. New York: Barnes & Noble. 11. Brownlee, John. 2012. “Why Retina Isn’t Enough.” CultOfMac, June 15, 2012. https://www.cultofmac.com/173702/why-retina-isnt-enough-feature/. 12. Chion, Michel. 1994. Audio-Vision. Translated by Claudia Gorbman. New York: Colum- bia University Press 13. Connor, Steven. 2000. Dumbstruck: A Cultural History of Ventriloquism. Oxford: Oxford University Press. 14. Damböck, Christian, and Hans-Ulrich Lessing, eds. 2016. Dilthey als Wissenschaftsphilo- soph. Freiburg: Karl Alber. 15. Darley, Andrew. 2000. Visual Digital Culture: Surface Play and Spectacle in New Media Genres. London: Routledge. 16. Darwin, Erasmus. (1794) 1809. Zoonomia. Boston, MA: Thomas and Andrews. 17. de Baer, Karl Ernst. 1862. Welche Auffassung der lebenden Natur ist die richtige? Berlin: August Hirschwald. 18. Deely, John. 1994. New Beginnings: Early Modern Philosophy and Postmodern Thought. Toronto, ON: University of Toronto Press. 20. Der Deriam, James. 2001. Virtuous War. Boulder, CO: Westview. 21. Dilthey, Wilhelm. (1910) 1985. “Poetry and Lived Experience.” In Poetry and Experi- ence, edited by Rudolf A. Makkreel and Frithjof Rodi, 250–53. Princeton, NJ: Princeton Uni- versity Press. 22. Funkenstein, Amos. 1986. Theology and the Scientific Imagination from the Middle Ages to the Seventeenth Century. 2nd ed. Princeton, NJ: Princeton University Press. 23. Grimshaw, Mark. 2008. The Acoustic Ecology of the First-Person Shooter: The Player Experience of Sound in the First-Person Shooter Computer Game. Saarbrücken: Mueller. 24. Hauptmann, Moritz. 1863. “Klang.” In Jahrbücher für musikalische Wissenschaft, edited by Friedrich Chrysander. Leipzig: Breitkopf & Härtel. 25. Hayles, Katherine N. 2012. How We Think: Digital Media and Contemporary Techno- genesis. Chicago, IL: University of Chicago Press. 26. Hilbert, Martin, and Priscila López. 2011. “The World’s Technological Capacity to Store, Communicate, and Compute Information.” Science 332: 60–65. 27. Hooke, Robert. 1665. Micrographia. London: printed for John Martin. 28. Jørgensen, Kristine. 2006. “On the Functional Aspects of Computer Game Audio.” Pro- ceedings of Audio Mostly Conference, October 11–12, 2006, Piteå, Sweden. http://hdl.han- dle.net/1956/6734. 29. Kapp, Ernst. 1877. Grundlinien einer Philosophie der Technik [Elements of a philosophy of technology]. Brunswick: Westermann. 30. Locke, John. (1689) 2008. An Essay Concerning Human Understanding. Abridged by Pauline Phemister. Oxford: Oxford University Press. 31. Marks, Lisa. 2000. The Skin of the Film. Durham NC and London: Duke University Press. 32. Negroponte, Nicholas. 1995. Being Digital. New York: Knopf. 33. Perry, Nick. 1993. Hyperreality and Global Culture. London: Routledge. 34. Pollmann, Inga. 2013. “Invisible Worlds, Visible: Uexküll’s Umwelt, Film, and Film The- ory.” Critical Inquiry 39: 777–816. 35. Roads, Curtis. 2004. Microsound. Cambridge, MA: MIT Press. 36. Rousseau, Jean-Jacques. (1781) 1986. On the Origin of Languages. Translated by John H. Moran and Alexander Gode. Chicago, IL: University of Chicago Press. 37. Serres, Michel. (1985) 2008. The Five Senses: A Philosophy of Mingled Bodies. Trans- lated by Margaret Sankey and Peter Cowley. London: Continuum, 2008. 38. Siegert, Bernhard. 2003. Passage des Digitalen. Berlin: Brinkmann & Bose. 39. Sterne, Jonathan. 2012. MP3: The Meaning of a Format. Durham NC: Duke University Press. Strachan, Robert. 2017. Sonic Technologies: Popular Music, Digital Culture and the Crea- tive Process. New York: Bloomsbury. 40. Théberge, Paul, Kyle Devine, and Tom Everrett, eds. 2015. Living Stereo: Histories and Cultures of Multchannel Sound. New York: Bloomsbury. 41. Trippett, David. 2018. “Music and the Transhuman Ear: Ultrasonics, Material Bodies and the Limits of Sensation.” Musical Quarterly 100: 199–261. 42. van den Oord, Aäron, and Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Viuyals, Alex Graves, Nal Kalchbrenner, Andrew Senior, and Koray Kavukcuoglu. 2016. “WaveNet: A Generative Model for Raw Audio.” September 19, 2016. http://deepmind.com/blog/wave- net-generative-model-raw-audio/. 43. Varro, Marcus Terentius. 2006. Varro on the Latin Language [De lingua Latina]. Trans- lated by Roland Kent. Loeb Classical Library. Cambridge, MA: Harvard University Press. 44. von Uexküll, Jacob. (1934) 2010. A Foray into the World of Animals and Humans. Trans- lated by Joseph D. O’Neil. Minneapolis, MN: University of Minnesota Press. 45. Wells, H.G. (1901) 1926. ‘The New Accelerator. ’Amazing Stories 1 (April): 57–61, 96. Figure 9.1 Etienne-Jules Marey’s chronophotographic study “Cyclist” (c. 1894). Image from “La Collection des appareils,” Iconothèque, Cinémathèque Française, Paris. Figure 9.2 Modaunt-Short’s advert for What Hi-fi? (1988), illustrating the desirability of (undefined) perfect fidelity in audio reproduction. Figure 9.3 Unsigned illustration of H. G. Wells’s ‘The New Accelerator’, depicting Professor Gibberne and the narrator observing a radically slowed-down environment, for the inaugural issue of Amazing Stories 1 (April 1926), 57. Figure 9.4 Robert Hooke’s large-scale illustration of a blue fly as seen through magnifying glasses, and reproduced in Hooke’s Micrographia (1665), the first major work on microscopy. British Library Collection 435.e.19. Figure 9.5 Moritz Hauptmann’s musical examples for the same sonic frequencies heard by differently sized bodies, modulated by Swift’s ratios given above. i It is perhaps indicative that the deceptive realism of filmic effect has made it a favored vehicle for exploring vicarious perception. Small wonder, then, that as early as 1901, Georges Méliès’s The Dwarf and the Giant de- picts a single man who splits into two versions of himself, one who grows tall, one who shrinks to an eighth in size. Contributions to this subgenre of sci-fi films on the topic of size alteration would include those from The Incredible Shrinking Man (Jack Arnold, 1957), Darby O'Gill and the Little People (Robert Stevenson, 1959), and Devil-Doll (Lindsay Shonteff, 1964) to Honey, I Shrunk the Kids (Joe Johnston, 1989) and its ensuing fran- chise with the Walt Disney Company, as well as Ant-Man (Peyton Reed, 2015). ii The technical challenge of counterpointing tiny and “normal” humans was itself sufficient for the film to be nominated for the American Visual Effects Society’s award for Outstanding Supporting Visual Effects in a Mo- tion Picture. See https://visualeffectssociety.com/portfolio-items/2017-16th-annual-ves-awards/?portfolio- Cats=29. iii Two examples of critiques that respond to Baudrillard’s claims would include Perry 1993, which explores cul- tural contexts where original cannot be distinguished from copy, and Der Deriam 2001, which situates the the- ory of virtuality in warfare. iv See https://www.oed.com/view/Entry/129823?redirectedFrom=obscene#eid. v It appears to be a misunderstanding of Varro, who in fact argues just the opposite, that anything shameful is called obscenum because it ought not to be said openly other than on stage. See Varro 2006: VII: 351. vi A recent summary is given in Strachan 2017. vii See https://www.sciencedaily.com/releases/2007/08/070823122253.htm. viii See https://www.pcmag.com/archive/analyst-challenges-apples-iphone-4-retina-display-claims-251638 and https://www.npr.org/sections/alltechconsidered/2010/06/07/127530049/live-blogging-apple-s-developers-con- ference. ix For the modern period, the touchstone for placing trust in sensation remains John Locke’s anteriority of sensa- tion to reflection ([1689] 2008). x This definition of realism has been explored by Darley 2000. xi A thoughtful critique of Deely’s framework is given in Bains 2006: 49ff. xii A joint definition for high-resolution audio, agreed between the Recording Industry Association of America, the Consumer Electronics Association, the Digital Entertainment Group, and the Recording Academy Producers & Engineers Wing, remains technologically open, and rooted in intentionality: “lossless audio capable of repro- ducing the full spectrum of sound from recordings which have been mastered from better than CD quality (48 kHz/20-bit or higher) music sources which represent what the artists, producers and engineers originally in- tended.” See “High Resolution Audio Initiative Gets Major Boost with New ‘Hi-Res MUSIC’ Logo and Brand- ing Materials for Digital Retailers,” The Recording Industry Association of America (RIAA), June 23, 2015, https://www.riaa.com/high-resolution-audio-initiative-gets-major-boost-with-new-hi-res-music-logo-and-brand- ing-materials-for-digital-retailers/. xiii Reading perceptual mechanisms into words would seem more than just another term for literary realism. Its boldness may be taken as indicative of the multidisciplinary outlook afforded by that generation of 19th-century scientists who lived through the professionalization of different branches of the sciences, human and natural, within universities. See Damböck and Lessing 2016 and Barton 2003. xiv It was not uncommon during the second half of the 19th century for scientists to propose a higher upper limit for the aerial frequencies human ears could hear, now commonly accepted to be 20,000Hz. See Trippett 2018: 202–7. xv More recent theorists posit the smallest unit, or “grain,” of audible sound at between a thousandth and a tenth of a second. See Roads 2004: 86-97. xvi Robert Hooke, Micrographia [1665], “Observation 38 ‘on the structure and motion of the sings of flies’.” xvii Hooke, Micrographia, “Preface.” On the origin of the microphone, see Abbate 2016. xviii It is precisely the singular world of Classical biology that Uexküll would reject. See Trippett 2018: 208-ff. xix See https://www.descript.com/ethics.