Category Archives: Uncategorized

Data Sonification for Beginners

Data Sonification

What if you could make data more engaging? Imagine a data presentation that could elicit an emotional response from your audience. Data that can talk to you. Even sing to you. This is the world of data sonification.

We are all familiar with data visualization, the realm of techniques that translate data into visual images. These images allow us to grasp data patterns quickly and easily. We learn to produce and consume simple visualizations – pie charts, bar charts, line graphs – as early as elementary school. These are so ubiquitous, we rarely notice them.

Data sonification is analogous to data visualization, but instead of perceptualizing data in the visual realm, it perceptualizes data in the sonic realm. Sonification has a reputation as a cutting-edge, experimental practice, and in many ways it is just that. But it has also been around longer than many of us realize. David Worrall, in his 2019 book, Sonification Design, describes how the Egyptian Pharaoh audited granary accounts by having independently-prepared ledgers read aloud before him, and listening for discrepancies. (In fact, the very word, “audit,” comes from the Latin word meaning “to hear.”)

Another newer, but still retro, manifestation of data sonification should be familiar from cold-war era science fiction movies, or maybe old episodes of Mission Impossible: the sound of a Geiger counter, an electronic instrument that measures ionizing radiation. Hand-held Geiger counters characteristically produce audible clicks in response to ionizing events, to optimize their usefulness when the user’s attention must be focused somewhere other than reading a meter visually.

Modern attempts at computer-assisted data sonification began to gather speed in the early 1990s. A typical study is Scaletti and Craig’s 1991 paper, “Using Sound to Extract Meaning from Complex Data,” which explored the possibilities of parameter-mapped sonification using technology available at the time. The International Community for Audio Display (ICAD) was founded in 1992 and has held a conference most years since then. The Sound and Music Computing Conference and the Interactive Sonification Workshop both started in 2004. Sonification research is now regularly published in engineering journals, psychology journals, music journals, and small handful of specialty interdisciplinary publications like the Journal of Multimodal User Interfaces.

Most data sonification projects fall into one of three categories: audification, parameter-mapped sonification, or model-based sonification. Of these, audification is the simplest; it involves shifting a data stream into the audible realm by using it to produce sound directly, often dramatically speeding it up or slowing it down in the process. This has often been applied in seismology to allow researchers to listen to earthquakes, such as this sonification of the 2011 Tohoku Earthquake in Japan. It has also been applied to astronomical data, notably by NASA, and also in the work of Wanda Diaz Merced.

Model-based sonification is a much more subtle process. Here the technique is to take a basic sound and modify particular aspects of it according to data values. The sound must first be represented by a mathematical model, as is done routinely with musical instruments for computer music applications. Then, different parts of the model are made to interact with values representing different data variables. The resulting transformations of the model yield a new sound, which reflects the influence of the data values. Think of the sound of a bell being rung. The sound of the bell depends on various qualities: its size, its thickness, the ratio of length to width, how much it flares, what kind of metal it is made of. Vary any one of these, and the sound is altered. This is how model-based sonification works, except that a mathematical model of a bell is subjected to variation, rather than an actual bell. (No bells were harmed in the course of this research!) These four sounds use this kind of process to sonify distributions of neurons in artificial networks: id=1 cluster, id=3 cluster, id=5 cluster, id=6 cluster. (The examples are from Chapter 16 of Thomas Hermann’s The Sonification Handbook, and can be found along with others here.

Parameter-mapped sonification lies somewhere in the middle between these two approaches on the continuum of sophistication. In parameter-mapped sonification, individual sound parameters such as pitch, loudness, duration, or timbre are mapped to values in a dataset. This is the most accessible approach to sonification for most people; the easiest to grasp intuitively, and the easiest to experiment with. It works particularly well for single-variable, time-series data.

Low Barrier to Entry

A number of easy-to-use tools have been developed by sonification researchers to allow you to develop parameter-mapped sonifications on your own. One of these is TwoTone, developed by Sonify, Inc., in partnership with Google. Twotone is available as a free web app with an intuitive user interface. It comes with a library of existing datasets for you to play around with, or you can upload a spreadsheet file of your own to sonify. TwoTone will map your data onto midi pitches, according to ranges and constraints you can specify. In addition to sonification, it shows a real-time animated graph indicating what part of the data you are listening to at any particular moment, making it a multi-modal tool for experiencing data. You can download your sonification as an MP3 file, but to capture the visualization you need to use a screen recorder.

Another free web app for data sonification is offered by Music Algorithms, developed by Jonathan N. Middleton of Eastern Washington University. Music Algorithms does not offer a visualization to go along with its sonification, but it does offer duration as a parameter for sonification, which TwoTone does not. Where TwoTone comes pre-loaded with sample datasets to play with, Music Algorithms offers mathematical series such as Pi, Fibonacci, or a DNA sequence, in addition to a “custom” function that allows you to input your own data. You can download your finished sonification as a MIDI file.

Much of the leading work in data sonification happens in the Sonification Lab at Georgia Tech. Their Sonification Sandbox was one of the first tools to allow public users to create their own sonifications. First released publicly in 2007, it is a free program you can download and install on your own computer. However the program is written in Java, and the creators have not kept it current with Java version updates. The last version (still available), is from 2014, and includes modifications to support Java 7. The most recent Java version is Java 19, released in Sept. 2022, and Sonification Sandbox works poorly with it. To get the best results with Sonification Sandbox, use a dedicated system (or a virtual installation within another system) running Java 7.

That doesn’t mean Georgia Tech has been sitting on its hands. Highcharts Sonification Studio, released in 2021, is a fully-updated web-based sonification platform, developed in partnership between GT and the data visualization software developer Highcharts. Users can upload a CSV file, choose data and sonification parameters, and produce a MIDI-based sonic rendering of their data.

Medium Barrier to Entry

Anyone who has spent much time around electronic composition is probably familiar with a visual object-oriented programming environment called Max, originally developed by Miller Puckette at IRCAM in the 1980s. Although not developed with data sonification in mind, this is one of Max’s capabilities. Max offers great flexibility, but it comes with a correspondingly steep learning curve. Fortunately, it is known for its great documentation, tutorials, and a user community not shy about posting instructional videos. If you are interested in using Max for sonification, tutorial 18 is the one you will be shooting for. Start at the beginning and take the tutorials one by one, and when you get to tutorial 18, you will learn how to use Max to convert spreadsheet data to sound and animated graphing.

Max is a little pricey, at least compared to a free web app; you can expect to pay around $400 for a license, or $250 with an academic discount. For the more adventurous, there is a free, open-source alternative called Pure Data. Pure Data (or PD), also developed by Puckette, is a completely separate and independent tool, but is designed to do the things Max does, using an interface similar to that of Max. The big difference is in the documentation: PD’s documentation is mostly community-developed, so it isn’t always as beginner-friendly as the documentation in Max. However, if you are patient, you can learn to do the same things in PD that you can do in Max. Besides being free, PD also has the advantage that it is available in a version for Linux, as well as for MacOS and Windows. (Max is available for Windows and Mac only.)

Sonification for Librarians

So what might you do once you get your hands on these tools? Good question! Here are a few sonifications I have created using the humblest data at my disposal: the log of the gate counter in my music library at University of Denver. Using TwoTone, I created a sonification (and recorded the animated graph) of patron visits to the library in FY 2018. Play the video, and watch for the small orange line moving from the left through the rows of blue lines. You will notice that higher pitch correlates to higher values in the sonified data.

The top row is a repeating sequence of seven values indicating weeks in the year; it is placed in the lowest range. The next row is the mornings, with the noon hour included; it is in a higher range. It begins with a period of lower values representing reduced library traffic in the middle and late summer, then jumps dramatically when school begins in the fall. The next row is the afternoon/evening reading; it is in an even higher range. You may notice that after school begins, the gaps in this row between groups of values seen in the summer disappear. This is because during school breaks we do not open on the weekend, while during school terms, we have weekend hours in the afternoon. (But not in the morning – note the contrast with the row above.) University of Denver is on the quarter system; you can easily identify Fall, Winter, and Spring Quarters, separated by a long Winter Break and a much shorter Spring Break. The last row is the night shift, and it is placed in the highest range of all; I have also further distinguished it by representing it in two-note arpeggios. You can see that we are open nights only during school terms.

The same data was differently, for this sonification. Here, the daily totals were used instead of individual shifts, but Fall, Winter, and Spring Quarters were mapped against each other as a comparison.

Here is another sonification of the quarter-against quarter data, this one created using Music Algorithms. Data values are again mapped to pitch. The time values in this sonification are manipulated in such a way that they grow longer in a repeating cycle of seven values, with Monday as the shortest and Sunday as the longest. This allows us to identify the weekend data as the two longest time values at the end of each series.

A sonification of the same data using Sonification Sandbox is here. Like the other tools, this one has its own look and sound. The recording is a little slow; it takes 60 seconds to hear all of it. You might want to try speeding it up on YouTube; alternatively, this recording completes the process in ten seconds.

Here is a sonification created with Max. In this case, three years’ gate-count data were superimposed on each other, with each displayed in a different-colored graph, and all three sonified simultaneously.

Listening to these sonifications, you are likely to experience them asthetically, or even emotionally, in a way that rarely happens when we inspect a visual chart or graph. This is possibly the most unusual aspect of data sonification: the immediacy and urgency of sound make vision seems cool and analytical by comparison. Unlike our eyes, we cannot “close” our ears; nor can we “listen away” from one stimulus as we can look away in order to focus on another. This difference in quality between aural and visual perception helps explain why so many sonification tools are designed for multimodal presentation – hearing and vision are designed to reinforce each other.

Try some of these tools and see what you can create with data sonification. If you come up with something interesting, post a comment here!

ETSC TechHub

ETSC TechHub is a drop-in session held at MLA conferences that includes a variety of technology-related discussion groups. By attending TechHub, MLA members can get quick informal tutorials on various digital tools or ideas. The first four posts on our blog will be dedicated to videos and resources produced for MLA TechHub 2022.

Music Library Association Emerging Technologies and Services Committee

Welcome to the Music Library Association (MLA) Emerging Technologies and Services Committee (ETSC) blog! This committee works to identify and evaluate current trends, tools, services, and developments relating to technologies used by libraries and librarians, with special attention to their handling of music materials. It coordinates and facilitates the exchange of this information to the MLA membership. Our blog will be used to share information about emerging technologies and services identified by committee members and guests.