The Sounds of New York City (SONYC) project launches its first citizen science initiative to help NYU researchers train machine listening models

The Sounds of New York City (SONYC) —a first-of-its-kind research project addressing urban noise pollution—has launched a citizen science initiative to help artificial intelligence (AI) technology understand exactly which sounds are contributing to unhealthy levels of noise in New York City.

SONYC—a National Science Foundation-funded collaboration between New York University (NYU), the City of New York, and Ohio State University—is in its third year of a five-year research agenda and leverages machine listening technology, big data analysis, and citizen science to more effectively monitor, analyze, and mitigate urban noise pollution.

(For an overview of the SONYC project, please see this recent contributed article and video from Communications of the ACM and NYU’s press release from 2016).

The citizen science initiative, recently launched in the Zooniverse citizen science web portal, enlists the help of volunteers to identify and label individual sound sources—such as a jackhammer or an ice cream truck—in 10-second anonymized urban sound recordings transmitted from acoustic sensors positioned in high-noise environments in the city.

With the help of citizen scientists, machine listening models learn to recognize these sounds on their own, assisting researchers in categorizing the 30 years-worth of sound data collected by the sensors over the past two years and facilitating big data analysis which, in part, will provide city enforcement officials with a more accurate picture of noise—and its causes—over time. Ultimately, the SONYC team aims to empower city agencies to implement targeted data-driven noise mitigation interventions.

“It’s impossible for us to sift through this data on our own but we’ve learned through extensive research how to seamlessly integrate citizen scientists into our systems and subsequently, advance our understanding of how humans can effectively train machine learning models,” said Juan Pablo Bello, lead investigator; director of the Music and Audio Research Lab (MARL) at the NYU Steinhardt School of Culture, Education, and Human Development; and director of NYU’s Center for Urban Science and Progress (CUSP).

“Artificial intelligence needs humans to guide it—much like how a child learns by observing its parents—and we can see this training model having widespread applications in other fields. We’re incredibly grateful for the help of our volunteers,” continued Bello.  

“Training machines to accurately recognize sounds is a major challenge that can put citizen-researchers at the forefront of machine learning research. This is an opportunity for New York residents—and anyone interested in how sound affects our lives—to contribute to a scientific project that will help improve our sonic environments,” said Oded Nov, an associate professor of technology management and innovation at NYU Tandon.


HOW CITIZEN SCIENTISTS CAN HELP TO QUIETEN CONSTRUCTION SITES 
While SONYC aims to better understand urban noise pollution at scale, after-hours construction noise was identified as a priority based on the New York City noise code and feedback from the NYC Department of Environmental Protection (DEP).

Monitoring after-hours construction noise in New York City is challenging given the transient nature of noise and high number of noise-related complaints. The DEP is collaborating with SONYC with the hope of improving the accuracy and timeliness of the DEP’s response to noise violations; the smart sensor technology will save DEP investigators time in locating the specific source of the sound, as the sensors will automatically categorize and classify the source of noise violations through the machine listening algorithm.

Existing technologies are unable to isolate the effect of offending sources, especially in urban environments flooded with multiple sounds, but as citizen scientists annotate more sound data, helping the sensors get smarter, SONYC researchers will be able to build better tools for the real-time monitoring of source-specific noise violations (such as a jackhammer at two a.m.).

 A microphone attached to a silver box overlooks a yellow crane on a New York City street


HOW TO GET INVOLVED
To participate in this project, citizen science volunteers view a list of possible sound sources and click the ones that match the noises represented in a visualization of a 10-second audio recording. Supporting resources are available, including tutorials, field guides, and an option to communicate directly with researchers through the platform.

A list of possible sound sources that includes small-sounding engine, car horn, siren, and ice cream truck.

To ensure privacy, the audio clips on the platform are non-sequential in time and location. Incidentally recorded speech is unrecognizable as conversation, as determined by independent acoustical consultants, and no more than one recording is pushed out on the platform from the same location within an hour. (For further information on privacy measures employed by the SONYC project, please see the FAQ page.)

The goal for the citizen science campaign is 1000 identifications for each sound source which, in conjunction with other models and techniques, will be sufficient in training the sensor algorithm.

More than 700 people have already provided valuable input through the citizen science platform, but researchers estimate they will need the help of approximately 3,500 volunteers to meet their goal of at least 50,000 annotated recordings.


WHY ARTIFICAL INTELLIGENCE NEEDS HUMAN INPUT
Large volumes of rich data that are annotated or labeled are essential for training and evaluating novel machine listening models to automatically detect sound sources such as idling engines, car horns, or police sirens. However, this environmental data is currently unavailable to researchers.  

Conversely, the public is constantly sharing images and tagging them, providing reams of categorized data that inform deep learning models (similar to how Facebook can recognize friends’ faces in your uploaded photos). Speech recognition systems like Google’s Alexa are trained on thousands of hours of transcribed speech data, but in general, Mark Cartwright, postdoctoral researcher at NYU's Music and Audio Research Laboratory, says machine listening is yet to see the same success.

“If you’re training machine listening algorithms using popular techniques from the past decade, you need a lot of data. But we don’t have that audio data because noise has a temporal aspect and volunteers can’t quickly label recordings in a list like they can with images. There are also many sounds competing for attention in any given recording. It’s still an understudied area but we’re evaluating how to design tasks for non-experts so we can get the best quality annotations at higher speeds,” said Cartwright.

The SONYC team’s research into best practices for audio annotation—outlined in a recent paper accepted to the ACM CHI Conference on Human Factors in Computing Systems—advances existing scholarship and could enable powerful applications in other fields including bioacoustics monitoring, electric vehicle sensing, and assistive technologies.

For example, the researchers found spectrogram visualizations enabled citizen scientists to identify noise patterns more quickly than with other visual aids and that asking volunteers to annotate multiple sounds in a single recording yields similar quality results to those obtained when volunteers are asked to decide whether a single sound is present or not.

An audio player showing a spectrogram visualization

A spectrogram visualization of a 10-second audio clip from the Zooniverse platform

Graham Dove, postdoctoral associate at NYU Tandon, is heading up the citizen science initiative in addition to Bello, Nov, and Cartwright.

SONYC is a collaboration between NYU CUSP, NYU Tandon, NYU Steinhardt, and The Ohio State University and is supported by a $4.6 million grant from the National Science Foundation.   


About the NYU Steinhardt School of Culture, Education, and Human Development
Located in the heart of New York City’s Greenwich Village, NYU’s Steinhardt School of Culture, Education and Human Development prepares students for careers in the arts, education, health, media and psychology. Since its founding in 1890, the Steinhardt School's mission has been to expand human capacity through public service, global collaboration, research, scholarship, and practice. To learn more about NYU Steinhardt, visit steinhardt.nyu.edu.

About the NYU Tandon School of Engineering
The NYU Tandon School of Engineering dates to 1854, the founding date for both the New York University School of Civil Engineering and Architecture and the Brooklyn Collegiate and Polytechnic Institute (widely known as Brooklyn Poly). A January 2014 merger created a comprehensive school of education and research in engineering and applied sciences, rooted in a tradition of invention and entrepreneurship and dedicated to furthering technology in service to society. In addition to its main location in Brooklyn, NYU Tandon collaborates with other schools within NYU, one of the country’s foremost private research universities, and is closely connected to engineering programs at NYU Abu Dhabi and NYU Shanghai. It operates Future Labs focused on start-up businesses in downtown Manhattan and Brooklyn and an award-winning online graduate program. For more information, visit engineering.nyu.edu.

About NYU’s Center for Urban Science and Progress
CUSP is a university-wide center whose research and education programs are focused on urban informatics. Using NYC as its lab, and building from its home in the NYU Tandon School of Engineering, it integrates and applies NYU strengths in the natural, data, and social sciences to understand and improve cities throughout the world. CUSP offers a one-year MS degree in Applied Urban Science & Informatics. For more news and information on CUSP, please visit http://cusp.nyu.edu.