Digital Humanities + Data

By Keith Allison | November 9, 2018

Cover Image

NYU’s DH+Data Day Celebrates Data Science and Visualization in the Service of the Humanities

In the 1800s, Colden’s Liquid Beef Tonic was sold as a cure for alcoholism. It’s primary ingredient? Alcohol. This may not be the sort of thing one expects to learn at an event dedicated to discussing data visualization and research, but that’s why Data Services recently changed the name of their longstanding GIS Day. DH+Data Day — the DH stands for digital humanities — is a testament to the expansions that have occurred in who uses data, what they use it for, and how they go about accessing and analyzing it. Increasingly, data science plays a crucial role in almost every academic discipline. The 2018 DH+Data Day featured presentations by researchers in disciplines as diverse as journalism, food science, archiving, art, history, and sociology.

“We have celebrated GIS Day since 2012 as a medium for building and nurturing NYU’s GIS community,” said Himanshu Mistry, manager of Data Services. “Our goals were to bring awareness about the value of using GIS tools in research and teaching and learning; showcase geospatial resources at NYU; feature cutting-edge tools and services offered by Data Services; and highlight some exciting geospatial and applied research being done here at NYU.”

Mistry continued: “Over time, we started observing cross-disciplinary research trends across the University. To fuel this phenomenon, we expanded GIS Day to Data Services Research Day in 2017. All the components of GIS Day were included, plus all of the data-driven research and instruction were folded in as well. Our speaker series ranged from GIS to humanities and technology and more, and our competition included a fabulous array of mapping and data visualization submissions.”

A common theme fueled the transformation of this year’s DH+Data Day: as data services become more advanced, they have the power to refine and revise the picture of history by integrating sources that are either entirely new or were previously cumbersome to access. Projects of this type create new sources for research, putting records that might have been lost to time at a researcher’s fingertips.

Mistry explained, “Over the last few years, we have seen a good number of digital humanities projects intersecting with various tools and services provided by Data Services and Digital Scholarship Services. Coincidentally, there was a proposal for a co-hosted event issued by the Digital Humanities Advisory Committee and Digital Scholarship Services team. We took the opportunity to establish a theme-based event with an extended goal of showcasing amazing digital humanities and data-driven research projects created by the NYU community.”

Gathering and Presenting Data

Colden’s Liquid Beef Tonic was mentioned during a presentation by Dana Karwas, an Industry Assistant Professor of Integrated Digital Media at NYU Tandon. Her session, “The City Record Historical Project,” discussed the effort to digitize and make searchable online the City Record, a periodical published by the city of New York since 1873. The newspaper covers, “public hearings and meetings, public auctions and sales, solicitations and awards, and official rules proposed and adopted by city agencies.”1 Once scanned, the team was able to observe the ways in which the data was used by researchers. In her example, Karwas traced a path of inquiry from causes of death in New York to specific articles in the City Record pertaining to health and hygiene, which eventually surfaced the dubious beef tonic cure-all.

During NYU adjunct instructor Scott Barton’s presentation, “Digital Humanities and Contemporary Food Studies Issues,” he discussed the power of the tool Storymaps to change the way students present what they’ve learned. For his class, “Foodways and Contemporary Issues,” rather than being restricted to the “test and term paper” model, students used Storymaps to create projects that integrated hyperlinks, photos, video, audio, maps, and other resources. In Barton’s opinion, pushing the presentations beyond papers resulted in more engaging, more impactful learning.

This opinion was echoed by Tom Augst, an Associate Professor of English at NYU, who spoke on the subject of “Humanities Data Curation and Project-Based Learning.” For Augst, who also uses Storymaps in his courses, these new tools enable research and learning to expand beyond the concept of “the canon,” that set of texts long-agreed upon as the core of an education. Canon can provide a useful foundation, but it’s also restrictive — slow to change as new works are introduced and often culturally narrow, having been passed down from previous generations when scholarship was dominated by white, male Western Europeans. Storymaps empowered his students to explore alternative texts and offered them a different way to present what they had learned.

Both Barton and Augst highlighted ways in which new data tools enabled them to move their instruction toward a more experiential-based methodology. Students in Barton’s food studies course relied on personal experience in conjunction with textual research to explore issues involving food and human consumption. Augst’s class studied Edgar Allan Poe by visiting sites around the city related to the author, and searched literature for location-related scents, resulting in a “smell atlas” of old New York. Such an atlas contributes to the study and understanding of urban centers.

Organizing and Archiving Data

Initiatives like the City Record Historical Project yield a massive amount of data that is useless if it is not organized in a way that is searchable and manageable. The methodology for cataloging printed periodicals is well-developed, but what happens when the source is more complicated, if it crosses multiple mediums of expression? This was the question explored in An Introduction to the NYU Artist Archive Initiative, presented by Deena Engel, Clinical Professor at the Courant Institute of Mathematical Sciences, and Glenn Wharton, Clinical Professor at the Graduate School of Arts & Science Program in Museum Studies.

Contemporary art can change over time and be iterative and/or multimedia in nature. Professors Engel and Wharton seek to devise a system that can archive non-static works that pull from multiple places. The NYU Artist Archive Initiative makes use of MediaWiki. Using refinable searches, tags, keywords, and the ability to integrate various sources of text and media has enabled them to build a catalog that is able to capture the complexity of art while remaining adaptable and inexpensive.

A similar problem was addressed by Katy Boss, Librarian for Journalism, Media, Culture, & Communication at Bobst Library, and Meredith Broussard, Assistant Professor at NYU’s School of Journalism. “Saving Data Journalism: Building an Archive Tool for Dynamic Websites” discussed the difficulty of archiving “the news.” When, how, and what “the news” is has become so variable, an accurate, dependable archive must be designed that can capture various forms of data as well as provide emulation of legacy technologies, file formats, and plug-ins that may no longer be functional. The solution their team currently employs is ReproZip, an archiving tool capable of harvesting posts and pages as well as underlying media, plug-ins, and other structures that enable the content to function.

Jonathan Stray, a computational journalist at Columbia University’s School of Journalism, used his session to address another question: with data science becoming an increasingly integral part of journalism and scholarship, how do you make it available to journalists and researchers without also demanding that they become experienced coders? In the same way Tom Augst hopes to open humanities learning to, for example, scientists, Stray wants to open data science to journalists by developing tools that lessen the need to become adept at coding just to complete their research. In Workbench: Reproducible Data Analysis without Coding Stray showcased Workbench, a tool that simplifies the process of performing complex searches through massive amounts of data (the entirety of Twitter, for example, or public records).

The Heart of Digital Humanities

The how, what, and even the why, of digital humanities was discussed at DH+Data Day. Two final sessions asked: but what else? What can be done differently, or more effectively? Leo Douglas, a Clinical Assistant Professor at Liberal Studies and Lecturer at Columbia University, leveraged the social media hashtag #reallifescientists to investigate why boys in his native Jamaica were so unlikely to pursue scientific careers. In his research, the data only provided part of the story.

Douglas used the hashtag to present photos to young students, determine the most popular ones, and then collect the reactions of the children to each photo. He then analyzed the language to create a set of keywords. What he found was that, in addition to the resource and economic barriers that exist, there was a cultural aspect to the lack of young, black males pursuing science careers. They identified scientific work as feminine (because photos showed biologists caring for animals), not formal, not manly, or too much like being a farmer. The results of Douglas’ inquiry expose a cultural bias against men performing certain jobs. Armed with that information, it becomes more possible to challenge the perception.

For Douglas, being able to leverage another dimension of data science provided insight that might otherwise have been missed. This additional dimension is what presenter Cy Andrews calls “experiential” data visualization. Andrews focused on how representations of data can move beyond graphs and into a realm with a more immediate, even physical, impact. She calls this an “emotional performance index.” She discussed her project, Trigger Warning, an immersive experience in which statistics about sexual assault are translated into a tactile experience. A person steps into a dark booth, illuminated by a pulsing red light and filled with the sound of a heartbeat. The beat simulates the heart rate of a person during a sexual assault, during which time the body will “shut down” as a defense and coping mechanism. The person is in the booth for two minutes, at which time they are informed that, according to the data, someone was sexually assaulted as the viewer stepped into booth, and someone else had been assaulted as the viewer steps out.

To read a statistic is one thing. To be immersed in it is quite another. This, Andrews thinks, is the direction into which data science and visualization can expand. She hopes that, as the tool set continues to develop, data scientists, journalists, and artists will also explore the impact of an emotional dimension to data visualization.

A primary role of data visualization is to render complex data in a way that can be more easily understood. As it continues to expand, science and humanities have an opportunity to learn from one another. Can data about climate change be rendered in an experiential fashion? Or human migration patterns? What about archaeology or risk assessment? As the software and hardware that forms the foundation of data sciences becomes more powerful and easier to use, the potential applications expand significantly, as does the thinking around how they can be used.

About NYU Data Services

Data Services is a joint service of New York University’s Division of Libraries and Information Technology to support quantitative, qualitative, and geographical research at NYU. Data Services offers access to specialty software packages for statistical analysis, geographic information systems (GIS), and qualitative data analysis. They provide training and support, as well as consulting expertise, for many aspects of the research data life cycle including access, analysis, collection development, data management, and data preservation.

More Information