Data Management

Bridging Traditional Digital Humanities and Archives through Computational Archival Science

As libraries and archives across the world are planning an increasingly digital records future, there is a critical need to strengthen digital and computational literacy and training for future librarians and archivists. Acute “skills and management gaps in libraries” have been recognised which highlight the need for greater automation in library work, the facilitation of computational research, and the need for library managers to understand and value the benefits of in-house data science skills.

Data Discretion: Screen-Level Bureaucrats and Municipal Decision-Making

Public servants tasked with implementing rules or policies on the street-level often make discretionary decisions based on local context. Lipsky has labeled them street-level bureaucrats. During the COVID-19 pandemic, as most face-to-face interactions facilitated by local government moved online, many street-level decisions were moved to screens, representing the actions of who Bouvins and Zouridis refer to as screen-level bureaucrats. Discretionary decision-making among public servants continued, but much of it centered on the collection, analysis, and use of data.

Examining the Shift in Political Inclination of Korean Middle School Language Textbooks between the Independence and the Korean War (1945–1953): A Network Modeling Approach

This study leverages network modeling methodologies to examine the distribution of political inclinations in Korean middle school language textbooks from 1945 to 1953. The textbooks included in this study are 82 textbooks, where 37 of which from the U.S. military government period (1945-1948), 30 of which from the South Korean government establishment period (1948-1950), and 15 of which from the wartime period (1950-1953).

An Evaluation of GPT-4V for Transcribing the Urban Renewal Hand-Written Collection

In November 2024, OpenAI released GPT- 4V(ision), which includes Optical Character Recognition (OCR) capabilities. Given that much of the data curation, processing, and cleaning can be managed through user-friendly prompts (i.e., chat), we aim to conduct an initial assessment of GPT-4V’s effectiveness in transcribing hand-written documents from the urban renewal collection. If GPT-4V can accurately digitize hand-written documents through carefully crafted prompts, it could become a valuable tool for nonexperts in transcribing historical documents on a large scale.

Crowdsourcing Behavior in Reporting Civic Issues: The Case of Boston's 311 Systems

Many cities in the United States use civic technologies like 311 systems as part of their public service systems for monitoring non-emergency civic issues. These systems have enhanced the city's monitoring capability by diversifying communication channels. However, the data created through these systems is often biased because of differences in people's use of technology (i.e., digital divide) and individuals' behavioral patterns in providing types of information to the systems.

A Visualization Tool and Assessment Framework for Civic Technology Use in the DMV Area: The Case of 311 Systems During the COVID-19 Outbreak

The 311 systems that city officials currently deploy can efficiently detect non-emergency civic issues such as potholes and trash. From a socio-technical perspective, residents can re-appropriate the technology for their own purpose adding new capacities and affordances not initially intended. For example, when Hurricane Irma hit Miami in 2017, residents used 311 systems to report disaster-related issues, which led city officials to adapt the system by creating a new category.

Multi-Generational Stories of Urban Renewal: Preliminary Interviews for Map-based Storytelling

Urban renewal was a project of the American government that aimed to reconstruct poor urban neighborhoods. Because community-level data that shows the underlying mechanisms of urban renewal has not been curated in a systematic way, due to the complexity and volume of the relevant archival collections, we aim to digitally curate property acquisition documents from the urban renewal projects that affected the Southside neighborhood of the city of Asheville, North Carolina, in the form of a map-based, interactive web application. This paper reports early findings from interviews.

Toward Understanding Civic Data Bias in 311 Systems: An Information Deserts Perspective

While civic technologies for public issues and services such as 311 systems are widely adopted in many U.S. cities, the impact of the emerging civic technologies and their datalevel dynamics are unclear. Because the provision patterns of civic issues to technological systems are different across neighborhoods and populations, it is difficult for city officials to understand whether the provided data itself reflects civic issues. Also, the disparities in the information provided to civic technologies in different neighborhoods may exacerbate the existing inequality.

Local Information Landscapes: Theory, Measures, and Evidence

To understand issues about information accessibility within communities, research studies have examined human, social, and technical factors by taking a sociotechnical view. While this view provides a profound understanding of how people seek, use, and access information, this approach tends to overlook the impact of the larger structures of information landscapes that constantly shape peoples access to information.

Digital Curation of a World War II Japanese-American Incarceration Camp Collection: Implications for Sociotechnical Archival Systems

We describe computational treatments of archival collections through a case study of World War II Japanese-American Incarceration Camps. Camp staff and police officers compiled so-called "internal security" reports relating to alleged cases of "disorderly conduct, assault, theft, loss of property, and accidents" in the camps, and an index to these reports comprising over 25,000 index cards to the reports. The sheer size of these collections is pushing archivists and researchers to consider new forms of processing for collections at scale.

Pages