The HathiTrust Research Center: An Architecture for Humanities Computing
Session Type: Presentation/Panel
Session Description:
The HathiTrust Research Center (HTRC) enables nonprofit and educational users
to have computational access to published works in the public domain stored
within the HathiTrust Digital Library, an extensive collaborative digital
library of nearly 10 million volumes and 2 billion pages of archived material
maintained by major research institutions and libraries worldwide. The HTRC is
founded as a joint venture between Indiana University, the University of
Illinois Urbana-Champaign, and the HathiTrust that is aimed at solving the
difficult challenges of increasing computational access to the public domain
and copyrighted materials in the HathiTrust digital library.
The technical goals and milestones of the first phase of the project
include:
• Develop bridge and caching strategies between the HathiTrust Repository and
indices and the HTRC data store. Work on a versioning database.
• Develop a prototype system for non-consumptive research
• Develop web and portal capabilities.
• Develop distributed access capabilities and improve data quality.
• Perform risk security analysis and the initial development of security
infrastructure and procedures.
The HTRC has developed a technical architecture to meet these requirements. This presentation will discuss each of the requirements, the challenges presented by each requirement, and the technology solution to each. The presentation will also discuss the future goals and challenges of the HTRC.
Session Leader:
Stacy Kowalczyk, Data to Insight Center, Indiana University
Session Notes:
Contribute to the
community reporting Google doc
for this session!