The World Wide Web is enormous and is in constant flux, with more web content lost to time than is currently accessible via the live Web. The growing body of archived web material available to researchers is immensely valuable as a record of important aspects of modern society, but there is little, if any, supporting infrastructure, processes and trusted methods available to facilitate domain specific Internet research. Humanities researchers are expected to individually assemble research data and e-Research tools needed for analysis. This can be cost-prohibitive in terms of resources and time.
This project aims to begin to address this gap by establishing a framework for e-Humanities (also called Digital Humanities) research using available open source tools and technologies and archived web content to create novel research interfaces to the first of many, scholarly, e-Humanities web collections.
Within the context of this project, the term 'web collections' is used to describe collections of archived websites. Both the Internet Archive and Hanzo have extensive experience in web archiving, and are prominent players internationally in the creation of web collections, including the largest of all web collections, the Internet Archive’s Web collection accessible via the WayBack Machine.