DHOxSS 2011 Sessions

All materials are licensed Creative Commons Attribution, unless otherwise specified.

* Please note that the links to materials are no longer working. *

Monday 25 July 2011

09:30 - 12:30

  • Title: Creating digital texts in XML using the TEI (Part 1)
  • Tutors: Sebastian Rahtz (OUCS) and James Cummings (OUCS)
  • Time: Monday 25 July 2011, 09:30 - 12:30 continued Tuesday 26 July 2011, 09:30 - 12:30
  • Location: OUCS
  • Materials: tei-01-intro.pdftei-02-customising.pdftei-03-metadata.pdftei-04-names.pdftei-materials.zip
  • Description:

    In part 1 we will discuss principles of encoding digital text in XML, and introduce the Text Encoding Initiative (TEI) scheme. Practical exercises will use the oXygen XML editor and Roma schema modeller to produce our first texts.

    In part 2 we will look at the TEI in more detail, especially the features for metadata headers, and look at the commonly-used modules for components of a typical digital text.
     

  • Title: Working with audio files
  • Tutor: Stephen Eyre (OUCS)
  • Time: Monday 25 July 2011, 09:30 - 12:30
  • Location: OUCS
  • Description:

    This session will demystify the whole process of recording audio, starting from a conceptual outlook, but moving to hands-on experience for most of the session. There will be a demonstration of equipment and differences between specific microphones will be shown. We will use Audacity to build an audio project that will give everyone a sense of confidence that they can create a wide variety of audio files to suite their teaching and research needs.

14:30 - 15:30

  • Title: Modelling with RDF
  • Tutor: John Pybus (OeRC)
  • Time: Monday 25 July 2011, 14:30 - 15:30
  • Location: Wolfson College
  • Description:

    In this session we describe the RDF data model, provide examples of data modelled as RDF, and demonstrate how to query RDF data. We take a brief look at vocabularies that might be of particular interest in the digital humanities, such as the CIDOC-CRM. In the context of linked open data we discuss copyright and licensing.
     

  • Title: “...and now what?” Some approaches to project modelling.
  • Tutor: Pip Willcox (Bodleian Libraries)
  • Time: Monday 25 July 2011, 14:30 - 15:30
  • Location: Wolfson College
  • Materials: Willcox-ProjectModelling.pptWillcox-ProjectModelling.pdf
  • Description:

    The subject matter is chosen; the funding is in place. How do you set about turning the eye-catching project outline into reality?

    This talk takes an empirical look at some answers to this question, drawing on several TEI-encoded editing projects. Time will be allowed for discussion.

16:00 - 17:00

  • Title: Practical RDF modelling and conversion
  • Tutor: Alexander Dutton (OUCS)
  • Time: Monday 25 July 2011, 16:00 - 17:00
  • Location: Wolfson College
  • Description:

    We explore practical considerations in representing information as RDF., Topics covered include linked and linkable data, choosing and creating vocabularies, controlled vocabularies, serialization options, and tools for conversion. We provide pointers to existing modelling communities, and ways to publish the RDF you create.
     

Plenary Lecture: 17:00 - 18:00

  • Title: Providing documents on the web: experience from legislation.gov.uk
  • Speaker: Jeni Tennison (UK eGov guru)
  • Time: Monday 25 July 2011, 17:00 - 18:00
  • Location: Wolfson College
  • Description:

    The legislation.gov.uk website provides access to legislation from 1267 to the present day, in HTML, PDF, XML and RDF formats and with separate URLs for each version of every sub-section. This talk describes its architecture and use of open standards as a pattern for publishing complex documents on the web.

Tuesday 26 July 2011

09:30 - 12:30

  • Title: Creating digital texts in XML using the TEI (Part 2)
  • Continued from Part 1.

  • Title: RDF querying and visualization (Part 1)
  • Tutors: John Pybus (OeRC) and Alexander Dutton (OUCS)
  • Time: Tuesday 26 July 2011, 09:30 - 12:30 continued Wednesday 27 July 2011, 09:30 - 12:30
  • Location: OUCS
  • Description:

    In Part One we will convert some sample data to RDF and load it into a triple store. We will then explore SPARQL, a query language over RDF. We will experiment with off-the-shelf tools for data storage and presentation.

    In Part Two we will move on to more complicated querying, and see how we can use query results in a number of environments. Finally we will create JavaScript-based visualisations of RDF data.

14:30 - 15:30

  • Title: Best practice for language corpora
  • Tutor: Martin Wynne
  • Time: Tuesday 26 July 2011, 14:30 - 15:30 continued 16:00 - 17:00
  • Location: Wolfson College
  • Materials: Wynne-CorpusResources.pdf
  • Description:

    How should we go about designing, constructing a corpus of language data? We visit the thorny issues around sampling and representativeness, and the many dimensions of modality, medium, genre and text type. We will take an overview of the available options for the representation of metadata, textual features and linguistic annotations. And from the point of view of the user, we ask how to find and evaluate linguistic corpora. How do we assess their availability for different types of use? And in the online, data-rich research environments of today, should we ask if we still need language corpora, when there is so much digital language out there at our fingertips? What is the case for spending time crafting the perfect corpus, when we could be concentrating on acquiring and filtering the mass of data on the web?
     

  • Title: Managing Digital Humanities Projects
  • Tutor: Ruth Kirkham (OeRC)
  • Time: Tuesday 26 July 2011, 14:30 - 15:30
  • Location: Wolfson College
  • Materials: Kirkham-Managing.pptKirkham-Managing.pdf
  • Description:

    This talk will look at the motivational issues involved in managing a project with diverse stakeholders and focuses on the experiences of a number of Digital Humanities projects based at the Oxford e-Research Centre. These projects have demonstrated that a user centred approach allows stakeholders at all levels to get a good understanding of the progress and potential of the project. However we’ve learnt that it is not enough to assume that interested parties will continue to seek out project involvement. To ensure this, a Project Manager needs to continually focus on communicating the progress of the project and clear attention should be given to the differing and changing needs of project stakeholders.
     

16:00 - 17:00

  • Title: Best practice for language corpora (Continued)
  • Continued.

     

  • Title: Getting funding: quality, impact, sustainability.
  • Tutor: David Robey
  • Time: Tuesday 26 July 2011, 16:00 - 17:00
  • Location: Wolfson College
  • Materials: Robey-Funding.pptxRobey-Funding.pdf
  • Description:

    David Robey was formerly Director of the UK Arts and Humanities Research Council's ICT in Arts and Humanities Research Programme. He will review the issues that applicants for project funding in the digital humanities should attempt to deal with in order to maximize their chances of success, focussing on the particular needs of projects planning digital outputs, but with some more general suggestions about writing grant applications.

Plenary Lecture: 17:00 - 18:00

  • Title: "The Uneasy Pursuit of the Future of the Book: A Discussion of the Implementing New Knowledge Environments Project, After Shakedown" and "Building and Maintaining a Team Approach in a Rapidly-Advancing Area of Research and Development"
  • Speakers: Ray Siemens (University of Victoria) and Lynne Siemens (University of Victoria)
  • Time: Tuesday 26 July 2011, 17:00 - 18:00
  • Location: Wolfson College
  • Materials: Siemens-TeamApproach.pdf
  • Description:

    Lynne Siemens: "Building and Maintaining a Team Approach in a Rapidly-Advancing Area of Research and Development"
    Teamwork is becoming an important part of academic life, especially within Digital Humanities. These projects often bring together team members from a variety of content and skill areas. While these collaborations have many advantages, several challenges exist, including coordination and tensions between various professional subcultures. Academic teams must understand the nature of collaboration and develop governance models that will allow them to achieve their research objectives within a rapidly changing area of research. This paper will examine several examples of policies, procedures, and skills that can facilitate and enhance collaboration within DH teams.

    Ray Siemens: "The Uneasy Pursuit of the Future of the Book" The advent of the e-book has made the book, itself, visible to us as an object of study in new ways that have, in turn, metaphorically and analogically fertilized and fomented our understanding of new forms of e-reader book-ishness and e-reading. As a powerful metaphor for textual forms of communication, the notion of the book as knowledge environment spurs development of e-readers in the direction of emerging universal electronic libraries; from perspectives of its physical artifactual nature, and its formal components, book elements and features are mimicked, augmented, and enhanced as they are prototyped and deployed in electronic reading environments. This paper discusses the work of the Implementing New Knowledge Environment (INKE) research team in this context: noting just how much we have yet to understand about 'reading' in the new context of its electronic correlative acts and, perhaps, in pre-electronic times; urging that the dizzyingly rapid cycle of development, deployment, and adoption of e-reading devices has the positive effect of providing a technological disruption with the potential to benefit our understanding of the core, essential activities that our reading devices have always facilitated, electronic and otherwise; and arguing that it is an understanding of these activities that will allow us best to anticipate the long-term developmental trajectories of our reading devices in future.

  • Note: This evening plenary will be followed by a drinks reception.

Wednesday 27 July 2011

09:30 - 12:30

  • Title: Publishing XML files using XSLT (Part 1)
  • Tutor: Sebastian Rahtz (OUCS)
  • Time: Wednesday 27 July 2011, 09:30 - 12:30 continued Thursday 28 July 2011, 09:30 - 12:30
  • Location: OUCS
  • Materials: xsl-01.pdfxsl-02.pdfxsl-03.pdfxsl-04.pdf
  • Description:

    In part 1 we will learn to write XSLT transformations to turn TEI XML files to HTML, and build up a rich web page from a digital text.

    In part 2 we will move on from simple HTML, and will look at other uses of XSLT to manipulate TEI XML files, including making ebooks and converting to and from word-processor formats.

     

  • Title: RDF querying and visualization (Part 2)
  • Continued from Part 1.

14:30 - 15:30

  • Title: TEI for linking text and facsimiles
  • Tutors: Sebastian Rahtz (OUCS), James Cummings (OUCS), and Adam Obeng (OUCS)
  • Time: Wednesday 27 July 2011, 14:30 - 15:30 continued 16:00 - 17:00
  • Location: Wolfson College
  • Materials: tei-05-linking.pdf http://adamobeng.com/dhox/slides/
  • Description:

    Scholarly editions these days are not considered complete without the inclusion of digital images of the original text. This session will expand on the earlier TEI sessions by introducing the methods by which you can link text to facsimile images. A number of different approaches, through case studies of projects we have worked on, will be considered for their benefits and drawbacks. Although not a practical session, students will be given a demonstration of both simple and complex linking. There will also be a demonstration of work on multimedia ePubs. There will be time for discussion and exploration of the student's concerns for their own work.

     

  • Title: Tools for analyzing linguistic Corpora
  • Tutor: Martin Wynne
  • Time: Wednesday 27 July 2011, 14:30 - 15:30 continued 16:00 - 17:00
  • Location: Wolfson College
  • Materials: Wynne-CorpusTools.pdf
  • Description:

    Once you've captured your corpus, how do you analyse it? There is currently a bewildering array of options, but it is still sometimes difficult to find the right solution for your particular purpose. How can we classify the different options? There are tools with varying suites of functions; tools for specific corpora, specific formats, specific languages; tools for online and offline work; server-based, desktop, and even mobile applications; tools for comparing corpora, or comparing texts with corpora; tools for annotating; tools for workflows, tools for sharing and collaborating. We will take a look at the various options available now, and a peak towards the horizon to see what might be coming in the next generation of applications and services.

     

16:00 - 17:00

  • Title: TEI for linking text and facsimiles (Continued)
  • Continued.
     

  • Title: Tools for analyzing linguistic Corpora (Continued)
  • Continued.
     

Plenary Lecture: 17:00 - 18:00

  • Title: Linking transcriptions to spoken audio
  • Speakers: John Coleman (Phonetics) and Sergio Grau Puerto (Phonetics)
  • Time: Wednesday 27 July 2011, 17:00 - 18:00
  • Location: Wolfson College
  • Materials: ColemanGrau-LinkingSpokenAudio.pptColemanGrau-LinkingSpokenAudio.pdfColemanGrau-KBW_gronnies.wav
  • Description:

    As spoken audio corpora become ever larger, new challenges arise of how to navigate around them or find material of interest in them. In this talk, we explain how we've been getting to grips with full time-aligned annotation of the British National Corpus Spoken Audio edition, more than two months of continuous audio. We focus on forced alignment, XML encoding of detailed timing information, and streaming audio fragments.

Thursday 28 July 2011

09:30 - 12:30

  • Title: Publishing XML files using XSLT (Part 2)
  • Continued from Part 1.
     

  • Title: Digital Images: sourcing, adapting, and safe keeping
  • Tutor: David Baker (OUCS)
  • Time: Thursday 28 July 2011, 09:30 - 12:30
  • Location: OUCS
  • Materials: Baker-DigitalImages.pdfBaker-DigitalImagesWorkbook.pdf (These materials licensed CC+by+nc+nd.)
  • Description:

    Digital images are a valuable part of your research, sometimes critically so. This three hour session will introduce you to some of the key issues that you need to be aware of when sourcing, adapting and using digital images. Although the focus is the use of images in an academic context, the ideas covered are equally relevant to your personal image collections.
     

14:30 - 15:30

  • Title: Textual editing and digital scholarly editions
  • Tutor: Lou Burnard
  • Time: Thursday 28 July 2011, 14:30 - 15:30 continued 16:00 - 17:00
  • Location: Wolfson College
  • Materials: tei-editing.pdf
  • Description:

    The TEI Guidelines are extensively used in the production of new scholarly editions of primary source materials. This session provides a critical overview of the features provided by the Guidelines for encoding textual or documentary editions, ranging from the traditional critical apparatus, via detailed documentary transcription, to the latest set of proposals intended to support an encoding of the writing process itself, according to the French school of l'édition génétique, or the German school of historisch-kritische Ausgabe.
     

  • Title: The A,B,C of crowdsourcing a community collection
  • Tutor: Kate Lindsay
  • Time: Thursday 28 July 2011, 14:30 - 15:30
  • Location: Wolfson College
  • Description:

    Have you thought about engaging the general public in your digital project to create or enhance content? Do you trust the wisdom of the crowds? This presentation explores a model of crowdsourcing developed by the RunCoCo team at the University of Oxford to build academic-community collection archives. Put into practice through a series of successful digitisation projects that used social participation and digital technologies, lessons learned are synthesised into an A,B,C of advice for other projects and groups who aim to ‘crowd-source’ with sustainable success.
     

16:00 - 17:00

  • Title: Genetic editing and digital scholarly editions
  • Continued.
     

  • Title: Building queryable document-based websites
  • Tutor: Joseph Wicentowski (US State Dept.)
  • Time: Thursday 28 July 2011, 16:00 - 17:00
  • Location: Wolfson College
  • Description:

    TEI is a platform-neutral format for encoding texts, but how is TEI published and consumed on one very important platform -- the web? This session surveys a number of websites based on TEI texts that do this and explores their underlying technologies and methodologies, and in so doing introduces important concepts in text search and data mining. This session will provide new TEI learners with the vocabulary and concepts needed to evaluate solutions for building queryable TEI document-based websites for digital humanities research. (Keywords: TEI, search, databases, web servers)
     

Plenary Lecture: 17:00 - 18:00

  • Title: Visualization in Flatland
  • Speaker: Min Chen (OeRC)
  • Time: Thursday 28 July 2011, 17:00 - 18:00
  • Location: Wolfson College
  • Description:

    A large number of challenging problems in visualization involve three or higher dimensional data, while the majority of visualization results have been, and will continue to be, shown on two dimensional computer displays and paper media. "Flatland: A Romance of Many Dimensions", written by a headmaster and Shakespearean scholar in 1884, enlightened us about the fundamental difficulty and hindrance in visualizing such data. The speaker will draw from his experience in several areas of visualization, and discuss challenges in visualization from the perspective of "Flatland", highlighting the essence of dimension reduction in many successful visualization techniques.
     

  • Note: This evening plenary will be followed by a drinks reception.
     

Friday 29 July 2011

09:30 - 12:30

  • Title: An introduction to XML databases
  • Tutor: Joseph Wicentowski (US State Department)
  • Time: Friday 29 July 2011, 09:30 - 12:30
  • Location: OUCS
  • Description:

    An XML database can turn your TEI documents (and other XML documents) into a powerful database for your digital humanities research. This workshop introduces eXist, a free, open-source XML database and application server that is used widely in the TEI community. Building on the sample data used in the introductory TEI workshops, the workshop will give you the tools to display, search, analyze, and transform your TEI documents. Learn from the creator of history.state.gov, a TEI-rich website entirely driven by eXist. Familiarity with HTML is helpful but not required. (Keywords: TEI, XML databases, eXist, XQuery.)

  • Materials: Wicentowski-XMLDatabases.pdfWicentowski-XMLDatabases-materials.zip;

     

  • Title: Visualizing data sets for the web (with Google's Visualization API)
  • Tutor: Arno Mittelbach (Technische Universität Darmstadt)
  • Time: Friday 29 July 2011, 09:30 - 12:30
  • Location: OUCS
  • Materials: Mittelbach-Javascript.keyMittelbach-Javascript.pdfMittelbach-Presentations.zipMittelbach-exercises.zip
  • Description:

    A good visualization can be a powerful helper for communicating or interpreting data. Creating the right visualization can, on the other hand, be a very tedious and difficult task. In this workshop we are going to explore how we can visualize datasets in a web browser. We are going to give a brief introduction to javascript, today's main scripting language for browsers and then take on the task to create various different visualizations (including charts, maps, timelines etc.) with Google's Visualization API, which allows to create good looking, interactive visualizations with a minimum amount of work.
     

14:30 - 15:30

  • Title: Impact as a process: considering the reach of resources from the start (Part 1)
  • Tutors: Eric T. Meyer (OII) and Kathryn Eccles (OII)
  • Time: Friday 29 July 2011, 14:30 - 15:30 continued 16:00 - 17:00
  • Location: Wolfson College
  • Materials: http://www.slideshare.net/etmeyer/tidsrdhox
  • Description:

    Digital resources are part of a growing trend towards a world where cultural and educational materials are digitally produced and reproduced, and are available to download at the click of a button. But how do you go about defining value and impact? At what point should you start thinking about your audience and how you will know whether you have reached them? When is a digital resource a well used resource? How can niche resources that will never see high-volume traffic demonstrate impact using qualitative and quantitative measures? How can large and heavily-used resources see past the numbers to better understand the context of their users to enhance impact? In this session, Dr Meyer and Dr Eccles will showcase results from their research in this area, and will also present their Toolkit for the Impact of Digitised Scholarly Resources, in which they present a set of best practices in this area, in the form of ‘how to’ guides, tools and resources.
     

  • Title: Digital library technologies and best practice (Part 1)
  • Tutors: Neil Jefferies (Bodleian Libraries) and Christine Madsen (OII)
  • Time: Friday 29 July 2011, 14:30 - 15:30 continued 16:00 - 17:00
  • Location: Wolfson College
  • Materials: Jefferies-DigitalLibrary.pdfMadsen-DigitalLibrary.pptxMadsen-DigitalLibrary.pdf
  • Description:

    This session will draw together many of topics covered in other sessions. We will discuss the direction that digital libraries are going in and the how the various technologies will take us there. In particular, it is unlikely that any single technical approach will be sufficient in the future so an integrated view is required. We will explore how scholarly discourse and communication is changing and how this affects the types of material that a library will need to hold and how practice, in terms of curation, preservation and discovery/re-use will need to evolve accordingly. Implicit in this is a shift from library-centric standards towards Internet standards that are more broadly-based and better supported. We will also deal with the practicalities of producing data management, curation and preservation plans which are increasingly required by funding-bodies as a condition of grant.
     

16:00 - 17:00

  • Title: Impact as a process: considering the reach of resources from the start (Part 2)
  • Continued from Part 1.
     

  • Title: Digital library technologies and best practice (Part 2)
  • Continued from Part 1.
     

Plenary Lecture: 17:00 - 18:00

  • Title: Towards Web-Scale Analysis of Musical Structure
  • Speaker: David De Roure (OeRC)
  • Time: Friday 29 July 2011, 17:00 - 18:00
  • Location: Wolfson College
  • Description:

    This plenary session will close the Digital.Humanities@Oxford Summer School.