A Prince’s Papers: Session 1

Read the introduction page for background on this workshop series.

https://www.youtube.com/embed/E_y1TjgZRJI

Prince Albert of Saxe Coburg and Gotha was born in 1819. He married Queen Victoria, his cousin, in 1840, and the couple had nine children.

He was a man of many interests, passions, even, in the intellectual and cultural sense. He took a keen interest in law and politics, was a competent draftsman, was a gifted amateur musician and composer, was avidly interest in the arts and in technological and societal progress, and was, to all intents and purposes, a workaholic. His papers and collections give the impression of a man whose presence and influence was felt in many spheres, who was, sometimes himself, and sometimes through his small body of highly trusted advisors, at the centre of a large social network, and a man under whose eye almost every detail of numerous significant events, from the Great Exhibition to the Treasures of Art Exhibition to the Crimean War to tensions with America, would pass. As such, the collection crosses just as many fields of interest, from musicology, to art history, to royal history.

He worked hard and died young, at the end of 1861. Much ink has been spilled theorising about both the cause of his death and what would have happened had he not died so young, but we will not concern ourselves with those issues in this workshop. Rather, we will look at some of the activities in which he was involved: his speeches, which indicate the range of his interests, and the causes that he took up and often generously supported; his art historical and photographic interests; the Great Exhibition; and, finally an opportunity for you to go digging and find what interests you.

The scope for this last session, pursuing your own interests, is extensive. Prince Albert's paper trail is enormous, varied and fascinating. It is often written in a good hand, whether by him or by those writing to or for him, which makes it very apt to this kind of workshop, and it gives an insight into fragments of cultural history that can easily be forgotten.

I hope that, through this course, you will come to be as fascinated by these papers as I am, and to see Albert and his 'works' - for which he is so well memorialised - in a new light - the light of communicative efforts and structural contexts and social networks, and, above all, surprising detail.

https://www.youtube.com/embed/UuhcCzSbF-8

Let us begin with a few brief thoughts about why we might want to transcribe manuscripts. Grab your pencil and paper, and jot down the ones you think of.

When I say transcription, in this context, I mean more than simply making a copy for our own reference - this is, in many respects a digital extension of a cyclical note-taking manuscript culture . Rather, I am thinking about transcription for publication more broadly.

Until recently, it was much easier to print text than image, and it is still the case that facsimile reproductions of manuscripts are expensive to produce. But why, in an age during which digital transmission of images is so prevalent, when we can access images so easily through a network, is there still such value placed upon the transcription of manuscript into what is still, on the surface at least, a printed object?

I think there are three, perhaps four, main answers:

the first is legibility. It is often far easier, and far quicker, to read a printed text than a manuscript, particularly given the variation between different hands (handwritings).
the second is editorial control. Transcription and editing gives the opportunity to clarify the original text and, in some cases, make comparisons across different versions of a text, to unify or point out conflicts between them. It also allows for an element of explanation.
the third is searchability. Computers have been able to handle the searching of text for many years, and it is still far more straightforward to work with digital text than it is to work with digital images, particularly for searching, sorting, and other forms of processing.

The fourth, if there is a fourth, is, perhaps, a little more philosophically critical, and relates to the value assigned to the printed word in an academic context.

Have you thought of any further reasons? Keep a note of them and share them at the drop-in session.

This seems like a very good point to turn to some transcription, but before we do, I promised I would say a little about Brackets. I am recommending it because it will allow you to see more easily where you have added bits of code, and I shall show you how that works through a very short video, in which we'll adjust one or two of the settings to make it a bit more user friendly. If you haven't done any 'coding' before, you are about to have your very first taste of it, but it will be very straightforward! If you find this entirely befuddling, that's ok. For now, just use the standard text editor on your machine. For a Mac, that will be Text Editor, and for a Windows machine, it will be NotePad. Do make sure that your editor is set to 'plain text'.

If you run into trouble with this, try running it by google, and if that doesn't help, do feel free to get in touch with me.

NB: If your installation of Brackets does not look quite the same as in the video, when you begin to install it - there may be a welcome page, or some example documents open - don't panic! In particular, in the section in which we set the preferences, don't worry if you have a different number of code lines in the right hand pane - just follow the instructions, but putting the comma at the end of the last text line above the curly bracket (brace) of the right hand pane (it will make sense when you watch the video!). Follow the instructions through until the end, even if things look a little different and see if things work out. If they do not, do feel free to use a standard text editor for now, or to get in touch!

https://www.youtube.com/embed/VWO4Lu4btfs

https://www.youtube.com/embed/424vnlokGFM

Now that you have your text editor ready (hopefully!), let's take a look at a document together. This is always a good start to any kind of digital - and many other kinds of - project. It lets us see what the materials we are working with look like, what kind of things we are likely to encounter before we begin, and sets off a chain of thought about how we should approach our work.

Navigate to albert.rct.uk -> Collections -> Royal Archives -> Prince Albert's Official Papers -> Various Files -> Speeches by the Prince Consort -> The [Royal] Literary Fund -> detailed object viewer.

This is one of many speeches that Prince Albert gave in his lifetime, often enough in support of good causes. You will notice that the second session of this workshop is also dedicated to Albert's speeches, and I think you will be able to see why from this document.

It is written, as you might have noticed from the metadata when you were navigating here, in Prince Albert's own hand, which is quite neat and legible, and it is a relatively short document. Metadata, by the by, is simply 'data about data' - in this context, the letter is our data, and the metadata is how it is described as an archival resource. We'll talk more about metadata later.

Let's start by reading through it and seeing what there is to see. Ignoring, for a moment, the content of the document, let's have a look at a few features that are worth noting.

The document is very narrow, there are only a few words to each line, and quite a few of these are broken across two lines. We will want to reflect this in our transcription, as counter-intuitive as that might seem, so that we can give an impression of what the document looks like. We will also wish to make a note of the page breaks. It mentions, also, one or two people: The Queen (Victoria), and Lord Lansdowne.

Now that we have established all of that, let's make a first-transcription. If there are any issues with the handwriting, for now, note them with an '[[illegible]]' and see if you can get to the end reasonably quickly. This will give you a chance to get used to the hand, and to begin to get a feel for how letters and words are formed. When you have reached the end of the document, you can go back, and see if you can figure out the words and letter forms of any difficult to read passages contextually. Look for similar shapes in words you have been able to transcribe; look at the word from different angles; imagine that it is a little like a crossword; try tracing out the word on paper for yourself, to see if feeling the shape clarifies matters - the uses of pencil and paper! And don't forget that we can discuss awkward palaeographical matters as a group in our contact time.

You should now have something that looks and reads approximately like a full document. Let's save the document as '[Your Name] Literary Fund Speech Blank' and then set it aside for a moment.

TEI, or 'The TEI' stands for 'the Text Encoding Initiative'. It is a system of marking up the semantic and structural elements of text in a way that makes them more navigable by a computer. The TEI has been around for a good while, with work on it beginning in the late 1980s, and it is used in a wide range of scholarly projects around the world.

Why does TEI exist?

Computers, by nature, have no semantic or contextual understanding of text outside of their vocabulary of commands - it is all, quite literally, just meaningless data. It is possible to programme computers to explore that data in different ways, and it is possible to search through it against particular parameters; however, if you feed a novel into a computer and then ask it to find the chapter titles, how can it do so when all it has to work with is undifferentiated text?

It is at this point that a system of encoding such as the TEI becomes useful. By marking up elements of the text and describing the whole text in semantic blocks, we make it possible for the computer, through programmatic software, to differentiate between its different elements. This makes it more searchable, and more processible - more possible to manipulate and extract data from.

XML at a glance

The TEI language is a form of XML, or eXtensible Markup Language. XML is one of a number of ways to express and structure data to make it more malleable for digital use and re-use. XML is a nested, hierarchical language in which data is 'wrapped' in element tags. It is a strict language, which means that errors in the structure cause problems, but it is also relatively straight-forward.

When we mark up a document in TEI, we are using a specially developed form of XML to describe elements of that document in computer-readable ways. With this in mind, let's take a little look at some general examples of XML structure to get a sense of how it operates. Do not get put off if this does not make immediate sense. For the purposes of the workshops, we will be working with a limited range of element tags and guidance for using them.

Data, information, in XML is described by an element tag. These tags wrap around each piece of data, containing it. A 'tag', or 'element' in XML, looks a little like this.

<openTag>Information</closeTag>

The opening tag describes to the computer what kind of information or data will be contained in the element. The closing tag </ indicates to the computer that the data is complete and that element is now finished - it is very much like putting information into boxes! The tags must match, and so a paragraph tag () must be closed with its matching partner ().

Let's try describing some simple data in XML.

I am a person. My name is Andrew Cusworth. I work in Digital Humanities.

I have given three, or more, types of information here, some of which are more important than others, and some of which are more categories than specifics. Categories make good tags, or elements, so let us decide that 'person,' 'name' and 'work' are categories.

'Name' and 'work' are very simple to express in XML:

<name>Andrew Cusworth</name>
<work>Digital Humanities</work>

Here, we have made it visible to the computer that a concept called 'name' has the value of 'Andrew Cusworth' and that a concept called 'work' has the value 'Digital Humanities'. However, we have not told the computer what those concepts belong to - a 'person'.

This 'person' concept encompasses both 'name' and 'work'.

<person>
<name>Andrew Cusworth</name>
<work>Digital Humanities</work>
</person>

Here, we have stated that a concept called 'person' has a name ('Andrew Cusworth') and work ('Digital Humanities'). Notice that the concept of 'person' is wrapped around the elements that belong to it - they are 'contained' or 'nested'.

We might extend this example further, with more than one person contained inside a concept called 'people'.

<person>
 <name>Andrew Cusworth</name>
 <work>Digital Humanities</work>
 </person>

<person>
 <name>A.N. Other</name>
 <work>Super Hero</work>
 </person>

</people>

Here, 'people' contains two 'person's, each with their own attributes. I have laid it out slightly differently to make how the elements are nested (arranged in boxes) a little more obvious.

I work through this example again in the following video.

https://www.youtube.com/embed/J9A6MNGyp_E

Grab your pencil and paper, and have a go at encoding yourself in XML. Write a simple paragraph about yourself and see if you can make a data model for it.

Do not worry if this slightly abstract exercise feels a little confusing at this point, it will gradually begin to feel familiar, and is intended only to get you thinking about how data is structured and expressed in computer friendly terms.

For this first exercise, we are going to limit ourselves very strictly in how much we are going to mark up the document. This will not be a comprehensive introduction to TEI and how to use it, and there are quite a few elements that we will be leaving out for now; rather, the intention is to give you a flavour of what this kind of work feels and looks like, and we will deepen that as we go along.

For this first bit of transcription and mark-up, we will use only the following tags:

<text></text>
This tag encompasses the body of the document we are transcribing and lets the computer know that it is a text.

This tag encompasses and denotes a paragraph.

<pb/>
This is one of two 'short tags' that we will be using, and denotes a page-break. Note that it doesn't have a separate 'closing tag', and is instead abbreviated.

<lb/>
This is the second of our 'short tags', and indicates a line-break.

<name></name>
This, unsurprisingly, encompasses a name. As you might have guessed, this is a little non-specific, so in this instance, we will make it more so by using <name type='person'></name>, which will tell us that we are seeing a person's name.

Remember that tags (except short tags) must both be opened and closed to be correct, and that they must not overlap - just as a box inside a box cannot overlap without causing some very significant issues with the fabric of space-time.

For example, <text></text> is very definitely wrong, whereas <text></text> is very much right.

Returning to our currently blank transcription, let's start by saving it as a new document so that we don't lose our original. Name it something like '[Your Name] Literary fund speech - TEI transcription'.

Now, to the editing! Begin by wrapping the whole document in a 'text' tag. We will put an opening '<text>' tag at the top and a closing '</text>' tag at the bottom.

Next, let's mark in the page breaks using the '<pb/>' short tag. It is possible to give a page number to each of these, which allows them to be identified sequentially. If you would like to do this, it looks like

<pb n="1"/>, where the number corresponds to the page number.

Now that we have marked out the page breaks, let us add in the paragraphs using a '' opening tag at the beginning and a '' closing tag at the end of each. You might be wondering, particularly after my comment about needing to not have overlapping tags (boxes and space-time), what happens if a paragraph straddles a page break. In this instance, it is helpful to remember that a page break ('<pb/>') is opened and closed in the same tag, and therefore does not break the rules!

The final layout tag we should add is the line break '<lb/> short tag. There are quite a few of these, and so it is a little tedious. I suggest that you type it out once, copy it, and simply paste it in at the end of each line. Remember that it is not needed at the end of a paragraph or the end of the document.

We will now add our last semantic tags, the '<name>' tags, around 'The Queen' and 'Marquis of Lansdowne'. In this last instance, you will again notice that our opening and closing tags surround an '<lb/>', but, do not worry, the same rules apply here as in the case of paragraphs and page breaks.

Save the document, and, there you have it. You have completed your first bit of TEI mark-up! It is not, for reasons that I will explain in a later session, a complete TEI document, but I hope that it gives you a sense of the work involved in creating this sort of digital transcription and edition. A lot of thought goes into how we mark up documents, and how detailed each scheme of work should be, which is something we will also talk about a little more in the future.

https://www.youtube.com/embed/NhNgghbK8uw

In the next session, we will look at a number of Albert's speeches, with each participant being assigned a speech to transcribe. The process of transcribing and marking up will be broadly similar to this, and there will be room for discretion about the level of mark-up detail you want to pursue.

We will also, in that session, produce a 'digital print' edition, with a walk-through of some of the considerations that go into that work.

The first contact session will largely be about introductions and troubleshooting from today's session. In the second, you will have an opportunity to share your self-transcribed letter, and discuss what you think is interesting (or not) about it.

A Prince’s Papers: Session 1

Who was Prince Albert and why are we transcribing his papers?

Why transcribe?

Installing Brackets

First-pass transcription

What is TEI?

Marking up Prince Albert's Speech to the Literary Society

Looking forward to next time

Main Contact