Data: Challenges in dealing with very large collections of speech

Data - Lunchtime talks for researchers - IT Services

John Coleman will talk about how human languages are extraordinarily large, complex and variable and how until recently, our ability to analyse or model these has been profoundly challenged by a problem of scale. For it to become feasible to analyse its many dimensions of variation, far bigger data sources will be needed. Aggregation of existing speech corpora could be an initial step towards generating a resource that begins to be large enough to start to get to grips with some of the main sources of variation.

The ‘Data’ series of lunchtime talks is aimed at researchers from all disciplines who generate, gather, rearrange, or re-purpose data. The talks will deal with software, services, and techniques to help you make the most of that data. Covering data visualisation tools and projects, and aspects of data management from planning to re-use, these talks are intended to inspire, whilst also considering the practical requirements of research funders and issues surrounding data sharing. Join us to hear how others are doing things with data, both in Oxford and outside, and come along to tell us about what you are doing.

Event Link: http://courses.it.ox.ac.uk/detail/TPJB6