The use of content mining is said to be lower in Europe than in some countries in the Americas and Asia. Hence EU funding for the FutureTDM project, which seeks to improve uptake of text and data mining (TDM) in Europe. FutureTDM actively engages with stakeholders such as researchers, developers, publishers and SMEs and looks in depth at the TDM landscape in the EU to help pinpoint why uptake is lower, to raise awareness of TDM and to develop solutions.

Marlborough Arms LocationNearest tube stations Goodge St, Warren St, Euston Sq. |
|
Open access ; open data; open source; open knowledge networking... Today these are not just aspirations or a fairy-tale Aladdin's Cave - they are the everyday tools and techniques we can exploit to open doors for ourselves and our users. Join us in this Meetup at the Marlborough Arms to consider the open opportunities for text and data mining, see the techniques in action and fire yourself up for the Open Season!
Programme
17:30 | Registration |
18:00 | Welcome and introductions Chair: Stella Dextre Clarke |
18:05 | FutureTDM project - Progress and prospects Freyja van den Boom, FutureTDM |
18:20 | Copyright - the Implications for Text Miners Ben White, British Library |
18.40 | Quick Refreshment Break |
19.00 | Content Mining in Action Peter Murray-Rust, Cambridge University |
19.30 | Panel session and Q/A - what's the knowledge organizing perspective? |
20.00 | Networking |
Speakers and abstracts
-
Freyja van den Boom
FutureTDM project - progress and prospects
Talk
-
Ben White
Copyright - the Implications for Text Miners
Talk
Ben White will explain how the Copyright exception came about and the role of the British Library in finding a solution.
-
Peter Murray-Rust
Content Mining in Action
Talk
Every day about 15,000 journal articles and over 500 theses are published. We must use machines and software if we are going to stop being drowned! Peter Murray-Rust is a scientist-turned-activist who, with funding from The Shuttleworth Foundation, has created a community of young excited enthusiasts who are building the tools, resources and practices. Given these tools, content mining can be done by anyone and 6 young Fellows have been appointed to carry out research projects in biomedical science.
Starting from an authority list or thesaurus of terms used in your discipline (which you can create if there isn't one) you'll be surprised how many documents you can find using TDM tools. The next step is to extract information and aggregate, and that is also possible. In this talk, Prof Murray-Rust showed the various stages - crawl, download, normalize, search, index, and understand - everything Open Source and re-usable. His plant science slides at http://www.slideshare.net/petermurrayrust/high-throughput-mining-of-the-plantscience-literature illustrate the possibilities.