Text mining is an interdisciplinary field combining techniques from linguistics, computer science and statistics to build tools that can efficiently retrieve and extract information from digital text (see PLOS blog: Announcing the PLOS Text Mining Collection, April 17, 2013). For instance, it uses powerful computers to find links between drugs and side effects, or genes and diseases, that are hidden within the vast scientific literature. These are discoveries that a person scouring trough papers one by one may never notice. Interest in text and data mining scholarly content is on the increase. For those who want to learn more about text and data mining, a webinar recording of the CrossRef Text and Data Mining (from June 3, 2014) may be of interest.

Mining for Insights

Siân Harris investigates the role of text and data mining in research – and what the publishing industry is doing, and could do, to help. 

Text and data mining is a hot topic. It has been extensively discussed in copyright and open-access discussions and has been mentioned in many recent policies in these areas. But is there a fundamental disconnect between what researchers want to do and what information providers think they need?

