The digital age has given us unprecedented access to information. Researchers can now obtain far more research into their subject areas than ever before. But how much is too much? And could AI hold the key to tackling information overload in research? In the first of two blogs, we look at the role of publishers in this issue and the development of machine-generated books.
There are now millions of academic articles published every year. Even within niche subject areas, the sheer volume of papers, pre-prints, and data published is far too great for an individual researcher to stay abreast of. The first wave of the Covid-19 pandemic in 2020 is particularly illustrative of this problem. During the first six months of 2020, the number of articles published about Covid-19 grew from zero to 28,000. In mid-May, nearly 3,000 papers were published in a single week. It would be impossible for a researcher to read this many papers – and still manage to do their own research. So how can they make sure they’re reading the seminal papers and most important findings? And is there more that publishers can do to support them?
We first looked at this subject in a webinar with Dr. Stephanie Preuss, Senior Editor at Springer Nature and Markus Kaindl, Springer Nature’s Group Product Manager for Research Intelligence. Here, we review what they covered and further developments that have taken place since.
She explained that while publishers can be part of the problem, they can also be part of the solution. And this is why Springer Nature is investing in technology that uses artificial or ‘augmented’ intelligence to offer solutions to the ‘information overload’ challenge.
As a large publisher, we have a lot of different brands, and, of course, those brands publish a lot of research,” said Dr. Stephanie Preuss, speaking at the webinar. “We’re very proud of that research, but it also means we are part of the problem of information overload."
So what exactly can publishers do? Stephanie laid out some of the key development areas:
"We think that there are some important questions around the role of artificial intelligence and publishing,” said Stephanie. We think that artificial intelligence will shape the future of our industry."
Stephanie went on to explain the development and release of the first-ever machine-generated academic book. The book, Lithium-Ion Batteries, A Machine-Generated Summary of Current Research, was published in April 2019 following a collaboration between Springer Nature, the Applied Computational Linguistics lab of Goethe University Frankfurt, and Digital Science.
This innovative book prototype provided a compelling machine-generated overview of the latest research on lithium-ion batteries, automatically compiled by an algorithm dubbed “Beta Writer”. The launch of the book generated significant media attention – with ge.com naming it one of their “coolest things on Earth this week”.
In the three years leading up to the book’s publication, more than 53,000 papers and articles were published about research being conducted in the lithium-ion battery field. But staying on top of all that research would be near impossible.
As Andrew Liszewski wrote for Gizmodo, “It’s a firehose of data that Springer Nature has turned into a manageable trickle through this machine-generated publication.”
The algorithm uses machine learning to first analyze thousands of publications to ensure that only those relevant are selected for the book. It then parses, condenses, and organizes those pre-approved, peer-reviewed publications from Springer Nature’s online database into coherent chapters and sections that each focus on a different aspect of battery research.
The algorithm produces no new results – it’s not new research output – but it accurately provides an unbiased summary of all known facts on a subject to provide a new perspective.
The book published in 2019 was only the start of our work looking at machine-generated texts. In 2021, we published over 500 machine-generated literature overviews, and offered a new book format – AI-based literature overviews.
The new product is a mixture of human-written text and machine-generated literature overviews. An author puts these machine-generated reviews, created from a large set of previously published articles in Springer Nature journals, into book chapters to provide a scientific perspective.
"This is an exciting step in our innovation journey that started with the first machine-generated book, as this is effectively a new type of book format that resembles a kind of dialogue between the author (now editor) and the machine."
Climate, Planetary and Evolutionary Sciences: A Machine-Generated Literature Overview, edited by Guido Visconti, is the first publication of this kind. Professor Guido Visconti devised a series of questions and keywords related to different aspects of climate studies, examining their most recent developments and their most practical applications. These were queried, discovered, collated and structured by the machine using AI clustering with the results presented in a series of book chapters for Professor Visconti to put into scientific context. The same model was used in 2022 to publish CRISPR: A Machine-Generated Literature Overview, edited by Ziheng Zhang, Ping Wang, and Ji-Long Liu.
"We are looking forward to seeing how this joint journey of authors, publishers, and machines helps advance science and show authors surprising new opportunities for future research. We hope others will be inspired and invite the submission of new ideas to produce similar publications in other research areas."
Look out for our second blog on this topic, where we’ll consider how AI can help support the research community during times of crisis, from ‘TLDR’ abstracts to automating scientific content generation.