The latest milestone for IBM researchers in their development of the Watson platform is the creation of a "state-of-the-art" system for automatically abstracting documents.
Using a deep-learning strategy, the research team tasked with improving Watson's question answering algorithms generated short summaries of millions of English newswire reports. "In this work, we focus on the task of text summarization, which can also be naturally thought of as mapping an input sequence of words in a source document to a target sequence of words called summary," the researchers note.
The deep learning-based sequence-to-sequence approach they employed is more commonly used for machine translation. The researchers point out abstracting text differs significantly in that the summary is usually short and does not heavily rely on document length, and it is acceptable to omit all but the core concepts in the source material.
The researchers report the use of an attentional encoder-decoder recurrent neural network to summarize text offers superior performance over a recent cutting-edge model used by Facebook to generate summaries. "They are surprisingly good and would easily pass muster for a human-generated summary in most cases," the team says.
The ability of machines to summarize text so they capture its key meaning is important if computers are to obtain a human-like understanding of language.
View Full Article
Abstracts Copyright © 2016 Information Inc., Bethesda, Maryland, USA
No entries found