VIETNAMESE MULTI-DOCUMENT SUMMARIZATION BASE UNSUPERVISED LEARNING METHODS
Abstract
Recently, English summarization has been amazing results, while Vietnamese summarization has been being at an early stage with limited results. This paper proposes a solution to summarize Vietnamese text by utilizing unsupervised learning.
The article shows the results of employing unsupervised learning methods to summarize a document. To do that, the authors compared results of unsupervised learning methods for summarization to supervised learning ones, including CNN and LSTM. The comparison can demonstrate the effectiveness of unsupervised learning methods for summarization.
Unsupervised learning methods give promising empirical results because of some reasons. Firstly, based on ranking mechanisms, they pick up high-scoring sentences, which ensure the selection of important sentences. Secondly, the selection of sentences with low correlation shows that a summary text does not overlap with remaining sentences, which are not included in the summary.