Towards Better Single Document Summarization using Multi-Document Summarization Approach
Sandhya Singh1 , Kevin Patel2 , Krishnanjan Bhattacharjee3 , Hemant Darbari4 , Seema Verma5
Section:Research Paper, Product Type: Journal Paper
Volume-7 ,
Issue-5 , Page no. 695-703, May-2019
CrossRef-DOI: https://doi.org/10.26438/ijcse/v7i5.695703
Online published on May 31, 2019
Copyright © Sandhya Singh, Kevin Patel, Krishnanjan Bhattacharjee, Hemant Darbari, Seema Verma . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
View this paper at Google Scholar | DPI Digital Library
How to Cite this Paper
- IEEE Citation
- MLA Citation
- APA Citation
- BibTex Citation
- RIS Citation
IEEE Style Citation: Sandhya Singh, Kevin Patel, Krishnanjan Bhattacharjee, Hemant Darbari, Seema Verma, “Towards Better Single Document Summarization using Multi-Document Summarization Approach,” International Journal of Computer Sciences and Engineering, Vol.7, Issue.5, pp.695-703, 2019.
MLA Style Citation: Sandhya Singh, Kevin Patel, Krishnanjan Bhattacharjee, Hemant Darbari, Seema Verma "Towards Better Single Document Summarization using Multi-Document Summarization Approach." International Journal of Computer Sciences and Engineering 7.5 (2019): 695-703.
APA Style Citation: Sandhya Singh, Kevin Patel, Krishnanjan Bhattacharjee, Hemant Darbari, Seema Verma, (2019). Towards Better Single Document Summarization using Multi-Document Summarization Approach. International Journal of Computer Sciences and Engineering, 7(5), 695-703.
BibTex Style Citation:
@article{Singh_2019,
author = {Sandhya Singh, Kevin Patel, Krishnanjan Bhattacharjee, Hemant Darbari, Seema Verma},
title = {Towards Better Single Document Summarization using Multi-Document Summarization Approach},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {5 2019},
volume = {7},
Issue = {5},
month = {5},
year = {2019},
issn = {2347-2693},
pages = {695-703},
url = {https://www.ijcseonline.org/full_paper_view.php?paper_id=4302},
doi = {https://doi.org/10.26438/ijcse/v7i5.695703}
publisher = {IJCSE, Indore, INDIA},
}
RIS Style Citation:
TY - JOUR
DO = {https://doi.org/10.26438/ijcse/v7i5.695703}
UR - https://www.ijcseonline.org/full_paper_view.php?paper_id=4302
TI - Towards Better Single Document Summarization using Multi-Document Summarization Approach
T2 - International Journal of Computer Sciences and Engineering
AU - Sandhya Singh, Kevin Patel, Krishnanjan Bhattacharjee, Hemant Darbari, Seema Verma
PY - 2019
DA - 2019/05/31
PB - IJCSE, Indore, INDIA
SP - 695-703
IS - 5
VL - 7
SN - 2347-2693
ER -
VIEWS | XML | |
443 | 242 downloads | 153 downloads |
Abstract
Extractive Single Document Summarization (SDS) is the task of summarizing a single document via extracting importance sentences verbatim and arranging them in a cohesive manner. It is different from Multi-Document Summarization (MDS) where multiple source documents are processed to generate a single summary. This paper proposes a two-stage mechanism to perform single document summarization via multi-document summarization technique. The approach involves the use of popular extractive summarization algorithms to generate summaries which are then further processed as multi-document summarization instance. The MDS approach used is based on word graph based sentence fusion followed by concept-based Integer Linear Programming (ILP) method for maximizing the coverage in sentence selection. The proposed system outperforms each of the single document summarizers by at least 2.6 percent point ROUGE scores, thereby indicating that performing single document summarization via multi-document summarization is a promising venue for further research in summarization.
Key-Words / Index Term
Text Summarization, Single Document Summarization (SDS), Multi-Document Summarization (MDS), Extractive Summarization, Integer Linear Programming (ILP)
References
[1] I. Mani, “Advances in automatic text summarization. “MIT press, 1999.
[2] D.R. Radev, E. Hovy, K. McKeown, “Introduction to the special issue on summarization.“ Computational linguistics, 28(4), pp.399-408, 2002.
[3] H. Saggion, T. Poibeau, “Automatic text summarization: Past, present and future.” In Multi-source, multilingual information extraction and summarization (pp. 3-21). Springer, Berlin, Heidelberg, 2013.
[4] M.Y. Kan, K.R. McKeown, J.L. Klavans, “Applying natural language generation to indicative summarization.” In Proceedings of the 8th European workshop on Natural Language Generation-Volume 8 (pp. 1-9). Association for Computational Linguistics, 2001.
[5] H. Saggion, G. Lapalme, “Generating indicative-informative summaries with sumUM.” Computational linguistics, 28(4), pp.497-526, 2002.
[6] J.L. Klavans, M.Y. Kan, K. McKeown, “Domain-specific informative and indicative summarization for information retrieval.”, In proceedings of the Workshop on text summarization (DUC 2001), 2001.
[7] T. Hirao, M. Nishino, M. Nagata, “Oracle Summaries of Compressive Summarization.” In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) (Vol. 2, pp. 275-280), 2017.
[8] W. Bosma, “Query-based summarization using rhetorical structure theory.” LOT Occasional Series, 4, pp.29-44, 2005.
[9] H.G. Silber, K.F. McCoy, “An efficient text summarizer using lexical chains.” In Proceedings of the first international conference on Natural language generation-Volume 14 (pp. 268-271). Association for Computational Linguistics, 2000.
[10] R. Barzilay, M. Elhadad, “Using lexical chains for text summarization.” Advances in automatic text summarization, pp.111-121, 1999.
[11] Y. Gong, X. Liu, “Generic text summarization using relevance measure and latent semantic analysis.” In Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval (pp. 19-25). ACM, 2001.
[12] S. Deerwester, S.T. Dumais, G.W. Furnas, T.K. Landauer, R. Harshman, “Indexing by latent semantic analysis.” Journal of the American society for information science, 41(6), pp.391-407, 1990.
[13] S.M. Harabagiu, F. Lacatusu, “Generating single and multi-document summaries with gistexter.” In Document Understanding Conferences (pp. 11-12), 2002.
[14] G. Erkan, D.R. Radev, “Lexrank: Graph-based lexical centrality as salience in text summarization.” Journal of artificial intelligence research, 22, pp.457-479, 2004.
[15] O. Sornil, K. Gree-Ut, “An automatic text summarization approach using content-based and graph-based characteristics.” In 2006 IEEE Conference on Cybernetics and Intelligent Systems (pp. 1-6). IEEE, 2006.
[16] H. Chen, T. Ng, “An algorithmic approach to concept exploration in a large knowledge network (automatic thesaurus consultation): Symbolic branch and bound search vs. connectionist Hopfield net activation.” Journal of the American Society for Information Science, 46(5), pp.348-369, 1995.
[17] C. Smith, H. Danielsson, A. Jönsson, “A more cohesive summarizer. “ Proceedings of COLING 2012: Posters, pp.1161-1170, 2012.
[18] J. Cheng, M. Lapata, “Neural summarization by extracting sentences and words. “ In proc. Of 54th Annual Meeting of the Association for Computational Linguistics, 2016.
[19] R. Nallapati, F. Zhai, B. Zhou, “Summarunner: A recurrent neural network based sequence model for extractive summarization of documents.“ In Thirty-First AAAI Conference on Artificial Intelligence, 2017.
[20] D.R. Timothy, T. Allison, S. Blair-goldensohn, J. Blitzer, A. Elebi, S. Dimitrov, E. Drabek, A. Hakim, W. Lam, D. Liu, J. Otterbacher, “MEAD a platform for multidocument multilingual text summarization.” In International Conference on Language Resources and Evaluation, 2004.
[21] A. Celikyilmaz, D. Hakkani-Tur, “A hybrid hierarchical model for multi-document summarization.” In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (pp. 815-824). Association for Computational Linguistics, 2010.
[22] M.A. Fattah, “A hybrid machine learning model for multi-document summarization.” Applied intelligence, 40(4), pp.592-600, 2014.
[23] Z. Cao, F. Wei, L. Dong, S. Li, M. Zhou, “Ranking with recursive neural networks and its application to multi-document summarization. “ In Twenty-ninth AAAI conference on artificial intelligence, 2015.
[24] T. Hirao, M. Nishino, M. Nagata, “Oracle Summaries of Compressive Summarization.” In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) (Vol. 2, pp. 275-280), 2017.
[25] F. Galgani, P. Compton, A. Hoffmann, “Combining different summarization techniques for legal text.” In Proceedings of the Workshop on Innovative Hybrid Approaches to the Processing of Textual Data (pp. 115-123). Association for Computational Linguistics, 2012.
[26] K. Hong, M. Marcus, A. Nenkova, “System combination for multi-document summarization.” In Proceedings of the 2015 conference on empirical methods in natural language processing (pp. 107-117), 2015.
[27] S. Dutta, V. Chandra, K. Mehra, A.K. Das, T. Chakraborty, S. Ghosh, “Ensemble Algorithms for Microblog Summarization.” IEEE Intelligent Systems, 33(3), pp.4-14, 2018.
[28] S. Kullback, “Information theory and statistics.” Courier Corporation, 1997.
[29] A. Haghighi, L, Vanderwende, “Exploring content models for multi-document summarization.” In Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics (pp. 362-370). Association for Computational Linguistics, 2009.
[30] J. Steinberger, K. Jezek, “Using latent semantic analysis in text summarization and summary evaluation.” Proc. ISIM, 4, pp.93-100, 2004.
[31] H.P. Luhn, “The automatic creation of literature abstracts.” IBM Journal of research and development, 2(2), pp.159-165, 1958.
[32] A. Nenkova, L. Vanderwende, “The impact of frequency on summarization.” Microsoft Research, Redmond, Washington, Tech. Rep. MSR-TR-2005, 101, 2005.
[33] R. Mihalcea, P. Tarau, “Textrank: Bringing order into text.” In Proceedings of the 2004 conference on empirical methods in natural language processing, 2004.
[34] S. Robertson, H. Zaragoza, “The probabilistic relevance framework: BM25 and beyond. “ Foundations and Trends® in Information Retrieval, 3(4), pp.333, 2009.
[35] F. Barrios, F. López, L. Argerich, R. Wachenchauzer, “Variations of the similarity function of textrank for automated summarization.”, In Proc. Argentine Symposium on Artificial Intelligence, ASAI, 2016.
[36] K. Filippova, “Multi-sentence compression: Finding shortest paths in word graphs.” In Proceedings of the 23rd International Conference on Computational Linguistics(pp. 322-330). Association for Computational Linguistics, 2010.
[37] D. Gillick, B. Favre, “A scalable global model for summarization.” In Proceedings of the Workshop on Integer Linear Programming for Natural Language Processing (pp. 10-18). Association for Computational Linguistics, 2009.
[38] C. Li, X. Qian, Y. Liu, “Using supervised bigram-based ILP for extractive summarization.” In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (Vol. 1, pp. 1004-1013), 2013.
[39] C.Y. Lin, “Rouge: A package for automatic evaluation of summaries.” Text Summarization Branches Out, 2004.
[40] R. Garcia, R. Lima, B. Espinasse, H. Oliveira, “Towards coherent single-document summarization: an integer linear programming-based approach.” In Proceedings of the 33rd Annual ACM Symposium on Applied Computing (pp. 712-719). ACM, 2018.
[41] K. Filippova, M. Strube, ”Sentence fusion via dependency graph compression.” In Proceedings of the Conference on Empirical Methods in Natural Language Processing (pp. 177-185). Association for Computational Linguistics, 2008.
[42] K.S. Kumar, S. Prasad, S. Banwral, V.B. Semwal, V.B., “Sports video summarization using priority curve algorithm.” International Journal on Computer Science & Engineering, 2(9), pp.2996-3002, 2010.
[43] S. Saraswathi, M. Hemamalini, S. Janani, V. Priyadharshini, “Multi-document Summarization for Query Answering E-learning System.” International Journal on Computer Science and Engineering (IJCSE), 3(3), pp.1147-1154, 2011.