NLP: Summarizing a document, python, nltk, google docs ver. 3

Last Updated on November 7, 2021 by shibatau

Learning NLP

There are two approaches to summarizing texts in Natural Language Processing, extraction and abstraction.
In extraction-based summarization, a subset of words that represent the most important points are extracted from texts and combined to make a summary. 

Here we will learn extraction-based summarization from the following posts.

1.Building a text summarizer in Python using NLTK and scikit-learn class TfidfVectorizer

2.A Gentle Introduction to Text Summarization in Machine Learning

1.Extraction_based summarization 1

Show the difference between the original document and the summarized one using Google Docs.

You can compare texts online:

https://countwordsfree.com/comparetexts

Scripts

I have run the scripts in the above post on Google Colaboratory. You need to upload two files called “ai.txt” beforehand.

https://colab.research.google.com/drive/1rB_PssCJ21DCWKJhcyDVz9sfBv9XVNF6?usp=sharing

2.Extraction_based summarization 2

I have run the scripts on Google Colaboratory:

https://colab.research.google.com/drive/1aUKCgCvodKS-FYCIMzinQI3SyOKykfdy?usp=sharing

About shibatau

I was born and grown up in Kyoto. I studied western philosophy at the University and specialized in analytic philosophy, especially Ludwig Wittgenstein at the postgraduate school. I'm interested in new technology, especially machine learning and have been learning R language for two years and began to learn Python last summer. Listening toParamore, Sia, Amazarashi and MIyuki Nakajima. Favorite movies I've recently seen: "FREEHELD". Favorite actors and actresses: Anthony Hopkins, Denzel Washington, Ellen Page, Meryl Streep, Mia Wasikowska and Robert DeNiro. Favorite books: Fyodor Mikhailovich Dostoyevsky, "The Karamazov Brothers", Shinran, "Lamentations of Divergences". Favorite phrase: Salvation by Faith. Twitter: @shibatau

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.