Sentiment Analysis of Text Using NLTK Components

  • Share this:

Code introduction


This function uses several components from the nltk library to analyze the sentiment of text. First, it uses word_tokenize to split the text into words, then it uses stopwords to remove common stopwords, followed by using WordNetLemmatizer for lemmatization, and finally, it uses the VADER sentiment analyzer to calculate the sentiment score for each word and computes the overall sentiment score for the text.


Technology Stack : nltk (Natural Language Toolkit), word_tokenize (word tokenization), stopwords (stopwords corpus), WordNetLemmatizer (lemmatization), sentiment.vader (VADER sentiment analyzer)

Code Type : The type of code

Code Difficulty : Intermediate


                
                    
import nltk
from nltk.tokenize import word_tokenize
from nltk.corpus import stopwords
from nltk.stem import WordNetLemmatizer

def analyze_text_sentiment(text):
    # Tokenize the text into words
    words = word_tokenize(text)
    
    # Remove stopwords
    stop_words = set(stopwords.words('english'))
    filtered_words = [w for w in words if not w.lower() in stop_words]
    
    # Lemmatize the words
    lemmatizer = WordNetLemmatizer()
    lemmatized_words = [lemmatizer.lemmatize(w) for w in filtered_words]
    
    # Analyze sentiment of the words
    sentiment_scores = [nltk.sentiment.vader.SentimentIntensityAnalyzer().polarity_scores(w) for w in lemmatized_words]
    
    # Calculate overall sentiment score
    overall_sentiment_score = sum(sentiment_scores) / len(sentiment_scores)
    
    return overall_sentiment_score