Extracting POS Frequency from Text with spaCy

  • Share this:

Code introduction


This function takes a text as input, uses the spaCy library to analyze the parts of speech in the text, and returns a dictionary containing different parts of speech and their frequencies.


Technology Stack : spaCy

Code Type : Function

Code Difficulty : Intermediate


                
                    
import spacy
from spacy.symbols import NOUN, VERB, ADJ, ADV

def extract_pos_frequency(text):
    # Load English tokenizer, tagger, parser, NER and word vectors
    nlp = spacy.load("en_core_web_sm")
    
    # Process whole documents
    doc = nlp(text)
    
    # Count parts of speech
    pos_counts = {}
    for token in doc:
        pos = token.pos_
        if pos in pos_counts:
            pos_counts[pos] += 1
        else:
            pos_counts[pos] = 1
    
    return pos_counts

# Code Information                
              
Tags: