Lemmatization with spaCy

  • Share this:

Code introduction


This function uses spaCy's lemmatizer to lemmatize the input text, returning a list of lemmatized tokens.


Technology Stack : spaCy

Code Type : Function

Code Difficulty : Intermediate


                
                    
def lemmatize_text(text, language='en'):
    """
    Lemmatize the input text using spaCy's lemmatizer for the specified language.

    Args:
        text (str): The text to be lemmatized.
        language (str): The language of the text. Defaults to 'en'.

    Returns:
        list: A list of lemmatized tokens.
    """
    import spacy

    # Load the language model
    nlp = spacy.load(language)

    # Process the text
    doc = nlp(text)

    # Lemmatize the tokens
    lemmatized_tokens = [token.lemma_ for token in doc]

    return lemmatized_tokens                
              
Tags: