Language Identification with Polyglot Library

  • Share this:

Code introduction


This function uses the Polyglot library to identify the language of the input text. It first uses Detector to detect the language of the text, then checks if the detected language is in the list of specified languages. If the language is in the list, it creates a Text object and retrieves the most likely language code.


Technology Stack : Polyglot, Detector, Text

Code Type : Function

Code Difficulty : Intermediate


                
                    
def random_word_language_identification(text, languages):
    from polyglot.detect import Detector
    from polyglot.text import Text

    # Create a Detector instance
    detector = Detector()

    # Detect the language of the input text
    detected_language = detector.detect(text)

    # Check if the detected language is in the list of languages to identify
    if detected_language.lang in languages:
        # Create a Text object from the input text
        text_object = Text(text, hint_language_code=detected_language.lang)

        # Get the most likely language for the text
        identified_language = text_object.language.code

        return identified_language
    else:
        return "Language not in the list of specified languages"