Entity Recognition using Allennlp with BERT Model

  • Share this:

Code introduction


This function uses the Predictor class from the Allennlp library to predict entities in the input text. It first loads a pre-trained model, then tokenizes the input text, and converts the tokenized text into an Allennlp instance. Finally, it uses the predictor to predict entities in the text.


Technology Stack : Allennlp

Code Type : Function

Code Difficulty : Intermediate


                
                    
def random_entity_recognition(text, model_name="bert-base-cased"):
    from allennlp.predictors.predictor import Predictor
    from allennlp.models import Model
    from allennlp.data import DatasetReader, Instance
    from allennlp.data.token_indexers import PretrainedTransformerIndexer
    from allennlp.data.tokenizers import PretrainedTransformerTokenizer
    from allennlp.nn.util import move_to_device

    # Load the model
    predictor = Predictor.from_path(model_name)

    # Tokenize the input text
    tokenizer = PretrainedTransformerTokenizer.from_pretrained(model_name)
    tokens = tokenizer.tokenize(text)

    # Create an instance from the tokens
    dataset_reader = DatasetReader.from_pretrained(model_name)
    instance = dataset_reader.text_to_instance(tokens=tokens)

    # Predict the entities
    predictions = predictor.predict(instance)

    # Move the predictions to the CPU
    predictions = move_to_device(predictions, 'cpu')

    return predictions                
              
Tags: