Keyword Extraction from Text

  • Share this:

Code introduction


This function extracts the most common n words from the given text as keywords.


Technology Stack : re, collections.Counter

Code Type : Function

Code Difficulty : Intermediate


                
                    
import re

def extract_keywords(text, n=5):
    """
    提取文本中的关键词。
    """
    # 使用正则表达式匹配所有的单词
    words = re.findall(r'\b\w+\b', text)
    # 使用Counter计算词频
    from collections import Counter
    word_counts = Counter(words)
    # 获取最常见的n个单词作为关键词
    keywords = word_counts.most_common(n)
    return [keyword[0] for keyword in keywords]