You can download this code by clicking the button below.
This code is now available for download.
This function takes HTML content as input, parses the HTML using the BeautifulSoup library, and extracts all heading tags (from <h1> to <h6>), then returns a dictionary containing the names of the heading tags and their text.
Technology Stack : Beautiful Soup
Code Type : Function
Code Difficulty : Intermediate
def extract_headings(html_content):
from bs4 import BeautifulSoup, SoupStrainer
# Use SoupStrainer to parse only the <h1> to <h6> tags
heading_strainer = SoupStrainer('h1', 'h2', 'h3', 'h4', 'h5', 'h6')
soup = BeautifulSoup(html_content, 'html.parser', parse_only=heading_strainer)
# Extract headings and their text
headings = {tag.name: tag.get_text() for tag in soup.find_all()}
return headings