You can download this code by clicking the button below.
This code is now available for download.
This function extracts text from all specified tags within a given HTML string. By default, it extracts text from `<p>` tags.
Technology Stack : lxml
Code Type : Function
Code Difficulty : Intermediate
def extract_text_from_html(html, tag='p'):
from lxml import etree
def extract_text(element):
text = ''.join(element.itertext())
return text.strip()
tree = etree.HTML(html)
elements = tree.xpath(f"//{tag}")
return [extract_text(el) for el in elements]