Extracting Hrefs from Specific HTML Tags

2024-12-16 12:09:42 3 Views

Code introduction

The function sends an HTTP request to obtain the web page content, then parses the HTML using BeautifulSoup, and extracts all href attributes from the specified tags.

Technology Stack : BeautifulSoup, requests

Code Type : The type of code

Code Difficulty : Intermediate

                
                    
def find_hrefs_by_tag(url, tag_name):
    """
    This function takes a URL and a tag name as input, and returns all href attributes from the specified tag in the HTML content.
    """
    from bs4 import BeautifulSoup
    import requests

    # Send a GET request to the URL
    response = requests.get(url)
    
    # Parse the HTML content using BeautifulSoup
    soup = BeautifulSoup(response.text, 'html.parser')
    
    # Find all tags with the specified tag name
    tags = soup.find_all(tag_name)
    
    # Extract and return all href attributes from these tags
    hrefs = [tag.get('href') for tag in tags if tag.get('href') is not None]
    return hrefs

# JSON representation of the code