Luigi Task for Fetching and Saving URL Content

  • Share this:

Code introduction


This code defines a Luigi task that fetches content from a specified URL and saves it to a file.


Technology Stack : Luigi, requests, luigi.contrib.http

Code Type : Luigi Task

Code Difficulty : Intermediate


                
                    
import random
import luigi
import requests
from luigi.contrib import http

def fetch_random_url(url):
    """
    Fetches a random URL and returns its content.
    """
    response = requests.get(url)
    response.raise_for_status()
    return response.text

class FetchRandomURLTask(luigi.Task):
    url = luigi.Parameter()

    def run(self):
        content = fetch_random_url(self.url)
        with self.output().open('w') as output_file:
            output_file.write(content)

    def output(self):
        return http.HttpTarget(self.url)