Extract Emails from Text Files in Directory

  • Share this:

Code introduction


This function reads all files with a specified extension in a given directory, extracts email addresses from them, and returns a list of these addresses. It uses the os module to traverse the filesystem, the re module to match email patterns, and the time module to measure the execution time of the function.


Technology Stack : os, re, time

Code Type : Function

Code Difficulty :


                
                    
import os
import re
import sys
import time
import json

def read_file(file_path):
    with open(file_path, 'r') as file:
        return file.read()

def find_files(directory, extension):
    return [os.path.join(directory, file) for file in os.listdir(directory) if file.endswith(extension)]

def extract_emails(text):
    return re.findall(r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b', text)

def run_time(func):
    def wrapper(*args, **kwargs):
        start_time = time.time()
        result = func(*args, **kwargs)
        end_time = time.time()
        print(f"Function {func.__name__} took {end_time - start_time} seconds to run.")
        return result
    return wrapper

@run_time
def process_directory(directory, extension='.txt'):
    files = find_files(directory, extension)
    emails = []
    for file in files:
        content = read_file(file)
        emails.extend(extract_emails(content))
    return emails                
              
Tags: