Random File URL Extraction and JSON Conversion

  • Share this:

Code introduction


This function randomly selects a file from a specified directory, reads the file content, uses regular expressions to find all URLs, and returns them in JSON format.


Technology Stack : os, re, json

Code Type : Function

Code Difficulty : Intermediate


                
                    
import os
import sys
import re
import json
import random

def extract_random_file_name(directory):
    files = os.listdir(directory)
    if not files:
        return None
    return random.choice(files)

def xxx(directory, file_name):
    # 随机选择一个文件名,并检查它是否在指定目录中
    file_path = os.path.join(directory, file_name)
    if not os.path.exists(file_path):
        raise FileNotFoundError(f"File {file_name} does not exist in {directory}")

    # 读取文件内容
    with open(file_path, 'r') as file:
        content = file.read()

    # 使用正则表达式查找所有的URL
    urls = re.findall(r'http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\\(\\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+', content)

    # 将找到的URL转换为JSON格式
    result = json.dumps(urls, indent=4)

    return result                
              
Tags: