Generating Normally Distributed Samples from Pandas Series

  • Share this:

Code introduction


This function generates a normally distributed sample from a given pandas Series data. The mean of the normal distribution can be customized, while the standard deviation is calculated based on the original data.


Technology Stack : pandas, numpy, scipy.stats

Code Type : Data generation and processing

Code Difficulty : Intermediate


                
                    
import numpy as np
import pandas as pd
from scipy.stats import ttest_1samp

def sample_normal_distribution(data, mean, size=100):
    """
    Generate a sample of normally distributed data around a given mean.
    
    Args:
    data (pandas.Series): The original data to generate the sample from.
    mean (float): The mean of the normal distribution.
    size (int): The number of samples to generate.
    
    Returns:
    pandas.Series: A series containing the normally distributed samples.
    """
    # Calculate the standard deviation of the original data
    std_dev = data.std()
    
    # Generate a normally distributed sample
    samples = np.random.normal(mean, std_dev, size)
    
    # Convert the numpy array to a pandas Series
    sample_series = pd.Series(samples)
    
    return sample_series