Creating Scatter Plot with Overlaid Histograms for Two Datasets

  • Share this:

Code introduction


This function creates a scatter plot of two datasets and overlays a histogram of each dataset. The first histogram shows the distribution of the x data, and the second histogram shows the distribution of the y data, which is cumulative to show the cumulative distribution of the data.


Technology Stack : Matplotlib, NumPy

Code Type : The type of code

Code Difficulty :


                
                    
import numpy as np
import matplotlib.pyplot as plt

def plot_scatter_with_histogram(x, y):
    """
    This function creates a scatter plot of two datasets and overlays a histogram of each dataset.
    """
    fig, ax = plt.subplots(1, 2, figsize=(12, 6))
    
    # Scatter plot
    ax[0].scatter(x, y)
    ax[0].set_title('Scatter Plot')
    ax[0].set_xlabel('X Data')
    ax[0].set_ylabel('Y Data')
    
    # Histogram for x
    ax[1].hist(x, bins=30, alpha=0.5, label='X Data')
    ax[1].set_title('Histogram of X Data')
    ax[1].set_xlabel('X Value')
    ax[1].set_ylabel('Frequency')
    
    # Histogram for y
    ax[1].hist(y, bins=30, alpha=0.5, label='Y Data', cumulative=True)
    ax[1].set_title('Histogram of Y Data')
    ax[1].set_xlabel('Y Value')
    ax[1].set_ylabel('Frequency')
    
    plt.tight_layout()
    plt.show()

# Example usage:
# x = np.random.randn(100)
# y = np.random.randn(100)
# plot_scatter_with_histogram(x, y)