Shuffling and Sampling Dask Arrays

  • Share this:

Code introduction


This function accepts a Dask array (data) and a number of samples (num_samples). It first shuffles the array and then samples a specified number of elements from it.


Technology Stack : Dask, NumPy

Code Type : Function

Code Difficulty : Intermediate


                
                    
def shuffle_and_sample(data, num_samples):
    import dask.array as da
    import numpy as np
    
    # Shuffle the data
    shuffled_data = da.random.permutation(data, axis=0)
    
    # Sample a subset of the data
    sampled_data = shuffled_data[:num_samples]
    
    return sampled_data                
              
Tags: