sihnpy.sliding_window

Module Contents

Functions

bins(data, var, w_size, s_size[, collapse])

Sliding-window function estimating the number of bins to compute.

build_windows(data, var, w_size, s_size, n_bin)

Function deriving the participants in each window. Returns a pandas.DataFrame with only an

data_by_window(w_store, data)

This function separates the data in age windows.

sum_by_window(w_data, var)

This function outputs summary measures for the sliding variable used for the sliding-window.

export_data(w_data, w_summary, var, path, name)

Function exporting sliding window information.

sihnpy.sliding_window.bins(data, var, w_size, s_size, collapse=False)[source]

Sliding-window function estimating the number of bins to compute.

Parameters
  • data (pandas.DataFrame) – Data of the sample containing the variable var to use for sorting and sliding.

  • var (str) – Name (string) of the column to use for sorting

  • w_size (int) – Integer representing the window size (i.e., number of participants per window)

  • s_size (int) – Integer representing the step size (i.e., number of non-overlapping participants per window)

  • collapse (bool, optional) – Switch determining if the last window has a larger or smaller number of participants, by default False

Returns

Returns an integer representing the number of windows to use based on the data and parameters provided.

Return type

int

sihnpy.sliding_window.build_windows(data, var, w_size, s_size, n_bin)[source]

Function deriving the participants in each window. Returns a pandas.DataFrame with only an index.

Note: In the original script, the code creating “bin_list” has an extra +1. This was because R is 1-indexed. However, Python is 0-indexed, so it needs to start at 0.

Parameters
  • data (pandas.DataFrame) – Data of the sample containing the variable var to use for sorting and sliding.

  • var (str) – Name (string) of the column to use for sorting

  • w_size (int) – Integer representing the window size (i.e., number of participants per window)

  • s_size (int) – Integer representing the step size (i.e., number of non-overlapping participants per window)

  • n_bin (int) – Number of windows to derive

Returns

Returns a dictionary where the keys are the name of the windows and the values are the IDs of the participants in each window.

Return type

dict

sihnpy.sliding_window.data_by_window(w_store, data)[source]

This function separates the data in age windows.

Parameters
  • w_store (dict) – Dictionary containing the window labels and the IDs for each window.

  • data (pandas.DataFrame) – Dataframe containing the data to split in windows.

Returns

Dictionary where the keys are the labels of the windows and the values are the dataframes split for each window.

Return type

dict

sihnpy.sliding_window.sum_by_window(w_data, var)[source]

This function outputs summary measures for the sliding variable used for the sliding-window. Can be used on other variables in the data, as long as the variables are continuous.

Parameters
  • w_data (dict) – Dictionary containing the data for each window.

  • var (str) – String representing the name of the variable to generate stats for.

Returns

_description_

Return type

pandas.DataFrame

sihnpy.sliding_window.export_data(w_data, w_summary, var, path, name)[source]

Function exporting sliding window information.