`sihnpy.imbalance_mapping`

Imbalance mapping module

Module Contents

Functions

`odregression_single`(index, x, y)	Function computing orthogonal regression. Was developped as an adaptation of the Pracma
`_pre_mapping`(data)	Create the 3D array necessary to store the orthogonal distances.
`imbalance_mapping`(data[, type])	Imbalance mapping function. For each column in the original data, compute covariance
`_by_person`(residual_array)	Hidden function computing the mean orthogonal distance by participant. The 3rd dimension
`_by_region`(residual_array)	Hidden function computing the mean orthogonal distance by region across all participants.
`_by_person_by_region`(residual_array)	Hidden function computing the mean orthogonal distance by region for each participant.
`imbalance_stats`(data, residual_array)	Function computing the imbalance measures described in Nadig et al. (2021). We additionally
`export`(data, residual_array, output_path, ...[, all])	Function to export the results of the imbalance mapping to files. If requested by the user

sihnpy.imbalance_mapping.odregression_single(index, x, y)[source]

Function computing orthogonal regression. Was developped as an adaptation of the Pracma library in R. Results were tested against Pracma and scipy’s ODR implementation.

The ODR implementation uses Singular Value Decomposition (SVD) to find the model values. Note that sihnpy’s imbalance mapping really only requires the orthogonal distances from the models. As such, there isn’t a lot of focus on the other model measures in this implementation.

Also note that the function, as it is a direct translation from Pracma, can accept multiple independant variables. However, this wasn’t tested as the primary goal of the implementation was to use ODR for covariance between 2 brain regions. Use at your risks!

For more customization options, I recommend using SciPy’s version which includes a lot more options: https://docs.scipy.org/doc/scipy/reference/odr.html#module-scipy.odr

I’m far from a mathematician; I have adapted the code exactly, but I don’t always fully understand what Pracma’s developer did in some instances. As such, some steps are not always clear in terms of what they do. For more info, refer to Pracma’s docs: https://rdrr.io/cran/pracma/man/odregress.html

Parameters

index (pandas.DataFrame.index) – Expects the index of the dataframe containing the IDs to output with the data. A numpy.ndarray or list of index values can also be accepted, but the final column will not have a title.
x (numpy.ndarray) – Expects a numpy array containing the values of the predictor. I recommend feeding the pandas.Series with the added .values method to this argument.
y (numpy.ndarray) – Expects a numpy array containing the values of the outcome. I recommend feeding the pandas.Series with the added .values method to this argument.

Returns

Returns a dataframe with individual-level model-computed measures and a Dataframe with model variables (slope, intercept, total sum of squares)

Return type

pandas.DataFrame

sihnpy.imbalance_mapping._pre_mapping(data)[source]

Create the 3D array necessary to store the orthogonal distances.

Parameters: data (pandas.DataFrame) – Dataframe containing the data to input to imbalance mapping.
Returns: A 3D numpy.ndarray of size AxAxP, where A is the number of regions and P is the number of participants.
Return type: numpy.ndarray

sihnpy.imbalance_mapping.imbalance_mapping(data, type='sign')[source]

Imbalance mapping function. For each column in the original data, compute covariance (i.e., regression) using orthogonal distance regression. Orthogonal distance is computed for each participant individually for each regression. We store the orthogonal distance for each participant, for each pair of brain region in a 3D symmetric matrix.

The script gives the option of using either “absolute” or the “signed” distances. The sign add the distinction of showing whether a participant is below or above the regression line.

Parameters

data (pandas.DataFrame) – Dataframe where the index is the participant IDs and the columns are the brain data.
type (str, optional) – Type of orthogonal to compute (signed or absolute), by default ‘sign’.

Returns

Returns a numpy.ndarray of size AxAxP, where A is the number of region and P is the number of participants. The array is filled with the orthogonal distances resulting from the orthogonal regression.

Return type

numpy.ndarray

sihnpy.imbalance_mapping._by_person(residual_array)[source]

Hidden function computing the mean orthogonal distance by participant. The 3rd dimension of the residual_array numpy.ndarray is the matrices of each individual participants. We extract the upper triangle of each matrix and compute the mean. This is equivalent to the Figure 1D in Nadig et al. (2021).

Parameters: residual_array (numpy.ndarray) – numpy.ndarray of size AxAxP, where A is the number of region and P is the number of participants. The array is filled with the orthogonal distances resulting from the orthogonal regression.
Returns: Returns a list of size P, where P is the number of participants. Each list element is the mean imbalance in 1 individual.
Return type: list

sihnpy.imbalance_mapping._by_region(residual_array)[source]

Hidden function computing the mean orthogonal distance by region across all participants. In numpy terms, this is equivalent to averaging over the first and third dimension of the matrix. The diagonal of the matrices on the third dimension are perfect correlations (i.e., correlation of region R to region R) so we remove them first. The result is equivalent to Figure 1E in Nadig et al. (2021).

Parameters: residual_array (numpy.ndarray) – numpy.ndarray of size AxAxP, where A is the number of region and P is the number of participants. The array is filled with the orthogonal distances resulting from the orthogonal regression.
Returns: Returns a numpy array of size Ax1, where A is the number of regions.
Return type: np.ndarray

sihnpy.imbalance_mapping._by_person_by_region(residual_array)[source]

Hidden function computing the mean orthogonal distance by region for each participant. In numpy terms, this is equivalent to averaging over the first dimension of the matrix. The diagonal of the matrices on the third dimension are perfect correlations (i.e., correlation of region R to region R) so we remove them first.

Parameters: residual_array (numpy.ndarray) – numpy.ndarray of size AxAxP, where A is the number of region and P is the number of participants. The array is filled with the orthogonal distances resulting from the orthogonal regression.
Returns: Returns a numpy array of size PxA, where P is the number of participants and A is the number of regions.
Return type: np.ndarray

sihnpy.imbalance_mapping.imbalance_stats(data, residual_array)[source]

Function computing the imbalance measures described in Nadig et al. (2021). We additionally compute an average imbalance by region by person.

The function outputs three dataframes.

Parameters

data (pandas.DataFrame) – Dataframe containing the data used for computing the imbalance. We use it to extract the index and use that for the new dataframes
residual_array (np.ndarray) – numpy.ndarray of size AxAxP, where A is the number of region and P is the number of participants. The array is filled with the orthogonal distances resulting from the orthogonal regression.

Returns

Returns three dataframes containing the measures computed for the imbalance analysis.

Return type

pandas.DataFrame

sihnpy.imbalance_mapping.export(data, residual_array, output_path, avg_imb_by_region, avg_imb_by_person, avg_imb_by_pers_by_region, name, all=False)[source]

Function to export the results of the imbalance mapping to files. If requested by the user sihnpy will also output the individual orthogonal distances matrices.

Parameters

data (pandas.DataFrame) – Original dataframe containing the data to use with the imbalance mapping.
residual_array (numpy.ndarray) – numpy.ndarray of size AxAxP, where A is the number of region and P is the number of participants. The array is filled with the orthogonal distances resulting from the orthogonal regression.
output_path (str) – Local path where the data should be output.
avg_imb_by_region (pandas.DataFrame) – Dataframe containing the average imbalance by region
avg_imb_by_person (pandas.DataFrame) – Dataframe containing the average imbalance by person
avg_imb_by_pers_by_region (pandas.DataFrame) – Dataframe containing the average imbalance by region, by person
name (str) – String to add as a suffix for all the output (user’s choice)
all (bool, optional) – Whether the user wants to output all individual participants’ matrices, by default False

sihnpy.imbalance_mapping

Module Contents

Functions

`sihnpy.imbalance_mapping`