Config

Configuration parameters and synthetic datasets creation

source

generate_synthetic_data

 generate_synthetic_data (n:int=1000)

Generates a sample DataFrame containing age, gender, and value data.

Args: n: The number of rows in the generated DataFrame.

Returns: A pandas DataFrame with columns ‘age’, ‘gender’, and ‘val’.


source

generate_synthetic_data_like

 generate_synthetic_data_like (df:pandas.core.frame.DataFrame, n:int=1000,
                               random_seed:int=42)

Generate a sample DataFrame containing the same columns as df, but with random data.

Args:

df: The DataFrame whose columns should be used.
n: The number of rows in the generated DataFrame.

Returns: A pandas DataFrame with the same columns as df.

data = generate_synthetic_data()
data.head()
date_of_research_stage age_at_research_stage sex val1 val2
participant_id
0 2021-07-06 69.788555 1 162.270810 82.273910
1 2021-01-07 36.289947 1 125.240476 71.164810
2 2022-02-21 61.501970 1 116.044417 68.405992
3 2020-06-27 46.299262 0 84.440308 56.924759
4 2021-01-08 70.127055 1 120.693921 69.800843
generate_synthetic_data_like(data.head(), n=5)
date_of_research_stage age_at_research_stage sex val1 val2
participant_id
0 2022-02-21 46.299262 1 84.440308 82.273910
1 2020-06-27 36.289947 1 162.270810 69.800843
2 2021-01-08 61.501970 1 120.693921 71.164810
3 2021-01-07 69.788555 1 116.044417 68.405992
4 2021-07-06 70.127055 0 125.240476 56.924759