Evaluating distribution fit for the ldl variable
In this exercise, you'll focus on one variable of the diabetes dataset dia: the ldl blood serum. You'll determine whether the normal distribution is a still good choice for ldl based on the additional information provided by a Kolmogorov-Smirnov test.
The dia DataFrame has been loaded for you. The following libraries have also been imported: pandas as pd, numpy as np, and scipy.stats as st.
Cet exercice fait partie du cours
Monte Carlo Simulations in Python
Instructions
- Define a list called
list_of_distscontaining your candidate distributions: Laplace, normal, and exponential (in that order); use the correct names fromscipy.stats. - Inside the loop, fit the data with the corresponding probability distribution, saving as
param. - Perform a Kolmogorov–Smirnov test to evaluate goodness-of-fit, saving the results as
result.
Exercice interactif pratique
Essayez cet exercice en complétant cet exemple de code.
# List candidate distributions to evaluate
list_of_dists = [____]
for i in list_of_dists:
dist = getattr(st, i)
# Fit the data to the probability distribution
param = dist.____
# Perform the ks test to evaluate goodness-of-fit
result = ____
print(result)