Visualizing goodness of fit
The chi-square goodness of fit test compares proportions of each level of a categorical variable to hypothesized values. Before running such a test, it can be helpful to visually compare the distribution in the sample to the hypothesized distribution.
Recall the vendor incoterms in the late_shipments dataset. You hypothesize that the four values occur with these frequencies in the population of shipments.
CIP: 0.05DDP: 0.1EXW: 0.75FCA: 0.1
These frequencies are stored in the hypothesized DataFrame.
The incoterm_counts DataFrame stores the .value_counts() of the vendor_inco_term column.
late_shipments is available; pandas and matplotlib.pyplot are loaded with their standard aliases.
This exercise is part of the course
Hypothesis Testing in Python
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Find the number of rows in late_shipments
n_total = ____
# Print n_total
print(n_total)