Multivariate logistic regression
Generally, you won't use only loan_int_rate to predict the probability of default. You will want to use all the data you have to make predictions.
With this in mind, try training a new model with different columns, called features, from the cr_loan_clean data. Will this model differ from the first one? For this, you can easily check the .intercept_ of the logistic regression. Remember that this is the y-intercept of the function and the overall log-odds of non-default.
The cr_loan_clean data has been loaded in the workspace along with the previous model clf_logistic_single.
This exercise is part of the course
Credit Risk Modeling in Python
Exercise instructions
- Create a new
Xdata set withloan_int_rateandperson_emp_length. Store it asX_multi. - Create a
ydata set with justloan_status. - Create and
.fit()aLogisticRegression()model on the newXdata. Store it asclf_logistic_multi. - Print the
.intercept_value of the model
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Create X data for the model
X_multi = ____[[____,____]]
# Create a set of y data for training
y = ____[[____]]
# Create and train a new logistic regression
clf_logistic_multi = ____(solver='lbfgs').____(____, np.ravel(____))
# Print the intercept of the model
print(____.____)