ComenzarEmpieza gratis

Creating the TF-IDF DataFrame

Now that you have generated our TF-IDF features, you will need to get them in a format that you can use to make recommendations. You will once again leverage pandas for this and wrap the array in a DataFrame. As you will be using the movie titles to do your filtering of the data, you can assign the titles to the DataFrame's index.

The df_plots DataFrame has once again been loaded for you. It contains movies' names in the Title column and their plots in the Plot column.

Este ejercicio forma parte del curso

Building Recommendation Engines in Python

Ver curso

Instrucciones del ejercicio

  • Create a TfidfVectorizer and fit and transform it as you did in the previous exercise.
  • Wrap the generated vectorized_data in a DataFrame. Use the names of the features generated during the fit and transform phase as its column names and assign your new DataFrame to tfidf_df.
  • Assign the original movie titles to the index of the newly created tfidf_df DataFrame.

Ejercicio interactivo práctico

Prueba este ejercicio y completa el código de muestra.

from sklearn.feature_extraction.text import TfidfVectorizer

# Instantiate the vectorizer object and transform the plot column
vectorizer = ____(max_df=0.7, min_df=2)
vectorized_data = vectorizer.____(df_plots['Plot']) 

# Create Dataframe from TF-IDFarray
tfidf_df = pd.____(____.toarray(), columns=vectorizer.____())

# Assign the movie titles to the index and inspect
tfidf_df.____ = ____['Title']
print(tfidf_df.head())
Editar y ejecutar código