- Movie recommender system
- Recommender System in Python 101
- Recommender System using Amazon Reviews
- Matrix factorization using the surprise library
- Links to real-time training + prediction models
[1] Note that the tf-idf functionality in sklearn.feature_extraction.text can produce normalized vectors, in which case cosine_similarity is equivalent to linear_kernel, only slower.
from sklearn.feature_extraction.text import TfidfVectorizer
tfidf = TfidfVectorizer(stop_words='english')
tfidf_matrix = tfidf.fit_transform(X)
cosine_similarities = linear_kernel(tfidf_matrix[target_idx], tfidf_matrix).flatten()
related_docs_indices = cosine_similarities.argsort()[:-5:-1] # Meaning reverse the list, but take the last 5 only.
cosine_similarity[related_docs_indices]