Skip to content

Latest commit

 

History

History
34 lines (23 loc) · 1.68 KB

LEARN.md

File metadata and controls

34 lines (23 loc) · 1.68 KB

Recommendation in E-commerce

TODO

Datasets

TFIDF

[1] Note that the tf-idf functionality in sklearn.feature_extraction.text can produce normalized vectors, in which case cosine_similarity is equivalent to linear_kernel, only slower.

from sklearn.feature_extraction.text import TfidfVectorizer
tfidf = TfidfVectorizer(stop_words='english')
tfidf_matrix = tfidf.fit_transform(X)

cosine_similarities = linear_kernel(tfidf_matrix[target_idx], tfidf_matrix).flatten()
related_docs_indices = cosine_similarities.argsort()[:-5:-1] # Meaning reverse the list, but take the last 5 only.
cosine_similarity[related_docs_indices]