Merge branch 'master' of https://github.com/PreferredAI/cornac into n…

…ext-item
PreferredAI · Dec 18, 2023 · a4bccf3 · a4bccf3
2 parents 5b45e82 + f0e36cd
commit a4bccf3
Show file tree

Hide file tree

Showing 28 changed files with 697 additions and 88 deletions.
diff --git a/.github/draft-config.yml b/.github/draft-config.yml
@@ -0,0 +1,62 @@
+name-template: 'Cornac $RESOLVED_VERSION'
+tag-template: 'v$RESOLVED_VERSION'
+autolabeler:
+  - label: 'docs'
+    files:
+      - '*.md'
+    branch:
+      - '/docs{0,1}\/.+/'
+  - label: 'models'
+    files:
+      - '/cornac/models/*.py'
+      - '/cornac/models/**/*.py'
+  - label: 'datasets'
+    files:
+      - '/cornac/datasets/*.py'
+
+template: |
+  # What's Changed
+
+  $CHANGES
+
+  **Full Changelog**: https://github.com/$OWNER/$REPOSITORY/compare/$PREVIOUS_TAG...v$RESOLVED_VERSION
+
+categories:
+  - title: 'Breaking'
+    label: 'type: breaking'
+  - title: 'Models'
+    label: 'type: models'
+  - title: 'Datasets'
+    label: 'type: datasets'
+  - title: 'New'
+    label: 'type: feature'
+  - title: 'Bug Fixes'
+    label: 'type: bug'
+  - title: 'Maintenance'
+    label: 'type: maintenance'
+  - title: 'Documentation'
+    label: 'type: docs'
+  - title: 'Other changes'
+  - title: 'Dependency Updates'
+    label: 'type: dependencies'
+    collapse-after: 5
+
+version-resolver:
+  major:
+    labels:
+      - 'type: breaking'
+  minor:
+    labels:
+      - 'type: feature'
+  patch:
+    labels:
+      - 'type: bug'
+      - 'type: maintenance'
+      - 'type: docs'
+      - 'type: dependencies'
+      - 'type: security'
+      - 'type: models'
+      - 'type: datasets'
+
+exclude-labels:
+  - 'skip-changelog'
diff --git a/.github/workflows/release-drafter.yml b/.github/workflows/release-drafter.yml
@@ -2,18 +2,25 @@ name: Release Drafter
 
 on:
   push:
-    tags:        
-      - '*'
+    branches:
+      - master
+  pull_request:
+    types: [opened, reopened, synchronize]
+
+permissions:
+  contents: read
 
 jobs:
   update_release_draft:
+    permissions:
+      contents: write
+      pull-requests: write
     runs-on: ubuntu-latest
     steps:
       - name: Draft release
         uses: release-drafter/[email protected]
         id: release_drafter
         with:
-          config-name: workflows/release-drafter.yml
-          disable-autolabeler: true
+          config-name: draft-config.yml
         env:
-          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
+          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
diff --git a/README.md b/README.md
@@ -7,7 +7,7 @@
 ### Quick Links
 
 [Website](https://cornac.preferred.ai/) |
-[Documentation](https://cornac.readthedocs.io/en/latest/index.html) |
+[Documentation](https://cornac.readthedocs.io/en/stable/index.html) |
 [Tutorials](tutorials#tutorials) |
 [Examples](https://github.com/PreferredAI/cornac/tree/master/examples#cornac-examples-directory) |
 [Models](#models) |
@@ -19,7 +19,7 @@
 [![CircleCI](https://img.shields.io/circleci/project/github/PreferredAI/cornac/master.svg?logo=circleci)](https://circleci.com/gh/PreferredAI/cornac)
 [![AppVeyor](https://ci.appveyor.com/api/projects/status/0yq4td1xg4kkhdwu?svg=true)](https://ci.appveyor.com/project/tqtg/cornac)
 [![Codecov](https://img.shields.io/codecov/c/github/PreferredAI/cornac/master.svg?logo=codecov)](https://codecov.io/gh/PreferredAI/cornac)
-[![Docs](https://img.shields.io/readthedocs/cornac/latest.svg)](https://cornac.readthedocs.io/en/latest)
+[![Docs](https://img.shields.io/readthedocs/cornac/latest.svg)](https://cornac.readthedocs.io/en/stable)
 <br />
 [![Release](https://img.shields.io/github/release-pre/PreferredAI/cornac.svg)](https://github.com/PreferredAI/cornac/releases)
 [![PyPI](https://img.shields.io/pypi/v/cornac.svg)](https://pypi.org/project/cornac/)
@@ -126,18 +126,18 @@ $ curl -X GET "http://localhost:8080/recommend?uid=63&k=5&remove_seen=false"
 
 # Response: {"recommendations": ["50", "181", "100", "258", "286"], "query": {"uid": "63", "k": 5, "remove_seen": false}}
 ```
-If we want to remove seen items during training, we need to provide `TRAIN_SET` which has been saved with the model earlier, when starting the serving app. We can also leverage [WSGI](https://flask.palletsprojects.com/en/3.0.x/deploying/) server for model deployment in production. Please refer to [this](https://cornac.readthedocs.io/en/latest/user/iamadeveloper.html#running-an-api-service) guide for more details.
+If we want to remove seen items during training, we need to provide `TRAIN_SET` which has been saved with the model earlier, when starting the serving app. We can also leverage [WSGI](https://flask.palletsprojects.com/en/3.0.x/deploying/) server for model deployment in production. Please refer to [this](https://cornac.readthedocs.io/en/stable/user/iamadeveloper.html#running-an-api-service) guide for more details.
 
 ## Efficient retrieval with ANN search
 
 One important aspect of deploying recommender model is efficient retrieval via Approximate Nearest Neighor (ANN) search in vector space. Cornac integrates several vector similarity search frameworks for the ease of deployment. [This example](tutorials/ann_hnswlib.ipynb) demonstrates how ANN search will work seamlessly with any recommender models supporting it (e.g., MF).
 
 | Supported framework | Cornac wrapper | Examples |
 | :---: | :---: | :---: |
-| [spotify/annoy](https://github.com/spotify/annoy) | [AnnoyANN](cornac/models/ann/recom_ann_annoy.py) | [ann_all.ipynb](examples/ann_all.ipynb)
-| [meta/faiss](https://github.com/facebookresearch/faiss) | [FaissANN](cornac/models/ann/recom_ann_faiss.py) | [ann_all.ipynb](examples/ann_all.ipynb)
-| [nmslib/hnswlib](https://github.com/nmslib/hnswlib) | [HNSWLibANN](cornac/models/ann/recom_ann_hnswlib.py) | [ann_hnswlib.ipynb](tutorials/ann_hnswlib.ipynb), [ann_all.ipynb](examples/ann_all.ipynb)
-| [google/scann](https://github.com/google-research/google-research/tree/master/scann) | [ScaNNANN](cornac/models/ann/recom_ann_scann.py) | [ann_all.ipynb](examples/ann_all.ipynb)
+| [spotify/annoy](https://github.com/spotify/annoy) | [AnnoyANN](cornac/models/ann/recom_ann_annoy.py) | [ann_example.py](examples/ann_example.py), [ann_all.ipynb](examples/ann_all.ipynb)
+| [meta/faiss](https://github.com/facebookresearch/faiss) | [FaissANN](cornac/models/ann/recom_ann_faiss.py) | [ann_example.py](examples/ann_example.py), [ann_all.ipynb](examples/ann_all.ipynb)
+| [nmslib/hnswlib](https://github.com/nmslib/hnswlib) | [HNSWLibANN](cornac/models/ann/recom_ann_hnswlib.py) | [ann_example.py](examples/ann_example.py), [ann_hnswlib.ipynb](tutorials/ann_hnswlib.ipynb), [ann_all.ipynb](examples/ann_all.ipynb)
+| [google/scann](https://github.com/google-research/google-research/tree/master/scann) | [ScaNNANN](cornac/models/ann/recom_ann_scann.py) | [ann_example.py](examples/ann_example.py), [ann_all.ipynb](examples/ann_all.ipynb)
 
 
 ## Models
@@ -152,6 +152,7 @@ The recommender models supported by Cornac are listed below. Why don't you join
 | 2020 | [Adversarial Training Towards Robust Multimedia Recommender System (AMR)](cornac/models/amr), [paper](https://ieeexplore.ieee.org/document/8618394) | [requirements.txt](cornac/models/amr/requirements.txt) | [amr_clothing.py](examples/amr_clothing.py)
 |      | [Hybrid neural recommendation with joint deep representation learning of ratings and reviews (HRDR)](cornac/models/hrdr), [paper](https://www.sciencedirect.com/science/article/abs/pii/S0925231219313207) | [requirements.txt](cornac/models/hrdr/requirements.txt) | [hrdr_example.py](examples/hrdr_example.py)
 |      | [LightGCN: Simplifying and Powering Graph Convolution Network for Recommendation](cornac/models/lightgcn), [paper](https://arxiv.org/pdf/2002.02126.pdf) | [requirements.txt](cornac/models/lightgcn/requirements.txt) | [lightgcn_example.py](examples/lightgcn_example.py)
+|      | [Temporal-Item-Frequency-based User-KNN (TIFUKNN)](cornac/models/tifuknn), [paper](https://arxiv.org/pdf/2006.00556.pdf) | N/A | [tifuknn_tafeng.py](examples/tifuknn_tafeng.py)
 | 2019 | [Embarrassingly Shallow Autoencoders for Sparse Data (EASEᴿ)](cornac/models/ease), [paper](https://arxiv.org/pdf/1905.03375.pdf) | N/A | [ease_movielens.py](examples/ease_movielens.py)
 |      | [Neural Graph Collaborative Filtering (NGCF)](cornac/models/ngcf), [paper](https://arxiv.org/pdf/1905.08108.pdf) | [requirements.txt](cornac/models/ngcf/requirements.txt) | [ngcf_example.py](examples/ngcf_example.py)
 | 2018 | [Collaborative Context Poisson Factorization (C2PF)](cornac/models/c2pf), [paper](https://www.ijcai.org/proceedings/2018/0370.pdf) | N/A | [c2pf_exp.py](examples/c2pf_example.py)
@@ -202,7 +203,7 @@ The recommender models supported by Cornac are listed below. Why don't you join
 
 ## Contributing
 
-This project welcomes contributions and suggestions. Before contributing, please see our [contribution guidelines](https://cornac.readthedocs.io/en/latest/developer/index.html).
+This project welcomes contributions and suggestions. Before contributing, please see our [contribution guidelines](https://cornac.readthedocs.io/en/stable/developer/index.html).
 
 ## Citation
 

diff --git a/cornac/__init__.py b/cornac/__init__.py
@@ -23,4 +23,4 @@
 # Also importable from root
 from .experiment import Experiment
 
-__version__ = '1.17'
+__version__ = '1.18'
diff --git a/cornac/data/dataset.py b/cornac/data/dataset.py
@@ -633,7 +633,7 @@ class BasketDataset(Dataset):
     uir_tuple: tuple, required
         Tuple of 3 numpy arrays (user_indices, item_indices, rating_values).
 
-    basket_ids: numpy.array, required
+    basket_indices: numpy.array, required
         Array of basket indices corresponding to observation in `uir_tuple`.
 
     timestamps: numpy.array, optional, default: None
@@ -665,7 +665,7 @@ def __init__(
         bid_map,
         iid_map,
         uir_tuple,
-        basket_ids=None,
+        basket_indices=None,
         timestamps=None,
         extra_data=None,
         seed=None,
@@ -681,23 +681,31 @@ def __init__(
         )
         self.num_baskets = num_baskets
         self.bid_map = bid_map
-        self.basket_ids = basket_ids
+        self.basket_indices = basket_indices
         self.extra_data = extra_data
-        basket_sizes = list(Counter(basket_ids).values())
+        basket_sizes = list(Counter(basket_indices).values())
         self.max_basket_size = np.max(basket_sizes)
         self.min_basket_size = np.min(basket_sizes)
         self.avg_basket_size = np.mean(basket_sizes)
 
         self.__baskets = None
+        self.__basket_ids = None
         self.__user_basket_data = None
         self.__chrono_user_basket_data = None
 
+    @property
+    def basket_ids(self):
+        """Return the list of raw basket ids"""
+        if self.__basket_ids is None:
+            self.__basket_ids = list(self.bid_map.keys())
+        return self.__basket_ids
+
     @property
     def baskets(self):
         """A dictionary to store indices where basket ID appears in the data."""
         if self.__baskets is None:
             self.__baskets = defaultdict(list)
-            for idx, bid in enumerate(self.basket_ids):
+            for idx, bid in enumerate(self.basket_indices):
                 self.__baskets[bid].append(idx)
         return self.__baskets
 
@@ -822,7 +830,7 @@ def build(
             np.ones(len(u_indices), dtype="float"),
         )
 
-        basket_ids = np.asarray(b_indices, dtype="int")
+        basket_indices = np.asarray(b_indices, dtype="int")
 
         timestamps = (
             np.fromiter((int(data[i][3]) for i in valid_idx), dtype="int") if fmt in ["UBIT", "UBITJson"] else None
@@ -838,7 +846,7 @@ def build(
             bid_map=global_bid_map,
             iid_map=global_iid_map,
             uir_tuple=uir_tuple,
-            basket_ids=basket_ids,
+            basket_indices=basket_indices,
             timestamps=timestamps,
             extra_data=extra_data,
             seed=seed,

diff --git a/cornac/eval_methods/base_method.py b/cornac/eval_methods/base_method.py
@@ -157,6 +157,8 @@ def ranking_eval(
     if len(metrics) == 0:
         return [], []
 
+    max_k = max(m.k for m in metrics)
+
     avg_results = []
     user_results = [{} for _ in enumerate(metrics)]
 
@@ -203,7 +205,9 @@ def pos_items(csr_row):
         u_gt_pos_items = np.nonzero(u_gt_pos_mask)[0]
         u_gt_neg_items = np.nonzero(u_gt_neg_mask)[0]
 
-        item_rank, item_scores = model.rank(user_idx, item_indices)
+        item_rank, item_scores = model.rank(
+            user_idx=user_idx, item_indices=item_indices, k=max_k
+        )
 
         for i, mt in enumerate(metrics):
             mt_score = mt.compute(

diff --git a/cornac/eval_methods/next_basket_evaluation.py b/cornac/eval_methods/next_basket_evaluation.py
@@ -128,10 +128,10 @@ def get_gt_items(train_set, test_set, test_pos_items, exclude_unknowns):
             user_idx,
             item_indices,
             history_baskets=history_baskets,
-            history_basket_ids=bids[:-1],
+            history_bids=bids[:-1],
             uir_tuple=test_set.uir_tuple,
             baskets=test_set.baskets,
-            basket_ids=test_set.basket_ids,
+            basket_indices=test_set.basket_indices,
             extra_data=test_set.extra_data,
         )
 

diff --git a/cornac/exception.py b/cornac/exception.py
@@ -13,17 +13,14 @@
 # limitations under the License.
 # ============================================================================
 
-class CornacException(Exception):
-    """Exception base class to extend from
 
-    """
+class CornacException(Exception):
+    """Exception base class to extend from"""
 
     pass
 
 
 class ScoreException(CornacException):
-    """Exception raised in score function when facing unknowns
+    """Exception raised in score function when facing unknowns"""
 
-    """
-
-    pass
+    pass
diff --git a/cornac/experiment/experiment.py b/cornac/experiment/experiment.py
@@ -150,7 +150,7 @@ def run(self):
             if self.val_result is not None:
                 self.val_result.append(val_result)
 
-            if not isinstance(self.result, CVExperimentResult):
+            if self.save_dir and (not isinstance(self.result, CVExperimentResult)):
                 model.save(self.save_dir)
 
         output = ""

diff --git a/cornac/models/__init__.py b/cornac/models/__init__.py
@@ -69,6 +69,7 @@
 from .sorec import SoRec
 from .spop import SPop
 from .svd import SVD
+from .tifuknn import TIFUKNN
 from .trirank import TriRank
 from .vaecf import VAECF
 from .vbpr import VBPR

diff --git a/cornac/models/ann/recom_ann_annoy.py b/cornac/models/ann/recom_ann_annoy.py
@@ -69,7 +69,6 @@ def __init__(
     ):
         super().__init__(model=model, name=name, verbose=verbose)
 
-        self.model = model
         self.n_trees = n_trees
         self.search_k = search_k
         self.num_threads = num_threads
@@ -85,14 +84,18 @@ def __init__(
 
     def build_index(self):
         """Building index from the base recommender model."""
+        super().build_index()
+
         from annoy import AnnoyIndex
 
         assert self.measure in SUPPORTED_MEASURES
 
         self.index = AnnoyIndex(
             self.item_vectors.shape[1], SUPPORTED_MEASURES[self.measure]
         )
-        self.index.set_seed(self.seed)
+
+        if self.seed is not None:
+            self.index.set_seed(self.seed)
 
         for i, v in enumerate(self.item_vectors):
             self.index.add_item(i, v)
@@ -115,6 +118,11 @@ def knn_query(self, query, k):
         ]
         neighbors = np.array([r[0] for r in result], dtype="int")
         distances = np.array([r[1] for r in result], dtype="float32")
+
+        # make sure distances respect the notion of nearest neighbors (smaller is better)
+        if self.higher_is_better:
+            distances = 1.0 - distances
+
         return neighbors, distances
 
     def save(self, save_dir=None):