Skip to content

Commit

Permalink
Merge branch 'master' of https://github.com/PreferredAI/cornac into n…
Browse files Browse the repository at this point in the history
…ext-item
  • Loading branch information
lthoang committed Dec 18, 2023
2 parents 5b45e82 + f0e36cd commit a4bccf3
Show file tree
Hide file tree
Showing 28 changed files with 697 additions and 88 deletions.
62 changes: 62 additions & 0 deletions .github/draft-config.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
name-template: 'Cornac $RESOLVED_VERSION'
tag-template: 'v$RESOLVED_VERSION'
autolabeler:
- label: 'docs'
files:
- '*.md'
branch:
- '/docs{0,1}\/.+/'
- label: 'models'
files:
- '/cornac/models/*.py'
- '/cornac/models/**/*.py'
- label: 'datasets'
files:
- '/cornac/datasets/*.py'

template: |
# What's Changed
$CHANGES
**Full Changelog**: https://github.com/$OWNER/$REPOSITORY/compare/$PREVIOUS_TAG...v$RESOLVED_VERSION
categories:
- title: 'Breaking'
label: 'type: breaking'
- title: 'Models'
label: 'type: models'
- title: 'Datasets'
label: 'type: datasets'
- title: 'New'
label: 'type: feature'
- title: 'Bug Fixes'
label: 'type: bug'
- title: 'Maintenance'
label: 'type: maintenance'
- title: 'Documentation'
label: 'type: docs'
- title: 'Other changes'
- title: 'Dependency Updates'
label: 'type: dependencies'
collapse-after: 5

version-resolver:
major:
labels:
- 'type: breaking'
minor:
labels:
- 'type: feature'
patch:
labels:
- 'type: bug'
- 'type: maintenance'
- 'type: docs'
- 'type: dependencies'
- 'type: security'
- 'type: models'
- 'type: datasets'

exclude-labels:
- 'skip-changelog'
17 changes: 12 additions & 5 deletions .github/workflows/release-drafter.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,18 +2,25 @@ name: Release Drafter

on:
push:
tags:
- '*'
branches:
- master
pull_request:
types: [opened, reopened, synchronize]

permissions:
contents: read

jobs:
update_release_draft:
permissions:
contents: write
pull-requests: write
runs-on: ubuntu-latest
steps:
- name: Draft release
uses: release-drafter/[email protected]
id: release_drafter
with:
config-name: workflows/release-drafter.yml
disable-autolabeler: true
config-name: draft-config.yml
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
17 changes: 9 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
### Quick Links

[Website](https://cornac.preferred.ai/) |
[Documentation](https://cornac.readthedocs.io/en/latest/index.html) |
[Documentation](https://cornac.readthedocs.io/en/stable/index.html) |
[Tutorials](tutorials#tutorials) |
[Examples](https://github.com/PreferredAI/cornac/tree/master/examples#cornac-examples-directory) |
[Models](#models) |
Expand All @@ -19,7 +19,7 @@
[![CircleCI](https://img.shields.io/circleci/project/github/PreferredAI/cornac/master.svg?logo=circleci)](https://circleci.com/gh/PreferredAI/cornac)
[![AppVeyor](https://ci.appveyor.com/api/projects/status/0yq4td1xg4kkhdwu?svg=true)](https://ci.appveyor.com/project/tqtg/cornac)
[![Codecov](https://img.shields.io/codecov/c/github/PreferredAI/cornac/master.svg?logo=codecov)](https://codecov.io/gh/PreferredAI/cornac)
[![Docs](https://img.shields.io/readthedocs/cornac/latest.svg)](https://cornac.readthedocs.io/en/latest)
[![Docs](https://img.shields.io/readthedocs/cornac/latest.svg)](https://cornac.readthedocs.io/en/stable)
<br />
[![Release](https://img.shields.io/github/release-pre/PreferredAI/cornac.svg)](https://github.com/PreferredAI/cornac/releases)
[![PyPI](https://img.shields.io/pypi/v/cornac.svg)](https://pypi.org/project/cornac/)
Expand Down Expand Up @@ -126,18 +126,18 @@ $ curl -X GET "http://localhost:8080/recommend?uid=63&k=5&remove_seen=false"

# Response: {"recommendations": ["50", "181", "100", "258", "286"], "query": {"uid": "63", "k": 5, "remove_seen": false}}
```
If we want to remove seen items during training, we need to provide `TRAIN_SET` which has been saved with the model earlier, when starting the serving app. We can also leverage [WSGI](https://flask.palletsprojects.com/en/3.0.x/deploying/) server for model deployment in production. Please refer to [this](https://cornac.readthedocs.io/en/latest/user/iamadeveloper.html#running-an-api-service) guide for more details.
If we want to remove seen items during training, we need to provide `TRAIN_SET` which has been saved with the model earlier, when starting the serving app. We can also leverage [WSGI](https://flask.palletsprojects.com/en/3.0.x/deploying/) server for model deployment in production. Please refer to [this](https://cornac.readthedocs.io/en/stable/user/iamadeveloper.html#running-an-api-service) guide for more details.

## Efficient retrieval with ANN search

One important aspect of deploying recommender model is efficient retrieval via Approximate Nearest Neighor (ANN) search in vector space. Cornac integrates several vector similarity search frameworks for the ease of deployment. [This example](tutorials/ann_hnswlib.ipynb) demonstrates how ANN search will work seamlessly with any recommender models supporting it (e.g., MF).

| Supported framework | Cornac wrapper | Examples |
| :---: | :---: | :---: |
| [spotify/annoy](https://github.com/spotify/annoy) | [AnnoyANN](cornac/models/ann/recom_ann_annoy.py) | [ann_all.ipynb](examples/ann_all.ipynb)
| [meta/faiss](https://github.com/facebookresearch/faiss) | [FaissANN](cornac/models/ann/recom_ann_faiss.py) | [ann_all.ipynb](examples/ann_all.ipynb)
| [nmslib/hnswlib](https://github.com/nmslib/hnswlib) | [HNSWLibANN](cornac/models/ann/recom_ann_hnswlib.py) | [ann_hnswlib.ipynb](tutorials/ann_hnswlib.ipynb), [ann_all.ipynb](examples/ann_all.ipynb)
| [google/scann](https://github.com/google-research/google-research/tree/master/scann) | [ScaNNANN](cornac/models/ann/recom_ann_scann.py) | [ann_all.ipynb](examples/ann_all.ipynb)
| [spotify/annoy](https://github.com/spotify/annoy) | [AnnoyANN](cornac/models/ann/recom_ann_annoy.py) | [ann_example.py](examples/ann_example.py), [ann_all.ipynb](examples/ann_all.ipynb)
| [meta/faiss](https://github.com/facebookresearch/faiss) | [FaissANN](cornac/models/ann/recom_ann_faiss.py) | [ann_example.py](examples/ann_example.py), [ann_all.ipynb](examples/ann_all.ipynb)
| [nmslib/hnswlib](https://github.com/nmslib/hnswlib) | [HNSWLibANN](cornac/models/ann/recom_ann_hnswlib.py) | [ann_example.py](examples/ann_example.py), [ann_hnswlib.ipynb](tutorials/ann_hnswlib.ipynb), [ann_all.ipynb](examples/ann_all.ipynb)
| [google/scann](https://github.com/google-research/google-research/tree/master/scann) | [ScaNNANN](cornac/models/ann/recom_ann_scann.py) | [ann_example.py](examples/ann_example.py), [ann_all.ipynb](examples/ann_all.ipynb)


## Models
Expand All @@ -152,6 +152,7 @@ The recommender models supported by Cornac are listed below. Why don't you join
| 2020 | [Adversarial Training Towards Robust Multimedia Recommender System (AMR)](cornac/models/amr), [paper](https://ieeexplore.ieee.org/document/8618394) | [requirements.txt](cornac/models/amr/requirements.txt) | [amr_clothing.py](examples/amr_clothing.py)
| | [Hybrid neural recommendation with joint deep representation learning of ratings and reviews (HRDR)](cornac/models/hrdr), [paper](https://www.sciencedirect.com/science/article/abs/pii/S0925231219313207) | [requirements.txt](cornac/models/hrdr/requirements.txt) | [hrdr_example.py](examples/hrdr_example.py)
| | [LightGCN: Simplifying and Powering Graph Convolution Network for Recommendation](cornac/models/lightgcn), [paper](https://arxiv.org/pdf/2002.02126.pdf) | [requirements.txt](cornac/models/lightgcn/requirements.txt) | [lightgcn_example.py](examples/lightgcn_example.py)
| | [Temporal-Item-Frequency-based User-KNN (TIFUKNN)](cornac/models/tifuknn), [paper](https://arxiv.org/pdf/2006.00556.pdf) | N/A | [tifuknn_tafeng.py](examples/tifuknn_tafeng.py)
| 2019 | [Embarrassingly Shallow Autoencoders for Sparse Data (EASEᴿ)](cornac/models/ease), [paper](https://arxiv.org/pdf/1905.03375.pdf) | N/A | [ease_movielens.py](examples/ease_movielens.py)
| | [Neural Graph Collaborative Filtering (NGCF)](cornac/models/ngcf), [paper](https://arxiv.org/pdf/1905.08108.pdf) | [requirements.txt](cornac/models/ngcf/requirements.txt) | [ngcf_example.py](examples/ngcf_example.py)
| 2018 | [Collaborative Context Poisson Factorization (C2PF)](cornac/models/c2pf), [paper](https://www.ijcai.org/proceedings/2018/0370.pdf) | N/A | [c2pf_exp.py](examples/c2pf_example.py)
Expand Down Expand Up @@ -202,7 +203,7 @@ The recommender models supported by Cornac are listed below. Why don't you join

## Contributing

This project welcomes contributions and suggestions. Before contributing, please see our [contribution guidelines](https://cornac.readthedocs.io/en/latest/developer/index.html).
This project welcomes contributions and suggestions. Before contributing, please see our [contribution guidelines](https://cornac.readthedocs.io/en/stable/developer/index.html).

## Citation

Expand Down
2 changes: 1 addition & 1 deletion cornac/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,4 +23,4 @@
# Also importable from root
from .experiment import Experiment

__version__ = '1.17'
__version__ = '1.18'
22 changes: 15 additions & 7 deletions cornac/data/dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -633,7 +633,7 @@ class BasketDataset(Dataset):
uir_tuple: tuple, required
Tuple of 3 numpy arrays (user_indices, item_indices, rating_values).
basket_ids: numpy.array, required
basket_indices: numpy.array, required
Array of basket indices corresponding to observation in `uir_tuple`.
timestamps: numpy.array, optional, default: None
Expand Down Expand Up @@ -665,7 +665,7 @@ def __init__(
bid_map,
iid_map,
uir_tuple,
basket_ids=None,
basket_indices=None,
timestamps=None,
extra_data=None,
seed=None,
Expand All @@ -681,23 +681,31 @@ def __init__(
)
self.num_baskets = num_baskets
self.bid_map = bid_map
self.basket_ids = basket_ids
self.basket_indices = basket_indices
self.extra_data = extra_data
basket_sizes = list(Counter(basket_ids).values())
basket_sizes = list(Counter(basket_indices).values())
self.max_basket_size = np.max(basket_sizes)
self.min_basket_size = np.min(basket_sizes)
self.avg_basket_size = np.mean(basket_sizes)

self.__baskets = None
self.__basket_ids = None
self.__user_basket_data = None
self.__chrono_user_basket_data = None

@property
def basket_ids(self):
"""Return the list of raw basket ids"""
if self.__basket_ids is None:
self.__basket_ids = list(self.bid_map.keys())
return self.__basket_ids

@property
def baskets(self):
"""A dictionary to store indices where basket ID appears in the data."""
if self.__baskets is None:
self.__baskets = defaultdict(list)
for idx, bid in enumerate(self.basket_ids):
for idx, bid in enumerate(self.basket_indices):
self.__baskets[bid].append(idx)
return self.__baskets

Expand Down Expand Up @@ -822,7 +830,7 @@ def build(
np.ones(len(u_indices), dtype="float"),
)

basket_ids = np.asarray(b_indices, dtype="int")
basket_indices = np.asarray(b_indices, dtype="int")

timestamps = (
np.fromiter((int(data[i][3]) for i in valid_idx), dtype="int") if fmt in ["UBIT", "UBITJson"] else None
Expand All @@ -838,7 +846,7 @@ def build(
bid_map=global_bid_map,
iid_map=global_iid_map,
uir_tuple=uir_tuple,
basket_ids=basket_ids,
basket_indices=basket_indices,
timestamps=timestamps,
extra_data=extra_data,
seed=seed,
Expand Down
6 changes: 5 additions & 1 deletion cornac/eval_methods/base_method.py
Original file line number Diff line number Diff line change
Expand Up @@ -157,6 +157,8 @@ def ranking_eval(
if len(metrics) == 0:
return [], []

max_k = max(m.k for m in metrics)

avg_results = []
user_results = [{} for _ in enumerate(metrics)]

Expand Down Expand Up @@ -203,7 +205,9 @@ def pos_items(csr_row):
u_gt_pos_items = np.nonzero(u_gt_pos_mask)[0]
u_gt_neg_items = np.nonzero(u_gt_neg_mask)[0]

item_rank, item_scores = model.rank(user_idx, item_indices)
item_rank, item_scores = model.rank(
user_idx=user_idx, item_indices=item_indices, k=max_k
)

for i, mt in enumerate(metrics):
mt_score = mt.compute(
Expand Down
4 changes: 2 additions & 2 deletions cornac/eval_methods/next_basket_evaluation.py
Original file line number Diff line number Diff line change
Expand Up @@ -128,10 +128,10 @@ def get_gt_items(train_set, test_set, test_pos_items, exclude_unknowns):
user_idx,
item_indices,
history_baskets=history_baskets,
history_basket_ids=bids[:-1],
history_bids=bids[:-1],
uir_tuple=test_set.uir_tuple,
baskets=test_set.baskets,
basket_ids=test_set.basket_ids,
basket_indices=test_set.basket_indices,
extra_data=test_set.extra_data,
)

Expand Down
11 changes: 4 additions & 7 deletions cornac/exception.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,17 +13,14 @@
# limitations under the License.
# ============================================================================

class CornacException(Exception):
"""Exception base class to extend from

"""
class CornacException(Exception):
"""Exception base class to extend from"""

pass


class ScoreException(CornacException):
"""Exception raised in score function when facing unknowns
"""Exception raised in score function when facing unknowns"""

"""

pass
pass
2 changes: 1 addition & 1 deletion cornac/experiment/experiment.py
Original file line number Diff line number Diff line change
Expand Up @@ -150,7 +150,7 @@ def run(self):
if self.val_result is not None:
self.val_result.append(val_result)

if not isinstance(self.result, CVExperimentResult):
if self.save_dir and (not isinstance(self.result, CVExperimentResult)):
model.save(self.save_dir)

output = ""
Expand Down
1 change: 1 addition & 0 deletions cornac/models/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,7 @@
from .sorec import SoRec
from .spop import SPop
from .svd import SVD
from .tifuknn import TIFUKNN
from .trirank import TriRank
from .vaecf import VAECF
from .vbpr import VBPR
Expand Down
12 changes: 10 additions & 2 deletions cornac/models/ann/recom_ann_annoy.py
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,6 @@ def __init__(
):
super().__init__(model=model, name=name, verbose=verbose)

self.model = model
self.n_trees = n_trees
self.search_k = search_k
self.num_threads = num_threads
Expand All @@ -85,14 +84,18 @@ def __init__(

def build_index(self):
"""Building index from the base recommender model."""
super().build_index()

from annoy import AnnoyIndex

assert self.measure in SUPPORTED_MEASURES

self.index = AnnoyIndex(
self.item_vectors.shape[1], SUPPORTED_MEASURES[self.measure]
)
self.index.set_seed(self.seed)

if self.seed is not None:
self.index.set_seed(self.seed)

for i, v in enumerate(self.item_vectors):
self.index.add_item(i, v)
Expand All @@ -115,6 +118,11 @@ def knn_query(self, query, k):
]
neighbors = np.array([r[0] for r in result], dtype="int")
distances = np.array([r[1] for r in result], dtype="float32")

# make sure distances respect the notion of nearest neighbors (smaller is better)
if self.higher_is_better:
distances = 1.0 - distances

return neighbors, distances

def save(self, save_dir=None):
Expand Down
Loading

0 comments on commit a4bccf3

Please sign in to comment.