Simple Recommendation System

Welcome to the Simple Recommendation System! This project is designed to build and evaluate various recommendation algorithms using both simple and advanced approaches. Whether you're a beginner looking to understand the fundamentals or an experienced developer aiming to implement a structured and modular recommendation system, this repository offers comprehensive resources to meet your needs.

Introduction

The Recommendation System Project aims to develop and evaluate different recommendation algorithms using a dataset of titles and user interactions. The project is divided into two main approaches:

Simple Approach: Utilizes Jupyter Notebooks for Exploratory Data Analysis (EDA), Modeling, and Evaluation.
Advanced Approach: Employs structured Python scripts for a more organized and scalable workflow, incorporating multiple algorithms and a modular codebase.

Approaches

Simple Approach

The Simple Approach is ideal for quick experimentation and understanding the basics of recommendation systems. It leverages Jupyter Notebooks to perform data analysis, build models, and evaluate their performance interactively.

Notebooks Included:
- EDA.ipynb: Conducts Exploratory Data Analysis to understand the dataset, visualize trends, and preprocess data.
- Modeling.ipynb: Implements various recommendation algorithms such as Content-Based Filtering, Collaborative Filtering (SVD), and Hybrid Models.

Features

Interactive Environment: Ideal for experimenting with data and models interactively.
Visualization: Easily generate plots and charts to visualize data distributions and model performance.
Step-by-Step Implementation: Simplifies the learning process for beginners.

Advanced Approach

The Advanced Approach is tailored for production-ready environments where scalability, maintainability, and automation are crucial. It utilizes a structured and modular codebase with Python scripts organized into various components.

Scripts Included:
- app/logger.py: Sets up logging for the application.
- app/main.py: The entry point that orchestrates the workflow.
- app/parser.py: Parses command-line arguments for hyperparameters and model options.
- app/utils.py: Contains utility functions for setting seeds, saving/loading models, etc.
- src/config.py: Configuration file defining paths and parameters.
- src/data_processing/load_data.py: Functions to load raw and processed data.
- src/data_processing/preprocess.py: Data cleaning and preprocessing functions.
- src/feature_engineering/feature_engineer.py: Functions for feature engineering like TF-IDF vectorization.
- src/models/: Contains various recommender models:
  - collaborative_filtering.py
  - content_based.py
  - hybrid_model.py
  - recommender.py
- src/evaluation/metrics.py: Implements custom evaluation metrics.
- src/workflow.py: Manages the entire workflow from data loading to evaluation.

Features

Modular Design: Separation of concerns enhances code readability and maintainability.
Multiple Algorithms: Incorporates a variety of recommendation algorithms for comprehensive analysis.
Logging: Detailed logging for monitoring and debugging.
Automation: Scripts can be integrated into automated pipelines for continuous evaluation and deployment.

Differences Between Approaches

Feature	Simple Approach (Notebooks)	Advanced Approach (Scripts)
Ease of Use	High, suitable for beginners	Moderate, requires familiarity with scripts
Interactivity	Interactive and visual through notebooks	Script-based, less interactive
Scalability	Limited by notebook environment	Not highly scalable but more structured
Maintainability	Less maintainable for large projects	More maintainable with modular code
Automation	Manual execution through notebooks	Can be automated using scripts
Flexibility	Easy to tweak and experiment	Structured for robust development
Performance	Suitable for small to medium datasets	Better performance with multiple algorithms
Logging & Monitoring	Basic or none	Comprehensive logging and monitoring

Getting Started

Prerequisites

Ensure you have the following installed on your system:

Python 3.7+
pip (Python package installer)
Git (optional, for cloning the repository)
Virtual Environment Tool (optional but recommended, e.g., venv or conda)

Installation

Clone the Repository

git clone https://github.com/yourusername/recommendation_system.git
cd recommendation_system

Create a Virtual Environment

python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install Dependencies
```
pip install -r requirements.txt
```
**Make sure you download the titles and title_interactions to /notebooks folder for Simple Approach, /data/raw folder for Advanced Approach **
```
pip install -r requirements.txt
```

Advanced Approach

Navigate to the Project Root Directory
```
cd simple_recommendation_system
```
Run the Main Script The main.py script orchestrates the entire workflow, including data loading, preprocessing, feature engineering, model training, evaluation, and saving results.
```
python app/main.py --max_features 5000 --sample_percentage 5 --algorithms SVD --alpha 0.5 --top_k 10
```

Command-Line Arguments::
- --max_features: Maximum number of features for TF-IDF Vectorizer (default: 10000).
- --sample_percentage: Percentage of training data to sample (default: 100.0).
- --algorithms: Comma-separated list of collaborative filtering algorithms to use (default: SVD,SVDpp,NMF,KNNBasic,KNNBaseline,KNNWithMeans).
- --alpha: Weighting factor for the hybrid recommender (default: 0.5).
- --top_k: Number of top recommendations to evaluate (default: 10).

Monitor Outputs

--Logs: Check the logs/recommender.log file for detailed logs.
--Evaluation: The evaluation results are saved in outputs/recommendation_evaluation_results.csv.
--Saved Models: Trained models are saved in the models_saved/ directory.

Unit Tests

Unit tests are implemented to ensure the correctness of the evaluation metrics and the functionality of the recommendation models.

Navigate to the Project Root Directory
```
cd simple_recommendation_system
```
Navigate to the Project Root Directory
```
python -m unittest discover -s tests
```

Contact

For any questions or suggestions, please contact [email protected] .

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Simple Recommendation System

Table of Contents

Introduction

Approaches

Simple Approach

Features

Advanced Approach

Features

Differences Between Approaches

Getting Started

Prerequisites

Installation

Advanced Approach

Unit Tests

Contact

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
app		app
data		data
notebooks		notebooks
src		src
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

bilalimamoglu/simple_recommendation_system

Folders and files

Latest commit

History

Repository files navigation

Simple Recommendation System

Table of Contents

Introduction

Approaches

Simple Approach

Features

Advanced Approach

Features

Differences Between Approaches

Getting Started

Prerequisites

Installation

Advanced Approach

Unit Tests

Contact

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages