Welcome to the Machine Learning Projects Repository! This collection features a variety of machine learning projects designed to address real-world problems, showcasing diverse tools, algorithms, and insights.
- Programming & Libraries: Python, Pandas, NumPy, Scikit-learn, Matplotlib, Seaborn
- Algorithms Used: Linear Regression, Logistic Regression, Random Forest, Decision Tree, Gradient Boosting, Support Vector Machine (SVM), K-Means Clustering, PCA
- Objective: Predict house prices using features such as location, size, and amenities.
- Models: Linear Regression, Random Forest, Gradient Boosting
- Key Insight: Random Forest delivered the highest accuracy, emphasizing the importance of spatial and size-related features.
- Objective: Predict passenger survival on the Titanic using demographic and class-related features.
- Models: Logistic Regression, Random Forest, SVM
- Key Insight: Effective imbalanced data handling improved accuracy, with Random Forest achieving 81%.
- Objective: Predict restaurant ratings based on location, cuisine, and cost.
- Models: Random Forest, Ridge Regression
- Key Insight: Random Forest outperformed others, highlighting location and cuisine type as major contributors.
- Objective: Predict revenue based on user behavior and shopping patterns.
- Models: Logistic Regression, Gradient Boosting
- Key Insight: Ensemble models excelled, with Gradient Boosting achieving a robust F1 score of 0.91.
- Objective: Estimate health insurance costs using personal and medical data.
- Models: Linear Regression, Gradient Boosting
- Key Insight: Gradient Boosting provided superior accuracy, with BMI and age being critical factors.
- Objective: Detect fraudulent transactions from financial datasets.
- Models: Logistic Regression, Decision Tree
- Key Insight: Decision Tree achieved high precision by effectively splitting categorical features.
- Objective: Determine loan eligibility using financial and employment data.
- Models: Logistic Regression, Random Forest
- Key Insight: Logistic Regression emerged as the most interpretable solution with excellent performance.
- Objective: Predict trip durations using location, distance, and time data.
- Models: Linear Regression, Random Forest
- Key Insight: Random Forest demonstrated superior predictive accuracy for real-time applications.
- Objective: Cluster music tracks based on audio features to predict genres.
- Models: K-Means Clustering, PCA
- Key Insight: PCA improved clustering efficiency, delivering meaningful groupings.
- Objective: Predict air quality levels using environmental data.
- Models: Random Forest, Gradient Boosting
- Key Insight: Random Forest emerged as the top performer, with pollutant levels being a significant predictor.
- Objective: Detect malware using system and application behavior data.
- Models: Logistic Regression, SVM, Decision Tree
- Key Insight: Logistic Regression and Decision Tree achieved high precision, making them reliable for detection systems.
This repository highlights my expertise in applying machine learning techniques to various challenges, focusing on data preprocessing, feature engineering, model optimization, and actionable insights. Each project demonstrates practical applications, showcasing my ability to solve real-world problems effectively.
Explore the repository to dive deeper into the projects, and feel free to connect for feedback or collaboration opportunities!