This repository contains a Jupyter Notebook for predicting diabetes using a machine learning model. The dataset used for this project is from Kaggle: Diabetes Prediction Dataset.
Diabetes is a chronic disease that affects millions of people worldwide. Early detection is crucial for managing the condition and preventing complications. This project aims to predict the likelihood of diabetes in a patient based on various medical attributes using machine learning techniques.
The dataset used in this project is publicly available on Kaggle and contains several medical predictors such as age, BMI, blood pressure, insulin level, and more.
- Dataset Link: Diabetes Prediction Dataset
To run this project, you need to have Python and Jupyter Notebook installed on your system. Additionally, install the required Python packages by running:
pip install -r requirements.txt
- Clone the repository:
git clone https://github.com/your-username/diabetes-prediction.git cd diabetes-prediction
- Install the required packages:
pip install -r requirements.txt
- Open the Jupyter Notebook:
jupyter notebook Diabetes_prediction_model.ipynb
- Follow the steps in the notebook to preprocess the data, train the model, and evaluate its performance.
The notebook covers the following steps:
- Data Exploration: Understanding the dataset and visualizing the features.
- Data Preprocessing: Cleaning the data and preparing it for the model.
- Model Training: Training various machine learning models such as Logistic Regression, Decision Tree, Random Forest, and others.
- Model Evaluation: Evaluating the performance of the models using metrics such as accuracy, precision, recall, and F1 score.
The model's performance is evaluated, and the best-performing model is selected based on the evaluation metrics. Details of the results can be found in the notebook.
Contributions are welcome! Please fork the repository and submit a pull request for any enhancements or bug fixes.