Network Intrusion Detection Kaggle Competition Predictive Modeling and F1-Score Optimization

Engaged in a Kaggle competition on network intrusion detection, I crafted a predictive model leveraging the provided training dataset. The task involved submitting CSV solutions (ID, Class) for the test set, with ID aligning with test data and Class presenting predicted labels. The competition's evaluation metric was the F1-score. Notably, this undertaking was a component of my 2023 master's program at the University of Ottawa, specializing in AI for Cyber Security.

Required libraries: scikit-learn, pandas, matplotlib.
Execute cells in a Jupyter Notebook environment.
The uploaded code has been executed successfully within the Google Colab environment.

Binary Classification Problem

Task is to classify the connection is intrusive (1) or not (0)

Independent Variables:

Include features such as duration, protocol type, service, flags, and numerical attributes related to net work activities. These variables provide a comprehensive representation of network behavior for intrusion detection.

Target variable:

'Class': classify the connection is intrusive (1) or not (0)

Key Tasks Undertaken

Data Loading and Exploration:
- Loaded and explored the train and test datasets.
- Checked data information, null values, duplicates, and unique values.
Data Cleaning:
- Handled missing values and duplicates.
- Dropped unnecessary columns ("ID", "duration").
Data Preprocessing:
- Separated features (X_train) and target variable (y_train).
- Applied one-hot encoding to categorical variables.
Model Training:
- Utilized CatBoostClassifier with hyperparameter tuning after applying differnet Classifiers.
- Applied class weights for imbalanced classes.
- Employed soft voting with a threshold for ensemble predictions.
Model Evaluation and Prediction:
- Evaluated the model using the F1-score metric.
- Generated predictions for the test data.
Submission File Creation:
- Formatted the predictions into a CSV file with columns (ID, Class).
- Saved the submission file as "Result of CatBoostClassifier model.csv".

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
LICENSE		LICENSE
NID-Kaggle96.8.ipynb		NID-Kaggle96.8.ipynb
README.md		README.md
Result of CatBoostClassifier model.csv		Result of CatBoostClassifier model.csv
testdata.csv		testdata.csv
traindata.csv		traindata.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Network Intrusion Detection Kaggle Competition Predictive Modeling and F1-Score Optimization

Binary Classification Problem

Independent Variables:

Target variable:

Key Tasks Undertaken

About

Releases

Packages

Languages

License

RimTouny/Network-Intrusion-Detection-Kaggle-Competition-Predictive-Modeling-and-F1-Score-Optimization

Folders and files

Latest commit

History

Repository files navigation

Network Intrusion Detection Kaggle Competition Predictive Modeling and F1-Score Optimization

Binary Classification Problem

Independent Variables:

Target variable:

Key Tasks Undertaken

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages