This project analyzes NASA's Near-Earth Object (NEO) data to explore trends, patterns, and risks associated with objects approaching Earth's orbit. It integrates advanced data science techniques, machine learning models, and interactive visualizations to study NEO characteristics and potential threats.
-
Data Exploration
- Explore detailed NEO metrics such as size, velocity, and approach distance.
- Analyze time-series data to observe close approaches over time.
-
Visualization
- Scatter plots, histograms, and heatmaps for understanding data distributions.
- Interactive 3D scatter plots for clustering insights.
-
Clustering and Risk Analysis
- Use K-Means and HDBSCAN for clustering NEOs based on orbital and size parameters.
- Identify high-risk NEOs based on proximity and size.
-
Predictive Modeling
- Build and evaluate machine learning models (e.g., KNN and XGBoost).
- Classify NEOs for potential hazards.
The dataset contains detailed attributes of near-Earth objects, including:
- Object diameter
- Approach distance
- Velocity
- Orbital parameters
- Classification of hazard levels
- Introduction and Objective: Overview of the analysis goals.
- Dataset Overview: Description of the NASA NEO dataset.
- Exploratory Data Analysis (EDA):
- Visualizations: Scatter plots, histograms, and heatmaps.
- Time-series trends of close approaches.
- Clustering:
- Techniques: K-Means, HDBSCAN.
- 3D visualizations of clusters.
- Silhouette scoring for evaluation.
- Predictive Modeling:
- Models: KNN, XGBoost.
- Risk analysis and classification.
- Interactive Visualizations:
- Use Plotly for engaging visual outputs.
- Core libraries for data manipulation:
pandas
,numpy
,matplotlib
,seaborn
. - Advanced clustering methods such as
HDBSCAN
and K-Means. - Descriptive statistics and missing value handling.
- 3D scatter plotting for cluster visualization using
matplotlib
andplotly
. - Predictive modeling with hyperparameter tuning using
XGBoost
andKNN
.
-
Clone this repository:
git clone https://github.com/your-username/NASA-Near-Earth-Object-Analysis.git cd NASA-Near-Earth-Object-Analysis
-
Install dependencies:
pip install -r requirements.txt
-
Launch the Jupyter Notebook:
jupyter notebook
-
Open
NASA_Near_Earth_Object_Analysis.ipynb
to explore the analysis.
- Data Loading: Load and preprocess the dataset.
- EDA: Visualize and interpret data distributions.
- Clustering: Group NEOs and analyze profiles.
- Modeling: Train and evaluate machine learning models for classification.
- Interactive Visualizations: Generate dynamic plots with Plotly.
- Time Series Analysis: Observe trends in close approaches over time.
- Cluster Visualizations: 3D scatter plots to explore relationships between clusters.
- Risk Profiles: Bar charts and tables summarizing cluster characteristics.
- Incorporate live NASA API data for real-time analysis.
- Enhance risk profiling with additional data features.
- Experiment with deep learning models for predictive tasks.