This repository contains code for a machine learning model built to classify celestial objects such as galaxies, stars, and quasars based on spectral characteristics. The model utilizes data from the Sloan Digital Sky Survey (SDSS), a comprehensive survey of space.
The dataset consists of 100,000 observations of celestial objects, each described by 18 feature columns. These features include:
- obj_ID: Object Identifier, unique value identifying the object in the image catalog used by the CAS.
- alpha: Right Ascension angle (at J2000 epoch).
- delta: Declination angle (at J2000 epoch).
- u, g, r, i, z: Photometric filter measurements in the ultraviolet, green, red, near-infrared, and infrared spectrum, respectively.
- run_ID, rereun_ID, cam_col, field_ID: Identifiers for the specific scan, rerun, camera column, and field number.
- spec_obj_ID: Unique ID for optical spectroscopic objects. Different observations with the same spec_obj_ID share the output class.
- class: Object class (galaxy, star, or quasar object).
- redshift: Redshift value based on the increase in wavelength.
- plate, MJD, fiber_ID: Plate ID, Modified Julian Date, and fiber ID used for each observation.
The machine learning model is implemented in a Jupyter Notebook using Python's popular data science libraries, including:
- scikit-learn for building and evaluating the model.
- pandas for data manipulation and analysis.
- matplotlib and seaborn for data visualization.