Skip to content
View ishani2202's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report ishani2202

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
ishani2202/README.md

Hi there! 👋 I'm Ishani Arya

Welcome to my GitHub profile! I’m a passionate Data Scientist and Machine Learning Innovator with a strong foundation in Data Science, complemented by a love for finance & healthcare research. Here, you’ll find my work at the intersection of AI, machine learning, and analytics.


Website Badge LinkedIn Badge Behnace Badge


🎓 Education

  • Johns Hopkins University
    Master of Science in Data Science (Expected: 2026)
    Relevant Courses: Advanced Data Science, Machine Translation, ML for Healthcare

  • LM Thapar School of Management
    MBA in Finance (2024)
    Relevant Courses: Financial Derivatives, Options Pricing, Portfolio Management

  • Thapar University
    Bachelor’s in Computer Science (2023)
    Relevant Courses: Deep Learning, Probability and Statistics, Algorithm Design


💼 Experience

Research Assistant: Financial Data Science | Johns Hopkins University

November 2024 – Present | Baltimore, USA
Engineered an ETL pipeline for processing unstructured SEC filings (10-K & 10-Q) of BDCs, applying advanced data science techniques for financial forecasting.

  • Streamlined Data Analysis: Designed Python and Stata workflows for data wrangling, feature engineering, and regex classification to convert financial data into machine-readable formats.
  • Predictive Modeling: Prepared datasets for financial time-series forecasting, leveraging ARIMA models for trend prediction and actionable insights.

Data Science Trainee | LG Electronics

February 2024 – July 2024 | Noida, India
Designed and implemented customer insight pipelines to enhance product development strategies.

  • Web Scraping Expertise: Built web scraping solutions using Selenium and Beautiful Soup to extract Voice of Customer data from platforms like YouTube and Flipkart.
  • NLP Innovations: Fine-tuned BERT for sentiment analysis and utilized Latent Dirichlet Allocation (LDA) for topic modeling, transforming customer feedback into actionable insights.

Big Data Intern | Reliance Jio

June 2023 – August 2023 | Mumbai, India
Harnessed big data technologies to optimize marketing strategies for India’s leading telecom provider.

  • ETL Pipeline Development: Designed a scalable pipeline using Apache Spark to process over 1M tweets, integrating MongoDB for efficient storage.
  • Predictive Analytics: Applied Naive Bayes for sentiment classification and SVM for engagement prediction, visualizing results in Tableau to boost campaign effectiveness by 15%.

Machine Learning Researcher | Thapar University

June 2022 – May 2023 | Patiala, India
Pioneered a deep learning-based waste classification system, contributing to environmental sustainability.

  • Innovative Edge Processing: Developed a waste classification boat using ResNet and TensorFlow, achieving 93.07% accuracy in real-time classification of biodegradable vs. non-biodegradable waste.
  • Deployment Ready: Deployed the system on Raspberry Pi for efficient edge processing with live camera input.

🔬 Research Projects

Autonomous Waste Segregation Boat

An innovative AI-powered solution for environmental management, featuring a patented system for waste classification.

  • Smart Waste Management: Leveraged TensorFlow and ResNet on Raspberry Pi to accurately distinguish between biodegradable and non-biodegradable waste, achieving 93.07% classification accuracy.
  • Impact in Action: This prototype advances environmental sustainability by improving waste segregation processes in water bodies.

MelanoViT | GitHub

A robust melanoma classification framework leveraging state-of-the-art Vision Transformers (ViT) and DinoV2.

  • Breakthrough Performance: Achieved 99.03% accuracy with metadata integration, surpassing traditional CNN-based models.
  • Advanced Techniques: Addressed class imbalance with weighted loss functions, enhanced generalization through data augmentation (hair removal, geometric transformations).

Pandemic Tracker | GitHub | Interactive Website

A comprehensive COVID-19 global dashboard designed for dynamic data exploration.

  • Time-Series Analysis: Utilized Python, Pandas, and Plotly for rolling averages and trend visualizations.
  • Deployed Application: Interactive Streamlit app featuring choropleths, bar graphs, line plots, and filters to enable detailed insights into pandemic trends.

EchoTranslate | GitHub

A Transformer-based pipeline for ASR and multilingual translation targeting medical applications.

  • Domain-Specific Excellence: Preprocessed and fine-tuned on a medical corpus, achieving a 5.8% BLEU score improvement and 12.5% WER reduction.
  • Advanced Features: Incorporated Torchaudio and Librosa for spectrogram extraction, signal denoising, and preprocessing in doctor-patient conversations.

🛠️ Technical Skills

  • Programming Languages: Python, MySQL, R, C++, Java, JavaScript
  • Technical Skills and Tools:
    • Data Filtration: NumPy, Pandas, OS, Scikit-Learn, SciPy, Datasets, LIWC, NLTK
    • Web Scraping: Selenium, BeautifulSoup
    • Model Building and Training: PyTorch, Transformer, WandB, SageMaker, SpaCy, Flair, TensorFlow, OpenCV, Amazon Lex, XGBoost
    • Data Visualization: Matplotlib, Seaborn, Tableau
  • Software & Frameworks: PowerBI, Spark, Hadoop, AWS, Azure, Apache Spark, Docker, Microsoft Excel, Figma
  • Competencies: Machine Learning, Generative AI, Feature Engineering, Deep Learning, Data Analysis, Financial Derivatives

🏆 Extracurriculars & Honors

  • Millennium Fellowship 2021 by United Nations Academic Impact
  • Head of Administration for IAESTE, TIET, India Chapter
  • Top 5 Finalist in Microsoft Learn Student Chapter Hackathon at Thapar University

📫 Let’s Connect!

Whether you're interested in discussing a project, exploring collaboration opportunities, or simply want to chat about data science, feel free to reach out on LinkedIn or GitHub.


Popular repositories Loading

  1. Image-Classification-for-Automatic-Waste-Segregation-Patent-Published Image-Classification-for-Automatic-Waste-Segregation-Patent-Published Public

    Jupyter Notebook

  2. Covid19_Dashboard Covid19_Dashboard Public

    Python

  3. Image_Caption Image_Caption Public

    Jupyter Notebook

  4. EchoTranslate EchoTranslate Public

    Jupyter Notebook

  5. VisionClassify VisionClassify Public

    Forked from Khushangz/VisionClassify

    Jupyter Notebook

  6. ishani2202 ishani2202 Public