Avatar
foyie.py
print("Hello, World!")

I'm

AI . ML . Data Science

Passionate about building intelligent systems and leveraging machine learning to solve real-world problems. Currently pursuing Master's in Data Science at UC San Diego with expertise in Generative AI, Computer Vision, Deep Learning and MLOps.

Download Resume

About Me

My journey into AI began with purpose, not just passion. In 2020, amid COVID-19, I asked: How can I build something that matters right now? That question led me to develop machine learning pipelines for identifying drug targets. What started as a crisis response evolved into a career engineering AI systems that improve lives, from predicting power outages for 3.6 million SDG&E customers to protecting endangered wildlife through computer vision at San Diego Zoo.

I build AI that ships. My expertise spans deep learning, Generative AI, computer vision, and MLOps, but more importantly, I know how to take models from research to production. I've built multi-agent GenAI platforms with RAG architectures, deployed neural networks on AWS at scale, and optimized inference pipelines to run faster. I thrive in the messy middle ground where research meets engineering, where you need to understand both gradient descent and Docker containers.

But my impact extends beyond the technical. I'm deeply committed to making tech an inclusive space for women. I'm seeking teams that share my vision: AI that's ethical, impactful, scalable, and built by people who represent the world we're building for. I bring production-ready ML skills (PyTorch, LangChain, AWS, RAG systems), published research, real-world deployment experience, and an unwavering commitment to using AI as a force for good.

Ready to discuss how cutting-edge AI research meets production engineering? Find me at GHC 2025!

Chandrima Das

Education

Master of Science in Data Science
University of California, San Diego
Sep 2024 – Present

GPA: 3.82 | San Diego, CA, USA

Bachelor of Technology in Computer Science
Institute of Engineering and Management, Kolkata
Sep 2020 – Jun 2024

GPA: 3.99 | Rank: 2nd | Kolkata, India

Skills & Expertise

Building intelligent systems with cutting-edge technologies

Additional Technologies

Java
C/C++
R
MATLAB
Vector Databases
YOLO
XGBoost
Librosa
Flask
Django
REST APIs
CI/CD
MLflow
Keras
Seaborn
NLTK
spaCy
PostgreSQL
MongoDB
Redis
Airflow
Jupyter
A/B Testing
Model Deployment

Experience

Machine Learning Intern
San Diego Supercomputer Center
San Diego, CA, USA
Jul 2025 – Present
Machine Learning Intern
San Diego Zoo Wildlife Alliance
San Diego, CA, USA
Aug 2025 – Present
Machine Learning Engineering Intern
Moccet
Boston, MA, USA
May 2025 – Jul 2025
Machine Learning Researcher
MixLab, University of California San Diego
San Diego, CA, USA
Mar 2025 – Jun 2025
Data Science Intern
Innovation and Entrepreneurship Development Cell (CSE)
Kolkata, India
Jan 2023 – May 2024
Machine Learning Intern
Indian Institute of Technology Indore
Indore, India
Jun 2023 – Nov 2023

Projects

Multi-Agent AI Supply Chain Optimization

An autonomous AI platform that processes 10,000+ supply chain events per second using four ML agents. Built with microservices architecture to simulate and handle real-time disruptions.

MLOps MLflow PostgreSQL Kafka Docker Kubernetes
Click to see details

Multi-Agent AI Supply Chain Optimization

A resilient, autonomous AI platform for supply chain management that processes over 10,000 events per second using four specialized machine learning agents. Built with FastAPI, PostgreSQL, and Kafka for high-throughput data flow, with MLflow and Weights & Biases for comprehensive monitoring. Deployed using microservices architecture with Docker and Kubernetes to simulate and handle supply chain disruptions at scale.

Click to flip back

Personal AI Nutrition & Fitness Coach

An AI health assistant that analyzes food images to calculate nutrition and provides real-time coaching. Reduced API costs by 70% through RAG-powered FAISS search.

GPT4-Vision RAG FAISS FastAPI Computer Vision
Click to see details

Personal AI Nutrition & Fitness Coach

A personal health platform combining computer vision and conversational AI. Uses GPT4-Vision for food image analysis to automatically calculate nutritional values, making meal logging effortless. Integrated GPT-4 and Qwen-VL with a RAG system using FAISS vector search for real-time nutrition coaching, achieving a 70% reduction in API costs. Built on FastAPI with SQLite caching and asyncio pipeline for smooth, personalized experiences.

Click to flip back

BirdID: Identifying Birds by Sound

A scalable audio classifier trained on 50,000+ bird call samples. Engineered 206 parallel XGBoost classifiers to handle class imbalance, achieving 0.911 AUC-ROC. Accepted to CLEF 2025.

PyTorch XGBoost Librosa Signal Processing Audio Classification
Click to see details

BirdID: Identifying Birds by Sound

A deep dive into audio classification using PyTorch and XGBoost across 50,000+ audio samples. Engineered log-mel and frequency features from raw audio and designed 206 parallel XGBoost binary classifiers to handle severe class imbalance. Achieved significant F1-score improvements for rare species through targeted data augmentation and an overall 0.911 AUC-ROC score. This work was accepted as a paper at CLEF 2025.

Click to flip back

Human Drug Target Identification

A machine learning pipeline analyzing 20,000+ genes to identify potential drug targets. Reduced dimensionality by 60% using PCA and Node2Vec, achieving 89.4% recall with Random Forest.

Scikit-learn NetworkX PCA Node2Vec Bioinformatics
Click to see details

Human Drug Target Identification

My first research venture and introduction to data science's real-world impact. Developed a machine learning pipeline to identify potential drug targets by analyzing 20,000+ genes, RNA-seq data, and protein networks. Used feature engineering techniques like PCA and Node2Vec to reduce dimensionality by 60%. The final Random Forest classifier achieved 89.4% recall and 75.4% accuracy, outperforming baseline classifiers by 7%.

Click to flip back

PlaySense: Game Rule Retrieval System

A question-answering system that instantly resolves board game rule disputes. Built with RAG architecture to query 20+ game manuals with 91% precision.

RAG LangChain Ollama ChromaDB NLP
Click to see details

PlaySense: Game Rule Retrieval System

Ever been mid-game, arguing about a rule? As a board game enthusiast, I built PlaySense to solve exactly that. It's a RAG-powered question-answering system using LangChain that lets you query game manuals instantly. The system parses 20+ structured and unstructured manuals stored in ChromaDB, with automated embedding of 500+ pages using Ollama. Through optimization and a custom validation pipeline, I improved retrieval latency by 28% and achieved 91% QA precision.

Click to flip back

RavNet: Artery-Vein Segmentation

A novel cascaded U-Net architecture for medical imaging that segments arteries and veins from retinal scans to aid diabetes detection. Achieved 0.74 sensitivity with 2% improvement over baselines.

PyTorch Deep Learning Computer Vision Medical Imaging U-Net
Click to see details

RavNet: Artery-Vein Segmentation

A deep dive into medical imaging where I developed RavNet—a novel architecture with two cascaded U-Nets and three decoders in the second network. Built in PyTorch for artery-vein segmentation from retinal scans, helping identify conditions like diabetes. I curated and preprocessed 400+ retinal scans with advanced augmentations to enhance generalization and reduce overfitting. The model achieved 0.74 sensitivity and 0.98 specificity, outperforming U-Net baselines by 2%.

Click to flip back

PayLens: Salary Prediction & Equity Analysis

A data-driven platform revealing compensation insights from 50,000+ salary records. Identified a 12% gender pay gap through statistical analysis and built interactive Tableau dashboards.

Tableau SQL ETL Regression Statistical Testing
Click to see details

PayLens: Salary Prediction & Equity Analysis

Ever wondered if you're being paid fairly? Built PayLens to provide data-driven compensation insights. Created an automated ETL pipeline to scrape 50,000+ salary records from Levels.fyi with 99%+ accuracy. Developed a regression model identifying the five primary drivers of compensation. Statistical analysis (t-tests, ANOVA) revealed a 12% gender pay gap after controlling for variables. Presented findings in an interactive Tableau dashboard to help people make informed career decisions.

Click to flip back

SuicideWatch: Mental Health Visualization

An interactive dashboard exploring global suicide data to destigmatize mental health discussions. Built during COVID-19 to visualize temporal and demographic patterns for public health insights.

Streamlit Pandas Matplotlib Seaborn Data Viz
Click to see details

SuicideWatch: Mental Health Visualization

Developed during the COVID-19 pandemic when mental health became critically important, especially coming from India where it's often a taboo topic. I wanted to build a visualization to vocalize how crucial these conversations are. Curated and cleaned global suicide datasets from WHO and World Bank, performed feature engineering and outlier handling, then built an interactive Streamlit dashboard allowing exploration by country, age, and gender. Conducted statistical analysis to identify temporal and demographic patterns for actionable public health insights.

Click to flip back

AnimeMate: Recommendation System

A content-based recommendation engine for anime built when existing platforms fell short. Uses TF-IDF and cosine similarity across 19,000 anime entries for precise recommendations.

TF-IDF Cosine Similarity BeautifulSoup NLP Web Scraping
Click to see details

AnimeMate: Recommendation System

As a huge anime fan unable to find good movie recommendation platforms, I created my own. Built a scalable content-based recommendation engine using TF-IDF vectorization and cosine similarity for precise anime recommendations. Scraped 19,000 anime entries using BeautifulSoup and tokenized synopses to power the model.

Click to flip back

PotterVerse: Character Network Analysis

A network visualization of Harry Potter character relationships. Used SpaCy NER to identify and map character connections, where hobby and passion intersected.

NetworkX NLP SpaCy Network Viz NER
Click to see details

PotterVerse: Character Network Analysis

As a lifelong Potterhead, I wanted to analyze the complex character relationships throughout the books. Used NetworkX to map character relationships from Harry Potter data. Scraped datasets and applied NLP with SpaCy for Named Entity Recognition to identify and connect characters. This project represents where my hobby and passion for data intersected.

Click to flip back

Publications

One Detector per Bird: A Scalable Binary Classification Approach for BirdCLEF+ 2025.

Suthraye, S., Das, C., Gaikwad, A., Senthilnathan, K., & Sawant, S. P.(2025). University of California, San Diego.

CLEF 2025, LCNS, Springer, Singapore.

RavNet: Conditioning Retinal Vessel Identification using a cascaded multi objective U-Net

Das, C., Swarnendu, G. (2024)

COMSYS 2024, LNNS, Springer, Singapore.

COVID-19 Drug Target Identification using Localization, Gene Expression & Node2vec

Das, C., Saha, S. (2024)

COMSYS 2023, LNNS, vol 974, Springer, Singapore.

NCSML-HDTD: Network Centrality and Sequence-Based ML for COVID-19 Drug Target Discovery

Jha, S., Das, C., Saha, S. (2023)

COMSYS 2022, LNNS, vol. 690, Springer, Singapore.

Leadership & Achievements

Silver Medalist in Computer Science Department
Recipient of Director's Award for Outstanding Contribution
Best Paper Award in Third International Conference on Frontiers in Computing and Systems (COMSYS 2022)
Presented research papers at 3+ international conferences
Organized 5+ seminars in ML and conducted coding workshops for 40+ students for increasing female participation in tech

Beyond AI

Some things that define me!

Art

some unfinished artworks as a painting is never complete

Life in Frames

People who matter

Things I Love

Dark Coffee

Think of coffee as my gradient descent optimizer. It minimizes my grogginess loss function and helps me converge toward productivity faster than any Adam optimizer ever could.

🥻

Sarees

Coming from India, nothing makes me feel more gorgeous than being in a saree. Its complex, beautiful, and somehow making everything work elegantly together.

Feminism

Passionate about debugging society's biased algorithms, because equality isn't a feature, it's the foundation every system should be built on.

Get In Touch

Send a Message