>

Hi there, I'm

Rupanshu Kapoor

A passionate data scientist specializing in Machine Learning and Artificial Intelligence. With extensive experience in Python, SQL, and deep learning, I thrive on solving complex problems and turning data into actionable insights.


I have a robust background in data science, honed through hands-on projects and active participation in data science platforms like Kaggle and Hugging Face. My journey has led me to explore various domains, from automating ML workflows and task-specific object detection to creating RAG chatbots. I am always eager to tackle new challenges and contribute to the data science community.

Resume

Experience


Data Science Intern, Imarticus

May 24 - Present

Developed an advanced resume parsing tool using NLP for text extraction and LLMs for context-aware information extraction, enhancing candidate profile accuracy by 30%.

Implemented grammar and spelling error detection with intelligent suggestions, improving the quality and professionalism of resumes by 40%.


Data Engineer, Pratham Software

July 19 - Dec 20

Designed and developed a CPQ (Configure Price Quote) tool tailored to specific customer requirements, enabling efficient and customized pricing strategies.

Created automated ETL pipeline using Pyspark on Azure Data Factory(ADF), reducing delivery times by 30%.

My Skill Set

Latest Projects

Gambar 1

SnapText: AI Image Chatbot

Designed a RAG AI chatbot featuring secure user authentication via Firebase and advanced text extraction from handwritten images and PDFs using Google Document AI, achieving 95% text recognition accuracy by integrating text embeddings of the extracted text, saved embedding in the ChromDB enhancing search efficiency and reducing query response time by 40%.

Gambar 2

Face Mask Detection

Developed a face mask detection application for public safety systems during pandemics like COVID-19, using a custom CNN model trained on a dataset of 12,000 images to achieve 99% accuracy. Enhanced the model's performance and robustness with data augmentation, and deployed it with a user-friendly interface via Streamlit.

Gambar 3

CVGuru: AI resume review assistant

CVGuru is an AI Resume Review Assistant that parses information from a resume using natural language processing and Gemini LLM model and provides recommendations based on the extracted data.

Gambar 4

Decision Tree Visualizer

An interactive Decision Treee Visualizer app provides an interface where you can customize various parameters of the Decision Tree model and observe its performance on the Iris dataset by visualizing Decision Boundry and Decision Tree Graph.

Gambar 5

DataFlow Pro

DataFlow Prp is a Python application designed to automate the process of building, tuning, and evaluating machine learning models based on json provided in RTF/JSON/TXT file format. This application follows a structured flow to read the json file, extract dataset information, transform features, split data, build and tune models, and evaluate their performance.

Gambar 6

IPL Dsashboard

Grouping customers based on purchasing behavior using clustering techniques (Kmeans) to increase understanding of customer preferences accordingly.

Gambar 4

SQL Case Studies

An interactive dashboard using Tableau that provides analysis of sales, profits, and total sold. This provides an in-depth understanding of the sales performance of adidas products in the US.

Gambar 5

Covid-19 Dashboard

This dashboard provides the latest data visualization about the number of confirmed cases, recovery rate, number of deaths. Users can explore the data by filtering by province, time range and other parameters.

Gambar 6

NewYork City Taxi Fare

The Streamlit website for stroke prediction was built using Streamlit, where users can enter patient data such as age, gender, blood sugar levels, body mass index (BMI), and other risk factors to obtain stroke risk predictions.

Certificatoions


Stanford DeepLearning AI Advanced Learning Algorithms


Stanford DeepLearning AI Supervised Machine Learning


IBM Supervised Machine Learning: Regression