Joan Of Arc Xavier

Logo

Data Scientist | Machine Learning | Deep Learning

About Me:

I completed my PhD in Electrical Engineering, during which I consistently focused on predictive modelling of complex nonlinear systems from many messy datasets. Moreover I also did courseworks in Machine Learning and Statistics for Data Science. Alongside my coursework, I conducted research that involved Exploratory Data Analysis (EDA) and Machine Learning algorithms. This combination of academic training and hands-on research sparked my interest in transitioning from academia to the data science industry. I found the process of working with raw data, uncovering patterns, and transforming numbers into meaningful insights and actionable business recommendations fascinating. I began relentlessly working on Python, SQL, Excel, Tableau and this journey naturally shifted my career trajectory toward becoming a data scientist, where I could utilize my skills in data analysis, machine learning, and statistical modeling to solve real-world problems.

Skills & Tools

Recognitions & Certifications

Technical Expertise

Notable Research in Data Science (With high impact factor Research Publications):

Time series forecasting of a Real Time pH Neutralization Process:

Collected around 10705 samples of real-time dataset(pH base flow rate, acid flow rate) of a complex nonlinear pH neutralization process. The nonlinear dynamics of the pH titration curve is predicted using deep learning algorithms: Temporal Convolutional Networks (TCN) and LSTM networks

Key Highlights:

Regression Analysis using kSINDYc - A ML Approach:

a. ExpiryGenie – Smart Food Expiry Tracker (Deployed on AWS with HTTPS) Households and small food businesses often face avoidable food waste due to forgotten expiry dates. There was a need for a smart, accessible solution to track food shelf life, provide expiry alerts, and help users reduce waste and save money.

Why This Project Matters

• Built an intuitive multi-tab dashboard to add food items via manual entry, text, voice, and image/receipt scanning using Gemini AI and OCR. • Integrated AWS S3 to store food records and user credentials securely. • Provisioned and configured AWS EC2 Ubuntu instance with Python 3.12. • Used Elastic IP for stable access and Streamlit with nohup to run persistently in the background. Business Insights • Food waste reduction: Enables early expiry alerts based on AI-predicted shelf life or receipt scan input. • Cost savings: Tracks items used on time, displaying money saved per user. • User behavior: Data on frequently wasted categories can guide personalized reminders or donation prompts.

b. Real-World Stock Forecasting Dashboard with Streamlit cloud

An interactive data science web app for stock price analysis and prediction using real-time data from Yahoo Finance. Built with Python, Streamlit, and popular machine learning and deep learning libraries, this project helps users analyze market trends, explore financial statements, and forecast future prices using models like ARIMA, SARIMA, and LSTM.Github

Why This Project Matters

While many stock prediction projects focus on just one model, this project uniquely combines multiple forecasting models, classification algorithms, EDA, feature engineering, time series models and deep learning in a single interactive app.

c. Bit-Coin Price prediction using Time-series Forecasting ToolsUsed: NumPy, Pandas, datetime, Matplotlib, Seaborn, Statsmodels and SciPy The primary objective of this project is to compare the accuracy of bitcoin price in USD prediction from time-series data based on two different models, Long Short term Memory (LSTM) network and ARIMA model. Collected the recent dataset (Sep 2014 - March 2025) from yahoo financing. Here are the questions I was interested in answering:

Key Highlights:

c. Predictive Analytics for Employee Turnover Reduction with ML

Key Highlights:

d. EDA and Hypothesis Testing on Marketing Campaign Dataset

Key Highlights:

e. SQL Project using Paintings & Museum Dataset:

Analyzed museum inventory data with SQL to identify unexhibited paintings and underutilized museum spaces. Discovered 15% of artworks not displayed, highlighting opportunities to improve visitor engagement and provided insights to optimize art rotations and enhance museum profitability.

Key Highlights:

Project/Course Certifications:

Honors & Awards

Ph.D. in Electrical Engineering | ML & Deep Learning | Data Science Research

In my PhD research at Anna University, (2018-2023), I worked on a challenging multi-disciplinary topic – ‘Nonlinear system Identification, nonlinearity quantification and control of Nonlinear Systems’, where I integrated the concepts of Nonlinear system identification from Electrical Engineering, time series forecasting (ARIMA Models) and analysis using Machine learning/deep learning algorithms and applied it to control engineering problems. This experience shaped my skills in analysing large raw datasets, applying rigorous optimization solvers,hyperparameter tuning and developing new approaches to complex real-world problems.This experience also stimulated my deep interest in diving in-depth in Machine learning/Deep learning algorithms and exploring Data Science to transition into this field.

Research Publications

I have published around 10 peer-reviewed articles in science citation indexed journals. Among these, three research articles were focused on Machine learning based system identification as a first author in journals with high impact factor. My publications have over 60 total citations and a h-index of 5. You can view my publications in the my Google Scholar page

Collaborative Projects and Scientific Writing

As a Full-time researcher at Anna University, I designed and executed multiple projects, collaborated with faculty at Central Research labs, and mentored Masters students on advanced topics in control systems, Machine learning and Deep learning which resulted in good publications in high impact factor Journals.

Github Projects:

Data Science Projects ML Projects For Python Projects