Projects

Machine Learning on IoT Data: Predicting Battery Health Status and User Type

• Built a predictive model that can warn of initial glitches or anomalies in battery health allowing early detection and prevention

• Applied Supervised and Unsupervised methods such as K-Means Clustering and K-Nearest Neighbors to label health status of batteries

• Developed Tableau reports and dashboards from MongoDB to showcase Exploratory Data Analysis (EDA) and anomaly detection

• Performed predictive analytics to identify global and contextual anomalies present in faulty batteries prompting early detection/prevention

Intelligent Transportation - Machine Learning in Taxi Industry

• Performed analysis for over 2.4 million taxi records to predict the demand for 25 zones over temporal resolution of 24 hours

• Explored a range of supervised and unsupervised learning models like Regression, Random Forest, XGboost, and Multi-Layer Perceptron (MLP) using Keras and TensorFlow libraries to select the winning model

Real-Time Billing System for 4G/5G Network

• The billing system is an end-to-end solution that is implemented to support 4G/5G networks and adaptable to other domains

• Built on Microservices architecture using several technologies, such as Kafka, Spark, HDFS, MongoDB, and Tableau

• Developed Tableau based dashboards from MongoDB and Apache Cassandra databases for real-time analysis of events

Machine Learning Techniques for Stock Market Prediction

• Built robust machine learning models using Long Short-Term Memory (LSTM) network, Polynomial Regression, and Sentiment Analysis

• Defined reliable financial datasets and performed Exploratory Data Analysis, Data Wrangling, and Feature Engineering for data mining

• Implemented sentiment analysis and NLP techniques using NLTK package to identify and explain market factors affecting stock value

• Responsible for predictive analysis of stock market data to predict the market value of stocks and generate interactive visualizations

• Designed a profitability analysis algorithm and simulated profitability analysis for all models using Matplotlib and Seaborn libraries

Cluster And Segment Neighborhoods In Major Cities

• Performed analysis on the Toronto and New York City datasets

• Extracted the list of Toronto and New York City neighborhoods by scraping the web

• Used the Geocoder library to fetch latitude and longitude coordinates for each of the neighborhoods. Adding the geographical coordinates (latitude and longitude) allows to map these neighborhoods using the folium API

• Used the Foursquare location API to extract the list of venues surrounding each of the neighborhoods

• Performed clustering (using K-means clustering) taking mainly into account the most common venues in each neighborhood (stores, Italian restaurant, gym, park, ...)

• Neighborhoods were classified into clusters depending on their similarities (in terms of most common venues)