Machine Learning Engineer with two years of experience in an AI consultancy startup. Expertise includes building various statistical models like the random forest, lightGBM, ARIMA, KNN, K-means, and building deep learning models like RNN, CNN, LSTMs, GANs, Reinforcement Learning. Skilled in Machine Learning, Statistics, Problem Solving, and Programming.
Visvesvaraya Technological University, Computer Science & Engineering — Bachelor of Engineering (2018)
Presented free webinars in diverse topics under Machine Learning. Consulted for a company on customer churn prediction. Designed Data Science curriculum,‘11 projects to Data Science’, for beginners on Github.
Donned the hat of various roles, including Data Scientist, Machine Learning Engineer, DevOps Engineer, Technical Architect, Project Lead, Frontend Developer, during multiple projects. Built customized statistical and deep learning models for the clients. Architect and documented the process and code through UML diagrams.
Interned under Prof.Sameer Mathur, IIM Lucknow. Solved 3 Harvard Business Case Studies during the internship.
Programming Languages | Python, C, C++ |
---|---|
Data Storage Platforms and Database Management Systems | MySQL, Postgres, MongoDB |
Big Data tools | PySpark, Kafka |
Editors and Notebooks & Visualisation tools | Vim, Emacs, Atom, VS Code, Jupyter Notebook |
Cloud Platforms | GCP, AWS |
Resource Management tools | Docker, Kubernetes |
Machine Learning and Deep Learning Frameworks | Scikit Learn, Tensorflow, Keras, Pytorch |
CI/CD | Git, CircleCI |
Web Deployment and APIs | Flask, Django, Tensorflow serving |
Bayesian Machine Learning, KBAI, Graph Neural Networks, Graph Database - Neo4j, Probabilistic Graphical Models
Built an unsupervised time series model for stock prediction for the S&P 500 ETF. Used a Deep Q learning technique to adjust the weights of bags of stocks, which yielded 18% growth in a year. Built API endpoints around the application for the frontend to access. Implemented redundant three-tier architecture and hosted the model on GCP.
Tech Stack: Tensorflow, Flask, GCP
Data Engineered the given big data, which had duplicates and simmered down by 5x times using various database normal forms. Converted their R codebase to production-ready Python code, built a recommendation engine using LightGBM, and wrote regression tests for the same. Hosted the software in their cloud base.
Tech Stack: Python, R, Scikit learn
Automated a process of reading the various values from a digital copy of semiconductor design architecture and populating the values in excel. Built intelligence for the testing process, which was taking two weeks per design manually. Initially built a simple UI in the Django web framework. Used docker to build OS-level virtualization to deliver software packages in containers.
Tech Stack: OpenCV, Tesseract, Dask, Numba, Pandas, Numpy, Django, Docker
Collected, cleaned, analyzed, and interpreted an extensive unstructured data of various products of the company. During this, we were able to come up with different insights from the data through visualization. Built various models for forecasting, including simple linear regression, ARIMA, and CNN.
Tech Stack: Plotly, Matplotlib base map, Scikit learn, pyramid-ARIMA, Tensorflow
Generated data based on the schema provided and built XgBoost and WTTE-RNN model for churn prediction. Wrote connectors for Amazon Redshift to train the model. Hosted the model in AWS.
Tech Stack: Scikit learn, Tensorflow, PostgreSQL, AWS, AirTable
Built the MTCNN model for Face Recognition for a use case. Integrated and deployed the model with their already existing platform for hassle-free functioning.
Tech Stack: Keras, Flask
The use case was to monitor milk storage units by collecting various data from IoT sensors. During data analysis, we found out that during the peaks in the temperature data, the milk was being filled. Wrote a peak detection script to monitor the storage unit.
Tech Stack: Plotly, Scipy
Built a Deep LSTM Siamese network for text similarity. Used pre-trained word embeddings to identify semantic similarities. Used Levenshtein distance to calculate string distance.
Tech Stack: Numpy, Tensorflow, Gensim, NLTK, Scikit Learn
Used Google Cloud Vision API for extracting the data from the invoice pdf. Retrieved only the required data using regular expressions.
Tech Stack: Google Cloud Vision API, Regex
Vishwakarma is an open-source pip installable package for visualizing high quality, journal standard images for Probability Density Function, Probability Mass Function, and Probabilistic Graphical Models. Architected and spearheaded the whole server-side development process. The output can be downloaded in LaTeX, PDF, or PNG format.
Presented a free webinar in association with Nowalabs on August 15th, 2020. New York City Taxi Fare Prediction is one of the popular beginner-level problems on Kaggle. The goal of this challenge is to predict the fare of a taxi trip given information about the pickup and drop off locations, the pickup date time, and the number of passengers traveling. The webinar included -
Presented a free webinar in association with Nowalabs on September 5th, 2020. Rock-Paper-Scissors is a hand game usually played between two people, in which each player simultaneously forms one of three shapes with an outstretched hand. Built RPS game from scratch using OpenCV and MobileNetV2 live under 90 minutes. The webinar included -