logo
Contact
Rishi
I'm Rishi!

I do Code & Chill 🍿

Passionate about the magic of machine learning, crafting predictive models, optimizing data processes, and exploring deep learning boundaries. Facinated by technology to make life easier and more efficient.

ABOUT ME

EXPLORE NOW

I enjoy crafting predictive models, crunching statistical numbers, and playing around with patterns and techniques. Jumping into the exciting world of data and algorithms. Isn't it amazing to watch data and algorithms come together to create something exciting?

My ride in this rollercoaster involves tweaking data engineering processes, ETL pipelines, and taking on the twists and turns of deep learning. With a vow to keep on learning, I'm on a quest through the ever-changing landscapes of machine learning. Turning ideas into real-deal answers and pushing the limits of what we can pull off! 🚀✨

EXPERIENCE

EXPLORE NOW

3BiGS / Machine Learning Engineer

JUL 2023 - PRESENT, HYDERABAD

➥ Developed Language Model to provide insightful responses while implementing End-to-end processes from Data Curation, Model Training, and Instruction Finetuning to Evaluations. Achieved 79% accuracy in Healthcare Oral domain. For Data Curation, developed an ETL pipeline, used REST APIs, and conducted statistical analysis to curate the data. In the model development process, used PyTorch for implementation, Architecture of Transformer with RoPE embeddings, bitsandbytes framework for model quantization, Docker for containerization, and Deployed on SageMaker.

➥ Implemented Semantic Search to optimize accuracy and efficiently retrieve relevant research papers to speed up the research using Retrieval Augmente Generation (RAG). In the development process, converted to embeddings with SentenceTransformer from HuggingFace API, stored the vectors in ChromaDB in persistent Index, Documents retrieval using advanced filtering and metadata, Deployed on EC2 instance, and integrated S3 storage system to access VectorStore.

➥ Performed Cluster Modeling for a recommendation system that streamlines the research discovery process. Utilized Cluster Modeling and Statistical Analysis of dense regions in the data, visualized with Matplotlib. To analyze and improve performance, reduced the dimensionality using the Principal Component Analysis (PCA) technique and optimized the attributes and hyperparameters with memory optimization. Used the DBSCAN estimator from scikit-learn to obtain the best estimator. The modeling effectively suggests relevant papers to users based on dense regions in the data, making it a valuable tool for medical research.

Neural NetworksOptimizationNatural Language ProcessingScikit-learnRetrieval Augmented GenerationDockerTransformersSageMakerGithub

Hiper / Data Engineer

AUG 2022 - DEC 2022, BANGALORE

➥ Engineered Special 17% increase in data accuracy.

➥ Conducted Multivariate analysis and Hypothesis tests to optimize vehicle fuel consumption through predictive modeling.

➥ Streamlined data analysis by developing user-friendly fuel optimization dashboards using Streamlit and ETL tools, slashing hours of work and increasing productivity.

Multivariate AnalysisPredictive ModelingPandasMatplotlibSeleniumStreamlitAWSGitHubREST APIs

Hevodata / CXE

AUG 2021 - AUG 2022, BANGALORE

➥ Enhanced user experience by optimizing no-code ETL pipelines through real-time and batch processing from databases and cloud services, largely for data lakes and data warehouses.

➥ Implemented on-demand and customised data cleaning pipelines, ensuring data quality contributing to a 12-18% reduction in data errors.

➥ Collaborated with cross-functional teams for bug fixes and enhancements using SaaS Services such as JIRA, Confluence, etc.

PythonData WarehouseData LakeETL PipelinesSeleniumSQLNoSQLJIRAConfluence

PROJECTS

EXPLORE NOW

BLOGS

EXPLORE NOW

Read their stories

GithubMediumLinkedIn