Projects

Machine learning, deep learning, and full stack projects emphasizing LLMs, RAG systems, predictive modeling, and data pipelines for real world applications.

LegaiAI: Knowledge Retrieval and RAG with Pretraining and Instruction FineTuning View Project

Sentence Transformers · Qwen · AWS (S3) · FAISS · data streaming · RAG · Langchain · Retrieval

Implemented a embedding and retrieval system using Sentence Transformers and FAISS with scalable vector search, checkpointing, and AWS S3 data pipelines, supporting RAG architecture on 40GB datasets.

Created an LLM pipeline including pretraining, instruction finetuning, and agentic orchestration work- flows, enabling AI agents capable of reasoning and multistep task execution using LangChain frameworks.

Designed a modular data lifecycle system (ingestion, retrieval, optimization, serving) with fault tolerant and scalable architecture, improving system efficiency and retrieval performance for ML workflows.

Instruction-Tuned GPT-2 for Task-Oriented Text Generation View Project

Python · PyTorch · LLaMA 3 · Hugging Face Transformers · GPT2Tokenizer · Custom training loop, AdamW optimizer · Pandas, JSON · tqdm, sklearns

Developed an end-to-end instruction fine-tuning pipeline using GPT-2 Medium to generate task-specific responses in an Alpaca-style format. The project involved preprocessing a structured instruction dataset, converting it into prompt-response format, and training a causal language model to follow natural language instructions.

Implemented custom PyTorch datasets and dynamic batching with a tailored collate function to handle variable-length sequences efficiently. Fine-tuned the model using supervised learning and evaluated its performance on a held-out test set using both loss metrics and LLM-based evaluation.

Integrated external evaluation pipelines using LLaMA (via Hugging Face and Ollama) to score model outputs on correctness, relevance, and completeness. Automated generation, storage, and evaluation of responses, creating a scalable workflow for benchmarking instruction-following models.

Ethereum Price Prediction & RL-Based Automated Trading Bot View Project

Python · PyTorch · NumPy · Pandas · Scikit-learn · Reinforcement Learning · Coinbase API · Django · JavaScript · CUDA

Developed an end-to-end system combining LSTM + ANN models for Ethereum (ETH) price prediction with a Reinforcement Learning (RL)-based trading agent. The pipeline includes real-time data collection, preprocessing, feature engineering, sequence modeling, and environment simulation for training trading strategies.

Extended the project into a fully automated trading bot that executes buy/sell decisions based on user-defined strategies. Integrated with the Coinbase API for real-time market data, order execution, and secure wallet interactions, enabling live trading with actual funds.

Deployed a Django-based dashboard to visualize price trends, model predictions, RL-driven trading actions, and portfolio performance, making the system accessible and interpretable for users.

Ethereum Price Prediction with LSTM + ANN & Hyperparameter Optimization View Project

Python · PyTorch · NumPy · Pandas · Scikit-learn · CoinGecko API · Django · JavaScript · CUDA

Built a hybrid LSTM + ANN forecasting system to predict Ethereum (ETH) price movements using real-time market data from the CoinGecko API. The pipeline handles data collection, preprocessing, feature engineering, and sequence modeling.

Deployed a Django-based dashboard that visualizes price history, model predictions, and prediction-driven buy/sell indicators, making the system accessible to non-technical users.

Sentiment Analysis & Predictive Modeling on Twitter Data View Project

Python · TensorFlow · Keras · NumPy · Pandas · Matplotlib · Scikit-learn

Developed a CNN model using the Keras Functional API to predict tweet sentiment scores, combining text embeddings with engagement features. Designed an end-to-end pipeline from raw data ingestion to cleaned, vectorized inputs.

Aggregated sentiment by candidate and time window to forecast public opinion trends and demonstrate how NLP results can support decision-making in campaigns and analysis.

VGG16-based Lung Cancer Classification View Project

Python · PyTorch · torchvision · NumPy · Scikit-learn

Fine-tuned a VGG16 model on CT scan images to classify multiple lung cancer types, using advanced augmentation and normalization strategies to improve generalization.

Achieved 99.89% training accuracy and 97.27% test accuracy with a macro AUC of 99.96%, and evaluated performance using confusion matrices, precision–recall curves, and F1-based analysis. Used ARIMA on epoch-wise accuracy to forecast future model performance.

LSTM-based Word Prediction View Project

Python · PyTorch · NumPy · Jupyter Notebook

Created an LSTM language model to predict the next word in a sentence from raw course content. Implemented a full preprocessing pipeline including tokenization, numerical encoding, and padding.

Designed an Embedding → LSTM → Linear architecture and a recursive generation loop to produce coherent multi-word sequences from seed text.

RNN-Based Question Answering System (NLP Project) View Project

Python · PyTorch · RNN · NLP · Pandas · NumPy · Deep Learning · Neural Networks · DataLoader API

Developed a Question Answering system using a Simple Recurrent Neural Network (RNN) in PyTorch.

Implemented custom text preprocessing including tokenization, vocabulary creation, and numerical encoding.

Built a custom Dataset and DataLoader pipeline for training.

The model uses embedding layers and an RNN architecture to learn question-answer mappings and generate predictions on unseen queries.

Blockchain-Based Voting System View Project

Solidity · Python · Django · Web3.py · MetaMask · Ganache

Architected a secure, decentralized voting platform using Ethereum smart contracts and a Django frontend. The system supports admin-configured elections, candidate registration, and voter onboarding.

Integrated MetaMask for transaction signing and on-chain vote casting, and provided a transparent results page for post-election analysis.

Analytical Visualization using Dash Framework View Project

Python · NumPy · Pandas · Dash · Plotly

Python Dash project for interactive data visualization using Plotly.

Runs on the Dash development server, allowing exploration of datasets through dynamic charts and dashboards. Ideal as a foundation for building fully interactive analytics apps.