Rithvika T | Portfolio

Time Period	Company	Role	Experience
Jan, 2025 - Present	Tata Consultancy Services	AI/ML Intern	Developed a Fine-Grained Image Similarity model using CNNs, Vision Transformers, and Triplet Loss. Built a scalable FAISS-based retrieval system for high-precision SKU matching and stock differentiation.
Nov, 2023 - Jan, 2024	Iota Analytics Pvt Ltd	Summer Intern (AI/ML/NLP)	Built a privacy-preserving NLP pipeline using Hugging Face and custom regex for PII redaction. Implemented a RAG system with FAISS and LLMs for domain-specific Q&A via Flask API
Jun, 2023 - Aug, 2023	Chakralaya Analytics	Intern	Contributed to a real-time Business Intelligence system for procurement and strategic planning. Created dashboards and KPIs for actionable insights used by CEOs and buyers.

Ship Detection using SAR Imagery for Maritime Vigilance

Built a ResNet50-based model to detect ships in noisy SAR images, optimized with FPN and PSO. Presented at ICDSAAI 2025 for its improved accuracy in cluttered maritime scenes.

Tensorflow OpenCV CNNs

Conference Publication: ICDSAAI 2025

DOI: 10.1109/ICDSAAI65575.2025.11011861

Authors: Rithvika T, Monish P, Dr. Poonkodi M

Affiliation: Vellore Institute of Technology, Chennai

Overview

This research project focuses on the development of a deep learning-based solution for accurate ship detection in Synthetic Aperture Radar (SAR) images, addressing challenges such as high noise, cluttered backgrounds, and small target detection. The work was conducted under the guidance of Dr. Poonkodi M and was presented at the International Conference on Data Science, Agents, and Artificial Intelligence (ICDSAAI) 2025.

The proposed framework leverages a custom-optimized architecture built on ResNet50 and Feature Pyramid Networks (FPN), designed to handle the grayscale characteristics and spatial noise inherent in SAR data. Key enhancements include region of interest (ROI) detection, corner-aware filtering, Swish and Tanh activation refinements, and dynamic hyperparameter tuning using Particle Swarm Optimization (PSO). The architecture ensures robust detection even in low-visibility or occluded scenarios.

Key Features

Multi-scale feature fusion using ResNet50 + FPN
ROI-based corner detection to isolate potential ship regions
Activation optimization with Swish and Tanh functions
Weight mapping and entropy-based filtering to reduce noise impact
Hyperparameter optimization via Particle Swarm Optimization (PSO)
Performance comparison with Faster R-CNN, Mask R-CNN, YOLOv2, and FCOS models

Dataset and Methodology

The dataset consisted of high-resolution (768×768) SAR images labeled with run-length encoded segmentation masks. Preprocessing steps included rotation, zoom, flipping, noise reduction with median filters, and edge enhancement using Sobel filters. The final pipeline incorporates Euclidean distance-based filtering to eliminate invalid regions and focuses on structure-aware classification of ship types and sizes.

Technologies Used

Deep Learning: TensorFlow, Keras
Image Processing: OpenCV, scikit-image
Optimization: Particle Swarm Optimization (PSO)
Visualization: Matplotlib, Seaborn

Outcome

The model demonstrated superior detection accuracy for small and partially occluded ships and outperformed several baseline architectures in both precision and robustness to SAR-specific noise. The framework is suitable for deployment in maritime surveillance systems requiring reliable ship detection under challenging environmental conditions.

Acknowledgment

This project was carried out with the academic support of Dr. Poonkodi M and the infrastructure provided by VIT Chennai. The research is intended for academic and non-commercial use.

Lung Disease Classification using Ensemble Transfer Learning and Grad-CAM

Developed an ensemble-based AI model for lung disease detection from chest X-rays, combining five pre-trained CNNs. Integrated Grad-CAM for explainability and deployed the system via Flask for real-time clinical use.

TensorFlow Grad-CAM Flask

Type: Research Project

Guide: Dr. Poonkodi M

Deployment: Flask-based Web Application

Overview

This project presents a deep learning framework for pneumonia and lung disease detection using chest X-rays. The system integrates five pre-trained convolutional neural networks (CNNs) through ensemble learning and utilizes Grad-CAM for explainability. The solution is designed to offer real-time, clinically relevant predictions with visual interpretability, addressing diagnostic challenges in under-resourced healthcare environments.

Problem Statement

Diagnosis from chest X-rays is often hindered by low contrast, overlapping features with other lung diseases, and radiologist fatigue. Traditional AI models either focus on binary classification or lack interpretability. This project aims to create a robust, multi-class, and explainable system that enhances clinical trust and utility.

Research Objective

To build an accurate, interpretable deep learning model for multi-class lung disease classification using ensemble CNNs and Grad-CAM visualizations.

Key Features

Utilizes five pre-trained models: InceptionResNetV2, EfficientNetB2, DenseNet121, MobileNet, InceptionV3
Combines outputs using ensemble averaging to improve generalization
Employs Grad-CAM to generate heatmaps for model decision explainability
Web interface built with Flask for real-time image upload and prediction

Model Architecture

The architecture includes preprocessing, parallel CNN inference, ensemble averaging, and Grad-CAM visualization. The interface allows clinicians to upload chest X-rays and receive both predictions and interpretable visual outputs.

Modules

Preprocessing: Resize, normalize, and augment input images
Model Training: Transfer learning using TensorFlow and Keras
Prediction: Output averaging from five models
Explainability: Grad-CAM heatmap overlay for feature importance
Deployment: Flask app for web-based access

Technologies Used

Deep Learning: TensorFlow, Keras
Web Development: Flask
Explainability: Grad-CAM
Image Processing: OpenCV

Dataset

NIH ChestX-ray14 (from NIH and Kaggle): A large-scale public dataset consisting of labeled frontal chest X-ray images used for model training and evaluation.

Evaluation

Individual Model Performance:

InceptionResNetV2: 90.12% Accuracy, 77.80% AUC-ROC
EfficientNetB2: 89.73% Accuracy, 77.11% AUC-ROC
MobileNet: 89.65% Accuracy, 76.60% AUC-ROC
InceptionV3: 89.08% Accuracy, 76.18% AUC-ROC
DenseNet121: 88.62% Accuracy, 73.77% AUC-ROC

Ensemble Results: 89.69% Accuracy, 79.10% AUC-ROC

Key Findings

The ensemble approach outperformed individual models in accuracy and robustness.
Grad-CAM visualizations added transparency and improved clinical interpretability.
InceptionResNetV2 was the best-performing single model, while DenseNet121 showed signs of overfitting.

Conclusion

The system demonstrates that combining ensemble learning with explainability techniques like Grad-CAM can significantly enhance the accuracy and trustworthiness of AI-assisted medical diagnostics. The final model is deployable as a real-time web application, aiding rapid pneumonia diagnosis and assisting radiologists in low-resource clinical environments.

Fine Grained Image Similarity for Apparel

A deep learning-based system for identifying subtle visual differences between similar apparel items using attention mechanisms, attention erasure, and ResNet50 embeddings for retail inventory optimization.

Image Similarity Attention Mechanisms Attention Erasure ResNet50 OpenCV

Type: Research & Industry Project

Deployment: AI module for integration into inventory systems

Overview

This project proposes a fine-grained image similarity system designed for retail inventory and catalog management. It tackles the challenge of distinguishing visually similar apparel items—such as duplicates from different suppliers or restocked versions—using a deep learning framework built on ResNet50 and enhanced with attention mechanisms and attention erasure techniques. The system is optimized for subtle visual differentiation, making it highly applicable for real-world fashion retail use cases.

Problem Statement

In retail environments, particularly large-scale apparel chains, identifying duplicate or near-duplicate products is difficult due to minor manufacturing variations. Traditional image matching techniques often fail when textures, patterns, or colors appear subtly different. A fine-grained AI-based solution is needed to enhance visual comparison and prevent catalog mismanagement or redundant purchasing.

Research Objective

To develop a deep learning-based similarity detection system capable of capturing nuanced visual differences in apparel items, using attention-based embeddings and training strategies that improve robustness and generalization.

Key Features

Uses ResNet50 as the backbone for feature extraction with frozen base layers
Incorporates Channel and Spatial Attention to focus on clothing-specific visual cues
Introduces Attention Erasure during training to reduce overfitting on dominant regions
Employs a custom embedding head with dropout and normalization for compact feature representation
Compares embeddings using Triplet Loss with semi-hard mining and Cosine Similarity

Model Architecture

The system integrates OpenCV-based preprocessing, ResNet50 for hierarchical features, attention and attention-erasure modules for visual focus diversification, and a dense embedding head. Similarity is evaluated using a Triplet Loss framework that optimizes inter-class separability while preserving intra-class cohesion.

Modules

Preprocessing: BGR to RGB conversion, Gaussian blur, normalization
Feature Extraction: ResNet50 with attention modules
Attention Erasure: Dynamic masking of salient areas to promote secondary feature learning
Embedding Generation: Fully connected layers for feature representation
Similarity Matching: Cosine similarity and KNN-based scoring

Technologies Used

Category	Tools & Frameworks
Deep Learning	TensorFlow, Keras
Attention Modules	Custom attention and erasure layers
Image Processing	OpenCV
Loss Function	Triplet Loss with semi-hard mining
Visualization	Matplotlib, Seaborn

Dataset

Source: In-house and scraped e-commerce datasets
Image Type: RGB images of upper/lower wear, dresses, etc.
Labels: Binary (similar / dissimilar)
Resolution: 224×224
Sample Size: Balanced pairs across multiple categories

Evaluation

Accuracy: High matching precision across fine-grained apparel categories
Robustness: Handles changes in lighting, background clutter, and camera angles
Generalization: Attention erasure improves detection of secondary design features

Key Findings

Attention-based architectures outperform traditional CNNs in nuanced apparel matching
Erasure modules help reduce overfitting to primary design features
The model is lightweight enough for deployment in real-time retail environments

Conclusion

The project demonstrates a powerful combination of deep learning, attention mechanisms, and contrastive learning to solve the real-world challenge of fine-grained apparel similarity. The system is suitable for deployment in inventory management tools , improving product traceability, catalog accuracy, and cost optimization.

RAG with AES Encryption

Developed a secure Retrieval-Augmented Generation (RAG) system using Hugging Face LLMs for document-based question answering. Integrated AES encryption to protect sensitive outputs and ensure privacy in local PDF/DOCX processing.

RAG LLMs AES Encryption

Type: Research Project

Technologies: Hugging Face Transformers, LLaMAIndex, Quantized LLMs, AES (Fernet) Encryption

Overview

This project introduces a secure Retrieval-Augmented Generation (RAG) framework for privacy-critical applications. It combines local document-based question answering using quantized large language models (LLMs) with AES encryption, enabling encrypted, explainable, and efficient responses on sensitive `.pdf` and `.docx` files.

Objectives

Enable contextual question-answering over user-uploaded documents
Ensure end-to-end output security through AES encryption
Optimize inference efficiency using 4-bit quantization for LLMs

Key Features

Document-aware QA: Accepts `.pdf` and `.docx` files for semantic retrieval
Quantized Inference: Efficient 4-bit loading via BitsAndBytes for consumer-grade hardware
Secure Output: AES encryption (Fernet) of all model outputs with controlled decryption
Modular Pipeline: Easily integrable into web or API-based systems

System Workflow

Documents are loaded and embedded using LLaMAIndex
User enters a natural language query
Relevant document chunks are retrieved and passed to a quantized transformer model
Generated responses are encrypted using AES (Fernet)
Authorized users can decrypt responses locally

Tech Stack

LLMs: Hugging Face Transformers (quantized via BitsAndBytes)
Retrieval: LLaMAIndex, VectorStoreIndex
Document Processing: docx2txt, pypdf
Encryption: Python cryptography module (AES via Fernet)
Deployment Ready: Jupyter Notebook / API-compatible

License

This project is released under the Apache License 2.0. For commercial use, please contact the authors.

Arrhythmia Detection using ECG Signals

Developed a 1D CNN model to classify cardiac arrhythmias from ECG data using the MIT-BIH dataset, with wavelet-based denoising and class balancing via SMOTE. Achieved accurate classification across heartbeat types, forming a strong foundation for real-time health monitoring systems.

1D CNN ECG Signal Processing SMOTE

Type: Research Project

Technologies: 1D CNN, SMOTE, Wavelet Transform, TensorFlow, MIT-BIH Dataset

Project Overview

This project aims to detect and classify cardiac arrhythmias from ECG (Electrocardiogram) signals using a 1D Convolutional Neural Network. Trained on annotated heartbeat segments from the MIT-BIH Arrhythmia Database, the model is designed for integration into automated, real-time health monitoring systems.

Objective

To build a deep learning-based classification pipeline that accurately detects multiple types of heartbeats, including abnormal patterns, for early and efficient cardiac diagnostics.

Dataset

Source: MIT-BIH Arrhythmia Database
Sampling Rate: 360 Hz
Classes Considered:
- N: Normal
- L: Left Bundle Branch Block Beat
- R: Right Bundle Branch Block Beat
- A: Atrial Premature Beat
- V: Premature Ventricular Contraction

Workflow Summary

1. Data Extraction & Labeling

ECG signals and annotations are parsed. Heartbeats are segmented and labeled.

2. Preprocessing

Denoising via wavelet transforms
Normalization and resampling (360 data points)
Label encoding + one-hot encoding

3. Class Imbalance Handling

SMOTE is used to balance underrepresented classes like A and V.

4. Training & Evaluation

Model compiled with Adam optimizer and categorical crossentropy loss. Accuracy/loss curves and confusion matrix used to evaluate model performance.

Results & Insights

High classification accuracy across majority and minority classes after SMOTE
Improved detection of rare arrhythmias
Visualizations provided transparency and interpretability
Strong baseline for future integration into real-time systems

Tools & Libraries

Python, NumPy, Pandas, Matplotlib, SciPy, PyWavelets, Scikit-learn, Keras, TensorFlow, Imbalanced-learn (SMOTE)

Future Scope

Deploy on IoT or wearable devices for live ECG monitoring
Extend to multi-lead ECG analysis
Implement on edge devices for real-time inference

Chatbot using PyTorch and NLP

Developed a chatbot using PyTorch and NLP techniques to provide natural language responses to user queries. The chatbot uses a pre-trained model to generate responses and can be trained on new data to improve its performance.

PyTorch Seq2Seq Modeling

Technologies: PyTorch, NLTK, seq2seq, Named Entity Recognition, Intent Classification

Project Overview

This project involves building an intelligent chatbot capable of understanding and responding to human language. The chatbot is powered by PyTorch and utilizes core Natural Language Processing (NLP) tasks like intent recognition, named entity recognition (NER), and sequence-to-sequence modeling. Designed to handle diverse user queries, the system combines rule-based logic with deep learning models to generate accurate, human-like responses.

Objectives

Build an end-to-end NLP chatbot that can learn from conversations.
Implement intent detection and response generation using deep learning models.
Deploy the chatbot for real-time interaction across different platforms.

Dataset

Conversational datasets, including question-answer pairs or synthetic dialogues, are used for training. The model is also capable of being fine-tuned on real-time interaction data gathered during deployment.

Model Development

Seq2Seq model with encoder-decoder architecture for contextual response generation.
Rule-based fallback systems for predefined queries.
Trained using backpropagation with gradient descent to improve accuracy and fluency.

Preprocessing

Data is cleaned and tokenized using tools like NLTK or spaCy. Steps include: lowercasing, punctuation removal, stopword filtering, and word tokenization.

Training & Evaluation

Trained on labeled conversational data.
Metrics used: Accuracy, BLEU score (response quality), F1 score (NER), Intent Accuracy.
Training loss and validation metrics tracked to avoid overfitting.

Deployment

The chatbot can be deployed in multiple environments such as web apps or messaging platforms. A simple Flask-based API or command-line interface allows real-time user interaction.

Dependencies

Python 3.7+
PyTorch
NLTK / spaCy
Flask (for deployment)

Future Enhancements

Integrate transformer-based models for better context handling.
Add voice support using Speech-to-Text APIs.
Enable multilingual support with translation layers.

Hey, I'm Rithvika Tiruveedhula

AI/ML Developer | GenAI Enthusiast

About Me

Work Experience

My Projects

Ship Detection using SAR Imagery for Maritime Vigilance

Lung Disease Classification using Ensemble Transfer Learning and Grad-CAM

Fine Grained Image Similarity for Apparel

RAG with AES Encryption

Arrhythmia Detection using ECG Signals

Chatbot using PyTorch and NLP

Get In Touch