Final-year B.Tech Computer Science and Engineering student at Vellore Institute of Technology(VIT), Chennai with hands-on experience in AI/ML, deep learning, and real-world application development.
Hello! I'm Rithvika T, a passionate AI/ML developer and final-year Computer Science student at VIT Chennai. I specialize in deep learning, computer vision, and building intelligent, real-world AI systems.
With hands-on experience in healthcare diagnostics, fine-grained image similarity, and generative AI models, I enjoy turning data into impactful solutions. My work bridges academic research and industry-driven development, with a strong focus on model explainability, accuracy, and scalability.
I believe in using AI to solve meaningful problems from improving medical diagnostics to optimizing retail analytics and thrive in fast-paced, collaborative environments where innovation meets execution.
Time Period | Company | Role | Experience |
---|---|---|---|
Jan, 2025 - Present | Tata Consultancy Services | AI/ML Intern |
|
Nov, 2023 - Jan, 2024 | Iota Analytics Pvt Ltd | Summer Intern (AI/ML/NLP) |
|
Jun, 2023 - Aug, 2023 | Chakralaya Analytics | Intern |
|
Built a ResNet50-based model to detect ships in noisy SAR images, optimized with FPN and PSO. Presented at ICDSAAI 2025 for its improved accuracy in cluttered maritime scenes.
Conference Publication: ICDSAAI 2025
DOI: 10.1109/ICDSAAI65575.2025.11011861
Authors: Rithvika T, Monish P, Dr. Poonkodi M
Affiliation: Vellore Institute of Technology, Chennai
This research project focuses on the development of a deep learning-based solution for accurate ship detection in Synthetic Aperture Radar (SAR) images, addressing challenges such as high noise, cluttered backgrounds, and small target detection. The work was conducted under the guidance of Dr. Poonkodi M and was presented at the International Conference on Data Science, Agents, and Artificial Intelligence (ICDSAAI) 2025.
The proposed framework leverages a custom-optimized architecture built on ResNet50 and Feature Pyramid Networks (FPN), designed to handle the grayscale characteristics and spatial noise inherent in SAR data. Key enhancements include region of interest (ROI) detection, corner-aware filtering, Swish and Tanh activation refinements, and dynamic hyperparameter tuning using Particle Swarm Optimization (PSO). The architecture ensures robust detection even in low-visibility or occluded scenarios.
The dataset consisted of high-resolution (768×768) SAR images labeled with run-length encoded segmentation masks. Preprocessing steps included rotation, zoom, flipping, noise reduction with median filters, and edge enhancement using Sobel filters. The final pipeline incorporates Euclidean distance-based filtering to eliminate invalid regions and focuses on structure-aware classification of ship types and sizes.
The model demonstrated superior detection accuracy for small and partially occluded ships and outperformed several baseline architectures in both precision and robustness to SAR-specific noise. The framework is suitable for deployment in maritime surveillance systems requiring reliable ship detection under challenging environmental conditions.
This project was carried out with the academic support of Dr. Poonkodi M and the infrastructure provided by VIT Chennai. The research is intended for academic and non-commercial use.
Developed an ensemble-based AI model for lung disease detection from chest X-rays, combining five pre-trained CNNs. Integrated Grad-CAM for explainability and deployed the system via Flask for real-time clinical use.
Type: Research Project
Guide: Dr. Poonkodi M
Deployment: Flask-based Web Application
This project presents a deep learning framework for pneumonia and lung disease detection using chest X-rays. The system integrates five pre-trained convolutional neural networks (CNNs) through ensemble learning and utilizes Grad-CAM for explainability. The solution is designed to offer real-time, clinically relevant predictions with visual interpretability, addressing diagnostic challenges in under-resourced healthcare environments.
Diagnosis from chest X-rays is often hindered by low contrast, overlapping features with other lung diseases, and radiologist fatigue. Traditional AI models either focus on binary classification or lack interpretability. This project aims to create a robust, multi-class, and explainable system that enhances clinical trust and utility.
To build an accurate, interpretable deep learning model for multi-class lung disease classification using ensemble CNNs and Grad-CAM visualizations.
The architecture includes preprocessing, parallel CNN inference, ensemble averaging, and Grad-CAM visualization. The interface allows clinicians to upload chest X-rays and receive both predictions and interpretable visual outputs.
NIH ChestX-ray14 (from NIH and Kaggle): A large-scale public dataset consisting of labeled frontal chest X-ray images used for model training and evaluation.
Individual Model Performance:
Ensemble Results: 89.69% Accuracy, 79.10% AUC-ROC
The system demonstrates that combining ensemble learning with explainability techniques like Grad-CAM can significantly enhance the accuracy and trustworthiness of AI-assisted medical diagnostics. The final model is deployable as a real-time web application, aiding rapid pneumonia diagnosis and assisting radiologists in low-resource clinical environments.
A deep learning-based system for identifying subtle visual differences between similar apparel items using attention mechanisms, attention erasure, and ResNet50 embeddings for retail inventory optimization.
Type: Research & Industry Project
Deployment: AI module for integration into inventory systems
This project proposes a fine-grained image similarity system designed for retail inventory and catalog management. It tackles the challenge of distinguishing visually similar apparel items—such as duplicates from different suppliers or restocked versions—using a deep learning framework built on ResNet50 and enhanced with attention mechanisms and attention erasure techniques. The system is optimized for subtle visual differentiation, making it highly applicable for real-world fashion retail use cases.
In retail environments, particularly large-scale apparel chains, identifying duplicate or near-duplicate products is difficult due to minor manufacturing variations. Traditional image matching techniques often fail when textures, patterns, or colors appear subtly different. A fine-grained AI-based solution is needed to enhance visual comparison and prevent catalog mismanagement or redundant purchasing.
To develop a deep learning-based similarity detection system capable of capturing nuanced visual differences in apparel items, using attention-based embeddings and training strategies that improve robustness and generalization.
The system integrates OpenCV-based preprocessing, ResNet50 for hierarchical features, attention and attention-erasure modules for visual focus diversification, and a dense embedding head. Similarity is evaluated using a Triplet Loss framework that optimizes inter-class separability while preserving intra-class cohesion.
Category | Tools & Frameworks |
---|---|
Deep Learning | TensorFlow, Keras |
Attention Modules | Custom attention and erasure layers |
Image Processing | OpenCV |
Loss Function | Triplet Loss with semi-hard mining |
Visualization | Matplotlib, Seaborn |
The project demonstrates a powerful combination of deep learning, attention mechanisms, and contrastive learning to solve the real-world challenge of fine-grained apparel similarity. The system is suitable for deployment in inventory management tools , improving product traceability, catalog accuracy, and cost optimization.
Developed a secure Retrieval-Augmented Generation (RAG) system using Hugging Face LLMs for document-based question answering. Integrated AES encryption to protect sensitive outputs and ensure privacy in local PDF/DOCX processing.
Type: Research Project
Technologies: Hugging Face Transformers, LLaMAIndex, Quantized LLMs, AES (Fernet) Encryption
This project introduces a secure Retrieval-Augmented Generation (RAG) framework for privacy-critical applications. It combines local document-based question answering using quantized large language models (LLMs) with AES encryption, enabling encrypted, explainable, and efficient responses on sensitive `.pdf` and `.docx` files.
This project is released under the Apache License 2.0. For commercial use, please contact the authors.
Developed a 1D CNN model to classify cardiac arrhythmias from ECG data using the MIT-BIH dataset, with wavelet-based denoising and class balancing via SMOTE. Achieved accurate classification across heartbeat types, forming a strong foundation for real-time health monitoring systems.
Type: Research Project
Technologies: 1D CNN, SMOTE, Wavelet Transform, TensorFlow, MIT-BIH Dataset
This project aims to detect and classify cardiac arrhythmias from ECG (Electrocardiogram) signals using a 1D Convolutional Neural Network. Trained on annotated heartbeat segments from the MIT-BIH Arrhythmia Database, the model is designed for integration into automated, real-time health monitoring systems.
To build a deep learning-based classification pipeline that accurately detects multiple types of heartbeats, including abnormal patterns, for early and efficient cardiac diagnostics.
ECG signals and annotations are parsed. Heartbeats are segmented and labeled.
SMOTE is used to balance underrepresented classes like A and V.
Model compiled with Adam optimizer and categorical crossentropy loss. Accuracy/loss curves and confusion matrix used to evaluate model performance.
Python, NumPy, Pandas, Matplotlib, SciPy, PyWavelets, Scikit-learn, Keras, TensorFlow, Imbalanced-learn (SMOTE)
Developed a chatbot using PyTorch and NLP techniques to provide natural language responses to user queries. The chatbot uses a pre-trained model to generate responses and can be trained on new data to improve its performance.
Technologies: PyTorch, NLTK, seq2seq, Named Entity Recognition, Intent Classification
This project involves building an intelligent chatbot capable of understanding and responding to human language. The chatbot is powered by PyTorch and utilizes core Natural Language Processing (NLP) tasks like intent recognition, named entity recognition (NER), and sequence-to-sequence modeling. Designed to handle diverse user queries, the system combines rule-based logic with deep learning models to generate accurate, human-like responses.
Conversational datasets, including question-answer pairs or synthetic dialogues, are used for training. The model is also capable of being fine-tuned on real-time interaction data gathered during deployment.
Data is cleaned and tokenized using tools like NLTK or spaCy. Steps include: lowercasing, punctuation removal, stopword filtering, and word tokenization.
The chatbot can be deployed in multiple environments such as web apps or messaging platforms. A simple Flask-based API or command-line interface allows real-time user interaction.