MS Applied AI · Stevens Institute of Technology · Dec 2026

RAM
KASURU

RAM KASURU

Building SLM ensembles where open-source models Mistral, Gemma, DeepSeek vote on domain-specific reasoning, fine-tuned with LoRA on epidemiology datasets and benchmarked on NVIDIA L40s GPUs.

98%Applied ML Grade
2+Published Papers
3xBest Delegate MUN
Scroll to explore

Applied
AI
Engineer

I'm Ram Kasuru an MS Applied AI student at Stevens Institute of Technology, graduating December 2026. My research focuses on parameter-efficient fine-tuning, SLM ensemble architectures, and efficient GPU-based benchmarking on real-world domain data.

I build systems where multiple open-source small language models collaboratively reason and vote on outputs, combining the efficiency of SLMs with ensemble-level robustness. I'm also actively exploring PhD opportunities at the intersection of efficient NLP and applied machine learning.

LoRA / PEFT Fine-Tuning
SLM Ensemble Architectures
PyTorch & HuggingFace
CUDA / SLURM / HPC
Computer Vision (YOLO, CNN)
NLP & Multimodal Systems

Research & Papers

2+ published · 1 in progress

// 001 · Conference Paper · ISCMCTR-2024

VIOLA: Video Integration of Object Detection, Language Insights and Accessibility

An end-to-end multimodal pipeline fusing YOLOv8 for real-time object detection, OpenAI Whisper for automatic speech recognition, and GPT-4 for contextual language insight generation — achieving 90% object detection and 98% ASR accuracy on live video streams.

YOLOv8Whisper ASRGPT-4MultimodalCV
// 002 · Research Report · Dayananda Sagar University

Prediction of Brain Stroke Using Tensor Factorization & Deep Learning

Tensor-based MRI stroke diagnosis using Tucker decomposition combined with EfficientNetB0 transfer learning. Achieves 98% classification accuracy across four pathological tumor categories on a multi-class dataset of 3,000+ MRI scans.

EfficientNetB0Tucker Decomp.Medical AIMRITransfer Learning
// 003 · Master's Thesis · Stevens Institute of Technology · In Progress

SLM Ensemble via LoRA Fine-Tuning: Voting-Based Aggregation of Open-Source Language Models on Epidemiology Data

A systematic study building an ensemble of LoRA fine-tuned open-source Small Language Models — Mistral 3B, Gemma, and DeepSeek — each trained on a domain-specific epidemiology tutoring dataset. Models cast votes on final outputs, aggregating domain-specific reasoning across architectures. Demonstrates 30%+ improvement in factual recall while reducing GPU memory overhead by 60%+ compared to monolithic LLM approaches.

SLM EnsembleLoRAMistral 3BGemmaDeepSeekROUGE-LL40s GPUVoting Aggregation
Manuscript in Preparation ◌

Selected
Projects

01

VIOLA Multimodal Assistant

Real-time live video analysis fusing YOLOv8, Whisper ASR, and GPT-4 into a unified accessibility and insight platform. Presented at ISCMCTR-2024.

YOLOv8WhisperGPT-4
02

SLM Ensemble + LoRA Epidemiology Tutor

Multi-model ensemble (Mistral 3B, Gemma, DeepSeek) fine-tuned with LoRA on a synthetic epidemiology dataset. Voting-based aggregation on NVIDIA L40s via SLURM.

LoRAMistralGemmaSLURM
03

Brain Stroke & Tumor Prediction

CNN + Tucker tensor factorization pipeline for MRI-based brain tumor classification. 98% accuracy with EfficientNetB0 and interactive Jupyter widget inference.

TensorFlowEfficientNetTucker Decomp.
04

LLM Benchmarking & Prompt Engineering

At Outlier AI improved LLM evaluation accuracy by 12% across 10k+ prompts via reinforcement-based tuning; reduced evaluation latency by 18% across 5+ LLM families.

Prompt TuningRLHFBenchmarking
05

Shopify × Facebook ML Automation

Scalable ML-driven API automation for 200+ clients at Ziberr Communications. Improved user engagement by 40% and operational pipeline efficiency by 28%.

ML OpsShopify APIPython

Work
Experience

Current
Graduate Student Grader
Stevens Institute of Technology
Course: AAI-627 Data Acquisition, Modeling and Analysis: Big Data Analytics. Assisting Prof. Yu-Dong Yao with grading, HW evaluation, and lab supervision for coursework covering AI, ML, DL, and big data systems.
AI / ML / DL Big Data Analytics PySpark TensorFlow CUDA Python Feature Engineering
2024
ML Evaluation Engineer
Outlier AI
Improved LLM evaluation accuracy by 12% across 10k+ prompts via reinforcement-based tuning. Automated evaluation pipelines reducing latency by 18% across 5+ LLM families. Benchmarked model reasoning on complex domain-specific tasks.
LLM Evaluation RLHF Prompt Engineering Benchmarking
2023
ML Automation Engineer
Ziberr Communications
Designed scalable ML-driven API automation for 200+ e-commerce clients across Shopify and Facebook ecosystems. Improved user engagement by 40% and operational pipeline efficiency by 28%. Deployed models to production serving high-volume transaction data.
MLOps Shopify API Facebook API Python Production ML

Beyond
Research

What keeps me curious outside the GPU cluster.

🌏
Model United Nations

3× Best Delegate at international MUN conferences. Competed and chaired across simulations spanning climate policy, cybersecurity governance, and global health honing structured argumentation and diplomacy under pressure.

📚
Deep Reading

Heavy reader across AI research papers, cognitive science, and philosophy of mind. Regular on arXiv particularly tracking efficient inference, mechanistic interpretability, and emergent reasoning in LLMs.

🏋️
Muay Thai

Just starting out in Muay Thai drawn to its blend of technical precision and full-body conditioning. Treating it the same way I approach a new model architecture: fundamentals first, iterate from there.

🎮
Gaming

Competitive FPS COD Warzone, CS:GO, and Valorant for fast reflexes and team coordination. Also learning Chess every day; finding that studying openings and endgames maps surprisingly well onto search-and-planning in AI.

✈️
Traveling & Cultures

Lived and worked across South Asia before moving to New Jersey. Passionate about cross-cultural exchange shaped by years of international MUN and NGO operations across the region.

🎵
Music & Ambient Sound

Lo-fi, ambient, and film scores as the backdrop for deep work. Believe the right soundscape is a genuine productivity tool treat it with the same intentionality as a well-crafted prompt.

Positions of
Leadership

// 2018 2020

General Secretary &
Head of Operations South Asia

Lakshaya NGO for Model United Nations
South Asia Region

MUN NGO South Asia Diplomacy
// Present

General Secretary

S.py Graduate AI Club
Stevens Institute of Technology

AI / ML Graduate Club Community Stevens

Professional
Recommendations

"

His knowledge spans diverse domains including Recommendation Systems, Generative Adversarial Networks, and Computer Vision, showcasing expertise in Deep Learning. His mastery of technical subjects is complemented by his ability to cultivate critical thinking and problem-solving.

Arjun Krishnamurthy
Assistant Professor, AI & ML
Dayananda Sagar University
"

In the VIOLA major project, he demonstrated exceptional research skills and effectively applied acquired knowledge to a cutting-edge, next-generation project. Frequent experimentation with new ideas underscores his dedication to pushing the boundaries of knowledge.

Pradeep Kumar K
Assistant Professor, CSE
Dayananda Sagar University
"

His unique blend of technical expertise, continuous learning, and outstanding soft skills make him invaluable. His passion for leveraging data analytics and AI for real-world applications positions him as a valuable asset with the potential to drive innovation in Data Science.

Pradeep Kumar K
Assistant Professor, CSE
Dayananda Sagar University
🏆
Best AI for Social Impact Tech Spark 2024
Dayananda Sagar University, School of Engineering · Dept. of Computer Science & Engineering (AI & ML) · February 24, 2024
DSU · 2024
📄
Paper Presented · ISCMCTR-2024
VIOLA presented at the 2nd International Student Conference on Multidisciplinary and Current Technical Research, MITS Gwalior · April 20–21, 2024 · Paper ID: 42
International · 2024

Let's
Connect

Open to AI/ML engineering roles, research collaborations, and PhD opportunities. If you're working on efficient language models, multimodal systems, or applied NLP let's talk.

Based in Jersey City, NJ · Available for on-site, hybrid, or remote positions · F-1 OPT eligible upon graduation (Dec 2026)

🤖
Ram's AI Assistant
Initialising model…
Loading Gemma-2 2B · first load ~1.4 GB
// hidden: world_model.py
WORLD MODEL
ŝ(t+1) = fθ( s(t), a(t) )
π*(a|s) = argmaxa [ r + γ · V(ŝ) ]
L = 𝔼[ ‖ s(t+1) − ŝ(t+1) ‖² ]
"A model that predicts its own future
then acts to make the best one real."
the dream behind every architecture I build.