Research & Projects — Md A Rahman

SAEGuardBench — Do SAE Features Help Detect Jailbreaks?

Benchmark comparing 8 detection methods across 4 paradigms on 6 datasets and 4 LLMs (2B-70B parameters). SAE features consistently hurt jailbreak detection compared to simple linear probes on raw activations. The Detection Gap is negative on every model tested.

Technologies: Python, PyTorch, TransformerLens, SAELens, HuggingFace, Gradio

GitHubPaper

CompToolBench — LLM Tool-Use Evaluation

Evaluation framework testing where 18 LLMs fail across four complexity levels of tool use. The Selection Gap: single-action selection accuracy is systematically 13.2pp lower than multi-step composition across 17/18 models.

Technologies: Python, LiteLLM, pytest, Gradio, HuggingFace

GitHubPaper

MergeSafe — How Backdoors Survive Model Merging

Testing whether model merging dilutes backdoors. It does not. LoBAM brings attack success back to 83-99%. Built a three-signal pre-merge scanner with 100% recall and 0% false positives across 18 configurations.

GitHubPaper

TrafficLM — A 12-Year Concept Drift Study in Website Fingerprinting

Measuring how website fingerprinting models decay over time. Training 7 classifiers on 2014 Tor traffic and testing on a fresh 2026 crawl. Targeting PoPETs 2027.

GitHubPaper

Traffic Fingerprinting — Encrypted Network Traffic Classification

94.1% accuracy classifying encrypted website traffic using only packet-size features across 5,001 samples from 50 websites.

GitHub

RADAR-Rowhammer — EM Side-Channel Attack Detection

EM side-channel rowhammer detection across 5 architectures, 7 attack patterns, and 3 DRAM platforms. 92.7% accuracy.

GitHub

IELTSLab — Adaptive Learning Platform with Speech ML

Full-stack monorepo with 5 containerized microservices. Python ML sidecar with faster-whisper speech recognition and adaptive testing.

GitHub

Crown and Caddie — Golf Social Network

Golf social networking mobile app with 50+ screens, real-time messaging, and P2P marketplace with Stripe payments.

PolyHope — Hope Speech and Sarcasm Detection

Dual-task NLP framework achieving 80% F1 score using RoBERTa for hope speech detection at RaNLP 2025.

GitHub

Portfolio

Projects & Research

Research and engineering projects across AI safety, LLM evaluation, LiDAR processing, NLP, and mobile development

SAEGuardBench
Featured

SAE Jailbreak Detection Benchmark

SAEGuardBench

SAE features consistently hurt jailbreak detection. Raw activation probes achieve 0.949 AUROC vs 0.712 for SAE-based methods.

PythonPyTorchTransformerLens+3
CompToolBench
Featured

LLM Tool-Use Evaluation

CompToolBench

Evaluation framework testing where 18 LLMs fail across four complexity levels. Found the Selection Gap: 13.2pp lower accuracy on single-tool selection than multi-step composition.

PythonLiteLLMpytest+3
MergeSafe
Featured

Backdoor Survival in Model Merging

MergeSafe

Backdoors survive model merging. LoBAM amplification brings attack success to 83-99%. Built a pre-merge scanner with 100% recall.

PythonPyTorchHuggingFace+2
TrafficLM
Featured

12-Year Concept Drift Study

TrafficLM

Measuring how website fingerprinting models decay over 12 years. Deep Fingerprint CNN achieves 94.4% but drops under concept drift.

PythonPyTorchscikit-learn+2
Traffic Fingerprinting

Encrypted Traffic Classification

Traffic Fingerprinting

94.1% accuracy classifying encrypted website traffic using only packet-size features across 5,001 samples from 50 websites.

Pythonscikit-learnStreamlit+2
RADAR-Rowhammer

EM Side-Channel Detection

RADAR-Rowhammer

EM side-channel rowhammer detection across 5 architectures, 7 attack patterns, and 3 DRAM platforms. 92.7% accuracy.

PythonPyTorchtorchaudio+2
IELTSLab

Adaptive Learning Platform

IELTSLab

21K+ LOC monorepo with 5 containerized microservices. ML sidecar with faster-whisper speech recognition and adaptive testing.

Next.jsReact NativeFastAPI+2
AnthroCircle

Bilingual Academic Publishing

AnthroCircle

Bilingual academic publishing platform serving 120+ articles and 5,000+ daily readers. Migrated from WordPress to Next.js 15.

Next.jsPostgreSQLPrisma+2
PolyHope

Dual-task NLP Framework

PolyHope

80% F1 score using RoBERTa for hope speech detection.

PyTorchRoBERTaTransformers+1
R

TTU Parking Intelligence

RaiderPark

Parking intelligence app for Texas Tech with on-device ML (TF Lite), LightGBM + Temporal Fusion Transformer ensemble, and PostGIS geospatial queries.

React NativeTensorFlow LiteLightGBM+4
Crown & Caddie

Golf Social Network

Crown & Caddie

Golf social networking mobile app with 50+ screens, event coordination, player profiles, handicap tracking, real-time messaging, and full P2P marketplace with Stripe payments.

React NativeExpoSupabase+2
LiDAR Traffic Safety

Python Framework

LiDAR Traffic Safety

Real-time 3D point cloud processing with DBSCAN clustering.

Pythonscikit-learnmatplotlib+1
LiDAR Pipeline

TxDOT Project

LiDAR Pipeline

10TB+ real-time data processing for infrastructure safety.

PythonCUDABig Data+1