AI Researcher & Engineer

Avishek
Shrabon

BSc in Computer Science & Engineering. Researching federated learning systems and LLM knowledge elicitation. Seeking a fully-funded Masters or PhD abroad to push the boundaries of AI.

3.74
CGPA BSc CSE — American International University Bangladesh
2
Research Papers Federated Learning & LLM Knowledge Elicitation — in preparation
75%
Time Saved Real-world software reduced event reconciliation from 20 min → 5 min
Photo coming soon

Building towards
meaningful AI research

I'm a CS graduate from American International University Bangladesh, where I completed my BSc in Computer Science and Engineering with a CGPA of 3.74 finishing the degree in 3.5 years. My academic journey has been defined by a deep curiosity about how intelligent systems learn, communicate, and scale.

My research sits at the intersection of distributed machine learning and large language model behavior two areas I believe will define the next decade of AI development. I'm driven by problems that are both theoretically rich and practically meaningful.

Beyond research, I build software that solves real problems. I believe that good engineering and good science reinforce each other the discipline of shipping code makes you a more rigorous thinker.

Since mid-2022 I have worked as a private tutor, teaching Physics, Chemistry, Biology, Mathematics, Higher Mathematics, and ICT to students from Grade 8 through 12. Maintaining 3–4 students simultaneously five days a week throughout my degree taught me how to communicate complex ideas clearly, manage my time under pressure, and stay financially independent while pursuing research.

Degree BSc Computer Science & Engineering
Institution American International University Bangladesh
CGPA 3.74 / 4.00
Graduated 2025 — completed in 3.5 years
Location Dhaka, Bangladesh
Status Seeking Fully-Funded Masters / PhD — 2025
Experience Private Tutor — Sciences & Mathematics (2022 — Present)

How I got
here

The honest story of how someone with no ML background taught themselves federated learning and produced original research in six weeks, mostly alone.

Key insight

Self-directed learning under pressure is the most important skill a researcher can have. I found that out the hard way and I'm grateful for it.

01 The Starting Point

No ML background. A thesis to write.

In early 2025, I was assigned a federated learning thesis with three teammates two of whom had completed Machine Learning and CVPR courses. I had taken neither. My only formal AI exposure was a survey course covering search algorithms. I didn't know what a forward pass was, what backpropagation meant, or how Knowledge Distillation worked.

02 Six Weeks

From zero to research-ready.

I spent roughly six weeks building my foundation from scratch working through Google, YouTube, research papers, and the Flower federated learning framework to understand the field. I mapped out the terminology, the architecture patterns, the math behind gradient flow and distillation. By the end I could read papers, understand the tradeoffs, and think about the problem independently.

03 The Reality

Leading the technical development.

My two most technically experienced teammates were doing their internships simultaneously and couldn't dedicate time to the thesis. I was left as the sole technical contributor. Rather than wait, I moved forward designing the entire pipeline, making architectural decisions, and writing most of the codebase independently.

04 The Methodology

A novel approach, proposed from scratch.

I proposed combining gradient-based client clustering with a Knowledge Distillation setup to handle heterogeneous data and client distributions in federated environments. I presented this to our supervisor, who approved the direction. The approach became the core contribution of the paper.

05 The Outcome

A+ and a paper approaching publication.

The thesis was graded A+. The paper is currently being refined for conference submission. The experience taught me that self-sufficiency is not just a skill it's a research necessity. A PhD environment will have no shortage of moments where you have to figure things out alone. I already know I can.

Active work in AI systems

In Progress — Active Experimentation

HSBN-FL: Hierarchical Stochastic Bottleneck Networks for Heterogeneous Federated Learning

Federated averaging assumes architectural homogeneity — clients must share compatible tensor shapes for weight aggregation to be meaningful. In practice this assumption breaks: clients differ in hardware, compute budget, and naturally adopt different architectures. Rather than patching compatibility onto an aggregation-based protocol, this work removes weight exchange entirely. Clients communicate compressed latent representations through per-client stochastic bottlenecks into a fixed-dimensional common space, and the server reasons over those representations rather than averaging parameters. The server is structured as two levels: a Transformer adapter (Z1) that attends over all participating clients simultaneously — giving each client a globally-informed refined representation — and an MLP apex (Z2) that compresses the pooled output into the most abstract global state. A top-down feedback signal then flows back to each client carrying semantic guidance derived from the global representation, used as a soft alignment target alongside the local task loss. The result is a fully heterogeneous system: clients may use any architecture with any output dimensionality, no model is ever shared, and cross-client collaboration emerges implicitly through the attention mechanism at Z1.

Currently in active experimentation on Kaggle. Smoke test passed; full 200-round run underway with ablations across Dirichlet heterogeneity levels queued. No submission timeline set.

Federated LearningRepresentation LearningHierarchical LearningTransformerInformation Bottleneck
Manuscript Under External Review

Hierarchical Stochastic Bottleneck Networks: Conditions for Representational Abstraction in Deep Hierarchies

Deep networks gain representational power from depth, yet standard training under a single global objective does not organize what is learned at each layer — successive layers tend to encode redundant information rather than forming a genuine hierarchy of abstractions. This paper investigates Hierarchical Stochastic Bottleneck Networks (HSBNs), architectures combining three well-studied mechanisms: per-level objectives providing independent gradient signal at each level, bandwidth-limited stochastic channels constraining inter-level information flow via a KL-divergence penalty, and bidirectional message passing enabling top-down modulation of lower-level representations. We analyze theoretically under what conditions this combination induces a strict abstraction hierarchy — showing each level learns a strictly more compressed representation than the one below — and verify this empirically on CIFAR-10 and CIFAR-100 using Centered Kernel Alignment and intrinsic dimensionality. Controlled ablations reveal two qualitatively distinct failure modes: removing per-level objectives collapses coarse-level accuracy to near-random (3.9% on CIFAR-100) while fine-level performance remains intact, indicating abstraction does not self-organize under a global objective; removing bandwidth constraints increases raw accuracy while destroying representational differentiation (CKA = 1.0). These findings characterize the minimum conditions under which hierarchical abstraction emerges and quantify the accuracy cost of enforcing it. Code and checkpoints are released in full.

Manuscript complete and submitted to an external research lab for review and refinement. Targeting an IEEE conference; venue to be confirmed.

Deep LearningRepresentation LearningInformation BottleneckHierarchical ModelsCIFAR
Archived — AIUB Academic Repository

Adaptive Federated Learning with Heterogeneous Data and Client Distributions

Addresses one of federated learning's most persistent challenges: real-world deployments where clients hold statistically heterogeneous data and differ in computational capacity. The approach combines gradient-based client clustering with a Knowledge Distillation framework, enabling models to generalize across non-IID distributions without sacrificing client privacy or convergence stability.

Originally completed as part of the BSc CSE programme (graded A+). Archived in the AIUB academic repository; not currently targeting external publication.

Federated LearningKnowledge DistillationNon-IID DataClient ClusteringDistributed ML

Software and hardware that ships and works

Software — Real World Deployment

Church Event Ledger & Ticket Reconciliation System

2025

A practical ledger application built in three days to manage ticket sales and financial reconciliation for a church revival meeting. Before this tool, end-of-night accounting required manual tallying across multiple counters averaging 15–20 minutes per session. The software centralised all transactions in real time, reducing close-out time to under 5 minutes across five consecutive events.

Built under time pressure, used in production, by real people. Currently being refined with additional features.

Full-StackFinancial LedgerPostgreSQLReal-World Deployment
75% time reduction per session
Hardware — Embedded Systems

Proximity Sensor Aid for the Visually Impaired

2023

A wearable proximity detection system designed to assist blind users in navigating their environment. Built around an Arduino Nano with an HC-SR04 ultrasonic sensor for mid-range detection and an experimental Time-of-Flight sensor for close-range precision. The device provided haptic and audio feedback on obstacle proximity.

Identified key limitations honestly: the prototype was too bulky for practical daily use, and AI-based object recognition the intended next step proved cost-prohibitive at the time. A clear improvement path exists.

Arduino NanoHC-SR04Time-of-FlightEmbedded CAssistive Tech
ToF experimental sensor integration
Hardware — Electronics

Darkness-Triggered Street Light Automation

2023

An automatic street lighting system using a light-dependent resistor to detect ambient darkness and trigger relay-controlled lighting. Designed as an energy-efficient alternative to timer-based systems, activating only when environmental conditions require it.

Coursework project demonstrating practical application of sensor-based automation and circuit design.

LDR SensorRelay CircuitAutomationElectronics
Software — Coursework

Hotel Management System

2023

A full-featured hotel management system built as coursework, handling room booking, guest records, billing, and occupancy tracking. Emphasis was on high-logic system design the primary goal of the course.

Group project. Source code lost to a system reset a lesson in version control that led directly to consistent Git usage afterward.

Management SystemDatabase DesignBackend Logic

Tools of the trade

Languages
Python JavaScript TypeScript SQL C++ ∗ C# ∗
AI / ML
PyTorch scikit-learn Hugging Face NumPy Pandas Seaborn Matplotlib
Web & Frameworks
React Next.js Nest.js Node.js Vite HTML CSS / SCSS
Databases
PostgreSQL MongoDB SQLite
Tools & Platforms
Git GitHub VS Code Google Colab Kaggle Linux
Research Areas
Federated Learning Knowledge Distillation LLMs NLP

∗ Surface-level familiarity

"I want to do research that matters and build the skills to keep doing it."

I'm actively seeking a fully-funded Masters, PhD, or joint programme abroad in AI or Machine Learning. My goal is to join a research group where I can contribute meaningfully from day one, not just as a student, but as a collaborator.

My research background in federated learning and LLM knowledge elicitation gives me a foundation to build on. I'm particularly interested in labs working on efficient and distributed learning systems, LLM interpretability, and the alignment between model capabilities and knowledge representation.

I come from Bangladesh, where access to computational resources and research mentorship is limited but that constraint has taught me to be resourceful, rigorous, and deeply motivated. I taught myself federated learning from scratch, carried a thesis team alone, and produced original research without the prerequisite coursework. I'm not looking for a comfortable path. I'm looking for the right environment to grow into the researcher I know I can be.

If you're a professor or researcher whose work intersects with mine, I'd genuinely love to connect.

Research Interests
Distributed MLLLM InterpretabilityFederated SystemsKnowledge RepresentationML EfficiencyAI AlignmentNLPModel Probing

Let's connect

Whether you're a professor, researcher, collaborator, or just someone who found my work interesting — I'd love to hear from you.

Send a message

Have a research opportunity or just want to talk? Drop a note below.