CS & Computer Systems Engineering @ RPI

Tanay Anand

Building at the intersection of ML systems, embedded firmware, and distributed infrastructure. Currently researching kernel optimization for neural network workloads.

Tanay Anand

Where I've Worked

Software Developer & Instructor
TheCoderSchool — Remote
2022 — Present
  • Developed a PyTorch-based notes assistant that processed 5K+ documents, improving model accuracy by 20% through data preprocessing and validation workflows.
  • Built testing utilities to verify ML data pipelines, catch data handling issues, and improve reproducibility of model inference behavior.
  • Investigated runtime and memory-related failures in model execution, resolving edge cases in data loading and preprocessing.
  • Taught Python, Java, and C to 50+ students, with emphasis on software design, algorithmic thinking, and systematic debugging.
Teaching Assistant — Embedded Control
Rensselaer Polytechnic Institute — Troy, NY
Spring 2025
  • TA for the undergraduate Embedded Control course covering microcontroller programming, interrupt-driven I/O, real-time scheduling, and hardware-software co-design.
  • Held office hours and lab sessions guiding 60+ students through hands-on firmware development on ARM-based platforms using C, GPIO, timers, and finite state machines.
  • Designed and graded lab assignments involving sensor integration, motor control, and real-time event handling on resource-constrained hardware.
  • Debugged student hardware setups and firmware, troubleshooting issues across serial communication, interrupt priority conflicts, and timing-sensitive control loops.

Academic Background

Rensselaer Polytechnic Institute

B.S. Computer Science & Computer Systems Engineering
GPA: 3.91 / 4.00 Aug 2023 — Dec 2026 (Expected)

Machine Learning • Operating Systems • Distributed Systems • Computer Networks • Data Structures & Algorithms • Parallel Computing • Embedded Control • Computer Architecture

Carnegie Mellon Pre-College

National High School Game Development Academy
Jun — Aug 2021

Programmer (Unity C#), Developer & Audio Engineer. Built game prototypes in collaborative team environments using industry-standard tools.


Tech Stack

Languages
Python C++ C Rust Java SQL
ML & Numerical Computing
PyTorch Model Evaluation Preprocessing Inference Validation Tensor Ops Benchmarking
Systems & Software
Data Structures Algorithms OOP Testing Profiling Linux Git Docker PostgreSQL
Embedded & Hardware
Microcontrollers GPIO Interrupts FSMs Real-Time Control HW/SW Integration

Things I've Built

Low-Level/Trading

Mini-Matching-Engine

A simplified electronic trading engine written in pure C for Linux. It accepts buy and sell limit orders from stdin or a file, maintains an in-memory order book, and matches orders using price-time priority

makefileC
ML / Visualization

Transformer Scaling Explainer

Interactive tool modeling Transformer inference scaling. Adjust layers, sequence length, precision, and hardware to visualize latency, attention cost, and KV cache behavior in real time.

ReactPythonWebGLVercel
ML / Visualization

Neural Networks & LLM Visualizer

Explore tokenization, attention patterns, and internal representations of large language models with real-time visual feedback.

ReactTypeScriptD3.jsVercel
Claude/ASIC

ASIC AI Agent - Claude

Built an AI agent for RTL verification that reads Verilog, generates testbenches, runs simulations, parses results, and reports pass/fail automatically.

Claude CodeVerilogPythonAutomation
Undergraduate Research

Kernel Optimization Study

Benchmarked matmul and convolution kernels across CPU/GPU. Achieved 35% training speedup via memory access optimization and tensor-level tuning.

CUDAC++PythonPyTorch
Embedded Systems

Robotics Control System

Embedded C firmware with FSM control logic, timers, interrupts, GPIO, and event-driven state management for real-time sensor-driven behavior.

CARM CortexGPIORTOS
Systems / Tooling

Telemetry Validation Toolkit

Python toolkit to parse, validate, and analyze structured system logs. Modular architecture for reusable validation logic across datasets.

PythonPandasCLIJSON
Distributed Systems

Distributed Key-Value Store

Raft consensus-based KV store with linearizable reads/writes, leader election, log replication, and snapshotting for fault tolerance.

RustgRPCRaftTokio
Networking

Packet Sniffer & Analyzer

Low-level packet capture decoding Ethernet, IP, TCP/UDP headers with real-time traffic visualization and filtering by protocol, port, and address.

ClibpcapTCP/IPncurses
Operating Systems

Custom Memory Allocator

User-space allocator with first-fit, best-fit, and buddy strategies. Benchmarked fragmentation and throughput against glibc malloc.

CLinuxmmapValgrind