Hi, I'm Danny Wang
15-year-old student and developer with over eight years of experience building meaningful software.
I'm obsessed with everything from CUDA transformer compression to PyTorch deep learning research to full-stack React apps that actually ship.
About
I started programming at the age of seven by making simple text-based games with Python. Over time, I've explored full-stack development, competitive programming, and machine learning. Today, I write C++ for algorithmic contests and high-speed AI libraries, Python for ML experiments and other scripts, and HTML/CSS/JavaScript for web apps, though I have plenty of other technologies under my belt too. I learn fastest by building things, whether it be fast-paced hackathons, huge projects, or deep theoretical research papers. Outside of software engineering, you'll find me at the piano working through Chopin nocturnes, topping honour rolls in math contests, training for badminton and competing at provincial tournaments, or reading about Lagrangian mechanics just because it's interesting.
Experience
Key Achievements
Mathematics
- AIME Qualifier twice (top ~5%)
- 1st Place in Grade 10 for multiple Canadian & Alberta contests
- 65+ contest awards and honour roll placements since 2021
- 6x Waterloo contests school champion
Competitive Programming
- 1842 Codeforces Rating (top ~4% worldwide)
- Gold USACO Division
- Group 2 CCC Senior (top 1%) and 3 Certificates of Distinction
- 2000+ problems solved across Codeforces, LeetCode, DMOJ, and more
- 8+ yrs Coding since 2017
Other Honours & Awards
- Structure & Design Award from Calgary Youth Science Fair (CYSF) with $250 prize money
- Honours with Distinction in all my RCM Piano exams
- 3 Gold Medals from school badminton
- Honours with Distinction awarded for maintaining an average above 90% in high school
Projects
Tensorbit Core - A C++/CUDA Transformer Pruning Library
Year Built: 2026
Results: Production-ready pruning library for structured N:M sparsity on transformer models. Custom .tbm container format, CUDA-accelerated EHAP + CORING, BlockOBS greedy pruning. Prunes Mistral 7B in ~80 minutes with support for models of all sizes.
A C++/CUDA pruning library built as part of Tensorbit Labs's state-of-the-art pruning -> distillation -> quantization approach to compressing and optimizing large language models and vision transformers. Tensorbit Core, the first stage of the pipeline, applies structured N:M sparsity to models using Efficient Hessian Aware Pruning (EHAP) and coarse-to-fine mask generation (CORING). It also features a custom binary container format (.tbm) for zero-copy GPU loading, multi-shard safetensors support, and an inference engine with 13 CUDA/C compute kernels for autoregressive transformer decoding on pruned models.
Read the WriteupView on GitHub
SpinFilter - Bias Detection and Content Analysis Platform
Year Built: 2026
Results: 3rd Place, Calgary Hacks 2026 (Tier 2) - $50 prize.
A project built in 24 hours for the Calgary Hacks 2026 hackathon by a team of four high school students. Features advanced Python neural network and LLM integration into a React powered and user-friendly frontend engineered to deliver an accessible app that provides insight into media bias and language. Completely built from scratch with features including NLP sentiment analysis, LLM bias removal, media outlet bias statistics, article web-scraping from URL, and per-paragraph bias analysis and detection.
View on GitHubCosine Similarity Distillation - Novel Neural Network Teacher-Free Distillation Technique
Year Built: 2026
Results: 67x storage reduction vs KD while maintaining a competitive accuracy gain over the baseline student. Uses only a single teacher forward pass in the entire distillation process. Per-class fingerprints as small as 50 KB.
Proposed new approach to knowledge distillation in neural networks using a teacher-free method with Gaussian random projection teacher fingerprints for highly storage efficient and computationally friendly student knowledge transfer. Experimentally compared against current KD and FitNet distillation methods and fully implemented in Python utilizing PyTorch's vision ResNet model on the CIFAR-100 dataset. Research paper currently a work in progress.
View on GitHubCalgary Housing Neural Network - Multilayer Perceptron Built from Scratch
Year Built: 2026
Results: R² = 0.87 in dollar space (0.99 in log10 space), MAE ~$66k, median APE ~6% on held-out test set. Deployed with a Flask web interface.
An 86→256→128→64→1 MLP (three hidden layers) that predicts Calgary residential property assessments from the City's open assessment data, built entirely in NumPy and Pandas with no high-level ML libraries. Implements backpropagation, Adam optimizer with learning rate decay, Huber loss in log10 space, and He weight initialization (Kaiming initialization) all from scratch. Features a custom training loop with batch processing and validation splitting, demonstrating a complete understanding of how neural networks work under the hood rather than relying on framework abstractions.
View on GitHubThis Portfolio Website
Year Built: 2025-2026
Results: Responsive, print-optimized, and aesthetically clean site with no external dependencies or backend requirement. Hosted on dannywang.dev.
Built from scratch in vanilla HTML/CSS/JS, with no frameworks, no build step, and no external dependencies. Features a responsive fixed-sidebar layout, animated carousel, touch-swipe support, interactive skill tags, toggleable dark mode, and a print-optimized stylesheet. Designed with a minimalist aesthetic and system fonts to ensure fast load times and a professional presentation.
View on GitHubCCC Solutions Archive
Year Built: 2025-2026
Results: Platform with full editorials for CCC problems, hosted at thepeeps191.github.io/ccc.
Comprehensive solution editorials to every past Canadian Computing Competition (CCC) problem, fully written by me with code implementations in either Python and C++. Each problem includes a detailed breakdown of the algorithmic approach, time complexity analysis, and alternative solution strategies. Currently a work-in-progress but new editorials are added regularly as I work through writing solutions for the remaining problems across both the Junior and Senior division.
View ProjectPokémon Wishstone - WIP Custom Pokémon ROM Hack
Year Built: 2024-2026
Results: Custom region, maps, scripts, and trainer battles - long-term WIP exploring GBA decompilation.
A custom Pokémon ROM hack for the Game Boy Advance, built using the decompilation framework of Pokémon Emerald (pokeemerald-expansion). Features a new region, custom map layouts, modified Pokémon spawns and trainer battles, original scripts and dialogue, and quality-of-life gameplay improvements. A long-term project exploring game design, systems programming, and large-scale C codebases.
View on GitHubSkibidi Toilet: Attack on G-Man
Year Built: 2024-2025
Results: Built for school game development course and achieved a grade of 100%.
A 3D action game built with my friend Daniel Bayanati in Unity for a school course on Game Development. Features custom shaders, enemy AI, player mechanics, and level design. Developed entirely in C# with ShaderLab and HLSL for visual effects, and utilized Blender for 3D graphics modelling. Contains multiple gameplay stages, interactive environments, and polished game feel. Originally designed for virtual reality play but has been implemented as a regular game.
View on GitHubYOLO Dog vs Cat Classifier
Year Built: 2024
Results: Custom annotated dataset, end-to-end training and inference pipeline using YOLOv8.
A computer vision model trained to classify and detect dogs and cats using YOLOv8. Built with a custom annotated dataset, the pipeline covers data preprocessing, model training, validation, and inference on new images. A foundational computer vision project exploring object detection workflows end-to-end.
View on GitHubArduino Nano Gesture-Controlled Car
Year Built: 2024
Results: Two Arduino Nanos communicating wirelessly via IR with accelerometer and gyroscope-based gesture controlling of a car.
A gesture-controlled car built with two Arduino Nanos, using accelerometer and gyroscope sensor input to interpret hand movements and control motor direction and speed wirelessly via IR communication across both Arduinos. Written in C++ for the Arduino framework, combining embedded systems programming with real-time sensor processing, circuit prototyping, and motor control logic.
View on GitHubCheapest Pokémon Card - eBay Scraper With a Flask Interface
Year Built: 2024
Results: Real-time eBay scraping with automated cheapest-listing detection via Flask interface.
A web application that scrapes eBay listings to find the cheapest listing for a specific Pokémon card, presented through a clean Flask interface. Handles real-time search queries, parses and filters listing data, and automatically opens up the listing it finds. Combines web scraping, backend API development, and frontend rendering in a single lightweight tool.
View on GitHubLearnmonkey - Programming Education Platform
Year Built: 2022-2024
Results: 2000+ commits, open-source collaborative educational platform with lessons across multiple programming languages and frameworks.
An interactive programming education website built as a collaborative open-source project. Features a variety of lessons across everything from Python to Go to C++ to Java. Designed to make foundational programming accessible and to provide detailed tutorials for everything programming. Built entirely from scratch with HTML, CSS, and JavaScript.
View on GitHubMonkeytyper - Python Bot to Automate Monkeytype Records
Year Built: 2023
Results: Achieves 400 WPM on 10-word English benchmark using PyAutoGUI + Tesseract OCR.
A Python bot to automate Monkeytype records, with built-in text detection and autotyping for an average of 400 WPM on the 10-word English benchmark. Uses PyAutoGUI for automation and Tesseract OCR for on-screen text detection and recognition.
View on GitHubDicemazing - 2022 GMTK Game Jam Submission
Year Built: 2022
Results: Built in 48 hours for GMTK Game Jam 2022 — custom shaders, dice mechanics, and ShaderLab/HLSL effects.
A Unity game built by Danny Wang & Jonas Huang in 48 hours for the 2022 GMTK Game Jam under the theme "Roll of the Dice." Features dice-based gameplay mechanics with custom shaders and visual effects. Built primarily in ShaderLab and HLSL with C# game logic.
View on GitHubArduino Science Fair Car - Calgary Youth Science Fair Gold Award
Year Built: 2021-2022
Results: Gold Award + Structure & Design Award at Calgary Youth Science Fair (CYSF).
An Arduino-based autonomous car project that won the Structure & Design Award and a Gold Medal at the Calgary Youth Science Fair. Features sensor-based navigation, motor control logic, and embedded C++ programming on the Arduino platform. Integrated multiple hardware components including ultrasonic sensors and motor drivers for autonomous obstacle avoidance.
View on GitHubBobville Economy Bot - Discord Economy & Game Bot
Year Built: 2020-2021
Results: Full economy system with currency, item shops, and mini-games — served a school community server.
A full-featured Discord economy bot with virtual currency systems, mini-games, item shops, and user progression. Built using discord.py with a modular command structure, persistent database storage, and interactive gameplay commands that served a school community server.
View on GitHubResearch
Cosine Similarity Distillation (CSD) - Teacher-Free Knowledge Transfer via Random Projection Fingerprints (2025-present) Independent Research
A novel distillation method that eliminates the need for live teacher model forwarding during student training. Instead of loading the teacher at every batch, CSD precomputes compact "fingerprints", which are the cosine similarities between the teacher's normalized intermediate features and the columns of a frozen random reference matrix. During student training, the student then regresses these fingerprints alongside the classification loss, achieving competitive accuracy with 67x smaller storage and a single teacher forward pass compared to traditional KD or FitNet distillation methods. Built in PyTorch on CIFAR-100 with ResNet architectures, featuring augmentation-aware fingerprint generation, cosine similarity loss, and lambda warmup. Demonstrated per-class fingerprints as small as 50 KB. Paper is currently a work in progress.
Technical Skills
Get in Touch
If you're building something interesting, let's connect.