Projects

Below is a description of some of the projects I worked on during my undergraduate career. The selected projects listed here showcase my experience in building distributed systems, designing and implementing new programming languages, proving theorems with advanced mathematics, and building robust machine learning models.

A sharded, linearizable key-value storage system with dynamic load balancing
Practicum Project | CS 5414: Distributed Computing Principles | Collaborator: William Long

We designed a distributed key-value storage system in Java capable of handling network partitions and redistributing shards across replica groups while maintaining linearizability. We implemented a version of multi-Paxos, as described in in Robbert van Renesse's famous paper Paxos Made Moderately Complex, to ensure fault tolerance of replica groups. (Fun fact: Prof. van Renesse was actually my operating systems professor in undergrad.) We optimized our Paxos implementation with mechanisms for garbage collection and electing a distinguished proposer. The project was carried out in a framework that was created by a research group at the University of Washington, known as DSLabs. The DSLabs framework provides correctness and performance tests for our distributed key-value store, and is also equipped with a model checker. Since the framework is used by students at other universities, I am unable to publicly post our code for academic integrity reasons. However, I am more than happy to talk about the project on an individual basis, so feel free to reach out if you would like to hear more about it. Check out the Github repo for the DSLabs framework below.

DSLabs Github Repo

ChipotLang. An interpreted functional programming language for concurrent programming
Final Project | CS 4110: Programming Languages and Logics | Collaborator: William Long

We designed a functional programming language for concurrent programming, known as ChipotLang. Our vision was that ChipotLang could be used as an educational tool to help bridge the gap between functional and systems programming. The interpreter, which was built in OCaml, translates functional expressions into continuation-passing style (CPS) and then evaluates them using the semantics of lambda calculus.

Github Repo

OCamulator. A domain-specific language for linear algebra, probability, and statistics
Final Project | CS 3110: Data Structures and Functional Programming | Collaborators: Matthew Frucht, Stephen Tse

We created an interpreted domain-specific language (DSL) in OCaml for mathematical computations. We built a read-eval-print-loop (REPL), where users can write OCamulator code and obtain results, similarly to MATLAB's shell. The OCamulator language is capable of linear algebraic computations, such as row reduction, PLU matrix factorization, solving linear systems, inverting square matrices, finding determinants of square matrices, and matrix-vector arithmetic. In the OCamulator language, users can also perform basic arithmetic, solve algebraic expressions symbolically, calculate statistical measures of datasets, perform linear regressions, and evaluate PDFs and CDFs for various common probability distributions.

Github Repo

Proving the t-distribution arises from sampling normally-distributed populations
Final Project | ENGRD 2700: Engineering Probability and Statistics

I rigorously proved that the t-distribution arises from sampling normal populations when the variance is unknown and is taken from the sample itself. The proof relies on topics from probability theory, statistics, linear algebra, and multivariate calculus.

Proof

Predicting radio signal modulation scheme with an ensemble of convolutional neural networks
Midterm Project | ECE 4200: Fundamentals of Machine Learning

I built a deep learning model for predicting the modulation scheme used to encode a radio signal using time series data - specifically, the in-phase and quadrature (I/Q) components of a signal sampled at a periodic rate. After experimenting with a variety of machine learning algorithms and models, the final model chosen was an ensemble of convolutional neural networks. With ten possible modulation schemes in the dataset and low signal-to-noise ratio, this model was able to predict which scheme was used to encode a given signal with 61.820% accuracy on the test set.

Github RepoReport