Projects

Large Language Models & Natural Language Processing

IntervalTree.rs

Implemented an efficient interval tree in Rust and exposed PyO3 python bindings. GitHub repo.

SEC Filings Data

The MCP server provides end-to-end workflows for SEC filings and earnings call transcripts—including ticker resolution, document retrieval, OCR, embedding, on-disk resource discovery, and semantic search—exposed via MCP and powered by the same olmOCR and embedding backends as vLLM backends. GitHub repo.

Teaching Distributed Data Parallelism for LLMs

This playlist is about going from theory to practice for training large models. Watch the playlist.

MoE-ReFT

Parameter-efficient finetuning with ReFT (representation finetuning) on OLMoE-7B-A1B using interventions before and after MoE layers. GSM8k test loss: base 0.82, pre-MoE 0.91, post-MoE 10.8 (worse). GitHub repo.

Enhancement to Grouped Query Attention

Developed a weight-based aggregation of key-value heads to improve T5-small summarisation performance by 2.75% over the grouped query attention baseline. Explore the GitHub repo and the Weights & Biases report.

Sequence Length Balancing in Rust

Built a Rust project for sequence length balancing with a scheduler backed by a ZMQ server using a ROUTER-DEALER architecture. See the GitHub repo.

Benchmark plot for sequence length balancing
Sequence length balancing benchmark.

SEC Filings Question Answering Agent

Built an end-to-end system that parses 10-Q and 10-K filings to answer investor questions about company health. Explore the original project and the revamped finance data LLM repo.

Dashboard from the SEC QA agent
SEC filings question answering workflow.

Movie Reviews Question Answering Agent

Built a MongoDB Atlas-powered QA system. Browse the code and watch the demo.

Movie QA agent interface
Movie reviews QA prototype for Parasite.

Old days of being a finance bro

Reinforcement Learning

Algorithmic Trading with Google Trends

Leveraged web search data as state space to improve RL trading performance. GitHub · Medium

Black-Litterman Portfolio Optimisation

Work-in-progress on applying RL to portfolio construction. GitHub

FinRL Optimisation Contributions

Authored explainers and tutorials for hyperparameter optimisation workflows in FinRL. Article series

Machine Learning