Portfolio

I enjoy exploring, learning, and creating. Here's some of my work that I'm proud of.

Data-driven Instruction Augmentation for Language-conditioned Control

Preprint

We use VLMs to perform instruction augmentation on offline datasets, getting more mileage out of existing data.

Data-driven Instruction Augmentation for Language-conditioned Control

Preprint

DIAL consists of three stages: 1) finetune pretrained CLIP with robotics data 2) relabel existing datasets 3) train an instruction following agent with imitation learning.

We apply DIAL to a challenging robotics domain with 80,000 demonstrations, of which only 3.5% contain crowd-sourced language labels.

Inner Monologue

CoRL 2022

Embodied LLM planners incorporate textual feedback from the environment to perform closed-loop planning.

Inner Monologue

CoRL 2022

We formulate an LLM inner monologue by continually adding information from various sources of feedback into the language model prompts during robotic planning.

LLMs are able to act as reactive and roboust planners by incorporating textual feedback sources from success detectors, scene descriptors, visual question-answering, and human feedback.

Value Function Spaces

ICLR 2022

Value functions provide skill-centric representations that help long-horizon and hierarchical RL

Value Function Spaces

ICLR 2022

Value Function Spaces is a simple approach that produces useful representations by using the value functions corresponding to lower-level skills.

These value functions capture the affordances of the scene, thus forming a representation that compactly abstracts task relevant information and robustly ignores distractors

Thinking While Moving

ICLR 2020

Deep reinforcement learning in concurrent action environments, enabling significantly faster and smoother robot policies.

Thinking While Moving

ICLR 2020

This project analyzes deep continuous-time reinforcement learning in the context of concurrent action environmnts.

We introduce a new class of concurrent action deep reinforcement learning algorithms with theoretical learning guarantees.

Evaluations on simulated benchmark tasks and large-scale robotic grasping tasks show that our methods are smoother and faster by over 37%.

Presented at the International Conference on Learning Representations, 2020.

Quadrotor Control Methods

Literature Review 2016

A comprehensive review of quadrotor control methods, focussing on safety-constrained real-time control scenarios.

Quadrotor Control Methods

Literature Review 2016

This literature review surveys existing work in quadrotor control at a broad level, and then examines two specific approaches in real-time control scenarios: learning-based model predictive control (LBMPC) and reachability-based safe learning.

Real-time safety-constrained scenarios encapsulate challenges that extend to many other research areas.

Machine Learning at Berkeley

Founder and President 2015-2017

A student-run organization focusing on workshops, industry projects, and research.

Machine Learning at Berkeley

Founder and President 2015-2017

I founded a student-run non-profit machine learning organization focusing on workshops, industry projects, and research.

I grew the organization to more than 80 undergraduate and graduate students in 18 months.

1200+

WorkshopAttendees

8

Research Projects

16

Industry Clients

Voyager Consulting

Vice President

A student-run organization that provides strategy consulting services to growth-stage and Fortune 500 companies.

Voyager Consulting

Vice President

I served as Vice President in Fall 2015. I worked with clients, determined strategic project directions, and lead consultants.

Projects I worked on: Valve, National Basketball Assocation, Twitch.tv, Warby Parker

4

Clients

20

Consultants

35

Projects

TweetScript

1st Place Hackathon Winner

A programming language composed entirely of Tweets.

TweetScript

1st Place Hackathon Winner, BearHack 2013

In less than 36 hours, we built a parser and tokenizer to create a programming language out of Tweets.

2013

November

1st

Out of 372 Hackers

5

Team Members

DiversaTech

Technology Advisor

DiversaTech is a UC Berkeley student organization that seeks enable success in the technology sector through diversity.

DiversaTech

Technology Advisor

As a Technology Advisor, I was part of the founding team and saw the organization grow and mature. I designed DiversaTech's website that has been used since Spring 2016.

2016

Spring

HTML/CSS

Bootstrap

1000+

Hits

Kaffeine

Contributor

Kaffeine is an application that pings Heroku sites to prevent them from idling.

Kaffeine

Contributor

Kaffeine is an open-source application that pings Heroku sites to keep them from idling.

I added features to update the application to comply with Heroku's policy changes in 2015.

2015

Summer

Ruby on Rails

9000+

Users

Token Turing Machines

Preprint

A Transformer model with external memory for real-world sequential visual understanding.

Token Turing Machines

Preprint

Our model is inspired by the seminal Neural Turing Machine, and has an external memory consisting of a set of tokens which summarise the previous history (i.e., frames). This memory is efficiently addressed, read and written using a Transformer as the processing unit/controller at each step.

TTM outperforms other alternatives, such as other Transformer models designed for long sequences and recurrent neural networks, on two real-world sequential visual understanding tasks.

SayCan: Grounding Language in Robotic Affordances

CoRL 2022

LLM planning that is grounded through affordances enables long-horizon, complex reasoning

SayCan: Grounding Language in Robotic Affordances

CoRL 2022 (oral)

SayCan combines the reasoning and planning capabilities of large language models with grounding through affordance estimation.

The robot can act as the language model’s “hands and eyes,” while the language model supplies high-level semantic knowledge about the task

AW-Opt

CoRL 2021

AW-Opt combines advantage-weighted regression and QT-Opt for a continuous control IL+RL method.

AW-Opt

CoRL 2021

In this paper, our aim is to test the scalability of prior IL + RL algorithms and devise a system based on detailed empirical experimentation that combines existing components in the most effective and scalable way.

Our complete method, which we call AW-Opt, combines elements of advantage-weighted regression and QT-Opt, providing a unified approach for integrating demonstrations and offline data for robotic manipulation.

Presented at the Conference on Robot Learning, 2021.

Learning Latent Plans from Play

CoRL 2019

Self-supervised learning of visual manipulation tasks from unlabeled play data.

Learning Latent Plans from Play

CoRL 2019

This project proposes learning on top of unlabeled teleoperated human play data as a way to scale up multi-task skill learning

We introduce Play-LMP, a self-supervised method that learns to organize play behaviors in a latent space and then reuse them at test time to achieve specific goals.

Presented at the Conference on Robot Learning, 2019.

GANs for Model-based Reinforcement Learning with Tree Search

Term Paper 2017

A project using GANs to learn dynamics models for Tree Search and Deep Value Networks.

GANs for Model-based Reinforcement Learning with Tree Search

Term Paper 2017

Term Paper for CS294-112 Deep Reinforcement Learning, Spring 2017

This project uses Generative Adversarial Networks to learn dynamics models used for Tree Search methods with Deep Value Networks.

4

Models

Tensorflow, Keras

3

Authors

Baseline Power Prediction

Term Paper 2016

A project predicting power usage at the Sutardja Dai Hall in UC Berkeley, using statistical learning methods.

Baseline Power Prediction

Term Paper 2016

Term Paper for CS281A Statistical Learning Theory, Fall 2016

We leverage expectation-maximization, graphical methods, and Recurrent Neural Networks to achieve state-of-the-art results in building baseline power prediction.

5

Models

Tensorflow, Keras, Scipy

2

Authors

Human-Robot Interaction Experiment

Motion Planning Feedback Controller

Experiment on human behavior in a two-robot system for collaborative tasks

HRI Experiment

Motion Planning Feedback Controller

I implemented the motion planning module for a Pioneer robot for an experiment investigating human-friendly task planning algorithms.

2

Robots

Robot Operating System

10+

Participants

adViz

IBM Challenge Winner

MVP for a safety and navigation application; winner of multiple pitch competitions

adViz

IBM Challenge Winner, Team Lead

I created an MVP for a safety and navigation application.

We won multiple pitch competitions, culminating in presenting at IBM Interconnect 2016 in Las Vegas.

5

Months

$5000

Prize

4

Team Members

CalGuessr

Personal Project

A guessing game where you locate where pictures are taken. 2k+ hits within first week.

CalGuessr

Personal Project

A guessing game inspired by GeoGuessr, personalized for UC Berkeley.

It was the top post on the UC Berkeley subreddit and saw 8GB of traffic in 48 hours.

2015

Summer

Ruby on Rails

1200+

Users

Asian American Assocation

Web Developer

UC Berkeley's Asian American Association is one of UC Berkeley's largest student organizations

Asian American Assocation

Web Developer

Job Description

I designed the Asian American Association's website that was used between Spring 2015 and Spring 2016.

2015

Spring

HTML/CSS

Bootstrap

2000+

Hits

Jump-Start Reinforcement Learning

Preprint

Jump-Start Reinforcement Learning (JSRL) enables any pre-existing policy to form a curriculum for an on-policy RL algorithm

Jump-Start Reinforcement Learning

Preprint

JSRL is a meta-algorithm that uses a pre-existing guide-policy along with a learned exploration policy to quickly explore and solve new tasks.

By using the guide-policy to form a curriculum of starting states for the exploration-policy, we are able to efficiently improve performance on a set of simulated robotic tasks.

Predictive Information QT-Opt

CoRL 2022

A QT-Opt agent augmented with an auxiliary loss that learns representations of the predictive information solves 297 tasks in simulation and the real world.

Predictive Information QT-Opt

coRL 2022

Predictive Information QT-Opt (PI-QT-Opt) uses a predictive information auxiliary loss to outperform QT-Opt on 297 simulated and real robot manipulation tasks.

Value functions learned by PI-Qt-Opt acted as the affordances in SayCan, one of the most important and challenging components.

Actionable Models

ICML 2021

Actionable Models combines goal-conditioned Q-learning with CQL to learn from offline datasets.

Actionable Models: Unsupervised Offline Reinforcement Learning of Robotic Skills

ICML 2021

We propose the objective of learning a functional understanding of the environment by learning to reach any goal state in a given dataset.

We employ goal-conditioned Q-learning with hindsight relabeling and develop several techniques that enable training in a particularly challenging offline setting.

We also show that our method can learn to reach long-horizon goals across multiple episodes through goal chaining, and learn rich representations that can help with downstream tasks through pre-training or auxiliary objectives.

Adversarial Machine Learning

Textbook Chapter 2018

Chapter 17 in "Artificial Intelligence Safety and Security"

Adversarial Machine Learning

Textbook Chapter 2018

This textbook is used for several university AI Safety courses, and includes contributions from field leaders such as Ian Goodfellow, Max Tegmark, and Ian Goodfellow.

The Adversarial Machine Learning chapter provides a technical tutorial on adversarial attacks and defense for deep neural networks.

aDOBO

CDC 2017

Dynamics Optimization with Bayesian Optimization. Learning in data-efficient regimes, demonstrated on a quadrotor system.

aDOBO

CDC 2017

This project introduces aDOBO, a framework for learning dynamics in few iterations to maximize controller performance.

The method works in data-efficient regimes, demonstrated on a quadrotor system. The paper was presented at the 56th IEEE Conference on Decision and Control.

This work was also extended as my Master's Thesis.

Deep Frame Rate Upscaling

Term Paper 2016

A project using CNNs, GANs, and VAEs to upscale framerate by interpolating frames.

Deep Frame Rate Upscaling

Term Paper 2016

Term Paper for CS294-129 Deep Neural Networks, Fall 2016

This project attempted to use various architectures to interpolate frames in videos.

4

Models

Tensorflow, Keras

3

Authors

Pursuit-Evasion Reachability

Data Visualization

Visualizing sets gathered from calculating reachability in a game theory application.

Pursuit-Evasion Reachability

Data Visualization

Finding and visualizing reachable sets in a flexible manner. Based on results of a CDC paper by Jaime Fisac

An intuitive and reusable framework for visualizing large datasets from reachability computations. Used at CDC 2015 in Tokyo.

2015

Spring

D3.js, MATLAB

200,000+

Unique Sets

Salonniere

Intelligent Event Planner

Winner of the AI Chatbot competition at the Sutardja Center for Entrepeneurship and Technology.

Salonniere

Chatbot Competition Winner, Team Lead

I created a Messenger Chatbot to help plan events. It suggests ideas, sends invitations, and interacts with guests.

We won the AI Chatbot Collider in Fall 2016 at UC Berkeley.

3

Months

$2500

Prize

3

Team Members

DeCal Platform

Platform Re-make

A web platform for UC Berkeley's DeCal education program, serving 4000 students a semester.

DeCal Platform

Platform Re-make

As part of an Agile team of 6, I worked with the DeCal Education Board to re-make the DeCal web platform from scratch.

I focussed on authentication with OAuth, file uploads with Carrierwave/S3, and behavior tests with Cucumber.

2015

Spring

Ruby on Rails

4000+

Users Per Semester