EfficientBert — meghana ravikumar

EfficientBERT: Effectively trading off model size and accuracy during model compression

In this project I explore the following questions: can we better understand the effects of compression and architecture decisions on model performance? Do these architectural decisions (including model size) or distillation properties dominate these trade-offs?

My work on BERT has been accepted and invited to leading industry conferences including: RayConnect, RaySummit, MLConf, Bay Area NLP Meetup.

Watch the talk

Summary of EfficientBERT

Full methodology of EfficientBERT

Resources

- To recreate or repurpose this work please use this repo.
- The AWS AMI used to run this code can be found here.
- Get the trained model checkpoints here
- To play around with the SigOpt dashboard and analyze results for yourself, take a look at the experiment
- EfficientBERT Summary
- EfficientBERT complete paper on Nvidia’s devblog
- Slides from MLConf

EfficientBERT: Effectively trading off model size and accuracy during model compression

Watch the talk

Resources

GitHub Repo

Model checkpoints and Experiment Results

Read the full work

Black Box Image Augmentation

Building high performing models