EfficientBERT: Effectively trading off model size and accuracy during model compression

In this project I explore the following questions: can we better understand the effects of compression and architecture decisions on model performance? Do these architectural decisions (including model size) or distillation properties dominate these trade-offs?

My work on BERT has been accepted and invited to leading industry conferences including: RayConnect, RaySummit, MLConf, Bay Area NLP Meetup.

Watch the talk

Summary of EfficientBERT

Full methodology of EfficientBERT

Resources

    • To recreate or repurpose this work please use this repo.

    • The AWS AMI used to run this code can be found here.

    • Get the trained model checkpoints here

    • To play around with the SigOpt dashboard and analyze results for yourself, take a look at the experiment

Previous
Previous

Black Box Image Augmentation

Next
Next

Building high performing models