Optimizing architecture and learning strategy for End-to-End memory networks
In this project, I explore how various optimization strategies, various learning strategies, and modifying model architecture affect a memory network’s performance. This work built a precursor for my work on EfficientBert.
This work has been published on Nvidia’s dev blog.
Resources
-
To recreate or repurpose this work please use this repo
-
read the full work on Nvidia’s dev blog