Optimizing architecture and learning strategy for End-to-End memory networks

In this project, I explore how various optimization strategies, various learning strategies, and modifying model architecture affect a memory network’s performance. This work built a precursor for my work on EfficientBert.

This work has been published on Nvidia’s dev blog.

Resources

    • To recreate or repurpose this work please use this repo

Previous
Previous

Building high performing models