Weight Distillation: Transferring the Knowledge in Neural Network Parameters. ACL2021

Release time:2024-01-09

Hits:

First Author:: Ye Lin, Yanyang Li, Ziyang Wang, Bei Li, Quan Du, Tong Xiao,Jingbo Zhu

Translation or Not:: no

Pre One:An Efficient Transformer Decoder with Compressed Sub-layers. AAAI2021.
Next One:Weight Distillation: Transferring the Knowledge in Neural Network Parameters. ACL2021.