Weight Distillation: Transferring the Knowledge in Neural Network Parameters. ACL2021
Release time:2024-01-09
Hits:
- First Author:
- Ye Lin, Yanyang Li, Ziyang Wang, Bei Li, Quan Du, Tong Xiao,Jingbo Zhu
- Translation or Not:
- no
- Pre One:An Efficient Transformer Decoder with Compressed Sub-layers. AAAI2021.
- Next One:Language Modeling for Syntax-based Machine Translation Using Tree Substitution Grammars: A Case Study on Chinese-English Translation. ACM Transactions on Asian Language Information Processing (TALIP), 10(4), article 18.
