microsoft/deberta-xlarge-mnli
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
DeBERTa improves the BERT and RoBERTa models using disentangled attention and enhanced mask decoder. It outperforms BERT and RoBERTa on majority of NLU tasks with 80GB training data.
Please check the official repository for more details and updates.
This the DeBERTa xlarge model(750M) fine-tuned with mnli task.
Fine-tuning on NLU tasks
We present the dev results on SQuAD 1.1/2.0 and several GLUE benchmark tasks.
Model | SQuAD 1.1 | SQuAD 2.0 | MNLI-m/mm | SST-2 | QNLI | CoLA | RTE | MRPC | QQP | STS-B |
---|---|---|---|---|---|---|---|---|---|---|
F1/EM | F1/EM | Acc | Acc | Acc | MCC | Acc | Acc/F1 | Acc/F1 | P/S | |
BERT-Large | 90.9/84.1 | 81.8/79.0 | 86.6/- | 93.2 | 92.3 | 60.6 | 70.4 | 88.0/- | 91.3/- | 90.0/- |
RoBERTa-Large | 94.6/88.9 | 89.4/86.5 | 90.2/- | 96.4 | 93.9 | 68.0 | 86.6 | 90.9/- | 92.2/- | 92.4/- |
XLNet-Large | 95.1/89.7 | 90.6/87.9 | 90.8/- | 97.0 | 94.9 | 69.0 | 85.9 | 90.8/- | 92.3/- | 92.5/- |
DeBERTa-Large1 | 95.5/90.1 | 90.7/88.0 | 91.3/91.1 | 96.5 | 95.3 | 69.5 | 91.0 | 92.6/94.6 | 92.3/- | 92.8/92.5 |
DeBERTa-XLarge1 | -/- | -/- | 91.5/91.2 | 97.0 | – | – | 93.1 | 92.1/94.3 | – | 92.9/92.7 |
DeBERTa-V2-XLarge1 | 95.8/90.8 | 91.4/88.9 | 91.7/91.6 | 97.5 | 95.8 | 71.1 | 93.9 | 92.0/94.2 | 92.3/89.8 | 92.9/92.9 |
DeBERTa-V2-XXLarge1,2 | 96.1/91.4 | 92.2/89.7 | 91.7/91.9 | 97.2 | 96.0 | 72.0 | 93.5 | 93.1/94.9 | 92.7/90.3 | 93.2/93.1 |
收录说明:
1、本网页并非 microsoft/deberta-xlarge-mnli 官网网址页面,此页面内容编录于互联网,只作展示之用;
2、如果有与 microsoft/deberta-xlarge-mnli 相关业务事宜,请访问其网站并获取联系方式;
3、本站与 microsoft/deberta-xlarge-mnli 无任何关系,对于 microsoft/deberta-xlarge-mnli 网站中的信息,请用户谨慎辨识其真伪。
4、本站收录 microsoft/deberta-xlarge-mnli 时,此站内容访问正常,如遇跳转非法网站,有可能此网站被非法入侵或者已更换新网址,导致旧网址被非法使用,
5、如果你是网站站长或者负责人,不想被收录请邮件删除:i-hu#Foxmail.com (#换@)
前往AI网址导航