T5v1.1

Overview

T5v1.1 was released in the google-research/text-to-text-transfer-transformer repository by Colin Raffel et al. It’s an improved version of the original T5 model. This model was contributed by patrickvonplaten. The original code can be found here.

Usage tips

One can directly plug in the weights of T5v1.1 into a T5 model, like so:

>>> from transformers import T5ForConditionalGeneration

>>> model = T5ForConditionalGeneration.from_pretrained("google/t5-v1_1-base")

T5 Version 1.1 includes the following improvements compared to the original T5 model:

Note: T5 Version 1.1 was only pre-trained on C4 excluding any supervised training. Therefore, this model has to be fine-tuned before it is usable on a downstream task, unlike the original T5 model. Since t5v1.1 was pre-trained unsupervisedly, there’s no real advantage to using a task prefix during single-task fine-tuning. If you are doing multi-task fine-tuning, you should use a prefix.

Google has released the following variants:

Refer to T5’s documentation page for all API reference, tips, code examples and notebooks.