Back to Blog

Heading

This is some text inside of a div block.
,
This is some text inside of a div block.
This is some text inside of a div block.
,
This is some text inside of a div block.
This is some text inside of a div block.
,
This is some text inside of a div block.
This is some text inside of a div block.
No items found.

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.

Heading 1

Heading 2

Heading 3

Heading 4

Heading 5
Heading 6

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.

Block quote

Ordered list

  1. Item 1
  2. Item 2
  3. Item 3

Unordered list

  • Item A
  • Item B
  • Item C

Text link

Bold text

Emphasis

Superscript

Subscript

Discover more

Back to Blog

Auxiliary Tuning and its Application to Conditional Text Generation

,
,
,
May 31, 2024
No items found.

Auxiliary Tuning is an efficient method for adapting a pre-trained language model to a novel task, such as conditional text generation.

What we did: We designed a simple and efficient method, called Auxiliary Tuning, for adapting a pre-trained LanguageModel (LM) to a novel task; we demonstrate the approach on the task of conditional text generation. Our approach supplements the original pre-trained model with an auxiliary model that shifts the output distribution according to the target task.

Why it matters: Achieving state-of-the-art fluency in language tasks such as text generation entails costly training of large LMs [1]. Auxiliary Tuning allows practitioners to amortize this cost across target tasks by leveraging existing pre-trained LMs. This is done without modifying the pre-trained weights, avoiding the risks of rigidity and catastrophic forgetting, and allowing natural scaling to multiple target tasks.

How it works: The auxiliary model is trained by adding its log its to the pre-trained model log its and maximizing the likelihood of the target task output. Our method imposes no constraints on the auxiliary architecture. In particular, the auxiliary model can ingest additional input relevant to the target task, independently from the pre-trained model’s input. Furthermore, mixing the models at the log its level provides a natural probabilistic interpretation of the method.

Results: Our method achieved similar results to training from scratch for a number of different tasks, while using significantly less compute for training; we share a specific example of text generation conditioned on keywords.

Read the paper here.

Discover more

Discover more