HAIM: A Modest Step Towards Controllable Text Generation

We present a language model that inserts synthetic text between human-written beginning and ending.
The Age of Powerful Text Generators

In the past six months, we've seen great advances in the field of Natural Language Generation (NLG), enabled by applying huge amounts of computing power to train large language models on massive datasets.

 

This new age of NLG was heralded by OpenAI's GPT-2 model, which demonstrated the ability to complete a human-written text with synthetic continuations of unprecedented quality. When primed with a few sentences of input text, the model completes it to produce plausible-looking passages, describing fictitious events such as the discovery of unicorns in the Andes or John F. Kennedy rising from his grave.


AI2's Grover model followed soon after GPT-2 and was trained on an even larger dataset focused specifically on news articles. A novel feature introduced in Grover is the ability to prime the model with metadata, such as a headline, a publication domain or the author names. For example, given a headline such as "New Study: Pet Ownership Increases Life Expectancy" and a domain such as nytimes.com, Grover can generate an article styled like the New York Times, complete with quotes of nonexistent scientists studying made-up health benefits of pets.

Why Controllability Matters

We believe that automated (or semi-automated) text generation holds great promise for society, by helping people write better and more productively. In order to unlock this potential, however, text generators must evolve to become much more controllable. 

 

Impressive as it is, the text generated by GPT-2 and Grover is far from perfect. In particular, the models’ output tends to diverge from the human-written input as the generation progresses. Sooner or later, the generators go off-topic, lose coherence, or contradict the preceding text. Such behaviors are a major problem for a user trying to convey a message or express an idea.


There is no natural way for a user to restrict GPT-2 and Grover’s tendency to diverge. This is inherent to their left-to-right, extrapolating method of operation. Metaphorically speaking, the user can give these models a starting point and a vague sense of direction, but not a final destination, let alone a route to follow.

Introducing an Interpolating Language Model

As a step towards controllability, we created HAIM, an interpolating language model; given a human-written beginning (prefix) and human-written ending (suffix), HAIM generates synthetic text (body) that fits between them with a desired length. Thus, HAIM introduces two new “knobs” for tuning its output: the suffix, for keeping the generated text on topic, and the length, for controlling the amount of text inserted between the prefix and the suffix.

Comparing Controllability

GPT2 (first 63 words) :

John had an exam that he had worked for very hard and had prepared for many months in advance. But when the day arrived he felt that he wasn’t ready.

And it was because that exam left a bad taste for so much of his life that he had to sit out this summer. He never thought he could be in this position again. On his last day, he walked up to the door in front of me and said to me,"I got a job. I am going out, working for somebody." [...]

prefix

HAIM (50 words):

John had an exam that he had worked for very hard and had prepared for many months in advance. But when the day arrived he felt that he wasn’t ready.

He was extremely tired from sitting in the exam hall. So, he just wanted to make sure that he got a satisfactory grade. His teacher was very attentive and he always kept an eye on him. He also asked him many questions and kept an eye out for any problems that might develop.

He was very pleased when his teacher informed him that he got an A+.

HAIM (10 words):

John had an exam that he had worked for very hard and had prepared for many months in advance. But when the day arrived he felt that he wasn’t ready.

The exam actually went well and he was actually a good candidate.

He was very pleased when his teacher informed him that he got an A+.

prefix

suffix

prefix

suffix

We used the same Transformer-based architecture as GPT-2 and Grover. We trained the model from scratch on OpenWebText, a freely-available clone of OpenAI’s WebText dataset. In order to train the model to generate text conditioned on a prefix and a suffix, we manipulated the order of the text in every training example and employed other tricks to control the output length. A complete description of our implementation appears in the Appendix below.

What We’re Sharing

At this time we’re only releasing a demo of HAIM-Large, a variant of the model with 345M parameters. This is equivalent in size to the publicly released versions of Grover and GPT-2.

 

You can try it out here.

Much of the discussion around these systems has focused on the risks that powerful text generators might pose in the wrong hands - specifically, their potential for generating fake news in volume. We applaud the caution; fake news is a real issue which we as a community should take seriously, and we're glad it is taken seriously. We do feel compelled to mention that, in our view, the "fake news risk" is overemphasized relative to the benefits of facilitating better and more efficient writing. Furthermore, insofar as one is concerned with fake news, we’re not sure that suppressing text generation technology is either doable or the most relevant factor.  All that said, we have much respect for our colleagues who have thought hard about these issues, and we are following their lead regarding release policy for now.

Why HAIM?

We could pretend that “HAIM” stands for Halfway Acceptable Interpolating Machine, but the truth is that we simply asked the model to name itself. 

 

We did it by feeding it the prefix “The team needed a name. The best suggestion” and the suffix “Everybody agreed it was a great name for a state-of-the-art natural language generator.”.

It interpolated by inserting “was Haim”. 

 

“HAIM” seems like a strange name, but who are we to argue with Haim?

Appendix: Technical Details

1  We used syntok for sentence segmentation.

© 2019 AI21 Labs, LTD. All rights reserved.