Back to Blog

Heading

This is some text inside of a div block.
,
This is some text inside of a div block.
This is some text inside of a div block.
,
This is some text inside of a div block.
This is some text inside of a div block.
,
This is some text inside of a div block.
This is some text inside of a div block.
No items found.

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.

Heading 1

Heading 2

Heading 3

Heading 4

Heading 5
Heading 6

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.

Block quote

Ordered list

  1. Item 1
  2. Item 2
  3. Item 3

Unordered list

  • Item A
  • Item B
  • Item C

Text link

Bold text

Emphasis

Superscript

Subscript

Discover more

Back to Blog

Grounding Language Models In-Context: Improving Text Generation and Attribution for Off-the-Shelf LMs

Ori Ram, Yoav Levine, Itay Dalmedigos, Dor Muhlgay, Amnon Shashua, Kevin Leyton-Brown, Yoav Shoham
,
,
,
May 31, 2024
No items found.

Text generation via language models (LMs) is dramatically improving, but LMs do not attribute their generated text to its sources and can often make mistakes. We propose the simple framework of ๐˜๐˜ฏ-๐˜Š๐˜ฐ๐˜ฏ๐˜ต๐˜ฆ๐˜น๐˜ต ๐˜™๐˜ฆ๐˜ต๐˜ณ๐˜ช๐˜ฆ๐˜ท๐˜ข๐˜ญ ๐˜ˆ๐˜ถ๐˜จ๐˜ฎ๐˜ฆ๐˜ฏ๐˜ต๐˜ฆ๐˜ฅ ๐˜“๐˜ข๐˜ฏ๐˜จ๐˜ถ๐˜ข๐˜จ๐˜ฆ ๐˜”๐˜ฐ๐˜ฅ๐˜ฆ๐˜ญ๐˜ด, which allows for grounding ๐˜ข๐˜ฏ๐˜บ ๐˜ฐ๐˜ง๐˜ง-๐˜ต๐˜ฉ๐˜ฆ-๐˜ด๐˜ฉ๐˜ฆ๐˜ญ๐˜ง ๐˜“๐˜” in knowledge from external sources, and attributing the text it generates to its sources.

โ€

Recent advances in language modeling have dramatically increased the usefulness of machine-generated text across a wide range of use-cases and domains. An outstanding Achillesโ€™ heel of LM generated text is that it is not attributed to a specific source, and often includes factual inaccuracies or errors. This problem is present in any LM generation scenario, and is exacerbated when generation is made in uncommon domains, or when it involves up-to-date information that the LM has not seen during training. A promising approach for addressing this challenge is Retrieval-Augmented Language Modeling (RALM), grounding the LM during generation by conditioning on relevant documents retrieved from an external knowledge source.

Leading RALM systems introduced in recent years tend to be focused on altering the language model architecture, and the need for changes in architecture and dedicated retraining has hindered the wide adoption of such models. Thus, while the RALM approach bears potential to alleviate factual inaccuracies and to provide direct sources for the generated text, it is in practice not deployed alongside leading LMs.

In our paper, we present In-Context RALM: a simple yet powerful RALM method which can be used for endowing any off-the-shelf LM with access to external knowledge sources. In-Context RALM simply inserts the retrieved document to a regular LMโ€™s input, rendering it applicable even for LM behind API . While existing works choose which documents to show the LM via standard general purpose approaches, we propose several novel methods for grounded generation oriented document selection.ย 

โ€

Our simple and easily deployable setup allows improving the language modeling abilities of off-the-shelf LMs to those equivalent to increasing the LM's number of parameters by 4X, across a diverse evaluation set of five text corpora. We believe that further gains can be achieved via developing the generation-oriented retrieval mechanism, while retaining the straightforward document insertion mechanism of RALM.

To help others both to deploy and to build upon our work, our paper is accompanied by an online release of all our code, datasets, trained models, and indexes for our standardized suite of corpora.ย 

Discover more

Discover more