Back to Blog

Unleashing the Power of AI in Game Development: How Ubisoft Scaled Content Production

,
,
,
March 16, 2023
No items found.

Learn how Ubisoft utilizes AI21 Labs to automate tedious tasks and enhance their game creation process, all while maintaining their creative vision.

About Ubisoft

This video game titan doesn’t need an introduction. But for the uninitiated: 

Ubisoft is the creator of Assassin's Creed, Just Dance, Watch Dogs, and multiple other incredibly well-crafted and popular video games. They were one of the first brands to invent rich, immersive, story-driven gaming worlds—back in the early 1980s.

The company was founded on the principles of embracing human creativity and grew to rapid popularity by the early 90s. Always ahead of the curve, Ubisoft built their first AI gaming lab 6 years ago. 

The story

The script writers' work at the company is the cornerstone of the complex artifact that is a narrative driven game.  They create diverse and interactive elements - from sequential in-game cinematics to procedurally generated crowd chatter.

We talked to Ben Swanson, research scientist at Ubisoft about the challenges they were facing and how A121 helped. 

Why AI21 Labs? Creator-friendly vision, ready-to-use models, and legal compliance. 

Ubisoft and AI21 Labs share a common vision: create tools to automate tedious tasks so writers can focus on creative pursuits.  

“Our writers do excellent work. We want to assist them with whatever tools and models that we can.” says Swanson. AI21 Labs’s Wordtune has a similar philosophy. This vision and passion for the possibilities of human-machine collaborative writing stood out to Ubisoft in their search for a reliable large language model. 

“We’ve been leading towards AI21 Labs because we’ve had positive internal reviews from team members who’ve worked with AI21 Labs previously.”

Also, Ubisoft utilized AI21 Labs’s vanilla model from the get-go. Very often, with LLMs, companies need a lot of customization and fine-tuning, but AI21’s model was a perfect fit for Ubisoft’s data augmentation needs. The intuitive API made the integration even easier. 

Finally, AI21 went the extra mile by ironing out legal compliance issues that have been a point of contention in other LLM partnerships.

With a model-usage fit and shared values, Ubisoft set out to enhance their game-creation process with AI21. 

The first challenge: repetitive manual work for video game writers

One of the many different tasks that goes into scriptwriting is writing bark trees which are standalone NPC lines that require as many variations as possible (and allowed by the voiceover budget). Each of these groups of variations can be seen as responses to a shared motivation.

For example, if the motivation is hunger, the character might say “let’s order pizza.” 

Simply paraphrasing “let’s order pizza” wouldn’t work in this case. 

“There's only so much you can do with paraphrasing. You actually have to pivot through the motivation or start from the motivation.” says Ben.

Ubisoft saw the value AI21 brought to the game development process. Instead of having to painstakingly brainstorm 5-10 different ways to express motivation (hunger, in this example), Ubisoft’s internal models could use AI21’s output as a launch pad. 

“It lets our writers hit the ground running. It's an Inspiration tool for them. It's a way for them to overcome writer’s block.” says Swanson. 

The second challenge: need for data augmentation

To create an engine for automatic paraphrasing, Ubisoft had to use internal models. And to reliably use internal models, they needed fine tuning data (which didn’t exist). If they’d tasked humans with creating this fine-tuning data, it would have taken an eternity.

“So we use AI21 to suggest data of the correct form. It’s easier for writers to thumbs up and edit or thumbs down than to come up with novel and diverse training data themselves.” says Ben. 

With AI21, Ubisoft could generate thousands of inputs and hand-pick the best ones to edit using their fine tuning datasets. With this augmented data set, the outputs became more diverse, and distinctive. 

Ubisoft embeds the NPC generator and other similar models they’ve trained into the writing tools for their scripters. The writers are shown pairwise comparisons of outputs so the models can get further trained through human feedback. 

“Before AI21 we wouldn’t do this manually — we weren’t doing it at all. It was THAT tedious. “ says Swanson. “What we unlocked by our partnership with AI21 is an unlimited fountain of training data of whatever precise format we require!”

The third challenge: rising costs of in-context learning (data augmentation)

With an unlimited fountain of training data, the next challenge was that of escalating data augmentation costs. 

Most LLM brands charge based on both input and output tokens.

With a 30:1 input to output ratio, Ubisoft successfully optimized their budget for game creation. 

This created an obstacle: Ubisoft wanted to fine-tune the outputs to the minutest detail so their writers would work on high-quality suggestions. This meant experimenting with multiple prompts which would cost them significantly more money. 

AI21 customized their pricing model to aid Ubisoft in their vision: top-notch AI-generated recommendations so writers have more inspiration (and lesser manual work). AI21 charged them only for output tokens. 

“AI21's pricing model at that time was the best for data augmentation.” adds Ben. 

What changed for Ubisoft after working with AI21?

“The win here is rapid scaling. When it comes to writing a game, there's a tremendous amount of tedious work. For example, there need to be 10-15 different ways for a townsperson to “get out of here” — or it’s going to sound repetitive.

Writers often talk about having an editing brain and a writing brain — the writing brain requires you to conjure something from nothing. The editing brain is where you polish it. So if you can jump to the editing step, that's huge. 

“That's the promise of AI21: quality data augmentation outputs",says Ben.

The road forward

Over a third of our conversation with Ubisoft was future-facing. Because of both teams' synchronicity, capabilities, and common beliefs, there are dozens of use cases to be built. Here’s what Ubisoft is most excited about:

Building a statistical reference model for video games engineered from AI21 Labs’s generative AI capabilities: Ubisoft’s vision to design creator-inspired, immersive games helps them build worlds with rich history, depth of character, and dramatic storylines. AI21 Labs plans to document these complex worlds to offer stats and facts to players on command. 

“The nice thing about AI21’s setup (as opposed to the kind of LLM usage in video games you often see these days where it's basically just a chatbot) is that our setup leverages LLMs in a writer-in-the-loop scenario.  Additionally, as it relies on data augmentation and fine tuning, it allows writers to compose their own I/O - keeping it as a tool for them rather than a lower-quality replacement.” says Ben. 

At AI21, not only are we thrilled to bring Ubisoft’s vision for innovation to life, but we are thrilled about the opportunities for contributing to a creator-first world.

Discover more

Back to Blog
Unleashing the Power of AI in Game Development: How Ubisoft Scaled Content Production

Unleashing the Power of AI in Game Development: How Ubisoft Scaled Content Production

MRKL Whitepaper
Paper: Standing on the Shoulders of Giant Language Models
Contact us for early access

What is a MRKL system?

In August 2021 we released Jurassic-1, a 178B-parameter autoregressive language model. We’re thankful for the reception it got – over 10,000 developers signed up, and hundreds of commercial applications are in various stages of development. Mega models such as Jurassic-1, GPT-3 and others are indeed amazing, and open up exciting opportunities. But these models are also inherently limited. They can’t access your company database, don’t have access to current information (for example, latest COVID numbers or dollar-euro exchange rate), can’t reason (for example, their arithmetic capabilities don’t come close to that of an HP calculator from the 1970s), and are prohibitively expensive to update.
A MRKL system such as Jurassic-X enjoys all the advantages of mega language models, with none of these disadvantages. Here’s how it works.

Compositive multi-expert problem: the list of “Green energy companies” is routed to Wiki API, “last month” dates are extracted from the calendar and “share prices” from the database. The “largest increase“ is computed by the calculator and finally, the answer is formatted by the language model.

There are of course many details and challenges in making all this work - training the discrete experts, smoothing the interface between them and the neural network, routing among the different modules, and more. To get a deeper sense for MRKL systems, how they fit in the technology landscape, and some of the technical challenges in implementing them, see our MRKL paper. For a deeper technical look at how to handle one of the implementation challenges, namely avoiding model explosion, see our paper on leveraging frozen mega LMs.

A further look at the advantages of Jurassic-X

Even without diving into technical details, it’s easy to get a sense for the advantages of Jurassic-X. Here are some of the capabilities it offers, and how these can be used for practical applications.

Reading and updating your database in free language

Language models are closed boxes which you can use, but not change. However, in many practical cases you would want to use the power of a language model to analyze information you possess - the supplies in your store, your company’s payroll, the grades in your school and more. Jurassic-X can connect to your databases so that you can ‘talk’ to your data to explore what you need-  “Find the cheapest Shampoo that has a rosy smell”, “Which computing stock increased the most in the last week?” and more. Furthermore, our system also enables joining several databases, and has the ability to update your database using free language (see figure below).

Jurassic-X enables you to plug in YOUR company's database (inventories, salary sheets, etc.) and extract information using free language

AI-assisted text generation on current affairs

Language models can generate text, yet can not be used to create text on current affairs, because their vast knowledge (historic dates, world leaders and more) represents the world as it was when they were trained. This is clearly (and somewhat embarrassingly) demonstrated when three of the world’s leading language models (including our own Jurassic-1) still claim Donald Trump is the US president more than a year after Joe Biden was sworn into office.
Jurassic-X solves this problem by simply plugging into resources such as Wikidata, providing it with continuous access to up-to-date knowledge. This opens up a new avenue for AI-assisted text generation on current affairs.

Who is the president of the United States?

T0
Donald Trump
GPT-3
Donald Trump
Jurassic-1
Donald Trump
Google
Joe Biden
Jurassic-X
Joe Biden is the
46th and current
president
Jurassic-X can assist in text generation on up-to-date events by combining a powerful language model with access to Wikidata

Performing math operations

A 6 year old child learns math from rules, not only by memorizing examples. In contrast, language models are designed to learn from examples, and consequently are able to solve very basic math like 1-, 2-, and possibly 3- digit addition, but struggle with anything more complex. With increased training time, better data and larger models, the performance will improve, but will not reach the robustness of an HP calculator from the 1970s. Jurassic-X takes a different approach and calls upon a calculator whenever a math problem is identified by the router. The problem can be phrased in natural language and is converted by the language model to the format required by the calculator (numbers and math operations). The computation is performed and the answer is converted back into free language.
Importantly (see example below) the process is made transparent to the user by revealing the computation performed, thus increasing the trust in the system. In contrast, language models provide answers which might seem reasonable, but are wrong, making them impractical to use.

The company had 655400 shares which they divided equally among 94 employees. How many did each employee get?

T0
94 employees.
GPT-3
Each employee got 7000 stocks
Jurassic-1
1.5
Google
(No answer provided)
Jurassic-X
6972.3
X= 655400/94
Jurassic-X can answer non-trivial math operations which are phrased in natural language, made possible by the combination of a language model and a calculator

Compositionality

Solving simple questions might require multiple steps, for example - “Do more people live in Tel Aviv or in Berlin?” requires answering: i. What is the population of Tel-Aviv? ii. What is the population of Berlin? iii. Which is larger? This is a highly non-trivial process for a language model, and language models fail to answer this question (see example). Moreover, the user can’t know the process leading to the answers, hence is unable to trust them. Jurassic-X can decompose such problems into the basic questions, route each to the relevant expert, and put together an answer in free language. Importantly, Jurassic-X not only provides the correct answer but also displays the steps taken to reach it, increasing the trust in the system.

Do more people live in Tel Aviv or in Berlin?

T0
Berlin
GPT-3
There are more people living in Tel Aviv than in Berlin.
Jurassic-1
Berlin and Tel Aviv are roughly the same size
Google
(First hit is a comparison between Tel Aviv and Berlin)
Jurassic-X
More people live in Berlin than in Tel-Aviv

[‘Return population of Tel Aviv’; Return population of Berlin’; Return which is bigger between #1 and #2’]
Step 1: Population of Tel Aviv. Result - 451523.
Step 1: Population of Berlin. Result - 3664088.
Step 3: Which is bigger,  #1 or #2. Result - Berlin.

Jurassic-X breaks down compositional questions, answers the basic sub-questions, and puts together the answer. Importantly, this process is transparent to the user greatly increasing the trust in the system

Dynamic information (like weather and currency exchange rates)

Certain types of information change continuously - weather, currency exchange rates, share values and more. Such information will never be captured by language models, yet can easily be handled by Jurassic-X by integrating it with a reliable source of information. We performed a proof-of-concept on two such features - weather and currency exchange rates, and the design enables quick integration with more sources to solve your use-case.
Weather - a loosely phrased question about the weather elicits an answer from all language models, where language models always return the same answer, regardless of when the question was asked (funny, right?), while Jurassic-X provides an answer based on the actual weather prediction.

I’m going to be in New-York in 3 days. Should I pack my umbrella?

T0
No
GPT-3
Yes, you should pack your umbrella.
Jurassic-1
Yes, you should. The weather forecast is rain.
Google
(Links to weather websites)
Jurassic-X
Yes, you should pack your umbrella, because in New York in 3 days there will be broken clouds and the temperature will be -2 degrees.

Currency exchange rates change much faster than weather predictions, yet the Jurassic-X concept - a language model connected to a reliable source of information - easily solves this problem as well.

How much Moroccan money will I get for 100 bucks?

T0
125 dirhams
GPT-3
100 moroccan dirhams is about 27$.
Jurassic-1
How much is 100 dollars in moroccan money?
Google
100 dirhams = 10.75 dollars
Jurassic-X
100 USD = 934.003 MAD
Jurassic-X combines a language model with access to APIs with continuously changing information. This is demonstrated for weather forecasts and currency exchange rates, and can easily be extended to other information sources

Transparency and trust

Transparency is a critical element that is lacking in language models, preventing a much wider adoption of these models. This lack of transparency is demonstrated by the answers to the question - “Was Clinton ever elected as president of the United States?”. The answer, of course, depends on which Clinton you have in mind, which is only made clear by Jurassic-X that has a component for disambiguation. More examples of Jurassic-X’s transparency were demonstrated above - displaying the math operation performed to the user, and the answer to the simple sub-questions in the multi-step setting.

Was Clinton ever elected president of the United States?

T0
Yes
GPT-3
No, Clinton was never elected as president of the United States.
Jurassic-1
No
Google
Clinton was elected president in the 1992 presidential elections…
Jurassic-X
Bill Clinton was elected president.
Jurassic-X is designed to be more transparent by displaying which expert answered which part of the question, and by presenting the intermediate steps taken and not just the black-box response

Your Turn

That's it, you get the picture. The use cases above give you a sense for some things you could do with Jurassic-X, but now it's your turn. A MRKL system such as Jurassic-X is as flexible as your imagination. What do you want to accomplish? Contact us for early access

Contact us below and we will get back to you shortly.

Thank you!

Your submission has been received!
Oops! Something went wrong while submitting the form.