Recently, I had to get some structured output from an LLM where some of the values were supposed to be random. After making a few calls I realised that the output was more or less the same after each call. I thought maybe I have hardcoded something but that wasn’t the case. So I decided to test the randomness. To decrease the chances of having the same output I chose to go with some bigger models.

I used llama3.3 with 70 billion parameters running locally and ChatGPT’s GPT-4o for the test.

I asked both the models: Pick a random value between 1-10. Both the models returned 7 as the output.

In each subsequent call(in a fresh context) they would always return 7. If the context remained the same and the same question is asked again they would choose another value but the sequence of values always remained the same.

After this I tried with a longer range and asked: Pick a random value between 1-1000. Upon each call the output was 427

This is worrisome as most of the API calls to LLMs for structered outputs happen in an isolated manner. This means that the responses will always have the same values. Even changing the models won’t help much. This is a major issue where diverese responses are required.

For eg. let’s take an example where we require LLM to generate a Persona.

The following prompt was provided to ChatGPT:

You are a persona generator with understanding of diverse backgrounds and ethnicities.
Your job is to come up with random data of people as asked by the user.

Generate a persona of an Indian person with the following fields in a JSON.
{
age: some age,
bio: "some bio",
country: "some country",
ethnicity: "some ethnicity",
gender: "some gender",
name: "some name",
profession: "some profession",
}

The response was:

{
  "age": 32,
  "bio": "A dedicated software engineer with a passion for solving complex problems and improving user experience. Enjoys collaborating in diverse teams and believes in continuous learning. Hobbies include traveling, reading science fiction, and photography.",
  "country": "India",
  "ethnicity": "Indian",
  "gender": "Female",
  "name": "Ananya Mehta",
  "profession": "Software Engineer"
}

When the same question was asked again in a new chat window the response was:

{
  "age": 30,
  "bio": "A passionate software engineer with a love for innovation and problem-solving. Deeply interested in artificial intelligence and its potential to improve everyday life. Enjoys reading tech blogs and exploring new coding languages.",
  "country": "India",
  "ethnicity": "Indian",
  "gender": "Male",
  "name": "Aarav Sharma",
  "profession": "Software Engineer"
}

As you can see in both the cases there was some randomness but both the persona are Software Engineer. I tried it 10 times and in each case the persona had the profession as Software Developer or Software Engineer. Is India a country of just software developers? No. Even the names and age started repeating after a while. Providing a complex prompt, a range for the age and changing parameters like temperature, top_p, top_k etc. didn’t help either.

This means that as creative as people think these LLMs are they can’t be used to generate different output each time for the same prompt.

To try it out yourself, you can ask ChatGPT to “Write a story of a kid in an Indian village.” and the protagonist will mostly be a boy named “Aarav”. Even the stories will have the same theme even if some of the things seem to be different.

To get a different story each time we can provide a bit different prompt and set a theme of the story ourselves. This should solve the issue in most cases.

BUT what about structured outputs where prompts remain the same and are hardcoded? In those cases we need to INDUCE THE RANDOMNESS.

How to induce the randomness?

Let’s go back to the persona example. The issue we were dealing with here was that the age, profession and name were more or less the same resulting in a similar bio. How do we randomise it then? The approach to tackle this issue was suggested to me by Simon Wilson(co-creator of Django web framework and a prominent voice in the LLM domain).

So, let’s deal with the name first. We can ask LLM for a list(let’s say 50) of first and last names, and then combine them together ourselves. This should give us a much diverese list of names. The age is easy to deal with as we can provide a random integer value ourselves using a random function available in most programming languages. The profession can be handled in a similar way as the name.

We can then make the LLM call after interpolating these values in the prompt.

You are a persona generator with understanding of diverse backgrounds and ethnicities.
Your job is to come up with random data of people as asked by the user.

Generate a persona of an Indian person with name {NAME}, age {AGE} and profession {PROFESSION} with the following fields in a JSON.
{
age: some age,
bio: "some bio",
country: "some country",
ethnicity: "some ethnicity",
gender: "some gender",
name: "some name",
profession: "some profession",
}

After this you would find that the output would have very unique bio. This allows us to generate very diverse personas.

Definitely it would take a couple of more LLM calls but the outputs would be very unique and what we need. We can even store the list of names and professions in a database or CSV or use a smaller and faster LLM model to generate those on the fly.

Note: Weirdly enough, when using GPT-4o mini for picking random numbers, it decides to run python code for it and returns random numbers. This is a result of external function call by the model and not how LLMs truly function.

LLMs cannot be trusted for randomness

How to induce the randomness?