top of page
Writer's pictureRahul Kumar

Why Large Language Model Hallucinate ?

Updated: Aug 8, 2023



We will continue our talk on basis what , why and how to minimize hallucination

What’s Hallucination ?

Hallucination are outputs of Large Language Model that deviates from facts or contextual logic and they can range from minor inconsistency to completely fabricated or contradictory statements.

We can categorize Hallucination in to different level of granularity:

Sentence Contradiction :

This is where LLM generates sentences which contradict previous generated sentence by LLM. Example : the color of blackboard is black. the color of blackboard is white.

Prompt Contradiction :

In these scenario the generated sentence contradicts the prompts that was used to generate it. Example : Prompt : Write a positive review of a restaurant. Response from LLM: The food was terrible and staff were rude.

Factual Contradiction :

These are factual error contradiction. Example : London is capital of India.

Nonsensical Hallucination:

These are completely irrelevant information being put together that doesn’t make sense.

Why LLM Hallucinate ?

Data Quality :

LLM are basically trained on large corpora of text that inherently have noise, error ,biases or inconsistency, Even if the training data is completely accurate that data may not cover all the possible domains and topics that LLM are expected to generate content about so LLM may generalize on data without able to verify it’s accuracy or relevance.

As LLM Reasoning capability Improves Hallucination decreases.

Generation Method :

LLM use various methods and objective to generate text such as Beam Search, Sampling , Maximum Likelihood Estimation etc these methods may introduce biases and tradeoff between fluency and diversity , coherence and creativity or accuracy and novelty.

Input Context :

It’s the information given to LLM as input prompt, it can help guide the model to reduce the relevant and accurate output but it can also mislead the model if it’s not specific or unclear or contradictory in nature. The whole In context learning or Prompt Engineering is a field in itself to generate and pick on useful prompts.

Input context is the one where user have control over.

How we can Minimize Hallucination ?

Clear and Precise Prompt :

We should be very clear and precise to steer the model towards required objective.

Active Mitigation Strategy :

It’s like playing around with settings of LLM required for generation process example temperature. Temperature control the randomness of output a lower temperature would produce more conservative and focused response while higher temperature produces more creative and diverse responses. Higher the temperature more is chance of model to hallucinate.

Besides from Temperature other setting are Top-k , Top-p , Stop Sequences , Frequency and Presence Penalties.

Multi Shot Prompting :

It provide model multiple examples of desired input/output format that steer the model towards clear understanding of what’s is required.

Thanks

17 views0 comments

Commenti


bottom of page