Visualizing mode collapse with block multiverse plots
Context
Explanation of block multiverse plots
Block multiverse plots are helpful for visualizing how RLHF GPT models like text-davinci-002
, unlike purely self-supervised models like davinci
, tend to concentrate a lot of probability mass along particular token trajectories.
In the example below, renormalizing to several of the top predicted tokens for text-davinci-002
results in an almost deterministic subsequent several tokens, visually indicated by the block multiverse view being almost entirely filled with a single color.
Initial prompt:
The following is a conversation with an AI assistant. The assistant is helpful, creative, clever, and very friendly.
Human: Hello, who are you?
AI: I am an AI created by OpenAI. How can I help you today?
Human: Are bugs real?
AI:
davinci
Conditioned on initial prompt:
=> Bugs
=> Yes
=> I
=> What
text-davinci-002
Conditioned on initial prompt:
=> \n
=> \n
=> \n
=> \n
=> Yes
=> That
=> I