Large language models don’t behave like people, even though we may expect them to

People generalize to form beliefs about a large language model’s performance based on what they’ve seen from past interactions. When an LLM is misaligned with a person’s beliefs, even an extremely capable model may fail unexpectedly when deployed in a real-world situation.