Waiting for a delayed flight or a late package is a common experience. Uncertainty about when an event will actually occur is a frequent source of consumer anxiety. Customers regularly turn to service representatives to ask when their wait will end.
A recent investigation addresses this scenario to shed light on how people react when they receive time-based predictions. Published in the Journal of Retailing and Consumer Services, the research goal was to determine if customers respond differently depending on the identity of the service agent.
Specifically, the study tested whether people evaluate answers differently when an artificial intelligence, or AI, chatbot provides the information compared to a human employee. The central finding showed that consumers evaluate predictions more positively when they come from a human.
Identifying the Knowledge Gap
Companies increasingly rely on automated software to handle customer questions. Existing research shows that people often react differently to human workers than they do to software programs. However, researchers identified a lack of data regarding how customers react when non-human programs attempt to predict future events.
Magnus Söderlund, a researcher at the Stockholm School of Economics, designed a study to investigate this specific interaction. To understand this investigation, it helps to understand a psychological concept called “theory of mind.” This term refers to the ability to recognize that other people have their own individual thoughts, goals, and limitations.
Predicting when a complex event will happen, like a flight departure, requires an understanding of the various human decisions involved in that process. Söderlund wanted to know if people believe a computer program possesses enough theory of mind to accurately forecast events driven by human actions. People generally know that algorithms rely on historical data patterns. They may assume these programs cannot fully grasp the unpredictable nature of human choices.
Testing Agent Identity in Two Experiments
Söderlund conducted two separate experiments using online participants. In the first experiment, 380 participants read a scenario about waiting for a delayed connecting flight at an airport. They were told they asked a service agent named Emma when the flight would depart.
Half of the participants read that Emma was a 37-year-old human employee with 15 years of experience. The other half read that Emma was an AI chatbot trained on a massive volume of airline data. The participants then read Emma’s prediction about the departure time.
Söderlund measured how the participants rated Emma’s theory of mind, her prediction skills, and their overall evaluation of her answer. The data was collected through questionnaires using a ten-point scale.
The analysis revealed a specific chain of events. First, participants attributed a higher level of theory of mind to the human agent than to the chatbot. This initial belief led participants to rate the human as having better prediction skills. Finally, these higher perceived skills resulted in a more positive overall evaluation of the human’s prediction.
The first experiment also tested if using different words, like saying a flight “will” leave versus is “likely” to leave, changed the results. The data showed that the specific wording did not alter the participants’ reactions. Interestingly, while the chatbot scored lower than the human, participants still attributed a moderate amount of understanding to the software. They did not view the computer program as entirely mindless.
The second experiment involved 400 new participants and followed a similar flight delay scenario. This time, the researcher introduced a new variable to test what happens when the prediction is either accurate or inaccurate. Participants were told the actual departure time, revealing whether Emma’s guess was right or wrong.
The analysis showed that accurate predictions naturally received better evaluations across the board. It also revealed that when the AI chatbot made an inaccurate prediction, participants lowered their rating of its theory of mind much more severely than they did for a human who made the exact same mistake.
Actionable Business Insights
For businesspeople, this research offers specific insights into managing customer service interactions. When customers ask for predictions about future events, companies may experience better customer responses if they route these questions to human employees. A human touch appears to lend credibility to estimates involving complex human processes.
If a company must use chatbots for these interactions, the study outlines potential ways to manage the conversation. For example, a chatbot could warn the customer that predicting the exact timing of complex events is difficult before providing an estimate. This sets expectations and may prevent a harsh reaction if the prediction turns out to be wrong.
Another option is programming the chatbot to admit it cannot predict the future. While this displays honesty, Söderlund notes that previous studies show “I do not know” answers can independently lower a customer’s perception of service quality.
There are a few caveats to consider regarding this research. The study focused specifically on a negative situation, which was a flight delay. It is not entirely clear if customers would react the exact same way if they were asking a chatbot about a positive event, such as the arrival time of a highly anticipated new product. Additionally, the participants were reading a hypothetical scenario rather than experiencing a real-life delay in an actual airport.



