The Psychology of Conversational AI Blog

Learn more about how psychology and dialog design work together. In this blog, I write about observations from the last years of my professional experience and recent studies on the topic.

Please note that the articles are based on my professional opinion and are not a conclusive scientific consideration of these issues. As in any other field of psychological research, there are always studies that reach different conclusions, also depending on the measured variables, their operationalization and their weighting for the interpretation of the results.

Should a digital assistant be humorous?
Humor is an important conversational tool and a highly valued social trait.

👉 The majority of people prefer to surround themselves with humorous people than with others – this applies to friends and also to romantic partners.
👉 Humor is a social lubricant. It helps build and maintain relationships and facilitates interactions. It is also used in groups to strengthen social bonds and counteract stress, for example.

We can also see how positively humor affects us by the extent to which it is used in advertising. It is estimated that 30-42% of all advertisements contain humor.

Results of many studies show that people develop a positive attitude towards humorous advertising content and towards the brand names advertised in this way.

It is also interesting to note that the decision to buy a product can be positively influenced by the mere association with humor. In one study, for example, participants were more likely to want to buy products advertised near funny cartoons than those advertised near neutral cartoons.

❓What does all this mean for the design of conversational AI?

We know that a digital assistant is evaluated more positively when it is designed to be human-like. However, this effect turns negative when the assistant acts too human-like (uncanny valley effect).

The ability to develop and understand humor is a special feature of the human species. Therefore, we shouldn’t overdo it with the use of humor when designing a digital assistant. However, humor has indeed been found to have a positive effect on how users perceive digital assistants:

👉 In computer systems in which tasks might be long and boring, humor can be used to maintain long-term interactions and alleviate boredom (McTear et al., 2016).
👉 A chatbot that is humorous might encourage more positive involvement and increase whether humans perceive it as being emotionally intelligent (Dybala et al., 2009).
👉 In a study participants found the use of emojis to be a great way to add humor and emotions to the conversation. They also stated that it helped communicate the chatbots personality regarding which emojis it used, and how frequently.

Dybala, P., Ptaszynski, M., Rzepka, R., & Araki, K. (2009, May). Humoroids: conversational agents that induce positive emotions with humor. In AAMAS’09 Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems (Vol. 2, pp. 1171-1172). ACM.
McTear, M., Callejas, Z., & Griol, D. The conversational interface: talking to smart devices (2016).
Röhner, J. & Schütz, A. (2020). Psychologie der Kommunikation. Wiesbaden: Springer VS.

: ) or ^_^ ?
Which of these emoticon styles would you rather choose to express joy?

If you prefer the horizontal style, (like :O for a surprise) then chances are you are from a western culture.
👉 In western cultures, where people are used to reading horizontally from left to right, the horizontal style dominates, in which the eyes are shown first and then the mouth in the sequence from left to right.
👉 In eastern cultures, where people are used to reading from top to bottom, the vertical style dominates, where the eyes are shown above the mouth (Park et al., 2013).

Sadness in the
👉 horizontal style : (
👉 and in the vertical style T_T

It also differs which part of the emoticon indicates the meaning. In the vertical style, the meaning is indicated by the shape of the eyes, and in the horizontal style, it is indicated by the shape of the mouth.
It is also interesting in this context that the use of styles depends less on where a person lives geographically than on which language he or she predominantly speaks. In Indonesia, for example, the predominant language spoken is English and those living there use the horizontal emoticon style accordingly.

Why is this interesting for conversational ai?

Over the last few years, I have conducted many studies in different markets to validate dialog progressions and prompts, and also to understand how what kind of information should be weighted within a dialog in which market.
👉The influence of different cultures on the perception of speech interactions should not be underestimated, but is often not really considered when designing conversational AI.

Park, J., Barash, V., Fink, C., & Cha, M. (2013). Emoticon style: Interpreting differences in emoticons across cultures. In Proceedings of the International AAAI Conference on Web and Social Media (Vol. 7, No. 1, pp. 466-475).

How do our questions influence response behavior?

Depending on the type of question we use, we receive responses of varying detail. E.g., open questions, closed questions or leading questions.

For the design of a conversational AI, closed questions are primarily suitable:
✓ yes and no questions
✓ selection questions, where there are two or more answer alternatives
✓ identification questions, e.g., “Where do you live?”

Closed-ended questions, unlike open-ended questions, have limited answer alternatives, which makes them conducive to short answers and quick response. This makes them particularly well suited for the design of conversational AI. They take the user by the hand and make good dialog guidance possible. 

But besides the mere type of question, the wording of the question also plays an important role in response behavior.

Did you know, for example, that the mere length of the question has a decisive influence on the response behavior?
Wilson (1990) found a positive correlation between the length of the question and the length of the answer.

That is, the longer the question, the longer the answer.

So if you want to get a short answer, you should not formulate long questions.

Röhner, J. & Schütz, A. (2020). Psychologie der Kommunikation. Wiesbaden: Springer VS.

A voice output in conversational AI must be able to do many things:

✓ Provide transparency that what the user said was understood.
✓ It must contain relevant information, queries, and options for action, and be formulated in such a way that the user can clearly assess how to respond at each step of the dialog in order to quickly reach his or her goal.
✓ The prompt must be equally suitable for people with different levels of prior knowledge and must function within a consistent overall character of the assistant.

This is not always easy!

So how long may a voice output be or what is the maximum amount of information it should contain?

Many designers work here with rules of thumb that apply to different components of speech output. For example, “you should list a maximum of three options or categories”. And this makes perfect sense!

Nevertheless, this question is more dependent on the specific use case and the user than many designers realize.
For example, how much information can be processed simultaneously in working memory depends strongly on the user’s personal prior knowledge.

This should especially be kept in mind when validating concepts in studies with colleagues who are already familiar with the product or topic!
In their brains, the processing of speech output is different from that of a person with no prior knowledge. This is because units of information take up less space in working memory if they form a meaningful unit for the person.

Here is an example: If I dictate the following series of letters to you: S, O, L, U, I, B, M and ask you to remember these letters in this order, then these letters will occupy 7 places in your working memory. But for all of you for whom the letter combination IBM represents a meaningful unit, there are only 5 places occupied (S, O, L, U, IBM).

Of course, many more variables play into good voice output design, too! More about this topic here.

New articles

Can AI be compared to human intelligence – and does it pose a threat to us because it could become “superintelligent”?

Many articles and books discuss the danger of artificial intelligence. They postulate that an artificial intelligence comparable to human intelligence will be achieved in the foreseeable future. They go so far as to claim that this AI will then create ever more intelligent versions of itself, until we will be confronted with a “superintelligence”. Some…

To design or not to design a character for your conversational ai?

Should we define a character for a digital instance? Or are we thereby promoting the uncanny valley effect? Or, in the worst case, does this encourage deception of users, who might then interpret a consciousness into conversational AI? Only recently, a Google employee again clearly demonstrated to us that people tend to interpret human characteristics…

Follow me on LinkedIn or YouTube

Blog at