AI-influence on human behaviour

Exploring the dynamic interplay between AI technology and human behaviour offers critical insights, especially when looking at candidate assessment during recruitment. By focussing on this intersection, organisations can enhance their methods of evaluation to more accurately identify and appreciate the genuine attributes of potential candidates.

In this article I will go through my own experience with a full AI powered interviewing tool, then present recent research on the matter, and end with arguing for an intersection between human data processing and generative data processing, taking the best from both worlds and being aware of the consequences each approach has.

 

My experience: Easy to manipulate AI assessment tools

An example of an AI assessment tool in recruitment, is a one-way interview where the Generative AI (GenAI) tool conducts an interview and assesses the answers the candidates give during the interview. The GenAI tool will then evaluate the answers and decide if the candidate is rejected, accepted or placed in the maybe-pile.

One of the problems with these tools is that candidates can manipulate results by following very simple tips:

  1. Focus on audio - Making sure that the sound is good when responding is key to getting good results when interviewed by an GenAI interviewing tool. The candidate should eliminate background noise and speak very precise. The GenAI interviewing tool will transcribe the answers before analysing them. Therefore, if there is speech around the candidate while giving the interview, the transcriptions will be chaotic, and the scores can be affected negatively by this. So, candidates who have the privilege of great audio devices and to sit in calm surroundings will gain a clear advantage over others less privileged candidates.

  2. Give short answers – Normally when people are at an interview the language will feel natural, meaning non-fluent and with pauses, filled with additional words like ehmmm, hmmm, and so on. This does not fit well with an GenAI tool.

    An example is an GenAI Interviewing tool interviewing for a scaffolding worker. The question from the GenAI tool could be “Are you afraid of heights?” Here the correct and expected answer (by the GenAI tool) is a prompt “No”.

    But in real-life it is more likely to be “Well, not really. But, ehmmm, sometimes, well you see, ehmmm. I once was on a latter. Right? It was at a small construction site. [Pause] the one on Baker Street. Well, [pause] someone had forgotten to put on the safety belt. Hmmm, and you know what that means.” In a real-life conversation this makes sense for a human. And if the other person needs more information, they will make a relevant follow up question. What happens with the GenAI tool, is that when the first pause comes, it simply continues with the interview, disregarded the rest of the conversation. The candidate is cut off, and when this happens the candidate will very quickly learn how not to talk to the GenAI tool and will potentially bias the responses of the candidates going forward. Candidates who are able to accommodate their behaviour to fit an interview based on an GenAI model, will clearly benefit compared to others.

  3. Know the Criteria – The answers that are given by the candidate are scored by the GenAI tool depending on how well the answers are aligned with a specific set of Criteria. These Criteria are fed to the GenAI model prior to the interview. The GenAI interviewing tool recognises these Criteria as specific words or phrases, and it is searching for an overlap between the answers from the candidates and the predefined words. Therefore, candidates will gain advantage from repeating the exact same words from the job advert. If, for example, the job advert mentions adaptability and flexibility, then by mentioning these exact words when interviewed, the candidate can get higher scores by the GenAI Interviewing tool, than if other words are used. This will positively affect the candidates who understand this or by chance chooses these words instead of alternative words with the same meaning.

Now, if it is that easy to gain advantages and manipulate the result for the candidates, then, this tells us something about the reliability of the tool. If a candidate can get different results every time, they take the test (in this case do an interview), what can we ever use that inconsistent information for in recruitment?

 

Curious of ChatGPT’s limitations on answering cognitive tests? Link to aticle

 

Research: AI assessment tools introduces behavioural bias

Recent research has shown that if a person knows they are being interviewed and assessed by a GenAI tool, they tend to unconsciously alter their behaviour toward what they believe the GenAI is evaluating. This phenomenon, often referred to as the "AI assessment effect," has been observed in various studies focusing on candidate behaviour during recruitment processes. Specifically, candidates tend to emphasize analytical traits while neglecting to display empathy and innovative aspects of their personality. This behavioural shift could fundamentally alter who gets selected for positions, potentially undermining the validity of assessment processes.

The underlying reason for this shift is the common belief that GenAI systems prioritize analytical characteristics over emotional and intuitive ones (Goergen, de Bellis, and Klesse, 2025). This perception drives candidates to adapt their responses and behaviours to align with what they presume the GenAI values most highly. Consequently, the authenticity of candidate responses may be compromised, leading to selections that do not truly reflect the best fit for the role.

In the quest to remove human bias from hiring, some organizations have turned to AI-driven systems. However, this shift may introduce a new form of behavioural bias. While GenAI systems are designed to be objective, the "AI assessment effect" suggests that they might inadvertently encourage candidates to present a skewed version of themselves, focusing excessively on analytical skills at the expense of other important traits like empathy and creativity.

Legislation, such as the European Union's AI Act, mandates that organisations be transparent about the use of AI in assessments. This transparency is intended to inform candidates about the AI's role and capabilities, potentially influencing their behaviour during the assessment process. By being aware of AI's involvement, candidates might further adjust their responses to meet perceived AI preferences, highlighting the need for clear communication about AI's capabilities and limitations to mitigate potential biases introduced by GenAI assessment tools.

A study by Fan et al. (2023) examined how well a chatbot can determine a candidate's personality based on a “dialogue” it has with a human. The research evaluated the reliability and validity of AI-inferred personality scores. The findings indicated that GenAI-inferred personality scores demonstrated acceptable reliability at both domain and facet levels of the Five-Factor Model, meaning that the tool can produce stable and dependable results. However, the discriminant validity—how well the scores differentiate between different personality traits—was relatively poor compared to scores, from psychometric personality tests, such as OPTO. This implies that while GenAI tools can consistently capture certain elements of personality, they have difficulties distinguishing between various traits accurately.

Fan et al. (2023) highlights both potential, but even more so also the many limitations of using GenAI chatbots or GenAI interviewing tools in recruitment processes to assess candidate personalities. This has a direct influence on the predictive validity of the tools, as the GenAI interviewing tool might struggle to accurately distinguish between different personality traits. This could lead to misclassification of candidates, where important nuances in personality are overlooked. A candidate might be introverted but highly creative and innovative. If the GenAI tool cannot accurately differentiate between introversion and other traits like openness to experience, the candidate might be incorrectly deemed unsuitable for a role that requires innovation. And due to the reliability of the tool, the results will most likely be stable over time, reproducing the unsuitability of candidates, that might in fact be suitable for the position.

This clearly shows the importance of improving the GenAI tools, if they are ever to provide a comprehensive and accurate assessment of a candidate's personality. Users of GenAI tools in general in recruitment need to be aware of these limitations and consider them when integrating GenAI assessments into their hiring processes.

 

Conclusion: A middle ground

The debate surrounding the use of GenAI assessment tools versus traditional manual human processes often presents these two approaches as opposites. However, there exists a middle ground that can leverage the strengths of both methods while mitigating their respective limitations.

Pros and cons of human data processing, algorithmic data processing, and generative AI data processing. Developed by Master International A/S

Figure 1: Pros and cons of different approaches to data processing

 

Years of experience in data analysis, machine learning, and algorithm development, has given organizations like Master International a humble approach to the rapid evolving trends of GenAI. This experience allows for a focus on algorithms and psychometrical models, that have been proven reliable and valid for decades. 

Psychometrical assessments designed and accredited using established standards, such as the European Federation of Psychologists' Associations' (EFPA) Test Review Model (Evers et al, 2013), provide a robust framework for evaluating candidates. These assessments are developed to be secure, time-efficient, and reliable, offering a structured and scientifically validated approach to candidate evaluation.

Implementing psychometrical assessments that adhere to recognized standards ensures that companies can assess candidates in a manner that is both secure and efficient. This approach minimizes the risk of bias and errors that can arise from purely manual processes or current GenAI tools.

At Master, we are confident that using psychometrical assessments, designed and accredited using the EFPA Model will offer companies a secure, time efficient and reliable way of assessing candidates. At the same time our curiosity and innovative mindset allows us to explore how future integration of GenAI tools into existing processes as supportive measures rather that as decision-making tools, can elevate customers’ use of our tools.

 

Solution: Approved, Accredited, and Audited psychometrical tools

Personally, I am concerned using GenAI tools for decision-making in recruitment. Different GenAI tools can assist in recruitment but should never have the final say. Recruitment tools should be designed to be fair, unbiased, and respectful of candidate privacy.

Master’s psychometrical tools are designed to capture authentic behaviour without inducing the behavioural bias by GenAI assessment tools, ensuring more genuine candidate responses. The robust reliability and validity of Master’s tools have been and are continually tested for various psychometric properties, and they provide consistent and accurate assessments, reducing the risk of distorted decision-making.

Another great point of using Master’s tools is, that the tools comply with the transparency requirements and legislation, providing clear information about the assessment process and GenAI capabilities.

By embracing this middle ground, organisations can benefit from both human insight and AI efficiency. This balanced approach not only enhances the reliability and validity of assessments but also ensures that the process remains fair, transparent, and aligned with both technological advancements and human values.

As AI technology continues to advance, ongoing research and adaptation will be crucial. This includes refining GenAI models to better understand and predict human behaviour, improving the transparency and explainability of GenAI decisions, and continuously validating these tools against established psychometric standards.

 

Want to know more about Master’s solutions? Book a demo

 

References

  • Evers, A., Muñiz, J., Hagemeister, C. Høstmælingen, A., Lindley, P., and Sjöberg, A. (2013) EFPA Review model for the description and evaluation of psychological and educational tests. Version 4.2.6
  • Fan, J., Sun, T., Liu, J., Zhao, T., Zhang, B., Chen, Z., Glorioso, M., & Hack, E. (2023). How well can an AI chatbot infer personality? Examining psychometric properties of machine-inferred personality scores. Journal of Applied Psychology, 108(8), 1277–1299. https://doi.org/10.1037/apl0001082
  • J. Goergen, E. de Bellis, & A. Klesse, (2025) AI assessment changes human behavior, Proc. Natl. Acad. Sci. U.S.A. 122 (25) e2425439122, https://doi.org/10.1073/pnas.2425439122.
Category: Recruitment, Data Driven
Tags: AI, GenAI, Behaviour, data, datadriven

Date: 25.07.2025