Evaluating Critical Thinking in Developers

I was recently pondering the use of AI in the work place, in particular in the area of software development, and that started a train of thought about how to recruit developers best suited to the brave new world.

It occurred to me that maybe we can use AI to help evaluate how well a developer uses AI. After all we are no longer interested in their ability to actually type code, what we want is to get an insight into their critical thinking and their ability to induce the AI to produce code that follows a clean architecture.

I'm suggesting that we put together a scenario using some existing code using your organisation's favourite technology, from a point in time where a feature was about to be added, or a particularly nasty bug was about to be fixed, and sit the candidate down with Claude Code and an IDE and ask them to add the feature or fix the bug. Except during that session, we record their conversation with the AI.

The sample code can be kept on the shelf to be rolled out fresh each time. At the end of the session, we'll have their completed feature/bug fix, and also a complete history of their conversations with the AI. This latter is the key to giving us an insight into their critical thinking and their ability to supervise an AI.

We could evaluate things like:

Prompt quality signals

Were prompts clear and specific?
Did the candidate break complex tasks into logical steps?
Did they provide useful context?
How well did they iterate when the first result wasn’t right?

Problem-solving approach

Did they tackle the right things in a sensible order?
Did they recognize when generated code had issues and course-correct?

Technical understanding

Did their prompts demonstrate they understood the underlying code, or were they just guessing?
Could they review and refine outputs effectively?

Efficiency

How many back-and-forth iterations did it take to reach a working solution?
Did they waste cycles on unproductive paths?

The AI can then be used to evaluate the session transcript against these criteria, and together with an anlysis of the actual code change provide an objective measurement of how well suited the candidate is to working with AI to meet the requirements of the role.

Further to this, I realised that using such a test would give a valuable insight into the way a developer approaches and solves a problem even if your organisation isn't ready to let everyone loose on AI just yet.

Evaluating Critical Thinking in Developers - 2 minute read.

Rob Lewis

More articles by Rob Lewis

Explore content categories

More articles by Rob Lewis

Could AI Save the Web?

How Do You Test Software That Never Has the Same Conversation Twice?

Miss Manners, OPS5, and Why Rules Engines Still Race Over a Dinner Party

Teaching Claude Code to Think

Why Did I Write a Book on OPS5?

Explore content categories