̽Ƶ

Retaining human oversight of AI marking ‘complex’, trial finds

Universities piloting the use of AI tools in assessment identify ‘tension’ in determining correct role for academics to play

Published on
May 21, 2026
Last updated
May 21, 2026
Source: iStock/VSFP

Ensuring humans retain oversight of marking done by artificial intelligence is “more complex than one might initially think” and may not save academics any time in the long run, according to the initial results of a major UK trial.

Select universities and colleges in the UK have been experimenting with ways to introduce AI into their assessment processes with support of sector body Jisc.

Findings published on 21 May identified a “tension” between the AI marking tools and maintaining academic oversight.

While the trial did not enforce many restrictions on universities’ participation, they were required to ensure that “wherever AI was used to mark and give feedback on students’ work, a human should retain oversight of the process”.

̽Ƶ

ADVERTISEMENT

But the trial found that “keeping the human in the loop is more complex than one might initially think” and may not deliver time savings to the marking process. 

It raised concerns that academics could be “influenced by AI feedback, rather than reviewing it critically”, and academic oversight of AI marking could be “reduced to simply scanning one’s eyes over the student’s work and the AI feedback before clicking ‘next’”.

̽Ƶ

ADVERTISEMENT

A blog of the findings says: “At first glance, this problem seems to be driven by a lack of diligence, but further insights revealed a more interesting logic. As per the objectives of the pilot, the motivation for using AI in this context is to reduce workload. 

“As such, there is an expectation that the AI’s outputs can be trusted and delegated to some extent. If the human in the loop is putting significant effort into reviewing the student’s work and the AI’s feedback, then what is the purpose of using AI at all?”

Tom Moule, senior AI specialist and product lead for artificial intelligence at Jisc, told ̽Ƶ that rather than offering time savings, AI may allow academics to offer “richer”, “more detailed” and “more timely” feedback.

Viewed through this lens, AI allows academics to “do more with the same”, he said, offering more useful feedback with no extra effort.

̽Ƶ

ADVERTISEMENT

“The way to resolve that tension is to have a clear separation of what the role of the human and the AI is. In doing that you can verify that it is the human taking the dominant role and the role that requires the highest level of expertise, and the AI doing the role that requires perhaps more routine, but time-consuming tasks,” he said.

For example, academics could review and make notes on the student’s work before seeing any AI feedback, putting them in a “stronger position to review the AI’s first draft of feedback discerningly”. 

“In terms of time savings, the key here is for the human input to be focused on the overall effectiveness of the student’s work and some of its standout aspects. The AI, meanwhile, plays the role of analysing fulfilment of the marking criteria and writing up the feedback in rich detail,” the blog says.

This should be built into the foundations of any marking tools, it argues. “Ultimately, the pilot is reinforcing an important lesson: human oversight works best when it is designed intentionally into the workflow. When educators are given a clear role in shaping, verifying and refining AI-generated feedback, the technology becomes more of a collaborator rather than a substitute.”

̽Ƶ

ADVERTISEMENT

The blog also stressed that AI feedback is most effective in formative assessment, and such tools are best suited to “delivering rich, actionable feedback to students in a timely manner”, rather than working on summative assessments, where “the mark is the primary concern”.

Some institutions raised concerns that the tools could “produce slightly different outputs” if put through multiple times, but the paper argues “variation is not unique to AI; different educators can also arrive at different judgements”.

̽Ƶ

ADVERTISEMENT

“In many institutions, existing moderation processes already exist to smooth out such differences,” it says.

juliette.rowsell@timeshighereducation.com

Register to continue

Why register?

  • Registration is free and only takes a moment
  • Once registered, you can read 3 articles a month
  • Sign up for our newsletter
Please
or
to read this article.

Related articles

Students are using AI tools to decide where to apply to college, pushing institutional leaders to find ways to ensure that AI chatbots include their college in the conversation.

By Johanna Alonso
13 February

Reader's comments (3)

University administrators especially are determined to show that AI can make higher education "richer" (notice how many times the JISC people use this word). "Rich" can, of course, simply mean "more" of something that the technology can provide: more spelling and grammar corrections, to begin with. In my 40+ years of teaching I found that students resist reading comments when they believe there are too many (and that has gotten worse over the years) about too many different aspects of their work. On the first assessment of the semester I may be flying blind, but by the second I have an idea what any particular student needs to work on most urgently, and if you try to make them work on everything at once they don't bother. AI can't make those judgments (actually, AI can't make any judgments at all--since I am a philosopher I'd be happy to talk about that). It is no surprise that it is likely to slow grading/assessment and make it more ineffective. But I am sure that all the para-academic staff that writes rules and rubrics will be delighted....
Well yes, exactly
new
Given the quality of some of the students in Uni's these days I doubt some of them can actually read the AI feedback.

Sponsored

Featured jobs

See all jobs
ADVERTISEMENT