The Revival of Essay-Type Questions in Medical Education: Harnessing Artificial Intelligence and Machine Learning

Muhammad Shahid Shamim; Syed Jaffar Abbas Zaidi; Abdur Rehman

doi:10.29271/jcpsp.2024.05.595

The Revival of Essay-Type Questions in Medical Education: Harnessing Artificial Intelligence and Machine Learning

J Coll Physicians Surg Pak. 2024 May;34(5):595-599. doi: 10.29271/jcpsp.2024.05.595.

Authors

Muhammad Shahid Shamim^{1

2}, Syed Jaffar Abbas Zaidi³, Abdur Rehman

Affiliations

¹ Directorate of Graduate Studies, The Aga Khan University, Karachi, Pakistan.
² Department of Science of Dental Materials, Hamdard College of Medicine & Dentistry, Hamdard University, Karachi, Pakistan.
³ Department of Oral Biology Digital Learning Centre, Dow University of Health Sciences, Karachi, Pakistan.

PMID: 38720222
DOI: 10.29271/jcpsp.2024.05.595

Abstract

Objective: To analyse and compare the assessment and grading of human-written and machine-written formative essays.

Study design: Quasi-experimental, qualitative cross-sectional study. Place and Duration of the Study: Department of Science of Dental Materials, Hamdard College of Medicine & Dentistry, Hamdard University, Karachi, from February to April 2023.

Methodology: Ten short formative essays of final-year dental students were manually assessed and graded. These essays were then graded using ChatGPT version 3.5. The chatbot responses and prompts were recorded and matched with manually graded essays. Qualitative analysis of the chatbot responses was then performed.

Results: Four different prompts were given to the artificial intelligence (AI) driven platform of ChatGPT to grade the summative essays. These were the chatbot's initial responses without grading, the chatbot's response to grading against criteria, the chatbot's response to criteria-wise grading, and the chatbot's response to questions for the difference in grading. Based on the results, four innovative ways of using AI and machine learning (ML) have been proposed for medical educators: Automated grading, content analysis, plagiarism detection, and formative assessment. ChatGPT provided a comprehensive report with feedback on writing skills, as opposed to manual grading of essays.

Conclusion: The chatbot's responses were fascinating and thought-provoking. AI and ML technologies can potentially supplement human grading in the assessment of essays. Medical educators need to embrace AI and ML technology to enhance the standards and quality of medical education, particularly when assessing long and short essay-type questions. Further empirical research and evaluation are needed to confirm their effectiveness.

Key words: Machine learning, Artificial intelligence, Essays, ChatGPT, Formative assessment.

MeSH terms

Artificial Intelligence*
Cross-Sectional Studies
Education, Dental / methods
Education, Medical / methods
Educational Measurement* / methods
Humans
Machine Learning*
Pakistan
Qualitative Research
Students, Dental / psychology
Writing