Cross-Sectional Descriptive Study of Comparative Accuracy of ChatGPT, Google Gemini, And Microsoft Copilot in Solving NEET PG Medical Entrance Test
DOI:
https://doi.org/10.5195/ijms.2025.3989Keywords:
Artificial Intelligence, NEET PG, Multiple choice questions, Graduate medical education, Machine LearningAbstract
Background: Artificial Intelligence (AI) is increasingly applied in healthcare and medical education, with tools capable of assisting in diagnosis, treatment planning, and exam preparation. The NEET-PG is India’s national entrance examination for postgraduate medical training, with case vignettes forming a major component of assessment. AI chatbots therefore hold potential as aids in exam preparation. Previous studies have reported variable accuracy of AI tools in medical licensing exams, but head-to-head comparisons across question types, subjects, and platforms are scarce. Given their rapidly growing use by students and educators, establishing the reliability of these tools is critical. This study directly compares three leading AI chatbots.
The objective was to assess and compare the accuracy of ChatGPT-4, Google Gemini, and Microsoft Copilot in solving the NEET-PG 2023 examination and evaluate their performance across different question types and medical subjects.
Methods: This cross-sectional descriptive study evaluated the performance of three AI chatbots using a validated set of 200 NEET-PG 2023 questions sourced from PrepLadder and verified against standard textbooks. These questions were presented verbatim to ChatGPT-4, Google Gemini, and Microsoft Copilot. Each chatbot received the questions independently in separate sessions to minimize memory bias. Responses were recorded correct or incorrect using the validated answer key, and accuracy was expressed the percentage of correct responses. Comparative analysis was performed for overall accuracy, subject distribution, and question type (recall, analytical, image-based, case-based). Differences were assessed using the chi-square test with p < 0.05 considered statistically significant.
Results: Microsoft Copilot achieved the highest overall accuracy with 165/200 correct responses (82.5%), followed by ChatGPT-4 with 161/200 (80.5%) and Google Gemini with 155/200 (77.5%). The difference in overall performance was not statistically significant (χ² = 1.6, p = 0.4). All three chatbots achieved 100% accuracy in Microbiology, Anesthesia, and Psychiatry, whereas lower accuracy occurred in Community Medicine, Forensic Medicine, Internal Medicine, and Radiology. No significant variation was found across subjects (χ² = 2.7, p = 0.9). By question type, recall-based items showed the highest accuracy (85.5%), followed by case-based (82.4%) and analytical (77.3%), while image-based questions were the most challenging (mean accuracy 71.0%). Although Copilot performed slightly better on recall and image-based items, the differences across the three chatbots for question type were not statistically significant (χ² = 0.35, p = 0.9). These findings highlight variability by subject and question format but no significant difference among the three tools.
Conclusion: All three AI chatbots demonstrated good accuracy in solving NEET-PG questions, performing better in recall-based subjects and less well with image-based items, reflecting current limitations in multimodal applications. They can complement exam preparation by serving as an accessible and interactive platform, offering an affordable alternative to expensive coaching. In healthcare, AI chatbots hold potential for assisting with diagnosis, treatment planning, triage, and referral, particularly in resource-limited settings. However, concerns regarding data privacy, patient confidentiality, lack of empathy, erosion of clinical decision-making limit their broader adoption. Future research should evaluate evolving versions of these models, larger exam datasets, and integration into structured educational frameworks.
Downloads
Published
How to Cite
Issue
Section
Categories
License
Copyright (c) 2025 Manik Bhise, Kale Sachin Sadanand, Patil Anuradha Vishwanath, Jali Nandita Vivekanand

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- The Author retains copyright in the Work, where the term “Work” shall include all digital objects that may result in subsequent electronic publication or distribution.
- Upon acceptance of the Work, the author shall grant to the Publisher the right of first publication of the Work.
- The Author shall grant to the Publisher and its agents the nonexclusive perpetual right and license to publish, archive, and make accessible the Work in whole or in part in all forms of media now or hereafter known under a Creative Commons Attribution 4.0 International License or its equivalent, which, for the avoidance of doubt, allows others to copy, distribute, and transmit the Work under the following conditions:
- Attribution—other users must attribute the Work in the manner specified by the author as indicated on the journal Web site; with the understanding that the above condition can be waived with permission from the Author and that where the Work or any of its elements is in the public domain under applicable law, that status is in no way affected by the license.
- The Author is able to enter into separate, additional contractual arrangements for the nonexclusive distribution of the journal's published version of the Work (e.g., post it to an institutional repository or publish it in a book), as long as there is provided in the document an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post online a prepublication manuscript (but not the Publisher’s final formatted PDF version of the Work) in institutional repositories or on their Websites prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work. Any such posting made before acceptance and publication of the Work shall be updated upon publication to include a reference to the Publisher-assigned DOI (Digital Object Identifier) and a link to the online abstract for the final published Work in the Journal.
- Upon Publisher’s request, the Author agrees to furnish promptly to Publisher, at the Author’s own expense, written evidence of the permissions, licenses, and consents for use of third-party material included within the Work, except as determined by Publisher to be covered by the principles of Fair Use.
- The Author represents and warrants that:
- the Work is the Author’s original work;
- the Author has not transferred, and will not transfer, exclusive rights in the Work to any third party;
- the Work is not pending review or under consideration by another publisher;
- the Work has not previously been published;
- the Work contains no misrepresentation or infringement of the Work or property of other authors or third parties; and
- the Work contains no libel, invasion of privacy, or other unlawful matter.
- The Author agrees to indemnify and hold Publisher harmless from the Author’s breach of the representations and warranties contained in Paragraph 6 above, as well as any claim or proceeding relating to Publisher’s use and publication of any content contained in the Work, including third-party content.
Enforcement of copyright
The IJMS takes the protection of copyright very seriously.
If the IJMS discovers that you have used its copyright materials in contravention of the license above, the IJMS may bring legal proceedings against you seeking reparation and an injunction to stop you using those materials. You could also be ordered to pay legal costs.
If you become aware of any use of the IJMS' copyright materials that contravenes or may contravene the license above, please report this by email to contact@ijms.org
Infringing material
If you become aware of any material on the website that you believe infringes your or any other person's copyright, please report this by email to contact@ijms.org


