Validation of multiple true or false (MTF) questions and their usefulness in assessment in an undergraduate medical programme

Objective: To find out difficulty index (P) and discrimination index (D) of the items Multiple True or False (MTF) questionsset for the examinations conducted by constituent departments of a medical college, namely: anatomy, physiology, biochemistry, pathology, pharmacology and microbiology. Methods: Scores obtained by students of first year MBBS (March 2010 batch) in anatomy, physiology, biochemistry and second year students (March 2009 batch) in pathology, pharmacology and microbiology were taken. MTF component of the examination of block 1, 2, 3 and 4 were considered. Difficulty index (P) and discrimination index (D) of MTF component of block 1, 2, 3 and 4 were analyzed using Microsoft excel. Correlations of MTF and essay scores of the examinations were also performed. Results: The average easy questions(P ≥ 75%) were 48%, 65% and 51% & percentage of average difficult questions (P ≤ 25%) were 3%, 1% and 2% in anatomy, physiology and biochemistry respectively. Anatomy, physiology and biochemistry had 71%, 56% and 67% of questions with discrimination index D ≥ 0.2. The average easy questions were 64%, 51% and 46% & percentage of average difficult questions was 3%, 3% & 4% in pathology, pharmacology and microbiology respectively. Pathology, pharmacology and microbiology had 40%, 51% and 50% of questions with discrimination index D ≥ 0.2. The essay and MTF scores of each subject showed strong and significant correlations. Conclusion: The difficulty index obtained from the analysis shows that the questions were easy for the students. However, the study shows that about 50% of the questions were capable of discriminating students with higher ability from those with lower ability (D ≥ 0.2).


Introduction
Assessment is an integral part of any education system. The main goal of conducting an assessment is to measure competency of students in the respective domains. One of the widely accepted ways of assessing students is by conducting written tests, where in, either objective or essay type questions will be asked and students will have to answer them within a pre-determined period of time. In such a method, quality of questions that are used to measure knowledge of students are very important in determining the success of assessment. A technique called item analysis is done to analyze the quality and utility of questions or items in differentiating the performance of students. 20 examinees responses to each item (Kim, 1997). Furthermore to assess effects of educational programs it is very important to conduct item analysis (Hetzel, 1997). It can also be used to obtain information such as difficulty and discriminatory power of items (Kim, 1999).
Difficulty Index (P) is the percentage of total correct responses to the test questions. Difficulty Index is analyzed to understand how easy or difficult the question is for the students (Singh et al., 2000). Higher the value of difficulty index, the lower is the difficulty of the question (Backoffet al., 2000;Hetzel, 1997). If the index is above 75% it is considered as an easy question while index below 25% is declared as a difficult question (Singh et al., 2000;Sim&Rasiah, 2006).For example, if the difficulty index of a question is 75%, it means 75% of students have answered the question correctly and hence it is considered as easy question. On the other hand, if the difficulty index of the question is 25%, it indicates that only 25% of students have answered the question correctly and obviously the question is considered difficult. Value of difficulty index ranges from zero to one (Zurawski, 1998). When no examinees answered correctly value of difficulty index is zero. It attains a value of one when all examinees answered correctly.
Discrimination Index (D): To find out the ability of a question to discriminate students based on their understanding of subject matter, discrimination index is calculated. Higher the discrimination index better is the question in discrimination. If a particular question is doing a good job of discriminating between those who score high and those who score low, more students in top scoring would have answered the question correctly (Hetzel, 1997). Discrimination index ranges from -1 to 1. An item everybody answered correctly or incorrectly will have zero discrimination index (Zurawski, 1998). If students in the lower group answer more, then, discrimination index will be negative. If more number of students in higher group answers correctly, then, discrimination index will assume a positive value.
Following are the guidelines regarding discrimination index mentioned by Ebel (1972).
 D ≥ 0.4; very good  D: 0.3 to 0.39; reasonably good, possibly subject to improvement  D: 0.2 to 0.29; marginal, need some revision  D < 0.19; poor, need major revision or to be eliminated. Brown (1983) and Algina (1986) have found that test questions with discrimination index equal to or above 0.2 are acceptable and these questions differentiate upper and lower groups.
In recent years, MBBS curriculum at our institute has undergone many changes. In changed circumstances, it becomes necessary to explore the level of learning by students and their level of competency. An item analysis was carried out to determine the reliability of the tools of assessment which would enable us to meet the above two requirements.
Objective of the present study was to find out the difficulty Index and discrimination index of Multiple True or False (MTF) questions set for examinations conducted by constituent departments of our institution, namely: anatomy, physiology, biochemistry, pathology, pharmacology and microbiology.

Methods
Our institution offers MBBS course which is of five years duration. Physiology, anatomy and biochemistry are taught in first year curriculum, while, pathology, pharmacology and microbiology are taught in the second year. The curriculum is taught in 4 blocks (teaching units).Block 1 includes basic concepts, skin, muscle, bones, joints and blood. Block 2 includes cardiovascular system, respiratory system, GIT, nutrition and hepatobiliary system. Block 3 includes endocrine, reproduction, kidney and electrolytes. Block 4 includes central nervous system, special senses and molecular biology. Duration of every block is 10 weeks. At the end of every block, examinations will be conducted which consists of both essay type (60 marks) and MTF (120 marks) type of questions. For the purpose of the present study, scores obtained by first year MBBS students (March 2010 batch) in anatomy, physiology, biochemistry and second year students (March 2009 batch) in pathology, pharmacology and microbiology was taken. Only MTF component of the examination of block 1, 2, 3 & 4 were considered and analysis was done using Microsoft excel. Two parameters, namely, Difficulty index (P) and Discriminatory index (D) were analyzed.
The formula used for evaluation of difficulty index:

South East Asian Journal of Medical Education
Vol. 10 no. 1, 2016 (Where, P was difficulty Index, R was total correct responses and T was total number of students appeared for the examination) To analyze discrimination index, papers were arranged in rank order with students scoring highest marks positioned in the top. It was then divided into three equal groups with higher Ability Group (HAG), which was the top 1/3rd and Lower Ability Group (LAG), which formed the bottom 1/3rd of the group (Chandratilake et al., 2010).

Discrimination index was calculated as
Discrimination index D = PHAG -PLAG (Sim&Rasiah, 2006;Chandratilake et al., 2010) (Difficulty index of higher ability group -Difficulty index of lower ability group) Correlation between MTF and essay scores was analyzed using Pearson correlation. This was performed to understand the reliability of the MTF scores in categorizing students as HAG and LAG in our settings.
Results: Table 1 shows the difficulty and discrimination index of questions in anatomy, physiology and biochemistry in different blocks. The average easy questions (P ≥ 75%) were 48%, 65% and 51% & percentage of average difficult questions (P ≤ 25%) were 3%, 1% & 2% in anatomy, physiology and biochemistry respectively. Anatomy, physiology and biochemistry had 71%, 56% and 67% of questions with discrimination index D ≥ 0.2 as shown in table 2.
As shown in the table 3, in second year the average easy questionswere 64%, 51% and 46% & percentage of average difficult questions were 3%, 3% & 4% in pathology, pharmacology and microbiology respectively. Pathology, pharmacology and microbiology had 40%, 51% and 50% of questions with discrimination index D ≥ 0.2 as shown in table 4.
The tables 5 and 6 show correlation of students' scores in the essay-type questions and MTF questions. All the correlation coefficients (r) were above 0.6 in all subjects indicating a strong correlation and they were also significant (p< 0.001).

Discussion
The study shows a consistent level of difficulty and discrimination indices being maintained from subject to subject. The students' score in MTF component correlates strongly with the essay component in all subjects throughout the year. Such consistency can be attributed to the teaching-learning and assessment process followed in our medical school. The question papers set by the faculty members for examinations were scrutinized at multiple levels. Question papers were reviewed by a team of experts involved in that particular block which were then thoroughly scrutinized by head of the department. This ensures that the questions were confined to the objectives defined for the course and also maintains the standard of question paper.
It can be seen from the analysis that physiology from first year and pathology from second year had more easy questions compared to other subjects. disorders and paste it in the journal book; also they have to solve renal problems. This might have made the students to learn more and the reason for more students answering and so the lower discrimination. The study of a preclinical question paper done by Mitraet al., (2009) found that 40% of the total test questions had difficulty index crossing 80% and another study of year two examinations of a medical school reported by Si MuiSimet al., (2006), also found that about 40% of the MCQ items crossed difficulty index 75%. Compared to these studies, we have more easy questions. If the question is easy for the students it also means that they have learnt the topic (Zurawski, 1998). It is substantiated by the good correlation of students' essay and MTF scores.Thus, higher competency level of the learners may be one reason for the outcome.Teaching methodology adopted at our institution involved active learning components such as Self Directed Learning (SDL), Problem Based Learning (PBL) in addition to didactic lectures. This forced students to do additional self-learning which invariably involved referring various learning materials related to the subject. This might have improved competency, knowledge and retention level of students (Dolmans & Schmidt, 1996).
Our institution provided very good academic support for students where in, weak students were identified by faculty members and were given proper directions to improve their competency level. Additional efforts were put by faculty members to see that students learn as per the objectives set in each subject. Printed materials clearly defining the objectives set in each subject were issued to students well in advance, to keep them informed of what was expected to be learnt as per the curriculum. This might have helped students to prepare themselves better to meet expectations and obtain a wide understanding of subjects, which in turn, increased probability of answering questions correctly. These could be the other reasons for large percentage of students answering the questions correctly. Carroll (1993) quotes that "The item difficulty and discrimination are often reciprocally related". This may be the reason for the low discrimination index.

Conclusion
The difficulty index obtained from the analysis shows that the questions were easy for the students. However, the discriminatory index decreases as the difficulty index increases, but the study shows that about 50% of the questions were capable of discriminating students with higher ability from those with lower ability (D ≥ 0.2). If students' success in examination is to be considered as an indicator of effective teaching learning process, then the assessment should be a reliable indicator of student achievement of intended learning outcomes of the course.