Clinical reasoning of Indonesian medical students as measured by diagnostic thinking inventory

Introduction: Clinical reasoning skill is one of the most important skills for a good physician. A number of instruments have been developed to measure this skill, including the Diagnostic Thinking Inventory (DTI). Several studies have been carried out to measure its reliability and validity; however, evidence of its construct validity is still limited. This study aims to explore the construct validity of the DTI and to measure the clinical reasoning skills of Indonesian medical students. Method: The subjects were 1135 medical students and 60 general practitioners. They were asked to complete the Indonesia version of DTI. Results: Overall reliability of the DTI was .74 and .50 for the flexibility in thinking scale and .70 for the evidence of knowledge structure scale. A one way unrelated ANOVA showed that there were significant differences in the DTI score (F = 7.097, p = .000), flexibility of thinking subscale (F = 6.111, p = .000), and the evidence of knowledge structure subscale (F = 5.306, p = .000) with the scores increasing over the period of medical training and practical experiences. The biggest proportion of subjects in all groups reached the highest level (scored between 171-246/level 5). Conclusion: This study has shown the construct validity of DTI in a different linguistic context. It also has shown the level of clinical reasoning skills of Indonesian medical students varied with year of study.


Introduction
Clinical reasoning skill is considered as one of the most important skills needed to be a good physician.Earlier paradigms considered clinical reasoning skills as a set of generic skills that incorporate relevant data collection, hypothesis generation, data interpretation and hypothesis evaluation.However the work of Elstein et al. (1978) fostered the idea that the way medical knowledge is structured in the mind of students and physicians is critical to the quality of medical diagnosis.Interest in the way medical diagnostic knowledge is structured and the role of knowledge structure in medical diagnosis has steadily increased, yielding some clinical reasoning models (Bordage & Lemieux, 1986, 1987;Grant & Marsden, 1987, 1988;Schmidt et al., 1990;Higgs & Jones, 1995).Some research results also have shed light on the clinical reasoning of the novice and the expert.Novice reasoning is characterized by a reliance on biomedical and applied science knowledge.This knowledge base, however, is poorly organized.Expert reasoning is usually characterized by good knowledge structure.
In line with this understanding of the nature of clinical reasoning, a number of methods to assess the process were developed e.g. the Diagnostic Thinking Inventory (Bordage et al., 1990), the Scripts Concordance Test (Charlin et al., 1998;2000), and Clinical Reasoning Problems (CRP) (Groves, 2002).
The Diagnostic Thinking Inventory (DTI) is a self-report inventory that is mainly based on research into medical diagnostic and clinical reasoning.The DTI measures two cognitive constructs that emerged from the clinical reasoning research (Lemieux & Bordage, 1986;Bordage & Lemieux, 1987;Grant & Marsden, 1987, 1988), namely flexibility in thinking and evidence for structure in memory.Flexibility in thinking refers to the use of a variety of thinking styles that can be applied during the diagnostic process.Structure in memory refers to the availability of knowledge, stored in memory, during the diagnostic process.It is assumed that availability is a direct consequence of adequate knowledge organization.Bordage et al. (1990), showed that this inventory has acceptable reliability (alpha coefficient = 0.83), and can discriminate between expert and novice diagnosticians.Groves et al. (2003), suggested that it has the advantage of being independent of knowledge, and is not only applicable to the assessment of clinical reasoning at all levels of expertise but is also able to provide direct insight into the nature of the subject's clinical reasoning process.This understanding could be used as the baseline for developing strategies which support the acquisition of clinical reasoning competency.Sobral (1995Sobral ( , 2002) ) related the development of clinical reasoning diagnosed by the DTI to learning characteristics, knowledge score and types of curriculum.Others have used it to measure the effect of a specific intervention (Round, 1999), to measure its reliability and validity when used with physiotherapists (Jones, 1997), to assess radiologists' clinical reasoning competency (Peterson, 1999), to measure the concurrent validity of a new clinical reasoning assessment method (Groves, 2002) and to study the clinical reasoning characteristics of diagnostic experts (Groves, 2003).However, the above mentioned studies do not provide evidence on the construct validity of this instrument for use with medical students or physicians.The present study aims: 1. to explore the stability of the construct validity of the DTI in different linguistic contexts.2. to measure the clinical reasoning skills of medical students in the Faculty of Medicine, Gadjah Mada University, Jogyakarta, Indonesia.

Context of Study
Gajah Mada University Faculty of Medicine (GMU-FM) is one of the largest universities in Indonesia.Each year the GMU-FM accepts approximately 280 new medical students.
From 1990 to 2002 GMU-FM used a hybrid Problem-Based Learning curriculum, but since September 2003 it has been using a full PBL curriculum.There are two programmes within GMU-FM, one is the regular programme which is delivered in the Indonesian language, the other is an international programme where the students come from overseas and the curriculum delivery is in English.The present study used the regular programme.

Subjects
All first year to final (6th) year medical students in GMU SM were included in this study (n=1135).Sixty General Practitioners of varying experience were also recruited.

Measures
The researcher translated the DTI to Bahasa Indonesia (National Indonesian language).Five students who had TOEFL test scores of more than 580 were asked to complete Indonesian and English versions of the DTI.
Nunnaly & Berstein (1994) have suggested that parallel forms should be administered at least two weeks apart.One argument for a significant time span between sittings is to minimize memory effect and the tendency to respond to an item based on recalling the earlier response.In this study, the Indonesian version was administered two weeks after the English version.
In this study, each conflicting answer in the Indonesian and English versions was identified.All the questions that yielded different answers were reviewed.New translations for those questions were based on the feedback.The DTI was then distributed to all subjects.

a. Response rate
The response rate for all subject groups is presented in table 1.The overall response rate was 80.75%.The response rate for 1 st year students was the highest as it was easier to contact them compared to the other groups.The relatively low rate for 2 nd years can be explained by the fact that they rarely attend large group lectures and accessing each one individually was quite difficult in a large medical faculty as Gadjah Mada School of Medicine.2 shows means for the overall DTI score and its two subscales for each subject groups.A one way unrelated ANOVA showed that there were significant differences in the DTI score (F = 7.097, p = .000),flexibility of thinking subscale (F = 6.111, p = .000),as well as on evidence of knowledge structure subscale (F = 5.306, p = .000)with the scores increasing over the period of medical training and practical experiences.Level 2 = Some evidence of developing structure and flexibility.
Level 3 = Evidence of overall developing flexibility and structure.
Level 4 = Good flexibility of thinking and evidence of structure.
Level 5 = Excellent flexibility and evidence of structure The percentage of subjects for each group according to their attainment level is shown in table 3. It is surprising that more than one-third of year 1 and year 2 medical students scored between 171 and 246 (level 5).It is also of concern that around one-fifth of clinical students were in the lowest position (having poor flexibility of thinking and little evidence of knowledge structure).It shows clearly the "ceiling effect" where the biggest proportion of subjects in all groups reached the highest level (scored between 171-246/level 5).The 2 GP groups scored higher than the others.

d. Reliability of the DTI
Overall reliability of the DTI was .74(α coefficient for internal consistency); for the flexibility in thinking and evidence of knowledge structure it was .50 and .70respectively.Table 4 shows comparisons with several previous studies.

Discussion and Conclusion
In term of clinical reasoning skills the theory indicates that the more experienced the subjects, the better their clinical reasoning skills.The analysis of the DTI scores of the eight groups supports the construct validity of the DTI that has been established in previous studies (Bordage, 1990;Groves et al, 2002).
The overall reliability of the DTI was .74.This coefficient is comparable to the developers' original finding of .83(Bordage et al., 1990); as well as to previous studies that ranged from .66 to .87 (Groves et al., 2002;Jones, 1997).
The reliability of the evidence of knowledge structure subscale was .70.It is also comparable to previous studies (Bordage, 1990).However, the reliability of flexibility in thinking subscale was only moderate (.50).Further analysis showed that if item number 2 (asking about prioritization), number 3 (asking about early interpretation), number 11 (asking about clarification before further data acquisition), and number 16 (asking about data interpretation over acquisition) were deleted the alpha coefficient increases.
The first and second year Indonesian medical students' mean total DTI scores were higher than those reported by Bordage et al., (1990) and Groves et al., (2002).The third year students' mean score is relatively lower than of similar cohorts from previous studies (158.3, Bordage et al., 1990;168.59 Sobral, 1995;168.89 Sobral, 2000;171.1 Groves et al., 2002).No previous studies were identified in the literature that involved fourth year medical students.The fifth year and sixth year medical students' mean score is also lower than in previous studies, while the GPs' score is higher.
The fact that approximately one-third of the first and second year medical students achieved level 5 was surprising.As explained by Bordage (1990), the DTI can be used in two modes, either in relation to a particular case or in relation to one's general mode of diagnostic thinking.In this experiment the DTI was used in relation to subjects' general modes of diagnostic thinking.The instructions used by the original developers (Bordage et al, 1990) asked the subjects to respond as spontaneously as possible by indicating how they actually diagnose and not how they think they should diagnose (even for those with little clinical experience).
Written instruction was provided very clearly.The researcher also emphasised the instructions orally.However, the first and second year students who have not encountered many patient problems may find difficulties indicating how they actually diagnose.Consequently they probably thought what they should do (try to find the right answer), not what they may actually do in practice.This may have contributed to their high scores.In a formative setting this problem could be handled by asking students to explain their choice.
The fact that approximately one-fifth of clinical students scored below 150 (poor flexibility and little evidence of structure) is a concern.The learning experiences they have undergone for more than four years should have provided sufficient provision to score better.
In conclusion, this study has shown the construct validity of DTI in a different linguistic context.It also has showed the level of diagnostic skills of Indonesian medical students.
c. Total score according to descriptors of attainment In relation to the DTI score, Bordage identified descriptors for five different levels of attainment: Level 1 = Poor flexibility and little evidence of structure.

Table 1 : The response rate from each subject group
b. Mean scores of DTI and its two subscales Table