Our exam software encourages students to study STEM material in an organized, meaningful way. We will demonstrate the tool, and provide data some several studies showing what we've learned about best practice, student attitudes, reliability and validity of scores, as well as future directions.
Even though a variety of different methods have been developed for evaluating conceptual knowledge in STEM courses, none of them appear to be close to an ideal method, particularly for online and blended courses. Multiple choice exams are fast and easy to grade with modern technology, are viewed by students as being objective, and can include a very wide range of concepts and facts that can be tested in a relative short time (e.g., McKeachie and Svinicki, 2010). However, many STEM faculty are extremely busy and not all have learned about best practices in multiple choice test construction. Thus in practice most multiple choice exams simply verify that students have read required material. Such exams do not measure how well students are organizing information. They rarely measure higher level thinking. Furthermore, they do not provide any information about which students are going beyond the basic information presented in the course. Alternatively, teachers of STEM courses could use essay exams. Essay exams are easier to write than multiple choice; can be used to evaluate whether students have developed elaborate, organized conceptual understanding; can provide clues about student's creativity, synthesis, evaluation, and other higher order thinking skills (e.g., McKeachie and Svinicki, 2010); and are open ended. However, essay exams take a great deal of time of grade, scores are not as reliable as multiple choice exams, students find them very subjective, and graders often show halo effects as a result of writing style in contrast to content knowledge. What is needed is a reliable, valid evaluation method that does not have these disadvantages.
Our research is about an approach that is so designed. In our testing, approach is based on the following principle :
-Kintsch (e.g., 1988), whose reading comprehension model is the foundation of most comprehension models, argues that the meaning of text is based on propositions, not on individual words. Thus our approach emphasizes propositions, but to avoid halo effects, encourages the use of simple propositions.
- Early in the 20th century, many philosophers argued that knowledge is verbal. However, later in the 20th century, researchers such as Paivio (e.g., 1969) provided strong evidence that there is an imagery system that is more visual, concrete, and analog. Thus we provide students a means to report aspects of their knowledge stored in each of these systems.
- The tool should be web-based and easy to use so that it is accessible and quickly learned by students in online and blended courses. The experimental testing tool we have created is called Easymap. Faculty create a test by defining a prompt or question, identify fundamental concepts students should know (optionally), and determine when it is available and to who. Students log in. They enter simple propositions in the form of "concept A" "is related to" "concept B." The propositions are listed on the screen and are editable. The software also displays all the propositions in concept map format. In our current version, students cannot directly interact with the map but can view it to see their organization and assess what is missing.
Method Our research over the last two years has been aimed at the following questions:
-What is best practice with regard to the prompts or questions?
-How can students be taught to use the system quickly?
-What restrictions should the software have to encourage students to enter well-formed propositions and create well-formed maps?
-What scoring approaches provide good reliability and validity?
At the conference, we'll give more details of various experiments, but here is an example of one of the most recent. Each of 33 Students in an upper division, perception course were given a number of exams in a semester. For each topic, we attempted to have students complete a pretest, two weeks later a post-test, then have several days to work at home on a copy of their post-test to improve it. Students were randomly assigned to topics. All three time periods were completed for 4 exam topics and pretest plus take home were completed for 6. Students were motivated to do well. These exam grades (although the pretest was counted as completed or not) formed the basis of the entire course grade. Results Maps for all three time periods for each exam prompt were put together and each was assigned a random label. A group of experts worked together to rank all of the exams. There was extensive discussion to build consensus on the ranking. Also, several experts rated the individual propositions as correct, irrelevant, or incorrect creating a score of total correct propositions for each student on each exam. The agreement among the raters was .90. The correlation between global ratings and total correct score was large and significant (e.g., Exam 1, r(40) = .88, p<.0001). The total correct scores for the 4 exams that were done at all three time periods were analyzed with ANOVA. The only significant difference was when the exam was taken (pre,post,take home F(2, 174) = 94.97, p<.0001). Similar results were obtained for global ratings. Student attitudes in general were quite positive. All but two indicated that the approach to testing was outstanding and they were highly encouraging that the research continue and the software enhanced. One student wrote "This is what testing in college should be like, but usually isn't!" Several other students wrote similar comments. Discussion This new system allows exams to be set up very quickly, in about the same amount of time as writing an essay exam. The exam format encourages students to study STEM material in an organized, meaningful way. The system allows students to emphasize different aspects of material and to go beyond basic course material in a wide variety of ways. Initial grading takes some time, but scoring of propositions can be stored and used to partial grade other students. We will discuss scoring and other issues with the audience.
He received his bachelor's degree in Psychology from the University of California and PhD in Cognitive Mathematical Psychology from Indiana University. He is currently a professor at Ball State University. He has published in a number of areas including perception, cognition, human factors, education and modern technology. He is a past president of the Society for Computers in Psychology; has been a consulting editor for several journals and grant agencies; and a recipient of BSU's Outstanding Faculty award to honor his contributions to research, teaching, and service. For the past nine years, he has also served as President of nHarmony, Inc., a software company providing custom web-based solutions.