|
OR/MS Today - October 2006 Issues in Education Testing Ourselves By Joel Sokol Consider the following title and abstract of a research paper: Title: A new heuristic for solving Problem X. Abstract: Problem X is a well-known and well-studied problem. In this paper, we present a new heuristic algorithm that gives solutions to Problem X. Our method belongs to a class of algorithms that are thought to be good by other researchers. Computational results on one test instance suggest that our algorithm generates a good solution. As OR/MS professionals, we know what's missing from this paper: a comparison of this heuristic's results with those of other methods for solving Problem X. Any method for solving a well-studied problem should be compared with other solution methods for the same problem. Unfortunately, though, there is one area of OR/MS research where the requirement to compare one's methods to those already existing is often non-existent: OR/MS education. For some reason, we authors, referees and editors in this area seem to be much less stringent about making comparisons to known methods when Problem X is "how to teach topic Y to students of type Z." There are several reasons why this might be the case, but the biggest is probably that it's much harder to run serious methodological comparisons in OR/MS education. In other areas of OR/MS, a researcher with a new solution idea can easily find large repositories of widely recognized benchmark test instances (at Netlib, for example) and/or quickly generate as many random test instances as necessary. In educational research, our test instances (i.e., courses we teach) appear at a rate of about one or two per semester; moreover, these tests are easy to accidentally spoil (for example, if the researcher overlooks an important factor and leaves it uncontrolled for, it's impossible to go back and re-run the experiment on the same set of students). This all makes for a much slower rate of computational testing than in other areas of OR/MS. While impatience with the rate of computational testing is certainly understandable, it leads us into the oldest trap in the OR/MS book: making a decision because it "should make sense" rather than because data indicates that it works. On the one hand, we spend countless hours teaching our students the importance of using quantitative methods to analyze alternatives, but on the other hand we publish our educational ideas that simply look/sound reasonable, without data to show that our methods work better than any other approach. (Among education researchers we're certainly not unique in this respect e.g., the phonics vs. "whole language" debate but as OR/MS professionals, we should know better.) In spite of the difficulties, I believe it is important that we practice what we preach: new (and old) teaching methodologies and tools should be evaluated and compared quantitatively, with positive results being a requirement for publication. I might have spent a few dozen hours coding my Excel or Java educational tool, but if the data shows that it doesn't help student learning (or that it doesn't help students learn any better than standard methods), it doesn't deserve to be published in our peer-reviewed OR/MS education journals. In fact, my publication would be a disservice to my colleagues, who might spend a few dozen hours of their own making similar, and similarly ineffective, educational tools. That's the whole point of computational testing, whether in education or any other "Problem X" we tackle in OR/MS. On the other hand, slowing the publication process down to a crawl due to testing isn't an ideal outcome either. So, I suggest that we help each other out. Rather than testing over several semesters serially, several faculty can test in parallel, with each one using the new idea for some students and using their other students as a control group. (For some ideas, such as new lecture modules, faculty teaching two separate sections would be required.) What would be the benefit of participating in this type of "academic clinical trial"? Other than being able to help a colleague and the chance to get an "advance copy" of some innovation, participants also could hope that the colleague they help would in turn help them evaluate a future idea. (In the sense, the system would be similar to our standard process for refereeing papers.) Of course, finding colleagues willing to take part might not be so easy, so I am hereby volunteering to administer the system for anyone who wants to take part. If you're willing to try (and measure) a new idea in your classroom, let me know what course(s) you usually teach and what type(s) of students are in those courses. If you have an idea you need to test, let me know that too, and I'll try to match you up with appropriate colleagues. And most importantly, when we write a paper, referee it or handle it as an editor, we need to hold our papers to the same standard of proof. Whether we're proposing a new method for solving something in OR/MS or a new method for teaching something in OR/MS, we need to show evidence that the new method is worth using. We don't publish unsubstantiated claims in peer-reviewed journals in any other area of OR/MS; there's no reason we should in education either. OR/MS Today copyright © 2006 by the Institute for Operations Research and the Management Sciences. All rights reserved. Lionheart Publishing, Inc. 506 Roswell Rd., Suite 220, Marietta, GA 30060 USA Phone: 770-431-0867 | Fax: 770-432-6969 E-mail: lpi@lionhrtpub.com URL: http://www.lionhrtpub.com Web Site © Copyright 2006 by Lionheart Publishing, Inc. All rights reserved. |