![]() February 1998 Forecasting that Fits There are many forecasting products from which to chose. How can you pick the ones that are correct for your needs? By Jack Yurkiewicz Our last forecast survey appeared in the December 1996 issue of OR/MS Today [1]. Based on readers' comments and interests and because forecasting as a discipline continues to grow rapidly an update is in order. The American Statistical Association's annually sponsored meetings (MSMESB or "Making Statistics More Effective in Schools of Business") continues to stress forecasting coverage in business statistics courses. Operations research or management science courses are including forecasting as basic topics. Two new texts on forecasting (DeLurgio [2]and Diebold [3]) have appeared. All current editions of business forecasting texts as well as operations research texts have expanded or included chapters on forecasting [4]. Business practitioners have made both time-series analysis and causal forecasting essential factors of their analysis of the firm, and financial analysts routinely use forecasting techniques. Because of these developments, there are many forecasting programs on the market. Traditionally, if you wanted a "top-notch" forecasting program, you would buy a dedicated, stand-alone product. However, general statistical programs have expanded and enhanced their forecasting modules and some of these rival their stand-alone counterparts. Since most forecasters do statistical analysis, the additional cost of the forecasting module to the statistics software could be lower than purchasing a dedicated forecasting program and the learning curve is nil. However, if you want some of the more "exotic" or sophisticated forecasting methodologies, such as spectral analysis or state-space models (and others beyond exponential smoothing and Box-Jenkins procedures), then you must buy a dedicated forecasting package that offers these procedures. While most vendors from the 1996 survey are in the current one, many of their products have been updated and enhanced from what they offered then. For the most part, the improvements have come in the "ease-of-learning" and "ease-of-use" categories. Entering or importing data is now easier or more intuitive from a larger variety of formats. Graphical capabilities have improved as well. As we have done in the past surveys, we categorized forecasting software into three groups. The first category is called "automatic" software. In such software, the user enters or imports data and asks the program to "analyze" it. Considering various diagnostic tests, the software responds with a "recommended" methodology (exponential smoothing, Box-Jenkins, etc.) that should give the "best" forecasts. If the user concurs with the recommendation, the program will then proceed to find the optimal parameters for the proposed procedure, get the forecasts and corresponding statistics (mean-square-error, Ljung-Box statistics, mean-absolute-deviation, etc.) and make forecast plots. The user can manually "override" the recommended procedure and choose his own preferred methodology. The software will then get the optimal parameters for the model and the associated output. In the past, I found that I could do "better" than the method recommended by most of these automatic programs, especially for erratic time-series data. The method I manually chose would give me a lower criterion value (MSE, Schwarz' Bayesian Information Criterion [BIC], Akaike's Information Criterion [AIC], etc.) than that found by the recommended procedure. However, that may have changed with the current crop of software. Perhaps the archetype of automatic forecasting software, Forecast Pro (version 3.0), for my test data always recommended a procedure that indeed turned out to be the "best." It is still prudent to warn users to be leery of using these automatic programs as "black boxes." Perhaps your data set may result in a recommendation that, with some tinkering, you could better using another technique. My informal testing showed that (at least for Forecast Pro) great strides have been made in the accuracy of its recommendations. I call the second software category "semiautomatic." You enter the data but the program does not recommend a procedure. You must choose the appropriate model from a list. The program will then find the optimal parameters for the model chosen, make forecasts and get the appropriate statistics and plots. Of course, you can specify the parameters manually as well. The user of such software must obviously have a solid knowledge of forecasting and the various associated techniques. Many of the programs fell into this category, and all of the general statistics programs with forecasting modules are in this group. In this semiautomatic group, the programs differ on how they operate. For example, Sibyl/Runner will get the optimal smoothing parameters for Winters' seasonal forecasting method using a nonlinear optimization approach, thus getting the results in a single iteration. Trends (part of the SPSS) asks the user to specify ranges for these parameters because it uses a grid search to locate the best ones. The program recommends that you use larger step sizes first and then "fine tune" the search with smaller ones. Thus, it may take several "passes" before you get these parameters with three decimal accuracy. MINITAB will automatically find the optimal parameters for Brown's simple exponential smoothing and for Holt's method, but not for Winters' model. The user must find these manually using a tedious trial-and-error process. The third software category can best be called "manual." Here the user must specify both the method and the parameters. Thus, the user must execute many "runs" for a time series, each time noting the corresponding output statistics. The forecasting session ends when the user finds the "best-to-date" parameters, or dinner is ready, whichever comes first. I would not recommend that you purchase manual software, and few products fall into this group. Choosing Your Software How do you choose forecasting software? Besides choosing one of the three categories described, you should also determine the overall capabilities of the program. That is, what methodologies are available? A program with a long list of techniques is not necessarily better than one that offers fewer procedures, so long as the latter program has the models you traditionally use. On the other hand, if your company (or the government) primarily uses the Census X-11 decomposition procedure, then your product choices are more limited. Even if the program does have the methods you require, the capabilities of these methods vary. For example, you can use an ordinary spreadsheet such as Excel to do multiple regression analysis, but Excel (and many forecasting programs) will not find a prediction interval for a new observation. If the program can find a prediction interval, does it allow you to specify any confidence level in addition to the standard 95 percent? Some programs limit the number of observations or data points, and if your data is routinely large, such programs become useless for your needs. Perhaps a certain program can perform Winters' method, but will it permit damped or nonlinear trend in addition to the standard linear trend? Can it find smoothing parameters outside the standard zero-one interval? I found that the graphics output of these programs could vary. All will give you a time plot of the data and most can give a plot of the fitted results and the data. However, some plots show forecasts, while others show forecasts and confidence intervals as well. Many programs, on seeing a single time-series column of data, will make a time plot automatically, and you can embellish this by specifying appropriate title, legends and tick marks for the horizontal axis. Others require the data to have the time intervals as a separate column of numbers or words, thus making the time plot more tedious. Some can give a simple plot of the autocorrelation and partial autocorrelation function, while others show the confidence limits for the correlations. Many programs allowed me to save the plot in a separate standard file format (such as a Windows bitmap or metafile), making it easy to bring the graph into my graphics program to embellish it. Others forced me to copy the plot into the clipboard and then paste it into my word processor, making enhancements more difficult. These and other issues can be easily resolved, as most vendors will readily supply you with the information on the software's capabilities. However, it may be harder to resolve some other questions. Just how easy is it to learn the program, and how easy is it to use, especially to the infrequent user? The documentation quality may also vary. Some give little more than what is in the on-disk help system, while others will give good tutorials on how to use the package. Others show you not only how to use the software, but also give advice on the art of forecasting itself and actually teach the methodology. Does the vendor supply help, and what is the level of support? Is it just a Web site with a series of generic FAQs, or can you actually speak with someone in technical support? Is there a charge for live technical support, and what are its time limitations, if any? The Survey We attempted to identify as many forecasting vendors as possible, and then we mailed a simple questionnaire to each. If a vendor failed to respond, we attempted to follow up with a telephone call. The resulting list is almost surely not comprehensive, and we will always get readers and vendors who later complain that we did not include a certain product. To these we can only apologize and promise to include them in the next roundup. However, we only considered products from commercial vendors, and we identified these products from the publication's database, advertising, word-of-mouth and displays at professional conferences. We did not consider software bundled with textbooks, or software written by a professor and used at his or her university but not elsewhere. The Results As with the previous surveys, the reader should understand that the results given are just summaries of the information supplied by the vendors of the software. We did not attempt to verify the information supplied. I am still frustrated to find that the answers I got from these products did not agree for identical time series. After specifying a particular procedure (e.g., Winters' method), invariably the programs came up with widely differing parameters. Forecasts differed, but not dramatically. Probably the main reason for this is how each program chooses the initial conditions to start the fitting process. None of my programs permits me to choose these initial conditions, and worse, very few even described in the documentation what those initial conditions were. Different optimization criteria were another reason for the variation of output parameters. Some programs try to minimize MSE, others use BIC and others employ MAD (mean absolute deviation). A few programs let you choose the optimization criterion, but most do not. I found another problem was importation of data. While many programs claim to read Excel files, few can read current Excel 7 files (from Office 97, which has been out for more than a year). For example, SPSS version 8.0, released in January 1998, can only read Excel files earlier than version 5.0. Some programs may not read the spreadsheet data "as is" but require additional header information. Forecast Pro, for example, assumes the Excel XLS format is the default mode of data entry, yet the software requires special handling before it will read the spreadsheet. Even if the Excel spreadsheet has just one time series column of numbers, Forecast Pro's documentation states that the "word VERTICAL must be place in cell A1 to indicate the spreadsheet is in column format. The other cells in row 1 and column A are ignored. Each data record consists of six header items in row 2 through 7, followed by the historic data in the remainder of the column, beginning in row 8." Rows 2 through 7 must contain the variable name, a variable description, a starting year, a starting period, the number of periods per year, and the number of periods per seasonal cycle. Infrequent users might want to keep the manual nearby for guidance. Sibyl/Runner, while a Windows product, reads only ASCII files, and its first importation choice is the Lotus PRN format a DOS standard from 15 years ago. Sibyl also requires that the first element of the column be a variable name. The user supplies the other information (number of periods per seasonal cycle, etc.) via a "setup" dialog box. Obviously, the results of a one-page questionnaire cannot tell everything you want to know about the software. If the vendor has a student version available, try it before you purchase the "professional" model. The student version is typically limited in that it perhaps handles smaller data sets, or certain features are omitted. At the very least, ask the vendor any questions you may have. If your forecasting methods are limited to the class of exponential smoothing methods (Brown, Holt, Winters), consider using Excel as the forecasting software. Using Excel's Solver (it is a nonlinear program and we do just that in our operations management course!), it is not too difficult to find smoothing constants that will minimize MSE. View the online version of the 1998 Forecasting Software Survey References 1. Yurkiewicz, J., "Forecasting Software Survey: Guide to a Fast Growing Discipline," OR/MS Today, December 1996, pp.70-75. 2. DeLurgio, Stepen A., Forecasting Principles and Applications,' Irwin-McGraw Hill, New York, 1998. 3. Diebold, Francis X., "Elements of Forecasting," South-Western College Publishing, Cincinnati, Ohio, 1998. 4. Three good examples of the latter (and there are many others) are: Winston, Wayne and S. Christian Albright, "Practical Management Science, Spreadsheet Modeling and Applications," Duxbury Press, Belmont, Calif., 1997; Ragsdale, Cliff T., "Spreadsheet Modeling and Decision Analysis," Second Edition, South-Western College Publishing, Cincinnati, Ohio, 1998; Hesse, Rick, "Managerial Spreadsheet Modeling and Analysis," Irwin-McGraw Hill, New York, 1997. Jack Yurkiewicz is a professor of Management Science at the Lubin School of Business at Pace University in New York. He can be contacted via e-mail at yurk@pace.edu Reader Service Form OR/MS Today copyright © 1998 by the Institute for Operations Research and the Management Sciences. All rights reserved. Lionheart Publishing, Inc. 506 Roswell Street, Suite 220, Marietta, GA 30060, USA Phone: 770-431-0867 | Fax: 770-432-6969 E-mail: lpi@lionhrtpub.com Web Site © Copyright 1998 by Lionheart Publishing, Inc. All rights reserved. |