CONNECT, SPRING 1996: STATISTICS AND THE SOCIAL SCIENCES


SAS or SPSS? Windows or Unix?

by Robert A. Yaffee

[Ed: Links to web pages and/or e-mail addresses which have become inactive since the publication of this article have been enclosed in curly brackets { }. Replacement links have been provided where possible.]

Often a professor or student is confronted with the decision of which general-purpose statistical package to use, and on what platform. If the user is a beginner or has small datasets, SPSS (Statistical Package for the Social Sciences) is often the best choice. For such a person, the MS Windows version (version 6.1.2 is available at the ACF) is easier to use than the Unix one. SPSS for Unix (version 5) can run on the RS6000/C-20 in batch mode, but for graphics, the user must have access to an X-Windows terminal or to an X-Windows emulator running on a desktop computer. One such emulator, Micro X-Windows, is available on the Gateway PCs in the ACF computer labs in the Tisch Hall (room LC-8), 14 Washington Place, and Third Avenue North. If the user has to enter data, SPSS is much easier to use than SAS (the Statistical Analysis System), and the MS Windows version of it is much faster than the Unix version running in X-Windows.

SPSS running under Micro X-Windows may be too slow for many people. For researchers with large datasets and more complex statistical analyses, SAS may be the better package. Running under either MS Windows or Unix, SAS is currently more powerful than SPSS, as well as more complicated. On both systems, SAS now has better graphing capabilities.

For general data management, SAS possesses certain advantages over SPSS. With SAS, it is easier to merge and to concatenate datasets. It is easier to pipe the output from one dataset into that of another with SAS than it is with SPSS. It is easier with SAS to take the output of one statistical procedure and feed it into the input of an another statistical procedure. SPSS value labels are easier to form than SAS variable formats. SPSS is more modular and less flexible in its data management than SAS. But for data entry, SPSS for Windows allows for easier input.

The number, power, and flexibility of SAS statistical procedures are generally greater than those of SPSS. For categorical data analysis, SAS offers more tests than does SPSS. SAS has a repertoire of significance tests for differences between stratified crosstabulations. SAS also contains a wider variety of regression and anova procedures than does SPSS. SAS Graph far exceeds the current capabilities of SPSS Chart.

Nonetheless, SPSS offers some unique options not now available in SAS: SPSS not only allows for easier data entry, it has neat syntax and analysis for hierarchical regresssion, a CHAID module for market segmentation, a good reliability program, an Exact test module with a wide variety of tests, and a neural-network module, along with very pleasant output presentation. At this time, preparation for proper usage of the SAS system, with their greater variety of options, involves much more homework than for proper usage of SPSS. For the beginning student, SPSS is more user-friendly, but for the advanced user or statistician, SAS may be powerful than SPSS. SPSS will be coming out with a new General Linear Model program allowing for post-hoc tests in version 7 for Windows. SPSS continues to develop new and interesting statistical procedures, and is probably the most widely used statistical package in universities, both in North America and Europe today.

In addition to a number of specialized statistical packages, the ACF Statistics and Social Science Group supports both SPSS and SAS. Persons interested in discussing the relative advantages of one package or platform over the other should contact either Robert Yaffee or Frank Lopresti at 998-3058.

For more about SPSS, see "New Statistical Modules for Marketing Research and Time-Series Analysis." [ C ]


Dr. Yaffee was a statistical consultant at the ACF at the time of this article's publication.
{yaffee@nyu.edu}

Posted 15 February 1996. Revised 24 May 2004.