Intercooled Stata Version 7.0


By Frank LoPresti
frank.lopresti@nyu.edu



Previous
Article   Connect Home    Next Article
Click here for a print-friendly pdf version of this article.
If you do not already have Acrobat Reader on your computer, please click here for a free download.

   

Stata startup
Figure 1. The Stata startup window.

Stata, by Stata Corporation, is one of the top statistical programming packages. It is often ranked number two by research institutions behind SAS. Like SAS and S-PLUS, Stata is highly programmable, which is a necessity for advanced users. Useful programs have been tested and archived by Stata. Their commitment to moderate the usergroups has lead to a large repository of programs addressing advanced statistical needs.

Why Use Stata?

While SPSS has the fastest learning curve for the novice user, Stata runs a close second, and is way ahead of SAS in this regard. If you're an advanced statistical user, you probably have chosen your programming tools based on those used by your co-researchers or those used in your graduate work. At present, Stata is ahead of most other packages in dealing with complex samples, time series cross-sectional regression, and the now very popular ARCH and GARCH modeling techniques. If this is the type of work you are doing, you already know about Stata. If, on the other hand, you're doing elementary statistical work, you should consider adding this respected tool to your kit.

To give you a feeling for Stata, let's walk through a simple first assignment.

"Stats 101" Assignment

First, in the "c:\" drive of your PC, make a folder for the assignment named "c:\assignment_1". Once you have started Stata, this folder will house the data and output files pertaining to the assignment. This keeps your work together--a nice feature of Stata.

Next, start up Stata. Figure 1 shows what you will see. Note that there is no data spreadsheet and there are only a few pull-down menus. Unlike SPSS, Stata is run mostly by issuing Stata syntax commands. Use the window "Stata Command" to enter commands. Below, I'll use bold text to notate commands to enter in this command window.


Entering data in a stata spreadsheet
Figure 2. Entering data in a Stata spreadsheet.

Enter the command cd c:\assignment_1 (cd stands for change directory). From now on, any time you save data or output, it will go to the folder "assignment_1". Enter log assignment_1. This command copies your Stata Results window into a file named "assignment_1.smcl". Later, if you wish, you can use this file to copy from and paste your work into a word processor file.

Use the button labeled "Data Editor" (see Figure 1) or enter the command edit to open a new editor window. Now you will see your spreadsheet.

Let's put in three columns of data, which we will later name ID, SEX and SALARY (Stata won't let you name the variables until you enter some data). Now, start entering numbers into the spreadsheet. Rows are cases (e.g., a person's answers to a questionnaire), columns are different variables. You can use the TAB and ENTER keys to enter the data. TAB takes you to the right so that you could enter a case across a row. ENTER takes you down so that you can enter all the values for a particular variable. Don't use the cursor keys or the mouse to navigate the spreadsheet while entering data.

First, enter some data into a column without a name (see Figure 2). After you have entered some data, double click on a column--anywhere in the column--and a Variable Information window will open. Use this window (see Figure 3) to name and label all three variables. You can label the variable SALARY, "Salary in Thousands of Dollars" so that later, when we run some simple procedures, this label will appear in the output to remind us that salary is in thousands of dollars.

Notice that the Stata Command window is not visible. Before you can do anything other than edit your data, you must end your edit session. Click on the "X" in the upper right corner of the spreadsheet. That ends the data editing session. (Personally, I find that scary, but it will not delete your spreadsheet.) Now you should save your data. Enter save mydata in the Stata Command window. This saves your data in the assignment_1 folder on your "c:\" drive into a file called "mydata.dta".

Variable Information Window
Figure 3. After entering data, double click on a column to get this Variable Information window.

Now, let's add value labels for sex so that your output will be labeled to reflect the fact that "0" is male and "1" is female. Stata has a concept of value labels being a separate thing from a particular variable. If you create a value label named "sexes" to use on the variable SEX, you can reuse the value label "sexes" later with another variable. For example, you could use the "sexes" label with the variable that records the gender of the respondents' first child.

Enter the command label define sexes 0 "male" 1 "female". You have now created the value label "sexes". You can apply this label to the variable SEX by entering the command: label values sex sexes.

Next, enter describe. This command describes your active data set, "mydata.dta" (see Figure 4). Enter graph sex to get a bar chart of sex. Notice how the chart has its own window. If you want to save or print that graph, you must do it now. Unlike the Results Window and the log file, graphs do not accumulate in the Graph window; only one is kept in the window. Now enter test salary, by (sex). This output gets appended to the bottom of the Results Window, which may be printed from the pull-down menu under "File".

Conclusion

Since Stata commands are typed in by the user, you must know something about the syntax before starting, so the new Stata user will have to read a bit more than the new SPSS user. The UCLA website at www.ats.ucla.edu/stat/ stata/ has a great deal of information for you to read, including a lengthy STATA CLASS NOTES web tutorial for the beginner. Please note that Stata doesn't import SAS, SPSS, or Excel files, but the ITS Stats/Mapping Group has a program, DBMS/Copy, which will convert data files from most packages.

Results Window
Figure 4. The Stata Results window.

People at the Federal Reserve use SAS. The Harvard School of Public Health is partial to Stata. If you are going to become a "heavy" user, you will need to use the tools that are used in your field. Stata, like SAS, is a tool for the rest of your research life. And it will impress your friends. At a conference, no stat bully will kick sand in your face.

Intercooled Stata 7.0 is available to the NYU community at the Tisch and Third Avenue North ITS computer labs. Earlier versions are also available at Tisch. Stata is supported by the Social Science, Statistics and Mapping group of ITS Academic Computing Services, whose offices are at the Third Avenue North Dorm, 75 Third Ave., Level C3. Contact Frank Lopresti (frank.lopresti@ nyu.edu, 998-3398) or Bob Yaffee (bob.yaffee@nyu.edu, 998-3402) for more information.

Please see the informational box below for more news from ACS and the Social Sciences, Statistics and Mapping Group.


 

 

 

 

NEWS FROM THE ITS STATISTICS AND MAPPING GROUP

New Versions of Statistical and Mapping Packages available through NYU/ITS (Selected list).

  • SPSS version 10.1 (modules such as TextSmart and Answer Tree) for PC (1,2,3,4)
  • SPSS version 10 for Mac (1,3)
  • SAS version 8.2 for PC (1,2,3)
  • SAS version 6.13 for Mac (1)
  • Stata version 7 for PC (1,2)
  • LISREL 8.5 for PC (2)
  • HLM 5 for PC (2)
  • ARCVIEW version 8 for PC (1,2)
  • ARCINFO version 8 for PC (1,2)

  1. Available at ITS labs (see www.nyu.edu/its/students/labs/ for lab hours and locations).
  2. Available at ITS Academic Computing Statistics and Mapping Lab at the 3rd. Avenue North Lab. Contact Frank Lopresti (frank.lopresti@nyu.edu) for more information.
  3. Site license which allows resale of software to NYU researchers. Visit www.nyu.edu/its/faculty/software/ or contact Eduardo DeLeon (eduardo.deleon@nyu.edu) for more information.
  4. Available at the NYU Computer Bookstore.

The Statistics and Mapping Group has started an announcements list within NYU Public Forums.

An announcement list, as the name implies, will only send you announcements. This list is not a discussion list. Announcements will contain information about software and hardware at NYU, and other information the Statistics and Mapping Group moderators want to present to our users. Researchers at NYU should feel free to submit material for publication in the list. Send e-mail to ListName@forums.nyu.edu.

To join, send a blank e-mail to join-statistics@forums.nyu.edu or go to the url http://forums.nyu.edu/cgi-bin/nyu.pl?enter=statistics to get published announcements without joining the list.




Frank LoPresti heads the Social Sciences, Statistics and Mapping Group of ITS Academic Computing Services.

 

Previous
Article   Connect Home    Next Article Information Technology Services

Connect Archives