1. The data set
2. A SAS program
3. Running the sas program
4. sas output
5. debugging
1. The data. SAS is used to analyze data such as the data set
'student data' (~sasclass/data/student.data) part of which is
shown here:
ALFRED M 14 69.0 112.5
ALICE F 13 56.5 84.0
BARBARA F 13 65.3 98.0
CAROL F 14 62.8 102.5
It may help to briefly review the jargon used to refer to a data set:
observations, variables and values. Using the student data set:
each line is an observation: ALFRED M 14 69.0 112.5
each column represents a variable, the first 8 columns are the
variable NAME the variable values of NAME are the entries such as
ALFRED, ALICE...
Numeric variables have values which are numbers only (0-9).
Numeric variables above are: age, height and weight
Character variables may contain both numbers and letters.
character variables above are: name and sex
2. A sas program.
A sas program is a file of sas commands or sas language which will
perform tasks on a data set. A sas program can be created using
any text editor such as (under unix) pico, ted, vi, or emacs:
mead% pico sample.sas
A sas program consists of sas keywords and data set and variable names
which you choose, combined into sas statements.
sas rules:
-data set names and variables must
begin with a letter or _ (underscore)
contain 8 or fewer characters
-sas statements end in a semi-colon.
position and case are not important
A sample sas program 'sample.sas' looks like this:
/* a comment: the program name: sample.sas */
options noovp nocenter ls=80;
filename in '~sasclass/data/student.data';
data students;
infile in;
input name $ sex $ age height weight;
proc print;
An English translation of the sas statements:
/* a comment...*/
A comment is not executed and is used to place notes in your program
a comment beings with slash star /* and ends with star slash */.
options noovp nocenter ls=80;
The options statement sets the environment for your sas run, in this
case noovp is 'no overprint', making error messages easier to read on
screen; 'nocenter' tells sas not to center the output and 'ls=80'
tells sas to limit output to 80 columns.
filename in '~sasclass/data/student.data';
Filename creates a keyword of your choice, in this case 'in' to
represent the real dataset (~sasclass/data/student.data). The
keyword is used later in the program (in the infile statement)
to represent the data set.
data students;
The data statement names the sas data set to be created, 'students',
and starts the data step. The sas data step is used to read and
modify data.
infile in;
The infile statement is part of the sas data step and tells sas
which dataset to process.
input name $ sex $ age height weight;
The input statement is part of the sas data step and describes
the format of the data set to be read and names the variables.
The dollar sign specifies that a variable (name and sex) is
'character', and contains letters rather than numbers.
proc print;
The proc print; statement ends the data step and calls the print
procedure to print the sas data set 'students'. SAS procedures
print the most recently created data set unless otherwise specified.
3. Running the sas program
To run a sas program type 'sas' at the system prompt followed by
the name of your sas program (sample.sas in this example):
mead% sas sample.sas
4. Sas output
Running your sas program will create a sas log named after
your program e.g. "sample.log". The sas log contains notes
(e.g. how many observations processed), warnings (e.g. missing
values, divide by zero etc.) and errors (e.g. variable name too long
or syntax errors).
Running your sas program will create a 'lst' file also named
after your program e.g. "sample.lst" if you requested output in
the form of a procedure (e.g. proc print) and if the program runs
successfully. The 'lst' file will contain the output from procs.
You can look for the files after the sas run using commands to list
files (e.g. 'ls' under unix)
Mead% ls sample.*
and you can view the contents of the files using an editor or pager
Mead% pico sample.log
or
Mead% less sample.log
After your program runs you will want to look at the log to make
sure there are no syntax errors or typos. If the log is ok then
next step is to look at the listing file 'lst' to make sure the
program did what you want. If errors are discovered, edit the
program, make the changes and re run the program, repeating the cyle.
5. Debugging
It usually takes a few runs to get the program to run correctly even
after years of practice. Typos or missing semi-colons are the usual
problem. The sas log (sample.log) will contain information to help
debug your sas program. Examine the sas log after a run using an editor
(e.g. pico) or pager (e.g. less):
mead% less sample.log
As an example, the program 'sample.sas' was copied to another filename
'sample_with_errors.sas' and an error introduced by omitting the
semi-colon from the end of the filename statement:
filename in '~sasclass/data/student.data'
The log of the run, 'sample_with_errors.log' is included below.
The log shows date, time and other header information then
lists the program as it was executed and underlines statements
which sas did not understand.
sample_with_errors_log:
1 The SAS System 12:15 Monday, January 17, 1994
NOTE: Copyright(c) 1989 by SAS Institute Inc., Cary, NC USA.
NOTE: SAS (r) Proprietary Software Release 6.07 TS104
Licensed to UNIVERSITY OF WASHINGTON, Site 0006158005.
NOTE: Running on IBM Model RS/6000 Serial Number 000015086600.
Welcome to SAS on Mead! For help with problems, please
call 543-7054 or send electronic mail to help@cac.
NOTE: AUTOEXEC processing beginning; file is /usr/local/sas/autoexec.sas.
NOTE: SAS initialization used:
real time 0.402 seconds
cpu time 0.250 seconds
NOTE: AUTOEXEC processing completed.
1 /* a comment: the program name: sample.sas */
2 options noovp nocenter ls=80;
3 filename in '~sasclass/data/student.data'
4 data students;
____
23
ERROR 23-2: Invalid option name DATA.
4 data students;
________
23
ERROR: Error in the LIBNAME or FILENAME statement.
5 infile in;
______
180
ERROR 23-2: Invalid option name STUDENTS.
ERROR 180-322: Statement is not valid or it is used out of proper order.
6 input name $ sex $ age height weight;
_____
180
ERROR 180-322: Statement is not valid or it is used out of proper order.
7 proc print;
ERROR: There is not a default input data set (_LAST_ is _NULL_).
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE PRINT used:
real time 0.022 seconds
cpu time 0.020 seconds
ERROR: Errors printed on page 1.
2 The SAS System 12:15 Monday, January 17, 1994
NOTE: The SAS System used:
real time 0.477 seconds
cpu time 0.300 seconds
NOTE: SAS Institute Inc., SAS Circle, PO Box 8000, Cary, NC 27512-8000
From this sas log it may not be evident what the problem is. The best
strategy is to look above the first error. The semi-colon is missing
from the filename statement.