Você está na página 1de 6

Basic SAS programming Claudia von Brmssen

Dept. of Economics,
Unit of applied statistics and mathematics
Reading and writing data, data manipulation

Writing SAS programs and finding errors

SAS programs are written in the SAS editor. The program is run by marking the code (right mouse-click, mark
all code) and clicking submit or pressing F8 (you can run the entire program by pressing F8 without marking
but usually you will do programming in small parts and then you can mark the part you want to run).
Whenever some program code is submitted you can check if everything is ok by studying the Log Window
there you will find errors, warnings and notes (read more about this in Errors, Warnings, and Notes (Oh My)
, see link at the end of this document).
If the program is correct and you have run a procedure step (PROC) you will get the output in the Output
window. If you run a data step (DATA ) there will be no output, but you can check if the data step did
what it was supposed to do by printing the resulting dataset (PROC PRI NT dat a=; r un; ).

If you write code it is useful to comment on the different parts of the program. To insert a comment, write *
in the beginning of the line. If you want to write a long comment or a comment containing ;write instead
/ * to start the comment and */ to end the comment.

Read data from an external file

Reading an external file can be done in 3 ways:
Use the SAS function IMPORT
Write the data to the SAS editor and read it from there
Read the file using a DATA step and the INFILE statement.


The SAS-function IMPORT

For this click Fi l e - > I mpor t dat a and you get:



Choose the type of file you want to import (in our case Excel), click Next and choose the file by browsing you
catalogue (observe that the extension of the file is xlsx).
In the next step you choose how the file is saved as SAS-dataset (Choose the SAS destination). You can
choose between:
SASUSER the file is saved permanently in the SAS library SASUSER
and
Basic SAS programming Claudia von Brmssen
Dept. of Economics,
Unit of applied statistics and mathematics
WORK the file is saved temporarily in the SAS library WORK and is deleted when the session is closed.

Usually it is best to choose WORK and save the dataset as permanent dataset when we have seen that the
import was successful.

Furthermore you need to specify the name for the file (MEMBER. Click Next and Finish and your data is
imported. Check the log-window to see if the import was done successfully.











Read data directly from the SAS editor

The data can also be inserted in the SAS editor directly. This is convenient if the data set is not already saved
in a file. Now we have already read our data into SAS but try the following statements anyway (copy and
paste to your editor).

data i ndat a2;
i nput dat e: ddmmyy10. t ype $ concent ;
car ds;
19/ 08/ 2006 A 16
19/ 08/ 2006 A 14
19/ 08/ 2006 A 19
19/ 08/ 2006 A 16
19/ 08/ 2006 A 14
19/ 08/ 2006 A 15. 5
19/ 08/ 2006 A 20
19/ 08/ 2006 A 21
19/ 08/ 2006 A 14
19/ 08/ 2006 A 13
19/ 08/ 2006 A 17
19/ 08/ 2006 B 21
19/ 08/ 2006 B 18
19/ 08/ 2006 B 17
19/ 08/ 2006 B 19. 5
19/ 08/ 2006 B 23
19/ 08/ 2006 B 16
19/ 08/ 2006 B 17
19/ 08/ 2006 B 12
19/ 08/ 2006 B 19
19/ 08/ 2006 B 16
19/ 08/ 2006 B 22
19/ 08/ 2006 B 21. 5
26/ 08/ 2006 A 18
26/ 08/ 2006 A 17. 5
26/ 08/ 2006 A 22
26/ 08/ 2006 A 16
26/ 08/ 2006 A 19
26/ 08/ 2006 A 21
26/ 08/ 2006 A 15
In this exercise I call the data set indata2. It is saved in
the directory WORK. If you want to save it permanently
write
dat a sasuser . i ndat a2;
and the dataset will be saved in SASUSER and can be
used in another session.

The input statement defines which variables are read:
dat e: ddmmyy10. a date variable that is of type
ddmmyy (day, month, year in this order) and
together consist of 10 characters (8 numbers and two
delimiters)
t ype $ reads the second column, $ is necessary to
indicate that this is a text variable
concent reads the third column. Since it is a
numerical variable no specification is needed

car ds; (or dat al i nes; ) is used to indicate the
beginning of the data input.

The program is ended with run; left click on mouse and
mark the entire program from data to run;
and click F8 (or submit) to run the program.
Check the log window if there are any warnings or
errors.

Note: Importing Excel files
If you have problems importing Excel files convert them to csv-files in Excel and try again.

Note: Write data to an external file - EXPORT
To write a SAS data set to an external file (e.g. Excel, ASCII,) is done by SAS EXPORT in a similar way
as SAS IMPORT. You can also write external files using programming code, but we will not take up
this in this course.

Basic SAS programming Claudia von Brmssen
Dept. of Economics,
Unit of applied statistics and mathematics
26/ 08/ 2006 A 17
26/ 08/ 2006 A 21
26/ 08/ 2006 A 18
26/ 08/ 2006 A 16. 5
26/ 08/ 2006 A 17
26/ 08/ 2006 A 21
26/ 08/ 2006 A 19. 5
26/ 08/ 2006 B 22
26/ 08/ 2006 B 25
26/ 08/ 2006 B 14
26/ 08/ 2006 B 21
26/ 08/ 2006 B 20. 5
26/ 08/ 2006 B 18
26/ 08/ 2006 B 17
26/ 08/ 2006 B 19
26/ 08/ 2006 B 21
26/ 08/ 2006 B 25
26/ 08/ 2006 B 21
; run;


Read data from a external file using the infile statement

We can also use a program to read an external file, but this is usually more complicated even if the structure
is quite similar to reading from the editor. Again it is often easier to read csv-files than xls-files. (Replace the
path ' Z: \ SAS\ dat a\ obser ved_val ues. csv' with the one that is correct for your file).

data i ndat a3;
i nf i l e ' Z: \ SAS\ dat a\ obser ved_val ues. csv' dl m=' , ' f i r st obs=2 ;
i nput dat e : anydt dt e20. t ype $ concent ;
run;


To make this program run we need to adjust a few things: We use dl m=' , ' to indicate that we have
comma (,) as delimiter (between the different variables), f i r st obs=2 makes the program read the second
row and all after, but not the first row, since the first row contains the headers and cannot be read at the
same time as the data. With i nput dat e : anydt dt e20. t ype $ concent ; we give all the variable
names (date, type and concent) and indicate the variable types date: a date value, type: a character variable
and concent a numerical variable.













Saving your SAS dataset permanently

Often you will want to save your SAS dataset permanently in order to use it again another day (so that you
do not have to import it again). You can do this by
Note: Missing values in SAS always are coded as dot (.). If you use the infile statement you will have to replace
all empty cells in Excel with dots. If you read data from the editor the missing observations should be coded as
dot. PROC import should be able to handle missing values in external files even if they are not coded as dot.

Note: If you run Excel with Swedish settings you will instead of a comma delimited file get a semicolon
delimited file. You need then change to dl m=' ; . If you have number with decimals you will also have to
change the decimal delimiters. This is easiest to do in Excel by the replace function. The decimal delimiter
must be dot (.) .

Basic SAS programming Claudia von Brmssen
Dept. of Economics,
Unit of applied statistics and mathematics

data sasuser . i ndat a2;
set i ndat a2;
run;
The file is then saved in the catalogue SASUSER. This might however be a disadvantage if you have many SAS
files. Rather you would save them together with the according project in another catalogue.
You can then specify your own catalogue using the l i bname statement.

l i bname pr oj ect ' z: / pr oj ect s/ SAS/ ' ;

( pr oj ect is the name I chose to call the library. z: / pr oj ect s/ SAS/ is the catalogue). We then
save the dataset permanently in the folder z: / pr oj ect s/ SAS/ by:

data pr oj ect . i ndat a2;
set i ndat a2;
r un;


Date informats and formats

Write the data from above to the output window:
proc print dat a=i ndat a2;
run;

or

proc print dat a=pr oj ect . i ndat a2;
run;

When you look at the data in the Output window you will see that the date variable is a number rather than
date format. (Date=0 is the 1
st
of January 1960 and we count from there). To express the date in date format
you can insert the statement:

f or mat dat e: ddmmyy10. ;

right after the i nput statement in the program that reads the data (or in any other data step) and the date
format will be used in all outputs you make after that. Do this for the program where you read indata2 and
run this program again. Print the data to see the difference

If you do not want to use this format all the time you can define the date format in a procedure step, like
this:
proc print dat a=i ndat a2;
f or mat dat e: yymmdd10. ;
run;

The format is used when the data is printed to the output window, but not after that.

Here are some other date formats:

DDMMYYw. day, month, year (w. gives the number of characters, above we use 10, try 8 and check the
difference)
YYMMDDw. year, month, day (often used in Sweden)
MMDDYYw. month, day, year
DATEw. day, month abbreviation, year
Note: Since SAS internally always
handles dates as numeric you can easily
compute differences for different dates
and get the result as number of days
between the two dates.

Basic SAS programming Claudia von Brmssen
Dept. of Economics,
Unit of applied statistics and mathematics
DAYw. day of month
DOWNAMEw. name of the day of the week
JULDAYw. day of the year (e.g. 32 for the 1
st
of February)
ANYDTDTEw. we used this to read data from a file. SAS tries to finds a suitable date format by itself.


Data manipulations and new variables

We can compute new variables and manipulate our data in SAS. Many mathematical, logical and date
functions are available. We always have to do this in a data step, like this:

dat a concent r at i on_dat a1;
set concent r at i on_dat a;
(data manipulations);
r un;

The following operators might be useful

Arithmetic Operators in SAS
* multiplication
+ addition
/ division
- subtraction
** exponentiation

Mathematical functions:
Log natural logarithm
Log10 Logarithm to the base 10
Sqrt Square root
round round the argument to a specified level

Comparison Operators:
=or eq equal to
^=or ne not equal to
>or gt greater than
>=or ge greater than or equal to
<or l t less than
<=or l e less than or equal to

Boolean Operators:
&or and and
| or or or
^or not negation


Functions that work together with a date variable
mont h( ) Extracts Month
day( ) Extracts Day
year ( ) Extracts Year
weekday( ) Extracts Day of Week
Here are some examples of data manipulations:

Basic SAS programming Claudia von Brmssen
Dept. of Economics,
Unit of applied statistics and mathematics
l ogconcent =l og( concent ) ; creates a new variable l ogconcent that is the natural
logarithm of the variable concent
i f concent >16 t hen var i abl eB=" Yes" ; creates a new variable var i abl eB if val ue is equal to 13
(if val ue is not equal to 13 var i abl eB will be missing)

i f concent >16 t hen var i abl eB=" Yes" ;
el se var i abl eB=" No" ; same as above, but if val ue is not equal to 13 the new
variable is set to No instead of Yes

i f concent =>13 t hen out put ; only observations that have concent 13 or higher will be
kept in the dataset (t hen out put can be omitted)

i f concent <13 t hen del et e; same as above

i f concent <13 & t ype=' B' t hen del et e;
all observations that are less than 13 and have treatment
type B are deleted;

year = year ( dat e) creates a new variable called year


Set and Merge

If you have two datasets containing the same variables and you want to concatenate them use the set
statement in a data step.

data concat dat aset s;
set dat aset 1 dat aset 2;
run;

If you instead have two dataset with different variables, but one of the variables is in common (e.g. date or
id) you can combine the two datasets by merging them according to the common variable.

data dat aset combi ned;
mer ge dat aset 1 dat aset 3;
by i d;
run;




Litterature:
Fifty Ways to Lose Your Data (and How to Avoid Them)
http://support.sas.com/resources/papers/proceedings09/134-2009.pdf

Errors, Warnings, and Notes (Oh My) A Practical Guide to Debugging SAS Programs
http://susanslaughter.files.wordpress.com/2009/02/debug1.pdf

Você também pode gostar