Você está na página 1de 21

MAKE SURE YOU RESET THE DATE OF YOUR PC BEFORE

STARTING SAS IF EXPIRED

Tips and Tricks of SAS

1) Add personal comments:


*/ Testing the example /*
or
* TESTing the example;

2) Add title to a program:


Add the world TITLE, e.g.

TITLE ' Repeated measure analysis example';

Also add footnote;

FOOTNOTE ‘this is the footnote’:

End multiple titles with


TITLE;

To label variables use

LABEL short-variable-name = ‘any thing you want’;


e.g.,

LABEL tree = ‘Tree Name’


cult = ‘Carambola Cultivar Name’;

3) Read from a file add the following after naming your data set
INFILE:
DATA name;
INFILE ‘file.dat’:
INPUT list-of-variables;

If writing directly to sas, use


DATA name;

1
INPUT list-of-variables;
CARDS;
Type all data here I order of variables followed by (;)
4) Character variables are followed by $

5) You can have the data read in columns, rather than rows: Page 36-
37.

6) Use @@ at the end of INPUT variable-list to ask sas not to cosider a new
line as a new observation but rather a continous from previous step (P. 46).

INPUT variables @@;

7) To ask SAS to start from the 3rd line of your data (or any line) use
FIRSTOBS;

INFILE ‘name.dat’ FIRSTOBS = 3;

To end it use option OBS = n where n is which line to stop.

INFILE ‘name.dat’ FIRSTOBS = 3 OBS = 23; (for 20 observations)

MISSOVER for sas not to read more data after it’s done, PAD for
formatted file, DLM delimited space by e.g. a comma (,). P. 50-51

8) List content of a data set that is ready to be analysed, just type:

PROC CONTENTS DATA = name;

P. 52

9) Create a LIBNAME for a list of data sets: use,


LIBNAME libraryname ‘c:\sas\rashid this is the directory name’;
DATA libraryname.another-name-for-data-set;
INFILE ‘dataset-name.dat’;

The results is going to be libraryname.theothername (eg.


TREE.CARAMBOLE).

2
P. 56-57

10) IF-THEN use:

IF condition THEN action

Use IN to compare the variable value to a list of values.

IF cultivar IN (‘arkin’, ‘golden star’) THEN tree = ‘carambola’;


SAS will put arking and golden star under the variable tree carambola, if a
list of trees and varities existed.

You can have multiple actions by using DO and END:

IF condition THEN DO;


Action1; (e.g. tree = ‘carambola’; among list of trees)
Action2; (e.g. fruit = ‘sweet’; among list of flavors)
Action-n;
END;

You can also have AND or OR (e.g):

IF condition AND condition THEN action;

P. 72-73

Group the data into categories using:

IF condition THEN action;


ELSE IF condition THEN action;
ELSE action;

e.g.

IF tree = ‘100 ‘ THEN age = ‘old’;


ELSE IF tree < 100 THEN age = ‘medium’;
ELSE IF tree <50 THEN age = ‘Young’;
ELSE age =’tiny’;

3
P. 74-75

Also use IF to subsetting data, e,.g:

IF expression; for example IF sex = ‘f’; (this will only use female data only)

Another way to do this by


IF expression THEN DELETE; for example IF sex = ‘m’ THEN DELETE;
(will not use male data)

p. 76

11) SAS Dates is the number of days since 1/1/1960

P. 78

12) Use Short Cuts:

- Numbered range lists: INPUT cat1 – cat5; (use 1 dash)


This is for varuables with the same initial name then numbered in sequence.

- Name range lists: PUT y -- b; (using 2 dashes)


Only if you know the INPUT was in order like INPUT y a c h b;

P. 86

13) Make data subset in PROC with where:

WHERE condition;

This way sas reads only observations meet the condition

e.g. WHERE rain => 50; (only consider rain of equal to or more than
50 inches)

4
P. 92

14) Sort data with PROC SORT:

PROC SORT;
BY variable-1 … variable-n;

e.g., PROC SORT;


BY tree cultivar;
SAS will sort the data by tree value then cultivar value, etc.

You can make a new data set with OUT statement, e.g.

PROC SORT DATA = name-of-data-set OUT = new-data-set-name; (or just)


PROC SORT;
OUT = new-data-set-name;
BY variables;

You can add options to the SORT line such as:


NODUPKEY (no duplicates) or to BY line like DESCENDING (order)

P. 94

15) PROC PRINT to print the data

To print all data variables with their labels

PROC PRINT LABEL;


Also use:
BY avriable-list; to get sorted output variables.
SUM variable-list; to get sum of values

P. 96

5
16) DATA summary with PROC MEANS:

For summary info use:

PROC MEANS options;

List of options page 106

17) Examine data with PROC FREQ:

Data organized against each other by

PROC FREQ;
TABLES variable-combinations; (e.g. TABLES sex * age)

P. 110

18) plot data:

PROC PLOT;
PLOT vertical-variable * horizontal-variable; (plot X * Y)

e.g. (where ‘x’ and ‘#’ the plotting symbol instead of SAS letters)

PLOT sex* length = ‘#’ sex * age = ‘x’; (2 plots)

Also

To tell sas to use the 1st letter of a variable NAME (e.g. DATA $ banan
carambola etc.)

PLOT sex*length = DATA;

Also you can use the option /OVERLAY; at the end of PROC PLOT to have
2 plots in one.

P. 112

6
19) Combine 2 data sets in one using SET statement:

create dataset called both from set1 and set2:

DATA both;
SET set1 set2;
You can use If and BY statement to sort the new dataet.
P.118-120

20) Merge 2 data sets with similar variables:

(1st both data are listed) then


DATA both;
MERGE set1 set2;
BY age;
PROC PRINT;
RUN:

P. 122

21) PROC UNIVARIATE for all data basic stats:

PROC UNIVARIATE;
VAR variable-list;

You can have other options like normality test and plot, e.g.:
PROC UNIVARIATE PLOT NORMAL;

p. 146

22) Correlations with PROC CORR:

PROC CORR; (this does it all, or choose like..)


VAR varliable-list;
WITH variable-list;

PROC CORR;
VAR sex;

7
WITH age;
p. 148

23) Linear REG

PROC REG;
MODEL dependent = independent;
(also add)
PLOT y-variable * X-variable =’symbol’:

P. 152

24) ONE Way ANOVA:

requires CLASS and MODEL statements:

PROC ANOVA;
CLASS variable-list; (this comes 1st)
MODEL dependent = effects; (effect is usually the class variable)
Options like
MEANS effects / options;

e.g.,
CLASS team;
MODEL height = team;
MEANS team/ DUNCAN; (for Duncan separation).

p. 154

8
Example of a SAS program to analyze some data using ANOVA with
treatment mean separation by LSD.

TITLE ‘Mango cuttings’ ;


DATA mango;
INPUT tree $ wood $ week1 week2;
CARDS;
y s 20 22
y s 23 25
y s 19 20
y h 32 35
y h 30 34
y h 31 33
m s 22 21
m s 23 22
m s 25 22
m h 32 35
m h 35 36
m h 32 38
;
PROC print;
PROC GLM;
CLASS tree wood;
MODEL week1 week2 = tree wood tree*wood;
MEANS tree wood / LSD;
LSMEANS tree*wood / PDIFF STDERR;
PROC SORT; BY tree;
PROC MEANS MEAN STDDEV STDERR; by tree;
PROC SORT; BY wood;
PROC MEANS MEAN STDDEV STDERR; by wood;
RUN;

Copy the above code and paste it in SAS, then click RUN.

9
So what this code means:

TITLE ‘Mango cuttings’ ;


The title of this program is Mango cuttings
DATA mango;
The name of this data is mango (always less then 8 letters)
INPUT tree $ wood $ week1 week2;
INPUT is the variables you have. If letters, it needs to have $ after the variable name, e.g.
wood $, weeks are numbers so no need for $. The list here must be the same order with
your data set.
CARDS;
CARDS tell SAS that the data set is listed next
y s 20 22

m h 32 38
;
Data must end in semi colon as with every end of code (syntax). If data is missing replace
it with a dot (.)
PROC PRINT;
PROC means procedure. PROC Print will print the above dataset to check if numbers are
correct.
PROC GLM;
GLM is General Linear Procedure used for ANOVA analysis.
CLASS tree wood;
GLM must be followed by CLASS, which are the controlled variables in this case tree
and wood types.
MODEL week1 week2 = tree wood tree*wood;
GLM also needs MODEL that lists the measured variables = controlled variables as seen
in this code, tree, wood and the interaction of tree with the wood (tree*wood).
MEANS tree wood / LSD;
Means will result in a calculation of means and sorted from high to low with letters based
on mean separation analysis. Here means of tree and wood are separated using LSD.
Similar numbers in the output means, they are not significantly different.
LSMEANS tree*wood / PDIFF STDERR;
LSMEANS used to see if differences between interactions are significant.
PROC SORT; BY tree;
This procedure will sort the data by tree
PROC MEANS MEAN STDDEV STDERR; BY tree;
This will calculate the mean, standard deviation and standard error and will be sorted by
tree in each week.
PROC SORT; BY wood;
This will sort the data by wood type
PROC MEANS MEAN STDDEV STDERR; BY wood;
This will calculate the mean, standard deviation and standard error and will be sorted by
wood type in each week
RUN;
RUN is the final statement in each SAS program (syntax), must have it for the program to
run. Make sure you add semicolon (;) at the end of each line of code.

10
SAS OUTPUT

RESULT OF PROC PRINT (PRINTED DATA SET)


Mango cuttings 00:09 Friday, December 12, 1998 12

Obs tree wood week1 week2

1 y s 20 22
2 y s 23 25
3 y s 19 20
4 y h 32 35
5 y h 30 34
6 y h 31 33
7 m s 22 21
8 m s 23 22
9 m s 25 22
10 m h 32 35
11 m h 35 36
12 m h 32 38

11
THIS TELLS YOU THE ORDER OF VARIABLES AND NUMBER OF DATA SET
Mango cuttings 00:09 Friday, December 12, 1998 13

The GLM Procedure

Class Level Information

Class Levels Values

tree 2 m y

wood 2 h s

Number of observations 12

12
This is ANOVA for week 1 BY PROC GLM FOR TREE, WOOD, &
Interaction. NOTICE tree and wood are significantly different
(less than 0.05).
Mango cuttings 00:09 Friday, December 12, 1998 14

The GLM Procedure

Dependent Variable: week1

Sum of
Source DF Squares Mean Square F Value Pr > F

Model 3 316.6666667 105.5555556 39.58 <.0001

Error 8 21.3333333 2.6666667

Corrected Total 11 338.0000000

R-Square Coeff Var Root MSE week1 Mean

0.936884 6.048123 1.632993 27.00000

Source DF Type I SS Mean Square F Value Pr > F

tree 1 16.3333333 16.3333333 6.12 0.0384


wood 1 300.0000000 300.0000000 112.50 <.0001
tree*wood 1 0.3333333 0.3333333 0.12 0.7328

Source DF Type III SS Mean Square F Value Pr > F

tree 1 16.3333333 16.3333333 6.12 0.0384


wood 1 300.0000000 300.0000000 112.50 <.0001
tree*wood 1 0.3333333 0.3333333 0.12 0.7328

13
This is ANOVA for week 2 BY PROC GLM FOR TREE, WOOD, &
Interaction. NOTICE only wood is significantly different (less
than 0.05).
Mango cuttings 00:09 Friday, December 12, 1998 15

The GLM Procedure

Dependent Variable: week2

Sum of
Source DF Squares Mean Square F Value Pr > F

Model 3 528.9166667 176.3055556 70.52 <.0001

Error 8 20.0000000 2.5000000

Corrected Total 11 548.9166667

R-Square Coeff Var Root MSE week2 Mean

0.963565 5.531681 1.581139 28.58333

Source DF Type I SS Mean Square F Value Pr > F

tree 1 2.0833333 2.0833333 0.83 0.3880


wood 1 520.0833333 520.0833333 208.03 <.0001
tree*wood 1 6.7500000 6.7500000 2.70 0.1390

Source DF Type III SS Mean Square F Value Pr > F

tree 1 2.0833333 2.0833333 0.83 0.3880


wood 1 520.0833333 520.0833333 208.03 <.0001
tree*wood 1 6.7500000 6.7500000 2.70 0.1390

14
This is MEANS SEPARATION OF TREE BY LSD FOR WEEK 1
Mango cuttings 00:09 Friday, December 12, 1998 16

The GLM Procedure

t Tests (LSD) for week1

NOTE: This test controls the type I comparisonwise error rate not the experimentwise error rate.

Alpha 0.05
Error Degrees of Freedom 8
Error Mean Square 2.666667
Critical Value of t 2.30600
Least Significant Difference 2.1741

Means with the same letter are not significantly different.

t Grouping Mean N tree

A 28.1667 6 m

B 25.8333 6 y

15
This is MEANS SEPARATION OF TREE BY LSD FOR WEEK 2
Mango cuttings 00:09 Friday, December 12, 1998 17

The GLM Procedure

t Tests (LSD) for week2

NOTE: This test controls the type I comparisonwise error rate not the experimentwise error rate.

Alpha 0.05
Error Degrees of Freedom 8
Error Mean Square 2.5
Critical Value of t 2.30600
Least Significant Difference 2.1051

Means with the same letter are not significantly different.

t Grouping Mean N tree

A 29.0000 6 m
A
A 28.1667 6 y

16
This is MEANS SEPARATION OF WOOD BY LSD FOR WEEK 1
Mango cuttings 00:09 Friday, December 12, 1998 18

The GLM Procedure

t Tests (LSD) for week1

NOTE: This test controls the type I comparisonwise error rate not the experimentwise error rate.

Alpha 0.05
Error Degrees of Freedom 8
Error Mean Square 2.666667
Critical Value of t 2.30600
Least Significant Difference 2.1741

Means with the same letter are not significantly different.

t Grouping Mean N wood

A 32.0000 6 h

B 22.0000 6 s

17
This is MEANS SEPARATION OF WOOD BY LSD FOR WEEK 2
Mango cuttings 00:09 Friday, December 12, 1998 19

The GLM Procedure

t Tests (LSD) for week2

NOTE: This test controls the type I comparisonwise error rate not the experimentwise error rate.

Alpha 0.05
Error Degrees of Freedom 8
Error Mean Square 2.5
Critical Value of t 2.30600
Least Significant Difference 2.1051

Means with the same letter are not significantly different.

t Grouping Mean N wood

A 35.1667 6 h

B 22.0000 6 s

18
This is LSMEANS SEPARATION OF INTERACTION FOR WEEK 1 & 2
Mango cuttings 00:09 Friday, December 12, 1998 20

The GLM Procedure


Least Squares Means

Standard LSMEAN
tree wood week1 LSMEAN Error Pr > |t| Number

m h 33.0000000 0.9428090 <.0001 1


m s 23.3333333 0.9428090 <.0001 2
y h 31.0000000 0.9428090 <.0001 3
y s 20.6666667 0.9428090 <.0001 4

Least Squares Means for effect tree*wood


Pr > |t| for H0: LSMean(i)=LSMean(j)

Dependent Variable: week1

i/j 1 2 3 4

1 <.0001 0.1720 <.0001


2 <.0001 0.0004 0.0805
3 0.1720 0.0004 <.0001
4 <.0001 0.0805 <.0001

Standard LSMEAN
tree wood week2 LSMEAN Error Pr > |t| Number

m h 36.3333333 0.9128709 <.0001 1


m s 21.6666667 0.9128709 <.0001 2
y h 34.0000000 0.9128709 <.0001 3
y s 22.3333333 0.9128709 <.0001 4

Least Squares Means for effect tree*wood


Pr > |t| for H0: LSMean(i)=LSMean(j)

Dependent Variable: week2

i/j 1 2 3 4

1 <.0001 0.1083 <.0001


2 <.0001 <.0001 0.6195
3 0.1083 <.0001 <.0001
4 <.0001 0.6195 <.0001

NOTE: To ensure overall protection level, only probabilities associated with pre-planned
comparisons should be used.

19
This is PROC MEANS OUTPUT TO CALCULATE MEAN, DTANDARD DEVIATION
AND STANDARD ERROR FOR TREE
Mango cuttings 00:09 Friday, December 12, 1998 21

--------------------------------------------- tree=m ---------------------------------------------

The MEANS Procedure

Variable Mean Std Dev Std Error


--------------------------------------------------------
week1 28.1666667 5.4924190 2.2422707
week2 29.0000000 8.0993827 3.3065591
--------------------------------------------------------

--------------------------------------------- tree=y ---------------------------------------------

Variable Mean Std Dev Std Error


--------------------------------------------------------
week1 25.8333333 5.8452260 2.3863035
week2 28.1666667 6.6156380 2.7008229
--------------------------------------------------------

20
This is PROC MEANS OUTPUT TO CALCULATE MEAN, DTANDARD DEVIATION
AND STANDARD ERROR FOR WOOD
Mango cuttings 00:09 Friday, December 12, 1998 22

--------------------------------------------- wood=h ---------------------------------------------

The MEANS Procedure

Variable Mean Std Dev Std Error


--------------------------------------------------------
week1 32.0000000 1.6733201 0.6831301
week2 35.1666667 1.7224014 0.7031674
--------------------------------------------------------

--------------------------------------------- wood=s ---------------------------------------------

Variable Mean Std Dev Std Error


--------------------------------------------------------
week1 22.0000000 2.1908902 0.8944272
week2 22.0000000 1.6733201 0.6831301
--------------------------------------------------------

21

Você também pode gostar