SAS Basic

1
SAS Training
Basic
Agenda
Introduction to SAS Software Program
Data preparation & Tabulation
Test of Difference: T-test, and ANOVA
Test of Association: Correlation & Regression Analysis
SAS
From traditional statistical analysis of variance
and predictive modeling to exact methods and
statistical visualization techniques, SAS/STAT
software is designed for both specialized and
enterprise wide analytical needs. SAS/STAT
software provides a complete, comprehensive set
of tools that can meet the data analysis needs of
the entire organization.
SAS Components
SAS
SAS Enterprise
Enterprise
Guide
Guide
SAS
SAS 9.2
9.2
Graphical
Graphical user
user interface
interface application
application
for
for some
some common
common basic
basic data
data analysis
analysis
tasks.
tasks.
Command-based
Command-based application
application for
for aa
wide
wide variety
variety of
of data
data analysis
analysis tasks.
tasks.
SAS Enterprise Guide

To open the statistical software package SAS
go to the Start Menu >>> All Programs >>>
SAS >>> SAS Enterprise Guide 4.3
SAS 9.2
To open the statistical software package SAS

go to the Start Menu >> All Program >> SAS
>> SAS 9.2 (English)
What Is SAS Enterprise Guide?

What Is SAS Enterprise Guide? SAS
Enterprise Guide is an easy-to-use Windows
client application that provides these features:
access to much of the functionality of SAS

an intuitive, visual, customizable interface
transparent access to data
ready-to-use tasks for analysis and
reporting
easy ways to export data and results to
other applications
scripting and automation
a program editor with syntax completion
and built-in function help
Explore the Main Windows

1
2
10
Create a Project for This Tutorial

If SAS Enterprise Guide is not open, start it now. In the
Welcome window, select New Project.
If SAS Enterprise Guide is already open, select File >>
New Project. If you already had a project open in SAS
Enterprise Guide, you might be prompted to save the
project. Select the appropriate response.
The new project opens with an empty Process Flow
window.
11
1. The Project Tree

You can use the Project Tree window to manage
the objects in your project. You can delete,
rename, and reorder the items in the project.
You can also run a process flow or schedule a
process flow to run at a particular time.
12
2. Workspace and Process Flow Windows

You can have one or more
process flows in your project.
When you create a new project,
an empty Process Flow window
opens. As you add data, run
tasks, and generate output, an
icon for each object is added to
the process flow.
The process flow displays the
objects in a project, any
relationships that exist between
the objects, and the order in
which the objects will run when
you run the process flow.
13
3. The Task List

You can use tasks to do
everything from manipulating
data, to running specific
analytical procedures, to
creating reports.
Many tasks are also available as
wizards, which contain a
limited number of options and
can provide a quick and easy
way to use some of the tasks.
14
Add SAS Data to the Project

You can add SAS data files
and other types of files,
including OLAP cubes,
information maps, ODBCcompliant data, and files
that are created by other
software packages, such as
Microsoft Word or
Microsoft Excel.
15
SAS Enterprise Guide requires all data that it

accesses to be in table format. A table is a
rectangular arrangement of rows (also called
observations) and columns (also called
variables).
Name
Jones
Laverne
Jaffe
Wilson
Gender
M
M
F
M
Age
48
58
.
28
Weight
128.6
158.3
115.5
170.1
16
a column's type is important because it affects how

the column can be used in a SAS Enterprise Guide
task. A column's type can be either character or
numeric.
Character variables, such as Name and Gender in
the preceding data set, can contain any values.
Missing character values are represented by a blank.
Numeric variables, such as Age and Weight in the
preceding data set, can contain only numeric values.
Currency, date, and time data is stored as numeric
variables. Missing numeric values are represented
by a period.
Name
Jones
Laverne
Jaffe
Wilson
Gender
M
M
F
M
Age
48
58
.
28
Weight
128.6
158.3
115.5
170.1
17
Local and Remote Data

When you open data in SAS Enterprise Guide,
you must select whether you want to look for the
data on your local computer, a SAS server, or in
a SAS folder.
18
Local and Remote Data (Cont)

If you click My Computer, you can browse the
directory structure of your computer. You can
open any type of data file that SAS Enterprise
Guide can read.
If you click Servers, you can look for your data
on a server. A server can either be a local server
if SAS software is installed on your own
computer, or it can be a remote server if SAS
software is installed on a different computer.
19
Open Data from Server

Within each server there are icons that you can select for
Libraries and Files. Libraries are shortcut names for
directory locations that SAS knows about. Some libraries
are defined by SAS, and some are defined by SAS
Enterprise Guide. Libraries contain only SAS data sets.
The Files folder on a server enables you to access data
files in the directory structure on the computer where
the SAS server is running. For example, if you wanted to
open a Microsoft Excel file on a server that is defined in
your repository, you would use the Files node to locate
and open the file.
20
Open Data from SAS Folders

If you click SAS Folders, you can browse the list
of SAS folders that you can access. SAS folders
are defined in the SAS Metadata Server and can
be used to provide a central location for your
stored processes, information maps, and projects
so that they can be shared with other SAS
applications. SAS folders can also contain content
that is not in the SAS Metadata Server, such as
data files.
21
Add SAS Data from Your Local Computer

Select File >> Open >> Data. In the Open Data
window, select My Computer.
Open the SAS Enterprise Guide samples directory
and double-click Data. By default, the sample
programs, projects, and data are located in
C:\Program Files\SAS\EnterpriseGuide\4.3\Sample.
By default, all file types are displayed in the
window. Files with the
icon are SAS data sets.
Press CTRL and select Orders.sd2 and
Products.sas7bdat, and then click Open.
22
Add SAS Data from Your Local Computer (Cont)

Shortcuts to
the Products and Orders
tables are added to the
project, and the data sets
open in data grids.
By default, the tables open in
read-only mode. In this
mode, you can browse, resize
column widths, hide and
hold columns and rows, and
copy columns and rows to a
new table.
You cannot edit the data in
the table unless you change
to edit mode. Select Edit >>
Remove Protect Data
23
View the Properties of a Data Set

In the project tree, right-click Products and
select Properties from the pop-up menu. The
Properties for Products window opens. You can see
information about general properties such as the
physical location of the data and the date it was last
modified.
24
View the Properties of a Data Set (Cont)

In the selection pane, click Columns. Here you
can view a list of columns in your data and the
column attributes.
25
Add Data from a SAS Library

Select File >> Open >> Data.
In the Open Data window,
select Servers.
Double-click Libraries, and
then double-click SASHELP.
As you can see, only SAS data
sets are stored in libraries
Scroll in the window and
double-click
the PRDSALE data set. A
shortcut to the data is added to
the project and the data opens
in the data grid.
26
Save the Project

Select File >> Save
Project As.
The Save window opens and
prompts you to choose
whether to save the project
on your computer or on a
server. Select My
Computer.
In the Save window, select a
location for the project. In
the File name box, type
your file name. Project
files are saved with the
extension .egp.
Click Save.
27
28
Data Input
There are two main simple tasks for data input;
Manually Input Data
Import from an External File
29
Manually Input Data

1. Create a SAS Library
2. Create a SAS Data Set
3. Input data
30
What is a SAS Data Library?

A SAS data library is a collection of one or more
SAS files that are recognized by SAS and can be
referenced and stored as a unit. Each file is a
member of the library. SAS data libraries help to
organize your work. For example, if a SAS
program uses more than one SAS file, then you
can keep all the files in the same library.
Organizing files in libraries makes it easier to
locate the files and reference them in a program.
31
Telling SAS Where the SAS Data

Library Is Located
directly specify the operating environment's
physical name for the location of the SAS data
library.
assign a SAS libref (library reference), which is a
SAS name that is temporarily associated with the
physical location name of the SAS data library.
32
Using Librefs for Temporary and

Permanent Libraries
When you start a SAS session, SAS automatically
assigns the libref WORK to a special SAS data
library. Normally, the files in the WORK library
are temporary files.
Files that are stored in any SAS data library
other than the WORK library are usually
permanent files; that is, they endure from one
SAS session to the next. Store SAS files in a
permanent library if you plan to use them in
multiple SAS sessions.
33
Create a SAS Library

Tools >> Assign Project Library
34
Create a SAS Library Step 1

Specify name and server for the library
35

Specify the engine for the library
36

Specify options for the library
37

Click Test Library, checking its OK to create this library
Press Finish to create the library
38
Create a SAS Library

Check created library at
Server List
When a libref is assigned to
a SAS data library, you can
use the libref throughout
the SAS session to access
the SAS files that are stored
in that library or to create
new files.
39
Create SAS Data Set

File >> New >> Data
40
Create SAS Data Set Step 1

Specify name TEST and location DEMO
41
Create SAS Data Set Step 2

Create columns and specify their properties
Name
Jones
Laverne
Jaffe
Wilson
Gender
M
M
F
M
Age
48
58
.
28
Weight
128.6
158.3
115.5
170.1
42
Input Data
43
Import from an External File

The Import Data wizard enables you to create
SAS data sets from text, HTML, or PC-based
database files (including Microsoft Excel,
Microsoft Access, and other popular formats).
When you use the Import Data wizard, you can
specify import options for each file that you
import.
44
Import Data
File >> Import Data
45
Import Data (Cont)

Desktop >> SAS Training >> Data Advising
Survey.xls
46
Import Data (Cont)

Specify Data
47
Import Data (Cont)

Select Data Source
48
Import Data (Cont)

Define Field Attributes
49
Import Data (Cont)

Advanced Options
50
Import Data Result
51
Import SPSS file
52
Import SPSS file Step 1

Select an SPSS file to import
53
Import SPSS file Step 2

Specify a name for the imported table
54
Import SPSS file Result
55
Create Format
Tasks >> Data >> Create Format
56
Create Format (Cont)

Set Format Name GENDER
Select Library - SASUSER
Select Format Type Character
57
Define Formats
Click New Label and type a name of a label
Click New Range and select type of values and
type a value according to the specified label
Repeat the steps
Click Run
58
Applying User-Defined Formats

Open a SAS Data Set
Unprotect Data: Edit >> Unprotect Data
59
Applying User-Defined Formats (Cont)

Right-click the column
Select Properties
60
Applying User-Defined Formats (Cont)

In the left pane, select Formats
In Categories box, select User Defined
In Formats box, select the desired Formats
61
Applying Formats in Tasks

Custom formats can be applied in the same
places that formats defined in SAS can be used.
62
SAS Tasks
After you have data in your project, you can
create reports and run analyses on the data.
To do this, you select a SAS task from the Task
List or from the Tasks menu. Some tasks have
wizards to guide you through the decisions that
you need to make. Wizards are available from
menus or from a link next to the related task in
the Task List.
63
Using Tasks in SAS Enterprise Guide

The icon next to each variable
represents the variable's type.
Country is a character
variable ( ). Year is a
numeric variable ( ). Month
is a numeric variable in dateand-time format ( ). Actual
and Predict are numeric
variables in currency format
( ).
64
One-Way Frequencies Task

We should create One-Way Frequencies (tables
and graphs) to check our data set one last time
before we intensively analyze the data.
65
One-Way Frequencies
Under Data, select Q1-Q19, Gender, Nation,
Year, and Major for Analysis variables.
66
One-Way Frequencies
Under Plots, check Vertical for Bar chart.
67
One-Way Frequencies
Check Frequency Tables and/or Bar charts for any
errors (e.g., typo). Make necessary correction(s).
68
Filter and Sort

Use Tasks >> Data >> Filter and Sort... or Sort data...
to help you find the error(s).
69
Summary Statistics Task

The Summary Statistics task can be used to
calculate summary statistics based on groups
within the data. You can produce reports,
graphs, and data sets as output.
70
Summary Statistics Task

The Summary Statistics task has both a wizard
and the standard task dialog box that can be
used to set up the results.
71
Summary Statistics: Task Roles

Use the wizard to assign variables to roles.
Co m
for e pute s
ta
vari ach nu tistics
able
m
in th eric
e lis
t.
Specify variables whose

values define subgroups.
72
Summary Statistics: Statistics and Results

Choose statistics and results to include, including
a report, graphics, and an output data set.
73
Summary Statistics: Advanced View

Opening the task in Advanced View enables
additional options to further modify the output.
74
Summary Tables
The Summary Tables wizard or task can be used to
generate a tabular summary report.
75
Summary Tables Wizard

The Summary Tables wizard enables you to select analysis
variable(s) and statistics, assign classification variables
to define rows and columns, and specify totals.
76
Summary Tables Wizard
77
78
One-Sample t-Test
Tasks >> ANOVA >> t Test
79
Selected One Sample.
80
Under Data, choose Q19 as the Analysis variable

task role and Gender as the Group analysis by.
81
Under Analysis, input H0 = 3.
82
T-Test Output
Since p-value
p-value is
is less
less than
than
Since
0.05, itit can
can be
be concluded
concluded that
that
0.05,
average
female
students
average female students
consider themselves
themselves as
as aa
consider
well-prepared
students
for
well-prepared students for
advising appointment
appointment
advising
(significantly
higher than
than 3).
3).
(significantly higher
Since p-value
p-value is
is less
less than
than
Since
0.05, itit can
can be
be concluded
concluded that
that
0.05,
average
male
students
also
average male students also
consider
themselves as
as aa
consider themselves
well-prepared students
students for
for
well-prepared
advising
appointment
advising appointment
83
Two-Sample t-Test
Tasks >> ANOVA >> t Test
84
Selected Two Sample.
85
Under Data, choose Q6 as the analysis variable

task role and Gender as the classification
variable.
86
Under Plots, check Summary plot,

Confidence interval plot, and Normal
quantile-quantile (Q-Q) plot.
87
T-Test Output
Equaled variance
variance is
is assumed.
assumed.
Equaled
Pooled
method
is
used.
Since
Pooled method is used. Since
p-value is
is greater
greater than
than 0.05,
0.05,
p-value
it
cannot
be
concluded
that
it cannot be concluded that
there is
is significant
significant difference
difference
there
in
Advisor
Satisfaction
in Advisor Satisfaction
between male
male and
and female
female
between
students.
students.
the probability
probability is
is greater
greater than
than
the
0.05. So
So there
there is
is evidence
evidence
0.05.
that
the
variances
for the
the two
two
that the variances for
groups, female
female students
students and
and
groups,
male students,
students, are
are not
not
male
different.
different.
88
One-Way ANOVA
Tasks >> ANOVA >> One-Way ANOVA
89
Under Data, assign Q6 and Year to the task

roles of Dependent variable and Independent
variable, respectively.
90
Under Tests, click Levenes test
91
Under Means Comparison, check

Bonferroni t test, Duncans multiplerange test, and Scheffes multiple
comparison procedure for Post Hoc tests
92
Under Plots, check Means for Plots Types.

Then, click Run.
93
One-Way ANOVA results
Since p-value
p-value is
is greater
greater than
than 0.05,
0.05,
Since
can be
be concluded
concluded that
that there
there is
is no
no
itit can
significant difference
difference in
in average
average
significant
Advisor Satisfaction
Satisfaction among
among
Advisor
year(s) of
of study.
study. Therefore,
Therefore, there
there is
is
year(s)
no need
need to
to check
check the
the Post
Post Hoc
Hoc tests.
tests.
no
94
Post Hoc Test: Bonferroni t Tests
95
Post Hoc Test: Scheffes Tests
96
ANOVA: Means Plot of Q6 by Year
97
98
Data Exploration, Correlations,

and Scatter Plots
Tasks >> Multivariate >> Correlations
99
With Data selected at the left, assign Q1, Q2, Q3, Q4,
and Q5 to the task role of Analysis variables and Q6
to the role of Correlate with.
100
Correlation Types
101
In Results, check the box for Create a scatter plot for

each correlation pair. Also, check the box at the right for
Show correlations in decreasing order of magnitude and
uncheck the box for Show statistics for each variable.
102
Correlation Analysis
Since p-values are less than 0.05, there are
significant (positive) relationships between Q6
(Overall satisfaction on Advisor) and Q1, Q2,
Q3, Q4, Q5.
103
Linear Regression
Tasks >> Regression >> Linear Regression
104
Drag Q6 to the dependent variable task role and

Q1, Q2, Q3, Q4, Q5. to the explanatory
variables task role.
105
Regression: Model
Model Selection Method: Full model fitted (by
default)
106
Regression: Statistics
Under Details on estimates, check Standardized
regression coefficients
Perform some Diagnostics
107
Regression Diagnostics
Unusual and Influential data (Outliers/Leverage)
Tests on Normality of Residuals
Tests on Nonconstant Error of Variance
(Heteroscedasticity)
Tests on Correlations among Predictors
(Multicollinearity)
Tests on Nonlinearity
Tests on Dependence of Residuals
(Autocorrelation)
Model Specification
108
Diagnostics: Collinearity Analysis

This option requests a detailed analysis of
collinearity among the regressors. This includes
eigenvalues, condition indices, and
decomposition of the variances of the estimates
with respect to each eigenvalue.
109
Diagnostics: Collinearity Analysis

Check Tolerance (1/VIF) or Variance Inflation (VIF)
Some researchers use the more lenient cutoff of 5.0 or even
10.0 to signal when multicollinearity is a problem. The
researcher may wish to drop the variable with the highest VIF
if multicollinearity is indicated and theory warrants.
The condition indices are the square roots of the ratio of the
largest eigenvalue to each individual eigenvalue. The largest
condition index is the condition number of the
scaled X matrix. Belsey, Kuh, and Welsch (1980) suggest that,
when this number is around 10, weak dependencies might be
starting to affect the regression estimates. When this number
is larger than 100, the estimates might have a fair amount of
numerical error (although the statistical standard error almost
always is much greater than the numerical error).
110
Diagnostics: Heteroscedasticity Test

This option tests that the first and second
moments of the model are correctly specified.
Asymptotic covariance matrix. This option
displays the estimated asymptotic covariance
matrix of the estimates under the hypothesis of
heteroscedasticity.
111
Diagnostics: Durbin-Watson Statistic

The Durbin-Watson statistic shows whether or not the
errors have first-order autocorrelation. (This test is
appropriate only for time series data.) The sample
autocorrelation of the residuals is also produced.
The value of d ranges from 0 to 4. Values close to 0
indicate extreme positive autocorrelation; close to 4
indicates extreme negative autocorrelation; and close to
2 indicates no serial autocorrelation. As a rule of thumb,
d should be between 1.5 and 2.5 to indicate
independence of observations. Positive autocorrelation
means standard errors of the b coefficients are too small.
Negative autocorrelation means standard errors are too
large.
112
Under Plots, select Custom list of plots under Show plots

for regression analysis. In the menu that appears, uncheck
the box for Diagnostic plots and check the box for
Histogram plot of the residual, Normal quartile
plot of the residual and Residual plots.
113
Regression Analysis
These are
are the
the FF Value
Value and
and
These
p-value, respectively,
respectively,
p-value,
testing the
the null
null hypothesis
hypothesis
testing
that the
the Model
Model does
does not
not
that
explain the
the variance
variance of
of
explain
the response
response variable.
variable.
the
R-Square defines
defines the
the
R-Square
proportion of
of the
the total
total
proportion
variance explained
explained by
by
variance
the Model.
Model.
the
114
Regression Analysis
These are
are the
the tt Value
Value and
and
These
p-value, respectively,
respectively,
p-value,
testing the
the null
null hypothesis
hypothesis
testing
that the
the coefficients
coefficients are
are
that
significantly equal
equal to
to 0.
0.
significantly
115
Regression: Diagnostics
Might suggest
suggest violation
violation
Might
of normality
normality of
of residuals
residuals
of
assumption
assumption
116
Might suggest
suggest violation
violation
Might
of normality
normality of
of residuals
residuals
of
assumption
assumption
117
118
Q&A

SAS Basic

Enviado por

Dados do documento

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

SAS Basic

Enviado por

Direitos autorais:

Formatos disponíveis

1

SAS Enterprise Guide

To open the statistical software package SAS

What Is SAS Enterprise Guide?

access to much of the functionality of SAS

Explore the Main Windows

Create a Project for This Tutorial

1. The Project Tree

2. Workspace and Process Flow Windows

3. The Task List

Add SAS Data to the Project

SAS Enterprise Guide requires all data that it

a column's type is important because it affects how

Local and Remote Data

Local and Remote Data (Cont)

Open Data from Server

Open Data from SAS Folders

Add SAS Data from Your Local Computer

Add SAS Data from Your Local Computer (Cont)

View the Properties of a Data Set

View the Properties of a Data Set (Cont)

Add Data from a SAS Library

Save the Project

Manually Input Data

What is a SAS Data Library?

Telling SAS Where the SAS Data

Using Librefs for Temporary and

Create a SAS Library

Create a SAS Library Step 1

Create a SAS Library Step 2

Create a SAS Library Step 3

Create a SAS Library Step 4

Create a SAS Library

Create SAS Data Set

Create SAS Data Set Step 1

Create SAS Data Set Step 2

Import from an External File

Import Data (Cont)

Import Data (Cont)

Import Data (Cont)

Import Data (Cont)

Import Data (Cont)

Import Data Result

Import SPSS file

Import SPSS file Step 1

Import SPSS file Step 2

Import SPSS file Result

Create Format (Cont)

Applying User-Defined Formats

Applying User-Defined Formats (Cont)

Applying User-Defined Formats (Cont)

Applying Formats in Tasks

Using Tasks in SAS Enterprise Guide

One-Way Frequencies Task

Filter and Sort

Summary Statistics Task

Summary Statistics Task

Summary Statistics: Task Roles

Specify variables whose

Summary Statistics: Statistics and Results

Summary Statistics: Advanced View

Summary Tables Wizard

Summary Tables Wizard