HISTORY OF SAS
SAS, (pronounced “sass”) once stood for “Statistical Analysis System,”
But now it is only SAS.
SAS began at North Carolina State University as a project to analyze agricultural research.
Founded in 1976 to help all sorts of customers.
SAS is both, a software and company.
The world’s biggest private sector company.
SAS giving operations in various sectors like,
Automotive
Communications
Education
Banking/Financial Services
Government
Health Insurance
Health Care Providers
Hospitality & Entertainment
Insurance
Life Sciences
Manufacturing
Media
Oil & Gas
Retail
Hotels
Utilities
And giving solution lines as
Analytics
Business Intelligence
Customer Intelligence
Data Integration & ETL
Financial Intelligence
Foundation Tools
Fraud Management
Governance, Risk & Compliance
High-Performance Computing
Human Capital Intelligence
IT Management
On-Demand Solutions
Performance Management
Risk Management
Supply Chain Intelligence
Sustainability Management
In 1966, there was no SAS. However, there was a need for a computerized
statistics program to analyze vast amounts of agricultural data collected through
United States Department of Agriculture (USDA) grants.
Then research started by University Statisticians Southern Experiment Stations,
Eight land-grant universities that received the majority of their research funding from the USDA.
Some schools came together under a grant from the
National Institutes of Health (NIH) to develop a general-purpose statistical
software package to analyze all the agricultural data they were generating.
North Carolina State University, located in the capital city of Raleigh,
North Carolina became the leader in the consortium.
North Carolina State University faculty members
Jim Goodnight and Jim Barr
Emerged as the project leaders –
Barr created the architecture and
Goodnight implementing the features
In 1972 NIH stopped giving funds to this team, and then the consortium agreed to
chip in $5,000 apiece each year to allow NCSU to continue developing and
maintaining the system and supporting their statistical analysis needs.
During the coming years, SAS software was licensed by pharmaceutical companies,
insurance companies, and banks, as well as by the academic community
that had given birth to the project.
Jane Helwig, another Statistics Department employee at NCSU, joined
the project consortium as a documentation writer
John Sall, a graduate student and programmer, rounded out the core team
Incorporation
In 1976 Goodnight, Barr, Helwig, and Sall left NCSU and formed
SAS Institute Inc. – a private company “devoted to the maintenance and
further development of SAS.” They opened offices in building #2806 Hillsborough Street,
across from the university.
By 1980, the growing company building capacity was not sufficient in the Hillsborough Street building,
and then it moved to the site of its present headquarters offices just outside Raleigh in Cary,
North Carolina. At that time employees were 20.
At this time SAS was growing, and the entire computer hardware and software industry was changing,
with new operating systems and platforms placing new demands on software developers
one of the first steps for SAS was to adapt the software to operate on IBM’s Disk Operating System (DOS).
Now it is working on different operating systems like Windows, DOS, Z/OS, UNIX, and various UNIX flavors.
In 1990 SAS Company grew with an employment force of 7000.
SAS celebrated its 25th anniversary in 2001,
It turns out from various difficulties along with the millennium and the Y2K frenzy.
And they created a new logo and tagline presently which we are seeing Tagline is –
THE POWER TO KNOW
SAS has been named one of FORTUNE magazine’s “100 Best Companies to Work For” every year since 1998 and no1 in 2010
SAS was named the best company to work for in 2010 by FORTUNE.
SAS IS A MULTI-VENDOR ARCHITECTURE
Multi Vendor Architecture is the foundation of cross-platform portability and
Interoperability of the SAS system means it allows programs to be written once and run anywhere,
regardless of hardware or operating system. This architecture provides customers
with hardware independence and flexible implementation.
SAS IS A MULTI-DATABASE ARCHITECTURE
SAS can connect to any kind of data source to read the data, that’s why SAS is a database Architecture.
Data sources are databases (like Oracle, SQL Server, DB2, Sybase, Teradata, Informix MS-Access, etc…)
Or Files (like Excel, CSV, and Notepads etc…)
THE PURPOSE OF SAS
SAS is a Flexible and extensible fourth-generation programming language designed for data access,
transformation, and reporting
-> Data access and Data Management
-> User Interfaces
-> Application Development
-> Business Solutions
-> ETL
-> Analytics
-> Report & Graphics
SAS-FUNCTIONALITY
The functionality of the SAS System is built around the four data-driven tasks.
1. Data access
2. Data management
3. Data analysis
4. Data presentation.
Data access:
Addresses the data required by the application.
It means reading raw data from the source to the SAS application.
Topics cover Infile Statement, Proc import, SQL pass thru, Libname, Proc access, DB Load procedure, etc…
Data management:
Shapes data into a form required by the application.
Topics cover Set, Merge, Format, Informat, Update, etc… statements and Functions
Data analysis:
Analyze data by using various procedures to find the sum, means, STD, freq, and various
statistical calculations. Or transforms raw data into meaningful and useful information.
Topics cover statistical procedures to find out Sum, Means, Frequency, Univariate ANOVA,
chi-square, CMH, GLN, Regression, Correlation, STD, etc., and reporting procedures like
Proc print, Report, Tabulate, and _Null_ Report.
Data presentation:
How you are going to present the output to the end user.
Topics cover ODS and mainly work delivery concepts
Turning Data into Information
Example:-
Data ds1;
Infile datalines;
Input id name$ sex$ age sal;
Datalines;
001 ABC M 23 50000
002 DEF F 27 45000
003 MNO F 21 70000
004 JKL F 23 44000
005 XYZ M 25 58000
;
Run;
Data ds2 (drop=age);
Set ds1;
Format sal comma6. ;
Run;
Proc sort data=ds2;
By sex;
Run;
ODS pdf file=”E:\sas\outputs\sample.pdf”;
Proc print data=ds2;
Var id name sex sal;
By sex;
Sumby sex;
Sum sal;
Run;
ODS pdf close;
Rules of the SAS program
1) Every statement end with semicolon
2) Every step (both data step & proc step) end with Run statement
3) SAS Program is not case sensitive
4) Single statement can write in multiple lines or
Multiple statements can write in single line
SAS Names & Rules to write SAS Names:
SAS Program is color sensitive program, it contains
Keywords – Which are there in dark blue colour
Statements – Which are there in light blue colour
Data – Which is back ground yellow colour, or data location is in pink colour
SAS Names – Which are there in black colour
1) Names must be Less than or equal 32 character length
2) SAS Names must start with alphabetic or under score (_)
3) SAS Names should contain alphabets, numeric’s and under score (_)
But not any special characters
4) SAS Names are not case sensitive but data is case sensitive, and if it is character
Should contain single quotes.
Data types:
Character Data type:
Variable values contain letters (Alphabetic) and/or special characters.
Numeric Data type:
Variable values contain Numbers and/or Dates.
Character variable will read both character and numeric data,
but numeric variable will read only numeric data.
In dataset character data is left aligned, numeric data is right aligned.
How SAS reads the data into variables
Whenever space comes in the data values or upto 8 char length in data values,
whichever comes first will read to into variable.
Terminology:
Tables are called datasets
Columns are called variables
Rows are called observations
ID NAME SEX AGE SAL
1 ABC M 23 50000
2 DEF F 27 45000
3 MNO F 21 70000
4 XYZ M 25 58000
TABLE
Missing Data:
The values of particular variables may be missing for some observations in that case
Missing character data represented by blanks and
Missing numeric data represented by period (.)
Example:-
ID NAME SEX AGE SAL
1 ABC M 23 50000
2
F 27 .
3 MNO F .
70000
4 XYZ
25 58000
Size of SAS Dataset: Prior 9.1 versions you can take maximum 32,767
columns now you can specify as many as up to your CPU memory.
We can invoke SAS in following ways
-> Interactive windowing mode (SAS windowing environment)
-> Interactive menu-driven mode (SAS Enterprise Guide,
SAS/ASSIST, SAS/AF, or SAS/EIS software etc…)
-> Batch mode
-> Noninteractive mode.
SAS Windowing Environment (Interactive windowing mode)
In SAS windowing environment there are 5 basic Windows
Those are (1) Editor (2) Log (3) Output (4) Explorer (5) Results
Editor Window:-
-> To write the program
-> To modify the program
-> To submit the program for execution
The default editor is the Enhanced Editor. The Enhanced Editor is syntax sensitive
and color codes your programs making it easier to read them and find mistakes.
The Enhanced Editor also allows you to collapse and expand the various steps in your program.
For other operating environments, the default editor is the Program Editor.
Log Window:-
Log window contains information of program which submitted in Editor Window.
Generally we can get here Notes, Warnings and Errors.
How many observations are there and how many variables are there in which library datasets are storing.
Output Window
If your program generates any printable results, then it will appear in the Output window.
Explorer Window
The Explorer window gives you easy access to your SAS libraries and files.
Results Window
The Results window is like a table of contents for your Output window.
The results tree lists each part of your results in an outline form.
Command Bar:-
Command Bar Tool Bar Full down Menus (Menu Bar)
The command bar is a place that you can type in SAS commands
Most of the commands that you can type in the command bar are also accessible
through the pull-down menus or the toolbar.
Example:-
X Notepad
X Time
X Date
X “E:\SAS\krishna.xls”
X “E:\SAS\IMPORT.SAS”
Tool Bar:-
Gives you quick access to commands that are already accessible through the pull-down menus.
(New)-
To open New Window
The same can do thru key board, using Ctrl N
(Open) –
To open the program which save in server/pc location
The same can do thru key board, using Ctrl O
(Save) –
To save the program or Log or Output windows information in
Server location or pc location.
The same can do thru key board, using Ctrl S
(Print) –
To produce print of program or Log or Output windows info.
The same can do thru key board, using Ctrl P
(Print Preview) –
Before giving the print we can check the preview of info
(Cut) –
To cut the part of program lines in Editor Window
The same can do thru key board, using Ctrl X
(Copy)-
To select the part of program lines
The same can do thru key board, using Ctrl C
(Paste) –
To Paste the part of program lines
The same can do thru key board, using Ctrl V
(Undo) –
To get back the part of program lines those cuts.
The same can do thru key board, using Ctrl Z
(New Library) –
To create a new library for storing datasets Click on this icon,
Specify new library name,
Specify Engine as default,
Click enable at startup,
And browse the location where datasets should store,
And click OK.
The same can do thru key board, using Ctrl B
(SAS Explorer)-
To open SAS Explorer Window.
(Submit)-
To submit the program for execution
This we can do in multiple ways
-> Click on this icon to execute entire SAS Session
-> Select some part of program lines and click on this icon – only selected program lines
submit for execution
-> Select some part of program lines and right click on select program lines and click
Submit selection for execute selected lines or click Submit All for execute entire SAS
Session.
-> In drill down Menus clicks Run then click submits.
-> Use F3 from keyboard.
(Clear All)-
To clean only Editor Window.
Other ways to clean windows
Using Ctrl E
To clean Editor Window
-> Click on above icon
-> Or right click on anywhere in Editor Window and click Clear All
-> Or in full down Menu bar click Edit then Clear All
-> Or execute below program
DM ‘EDITOR’ CLEAR;
To clean Log Window
-> Or right click on anywhere in Log Window then click on Edit
And click Clear All
-> Or in full down Menu bar click Edit then Clear All
-> Or execute below program
DM ‘LOG’ CLEAR;
To clean Output Window
-> Or right click on anywhere in output Window then click on
Edit and click Clear All
-> Or in full down Menu bar click Edit then Clear All
-> Or execute below program
DM ‘OUTPUT’ CLEAR;
(Break)-
To stop the execution program lines.
Click on this icon and select Cancel submit statements to stop the execution
Or select Terminate SAS System to close the session.
(Help)-
To get the documents and sample programs which help to learn.
Menu Bar:-
In Menu bar located at top of the window contains some full down menu’s those are
File-
-> To open new Editor Window
-> To open existing program
-> To save program
-> For print preview and
-> For Importing and Exporting data
Edit:-
-> For undo, redo, cut, copy, paste, clear all, select all,
Collapse all, expand all, find and replace
View:-
-> For getting back whichever is closed window like
Enhanced Editor, Program Editor, Log, Explorer and Output Windows
Or Write program like
DM ‘EDITOR’;
DM ‘LOG’;
DM ‘OUTPUT’;
Tools:-
-> For create new library, change font type, font size
And enable to create listing output and html output.
Run:-
-> For submitting SAS Program
And getting back last Submitted program.
Solutions:-
For analysis, Reporting
Window:-
For checking what are the windows are opened and arrange windows in software.
Help:-
To get the help from SAS documenting
SHORT CUT KEYS
Help F1
Execute F3
Recall F4
Log F6
Output F7
Zoom off F8
Short cut keys F9
Underlines First letter of Menu’s in Menu bar F10
Command Focus F11
Sub top Shift F1
Horizontal zoom Shift F3
Vertical zoom Shift F4
Zoom one on another Shift F5
Left Shift F7
Right Shift F8
Wpopup (Bring up word tip) Shift F10
Hide the current word tip ESC
Libname Ctrl B
Copy Ctrl C
Directory Ctrl D
Clear Ctrl E
Find Ctrl F
Moves line no Ctrl G
Replace Ctrl H
SAS System Options Ctrl I
Log Ctrl L
File name Ctrl Q
RFind Ctrl R
Title Ctrl T
Paste Ctrl V
Cut Ctrl X
Redo Ctrl Y
Undo Ctrl Z
Open Explorer Ctrl W
Execute the last recorded macro Ctrl F1
Move cursor to next case change ALT Right
Move cursor to previous case change ALT Left
Commenting Ctrl /
Uncommenting Ctrl Shift /
Convert the selected text to lowercase Ctrl Shift L
Convert the selected text to uppercase Ctrl Shift U
Note: Click F9 from your Keyboard to get all the short cut keys into Log.
SAS PROGRAM
A SAS program is a sequence of statements in executed order.
SAS program having 3 components
• Data step
• Proc step
• Global options & Global statements
DATA steps are typically used to retrieve the data and create SAS data sets.
PROC steps are typically used to process SAS data sets
(That is, generate reports and graphs, edit data, and sort data).
GLOBAL OPTIONS are useful to change default settings.
Example:-
Options font=verdana 15;
Data ds1;
Infile datalines;
Input id name$ sex$ age sal;
Datalines;
001 ABC M 23 50000
002 DEF F 27 45000
003 MNO F 21 70000
004 XYZ M 25 58000
;
Run;
Proc print data=ds1;
Run;
WAYS WE CAN READ DATA INTO SAS
Instream Data can enter in SAS Program itself followed by DATALINES statement
Data ds1;
Infile datalines;
Input id name$ sex$ age sal;
Datalines;
001 ABC M 23 50000
002 DEF F 27 45000
003 MNO F 21 70000
004 XYZ M 25 58000
;
Run;
Can read from External files (flat files like Notepad, Excel & csv) to SAS
Data ds;
Infile “C:\Documents and Settings\Administrator\Desktop\SAMPLE.txt”;
Input id name$ sex$ age sal;
Run;
Proc import datafile=”C:\Documents and Settings\Administrator\Desktop\SAMPLE.csv”
Out=work.ds dbms=csv replace;
Run;
Proc import datafile=”C:\Documents and Settings\Administrator\Desktop\SAMPLE.xls”
Out=work.ds dbms=excel replace;
Run;
Proc import table=demo out=work.ds dbms=access replace;
Database=”C:\Documents and Settings\Administrator\Desktop\Sample.mdb”;
Run;
Can read from existing SAS Datasets (With in SAS) to new SAS datasets.
Data work.ds;
Set sashelp.class;
Run;
Can read from different databases to sas like Oracle DB2 Sybase etc…
Proc sql;
Connect to oracle (user=Scott password=tiger);
Create table ds3 as
Select * from connection to oracle
(
Select * from EMP
);
Disconnect from oracle;
Quit;
Can read from SAS Datasets which are located in server/pc (Out side SAS).
Libname Krishna “E:\sas”; (PC location) Libname Krishna “/sas/mis/data”; (Server location)
Libname Krishna oracle user=scott password=tiger”; (Database)
HOW THE SAS SYSTEM WORKS WITH DATA
When starts a SAS session with any mode,
there is a work library this is just temporary created directory (default library) where datasets are stored for SAS session.
All the datasets are created in SAS session will be referred as Work. Prefix
Once close the session will be lost all the datasets from work Library,
If we want keep the datasets permanently need to create own library and keep the datasets permanently
How to create Library:
(Programming method)
LIBNAME Statement
Useful to create a library.
Associates a Libref with a SAS library and lists file attributes for a SAS library.
Syntax: – LIBNAME Libref ‘SAS-library’;
LIBNAME MY_SAS “E:\SAS_CLAS”;
Rules for naming a libref:
must be 8 characters or less
must begin with a letter or underscore
Remaining characters are letters, numbers, or underscores.
The specified location if already contains some SAS datasets it will reflect into New library.
Note: – If you get raw data as a SAS datasets (Not Notepads, Excel, csv & Oracle)
You can read that datasets into SAS using Libname method.
Libname My_SAS “E:/SAS/SASDATA”;
Using above program I am creating a library called My_SAS
And I specified some location to create My_SAS library
so whatever SAS datasets are there in above location it will be reflect on My_SAS library.
Or
Whatever datasets you are creating in front of the dataset use your
Libname so that datasets will store into your library and specified location.
Data My_SAS.ds1;
Infile datalines;
Input id name$ sex$ age sal;
Datalines;
001 abc m 23 45000
002 def f 34 67000
003 mno m 21 36000
004 xyz f 27 45000
;
Run;
(GUI method)
In Tool bar click on New Library icon and
Specify library name, Click enable at start up (To make it permanent).
And browse the location where you are going to create datasets as backup
And click ok.
Click OK to Create a Library
DATA STEP PROCESS
When the data step is submit for execution,
it first under goes a syntax check by the SAS system if no errors are found the data step is then complied and executed .
When executing the data step for instream data, the SAS system creates the following three items.
INPUT BUFFER:-
Each raw record of data is read into an area of memory when an input statement is executed.
PROGRAM DATA VECTOR:-
The SAS system builds the data set one observation at a time in this area of memory as the program is executed.
values are read from the input buffer or created by programming statements and assigned to corresponding variables in the PDV.
The written to a SAS data set as a single observation.
In PDV along with all variables there are 2 automatics variables those are
_ N _ and _ ERROR _
_ N_: indicates how many times the data step has iterated.
By default _ n _ =1 When iterations done its increase +1 Using we can find out how many observations are there in dataset.
_ Error _: default value =0 when error encounter it gives _ Error _ =1
If 100 of errors also _ Error _ =1 only
_ Error_ =1 is logical error it’s not a syntax error.
For Syntax error you won’t get
_error _=value. Syntax errors can see in the log with red color and where ever error is there it shows red color underline
Syntax errors are program errors and logical errors are data errors.
DESCRIPTOR INFORMATION:-
On each SAS data set, SAS creates and maintains information about data set and variable attributes like Length, Label,
Format, and Informat and data type. To see this information use Proc contents procedure.
Proc contents Data=Dataset;
Run;
Example:-
Data ds;
Infile datalines;
Input id name age sex$ sal;
Datalines;
001 abc 23 m 5000
002 def 25 f 5600
003 mno 28 f 8000
004 xyz 21 m 6000
;
Run;
(Run above program and see the log for _n_ and _error_ values)
Interview Questions
Q1) What Is SAS? What Is Purpose Of SAS?
Q2) SAS Is Multi Vendor Architecture? How? Explain?
Q3) SAS Is Multi Database Architecture? How? Explain?
Q4) Write about SAS Functionality?
Q5) Write about SAS Programming Rules?
Q6) Write about Data types In SAS?
Q7) Explain Terminology in SAS?
Q8) How Values Treated In SAS?
Missing Character Data Represented By_________________
And
Missing Numeric Data Represented By___________________
Q9) Explain About SAS Windowing Environment?
Q10) Explain About Command Bar? And Commands?
Q11) Explain About Toolbar and Description Of Each Tool Kit In Toolbar?
Q12) Where Is Menu Bar Is SAS? And Write Menu List?
Q13) What Is Purpose Of Below Short Cut Keys?
F1——————
F3——————
F4——————
F6——————
F7——————
F8——————
F9——————
F10————- —
Shift F3————-
Shift F4————-
Shift F5————-
Shift F7————-
Shift F8————–
Ctrl B—————-
Ctrl C—————-
Ctrl D—————-
Ctrl E—————-
Ctrl F—————-
Ctrl L —————-
Ctrl H—————-
Ctrl I—————-
Ctrl L—————-
Ctrl Q—————-
Ctrl V—————-
Ctrl X—————-
Ctrl Y—————-
Ctrl Z—————-
Ctrl W—————-
Shift /—————
Shift Ctrl /———–
Ctrl Shift L————
Ctrl Shift U———–
Q14) Explain About Commenting?
Q15) What Is SAS Program?
Q16) What Is Datastep? Purpose of the Datastep?
Q17) What Is Proc Step? Purpose of Proc Step?
Q18) What is global options? Purpose of global options?
Q19) How Many Ways We Can Read Data Into SAS? Explain Each Way With Example?
Q20) How the SAS system Works With Data?
Q21) How Can You Create A Library?
Q22) Write A Program To Create A Library With The Name Of My_SAS?
Q23) I Have Some SAS Datasets in My Computer at the Place Of
E: /SAS/Source_Data How Can You Take That Data into SAS? Write code?
Q24) How can we identify SAS Work library path location? All datasets by default it
Would be saved in Work library…So how to identify library path information?
Q25) Explain About the Backend Process Of The Datastep?
Q26) What Is Syntax Error?
Q27) What Are the Error Message You Can See In the Log? Explain Those?
Q28) What Is Input Buffer?
Q29) What Is PDV?
Q30) What Is Descriptor Information? How We Can See This Part in output?
Q31) Explain about Automatic Variables In The Backend Process?
Q32) What is _N_?
Q33) What is _Error_?
Q34) What Is Logical Error?
Q35) I Have 5 Errors In My Program What Is _Error_ Values?
Q36) _Error_ is A Logical Error Or Syntax Error?
Q37) How Can I Change My Font Style and Font Size of Program?
Q38) What Is Difference between Program Editor? Enhance Editor?
Q39) SAS Is A Technology? Company? Or? Both? Explain?
Q40) SAS Company Is Located Where?
Q41) How SAS Born?
Q42) Who Is the Founder of SAS?