Thursday, 24 July 2014

SAS Basics Part-1


Let us start with inserting data in sas dataset
  1. How will the output data look like -
Title 'Survey';
DATA survey; 
   INPUT id sex  age inc r1 r2 r3 ; 
   DATALINES; 
 1  F  35 17  7 2 2 
17  M  50 14  5 5 3 
33  F  45  6  7 2 7 
49  M  24 14  7 5 7
;


RUN;

After running above code in SAS output will be 
                                                                Survey


Obsidsexageincr1r2r3
11.3517722
217.5014553
333.456727
449.2414757

As you can Sex column is having a missing value though it is present in our insert statement.This had happened because SAS is considering it to be numeric value since $ sign is not present in front of variable name.
So we need to modify our insert statement to take character value by adding $ sign in front of the variable name like 'sex $'.

Title 'Survey';
DATA survey; 
   INPUT id sex $ age inc r1 r2 r3 ; 
   DATALINES; 
 1  F  35 17  7 2 2 
17  M  50 14  5 5 3 
33  F  45  6  7 2 7 
49  M  24 14  7 5 7
;


RUN;

Output
                                                              Survey


Obsidsexageincr1r2r3
11F3517722
217M5014553
333F456727
449M2414757



Q2. Data cats;
infile '/folders/myfolders/Cat.txt';
Input ID $ 1-4 Age 6-7 Sex $ 8-9;
run;

Write a program, similar to the above example that will create a temporary dataset called Students. Assume that
i)        temporary dataset (called cats, work.cats, or 'cats.sas7bdat'(in some folder) already exists.
ii)      using the raw data file directly.
What changes should be made so that the DATA step reads only the first 15 observations? 
Ans:

Lets create cat text file with 20 observations





i) data Students;
set cat;
run;

ii)Using raw file as input, the code require to read only 15 observations(rows) from file is as follows
data cat;
 infile '/folders/myfolders/cat.txt' firstobs=1 obs=15;
 input ID $ 1-4 AGE 6-7 SEX $ 8-9;
run;
proc print data=cat;
run;

In the above the option firstobs and obs are used to read first 15 rows from file.

Output:
                                                                            Students


ObsIDAGESEX
1000121M
2000221M
3000321M
4000421M
5000521M
6000621M
7000721M
8000821M
9000921M
10001021M
11001121M
12001221M
13001321M
14001421M
15001521M

Q 3.      Copy the below data in a .txt and then import into SAS
Joseph,76,"Red Racers, Washington",AAA
Mitchel,82,"Blue Bunnies, Richmond",AAA
Sue Ellen,74,"Green Gazelles, Atlanta",AA


Ans:

i) with Infile option:
Data Team;
infile '/folders/myfolders/Team.txt' dlm=',' dsd ;
Input Name: $20. Age Address: $50. Remarks: $10.;
run;

proc print data=Team; run;


In the code we have used dsd (delimeter sensitive data) in order to make sas understand not to consider comma of 'address' variable as delimeter.


ii)Proc Import
Title'Team';
Proc import datafile='/folders/myfolders/Team.txt' out=TeamImport dbms=dlm replace;
delimiter=',';
getnames=yes;
run;
proc print data=TeamImport; run;


output:
                                                                       Team



ObsNameAgeAddressRemarks
1Joseph76Red Racers, WashingtonAAA
2Mitchel82Blue Bunnies, RichmondAAA
3Sue Ellen74Green Gazelles, AtlantaAA

Q 4.      The following data gives the high and low trading prices in Canadian dollars for various stocks on a given day on the Toronto Stock Exchange. Enter these data into a SAS data set with three variables, Stock, Hi, and Low and 10 observations. Print the data set in the Output window to check that you have successfully entered it. Save the data set as a permanent SAS data set giving it the name stocks.
Stock Hi Low
ACR 7.95 7.80
MGI 4.75 4.00
BLD 112.25 109.75
CFP 9.65 9.25
MAL 8.25 8.10
CM 45.90 45.30
AZC 1.99 1.93
CMW 20.00 19.00
AMZ 2.70 2.30
GAC 52.00 50.25

Ans:

libname assign '/folders/myfolders/';


Data assign.Stocks;

input Stock $ Hi Low;
datalines;
ACR 7.95 7.80
MGI 4.75 4.00
BLD 112.25 109.75
CFP 9.65 9.25
MAL 8.25 8.10
CM 45.90 45.30
AZC 1.99 1.93
CMW 20.00 19.00
AMZ 2.70 2.30
GAC 52.00 50.25
;

run;

Proc Print data=assign.Stocks;
run;

Output:

                                                                     Stocks

ObsStockHiLow
1ACR7.957.80
2MGI4.754.00
3BLD112.25109.75
4CFP9.659.25
5MAL8.258.10
6CM45.9045.30
7AZC1.991.93
8CMW20.0019.00
9AMZ2.702.30
10GAC52.0050.25
5.       Save the above data in a text file (.txt) and import the file into SAS.
Ans:
Proc import datafile='/folders/myfolders/StocksImport.txt' out=Assign.StockImport dbms=dlm replace;
getnames=yes;
run;

Output:
                                                                 Stock Import

ObsStockHiLow
1ACR7.957.8
2MGI4.754
3BLD112.25109.75
4CFP9.659.25
5MAL8.258.1
6CM45.945.3
7AZC1.991.93
8CMW2019
9AMZ2.72.3
10GAC5250.25


6.       Calculate the average of the Hi and Low prices for all the stocks and save it in a variable called 
average. 
Ans:

data assign.Stocks;
set assign.stocks;
Average=mean(hi,low);
run;
proc print data=assign.Stocks;

run;

Output:                                                        Stocks
ObsStockHiLowAverage
1ACR7.957.807.875
2MGI4.754.004.375
3BLD112.25109.75111.000
4CFP9.659.259.450
5MAL8.258.108.175
6CM45.9045.3045.600
7AZC1.991.931.960
8CMW20.0019.0019.500
9AMZ2.702.302.500
10GAC52.0050.2551.125
                                                                                       

7.       Using the SAS data set created in Exercise 4, add the following stocks to the permanent data set.
Stock Hi Low
CLV 1.85 1.78
SIL 34.00 34.00
AC 14.45 14.05

Remove the variable average from the data set and save the data set.

Ans: In this we will be first creating the temp dataset and then merge it to the existing dataset.


Data StocksTemp;

input Stock $ Hi Low;
cards;
CLV 1.85 1.78
SIL 34.00 34.00
AC 14.45 14.05
;
run; 

Data Assign.stocks;

Set assign.stocks(drop=Average) Stockstemp;
run;

Proc Print data=assign.stocks;run;

Output:

                                                                                          Stocks

ObsStockHiLow
1ACR7.957.80
2MGI4.754.00
3BLD112.25109.75
4CFP9.659.25
5MAL8.258.10
6CM45.9045.30
7AZC1.991.93
8CMW20.0019.00
9AMZ2.702.30
10GAC52.0050.25
11CLV1.851.78
12SIL34.0034.00
13AC14.4514.05

Note: Rows hilighted in red are newly added rows


Q 8. Save the following data in a .txt file and import into SAS.

Name, Score, Team, Division
(data generated from the recent Athletics Meet)
Joseph,76,"Red Racers, Washington",AAA
Mitchel,82,"Blue Bunnies, Richmond",AAA

Sue Ellen,74,"Green Gazelles, Atlanta",AA
Ans:


Title'Personal Info';

Proc import datafile='/folders/myfolders/UPLDwithImport.txt' out=PersonalInfo dbms=dlm replace;
delimiter=',';
getnames=yes;
run;
proc print data=personalinfo; run;

Output:


                                                                                   Personal Info


ObsName_Score_Team_Division
1Joseph76Red Racers, WashingtonAAA
2Mitchel82Blue Bunnies, RichmondAAA
3Sue Ellen74Green Gazelles, AtlantaAA