Datasets: Cases
by Variables
| Variable 1 | Variable 2 | Variable 3 | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
In SAS (or SPSS) the basic data
unit is the case.
( This is a leftover from the era
of Computer Cards)
Data is stored in
Next we will review basic SAS procedures
by example.
1. From a file
OPTIONS PS=55 LS=80;
DATA CRIME;
filename myf 'crime.dat';
Infile myf;
INFILE 'crime.dat';
INPUT MURDER RAPE ROBBERY
ASSAULT BURGLARY LARCENY AUTOTHFT REGION $;
CARDS;
2. Dataset inside the program file:
filename product 'ULTIMVN4.DBF';
proc dbf db4=product out=new;
run;
data temp;
set new;
if _N_ NE 6 then output;
if _N_ ne 6;
if _N_ ne 7;
if _N_ ne 77;
if _N_ ne 78;
if _N_ ne 101;
if _N_ ne 143;
lage = log(age18t24);
run;
4. Reading a file that is compressed:
This is from a SAS program that reads the 5% sample from
the US Census.
infile in lrecl=193;
input RECTYPE $ 1 @;
if RECTYPE='H' then
do;
input
SAMPLE $ 2 DIVISION $ 3 STATE
$ 4-5
COGRP $ 6-8 AREATYPE $ 9 SMSA
$ 10-13
.............................
ARENT1 $ 162 FILLER $ 163-193;
end;
else if RECTYPE ='P' then
do;
input
RELAT1
$ 2-3 RELAT2 $ 4 SUBFAM1 $ 5
SUBFAM2 $ 6 SEX $ 7 AGE
8-9 QTRBIR $ 10
...............................
AINCOME6 $ 192 AINCOME7 $ 193
;
run;
Title Generate a random sample of size 10; options ls=80 ps=65; data records;
infile 'records.txt'; input name $ ; id = _N_; x = uniform(0); cards; proc print; run; proc sort; by x; data sample; set records; if _N_ <= 10; proc print; run;
This is the standard model in many statistical problems.
The data contains several predictors and one or more responses.
libname mylib spss 'Reg.por';
options linesize=70 pagesize=55;
data a;
format IN80BO IN80C IN90C INBBID
INBP90 INCENP INCODE INCR80 INCR90
INEMP INMAIN INMAIN2 INMAIN3 INNV INO
INREST INT79 INT80 INT90 INV90
INVAL SEV79 SEV80 SEVR80 SFAMV VAPT79
VAPT80 VCOM79 VCOM80 VFRM79
VFRM80 VIND79 VIND80 VRES79 VRES80 VTOT79
VTOT80 VVAL79 VVAL80 comma8.
NAMEV1 $CHAR200.;
set mylib._first_;
if _N_ < 100;
run;
proc plot;
plot ravlchm*(roadacc roadcap transacc
transcap);
run;
proc reg data=a;
model ravlchm = hhinc black hslds
setr vac indexr pstu
roadacc roadcap transacc transcap/P R COLLIN;
We will look at regression analysis in the next chapter.
PROC GLM;
CLASS REGION ;
MODEL
MURDER RAPE ROBBERY ASSAULT
BURGLARY = REGION;
MEANS REGION/ BON DUNCAN
SCHEFFE;
TITLE ' CRIME INCIDENCE IN
THE UNITED STATES BY REGION' ;
run;