To remove all non-printable characters from your string i.e all ASCII's below 32 and above 126, you can do by providing "KW" (keep writable) as THIRD argument to compress function.
Example
data eject_a_page;
name = "Vikas" || '12'x || " Gaddu";
newname = compress(name,,"kw");
run;
Wednesday, November 18, 2009
do leave loop and continue !!!
There are two statements in SAS, leave and continue. Difference between two is seldom not understood.
data leave;
do i = 1 to 5;
do j = 1 to 5;
if j = 2 or j = 4 then leave;
output;
end;
end;
run;
Here SAS will leave inner do loop completely once the first condition j=2 gets satisfied. So in output you will never have j = 2 or 3 or 4 or 5.
OUTPUT
i j
1 1
2 1
3 1
4 1
5 1
Same program with continue statement.
data continue;
do i = 1 to 5;
do j = 1 to 5;
if j = 2 or j = 4 then continue;
output;
end;
end;
run;
Here SAS will skip the inner loop for both conditions j = 2 and j=4, but it will not completely leave the loop. So ouput for this will look like
i j
1 1
1 3
1 5
2 1
2 3
2 5
. .
. .
. .
5 3
5 5
data leave;
do i = 1 to 5;
do j = 1 to 5;
if j = 2 or j = 4 then leave;
output;
end;
end;
run;
Here SAS will leave inner do loop completely once the first condition j=2 gets satisfied. So in output you will never have j = 2 or 3 or 4 or 5.
OUTPUT
i j
1 1
2 1
3 1
4 1
5 1
Same program with continue statement.
data continue;
do i = 1 to 5;
do j = 1 to 5;
if j = 2 or j = 4 then continue;
output;
end;
end;
run;
Here SAS will skip the inner loop for both conditions j = 2 and j=4, but it will not completely leave the loop. So ouput for this will look like
i j
1 1
1 3
1 5
2 1
2 3
2 5
. .
. .
. .
5 3
5 5
Thursday, June 25, 2009
SUBSTR function can be used on both side of assignment statment
data name;
name = "vikus";
substr(name,4,1) = "a";
run;
proc print ;
run;
This will change u to a in name variable.
data name;
name = "Vikas Gaddu";
firstname = substr(name,1,5);
run;
proc print;
run;
Firstname variable will contain Vikas.
name = "vikus";
substr(name,4,1) = "a";
run;
proc print ;
run;
This will change u to a in name variable.
data name;
name = "Vikas Gaddu";
firstname = substr(name,1,5);
run;
proc print;
run;
Firstname variable will contain Vikas.
Wednesday, June 24, 2009
Creating Special Missing Values
data student_score;
student = 1; score=50; output;
student = 2; score =.a; output;
student = 3; socre=.n; output;
run;
proc print;
run;
Here special numeric value a means that student was absent and n means incomplete test. So single missing values can have multiple meaning in SAS.
student = 1; score=50; output;
student = 2; score =.a; output;
student = 3; socre=.n; output;
run;
proc print;
run;
Here special numeric value a means that student was absent and n means incomplete test. So single missing values can have multiple meaning in SAS.
Tuesday, June 16, 2009
Name statements that function at both compile and execution time.
DATA statement at compile time will tell the compiler name of the dataset to be filled in dataset descriptor portion. At run time it will tell compiler when to re-initialize user created variable.
Identify statements whose placement in the DATA step is critical.
1> if restriction_criteria; statement
2> output; statement
3> assignment statments eg
if minage < 20 ;
minage = min(age); /* This statment should be above IF statment*/
2> output; statement
3> assignment statments eg
if minage < 20 ;
minage = min(age); /* This statment should be above IF statment*/
Name statements that are recognized at compile time only?
KEEP, DROP, LENGHT, ATTRIB, Syntax errors, BY, RETAIN, WHERE, FORMAT, INFORMAT, ARRAY
At compile time when a SAS data set is read, what items are created?
PDV is intialized, INPUT buffer size, descriptor portion of dataset is set like name of dataset, engine, host operating system etc.
Does SAS ‘Translate’ (compile) or does it ‘Interpret’? Explain.
SAS is a compiler. SAS does not read and execute one statement at a time. It will either wait for a RUN, QUIT, next DATA or PROC statement before executing its statment. This process is similar to a compiler than a interpreter.
What is the Program Data Vector (PDV)? What are its functions?
PDV is internal temparary memory which SAS uses to store one observation at one time from input dataset. It also has some automatic system flag variables like _ERROR_ or _N_ etc.
How would you code a merge that will keep only the observations that have matches from both sets.
data c;
merge a(in=ina) b(in=inb);
by commonvar;
if ina and inb;
run;
merge a(in=ina) b(in=inb);
by commonvar;
if ina and inb;
run;
Code a PROC SORT on a data set containing State, District and County as the primary variables, along with several numeric variables.
proc sort data = mydata;
by state district county _numeric_;
run;
by state district county _numeric_;
run;
When looking for data contained in a character string of 150 bytes, which function is the best to locate that data: scan, index, or indexc?
SCAN will scan for word.
INDEX will search for the pattern so will be the best.
INDEXC will search for the character, so not that useful in this case.
INDEX will search for the pattern so will be the best.
INDEXC will search for the character, so not that useful in this case.
How would you include common or reuse code to be processed along with your statements?
Create a autocall macro with reuse code to be used anywhere.
What is the one statement to set the criteria of data that can be coded in any step?
%LET condition = where age > 70;
data age70;
set demog;
&condition;
run;
%LET statement criteria can be used in any step!!!
data age70;
set demog;
&condition;
run;
%LET statement criteria can be used in any step!!!
If you’re not wanting any SAS output from a data step, how would you code the data statement to prevent SAS from producing a set?
DATA _NULL_; statment will do that.
If reading an external file to produce an external file, what is the shortcut to write that record without coding every single variable on the record?
Not Sure. Maybe we can use _INFILE_ automatic variable.
Under what circumstances would you code a SELECT construct instead of IF statements?
If I am comparing numeric variable then it is faster to use SELECT than IF statment.
What is the purpose of the trailing @? The @@? How would you use them?
@ to input multiple lines in rawdata to single observation in a dataset
@@ to input single line in rawdata to multiple observations in a dataset
@@ to input single line in rawdata to multiple observations in a dataset
Name and describe three SAS functions that you have used, if any?
PROPCASE : Converts first alphabet of every our upcase.
SCAN: searches for a word in a string.
SMALLEST: Find smallest non-missing variable from an array or list of variables.
SCAN: searches for a word in a string.
SMALLEST: Find smallest non-missing variable from an array or list of variables.
What is the difference between an informat and a format? Name three informats or formats
INFORMAT is how you want to read a data and FORMAT is how you want to write a data.
User created FORMATS and INFORMATS can be created using PROC FORMAT. Some defaults INFORMATS are date9., dollar8.2, comma8.2 and these can also be used as FORMATS.
User created FORMATS and INFORMATS can be created using PROC FORMAT. Some defaults INFORMATS are date9., dollar8.2, comma8.2 and these can also be used as FORMATS.
If reading a variable length file with fixed input, how would you prevent SAS from reading the next record if the last variable didn’t have a value?
You need to use MISSOVER option in INFILE statment.
Are you familiar with special input delimiters? How are they used?
On INFILE statement you can use DLM option to specify any special input delimiters.
Thursday, April 30, 2009
CALL SYMPUT and CALL SYMPUTX
CALL SYMPUT will convert data step variable value into marco variable value and it will store value of in nearest non-missing symbol table. In case below nearest non-missing symbol table is local symbol table of marco name. When we try to access name macro variable outside the macro SAS issues a warning.
%macro name(string=);
data _null_;
name = "&string";
call symput("name",name);
run;
%put &name;
%mend;
%name (string= Vikas);
%put &name;
/*WARNING: Apparent symbolic reference NAME not resolved.*/
CALL SYMPUTX new in SAS v9 can have an extra third argument where you can specify which symbol table you want SAS to store value of this macro variable. Code below is same as above except we are using call symputx and third argument saying we want macro variable to be saved in global symbol table. So this time around we don't get the Warning message.
%macro name(string=);
data _null_;
name = "&string";
call symputx("name",name,'G');
run;
%put &name;
%mend;
%name (string= Vikas);
%put &name;
SAS to keep only valid characters in a String
/* These are the only valid characters */
%let validchars = 'a' 'b' '1' '2' '_' '$';
data validchars;
string = "abc123_*";
str_len = length(string );
do while( str_len > 0);
if substr(string, str_len,1) not in (&validchars )then substr(string,str_len,1) = " ";
str_len = str_len -1;
end;
string = compbl(string);
/*Removing multiple blanks */
run;
proc print;
run;
%let validchars = 'a' 'b' '1' '2' '_' '$';
data validchars;
string = "abc123_*";
str_len = length(string );
do while( str_len > 0);
if substr(string, str_len,1) not in (&validchars )then substr(string,str_len,1) = " ";
str_len = str_len -1;
end;
string = compbl(string);
/*Removing multiple blanks */
run;
proc print;
run;
Assigning multiple formats to one variable depending on its value using PROC FORMAT
proc format;
value mydates
'1Jan2009'd - '31jan2009'd = [mmddyy8.]
'1feb2009'd - '28feb2009'd = [date7.]
;
run;
data dates;
first = '15jan2009'd ; output;
first = '15feb2009'd; output;
format first mydates.;
run;
proc print;
run;
Subscribe to:
Posts (Atom)