Tuesday, August 5, 2025

πŸ”„ Mastering IF-THEN-ELSE and DO Loops in SAS – Complete Guide with Examples

πŸ”° Introduction

Conditional logic and looping are core concepts in any programming language. In SAS, IF-THEN-ELSE statements and DO loops provide powerful tools for controlling the flow of your Data Step code. This guide explains how to use these tools effectively with real-world examples.


If else in SAS

🧩 1. IF-THEN-ELSE in SAS

The IF-THEN-ELSE statement allows you to execute specific code based on conditions.

βœ… Syntax:

IF condition THEN action;
ELSE IF condition THEN action; ELSE action;

πŸ“Œ Example 1: Simple IF-THEN-ELSE

data class_flag;
set sashelp.class; if age < 13 then group = 'Child'; else if age < 18 then group = 'Teen'; else group = 'Adult'; run;

Explanation:
Classifies students into Child, Teen, or Adult groups based on age.


πŸ§ͺ More IF-THEN-ELSE Examples in SAS


πŸ“Œ Example 1: Assign Grades Based on Scores

data grades;
input name $ score; if score >= 90 then grade = 'A'; else if score >= 80 then grade = 'B'; else if score >= 70 then grade = 'C'; else if score >= 60 then grade = 'D'; else grade = 'F'; datalines; John 85 Sara 92 Alex 67 Nina 74 Bob 58 ; run;

πŸ“Œ Example 2: Handle Missing Values in Conditions

data test_missing;
input id age; if age = . then status = 'Missing'; else if age < 18 then status = 'Minor'; else status = 'Adult'; datalines; 1 25 2 . 3 17 ; run;

πŸ“Œ Example 3: Create Flags for Categorical Variables

data product_flag;
input product $ category $; if category = 'Electronics' then flag = 1; else flag = 0; datalines; Laptop Electronics Shoes Apparel Phone Electronics Watch Accessories ; run;

πŸ“Œ Example 4: Nested IF-THEN-ELSE Logic

data nested_logic;
input city $ temp; if city = 'Delhi' then do; if temp > 40 then warning = 'Heatwave'; else warning = 'Normal'; end; else warning = 'Check city'; datalines; Delhi 45 Delhi 30 Mumbai 32 ; run;

πŸ“Œ Example 5: Use with Multiple Variables

data risk_check;
input age income; if age < 25 and income < 30000 then risk = 'High'; else if age >= 25 and income < 30000 then risk = 'Medium'; else risk = 'Low'; datalines; 22 25000 28 28000 35 60000 ; run;

πŸ“Œ Example 6: Case-Insensitive Character Comparison

data department;
input empname $ dept $; if upcase(dept) = 'HR' then team = 'Human Resources'; else team = 'Other'; datalines; John HR Alex finance Sara hr ; run;

πŸ“Œ Example 7: Assign Labels to Numeric Ranges

data salary_bracket;
input empid salary; if salary < 30000 then bracket = 'Low'; else if 30000 <= salary < 60000 then bracket = 'Medium'; else bracket = 'High'; datalines; 101 25000 102 32000 103 60000 104 58000 ; run;

πŸ“Œ Example 8: IF Without ELSE (Slower)

Data no_else;
set sashelp.class; if age < 12 then group = 'Preteen'; if age >= 12 then group = 'Teen'; run;

πŸ’‘ Best Practices:

  • Use IF-THEN/ELSE instead of multiple IF statements for better performance.
  • When checking for missing values, use IF var = ..


πŸ”„ 2. DO Loops in SAS

DO loops are used to repeat code a specified number of times or while a condition is true.


βœ… 2.1 DO Loop Syntax

do index = start to end;
/* repeated statements */ end;

πŸ“Œ Example 2: Basic DO Loop

data loop_example;
do i = 1 to 5; square = i**2; output; end; run;

Explanation:
Generates a dataset with numbers 1 to 5 and their squares.


βœ… 2.2 DO WHILE Loop

data do_while;
x = 1; do while (x < 5); square = x**2; output; x + 1; end; run;

βœ… 2.3 DO UNTIL Loop

data do_until;
x = 1; do until (x > 5); cube = x**3; output; x + 1; end; run;

🧠 Combining IF and DO Loops

data even_numbers;
do i = 1 to 10; if mod(i, 2) = 0 then output; end; run;

Explanation:
Generates only even numbers between 1 and 10 using both DO loop and IF condition.


⚠️ Common Mistakes to Avoid

MistakeFix
Missing OUTPUT in DO loopAlways include output; if needed
Forgetting semicolonsEnd each SAS statement with ;
Using multiple IF instead of IF-THEN-ELSECan slow performance

πŸ”š Conclusion

Mastering IF-THEN-ELSE and DO loops in SAS empowers you to create dynamic, flexible, and readable data processing routines. Whether you're classifying data, creating new variables, or iterating through logic, these tools are fundamental to writing efficient SAS programs.

Labels: , , , , , , , ,

Monday, August 4, 2025

πŸ“Š PROC RANK in SAS – Rank, Percentile, and Group Your Data Easily

Introduction

In data analysis, ranking values is essential for identifying top performers, segmenting data, and calculating percentiles. PROC RANK in SAS makes this process easy by assigning ranks, percentiles, or group numbers to numeric variables.

Proc Rank by Datahark


πŸ”§ Syntax of PROC RANK

PROC RANK DATA=input_dataset OUT=output_dataset
RANKS=rank_variable <TIES=LOW|HIGH|MEAN|DENSE>; VAR variable_to_rank; BY group_variable; <GROUPS=n>; RUN;

Key Options Explained:

OptionDescription
DATA=Input dataset
OUT=Output dataset with new rank variable
RANKS=Name of the new variable that stores the rank
TIES=Specifies how tied values are handled (default is MEAN)
BYPerform ranking within each BY-group
VARVariable to rank
GROUPS=Divide data into equal-sized groups (like quantiles or deciles)

πŸ“Œ Example 1: Basic Ranking

proc rank data=sashelp.class out=ranked_class;
var height; ranks height_rank; run;

Explanation:
Ranks students in sashelp.class by their height, storing the result in height_rank.


πŸ“Œ Example 2: Ranking within Groups

proc sort data=sashelp.class out=sorted_class;
by sex; run; proc rank data=sorted_class out=ranked_sex; by sex; var weight; ranks weight_rank; run;

Explanation:
Ranks weight within each sex group.


πŸ“Œ Example 3: Create Percentile or Quantile Groups

proc rank data=sashelp.class out=grouped_class groups=4;
var age; ranks age_quartile; run;

Explanation:
Divides age into 4 quartile groups (0 to 3).


πŸ“Œ TIES= Option in Action

proc rank data=sashelp.class out=ranked_ties ties=low;
var height; ranks height_rank; run;

TIES= Options:

  • LOW – Lowest rank for all ties
  • HIGH – Highest rank for all ties
  • MEAN – Average rank (default)
  • DENSE – No gaps between ranks


βœ… When to Use PROC RANK

  • Ranking top N values
  • Creating quantile-based bins (e.g., deciles, quartiles)
  • Calculating percentiles
  • Segmenting customers or products
  • Normalizing scorecards


🧠 Tips for Using PROC RANK

  • Always sort the dataset before using BY.
  • Use GROUPS= for percentiles or bucketing.
  • For multiple variables, use multiple VAR and RANKS pairs.
  • Combine with PROC SQL or PROC PRINT for better reporting.


πŸ“Ž Final Thoughts

PROC RANK is a powerful yet simple procedure in SAS that enables effective data ranking and segmentation. It’s especially useful in scoring, customer segmentation, and exploratory data analysis.

Labels: , , , , , , , , , , , , ,