Saturday, July 12, 2025

πŸ”„ Mastering PROC TRANSPOSE in SAS: Convert Rows to Columns and Vice Versa

PROC TRANSPOSE is a powerful SAS procedure used to reshape data by converting rows into columns or columns into rows. Whether you're preparing datasets for reporting or statistical analysis, PROC TRANSPOSE can simplify your task with just a few lines of code.

Proc Transpose in SAS - Datahark

In this blog, you'll learn:

  • What is PROC TRANSPOSE?
  • When and why to use it
  • Syntax and options
  • Multiple real-world examples
  • Tips and tricks for efficient use


πŸ” What is PROC TRANSPOSE in SAS?

PROC TRANSPOSE is used to pivot dataβ€”turning variables (columns) into observations (rows), or vice versa. It's especially useful for:

  • Summarizing repeated measures
  • Restructuring long or wide datasets
  • Preparing data for visualizations or modeling


πŸ“š Basic Syntax of PROC TRANSPOSE

proc transpose data=input_data out=output_data <options>;
by variable(s); * Optional: groups data; id variable; * Optional: names for new columns; var variable(s); * Variables to transpose; run;

πŸ“Œ Key Options Explained

OptionDescription
BYGroups data before transposing
VARSpecifies variables to transpose
IDUses values of a variable as new column names
NAME=Renames the default _NAME_ column
LABEL=Renames the default _LABEL_ column

βœ… Example 1: Transposing Without BY or ID

πŸ”Ή Input Data

data sales;
input Quarter $ Sales; datalines; Q1 100 Q2 120 Q3 140 Q4 160 ;

πŸ”Ή Transpose Code

proc transpose data=sales out=sales_transposed;
var Sales; run;

πŸ”Ή Output

NAMECOL1COL2COL3COL4
Sales100120140160

βœ… Example 2: Transposing with ID to Use Column Names

proc transpose data=sales out=sales_wide;
id Quarter; var Sales; run;

πŸ”Ή Output

NAMEQ1Q2Q3Q4
Sales100120140160

βœ… Example 3: Transpose with BY Grouping

πŸ”Ή Input Data

data student_scores;
input Student $ Subject $ Score; datalines; John Math 85 John English 78 John Science 92 Anna Math 88 Anna English 91 Anna Science 84 ;

πŸ”Ή Transpose Code

proc sort data=student_scores;
by Student; run; proc transpose data=student_scores out=scores_wide; by Student; id Subject; var Score; run;

πŸ”Ή Output

StudentEnglishMathScience
John788592
Anna918884

βœ… Example 4: Transposing Multiple Variables

data patient_data;
input ID $ Visit $ Height Weight; datalines; P1 Visit1 170 65 P1 Visit2 171 66 P2 Visit1 160 60 P2 Visit2 161 61 ;

πŸ”Ή Code

proc sort data=patient_data;
by ID; run; proc transpose data=patient_data out=trans_height prefix=Height_; by ID; id Visit; var Height; run; proc transpose data=patient_data out=trans_weight prefix=Weight_; by ID; id Visit; var Weight; run; data final_transposed; merge trans_height trans_weight; by ID; run;

πŸ”Ή Output

IDHeight_Visit1Height_Visit2Weight_Visit1Weight_Visit2
P11701716566
P21601616061

🧠 Tips for Using PROC TRANSPOSE

  • Always SORT your data before using BY.
  • Use the PREFIX= option to create meaningful column names.
  • Combine multiple transpositions for complex reshaping.
  • Use NAME= and LABEL= to rename the default variables _NAME_ and _LABEL_.


πŸ”Ž When to Use PROC TRANSPOSE

Use CasePROC TRANSPOSE?
Convert long to wide formatβœ… Yes
Convert wide to long formatβœ… Yes (reverse)
Reshape repeated measuresβœ… Yes
Change actual data values❌ No
Merge multiple reshaped tablesβœ… Yes

πŸ“ˆ Conclusion

PROC TRANSPOSE is an essential tool in any SAS programmer’s toolkit. It simplifies the process of reshaping data for reporting, analysis, and modeling. With a good understanding of BY, ID, and VAR options, you can handle almost any data transformation challenge.

Labels: , , , , , , , , , ,

Friday, July 11, 2025

How to Create Data Groups Using PROC FORMAT in SAS: Real-World Examples

Problem/Requirement -

Lets Suppose, We have Data in Numbers which could be range from 100 to 10000
now we want to create Groups of Each 100 Values, like below - 

Groups Example in SAS



Lets See with the Example | The Power of the FORMAT Procedure - 


/*This code will Generate some Numbers between 100 and 10000 to Get the Practice Dataset */

Data A;
Num=100;
Do while(Num<10000);
output;
Num=Num+54;
end;
Run;
Proc Print Data=A;
Run;

Creating Groups in SAS


Now we want to Group it in as per given in Requirement 

/* This Code will Genrate FMTName (Format Name), Start(Starting Number of Group), End(Ending Number of Group), Label(Group name for Numbers between Start and End Values)
*/
Data B;
Retain FMTName "Group";
Start=100;
i=1;
Do until(End>=10000);
End=Start+99;
Label=compbl("Group"||i);
Output;
Start=End+1;
i=i+1;
end;
drop i;
run;

Proc Print Data=B;
Run;

Grouping Data in SAS

Now if we need to create a User Defined Format -

/* This wil create a format for Above created B Dataset with Format Name "Group", PROC FORMAT with the CNTLIN= option is to build the formats from the imported control data set */

Proc Format CNTLIN=B;
Run; 

/*Lets see how it will print A Dataset if we apply Newly create Group format to Variable Num */

Proc Print Data=A;
Format Num Group.;
run; 

USE of Proc Format in SAS
/* If we want Actual Number and the Group name both the things then we can do it like this */

Data C;
Set A;
GroupName=Put(Num,Group.);
run;
/*Here we created a new Variable Group name and Copy value of Num with Group format, For this we used PUT Function */
Proc Print Data=C;
Run;
USE of Proc Format in SAS


I Hope you like this Article, please comment if you have any doubt or question.
Thank you

Labels: , , , , , , , , , , ,

Sunday, July 6, 2025

πŸ… Ultimate Guide to SAS Certifications: Base, Advanced, Data Scientist & Clinical Trials Programmer

Are you planning to build a career in SAS Programming, Analytics, or Clinical Research? Getting SAS certified is a powerful way to validate your skills and enhance job prospects across industries like healthcare, finance, retail, and pharma.

In this detailed guide, we’ll cover everything you need to know about the top SAS certifications:

SAS Certification

Let’s dive in!


πŸ§‘β€πŸ’» Base SAS Programming Specialist Certification

βœ… Who Should Take This?

Beginners or professionals who want to start their journey in SAS programming. It’s a foundational certification focusing on data step programming, importing/exporting data, basic reporting, and debugging.

πŸ“š Key Topics Covered:

  • SAS programming basics (DATA step, PROC step)
  • Reading and writing data
  • Combining datasets
  • Conditional logic
  • Basic statistical procedures
  • Data formatting and functions
πŸ“ Exam Details:

  • Exam Code: A00-231
  • Format: 40-45 multiple-choice and short-answer questions
  • Duration: 135 minutes
  • Passing Score: 725/1000
  • Fee: $180 (may vary by location)

πŸ“Œ Recommended Preparation:

  • Official SAS e-learning: SAS Base Programming Learning Path
  • Books: The Little SAS Book by Delwiche & Slaughter
  • Practice exams and sample questions available on the SAS portal.


🧠 Advanced Programming Professional Certification

βœ… Who Should Take This?

SAS users who already understand base programming and want to demonstrate expertise in macros, advanced functions, and optimizing code.

πŸ“š Key Topics Covered:

  • Creating and using SAS macros
  • Advanced DATA step programming
  • Error handling and debugging
  • PROC SQL and advanced querying
  • Creating reusable code

πŸ“ Exam Details:

  • Exam Code: A00-232
  • Format: 10–15 coding questions
  • Duration: 2 hours
  • Passing Score: 725
  • Fee: $180

πŸ“Œ Recommended Preparation:

  • SAS Advanced Programming Learning Path:
  • SAS Advanced Certification Prep
  • Practice with real-world datasets
  • Use SAS Studio or SAS OnDemand for Academics (free tool)

πŸ§ͺ Clinical Trials Programmer Professional Certification

βœ… Who Should Take This?

Designed for professionals working in clinical research and pharmaceutical domains. You’ll need knowledge of both SAS programming and CDISC standards (SDTM, ADaM).

πŸ“š Key Topics Covered:

  • Clinical trials data lifecycle
  • Generating TLFs (Tables, Listings, and Figures)
  • SDTM and ADaM data models
  • FDA submission standards
  • Clinical data cleaning and validation

πŸ“ Exam Details:

  • Exam Code: A00-282
  • Format: 50–60 multiple-choice and short-answer questions
  • Duration: 2 hours
  • Passing Score: 68%
  • Fee: $180

πŸ“Œ Recommended Preparation:


🧬 SAS Certified Data Scientist

βœ… Who Should Take This?

Mid to senior-level analysts or data professionals who want to master data wrangling, advanced analytics, machine learning, and model deployment using SAS.

🧩 It’s a bundled certification path that includes:

  • SAS Certified Advanced Analytics Professional
  • SAS Certified AI & Machine Learning Professional

πŸ“š Key Topics Covered:

  • Data exploration, preparation, and visualization
  • Predictive modeling and machine learning
  • Forecasting and optimization
  • Neural networks and natural language processing
  • Model deployment and model monitoring

πŸ“ Exam Details:

  • Total Exams: 5
  • Learning path includes hands-on case studies and labs
  • Duration: ~6-9 months with regular practice
  • Each exam: 2 hours, scenario-based

πŸ’‘ Learn More:


🎯 Which SAS Certification Should You Choose?

Career GoalRecommended Certification
SAS ProgrammerBase or Advanced SAS
Data AnalystAdvanced SAS or Data Scientist
Clinical ResearchClinical Trials Programmer
AI & Machine LearningSAS Certified Data Scientist
Entry-Level ProfessionalBase SAS Certification

πŸ’Ό Job Opportunities After SAS Certification

Earning a SAS certification opens doors to roles like:

  • SAS Programmer / Analyst
  • Clinical SAS Programmer
  • Data Scientist
  • Business Intelligence Analyst
  • Biostatistician

Top companies hiring SAS professionals include IQVIA, Accenture, Pfizer, Novartis, Deloitte, and Infosys.


πŸŽ“ Free Tools to Learn SAS

  1. SAS OnDemand for Academics – Free online SAS environment
  2. SAS Skill Builder for Students
  3. Free E-Learning Resources


πŸ” Conclusion

Whether you’re beginning your SAS journey or aiming to become a data science expert, SAS Certifications are a globally recognized badge of your expertise. Choose the one aligned with your career goals and get started with SAS’s official resources and training paths.

✨ Stay consistent, practice with real datasets, and join communities like SAS Support or Reddit's r/SAS to boost your confidence.


πŸ”— Quick Certification Links Recap:


Labels :

  • Base SAS Programming Specialist – Beginner level
  • Advanced SAS Programming Professional – Intermediate level
  • Clinical Trials Programmer – Intermediate to Advanced (industry-specific)
  • SAS Certified Data Scientist – Advanced level (data science & AI)

Labels: , , , , , ,

Friday, July 4, 2025

Getting Started with PROC FREQ in SAS: Your Guide to Frequency Analysis

When analyzing categorical data in SAS, one of the most powerful and commonly used procedures is PROC FREQ. Whether you're exploring survey data, customer demographics, or clinical trial results, PROC FREQ helps you understand the distribution and relationships of categorical variables with simplicity and precision.


In this post, we’ll walk through the fundamentals of PROC FREQ, its key options, and practical examples to help you get started.


What is PROC FREQ?

PROC FREQ stands for Procedure Frequency. It's used to:

  • Generate frequency tables
  • Calculate percentages
  • Compute cumulative frequencies
  • Create cross-tabulations (contingency tables)
  • Run chi-square tests and other statistical analyses

This procedure is most effective when working with categorical or ordinal variables.


Basic Syntax

PROC FREQ DATA=your_dataset;
TABLES variable1 variable2; RUN;

Example:

PROC FREQ DATA=sashelp.class;
TABLES sex; RUN;

This code will show how many males and females are in the sashelp.class dataset, along with percentages.


Key Features and Options

1. One-Way Frequency Table

Simple count of each unique value in a variable.

PROC FREQ DATA=sashelp.class;
TABLES age; RUN;

2. Two-Way Tables (Cross-tabulation)

Examine the relationship between two variables.

PROC FREQ DATA=sashelp.class;
TABLES sex*age; RUN;

This creates a two-way table showing the distribution of age within each gender.

3. NLEVELS Option

Gives the number of distinct levels (values) for each variable.

PROC FREQ DATA=sashelp.class NLEVELS;
TABLES sex age; RUN;


4.Β Table Statement Options

Option

Description

NOCUM

Suppresses cumulative frequency and cumulative percent columns.

NOPERCENT

Suppresses the percent and cumulative percent columns.

NOROW

Suppresses row percentages in a two-way table.

NOCOL

Suppresses column percentages in a two-way table.

LIST

Displays two-way table in list format instead of tabular format.

CROSSLIST

Outputs one row per combination of values (like list), but with statistics.

CHISQ

Performs chi-square tests of association for two-way tables.

FISHER

Performs Fisher's exact test (good for small sample sizes).

EXACT

Requests exact p-values for tests (useful when assumptions of chi-square aren't met).

TREND

Performs Cochran-Armitage trend test (for ordered categories).

EXPECTED

Displays expected cell frequencies under the null hypothesis.

CELLCHI2

Shows chi-square contribution of each cell.

ALL

Displays all available statistics.

MISSING

Includes missing values in frequency calculations.


5. Adding Percentages and Cumulative Stats

By default, PROC FREQ includes:

  • Frequency count
  • Percent
  • Cumulative frequency
  • Cumulative percent

These can be useful to quickly gauge distributions.

Examples

Basic one-way table without cumulative stats:

proc freq data=sashelp.class;
tables sex / nocum; run;

Two-way table with Chi-Square and Expected Values:

proc freq data=sashelp.class;
tables sex*age / chisq expected; run;

Frequency with weights:

proc freq data=mydata;
tables category; weight freq_count;
run;


Export frequency output to dataset:

proc freq data=sashelp.class;
tables age / out=age_freq; run; proc print data=age_freq; run;

Handling Missing Values:

PROC FREQ DATA=your_data; TABLES var1 / MISSING; RUN;

Click here to Read more Β»

Labels: , , , , ,

Monday, January 11, 2021

πŸ“ŠHow to Read Data in Fixed Columns Using Formatted Input Method | Real time SAS Problems

πŸ“Œ How to Read Data in Fixed Columns Using SAS

Working with fixed-width (or fixed-column) text files is a common task in SAS programming, especially when importing data from legacy systems or mainframes. Unlike delimited files (like CSV), fixed-column files store data in predefined character positions. In this guide, you'll learn how to read such data files effectively using SAS.

πŸ“‚ What is a Fixed-Width File?

A fixed-width file is a plain-text file where each data field occupies a specific number of characters. There are no delimiters like commas or tabs; instead, the position and length of each field are predefined.

Example:

101John 24Male
102Alice 28Female

  • ID: 1–3
  • Name: 4–11
  • Age: 12–13
  • Gender: 14–19


πŸ“₯ Reading Fixed Column Data Using INFILE and INPUT

SAS uses the combination of INFILE and INPUT statements to read fixed-column data.

βœ… Syntax:

DATA dataset_name;
INFILE 'file-path.txt'; INPUT variable1 $ start1-end1 variable2 $ start2-end2 ...; RUN;

  • $ indicates a character variable
  • start-end indicates the column positions


πŸ§ͺ Example: Reading Employee Data

Let’s say you have a file named employees.txt with the following structure:

101John 24Male
102Alice 28Female 103Steve 30Male

To read this in SAS:

DATA employees;
INFILE 'C:\SASData\employees.txt'; INPUT ID 1-3 Name $ 4-11 Age 12-13 Gender $ 14-19; RUN;

πŸ” Explanation:

  • ID is numeric (columns 1 to 3)
  • Name is character (columns 4 to 11)
  • Age is numeric (columns 12 to 13)
  • Gender is character (columns 14 to 19)


πŸ“Ž Tips for Handling Fixed-Width Files

  • Ensure each line in the text file has a uniform width.
  • Use $ in the INPUT statement to define character fields.
  • Always check column positions carefullyβ€”especially if fields are padded with spaces.
  • Use LRECL (logical record length) in INFILE if the line length is long:


INFILE 'C:\SASData\employees.txt' LRECL=100;

πŸ” Alternative: Using COLUMN POINTERS (Formatted Input Method)

You can also use column pointers (@) to specify exact starting positions for each field.

DATA employees;
INFILE 'C:\SASData\employees.txt'; INPUT @1 ID 3. @4 Name $ 8. @12 Age 2. @14 Gender $ 6.; RUN;

  • @n tells SAS where to begin reading.
  • The numeric formats like 3. or 8. specify field lengths.

More on Reads data from fixed columns

  • Reads character and numeric data including standard and nonstandard numerical values, such as numbers with dollar signs and commas 
  • Reads dates in different formats, not like column input which can only read dates as character values
  • It is the most common and powerful of all the input methods
  • Any time you have nonstandard data in fixed columns, you should consider using formatted input to read the file.

Let's see More Example  -

Suppose we have this raw data stored in any flat file -

1234567890123456789012 ---- Columns indicator, not part of the data

Tim  M14510/21/1978

Sara  13009/20/1964

Mike M18011/23/1965

LauraF13011/06/1980

Sean M16704/07/2000


The variables information is listed as below. 

There are 4 variables in the data - 

Name: occupies total five columns, column 1-5

Gender: occupies total one column, column 6

Weight: occupies three columns, column 7-9

DOB: occupies ten columns, column 10 - 19


data Form_input;
infile "/folders/myfolders/DATA_column.txt";
input 
@1 Name $ 5.
@6 Gender   $  1.
@7 Weight 3.
@10 DOB     mmddyy10.;
Run;

proc print data=sdata_column;
run;

Reading Raw Data
Explanation - 

1. We Used @ to specify the starting position of every variable 

2. Also used SAS Informats

Informats are built-in instructions that tell SAS how to read a data value. The choice of which informat to use is determined by the data.

For standard numeric and character values, two basic informats are w.d and $w.

The w.d Informat: reads standard numeric values. The w tells SAS how many columns to read. The optional d tells SAS that there is a decimal point in the value, the decimal point counts as one of the columns to be read. For example, the variable Weight containing 3 digits values (such as 145 lbs) has an informat of 3. ; Another example, value 2.1 takes up 3 column spaces, 2 digits + 1 decimal point, so it has an informat of 3.1

 

For date: the MMDDYY10. informat tells SAS that the date you are reading is in the mm/dd/yyyy form, 10 means this form takes up 10 columns including 8 digits and 2 slash signs. For example, the variable DOB has values like 10/21/1978. SAS reads the date and converts the value into a SAS date. SAS stores dates as numeric values equal to the number of days from January 1, 1960. So, if you read the value 01/01/1960 with the MMDDYY10. informat, SAS stores a date value of 0. The date 01/02/1960 read with the same informat would result in a date value of 1, and so forth. 

 

That’s why if you look at the output of above program than DOB displays as some numbers instead of dates, TO Display DOB as Date Value we need to use the Format.

Example - 

proc print data=Form_input;
format  DOB  mmddyy10.;
run;

Reading raw data by datahark


πŸ“Š Output Dataset

After reading the data, you can view the dataset using:

PROC PRINT DATA=employees;
RUN;

βœ… Summary

TaskSAS Statement
Specify file locationINFILE
Read fixed columnsINPUT with positions
Read character dataAdd $ after variable
Use column positionsUse ranges like 1-3 or pointers like @1
View datasetPROC PRINT

πŸ“Œ Conclusion

Reading fixed-column data in SAS is straightforward once you understand the layout of the file. By using INFILE, INPUT, and positional referencing, you can efficiently convert raw text files into structured SAS datasets. This skill is especially important for data analysts working with legacy systems or industry-standard text data formats.


Search Tags:

SAS Fixed Column Input
Read Fixed Width Files in SAS
SAS INFILE Statement
SAS INPUT Statement
Column Pointer in SAS
SAS Text File Import
SAS Programming Basics
Importing Data in SAS
Legacy Data Processing SAS
SAS Tutorial for Beginners

Labels: , , , , , , , ,