Tuesday, January 5, 2021

APPEND DATASETS USING SET | MULTIPLE SET | PROC APPEND | PROC SQL UNION

Here we are going to Learn how to Combine/Stack/Append Datasets in SAS, See below Picture to understand- What is Append


Append Datasets In SAS Datahark

SAS has multiple Methods to append datasets, these includes - 

  • Using Set in Datastep
  • Proc Append
  • Proc SQL
  • Merge

Example 1 - Using Set in Datastep 

Data Male;
Input Name $ Age;
Datalines;
Mark 45
Peter 60
Jack 20
;
run;

Data Female;
Input Name $ Age;
Datalines;
Lisa 45
Jane 60
Julie 20
Ruma 42
;
run;

Data Combined;
set Male Female;
run;

Append with Set

Lets See what happen if we write multiple SET statements - 

Data Combined;
set Male;
set Female;
run;

Multiple Set Statements

Here First it copied Male Dataset into Combined (All Three rows) and then Override all three rows of combined dataset with Female Dataset's First Three rows

Example 2 - Using PROC APPEND

Data Male;
Input Name $ Age;
Datalines;
Mark 45
Peter 60
Jack 20
;
run;

Data Female;
Input Name $ Age;
Datalines;
Lisa 45
Jane 60
Julie 20
Ruma 42
;
run;

Proc Append Base= Male Data=Female;
run;

Proc Print Data=Male;
run;

Proc Print Data=Female;

Data Male;
Input Name $ Age;
Datalines;
Mark 45
Peter 60
Jack 20
;
run; 

Data Female;
Input Name $ Age;
Datalines;
Lisa 45
Jane 60
Julie 20
Ruma 42
;
run;

Proc Append Base= Male Data=Female;
run;
Proc Print Data=Male;
run;
Proc Print Data=Female;
run; 

PROC Append

The above proc appended rows of Female Dataset into Male Dataset. It is faster than SET statement as here the second dataset is just appended to the first one and the SAS do not have to process the first dataset in the PDV, Essentially what happens with set or merge is that you bring both the datasets into PDV and then setting is done, the append does not do that it just add the second dataset to base one with out bringing the base dataset to PDV.

Few points for Proc Append - 

  • If PROC APPEND cannot find the BASE= data set, SAS creates it
  • If a DATA= data set is not specified, SAS uses the most recently created SAS data set
  • If appending data sets with different variables or attributes, use the FORCE option. This option must be used to prevent a syntax error.

Example 3 - Using PROC SQL - UNION

Data Male;
Input Name $ Age;
Datalines;
Mark 45
Peter 60
Jack 20
;
run;

Data Female;
Input Name $ Age;
Datalines;
Lisa 45
Jane 60
Julie 20
Ruma 42
;
run;

PROC SQL;
CREATE TABLE COMB_MF AS 
SELECT * FROM MALE
UNION CORR
SELECT * FROM FEMALE;
QUIT;

It will give same output as Example 1 

Here UNION Keyword is used to combined dataset and CORR keyword used to tell SAS that Append datasets based on the Column Name, If we dont use this Keyword PROC SQL will append based on the column sequence.

Labels: , , , , , , ,

Sunday, December 27, 2020

Creating Dataset using Proc SQL - WHY ? | HOW ?

With PROC SQL we can not only report data in Result window but we can also create New dataset for the result we get out of Select Query

WHY? - This is needed sometime because we want to reutilize the result of Select Query in some other places for analysis, so we required to store result somewhere in dataset.

HOW? - To perform this task we just need to write Create Table <Library.dataset Name> as Before any Select Statement 

Syntax - 

PROC SQL;
CREATE TABLE <Library.dataset name> AS
SELECT variable(s)
FROM Table Name
<WHERE Conditional statement
GROUP BY Variable(s)
ORDER BY variable(s)>;
QUIT; 

See Example - 

/* Creating Dataset using proc SQL */
Proc SQL;
Create Table WORK.AUDI as Select * From SASHELP.CARS
where Make = 'Audi';
Run;

Creating Dataset using PROC SQL



 

Use of INOBS option - Fix the number of rows in newly created dataset.

If you want to limit the number of observation(rows) in dataset produced by Proc SQL then INOBS option can be used - 

See Example - 

/* Create a dataset Contains top 10 Cheapest Audi */
Proc SQL INOBS=10;
Create Table Audi as Select * From SASHELP.CARS
where Make = 'Audi'
ORDER BY MSRP;
Run;

Top 10 Cheapest Audi

 

Labels: , , , , ,

Thursday, December 24, 2020

PROC SQL - More Basic Examples

Go to Previous page  - PROC SQL FOR BEGINNERS - HOW TO USE SQL IN SAS

More Examples of Basic PROC SQL Programs

7. Getting Specific Number of Records From Dataset - Use of OUTOBS option

proc sql outobs=10; 
select *  From SASHELP.CARS;
Quit;

proc sql example 7 Datahark

This Code is displaying First 10 records from SASHELP.CARS

8. Giving Label(New Display Name) to a Variable/Column in Dataset.

proc sql; 
select Make as Company, MSRP as Price From SASHELP.CARS 
where Make ='Audi';
QUIT;
Label in PROC SQL

This Code is changing Display name of Make to Company and MSRP to Price and Display all records for Audi

9. Creating New Variable in PROC SQL

Proc sql;
select Make, MSRP, (MSRP*0.10) AS DISCOUNT From SASHELP.CARS Where 
Make='Acura';
QUIT;
Variable in PROC SQL

IT will Add New Variable in Display List named as DISCOUNT which is having 10% of MSRP.

10. If we want to Use Newly Created Variable in Same Select Statement then we need to use CALCULATED Keyword

proc sql;
select Make, MSRP, (MSRP*0.10) AS DISCOUNT, (MSRP-CALCULATED DISCOUNT) as FINAL_PRICE 
From SASHELP.CARS 
WHERE Make='BMW';
QUIT;

Calculated Keyword in PROC SQL

11. Removing Duplicates Using DISTINCT Keyword

proc sql;
select DISTINCT Make
From SASHELP.CARS ;
QUIT;

Distinct in PROC SQL

12. Multiple SQL Statements in One PROC SQL

proc sql;
select DISTINCT Cylinders From SASHELP.CARS ;
select Make,Cylinders From SASHELP.CARS where Make IN ('BMW');
QUIT;
Multiple SQL in PROC SQL

13. Creating a New Dataset/Table using PROC SQL We can create New Dataset using CREATE TABLE statement in PROC SQL -

PROC SQL;
Create Table BMW as select Make,Cylinders From SASHELP.CARS where Make IN ('BMW');QUIT;

 

  •  Creating New Dataset in SAS using PROC SQL

Labels: , , , , ,

Wednesday, December 23, 2020

PROC SQL FOR BEGINNERS - HOW TO USE SQL IN SAS

PROC SQL SAS

SAS offers support to relational databases by using SQL queries within SAS Program using Proc SQL. Procedure PROC SQL is used to evaluate and run the SQL statements and also for creating New Dataset/table out of written query, Also it can create new Variables/Columns in Dataset(s).

SQL is one of the Top 5 Programming Languages for Data Analysis, The advantage of PROC SQL over SQL is - we can use SQL along with Base SAS programming logic which can help us to solve more complex data analysis problem very easily.

Lets see how we can utilize PROC SQL to write SQL in SAS - 

Syntax for Proc SQL  - 

PROC SQL;
SELECT Columns
FROM TABLE
<WHERE Condition
GROUP BY Columns
Having Condition
Order by Columns;>
QUIT;

Click here to Read more »

Labels: , , , , ,