SAS SCAN, SUBSTR and COMPRESS functions with examples for Clinical SAS programming and character string manipulation.

SCAN, SUBSTR, and COMPRESS Functions in SAS with Examples

SCAN, SUBSTR, and COMPRESS Functions in SAS with Examples

Introduction

SAS SCAN, SUBSTR and COMPRESS functions with examples for Clinical SAS programming and character string manipulation.
Learn how SCAN, SUBSTR, and COMPRESS functions are used in SAS to extract, clean, and manipulate character data with practical examples.

When working with Clinical SAS, programmers frequently need to manipulate text and character variables. SAS provides several powerful character functions that help extract, clean, and transform data efficiently. Students interested in learning these concepts can explore Clinical SAS Training in Hyderabad.

Among the most commonly used character functions are:-

  • SCAN Function
  • SUBSTR Function
  • COMPRESS Function

These functions are widely used in Clinical SAS, Banking, Healthcare, Retail, and Telecom projects for processing patient IDs, visit names, drug information, account numbers, and other text-based data.

In this article, we will explore the SCAN, SUBSTR, and COMPRESS functions in SAS with practical examples. Additional information about SAS programming can be found in SAS Official Documentation.

What is the SCAN Function in SAS?

SCAN Function in SAS example showing how to extract first name, middle name, and last name from a character string using delimiters.
Learn how the SCAN function in SAS extracts words from a string based on position and delimiter with practical examples.

The SCAN function extracts a word from a string based on a specified delimiter.

Students exploring the Best Clinical SAS Training Institutes in Hyderabad often encounter SCAN function questions during interviews because it is widely used in Clinical SAS programming.

Syntax

SCAN(string, word-number, delimiter)

Parameters

  • string → Source text
  • word-number → Position of the word to extract
  • delimiter → Character used to separate values

Example 1: Extract First and Last Name

data example 1;

  •     name = “John Michael Smith”;
  •     first_name = scan(name,1,’ ‘);
  •     middle_name = scan(name,2,’ ‘
  •     last_name = scan(name,-1,’ ‘);
  • run;

Output

NameFirst NameMiddle NameLast Name
John Michael SmithJohnMichaelSmith

Example 2: Extract Domain from Email

data example 2;

  • email = “user123@gmail.com”;
  • domain = scan(email,2,’@’);
  • run;

Output

gmail.com

Example 3: Clinical SAS Drug Information

data clinical;

  • drug = “Paracetamol 500mg Tablet”;
  • drug_name = scan(drug,1);
  • strength = scan(drug,2);
  • run;

Output

Drug NameStrength
Paracetamol500mg

Clinical trial data standards are commonly used in pharmaceutical research studies registered through ClinicalTrials.gov.

Example 4:  Extract Visit Number

data visit ;

  •  visitname = “VISIT_12_WEEK”;
  •  visit_no = scan(visitname,2,’_’);
  • run;

Output

12

What is the SUBSTR Function in SAS?

SUBSTR Function in SAS example showing how to extract year, month, and day from a date string using character positions.
Learn how the SUBSTR function in SAS extracts specific characters from a string using start position and length.

The SUBSTR function extracts a specific portion of a character string.

Syntax

SUBSTR(string,start-position,length)

Parameters

  • string → Source text
  • start-position → Starting location
  • length → Number of characters to extract

Example 5: Extract Year, Month, and Day

data date_ex;

dateval = “20240315”;

  •    year = substr(dateval,1,4);
  •    month = substr(dateval,5,2);
  •    day = substr(dateval,7,2);
  • run;

Output

YearMonthDay
20240315

Example 6: Extract Country Code

data fixed;

  •     id = “USA12345”;
  •     country = substr(id,1,3);
  •     id_num = substr(id,4);
  • run;

Output

CountryID Number
USA12345

Example 7: Clinical Trial Timestamp

data time_ex;

  •  ts = “2024-03-15 14:32:10”;
  •  year = substr(ts,1,4);
  •  month = substr(ts,6,2);
  •  hour = substr(ts,12,2);
  • run;

Output

Year = 2024

Month = 03

Hour = 14

Clinical SAS programmers frequently work with industry standards developed by CDISC for data formatting and reporting.

Example 8: Extract Last 4 Digits

data telecom;

  •  mobile = “+91-98765-43210”;
  •  last4 = substr(scan(mobile,-1,’-‘),2);
  • run;

Output

3210

What is the COMPRESS Function in SAS?

COMPRESS Function in SAS example showing how to remove hyphens from a patient ID and convert PAT-001-2024 into PAT0012024.
Learn how the COMPRESS function in SAS removes unwanted characters from strings for data cleaning and transformation.

The COMPRESS function removes unwanted characters from a string.

The COMPRESS function removes unwanted characters from a string.

Professionals enrolled in a SAS Course for Pharmacy Life Sciences Students frequently use COMPRESS to clean patient identifiers, drug codes, and laboratory values before analysis.

It is commonly used to remove:-

  • Spaces
  • Numbers
  • Special Characters
  • Alphabets

Example 9: Remove Spaces

data ex1;

  • name = “Clinical SAS Training”;
  • result = compress(name);
  • run;

Output

ClinicalSASTraining

Example 10: Remove Hyphens

data ex2;

  • phone = “98765-43210”;
  • mobile = compress(phone,’-‘);
  • run;

Output

9876543210

Example 11: Remove Numbers

data ex3;

  • value = “ABC123XYZ”;
  •  letters = compress(value,’0123456789′);
  • run;

Output

ABCXYZ

Example 12: Remove Alphabets

data ex4;

  •  value = “ABC123XYZ”;
  •  numbers = compress(value,’ABCDEFGHIJKLMNOPQRSTUVWXYZ’);
  • run;

Output

123

Example 13: Clean Patient ID

data patient;

  • patient_id = “PAT-001-2024”;
  • clean_id = compress(patient_id,’-‘);
  • run;

Output

PAT0012024

SCAN and SUBSTR Combined Example

Example 14: Clinical Trial Visit Extraction

data visit;

  •  visitname = “VISIT_12_WEEK”;
  •  visit_word = scan(visitname,2,’_’);
  •  visit_no = substr(visit_word,1);
  • run;

Output

12

Example 15: Product Information Parsing

data retail;

    product = “SAMSUNG-GALAXY-A53-BLACK”;

  •  brand = scan(product,1,’-‘);
  •  model = scan(product,3,’-‘);
  •  color = scan(product,-1,’-‘);
  • run;

Output

  • Brand = SAMSUNG
  • Model = A53
  • Color = BLACK

Conclusion

Common SAS character functions used in Clinical SAS including SCAN, SUBSTR, COMPRESS, CATX, TRANWRD, and INDEX functions.
Overview of commonly used SAS character functions for data extraction, transformation, and text manipulation in Clinical SAS programming.

The SCAN, SUBSTR, and COMPRESS functions are essential SAS character functions used for extracting, cleaning, and transforming text data. These functions are widely used in Clinical SAS programming to process patient information, visit names, drug details, timestamps, and clinical trial datasets.

Understanding these functions will help Clinical SAS programmers write efficient code and perform data manipulation tasks more effectively in real-world projects.

Students preparing for Clinical SAS Jobs for Freshers in India should practice these functions regularly because they are commonly used in programming assessments and technical interviews.

Clinical SAS professionals often work on studies submitted to regulatory agencies such as the U.S. Food and Drug Administration (FDA).

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *