SAS Macros are the Cure for Quality Control Pains

0
0
1897 days ago, 768 views
PowerPoint PPT Presentation
SAS Macros are the Cure for Quality Control Torments. Gary McQuown Information and Scientific Arrangements. Rages and Raves of a SAS Developer. Reason. I. Quality Control II. SAS Macros for Quality Control III. Wellsprings of SAS Macros and QC Code. I. Quality Control.

Presentation Transcript

Slide 1

SAS Macros are the Cure for Quality Control Pains Gary McQuown Data and Analytic Solutions

Slide 2

Rants and Raves of a SAS Programmer

Slide 3

Purpose I. Quality Control II. SAS Macros for Quality Control III. Wellsprings of SAS Macros and QC Code

Slide 4

I. Quality Control A continuous exertion for approval, change and help of the information related procedure to guarantee that information meets the business needs.

Slide 5

Quality Control "Quality control implies you can have what you require, how you require it, when you require it." E. Demming

Slide 6

Why Practice QC? It Saves Time It Saves Money It Makes Money Ignorance is not Bliss

Slide 7

How Data Goes Bad "Terrible Genes" .. Poor outline and accumulation "Reception" … Someone Else's Design "Tyke Abuse" ... Ineffectively Nurtured "Awful Teens" ... Developing Pains

Slide 8

The QC Process Define Requirements Identify Data Issues Analyze Options Improve Data Quality Document each progression and rehash

Slide 9

Define Requirements What do you require? Requires a comprehension of the business procedure, the information, the working framework and the clients. Documentation, business specs and "specialists".

Slide 10

Devil's Advocate What is right for one errand/gathering might be off base for another. What is right now might be mistaken later. What is right now ... will most likely be unable to be rehashed.

Slide 11

Identify Data Issues Accuracy Completeness Consistency Timeliness Uniqueness Validity

Slide 12

G = Good F = Fair B = Bad

Slide 13

Analyze Options What do you require? What do you have? What changes should be made? Will you break anything en route?

Slide 14

Improve Data Quality Selective Processing Clean Existing Values Correcting Existing Values Delete "terrible" information Add extra information Document unique and new values.

Slide 15

Documentation Design Process ... business specs "As You Go" ... in the code, log, email Input and Output documents (Freqs & Means) Modifications .... "according to xxx ", email Exceptions (Errors and Issues) User's Manual Elizabeth Axelrod ... Enormous "D" "Simply Shoot Them"

Slide 16

General Suggestions "Drive Out Fear" Early Intervention Obtain "Purchase In" from all gatherings Keep it "Straightforward" ... utilize macros Be predictable … utilize macros Monitor comes about Document everything, each time

Slide 17

II. SAS Macros permit you to utilize, re-utilize and share "question arranged" code. QC is exceptionally excess .... the same or comparative process performed on every informational index, every variable and each procedure.

Slide 18

Reality People are: Ignorant Forgetful Busy Lazy Don't Care

Slide 19

Why Macros Minimal Effort Parameters Available ( FREE )

Slide 20

FREQOUT Produces Frequencies for various factors % FREQOUT (data= /* input dataset name */, out= freqout /* yield informational collection name , vars= /* rundown of factors */, by = /* rundown of by factors */, fmtassign = /* var fmt var fmt */, investigating = NO/* YES or NO */Author: Ian Whitlock Location: www.lexjansen.com and sconsig.com

Slide 21

EAP_RPT %EAP_RPT (DSN=, LIBIN= , LIBOUT=, _VARS= , _FMTS=); DSN = Name of info SAS informational collection LIBIN= SAS library of info informational index LIBOUT= SAS library of yield informational index _VARS= rundown of character factors to audit .. combined with _FMTS _FMTS= rundown of arrangements to apply ... matched with _VARS Example: %EAP_RPT( _VARS = AGE INCOME EDUCATION , _FMTS = AGE INC EDU , LIBIN = PROJ_IN , LIBOUT = PROJ_OUT , DSN = STUDY_1);

Slide 22

DATA CLEANING TIP00128a - Cleansing Macro , Data Scrubbing routine (see tip 00128 for additional) %cleanse(schlib=work, schema=, strlen=50, var=, target=target, replace=replace, case=nocase); Author: Charles Patridge Version: 2.1 (sug. by Ian Whitlock) Location: www.sconsig.com

Slide 23

REMOVE OUTLIERS %outlier ( information = _SAS_dataset_name_, out = _SAS_output_dataset_name var = _variable_to_screen pass = _number_of_passes with the exception of = _exception_report_data_set_, mult = _multiplier_of_standard_deviations_) The %OUTLIER large scale finishes anomaly screens in view of factual estimations of a numeric variable in a SAS informational collection. It is set up to evacuate any exception records that are inside a given number of Standard Deviations from the mean, and will run that screen a given number of times. For instance, a "3-Pass-2" exception screen will expel any qualities outside 3 standard deviations from the mean, and will run that anomaly screen twice. The given numbers can be any whole number. Writer: Unknown Location: www.spikeware.com

Slide 24

CONT_COMPARE Compares two informational indexes, list all factors and reports potential issues: Fields in Both Type Length %cont_compare (dsn1, dsn2)

Slide 25

KEEPDBLS : Documents Duplicates TIP000367-KeepDbls %MACRO KeepDbls (SourceDs =_LAST_, TargetDs =, Overwrit =N, IdList =, Where =); Moves copy perceptions to another record. Creator: Jim Groeneveld Location: www.sconsig.com

Slide 26

CK_MISSING Evaluates factors as to missing and non missing status. Default= _numeric_ missing. _character_ $missing. Parms: DSN = libname and name of informational index. Default is the last read/made. PATH= way to catalog where QC information is put away. VAR = rundown of factors to b assessed. FMT = design statment. %ck_missing( dsn=mylib.recentfile, var=UPB FICO1 FICO2 FICO3 CHANNEL, fmt=UPB upb. FICO1 FICO2 FICO3 fico. CHANNEL $chnl. );

Slide 27

LOG FILTER : Examines and Reports on SAS Log Filter checks your log for mistakes, notices, and other "interesting" messages. It then shows what it finds in its rundown window. Double tap on a column and it'll reposition the log window to show the message in setting (if it's an outside log record, it'll open it in a watcher window and position it for you). Creator: Ratcliffe Location: http://ratcliffe.co.uk/rest_logfilt.htm

Slide 28

MK_FORMATS Create an organization from a SAS informational index. Parms: DSN = SAS informational collection START =Unique key esteem ie. SSN LABEL =Value to be related with begin ie. Full Name with SSN FMTNAME =Name of Format (sans ".") TYPE = C or N for Character or Numeric LIBRARY = Libname of Format Library (default =work) OTHER = Value to supply for missing (default =OTHER)

Slide 29

III. Wellsprings of SAS Macros and QC Code www.sas.com (illustrations) www.lexjansen.com (continuing) www.sconsig.com www.ratcliffe.co.uk www.statetechservices.com www.spikeware.com

Slide 30

More Sources www.mcw.edu/pcor/rsparapa/sasmacro.html www.math.yorku.ca/scs/friendly.html www.stat.ncsu.edu/sas/tests/index.html www.dasconsultants.com SAS-L Books By Users: Ron Cody's Data Cleaning Numerous books on Macros .... "By Example"

Slide 31

Questions ? Gary McQuown mcquown@DASconsultants.com www.DASconsultants.com

SPONSORS