副标题:无

作   者:

分类号:

ISBN:9781599946597

微信扫一扫,移动浏览光盘

简介

Summary: Publisher Summary 1 Now a retired professor from the Robert Wood Johnson Medical School, Cody is a private consultant and national instructor for SAS, and the author or coauthor of numerous books on SAS. He offers novice and experienced SAS programmers a practical guide to detecting and correcting data errors while learning to apply DATA step programming techniques and SAS procedures. The material has been updated to cover the many new functions in SAS, and includes a new chapter on integrity constraints and audit trails, several macros to make data cleaning tasks easier, and a short description of an SAS product called DataFlux for performing advanced data cleaning techniques such as address standardization and fuzzy matching. Annotation 漏2008 Book News, Inc., Portland, OR (booknews.com)   Publisher Summary 2 Thoroughly updated for SAS 9, this second edition addresses tasks that nearly every SAS programmer needs to do - that is, make sure that data errors are located and corrected. Written in Ron Cody's signature informal, tutorial style, this book develops and demonstrates data cleaning programs and macros that you can use as written or modify for your own special data cleaning needs. Each topic is developed through specific examples, and every program and macro is explained in detail.You'll learn how to -find and correct errors in character and numeric values -develop programming techniques related to dates and missing values -use SQL approaches to data cleaning -develop techniques for correcting your data errors -use integrity constraints and audit trails to prevent errors from being added to a clean data set Novice and experienced SAS users will discover ways to detect and correct data errors while learning how to apply DATA step programming techniques and SAS procedures. SAS Products and Releases: Base SAS: 9.2, 9.1.3, 9.1.2, 9.1, 9.0 SAS/STAT: 9.2, 9.1.3, 9.1.2, 9.1, 9.0 Operating Systems: All  

目录

Copyright
Preface to the Second Edition
Preface to the First Edition
Acknowledgments
Download Chapter
1 Token Chapter 1.聽Checking Values of Character Variables Section 1.1.聽Introduction
Section 1.2.聽Using PROC FREQ to List Values
Section 1.3.聽Description of the Raw Data File PATIENTS.TXT
Section 1.4.聽Using a DATA Step to Check for Invalid Values
Section 1.5.聽Describing the VERIFY, TRIM, MISSING, and NOTDIGIT Functions
Section 1.6.聽Using PROC PRINT with a WHERE Statement to List Invalid Values
Section 1.7.聽Using Formats to Check for Invalid Values
Section 1.8.聽Using Informats to Remove Invalid Values
Download Chapter
1 Token Chapter 2.聽Checking Values of Numeric Variables Section 2.1.聽Introduction
Section 2.2.聽Using PROC MEANS, PROC TABULATE, and PROC UNIVARIATE to Look for Outliers
Section 2.3.聽Using an ODS SELECT Statement to List Extreme Values
Section 2.4.聽Using PROC UNIVARIATE Options to List More Extreme Observations
Section 2.5.聽Using PROC UNIVARIATE to Look for Highest and Lowest Values by Percentage
Section 2.6.聽Using PROC RANK to Look for Highest and Lowest Values by Percentage
Section 2.7.聽Presenting a Program to List the Highest and Lowest Ten Values
Section 2.8.聽Presenting a Macro to List the Highest and Lowest "n" Values
Section 2.9.聽Using PROC PRINT with a WHERE Statement to List Invalid Data Values
Section 2.10.聽Using a DATA Step to Check for Out-of-Range Values
Section 2.11.聽Identifying Invalid Values versus Missing Values
Section 2.12.聽Listing Invalid (Character) Values in the Error Report
Section 2.13.聽Creating a Macro for Range Checking
Section 2.14.聽Checking Ranges for Several Variables
Section 2.15.聽Using Formats to Check for Invalid Values
Section 2.16.聽Using Informats to Filter Invalid Values
Section 2.17.聽Checking a Range Using an Algorithm Based on Standard Deviation
Section 2.18.聽Detecting Outliers Based on a Trimmed Mean and Standard Deviation
Section 2.19.聽Presenting a Macro Based on Trimmed Statistics
Section 2.20.聽Using the TRIM Option of PROC UNIVARIATE and ODS to Compute Trimmed Statistics
Section 2.21.聽Checking a Range Based on the Interquartile Range
Section 2.22.聽Summary
Download Chapter
1 Token Chapter 3.聽Checking for Missing Values Section 3.1.聽Introduction
Section 3.2.聽Inspecting the SAS Log
Section 3.3.聽Using PROC MEANS and PROC FREQ to Count Missing Values
Section 3.4.聽Using DATA Step Approaches to Identify and Count Missing Values
Section 3.5.聽Searching for a Specific Numeric Value
Section 3.6.聽Creating a Macro to Search for Specific Numeric Values
Download Chapter
1 Token Chapter 4.聽Working with Dates Section 4.1.聽Introduction
Section 4.2.聽Checking Ranges for Dates (Using a DATA Step)
Section 4.3.聽Checking Ranges for Dates (Using PROC PRINT)
Section 4.4.聽Checking for Invalid Dates
Section 4.5.聽Working with Dates in Nonstandard Form
Section 4.6.聽Creating a SAS Date When the Day of the Month Is Missing
Section 4.7.聽Suspending Error Checking for Known Invalid Dates
Download Chapter
1 Token Chapter 5.聽Looking for Duplicates and "n" Observations per Subject Section 5.1.聽Introduction
Section 5.2.聽Eliminating Duplicates by Using PROC SORT
Section 5.3.聽Detecting Duplicates by Using DATA Step Approaches
Section 5.4.聽Using PROC FREQ to Detect Duplicate ID's
Section 5.5.聽Selecting Patients with Duplicate Observations by Using a Macro List and SQL
Section 5.6.聽Identifying Subjects with "n" Observations Each (DATA Step Approach
Section 5.7.聽Identifying Subjects with "n" Observations Each (Using PROC FREQ
Download Chapter
1 Token Chapter 6.聽Working with Multiple Files Section 6.1.聽Introduction
Section 6.2.聽Checking for an ID in Each of Two Files
Section 6.3.聽Checking for an ID in Each of "n" Files
Section 6.4.聽A Macro for ID Checking
Section 6.5.聽More Complicated Multi-File Rules
Section 6.6.聽Checking That the Dates Are in the Proper Order
Download Chapter
1 Token Chapter 7.聽Double Entry and Verification (PROC COMPARE) Section 7.1.聽Introduction
Section 7.2.聽Conducting a Simple Comparison of Two Data Sets
Section 7.3.聽Using PROC COMPARE with Two Data Sets That Have an Unequal Number of Observations
Section 7.4.聽Comparing Two Data Sets When Some Variables Are Not in Both Data Sets
Download Chapter
1 Token Chapter 8.聽Some PROC SQL Solutions to Data Cleaning Section 8.1.聽Introduction
Section 8.2.聽A Quick Review of PROC SQL
Section 8.3.聽Checking for Invalid Character Values
Section 8.4.聽Checking for Outliers
Section 8.5.聽Checking a Range Using an Algorithm Based on the Standard Deviation
Section 8.6.聽Checking for Missing Values
Section 8.7.聽Range Checking for Dates
Section 8.8.聽Checking for Duplicates
Section 8.9.聽Identifying Subjects with "n" Observations Each
Section 8.10.聽Checking for an ID in Each of Two Files
Section 8.11.聽More Complicated Multi-File Rules
Download Chapter
1 Token Chapter 9.聽Correcting Errors Section 9.1.聽Introduction
Section 9.2.聽Hardcoding Corrections
Section 9.3.聽Describing Named Input
Section 9.4.聽Reviewing the UPDATE Statement
Download Chapter
1 Token Chapter 10.聽Creating Integrity Constraints and Audit Trails Section 10.1.聽Introducing SAS Integrity Constraints
Section 10.2.聽Demonstrating General Integrity Constraints
Section 10.3.聽Deleting an Integrity Constraint Using PROC DATASETS
Section 10.4.聽Creating an Audit Trail Data Set
Section 10.5.聽Demonstrating an Integrity Constraint Involving More than One Variable
Section 10.6.聽Demonstrating a Referential Constraint
Section 10.7.聽Attempting to Delete a Primary Key When a Foreign Key Still Exists
Section 10.8.聽Attempting to Add a Name to the Child Data Set
Section 10.9.聽Demonstrating the Cascade Feature of a Referential Constraint
Section 10.10.聽Demonstrating the SET NULL Feature of a Referential Constraint
Section 10.11.聽Demonstrating How to Delete a Referential Constraint
Download Chapter
1 Token Chapter 11.聽DataFlux and dfPower Studio Section 11.1.聽Introduction
Section 11.2.聽Examples
Download Chapter
0 Tokens Appendix A.聽Listing of Raw Data Files and SAS Programs Section A.1.聽Programs and Raw Data Files Used in This Book
Section A.2.聽Description of the Raw Data File PATIENTS.TXT
Section A.3.聽Layout for the Data File PATIENTS.TXT
Section A.4.聽Listing of Raw Data File PATIENTS.TXT
Section A.5.聽Program to Create the SAS Data Set PATIENTS
Section A.6.聽Listing of Raw Data File PATIENTS2.TXT
Section A.7.聽Program to Create the SAS Data Set PATIENTS2
Section A.8.聽Program to Create the SAS Data Set AE (Adverse Events)
Section A.9.聽Program to Create the SAS Data Set LAB_TEST
Section A.10.聽Listing of the Data Cleaning Macros Used in This Book
Section A.11.聽Creating a Macro to List the Highest and Lowest "n" Percent of the Data Using PROC UNIVARIATE
Section A.12.聽Creating a Macro to List the Highest and Lowest "n" Percent of the Data Using PROC RANK
Section A.13.聽Creating a Macro to List the Highest and Lowest "n" Values
Section A.14.聽Creating a Macro to List Out-of-Range Data Values
Section A.15.聽Writing a Program to Summarize Data Errors on Several Variables
Section A.16.聽Creating a Macro to Detect Outliers Based on Trimmed Statistics
Section A.17.聽Creating a Macro to List Outliers of Several Variables Based on Trimmed Statistics (Using PROC UNIVARIATE)
Section A.18.聽Detecting Outliers Based on the Interquartile Range
Section A.19.聽Creating a Macro to Search for Specific Numeric Values
Section A.20.聽Creating a Macro to Check for ID's Across Multiple Data Sets

已确认勘误

次印刷

页码 勘误内容 提交人 修订印次

    • 名称
    • 类型
    • 大小

    光盘服务联系方式: 020-38250260    客服QQ:4006604884

    意见反馈

    14:15

    关闭

    云图客服:

    尊敬的用户,您好!您有任何提议或者建议都可以在此提出来,我们会谦虚地接受任何意见。

    或者您是想咨询:

    用户发送的提问,这种方式就需要有位在线客服来回答用户的问题,这种 就属于对话式的,问题是这种提问是否需要用户登录才能提问

    Video Player
    ×
    Audio Player
    ×
    pdf Player
    ×
    Current View

    看过该图书的还喜欢

    some pictures

    解忧杂货店

    东野圭吾 (作者), 李盈春 (译者)

    loading icon