Choose Language Hide Translation Bar
Community Manager Community Manager

Identifying Unusual Patterns that Might Identify Data Integrity Issues



See how to:

  • Identify duplicate values
    • Find Most Duplicated Values - values that appear most frequently within column                                         
    • Find Longest Runs - values that repeats in consecutive rows within column
    • Find Longest Duplicated Sequences- sequence of values that repeats within column
    • Find Duplicates Across Columns- sequence of values that appears in the same rows across multiple columns
    • Use Rarity Score to interpret duplications
  • Identify unusual values                                                          
    • Locate Formatted Width within cells - both overall and decimals
    • Locate suspicious Fraction Lengths        
    • Locate suspicious Leading Digits that are too uniform
  • Identify unexpected linear relationships where, within some group of consecutive rows (default is 10), one column has an exact linear relationship with another column
  • Identify specification limit anomalies for columns with spec limit properties
    • Locate Spec Limit Matches where limits in cells exactly match LSL or USL
    • Compare Spec Limits Distribution to compare out-of-spec values to expected out-of-spec values

Note: Q&A is included at times 25:56, 26:47, 27:36, 28:40, 52:24 and 54:08.