Advanced Techniques for Working with Big Data
Published on
11-07-2024
03:28 PM
by
| Updated on
11-07-2024
05:38 PM
See how to:
- Understand how volume, velocity, variety, veracity, processing speed, memory, time and user patience impact approaches for handling tall and wide data
- Understand importance of approaches for handling big data - data filtering, column switching, predictor/response screening, collecting summary tables into graphs, automating variable reduction, sampling, finding missing values, screening outliers and using formula columns
- Identify format of World Development Indicator case study data - 305 million data points, 1402 variables, 15.5 GB of data (11:30)
- Save space using Virtual Join Like Id and Link Reference (13:00)
- Expose missing data using Tables Summary (20:38)
- Restructure data to split using Group to identify row types, Split By to identify the columns you will need and Split Column to identify values you want in new columns (24:50)
- Visualize data interactively using Graph Builder Column Switcher to flip through variables visually and Local Data Filter to see data based on desired range of values (28:30)
- Find factors that are most predictive for outcome of interest using Response Screening (32:00)
- Put graphs for selected relationships into a JMP table expression column using Make a table of graphs like this option (34:32)
- Identify and visualize most prevalent text and phrases using Text Explorer and Word Cloud (39:56)
- Use subsetting proxy method to analyze tall data (45:41)
- Build and validate model and then generate and save score code for predicting future results (49:04)
Note: Q&A included at times 36:40, 37:28, 37:43 and 38:25.
Start:
Wed, Jul 8, 2020 02:00 PM EDT
End:
Wed, Jul 8, 2020 03:00 PM EDT