Getting More Out of Data Competition Results With Pareto Fronts (2019-US-30MP-161)
Sep 12, 2019 6:53 AM
| Last Modified: Nov 4, 2019 8:03 AM
Christine Anderson-Cook, Scientist, Los Alamos National Laboratory Lu Lu, Assistant Professor, University of South Florida Sarah Burke, Statistician, The Perduco Group
Data competitions have attracted considerable attention among the world’s community of data and analytics scientists, as well as discipline-specific subject matter experts. Their broad involvement provides a model of crowdsourcing for business and government to solve tough high-impact problems in a cost-effective way. Typically winners are determined through a leaderboard formula that needs to be static throughout the competition, with fixed rewards and penalties for patterns of correct and incorrect responses for different aspects of the solution. However, for different uses of the solution, these aspects might be more or less important. By using the existing capability for constructing flexible high-dimensional Pareto fronts in JMP, it is possible to explore and identify various solutions with their strengths and weaknesses. Pareto fronts allow the user to identify all of the objectively superior solutions across all possible weightings of the different elements of the solution, and discard non-competitive solutions. The approach to using multiple Pareto fronts to highlight different "best" solutions will be demonstrated through a recently completed data competition focused on detecting, identifying and locating radioactive sources in an urban environment (https://www.topcoder.com/lp/detect-radiation).