To expand on excellent response from @Byron_JMP, Decision Tree is a very simple yet powerful algorithm. It has several advantages that makes it an effective choice for data mining : easy to understand and interpret, handling of categorical and numerical inputs and outputs, can handle missing values, is robust to outliers and can detect non-linear relationships, and enable to do features/predictors selection and ranking.
It doesn't assume traditional parametric assumptions, but there are a few points you should care about :
- Check multicollinearity and correlations among your predictors : As the Decision Tree creates splits in your data using all predictors and choosing the one offering the best split in your Y values, correlated predictors and multicollinearity could be a problem, as some predictors could be left out and never considered as they share some common information.
- Beware of Overfitting : Decision Tree are prone to overfitting, as you could expand the tree very deep to perfectly classify or do a regression with your numerical data. It is recommended to have a validation strategy (K-folds cross-validation, Leave-One-Out or validation set, as the data quantity doesn't seem to be an issue in your case) and to ensure the model created is not too complex for the task at hand : setting a maximum depth (number of splits) and a minimum size split enable to reduce the risk of overfitting.
- Training data sample and representativeness : As Decision Tree are greedy algorithms (they split the data in order to maximize information gain, or reduce entropy), they are very sensitive to the training data used. Small changes in the data could cause changes in the rules provided by the decision tree. Considerations about data representativeness and noise should be adressed to ensure avoiding overfitting and ensuring generalization of the results.
As suggested by @Byron_JMP, it is often interesting to use Random Forest (available through Bootstrap Forest platform for JMP Pro or through Predictor Screening for JMP), as this technique reduce some drawbacks of Decision Tree thanks to various mechanisms/techniques : creation of several independent trees in parallel trained on bootstrapped samples (to reduce sensitivity to training set, ensure generalization and enable checking/validation through "Out-Of-Bag" sample results) with each split considering only a subset of the predictors (to enable each predictor, even some correlated with others, to have the same chances to be selected for the split, which enable to have more diverse trees and an ensemble of trees with good generalization performances).
Hope this complementary answer will help you,
Victor GUILLER
L'OrƩal Data & Analytics
"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)