- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
data preprocessing
Hello.
My question is when we use nonparametric model in software such as neural network or SVM or Naive bayes or ....do the software scale our data by default?or before we use these models we should scale our numeric data?
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: data preprocessing
Ok, but in my (very long) answer, I did mention about tree-based methods and Naive Bayes:
Tree-based models and probability-based algorithms like Naive Bayes may not require scaling.
Tree-based methods don't require scaling as they are are not distance-based algorithms, the splits are done based on the order of the data and information generated by splitting at a certain threshold, but there are no influences of the individual values, ranges or distributions on the split results.
Naive Bayes is a probability-based algorithm, it calculates probabilities from the data's distribution and is invariant to the scale of the data.
Some further ressources :
https://www.dataschool.io/comparing-supervised-learning-algorithms/
https://forecastegy.com/posts/do-decision-trees-need-feature-scaling-or-normalization/
Does this complementary response answer your question ?
"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: data preprocessing
Did you read my reply to one of your similar question : https://community.jmp.com/t5/Discussions/data-preprocessing/m-p/761840/highlight/true#M93976 ?
"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: data preprocessing
Yes i read
but you mention about (( SVM, KNN, Neural Networks, (linear & logistic) regression))
i actually want to know about other models
boosted tree
bootstrap forest
decision tree
Naive bayes
thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: data preprocessing
Ok, but in my (very long) answer, I did mention about tree-based methods and Naive Bayes:
Tree-based models and probability-based algorithms like Naive Bayes may not require scaling.
Tree-based methods don't require scaling as they are are not distance-based algorithms, the splits are done based on the order of the data and information generated by splitting at a certain threshold, but there are no influences of the individual values, ranges or distributions on the split results.
Naive Bayes is a probability-based algorithm, it calculates probabilities from the data's distribution and is invariant to the scale of the data.
Some further ressources :
https://www.dataschool.io/comparing-supervised-learning-algorithms/
https://forecastegy.com/posts/do-decision-trees-need-feature-scaling-or-normalization/
Does this complementary response answer your question ?
"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: data preprocessing
Thanks a lot