I'll take a stab at this, although I think you probably need to be more specific to get better advice. I'm not sure how you are distinguishing "cleaning" data from "preprocessing" but I would advise against the step you list. I never recommend removing outliers - at least until fairly well into an analysis when you have some understanding on why the outliers are, in fact, outliers. I also would not recommend normalization or rescaling data as a general practice (though it may become useful for particular contexts) - many JMP analysis platforms automatically normalize data where it is helpful. Similarly, imputing missing data is unnecessary in most JMP platforms as it is done automatically (usually by just checking a box to include missing). There are times you will want to impute missing values more carefully, perhaps using your own methodology (e.g., building a regression model to impute missing data), but this again will depend on the context. So, I don't recommend doing any of the things you list as an automatic thing. Instead, I'd begin by graphically examining your data to make sure you understand what is being measured, what types of relationships seem to exist and are potentially important, and to ascertain whether some "preprocessing" is a good idea.
Indeed, it is important to examen your data graphically first before preprocessing them.
I want to compare the predictive performance of several machine learning models (linear models such as KNN and non linear models such as Random Forest, etc...) on my data. Linear models treat features as if they were on the same scale. But physiological variables have values on much different scales such as pH and heart rate. Therefore I want to Scale them by Z-normalization. Moreover, clustering algorithm (K-mean clustering) does not impute missing values automatically in JMPpro. Thats why a wanted to preprocess my data first
I believe standardization is done for clustering. The JMP documentation says that for KNN, each variable is scaled by its standard deviation - precisely to avoid the type of problem you are alluding to. So, I still think no additional preprocessing is required for you to do what you want - compare a number of different predictive models.