Hi,
Please suggest the analysis method for the following data figure.
X1 X2 X3 X4 X5 Y
21 33 2 66 81 90
17 39 10 17 90 120
5 55 6 51 95 106
...
I am trying to figure out which X is related to the Y response.
if there is any X, what would the equation be?
Thank you for helping.
Hello @MikeKim .
Unfortunately I think the short answer to your question is that more data is needed to be able to determine which Xs are related to the Y.
Do you have a reason to believe that only one X will be related to the response, or might several Xs have and influence on Y?
A scatterplot of the data provides a few clues as to what might be going on, but I would avoid drawing specific conclusion based only on three runs.
Firstly, X1 seems to correlate with X2. X3 also seems to correlate with X4. Is this fundamental to the way the system operates? If these correlations are expected and/or can be confirmed by further data, this might allow a predictive model to be built with less data than would be required for five independent Xs. A model could be built with only one of each correlating pair, or a Principal Component approach could be used.
There is also an indication Y might correlate with X3 and X4. This could be investigated if more data was available.
The best path forward would depend on what type of system is being studied. Will only unstructured data be available or can designed experiments be conducted to answer the question?
Hello @MikeKim .
Unfortunately I think the short answer to your question is that more data is needed to be able to determine which Xs are related to the Y.
Do you have a reason to believe that only one X will be related to the response, or might several Xs have and influence on Y?
A scatterplot of the data provides a few clues as to what might be going on, but I would avoid drawing specific conclusion based only on three runs.
Firstly, X1 seems to correlate with X2. X3 also seems to correlate with X4. Is this fundamental to the way the system operates? If these correlations are expected and/or can be confirmed by further data, this might allow a predictive model to be built with less data than would be required for five independent Xs. A model could be built with only one of each correlating pair, or a Principal Component approach could be used.
There is also an indication Y might correlate with X3 and X4. This could be investigated if more data was available.
The best path forward would depend on what type of system is being studied. Will only unstructured data be available or can designed experiments be conducted to answer the question?
I apologize for late.
Actually, the data is mock.
What I am asking for was for that data type, what analysis would be applicable.
I had concluded that I start with the multivariate correlation as you showed me.
Thank you for your support and let me know if there is any other screening method.
My current strategy is,
1. multivari-correlation
2. respective regression analysis (if corr is exist)
Two very useful screening methods are Response Screening and Predictor Screening. Response Screening is very useful for large data sets where you are looking for only linear relationships, and Predictor Screening is very useful for when you suspect interactions or curvature.
Appreciates for your answer.
But I have no idea how to further analyze the results of the Response screening.
Quite difficult to interpret the results and no background knowledge of this platform.
Sorry this.
But thank you!