キャンセル
次の結果を表示 
表示  限定  | 次の代わりに検索 
もしかして: 
  • JMP will suspend normal business operations for our Winter Holiday beginning on Wednesday, Dec. 24, 2025, at 5:00 p.m. ET (2:00 p.m. ET for JMP Accounts Receivable).
    Regular business hours will resume at 9:00 a.m. EST on Friday, Jan. 2, 2026.
  • We’re retiring the File Exchange at the end of this year. The JMP Marketplace is now your destination for add-ins and extensions.

Discussions

Solve problems, and share tips and tricks with other JMP users.
言語を選択 翻訳バーを非表示
Thierry_S
Super User

JMP > Dimension Reduction > Collapse Highly Correlated Variables (Total N = 7300)?

Hi JMP Community,

I have an extensive biomarker data set (7,300 variables) that contains multiple subsets of highly correlated variables (r > 0.95) that I want to collapse into representative variables (i.e., one aggregated variable for each group of highly correlated variables). While I can easily identify the highly correlated pairs of variables, I am struggling with identifying all the members of each group. I have experimented with Clustering, but I cannot get to a definitive answer.

Is there a method in the MultiVariate Methods that would allow me to collapse this dataset? 

Of note, I cannot chare the data set because of confidentiality.

Thank you for your help.

Best,

TS

Thierry R. Sornasse
2 件の受理された解決策

受理された解決策
Thierry_S
Super User

Re: JMP > Dimension Reduction > Collapse Highly Correlated Variables (Total N = 7300)?

Hi JMP Community,

It seems that I tend to find part of the answer soon after posting. Step 1: Use the Variable Clustering platform under the Custer menu = completed. 

Now that I have the variables clustered with Most Representative and Cluster Membership results available, what is the best way to apply the output to my table (I assume that JSL script will be involved).

Thanks for your help.

Best,

TS 

Thierry R. Sornasse

元の投稿で解決策を見る

Thierry_S
Super User

Re: JMP > Dimension Reduction > Collapse Highly Correlated Variables (Total N = 7300)?

Hi JMP Community,

Well, I solved my question. After clustering the variables (the subset of variables with intercorrelation > 0.95), I matched the Cluster Membership and the Most Representative to the original STACKED (Tall x Narrow table). I then selected all rows with no association to a Cluster and those with a name matching the Most Representative, subsetted, and Split by Variable name. I went from 7,300 variables to 5,700 (i.e., 22% collapse).

Best,

TS

Thierry R. Sornasse

元の投稿で解決策を見る

3件の返信3
Thierry_S
Super User

Re: JMP > Dimension Reduction > Collapse Highly Correlated Variables (Total N = 7300)?

Hi JMP Community,

It seems that I tend to find part of the answer soon after posting. Step 1: Use the Variable Clustering platform under the Custer menu = completed. 

Now that I have the variables clustered with Most Representative and Cluster Membership results available, what is the best way to apply the output to my table (I assume that JSL script will be involved).

Thanks for your help.

Best,

TS 

Thierry R. Sornasse
Thierry_S
Super User

Re: JMP > Dimension Reduction > Collapse Highly Correlated Variables (Total N = 7300)?

Hi JMP Community,

Well, I solved my question. After clustering the variables (the subset of variables with intercorrelation > 0.95), I matched the Cluster Membership and the Most Representative to the original STACKED (Tall x Narrow table). I then selected all rows with no association to a Cluster and those with a name matching the Most Representative, subsetted, and Split by Variable name. I went from 7,300 variables to 5,700 (i.e., 22% collapse).

Best,

TS

Thierry R. Sornasse
P_Bartell
Level VIII

Re: JMP > Dimension Reduction > Collapse Highly Correlated Variables (Total N = 7300)?

One other thought for you besides clustering is principal components analysis. Tailor made for dimensionality reduction.

おすすめの記事