cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Have your say in shaping JMP's future by participating in the new JMP Wish List Prioritization Survey
Choose Language Hide Translation Bar
lala
Level VII

How to get non-duplicate column names from the "leaf label formula" of decision tree model?

For example, use the following JSL to get the Leaf Label Formula.Extract the column names from the formula in the first 10 rows.

Thanks!

d0 = Open( "$SAMPLE_DATA/Equity.jmp" );
p=Partition(
	Y( :BAD ),
	X(
		:LOAN, :MORTDUE, :VALUE, :REASON, :JOB, :YOJ, :DEROG, :DELINQ, :CLAGE, :NINQ,
		:CLNO, :DEBTINC
	),
	Validation Portion( 0.3 )
);
p<<go;Wait(2);
p << save leaf label formula;
3 REPLIES 3
lala
Level VII

回复: How can duplicate column names not be extracted from the "Leaf Label Formula" of the decision tree model?

2023-10-07_11-56-51.png

lala
Level VII

回复: How can duplicate column names not be extracted from the "Leaf Label Formula" of the decision tree model?

Current Data Table( d0 );
txt = "";
For( i = 1, i <= 10, i++,
	If( i == 1,
		a = "",
		a = "aaa"
	);
	nn = d0[i, "Leaf Label Formula"];
	txt = txt || a || nn;
);
tx = "ccc\!n" || Substitute( txt, "&", "\!n", "aaa", "\!n", " or Missing", "", " not Missing", "", " Is Missing", "" );
tx = Substitute( tx, ">", "\!t>", "<", "\!t<", "(", "\!t(" );
d1 = Open( Char To Blob( tx ), "text" );
d2 = d1 << Summary( Group( ccc ), Freq( 0 ), Weight( 0 ), Link to original data table( 0 ) );
jthi
Super User

Re: How to get non-duplicate column names from the "leaf label formula" of decision tree model?

You could loop over all the possible column names and if match is found, add it to associative array (and remove from the column listing if you want to)

Names Default To Here(1);

input_cols = {"LOAN", "MORTDUE", "VALUE", "REASON", "JOB", "YOJ", "DEROG", "DELINQ", "CLAGE", "NINQ", "CLNO", "DEBTINC"};

dt = Open("$SAMPLE_DATA/Equity.jmp");
part = dt << Partition(
	Y(:BAD),
	X(Eval(input_cols)),
	Validation Portion(0.3)
);
part << go;
part << save leaf label formula;
part << close window;

aa_cols = Associative Array();

For Each({row_val}, dt[1::10, "Leaf Label Formula"],
	input_cols_temp = input_cols;
	For Each({input_col, idx}, input_cols,
		If(Contains(row_val, input_col),
			aa_cols[input_col] = 1;
			Remove From(input_cols_temp, idx);
		);
	);
	input_cols = input_cols_temp;
);

 

-Jarmo