Subscribe Bookmark RSS Feed

Formula Column from variable list using jsl parse, eval insert, expr and substitute

tony_cooper1

Staff

Joined:

Sep 29, 2014

I use 2 methods to create formula columns depending on whether I know how many variables the formula needs

  1. Expr and substitute are easy to read. Section 1 in the code creates several  formula columns for each variable in a list for mean(var, byvar)
  2. Parse and eval and possibly eval insert create a single formula column that requires a number of variables, the number and name being set by the variable list.
    1. Section 2a and 2b work and are functionally equivalent
    2. Section 2c and 2d do not run, but seem a path to an elegant solution

I would like ideas to create better practices and would appreciate recommendations

Names Default To Here( 1 );
n=ntable();
dtnms={};
If(
n==0, dtc=1,
n==1, If(current data table() << get name == "Big Class", dtc=0, dtc=1) , // else
for(ii=1, ii <= n, ii++, insert into(dtnms, data table(ii) << getname));
if(contains(dtnms, "Big Class") > 0, dtc=1, dtc=0);
);
if (dtc,dt = Open( "$SAMPLE_DATA/Big Class.jmp" ), dt = data table("Big Class"));

// Section 1: Add several columns of formula entering variables dynamically.
Form1=expr(
form1col=dt << new column(Name, formula(col mean(As Column(YVar), As Column(ByVar))))
);
VarList={"sex", "height", "weight"};
For(ii=2, ii <= N Items(VarList), ii++,
Eval(Substitute(Name expr(form1),
  expr(Name), VarList[ii]||" Averages",
  expr(YVar), VarList[ii],
  expr(ByVar), VarList[1],
));
);

// Section 2: Average the columns in the list of variables
// Method 2a Method works but is hard to debug
VarList={"height", "weight", "age"};
Formtxt1 = "formcol = dt << new column(\!"Method a: Mean\!", formula(mean( ";
Formtxt2 = "As Column(\!""||VarList[1]||"\!")";
If(
N Items(VarList)==2, Formtxt2 ||= ", As Column(\!""||VarList[2]||"\!")",
N Items(VarList)>2, For(ii=2, ii <= N Items(VarList), ii++,
   Formtxt2 ||= ", As Column(\!""||VarList[ii]||"\!")"
  )
);
Formtxt=Formtxt1||Formtxt2||");";
eval(parse(Formtxt));

// Method 2b: Somewhat easier to see
col4=expr(
  NewCol1=dt << New Column("Method b: Mean of "||_names_);
);
form4txt="NewCol1 <<formula(Mean( ^_Vars_^  )))";

VarList={"height", "weight", "age"};
VarRef = eval list(dt << get Column Reference(VarList));
_Vars_ = "As Column( \!"" || ConCat Items(VarList, "\!"), As Column( \!"") || "\!")";
form4txta = Eval Insert(form4txt);

Eval(Substitute(Name expr(col4),
expr(_names_), concat Items(VarList,", ",)
));
Eval(Parse(form4txta)); //Parse makes the text an expr which is evaluated

// *************************************** //
// ************* Not working ************* //

//Method 2c: I think this would be the best, but I cannot make it correct - Still have a list
form3=expr(
  NewCol=dt << New Column("Mean of "||_names_, formula(Mean(_Vars_ )));
);
VarList={"height", "weight", "age"};
VarRef = eval list(dt << get Column Reference(VarList));
Eval(Substitute(Name expr(form3),
expr(_names_), concat Items(VarList,", ",),
expr(_Vars_), VarRef
));

// Method 2d: Using a loop, but still incorrect
form4=expr(
  NewCol=dt << New Column("Mean of "||_names_, formula(Mean(_Vars_ )));
);

VarList={"height", "weight", "age"};
VarRef = eval list(dt << get Column Reference(VarList));
Form4a =Substitute(Name expr(form4),
expr(_names_), concat Items(VarList,", ",)
);

For(ii=1, ii <= N Items(VarList), ii++,
form4a=Substitute(Name expr(form4a),
  expr(_Vars_), expr(VarRef[ii], _Vars_)
);
);
Eval(Substitute(Name expr(form4a),
expr(_Vars_), ""
));


// *************************************** //
// ************* Not working ************* //

2 REPLIES
ms

Super User

Joined:

Jun 23, 2011

Nice collection of different methods.

Lately I have come to prefer working with expressions rather than strings in cases like this. For setting a formula by JSL I first build the expression using EvalExpr(), which I then insert with NameExpr() to avoid evaluation at runtime.

Below are two examples based on either a list of column names or a list of strings, the latter is a little more challenging.

dt = Open("$SAMPLE_DATA/Big Class.jmp");

// 1. List of column names

L1 = {age, height, weight}; // i.e dt<<get column names(numeric);

form1 = Eval Expr(Mean(Expr(L1)));

form1col = dt << New Column(Concat Items(Words(Char(L1), "{},")) || " Average", formula(Name Expr(form1)));

// 2. List of strings

L2 = {"age", "height", "weight"};// i.e dt<<get column names(numeric, string);

form2 = Eval Expr(

    Mean(

        Expr(

            Repeat(

                i = 0;

                {i++ ; Eval Expr(As Column(Expr(L2[i])))};,

                N Items(L2)

            )

        )

    )

);

form2col = dt << New Column(Concat Items(L2) || " Average", formula(Name Expr(form2)));