cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Browse apps to extend the software in the new JMP Marketplace
Choose Language Hide Translation Bar
mat-ski
Level III

Default Local vs Explicit declaration of locals

Hi, I'm trying to understand how best to limit my risk with regards to local scope in functions. In my understanding my options for limiting the potential to pollute the Here scope from within a Function execution are to either declare my local variables or use Default Local. However, each of these has some problems:

 

Default Local:

  • Referencing values may be context dependent, since names that haven't been defined locally fallback to Here then possibly (if Names Default To Here is off) Global then current table. As noted in the docs this can mean that behavior is less predictable as values are dependent on the context.

 

Explicit Declaration:

  • Referencing values may be context dependent, since names still use this fallback behavior until they are declared if not declared in the local names
  • Assignments can accidentally leak into the Here scope. That is, if a name that is intended to be local is omitted from the local name declarations then it will pollute the surrounding Here scope
    • Generally this outward scope pollution is hard to track down.
    • For example, if you call a function in a For where you use "i" as an iterator and that function also has a For that uses i and fails to declare it as local, this will affect your outer For loop. Granted this is obvious programmer error, but it is an easy one to make and is difficult to debug.

 

In my mind, the ideal option would be able to declare something like "Strict Local", which would always treat unqualified names (both when assigning values and referencing values) as local and surrounding scopes would only be accessible with qualified names. This would allow me to not risk polluting surrounding scopes as well as forcing me to declare when I am depending on those scopes, so limiting local pollution from surrounding scopes.

1 ACCEPTED SOLUTION

Accepted Solutions
mat-ski
Level III

Re: Default Local vs Explicit declaration of locals

For any future readers, I eventually wrote a test helper that I use to enforce that all locals are explicitly declared

 

/* 
 * Test that `invocation` does not add any variables to the Here namespace. It is important that
 * this test be run first, else whatever variables may leak will leak before running this test and 
 * so they will already be present in `hereVariablesBefore`.
 *
 * Example:
 * someFunction1 = Function( {}, {}, leakedVar = 1);
 * someFunction2 = Function( {}, {safeVar}, safeVar = 1);
 * assertNoLeakedVariables( Function( {}, {}, someFunction1() ) ); // Fails
 * assertNoLeakedVariables( Function( {}, {}, someFunction2() ) ); // Passes
 */ 
assertNoLeakedVariables = Function( {invocation},
	{hereVariablesBefore, hereVariablesAfter}, 

	Try(
    // Declare both of these in advance, because otherwise the key `hereVariablesBefore` will not be 
		// present in `hereVariablesBefore` and it makes the assertion confusing.
		hereVariablesBefore = .;
		hereVariablesAfter = .;
    
		hereVariablesBefore = Namespace( "here" ) << Get Keys;
		invocation();
		hereVariablesAfter = Namespace( "here" ) << Get Keys;

		UT Assert( Expr( Length( hereVariablesBefore ) ), Length( hereVariablesAfter ) );

		If( Length( hereVariablesBefore ) != Length( hereVariablesAfter ),
			showLeakedVariableNames( hereVariablesBefore, hereVariablesAfter )
		);
	, 

		UT Assert( exception_msg, "" )
	)
);

View solution in original post

8 REPLIES 8
Craige_Hales
Super User

Re: Default Local vs Explicit declaration of locals

I like the explicit local declaration list. When I miss one it lands in the global name space and showglobals() identifies it.  Lather, rinse, repeat until the global space remains clean.

Craige
mat-ski
Level III

Re: Default Local vs Explicit declaration of locals

Can I ask what your workflow is for checking this? Do you add "Show Globals()" to the end of your functions during development and manually check that you've not added anything unintentional to the global namespace? Or is this something you are able to enforce with tests?

 

Side question: is it Show Globals or Show Symbols that I should be using? I am not seeing any output from Show Globals even when I intentionally omit local variables from my functions, but when I use Show Symbols I get so much output that I wouldn't notice any new listings anyway.

 

If it helps I have "Names Default To Here" mode on in all my script files.

Craige_Hales
Super User

Re: Default Local vs Explicit declaration of locals

My workflow in the past is to not use Names Default To Here and just keep the global namespace clear. Show Globals() probably never sees anything if you are always using Names Default To Here. Show Symbols() is noisy because it is reporting information up through the call stack if you use it in a deeply nested function. Here's my workflow example, re-worked to use Names Default To Here and Show Symbols. Run it, as is, and it does not leak. comment line 17 (g3) and it will throw an error because g3 is put into the 'here' namespace. The example's use of g2 does highlight an ugly problem; default local solves that problem but adds issues you noted. I might try switching back to Default Local in a future project. I think it might be a safer choice for me; I'm more likely to leave a variable out of the explicit locals than read a variable before writing it.

I think you are trying to do a finer-grained test than I am; I only test (globals or here) at the end. You might be able to use some of these ideas if you need to test for namespace leakage in every function.


// **** you will probably need to restart JMP to really understand each change you make. ****

xyzzy = "before names default to here"; // this goes into globals. If you run again it goes into 'here'.

Names Default To Here( 1 ); // takes effect when executed

New Namespace( // don't put the functions in the here namespace, keep them in appDemo.
	"appDemo" // this keeps the here namespace free of voluminous function definitions.
);

appDemo:originalLeakState = Log Capture( Show Symbols() ); // remember the original state

appDemo:f1 = Function( {p1}, // parameter list
	{
	 //g2, // <<<< comment this use f2's local
	 g3, // <<<< comment this to make a leak
	g1}, // local list
	Write( "\!ng2=", g2 );
	g1 = 1; // very local to this function
	g2 = 2; // not local, in this example it is local to f2, a bug if unintentional
	g3 = 3; // not local, lands in the 'here'
);
appDemo:f2 = Function( {p2},
	{g2},
	g2 = 42; // this will be over written by f1
	appDemo:f1( p2 );
);
appDemo:f2( "x" );

// if you call f1 from here, the here:g2 gets written
// g2 = 142;
// appDemo:f1( "x" );
 
// a test might look like this
If( Log Capture( Show Symbols() ) != appDemo:originalLeakState,
	Write( "\!n=============leak report==============" );
	Show Symbols();
	Throw( "variable namespace leaks, see log!" );
,
	Write( "\!n========= no change to symbols =======" );
	If( Length( appDemo:originalLeakState ),
		Write(
			"\!n========= original leak state: =======",
			appDemo:originalLeakState
		)
	);
);
Craige
mat-ski
Level III

Re: Default Local vs Explicit declaration of locals

Thanks this is very helpful! I think that I should be able to use some of what you've written here in order to write some automated checks that ensure nothing leaks.

vince_faller
Super User (Alumni)

Re: Default Local vs Explicit declaration of locals

Also there's something weird about DEFAULT LOCAL.  It actually reads the script and creates the local assignments based on its interpretation of what variables are being created.  There are more than a few ways to screw this up. The one I probably see most often from people is when they try to dynamically create a variable name.  Which I would say is not great to do and you couldn't explicitly state it either so it's sort of same same.  

 

Names default to here(1);
f = function({}, 
	{DEFAULT LOCAL}, 
	x = 14;
	Eval(Parse("x"||char(x)||"=28"));//not that anyone should ever do this BUT ... 
	return(x);
);
f();
namespace("here"); // notice how x14 is in the here namespace
/*
New Namespace(
	{
		f = Function( {},
			{Default Local},
			x = 14;
			Eval( Parse( "x" || Char( x ) || "=28" ) );
			Return( x );
		),
		x14 = 28
	}
)
/*
Vince Faller - Predictum
ErraticAttack
Level VI

Re: Default Local vs Explicit declaration of locals

This has come up for me quite often -- it can be very frustrating:

 

this script clobbers the names (names escape the function's local scope):

Names Default to Here( 1 );
some expr = Expr(
	a = 4;
	b = 5;
	g = 6;
);

f = Function( {input expr},
	{Default Local},
	Print( "running expr" );
	input expr
);

f( Name Expr( some expr ) );

show( a, b, g )

Here is a solution, but the problem with this solution is that it is technically two function calls (counts as 2 stack items, thus will reach the 225 stack limit faster)

Names Default to Here( 1 );
some expr = Expr(
	a = 4;
	b = 5;
	g = 6;
	Show( Eval List( {a, b, g} ) );
);

f = Function( {input expr},
	{Default Local},
	Print( "running expr" );
	Eval( Eval Expr(
		Local( {Default Local},
			Expr( Name Expr( input expr ) )
		)
	) );
	1
);

f( Name Expr( some expr ) );

show( a, b, g )

It would be nice if the "Default Local" directive wasn't a parse time directive but a run time flag, but perhaps what is happening is that name-resolution is being performed at parse time, thus making run-time name lookup rules moot.

Jordan
hogi
Level XII

Re: Default Local vs Explicit declaration of locals

Concerning "Workflow" - A little helper like this #NamespaceInspector will help a lot to keep track of variables.
Do they show up in a Here Namespace - or in a local namespace?
Once debugging is finished, I will upload it as an AddIn ...

 



mat-ski
Level III

Re: Default Local vs Explicit declaration of locals

For any future readers, I eventually wrote a test helper that I use to enforce that all locals are explicitly declared

 

/* 
 * Test that `invocation` does not add any variables to the Here namespace. It is important that
 * this test be run first, else whatever variables may leak will leak before running this test and 
 * so they will already be present in `hereVariablesBefore`.
 *
 * Example:
 * someFunction1 = Function( {}, {}, leakedVar = 1);
 * someFunction2 = Function( {}, {safeVar}, safeVar = 1);
 * assertNoLeakedVariables( Function( {}, {}, someFunction1() ) ); // Fails
 * assertNoLeakedVariables( Function( {}, {}, someFunction2() ) ); // Passes
 */ 
assertNoLeakedVariables = Function( {invocation},
	{hereVariablesBefore, hereVariablesAfter}, 

	Try(
    // Declare both of these in advance, because otherwise the key `hereVariablesBefore` will not be 
		// present in `hereVariablesBefore` and it makes the assertion confusing.
		hereVariablesBefore = .;
		hereVariablesAfter = .;
    
		hereVariablesBefore = Namespace( "here" ) << Get Keys;
		invocation();
		hereVariablesAfter = Namespace( "here" ) << Get Keys;

		UT Assert( Expr( Length( hereVariablesBefore ) ), Length( hereVariablesAfter ) );

		If( Length( hereVariablesBefore ) != Length( hereVariablesAfter ),
			showLeakedVariableNames( hereVariablesBefore, hereVariablesAfter )
		);
	, 

		UT Assert( exception_msg, "" )
	)
);