cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Choose Language Hide Translation Bar
dale_lehman
Level VII

How can I use a lag function applied by groups?

I have data where I'd like to compute the lag from one row to the next (it is a time series), but I have data for many groups.  I want the lag created within each group separately.  For many statistical functions (e.g., Col Mean) there is the option to add one or more By Group variables.  There does not appear to be the same functionality for the Lag or Dif functions and I can't find anything in the scripting documentation to indicate how that might be done.  Is there a way to easily compute the lags for each group separately?  The only way I have currently found is to first sort the data by date and by group, and then use an IF formula that depends on the Group equaling the lag (Group), such as IF (Group) - lag (Group), then lag (column of interest), else missing.

1 ACCEPTED SOLUTION

Accepted Solutions
txnelson
Super User

Re: How can I use a lag function applied by groups?

Here is a formula that appears to work, even when the rows are not sorted by the 2 levels. (i.e. sorted by date)  The lagList in the formula, is there to handle the issue that the Col Moving Average() function returns the current row, even on the first occurrence.

If( Row() == 1, lagList = {} );
If( Contains( lagList, :administrative_area_level_1 || :administrative_area_level_2 ),
	Col Moving Average(
		:deaths,
		weighting = 1,
		before = 1,
		:administrative_area_level_1,
		:administrative_area_level_2
	) * 2 - :deaths,
	Insert Into( lagList, :administrative_area_level_1 || :administrative_area_level_2 )
);

txnelson_0-1654440546090.png

Here is an inefficient formula that also works

curLevel1 = :administrative_area_level_1;
curLevel2 = :administrative_area_level_2;
curRow = Row();
Try(
	:deaths[Max(
		Current Data Table() << get rows where(
			:administrative_area_level_1 == curLevel1 & (:administrative_area_level_2 == curLevel2 & Row() < curRow)
		)
	)]
);
Jim

View solution in original post

3 REPLIES 3
dale_lehman
Level VII

Re: How can I use a lag function applied by groups?

To make this more concrete, I've attached a simple example.  The lagged deaths column is what I want - but I'd like to be able to do it more simply or elegantly.

txnelson
Super User

Re: How can I use a lag function applied by groups?

Here is a formula that appears to work, even when the rows are not sorted by the 2 levels. (i.e. sorted by date)  The lagList in the formula, is there to handle the issue that the Col Moving Average() function returns the current row, even on the first occurrence.

If( Row() == 1, lagList = {} );
If( Contains( lagList, :administrative_area_level_1 || :administrative_area_level_2 ),
	Col Moving Average(
		:deaths,
		weighting = 1,
		before = 1,
		:administrative_area_level_1,
		:administrative_area_level_2
	) * 2 - :deaths,
	Insert Into( lagList, :administrative_area_level_1 || :administrative_area_level_2 )
);

txnelson_0-1654440546090.png

Here is an inefficient formula that also works

curLevel1 = :administrative_area_level_1;
curLevel2 = :administrative_area_level_2;
curRow = Row();
Try(
	:deaths[Max(
		Current Data Table() << get rows where(
			:administrative_area_level_1 == curLevel1 & (:administrative_area_level_2 == curLevel2 & Row() < curRow)
		)
	)]
);
Jim
dale_lehman
Level VII

Re: How can I use a lag function applied by groups?

I might have guessed that txnelson would have a solution!  It isn't really more elegant or simpler than what I had, but it avoids the need to sort the data - which makes it considerably more flexible.  Thanks.