- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
How can I use a lag function applied by groups?
I have data where I'd like to compute the lag from one row to the next (it is a time series), but I have data for many groups. I want the lag created within each group separately. For many statistical functions (e.g., Col Mean) there is the option to add one or more By Group variables. There does not appear to be the same functionality for the Lag or Dif functions and I can't find anything in the scripting documentation to indicate how that might be done. Is there a way to easily compute the lags for each group separately? The only way I have currently found is to first sort the data by date and by group, and then use an IF formula that depends on the Group equaling the lag (Group), such as IF (Group) - lag (Group), then lag (column of interest), else missing.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: How can I use a lag function applied by groups?
Here is a formula that appears to work, even when the rows are not sorted by the 2 levels. (i.e. sorted by date) The lagList in the formula, is there to handle the issue that the Col Moving Average() function returns the current row, even on the first occurrence.
If( Row() == 1, lagList = {} );
If( Contains( lagList, :administrative_area_level_1 || :administrative_area_level_2 ),
Col Moving Average(
:deaths,
weighting = 1,
before = 1,
:administrative_area_level_1,
:administrative_area_level_2
) * 2 - :deaths,
Insert Into( lagList, :administrative_area_level_1 || :administrative_area_level_2 )
);
Here is an inefficient formula that also works
curLevel1 = :administrative_area_level_1;
curLevel2 = :administrative_area_level_2;
curRow = Row();
Try(
:deaths[Max(
Current Data Table() << get rows where(
:administrative_area_level_1 == curLevel1 & (:administrative_area_level_2 == curLevel2 & Row() < curRow)
)
)]
);
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: How can I use a lag function applied by groups?
To make this more concrete, I've attached a simple example. The lagged deaths column is what I want - but I'd like to be able to do it more simply or elegantly.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: How can I use a lag function applied by groups?
Here is a formula that appears to work, even when the rows are not sorted by the 2 levels. (i.e. sorted by date) The lagList in the formula, is there to handle the issue that the Col Moving Average() function returns the current row, even on the first occurrence.
If( Row() == 1, lagList = {} );
If( Contains( lagList, :administrative_area_level_1 || :administrative_area_level_2 ),
Col Moving Average(
:deaths,
weighting = 1,
before = 1,
:administrative_area_level_1,
:administrative_area_level_2
) * 2 - :deaths,
Insert Into( lagList, :administrative_area_level_1 || :administrative_area_level_2 )
);
Here is an inefficient formula that also works
curLevel1 = :administrative_area_level_1;
curLevel2 = :administrative_area_level_2;
curRow = Row();
Try(
:deaths[Max(
Current Data Table() << get rows where(
:administrative_area_level_1 == curLevel1 & (:administrative_area_level_2 == curLevel2 & Row() < curRow)
)
)]
);
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: How can I use a lag function applied by groups?
I might have guessed that txnelson would have a solution! It isn't really more elegant or simpler than what I had, but it avoids the need to sort the data - which makes it considerably more flexible. Thanks.