cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
The Discovery Summit 2025 Call for Content is open! Submit an abstract today to present at our premier analytics conference.
See how to use to use Text Explorer to glean valuable information from text data at April 25 webinar.
Choose Language Hide Translation Bar
View Original Published Thread

Duration

hcarr01
Level VI

 

Hello everyone,
 
I have database like below,
 
undefined

 


In the comic, the green cells are the results that we would like to obtain.

 
I would like to create a script to calculate durations.
I have "start date" columns and then I have a "number" column which indicates the order of execution.
The goal would be to calculate "start date y - start date x" based on the numbers.
 
I explain myself on an example with the comic strip:
 
1st line : We notice that the column "720" has the number 12 in the cell, so we are looking among all the "number" columns for the first one greater than 12 (here column "740" with the value 21)
So duration 720 would be duration 720 = start date 740 - start date 720.
 
Then for duration 740 it would be duration 740 = start date 9801 - start date 740 with the same principle
 
Here is the script I started writing:
 
dt = Current Data Table();

dur_cols = Filter Each( {col_name}, dt << Get Column Names( "String", Continuous ),
 Starts With( col_name, "Date début" )  
);


num_colnr = Transform Each( {col_name}, dur_cols, Word( -1, col_name ) );

char_cols = dt << Get Column Names ("string", Character); 



wait(1);
N_taches = N Items(dur_cols);
for (i = 1, i <= N_taches, i++, 
 Eval(
  Eval Expr(
   col = dt << New Column ( "durée" || num_colnr[i], Numeric, "Continuous", Format(":jou:hh:m",8));
   col << Formula(
    colidx = Loc(dt[row(), char_cols] > Eval(num_colnr[i]))[1];
    If ( N Items(colidx) > 0,
     val = dur_cols[colidx] - Eval(dur_cols[i]);
     If(IsMissing(val),
      0
     ,
      val
   );
  ,
   0
  );
 );
)));
Thanks for your help !
 
 

This post originally written in French and has been translated for your convenience. When you reply, it will also be translated back to French .

1 ACCEPTED SOLUTION

Accepted Solutions
jthi
Super User


Re: Durée

A bit lazy solution, but it might work

Names Default To Here(1);

dt = Open("$DOWNLOADS/AAAA.jmp");

// would get only numeric columns, but the provided example table is broken...
all_cols = dt << Get Column Names(/*Numeric, */"String"); 
// so first we have to fix the datatable
Remove From(all_cols, 1, 2);
For Each({col_name}, all_cols,
	Column(dt, col_name) << Set Data Type("Numeric");
	Column(dt, col_name) << Set Modeling Type("Continuous");
);

// create list of duration columns
// same amount of columns AND in same order for "number" and duration
date_cols = {};
some_cols = {};

For Each({col_name}, all_cols,
	If(Starts With(lowercase(col_name), "date d"),
		Insert Into(date_cols, col_name);
	, !IsMissing(Regex(col_name, "^\d+$")),
		Insert Into(some_cols, col_name);
	,
		show(col_name);
	);
);

If(N Items(date_cols) != N Items(some_cols),
	Throw("Column mismatch");
);

// create new columns
dur_cols = {};
For Each({some_col}, some_cols,
	new_col = dt << New Column("duration " || some_col, Numeric, Continuous);
	Insert Into(dur_cols, new_col << get name);
);

// calculation
For Each Row(
	For Each({date_col, idx}, date_cols,
		datesomething = dt[Row(), some_cols[idx]];
		datesomethingvals = dt[Row(), some_cols];
		// to make comparison easier
		datesomethingvals[idx] = .; 
		datesomethingvals[Loc(datesomethingvals < datesomething)] = .; // lazy handling
				
		idx_of_interest = Loc Min(Abs(datesomethingvals - datesomething));
		If(idx_of_interest == 0,
			continue();
		);
		val_of_interest = dt[Row(), date_cols[idx_of_interest]];

		dt[Row(), dur_cols[idx]] = Abs(val_of_interest - dt[Row(), date_col]); // not sure if Abs is correct here
	);
);

// durations for 900 and 4500 for some reason missing from demo data
-Jarmo

View solution in original post

5 REPLIES 5
pauldeen
Level VI


Re: Durée

Maybe something like this:

For(k=1, k<=6, k++,
	StartValue = Column(k+2)[1];
	For(i=1, i<=6, i++,
		If(num(Column(i+8)[1])>StartValue,break())
	);
	If(i<7, Write(Column("Date début "||(Column(i+8)<<get name()))[1]-StartValue));
);
hcarr01
Level VI

Re: Duration

thank you for your reply @pauldeen but this is a test database, this script should work for more than 700 columns.

This post originally written in French and has been translated for your convenience. When you reply, it will also be translated back to French .

jthi
Super User


Re: Durée

A bit lazy solution, but it might work

Names Default To Here(1);

dt = Open("$DOWNLOADS/AAAA.jmp");

// would get only numeric columns, but the provided example table is broken...
all_cols = dt << Get Column Names(/*Numeric, */"String"); 
// so first we have to fix the datatable
Remove From(all_cols, 1, 2);
For Each({col_name}, all_cols,
	Column(dt, col_name) << Set Data Type("Numeric");
	Column(dt, col_name) << Set Modeling Type("Continuous");
);

// create list of duration columns
// same amount of columns AND in same order for "number" and duration
date_cols = {};
some_cols = {};

For Each({col_name}, all_cols,
	If(Starts With(lowercase(col_name), "date d"),
		Insert Into(date_cols, col_name);
	, !IsMissing(Regex(col_name, "^\d+$")),
		Insert Into(some_cols, col_name);
	,
		show(col_name);
	);
);

If(N Items(date_cols) != N Items(some_cols),
	Throw("Column mismatch");
);

// create new columns
dur_cols = {};
For Each({some_col}, some_cols,
	new_col = dt << New Column("duration " || some_col, Numeric, Continuous);
	Insert Into(dur_cols, new_col << get name);
);

// calculation
For Each Row(
	For Each({date_col, idx}, date_cols,
		datesomething = dt[Row(), some_cols[idx]];
		datesomethingvals = dt[Row(), some_cols];
		// to make comparison easier
		datesomethingvals[idx] = .; 
		datesomethingvals[Loc(datesomethingvals < datesomething)] = .; // lazy handling
				
		idx_of_interest = Loc Min(Abs(datesomethingvals - datesomething));
		If(idx_of_interest == 0,
			continue();
		);
		val_of_interest = dt[Row(), date_cols[idx_of_interest]];

		dt[Row(), dur_cols[idx]] = Abs(val_of_interest - dt[Row(), date_col]); // not sure if Abs is correct here
	);
);

// durations for 900 and 4500 for some reason missing from demo data
-Jarmo
hcarr01
Level VI

Re: Duration

thank you for your reply @jthi !

This post originally written in French and has been translated for your convenience. When you reply, it will also be translated back to French .

pauldeen
Level VI


Re: Durée

So? Just up the For loop count number. Or make it dynamic by including N cols() as the upper bound.

If you have allready solved this would you please mark this topic as solved?