cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
AlexR846
Level III

How to convert Column Data type from Character into Numeric wihtout losing Numeric values in the column

Dears,

Started with JMP few weeks ago !! It's a naive question.

I am trying to convert a column Data type from Character to Numeric, however i am losing the all the Numeric values from the column while converting the data type to Numeric.

Please find the attached screenshots for the same.

 

How to retain the numeric values from a column when i change it's data type from Character into Numeric??

Tried by replacing the characters using following commands,

 

values1 = dt:RequiredColumn<<Get Values;

Regex( :RequiredColumn, "[^0-9]", "", GLOBALREPLACE );

 

with this all the numeric values are also getting replaced.

 

Simply i would like to retain the Numeric values from the column (originally in Character data type). Tried to search for this problem in our community couldn't achieve any relevant information about it.

 

Picture01 - OriginalColumnFormat  - Screenshot of Column's original data type

Picture02 - ColumnafterChangingDatatype- Screenshot of Column's after changing the data type to Numeric.

 

Thanks in Advance...

 

ColumnafterChangingDatatype.JPG OriginalColumnFormat.JPG

1 ACCEPTED SOLUTION

Accepted Solutions
jthi
Super User

Re: How to convert Column Data type from Character into Numeric wihtout losing Numeric values in the column

Converting the table you have is "easy", but in general this sort of conversion isn't. There are so many different values you will have to take into account. Then there are is an issue/feature with how JMP's Num function works. Below is one idea, but this will still leave out many different "numeric" formats, like durations, dates, times, percentages, thousand separators and so on. Also it only removes some of the possible characters which are not related to numbers

Names Default To Here(1);

dt = Current Data Table();

//remove alphabetical characters, should also remove some special characters
regex_one = "[a-zA-Z\s]";
dt << New Column("no_alph", Character, Nominal, << Set Each Value(
	Regex(:MeasuredValues, regex_one, "", GLOBALREPLACE);	
));

regex_two = "^-?[0-9]+\.?[0-9]*$";
dt << New Column("nums", Numeric, Nominal, << Set Each Value(
	Num(Regex(:no_alph, regex_two))
));

 

 

-Jarmo

View solution in original post

10 REPLIES 10
statman
Super User

Re: How to convert Column Data type from Character into Numeric wihtout losing Numeric values in the column

There are likely multiple ways to accomplish what you want and I'll bet someone (e.g., Jim) will have a script to make it more efficient, but here is a potential work flow:

BTW, your jpg's don't allow us to see what character columns you want to change look like.  If there are any characters that are not numeric, these need to be dealt with before you change data type.

1. Right click on the column and use Recode.  Remove all non-numeric values.

2. Then, right click and choose Column info. (like the jpg's). change Data Type from character to numeric and modeling type to continuous.

"All models are wrong, some are useful" G.E.P. Box
AlexR846
Level III

Re: How to convert Column Data type from Character into Numeric wihtout losing Numeric values in the column

Dear statman,

 

Thank you for your timely reply.

I did try your suggestion by removing non-numeric values using Recode option from the column and tried to convert the columns data type from character to Numeric. However still all the values are disappearing as before when the data type is changed to Numeric.

 

Please find the attached screenshots for the same.

In recoding option, JMP was able to differentiate Numeric values and string characters.

Will post the result of Mr. JIM's suggestion soon.

 

Thank you..

txnelson
Super User

Re: How to convert Column Data type from Character into Numeric wihtout losing Numeric values in the column

JMP should convert all values that are proper representations of numeric values.  Therefore,

1.6.1, V1.6 should be converted to a missing value, since they are not valid numeric values.

However, -1.00, 5.00 and -15.000 should be converted.

 

Would you please run the following script to see if there are hidden characters in your data that might be causing an issue.

show(:MeasuredValues[6],char to hex(:MeasuredValues[6]);

The script will display the hex values for the value in the 6th row of your data table.  If the -1.00 isn't in the 6th row, then change the 6 to the row number for one of the rows where the character representation appears to be a valid numeric value, but the cell is being changed to a missing value.

It would also be very useful if you could attach a sample data table that contains just the MeasuredValues column.

Jim
AlexR846
Level III

Re: How to convert Column Data type from Character into Numeric wihtout losing Numeric values in the column

Dear Jim.

 

i tried to run the script for the 6th row number before converting it to Numeric, the result is

 

:MeasuredValues[6] = "-1.00";
Char To Hex(:MeasuredValues[6]) = "2D312E3030";

 

After converting the Column's data type to Numeric:

:MeasuredValues[6] = .;
Char To Hex(:MeasuredValues[6]) = "FFFFD10000000000";

 

i ran the script for the second row which is being converted to a Numeric value (changes to different value after conversion), the result is

Before conversion:

:MeasuredValues[2] = "1.6.1";
Char To Hex(:MeasuredValues[2]) = "312E362E31";

 

After conversion:

:MeasuredValues[2] = 3061584000;
Char To Hex(:MeasuredValues[2]) = "41E6CF8210000000";

 

I have attached the table with the MeasuredValues data.

Thank you !!

jthi
Super User

Re: How to convert Column Data type from Character into Numeric wihtout losing Numeric values in the column

Converting the table you have is "easy", but in general this sort of conversion isn't. There are so many different values you will have to take into account. Then there are is an issue/feature with how JMP's Num function works. Below is one idea, but this will still leave out many different "numeric" formats, like durations, dates, times, percentages, thousand separators and so on. Also it only removes some of the possible characters which are not related to numbers

Names Default To Here(1);

dt = Current Data Table();

//remove alphabetical characters, should also remove some special characters
regex_one = "[a-zA-Z\s]";
dt << New Column("no_alph", Character, Nominal, << Set Each Value(
	Regex(:MeasuredValues, regex_one, "", GLOBALREPLACE);	
));

regex_two = "^-?[0-9]+\.?[0-9]*$";
dt << New Column("nums", Numeric, Nominal, << Set Each Value(
	Num(Regex(:no_alph, regex_two))
));

 

 

-Jarmo
AlexR846
Level III

Re: How to convert Column Data type from Character into Numeric wihtout losing Numeric values in the column

Thanks jthi !!

That did the job...

txnelson
Super User

Re: How to convert Column Data type from Character into Numeric wihtout losing Numeric values in the column

It appears that JMP has a bug in the Num() function.  Character strings of the pattern

     "n.n.n"

should always return a JMP missing value, however in many cases, it returns a valid numeric value

Num( "9.4.333" )

returns

-49554633600

I can not find any usage note on this issue in the JMP Documentation(a non exhaustive search).  I am putting in a bug report to the JMP support team.

Jim
AlexR846
Level III

Re: How to convert Column Data type from Character into Numeric wihtout losing Numeric values in the column

Thank you for your valid time in investigating this issue Jim !!

jthi
Super User

Re: How to convert Column Data type from Character into Numeric wihtout losing Numeric values in the column

Could it be that JMP is interpreting that as a date and because that date is before 01.01.1904 (which is the start time of JMP's datetime) it is negative.

Names Default To Here(1);

a = Num("9.4.333");
b = Num("31.12.1903");
c = Num("1.12.1903");

Show(a, b, c);

After this discussion in community on how Num() behaves I have used my own validators for checking numeric values, this is one example:

Names Default To Here(1);

num_validator = Function({str}, {Default Local},
	regex_validator_string = "^[-\+]{0,1}[0-9\.\,]+$";
	If(Is Missing(Regex(str, regex_validator_string)),
		retval = .; //use missing as non match due to num returning . in case of non-numeric
	, //else
		retval = Num(str)
	);
	return(retval);
);
-Jarmo