BookmarkSubscribe
Choose Language Hide Translation Bar
ms
Super User ms
Super User

Re: Why can't I select certain numerical rows?

From the example below it appears that precision is improved if letting the software calculate the decimals, in comparison to defining them explicitly.

Clear Globals();

IntegerBased = Repeat( Index( 0, 60 ) / 10, 5 );

DecimalBased = Repeat( Index( 0, 6, 0.1 ), 5 );

aggTable = New Table( "B",

  New Column( "I", numeric, continuous, values( IntegerBased ) ),

  New Column( "I2", numeric, continuous, values( DecimalBased ) )

);

aggTable << Select Where( :I == 1.0 ); // Works!

aggTable << Select Where( :I2 == 1.0 ); // Works not!

0 Kudos

Re: Why can't I select certain numerical rows?

Do you really mean that changing between `numeric` and `fixed decimal` doesn't change the way it's stored, only displayed? 

Or maybe you mean that changing between `fixed decimal(x,y)` and `fixed decimal(p,q)` doesn't change the way it's stored?

Is such unique behavior documented anywhere by JMP?

0 Kudos
ms
Super User ms
Super User

Re: Why can't I select certain numerical rows?

The data type is "numeric" whatever format you apply. The format, e.g. fixed decimal, is not supposed to change the data. The default "Best" format suggest a presentation format depending on the data, for example trailing zeroes or very small fractions are hidden from the user.

It is not unique for JMP to keep data and formats separated (or suffer from the shortcomings of binary representation of decimals). Below's some examples from R:

a<-0.1; a; formatC(a, digits=25 )

[1] 0.1

[1] "0.1000000000000000055511151"

If you sum such imprecise decimals, like in your for-loop, the error may increase to the point were 1 does not equal 1. The software finds the discrepancy is small enough to not show it to the user but still too large to be ignored in comparisons:

> one=0.1+0.1+0.1+0.1+0.1+0.1+0.1+0.1+0.1+0.1; one; one == 1;

[1] 1

[1] FALSE

Compare with the below where the unprecise decimal is used only once in the calculation:

> one=10*0.1; one; one == 1;

[1] 1

[1] TRUE

>

0 Kudos

Re: Why can't I select certain numerical rows?

Right, but there is no labeling of anything as "numeric" and "fixed" there.  I'm not aware of any examples from any other languages, DBs, etc. where use of the keywords in a script does not imply what is going on in the computer's memory (or perhaps effectively going on).  And I don't think R even has a fixed precision "type" out of the box.  In JMP, apparently "fixed precision" is something that only exists on the screen but doesn't matter in a script.  That seems fairly unhelpful to me and a rather severe violation of the principle of least surprise.

See for example this from SQL Server:

http://msdn.microsoft.com/en-us/library/ms187746.aspx

Or this from Ruby:

Class: Float (Ruby 1.9.3)

Class: BigDecimal (Ruby 1.9.3)

Or this from Mathematica:

https://reference.wolfram.com/mathematica/tutorial/ArbitraryPrecisionNumbers.html

0 Kudos
pmroz
Super User

Re: Why can't I select certain numerical rows?

Curiously MS's code didn't work on my PC (Win7 64-bit running JMP 10.0.1).

If I'm comparing numbers I always use an epsilon parameter to account for round-off error:

Clear Globals();

IntegerBased = Repeat( Index( 0, 60 ) / 10, 5 );

DecimalBased = Repeat( Index( 0, 6, 0.1 ), 5 );

aggTable = New Table( "B",

    New Column( "I", numeric, continuous, values( IntegerBased ) ),

    New Column( "I2", numeric, continuous, values( DecimalBased ) )

);

aggTable << Select Where( I == 1.0 ); // Works! (not for me)

aggTable << Select Where( I2 == 1.0 ); // Works not!

epsilon = .0000001// Adjust for the magnitude of your numbers

aggtable << select where( Abs( :I - 1.0 ) < epsilon )// Works!

0 Kudos
ms
Super User ms
Super User

Re: Why can't I select certain numerical rows?

Strange. Try with a scoping colon before the column names.

Heres the 25 digit versions of 1.0 om my machine (JMP 11, Mac 64 bit):

II2
1,00000000000000000000000000,9999999999999998889776975
0 Kudos
pmroz
Super User

Re: Why can't I select certain numerical rows?

Here's what I get on mine:

II2
1.00000000000000000000000000.9999999999999998900000000

I set the format to Fixed Dec, Width 27, Dec 25.

Interesting though - when I try your code in JMP 11.1 the part that's supposed to work does work!

0 Kudos
XanGregg
Staff

Re: Why can't I select certain numerical rows?

The first Ruby link you mention is most relevant here. JMP, like most every other piece of software, uses the CPU's native base 2 floating point representation of numbers. As the Ruby link begins:

Float objects represent inexact real numbers using the native architecture's double-precision floating point representation.

Floating point has a different arithmetic and is a inexact number. So you should know its esoteric system.

The first reference it provides is a classic, What Every Computer Scientist Should Know About Floating-Point Arithmetic.

Getting back to JMP, you can see

0.1 + 0.1 + 0.1 == 0.3;// false
3/10 == 0.3;// true

Which mirrors MS's results showing straight division being more accurate than repeated addition. However, neither method is perfectly accurate because 0.3 (and 0.1) cannot be represented perfectly in base 2, no matter how many digits you. There are systems, especially those targeting finance, which store numbers in base 10, but they sacrifice speed for all numbers and precision for non-base 10 numbers.

So what are you to do?

If you're expecting base 10 numbers, you can use Round() with some agreed precision,

    Round(0.1 + 0.1 + 0.1, 5) == Round(0.3, 5); // true (to 5 digits)

The general solution is to use some sort of "fuzzy" comparison function which compares the relative difference against some semi-arbitrary small epsilon value,

fuzzy equals = Function({a, b},

  { diff = Abs(a - b), mag = Max(Abs(a), Abs(b))},

  If (diff == mag,

  diff < 1e-12, // comparing to 0

  diff / mag < 1e-10 // relative compare

  )

);

fuzzy equals(0.1 + 0.1 + 0.1, 0.3); // true

BTW, you can also see the same thing in JavaScript by entering

    javascript:alert(0.1 + 0.1 + 0.1 == 0.3)

in your browser's address bar.

0 Kudos

Re: Why can't I select certain numerical rows?

Yes, but in Ruby, this gives the expected result:


require 'bigdecimal'



a = BigDecimal.new("0.3")


b = BigDecimal.new("0.3")



p (a + b == BigDecimal.new("0.6")) ? "exact" : "not exact"


Whereas in JMP, this does not:


Current Data Table() << New Column( "a",


  Format( "Fixed Dec", 5, 2 ),


  Continuous,


  Set Values([0.3])


  );



Current Data Table() << New Column( "b",


  Format( "Fixed Dec", 5, 2 ),


  Continuous,


  Set Values([0.3])


  );



Current Data Table() << New Column( "c",


  Format( "Fixed Dec", 5, 2 ),


  Continuous,


  Formula( :a + :b ),


  );



Select Where( :c == 0.6 )


Maybe there is some way to cast the constant in the comparison to fixed decimal and that will fix it?  My point is that "Fixed Dec" appears to be just an output format specifier, not a data type.  I don't see this documented anywhere.  And it surely is not what most users are expecting when they use "Fixed Dec".

0 Kudos
Jeff_Perkinson
Community Manager Community Manager

Re: Why can't I select certain numerical rows?

Hi Jordan,

I'm sorry for the confusion.

Format is, indeed, just that: an output format for displaying numbers.

We use this term historically as SAS does and it is different from a data type.

JMP has only a few standard(*) Data Types as seen in the Column Info window which determine how values are stored:

  • Numeric
  • Character
  • Row State

The Format option in the Column Info window specifies how values are displayed. For numeric data there are a large number of possible formats, including currency formats and datetime formats.

Clicking the Help button in the Column Info window will display the documentation for the dialog where the Format option is documented.

As you have discovered, using JSL to write JMP programs can rely on an understanding of how JMP works interactively. In this case, Format is shown side-by-side with Data Type in the Column Info window which can help explain the difference when using them in JSL.

In your New Column() you could also define the data type, which would, perhaps, alleviate some of the confusion.


Current Data Table() << New Column( "a",


  Numeric,


  Format( "Fixed Dec", 5, 2 ),


  Continuous,


  Set Values([0.3])


);



I hope this helps.

-Jeff

(*)There are three more optional data types for storing integer values efficiently in 1, 2 or 4 bytes. These can be enabled in File > Preferences and click Tables.

-Jeff
0 Kudos