Rational and VFP Numbers

From NARS2000
Revision as of 01:34, 5 September 2011 by WikiSysop (talk | contribs)
Jump to navigationJump to search

Introduction

Rational numbers complement the 64-bit integer datatype to provide infinite precision (or WS FULL) at the cost of some performance. Similarly, Variable-precision Floating Point (VFP) numbers complement the 64-bit floating point datatype to provide more precision (user-controlled) with a similar cost in performance.

There is no separate datatype for infinite precision Integers. Instead, they are represented as a special case of rational numbers. In the discussion below, the phrase rational integer means a rational number whose denominator is one.

Rationale

The reason is simple: precision. If the 53 bits of precision in the floating point result of 2÷3 is not enough, you now have two more choices: one as an exact number, and one with as much precision as you care to specify.

Constants

Rational constants may be entered as follows:

  • 123x for the constant 123 (the suffix x is but a shorthand for r1)
  • 123r4567 for the constant 123÷4567

VFP constants may be entered as follows:

  • 123v for the constant 123
  • 123v4567 for the constant 123.4567
  • 123v4567e3 for the constant 123.4567e3

The above formats for constants (except for the suffix x) may be used in other constants such as 1r4p2 to generate a shorter and more accurate value for π2/4 than, say, ((○1)*2)÷4.

Precision

Rational numbers have infinite precision. They are stored with a separate numerator and denominator, both of which are exact numbers in the sense that their size (and hence precision) may grow limited only by the available workspace. Rational numbers complement the 64-bit integer datatype to provide infinite precision (or WS FULL) at the cost of some performance.

VFP numbers have user-controlled variable precision, and each number may have a different precision. The default precision at startup is controlled by the value of the system variable ⎕FPC. The system default of this value is 128 in units of bits of precision of the mantissa of the number, not counting the exponent (which is of fixed size). The current precision may be changed as needed by assigning a new value to the system variable. All newly created VFP numbers will have the new precision – the precision of VFP numbers already present in the workspace does not change.

Generally, precision is set once for a particular application and unchanged thereafter. Although not recommended, it is possible to mix VFP numbers of different precisions in a single array – presumably you really know what you are doing. The system function ⎕DR can be used to display an array's precision(s).

In the same vein, VFP numbers in function definitions can be tricky, too. For example, a cursory look at the function

    ∇ foo;⎕FPC;A
[1]   ⎕FPC←256
[2]   A←1v23456789…23456789
[3]   …
    ∇

might lead you to think that the VFP constant saved into A has a precision of 256 bits. Actually, the precision of that constant is set to the current value of ⎕FPC during tokenization when the function is saved in the workspace, not when the function is executed. If you must set the precision of constants within a user-defined function/operator to a specific value, use on the character form of the VFP constant as the following function demonstrates:

    ∇ foo;⎕FPC
[1]   ⎕FPC←2×⎕FPC
[2]   0 ⎕dr   1v234567890123456789
[3]   0 ⎕dr ⍎'1v234567890123456789'
    ∇
      foo
Variable Floating Point (3221): variable precision mantissa, 32-bit exponent -- FPC128
Variable Floating Point (3221): variable precision mantissa, 32-bit exponent -- FPC256

Datatype Propagation

Generally, the datatype of constants propagates through a calculation. For example, ⍳1000 generates the first thousand integers as an integer (actually an Arithmetic Progression Array) datatype, and ⍳1000x and ⍳1000v generate the same values as rational and VFP numbers, respectively.

Continuing the above example, *⍨⍳1000x generates the first thousand instances of NN as rational numbers, +/*⍨⍳1000x sums them into a single 3001-digit rational integer, and finally ¯10↑⍕+/*⍨⍳1000x extracts the low-order ten digits – 9110846700 (ProjectEuler.net #48) – all in a small number of milliseconds.

Note how the obvious expression ¯10↑⍕+/*⍨⍳1000 at first seems to solve the problem except that it quickly runs afoul of the limited precision of 64-bit integer and floating point numbers. Suffixing the constant 1000 with an x converts it to a rational number which then propagates through the calculation with infinite precision to yield the correct result.

Formatting

Rational integers are formatted as an integer with no special adornment; rational non-integers are formatted as a numerator and denominator separated by an r as in 34r9. As with the integer datatype, the numerator and denominator of a rational number are formatted exactly, unaffected by ⎕PP.


      2*100
1.2676506002282294E30
      2*100x
1267650600228229401496703205376

VFP numbers are formatted as decimal numbers to the precision inherent in the number or ⎕PP, whichever is smaller. For example,

      ⎕PP←100
      ⎕FPC←64
      ○1r2
1.57079632679489661926
      ⎕FPC←128
      ○1r2
1.570796326794896619231321691639751442098

where both of the above displays were limited by the precision of the number, not ⎕PP.

However the following displays are limited by ⎕PP:

      ⎕FPC←128
      ⎕PP←20
      !40v
815915283247897734350000000000000000000000000000
      ⎕PP←80
      !40v
815915283247897734345611269596115894272000000000

In other words, you might be fooled into thinking that a VFP number has more trailing zeros than it actually does simply because ⎕PP is too small.

► In general, the VFP datatype is better suited for representing fractional numbers than large integers.

Datatype Promotion

For the most part, rational numbers beget rational numbers and VFP numbers beget VFP numbers. However, when irrational, transcendental, and certain other functions are used, rational numbers beget VFP numbers. For example,

      ○1
3.1415926535897931
      ○1x
3.141592653589793238462643383279502884195

where the datatype of the two results are floating point and VFP, respectively. That is, in a manner similar to how some primitive functions with integer arguments may return floating point results, when a rational number is used as an argument to a primitive function that can't return a result with infinite precision, it is promoted to VFP.

The reason irrational, transcendental, and certain other functions on rational numbers don't return rational numbers is that, by definition, the result of such a function is, in general, not representable as a rational number; instead, VFP numbers are better suited to represent irrational results where the end user may control exactly how much precision is desired.

Two special functions are the prime decomposition (πR)/number theoretic (LπR) functions. In order to return accurate results over a wide range of precision, all numeric arguments are converted to rational integers and the result is returned as a rational integer.

The list of functions that produce VFP numbers given rational numbers is as follows:

  • Exponentiation: *R and L*R (except when L is rational and R is a 32-bit integer, in which case the result is a rational integer)
  • Logarithm: ⍟R and L⍟R
  • Pi Times and Circle functions: ○R and L○R
  • Root: √R and L√R

The list of functions that don't produce a rational or VFP result given those argument(s) is as follows:

  • Depth: (Integer)
  • Dyadic Comparison: =≠<≤≥>≡≢ (Boolean)
  • Nand and Nor: ⍲⍱ (Boolean)
  • Grade Up/Down: ⍋⍒ (Integer)
  • Index Of: (Integer)
  • Member Of: (Boolean)
  • Find: (Boolean)
  • Subset and Superset: ⊆⊇ (Boolean)
  • Prime Decomposition and Number Theoretic: π (Rational)

Otherwise, rational argument(s) produce rational result(s) and VFP argument(s) produce VFP result(s).

Datatype Demotion

It is common in APL implementations to demote datatypes where appropriate. For example, the constant 1.0 might actually be represented as an integer or even Boolean datatype. The idea is there is no loss of precision and the storage is typically smaller, so why not?

With rational and VFP numbers those reasons no longer apply. While the constant 1x might have the same precision as the constant 1.0, the difference in latent precision between the two is vast. In fact, in order for datatype propagation of rational and VFP numbers to work at all, we must be careful not to demote them automatically to a smaller datatype. Otherwise, it would require an intolerable degree of analysis on the part of the programmer to ensure that the desired datatype (rational or VFP) remains in effect throughout a calculation.

Conversions

To convert manually from one datatype to another,

  • Integer to rational, add 0x
  • Integer, floating point, or rational number to VFP, add 0v
  • Rational integer to a 64-bit integer (possibly losing some precision), use ⍎⍕
  • Rational non-integer to a floating point number (possibly losing some precision), use ⍎⍕0v+
  • VFP number to a floating point number (possibly losing some precision), use ⍎⍕
  • VFP number to rational — at the moment, no APL method exists although the underlying libraries do have such a function

Comparisons

Comparisons between two rational numbers or a rational number and any other integer is exact — just as they are between integers.

Comparisons between a VFP number and any other number is sensitive to ⎕CT — just as they are between floats.

Integer Tolerance

Both rational and VFP numbers may be used where the system ordinarily requires an integer (such axis coordinates, indexing, left argument to structural primitives, etc.) just as the system tolerates floating point numbers in those contexts if they are sufficiently near an integer. In all cases, the system attempts to convert the non-integer to an integer using the fixed system comparison tolerance (at the moment, 3E¯15).

Infinities

Support for ±∞ has been extended to rational and VFP numbers in the same manner as it applies to 64-bit integers and 64-bit floats. That is, the same cases covered by the system variable ⎕IC (Indeterminate Control) also apply to infinite rational and VFP numbers. Moreover, infinite numeric constants may be entered, for example, as

  • ∞x
  • ∞r1
  • ∞v
  • ∞v0

Also constants such as 2r∞ resolve to 0x.

New And/Or Different Behavior

  • Both roll (?R) and deal (L?R) on rational integers use a built-in random number generator so as to use the entire range of rational integers – this algorithm uses its own internal seeds that are much more complicated than the simple integer seed that is ⎕RL.

    For example, if you need really large random numbers

          ?10*60x
    370857192605742854709703007683731949504799559659692534573173

  • At the moment, matrix inverse (⌹R) and matrix division (L⌹R) on rational or VFP arguments each have two limitations above and beyond that of normal conformability:

    • for a square right argument that it be non-singular, and

    • for an overdetermined (>/⍴R) right argument that the symmetric matrix (⍉R)+.×R be non-singular.

    These limitations are due to the FLINT library used to implement Matrix Inverse/Divide on rational and VFP numbers.

    Integer and floating point arguments are not subject to these limitations because they use a more general algorithm (Singular Value Decomposition) that produces a unique result even for singular arguments (e.g., ⌹5 3⍴0).

Acknowledgments

The designers of J are thanked for having the foresight to include rational numbers as a separate datatype.

The following GPL libraries have been used to provide support for these datatypes:

  • MPIR (Multiple Precision Integers and Rationals) at mpir.org.
  • MPFR (Multiple Precision Floating-Point Reliable Library ) at mpfr.org.
  • FLINT (Fast Library for Number Theory) at flintlib.org.