Datatypes and Storage
Datatypes
Immediate Data
If data meets specific criteria, it is stored in the Symbol Table. That is if it is a simple (Character, Boolean, Integer, Floating Point) scalar, it is represented in the Symbol Table as follows according to these structures which may be found in symtab.h:
// Symbol table flags typedef struct tagSTFLAGS { UINT Imm:1, // 00000001: The data in .stData is Immediate simple numeric or character scalar ImmType:5, // 0000003E: ... Immediate Boolean, Integer, Character, or Float (see IMM_TYPES) Inuse:1, // 00000040: Inuse entry Value:1, // 00000080: Entry has a value ObjName:3, // 00000700: The data in .stData is NULL if .stNameType is NAMETYPE_UNK; value, address, or HGLOBAL otherwise // (see OBJ_NAMES) stNameType:4, // 00007800: The data in .stdata is value (if .Imm), address (if .FcnDir), or HGLOBAL (otherwise) // (see NAME_TYPES) SysVarValid:5, // 000F8000: Index to validation routine for System Vars (see SYS_VARS) UsrDfn:1, // 00100000: User-defined function/operator DfnLabel:1, // 00200000: User-defined function/operator label (valid only if .Value is set) DfnSysLabel:1, // 00400000: User-defined function/operator system label (valid only if .Value is set) DfnAxis:1, // 00800000: User-defined function/operator accepts axis value FcnDir:1, // 01000000: Direct function/operator (stNameFcn is valid) StdSysName:1, // 02000000: Is a standard System Name bIsAlpha:1, // 04000000: Is Alpha bIsOmega:1, // 08000000: Is Omega :4; // F0000000: Available bits } STFLAGS, *LPSTFLAGS; // N.B.: Whenever changing the above struct (STFLAGS), // be sure to make a corresponding change to // <astFlagNames> in <dispdbg.c>. // .Inuse and .PrinHash are valid for all entries. // .Inuse = 0 implies that all but .PrinHash are zero. // .Imm implies one and only one of the IMMTYPE_***s // .Imm = 1 implies that one and only one of aplBoolean, aplInteger, aplChar, or aplFloat is valid. // .Imm = 0 implies that stGlbData is valid. // .Value is valid for NAMETYPE_VAR only, however .stNameType EQ NAMETYPE_VAR // should never be without a value. // .UsrDfn is set when the function is user-defined. // .FcnDir may be set for any function/operator; it is a // direct pointer to the code. // htGlbName in HSHENTRY is set when .Imm and .FcnDir are clear. // Immediate data or a handle to global data typedef union tagSYMTAB_DATA { APLBOOL stBoolean; // 00: A number (Boolean) APLINT stInteger; // 00: A number (Integer) APLFLOAT stFloat; // 00: A floating point number APLCHAR stChar; // 00: A character HGLOBAL stGlbData; // 00: Handle of the entry's data LPVOID stVoid; // 00: An abritrary ptr LPPRIMFNS stNameFcn; // 00: Ptr to a named function APLLONGEST stLongest; // 00: Longest datatype (so we can copy the entire data) // 08: Length } SYMTAB_DATA, *LPSYMTAB_DATA; #define SYM_HEADER_SIGNATURE 'EMYS' // Symbol table entry typedef struct tagSYMENTRY { STFLAGS stFlags; // 00: Flags SYMTAB_DATA stData; // 04: For immediates, the data value; // for others, the HGLOBAL (8 bytes) LPHSHENTRY stHshEntry; // 0C: Ptr to the matching HSHENTRY struct tagSYMENTRY *stPrvEntry, // 10: Ptr to previous (shadowed) STE (NULL = none) *stSymLink; // 14: Ptr to next entry in linked list of // similarly grouped entries (NULL = none) UINT stSILevel; // 18: State Indicator Level for this STE HEADER_SIGNATURE Sig; // 1C: STE header signature // 20: Length } SYMENTRY, *LPSYMENTRY;
Global Data
Each global datatype type is stored in a global variable allocated by MyGlobalAlloc (GPTR | GHND, # Bytes) which returns a global memory handle (or NULL if an error occurs). The # Bytes is calculated by the function CalcArraySize (ARRAY_TYPE, APLNELM, APLRANK). The ARRAY_TYPE enum is defined in datatype.h as follows:
// Array types -- used to identify array storage type in memory typedef enum tagARRAY_TYPES { ARRAY_BOOL = 0 , // 00: Boolean ARRAY_INT , // 01: Integer ARRAY_FLOAT , // 02: Floating point ARRAY_CHAR , // 03: Character ARRAY_HETERO , // 04: Simple heterogeneous (mixed numeric and character scalars) ARRAY_NESTED , // 05: Nested ARRAY_LIST , // 06: List ARRAY_APA , // 07: Arithmetic Progression Array ARRAY_RAT , // 08: Multiprecision Rational Number ARRAY_VFP , // 09: Variable-precision Float ARRAY_HC2I , // 0A: Complex INT coefficients ARRAY_HC2F , // 0B: ... FLT ... ARRAY_HC2R , // 0C: ... RAT ... ARRAY_HC2V , // 0D: ... VFP ... ARRAY_HC4I , // 0E: Quaternion INT coefficients ARRAY_HC4F , // 0F: ... FLT ... ARRAY_HC4R , // 10: ... RAT ... ARRAY_HC4V , // 11: ... VFP ... ARRAY_HC8I , // 12: Octonion INT coefficients ARRAY_HC8F , // 13: ... FLT ... ARRAY_HC8R , // 14: ... RAT ... ARRAY_HC8V , // 15: ... VFP ... ARRAY_LENGTH , // 16: # elements in this enum // *MUST* be the last non-error entry // 17-1F: Available entries (5 bits) ARRAY_INIT = ARRAY_LENGTH , ARRAY_ERROR = (APLSTYPE) -1 , ARRAY_NONCE = (APLSTYPE) -2 , ARRAY_REALONLY = (APLSTYPE) -3 , ARRAY_HC1I = ARRAY_INT , // To simplify common macros ARRAY_HC1F = ARRAY_FLOAT , // ... ARRAY_HC1R = ARRAY_RAT , // ... ARRAY_HC1V = ARRAY_VFP , // ... } ARRAY_TYPES;
The APLNELM typedef is defined in types.h as follows (where ULONGLONG is defined as an unsigned 64-bit integer):
typedef ULONGLONG APLNELM; // The type of the # elements in an array
The APLRANK typedef is defined in types.h as follows:
typedef ULONGLONG APLRANK; // The type of the rank element in an array
Headers
Each global array is preceded by a header and is defined in datatype.h as follows:
typedef struct tagHEADER_SIGNATURE { UINT nature; // 00: Array header signature (common to all types of arrays) // 04: Length } HEADER_SIGNATURE, *LPHEADER_SIGNATURE; // Variable array header #define VARARRAY_HEADER_SIGNATURE 'SRAV' typedef struct tagVARARRAY_HEADER { HEADER_SIGNATURE Sig; // 00: Array header signature UINT ArrType:5, // 04: 0000001F: The type of the array (see ARRAY_TYPES) PermNdx:5, // 000003E0: Permanent array index (e.g., PERMNDX_ZILDE for ⍬) SysVar:1, // 00000400: Izit for a Sysvar (***DEBUG*** only)? PV0:1, // 00000800: Permutation Vector in origin-0 PV1:1, // 00001000: ... 1 bSelSpec:1, // 00002000: Select Specification array All2s:1, // 00004000: Values are all 2s #ifdef DEBUG bMFOvar:1, // 00008000: Magic Function/Operator var -- do not display :16; // FFFF0000: Available bits #else :17; // FFFF8000: Available bits #endif UINT RefCnt; // 08: Reference count APLNELM NELM; // 0C: # elements in the array (8 bytes) APLRANK Rank; // 14: The rank of the array (8 bytes) // followed by the dimensions // 1C: Length } VARARRAY_HEADER, *LPVARARRAY_HEADER;
Characters
There is only one character type (ARRAY_CHAR) and is stored as 16-bit WORDs in UCS-2 format. This format is a subset of UTF-16LE in that it does not attempt to handle characters beyond the BMP (Basic Multilingual Plane), that is 16 bits.
Numbers
There are numerous numeric datatypes from Boolean to Octonions. All of the numeric datatypes in this section use the common header above. The data portion of the array immediately follows the header.
Booleans are stored in the usual one element per bit in Little-Endian format.
The rest of the numeric datatypes in this section can be described by the "outer product" of its dimension (1, 2, 4, 8) and its Basic Type (8-byte Integer, 8-byte Floating Point, __mpq_struct (24- or 32-byte) Multiple-precision Integer/Rational, and __mpfr_struct (32- or 40-byte) Multiple-precision Floating Point). Because a Multiple-precision number contains a pointer to its data, its byte count depends upon the size of a pointer (32- or 64-bit). The dimensions (1, 2, 4, 8) correspond to the Real, Complex, Quaternion, and Octonion numbers. A scalar number in a specific dimension has as many coefficients (the Basic Types) as the dimension. The Multiple-precision types __mpq_struct and __mpfr_struct are defined in mpir.h and mpfr.h, respectively. The struct for __mpq_struct is defined as two __mpz_structs, one for the numerator and one for the denominator where __mpz_struct represents a Multiple-precision Integer and is defined in mpir.h.
For example, the data portion of
- A Real Integer array has one 64-bit integer for each element.
- A Complex Multiple-precision Integer/Rational array has two __mpq_struct Basic Types per element.
- A Quaternion Multiple-precision Floating Point array has four __mpfr_struct Basic Types per element.
Nested Arrays
Nested Arrays (ARRAY_NESTED) use the common header as above. The data portion of a Nested Array consists of a series of pointers (either 32- or 64-bit depending upon the width of the ABI (Application Binary Interface) of the program as 32- or 64-bit). As each pointer is on at least a 32-bit boundary, the low-order bit (normally 0) is used to distinguish STEs (Symbol Table Entries) from Global pointers. In an STE pointer the low-order bit is 0 and in Global pointers it is 1. A STE pointer is then an index into the current Symbol Table and a Global pointer (with the low-order bit cleared) is a global memory handle which may be locked and unlocked using MyGlobalLock and MyGlobalUnlock.
Heterogeneous Arrays
Heterogeneous arrays (ARRAY_HETERO) are a subset of Nested Arrays in that the pointers are all to Symbol Table Entries, that is the low-order bit in the pointer is 0.
Arithmetic Progression Arrays
APAs (ARRAY_APA) are a superset of APVs (Arithmetic Progression Vectors) in that they may be of any Shape and Rank. The header portion is as above. The data portion is defined in datatype.h as follows:
// Define APA structure typedef struct tagAPLAPA // Offset + Multiplier × ⍳ NELM (origin-0) { APLINT Off, // 00: Offset Mul; // 04: Multiplier // 08: Length } APLAPA, * LPAPLAPA;
The Multiplier may be 0 as, for example, is produced by the Reshape function of a simple scalar integer.
Type Demotion
All arrays are subject to type Demotion where a pointer to a token containing the array is passed to the TypeDemote (LPTOKEN, UBOOL) function and the Token is then changed. Tokens are defined in tokens.h.