7.2 Field Declarations

The syntax for fields of a sequential record,  record field declarations , are declared as follows:

<field_type> <field_name> : <field options> ;

Primitive Field Types

The following primitive types are supported:

Primitive Field TypeDescription

ascii

ASCII encoded string. This type may also be used for other types of string encodings with the character_encoding option. The default encoding used is ISO8859-1.

asn_length

Special type used to encode a BER encoded size specification. It decodes to the BER length specification as well as the length of the length specification itself. The type makes it possible to decode some special cases of BER encoded data without using ASN.1 format specifications. The special option content_only could also be used to only get the BER length specification.

bcd

Array of digits encoded in BCD. Nibble order can be specified as bcd(msn_fd) or bcd(lsn_fd). The msn_fd means that the most significant nibble is the first digit, while lsn_fd uses the least significant nibble as first digit. The msn_fd is default.

bytearray

Byte array.

ebcdic

A special case of ascii type, equivalent to ascii: character_encoding("Cp1047"). All options available for ascii fields are also available for ebcdic fields.

float types (floatdouble)

Binary encoded float value. This type supports IEEE754 standard 32-bit and 64-bit data encodings. The only difference between float and double is that the field is automatically mapped to their corresponding internal type.

integer types ( byte short , int long , bigint )

Binary coded integer value. The first byte is the most significant (that is, big endian order). The field can be used with the field options unsigned or signed to specify if the data will be decoded unsigned (which is the default) or two-complement signed. The byte order can also be explicitly specified by, for instance, using int(little_endian) or int(big_endian) as the type name.

The only difference between the types, when using an automatic in_map, is that the field is automatically mapped to their corresponding internal type.

Note!

Since the integer types are handled internally as fixed-length signed integers (except for bigint), there can be overflows in both decoding and encoding. If this occurs the integer values are truncated.

msp_length

Special type used to decode the length field in a Siemens MSP billing event. The length field specifies the event length excluding the length field itself.

external MSP_BILLING_EVENT 
 { 
 msp_length l : external_only; 
 ascii v : dynamic_size(l); 
 };
list

Type that can be used to decode a list of elements.

external MySubUDR { 
  int  dataLength : static_size(1); 
  ascii secretData : dynamic_size(dataLength); 
 }; 
 external MyUDR { 
  int elementCount  : terminated_by(":"); 
  list<MySubUDR> myList : element_count(elementCount); 
  int userType  : terminated_by(0xA); 
 };

The list can have any of the following field size options static_size, dynamic_size, terminated_by or element_count. For further information, see the section below, Field Size Specifications

Primitive Field Options 

Primitive Field OptionDescription

character_encoding(<encoding_name>)

Specifies that the field is encoded with the encoding named <encoding_name>, using the standard Java encoding support.

encode_value(<expr>)

Specifies that the encoded value of the field always is <expr>, regardless of whether decoded or set value is chosen during the processing. This is used for encoding only.

floatdouble

Informs the decoder that the value is actually a float value specified as a string and that the automatic mapping of the field is the internal type float or double. Only applicable for ascii fields.

int(base10)int(base16)

Informs the decoder that the value is actually an integer of decimal or hexadecimal base and that the automatic mapping of the field is of the internal type int. The integer types (byteshort, longbigint) can be used instead of int, in which case automatic mapping is done to the specified type. Only applicable for ascii and bcd fields.

lsb(<int_constant>)

The least significant bit of the field is <int_constant>. Value range is zero (0) to seven (7) (both inclusive). Only applicable for size one (1) integer type fields. This option is typically used together with the lsb option.

msb(<int_constant>)

The most significant bit of the field is <int_constant>. Value range is zero (0) to seven (7) (both inclusive). Only applicable for size one (1) integer type fields. This option is typically used together with the lsb option.

native_size(<expr>)

Specifies the number of BCD digits for a bcd declared field. This size does not cover field size calculation and dynamic_size must generally also be specified.

byte_alignment(<int_constant>)

Specifies that a field begins at the next even multiple of an alignment byte size. The value must be an even power of 2 (for example 1, 2, 4, or 8). This field option can also be used in a bit_block or repeat_block . For an example of how this option can be used, see the section below, Bit Blocks.

Note!

The byte_alignment field option is used for decoding only, and counts from the start of the UDR.


Field Size Specifications


Field Size SpecificationDescription

static_size(<constant>)

Used to specify a static size of a field (in bytes)

dynamic_size(<expr>)

Used to specify a dynamic size of a field (in bytes)

element_count(<expr>) 
Used to specify the size of a list field (in number of elements)

terminated_by(<constant>)

Used to specify a dynamic field terminated by a specific constant

bit_size(<expr>)

Used to specify a size of a field (in bits). This size specification can only be used inside a bit_block.

padded_with(<constant>)

Used to specify padding character

align(left)

Specifies that the field is left aligned, default

align(right)

Specifies that the field is right aligned

When decoding a field, the size calculation is done in two steps. First the occupying size is calculated. This is the required field size in the record. After that the core size and offset is calculated, which is the part of the field actually decoded into the internal field.

The occupying size is calculated as follows:

  1. If static_size is speci fied, this one is used.
     
  2. If dynamic_size is specified, then this one is used.
     
  3. If element_count is specified, then this one is used.
     
  4. If terminated_by is specified, then this one is used. The field size includes the termination character but will never take up more than the total remaining size in the UDR. (The reason that this is not considered as a decoding error is to support the trailing_optional field option).
     
  5. Otherwise (if the field type supports it) the field size will be deduced directly from the type. This is supported by constructed types (sub-records) and the asn_length primitive type. 

The core field data always has the full occupying size for constructed fields (record fields). For primitive fields the size is specified as follows:


  1. For a BCD field, with native_size specified, this along with the alignment specification is used.
     

  2. If terminated_by is used to find the occupying size, this terminator char (or nibble for BCD) is removed.
     

  3. Any padding is removed (while considering the alignment specification). The padding is either specified with padded_with or with terminated_by providing the occupying size is not calculated using the terminator (this case is present due to historical reasons and in current versions padded_with should be used instead). If the field is an ASCII field, space is used as default padding.
     

Field Options for Optional Fields

The following field options are used to specify when a field is present.


Field OptionDescription

present if(<condition>)

The field is present if the <condition> evaluates to true.

trailing_optional

The field is present unless the end of the UDR data has been reached. This is a convenient option equivalent to present if(remaining_size >0).

Other Field Options

Field OptionDescription
external_only

The field will not be automatically created in the target_internal when performing automatic mapping. This is useful for fields containing “decoding logic” and provide no useful information after decoding. Typical examples could be recordLength and recordType fields.

udr_size and remaining_size

Fields may need to use the size of the containing record in expressions. This is done by using the udr_size keyword.


Example - udr_size

external SimpleSequential {
     int recordType : static_size(1);
     ascii secretData : dynamic_size(udr_size-1);
};

In the previous example, the size of SimpleSequential is unknown at declaration time. However, when a size is provided (specified in a parent record type), the secretData field will occupy this entire space minus one byte (which is used by the recordType field in this example).

Note!

If the size is not supplied by a parent record, the record size calculation rules will result in an undefined size since the udr_size value is unavailable before the size has been calculated. This would cause a decoding error.


The other special value that depends on the record size is remaining_size, which is the size remaining until the end of the record. The previous example could have been written using remaining_size instead of udr_size, and is shown in the following example.

Example - remaining_size

external SimpleSequential {
    int recordType : static_size(1);
    ascii secretData : dynamic_size(remaining_size);
};

Bit Blocks

Bit blocks are used when the data record contains fields that are not byte aligned. When declaring fields in bit blocks there are two ways to specify which bits to use for the field content. When using a bit_block of a single byte, it is possible to specify the most and least significant bit of the field using msb and lsb, as previously described. The alternative is to use the bit_size option to specify the number of bits spanned by the field.

You can also use the byte_alignment field option if you need to specify from which byte a field begins. This field option can only be used for decoding. For further information on byte_alignment, see the section above, Primitive Field Types.

The general syntax of the  bit blocks is as follows:

bit_block : <size specification> 
  [, present if(<cond>) ] { 
    <bit_block contents> 
};

Example - bit_block with msb and lsb

bit_block : static_size(1) { 
       int LACLength  : msb(7), lsb(4); 
       int OwnerIDLength : msb(3), lsb(0); 
 };

Example - bit_block with bit_size

bit_block 
       int hour : bit_size(5); 
       int minute: bit_size(6); 
       int second: bit_size(6); 
       int eventId: bit_size(3); 
 };

Example - bit_block with byte_alignment

This example shows how the  byte_alignment  field option can be used in a  bit_block , in which the  secondBit  field begins in the last byte in a  bit_block  of five bytes:

external BitBlock_ByteAlignment { 
     bit_block : static_size(5) { 
         byte    firstBit:    bit_size(1); 
         byte    secondBit:  bit_size(1), byte_alignment(4); 
     }; 
 }; 

Except for simple fields, a bit_block can contain repeat_block constructs in the contents part. For a description of repeat_block see the section below, Repeat Blocks.

Repeat Blocks

A repeat_block can be used to specify that a group of fields is to be repeated a specified number of times. Currently this construct can only be used inside bit_block structures or another repeat_block structure. However this is restricted to a maximum of two levels of repeat_block. See the example below.

You can also use the byte_alignment field option if you need to specify from which byte a field begins. This field option can only be used for decoding. For further information on byte_alignment, see the section above, Primitive Field Types.

The general syntax of the  repeat blocks  is as follows:

repeat_block ( <repeat count> ) { 
    <repeat_block fields> 
};

Example - repeat_block

external BitBlockTest {
   bit_block : dynamic_size(remaining_size){
    int string_count: bit_size(8); 
    repeat_block(string_count) {
       int string_length: bit_size(8);
       repeat_block(string_length) {
         int character: bit_size(8);
       };
     };
   };
};

In the UDR Internal Format Browser, the structure of the UDR Type for this example appears as shown below:

Note!

It is not possible to encode to a structure containing a  repeat_block .

Constructed Types

A sequential field can be a type that is an instance of another external format.


Example - Constructed types

external MyParentFormat {
  int field1 : static_size(4);
  MyEnclosedFormat field1;
};

Here MyEnclosedFormat can be any external format.

set Construct

The set construct is used for decoding formats containing optional blocks of additional data. The syntax of the set Construct is declared as follows:

Example - set Construct

external MyFormat: 
     dynamic_size(recordSize) { 
   int recordSize: static_size(4); 
   set : dynamic_size( remaining_size ) { 
     MyPackage1 package1: optional; 
     MyPackage2 package2: optional; 
     list<MyPackage3> package3; 
   }; 
 };

All the formats, MyPackage1-3, must be declared with the identified_by option. The optional packages may appear in any order in the input file, however it is confirmed they do not appear more than once. Currently all fields in a set construct must be declared optional.

If the field type in the set is a list type, the set may contain multiple records of the list element type. The list type fields are not optional. Instead, when no matching records are found, the list is empty.

If a size is not specified on the set level, Ultra cannot validate that all the data in the UDR has been decoded. The user is therefore recommended to specify the size, unless the set size in advance is unknown (for instance if the record is terminated by a terminator package or the set size calculation is needed for the record size calculation). The dynamic_size(remaining_size) specification used in the previous example is often correct.

switched_set Construct

The switched_set construct can often be used instead of the set construct. It has advantages (in performance and in ease of usage) especially when the separate sub-packages are simple. The syntax is however more complex compared to the basic set construct. The syntax of the switched_set construct is declared as follows:

switched_set( <switch field> )
  [: <size specification> ] {
    <prefix fields>
    <switch cases>
    [<default case>]
}; 

The size specification is allowed to contain normal size options. The other parts of the declaration are the prefix fields, decoded for each package in the set and the prefix fields. All the prefixes must have static sizes. The switch field must be one of the prefix fields.The syntax of the  switch case  is declared as follows:

case( <case value> ) [: include_prefix] {
        <case fields>
};

Bit Blocks are supported in the prefix of a switched_set. The fields inside the bit_block can be used as any other field in the prefix. However, there are some conditions for using the bit_block implementation.

The mz.ultra.bitfield.codec must be set for the platform as well as the desktop:

  • mzsh topo set topo://client:desktop/val:config.properties.mz.ultra.bitfield.codec true
  • mzsh topo set --allow-disconnected  -l pico:platform/val:config.properties.mz.ultra.bitfield.codec true

There are some limitations too such as,

  • For the bit_block itself only static_size is supported. 
  • For fields, 
    • byte, short or int as type is supported
    • bit_size, static_size or msb, lsb is supported
    • signed, little_endian and big_endian are supported
external Simple: static_size(5) {   
  switched_set(tagId) {
    bit_block: static_size(2) {
      int tagId: bit_size(4);
      int mine: bit_size(5);
    };
    int hej: static_size(2);
    case(0)

{       int toshi: static_size(1);     }
;
    case(1)

{       int naka: static_size(1);     }
;
  };
};


The case fields are normal field specifications with the additional possibility of declaring list fields for the case where a package can be present repeatedly. If include_prefix is specified, then the case body will be decoded including the prefix fields. The syntax of the  default case is declared as follows:

default [: include_prefix] {
        <case fields>
};

The decoding of a switched_set is performed according to the following steps:

  1. Decode the prefix fields.

  2. Decode the case matching the value of the switch field. If no case matches, decode the default case. If there is no default case, end the switched_set decoding.

  3. Repeat steps 1-2 until the switched_set size (or the end of the UDR) has been reached.

Example - Format with a switched_set:

external SwitchedSetExample: terminated_by(0xA) { 
   // Size is remaining_size -1 (minus the terminator linefeed) 
    switched_set( packageId ): 
        dynamic_size( remaining_size - 1 ) { 
      ascii packageId: int(base10), static_size(1); 
      ascii packageLength: static_size(1), int(base10), 
         encode_value( case_size - 2 ); 
      case(1) { 
         list<ascii> list1: 
            dynamic_size( packageLength ); 
      }; 
      case(2): include_prefix { 
         ascii packageId_3: int(base10), static_size(1), 
               encode_value(3), external_only; 
         ascii packageLength_3: int(base10), static_size(1), 
               encode_value(case_size - 2), external_only; 
         ascii body_3: dynamic_size( packageLength_3 ); 
      }; 
      default: include_prefix { 
       list<ascii> defaultContent: 
             dynamic_size( packageLength + 2 ); 
      }; 
   }; 
 };

Encoding Specifications and Expressions

To support encoding to binary formats, it is often necessary to explicitly specify which value to be encoded in the external fields. Normally the value is taken from the corresponding internal field, however there are cases when this is not desirable. For instance, if there is no mapped internal field (because the external_only option has been used), or the value must be calculated from information about the encoding (for instance, udr_size). This is done through the encode_value option and there are several special constructs that may be used in the value expression (see the section above, Primitive Field Types).

  1. udr_size - evaluates to the encoded size of the UDR. That means this is not necessarily the same value as during decoding.
     

  2. field_size(fieldName) - evaluates to the encoded size of the named field.
     

  3. field_present(fieldName) - evaluates to true if the named field is present in the encoding. It is always true for non-optional external fields.
     

  4. case_size - this is only usable within switched_set blocks and evaluates to the encoded size of the current case (including prefix fields).

If the size expressions are used, the field encoding has to be postponed until the size is known. To be able to do this, Ultra requires that any such fields are static_size. An example of these concepts is presented next.

Example - Encoding specifications and expressions

external Ext: dynamic_size( udrSize ) { 
   ascii udrSize: int(base10), static_size(3), 
         align(right), padded_with("0"), 
                 encode_value( udr_size ); 
   ascii fieldSize: int(base10), static_size(3), 
         align(right), padded_with("0"), 
                 encode_value( field_present( strField ) ? 
                               field_size( strField ):0 ); 
   ascii strField: dynamic_size( fieldSize ), 
      present if( fieldSize > 0 ); 
 };

When processing an encode_value instruction, Ultra automatically decides how to convert the value depending on the result type of the expression. When deciding this, Ultra starts with the default internal type of the external field. If in this case, the type is called defaultType and the expression type is encodeType, the encoding rules are:


  1. If the defaultType is assignable from encodeType, use the default mapping.
     

  2. If the defaultType is string or bytearray and the encodeType is numeric, encode it as a simple ascii value (one byte).
     

  3. If the defaultType is bytearray and the encodeType is string, do standard encoding (ISO-8859-1) of the string.
     

  4. If the encodeType is string and the external base type is ascii (for example using int(base10)), use direct string encoding.

If none of these rules are applicable, the format will not compile. To understand what this means, consider the following field definitions.


Example - Field definitions

ascii strField1: static_size(1), encode_value("10");
  ascii strField2: static_size(1), encode_value(10);
  ascii intField1: static_size(1), int(base10), 
                   encode_value("10");  
  ascii intField2: static_size(1), int(base10), 
                   encode_value(10);

Expected encoded results for these fields:


FieldExpected encoded result

strField1

Both defaultType and encodeType are string. The normal encoding will be used to get the result "1" (since static_size(1) is used, the result "10" is truncated to one byte).

strField2

defaultType is string and encodeType is byte (numeric). This means that the second rule is applied, and the result is ascii 10 (newline).

intField1

defaultType is int and encodeType is string. There is no mapping between these types however, since the external base type is ascii, the string is mapped out as for strField1, and the result is "1".

intField2

defaultType is int and encodeType is byte. Since encodeType is directly assignable to defaultType, it is mapped out as normal, and the output is again "1".