A Constructed Decoder Example

In some cases, a so-called constructed decoder is useful. The main advantage is that it introduces some validation logic, making it possible to evaluate the order of the arriving records. For instance, suppose the incoming files contain one header and one trailer which must be present at the file start and end, in order for the file to be accepted. In between, data records may or may not be present. The data records can be of two types. Headers and trailers are considered to be records as well, so there are actually four record types in this format definition.

The source file for which a decoder is exemplified in this appendix:

An example of the source files discussed in this section

This section includes a detailed description of all the code block parts that a constructed decoder might contain:

  • Headers and Trailers (which are included in the external code block)

  • external
  • internal
  • in_map
  • decoder
  • out_map
  • encoder

external

Both headers and trailers as well as record types should be included in the external code block.

Headers and Trailers

Since headers and trailers are treated as records, the same syntax as for external records apply. APL syntax can also be used within UFDL code.

external FileHeader {
   ascii header : terminated_by(0xA);
};

The first line is always a header.

Note!

No identified_by is needed, since the decoder does not evaluate the input stream to see if the next record is a leader or not. This compares to whether to read a TypeA or TypeB record, when an identification test is called for.

external FileTrailer : identified_by( strStartsWith( trailer, "Date") ){
   ascii trailer : terminated_by(0xA);
};

The identified_by for the trailer is not crucial, however, it provides additional validation during decoding, since it evaluates that the trailer really starts with "Date".

Record Types

The definitions of TypeA and TypeB are fairly straightforward. No encode_value for RecordType is set, since this is evaluated from the internal UDR during encoding (see the section below, internal).

external TypeA : identified_by( RecordType == "A" ),
    terminated_by(0xA) {
        ascii RecordType      : static_size(2), 
                                terminated_by(":");
        ascii A_number        : terminated_by(":");
        ascii B_number        : terminated_by(":");
        ascii SequenceNumber  : terminated_by(":");
        ascii Duration        : terminated_by(0xA);
};


external TypeB : identified_by( RecordType == "B" ),
    terminated_by(0xA) {
        ascii RecordType      : static_size(2),
                                terminated_by(",");
        ascii CallingCountry  : terminated_by(",");
        ascii SequenceNumber  : terminated_by(",");
        ascii LocalAreaCode   : terminated_by(",");
        ascii A_number        : terminated_by(",");
        ascii B_number        : terminated_by(",");
        ascii CauseForOutput  : terminated_by(",");
        ascii CalledCountry   : terminated_by(0xA);
}; 


internal

Suppose it is desired to output one record type as a replacement for the incoming types A and B the simplest way is to create a mutual internal.

internal MyInternal {

  // These are common fields to TypeA and TypeB

  string RecordType; 
  string SequenceNumber; 
  string A_number; 
  string B_number; 

  // These may or may not be present depending on 
  // record type

  string CallingCountry: optional; 
  string LocalAreaCode: optional; 
  string Duration: optional; 
  string CauseForOutput: optional; 
  string CalledCountry: optional; 
};

Both TypeA and TypeB records are mapped to MyInternal. The common fields are always set. The others are defined as optional, hence, their presence depends on the record type. The RecordType in the internal type is required for encoding, since the encoder needs to evaluate the record type to decide whether to encode as TypeA or TypeB.

in_map

Both record types A and B are mapped to the same internal. This approach is useful to simplify APL syntax within processing (a lot of if-statements used to determine the record type, can be eliminated), or in case one resulting output type is produced.

TypeA and TypeB are both mapped to MyInternal (see the section above, internal).

in_map TypeA_in : external( TypeA ), internal( MyInternal ) {
        automatic;
};


in_map TypeB_in : external( TypeB ), internal( MyInternal ) {
        automatic;
};


The headers are not wanted in processing, therefore discard_output is set. However, the target_internal is still useful since it enables you to produce headers for encoding.

in_map Header_in : external( FileHeader ), 
   target_internal( Header ), discard_output { 
        automatic; 
};

// The trailer gets a special record type.

in_map Trailer_in : external( FileTrailer ), 
   target_internal( Trailer ) { 
        automatic; 
};


decoder

The following constructed decoder definition expects all batches to start with a header, end with a trailer, and have zero, one, or several A and B records in between. If not, the decoder aborts.

// The sub-decoders.

decoder Records : in_map( TypeA_in ), in_map( TypeB_in ); 
decoder Header : in_map( Header_in ); 
decoder Trailer : in_map( Trailer_in );

// The total (file) decoder. 
// '*' means zero/one/several records are expected between 
// one header and one trailer for each file collected.

decoder Total { 
   decoder Header; 
   decoder Records *; 
   decoder Trailer; 
};


out_map

Suppose you are required to encode back to the original format.

out_map TypeA_out: external(TypeA), internal( MyInternal ) {         automatic;
};

out_map TypeB_out: external(TypeB), internal( MyInternal ) { 
    automatic; 
};

out_map Trailer_out: external(FileTrailer), internal(Trailer) { 
   automatic; 
};

out_map Header_out: external(FileHeader), internal(Header) { 
   automatic; 
};

The out-maps and encoder are simple.

Note!

TypeA and TypeB both are encoded from MyInternal. Which type to use depends on the value of the RecordType field.

encoder

A constructed encoder cannot be created. Hence, the following encoder definition does not care for the order of arriving records, nor that all types must be present in the output file.

encoder Total: out_map( TypeA_out ),
               out_map( TypeB_out ), 
               out_map( Header_out ), 
                out_map( Trailer_out );