19. Google Protocol Buffer Support

This chapter describes the GPB (Google Protocol Buffers) addition to the Ultra Format Definition Language (UFDL). This addition enables you to compile GPB definitions, and to decode the GPB input data as well as encode data into the GPB format.

Both the proto2 and proto3 versions of the Google protocol buffers language are supported.

Overview

In the software, you manage GPB parsing in UFDL by applying the gpb_block construct. The syntax differs whether you are using proto2 or proto3.

Syntax for proto2

gpb_block {
 <GPB message elements>
};    


Syntax for proto3

gpb_block {
    syntax = "proto3"; 
    <GPB message elements>
};

The full description of the GPB language for proto2 and proto3 can be found at: https://developers.google.com/protocol-buffers/docs/proto or https://developers.google.com/protocol-buffers/docs/proto3.

The GPB Field Rules

You specify that message elements are formatted according to one of the following rules:

For proto2 only:

  • required
  • optional

For proto2 and proto3:

  • repeated


The GPB Scalar Value Types 

The GPB message elements can be defined with any of the following types:

TypeNotes
double8 bytes signed
float4 bytes signed
int32Uses variable-length encoding. Inefficient for encoding negative numbers – if your field is likely to have negative values, use sint32 instead.
int64Uses variable-length encoding. Inefficient for encoding negative numbers – if your field is likely to have negative values, use sint64 instead.
uint32Uses variable-length encoding.
uint64Uses variable-length encoding.
sint32Uses variable-length encoding. Signed int value. This more efficiently encode negative numbers than regular int32s.
sint64
Uses variable-length encoding. Signed int value. This more efficiently encode negative numbers than regular int64s. In sint64 will be more efficient than uint64.
fixed32Always four bytes. More efficient encoded than uint32 if values are often greater than 228.
fixed64Always eight bytes. More efficient encoded than uint64 if values are often greater than 256.
sfixed32Always four bytes.
sfixed64Always eight bytes.
bool
string
bytesMay contain any arbitrary sequence of bytes.


Limitations

The following limitations apply for the GPB support for proto2:

  • Default specifiers are not supported.
  • Groups are not supported.
  • The packed option is not supported.
  • Import statements with the gpb_block will have no effect.
  •  Nested types are not fully supported, since their names will become a part of the global scope. However, you can avoid this problem by changing names on one of the sub types.
  • The extensions specifier is not supported.
  • The packed specifier is not supported.
  • Options are not supported.
  • Packages are not supported.
  • Import public specifiers is not supported.
  • Definitions of services are not supported, only messages.

The following limitations apply for the GPB support for proto3:

  • The options that are supported are allow_alias in enums and packed for fields.
  • Importing definitions with the gpb_block is not supported.
  • Import public specifiers is not supported.
  • The parent message type is not supported.
  • The any type is not supported.
  • Packages are not supported.
  • Definitions of services are not supported, only messages.


Note!

The GPB message format is not self delimiting, which should be considered when decoding a stream of messages, or a file containing several messages.

Nested Types in Proto3

Nested types are supported in proto3. For further information on the specification for nested types, see https://developers.google.com/protocol-buffers/docs/proto3. When referring to a nested type by using its qualified name, a point is used as delimiter, e.g. M1.M2 or .M1.M2. However, note that while nested types are indicated with a point in the GPB specification when mapping to Ultra and APL, you must use an underscore instead. See the example provided below.

Example - Nested types in proto3

In this example M2 is nested inside M1Depending on where you are inside the gbp_blockyou can refer to a message by its relative nameM2 or its qualified name.M1.M2.

message M1 {
  message M2 {
    string f1 = 1;
  }
  M2 f2 = 2;
  .M1.M2 f3 = 3;
}

The following shows how to map M1 to an internal configuration:

in_map M1_in: external(M1),  target_internal(M1) {
  automatic: use_external_names;
}

In APL you can refer to M1 and M2 by their qualified names M1 and M1_M2, where an underscore is used instead of a point. 

udrCreate(M1);
udrCreate(M1_M2);

A GPB Format Example

To decode a GPB data file, a format definition is included in the Ultra gpb_block block in the Ultra format.


Example - GPB format using proto2

gpb_block {
   
    message MyData{
      required string      myName =1;
      required string      myText =2;
      required string      extraName =3;

      required int32       myPriority =4;
      required uint32      myId =5;
      required uint32      equipmentId =6;
      required MyParam     myParams = 7;
    }
 
    message MyParam {
       repeated string someField = 1;
    }

    message MyAdditional {
      required uint32   action = 1;
      required string   alias = 2;
      required int64    content = 3;
      optional int32    newId = 4;
      optional int32    newType = 5;
      optional uint64   myKey = 6;
    }
   
    message FlashEx {
       required string someField = 1;
    }

    message MyExtras {
      repeated FlashEx ex = 1;
    }
 
    message MyList {
      repeated string list = 1;
    }
   
    message SysmanData {
       required string someField = 1;
    }
};

external Wrapper {
       int dataSize: static_size(4), encode_value(udr_size-4);
        SysmanData data : dynamic_size(dataSize);
};

in_map Wrapper_inMap: external(Wrapper),
target_internal(Wrapper_int) {
    automatic;
};

out_map Wrapper_outMap: external(Wrapper),
internal(Wrapper_int) {
    automatic;
};

decoder Wrapper_decoder: in_map(Wrapper_inMap);

encoder Wrapper_encoder: out_map(Wrapper_outMap);

Example - GPB format using proto3

gpb_block {
    syntax = "proto3";
   
    
    message MyData{
      string      myName =1;
      string      myText =2;
      string      extraName =3;

      int32       myPriority =4;
      uint32      myId =5;
      uint32      equipmentId =6;
      MyParam     myParams = 7;
      map<string, MyParam> myMap = 8;
      AnEnum anEnum = 9;
      AnotherEnum anotherEnum = 10;
    }
   
    enum AnEnum {
        V0 = 0;
        V1 = 1;
    }
    enum AnotherEnum {
        option allow_alias = true;
        V0 = 0;
        V1 = 0;
    }

    message MyParam {
       repeated string someField = 1;
    }

    message MyAdditional {
      uint32   action = 1;
      string   alias = 2;
      int64    content = 3;
      int32    newId = 4;
      int32    newType = 5;
      uint64   myKey = 6;
    }
   
    message FlashEx {
       string someField = 1;
    }

    message MyExtras {
      repeated FlashEx ex = 1;
    }
 
    message MyList {
      repeated string list = 1;
    }
   
    message SysmanData {
       string someField = 1;
    }
};

external Wrapper {
       int dataSize: static_size(4), encode_value(udr_size-4);
        SysmanData data : dynamic_size(dataSize);
};

in_map Wrapper_inMap: external(Wrapper),
target_internal(Wrapper_int) {
    automatic;
};

out_map Wrapper_outMap: external(Wrapper),
internal(Wrapper_int) {
    automatic;
};

decoder Wrapper_decoder: in_map(Wrapper_inMap);

encoder Wrapper_encoder: out_map(Wrapper_outMap);

Note!

Since GPB does not specify the size, this has to be done externally, which is why int dataSize has to be included in the external unless the size is previously known.