Google Protocol Buffer Support

This chapter describes the GPB (Google Protocol Buffers) addition to the Ultra Format Definition Language (UFDL). This addition enables you to compile GPB definitions, and to decode the GPB input data as well as encode data into the GPB format.

Both the proto2 and proto3 versions of the google protocol buffers language are supported.

Overview

You manage GPB parsing in UFDL by applying the gpb_block construct. The syntax differs whether you are using proto2 or proto3.

Syntax for proto2

gpb_block { <GPB message elements> };


Syntax for proto3

gpb_block { syntax = "proto3"; <GPB message elements> };

The full description of the GPB language for proto2 and proto3 can be found at: https://developers.google.com/protocol-buffers/docs/proto or https://developers.google.com/protocol-buffers/docs/proto3.

The GPB Field Rules

You specify that message elements are formatted according to one of the following rules:

For proto2 only:

  • required

  • optional

For proto2 and proto3:

  • repeated



The GPB Scalar Value Types 

The GPB message elements can be defined with any of the following types:

Type

Notes

Type

Notes

double

8 bytes signed.

float

4 bytes signed.

int32

Uses variable-length encoding. Inefficient for encoding negative numbers – if your field is likely to have negative values, use sint32 instead.

int64

Uses variable-length encoding. Inefficient for encoding negative numbers – if your field is likely to have negative values, use sint64 instead.

uint32

Uses variable-length encoding.

uint64

Uses variable-length encoding.

sint32

Uses variable-length encoding. Signed int value. This more efficiently encode negative numbers than regular int32s.

sint64

Uses variable-length encoding. Signed int value. This more efficiently encode negative numbers than regular int64s. In sint64 will be more efficient than uint64.

fixed32

Always four bytes. More efficient encoded than uint32 if values are often greater than 228.

fixed64

Always eight bytes. More efficient encoded than uint64 if values are often greater than 256.

sfixed32

Always four bytes.

sfixed64

Always eight bytes.

bool

Use the bool type to define the GPB message elements.

string

Use the string type to define the GPB message elements.

bytes

May contain any arbitrary sequence of bytes.



Limitations

The following limitations apply for the GPB support for proto2:

  • Default specifiers are not supported.

  • Groups are not supported.

  • The packed option is not supported.

  • Import statements with the gpb_block will have no effect.

  • Nested types are not fully supported, since their names become a part of the global scope. However, you can avoid this problem by changing names on one of the sub types.

  • The extensions specifier is not supported.

  • The packed specifier is not supported.

  • Options are not supported.

  • Packages are not supported.

  • Import public specifiers is not supported.

  • Definitions of services are not supported, only messages.

The following limitations apply for the GPB support for proto3:

  • The options that are supported are allow_alias in enums and packed for fields.

  • Importing definitions with the gpb_block is not supported.

  • Import public specifiers is not supported.

  • The parent message type is not supported.

  • The any type is not supported.

  • Packages are not supported.

  • Definitions of services are not supported, only messages.



Note!

The GPB message format is not self delimiting, which should be considered when decoding a stream of messages, or a file containing several messages.

Nested Types in Proto3

Nested types are supported in proto3. For further information on the specification for nested types, see https://developers.google.com/protocol-buffers/docs/proto3. When referring to a nested type by using its qualified name, a point is used as delimiter, for example M1.M2 or .M1.M2. However, note that while nested types are indicated with a point in the GPB specification, when mapping to Ultra and APL, you must use an underscore instead. See the example provided below.

Example - Nested types in proto3

In this example M2 is nested inside M1. Depending on where you are inside the gbp_block you can refer to a message by its relative name M2 or its qualified name.M1.M2.

message M1 { message M2 { string f1 = 1; } M2 f2 = 2; .M1.M2 f3 = 3; }

The following shows how to map M1 to an internal configuration:

In APL you can refer to M1 and M2 by their qualified names M1 and M1_M2, where an underscore is used instead of a point. 



A GPB Format Example

To decode a GPB data file, a format definition is included in the Ultra gpb_block in the Ultra format.



Example - GPB format using proto2