Google Protocol Buffer Support
This chapter describes the GPB (Google Protocol Buffers) addition to the Ultra Format Definition Language (UFDL). This addition enables you to compile GPB definitions, and to decode the GPB input data as well as encode data into the GPB format.
Both the proto2 and proto3 versions of the google protocol buffers language are supported.
Overview
You manage GPB parsing in UFDL by applying the gpb_block
construct. The syntax differs whether you are using proto2 or proto3.
Syntax for proto2
gpb_block {
<GPB message elements>
};
Syntax for proto3
gpb_block {
syntax = "proto3";
<GPB message elements>
};
The full description of the GPB language for proto2 and proto3 can be found at: https://developers.google.com/protocol-buffers/docs/proto or https://developers.google.com/protocol-buffers/docs/proto3.
The GPB Field Rules
You specify that message elements are formatted according to one of the following rules:
For proto2 only:
required
optional
For proto2 and proto3:
repeated
The GPB Scalar Value Types
The GPB message elements can be defined with any of the following types:
Type | Notes |
---|---|
| 8 bytes signed. |
| 4 bytes signed. |
| Uses variable-length encoding. Inefficient for encoding negative numbers – if your field is likely to have negative values, use sint32 instead. |
| Uses variable-length encoding. Inefficient for encoding negative numbers – if your field is likely to have negative values, use sint64 instead. |
| Uses variable-length encoding. |
| Uses variable-length encoding. |
| Uses variable-length encoding. Signed int value. This more efficiently encode negative numbers than regular int32s. |
| Uses variable-length encoding. Signed int value. This more efficiently encode negative numbers than regular int64s. In sint64 will be more efficient than uint64. |
| Always four bytes. More efficient encoded than uint32 if values are often greater than 228. |
| Always eight bytes. More efficient encoded than uint64 if values are often greater than 256. |
| Always four bytes. |
| Always eight bytes. |
| Use the bool type to define the GPB message elements. |
| Use the string type to define the GPB message elements. |
| May contain any arbitrary sequence of bytes. |
Limitations
The following limitations apply for the GPB support for proto2:
Default specifiers are not supported.
Groups are not supported.
The
packed
option is not supported.Import statements with the
gpb_block
will have no effect.Nested types are not fully supported, since their names become a part of the global scope. However, you can avoid this problem by changing names on one of the sub types.
The
extensions
specifier is not supported.The
packed
specifier is not supported.Options are not supported.
Packages are not supported.
Import public specifiers is not supported.
Definitions of services are not supported, only messages.
The following limitations apply for the GPB support for proto3:
The options that are supported are
allow_alias
in enums andpacked
for fields.Importing definitions with the
gpb_block
is not supported.Import public specifiers is not supported.
The parent message type is not supported.
The
any
type is not supported.Packages are not supported.
Definitions of services are not supported, only messages.
Note!
The GPB message format is not self delimiting, which should be considered when decoding a stream of messages, or a file containing several messages.
Nested Types in Proto3
Nested types are supported in proto3. For further information on the specification for nested types, see https://developers.google.com/protocol-buffers/docs/proto3. When referring to a nested type by using its qualified name, a point is used as delimiter, for example M1.M2
or .M1.M2
. However, note that while nested types are indicated with a point in the GPB specification, when mapping to Ultra and APL, you must use an underscore instead. See the example provided below.
Example - Nested types in proto3
In this example M2
is nested inside M1
. Depending on where you are inside the gbp_block
you can refer to a message by its relative name M2
or its qualified name.M1.M2.
message M1 {
message M2 {
string f1 = 1;
}
M2 f2 = 2;
.M1.M2 f3 = 3;
}
The following shows how to map M1 to an internal configuration:
In APL you can refer to M1
and M2
by their qualified names M1
and M1_M2,
where an underscore is used instead of a point.
A GPB Format Example
To decode a GPB data file, a format definition is included in the Ultra gpb_block
in the Ultra format.