GBKF Patterns
Introduction
To understand the two main patters that you can use for structuring your data with the GBKF format, it is first necessary to understand its core design. The GBKF format will store the data thanks to keyed values, instance or group ids, and arrays of typed values.
And it will allow you to:
- Define multiple keys.
- Repeat the keys with no restrictions (same Instance ID, same Values Type, etc...)
- Define Instance ID's per each entry.
So in the binary file we can imagine the following representation1:
KEY-A | Instance ID 0 | Values Type 1 | Value A.1.1, Value A.1.2, Value A.1.3, Value A.1.n, ... KEY-A | Instance ID 1 | Values Type 2 | Value A.2.1, Value A.2.2, Value A.1.n, ... KEY-B | Instance ID 0 | Values Type 1 | Value B.1.1, Value B.1.2, Value B.1.3, Value B.1.n, ... etc...
And in code we can imagine the following structure2:
{"KEY-A":[ (0, [Value A.1.1, Value A.1.2, Value A.1.3, Value A.1.n, ...]), (1, [Value A.2.1, Value A.2.2, Value A.1.n, ...]), ], {"KEY-B":[ (0, [Value B.1.1, Value B.1.2, Value B.1.3, Value B.1.n, ...]), ], }
This design, makes simple to structure data and associate it to instances, and because of its flexibility, it allows to implement it, mostly with two different high-level data model patterns.
It's important to understand that the two patterns presented below are to structure the data inside the GBKF format. They do not change the content of the payload and both patterns can be used. Depending on the use case, one will be more optimized
1 The binary format is much more complex, to have the real representation look at the specification.
2 The code implementations use probably a dedicated object or structure to hold the data of each entry.
GBKF Single-Instance pattern
This patter will not use the Instance-ID field to identify the instances, but most to group similar objects and identify the properties.
The main interest of this pattern, is to reduce the number of entries (keyed-instance-value), when there is a high level number of instances.
Simple Model Example
Let's imagine we have a software to store the following Dogs data:
- An ID (Integer)
- A name (String)
- A birthday date (Integer)
Such data may be stored in a database like the following:
DOG TABLE CID | Name | B. Date ------------------------- 0 | Roko | 1576108800 1 | Trump | -743212800 2 | Alfie | 1701475200 3 | Milo | 1609632000With this pattern, there are mainly two solutions, presented below.
Solution 1: To use the Key to differentiate the properties
{"DOG-I":[ (0, [ID 1, ID 2, ID 3, ID 4, ...]), ], {"DOG-N":[ (0, [Name 1, Name 2, Name 3, ...]), ], {"DOG-B":[ (0, [Date 1, Date 2, Date 3, ...]), ], }
Note that the key groups the animal and the member, and the Instance ID is not used. The Instance ID can be used to create-sub groups (Ex: to chunk the data).
Solution 2: To use the Instance ID to differentiate the properties
{"DOG":[ (0, [ID 1, ID 2, ID 3, ID 4, ...]), (1, [Name 1, Name 2, Name 3, ...]), (2, [Date 1, Date 2, Date 3, ...]), ], }
Note that the Key is the same for all the members, and the differentiation is done with the Instance ID. Value 0 for the ID, Value 1 for the Name, Value 2 for the Date, etc...
This solution is preferred one.
- It does not require to parse the key, by properly separating the Object Type from its properties.
- It is more data-efficient. Storing keys is much more byte consuming than the Instance ID
Nullable fields
In low level languages, there are no "Null" values, an Integer must always be Integer, a String can be empty but must always be a String, etc...
This means that if your DataModel or DataBase accept null values, you must create a separate dedicated entry to store it.
The most efficient is to create a boolean entry, since it only uses 1 bit per value. We can adapt the previous example:
{"DOG":[ (0, [ID 1, ID 2, ID 3, ID 4, ...]), (1, [Name 1, Name 2, Name 3, ...]), (2, [false, true, false, ...]), ], }
Notice that here the Instance ID = 1 is used to store the name, and Instance ID = 2 to store if the name is null or not.
GBKF Multi-Instance pattern
...