Set storage format
The output storage format, see formats , depends on the sources. For example, if the incoming data is stored as Json, then the output will be Json as well. The same applies, when Avro is involved.
When using custom storage format, the output will be Json.
At times, it is required to control the resulting Key and/or Value storage. If the input is Json, for example, the output for the streaming computation can be set to Avro.
Another scenario involves Avro source(-s), and a result which projects the Key as a primitive type. Rather than using the Avro storage format to store the primitive, it might be required to use the actual primitive format.
Controlling the storage format can be done using the following syntax:
INSERT INTO <target> STORE KEY AS <format> VALUE AS <format> ...
There is no requirement to always set both the Key and the Value. Maybe only the Key or maybe only the Value needs to be changed. For example:
INSERT INTO <target> STORE KEY AS <format> ... // or INSERT INTO <target> STORE VALUE AS <format> ...
Considering a scenario where the input data is stored as Avro, and there is an aggregation on a field which yields an INT, to use the primitive INT storage and not the Avro INT storage set the Key format to INT:
INSERT INTO <target> STORE KEY AS INT SELECT TABLE SUM (amount) AS total FROM < source > GROUP BY CAST (merchantId AS int)
Here is an example for the scenario of having Json input source(-s), but an Avro stored output:
INSERT INTO <target> STORE KEY AS AVRO VALUE AS AVRO SELECT STREAM _key.cId AS _key.cId , CONCAT(_key.name, "!") AS _key.name , pId , CONCAT("!", name) AS name , surname , age FROM < source >
Changing the storage format is guarded by a set of rules. Following table describes how storage formats can be converted for the output.
|From \ To||INT||LONG||STRING||JSON||AVRO||XML||Custom/Protobuf|
|JSON||If the Json storage contains integer only||If the Json storage contains integer or long only||yes||=||yes||no||no|
|AVRO||If Avro storage contains integer only||If the Avro storage contains integer or long only||yes||yes||=||no||no|
|Custom (includes Protobuf)||no||no||no||yes||yes||no||no|
Time windowed formats follow similar rules to the ones described above with the additional constraint that Session Windows(SW) cannot be converted into Time Windows (TW) nor vice-versa.
|From \ To||SW[B]||TW[B]|
|SW[A]||yes if format A is compatible with format B||no|
|TW[A]||no||yes if format A is compatible with format B|
Example: Changing the storage format from TWAvro to TWJson is possible since they’re both TW formats and they Avro can be converted to Json.
Example: Changing the storage format from TWString to TWJson is not possible since, even thought they’re both TW formats, String formats can’t be written as JSON.