Combining data is as less bytes as possible

michel.de.meester · July 27, 2023, 8:43am

Hello,

Sorry for writing in "bad’ English, I’m living in the Flemish speaking part of Belgium.

I’m looking for an algorithm to compress data in as less bytes as possible to send them by LoRaWAN.
Of course, I also have to reconstruct the original data.

The data I have to send consists of:

3 temperatures ranging from -25.0 °C to 99.0 °C (with .1 decimal precision)
1 temperature ranging from -25 °C to 999 °C (with no decimals)
4 voltages ranging from 0.0 to 25.0 volt (with .1 decimal precision)
8 switch states (can be combined in 1 byte)

The code to retrieve these values is already written and uses the following variables:

float temp1
float temp2
float temp3
int16_t temp4

float volt1
float volt2
float volt3
float volt4

int8_t switch

What is the best practice to send these values in as less bytes as possible (I think LoRaWAN payload is max 12 bytes)

I was thinking:

multiplying the voltage floats by 10 to get rid of the decimal part, so they fit in an int8_t
adding 100 to the temp floats to get rid of the negative sign, then multiplying by 100 to fit in an int16_t (2 bytes)

Does anyone have better ideas?

Thanks in advance,
Michel

maxgerhardt · July 27, 2023, 9:45am

No, the LoRaWAN max user payload size depends on the Spreading Factor (SF). E.g., even in the worst possible SF 12 in EU868, the max payload size is still 51 bytes (source).

INT8 has a range of -128 to +127. If you have a voltage range of -25.0 to 99.0 with .1 decimal precision, multiplying by 10 would give the integer numbers -250 to 990 (representing -25.0 and 99.0 each). Neither -250 nor 990 fit in a INT8 anymore. In fact, for that voltage range and precision, you have 1 + (99 - (-25)) * 10 = 1 + 1240 = 1241 possible values (+1 for the representation of 0). The number of bits that these number of possible values needs to be saved is ceil ( log_2( num_values) ), and ceil( log_2(1241)) = ceil(10.27) = 11 (bits). That is no standard size (the next one would be 16 bits), but still you can write an algorithm that converts the given temperature to an 11-bit bitstring and writes it into a continuous series of bits. Meaning there’s something like

  payload byte 1       payload byte 2                     payload byte 3
[t1 (high 8 bits)] [t1(low 3 bits)  t2(high 5 bits)] [t2(low 3 bits) t3(high 5 bits)]

Where t1, t2, t3 are the 11-bit encoded temperatures, with 0 encoding “-25.0”, 1 encoding “-24.9”, 1240 encoding “99.0” et cetera. Basically: EncodedByte = (InputTemperature + 25) * 10, InputTemperature ∈ [25.0, 99.0], DecodedTemperature = (EncodedByte / 10.0) - 25.

I’ll assume that’s the most efficient constant-width bit-packing mechanism.

So, the given data could be encoded as:

3 x 11 bits (encoding above)
1 x 10 bits (encoding similiar above)
4 x 8 bits (easy: multiple temperature value by 10, cast to uint8_t)
1 x 8 bits (switch)

Total: 11 bytes needed. (10.375 bytes actually, but you can only work in full bytes in the payload)

But, since your initial assumption on LoRaWAN payload sizes is wrong, you would also have no problem storing them even as full 32-bit floats:

3 x 32 bits
1 x 32 bits
4 x 32 bits
1 x 8 bits

Total: 33 bytes

And if you decided to use int16_t bits (just letting the uppermost bits be unused as 0) for the temperatures that are over 8 bits but less than 16 bits:

3 x 16 bits
1 x 16 bits
4 x 8 bits
1 x 8 bits

Total: 13 bytes.

  payload byte 1    payload byte 2   payload byte 3     payload byte 4
[t1 (high 8 bits)] [t1(low 8 bits)] [t2 (high 8 bits)] [t2 (low 8 bits)]

Each of which easily fits in the worst-case payload size for LoRaWAN.

Sidenote: There are also variable-length encodings of data, see e.g., COBS, CBOR.