Clarification for Embedding Binary Data

Currently I have a project where I have bunch of html files what to be compressed into a gzip byte array and to be used by the code (to server as a web response).

As of now I have a pre build script which runs another powershell script which does the actual gzip compression, generates the byte array in .CPP complaint format and updates a header file.

Now I came across Embedding Binary Data in the documentation. I have few clarifications:

  1. The example in the documentation talks about two array variables for each of the file. _start and _end. How do I use them? In the actual example code linked there it uses only _start. What is the use of the second array, and what will it contain?
extern const uint8_t aws_root_ca_pem_start[] asm("_binary_src_aws_root_ca_pem_start");
extern const uint8_t aws_root_ca_pem_end[] asm("_binary_src_aws_root_ca_pem_end");
  1. Reading the build script (/builder/frameworks/_embed_files.py) I can’t make out how the actual embedding happen. Does it provide any optimization compared to my current way of embedding the binary data in a byte array variable in a header file?

After thinking some more I think the start and end are better represented as pointers (as any array name is just a pointer) and to find the length of the binary blob we would need to know _start and _end.

Correct. The symbols define the address of the start and end of the symbol, as a pointer. As such the second _end array will contain nothing per-se, it’s a marker to the end.

xtensa-esp32-elf-objcopy is called on the to-be-embedded file which produces the given symbols and a .o file which will be linked in the final executable.

No, it’s a binary copy of the input file, put in section of read-only flash. If you do the same in your header file with like a const uint8_t[] the effect will be exactly the same: a copy of the binary content and a pointer to the start of it.

1 Like

Thank you @maxgerhardt. Your answer makes it easier to understand the mechanism. I think my current logic of the bytearry generation in prebuild and adding to a header file makes it more easier to manage the static content.