Distributed build (with distcc)

Four years later, xkcd: Wisdom of the Ancients comes to mind. :slight_smile:

I’m struggling with terrible Platformio build times and have another machine with eight idle M1 cores.

The project I’m working on http://nightdriverled.com currently builds 39 different combinations for ESP32. The .bin files are about 1.5-1.8MB each with only one hitting 2MB. So while “our” source code is about 1.2MB (src/, include/) there are a about a dozen Arduino-ish libraries in lib_deps. I know our build is kind of goofy in including too much - instead of all 39 targets turning in exactly the combination of Adafruit this and U8g2 that, it sometimes builds them and throws them away, but slicing it in high resolution in platform.ini is awful, too. So we wait for 39 copies of /ArduinoJson (times a dozen for all the lib_deps) to be fetched from the network, then installed (which takes longer than the fetch) and then built. It takes my M1 about an hour and, of course, with 39 copies of everything, it’s not like the build cache is exactly helpful except when you’re rebuilding. There’s just not much you can do with 62GB that’s fast:

 du -hs .
 62G	.

Because of Platformio/Scons slowness in the dependency checking and the slow “retrieving from cache” taking almost as long as compiling some files, even a “do nothing” build is 20+ seconds.

 $ time pio run -e mesmerizer
[ ... ] 
Environment    Status    Duration
-------------  --------  ------------
mesmerizer     SUCCESS   00:00:20.851
========================= 1 succeeded in 00:00:20.851 =========================
pio run -e mesmerizer  17.34s user 2.46s system 92% cpu 21.443 total

I know that distcc won’t help with that awful incremental build time but for that hour it spends fetching and generating that 62GB, enlisting that other computer would be awesome.

I also know I can probably find easier ways to speed it up once it’s all been fetched (and why do I need 39 copies of the JSON code anyway?) like generating 39 compile_commands.json and feeding that to something that’s actually fast like Ninja or CMake or something, but others have to be struggling with this awkward build system. Distcc would be some nice pain relief as the individual builds should be extremely parallelizable.

I know our platformio.ini could handle these multiple envs better, but my attempts to restructure the library dependencies just really doesn’t scale out.

I know that even if distcc could scale up time waiting for compiles by 80%. (realistic once everything is fetched and cached over the local network) our build would still be painful, but distcc would still be helpful chowing down on those 401 .o’s.

I could probably work it out in a Makefile, but between Platformio, Scons, ESP-IDF, Arduino, there’s just a lot going on that I don’t have great understanding of before xtensa-blah-g++ gets called as that’s the part we could distcc up.

So, has anyone successfully worked out distcc with PlatformIO? Barring that, is anyone able to pull that tree and help us make it play nicer with PlatformIO? Speeding up our “do almost nothing” builds would also be welcome.

 rm ./.pio/build/mesmerizer/src/main.cpp.o
âžś  nightdriverstrip git:(fixincs) âś— time pio run -e mesmerizer
[ ... ] 
Retrieved `.pio/build/mesmerizer/src/main.cpp.o' from cache
[ ... ] 
Environment    Status    Duration
-------------  --------  ------------
mesmerizer     SUCCESS   00:00:22.142
========================= 1 succeeded in 00:00:22.142 =========================
pio run -e mesmerizer  18.40s user 2.50s system 91% cpu 22.854 total

That’s 22 seconds to find the dependencies, copy a file from a cache to the workspace (why not a symlink?), link, then run esptool to glom the ELF to a bin. So distcc couldn’t help that case at all, but that’s still about 20 seconds slower than it should be, IMO.

How are the rest of you getting on with small, but not trivial, programs like this? We’re only about 20KLOC plus some libraries so we shouldn’t exactly be a stressful combination.