Using PlatformIO with a custom board based on STM32F427VGT6 mcu

There may be some optimizations going on if the pointer pRCC_CR et cetera is not declared as volatile (aka. “this value can change without notice, it should always be re-read”), then the compiler might optimize this while loop to

  1. Read the CR register once, check if the PLL ready bit is set
  2. If not, hang up
  3. If it is, keep going

Because it assumes that the value, once read, does not change, so there’s no need in rechecking the once-read value. This is obviously not true when reading from a hardware register that the hardware can change at any time.

Can you add this to your platformio.ini:

debug_build_flags = -O0 -g3 -ggdb3 

(docs) the -O0 part disables all optimizations.

Of course, in the debugging sidebar, there should also be a “Peripherals” view. You can expand that and check in the “RCC” peripheral (reset and clock control) yourself, what value the CR register is set to, and whether it really has the PLL ready bit never set or not.

I made that change and that produced significant differences. The app gets past the startup configuration and into main(). The main loop is executing.

Now I can spend some time analyzing deeper functionality of the app with the board connected to various external hardware.

I switched back and forth with that change to debug_build_flags (just to ensure I hadn’t made a mistake like I did earlier with the debug menu setting), and the app behavior changes as expected.

What an interesting discovery this is. I wonder if the original programmer had ever tested the app using the internal oscillator? I’ll have to discuss this with him.

dDo you have any more thoughts regarding what the problem with trying to use an external oscillator might be?

Once again, thank you! I’m learning a lot through this process. :books: :man_student:

If your board has a crystal oscillator fitted (usually a shiny aluminium oval thing with e.g. “8.000” (MHz) written on it, then you can probably reset back to the original sInternalOsc = 0 to use that proper oscillator. The quartz will be a more frequency-stable clock source, and the PLL can also boost it to a higher frequency than with the internal oscillator. The problem seems to have been all the

etc variables which should have been volatile U32 *const instead. (A constant memory address at which a volatile 32-bit unsigned integer can be found).

You should also note that debug_builds_flags is only applied when debugging. If you want to always build with -O0 instead of the PlatformIO-default of -Os (optimize for size), then you have to instead write (docs, docs)

; Remove default -Os, compile with -O0 instead
build_flags = -O0
build_unflags = -Os

It is of course concerning that the firmware code only works when compiler optimizations are disabled. Correctly written code should always work, regardless of compiler optimization level. The code you’re working it seems very buggy, or only having ever been tested in a project that had -O0.

What I’ve also often seen that this very often happens when people port their projects from the e.g. IAR compiler to the GCC compiler. The IAR compiler recognizes these peripheral write / wait loops and does not optimize them away. Thus, although the programmer has written non-confirming code in regards what is actually wanted, the program works, and they never notice any problems.

Also be aware that this for delay loops like

for (int i = 0; i < 9999999; i++) {
  ; // do nothing
}

IAR will see this and faithfulll do 9999999 iterations of doing nothing, while GCC under any optimization level greater than 0 will optimize this entire loop away since it has no effect. Thus, the loop creates no delay, and then things can break very subtly if a certain delay is needed at some points (e.g., waiting after sending a character, or waiting a few microseconds before doing something else).

The again correct way would be

for (volatile int i = 0; i < 9999999; i++) {
  ; // do nothing
}

to disable this optimization, forcing the compiler into actually executing this loop (and incrementing i). Finely tuned delay loops like this might also need an adjustment of their upper value (9999999 in this case), since IAR and GCC might translate the code slightly differently, causing a different number of instructions to be executed.

See:

Note: Even the copy loops are missing volatile in its pointer, so the compiler might optimize it away. Important to know if you want to make this code all complaint for higher optimization levels.

1 Like

I believe these issues are all due to the fact that I’m porting the project from VisualGDB. I’ve worked with the original programmer for many years and his work has always been exemplary.

Thanks for these tips; I’m going to scour the code, looking for all of the things you’ve pointed out here.

:+1: :star::star::star::star::star:

@maxgerhardt, I thought you might be interested in a few observations that I made while analyzing the optimization issue in a bit more detail.

I used the original setting: debug_build_flags = -Os -ggdb2 -g2

At the first breakpoint the value in RCC is x5D83 (23939) (how can I make the debugger display hex?)

Then after single-step to the next line the value in RCC has changed to x35D83 (220547). Clearly two bits have been set, namely the bits defined by:

However, at this point if I push the RUN button, the behavior is that of an infinite loop.

Curious about this, I made the following change and repeated the same test:

This time the code on line 231 never executed, and the infinite loop was gone. Apparently the crystal oscillator circuit was enabled before the while expression was evaluated (as it must have been previously since the indicator bit was set then also). But this time the while expression WAS evaluated, whereas it must not have been during the previous test.

And so this is exactly the kind of thing you advised me to be on the lookout for. What’s most disturbing to me is that the while expression would not be evaluated but the loop would be – that seems like a broken optimization to me.

You would need to look at the generated assembly to see if the compiler didn’t sneak in a read instruction before (or skipped it entirely) after it bitwise-ORed it with RCC_CR_HSEON.

Since *pRCC_CR |= RCC_CR_HSEON; involves

  1. loading the current value of *pRCC_CR from the pointer
  2. ORing a constant onto it
  3. writing back the new value to the pointer

the next !(*pRCC_CR & RCC_CR_HSERDY)) technically does not need to re-read the value from the pointer: We just read it and know its value because we just wrote it out again. The compiler can very aggresively optimize it out, and legaly so, due to the way the pointer is declared.

This is what happens when the pointer is not marked with volatile.

But really, I would not expend much time and energy looking how the compiler translated broken code on different optimization levels. It needs to be fixed.

Technically, VisualGDB is the IDE, the only thing that matters is what compiler that IDE invoked to compile the code. And reading https://visualgdb.com/, it supports GCC, Keil and IAR.

If VisualGDB was using GCC for the project, as does PlatformIO, then both these IDEs can be made to compile the firmware code in the exact same way, by massaging the compiler settings (macros, optimization levels, etc.) to match to the other exactly (build_flags to set additional flags, build_unflags to unset wrong defaults).

That’s why it’s important to do a “verbose build” in the original IDE, noting every compiler invocation with their exact settings / flags. That can then be compared to the compiler invocations done by PlatformIO (project tasks → Advanced → Verbose Build), and build_flags/unflags can be used to massage PlatformIO into using the same flags as the original IDE. Then, if the compiler versions etc. are also the same of course, the exact same build should fall out.

It really only gets complicated when the project is being ported from one compiler to the other.