STM32 - infinite loop

Hello

  1. I have a PIO project which builds a software, based on the Arduino framework for Mega 328p. (Arduino nano) Everything works as expected.

platform.ini
[env:uno]
platform = atmelavr
board = uno
framework = arduino

  1. I also included a configuration for Nucleo 32 L432KC.

platform.ini
[env:nucleo_l432kc]
platform = ststm32
board = nucleo_l432kc
framework = arduino
board_build.core = arduino board
upload_protocol = jlink
debug_tool = jlink

When I want to debug the software, PIO build and uploads them with the embedded JLink debugger successfully. But the software runs not as expected. When I hit “pause” I see that the program hangs in “Infinite_Loop” in the startup_stm32l432xx.s.
Call stack:
image

When I reset the controller with the onboard reset button the call stack looks like this:
image

The first strange thing is, when I remove 2 function from my software (which will be not called when the controller starts up) debugging works. :sweat_smile: → memory problem?

  1. When I build the software for the STM32 without debugging capabilities, the software works also as expected. :thinking:

I don’t have the deep knowledge how to solve this issue but it would be nice to know more about this.

Thanks in advance
Franz

The fact that __libc_init_array() is in the callstack heavily suggests that it’s crashing in the constructor code of an object you create (or have written yourself).

So what is the code for these two functions? Must both be removed for the startup to work?

The fact that __libc_init_array() is in the callstack heavily suggests that it’s crashing in the constructor code of an object you create (or have written yourself).

Ok. :thinking: But why does the software work when I only upload (without a debugging session)?

The two functions which I commented out are functions within a member function of a class. These two functions are not called while starting up phase of the controller.
And yes I must comment out both functions.

uint8_t OekofenHighTemp::GetTemperatureOekoFenHighTemperature(uint8_t IOPin, float *temperature)
{
    int adcValue = GetADCValue(IOPin);

    if (adcValue < 100)
    {
        return STATUS_SHORT_CIRCUIT;
    }
    else if (adcValue > 900)
    {
        return STATUS_NO_SENSOR_CONNECTED;
    }

    // Kennlinie T=f(ADC) - mit Vorwiderstand 1800 Ohm (Ă–kofen)

    // -0,000000000001572x5 + 0,000000004214060x4 - 0,000004474485785x3 + 0,002375373617837x2 - 0,744827936068140x + 174,465441288606000

    double a1 = 174.465441288606000;
    double a2 = -0.744827936068140;
    double a3 = 0.002375373617837;
    double a4 = -0.000004474485785;
    double a5 = 0.000000004214060;
    double a6 = -0.000000000001572;

    // this line and a secound simalar line in another class I commented out and than debuging works
    //*temperature = ((float)a6 * pow(adcValue, 5) + a5 * pow(adcValue, 4) + a4 * pow(adcValue, 3) + a3 * pow(adcValue, 2) + a2 * pow(adcValue, 1) + a1);

    return STATUS_OK;
}

Possible compiler optimization? Add

debug_build_flags = -Os -g -ggdb3

(docs) to the platformio.ini and debug again. Does it then not crash, as if you uploaded it normally?

Can you confirm this by setting a breakpoint in this function and starting debugging again? What’s the call stack for the first call of the function?

1 Like

With the debug flags from you, I can debug the software :crazy_face:

With the debug flags, I can also confirm that these two functions are not called while in the start-up phase of the controller.

Background: This software read values (temperature,…) from different pins only when I send a specific Modbus command.

The call stack regarding the functions where I don’t commented out the mentioned two code statements:

When I upload the software “normal” (with the compiler optimization) the software has issues and doesn’t work 100%. I have an alive LED which blinks. → mainloop “OK”. But when I send a Modbus command, the alive LED stops.

Okay, two things:

Please add

extern "C" void HardFault_Handler() {
  while(1) ;
}

in some .cpp file, e.g., src/main.cpp, so that when a specific crash happens and you pause the software, it at least lands in this function as an indicator that a hardfault, and not some other exception, has happened.

Secondly, can you reproduce the crash where it, when you halt it, it does

And in that stacktrace, can you click on Serial2?

  1. I commented out the debug flags from you and added your code snippet.

I started debugging (hitting F5) than the call stack looks this:

It is not possible to get additional information when I click onto “Serial2@…” - grayed out

When I continue the program and press the reset button on the board, the call trace looks this:

Personally, I would expect that starting a program from a debug session has the same behavior as doing a hard reset???

  1. With the debug flags

→ HardFault_Handler never hit the breakpoint

Just so that we’re clear, “with compiler optimization” is “debug_build_flags = -Os -g -ggdb3 line is active”?

This stacktrace is really interesting, because the “Serial@0x200…” is an address in RAM, and with that in the backtrace it looks like code was executed from there – although all code belonging to the hardware serial class should be in flash, which is starting at 0x08000000. I’m not sure how code execution could fall in there, especially since there is no element before “Serial2@…”. And _end is the end of allocated memory where the stack should start. It’s not making a lot of sense to me.

A thing you can do is when you are in the exact above hardfault handler is to examine the Cortex-M4’s fault registers. The “Debug Console” VSCode tab gives you access to the GDB console. In reference to Documentation – Arm Developer, please execute the commands

print/x *(uint32_t *) 0xE000ED28
print/x *(uint32_t *) 0xE000ED2C
print/x *(uint32_t *) 0xE000ED30
print/x *(uint32_t *) 0xE000ED34
print/x *(uint32_t *) 0xE000ED38
print/x *(uint32_t *) 0xE000ED3C

what values does it show?

Correct.
Test (1) - “debug_build_flags = -Os -g -ggdb3” active
Test (2) - “debug_build_flags = -Os -g -ggdb3” inactive

Regarding your second question:

..Breakpoint reached @ address 0x0800510C
Reading all registers
Removing breakpoint @ address 0x0800510C,
 Size = 2
Removing breakpoint @ address 0x08006F68, Size = 2
Read 4 bytes @ address 0x0800510C (Data = 0x0000E7FE)

Breakpoint 2, HardFault_Handler () at src\main.cpp:79
79	  while(1) ;
Read 4 bytes @ address 0x2000FFEC (Data = 0x81000000)
Reading 64 bytes @ address 0x2000FFC0
WARNING: Failed to read memory @ address 0xFF
FFFFFE
Read 4 bytes @ address 0x08008FFC (Da
ta = 0xE7F23601)
Read 4 bytes @ address 0x08006FC
6 (Data = 0xFFCEF7FF)
Reading 64 bytes @ add
ress 0x08006F80
Read 4 bytes @ address 0x08006FCC (Data
= 0x20010000)
Reading 64 bytes @ address 0x08008FC0
Reading 64 bytes @ address 0x0800900
0
print/x *(uint32_t *) 0xE000ED28
Read 4 bytes @ address 0xE000ED28 (Data = 0x00000001)
$8 = 0x1
{"token":42,"outOfBandRecord":[],"resultRecords":{"resultClass":"done","results":[]}}
print/x *(uint32_t *) 0xE000ED2C
Read 4 bytes @ address 0xE000ED2C (Data = 0x40000000)
$9 = 0x40000000
{"token":44,"outOfBandRecord":[],"resultRecords":{"resultClass":"done","results":[]}}
print/x *(uint32_t *) 0xE000ED30
Read 4 bytes @ address 0xE000ED30 (Data = 0x00000002)
$10 = 0x2
{"token":46,"outOfBandRecord":[],"resultRecords":{"resultClass":"done","results":[]}}
print/x *(uint32_t *) 0xE000ED34
Read 4 bytes @ address 0xE000ED34 (Data = 0xE000EDF8)
$11 = 0xe000edf8
{"token":48,"outOfBandRecord":[],"resultRecords":{"resultClass":"done","results":[]}}
print/x *(uint32_t *) 0xE000ED38
Read 4 bytes @ address 0xE000ED38 (Data = 0xE000EDF8)
$12 = 0xe000edf8
{"token":50,"outOfBandRecord":[],"resultRecords":{"resultClass":"done","results":[]}}
print/x *(uint32_t *) 0xE000ED3C
Read 4 bytes @ address 0xE000ED3C (Data = 0x00000000)
$13 = 0x0
{"token":52,"outOfBandRecord":[],"resultRecords":{"resultClass":"done","results":[]}}

I see. Sadly the cortex-m4 system control block registers don’t give a lot of helpfull information when decoded here.

CFSR = 0x1
     = IACCVIOL ("MPU or Execute Never (XN) default memory map access violation on an instruction fetch has occurred. The fault is signaled only if the instruction is issued.")
	 
HFSR = 0x40000000
     = FORCED ("Processor has escalated a configurable-priority exception to HardFault.")
	 
DFSR = 0x2
     = BKPT ("Indicates a debug event generated by BKPT instruction execution or a breakpoint match in FPB")
	
MMAR = 0xE000ED38
BFAR = 0xE000ED3C

All it’s saying is that it has an instruction access violation. (The memory manage address register / MMAR is not valid because the MMARVALID in the CFSR is not set – see here for decoding)

Do I understand it correctly that when the hardfault occurrs only when the *temperature = ... line is active?

Do I understand it correctly that when the hardfault occurrs only when the *temperature = ... line is active?

→ Correct.

Here is a screenshot without the debug option from you and deactivated the two “*temperature=” lines (the other is not visible on the screenshot)

  • debugging works

I think you have a lot of experience and seen a lot of issues. What do you think / feeling? Is my issue caused by the project source code or from something else?