ESP32 WDT Timeout on Core 0

Hello,

I have a project which utilizes both the BLE and WiFi stacks, where the WiFi operations are restricted solely to the occasional use of AP mode for large data transfers to a mobile application. I have configured the project to pin both stacks to core 0, leaving my application code to execute on core 1.

The mobile app typically communicates with our device via BLE, but can request the activation of WiFi AP mode when a larger amount of data (~250KB) needs to be transferred via the web-server. We are utilizing UART2 to receive the data from a second micro-controller in the system at a baud rate of 500Kbps. We transfer data to the mobile app (web-server client) in chunks of 3300 bytes. Although data is being received from the secondary micro at 500Kbps, the transfer is not continuous but at a rate closer to 12K bytes per second.

On most occasions, the operations work fine. However, there are instances where we encounter a WDT Timeout on Core 0. Since this core is reserved for the “black box” code, it would appear there is an issue with how the stacks (and/or the interrupt handler for the UART) are managing their time.

Given the spotty occurrence of the problem, is there any advice for how we may be able to further diagnose the root cause of this issue?

Are there any modifications I can make to the Core 0 configuration which may help mitigate the problem?

Thanks in advance!
Mark

Seems like you have to yield to the watchdog task (wdt - Page 2 - ESP32 Forum) or disable the watchdog (ESP32: a better way than vTaskDelay to get around watchdog crash? - Programming Questions - Arduino Forum). I’m sure the people at GitHub - espressif/arduino-esp32: Arduino core for the ESP32 or ESP-IDF know the intrinsic technicalities regarding the watchdog in their frameworks :slight_smile:

Hi, maxgerhardt,

Yes, since this is occurring on Core 0, I would presume something in the Espressif code is not yielding properly. I would not desire to disable the WDT, as that could prove problematic with poor code. I was not sure how much of the low-level code is actually Espressif and how much potential wrapper code may have been developed by PlatformIO for the ESP32 package. I posted here first in an effort to glean some more insight, and determine whether the knowledgeable PlatformIO team had encountered issues like this.

Thanks for your comments, it does answer one question for me - I’ll have to pursue this on the Espressif forum!

Best Regards,
Mark

Just wanted to provide an update for anyone following this thread…

Following communications with Espressif, I learned this “Interrupt WDT timeout on Core 0” was specifically related to spending too much time in an interrupt handler. In my case, it turned out to be related to the UART handler.

Further diagnostics proved that one of the API calls for the Espressif web-server interface was blocking my code for up to 250ms. During that time, my task was blocked and was unable to service the software buffer for the UART at all. Once my 2KB circular software buffer had reached its storage limit, subsequent calls to the interrupt handler were unable to empty the hardware FIFO since there was no storage left to save it to. Although the interrupt request was being cleared, the handler became re-entrant due to the persistent FIFO full status.

I had to modify my code to make the circular buffer large enough to ride through these large periods of blocking activity. Additionally, I modified the UART interrupt handler to flush the hardware FIFO and set a buffer overflow flag if I encounter a buffer overrun condition in the future. Once these changes were done, the problems disappeared.

I hope this detail helps someone else encountering these issues…

Thanks for your help!

Thanks for this great information, I seem to have run into a similar problem. Is there any chance you could share at the code level the modifications you made?