MK20DX64VLH7 Custom Board

atestani · February 4, 2022, 2:43pm

I have a strange problem with a custom board based on the MK20DX64VLH7 which is the same MCU as a Teensy 3.2. The Teensy 3.2 MCU is 256K Flash/64K RAM (MK20DX256VLH7 and this one is 64K.Flash/16K RAM. Everything else is identical.

The problem is that the same test code will run fine on a a 256K MCU but not on a 64K part. The issue is if I have some String and integer statements in setup() the 64K case “crashes” where the 256K part runs fine. If I comment out the String statement, it runs on the 64K MCU as well.

I have the custom board set up with a JTAG/SWD connector and am using a JLink programmer/debugger. I have loaded code from within PIO as well as directly with the JLink and get the same results. I have tried to use the debugger but have not been able to find the problem.

I suspect the problem is in my board json file and/or linker script but I have researched this a lot and can’t see what is wrong. These files are shown below. (Sorry… I can’t upload the files or figure out how to show them as code)

Any advice on what could be wrong would be greatly appreciated.

teensy3x_64.json:

{
  "build": {
    "core": "teensy3", 
    "cpu": "cortex-m4", 
    "extra_flags": "-D__MK20DX256__ -DTEENSY31", 
    "f_cpu": "72000000L", 
    "ldscript": "mk20dx64V.ld", 
    "mcu": "mk20dx64"
  }, 
  "debug": {
    "jlink_device": "MK20DX64xxx7"
  },
  "frameworks": [
    "arduino" 
  ], 
  "name": "teensy3x_64", 
  "upload": {
    "maximum_ram_size": 16384, 
    "maximum_size": 65536,
  "protocols": [
    "jlink"
    ],
   "protocol": "jlink"
  }, 
  "url": "", 
  "vendor": ""
}

mk20dx64V.ld: (standard linker script from teensy core with the following lines changed)

MEMORY
{
FLASH (rx) : ORIGIN = 0x00000000, LENGTH = 64K
RAM (rwx) : ORIGIN = 0x1FFFE000, LENGTH = 16K <— is this correct?
}

maxgerhardt · February 4, 2022, 2:56pm

The reference manual https://www.pjrc.com/teensy/K20P64M72SF1RM.pdf is also valid for your MK20DX64VLH7.

In there it says, on page 90 and 91,

So it does a really weird split in the middle. If you say your MCU has 16KRAM, the calculation of the start address is

0x2000_0000 - (SRAM_size/2)
= 0x2000_0000 - (8*1024)
= 0x1fffe000

So that agrees with your calculation.

Well if the 256K flash device also has 4 times the RAM, the sammer device inherently can’t run all sketches. In Arduino and the String class e.g., the data is stored on the heap, so modifying a large string will create a temporary copy of it RAM, significantly increasing memory usage.

Does the firmware start at all, or does it just crash when you request an operation on a large string? What code are you using for testing?

atestani · February 4, 2022, 6:13pm

Good questions, thanks.

I should have stated there is a lot of RAM space available, i.e .buinding in release mode, PIO reports
RAM: [=== ] 30.9% (used 5064 bytes from 16384 bytes)
Flash: [======= ] 66.4% (used 43540 bytes from 65536 bytes)

built in debug mode:
RAM: [=== ] 30.8% (used 5052 bytes from 16384 bytes)
Flash: [======= ] 71.8% (used 47072 bytes from 65536 bytes)

Also, the the section of code I was referring to is this:
String xsn = 12;
int xserialNum = xsn.toInt();

and commenting out the second line allows it to “work”. When it fails, the code starts and hangs while executing some library code. I tried it again using the debugger and clicked pause to see where it was hung. It is in the teensy core file mk20dx128.c in the function fault_isr(). in this section of code:

while (1) {
	// keep polling some communication while in fault
	// mode, so we don't completely die.
	if (SIM_SCGC4 & SIM_SCGC4_USBOTG) usb_isr();
	if (SIM_SCGC4 & SIM_SCGC4_UART0) uart0_status_isr();
	if (SIM_SCGC4 & SIM_SCGC4_UART1) uart1_status_isr();
	if (SIM_SCGC4 & SIM_SCGC4_UART2) uart2_status_isr();
}

Interestingly, I do have a USB cable connected for a serial monitor function but there is no code using it
in this test. I unplugged the cable and got the same result… The isr that is being repeatedly called is usb_isr.

One other piece of information is that when I said it runs with a 256K MCU, that was on a Teensy 3.2 not with my custom board with a 256K MCU. I have used this custom board design with 128K and 256K MK20’s and have not seen this problem. However, I am going to build another version of this exact board with a 256K part and see if I have the same problem.

maxgerhardt · February 4, 2022, 10:25pm

And what does the backtrace look like in the debugger when it hits that function? What’s the function before it?

atestani · February 4, 2022, 11:55pm

**signal handler called @0xfffffff9 Unknown source.0 **

Isn’t that in RAM??? The previous call was to a function in the library I am using (NMEA2000) which I have used in multiple projects.

maxgerhardt · February 5, 2022, 12:23am

Well the cause looks like the tN2kGroupFunctionHandler before it, it does something that causes a signal handler to be called, which is the fault_isr.

I don’t see how this is related to

it seems to crash somewhere completely different.

Can you click on the function beneath the currently selected one and see in which line it crashed?

atestani · February 5, 2022, 2:16am

I had done that previously and did it again now and this is where the fault apparently originated.

I agree it doesn’t seem to have anything to do with the string and int statements in setup(). This is why this has been driving me crazy for 3+ days now! I appreciate your help in resolving this.

As a test, I went back to a previous version of the library I had used in another project with almost the same hardware just to be sure there wasn’t a regression in the updates. The problem was still there. Interestingly, the function where the fault originated was a completely different function.

maxgerhardt · February 5, 2022, 2:27am

Hm I think I’m starting to see the bigger picture. The Open() function that is in the stacktrace does a number of dynamic allocations with new (source) and the constructed object’s constructor is called. The constructed classes all inherit from the tN2kGroupFunctionHandler handler whose base constructor is called. The code seems to crash when trying to assign something to member variable. This might indicate that the allocated memory for the new-ed object is invalid (e.g. there’s a problem with the heap allocation / sbrk() routines in the case of the 16KB SRAM MCU) or it has already run out of memory and new returned a nullptr and it went and tried to call the constructor code with this = 0x0.

The line tN2kGroupFunctionHandlerForPGN126464 indicates it has gotten past the first allocation (tN2kGroupFunctionHandlerForPGN60928()) and is stuck in that next allocation (new tN2kGroupFunctionHandlerForPGN126464(this)). But it may already have been just luck that it has gotten through the first one if the heap is broken.

The debugger on the left side should show the values of the variables in the content. Can you post the full VSCode screenshot for each of the 3 functions in the call stack below the <signal handler called>?

atestani · February 5, 2022, 2:54am

Sure…below are the screeshots.

Would it help if you connected to my machine with AnyDesk or TeamViewer and watched this happening live? I am more than happy to do that if you are willing.

maxgerhardt · February 5, 2022, 2:56am

grafik

Yeah that’s already game over. this = 0x0 should not happen.

I have to look at the memory allocation routines and possibly the linker script on why that might happen.

Can you write a minimal firmware that just does allocations in a loop with malloc() of let’s say, 512 bytes, and prints out the returned pointer and breaks out when it’s 0? After what number of allocations does that happen? I’m trying to distinguish whether you’re experiencing an out-of-memory because the sketch (or rather, this nice library you’re using) allocates a bazillion bytes on the heap or whether the heap is outright broken.

atestani · February 5, 2022, 3:02am

Yes I will do that and post my results as soon as I have them.

atestani · February 5, 2022, 4:13am

I’m not very good with pointers so I apologize if I don’t have this right but here is what I got with the sketch below. I expected to see the first address to be something close to 0x1fffffe000, like 0x1fffe414 to account for the globals and the first allocation.

Edit: I commented out the char x[512] and did malloc(512) directly and got the same result.

Address= 0x1ffff538
Address= 0x1ffff740
Address= 0x1ffff948
Address= 0x1ffffb50
Address= 0x1ffffd58
Address= 0x0
pointer is NULL

Code:
#include <Arduino.h>

int* ptrValue;
char buffer[16];
char x[512];

//===============================================
void setup()
//===============================================
{
  Serial.begin(115200);
  delay(1000);
  ptrValue = (int*)malloc(sizeof(x));
}


//===============================================
void loop() 
//===============================================   
{
  if (ptrValue != NULL)
  {
     ptrValue = (int*)malloc(sizeof(x));
    sprintf(buffer, "Address=  %p\n", ptrValue);
    Serial.print(buffer);
  }
  else
  {
      Serial.println("pointer is NULL");
      while(1){}
  }
  delay(100);
}

maxgerhardt · February 5, 2022, 10:04am

Well so after 5 allocations we already can’t allocate the next 512 bytes, so in this example with very minimal static RAM usage you’re only getting 2560 bytes from the heap before it says ‘nope’. The library states

With default settings library requires about 23 kB rom and 3.3 kB RAM in normal operation

So the heap gives you less than the stated RAM memory requirements of the library. That’s bad. Sketches who allocate more static memory (in e.g. global, statically constructed objects) will have an even smaller amount of heap memory available then.

The -D__MK20DX256__ is not good, it tells the core that it’s the wrong type of chip. Based on this macro, it does certain decisions. Most interestingly, although the Teensy 3.2 is a MK20DX256VLH7, it still has provisions for the 128K device?

github.com

PaulStoffregen/cores/blob/master/teensy3/mk20dx128.c#L1166-L1197


      
          char *__brkval = (char *)&_ebss;
          
          #ifndef STACK_MARGIN
          #if defined(__MKL26Z64__)
          #define STACK_MARGIN  512
          #elif defined(__MK20DX128__)
          #define STACK_MARGIN  1024
          #elif defined(__MK20DX256__)
          #define STACK_MARGIN  4096
          #elif defined(__MK64FX512__) || defined(__MK66FX1M0__)
          #define STACK_MARGIN  8192
          #endif
          #endif
          
          #pragma GCC diagnostic push
          #pragma GCC diagnostic ignored "-Wunused-parameter"
          
          __attribute__((weak))
          void * _sbrk(int incr)
          {

This file has been truncated. show original

You see that for the currently active macro, __MK20DX256__, it uses a STACK_MARGIN of 4096 bytes. This STACK_MARGIN is used as the maximum number of bytes allowable between the heap that starts at the _ebss (end of BSS section, see linker script) and current stack pointer (evaluated dynamically). Probably because the __MK20DX256__ has 4 times the RAM, they chose the safety margin between heap and stack (so that they don’t collide) 4 times as big as computed to the __MK20DX128__ case.

I would recommend that in your custom board definition file you exchange the -D__MK20DX256__ for -D__MK20DX128__. Then at least the core should act like you have 16kBytes of RAM and not 64kBytes. How many allocations can you do then with the same sketch as before?

atestani · February 5, 2022, 5:00pm

Thanks!! Now it is all making sense! In a past project, I used a 128K RAM MK20 with the -D_MK20DX256 flag and it worked. From that I incorrectly (stupidly, I guess) assumed that flag didn’t matter for the general MK20 family.

Anyway, I tried using the MK20DX128 flag and for some reason I lost the serial monitor but I added an LED to “count” allocations and found it went from 6 to 15 which is getting about 5K back as expected.

I didn’t follow-up to find out why the serial port went away. Instead, I went back to the 256K flag and started editing mk20dx128.c and changed the stack margin to 1024 for that flag. Look what I got… it is now using the other SRAM bank!

Address= 0x1ffff540
Address= 0x1ffff748
Address= 0x1ffff950
Address= 0x1ffffb58
Address= 0x1ffffd60
Address= 0x1fffff68
Address= 0x20000170
Address= 0x20000378
Address= 0x20000580
Address= 0x20000788
Address= 0x20000990
Address= 0x20000b98
Address= 0x20000da0
Address= 0x0
pointer is NULL

I went back to my test program and with the change to mk20dx128.c, that works now. The real application now works as well.

So now the question is how do I use this “special” version of mk20dx128.c ?? I don’t want to modify the platform. I changed mk20dx128.c back to original and put the edited version in the project src folder. That seems to work. Is doing it that way legitimate?

maxgerhardt · February 5, 2022, 10:10pm

Well since the code is using that you can also just add a specific value definition to that with build_flags, .i.e, build_flags = -DSTACK_MARGIN=1024. You can also put it in the extra_flags of your board definition if you always want to use that with the board. Then no framework files have to be modified.

atestani · February 6, 2022, 12:26am

I did this in the board definition file:
"extra_flags": "-D__MK20DX256__ -DTEENSY31 -DSTACK_MARGIN=1024",

… works great!

Thanks for all your help on this. I really appreciate it!