SAM51 flash reads after writes

I’m making a bootloader for a SAMD51P20A, based on a previous one I made for a SAMD21. I have flash block erase and page writes working, and the application & bootloader use a few breadcrumbs in the last flash page to setup various bootloader operations. It all works well, I can erase the application code space, write application code to flash, then jump to the flash, and it runs.
The problem comes if I write a page of flash after erasing it, then try to read the value back from the newly written flash page. The value read back from a location in flash doesn’t reflect the newly written value, though the write does appear to have worked, as after the next reset, the written value is read correctly. I can step through in the debugger, and verify that the flash block erase and page writes work, by reading the flash directly, but in code, reading the same flash location shows a different value.

The minimal code to reproduce this is

ERROR_CODE_T FlashProgTestRead(uint32_t * bootcount) 
{
    // disable Flash read caches
    NVMCTRL->CTRLA.bit.CACHEDIS0 = 1;
    NVMCTRL->CTRLA.bit.CACHEDIS1 = 1;

    /***************************************************************************/
    /* ERASE BLOCK                                                             */
    /***************************************************************************/
    // Make sure the NVM is ready to accept a new command (NVMCTRL.STATUS).
    while (NVMCTRL->STATUS.bit.READY != NVMCTRL_STATUS_READY );
    // Execute "ER" Erase Row ( by passing in any page in the row )
    NVMCTRL->ADDR.reg = BOOTCOUNT_ADDRESS;
    // erase row
    NVMCTRL->CTRLB.reg = NVMCTRL_CTRLB_CMDEX_KEY | NVMCTRL_CTRLB_CMD_EB;
    // wait for erase to complete
    while (NVMCTRL->INTFLAG.bit.DONE == 0);
    // // check for any errors
    if (NVMCTRL->INTFLAG.bit.ADDRE || NVMCTRL->INTFLAG.bit.PROGE || NVMCTRL->INTFLAG.bit.LOCKE 
        || NVMCTRL->INTFLAG.bit.ECCSE || NVMCTRL->INTFLAG.bit.ECCDE || NVMCTRL->INTFLAG.bit.NVME) {
        // read ECCERR to clear whatever error flag
        uint32_t xxx = NVMCTRL->ECCERR.reg;
        return ERR_FLASH_ERASE;
    }

    /***************************************************************************/
    /* WRITE NEW VALUE                                                         */
    /***************************************************************************/
    // Disable automatic page write
    NVMCTRL->CTRLA.bit.WMODE = 0;
    // Make sure the NVM is ready to accept a new command (NVMCTRL.STATUS).
    while (NVMCTRL->STATUS.bit.READY != NVMCTRL_STATUS_READY ) { } 
    // Execute "PBC" Page Buffer Clear
    NVMCTRL->CTRLB.reg = NVMCTRL_CTRLB_CMDEX_KEY | NVMCTRL_CTRLB_CMD_PBC;
    // Clear the DONE Flag (NVMCTRL.INTFLAG)
    while (NVMCTRL->INTFLAG.bit.DONE == 0) { }

    // 6. Write data to page buffer with 32-bit accesses at the Flash page address
    uint32_t * p = (uint32_t *)FlashProg_GetPageAddress(BOOTCOUNT_ADDRESS);
    for (int i = 0; i < (PAGE_SIZE / 4); i++) {
            if (i != 126) {
                *p = 0xffffffff;
            } else {
                // write special value to the bootcount address
                *p = 0x12345678;
            }
            p++;
    }
    // Make sure the NVM is ready to accept a new command (NVMCTRL.STATUS).
    while (NVMCTRL->STATUS.bit.READY != NVMCTRL_STATUS_READY ) { } 
    // Execute "WP" Write Page
    NVMCTRL->CTRLB.reg = NVMCTRL_CTRLB_CMDEX_KEY | NVMCTRL_CTRLB_CMD_WP;
    // Wait til write is done
    while (NVMCTRL->INTFLAG.bit.DONE == 0);
    // check for any errors during write
    if (NVMCTRL->INTFLAG.bit.ADDRE || NVMCTRL->INTFLAG.bit.PROGE || NVMCTRL->INTFLAG.bit.LOCKE 
        || NVMCTRL->INTFLAG.bit.ECCSE || NVMCTRL->INTFLAG.bit.ECCDE || NVMCTRL->INTFLAG.bit.NVME) {
        // read ECCERR to clear whatever error flag
        uint32_t xxx = NVMCTRL->ECCERR.reg;
        return ERROR_FLASH_WRITE;
    }

    /***************************************************************************/
    /* WRITE SUCCEEDED, TRY TO READ FLASH                                      */
    /***************************************************************************/
    value = *((uint32_t *) 0xffff8/*BOOTCOUNT_ADDRESS*/);
    *bootcount = value;
    return ERR_OK;
}

Here’s a screen capture taken while running this in the debugger, with the correct flash value of 0x12345678 at location 0xffff8 (right pane), and the variable “value” in the locals pane on the left, just after it’s been read from the same flash location and showing the wrong, previous programmed value of 0xffffffff

The datasheet talks about two banks of flash, and being able to write to one bank and then switch in place but I’m not doing that. I feel like maybe it has to do with the read caching, but honestly I’ve been trying everything I can think of for more hours (read days) than I care to admit. I’ve looked at a couple other SAMD51 bootloaders, and I’m doing basically the same thing, and as I said, the bootloader actually works well, and in fact DOES correctly read the reset vector from the newly programmed application code in order to jump to the application.

Is there some dumb thing that I’m missing here to flush caches or force reads to work?

thanks kindly, Chris

Let’s make sure the compiler isn’t trying to be smart by executing the read from 0xffff8 before you wrote the new value to it: Disable all compiler optimizations.

Add to the platformio.ini

build_type = debug
debug_build_flags = -O0 -g3 -ggdb3 

then debug again.

(E.g., line 193 is also very likely to not actually be executed because the result is unused. You’d need to use volatile and cast the result away to void to prevent that)

Thank you Max, the volatile xxx is a good catch.
I’ve been experimenting - initially disabling the optimizations seemed to help somewhat, I was not able to read the flash immediately after a write, but later reads did work. Then I discovered the later reads used the memcpy() function instead of the value = *((uint32_t *) 0xffff8); construction, so I switched to using that, turned optimizations on again, and everything worked. Until I started stripping out all the test code, now I’m back to not being able to read flash again after a write. This is the pattern I’ve been following for days now; find something that works, implement it everywhere, then find it doesn’t work any more.
I will keep on this, it’s so close to working…

When you want to force a value being read fresh without compiler optimizations, it really needs to be a value = *(volatile uint32_t*) 0xffff8;, or memcpy(). But keeping compiler optimizations completely off per my instructions above will for sure eliminate all these type of errors so you can focus on others first.

OK, it just gets weirder. I have modified my test function to do two consecutive erase-then-write operations on the same block and page, writing a different value to the same location on the 2nd write. After the first write, I am able to read back the written value correctly, but after the second write, I’m doing two consecutive memcpy operations to two different variables, as I wondered if two reads might get around any caching issues. After the second write, two memcpy operations from the same flash location incorrectly read the same value as was written during the first write operation. I’m sure both memcpy() calls are being made because I can see the values change from those that I’ve preset. Later on in the code, I make yet another read operation of the flash, and that time it works… I really am confused now…

ERROR_CODE_T FlashProgTestRead(uint32_t * bootcount, uint32_t * bootcount2) 
{

    // disable Flash read caches
    NVMCTRL->CTRLA.bit.CACHEDIS0 = 1;
    NVMCTRL->CTRLA.bit.CACHEDIS1 = 1;

    //volatile uint32_t value = *((uint32_t *) BOOTCOUNT_ADDRESS); // fixme this makes flash read later not work!
    //volatile uint32_t value = 0x99; // works
    //volatile uint32_t value; // works
    uint32_t value; // works

    /***************************************************************************/
    /* ERASE BLOCK                                                             */
    /***************************************************************************/
    // Make sure the NVM is ready to accept a new command (NVMCTRL.STATUS).
    while (NVMCTRL->STATUS.bit.READY != NVMCTRL_STATUS_READY );
    // Execute "ER" Erase Row ( by passing in any page in the row )
    NVMCTRL->ADDR.reg = BOOTCOUNT_ADDRESS;
    // erase row
    NVMCTRL->CTRLB.reg = NVMCTRL_CTRLB_CMDEX_KEY | NVMCTRL_CTRLB_CMD_EB;
    // wait for erase to complete
    while (NVMCTRL->INTFLAG.bit.DONE == 0);
    // // check for any errors
    if (NVMCTRL->INTFLAG.bit.ADDRE || NVMCTRL->INTFLAG.bit.PROGE || NVMCTRL->INTFLAG.bit.LOCKE 
        || NVMCTRL->INTFLAG.bit.ECCSE || NVMCTRL->INTFLAG.bit.ECCDE || NVMCTRL->INTFLAG.bit.NVME) {
        // read ECCERR to clear whatever error flag
        saved_ECCER = NVMCTRL->ECCERR.reg;
        return ERR_FLASH_ERASE;
    }

    /***************************************************************************/
    /* WRITE NEW VALUE                                                         */
    /***************************************************************************/
    // Disable automatic page write
    NVMCTRL->CTRLA.bit.WMODE = 0;
    // Make sure the NVM is ready to accept a new command (NVMCTRL.STATUS).
    while (NVMCTRL->STATUS.bit.READY != NVMCTRL_STATUS_READY ) { } 
    // Execute "PBC" Page Buffer Clear
    NVMCTRL->CTRLB.reg = NVMCTRL_CTRLB_CMDEX_KEY | NVMCTRL_CTRLB_CMD_PBC;
    // Clear the DONE Flag (NVMCTRL.INTFLAG)
    while (NVMCTRL->INTFLAG.bit.DONE == 0) { }

    // 6. Write data to page buffer with 32-bit accesses at the Flash page address
    uint32_t * p = (uint32_t *)FlashProg_GetPageAddress(BOOTCOUNT_ADDRESS);
    for (int i = 0; i < (PAGE_SIZE / 4); i++) {
            if (i != 126) {
                *p = 0xffffffff;
            } else {
                // write special value to the bootcount address
                *p = 0x12345678;
            }
            p++;
    }
    // Make sure the NVM is ready to accept a new command (NVMCTRL.STATUS).
    while (NVMCTRL->STATUS.bit.READY != NVMCTRL_STATUS_READY ) { } 
    // Execute "WP" Write Page
    NVMCTRL->CTRLB.reg = NVMCTRL_CTRLB_CMDEX_KEY | NVMCTRL_CTRLB_CMD_WP;
    // Wait til write is done
    while (NVMCTRL->INTFLAG.bit.DONE == 0);
    // check for any errors during write
    if (NVMCTRL->INTFLAG.bit.ADDRE || NVMCTRL->INTFLAG.bit.PROGE || NVMCTRL->INTFLAG.bit.LOCKE 
        || NVMCTRL->INTFLAG.bit.ECCSE || NVMCTRL->INTFLAG.bit.ECCDE || NVMCTRL->INTFLAG.bit.NVME) {
        // read ECCERR to clear whatever error flag
        saved_ECCER = NVMCTRL->ECCERR.reg;
        return ERROR_FLASH_WRITE;
    }

    /***************************************************************************/
    /* WRITE SUCCEEDED, TRY TO READ FLASH                                      */
    /***************************************************************************/
    //p = (uint32_t *)BOOTCOUNT_ADDRESS;
    // value = *p; // doesn't read flash in debugger with optimizations off

    //value = *((uint32_t *) BOOTCOUNT_ADDRESS); // doesn't read. LATER: works as long as you don't declare & read value earlier, it seems to get optimized out????
    value = *(volatile uint32_t*)BOOTCOUNT_ADDRESS;
    // memcpy((void*)&value, (const void *)BOOTCOUNT_ADDRESS, sizeof(uint32_t)); // works with optimizations off
    *bootcount = value;

    // fixme cwp: try another erase & write 
    /***************************************************************************/
    /* ERASE BLOCK                                                             */
    /***************************************************************************/
    // Make sure the NVM is ready to accept a new command (NVMCTRL.STATUS).
    while (NVMCTRL->STATUS.bit.READY != NVMCTRL_STATUS_READY );
    // Execute "ER" Erase Row ( by passing in any page in the row )
    NVMCTRL->ADDR.reg = BOOTCOUNT_ADDRESS;
    // erase row
    NVMCTRL->CTRLB.reg = NVMCTRL_CTRLB_CMDEX_KEY | NVMCTRL_CTRLB_CMD_EB;
    // wait for erase to complete
    while (NVMCTRL->INTFLAG.bit.DONE == 0);
    // // check for any errors
    if (NVMCTRL->INTFLAG.bit.ADDRE || NVMCTRL->INTFLAG.bit.PROGE || NVMCTRL->INTFLAG.bit.LOCKE 
        || NVMCTRL->INTFLAG.bit.ECCSE || NVMCTRL->INTFLAG.bit.ECCDE || NVMCTRL->INTFLAG.bit.NVME) {
        // read ECCERR to clear whatever error flag
        saved_ECCER = NVMCTRL->ECCERR.reg;
        return ERR_FLASH_ERASE;
    }

    /***************************************************************************/
    /* WRITE NEW VALUE                                                         */
    /***************************************************************************/
    // Disable automatic page write
    NVMCTRL->CTRLA.bit.WMODE = 0;
    // Make sure the NVM is ready to accept a new command (NVMCTRL.STATUS).
    while (NVMCTRL->STATUS.bit.READY != NVMCTRL_STATUS_READY ) { } 
    // Execute "PBC" Page Buffer Clear
    NVMCTRL->CTRLB.reg = NVMCTRL_CTRLB_CMDEX_KEY | NVMCTRL_CTRLB_CMD_PBC;
    // Clear the DONE Flag (NVMCTRL.INTFLAG)
    while (NVMCTRL->INTFLAG.bit.DONE == 0) { }

    // 6. Write data to page buffer with 32-bit accesses at the Flash page address
    p = (uint32_t *)FlashProg_GetPageAddress(BOOTCOUNT_ADDRESS);
    for (int i = 0; i < (PAGE_SIZE / 4); i++) {
            if (i != 126) {
                *p = 0xffffffff;
            } else {
                // write special value to the bootcount address
                *p = 0x44;
            }
            p++;
    }
    // Make sure the NVM is ready to accept a new command (NVMCTRL.STATUS).
    while (NVMCTRL->STATUS.bit.READY != NVMCTRL_STATUS_READY ) { } 
    // Execute "WP" Write Page
    NVMCTRL->CTRLB.reg = NVMCTRL_CTRLB_CMDEX_KEY | NVMCTRL_CTRLB_CMD_WP;
    // Wait til write is done
    while (NVMCTRL->INTFLAG.bit.DONE == 0);
    // check for any errors during write
    if (NVMCTRL->INTFLAG.bit.ADDRE || NVMCTRL->INTFLAG.bit.PROGE || NVMCTRL->INTFLAG.bit.LOCKE 
        || NVMCTRL->INTFLAG.bit.ECCSE || NVMCTRL->INTFLAG.bit.ECCDE || NVMCTRL->INTFLAG.bit.NVME) {
        // read ECCERR to clear whatever error flag
        saved_ECCER = NVMCTRL->ECCERR.reg;
        return ERROR_FLASH_WRITE;
    }

    /***************************************************************************/
    /* WRITE SUCCEEDED, TRY TO READ FLASH                                      */
    /***************************************************************************/
    *bootcount = 0x99;  // preset  bootcount variable so I can see it change if memcpy is not optimized out
    memcpy((void*)bootcount, (const void *)BOOTCOUNT_ADDRESS, sizeof(uint32_t)); // works with optimizations off
    memcpy((void*)bootcount2, (const void *)BOOTCOUNT_ADDRESS, sizeof(uint32_t)); // works with optimizations off

    return ERR_OK;
}

I have finally got to the bottom of this problem - it stems from my implementation of a bootloader which stores a counter in high flash memory to keep track of failed bootload attempts. The counter
gets “incremented” by writing new values which only clear more flash bits, and never sets them.
This is done without an erase operation so I don’t lose the count if the processor should reset
between the erase and write operations.
This approach worked very well on the same bootloader on a SAMD21, but it appears that the SAMD51
does not like it. What I see is that if you write to a flash page twice, the second write operation
apparently succeeds, in that it doesn’t show any error interrupt flags in the NVMCTRL INTFLAG
register, but subsequent attempts to read that location result in ECC errors which persist even
across power cycles, and occur as many times as you try to read the flash location.

Interestingly, there is a bit of text in the SAMD21 datasheet, in section 37.12, Electrical
Characteristics, that says “Note that on this flash technology, a max number of 8 consecutive write
is allowed per row. Once this number is reached, a row erase is mandatory.”
By chance, the bootloader that I wrote for the SAMD21 only increments its counter 7 times before
declaring an end to its retries. There doesn’t seem to be any similar warning in the SAMD51
datasheet though.

Looking back over my previous posts, I was being misled by checking for ECC errors after writes or erase operations, when really they were occurring on previous READ operations, making me think that the writes & erases were failing.