Some problems with stm32duino serial port receive?

Maybe I should not seek help from here, but I don’t know that where to discuss the problem.
I use stm32l476rg and five serial ports in my project.
I use uart4 to connect a device with a private protocol device as I send
HEX(0x)[7E 7E 01 …] and it will return
HEX(0x)[7E 7E 00 30 00 08 27 01 62 98 32 80 08 02 02 53 21 03 19 12 08 47 04 87 22 ], that’s all right.

But actually, it runs as the picture:

the first two bytes are wrong, so I use an RS232 debugger to direct connect my protocol device’s rx pin to the debugger’s rx pin, I see the device have right response, as the picture

Here is my unoptimized program for receiving the data:

      size_t Mod_ZJ_Water::readDownloadData(uint8_t *buf, size_t size)
      {
        	uint8_t recCount = 0;   

        	delay(2000);    // Wait for the data

        	while(this->_serial->available() > 0)
        	{
        		buf[recCount] = this->_serial->read();

        		recCount = recCount + 1;
        	}

        	return  recCount;
        }

I don’t know the possible reason to cause it, and would you like to share a good program frame for
receiving the serial data with timeout check?

Thank you very much!

Hm maybe this delays so much that the internal UART buffers overflow as the input is not being read out fast enough.

You can replace this with a “at maximum, wait 2 seconds for input data” as with like

   unsigned long now = millis(); 
   while(this->_serial->available() == 0 && (millis() - now) <= 2000UL) { }
   /* read */

this will early-break from the loop and start reading data as soon as 1 byte is available, or exit the loop once 2 seconds are over. You might want to modify the == 0 part to a different length if you know the length of the UART frame you will receive. If it’s e.g. 10 bytes, you can write it as != 10. Or, just put a smaller delay before the readout (where you know that at least 1 byte has been received) that should cover the length of the UART transmission, and then read the full frame.

I use the

this->_serial->setTimeout(4000); 
return (_serial->readBytes(buf, size));

instead it, I think it’s same as your solution, but the problems is still exist,
the header 3 bytes will always be 0x00, but after it are all right.

Thank you

You tried writing a firmware that only reads the UART frame and re-prints it over serial, without all the rest of the firmware? It might isolate the problem.

I going to change serial 3 to serial 4, I will report it to you later

It is very ridiculous, I use the code below to get the data

bool FML_Socket::zWaterDataUpload(uint8_t upBuf[], size_t upSize, uint8_t downBuf[], size_t *downSize)
{
	logPrintln("Print the SL651 Upload Frame:");
	for (uint8_t i = 0; i < upSize; i++)
	{
		logPrint("%02X ", upBuf[i]);
	}
	logPrintln();

	if (waterMod.writeUploadData(upBuf, upSize) == true)
	{
		*downSize = waterMod.readDownloadData(&downBuf[0], 128);
		if (*downSize == 0)
		{
			logDebug("Water-Module: No response!")
			return false;
		}
		else
		{
			logPrintln("Print the SL651 Response Frame:");
			for (uint8_t i = 0; i < *downSize; i++)
			{
				logPrint("%02X ", downBuf[i]);
			}
			logPrintln();

			return true;
		}
	}
	else
	{
		return false;
	}
}

the result of print is that the header 3 bytes of response turns into 0x00, but in the lowest level it is right,
here is the code;

size_t Mod_ZJ_Water::readDownloadData(uint8_t *buf, size_t size)
{
	this->_serial->setTimeout(4000);

	uint8_t byteRecCount = _serial->readBytes(buf, size);

	Serial.println("Print the SL651 Response Frame--------------:");
	for (uint8_t i = 0; i < byteRecCount; i++)
	{

		Serial.printf("%02X ", buf[i]);
	}
	Serial.println();

	return byteRecCount;
}

Let’s look the secureCRT screenshot:

but I don’t know why?

fmlSocket.zWaterDataUpload(fmlProtocol.waterCloudData.upDataArray, fmlProtocol.waterCloudData.upSize,
									fmlProtocol.waterCloudData.downDataArray, ((size_t *)&fmlProtocol.waterCloudData.downSize));

the top layer to use it, could you figure out the reason?

may with the ((size_t *)&fmlProtocol.waterCloudData.downSize)?

Thankyou very much!

Hmm very interesting. There’s some sort of corruption ongoing in those first 3 bytes.

Are the arrays big enough to hold the data? Maybe there’s a stack overflow. What are the definition of the fmlProtocol structure?

If fmlProtocol.waterCloudData.downSize is a field of e.g. type uint8_t and not size_t, that could be the problem. If you are casting it to a size_t*. the underlying type must be size_t. Otherwise there’s a memory corruption. If that’s the case, the corruption comes through writing to the pointer.

You are so brilliant, as you discussed,

typedef struct
{
	uint16_t upSerialNo;
	uint8_t  upSize;
	uint8_t  upDataArray[256];

	uint16_t downSerialNo;
	uint8_t  downSize;
	uint8_t  downDataArray[128];
} WaterCloudDataT;

it seems the problem is on the casting uint8_t to size_t, but I do not know the detail that why it can cause it?
Or if I want to avoid this, and get the size, cast the size_t to uint8_t, What can I do for it?

Aahh yes that makes totally sense.

Sou see you are obtaining a pointer to the downSize variable. The variable is 1-byte wide, but the pointer you create

(size_t *)&fmlProtocol.waterCloudData.downSize

Is a pointer to a size_t, which is 4 Bytes wide (unsigned int or unsigned long).

So if we write a value using that pointer, it will write 4 Bytes, starting at the address of the downSize variable. However, in your structure, the value is actually only 1 Byte big, and the bytes after that variable is the buffer where your received data is stored (downDataArray is the next member variable after downSize). Thus, by writing to the size_t pointer, 4 Bytes get written, the first byte will overwrite the value of the actual downSize member, and then 3 more Bytes which are actually your data. That’s where the data is corrupted.

The CPU of your device also seems to be little endian. For example, the value “1” in hex as a 4-byte value (int) would be stored as

01 00 00 00

(least significant byte first).

In big endian it would be

00 00 00 01

(most significant byte first).

This also explains why the first 3 bytes are 0 in the corruption: The received data length is 25. The value 25 is then stored in the size_t* due to the line

which will be written in hex as

19 00 00 00

at the memory address where downSize starts.

To fix it as I said, you can either correct the type of this member to size_t

Or, adapt the function to accept a uint8_t* so that the correct size will be written.

(aka size_t *downSize to uint8_t *downSize and remove the cast in the caller accordingly).

Thanksgiving for your very detailed answer, as you said, stm32 is a little-endian MCU.

Thank you very much!