Intermittent serial problems AVR4809

I have been working on this project for quite a while and have tested it extensively. I left the test rig set up, now returning after a month or so, to find I have a serious problem with it described below.

There are two FTDI chips connected to the 4809 40 pin variant (dip on a breadboard with zif socket) One FTDI is used to power the 4809. The hardware ports used are Serial and Serial2 (arduino framework). The baud rate is 19200.

So the problem is that when I powerup the usb hub which hosts the FTDIs, sometimes the system works fine and sometimes it doesn’t. When it doesn’t work Serial is unresponsive and Serial2 produces garbled output, like as if the baud rate is incorrect. If I cycle the power and try again, it works as expected.

Just to make this much more interesting ( :frowning: ) in the same project box, I have another 4809 which also has two FTDIs connected and this is exhibiting the same problem.

I have used serial a lot over the years on arduino boards, but I am stumped with this. As I mentioned, I tested it to my satisfaction over a period of a week, cycling the power and testing the functionality according to a test plan. It all worked well, so I am at a loss to know what might be the cause of this.

The latest change to the code prior to the testing was to introduce a SPI connection between the two 4809s, which was my first attempt at SPI, but that seemed to work well.

Thanks for any thoughts on this sporadic behaviour and if you need any more info, just let me know.

PIO.ini below

; PlatformIO Project Configuration File
;
;   Build options: build flags, source filter
;   Upload options: custom upload port, speed and extra flags
;   Library options: dependencies, extra library storages
;   Advanced options: extra scripting
;
; Please visit documentation for the other options and examples
; https://docs.platformio.org/page/projectconf.html

[env:ATmega4809]
board = ATmega4809
platform = atmelmegaavr
framework = arduino
board_build.variant = 40pin-standard
board_build.f_cpu = 20000000L
monitor_flags = 
	--echo
upload_protocol = custom
upload_speed = 250000
upload_port = COM3
upload_flags = 
	-d
	atmega4809
	-c
	$UPLOAD_PORT
	-b
	$UPLOAD_SPEED
upload_command = pyupdi $UPLOAD_FLAGS -f $SOURCE
lib_extra_dirs = 
	C:\Users\Paul\Documents\PlatformIO\Projects\Development\
monitor_port = COM3
monitor_speed = 19200
lib_deps = 
	waspinator/AccelStepper@^1.61
	qub1750ul/SoftwareReset@^3.0.0

[platformio]
description = The integrated Stepper Driver


Does the problem occur in a more simple "print something forever in loop()" firmware?

If not, does it occur when you add your SPI code and occasionally execute it?

Does it also occur in the Arduino IDE? Then it’s likely a problem in the used Arduino core implementaiton and you should report an issue there.

I’ll try a simple example on all four ports without the SPI and see if that is reliable.
If that’s reliable, I’ll include the SPI stuff and see if it remains reliable.

thanks for help, much appreciated.

spent most of the evening on this, slimmed the code down to just serial transmissions. Interestingly, I have a LEDpin on each chip which is activated in the setup routine. If I connect all four USB cables to the four FTDIs, only the one 4809 does the led flash. If I remove two USB cables which are not connected to the 4809 which doesn’t flash the led, it flashes on power up. So it looks like if all four cables are connected, only one chip is running code/ powered up.

So a bit more investigation to do, at present looking like some kind of power problem, I’ll check that with a DVM tomorrow.

That in a way would be a bit of a relief, as I tested the code pretty thoroughly. At present I have an 80% failure rate and I’m sure I’d have spotted that during testing.

I’ll report back.

1 Like

So after much exploration, I think I have found the problem. It is not power supply related. The Microchip datasheet indicates the supply voltage measurements of 5.01 volts is within the safe operating area for cpu @ 20Mhz (datasheet P 462). In a nutshell, the problem is SPI related. I found that if the SPI setup code on the 4809 master completes before the SPI setup code on the 4809 slave, the slave 4809 locks up and code does not execute. I haven’t done a detailed analysis of where the code stops, but a flashing led (5 flashes at 1 second intervals) in the setup() on the slave turns on once and stays on. The expected serial comms in loop() does not happen.

In order to prove the SPI association, I branched my code for the two projects and in one branch stripped out all references to SPI. On loading this code, twenty tests involving cycling the power and watching leds flash always worked.

On the code branch with SPI included, 15 out of 20 tests failed with the slave 4809 ‘locked up’.

Just by trial and error I found that repositioning the SPI initialisation code so that the slave initialised before the master cured the problems.

It may just be that I’m such a numpty that I didn’t realise slave had to init before the master. It does seem sensible.