RCM6700 system mode violation when sharing serB and writeUserBlock with semaphore

I’m using an RCM6700 and I need to share serB and writeUserBlock.
My IDE is Dynamic C 10.72D.
I’ve implemented semaphores with SPIgetSemaphore and SPIfreeSemaphore and everything seems to run correctly, but when I try to run a stress test that sends and receives data on serB and then write user Block every 5 seconds after a random number of cycles (between 3 and 15) a runtime error pops up: “A system mode violation interrupt occurred…” at different addresses.

I’ve simplified the test opening serial port (with semaphores) and not sending nor receiving data: same result (Runtime error after some cycles)

If I run the test without opening the serial ports it runs without problems.

Can anyone give me some suggestion to find the problem? (I can provide code snippet if it is useful)

Thank you
Massimo

1 Like

My guess is that your program is crashing in some way and running random bytes as if they were code. I’m not familiar with the user/system modes myself. They were used with RabbitSys on the Rabbit 3000.

Because the Rabbit 6000 runs code out of RAM, it’s possible that due to a bug you’re overwriting code with data. You can integrate Samples/RemoteProgramUpdate/verify_firmware.c into your program to periodically verify that the running firmware image is still valid. If not, you can use compare_firmware.c to identify which area mismatches with the boot firmware stored in flash.

Another option would be to change “__nodebug” to “__debug” in IDBLOCK_API.LIB and set a breakpoint on writeUserBlock() so you can single-step through it and identify where the failure occurs.

I’m assuming that a stress test that doesn’t make use of writeUserBlock() is successful.

You might also be running into problems from mixing async serial on port B with clocked serial (SPI). The design was to allow multiple devices to share the SPI bus, and reconfiguring the I/O lines may have an impact. I believe I’ve seen that in the past. From looking at _SPIgetSemaphore(), I think it could be related to assuming SPI devices all use the TAT5R register and not SBDHR/SBDLR. It might also mean that it needs to save/restore more registers if switching between SPI and UART on the same serial port.

If at all possible, try to modify your design so it doesn’t require sharing that port.

Or, submit a sample program to Digi’s Tech Support that can run on a standard development board that demonstrates the failure. This could be a bug that we can fix in the semaphore code.

Thank you for your clear and detailed answer!
I’ve done further investigations and discovered some of the problems that you pointed out.

One big issue is the use of TAT5R that is not restored properly by semaphore because, as you reported, I’m mixing clocked SPI and asynchronous serial. I’ve made a workaround by opening serial port at 19200 baud for my communication and then closing it and reopening at maximum speed (460800) before releasing the semaphore. This seems to have a positive impact on userblock write (but not solves the problem completely).

Unfortunately I cannot modify the hardware design but I will give a try on your other suggestions.

Just a final information about a bug that crashes the code: it was my first thought and for this reason I’ve reduced the firmware (a very complex one) to the minimal set of functions: no costates, just an hardware init and and infinite loop with serial write to my peripheral alternate to a writeUserBlock, everytime with the same data. There is no doubt that a bug can still be present, but there are very low odds.

Thank you again for your help!

I want to add an update for everyone that may experience the same issue (and maybe also for Digi’s Tech Support can be interesting).

My program boils down to (I’m working with USE_FAR_LIB_API and USE_FAR_STRING_LIB):

// ----------------------------------
HttpState hstate;
char ptr;
char cfgBuffer[500];
_ConfigStruct Config; //(_ConfigStruct is a 500 bytes structure with several fields)
void
save_data[3];
unsigned int save_lens[3];
// ----------------------------------

while(1) {

// Use a fixed string for configuration data
sprintf(cfgBuffer,“var1=value1&var2=value2&…”);

…[here I get the ptr to point inside the cfgBuffer]…

_f_strncpy(hstate.buffer,ptr,150); // <— THIS is the important line

…[here Config struct is updated according to hstate.buffer]…

save_data[0]=&marker; // Marker is a constant 3 char data marker
save_lens[0]=3;
save_data[1]=&Config;
save_lens[1]=sizeof(Config);
writeUserBlockArray(0,(const void * const *) save_data,save_lens,2);

msDelay(3000);
}

This program runs fine (with the same exact fixed data) for 3 to 15 loops then hangs with a system mode violation error

If I modify ONLY the _f_strncpy to use a char buffer instead of a char buffer inside a struct (as is in HttpState) everythings runs ok forever:

char tmpbuffer[500];

[…]
_f_strncpy(tmpbuffer,ptr,150); // <— THIS is the important line
[…]

My guess is that the pointers arithmetic inside the FAR function in some case fails. I’ve not investigated further inside the libs.

Sorry for the long post, I hope it can be useful for someone.
Thaank you again for your help!

I think the issue is that you shouldn’t be making use of the buffer pointer in HttpState. It’s not clear to me why you are using that datatype in this code.

That’s for use by the HTTP server, and may even be a NULL pointer if it corresponds to a socket that isn’t connected. If you update your code to print the value of hstate.buffer, you’ll probably find that it’s 0. Even if it isn’t NULL, you don’t know what it’s pointing to (could be less than 150 bytes) or if the contents are otherwise in use.

Also, this seems completely unrelated to the semaphore issue you originally posted about. With this change in place, do your semaphores now work correctly?

You’re right, but I simplified the firmware a lot. In reality the config string comes from a GPRS connection that uses HTTP server (and so the HttpState structure).

About semaphores, as I wrote previously, the switching is working correctly only if I close my serB at 19200 and reopen it at 460800 before I release the semaphore.

Thank you again for your time!