Bad cluster in mod_FAT?

Using DC 9.50 for RCM3315, and mod_FAT 2.12, I have an intermittent problem that looks like a bad cluster. By posting here, I hope someone can help me understand my symptom and see how Rabbit recommends handling this situation.

My code saves its machine settings in text, key=val format, in a file on FAT called STARTUP.FIL. My code is deployed worldwide on hundrds of RCM3315 boards. It “almost always” works great, but in the past year there have been around 10 cases of systems losing their settings. The problem is usually intermittent on affected systems, but today I received a 3315 that shows the problem on every reset.

It turns out that STARTUP.FIL is in cluster 0, and every time I try to write to it, fat_Write returns -EINVAL. This happens in FAT_SHELL.C as well as in my application code.


A> ls                                                                           

Listing '' (dir length 16384)                                                   

          pj rhsvDa len=0       clust=2                                         

          pf rhsvDa len=0       clust=3                                         

 startup.fil rhsvda len=0       clust=0                                         

prot_mem.fil rhsvdA len=22      clust=5                                         

A> wr startup.fil 128                                                           

! Last write terminated with rc -22                                             

File '/startup.fil' written with 0 bytes out of 128                             


Eventually I deleted STARTUP.FIL in FAT_SHELL.C and was able to write (and overwrite) a new copy of that file in a different cluster. Going back to my real application, everything started working fine.

The mod_FAT documentation says that fat_Write returns -EINVAL if “file, buf, or len contain invalid values”, but I think it’s clear from the above that it also returns -EINVAL if it runs into something “bad” on the Flash device.

So, is it an undocumented responsibility of application code using mod_FAT to monitor fat_Write calls for the -EINVAL response and do something to remap the failing file to another cluster?

As I said above, I hope someone with fairly deep knowledge of mod_FAT can help me understand this problem and the recommended mitigation for it.

Thanks,
Larry

To answer my own post, no file should ever occupy cluster 0 or 1, per Rabbit tech support. I wrote a wrapper for file_Open that checks the allocated cluster, but I have never seen it fire. Rabbit tech support offered to help diagnose the problem if I could define a repeatable test, but the problem is too intermittent.

It is also possible that the FAT got corrupted after allocation. Time will tell.

Larry