Modules some kind of semi-dead?

Hello,
we are using digiconnect ME C-Type with our own software as Ethernet interface for our measurement devices.

I have now several devices returned from customers, which seem to be dead, but are not really.

Our own application is obviously not running.

When I try to discover the digi with “netosprog /discover”, it is not discovered. But when I watch this procedure with wireshark, I see that the ARP service is still working (with the correct MAC-address), the digi asks for the MAC of the host, but nothing more happens.

Other test:
When I switch on the device and watch this with wireshark, I see the digi’s usual “V2 membership report” IGMP message directed to IP 224.0.5.128 (Mulitcast addr?) (what does this V2… mean anyway?)

Therefore I assume, that digi’s hardware is o.k., but something bad has happened to some of its flash contents.

We had already some digis behaving like that and because they are not that cheap, I’d like to stop that.
Has anyone an idea:

  • what can cause such problems
  • how can they be recovered? (I assume, if I could re-flash them, they would be ok?))

Thank you very much for every bit of information,
AD

Which NET+OS version are you using? The fact you’re seeing the V2 Membership report (which means the device is letting other devices and routers on the network know that it will accept Multicast data on 224.0.5.128) and it’s responding to ARP’s means the TCP/IP stack has started up and is running. This would indicate that you’ve very likely hit applicationStart (see netosStartup in bsproot.c for where the root thread starts up the TCP/IP stack and then calls applicationStart). This would then mean that something in your code is causing the root thread to hang/crash if it’s not responding as you would expect. In terms of recovery, if the image in flash was corrupt the boot loader would attempt to obtain a DHCP address. If the image is working but crashes, you would have had to implement a backdoor feature into the boot loader to recover the module.

Hello Charlie,
thanks a lot for your quick answer.

We are using NET+OS Version 6.3 (Build 6.3.20.0).
The device does not try to obtain an DHCP address, I’ve tried that.

Our own software thread is not running but if the TCP/IP stack is running, why is the device not found by netosprog /discover? Is there any possibility to flash the device at this state when I only know the MAC address?

Another thought:
I have no idea, what our customers did to the devices, but the worst thing that I can think of is, that they switched off the device while storing a new IP address with the function customizeWriteDevBoardParams() (we are using static addressing and need to store the address in the digiconnect. I was told that this is the only way to do so)
I have tried to provoke that error several times before, but the worst thing to happen was, that the board parameters fell to the defaults of appconf.h. This has not happened to these devices because the MAC address is still ok.
Do you think, switching off the device during flash-write with customizeWriteDevBoardParams() can cause such a behavior?

Thank you for your answers,
AD

I hate to say it, but it looks like you have a new paper weight. There’s really no way to flash a module if the firmware is running (i.e. passes the boot loaders CRC check), but never gets to a point where you can update the actual firmware image through your own means without a JTAG.

I’ve attached a boot loader modification you can do to have a ‘back door’ for future recovery. Basically all you need to do is short pin 20 to ground during boot up and talk to the module through the serial port at 9600/8/n/1 no flow (see the readme for details).

If the device can respond to ARP’s, then the TCP/IP stack is up. If it can’t be discovered, it could have been that the default root password was changed, the ADDP cookie was changed, or the ADDP thread itself crashed (or was never started).

I wouldn’t believe that a default NVRAM would cause the module to not boot, also if it did get defaulted, you would see the MAC change (as you pointed out). If image flashed into the module got corrupt, then it would fail the boot loaders CRC check.

I’d suggest adding the backdoor in case this happens again, and continue to work with your customer to see if you can replicate the problem on your JTAG’d module as that’s where it sounds like you’ll have to do if you want to figure out why it’s behaving poorly.

I have written the code to get a dynamic IP address. It works well.
void writeNetworkPara(int value)
{
devBoardParamsType nvParams;
if (value != DHCP_OFF && value != DHCP_ON)
return ;
int flag = 0;
customizeReadDevBoardParams(&nvParams);
aceConfigInterfaceInfo *eth_config = customizeAceFindInterfaceConfig(BP_ETH_INTERFACE, &(nvParams.aceConfig));
if (eth_config == NULL)
return;
if (value == DHCP_ON && eth_config->dhcp_config.isEnabled != 1)
{
eth_config->static_config.isEnabled = 0;
eth_config->static_config.auto_assign = 0;
eth_config->dhcp_config.isEnabled = 1;
eth_config->dhcp_config.suggested_ip_address = 0;
eth_config->dhcp_config.gateway = 0;
flag = 1;
}
else if (value == DHCP_OFF && eth_config->static_config.isEnabled != 1)
{
eth_config->static_config.isEnabled = 1;
eth_config->static_config.auto_assign = 1;
eth_config->dhcp_config.isEnabled = 0;
eth_config->dhcp_config.suggested_ip_address = 0;
eth_config->dhcp_config.gateway = 0;
flag = 1;
}

if (NAGetAppUseNvram () && flag == 1)
{
    customizeWriteDevBoardParams (&nvParams);
}

}