Ports stop responding


We’ve been having reports of problems with our customers using Digiboards, and our testers were able to duplicate the problem, but it hasn’t been until today that I’ve been able to capture a failure with my own tracing version of our software running.

The basic symptom is that, eventually, a port on the multimodem card stops responding (we use an AccelePort RAS 8 in-house). Ultimately, all of the ports reach this state. The only solution we’ve discovered is to power cycle the computer – a warm reboot does not resolve the issue, nor does sending ATZ or any other string we can think of resolve the issue.

Our software calls out to special devices that we manufacture, communciates with them using DTMF, and then eventually, if the DTMF negotiation succeeds, we switch to modem communication to send the important information (such as reconfiguring the devices). Because of the use of DTMF, we require the use of modems capable of AT+FCLASS=8.

We start off each call by sending ATE1V1, then various ATIs to identify the modem, then AT+FCLASS=8 and AT+FCLASS=0 to ensure we can switch modes, then an initial dial string which lacks the final digit but includes a semi-colon. Then we switch to AT+FCLASS=8, set a few parameters, and then dial the final digit while in voice mode, so that we don’t lose any DTMF or other detection (which can happen with some modems if we aren’t in voice mode when the call connects…).

At this point, we begin dialing specific DTMF digits until we get a response from our device (or time out). Our device responds with several DTMF digits, and from them, we compute a hash based on a known password and the digits we received, then we transmit them to the device. If the device likes our hash, it responds with a DTMF #B, and then it puts its modem into answer mode.

At this point, we switch back into AT+FCLASS=0, and then send ATD to go into dial mode. If all goes well, the modems sync and then we can begin reprogramming the unit or downloading logs from it or whatever.

A problem occurs in that, at some point, that final ATD goes unanswered. The modem no longer sends any data back to my program, and we eventually time out. Subsequent attempts to send any commands to that port fail with no response from the modem at all. (Since our initial command for the “next call” is ATE1V1, it isn’t that we’ve disabled responses…) There simply is no longer any incoming data from that port at all.

This problem does not occur until some hundreds of calls, possibly thousands, have been made. Which port goes first is random.

My application is a .NET 2.0 app (or 3.5 if I feel like switching it), and I am currently using the .NET SerialPort class, although we also experienced this problem when we were using straight Win32 calls (we were initially a .NET 1.1 app, which didn’t have the SerialPort class, so we were Win32 until just recently, when I rearchitected this to resolve some other issues).

What do you need from me in order to help us resolve this issue? Is there some way to trace what your modems are doing, see what state they’re in, determine if they’re still functioning, etc., that we could run while we’re doing our tests to collect information for you to resolve this?

Thank you,

John Cullison