SNTP Error recovery

steved1 · June 3, 2008, 9:13am

Has anyone established the default error handling process when none of the configured SNTP time servers are available?

At the moment I’ve implemented RDeabill’s suggestion of restarting the SNTP process on server timeout. It does work, but can clog up the network a bit - on my LAN it restarts five times/second.

I think its usually the case that ‘server not available’ indicates a network problem (in my case, a cable fallen out), so we don’t really want to try and reach the server too frequently. Equally, we don’t (necessarily) want to wait a whole synchronisation interval before retrying. For my application, I would go for a retry interval in the range 1-10 minutes.

Any thoughts on how to control the retry process a bit?

(Using Net+OS 7.1 ATM; moving to 7.3/7.4 shortly)

rdeabill · June 3, 2008, 10:33am

Hi,

I have not really looked at what happens when the restart does not work, but I would think a solution here is to use a ThreadX timer.

When initialising the module with the SNTP routines create the timer but do not start it. Set up a simple timer function to do the actual restart. In the callback routine when you get the timeout start the timer rather than the restarting SNTP directly. When the callback routine indicates success disable the timer.

I have not checked this but it’s one way you may be able to do it.

steved1 · February 4, 2009, 5:36pm

Just had another go using Net+O/S 7.4 - no change from my previous post.

Seems to be no way to prevent SNTP either flooding the network connection, or locking up completely, in the event of a network or time server error.

charliek · February 4, 2009, 6:49pm

If you want to delay the retry process, just put a tx_thread_sleep() in your callback before you restart the SNTP process. I.e. something like:

if(status = NASNTP_SERVER_TIMEOUT)
{
tx_thread_sleep(NABspTicksPerSecond * 60 * 10); /* ten minute delay */
NArestartSntpServer(priAddr, secAddr);
}

steved1 · July 18, 2008, 11:13am

Finally got to play with this, and initial results not looking too promising.

Testing by simply unplugging the network cable. The ME duly recognises that its lost the network connection.
Next SNTP sync request fails, and I start a timer in the callback routine.

When the timer expires I try to restart the SNTP task (using same time servers) - it returns with error -1 (NASNTP_INVALID_STATE).

Everything then locks up (possibly not helped by the absence of retries in my code). Even when the network connection is restored, SNTP doesn’t appear to restart.
I’ve been using quite fast times for testing - 2-minute sync interval and 1-minute retry - and nothing happens even after an hour or two.

Anyone else tried something similar?

sparkys_dad · July 18, 2008, 12:55pm

I believe the correct way to handle this is as follows:

In the callback routine, return success (even though the status sent to the callback routine told you that no sntp server is available). This tells the sntp thread to keep trying to find a server.

Also, there have been remedial changes made in this area. make sure you have the latest patches from Digi’s web site.

steved1 · July 30, 2008, 2:18pm

Been playing with this some more, and not getting very far!

If you return NASNTP_SUCCESS from the callback when the server’s not available, the SNTP handler immediately retries - exactly the condition I’m trying to avoid!

I also tried returning a negative value from the callback, which the docs say stops SNTP. I also triggered a timer, and on expiry attempted to start SNTP (‘start’ seeming logical, since it should have stopped). Gives a NASNTP_SYSTEM_FAILURE error. If I try to restart the SNTP server after the timeout, I get a NASMTP_INVALID_STATE error instead.

Still using NET+O/S 7.1 (with latest patches) ATM - anyone got something like this to work yet?

steved1 · February 5, 2009, 9:27am

Works nicely thankyou.

(I tend to think of a ‘callback’ as something that should happen instantly, and not 10 minutes later!)

Topic		Replies	Views
Problem with SNTP NET+OS hardware	2	916	September 24, 2008
SNTP - Time Servers Hardoced? NET+OS	4	768	February 27, 2007
SNTP Simple Network Time Protocol NET+OS	1	518	January 17, 2008
clock not running with SNTP NET+OS rtc , sntp	1	762	August 21, 2008
Server NTP on ConnectME NET+OS hardware	0	548	October 18, 2006

SNTP Error recovery

Related topics