XBee3 "OSError: [Errno 7107] ENOTCONN" while connected

esimioni · August 10, 2023, 5:53pm

The 7107 ENOTCONN occurs between 1 minute and 1 day after powering on and the only way to recover is by power cycling the module, which is not a solution for my needs.

Odd facts

Upon starting there is a loop checking if xbee.atcmd('AI') == 0 before trying to send any message.
Immediately after confirming it is connected a first message is sent. This happens a few milliseconds after the check, and in many cases the first message already fails with ENOTCONN.
When it does go beyond the first message the code runs in a loop, and is catching any exception (BaseException). A repr(exception) String is sent via Zigbee to the coordinator.
Although rare, sometimes OSError: [Errno 7107] ENOTCONN is caught and it succeeds to send the message with the error over the network, just a few milliseconds after it was raised.
I have a whatchdog set for 30 seconds, so when the problem happens, 30s later the XBee is soft rebooted and what is described in the item 1 happens, just replace the “in many cases” by “always” in the sentence.
At this state it will keep in the loop of being rebooted by the WDT and will never be able to send a message. It is irrecoverable unless power cycled.
If I connect another XBee to XCTU and scan the network, the problematic XBee shows up there, even when it is in the irrecoverable loop described above.
Power cycling always solves the problem, at least until it fails again

This is an XB3-24 in non-sleeping (Router) mode
I can put it close to the coordinator or any other stable node and still the problem happens
My mesh is moderate in size, 65 nodes, 61 being routers
My mesh is stable, I don’t have problems with disconnecting devices (except the XBee)
The power source to the XBee is stable
RSSI doesn’t seem to be related, I’m reporting it every 5 minutes, sometimes it is close to -90 dBm and there is no problem, other times, the last reported RSSI is close to -70 dBm and the problem happens

What are the options that I have to deal with this problem?

mvut · August 11, 2023, 1:24pm

Is this on a Router or end device?

Have you tried issuing a Node Discovery to see if you can find the remote node?

Have you tried issuing a local Network Reset instead of rebooting?

esimioni · August 11, 2023, 1:51pm

What is the proper way of doing a Network Reset?
I’ve seen a command for that, but it says it resets all network parameters, so not what I need.
I’ve also read about the Network Watchdog, but I’m not sure if it would work together with the Watchdog Timer.

mvut · August 11, 2023, 2:13pm

It would be issuing an NR0.

esimioni · August 11, 2023, 2:38pm

Right, the documentation reads:

“Resets network layer parameters on one or more modules within a PAN. Responds immediately with an OK then causes a network restart. The device loses all network configuration and routing information.”

If the device loses all network configuration, it seems to me that it wouldn’t be able to rejoin the network without adjusting the parameters again and re-pairing it.

mvut · August 11, 2023, 2:51pm

That would depend on what functions you are using. If your Joining is enabled and open, then I would not expect it to be an issue.

If you prefer, you can issue an ATFR command. This command triggers the module to reset. This same thing as triggering the reset line but from an AT command interface instead.

esimioni · August 16, 2023, 12:03am

The ATFR solved the problem of soft rebooting and immediately getting the same ENOTCONN error.
But I was still not happy, these devices should not have such problems, and a reboot at least once a day is not a solution for my needs, I need them to be stable.

Then I decided to downgrade the firmware, from 1012 to 100D and voila! It is running for more than 24h without errors!
The firmware version 1012 seems to have introduced a bug.

I guess I should open a support case, right @mvut?

Edit: 100D is not the previous version, testing now with 1010. But I does work properly with 100D and not with 1012.

esimioni · August 28, 2023, 10:15pm

Answer from Digi’s tech support:

Thank you for reporting this bug. At this time, it has been reported by others as well and the only current option for resolution is to downgrade from 1012 to 100D or 1010.

Our firmware engineering group is working on a fix that should be included in the next firmware release. We do not have an estimate at this time for when that release will happen.

esimioni · April 10, 2024, 2:32pm

For anyone wondering, this problem has not been fixed on firmware version 1013. Still waiting for a solution that doesn’t involve using a very old firmware.

Topic		Replies	Views
Xbee3 Cellular: OSError: [Errno 7107] ENOTCONN. XBee - DigiMesh xbee-cellular	1	486	December 17, 2020
XBee3 Global Cat-M/NB-IoT - ENOTCONN errors XBee - Cellular error , xbee-cellular , micropython , xbee3 , cat-m	3	487	January 30, 2024
Digi XBee 3, DigiMesh - problem with sending messages by xbee.transmit () ENOTCONN XBee - DigiMesh xbee , digimesh , micropython	11	4605	April 20, 2025
Xbee 3 python script fails occasionally with OSError: [Errno 7019] ENODEV Python random , enodev	1	372	December 23, 2020
How to resolve OSError:[Errno 7005] EIO ? Python errno , 7005	7	1232	December 26, 2024

XBee3 "OSError: [Errno 7107] ENOTCONN" while connected

Related topics