UMQTT is not working after some arbitrary amount of time, while using Deep Sleep - Xbee3 NB-IoT

I am trying to test the deep sleep mode (pin disturb) using micropython code ( x.sleep_now and x.wake_reason()).
As using the Development kit triggers the SLEEP_REQ pin (pin #9) somehow, I am just applying 3.5V to pin 1 from a power supply.

While doing that, to check if the code is running well or not, I am printing all the statements
through MQTT. Meaning, I am sending all relevant variables to an mqtt server.

I noticed, that after some arbitrary amount of time, my code stops sending anything to the server.

I just did a 17hours test. Where the module wakes up at 10mins interval and sends some data to the server.

The module did not miss any data transmission or reboot in between. But it stopped connecting (or stopped sending to the server) after 17hours.

Even though I have programmed the module to reboot after 30 unsuccessful connection attempts, seems it is not connecting to the network/internet at all even after rebooting the module.

Why would the connection drop all of a sudden after working well for 17hours?

It worked just fine right after that when I just unplugged and plugged it back in.

As I can not connect the module to the PC to see print statements on the terminal, I do not get to see the error message.

If I use the normal sleep method (time.sleep()) instead of deep sleep (x.sleep_now) method then the code runs, however many times I want it to run. Though I haven’t tested it for 17hours as the project would not need it.

My SM is set to 0 (Normal).

I have tried using SM = 0 (normal) and 5 (Cyclic sleep Pin Wake)

I am on AP =4 (MicroPython)

How can I resolve this?

Are you sure you are handling any MicroPython exceptions? If the MicroPython application stops because of any exception it will not automatically restart.

Also, when the module stops connecting, are you able to inspect the state in anyway? For example, you could toggle some of the DIO pins and inspect their state with a multimeter or LED. You can also inspect the ON_SLEEP (pin 13) line.

It could also be the carrier is preventing you from registering. You shouldn’t go to sleep and wake up more than 5 times an hour (maybe less). When you get into this state, what does ATAI read? If it is 0x25, then the network blacklisted your device for registering and de-registering too often.

I have made a PCB that does not disturb the deep-sleep.

Which allowed me to see the terminal, while it is sleeping.
Turns out, I used to get DNS error and I handled it.

But sometimes, no matter how many times the code restarts it just would not connect. Powering it on and off removes all the problem.

I thought this also. But powering it off and powering it ON again solves the problem.

But as it is an iot device, which would just sit in some remote area we can not solve this problem this way.

If nothing else works to resolve the connection issue, you could try the AT!R command to reset the cellular component . The documentation says “CAUTION! This command is for advanced users, and you should only use it if the cellular component becomes completely stuck while in Bypass mode. Normal users should never need to run this command. See the FR (Force Reset) command instead.” and so you should not be required to do it, but it is something you could try.

When you are stuck, it would be good to know the ATAI values being reported.

Also version 0x11413 version of the firmware was just released. https://www.digi.com/support/productdetail?pid=5635&type=documentation

Just a quick note, you can turn the cellular modem off and perform soft resets and hard resets from the code. I was getting hangs, so I programmed in some power cycles of the modem, soft resets, and hard resets, usually in that order. Another note is to turn off the modem (put it in airplane mode) before performing a hard reset. You want to make sure you disconnect from the cell network. I observed that disconnecting first greatly improved the time to re-connect after the hard reset. It may have something to do with the carrier as another user mentioned below.

One last thing that might help. In the simple mqtt library, there is no timeout set. I recommend setting one. This way if something sticks during a publish, the connection will timeout and throw an exception. Once thrown, you can try to republish the data.