I am having an urgent problem concerning the socket api.
I am using the sendto() function periodically for a UDP (blocking)socket, after a while, the sendto() function generates the the error errno128 Socket is not connected. Does anybody know what causes this error.
My function has basically the following structure
I use the socket() function to create a UDP socket
I use the setsockopt() function for SO_SNDBUF , SO_RCVBUF und SO_BIO
I use the bind function to bind the socket on a special port
I use the sendto() for sending the data.
I use the receive function for receiving the response
At the end of the function or in case of an failure, I close the socket
If I am continuing calling the function while the error 128 is active the error changes from 128 to errno14: bad address . Even though I am still using the same IP address.
Does anybody have a solution/explanation for this problem ? I working with NetOS 5.0
NetOS 5.0? You do know that there has been at least 5 versions since then?
Anyway, from what I remember NetOS 5.0 had serious problems with blocking sockets, and setsockopt() only worked part of the time. You might want to check some of the really old messages in this forum.
As a work around here are some ideas:
Double check the code to make sure that you are calling closesocket(). Not doing so will destabilize things rapidly exactly like you mention is happening to you.
Take at look at using select(). Remember that it is NOT thread safe.
If you are not doing an ARP, RARP, or TFTP call, switch to TCP, it is much more stable.
Do not use TCP/UDP port 0.
I would highly recommend that you upgrade to at least 6.0b (where the TCP/IP stack was made stable). It is the last version that works with the original ME modules. All versions after that are for the new -C and -S modules.
You could also attach some of your code here so we could take a look.
Thanks alot for for your help. I have found the problem. I did not initialize the last parameter of the sendto() function. The last parameter describes the size of the struct sockaddr_in. This was responsible for the strange behaviour.
Nevertheless I will change the blocking sockets to none blocking sockets and use the select () function. You mention the select function is not thread safe. Where did you find this information? Unfortunately I am not able to port my software to the newer version at the moment.
It seems that you have a lot of experience with NETOS. Maybe you have an idea for another problem from us. Our software runs in a lot of devices all over the globe. Only in one location we have a problem that the devices crash partly (Ethernet communication) after a while. After adding some debug info over the serial interface, I found out, that the run counter of the timer task thread did not change anymore. Have you ever heard about this problem? As far as I know, the timer task thread is used by the TCP/IP stack.
I found out that if you call select() on two separate threads at the same time that they will potentially never return (about 5% of the time). I posted a patch for this about 2 years ago. I will cut and paste the one from my app and attach it if that helps.
As for the thread dying, I have seen this but haven’t really come up with a solution. When I got this error, I got rid of the internal pipes from thread to thread using port which got rid of the issue, but I don’t think that that will solve your problem. An upgrade to NetOS 6 might.
You mention that upgrading to NetOS 6 is not an option. Is this because of the cost or the fear that things will not be compatible?
thanks again for your help. I will implement your patch for our current software. Does the select problem still exist in the NETOS 6 version ?
Concerning the thread dying, I dont understand what you mean with internal pipes and using port. We use for the thread to thread communication the TXevent flag groups or messages queues.
Porting the software to NETOS6.3 is more a resource problem. How were your experiences with the porting? As far as I know it is a big step from 5 to 6. A new file structure and another bootloader etc… The bootloader concerns me the most. Is it possible starting the new OS with the old bootloader?
You do NOT want to port it to 6.3. 6.3 only works with the version 2 ME modules. The highest support for the Version 1 modules is 6.0c (which still has the select() problem, by the way).
It looks like you are smarter than I am with the thread communication. I was using two methods, FIFOs and pipes, and it turns out that the ME can’t handle pipes. It took me a month to figure that out - I had ported over OpenSSH and internally it uses tons of FIFOs. A hack to switch to FIFOs and it worked beautifully.
As for the new BSP, (the bootloader is the same) it is a lot more configurable, so unless you have made a lot of changes it shouldn’t be too hard. I had it reconfigured in under an hour. The really nice thing about 6.0 is that the API is much larger and more UNIXish. The bad thing is that ALL the documentation is in chm and pdf files. No books.
If you are looking at using the new Version 2 modules, there is NetOS 6.3 or 7.0. Since we are still building new boards here, I had to port my code from 6.0c to 6.3. The biggest problem is getting the -S modules from Digi; even if you order them you always get the -C modules which do NOT have the backdoor interface in them like the version 1 modules. Since I was using that interface for upgrades, I had to rewrite the manual. The nice thing is that they just released a new ME with twice the FLASH memory.
we don’t use digi modules. Instead we are using our on board with a NET+50 CPU. For that reason, the porting is probably a little bit mor work intensive. As far as I know the NETOS 6.3 can be used for the NET+50.
Thanks again for your help.
Yes, it would be a month of hacking to convert a NetOS 5.0 program to 6.3. Probably isn’t worth it. I hadn’t heard about these mini-boards you mention, they were not sold here in the U.S. (or were not advertised, one of the two). I probably would have used one if I had known about it. Sure would have been easier than dealing with the horrible pin connections on the ME!