Applications stop receiving device events at times - reboot fixes it

I have already asked support, but not with much luck at this point. Trying to see if anyone has ideas here.

My hub "occasionally" (it may happen several times per day, like yesterday, or only once, like today) enters a state in which my applications are still running (I see them logging time-based events), but are not receiving device events (switches turned on/off, movement, power meter readings, and so on). Rebooting the hub restores all functionality.
This has started happening more frequently since I have added more devices and applications (my network is at this point made of about 30 physical devices and a dozen or so applications). My z-wave information panel is "clean", with all devices properly initialized and active.
I do have a fairly "chatty" application: that application pushes most device events to an influxDB database running on my PC, which implies it's subscribed to a fair number of device events (with filterEvents: false) and it performs plenty of httpPost-based I/O.
I don't see any error popping up in the log viewer when the problem happens; I have the impression that support is able to access logs in the hub which are beyond my reach, but I don't know whether those would help identify what the issue is and what is causing it. Any suggestions about how I can troubleshoot this, short of de-activating the applications (and eventually the devices) one by one until the problem disappears (as that would cause major disruptions in my house...)?

Thanks.

we identified an issue recently that is likely causing this, this fix will be in the next release.

It's likely your influxDB app that's poking this bear.

OK, thanks; I'll report back after I've had a chance to test the next release.

Is this part and parcel with the z-wave slowdown issue, or something separate? Thanks.

Separate, amongst other nastiness this issue has the capability of killing the scheduler under certain circumenstances.

For about the past week or so, I have noticed Scheduled events (e.g. Rule Machine) will just stop after a period of a few days. Looking at the information page for the rule shows no upcoming scheduled events, where there should be one every day to turn on lights, for example. I have been going into the rule and clicking Done to re-kickstart the schedule...or rebooting the hub in a controlled manner. Whatever this bug is, it is pretty nasty. My wife is wondering if I switched back over to ST! :wink:

2 Likes

Very curious. I'm not seen any issues with schedules. I'm running Homebridge and Google Assistant Relay. These must not be causing issues then it would seem.

Love it!!

J

Is there a way to reproduce this bug? It's happened to me before in the past, and I have several apps that perform httpPosts, but they've been fine for the past few weeks. Is it the requests on their own or a combination of things?

it's related to scheduled http methods that that timeout for whatever reason.

I too have seen this sporadically. Glad to hear it's not just me. I figured it was since I still had ST running both networks as well.

I must say I haven't seen any httPost timeout error logged at times which correspond with the hub problems starting; but of course I don't know if some of those operations did timeout and they are not just reported correctly in the log.
One thing I did notice is that this problem happens more regularly at night, which is indeed when my server running influxDB undergoes typical maintenance/upgrade operations which may surely influence its ability to respond correctly to http requests.

Promising! 1.1.1.119 has been running since yesterday; the hub "survived the night" for the first time in several days! And this time I did see a few occasional timeout error reported in the log on httpPost operations.

4 Likes

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.