I have an elevation hub in a remote house that I will form the basis of my zwave/zigbee network in that house. At the moment, I am in the very early stages of this project, with just two devices - a ST zigbee button, and a Hank z-wave four-button scene controller. These two devices were set up as a placeholder for an eventual much larger number of devices. The eventual plan is for up to four hubs at the property across 3 buildings.
I am encountering a repeating issue with the hub apparently crashing during periods of inactivity. So far I have not seen the behavior when I am activity using the devices associated with the Hub.
By crashing, I mean completely unresponsive to opening the GUI, the attached devices are non-functional, the hub does not respond to pings, and network instrumentation appears to show no activity for multiple days. This happens repeatedly following several days of inactivity - 2, 3, or 4. The time taken for the crash is variable. There is nothing in the logs that indicates anything out of the ordinary - just random activity (eg temperature reports) from the associated devices until that simply stops. No errors were reported. I updated the hub to the latest firmware last weekend (whichever that was, since I am at a different location, I cannot reboot it to check). This is a pretty serious issue since I have 40 or so planned devices, and I need to know if I should abandon Hubitat as a solution and start looking for alternatives before I invest any more time. I do not honestly care either way, but I need to know if this can be solved to make decisions about my prototyping. My interim solution at present will be to resort to a nightly forced power cycle.
I came across various posts from a few years ago that referenced an identical crashing behavior, but in those cases, a firmware update fixed the problem.
What info can I collect to help with resolving this?
More generally you may also be interested in the Hub Information driver, a popular way to keep tabs on the state of the hub, particularly if you have an external data collection option, like InfluxDB or something similar.
Thank you for that.. I'm a fan of monitoring. One question I do have as a complete novice - how would I actually get access to the data from the driver? I understand the process of installing it (and I just did), but I have been unable to find any conceptual information on the hub architecture and where these kinds of drivers fit into it. I am just utterly lacking any context on how the various pieces fit together - any pointers to doc?
Yep. Go to settings))backup and download a new backup (that will be a clean up the database) then go to yourhubip:8081 and do soft reset. On reboot, when prompted to restore, use the database you downloaded to your pc
Thank you for the PM. May I ask how do you determine that hub stops working, and also, mind me asking when was the last time such event occurred. I took a look at your hub's engineering logs and I do not see any traces of your hub going missing, or "crashing" of any kind. I see that the hub was disconnected from the network 10 days ago, but no problems more recently. You do not need to perform Soft Reset or take any further actions. Instead, let me know as soon as you find the hub unreachable so we can address the issue when it happens.
To elaborate further: I define "Hub Stops working" in one of two ways
I discover I cannot connect to the GUI from my remote location, coincident with this I can see zero network activity on the hub from network instrumentation.
When I arrive at the property after my typical 5-day absence, the two devices I have (both buttons) are nonfunctional. This usually leads to me discovering that 1) is the case.
The only recovery appears to be power cycling.
I believe that the two items are unrelated. I do not see any evidence of the first to happen, as the routine tasks are running uninterrupted. The second, may be related to the mesh. When first powered, the radio has boosted power, if devices are at the edge of your mesh, they may stop working. Adding a repeater to support the battery powered devices may help alleviate second item.
This is where having an external logging system comes into play. There's a number of options. I think most folks go with influxdb and Grafana: Linux and Windows guides. I personally use Splunk and have a dashboard setup for the info from the Hub info driver. Here's a snippet:
The two devices are located within 5 feet of the hub, so it seems extremely unlikely to be a mesh or battery problem. These are the only zwave/zigbee devices within several miles (this is in the middle of nowhere), so not likely interference.
I am unclear on what you mean by "I do not see any evidence of the first happening". When this happens, the device is not contactable on the network - no response to either attempts to login or to ping it.
The engineering logs hold a lot of data and nothing was indicating lockups. What color is the LED on the hub when this happens? Is it connected into a switch? What make and model? Or directly into a router? (Make/model?). Have you tried switching ports on what it's plugged into?
It is connected to a Ubiquiti Switch 24, which connects via fiber to a UDMP, via cat6 to an ARRIS SB8200.
I switched the network ports this weekend when I moved to using PoE power via an adapter to allow me to power cycle remotely, so we will see if that makes any difference.
Concerning the network connection, the Hub exhibited the same behavior when I first set it up last year. At that time, the network was completely different - different switch, different router, and different address space (in the same physical location) The behavior was the same, but I simply had no spare time to troubleshoot since I would not be needing to integrate it until this year.
Now I am at the point where I am integration testing various network infrastructure components and ran into the problem again.
There have been rare instances, where some hubs stop working due to local network issues. When that happens, those events are clearly recorded in the engineering log, and the hub stops performing its routine tasks. That is not the case with your hub, that's why I said the next time it freezes to let us know so we can address the issue when it happens.