Mesh analyzer solving a lot of issues

Just want to jump in and up-vote a better way to debug Z-Wave.

I have 65+ devices and although I followed all the rules (build out, mains powered first, 10 at a time, repair between each set, settle for a day before next, only Z-Wave Plus), I still have issues with intermittent lost and floods of repeated messages.

That's something I miss from Home Assistant. I wish I had gone down the ZigBee route but it's too late now. How about an unsupported websocket with the low-level Z-Wave events? I just give us access to the logs you use when trouble shooting setups remotely. I know it's there, just open it up with no promises or backwards compatibility caveats. I'm at the point of thinking of buying a commercial Zniffer tool... Please give the Z-Wave logs?

1 Like

I think that's a little naive to be honest, no offense. Sometimes things just go wrong even when you do everything right, especially with a lot of moving parts (devices).

We have a lot geeks that love to tinker just like mechanics will tune their car over the weekend. Yes, a lot of customers don't care but giving some low level tools/access to allow us to debug this would help a lot. Especially since we can't back up the Z-Wave mesh and rebuilding devices and apps from scratch is just too time consuming (something I also miss from other Hubs).

Help us help ourselves?

1 Like

You now know what to look for when you get the tool, since you learned the basics. :+1:t2:Many do not and, even once they understand the basics, thatā€™s all they care to know. Putting development efforts and hub resources toward that is unnecessary since the add on tool better serves that role. That is the consensus Iā€™m reading, and I agree.

1 Like

Again thank you all for replying. It's obvious that even average Joe has to learn something, I did too when starting with HE. The documentation about how to build a mesh is very good and clear. And to give an example, my zigbee mesh was solid and reliable. But after 4 month suddenly every night devices fell of the mesh. By rediscovery they work for a couple off days. I have no idea how or why. I can't find any indication. The only diagnostic tools I get is my own wifi analyzer to see if a neighbour started some stupid new wifi AP. But no. So I'm stuck. So @SmartHomePrimer I can't believe there is such a thing as follow the documentation and you have nothing to be worried about. And yes, I used to think that too.

@zarthan I'm not talking about recording tons of data here. And even if it were recording, I still think the hub would be more then capable to do that. If a simple raspberry pi can the hub should be too. I also don't understand why you are convinced it isn't capable, only HE really knows what it can and cannot do.

If I understand correctly you don't want a tool like this. That's fine, you don't have to use it. But I do want something like this and I think more people can benefit from it.

3 Likes

@haas
great stuff, were can we find the guide and scripts to generate this?

1 Like

It's all on this thread..

But itā€™s not as simple as that. If the developers work on something like this, that takes time away from work on bug fixes, platform updates, adding support for additional devices, literally everything else they could spend their time doing.

@bcopeland said it well above:

Count me among the group that thinks this would be nice to have in a perfect world, but would be of questionable utility to the majority of hubitat users in reality.

5 Likes

It isn't that I don't want a tool like this. I don't need a tool like this. If I want to track down a problem, it is easy to use an external tool. They are available and inexpensive. Wasting efforts both device resource and staff mental is more effective directed elsewhere.

Do you not look at the ZigBee logs page occasionally?

[bathroom_light](http://10.x.x.x/hub/zigbeeLogs#zigbeeRx2921)2020-03-21 08:43:06.582 profileId:0x104, clusterId:0x19, sourceEndpoint:1, destinationEndpoint:1 , groupId:0, lastHopLqi:255, lastHopRssi:-57
[backdoor_light](http://10.x.x.x/hub/zigbeeLogs#zigbeeRx61443)2020-03-21 08:42:59.976 profileId:0x104, clusterId:0x19, sourceEndpoint:1, destinationEndpoint:1 , groupId:0, lastHopLqi:255, lastHopRssi:-54
[office_switch](http://10.x.x.x/hub/zigbeeLogs#zigbeeRx35938)2020-03-21 08:42:33.705 profileId:0x104, clusterId:0x6, sourceEndpoint:1, destinationEndpoint:255 , groupId:0, lastHopLqi:255, lastHopRssi:-58
[Thermostat](http://10.x.x.x/hub/zigbeeLogs#zigbeeRx64230)2020-03-21 08:42:21.824 profileId:0x104, clusterId:0x201, sourceEndpoint:1, destinationEndpoint:1 , groupId:0, lastHopLqi:255, lastHopRssi:-58
4 Likes

An over simplified take on the whole picture. Tools are useless in the hands of those that donā€™t know how to use them and donā€™t care to learn. Iā€™m not arguing that tools are bad. I donā€™t think anyone here is. They are available to those that want them. You want them for free, and I disagree. I personally would prefer the team push forward with new developments on new IoT hardware as it become available, and new home automation features. @marktheknife just explained that point I failed to get across, so Iā€™ll leave it at that.

4 Likes

You mean this?

That has never done anything since I have the hub.

That should spew logs every time a Zigbee device is operated. Leave that tab open for a while and go trigger some of your devices. Assuming you have mains powered devices like outlets, you should be able to turn them on and off and see something like the logs fragment I posted.

3 Likes

Nice solid looking "mesh", man

Great points and food for thought but...
If you sell a HUB that is able to, not necessarilly out of the box, talk to a wide range of Zigbee devices with additional help from Community drivers, your bound to hit issues here and there.
Bear in mind that alot, not all, but alot of certified devices come with a hefty price and therefore users opt for a cheaper solution.
this brings with it, unfortunately, uncertified devices which 9 times out of 10, the user has no idea that they're not 100% complient.
After continuous Hub slowdowns, I've started from scratch and have made many changes after reading endless forums, appreciated by the way and have changed just a couple of the following...

  1. Changed wifi channel
  2. Removed troublesome Zigbee devices
  3. Updated HE Drivers & Apps as new options become available
  4. Moved devices to a 2nd HUB
  5. Moved Apps to a different HE HUB
  6. Tried to spread the load accross 3 HE Hubs
  7. Analyzed xBee info
  8. Reintroduced 3 x Ikea Repeaters previously uninstalled
    Trial & error has taken months of what does & doesn't work but would be awesome to see included utilities with HE for diagnosing faults?
    Just saying IMHO.
3 Likes

Just ran across this thread and thought I'd add my own feature suggestion in this vein posted not long ago .

Looking around the forum there are numerous times this has come up and for good reason. It would seem to me that not addressing it is reminiscent of past lost opportunities in the HA space.

That is where frustration and the threshold to reach optimization was just high enough that some portion of the potential market avoided, gave up, or otherwise limited their use because they feared the reliability of what they couldn't readily verify.

I think this area of development is likely higher priority than many seasoned folk give it. Afterall, we are talking about verifying foundational network reliability and resilance...all the whizz bang layered on top is meaningless if something is broken/ not quite right/ not optimized and you don't have the tools onboard to understand where/why.

I don't buy the suggestion that the average user wouldn't benefit from Hubitat doing more in this area. Granted the issues can be multifaceted but there is a reasonable base of expectation folks bring from their WiFi experience that isn't even present here to look at.

2 Likes

I think the original feature request is a really good request.

In hubitat there is a lack of observability when it comes to the mesh that is missing. Its naive to argue that best practices is a substitute for observability, no one who builds any meaningful software for a living would live without logs and analysis and at scale automating and filtering the data is key to make the human informed.

I have more stability issues on hubitat than i had on smartthings which leads to requiring more serviceability, something which hubitat encourages in its interactions with the community. But serviceability in the mesh sorely needs a way to reason about the mesh and its current state and there just isn't great tooling for that on the platform.

The point about best practices is a bit wonky since that only work well when living in a static world where every interaction has been validate by the best practice and code doesn't change. But what works like that in the real world? I add devices and change my setup as i make alterations to my house, time of year and lifestyle. I don't have a fixed mesh, when the christmas lights come out they are automated ad-hoc and that gets removed when the season is over. When the summer is here i run the automation for my pool house far more frequently than i do any other time in the year. Interactions in the network aren't static and they change. I buy another lamp, lets put a smart plug on that. You get my point.

I think this is a request that is worth taking seriously.

From this thread Its clear to me that large networks lead to people creating homegrown observability systems to debug and triage errors. My network at 30 or so devices is approaching the point where this is getting important.

I should also state the obvious. i much prefer self-help than calling support. if i can solve something myself faster than support i will always be biased towards that! :slight_smile:

1 Like

Thereā€™s maybe not the tools youā€™re expecting. Not sure I even understand what it is youā€™re looking for, however there are tools.

For Zigbee:
http://[YOUR HUB IP]/hub/zigbee/getChildAndRouteInfo

For Z-Wave, look in the Z-Wave Details. You can see the route and last transmission rate. Thereā€™s also a tool to view the Z-Wave topology.

1 Like

If none of that helps there's also the hub stats to see what devices and apps are chewing up your hub.

http://hubitat.local/hub/enableStats (Run this to start stat collection and let it collect for about 5 minutes)
http://hubitat.local/hub/stats (Run this after the 5 minutes has elapsed to view the stats)
http://hubitat.local/hub/disableStats (Run this to stop stat collection)
For Winodws PC's Shift + Windows Key + S is an easy way to take a cropped screenshot.

You can use these links to track down the apps and devices by putting the id's from stats
http://hubitat.local/installedapp/configure/APPIDHERE
http://hubitat.local/device/edit/DEVICEIDHERE

8 Likes

cool. trying it now.

This is my report. about 10 minute sample.
Device 524 -529 are my sonos. even though i don't have anything playing on sonos i noticed in the log that they wake up frequently, i think they poll the sonos for track and volume information.

Device 535 is a garage door tilt sensor, i think it reported in battery.
Device 193 that ran 4 times is a motion senor, and there was motion there.
The rest are motion sensors.

Now over to the apps. App 236 is the amazon echo skill. what is the run time unit used? It can't be seconds but i could guess that a call to amazon could take 250 milli seconds to complete. And spending a cpu second on that every 10 minutes doesn't seem like a high cost.

335 is sonos integration app. Nothing was running so i'm surprised it ran.

I assume its normal for integration skills to run even though the service was inactive and not called from the gui or from the alexa or sonos endpoints?

device Stats enabled: false
Device stats start time: 1606869365737
Device stats total run time: 711661
device id 259 runcount 1 total runtime 13 average run time 13
device id 262 runcount 5 total runtime 144 average run time 28.8
device id 524 runcount 7 total runtime 518 average run time 74
device id 525 runcount 7 total runtime 212 average run time 30.2857142857
device id 526 runcount 7 total runtime 220 average run time 31.4285714286
device id 527 runcount 7 total runtime 195 average run time 27.8571428571
device id 528 runcount 7 total runtime 244 average run time 34.8571428571
device id 529 runcount 7 total runtime 328 average run time 46.8571428571
device id 535 runcount 1 total runtime 692 average run time 692
device id 193 runcount 4 total runtime 103 average run time 25.75
device id 428 runcount 1 total runtime 17 average run time 17
device id 227 runcount 1 total runtime 13 average run time 13
device id 263 runcount 1 total runtime 87 average run time 87
App Stats enabled: false
App stats start time: 1606869365740
App stats total run time: 711663
app id 236 runcount 3 total runtime 731 average run time 243.6666666667
app id 335 runcount 1 total runtime 1805 average run time 1805

The only thing that seems weird is the garage tilt sensor. That should be around 25 regardless of what it's doing.

You might want to check out the community zwave mesh analyzer app. Hubitat Z-Wave Mesh Details

2 Likes