Performance issue

I wouldn't say your hub is idle at all. On the contrary, it is trying to process a lot of events. As the logs you shared show, you have devices generating consecutive events every few milliseconds. Are you asking Teensy to check on your devices? If so, how often. You may want to enable debug on Maker API to see how often is triggered.

3 Likes

There's something bizarre about your setup. I have a similar setup with two Hubitats, and the majority of my automations are done using Node-RED via MakerAPI. I have ~100 devices (70 physical) on one, and about ~120 devices (40 physical) on the other. I've never seen the errors you have.

Are you polling from the Teensy? If so, there is no need to do so - because MakerAPI can push device events as updates using http POST.

3 Likes

I agree looks like excessive polling / refreshes is possibly being done. I don’t think the errors have anything to do with the code in the drivers.

2 Likes

The reason the switch was getting hammered there was because my wife went into the master bathroom, and tried to turn on the lights. Which weren't working, so of course she tried pressing the switch paddle several times. Then I tried the bedroom lights a couple of times: dev 394, and of course those didn't work either.

Normally, those lights aren't generating any events at all, so I did a deep dive and found where it started: It's the Thermostat. That reports whenever the temperature or the humidity changes, this shows where the problem started.

I was working the various lights about 40 minutes earlier, because I was transitioning from the "test" version of my control software running on my desktop to the production version running on the Teensy, so the IP address in all of the RM rules to make HTTP calls needed to be changed, and I wanted to verify everything was working. Even so, that's no more than twenty or thirty button presses over a couple of minutes, finishing as you can see in the Master Toilet at 18:15H. And then at 19:03H the Thermostat starts erroring out, after which the entire Z-Wave system stopped producing events.

You can stop reading here unless you're curious enough to want to know the exact setup.

Still here? OK. Right now the HE is nothing other than a Z-Wave and Zigbee gateway to interface those two systems to the brains of the system running on the Teensy. When a Z-Wave switch is pressed, it triggers RM, which makes an HTTP call out to the Teensy, which then does all the heavy lifting of turning lights on and off. Most of the bulbs are Kasa TP-Link, so they're WiFi and the Teensy can control them directly.
However Zigbee devices, of which I have a few, are controlled via the exact reverse path: the Teensy makes an HTTP call into Maker API which in turn controls the appropriate Zigbee switch.
And then there's the Thermostat. All that happens there is that every five minutes or so, the Teensy calls Maker API to get thermostat status: operating mode, temperature etc. etc. etc. And on those rare events (maybe four times a day at maximum) when Teensy needs to change the thermostat mode, it'll again hit the Maker API to switch from (e.g.) heat mode with a setpoint of 68F to cool mode with a setpoint of 76.

Maybe post a screenshot of your Z-wave details page, if this is a C7 or C8.

I think the problem is there is clearly more going on then what we are hearing. The hub can handle well over 100 devices but can also be overwelmed with poor code or a bad implementation.

I think one thing I would try to do is let maker api send updates to Teensy as well as recieve commands from it and stop all polling. Maker can easily do that. Preferably you want the hub to react to events so if at all possible let it send out commands when needed and then recieve when needed as well. Schedule repetative stuff slowly adds up. It can also be very detramintal to have commands come in to often to the hub.

Is the only indication of a load issue that error message do you see any alerts on the devices in the HE UI that indicate heavly load.

Are those errors still occuring. how busy do the logs look now. If there is as little stuff on this hub as you mention you should be able to turn logging on for everything so we can all see how busy everything is. It may be nice to see all of that so we can see everything that is happening.

1 Like

The only slight surprise is that Carriage Light is routing through the Master Toilet, since Carriage Light is almost twice as far from Master Toilet as it is from the hub itself. But whatever, it all works.

In terms of geography, Master Toilet, Master Bathroom and Bedroom switches (including Maralen and David) are all in relative close proximity to one another, Hall Bathroom and Carriage Lights are elsewhere in the house, and the Thermostat is just across the main hall from the HE. Maximum distance to any of them is no more than 30 ft, although there are a few walls getting in the way.

You have the firmware update button on that page. First thing to do is run that. It will help with Zwave issues allot.

That is 7 devices and should be nothing for the hub to handle. In the Upper right corner you have an alert. Is that for Hub Load?

How often is this HE rebooted. Is it on the latest version of firmware?
Do you have any other devices configured on the hub.

Can you click on the Live Logging menu and then also cut and past the screens for device and app stats?

1 Like

Apps:

Devices:

That's it. Trust me when I say there is nothing going on. When you say "stop all polling" what are you referring to? The only "polling" that I'm aware of is the Teensy pinging the HE every five minutes for Thermostat status. Other than that, everything should be event driven, in response to either a Z-Wave event from a device, or an HTTP call being made to Maker API.

How exactly are you anticipating I should remove RM and use Maker API. A typical RM rule uses a button press on a switch as a trigger, and then when that happens it makes an outbound HTTP call from the HE to the Teensy. I was under the impression that Maker API could only work with an inbound HTTP call from some other source on the LAN to the HE. I'm not seeing how this could be used to relay a switch button press from the HE to the Teensy.

The system had been complaining of heavy load for several months when I had all the brains on the HE, that's part of the reason I offloaded it all to the Teensy. There were several apps I'd written to do what is now being done by the Teensy. Light control, turning the Carriage Lights on and off and sunset and sunrise, etc. etc. etc. As far as I know they were well behaved, in that they only responded to "input", e.g. a switch press, or as variable change via a variable connector, did their work to set lights as appropriate, and then went back to sleep. But in any case, all those apps are now gone.

Alert is a firmware update, I can do that. Is there a major Z-Wave overhaul between 2.3.4.117 and the current 2.3.5.152?

How often do I reboot? Very rarely. I've been assuming that the HE software is well written and is capable of running for months at a time without needing any intervention.

No other devices, what you see on the device screen is all that's there.

App stats:

Device stats:

I would definitely do that Z-wave firmware update on the Z-wave page. It is different than the hub firmware. There were significant improvements to Z-wave radios with that update.

How many events are you storing for that thermostat. I would dial those way back, there really isn't a reason to store hundreds upon hundreds of thermostat events. In fact, I think it is a good practice in general to limit events unless you really need them for some odd reason. Nobody in their right mind analyzes how many times you turned on a light in a month, so these events are mostly clutter.

And yes there are LOTS of updates in the latest firmware. There are even LOTS of updates in the 2.3.4.xxx series after the 2.3.4.117 update you are on. I would advise updating.

Edit: in particular...

  • Possibly fixed elevated/severe hub load experienced on previous 2.3.4 builds.

and * Fixed high CPU usage/memory leak under certain conditions.

And those are just a couple in the 2.3.4 series.

1 Like

Here is the thing. I believe, you believe there is nothing going on. You probably have good reason to believe so. But something is happening that is causing load.

My talk about polling is simply that any type of repetative scheduled activity can cause impact if it happens frequently enough. I am just trying to zero in on anything that has even the slightest chance of causing a issue. Sometimes thing we shouldn't have a impact can unexpectedly.

Maker API is one of those things that it is very easy to create issues using. I have brought my dev hub to it's knees multiple times with very simple tasks i didn't think would be impactful.

Maker API has a option to post events to a external system. In this case it would be your Teensy that would be that external system. In the screen shot below it is the field for "URL to Post Device Events to". By populating that with your Teensy you can post events to it for activities.

This is how many of us get events into Node-Red, and is exactly what this setup here is for.

Maybe this is a reading thing on my part, but this is new information. I thought this just started.

The Zwave firmware is seperate from the Hubfirmware, and yes the last few Zwave Firmware updates are signfiicant for the C7.

If the Alert is for a firmware update then what is this a performance problem or a zwave response problem, or what is it.

2 Likes

Reguardless of how good the code is on HE it will need to be restarted occasionally. It runs in a JVM and those just tend to need to be restarted occasionall for memory reasons and such. If it has been up for months then it may not be bad to restart it.

2 Likes

Update has been done.

I went through all the devices, and found that they all had an event history of either 11 or 20, and a state history of 30. I set both of those on all devices to 5. So I wasn't storing a lot to begin with, but as you very correctly point out, that's just not something I need a lot of for normal operation. If I really cared (which I don't :slight_smile: ) I'd just turn on the debug flag for the device or app on the Teensy, and keep it all logged there.

Both the Z-wave and the hub firmware?

1 Like

Also is the problem still occuring. Can you help provide clairity as to what is the issue. Is it heavy load, or those messages that were logged, or something else?

1 Like

Missed that - thanks for the heads up. Updating the Z-Wave radio now.

So far I haven't seen it since the reboot. As for what the issue is, you know about as much as I do. Something broke in the Z-Wave stack during an event from the thermostat, that caused all events generated by Z-Wave devices to generate an exception, and the event was then lost. Thus the reason that the lights stopped working.

Again as I repeat, I can't see this being heavy load, not when you see how lightly loaded the HE is, as shown by the App and Device lists shown upthread.

It is very likely the Zwave firmware update and the hub firmware update probably got this in a much better state. There were some very big things that have come since the version you were on.

I think what through me off was terming it as a performance issue. In most cases that means someone is calling out getting heavy load warning and such and that was kind of the verbiage from the log errors.

One last housekeeping step you could do to make sure it is as good of a state as it can be would be to shut down the hub from the menu and then pull the power from the wall and leave it unplugged for about a min. This just ensures that the Zwave radio fully restarts. It doesn't restart with a simple hub reboot unfortunately. This is probably a dot your i's and cross your t's step, but probably not bad to do with what you have delt with.

1 Like

I think this is a important step after the Z-wave firmware update. I would second doing this step.

Did not see this mentioned anywhere so I will put this out there.
For the near future I would monitor app / device stats at Logs > Device Stats / App Stats
Turn on all the checkboxes.
Sort by Total ms high to low (should be default sorting).
Look at the % of total

Most apps / devices will be below 1.0, anything above 1.0 I would take a deeper look at.

Also, sort by Hub Actions
High Hub Actions can be an indication of problems stirring, but not always. For some apps/devices it is normal.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.