Here we go again & again

It's actually a little bit backwards. What you see that I don't, in the error log, are the debug lines from apps and devices, which can overwrite the errors within minutes. The size of the files are the same, the error log is just stripped down to only errors that hit the platform. In rare occasions if we find platform errors that are not posted in Logs, engineers will add them immediately. In the past few months, I have not seen any errors that were not also posted in Logs. It is important to turn off the non-critical debug lines, like rule actions, when not actively troubleshooting, so that errors are more visible.

6 Likes

Agree. Sadly IKEA don't do much IoT gear where I am :sob: Actually I have 2 Sonos speakers I can use but they aren't located where I want most of my TTS.

That would be a great addition to HE, if we could see the same log with just the bad stuff. Would open a lot of eyes to the problem some may be having.

2 Likes

And life goes on...

Error 500

HTTP ERROR 500
Problem accessing /hub/reboot. Reason:

Server Error

Caused by:
java.sql.SQLException: An SQLException was provoked by the following failure: com.mchange.v2.resourcepool.ResourcePoolException: A ResourcePool cannot acquire a new resource -- the factory or source appears to be down.
at com.mchange.v2.sql.SqlUtils.toSQLException(SqlUtils.java:118)
at com.mchange.v2.sql.SqlUtils.toSQLException(SqlUtils.java:77)
at com.mchange.v2.sql.SqlUtils.toSQLException(SqlUtils.java:74)
at com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool.checkoutPooledConnection(C3P0PooledConnectionPool.java:694)
at com.mchange.v2.c3p0.impl.AbstractPoolBackedDataSource.getConnection(AbstractPoolBackedDataSource.java:140)
at groovy.sql.Sql$36.run(Sql.java:4186)
at groovy.sql.Sql$36.run(Sql.java:4184)
at java.security.AccessController.doPrivileged(Native Method)
at groovy.sql.Sql.createConnection(Sql.java:4184)
at groovy.sql.Sql$AbstractQueryCommand.execute(Sql.java:4593)
at groovy.sql.Sql.rows(Sql.java:1715)
at groovy.sql.Sql.rows(Sql.java:1633)
at groovy.sql.Sql.firstRow(Sql.java:2140)
at groovy.sql.Sql$firstRow$20.call(Unknown Source)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:125)
at com.hubitat.hub.dao.InstalledAppDao.getDashboardMenu(InstalledAppDao.groovy:805)
at com.hubitat.hub.dao.InstalledAppDao$getDashboardMenu$14.call(Unknown Source)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:117)
at com.hubitat.hub.route.BaseRoute.render(BaseRoute.groovy:47)
at com.hubitat.hub.route.BaseRoute$render.call(Unknown Source)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:133)
at com.hubitat.hub.route.JettyErrorHandler.writeErrorPage(JettyErrorHandler.groovy:17)
at org.eclipse.jetty.server.handler.ErrorHandler.handleErrorPage(ErrorHandler.java:260)
at org.eclipse.jetty.server.handler.ErrorHandler.generateAcceptableResponse(ErrorHandler.java:250)
at org.eclipse.jetty.server.handler.ErrorHandler.generateAcceptableResponse(ErrorHandler.java:170)
at org.eclipse.jetty.server.handler.ErrorHandler.doError(ErrorHandler.java:142)
at org.eclipse.jetty.server.handler.ErrorHandler.handle(ErrorHandler.java:78)
at org.eclipse.jetty.server.Response.sendError(Response.java:655)
at org.eclipse.jetty.server.Response.sendError(Response.java:590)
at org.eclipse.jetty.servlet.ServletHandler$Default404Servlet.doGet(ServletHandler.java:1791)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:841)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1650)
at org.eclipse.jetty.websocket.server.WebSocketUpgradeFilter.doFilter(WebSocketUpgradeFilter.java:206)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1637)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253)
at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:61)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
at org.eclipse.jetty.server.Server.handle(Server.java:564)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:317)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:110)
at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:673)
at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:591)
at java.lang.Thread.run(Thread.java:748)
Caused by: com.mchange.v2.resourcepool.ResourcePoolException: A ResourcePool cannot acquire a new resource -- the factory or source appears to be down.
at com.mchange.v2.resourcepool.BasicResourcePool.awaitAvailable(BasicResourcePool.java:1467)
at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:644)
at com.mchange.v2.resourcepool.BasicResourcePool.checkoutResource(BasicResourcePool.java:554)
at com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool.checkoutAndMarkConnectionInUse(C3P0PooledConnectionPool.java:758)
at com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool.checkoutPooledConnection(C3P0PooledConnectionPool.java:685)
... 60 more
Caused by:
com.mchange.v2.resourcepool.ResourcePoolException: A ResourcePool cannot acquire a new resource -- the factory or source appears to be down.
at com.mchange.v2.resourcepool.BasicResourcePool.awaitAvailable(BasicResourcePool.java:1467)
at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:644)
at com.mchange.v2.resourcepool.BasicResourcePool.checkoutResource(BasicResourcePool.java:554)
at com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool.checkoutAndMarkConnectionInUse(C3P0PooledConnectionPool.java:758)
at com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool.checkoutPooledConnection(C3P0PooledConnectionPool.java:685)
at com.mchange.v2.c3p0.impl.AbstractPoolBackedDataSource.getConnection(AbstractPoolBackedDataSource.java:140)
at groovy.sql.Sql$36.run(Sql.java:4186)
at groovy.sql.Sql$36.run(Sql.java:4184)
at java.security.AccessController.doPrivileged(Native Method)
at groovy.sql.Sql.createConnection(Sql.java:4184)
at groovy.sql.Sql$AbstractQueryCommand.execute(Sql.java:4593)
at groovy.sql.Sql.rows(Sql.java:1715)
at groovy.sql.Sql.rows(Sql.java:1633)
at groovy.sql.Sql.firstRow(Sql.java:2140)
at groovy.sql.Sql$firstRow$20.call(Unknown Source)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:125)
at com.hubitat.hub.dao.InstalledAppDao.getDashboardMenu(InstalledAppDao.groovy:805)
at com.hubitat.hub.dao.InstalledAppDao$getDashboardMenu$14.call(Unknown Source)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:117)
at com.hubitat.hub.route.BaseRoute.render(BaseRoute.groovy:47)
at com.hubitat.hub.route.BaseRoute$render.call(Unknown Source)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:133)
at com.hubitat.hub.route.JettyErrorHandler.writeErrorPage(JettyErrorHandler.groovy:17)
at org.eclipse.jetty.server.handler.ErrorHandler.handleErrorPage(ErrorHandler.java:260)
at org.eclipse.jetty.server.handler.ErrorHandler.generateAcceptableResponse(ErrorHandler.java:250)
at org.eclipse.jetty.server.handler.ErrorHandler.generateAcceptableResponse(ErrorHandler.java:170)
at org.eclipse.jetty.server.handler.ErrorHandler.doError(ErrorHandler.java:142)
at org.eclipse.jetty.server.handler.ErrorHandler.handle(ErrorHandler.java:78)
at org.eclipse.jetty.server.Response.sendError(Response.java:655)
at org.eclipse.jetty.server.Response.sendError(Response.java:590)
at org.eclipse.jetty.servlet.ServletHandler$Default404Servlet.doGet(ServletHandler.java:1791)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:841)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1650)
at org.eclipse.jetty.websocket.server.WebSocketUpgradeFilter.doFilter(WebSocketUpgradeFilter.java:206)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1637)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253)
at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:61)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
at org.eclipse.jetty.server.Server.handle(Server.java:564)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:317)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:110)
at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:673)
at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:591)
at java.lang.Thread.run(Thread.java:748)
Powered by Jetty:// 9.4.z-SNAPSHOT

Seemingly caused by a database issue arising with a Xiaomi motion sensor. That's new.

And another error from the device used to track wifi connection of my phone...

And the RM flagged error I've seen before...

Did you update that iphone wifi driver?

Yup, about 2 weeks ago.

1 Like

@jwetzel1492 I was asking if the OP had updated his driver.. I knew you did your part

LOL I need more coffee. :smiley:

2 Likes

Yes, I saw the post about this a while back and updated it immediately.

Not to pile on, but there appears to be something wrong with your setup. Slowdowns, and even hard crashes is not the normal thing to have happen. Are there a few that have issues? Yes. But you seem to be having lots of unusual database issues.

I might kindly suggest to disable (not remove) all apps, and devices, and re-enable them one at a time. See if any one thing triggers this failure you are having. Giant pain the the ■■■, but this will eliminate your particular devices and automations from the picture. Yes I realize it is inconvenient, but how are you going to prove to support that you did your due diligence?

From there, I would think that you have either a hub failing, or your Z-stick that is failing. I read a post a quite a while back (that I can no longer find) where someone else was having similar issues, and the Z-stick was failing and Hubitat sent a new one which resolved the issue.

3 Likes

Your last screenshots show some SQL exceptions and failures because the "factory or source appears to be down". These types of messages often happen when an app is trying to do something while the hub is being shutdown either forcibly through the shutdown URL or from the UI.

When you say the hub is rebooted nightly are you forcing it to reboot on a schedule? If so, why? If other things are happening at the time of the reboot you are going to have have problems in the log around the time of those reboots. If so, when are you rebooting? If so, what is happening on the hub when you are rebooting?

Did you take the time to make the reboot happen away from maintenance? Did you make sure nothing else on the hub would be happening when you rebooted? Did you try to solve the problem without rebooting? Rebooting should be a last ditch resort. I tried to use it as a solution for a while and it doesn't help but I didn't use it on a schedule. I used an app that I contributed to that would detect essentially when the hub was going down and reboot then. You wouldn't force your car off without putting it onto park or neutral. You wouldn't force your phone or computer off unless it had to be or you did it through the UI. Why are you doing it with your hub?

And just to add to the general sentiment that I've seen expressed here and in other threads...

  • The HE platform is under development. There aren't a lot of preventative measures in place to stop users from configuring themselves into a corner or tools to help users understand that they have set themselves up for failure. That's not a great scenario because some users (especially those coming from SmartThings) are not going to understand that local processing has limits e.g. the hub hardware is limited and resources are actually finite. And on top of that it is still being optimized and vetted every day.
  • From what I've seen you are more than willing place the blame on the platform, devices, developer code, anything except your configuration or choices.
  • From what I've seen you are not willing to troubleshoot how you might be compounding a problem by disabling or uninstalling apps/devices to isolate the issue.

With a topic name like "Here we go again & again" you are not really calling out for help. You are taking aim at the platform. That's fine. You are entitled to do what you want to do. But I don't think you understand a few key factors that are preventing you from succeeding here.

Most of the local processing hubs are FOSS platforms. They are ultimately much less user friendly than HE. I'm a developer and openHAB made me uncomfortable. HASS is okay but it's still too much work even for me. And the rest really just don't have the flexibility to do everything yet. If you want local processing, to be able to do almost anything and a lot of user control then you are stuck here.

There will be frustrations but when considering the alternatives I still choose to be here.

9 Likes

Thanks for the detailed note @neonturbo

I'm working through this at the moment. Agree it is a royal PITA but without additional analysis tools on the hub there is no alternative after the other things I've tried.

The hub is failing. I don't have a Z-wave stick because it's a C5 so it's not a faulty external stick at least.

Thanks for the support. I will keep at it.

Interesting post @codahq, thanks for your time in writing it. Let me address some of your points and assumptions.

Yes. It was a last resort to reboot on a nightly basis given the poor performance of the hub (slow-downs). Reading various other posts and looking at various available apps it seemed like a sensible approach and is adopted by many users as I'm sure you know.

The hub is rebooted nightly at 5 am. There shouldn't be much going on at that time but certainly there may be some activity (wifi connection check, lamp polling to Hue, motion in bedroom sensors etc). I'm assuming this would be the case with any reboot at any time of day using any method. It's a valid point. I see your point. However, it was a solution as explained above that of course only cures the symptom, while I've been trying to understand and fix the root cause(s) .

Yes, I "took time" to schedule it away from 2 am.

Up to you to make whatever assumptions you like. The reality is I've been working through various steps based on what I've read here. Since you took time to write a lengthy response on this thread I will write them down so that you don't have to assume what I am or am not willing to try:

I've updated apps.
I've changed the LAN cable and monitored for changes.
I've executed soft resets.
I've installed a UPS and monitored for changes.
I've checked and rechecked rules to try to ensure they are not out of control or looping.
I've carefully inspected the logs and tried to understand each error.
I've reconnected Google Home when I saw errors related to that.
I've contacted support (multiple emails).
I've disabled Chromecast beta (as instructed).
I've made sure that disconnected devices are powered down (so that the nightly maintenance run can remove them as ghosts once converted).
I've got device Watchdog etc installed to try to monitor the situation.
I've started working through disabling other apps/drivers.

If there's anything else you or others know about, feel free to suggest it.

Yes I agree, it is fine to call out a real-world case of a problematic hub. To be honest I don't really much care if you or others don't like it or don't like the approach. I'm certainly willing to try things as indicated above, within my knowledge and ability. The title of the thread is absolutely my feeling every time the hub has an issue, slow down or crash. It also reflects my personal (British) sense of humour over the whole situation which is keeping me sane in a situation where multiple times I (or more likely my wife) could willingly throw it from an upstairs window. I am listening to constructive input and trying things in the hope that the core of my several thousand dollar investment in IoT will one day work reliably.

OK, but then what are they, specifically (without assuming anything and with the clarifications provided above)?

2 Likes

You've done a lot more than I gave you credit for so I apologize for making incorrect assumptions. Keep trucking on the action above. Also, I don't think I would reboot on a schedule.

If that doesn't reveal the problem then I would say start targeting individual devices and how they are configured/paired. I found I had a motion sensor (Zooz ZSE02) that was the worst repeater in the entire universe so I trashed it. It caused Z-Wave commands to back up and devices to panic. I also had to exclude some Hank RGB bulbs and re-include them. For some reason repair was not fixing an issue with them as neighbors. (I still have a nagging issue during heavy Z-Wave traffic and non-plus Z-Wave devices but that is a story for another time. It is 19 out of 20 times temporary and self-corrects.)

I've had to learn a lot (for example I zniff now) but I don't feel like it was the platforms fault. I feel like I would have had to learn a lot of what I did on the other FOSS platforms and I wouldn't have cared to learn it on the retail platforms because there wouldn't have been enough control on those platforms to do anything about the problems.

Enough users are having a good experience that it's fair to assume the platform is capable of good experiences. I've had my share of problems and I'm more than happy to commiserate with you over yours. This platform is not problem free but it's at least always improving and the awesome thing is you can see that in the HE culture/developers and the community that surrounds them. I really want to see this platform continue to improve and succeed long-term. Z-Wave is just too fricking annoying on openHAB and hass.io and I'm neck-deep (actually way over my head) in Z-Wave devices.

A lot of people around the community feel it is necessary to have a second or even third hub, a sentiment I don't agree with. (I guess I make exception if you aren't willing to walk away from non-standard devices that are proven not to play well with others on any hub and have to run them in a sandbox.) It has taken a lot of honing but (outside of the occasional self-inflicted wound from beta testing or experimenting) I am stable on one hub with 186 devices (around 100 Z-Wave, 30 Zigbee, mixed LAN devices, and a whole bunch of virtual devices from a Ring integration) with full time HTTP, telnet, websocket and UDP clients running. This includes the OOB Google Assistant integration app and the ST community Cast-Web project. I guess all of that is a really long-winded way of saying it's possible to find something you can live with and appreciate with a lot of work.

Every platform I have tried is a lot of work. Some of them have a lot of "won't fix" problems. I don't feel like we have that problem here.

5 Likes

Thanks for the reply. Agree absolutely. This platform is amazing when it works and I also don't believe there is anywhere better to go. I also hope to be able to run my relatively modest system on the 1 hub at each property. I've got stacks of rules that simply blow me away with what this system is capable of achieving. Then I get hang-ups or slowdowns and its very frustrating. I'm spending hours on this forum trying to find new ideas. It's my hobby too. So I'm basically an advocate. Go to Amazon and see the questions I've answered about HE with positive responses to try to help build the user base. For anyone saying I'm negative, then they are missing the point or have Hubitat too far up their ***. At the end of the day, we all want it to improve and be really awesome.

Regarding your recommendation not to reboot too often, I listen to that and so have reduced the reboot frequency. I would use Rootin' Tootin' but I had a bad experience with it getting into a loop that we never resolved so I prefer a scheduled reboot. I've reduced the frequency and maybe as you say that will actually help find the issues more clearly.

I will keep trying different things and anticipate, in parallel with that, the improvements coming from the HE team.

5 Likes

I agree you don’t NEED multiple hubs except in rare conditions.. Ex: I have a second production hub that runs a just lightify mesh, as we all know those things are horrible repeaters for anything but other lights..

I also have a dedicated development hub as I have messed up my production z-wave mesh many times during dev projects..

But yes all my automations run on and 98% of my devices connect to my primary hub.. My second production hub has no automations and is just a zigbee slave hub for my lightify mesh which is controlled by my primary hub..

3 Likes

I remember this. I thought @adamkempenich fixed that bug?

I haven’t put much time into Rootin’ Tootin’—but @codahq has done some solid work, and it should be fixed. My hubs have been incredibly stable, so I’ve actually been able to uninstall RT on my main two.

@Angus_M, I’m sorry to hear about your troubles :slightly_frowning_face: If you feel like giving RTSR another try, I’ll be standing by!

2 Likes

That would be ME. :smiley:

I have 6 Hubitat Hubs. So apparently I'm an addict. :smiley:

7 Likes