Hubitat Needs Better Logging for Easier Troubleshooting

I've been using Hubitat for a few years and in general, it works fine but every time something goes wrong, (which is inevitable with any home automation system) it's very difficult to troubleshoot because the logging is... well... not awesome. I'm an operations engineer in tech so I deal with logs every single day. Good logging and bad logging makes all the difference when troubleshooting an issue.

I'd say, at a very high level, there's two things we generally care about when something goes wrong:

  1. Events
  2. Devices

Every other home automation system I've used follows these general concepts for logging and they can be found in the same place.

9 times out of 10, the typical troubleshooting flow goes something like:

  • X was supposed to happen but didn't
  • Check the log to see if the trigger occurred
  • Troubleshoot from there

Now I could be wrong, but from what I can tell, triggers, events and other things that happen aren't logged in the logging area.

A simple example of this would be a rule that turns a light on at sunset. Normally, you would expect the home automation log to show 3 things:

  1. Sunset occurred
  2. The rule was run
  3. The device was turned on

I could be wrong here, but I only see item #3 in the log.

Events for rules can be found in the rule object itself... but only if you manually turn on logging for that particular rule. Why? Why wouldn't logging for everything that happens just be enabled by default if only at a basic level? I understand if verbose logging is not always enabled but why wouldn't basic event logging be enabled for each rule and sent to the main Hubitat log? Why is the Hubitat log for device events only?

Correct me if I'm wrong on any of this but dang, this system is hard to troubleshoot.

4 Likes

If by default you had logs on for every event, trigger, action, your useful logs would very quickly be pushed so far down it would either go away, take lots of digging through old logs, or filtering to make it useful. Even basic logs on every device and every rule would very quickly fill the logs with thousands upon thousands of reports.

I also suspect there would be some impact on hub memory or other hub resources if you kept more and more logs and just kept filling every space on the hard drive with logs.

I turn off every log I can so I have very few other devices filling up logs when I do want to log. You can turn on logging of various levels on a device, or for rules as needed. So when I troubleshoot, I just turn on the 2-3 things I want to see.

So maybe I don't understand your question, but why would having more logging ON by default be better or easier than turning things on only as needed? It just is needless clutter in my opinion.

3 Likes

I'm not sure I really agree with the clutter argument. That's what filtering is for. Computers store enormous amounts of data - the things that happen throughout the day in my home are inconsequential. Like I say, every other home automation system I've used logs events and devices in the same place without the requirement to go turning individual things on... they are on by default.

It's a total pain to have something not trigger then miss the chance to troubleshoot it because logging wasn't turned on. So I have to enable logging for the thing that didn't happen then wait for it to happen again. And as Murphy's Law would have it, that specific failure won't happen again for a few weeks but some other thing will fail to run then I'll have to enable logging for the other thing and wait for it to happen again. It's like whack-a-mole.

1 Like

You'll see #1, sunset time, under "Logs" (in recent firmware versions), but it's under "Location Events" rather than current or past logs. I'm not aware of any way to make this appear as a log entry per se without doing something that actually logs it yourself, but it's there.

You'll see #2 as @neonturbo indicated if you have logging enabled for that rule--"trigger" or "event" logging in particular, though either will only say that the rule was triggered (the former) or what specific event triggered it (the latter). To see what the rule is actually doing, you'll also want "action" logging enabled. Different apps have slightly different logging options, but this is how it works for Rule Machine, which I assume you're talking about.

By default, you will, as you note, also see #3. This doesn't really tell you whether a command was sent to the device, just that the device reported back its current state (which could have been in response to a command, but it could have been a physical action too--or an app besides the one you think, which is why action logging can be helpful for rules). This is part of the "Enable descriptionText" logging option in the driver. Some also log commands sent to the device with debug logging enabled, but not all (especially older drivers) do, and it should also be noted that descriptionText logging (and most app logging, though this is less standardized) stays enabled indefinitely but debug logging auto-disables after 30 minutes by convention.

Past logs are limited by size, so the benefit mentioned in the other post is more than just "clutter" visually--it can help things still be available when you need them. I normally keep app logs off when I'm not interested in troubleshooting and let the debug logging self-disable, but I do keep info/descriptionText logging on as it remains by default since it can be helpful to see device events without having to dig into each one.

2 Likes

I have had the situation of logs being on by default when first setting up the hub, and the logs go away (drop off) within a few hours. There isn't days upon days of storage room. This isn't exactly a computer with terabytes of storage, the little flash onboard memory is limited.

I might be wrong about this, but I think this flash memory is limited to how many read and write cycles it can have, you can wear out the memory by writing to it excessively.

You should just turn on all logging options if you really want that level of logs. You can do it when setting up your rules or when joining devices. If you want to go back and turn on all device logging, you can use the built-in Preference Manager to set all logging on. I don't think there is a similar way for rules, but it shouldn't take too long to run through those and turn on logs.

(There might be more, this was just a nice grouping of log options in Preference Manager).

They most certainly are if logging is turned on for the relevant apps and/or devices. Logging is off by default for apps. In general, the apps log everything of relevance for exactly the sort of troubleshooting you describe, once logging is turned on.

3 Likes

I had a similar question and started the below thread. Essentially I'd like trouble shooting at a device history/event-level as per Smartthings - it's the only thing I miss from my migration. It was definitely a lot easier with ST...

My biggest issue with the logging is that a device doesn't seem to know what acted upon it. For instance, a while back I was having an issue with a hall light seemingly coming on at random in the middle of the night and then just staying on all night and not turning back off. I had a really hard time tracking down what was causing that because all the logging for the device would tell me is that the light turned on. So to figure out what was turning it on, I had to already know what was turning it on.... so by figuring that out, I didn't even need the logging. But just having the log entry for the light say "Hall light was turned on by X" would have made it so much easier.

3 Likes

exactly, or if the logging was all in the same place, including all the triggers, other events, devices, etc. then you'd be able to see that one second before your light turned on, X event happened (e.g. sunrise, a door opened, a rule was triggered, etc.) you'd be able to correlate.

but as it is now, you gotta run around looking in different places for different things and its impossible to correlate or figure out how they are connected.

that's really funny because a hall light coming on and not turning off is exactly why I started this thread although the way logging works in HE has been driving me nuts since the beginning. Like I say, no other tool I use has logging so spread out like this or disabled by default. THis is very unusual and difficult to use.

Everything you need will show up in the one Logs section of the hub, if you enable the logging on the devices and apps you're trying to troubleshoot.

1 Like

LOL - IIRC, I believe SmartThings has lost this functionality with the "New" SmartThings App and their new and improved backend cloud infrastructure.

But, I know exactly what you mean. It would be nice to know which app caused a device to change, without having to look through the logs.

Agree with above but, by way of refining "better" logging, would add another reason to go through logs -- troubleshooting hub/app/device health. I'm still new to HE and was poking around in the logs and happened to see this entry:
2022-03-25 07:35:33.667 am Error making Call to Alexa message gateway...

Answering the question, "When did this start?" is quite tedious. The Amazon Echo Skill event log was empty and under Logs there was no way to see all errors for that app. For HE to facilitate this use case, two things are missing:

  1. filtering logs by debug level (info, warn, error)
  2. add a "Show all..." button to the logging page footer alongside "Show more..."

One other thought. If you're in the troubleshooting mode @wheeler.dan described, turning on debugging for a device/app is fine but does require touching the HE. I had a 32GB USB drive connected to my Vera+ so never had to touch it for this reason. Adding a USB port is expensive. Adding functionality to map a network drive is much easier.

I am jumping on this topic as I agree the ability to troubleshoot is a little trial and error,

I had a C4 hub which I migrated to the new C8 it was all working fine till one day.. out of the blue none of my Smart Bulbs or Switches were working.
Essentially anything zigbee was just broken.
I tried reset and repair of one device.. .and it would not rejoin. Tried looking for some clue as to what had occurred.. but nothing.

Note: I am pretty handy with Ethernet and Wifi Troubleshooting and plenty of tools exist for that for seeing interference and so on.. but they dont't work with Zigbee.

After a few days of on and off tinkering and getting no-where.. I bit the bullet and full reset the hub.

This worked.. did mean i lost all of my setup, But for me that was an opportunity to re-implement some rules and automation.

Things I cant be clear on.

  1. Did Migrating from c4-> c8 cause some sort of DB corruption that could not be solved by a restore? If so then this is a one off.. and I take that as part of the migration challenges.

  2. My hub was actually seeing the devices so my network was fine.. because i would see in the logs the 'Physical' events for switches and so on, but the Zigbee triggers were all broken. Chromecast devices and so on were operating normally.

  3. Not being able to see the network, the devices attached and how they route.. I find annoying, seems to be very easy to do with Networked Wifi or Ethernet devices.. so why can't you just add a Zigbee Dongle to a PC / Phone.. and inspect the network.. in the end it may or may not help.. but the process of troubleshooting is start at a point and work around that and keep going till you find the root cause.

  4. My Internet Router, has plenty of logs about the devices and the traffic going through it.. yet.. with hubitat.. it does not seem like this is possible / exposed.

  5. The symptom I had I still have no clue what caused it... how I could have fixed it without a Full Reset (Bear in mind I did a soft reset and restored the DB, this did not work, I reset the Zigbee Radio this did not work). To me the Zigbee messages from the switches were getting back to the hub, so the 'network' was fine.. but the ability to trigger any events from the hub was broken.. and I could not see why..

Maybe i missed a guide somewhere, but this topic showed up when searching for 'troubleshooting'