No, those logs aren't centralized. We have to reach to the hub and fetch them individually whenever there's a support ticket. Hub access is limited by design.
Let's start with dumping those watchdog warnings into regular log, then. I'm going to call it the "app/driver naming and shaming program"
Some people fire tickets stating that their hub is unresponsive or slow, it takes several emails back and fourth to even convince them to look at logs. Most of these are logged today. Based on this experience, I would say that by default should be preventing a method from running, with an "advanced" option of turning it off. That would improve customer experience and decrease the number of tickets immediately.
Great. I'm sure that my Ecobee Suite is going to be one of the first members of the shaming club...
FWIW, I found (the hard way) that Hubitat's atomicState memory management has/had memory leaks - after several days of running, the entire hub would slow down, often to the point that you had to use the back-door to reboot the hub. This drove me to revamp how I maintain state for all of the thermostats' attributes, and to explicitly release atomicState data maps when I was through with them. The end result has been a significant performance improvement and overhead reduction.
And yes, I changed to asyncHttp requests back in the SmartThings-only days, because it gave me a new thread process to process the returned data (and a new 20-40 seconds in which to do the processing).
I would recommend a much lower default - if my code takes 5 minutes to do ANYTHING, then something is wrong. I would prefer that you set the default to 60 seconds (longer than SmartThings' limit of 20/40 seconds). If I start getting run-time execution errors at 60 seconds, THEN I'll extend it, but only after checking my code to be sure that 60 seconds wasn't enough time.
I'd recommend the same for synchronous HTTP calls as well - let me extend the timeout if I need it, and I'll also extend the run-time limit at the same time.
Definitely elapsed time - collecting CPU stats has serious overhead. Just running VisualVM against hub software nearly halves the little box's performance.
While I understand that, this is where I get scared. If itβs just elapsed time, use of something like synchronized could trigger this even though I might be using exactly zero CPU because every other thread got the lock before me.
No, the side effect of making it interruptable is significantly reduced performance.
There are tools for finding busy devices on Logs page in the current version, though. Basic profiling turned out to be cheap, resource-wise, and it usually provides enough information to spot the troublemaker app/device.