InfluxDB Logger Support

Right. An issue which normally can be avoided (e.g. avoid get-and-set outside of the concurrent ds). I found one problematic pattern in writeQueuedDataToInfluxDb():

        writeData = myLoggerQueue.toArray().join('\n')
        myLoggerQueue.clear()

Looping over the queue and extracting elements with calls to poll() would not require a mutex, but might impact performance.

@mavrrick58, in answer to your question the average time represents both, and anything else the Hubitat views as part of the application. I don't know what it all encompasses.

It's Hubitat's view of how much time it spends when it invokes any function in the app. Hubitat's view across the board.

Adding explicit elapsed time checks in the functions indicate that with either approach softPoll() averages 275-280ms and writeQueuedDataToInfluxDb() averages 9-10ms.

The average of these two functions would be 142-145ms. So in order to get the average down to Hubitat's view of 110-120ms, so there must some other function(s) that Hubitat views as part of the mix.

Thanks for explaining.

Honestly after taking a look this morning I think some of my concerns earlier were a knee jerk reaction. I think the majority of what I saw make the big improvements during our previous upgrade to add 2.x support is still there. After reviewing my logs this morning the performance seems to be similar even using the State value. I am waiting to see what happens with the App Stats to see if they get back to a similar CPU Time, but i am pretty confident they will. 8% over very little is next to nothing.

FWIW, I think most of the performance improvement came from removing the second copy of the pending array, as well as the SmartThings leftovers.

I've pushed version 2.2.0.

Changes include:

  • Support for multiple instances of the application. Previously only one instance of the application was supported. Additional instances could be installed, but would not function correctly.
  • An option to control whether updates received from a device are published to InfluxDB even if the actual value hasn't changed. Previously, updates were only published if the value changed.
  • A dramatic improvement to event timestamping. Previously, timestamps were not assigned to the time of the actual event, but were assigned to the time the event was published to InfluxDB. This offset could be 1-15 minutes based on the "How often to write to db" configuration parameter.
  • A fix to properly decode valve events.

Many thanks to @hubitrep who drove most of these changes.

2 Likes

Those are some great changes @dennypage and @hubitrep, and some of the others reasons (that I forgot were issues until now) why I did not like the way this app saved the data.

From the sounds of it, with the soft polling disabled this app would now save the events to Influx in the same way I am doing it with NR, basically mirroring your event log on the device into Influx. May have to fire this thing up and give it a try again.

2 Likes

What would be the best way to get app/driver stats hooked into this ? e.g. where having a time series would help track down perf issues with specific apps/drivers.

What kind of app data are you thinking. I dont know how you would even start that without either developers writting there stuff to integrate with InfluxDB logger or Hubitat building in something lo level to dump the data.

If it is just syslog data that is possible though.

I’m thinking about the data you can find in the app or device stats tabs of the hub logs UI. If the same info was available through device events (a la Hub Info Driver by @thebearmay ) it would be super helpful.

Or maybe it’s available some other way?

Yea, I am not a fan of the current soft polling implementation either. I would like to change it so that it will only create an artificial event if the device hasn't already sent an event within the last interval.

The concept that the device sends an event, and the soft poll interval happens to hit a few seconds later and creates another one doesn't seem very useful.

1 Like

I'm not sure what "soft poll" initially meant, but I took it as meaning "polls the state of the DeviceWrapper attribute, but does not poll the physical device or ask the DeviceWrapper to poll it".

Putting aside the ST origins of this app for a moment: what we have are events and state snapshots. Again, the distinction matters because, among other reasons, one can reconstruct the state of a device by playing back the events (e.g. to figure out what happened/debug something) but in practice one can't guarantee reconstruction of the events with state polling.

Sampling theory tells us that one would have to poll at twice the frequency, or half of the minimum time span between two device events for a given device/attribute. For some devices/use cases, that's simply impractical.

Agree and that's one example of why I believe it's necessary to distinguish event and state streams when reporting to InfluxDB, or at least offer the option to do so, Users then would have full flexibility to mix them (or not) at the app level or at the InfluxDB level.

I feel like this may be one of those cases of wrong tool for the job though. I have a syslog database loaded in Influxdb from Node-Red using the websocket for the logs output. I have never wanted to use the influxdb logger data to do troubleshooting like you describe here. That is what i use the syslog dashboard for. It allows me to filter and get the picture very quickly like what you describe above and better. I can filter both the device and the apps involved to see what did what and how the device or devices were impacted.

I do see the concern about a poll then also having a actual report come in right after or before it. That is somewhat a waste. It is also a waste that all polling is one massive bucket. That is why I was experimenting with child apps to allow different soft polling intervals for different devices. Or even simply not having soft polling linked to also what devices are monitored as well.

FWIW, I was originally thinking to get rid of softball support altogether, but @mavrrick58 talked me out of it. He is using to work-around a limitation of InfluxDB. InfluxDB has no way ability to do a fill with the previous value, so if there are no data points for a series that fall within the actual time range of the query an empty set is returned. Using Grafana, this results in "No data". This core problem is something that should be addressed in InfluxDB itself, but the request has been outstanding for 6+ years.

Setting that issue aside, I can easily get my head around a failsafe publish, especially if it originates from the device itself. I use that myself to protect against any failures in the sensor->Hubitat->InfluxDB chain.

If the behavior is changed such that it only creates an artificial event for a device if the device is stale, because the device has not already sent an event within the interval, the massive bucket of updates issue would be greatly reduced or eliminated.

1 Like

I meant troubleshoot actual automations, or what they are used for as in why is this light on or how much water is in my well type of stuff. That's what I'm using this for. Not to troubleshoot the hub or my code. Logs is yet another subject and I would agree I would not use this for that (is there a Hubitat-ELK integration ? :clown_face: )

This is exactly what i mean. I can select the devices and the automations and such and see the complete flow of what happens. This also means i can see the failure.

Almost all of my use case for it has been for this kind of thing. Basically to tell me why something did or did not happen. Debuging code to me seems more of a interactive live thing.

This is one of those items perfect for visualizing with graphs though.

@dennypage I like the idea of not doing the poll if a attribute update wasn't recieved in a give time, but i am just overly cautious about anything that potentially adds overhead.

To accomplish this you either need to internal to the app track each time a attribute update is recieved or at the time of poll pull all events for a given device and check the events if any relate to the given attribute. Wonder what kind of overhead that would cause.

You can only select this stuff from your syslog data I assume if you are actually logging the events to the "Logs" of the hub? Or is it capturing the event logs? I do not have logging enabled for all my devices.

A simpler way: Set a per subscription flag when an event occurs. Every soft poll interval, loop through all the subscriptions... if the flag is not set for a subscription, create a fake event... clear the flag regardless.

1 Like

Yes the data has to be writen to the logs.

A lot of information is written to the logs about events regardless of logging setting.

Not really.. if I go to a device and turn off all the logging options, typically NOTHING would be logged at all to the logs that I can see from the hub UI. They would still be posted to the events for the device, but not printed into the actual logs. I assume you know this though, you have coded drivers.