InfluxDB Logger Support

hubitrep · January 23, 2023, 2:21pm

Right. An issue which normally can be avoided (e.g. avoid get-and-set outside of the concurrent ds). I found one problematic pattern in writeQueuedDataToInfluxDb():

        writeData = myLoggerQueue.toArray().join('\n')
        myLoggerQueue.clear()

Looping over the queue and extracting elements with calls to poll() would not require a mutex, but might impact performance.

dennypage · January 23, 2023, 4:37pm

@mavrrick58, in answer to your question the average time represents both, and anything else the Hubitat views as part of the application. I don't know what it all encompasses.

It's Hubitat's view of how much time it spends when it invokes any function in the app. Hubitat's view across the board.

Adding explicit elapsed time checks in the functions indicate that with either approach softPoll() averages 275-280ms and writeQueuedDataToInfluxDb() averages 9-10ms.

The average of these two functions would be 142-145ms. So in order to get the average down to Hubitat's view of 110-120ms, so there must some other function(s) that Hubitat views as part of the mix.

mavrrick58 · January 23, 2023, 5:05pm

Thanks for explaining.

Honestly after taking a look this morning I think some of my concerns earlier were a knee jerk reaction. I think the majority of what I saw make the big improvements during our previous upgrade to add 2.x support is still there. After reviewing my logs this morning the performance seems to be similar even using the State value. I am waiting to see what happens with the App Stats to see if they get back to a similar CPU Time, but i am pretty confident they will. 8% over very little is next to nothing.

dennypage · January 23, 2023, 6:05pm

FWIW, I think most of the performance improvement came from removing the second copy of the pending array, as well as the SmartThings leftovers.

dennypage · January 24, 2023, 8:16am

I've pushed version 2.2.0.

Changes include:

Support for multiple instances of the application. Previously only one instance of the application was supported. Additional instances could be installed, but would not function correctly.
An option to control whether updates received from a device are published to InfluxDB even if the actual value hasn't changed. Previously, updates were only published if the value changed.
A dramatic improvement to event timestamping. Previously, timestamps were not assigned to the time of the actual event, but were assigned to the time the event was published to InfluxDB. This offset could be 1-15 minutes based on the "How often to write to db" configuration parameter.
A fix to properly decode valve events.

Many thanks to @hubitrep who drove most of these changes.

jtp10181 · January 24, 2023, 2:32pm

Those are some great changes @dennypage and @hubitrep, and some of the others reasons (that I forgot were issues until now) why I did not like the way this app saved the data.

From the sounds of it, with the soft polling disabled this app would now save the events to Influx in the same way I am doing it with NR, basically mirroring your event log on the device into Influx. May have to fire this thing up and give it a try again.

hubitrep · January 24, 2023, 4:02pm

What would be the best way to get app/driver stats hooked into this ? e.g. where having a time series would help track down perf issues with specific apps/drivers.

mavrrick58 · January 24, 2023, 5:53pm

What kind of app data are you thinking. I dont know how you would even start that without either developers writting there stuff to integrate with InfluxDB logger or Hubitat building in something lo level to dump the data.

If it is just syslog data that is possible though.

hubitrep · January 24, 2023, 6:07pm

I’m thinking about the data you can find in the app or device stats tabs of the hub logs UI. If the same info was available through device events (a la Hub Info Driver by @thebearmay ) it would be super helpful.

Or maybe it’s available some other way?

dennypage · January 24, 2023, 7:02pm

Yea, I am not a fan of the current soft polling implementation either. I would like to change it so that it will only create an artificial event if the device hasn't already sent an event within the last interval.

The concept that the device sends an event, and the soft poll interval happens to hit a few seconds later and creates another one doesn't seem very useful.

hubitrep · January 24, 2023, 10:09pm

I'm not sure what "soft poll" initially meant, but I took it as meaning "polls the state of the DeviceWrapper attribute, but does not poll the physical device or ask the DeviceWrapper to poll it".

Putting aside the ST origins of this app for a moment: what we have are events and state snapshots. Again, the distinction matters because, among other reasons, one can reconstruct the state of a device by playing back the events (e.g. to figure out what happened/debug something) but in practice one can't guarantee reconstruction of the events with state polling.

Sampling theory tells us that one would have to poll at twice the frequency, or half of the minimum time span between two device events for a given device/attribute. For some devices/use cases, that's simply impractical.

Agree and that's one example of why I believe it's necessary to distinguish event and state streams when reporting to InfluxDB, or at least offer the option to do so, Users then would have full flexibility to mix them (or not) at the app level or at the InfluxDB level.

mavrrick58 · January 24, 2023, 11:23pm

I feel like this may be one of those cases of wrong tool for the job though. I have a syslog database loaded in Influxdb from Node-Red using the websocket for the logs output. I have never wanted to use the influxdb logger data to do troubleshooting like you describe here. That is what i use the syslog dashboard for. It allows me to filter and get the picture very quickly like what you describe above and better. I can filter both the device and the apps involved to see what did what and how the device or devices were impacted.

I do see the concern about a poll then also having a actual report come in right after or before it. That is somewhat a waste. It is also a waste that all polling is one massive bucket. That is why I was experimenting with child apps to allow different soft polling intervals for different devices. Or even simply not having soft polling linked to also what devices are monitored as well.

dennypage · January 24, 2023, 11:44pm

FWIW, I was originally thinking to get rid of softball support altogether, but @mavrrick58 talked me out of it. He is using to work-around a limitation of InfluxDB. InfluxDB has no way ability to do a fill with the previous value, so if there are no data points for a series that fall within the actual time range of the query an empty set is returned. Using Grafana, this results in "No data". This core problem is something that should be addressed in InfluxDB itself, but the request has been outstanding for 6+ years.

Setting that issue aside, I can easily get my head around a failsafe publish, especially if it originates from the device itself. I use that myself to protect against any failures in the sensor->Hubitat->InfluxDB chain.

dennypage · January 24, 2023, 11:48pm

If the behavior is changed such that it only creates an artificial event for a device if the device is stale, because the device has not already sent an event within the interval, the massive bucket of updates issue would be greatly reduced or eliminated.

hubitrep · January 24, 2023, 11:50pm

I meant troubleshoot actual automations, or what they are used for as in why is this light on or how much water is in my well type of stuff. That's what I'm using this for. Not to troubleshoot the hub or my code. Logs is yet another subject and I would agree I would not use this for that (is there a Hubitat-ELK integration ? )

mavrrick58 · January 25, 2023, 12:34am

This is exactly what i mean. I can select the devices and the automations and such and see the complete flow of what happens. This also means i can see the failure.

Almost all of my use case for it has been for this kind of thing. Basically to tell me why something did or did not happen. Debuging code to me seems more of a interactive live thing.

This is one of those items perfect for visualizing with graphs though.

@dennypage I like the idea of not doing the poll if a attribute update wasn't recieved in a give time, but i am just overly cautious about anything that potentially adds overhead.

To accomplish this you either need to internal to the app track each time a attribute update is recieved or at the time of poll pull all events for a given device and check the events if any relate to the given attribute. Wonder what kind of overhead that would cause.

jtp10181 · January 25, 2023, 12:59am

You can only select this stuff from your syslog data I assume if you are actually logging the events to the "Logs" of the hub? Or is it capturing the event logs? I do not have logging enabled for all my devices.

dennypage · January 25, 2023, 1:32am

A simpler way: Set a per subscription flag when an event occurs. Every soft poll interval, loop through all the subscriptions... if the flag is not set for a subscription, create a fake event... clear the flag regardless.

mavrrick58 · January 25, 2023, 1:33am

Yes the data has to be writen to the logs.

A lot of information is written to the logs about events regardless of logging setting.

jtp10181 · January 25, 2023, 1:37am

Not really.. if I go to a device and turn off all the logging options, typically NOTHING would be logged at all to the logs that I can see from the hub UI. They would still be posted to the events for the device, but not printed into the actual logs. I assume you know this though, you have coded drivers.