Alert - Your database is growing

kkossev · May 14, 2021, 6:17pm

A follow-up a few days later...

My system is stable now, no noticeable delays, HUB Watchdog values are not so bad:

BUT:
My C7 hub database size grows very quickly from 13MB after the reboot up to 90-100MB for less than an hour. Then it stabilizes and varies between 80..90MB :

( dB size is the cyan colored line)

So something strange happens, filling up the database extremely quickly in my opinion:
7:19 - 28 MB
7:24 - 37 MB = 1.8 MB/min = 30 KB/sec dB size increase
7:29 - 45 MB = 1.6 MB/min = 26 KB/sec dB size increase
7:34 - 52 MB = 1.4 MB/min = 23 KB/sec dB size increase
7:39 - 56 MB = 0.8 MB/min = 13 KB/sec dB size increase
7:44 - 63 MB = 1.4 MB/min = 23 KB/sec dB size increase
7:49 - 66 MB = 0.6 MB/min = 10 KB/sec dB size increase
7:54 - 89 MB = 4.6 MB/min = 76 KB/sec dB size increase

10..70 Kilobytes per second growth of the database is very unlikely to be due to Events to States recordings... nor logs. Should be something else that I don't understand.

I have spent too much time in the past days trying to narrow down the possible reasons. I went a wrong path disabling 'chatty' devices such us my NEO Coolcam power plugs, disabling/deleting applications that were on the top op the App stats State size (such a hubigraphs, weather tiles, etc..), disabling cummunity/custom integrations, etc.. All this was a waste of time, as it is obvious that no driver or app can generate nearly 2 Megabytes of events / states data per 1 minute!

I did a Soft Reset. I restored different database backups from the last week, I performed a lot of ..IP/hub/cleanupDatabase procedures. Currently, I have Event History Size = 1 and State History Size = 5. I am sure that even if I increase the sizes to 50 or 100, the database will not grow up so much compared to the 90-100MB that I have now. The database seems to be filled up with something else...

The only other thing that I am going to try is to exclude two Z-wave devices that are not powered on, but exist as nodes in the network.

As I said, I have no performance issues in the last 2 days. But what remains is the doubt that something is not OK.

nh.schottfam · May 14, 2021, 7:29pm

I see similar numbers to you.

You likely want to set the minimum states/events to 11, as anything less than that causes the db to try to clean up each time (wastes time).

I do agree and see in my observations the db growing quickly. So something is changing it a lot, but I don't see it in any of my devices....I'm guessing there is a problem on something internal pounding the db that needs to be fixed.

For me I observed this with the 2.2.6 releases.

As an FYI, I have hub mesh turned off (or at least at settings I do).

gopher.ny · May 14, 2021, 8:39pm

Don't get hung up on database size too much. It gets compacted/defragmented on alternate shutdowns, grows quickly after a boot, and stabilizes at some point. The database engine handles it, and while it's not entirely a black box, the number of buttons we can touch is limited.

I'll probably increase the alert threshold, though.

djw1191 · May 17, 2021, 1:33am

Well this is wild, I guess I'm jumping on this train. My database typically hovers around 75mb, I login today and see 1.6GB... About 20 minutes ago, it was 1.55GB (1550mb). So it's grown 57mb in about 20 minutes. In the 5 or 10 minutes I've been typing this, it's grown another 10mb.

The app stats and device stats pages don't show anything crazy and I don't actually have any devices that generate a lot of unique frequently updating attributes.

The only change I've had in the past couple of days is the addition of a Zwave (S0) lock.

@gopher.ny I know you said not to be hung up on it (and I typically do prescribe to the "if nothing is broken, don't worry about it"), but I'm curious if there is any sort of backend DB logging/activity around zwave devices, that might not show up in the device/app stats. My zwave mesh was having some issues post inclusion of this device and so I'm wondering if maybe somehow it's pounding away at the DB and causing it to grow quickly.

Ken_Fraleigh · May 17, 2021, 9:27am

Glad I saw this. I had mine set to 1 and things became quicker after setting them to 11. I’m wondering if this was causing my Zigbee lockups. I’ve been using Apple’s Adaptive Lighting, which sends quite a few Zigbee commands all at once. Having the hub constantly cleaning this up was probably less than ideal.

gopher.ny · May 17, 2021, 10:32am

The "clean up immediately" logic is going away in the 2.2.8. It's caused nothing but trouble.

Edit: to be clear, it gets triggered only for devices' max state/event counts of 10 or less. Setting these values to 11 or more means hub will do regular hourly cleanup. Default values do not trigger this logic.

Ken_Fraleigh · May 17, 2021, 10:50am

Did that start with 2.2.6? That’s when I first saw a performance hit.

gopher.ny · May 17, 2021, 10:53am

Yes. The idea was to use it on devices with huge states/event data, like cameras with images.

There is no direct connection, but I'm not ruling it out entirely.

gopher.ny · May 17, 2021, 11:14am

And it's about 2G right now. This is not normal. Give it a reboot and if the database is still huge afterwards, please do a soft reset.

djw1191 · May 17, 2021, 1:35pm

Rebooting now, but interestingly enough, at 9:28AM EST it started to report a negative number for size.. -2141

gopher.ny · May 17, 2021, 1:45pm

Integer overflow for 2G+... I got to account for that in the future versions!
Yeah, this looks like a soft reset.

djw1191 · May 17, 2021, 2:05pm

Yeah post reboot, still seeing the overflow. Will give it a soft reset. For reference, things seem to be working, but I imagine I'll run out of disk space soon enough, given the growth.

gopher.ny · May 17, 2021, 2:07pm

Yeah, and then main process will crash. It's a recoverable situation, since diagnostic tool will still be functional, but a proactive approach is best.

djw1191 · May 17, 2021, 2:17pm

I've reset and restored. Down to a nice 9mb.. I'll keep an eye on it, to see if the condition returns. I've started graphing this data, so much easier to track now.

AverageJoe90 · May 17, 2021, 2:17pm

Oh! I have followed @gopher.ny advice as well

mark.cockcroft · May 20, 2021, 7:42am

could we have these permantly available in the UI in future releases?
maybe with a note

i have been setting these to 5 and 10 thinking its better, but had noticed the hub CPU5min had been going up but didnt know by change the value it would be triggering instant clean ups

gopher.ny · May 20, 2021, 11:36am

Yes, it seems useful enough. Alerts page already has a UI for both, but it's only visible when database is large.

mark.cockcroft · May 20, 2021, 11:40am

Is it still true that any setting under 10 triggers a clean up every time?
Edit just want to know what settings for hourly only cleaning

Ken_Fraleigh · May 20, 2021, 1:24pm

Yes. This started with 2.2.6, which explains why I started seeing performance issues with the first 2.2.6 release. I was having cpu usage spikes of near 100% with mine both set to 1. Since changing both to 11, no problems.
@gopher.ny I haven't had anymore cpu usage spikes or zigbee lockups since changing these to 11. It seems likely that continuous CT changes to 26 of my zigbee bulbs via Apple's Adaptive Lighting was hammering the processor due to the continuous cleanups that insued. I haven't changed my use of Adaptive Lighting, or anything else except setting the event and state histories to 11 from 1. I wish this had been elaborated on in the release notes

brad5 · May 20, 2021, 10:43pm

Im seeing really similar stuff. My database seems to grow by leaps and bounds every time I check it. I have used the commands to trim events to 11 and prune the database. Here’s what I’m seeing... i

I will also try the backup / soft reset / restore.

Is this related to the new 2.2.7 release and should I consider downgrading?