My Zigbee Hell - The Remix

Hey Bobby, Thanks for responding. I don't think so. It happened last week while my wife & I were watching TV. Neither of us noticed anything flickering. Now I can't be 100% certain but I think minutes earlier I had done a reboot of the C8P, but those reboots don't typically cause any issues. And funny you mention power outage, because I had one at 520AM today and all the lights came on again. My C8P is on a pipower battery backup, which I verified to output a sustained 4 amps. Also if I pull the airgap on my light switches & then repower, the lights do come on, but then within 30 seconds they turn off by themselves, but this morning they did not, maybe the flicker was too fast(1-2secs)?

Not as an actual "switch". Instead, the feature is implemented as fall back, if all other pre-configured keys fail to join properly.

1 Like

Does this work only when the hub is in pairing mode, or does it also work for device rejoin requests? The Secure Mode switch appears to be about the latter (fixed my post above to clarify).

I had issues w/Sengled bulbs going back to my C7, that's in fact where the issues started for me, to the extent that I finally removed them all from my hub (while my other Zigbee devices were running fine). Others were reporting issues back then as well (though some folks had no problems w/them at all). So I would not tie Sengled issues to any particular hub. :slight_smile:

Channel 4 is generally not a great channel to be on...I'd go back 1 (or 6).

Both

Switched back to 1.

1 Like

Morning update. My "old" lock that i replaced has been sitting on the table with the hub and has not lost connection since i paired it yesterday.

my new lock (that likely didn't need to be replaced) continues its bad behavior - i can get it paired by moving hub close to it or i can choose a specific repeater and get it to pair with hub in hub normal location. but a few hours later i might be able to lock/unlock, but status won't change to locked/unlocked and within a few more hours i can not even lock or unlock it.

guys, i need some help here please. i tried to take the garage light (closest repeater) out of the picture by opening air gap - but seems that doesn't turn off the radio. i suppose i can kill power to garage and test - but that takes a bunch of other things down while power is off.

it would be nice to be able to identify what device is breaking my network without having to rip them out of the walls one by one.

why does this thread keep getting marked solved? it is NOT solved.

My guess would be someone hit the little “check” button accidentally when they meant to select the “heart” button to like a post.

2 Likes

Interesting. I am surprised (and curious) given that would seem to make Zigbee 3.0 security (TCLK exchanges) pointless, but that's irrelevant to this topic. Thanks for confirming.

Can you try hooking up a good quality / known to be good repeater Zigbee outlet to an extension cord and set it up as close as you can to the garage door lock. Pair through that repeater using the Zigbee Map advanced pairing tool. See if the lock stays paired/responsive. That should give you a clue whether the issue is environmental/RF related.

In your place I would also experiment with power level 4 on the hub.

3 Likes

Something cheap to try is to capture the zigbee logs for a few hours and dump them in an AI chat with a short description of your issue and suspicion of a defective repeater device. You can use the Simple Websocket Chrome extension or a shell script (ChatGPT can build one for you).

1 Like

I started here. Then...

This raised questions for me, I have two ZB plugs (one is Centralite, other is Innr) and they both play repeater to multiple devices each. I always though the in-wall switches/outlets are/were better suited to play repeater. I also had a spare Innr and it tested using it.

I was able to join using advanced join and the lock worked for a short while. But shortly later the lock switched from the plug hanging around it to the garage light switch - and it started flaking again.

The common denominator kept appearing to be the garage light switch - so I took it out. I was then able to rejoin the lock using the Innr plug and it remained stable. I found an older GE/Jasco ZB dimmer I had sitting around and installed and joined it. Seems to be fine.

I then unplugged the Innr and waited to see what the lock did - interestingly it connected through the master toilet fan switch - maybe 15 feet away through 2 walls - but it has remained working.

The garage light has multiple neighbors - but no end devices connected to it - maybe a good thing.

Now i just need to add the new lock to apps - yes, i am aware of device swap and normally swap out to a virtual device (and did this a few times in this nightmare) but i had observed the lock code manager was doing 'strange' things when i was factory resetting the first lock - so i decided to go virgin and add the new one in manually.

So it appears that light switch was the cause of all the problems (losing connections to other devices, the lock, and not being able to join devices). And i learned about the advanced zigbee app/join and that will be very helpful going forward.

Thanks to everyone with all the input!

2 Likes

Get the logs - got it. then copy/paste them into AI of choice and ask it what?

You really lost me here - can you provide specific instruction on this?

I just tried this a bit ago, and I asked Gemini to "Analyze these Zigbee logs. I am not having any particular issues, but would like to know the health of my Zigbee mesh. "

I should have pasted the analysis, but didn't think of it before I closed the browser tab. It was quite enlightening what Gemini produced. In my case it said everything was healthy, and showed the rationale. It gave stats about signal strength, routing, and a bunch more.

You can also tell the AI that you are having trouble, maybe don't be too specific and see if it can see what is going on. "I have been having trouble with my Zigbee mesh, could you please look at these logs and see if you can pinpoint any obvious issues".

1 Like

The HE hub provides a means to view/retrieve the complete stream of incoming Zigbee messages (as seen by the hub). You can see these directly on the hub admin interface at http://<hub ip>/hub/zigbeeLogs. It's possible to leave that browser tab open (and visible) and let it soak up logs for a while and copy/paste that into ChatGPT/Claude/Gemini. Ask a simple question as @neonturbo suggested above, with a summary description of the issue you are observing.

However what is rendered on that page is not as detailed as what the hub actually outputs on the websocket. To capture the full details, you need special tools, as a websocket connection is different from a web page. It is a little bit more technically involved; probably the easiest way is to use the developer console (F12) on the http://<hub ip>/hub/zigbeeLogs page, reload, and let it soak for a while.

Here is an example using Chrome:

The red arrows show where to navigate to. The green arrow points to the "export" button which lets you save the full capture to a ".har" file. You can upload that file directly to Claude/ChatGPT/Gemini.

Each log line has details about the Zigbee message received, including the full payload. AI tools are good at parsing these logs and figuring out what devices are doing and can even identify problems (e.g. Excessive OTA requests from Thirdreality devices).

I uploaded the above capture to Gemini with the simple prompt "tell me what is going on with my Zigbee network". This is what it came up with:

Gemini analysis of zigbee websocket capture

Based on the live log WebSocket capture (zigbeeLogsocket.har), your Zigbee network is active and running, but it reveals a very clear reason why you might be noticing latency, lag, or poor automation responsiveness.

Here is the analysis of what is happening under the hood:

1. The Core Issue: Severe Log Spamming (Traffic Congestion)

The single biggest takeaway from your log is that your Moes ES1 presence sensor is flooding the network with rapid, consecutive messages.

Specifically, look at this chunk of data:

21:40:02.103: Moes ES1 sends a time cluster (000A) message.

21:40:02.108: Moes ES1 sends the exact same message again (5ms later).

21:40:02.111: Moes ES1 sends it again (3ms later).

21:40:02.114: Moes ES1 sends it again (3ms later).

21:40:02.117: Moes ES1 sends it again (3ms later).

It is firing off 5 to 6 redundant updates in a span of 14 milliseconds. Outside of these bursts, it is continuously reporting illuminance, distance, and custom engineering attributes every 2 seconds. This level of hyper-chattering can saturate a Zigbee channel, causing a massive backlog on your hub (Hubitat, judging by the path structure) and leading to delayed or dropped events from your other devices.

2. Device Overview & Health Check

Moes ES1 (Device ID: 50): * Signal Strength: Perfect (LQI: 255, RSSI: -35). This indicates it has an excellent physical connection or is positioned right next to a router/hub.

Behavior: Highly problematic due to the data repeating bursts noted above.

  • TR Presence Multi-Sensor R3 (Device ID: 213):

Signal Strength: Perfect (LQI: 255, RSSI: -35).

  • Behavior: Healthy. It reports fine-grained radar/presence metrics steadily and updates binary motion attributes (0006) perfectly fine without hammering the network.

  • Aqara FP300 (Device ID: 120):

Signal Strength: Good (LQI: 192 to 200, RSSI: -52).

  • Behavior: Normal. It check-ins appropriately every minute or so with routine telemetry.

Recommended Next Steps to Fix the Network

To stop the network chatter and clear up bandwidth for your other devices, you should adjust the settings on the Moes ES1 sensor:

Increase the Reporting Interval / Minimum Change: Go into your hub's device details page for the Moes ES1. Look for driver configuration variables related to parameters like "Illuminance reporting threshold" or "Minimum reporting interval". Increasing these limits prevents the sensor from trying to broadcast minuscule lux changes instantly.

Switch/Update the Driver: The rapid-fire bursts on the Time Cluster (000A) point to either a bug in the device's firmware or an unoptimized custom device driver handling the device profile. Check to see if there is an alternative community driver available, or hit "Configure" in your hub UI to force it to rewrite its reporting table rules to the device.

Power Cycle: If configuration changes don't take, physically power cycle the Moes ES1 to clear its internal hardware buffer.

9 Likes

Amazing response! I use Edge, and on Mac, but close enough. I have logs capturing now and will upload them to CoPilot in the morning. I can’t wait to see what it finds.

FWIW - I have a few ThirdReality devices too - interested to see if they are doing the OTA thingy.

1 Like

Well, it appears the only "issue" with my ZB network is the fact some edge devices have weak signal - not surprised and I do have solid mains-powered repeaters positioned to support those devices.

Curious what my network looked like before I replaced that garage light switch!?

Summary

Here’s a clear, evidence-based breakdown of your Zigbee network based on the HAR/log file you uploaded.


:white_check_mark: Executive Summary

  • Overall status: :white_check_mark: Healthy and stable
  • Critical errors: :cross_mark: None detected
  • Devices communicating: :white_check_mark: Continuous activity across all devices
  • Main concern: :warning: Some devices have weak signal (likely mesh coverage gaps)

:magnifying_glass_tilted_left: What your Zigbee network is doing

1) Network health (hub-level)

  • Zigbee radio is online and functioning
  • No hub stress indicators:
    • zigbeeOffline=false
    • hubLoadElevated=false
    • hubLowMemory=false
  • WebSocket logging active → continuous real-time device traffic
  • No failures in requests or system services

:backhand_index_pointing_right: Interpretation:
Your coordinator + hub are working normally. No systemic failure.


2) Device communication (mesh activity)

  • Logs show constant zigbeeRx events from:
    • Motion sensors
    • Door/tilt sensors
    • Locks
    • Plugs
    • Lights
  • Messages are:
    • Regular
    • Sequential (no major gaps)
    • Successfully received

:backhand_index_pointing_right: Interpretation:
Your network is:

  • Active
  • Routing traffic correctly
  • Not dropping devices

3) Signal quality (this is the real story)

Strong devices (good mesh)

  • LQI: 128–255
  • RSSI: -35 to -65 dBm

Examples:

  • Office sensors and plugs → very strong, stable

Weak / borderline devices :warning:

Several devices consistently show:

  • LQI < 100
  • RSSI ≤ -75 dBm

Examples:

  • Outside Shed Door Sensor → LQI ~96–100, RSSI -75 to -76
  • Garage / exterior sensors
  • Some locks and bathroom devices
  • Occasional readings down to:
    • LQI: ~72–84
    • RSSI: ~-80 to -83

:backhand_index_pointing_right: Interpretation:

  • These are edge-of-mesh devices
  • Not failing, but:
    • More prone to latency
    • More retries under load
    • Future reliability risk

4) Error / failure indicators

  • :white_check_mark: No explicit error codes
  • :white_check_mark: No “device offline” events
  • :white_check_mark: No packet loss spikes
  • :white_check_mark: No join/leave instability

Minor anomalies observed:

  • Occasional repeated payloads
  • Slight LQI/RSSI fluctuation

:backhand_index_pointing_right: These are normal behavior in Zigbee RF, not failures.


5) Traffic & behavior patterns

  • High-traffic devices:

    • Kitchen sensors
    • Garage sensors
    • Zigbee plugs
  • Reporting intervals:

    • ~15–60 seconds for sensors
    • Normal for Zigbee networks

:backhand_index_pointing_right: No congestion or saturation detected.


:warning: What’s actually “going on”

:white_check_mark: NOT happening

  • No outage
  • No routing collapse
  • No coordinator issues
  • No failing devices (yet)

:warning: What IS happening (important)

Your network shows classic mesh edge symptoms:

  • Outdoor + garage devices = weaker signal paths
  • Some devices operating at marginal link quality
  • Likely causes:
    • Distance from hub
    • Walls/metal/garage construction
    • Limited repeater coverage in those areas

:hammer_and_wrench: Actionable next steps (focused + practical)

1) Strengthen the mesh (highest value fix)

Add repeaters between hub and weak devices:

  • Plug-in Zigbee outlets (best option)
  • Dedicated repeaters (IKEA, Sonoff, etc.)

Focus on:

  • Garage
  • Exterior doors
  • Shed area

2) Check channel interference

If you haven’t already:

  • Ensure Zigbee channel doesn’t overlap Wi-Fi:
    • Good Zigbee channels: 15, 20, 25

3) Rebuild routes (non-disruptive)

  • Power cycle repeaters (not end devices)
  • Let mesh heal naturally over a few hours

4) Watch specific problem candidates

Keep an eye on:

  • Shed sensors
  • Garage tilt sensors
  • Exterior locks

If they degrade further:

  • Add closer repeaters
  • Reposition slightly if possible

:bar_chart: Bottom line

Your Zigbee network is:

:white_check_mark: Functioning correctly
:white_check_mark: Stable and active
:white_check_mark: No failures or outages

…but…

:warning: Mesh coverage is slightly weak at the edges
:right_arrow: Not broken today, but worth optimizing before it becomes intermittent.


:white_check_mark: If you want, I can map your devices into a recommended repeater placement plan (very helpful given your layout).

1 Like

This prompted me to finally unplug the mmWave sensor from the network. LQIs for other devices immediately shot up. The sensor was taking up 90% of network messages and spamming, similar to yours.

Thank you for the instructions.

4 Likes

Can you mention the specifics of your light switch?
I have horrible issues myself, and they come and go - its been going on for years and through more than a couple hubs. (and 3 houses). I wonder if I don't have a bad Mains repeater somewhere that takes a crap and my world goes sideways. I've always wondered about ZB 1.2 and 3.0 and whether I've built a stable mesh - I have a lot of the ZB Jasco switches - they seem okay - but who knows . I trust nothing in house TBQH.

So the switch i ruled in as causing at least one issue (i think there were more issues not discussed in this thread) was an Enbrighten (Jasco/GE) ZB3002 ZB 3.0. Which makes me wonder if the problems I was having mirrored many other folks saying C8pro and ZB 3.0 "have issues".

The switch i replaced it with is a Jasco/GE 45857 - ZB HA 1.2 - and it is working fine. Again, is this a 3.0 issue? IDK

Now I'm wondering how I can export my devices to identify every device model so i can see if any others are ZB 3.0 and watch them for issues. short of looking at all 56 devices manually, i don't know a way to export.