Matter/Thread Network Discussion

This is rapidly getting over my head….

Anyway, what you are saying is the battery operated devices are not doing their thing due to them being battery devices? I can accept that, but kinda defeats the purpose of thread devices. If they were mains powered not sure why they can’t be just matter.

At this point I really don’t have a need for a mains powered device. We don’t have an IKEA store here and everything has to be ordered with a $35 minimum. So can’t really justify ordering something I don’t need.

To clarify something I said. I bought some Ikea devices as they were cheap and I thought that would be a good way to experiment with Thread. Didn’t really have that much use for them but since have made use out of a couple. When I had so many problems with them I bit the bullet and bought the Eve sensor to verify if it was an ikea issue or something else.

From what something said above it seems to indicate HE may have to do initalizations on a regular basis? Not sure I’m understanding that correctly. Obviously Apple has solved that issue it would appear. Again I’m just guessing based on my feeble brain.

Gonna take a break now. My wife is taking me out for BBQ for lunch. She doesn’t know it yet, but she is..:slight_smile:

1 Like

They should. They are, to an extent, since everything looks normal to you in Apple Home. Them being battery devices, they sleep to conserve battery, and can’t be reached when they are sleeping - that’s the biggest difference. They need to get everything done when they wake up. If they participate in multiple Matter fabrics (e.g. Apple Home + HE) they have to manage multiple subscriptions. The hypothesis is that something goes sideways with that.

I think so too.

2 Likes

Here’s an interesting experiment you could try @j715 . Armed with the knowledge that “it’s all IPv6” I went to the hub’s Matter Settings page and grabbed the IPv6 address of a TIMMERFLOTTE Matter-over-Thread battery-powered sensor and tried to ping it from a macos computer on the same LAN:

ping6 fd9e:58c:cb6f:0:a91d:f36f:5be7:5b34
PING6(56=40+8+8 bytes) fd0d:2946:b6e6:48a3:14eb:b5b6:1616:3339 --> fd9e:58c:cb6f:0:a91d:f36f:5be7:5b34
16 bytes from fd9e:58c:cb6f:0:a91d:f36f:5be7:5b34, icmp_seq=2 hlim=62 time=8144.017 ms
16 bytes from fd9e:58c:cb6f:0:a91d:f36f:5be7:5b34, icmp_seq=3 hlim=62 time=7155.259 ms
16 bytes from fd9e:58c:cb6f:0:a91d:f36f:5be7:5b34, icmp_seq=4 hlim=62 time=6183.477 ms
16 bytes from fd9e:58c:cb6f:0:a91d:f36f:5be7:5b34, icmp_seq=5 hlim=62 time=5223.405 ms
16 bytes from fd9e:58c:cb6f:0:a91d:f36f:5be7:5b34, icmp_seq=6 hlim=62 time=4244.339 ms
16 bytes from fd9e:58c:cb6f:0:a91d:f36f:5be7:5b34, icmp_seq=7 hlim=62 time=3261.404 ms
16 bytes from fd9e:58c:cb6f:0:a91d:f36f:5be7:5b34, icmp_seq=8 hlim=62 time=2273.575 ms
16 bytes from fd9e:58c:cb6f:0:a91d:f36f:5be7:5b34, icmp_seq=9 hlim=62 time=1294.208 ms
16 bytes from fd9e:58c:cb6f:0:a91d:f36f:5be7:5b34, icmp_seq=10 hlim=62 time=310.596 ms
16 bytes from fd9e:58c:cb6f:0:a91d:f36f:5be7:5b34, icmp_seq=11 hlim=62 time=14309.218 ms
16 bytes from fd9e:58c:cb6f:0:a91d:f36f:5be7:5b34, icmp_seq=12 hlim=62 time=13308.837 ms
16 bytes from fd9e:58c:cb6f:0:a91d:f36f:5be7:5b34, icmp_seq=13 hlim=62 time=12329.846 ms
16 bytes from fd9e:58c:cb6f:0:a91d:f36f:5be7:5b34, icmp_seq=14 hlim=62 time=11347.587 ms
16 bytes from fd9e:58c:cb6f:0:a91d:f36f:5be7:5b34, icmp_seq=15 hlim=62 time=10434.153 ms
16 bytes from fd9e:58c:cb6f:0:a91d:f36f:5be7:5b34, icmp_seq=16 hlim=62 time=9424.820 ms
16 bytes from fd9e:58c:cb6f:0:a91d:f36f:5be7:5b34, icmp_seq=17 hlim=62 time=8414.703 ms
16 bytes from fd9e:58c:cb6f:0:a91d:f36f:5be7:5b34, icmp_seq=18 hlim=62 time=7420.759 ms
16 bytes from fd9e:58c:cb6f:0:a91d:f36f:5be7:5b34, icmp_seq=19 hlim=62 time=6440.885 ms
16 bytes from fd9e:58c:cb6f:0:a91d:f36f:5be7:5b34, icmp_seq=20 hlim=62 time=5463.298 ms
16 bytes from fd9e:58c:cb6f:0:a91d:f36f:5be7:5b34, icmp_seq=21 hlim=62 time=4491.168 ms
16 bytes from fd9e:58c:cb6f:0:a91d:f36f:5be7:5b34, icmp_seq=22 hlim=62 time=3506.724 ms
16 bytes from fd9e:58c:cb6f:0:a91d:f36f:5be7:5b34, icmp_seq=23 hlim=62 time=2531.249 ms
16 bytes from fd9e:58c:cb6f:0:a91d:f36f:5be7:5b34, icmp_seq=24 hlim=62 time=1564.109 ms
16 bytes from fd9e:58c:cb6f:0:a91d:f36f:5be7:5b34, icmp_seq=25 hlim=62 time=576.020 ms

So clearly the Thread/LAN routing is working. What’s also interesting is the latency of the first response and the rythm to it (probably due to the device going back to sleep).

2 Likes

I have one Ikea contact sensor still paired with HE. Just have it on my desk to play with. It didn’t work this morning. The matter details page showed no IP address although there was no red X. I did an initialize and it picked back up. I then did the ping and saw similiar results as you got. I got to thinking later I wished I could have found the IP some other way before doing the initialize to see what would happen.

This morning I also had 3 Tapo matter plugs that didn’t want to work from HE. One is controlled by a Zigbee button. After repeated tries it finally started. Another one I had to physically push the button on the plug to get it going again. All okay now, but weird.

Busy day today so no more testing till probably tomorrow.

Perhaps the Matter logs could provide some clues into what is going on?

Next time you’re having a troubleshooting session, open the Matter logs window (from Matter settings) and keep it open while you do these manipulations, then use the download button. Feed the resulting file to your favourite AI chatbot and see if it has something interesting to say about what’s in the file.

The logs are continually logging stuff. Managed to copy the few lines that seemed to be repeating over and over. Google then says basically a sucessful ack of communication with a device. Don’t know what.

Contact quit working again. Logs don’t appear to show anything about that. when I open/close the contact there is no response in HE and the matter logs don’t seem to show any error. Which probably means nothing is coming in which is to be expected.

For what it’s worth here are the three lines of logging that seem to repeat over and over.

2026-06-11 6:41:26.657 am

EM

[1166:1268] Rxd Ack; Removing MessageCounter:54481984 from Retrans Table on exchange 35604r

2026-06-11 6:41:26.657 am

EM

[1166:1268] Found matching exchange: 35604r, Delegate: (nil)

2026-06-11 6:41:26.657 am

EM

[1166:1268] >>> [E:35604r S:52740 M:32549968 (Ack:54481984)] (S) Msg RX from 1:0000000000000C09 [A31D] to 000000000001B669 --- Type 0000:10 (SecureChannel:StandaloneAck) (B:34)

All of a sudden there was a whole string of these. Seemed to be all the matter devices. LOgs go by so fast hard to catch things.

2026-06-11 6:52:37.295 am

CTL

[1166:5361] Failed to get device address.

2026-06-11 6:52:37.293 am

RMI

[1166:72] RMI call: getIpAddress(deviceId=3177)

There is no need to catch anything: use the download button. The more context we have for LLMs the better

1 Like

I fed those to Claude, together with a link to this topic and about 2 hours worth of Matter logs from my hub as a reference. Here is what Claude is suggesting (seems reasonable to me):

Thanks for grabbing those logs — that actually helps a lot, and it points away from Thread being the culprit.

Here's the short version of what those three lines tell us. The only message type repeating is a SecureChannel:StandaloneAck — that's just an empty "got it" acknowledgement — and it shows Delegate: (nil). In plain terms: your device (…0C09) is clearly reachable — it's acking the hub, so frames are crossing the Thread mesh, the border router, and your LAN just fine — but there's no live subscription actually pulling data from it. The hub and device are trading keep-alive acks with nothing useful flowing on top. That's the log version of exactly what you said earlier: the data gets to the TBR but never makes it to HE, or HE bobbles it. It's bobbling it.

So this looks like a Matter subscription problem on the Hubitat side, not a Thread/network problem. Which fits everything you've seen: perfect in Apple Home (Apple's own subscription to the device is healthy), dropping in HE (HE's separate subscription to the same device is the one failing). It also fits that your mains-powered Matter devices are fine — they never sleep, so they never stress this part of the stack. The battery Thread sensors do.

Two things would pin it down, if you're up for a little more log-grabbing:

  1. Grab ~30–60 seconds of log around a drop, not just the repeating lines. What we're looking for is whether words like Subscription established, MoveToState, CASESession, or FindOperational are cycling — that would mean HE is stuck in a loop trying to re-establish the subscription. Or, if you see a Msg Retransmission … scheduled that never gets a matching "Removing … from Retrans Table," that points to messages going out to the device and never being answered. You don't have to interpret it — just paste it and we'll have a look.

  2. Check the timing. When that ack loop is running, is Apple Home still showing the device fine at the same moment? If yes, that's the clean proof: same device, healthy on Apple, stuck on HE — which puts the problem squarely in HE's Matter layer.

(If those logs come back ambiguous, there's a definitive experiment we can run to separate "sleepy device" from "Thread itself" — but let's see what the logs say first before going there.)

For what it's worth, I ran the same logging on my own hub and my sleepy sensor's subscription is stable — the hub even widens its retry timing to ~16 seconds for the battery device, which is it correctly recognizing the device is a slow-polling sleepy one. So HE can do this right, which makes me think your drops are about something specific to your devices' firmware or the poll-delivery conditions in your setup, rather than HE just being unable to keep sleepy subscriptions alive in general.

Net: I don't think a Thread radio in the hub would fix this, and I don't think it's your mesh — the device is reachable the whole time. It's the subscription layer above Thread. The logs above will tell us which flavour.

1 Like

Not sure what I am looking at here. The Ikea contact is offline in HE but working in Apple. I cleared the matter logs. Then cycled the contact several times with it not responding in HE. Then downloaded the logs. Gonna try to upload the log file and maybe someone else can see what I should be looking for. Well here is part of it anyway.

[1166:1268] Found matching exchange: 62307r, Delegate: (nil)
[1166:1268] Rxd Ack; Removing MessageCounter:121399875 from Retrans Table on exchange 62307r
[1166:1268] >>> [E:62311r S:52782 M:165187649] (S) Msg RX from 1:0000000000000C09 [A31D] to 000000000001B669 --- Type 0001:05 (IM:ReportData) (B:41)
[1166:1268] Handling via exchange: 62311r, Delegate: 0x7f737f8fe0
[1166:1268] ReportDataMessage =
[1166:1268] {
[1166:1268] SubscriptionId = 0xf900007f,
[1166:1268] InteractionModelRevision = 1
[1166:1268] }
[1166:1268] Refresh LivenessCheckTime for 22082 milliseconds with SubscriptionId = 0xf900007f Peer = 01:0000000000000C09
[1166:1268] <<< [E:62311r S:52782 M:121399876 (Ack:165187649)] (S) Msg TX from 000000000001B669 to 1:0000000000000C09 [A31D] [UDP:[fe80::2abb:edff:fe88:1c53%eth0]:5540] --- Type 0001:01 (IM:StatusResponse) (B:42)
[1166:1268] ??1 [E:62311r S:52782 M:121399876] (S) Msg Retransmission to 1:0000000000000C09 scheduled for 558ms from now [State:Active II:5300 AI:500 AT:300]
[1166:1268] >>> [E:62311r S:52782 M:165187650 (Ack:121399876)] (S) Msg RX from 1:0000000000000C09 [A31D] to 000000000001B669 --- Type 0000:10 (SecureChannel:StandaloneAck) (B:34)
[1166:1268] Found matching exchange: 62311r, Delegate: (nil)
[1166:1268] Rxd Ack; Removing MessageCounter:121399876 from Retrans Table on exchange 62311r
[1166:1268] >>> [E:62315r S:52782 M:165187651] (S) Msg RX from 1:0000000000000C09 [A31D] to 000000000001B669 --- Type 0001:05 (IM:ReportData) (B:41)
[1166:1268] Handling via exchange: 62315r, Delegate: 0x7f737f8fe0
[1166:1268] ReportDataMessage =
[1166:1268] {
[1166:1268] SubscriptionId = 0xf900007f,
[1166:1268] InteractionModelRevision = 1
[1166:1268] }
[1166:1268] Refresh LivenessCheckTime for 22082 milliseconds with SubscriptionId = 0xf900007f Peer = 01:0000000000000C09
[1166:1268] <<< [E:62315r S:52782 M:121399877 (Ack:165187651)] (S) Msg TX from 000000000001B669 to 1:0000000000000C09 [A31D] [UDP:[fe80::2abb:edff:fe88:1c53%eth0]:5540] --- Type 0001:01 (IM:StatusResponse) (B:42)
[1166:1268] ??1 [E:62315r S:52782 M:121399877] (S) Msg Retransmission to 1:0000000000000C09 scheduled for 565ms from now [State:Active II:5300 AI:500 AT:300]
[1166:1268] >>> [E:62315r S:52782 M:165187652 (Ack:121399877)] (S) Msg RX from 1:0000000000000C09 [A31D] to 000000000001B669 --- Type 0000:10 (SecureChannel:StandaloneAck) (B:34)
[1166:1268] Found matching exchange: 62315r, Delegate: (nil)
[1166:1268] Rxd Ack; Removing MessageCounter:121399877 from Retrans Table on exchange 62315r
[1166:1268] >>> [E:4193r S:52824 M:108585420] (S) Msg RX from 1:0000000000000C6A [A31D] to 000000000001B669 --- Type 0001:05 (IM:ReportData) (B:95)
[1166:1268] Handling via exchange: 4193r, Delegate: 0x7f737f8fe0
[1166:1268] ReportDataMessage =
[1166:1268] SubscriptionId = 0xf998ed04,
[1166:1268] AttributeReportIBs =
[1166:1268] [
[1166:1268] DataVersion = 0x30d9c851,
[1166:1268] Endpoint = 0x0,
[1166:1268] Cluster = 0x2f,
[1166:1268] Attribute = 0x0000_000C,
[1166:1268] Data = 184 (unsigned),
[1166:1268] AttributeReportIB =
[1166:1268] AttributeDataIB =
[1166:1268] DataVersion = 0x9c4c980b,
[1166:1268] AttributePathIB =
[1166:1268] {
[1166:1268] Endpoint = 0x1,
[1166:1268] Cluster = 0x45,
[1166:1268] Attribute = 0x0000_0000,
[1166:1268] }
[1166:1268]
[1166:1268] Data = true,
[1166:1268] },
[1166:1268] },
[1166:1268] ],
[1166:1268]
[1166:1268] InteractionModelRevision = 12
[1781218279.191] [1166:1268] [DMG] }
put : 12, status Success, clusterStatus None
put : 0, status Success, clusterStatus None
[1166:1268] Refresh LivenessCheckTime for 1844277 milliseconds with SubscriptionId = 0xf998ed04 Peer = 01:0000000000000C6A
[1166:1268] <<< [E:4193r S:52824 M:57292096 (Ack:108585420)] (S) Msg TX from 000000000001B669 to 1:0000000000000C6A [A31D] [UDP:[fdcd:57ad:6f64:0:b888:c708:f06a:40a5]:5540] --- Type 0001:01 (IM:StatusResponse) (B:42)
[1166:1268] ??1 [E:4193r S:52824 M:57292096] (S) Msg Retransmission to 1:0000000000000C6A scheduled for 3100ms from now [State:Active II:17000 AI:2500 AT:1000]
[1166:1268] >>> [E:4195r S:52824 M:108585421] (S) Msg RX from 1:0000000000000C6A [A31D] to 000000000001B669 --- Type 0001:05 (IM:ReportData) (B:95)
[1166:1268] Handling via exchange: 4195r, Delegate: 0x7f737f8fe0
[1166:1268] ReportDataMessage =
[1166:1268] SubscriptionId = 0x8ec8c2c2,
[1166:1268] AttributeReportIBs =
[1166:1268] [
[1166:1268] DataVersion = 0x9c4c980b,
[1166:1268] Endpoint = 0x1,
[1166:1268] Cluster = 0x45,
[1166:1268] Attribute = 0x0000_0000,
[1166:1268] Data = true,
[1166:1268] AttributeReportIB =
[1166:1268] AttributeDataIB =
[1166:1268] DataVersion = 0x30d9c851,
[1166:1268] AttributePathIB =
[1166:1268] {
[1166:1268] Endpoint = 0x0,
[1166:1268] Cluster = 0x2f,
[1166:1268] Attribute = 0x0000_000C,
[1166:1268] Data = 184 (unsigned),
[1166:1268] },
[1166:1268] ],
[1166:1268]
[1166:1268] InteractionModelRevision = 12
[1166:1268] }
[1166:1268] Received report with invalid subscriptionId 2395521730
[1166:1268] <<< [E:4195r S:52824 M:57292097 (Ack:108585421)] (S) Msg TX from 000000000001B669 to 1:0000000000000C6A [A31D] [UDP:[fdcd:57ad:6f64:0:b888:c708:f06a:40a5]:5540] --- Type 0001:01 (IM:StatusResponse) (B:42)
[1166:1268] ??1 [E:4195r S:52824 M:57292097] (S) Msg Retransmission to 1:0000000000000C6A scheduled for 2793ms from now [State:Active II:17000 AI:2500 AT:1000]
[1166:1268] >>> [E:4196r S:52824 M:108585422] (S) Msg RX from 1:0000000000000C6A [A31D] to 000000000001B669 --- Type 0001:05 (IM:ReportData) (B:95)
[1166:1268] Handling via exchange: 4196r, Delegate: 0x7f737f8fe0
[1166:1268] ReportDataMessage =
[1166:1268] SubscriptionId = 0x9b60999b,
[1166:1268] AttributeReportIBs =
[1166:1268] [
[1166:1268] DataVersion = 0x9c4c980b,
[1166:1268] Endpoint = 0x1,
[1166:1268] Cluster = 0x45,
[1166:1268] Attribute = 0x0000_0000,
[1166:1268] Data = true,
[1166:1268] AttributeReportIB =
[1166:1268] AttributeDataIB =
[1166:1268] DataVersion = 0x30d9c851,
[1166:1268] AttributePathIB =
[1166:1268] {
[1166:1268] Endpoint = 0x0,
[1166:1268] Cluster = 0x2f,
[1166:1268] Attribute = 0x0000_000C,
[1166:1268] }
[1166:1268] Data = 184 (unsigned),
[1166:1268] },
[1166:1268] ],
[1166:1268]
[1166:1268] InteractionModelRevision = 12
[1166:1268] }
[1166:1268] Received report with invalid subscriptionId 2606799259
[1166:1268] <<< [E:4196r S:52824 M:57292098 (Ack:108585422)] (S) Msg TX from 000000000001B669 to 1:0000000000000C6A [A31D] [UDP:[fdcd:57ad:6f64:0:b888:c708:f06a:40a5]:5540] --- Type 0001:01 (IM:StatusResponse) (B:42)
[1166:1268] ??1 [E:4196r S:52824 M:57292098] (S) Msg Retransmission to 1:0000000000000C6A scheduled for 2990ms from now [State:Active II:17000 AI:2500 AT:1000]
[1166:1268] >>> [E:4197r S:52824 M:108585423] (S) Msg RX from 1:0000000000000C6A [A31D] to 000000000001B669 --- Type 0001:05 (IM:ReportData) (B:95)
[1166:1268] Handling via exchange: 4197r, Delegate: 0x7f737f8fe0
[1166:1268] ReportDataMessage =
[1166:1268] {
[1166:1268] SubscriptionId = 0xfa9706d,
[1166:1268] AttributeReportIBs =
[1166:1268] [
[1166:1268] DataVersion = 0x9c4c980b,
[1166:1268] Endpoint = 0x1,
[1166:1268] Cluster = 0x45,
[1166:1268] Attribute = 0x0000_0000,
[1166:1268] Data = true,
[1166:1268] AttributeReportIB =
[1166:1268] AttributeDataIB =
[1166:1268] DataVersion = 0x30d9c851,
[1166:1268] AttributePathIB =
[1166:1268] {
[1166:1268] Endpoint = 0x0,
[1166:1268] Cluster = 0x2f,
[1166:1268] Attribute = 0x0000_000C,
[1166:1268] Data = 184 (unsigned),
[1166:1268] },
[1166:1268] ],
[1166:1268]
[1166:1268] InteractionModelRevision = 12
[1166:1268] }
[1166:1268] Received report with invalid subscriptionId 262762605
[1166:1268] <<< [E:4197r S:52824 M:57292099 (Ack:108585423)] (S) Msg TX from 000000000001B669 to 1:0000000000000C6A [A31D] [UDP:[fdcd:57ad:6f64:0:b888:c708:f06a:40a5]:5540] --- Type 0001:01 (IM:StatusResponse) (B:42)
[1166:1268] ??1 [E:4197r S:52824 M:57292099] (S) Msg Retransmission to 1:0000000000000C6A scheduled for 2947ms from now [State:Active II:17000 AI:2500 AT:1000]
[1166:1268] Received a duplicate message with MessageCounter:108585420 on exchange 4193r
[1166:1268] >>> [E:4193r S:52824 M:108585420] (S) Msg RX from 1:0000000000000C6A [A31D] to 000000000001B669 --- Type 0001:05 (IM:ReportData) (B:95)
[1166:1268] Found matching exchange: 4193r, Delegate: (nil)
[1166:1268] Forcing tx of solitary ack for duplicate MessageCounter:108585420 on exchange 4193r
[1166:1268] <<< [E:4193r S:52824 M:57292100 (Ack:108585420)] (S) Msg TX from 000000000001B669 to 1:0000000000000C6A [A31D] [UDP:[fdcd:57ad:6f64:0:b888:c708:f06a:40a5]:5540] --- Type 0000:10 (SecureChannel:StandaloneAck) (B:34)
[1166:1268] Received a duplicate message with MessageCounter:108585422 on exchange 4196r
[1166:1268] >>> [E:4196r S:52824 M:108585422] (S) Msg RX from 1:0000000000000C6A [A31D] to 000000000001B669 --- Type 0001:05 (IM:ReportData) (B:95)
[1166:1268] Found matching exchange: 4196r, Delegate: (nil)
[1166:1268] Forcing tx of solitary ack for duplicate MessageCounter:108585422 on exchange 4196r
[1166:1268] <<< [E:4196r S:52824 M:57292101 (Ack:108585422)] (S) Msg TX from 000000000001B669 to 1:0000000000000C6A [A31D] [UDP:[fdcd:57ad:6f64:0:b888:c708:f06a:40a5]:5540] --- Type 0000:10 (SecureChannel:StandaloneAck) (B:34)
[1166:1268] Received a duplicate message with MessageCounter:108585421 on exchange 4195r
[1166:1268] >>> [E:4195r S:52824 M:108585421] (S) Msg RX from 1:0000000000000C6A [A31D] to 000000000001B669 --- Type 0001:05 (IM:ReportData) (B:95)
[1166:1268] Found matching exchange: 4195r, Delegate: (nil)
[1166:1268] Forcing tx of solitary ack for duplicate MessageCounter:108585421 on exchange 4195r

I fed it to the same Claude session as before, and Claude is pretty happy with itself. See below.

@bcopeland you might want to have a look, maybe there is an interesting nugget in here somewhere ?

This is a much better capture — it actually shows the failure, and it's more specific than "the subscription drops." Two of your devices are in here telling slightly different versions of the same story.

Look at device …0C6A. HE logs this several times:

Received report with invalid subscriptionId 2395521730
Received report with invalid subscriptionId 2606799259

That's the key line. The device is wide awake and firing off reports — but it's tagging them with subscription IDs that HE doesn't recognize as valid. In plain terms: the device thinks it has live subscriptions that HE has already thrown away. The two ends disagree about what subscriptions exist. So the reports arrive fine (the device is reachable, the data crosses the mesh/TBR/LAN without trouble), but HE rejects them as belonging to a subscription it no longer knows about — and then the low-level acking machinery still has to acknowledge the messages, which is the endless StandaloneAck / Delegate: (nil) churn you pasted earlier. Those three repeating lines weren't the problem itself; they were the exhaust from this subscription-ID mismatch.

Your offline contact …0C09 shows a related symptom: its reports are actually being received and handled (you can see HE refresh the liveness timer and reply), yet it still shows offline. So the data is getting in but isn't turning into an "online" device on HE's side.

What this points to: it's not that HE can't reach your sleepy devices, and it's not Thread or your mesh — the device is reachable and chatty the whole time. It's that HE and the device get out of sync about which subscriptions are active, most likely after the device sleeps through a re-subscription cycle. The device keeps reporting under the old subscription; HE has moved on and rejects it. That's a subscription-state bug, and it's exactly the "subscription/session management kinks" we figured HE would need to iron out.

Practically, that means:

This is almost certainly not something you can fix by changing your network, adding a Thread radio, or buying different devices. It's in HE's Matter implementation.

The "initialize" button works because it forces HE and the device to tear down and rebuild subscriptions from scratch, which re-syncs them — until they drift apart again.

This is worth getting in front of Hubitat staff. The Received report with invalid subscriptionId line is concrete and reproducible — that's the kind of thing a developer can act on, much more than "my sensors drop."

If you can, upload the full downloaded log file (not just the pasted lines) and tag a staff member — the invalid subscriptionId sequence plus the surrounding subscription setup is what they'll need. You've actually done the hard part here; this is a genuinely useful capture.

2 Likes

Is there a way to attach a file? If I copy the entire file it says too many characters.

Not that I’m aware of. The simplest way is probably to provide a Google Drive or other cloud service shareable link to your file.

Gotcha. I can do that in the morning.

Here is link, I hope, to complete Matter log file.

https://www.icloud.com/iclouddrive/033IHIz4-GX7eNImNTDNJAyUQ#matter2

The Ikea contact was still offline this morning. I did the latest hub update and it came back online after the reboot. Which goes along with what is said above.

2 Likes

I fed this new log to Claude and it confidently asserted something different. Par for the course.

So instead I spent a few tokens on Claude Code to build a log analysis tool that is referenced on the specification documents, instead of relying on the LLM trying to directly parse the log.

FWIW here it goes:

Forensic look at the matter2.log capture — what it actually shows

I ran j715's matter2.log (linked in post #39) through a little analyser I've been working on. The picture lines up well with what people in this thread have been describing. Sharing in case it's useful to others hitting the same pattern — and to @bcopeland if you get a chance to look.

TL;DR

The bug isn't your network and isn't the devices. It's a coordination failure between the hub and certain devices — they lose track of each other's "subscription channel," and the recovery path that's supposed to fix it doesn't reliably fire.

What the log shows

Five Matter peers visible in a ~7-minute window. Verdicts the analyser came up with:

Peer State
…0C6A :red_circle: Broken — three different stale subscription IDs, CASE session went kDefunct and never recovered
…0C09 :yellow_circle: Earlier stage of the same bug — keep-alives flow, but 96 of 98 reports carry no actual data
…0C66 :yellow_circle: Self-healed late — went kDefunct, then re-established a fresh subscription ~16 minutes later
…0C15 :white_check_mark: Self-healed cleanlykDefunct → kActive worked the way it's supposed to
…0C2E :white_check_mark: Healthy throughout

What this means for users

  • Initialize is the right workaround. It forces both sides to start over with a fresh subscription. That's literally what the analyser would also recommend for …0C6A.
  • The hub can self-heal…0C15 and …0C66 did it during this capture. It's just slow (16+ minutes for C66) and unreliable (C6A never did, in 7 minutes).
  • If your device looks fine in Apple Home but offline in Hubitat, it's almost certainly this bug — the device is talking, the hub is just refusing what it says because it's lost the subscription context.
  • "Initialize fixes it, then it drifts again" isn't your imagination. The protocol doesn't have a corrective path once both sides are misaligned. Initialize is a hard reset; nothing prevents the drift from re-occurring.

What this might mean for Hubitat

Three signals from the log, in what I'd guess is order of importance:

  1. 181 mDNS / SRP address-lookup failures in 7 minutes (~26/min). The SDK's OnNodeAddressResolutionFailed is firing constantly. This explains why Initialize works (forces a fresh resolve) and why the problem recurs (next cache miss restarts the cycle). On a Thread fabric this implicates either the Border Router's SRP advertising proxy, the hub's mDNS cache, or a churning OMR prefix.

  2. The CASE recovery path runs for some peers but not others. …0C15 shows the expected pattern: Found an existing secure session to [1:…0C15]kDefunct → kActive. For …0C6A there's no such line anywhere in the rest of the capture — the recovery path simply doesn't fire. Why does the same code path work for one peer and not another in the same window?

  3. …0C6A shows three different invalid subscription IDs, not the same one replayed. The device is creating fresh subscriptions; the hub isn't accepting any of them. End-to-end subscription-state-machine issue.

The IM-layer detection (CHIP_ERROR_INVALID_SUBSCRIPTION) and the MRP-layer cleanup (MarkAsDefunct) are both working as designed. The piece that's not firing is the recovery — and the underlying constant address-lookup failure might well be what's blocking it.

Caveats

This is one ~7-minute snapshot of one hub. Battery vs mains-powered Matter-over-Thread devices behave very differently under stress, and the recording can't see device-side state — only what Hubitat sees. A longer capture spanning a full works → breaks → re-Initialize cycle would let someone correlate the address-lookup failures with the exact moments devices go offline. That's probably the next forensic step.


Methodology: parsed the log against the CHIP SDK's known event taxonomy, cross-referenced against the Matter 1.4 Core Spec (MRP §4.12, IM §8.5) and the ICD Management Cluster spec. Happy to share the tooling if anyone wants to run it on their own captures — it's a small Python CLI, no extra dependencies.

2 Likes

Wow! Totally lost now :face_with_spiral_eyes:. But the gist of it there’s really nothing I can do. Just wait for fixes. Which is fine. The critical devices I use virtual switches triggered by Apple automation. Works great so I’m good for now.

Should I try to get this?