Zigbee issues

Tony,

You worked for IBM in the S/360 days? Did you happen to know Dave Freeman, my first boss at Rutgers U. in the early 1970s and a dear friend, who was also known as "Mr. DOS?"

Jeff

I got in at the end of the 370 program but the internal forums (mainframe hosted VNET-- not sure that any external forums existed back then as this pre-dated BITNET /USENET) were great sources of folklore. Much discussion back then focused on PC/DOS; many of us mainframe guys became hobbyist nerds when the $10K 8085-based Datamaster and its Basic interpreter appeared just before the IBM PC); DOS/360 was in the rearview mirror. That said the name does sound familiar but I didn't know him personally.

2 Likes

Tony,

Ah. By the time System 370 was introduced, Dave Freeman had already left IBM. His nickname, "Mr. DOS," goes back to the early System 360 days in the early sixties. Development of the operating system which became z/OS, OS/360, began in 1957 but, by the time the first System 360 models became available, OS/360, the largest software project ever attempted, still hadn't produced code which would run on the earliest S/360 models (360/20, 360/30), so Dave was tapped to write a compact but functional stop-gap operating system which could be delivered to the first customers in 1963. He and his team of four programmers had already written the S/360 Assembler, so the same team wrote TOS/360, with the system loaded from a tape drive, which shuttled the tape back and forth to load features as needed. In short order, they converted this to DOS/360, and he got the nickname. The idea was that this DOS, which has nothing to do with the system for the PC that Bill Gates sold to IBM in the early eighties, would eventually be phased out, but in its entire run, the 360/30, which was always popular, never had enough main memory (RAM) to run OS/360, so DOS remained and, as far as I know, is still in use, today, in its newest incarnation. In fact, Dave left IBM because they wouldn't assign him another project as they deemed him irreplaceable, so he left to run the Rutgers data center, C.C.I.S., and we didn't ever run DOS/360, needless to say.

1 Like

Well, that's interesting. I worked in the Pok. development lab (hardware side) where the 360 was designed and built. I had the good fortune to end my days there in system design with a small cadre of the original fathers of the 360. They were revered even then.

Interesting to reflect on this as I browse pcparticker.com , deciding on 16 or 32GB for my PC build (I recall $2.5 million quoted as the cost of 1MB magnetic core in the 360 era; though any machine that size was a unicorn). I remember being awed when I heard that a 'full boat' 16MB S/370 (using frames of boards/cards/chips by then) was powered up for the first time on the test floor. It was the max that the non-XA models could directly address. Breakers tripped and it crashed... we wound up having to screen memory cards by power draw so a configuration that large could be part of the catalog. Fun times!

Tony,

My first system at Rutgers U. was a 360/67 with 512KB of main in two 2365 core boxes, each holding two 128KB potted core arrays. We owned the system, but we couldn't afford the core, so we leased those two boxes and had a third in a storage room which IBM just left there and rolled-out to the floor twice a year at final exam time, charging us only for the two months a year we used them. Before we sold the 360/67 (I'm an actual felon as I snatched the aluminum topper attached to the CPU box when we sold it to RPI), we bought 2MB of heirarchy II (slow) memory from a brand-x, the name of which I can't remember. When we sold the 360/67, we leased a 370/158 (not to mention a DEC KI-10 attached to the ARPANet as node 76) and, a few months later, a 370/168, with 4MB and 8MB respectively of chip memory main. I was lead Systems Programmer and jack of all trades, and the only SysProg on the staff who wasn't actively anti-social, so I was the one who conducted machine room tours. One day, we had a troupe of Cub Scouts. I opened the door to one of the 2365 core boxes and explained that each little cube visible held an array of little magnetic donuts, which were threaded by hand with wire thinner than a human hair, mostly by little, old ladies using magnifying glasses, when one of the kids, who seemed amazed, reached out and touched the pins of the lower block's wire-wrapped back plane, with which the system crashed, complete with the huge alarm bell behind the display of lights on the front of the CPU, knocking off 35 RJE stations and about 400 terminal users at TTYs and 2741 Selectrics. Needless to say, that was the end of our machine room tour program. Here's one of the 128KB potted core modules,

… and here's the topper, perched over my 1938, 20 quart Hobart stand mixer.

Fun times, indeed!
Jeff

3 Likes

Robert,

The Inovelli Red Gen2 (no neutral required) in-wall dimmers are now in stock at Amazon, so I ordered two of them to replace my Lutron Casetas. They should arrive on Wednesday and we'll see how that goes. Thanks for the heads-up.

Jeff

1 Like

The model 67 was the time sharing superstar... might have the first virtual memory model in the lineup. And there truly were rows of ladies that worked 'up the river' in the manufacturing plant sewing core planes. They had the skills that machines didn't. Even in the era of monolithic memory, core planes continued to be used for years by the Federal Systems division in aerospace-- the Space Shuttle's first processors used it, where the radiation hardness and nonvolatility were key attributes (I guess the microsecond access times were on par for the slower pace of that era). Ha... great story about the machine room tour.

So how 'bout that Zigbee (making this an on-topic post lest I get chastened as a thread creeper)?

2 Likes

@azz710 & @Tony, nice weekend detour guys. Interesting hearing stories like that thx :blush::+1:t2:

4 Likes

Tony,

I'm busily replacing my Quirky/GE lamps, trying every brand I can find. I replaced a number of them the other day with Sengled Element Classics I had "in stock," just ordered another dozen of those, replaced another four with Linkind lamps, just to try that brand (which seems to be fine, and they appear to repeat well, based upon the child-route report). I just ordered a single Samsung lamp, just to try it, and two candelabra base Ikea lamps to replace two I leave on all the time as the circuit never was switched as the original 1900 sconces had their own switches.

It's not thread creep as all of these off-topic hardware discussions got started when people started jumping on me, implying that I'm a Luddite! The 360/67 was fabulous, indeed. It was a 360/65 with a DAT (Dynamic Address Translation) box, a section added between the CPU and the memory bus structure on the back, from which the 2365 main memory modules sprouted. When I first started, we ran CP/67, the predecessor to VM, during off hours, and TSS a couple of hours on Sunday, just to play with it, as they were the two operating systems (actually, CP/67 was a monitor) which took advantage of the DAT box, the first commercial virtual storage system in the world. We also tried Stanford's WYLBUR, but all three were so sluggish that, for the most part, we just stuck with OS/360 MVT, starting with release 16 and ending with release 19.6, which was so stable it was hard to crash it even intentionally. As for the timing, the 2365 core boxes delivered a doubleword (64 bits plus parity) in 1.2 microseconds, which seemed awfully fast at the time.

Jeff

Just FYI, that page isn't the end of the story. The issue isn't that a "bad" smart bulb will flat-out fail to repeat; it's that it tries (and is likely to show up as a route for other devices in that list) but will sometimes eat a message instead of routing it, causing hard-to-troubleshoot problems at seemingly random times. Mike (on staff) has seen this with a few different brands of smart bulbs by looking at traffic with a Zigbee sniffer. I know you've read a lot about this issue by now but wanted to make sure that it is clear what people have actually seen, and the diagnostic page you referenced isn't really informative in that regard. That being said, I've never heard of this particular brand (Linkind?), so I'm not sure if anyone has even anecdotal experience regarding their behavior.

Hopefully you continue to work with Support to dig into the issues. Sending my continued wishes for good luck!

4 Likes

Well, you're not alone in the struggle. This stuff can get the better of anyone. Just when I thought I was the master of my Zigbee network, I decided to add a couple of Zigbee plug-in dimmers (GE 45852) to replace a couple of problematic Z-Wave units that periodically stop responding to polls until they are power cycled (being stuck behind furniture, that gets tiring quickly).

So I decide to pair the first dimmer to my HE hub. Not discovered, after multiple attempts, even including a hub restart (though I can conceive of no reason why that should help other than to mitigate some type of resource depletion on the HE side). Hmm, must be a dud. Unbox the second one and try; not discovered. Okay, two duds in a row is unlikely. So I try using my much maligned but still kicking SmartThings hub... instant success, with both joining and operating just fine. I can't think of any plausible reason for the failure on HE.

I haven't had any unexplainable Zigbee issues since I started digging into how the protocol works (from the start I took the usual pains to choose a non-interfering channel; though I live in an RF void anyway and interference aside from my own wifi has never been an issue). I've only come across one other device that would not join my Channel 18 network, and that problem was resolved when I tried another sample.

Because I couldn't think of anything else to try, and suspecting maybe the dimmer's radios (both of them?) didn't operate well on 18 for some reson, I changed my HE's Zigbee channel to 13 (as good but no better than 18 given where my wifi channels reside). Both dimmers joined instantly. The last time I did a channel change (back in Feb. '18 when I first installed HE) I grew tired of waiting for a couple of stray end-devices (some of which are mounted outdoors) to catch up and had to rejoin them; I didn't feel like doing that. So just for kicks I immediately (in less than a couple of minutes) changed H.E.'s channel back to its original 18.. and lo and behold, within ten minutes or so both GE dimmers also successfully followed to channel 18 and have been working fine ever since. They automatically rescanned, found, and rejoined their parent network just as they should have, and continue to operate fine on channel 18, but wouldn't initially join HE on 18. I still can't think of any reason (aside from flaky hardware) why. That's why I keep a diary/change log for undocumented workarounds like these.

2 Likes

Very sound advice that I’ve not followed but will do so, now.

2 Likes

Robert,

Absolutely, and I do get that and believe it. I now have my own Zigbee sniffer, which will help me diagnose these issues. The Child/Route report, however, does tell me when a particular device is being used as a repeater and, to some extent, how well it's doing.

As for Linkind, I certainly don't have a great deal of definitive evidence, but all four of them started repeating immediately and, when I disconnect my most reliable repeating plugs downstairs, the upstairs devices upon which they then rely, once the mesh settles-down, remain entirely responsive, and there have been no problems whatsoever, so I'm hopeful we'll find that they are a decent, low-cost lamp with a fast enough CPU to keep up with the message stream.

Meanwhile, regarding the problematic Quirky/GE Link lamps, and while packet drops are almost certainly part of the problem, I think there's something else involved as well, as simply adding a bad one, before it has a chance to be used as a repeater, can immediately orphan a block of lamps of any brand elsewhere in the house. I have very, very limited anecdotal evidence, but one hypothesis is that the original lamps, sealed in their boxes (of which I still have a few) do not tend to do this, whilst those to which Wink, after Flextronics bought the company, pushed new firmware are more likely, at least, to cause the problem. The original firmware from the time of the product launch in 2014 apparently caused a single lamp to drop out of the mesh entirely on occasion. The new firmware made this much less likely during the Flextronics Wink era. I have an unsubstantiated suspicion that matching Wink hub firmware was required to match the new lamp firmware and that this is the source of the problem with HE and, perhaps, SmartThings as well. I hope to prove this using my sniffer, once I get the filtering in the TI software to limit the vast quantity of unneeded data captured. As for what people have seen, I guess that depends upon which people and how rigorous their experimentation, for some of the stuff I've read is trivially refutable, at least when generally applied.

But the bottom line for me is that all of the Quirky/GE Link lamps have to go, which is why I have been purchasing many lamps of various brands so I can judge for myself which are the most reliable. My inaccessible chandelier is still an issue, so those nine Link lamps will be the last to go, especially as all have the latest Wink firmware, pushed some time last year. A dozen more Sengled Element Classics will arrive tomorrow, and ten of those will replace the last Link lamps that I can easily access. The bottom line for me is that while I do reject the notion that the Links are- and always were garbage, as has so often been stated, I wholeheartedly embrace the notion that they cause our Zigbee meshes intractable problems despite how well they work with the system for which they were designed. This, in fact, is the reason I'm attempting to avoid non-generic solutions, even those which turn out, like the Links, to be de-facto non-generic, and why I'm replacing my Lutron Casetas with Inovelli trickle current-tapping (no neutral) dimmer switches rather than using a Lutron bridge. They'll arrive on Wednesday, and I'm really looking forward to installing them.

Thanks again, as always,
Jeff

Tony,

I'm now on my third Zigbee channel, after forcing my mesh WiFi router to keep itself to the lower end of the 2.4GHz band. I'm not exactly in a void, but I never see a strong WiFi signal from any of my neighbors, which is good enough. Your diary is an excellent idea, and I will now do the same.

Thanks,
Jeff

2 Likes

I might have missed this in the thread, but for the chandelier if the Link bulb replacement looks to be inevitable why not use a smart wall dimmer (there are Zigbee and Z-Wave devices) with some filament style LED bulbs? There seem to be a lot of nice looking options available. In the worst case you will have one temperamental device to deal with rather than nine. Edit: I suspect neutral wire (or lack of) is the reason; Inovelli now has a Z-Wave model that doesn't require one.

2 Likes

Yep, no neutral. All of my light switches are in boxes, originally knob-and-tube, with just series loops (two wires). I could do that, now, but if I'm going to risk my life to change those nine bulbs again, I might as well just replace them with one of the generic Zigbee brands. As for the Inovelli, I have two Gen 2 dimmers arriving on Wednesday, to replace my two Lutron Casetas, for which I bought nice, brass switch plates. The switch for the chandelier is in a two-gang box, and I just don't want to replace the other switch and get yet another switch plate. The look of the bulbs doesn't matter as there's a Lalique glass shade over each which, from below, look like rosebuds.

I have an old "server" sitting here powered off that still has a 3.5" stiffy drive installed. However, I haven't turned it on in a few years as it's just a counterbalance to the new server also sitting on top of my rack UPS.

Now if you want to talk about 8" floppies, or even punch cards or paper tape, or 9 track reel tape, let's go...

1 Like

FWIW, My two AP's use 2.4Ghz channels 1 and 11, and my zigbee is using 20. Everything works great.

I have a lot of neighbor traffic on both sides in the chnl 6 range, so this works for me. I may, at some point, move the zigbee channel down, but it's reliable for now..

1 Like

Spent about 15 years supporting a typesetting operation that used 8-level paper tape. I used to be able to read it as quickly as a newspaper. We used optical readers (DEC) and a couple of DEC's PDP8/e minis to drive the phototypesetters. Yes, we did all the offline storage on 9 track tape. Great memories.

1 Like

Have you had any issues having a second Z-Wave stick paired to HE and using zensys tools simultaneously or do you unpair it before using it to kill or flash things?

I picked up the Zooz S2 stick to flash their stuff (as recommended by Zooz) but I've been advised to:

  1. not make it primary
  2. not pair it because it's S2