I get the frustration of those affected by the issue.
We (engineers) do not have direct access to the hubs. That is by design. We can get to the engineering logs, which are sometimes helpful and sometimes are not. Often, as it was in this case, we can't reproduce the issue internally. So we make a guess about what the issue is, add some logging in the next build, and make changes to address the suspected root cause. Sometimes we get it on the first try, but not this time. We're not happy about you having those issues, and we do our best to address them, within the constraints the lack of access imposes.
If you're really curious, we narrowed down the source of severe loads to use of Java's ProcessBuilder's redirectOutput and redirectErrorStream methods. It breaks down on a few hubs but not others, even though software is identical. We don't know why it happens, but we know not to use these going forward.