=> Orgs all around the world, runs towards the dawn.
Some days ago, we had an issue with the automation responsible to create our software releases. Suddenly and without us doing any specific change, we weren't able to create, sign and distribute, new versions of the software me and the team I work with develop.
Funny enough, we didn't notice that the whole thing has been broken for days. In the end, that specific automation handles, mainly, our overnight builds and installations on the test systems and, given nothing changed in the codebase during that time, we didn't have the chance to trigger any specific problem. It helped also that, in our big and slow organization, the next release of our program is planned for the beginning of next year and thus, it wasn't a real emergency, per se.
Unfortunately, that specific process was so oiled and well tested, that our human testers got so accustomed to it that, for them, not receiving their daily notification about what changed in the test environments telling them what need to be tested is almost a shock, to say the least.
In the end, even if the issue turned out not being that of a pain to fix, it's the kind of small thing that frustrates you the most because it's not your fault, or your team's fault and when a set of this kind of issues accumulates, you get really frustrated, not to say angry, because this annoyance destroys your workflow for no good reason.
All for a software upgrade.
What happened, exactly, is that one of the central "IT teams", decided it was high time to upgrade a piece of software that's required, by the whole organization, to work on all the codebases the place I consult at run, and there are many, many projects running the show.
We all agree here that, it's always better to keep your tools up-to-date, even to the point of sacrificing a bit of stability, or causing a small burden to others, in order to have an updated stack. However, that this is done at the level of an organization with more than 100000 employees, with many of them working on some software, things change. You can not simply pretend to play "agile" and break everybody workflows because you can not bothered to find a solution to either migrate automatically, or keep backward compatibility with the existing setups.
Software upgrades, are the way of life in IT, they are normal and I'm the first proponent of keeping a regular upgrade schedule to all the production software a company depends on, be it to patch security issues, or to receive performance improvements However, this time things went wa-wa for the following reasons:
Of course, we were caught by surprise. Considering the amount of internal spam everybody here receives, pretending to read an unimportant e-mail with nothing distinguishing it from the myriad of others, is a bit of a stretch. That's why, internally, the really important infrastructure changes, and the ones that requires additional work downstream, are always reported through what we call "IT bulletins" and that's why, in this case, things were broken not only for us, but also for many other teams in the organization that were caught with their pants down, this time.
I've could have justified this kind of breakage if it was an honest mistake, like: "sorry, this detail in the changelog slipped through", but this is was not the case. It was a clear, undeniable and careless breakage of everybody's flow for no good reason. There wasn't a security problem to fix, or performance issues that the upgrade would have solved. The worst thing, is that the fallout from the issue was happily left to all the software development teams to manage and fix. When the responsible were questioned about that, the answer was, to put it shortly: "well, deal with it". It was mind-blowing, to me, that something similar is now acceptable.
Everybody now will tell me that, in big corporations, there are no alternatives to this kind of approach, in order to standardize the whole company's infrastructure and procedures. In theory, something else is much more expensive, and that's why organizations try to centralize and simplify as much as possible. However, in the days of agility and speed and adaptability: are we sure that all this centralization, with the breakages that comes with it, makes a company more agile, flexible and in the end, efficient? Isn't it likely that more centralization leads to an increase of the level of bureocracy required to have anything done? Are efficiency and agility, as concepts, even compatible in a huge organization?
To answer those questions, we need to understand some specific things.
For first: all of the big corporations I ever worked at, of course only pretend to be agile and efficient. It's my personal opinion that flexibility and agility, are values completely incompatible with what is considered "efficiency" at such organizations. Sure, a company can cut costs while doing the same thing as before. Somebody else pointed out to me that it's perfectly fine to run a business at minimal expenses forever but, in the end, that same company loses the ability to adapt to changes quickly, to reinvent itself and, in the end, to innovate in an ever-changing market. It's true in any sector and it's even more true in the IT world.
Like cutting costs and centralizing things in general causes a lost of flexibility and agility, centralizing the company tooling too much, causes teams to become much less effective in their work. As Fred Brooks puts it: "there is no silver bullet" and, in this instance, pushing the hundred, if not thousand of teams in an organization to follow a "one true way", is the best recipe for a disaster, like when it's time to do a software upgrade.
Instead of spreading the risk, exchanging some "money", in the form of teams freedom, to have some diversification, thus reducing the potential of single point of failures, the organization's choice was all about trying to micro-control teams working environments. As usual, everything of this sort is justified with terms like "improved security" in order to not make this way of proceeding look like what it is in reality: a way to point the fan downstream, when the shit, in the end, will hit it. "Responding to change" they calls it but for me, it's just a form of blame shifting towards your users.
An organization, playing agile whack-a-mole while, on the other end, pushing hard on more and more centralized management of processes and tools is, in itself, the composition of various groups of people, or teams, trying to protect their small and precious bureocratic garden. This is apparent in organizations pushing for the adoption of specific processes and tools from the top, even when such mandates doesn't make any sense for one or more specific teams.
Ever wondered why many, many organizations hire, so called, "agile coaches" or why, are always full of unproductive people called "architects, busy deciding what the best git branching strategy is? Ever wondered why such braindead architects takes decisions about things regarding your team, even when they explicitly are not part of the team, or when they don't even collaborate, with your team?
This kind of behavior, in a company, affects everything: from the choice of the VCS to use, to the kind of development process and workflows to adopt (corporate Scrum anyone?), everything is mandated from a closed group of people at the top, that must justify, one way or another, why they are being paid handsomely by the company. Their work revolves around looking busy and imposing totally unuseful rules on other teams, in such a way that they can claim deniability for any consequence coming from their "suggestions".
If we focus on the classic corporate Scrum, resulting from those countless "agile transformation" projects, sold by countless consulting companies, the end result makes the classic waterfall-ish process the majority of companies used to follow, or the canonical iterative development, perhaps, like the most efficient ways to produce software ever invented. When in the wake of becoming more "agile" and "modern", you have a one month procurement-time for a development virtual machine in a cloud, it's clear that something went wrong, somewhere (or everywhere, perhaps).
Is there a way to do things differently, in order to not frustrate everyone trying to get some work done? Can an established and huge company reduce the need for those, centralized, single point of failure that everybody in the company dislike? In my opinion, there are ways, but those requires a, so called "distributed approach" to doing things and this is, somewhat, more expensive compared to what the big corporations do usually.
The, albeit simplified, rule of thumb, should be, in my opinion, something like the following:
No agile coaches, unless specifically asked for, no ceremonies, no "agile maturity reflections", where more often than not, questions are "not applicable". Different projects have different requirements and needs, some more complex, others deadly simple; an internal website is hardly comparable to a complete payment platform. It's stupid and idiotic to pretend to look at some "acceptance levels", for something a team haven't decided itself.
The only thing you will get once again, are a bunch of frustrated people lamenting that some bureocracy, decided by somebody else, is in their way of working. Leave a team breathe and and autonomously figure out what's better for them. A team working on a specific project, with specific requirements, is best suited to judge what will work best for the project itself, compared to some higher-ups without any clue.
If a team is well oiled and efficient by exchanging patches by e-mail, so be it. If a software does not need a continually running Gitlab pipeline to be released and to work properly, that should be fine, too. I do not see a problem, honestly, if in a big organization some diversification is pursued. Diversification means resilience and resilience, in the context of a big company, is a good thing to have.
It's my opinion that the "central" part of a big corporation, should limit itself with providing the glue infrastructure between teams and, in the end, some light guidelines about what it's needed to release a software in production (as an example, signing your packages with PGP).
Some standardization is to be expected, of course. Things like operating systems where the softwares should run on, are a good target to be standardized as, in the end, those things are not to be considered "core business". The main chat application is expected to be unique across the organization or better, use a standard protocol leaving the choice of what client to use to the single users. Those are things that are not expected to change often and, as already said, are not what's important for your business, in the end. Everything else, should be left to a team to decide. it's their way to find the best way to work and, while it will be a bit more expensive, compared to a more "efficient" solution, it's the best course of action to build resiliency and diversification, in a corporation.
Until the next bean counter complaining about "cutting costs".
text/gemini
This content has been proxied by September (3851b).