This week’s work fun: confirmed that Google Cloud’s default K8s Ingress setup pre-1.17 basically guarantees HTTP 502 responses for up to 10 minutes during regular app rollouts. To trigger the issue you just need to have fewer replicas of the app than there are K8s nodes/VMs. When a replica that lives on a node is deleted during a regular rolling update and there isn’t another to replace it Google’s Load Balancers happily continue to send traffic there. You can repro this with a basic setup from their official tutorials. Switching to their new Container-Native load balancers seems to help. It’s wild, though. 1.17 is fairly new in GKE and clusters aren’t auto-upgraded to Container-Native. Google has basically been selling a broken load balancer setup to GKE customers for years.
Enjoying Amazon’s Utopia remake. But seriously, how hard is it to tag torture scenes and let people opt-in to auto-skip? They already “X-Ray” actors in the scene on the pause screen. They have the metadata.
In re-reading The Wrong Abstraction recently, I’ve realized that while we often talk about code being an artifact of production, it often functions more as a decision log.
In her post Sandi writes about how the Sunk Cost Fallacy plays into the hesitation we as developers feel when encountering perplexing abstractions. Part of the recommendation is to consider that
“It may have been right to begin with, but that day has passed.”
I think this is often how we think about organizational decisions. We make new ones all the time, and often they alter or completely reverse those that came before, even if people who made the initial choices aren’t available for consultation anymore.
I suspect it’s easier for us to undo organizational decisions than code decisions because the former are made in a more visible and social environment. It’s easy to skip fixing the wrong abstraction when that choice may only be seen by one reviewer as part of an unrelated change. Perhaps this is where pair or mob programming can help. More eyes on the wrong abstraction at the right time could be all that’s needed to address it.
Thinking about code as a decision log could also help. Removing the abstraction is just another event. Events are bound to a specific moment in time, and maybe today is the right moment for an event that reverses some of those previous decisions.
Another goodie from InfoQ Engineering Culture podcast: CA Agile Leaders on the Using Data and Creating a Safe Environment to Drive Strategy. Some quick notes + insights:
- Biweekly status reports don’t really mean anything. Progress is shown with real data: stories/tasks moving to Done (based on the agreed Definition of Done). This data is critical. Executives need access to it in order to make informed decisions. Q: How do you break down the work in advance effectively? If you’ve got a clear target, but no clear way to get to it, how do you convert that into stories?
The natural human reaction to try and do more when we have a history of not delivering, thus overloading the system even further
- I didn’t realize how natural this really was until I heard it. If your team’s batting .200, just increase pitching frequency! Then you’ll definitely end up with more home runs, right? Not sure how far this analogy will go, but I’ll try: what you end up is non-stop bunting, because there’s no time to wind up or recover.
- What worked for CA was Management asking everyone in the org to limit their WIP. That’s true leadership courage right there.
- Organizational temporal myopia: people’s inability to see themselves in the future, applied to organizations. Organizational focus is often on the long-term vision, so individuals end up inflating the amount of work that can get done to get to that vision sooner. I think the implication in the podcast is to focus on and commit to the short-term (sprint?), because you can imagine what the organization will be like a week from now versus a year from now.
Everybody gets mad if you’re not working on their stuff
- Nothing gets done, but at least it looks like you’re working on everyone’s thing. Addressing this may be a matter of shining a light on what is vs. isn’t being accomplished. Circles back to safety. If you feel you can’t say that you’re gonna miss, it will look like everything’s going great with 67 projects in progress until the last second when all of them go red.
Not one person can save the ship from sinking, and not one person can bring the ship to shore.
- There’s a culture of heroism, which mitigates against this collective thinking. Leaders are incentivized to work against each other by optimizing their portion at the cost of other parts of the org. There’re individual performance reviews too. Individuals get praised, which then causes people to create situations in which they have an opportunity to save the day. CA tries to show that collective work is more effective than individual, and work with organizations to change incentive structures away from individuals and toward teams.
- Act “as-if”. Start showing and encouraging behaviours before they’re made “official”.
- Withholding information is punishment. It’s actually painful. People will do this to each other at work. Culture of transparency helps work against these bad behaviours.
Lately, I’ve been listening to podcasts on my commute and while at the gym. The InfoQ Engineering Culture podcast is probably my favourite one so far.
The last episode I heard was Diana Larsen on Organisation Design for Team Effectiveness and Having the Best Possible Work-Life. The show notes are fantastic, but I thought I’d jot down the points that particularly resonated with me.
Every team needs a purpose
The boundaries of the work are important. Not feature scope, but “what kind of work are we doing?”, and “how is it unique?” The agile teams I’ve been on had a purpose, but the boundaries were only loosely defined. This resulted in unexpected work landing on our plate, because it seemed like it didn’t belong there the least, which was as awkward as it sounds. This has the effect of diluting the team’s focus, making it possible for every member to push in different directions.
A group of people need to build some mutual history to become a high performing team
Shared experiences bring people closer together. It makes a lot of sense, which is why it’s so surprising that teams are often expected to rush through the Forming, Storming, and Norming phases without being able to first, you know, ship something together.
This history can be built quickly when responding to a crisis
This makes me wonder if the most effective time to spin up a team is when there’s an urgent business problem requiring attention. Is there an equally-effective way to start a team with longer-term or less-visible deliverables?
Software is learning work – typing is not where the work gets done
Right, so why do I feel guilty about spending office time learning? There’s a constant sense of urgency in a business (especially a small one, I think) to ship, and pausing to learn seems to work against that goal. I don’t want to read every tech book that’s relevant to my work before writing any code, but also want to be able to spend time digesting information before rushing to apply it.