I learned about difftastic today, which aims to show differences between files while being aware of the underlying programming language used in said files (if any).

Structured diff output with difftastic on a change in Mailhog
Structured diff output with difftastic on a change in Mailhog

It’s basically magic when it works!

I generally like the built-in diff in the JetBrains suite of IDEs. The one I use these days is GoLand, but I believe they all support adding an external diff tool. Since difftastic is a console app, here’s what I had to do on my Mac:

  1. brew install difftastic # install the tool
  2. Install ttab using their brew instructions. This allows GoLand to launch a new tab in iTerm and run the difft command there. Otherwise, using the External Diff Tool in Goland would appear to do absolutely nothing, as the output of the tool isn’t displayed natively.
  3. Configure the external diff tool using the instructions for GoLand.
    1. Program path: ttab
    2. Tool name: Difftastic (but can be anything you like)
    3. Argument pattern: -a iTerm2 difft %1 %2 %3
      1. The “-a iTerm2” is to ensure that iTerm is used instead of the default Terminal app.

Now you can click this little button in the standard GoLand diff view to open up the structural diff if needed:

Screenshot of GoLand's diff viewer with a highlight around the external diff tool button

Ideally the diff would be integrated into GoLand, but I don’t mind it being an extra click away, since difftastic doesn’t work reliably in many situations (particularly large additions or refactorings).

Prometheus has this line in its docs for recording rules:

Recording and alerting rules exist in a rule group. Rules within a group are run sequentially at a regular interval, with the same evaluation time.

Recording Rules

I read that a while ago, but at the time it wasn’t clear why it mattered. It seemed that groups were mostly intended to give a collection of recording rules a name. It became clear recently when I tried to set up a recording rule in one group that was using a metric produced by a recording rule in another group.

The expression for the first recording rule was something like this:

(
  sum(rate(http:requests[5m]))
  -
  sum(rate(http:low_latency[5m]))
)
/
(
  sum(rate(http:requests[5m]))
)

The result:

Using a recording rule from another group

It’s showing a ratio of “slow” requests as a value from 0 to 1. Compare that graph to one that’s based on the raw metric, and not the pre-calculated one:

Using the raw metric

The expression is:

(
  sum(rate(http_requests_seconds_count{somelabel="filter"}[5m]))
  -
  sum(rate(http_requests_seconds_bucket{somelabel="filter", le="1"}[5m]))
)
/
(
  sum(rate(http_requests_seconds_count{somelabel="filter"}[5m]))
)

The metrics used here correspond to the pre-calculated ones above. That is, http:requests is http_requests_seconds_count{somelabel="filter"}, and http:low_latency is http_requests_seconds_bucket{somelabel="filter", le="1"}. The graphs are similar, but the one using raw metrics doesn’t have the strange sharp spikes and drops.

I’m not sure what’s going on here exactly, but based on the explanation from the docs it’s probably a race between the evaluation of the two groups resulting in inconsistent number of samples used for http:requests and http:low_latency. Maybe one has one less sample than the other at the time they’re evaluated for the first group’s expression, which I think could show up as spikes.

Whatever the cause the solution is simple: if one recording rule uses metrics produced by another make sure they’re in the same group.

I was trying to add this site to Indieweb ring last night and found that it couldn’t validate the presence of the previous/next links, even though they were clearly in the footer of every page. I cleared the WordPress and Cloudflare caches without success.

Since Indieweb ring runs on Glitch, which is a large public service, I suspected that maybe Cloudflare was blocking their traffic. Sure enough, checking my http access logs I couldn’t find any requests from Glitch, and switching nameservers to my web host resulted in a successful check:

Indieweb ring's status checker log showing two failed checks and one successful one

This was happening even with the “Bot Fight” option set to off, “Security Level” set to “Essentially Off” and a disabled “Browser Integrity Check” option.

[side note] I love that my autogenerated site identifier for Indieweb ring is a person worried about taking pictures and writing. Accurate.

I got an M1 mac at work recently and hit a strange issue trying to run the Goland debugger:

Error running '<my test>':
Debugging programs compiled with go version go1.17.8 darwin/amd64 is not supported. Use go sdk for darwin/arm64.

Wait, amd64? I had installed the arm64 go package after the switch to M1, so why did it think I was running on amd64?

Turns out I had used homebrew on my old Intel Mac to install and use a newer version of my shell (bash). Because amd64 homebrew installs amd64 binaries the shell itself was running under amd64 and any app it ran would think the CPU architecture was also amd64. Running arch in the shell confirmed my suspicion.

The solution was to switch to the Mac’s built-in bash shell, reinstall homebrew, reinstall bash, and set iTerm to use /opt/homebrew/bin/bash instead of the default /bin/bash . I followed these instructions for switching to arm64 homebrew and kept the old Intel homebrew around aliased to oldbrew as suggested to still have access to amd64-only apps.

In re-reading The Wrong Abstraction recently, I’ve realized that while we often talk about code being an artifact of production, it often functions more as a decision log.

In her post Sandi writes about how the Sunk Cost Fallacy plays into the hesitation we as developers feel when encountering perplexing abstractions. Part of the recommendation is to consider that

“It may have been right to begin with, but that day has passed.”

I think this is often how we think about organizational decisions. We make new ones all the time, and often they alter or completely reverse those that came before, even if people who made the initial choices aren’t available for consultation anymore.

I suspect it’s easier for us to undo organizational decisions than code decisions because the former are made in a more visible and social environment. It’s easy to skip fixing the wrong abstraction when that choice may only be seen by one reviewer as part of an unrelated change. Perhaps this is where pair or mob programming can help. More eyes on the wrong abstraction at the right time could be all that’s needed to address it.

Thinking about code as a decision log could also help. Removing the abstraction is just another event. Events are bound to a specific moment in time, and maybe today is the right moment for an event that reverses some of those previous decisions.

Another goodie from InfoQ Engineering Culture podcast: CA Agile Leaders on the Using Data and Creating a Safe Environment to Drive Strategy. Some quick notes + insights:

  • Biweekly status reports don’t really mean anything. Progress is shown with real data: stories/tasks moving to Done (based on the agreed Definition of Done). This data is critical. Executives need access to it in order to make informed decisions. Q: How do you break down the work in advance effectively? If you’ve got a clear target, but no clear way to get to it, how do you convert that into stories?

The natural human reaction to try and do more when we have a history of not delivering, thus overloading the system even further

  • I didn’t realize how natural this really was until I heard it. If your team’s batting .200, just increase pitching frequency! Then you’ll definitely end up with more home runs, right? Not sure how far this analogy will go, but I’ll try: what you end up is non-stop bunting, because there’s no time to wind up or recover.
  • What worked for CA was Management asking everyone in the org to limit their WIP. That’s true leadership courage right there.
  • Organizational temporal myopia: people’s inability to see themselves in the future, applied to organizations. Organizational focus is often on the long-term vision, so individuals end up inflating the amount of work that can get done to get to that vision sooner. I think the implication in the podcast is to focus on and commit to the short-term (sprint?), because you can imagine what the organization will be like a week from now versus a year from now.

Everybody gets mad if you’re not working on their stuff

  • Nothing gets done, but at least it looks like you’re working on everyone’s thing. Addressing this may be a matter of shining a light on what is vs. isn’t being accomplished. Circles back to safety. If you feel you can’t say that you’re gonna miss, it will look like everything’s going great with 67 projects in progress until the last second when all of them go red.

Not one person can save the ship from sinking, and not one person can bring the ship to shore.

  • There’s a culture of heroism, which mitigates against this collective thinking. Leaders are incentivized to work against each other by optimizing their portion at the cost of other parts of the org. There’re individual performance reviews too. Individuals get praised, which then causes people to create situations in which they have an opportunity to save the day. CA tries to show that collective work is more effective than individual, and work with organizations to change incentive structures away from individuals and toward teams.
  • Act “as-if”. Start showing and encouraging behaviours before they’re made “official”.
  • Withholding information is punishment. It’s actually painful. People will do this to each other at work. Culture of transparency helps work against these bad behaviours.

Lately, I’ve been listening to podcasts on my commute and while at the gym. The InfoQ Engineering Culture podcast is probably my favourite one so far.

The last episode I heard was Diana Larsen on Organisation Design for Team Effectiveness and Having the Best Possible Work-Life. The show notes are fantastic, but I thought I’d jot down the points that particularly resonated with me.

Every team needs a purpose

The boundaries of the work are important. Not feature scope, but “what kind of work are we doing?”, and “how is it unique?” The agile teams I’ve been on had a purpose, but the boundaries were only loosely defined. This resulted in unexpected work landing on our plate, because it seemed like it didn’t belong there the least, which was as awkward as it sounds. This has the effect of diluting the team’s focus, making it possible for every member to push in different directions.

A group of people need to build some mutual history to become a high performing team

Shared experiences bring people closer together. It makes a lot of sense, which is why it’s so surprising that teams are often expected to rush through the Forming, Storming, and Norming phases without being able to first, you know, ship something together.

This history can be built quickly when responding to a crisis

This makes me wonder if the most effective time to spin up a team is when there’s an urgent business problem requiring attention. Is there an equally-effective way to start a team with longer-term or less-visible deliverables?

Software is learning work – typing is not where the work gets done

Right, so why do I feel guilty about spending office time learning? There’s a constant sense of urgency in a business (especially a small one, I think) to ship, and pausing to learn seems to work against that goal. I don’t want to read every tech book that’s relevant to my work before writing any code, but also want to be able to spend time digesting information before rushing to apply it.