Virtual Reflections on Kubecon NA 2023

[Reposted from Medium company blog]

Introduction

I feel like a little bit of a fraud writing about this, since I only managed to attend KubeCon virtually. But I watched enough of it and read enough about it that it gave me some thoughts.

OpenTelemetry (Otel)

Those of us who only relatively recently came across OpenTelemetry (shyly raises hand) will have been astonished by the momentum on display for its development, and in particular its adoption. Indeed, apparently OTel is the second highest velocity project in the CNCF (and retained that title between January and October) and we know from our conversations with customers that OTel implementations of various kinds are widespread.

To me, the interesting thing is that such exceptional industry investment isn’t — despite these economically pinched times — mostly about money. As far as I know, OTel doesn’t actually do anything — directly! — to help costs. (I don’t know technically if it has more efficient on-the-wire encoding; please tell me if you know.)

Instead, it does something even more important: changes effective ownership of the data. I suppose in theory it makes it easier to switch upstream observability providers, though perhaps that’s not quite as true as the switchers might like - the providers are working hard enough on differentiation that you’ll find switching costs are more about loss of features, operator unfamiliarity, and other helpful additions to inertia, than anything else. In a certain sense, the observability market resembles the cloud market, where raw graphing services are analogous to vanilla VMs, but most of the providers are frantically trying to bundle value-added services to encourage their customers in precisely the opposite direction.

Either way, OTel is clearly instrumental for end organisations and something almost everyone is going to have to support sooner or later.

Platform Engineering

Also notable was the growth in interest and presentations about Platform Engineering. I’m a veteran of a few previous industry transitions/hype waves, so I have both the burned fingers of the chastened ex-zealot — who still remembers what the zealotry felt like — and also deep curiosity about different ways of viewing the world.

I’m enjoying learning more as the movement grows, but — long story short, the DevOps/SRE/Platform Engineering discussion is uninteresting to me if it is primarily a tribal disagreement about how best to organise work. It is interesting to me as an evolving conversation about what matters in terms of service provision within a (sufficiently large/suitable/handwave) organisation. I will say that as I loosely understand the term right now, there’s nothing there I’ve not seen SRE teams do before. This is not a criticism or snark: again, the linguistic meat-grinder of the tech industry squelches everything introduced to its ignorant maw. But, one key difference see is that the focus on developer productivity typically plays out in different ways in Platform Eng rather than SRE: in the SRE world, this is mostly done by either sharing or taking responsibilities away from devs — conversely, in PlatEng, I see the central motivation being to supply self-serve platforms that the devs can use, in hopefully scalable ways. Both valid different ways to see the world, but tend to play out very differently organisationally.

Misc

For those of us with a toe in the AI/ML world, this KubeCon showed folks clearly grappling with the emerging future, but (again) IMHO not particularly clearly. There was a good-sized nod to the problems of how to host serious AI/ML workflows in two keynotes, for example (“inference is the new web app”), but those of us convinced AI/ML will itself affect Kubernetes as well as the rest of the industry were clearly over at the OpenAI DevDay and not at KubeCon. Conversely, as someone with more than a toe in the traffic management world, it was good to see Envoy trucking along with a bunch of new features (including Otel). Additionally, DataDog took their massive outage transparency roadshow to the day 2 keynote stage, and as someone who did a podcast on this, I believe it’s a great demonstration of DD’s organizational character to talk about it as openly as they do. Kudos to everyone involved.

Summary

Ultimately, KubeCon felt, even at a remove, very familiar to me — it was clearly the systems or systems-thinking crowd, as opposed to the folks gathered in the overlapping Github Universe or OpenAI DevDay conferences. You’d perhaps expect that to result in a more sparsely attended event — but AIUI, instead it intensified the agenda rather than shrank it, and consolidated its natural role as a repository for vendor-neutral discussions about infrastructure operations. Others might disagree and you might wellyourself, but IMHO you can see this in the huge focus on Observability, the prominence of Platform Engineering, and the effective long-fingering of AI/LLM concerns.

Afternote

There was a portion of the conference dedicated to remembering those who’ve passed, a gesture I think way more conferences could perform. Accordingly, this post is dedicated to Kris Nova, Gal Navon, Sefi Genis, Ido Hubara, Carolyn Van Slyck, and Roee Negri.