Notes

/Berlin Go Homelab Kubernetes Everything

Make Firefox look OK

When I open a freshly installed Firefox browser, it does not look good.

Keep reading…

Client-side pagination in Go (range-over function edition)

In light of Go 1.22’s experiment, that allows iterating over a function, I wanted to revisit my note from 2022 on client-side pagination, using a generic iterator, and get an idea on how the range-over function might help with the task. I will not delve into the details of the changes that Go brings. Refer to the “Rangefunc Experiment” on the Go wiki for more details. One only thing to mention, is that the executable must be built with the GOEXPERIMENT=rangefunc environment variable, to enable the experiment.

Keep reading…

Go

Twitter graffiti

A post on Twitter is graffiti on a public wall. A post in a personal blog is a sketch in your own notebook. Both can be raw or well-planned, be dumb or deep. But one is graffiti on a public wall, and another is a sketch in your own notebook.

Compile-time safety for enumerations in Go

Last month, a colleague of mine asked how to reason about enumerated values (“enums”) in Go. They wanted to benefit from Go’s type-safety and to prevent users from misusing the package.

For an imaginary example, that enumerates possible colours, the snippet below shows how we typically implement this:

type Color string

const (
    Red   Color = "red"
    Green Color = "green"
    Blue  Color = "blue"
)

But, just like Bill Kennedy illustrated it in the blog post on a similar topic, such implementation allows a user to pass any “untyped constant string” in places where the Color type is expected. Also, a user can declare their own value of a type Color, going beyond what was defined by the initial enumeration set.

package main

func main() {
    PrintColor(color.Red)

    PrintColor("RAINBOW")

    var rainbow color.Color = "🌈"
    PrintColor(rainbow)
}

func PrintColor(c color.Color) {
    fmt.Printf("%v\n", c)
}

// Outputs:
red
RAINBOW
🌈

Note how in the snippet above, the user both called PrintColor(“RAINBOW”) and defined their own variable of type color.Color, which holds a random string.

Update: a number of people pointed at my choice of words in the paragraph below, where I introduce a way to solve the problems. Indeed, “elegant” isn’t the most accurate one to describe this solution :) Also, refer to #19412 for the discussion about adding sum types to Go.


Go’s type system allows preventing both issues in a rather elegant way. Let’s declare the Color interface, with an unexported method, while implementing this interface in an unexported type color:

package color

type Color interface {
    xxxProtected()
}

type color string

func (c color) xxxProtected() {}

const (
	Red   color = "red"
	Green color = "green"
	Blue  color = "blue"
)

Because no types outside of our package color can implement the unexported method color.Color.xxxProtected(), we limit the possible implementations of the Color interface to only values, defined in the package color.

package main

func main() {
    PrintColor(color.Red)

    //PrintColor("RAINBOW") // cannot use (constant of type string) as color.Color
}

func PrintColor(c color.Color) {
    fmt.Printf("%v\n", c)
}

If a user tries to pass an untyped constant string or declare their own variant of the colour, the code won’t compile, resulting with an error:

package main

type mycolor string

func (c mycolor) xxxProtected() {}

func main() {
    black := mycolor("BLACK")
    PrintColor(black)
}

OUTPUTS:
./prog.go:9:13: cannot use black (variable of type mycolor) as color.Color value in argument to PrintColor:
mycolor does not implement color.Color (missing method xxxProtected)

Have you stumbled upon a scenario, where such strict compile-time checks for enumerated values was needed? Share your thoughts with me on Hacker News, Twitter or Bluesky.

Env variables, you will (likely) find set in my Kubernetes deployments

Kubernetes allows us to pass the values declared in a Pod’s manifest, to its containers via environment variables (docs). A typical situation, where I find this handy is when I run a Go application in a Pod.

As discussed in the previous note, out of the box, Go runtime isn’t aware if it runs inside a container. This can lead to confusing situations, when the runtime adjusts its behaviour, after observing the resources (CPU and memory) available on the cluster’s node, instead of the resources, a developer or an operator restricted the deployment with.

Keep reading…

Go runtime vs CFS quota

As of today, the Go runtime isn’t aware if it runs inside a container under the resource constraints (CPU or memory). The runtime sees the resources available for the container’s underlying host OS, e.g. the VM where the container runs, and tries to optimize its behaviour base on what it sees. For container runtimes on Linux, which implements the CPU restrictions via CFS (“Completely Fair Scheduler”), a mismatch in what the application thinks is has, and what the OS allows to use, can lead to the poor performance of the application after the unexpected throttling.

For example, a Go application, that runs in a container, constrained with 0.5 CPU, running on a host with 2 CPU, will observe 2 available CPU cores. That is the application’s calls to runtime.NumCPU() and runtime.GOMAXPROCS() will get us “2”. Because the Go runtime is optimized for the maximum utilization of the available compute under the concurrent workload, the goroutines it spawns are distributed to the internal thread pool, created with the assumption of two available CPU cores. This causes the application to throttle after the sum of the time it spend on the CPU cores per CFS period become equals to the quota of the container. With the default CFS period 100ms, the CFS quota of this container 0.5 CPU, and two threads running on different CPU cores, the application is throttled after 25ms every 100ms.

Keep reading…

Bookmarks (issue 10)

The 22 BEST Basslines of 2022 (Patrick Hunter).

Building a custom code search index in Go for searchcodecom (Ben Boyter).

Kubernetes resources under the hood. This year was rich for deep tech articles and talks, that explain how CPU requests and limits work in Kubernetes. This three-part series is no exception.

John Carmack on resigning from Meta. A post that spawned many intresting opinions on the internet: “You can’t have top people in X (performance, security, whatever) work for you and only half-care about X at the same time. They will move”.

How "go test" runs tests

When I run go test ./foo, Go toolchain performs several tricks under the hood.

First — skipping the intermediate preparation steps, like parsing the command line flags and checking for cached results — the toolchain generates a package main, which will run all TestXxx functions for a package under the test. Then it compiles a testing-binary, that includes the generated code, and the code of the package and its _test.go files.

The steps above are somewhere equivalent to running the following command:

> go test -c -o foo.test ./foo

# The resulting "foo.test" is indeed an executable
> file foo.test
foo.test: Mach-O 64-bit executable x86_64

This binary includes a generated function “main”, that, eventually, calls the packages TestMain(m *testing.M) (if one exists), and executes all the TestXxx(t *testing.T) functions, which the toolchain found during the code-generation.

For the details of how the generated “main” looks like, refer to the source code of the internal package “load”.

Next, the toolchain executes the built binary, setting the current working directory to the path of the original package foo, which is equivalent to running the following:

> cd ./foo
> ./../foo.test

The important (and sometimes not obvious) part is that Go toolchain builds a dedicated binary for every package, under the testing scope. That is, in the example above, the testing binary contains only the code of the package “foo”, its dependencies, and the code in its _test.go files. If we ran the tests for several packages — go test ./pkg/... — the toolchain will generate, built and execute individual binaries for every package below pkg/. This makes the testing of each package fully-isolated.

There are lots of other things happening underneath, including code coverage, benchmarking, etc. Have a look through the documentation under go help test and go help testflag, for some more details.

Go

Bookmarks (issue 9)

Kubernetes removals, deprecations, and major changes in 1.26.

Performance evaluation of autoscaling strategies in Kubernetes (Kewyn Akshlley). tl;dr; After comparing the performance of horizontal and vertical autoscaling using synthetic load, the horizontal autoscaling seems more efficient, reacts faster to the load variation, and results in a lower impact on the application’s response time.

How Pinterest delivers software at scale (Go Time, podcast). A very refreshing discussion about real-world technical challenges large organizations face.

Adam Dymitruk on Event Modeling (Software Engineering Radio, podcast). Event Modeling: what is it?

Bookmarks (issue 8)

Taking Postgres serverless (Changelog). Neon: Serverless PostgreSQL (Heikki Linnakangas @ Carnegie Mellon University).

Simple simulations for system builders (Marc Brooker). System designers care about questions like “How will the system behave under overload?” or “How sensitive is the design to latency?”. By “writing small simulators that simulate the behaviour of simple models”, Marc shows an approach to explore and reason about the possible answers to such questions.

The HTTP crash course nobody asked for.

Design docs at Google. When not to write a design doc: A clear indicator that a doc might not be necessary is when a design doc is an implementation manual, that doesn’t go into trade-offs, alternatives, and explanation of the decision-making; — write the actual application instead.

Bookmarks (issue 7)

The complete history and strategy of Amazon.com and The complete history and strategy of AWS (Acquired podcast). In the world, where the growth of storages is massively outpacing the improvements in the speed of the Internet, a database is the most sticky technology. That’s how AWS locks the enterprises in: it’s impossible to migrate the enterprise-grade of data out of AWS (“It took thirty years for Amazon to migrate off Oracle to AWS”).

CMU Intro to Database systems / Fall 2022 (Carnegie Mellon University); PostgreSQL B-Tree index explained, pt. 1 (Qwertee).

Build a CQRS event store with Amazon DynamoDB (AWS).

Working on a new thing

Number of developers will tell you, they prefer building “a new thing”, rather than maintaining existing “legacy” projects.

While working on “green field”, “day zero” projects can definitely be fun, they rarely feel real to me. The “real” project lives in production. It has users. It has load. It demands improvement.

You can’t improve “a new thing”.

Bookmarks (issue 6)

Kubernetes antipatterns: CPU Limits. Always define CPU requests; never define CPU limits.

Explaining the unexplainable: buffers in PostgreSQL. Shared buffers are those, which’re’ “shared” between several DB sessions, i.e. data pages, indices, etc; local buffers, are “local” to a session, i.e. for temporal tables; temp buffers are for intermediate objects, i.e. when the DBMS does hashing and sorting.

Rust Iterator pattern with iter(), into_iter() and iter_mut() methods.

Standard iterator interface in Go (Ian Lance Taylor via GitHub Discussions).

Best practices

Before jumping on the “best practices of software engineering” train, after reading the classics, or someone’s re-post on social media, ask what was the context, in which these “best practices” were established. Some of those made sense in the environment, where the product’s release cycle has been measured with months. But that doesn’t mean they make sense, when we ship/fix/ship/new-requirements/ship/update/ship the product several times every week.

Bookmarks (issue 5)

Scott’s Bass Lessons and BassBuzz on YouTube. I’m learning how to play bass guitar 🎸 now. So far, these two channels were the most helpful with both the practice lessons, and the inspiration to move forward.

Event-driven architecture done right (Tim Berglung, Devoxx Poland 2021). Try to learn what you’re trying to do, before you elaborate the architecture, if uncertainty is very high, and you don’t know exactly what businesses are asking. Just start with something (architecturally) simple.

Amazon DynamoDB: A Scalable, Predictably Performant, and Fully Managed NoSQL Database Service (whitepaper). The analysis of the paper (Marc Brooker).

Acquired’s lessons learned from 200 company stories: optimism always wins (Sony); nothing can stop the will to survive (Nvidia); it’s never too late (TSMC); focus on what makes your beer taste better (Amazon and all utility companies); don’t be talent — own the business (Oprah, Tail Swift); you’ll get the partners you ask for (Amazon’s “If you’re not on my bus, get off”), and more.

Bookmarks (issue 4)

Go 1.19beta1. As usual, lots of good improvements in the language’s runtime and the compiler, with one particularly interesting addition being the new “knob” runtime/debug.SetMemoryLimit.

How to use gender-neutral language at work and in life (Grammarly). “Luckily, the English language is relatively gender-neutral in many respects” [at least, when compared to Russian and German languages].

Meet passkeys (Apple) and Everything you want to know about WebAuthn (OktaDev). As you can guess, I’m very excited with Apple stepping onto the path to a passwordless future, while betting on WebAuthn standard.

Replace CAPTCHAs with Private Access Tokens (Apple), Private Access Tokens: stepping into the privacy-respecting, CAPTCHA-less future we were promised (Fastly), Private Access Tokens: eliminating CAPTCHAs on iPhones and Macs with open standards (Cloudflare).

Bookmarks (issue 3)

Rust tracks on Exercism. More than 100 coding exercises to learn Rust through practice.

PostgreSQL Anonymizer 1.0. A PostgreSQL extension for declarative data masking (docs and examples).

Queries in PostgreSQL: 4. Index scan (Postgres Pro). An in-depth overview of how PostgreSQL decides if it will use an index. One particular thing I had no idea about before I read the article was that “The Index Scan cost is highly dependent on the correlation between the physical order of the tuples on disk and the order in which the access method returns the IDs”. That explains several cases from my own experience, where postgres kept using “unexpected” sequential scans, after we added “another index” to the database.

Monarch: Google’s planet-scale in-memory time series database (Micah Lerner). A review of the paper (PDF), which describes the latest iteration of Google’s in-house metrics system.

Google’s API Improvement Proposals (AIP). A collection of design documents that summarize Google’s API design decisions.

A real life use-case for generics in Go: API for client-side pagination

Let’s say we have a RESTful API for a general ledger, with the endpoints, that return a paginated collection of resources:

  1. GET /accounts, retrieves a list of accounts, filtered and sorted by some query parameters;
  2. GET /accounts/:uuid/transactions, retrieves a list of transactions for account;
  3. GET /postings, retrieves a list of postings stored in the ledger.

No, this one isn't about data-structures...

Go

Bookmarks (issue 2)

Digital object identifier (DOI) (Wikipedia). A DOI is a persistent identifier or handle used to identify various objects uniquely. It aims to be “resolvable”, usually to some form of access to the information object to which the DOI refers. This is achieved by binding the DOI to metadata about the object, such as a URL. Thus, by being actionable and interoperable, a DOI differs from identifiers such as ISBNs, which aim only to identify their referents uniquely.

Operations principles: securely deploying the graph to production at scale (Principled GraphQL). Lots of things listed there apply to any sort of APIs — not only to GraphQL.

Songs your English teacher will NEVER teach! (Learn English with Papa Teach Me). Vocabulary from “Savage”, “WAP”, and “34+35”. This video is definitely not for kids!

8 phrases to spring-clean from your emails (Grammarly).

Platforms and Power (Acquired). “7 Powers” author Hamilton Helmer and Chenyi Shi (Strategy Capital), joined Acquired Podcast to discuss platform businesses, and how the “Power” framework applies to them.

Halfthings (Mat Ryer). Building something for the users to play with, to touch, to feel, to break, makes all the difference and moves the conversations away from the meta. Doing “one thing” or “building an MVP” can easily pull you into a “too much” for a validation phase. Build a “halfthing” instead.

Bookmarks (issue 1)

Generics can make your Go code slower (PlanetScale):

  1. boxing vs monomorphization vs partial monomorphization (“GCShape stenciling with Dictionaries”)
  2. interface inlining doesn’t work well with the 1.18’s compiler
  3. generics work well for byte sequences (string | []byte)
  4. in simple cases, generics can be useful for function callbacks.

How Meta enables de-identified authentication at scale. The rational, the use-cases, and a high-level architecture of Meta’s Anonymous Credential Service (ACS).

Hidden dangers of duplicate key violations in PostgreSQL (AWS). INSERT … ON CONFLICT has additional benefits, if compared to relying on PostgreSQL’s “duplicate key violation” error:

  1. no additional space needed for dead tuples
  2. less autovacuum required
  3. transaction IDs aren’t used for nothing, preventing (postponing) the potential trx-id wraparound.

Diving into AWS IAM Roles for (Kubernetes) Service Accounts (IRSA).