Notes

/ Berlin Go Homelab Kubernetes Everything

Go runtime vs CFS quota

As of today, the Go runtime isn’t aware if it runs inside a container under the resource constraints (CPU or memory). The runtime sees the resources available for the container’s underlying host OS, e.g. the VM where the container runs, and tries to optimize its behaviour base on what it sees. For container runtimes on Linux, which implements the CPU restrictions via CFS (“Completely Fair Scheduler”), a mismatch in what the application thinks is has, and what the OS allows to use, can lead to the poor performance of the application after the unexpected throttling.

For example, a Go application, that runs in a container, constrained with 0.5 CPU, running on a host with 2 CPU, will observe 2 available CPU cores. That is the application’s calls to runtime.NumCPU() and runtime.GOMAXPROCS() will get us “2”. Because the Go runtime is optimized for the maximum utilization of the available compute under the concurrent workload, the goroutines it spawns are distributed to the internal thread pool, created with the assumption of two available CPU cores. This causes the application to throttle after the sum of the time it spend on the CPU cores per CFS period become equals to the quota of the container. With the default CFS period 100ms, the CFS quota of this container 0.5 CPU, and two threads running on different CPU cores, the application is throttled after 25ms every 100ms.

Keep reading…

Bookmarks (issue 10)

The 22 BEST Basslines of 2022 (Patrick Hunter).

Building a custom code search index in Go for searchcodecom (Ben Boyter).

Kubernetes resources under the hood. This year was rich for deep tech articles and talks, that explain how CPU requests and limits work in Kubernetes. This three-part series is no exception.

John Carmack on resigning from Meta. A post that spawned many intresting opinions on the internet: “You can’t have top people in X (performance, security, whatever) work for you and only half-care about X at the same time. They will move”.

How "go test" runs tests

When I run go test ./foo, Go toolchain performs several tricks under the hood.

First — skipping the intermediate preparation steps, like parsing the command line flags and checking for cached results — the toolchain generates a package main, which will run all TestXxx functions for a package under the test. Then it compiles a testing-binary, that includes the generated code, and the code of the package and its _test.go files.

The steps above are somewhere equivalent to running the following command:

> go test -c -o foo.test ./foo

# The resulting "foo.test" is indeed an executable
> file foo.test
foo.test: Mach-O 64-bit executable x86_64

This binary includes a generated function “main”, that, eventually, calls the packages TestMain(m *testing.M) (if one exists), and executes all the TestXxx(t *testing.T) functions, which the toolchain found during the code-generation.

For the details of how the generated “main” looks like, refer to the source code of the internal package “load”.

Next, the toolchain executes the built binary, setting the current working directory to the path of the original package foo, which is equivalent to running the following:

> cd ./foo
> ./../foo.test

The important (and sometimes not obvious) part is that Go toolchain builds a dedicated binary for every package, under the testing scope. That is, in the example above, the testing binary contains only the code of the package “foo”, its dependencies, and the code in its _test.go files. If we ran the tests for several packages — go test ./pkg/... — the toolchain will generate, built and execute individual binaries for every package below pkg/. This makes the testing of each package fully-isolated.

There are lots of other things happening underneath, including code coverage, benchmarking, etc. Have a look through the documentation under go help test and go help testflag, for some more details.

Go

Bookmarks (issue 9)

Kubernetes removals, deprecations, and major changes in 1.26.

Performance evaluation of autoscaling strategies in Kubernetes (Kewyn Akshlley). tl;dr; After comparing the performance of horizontal and vertical autoscaling using synthetic load, the horizontal autoscaling seems more efficient, reacts faster to the load variation, and results in a lower impact on the application’s response time.

How Pinterest delivers software at scale (Go Time, podcast). A very refreshing discussion about real-world technical challenges large organizations face.

Adam Dymitruk on Event Modeling (Software Engineering Radio, podcast). Event Modeling: what is it?

Bookmarks (issue 8)

Taking Postgres serverless (Changelog). Neon: Serverless PostgreSQL (Heikki Linnakangas @ Carnegie Mellon University).

Simple simulations for system builders (Marc Brooker). System designers care about questions like “How will the system behave under overload?” or “How sensitive is the design to latency?”. By “writing small simulators that simulate the behaviour of simple models”, Marc shows an approach to explore and reason about the possible answers to such questions.

The HTTP crash course nobody asked for.

Design docs at Google. When not to write a design doc: A clear indicator that a doc might not be necessary is when a design doc is an implementation manual, that doesn’t go into trade-offs, alternatives, and explanation of the decision-making; — write the actual application instead.

Bookmarks (issue 7)

The complete history and strategy of Amazon.com and The complete history and strategy of AWS (Acquired podcast). In the world, where the growth of storages is massively outpacing the improvements in the speed of the Internet, a database is the most sticky technology. That’s how AWS locks the enterprises in: it’s impossible to migrate the enterprise-grade of data out of AWS (“It took thirty years for Amazon to migrate off Oracle to AWS”).

CMU Intro to Database systems / Fall 2022 (Carnegie Mellon University); PostgreSQL B-Tree index explained, pt. 1 (Qwertee).

Build a CQRS event store with Amazon DynamoDB (AWS).

Working on a new thing

Number of developers will tell you, they prefer building “a new thing”, rather than maintaining existing “legacy” projects.

While working on “green field”, “day zero” projects can definitely be fun, they rarely feel real to me. The “real” project lives in production. It has users. It has load. It demands improvement.

You can’t improve “a new thing”.

Bookmarks (issue 6)

Kubernetes antipatterns: CPU Limits. Always define CPU requests; never define CPU limits.

Explaining the unexplainable: buffers in PostgreSQL. Shared buffers are those, which’re’ “shared” between several DB sessions, i.e. data pages, indices, etc; local buffers, are “local” to a session, i.e. for temporal tables; temp buffers are for intermediate objects, i.e. when the DBMS does hashing and sorting.

Rust Iterator pattern with iter(), into_iter() and iter_mut() methods.

Standard iterator interface in Go (Ian Lance Taylor via GitHub Discussions).

Best practices

Before jumping on the “best practices of software engineering” train, after reading the classics, or someone’s re-post on social media, ask what was the context, in which these “best practices” were established. Some of those made sense in the environment, where the product’s release cycle has been measured with months. But that doesn’t mean they make sense, when we ship/fix/ship/new-requirements/ship/update/ship the product several times every week.

Bookmarks (issue 5)

Scott’s Bass Lessons and BassBuzz on YouTube. I’m learning how to play bass guitar 🎸 now. So far, these two channels were the most helpful with both the practice lessons, and the inspiration to move forward.

Event-driven architecture done right (Tim Berglung, Devoxx Poland 2021). Try to learn what you’re trying to do, before you elaborate the architecture, if uncertainty is very high, and you don’t know exactly what businesses are asking. Just start with something (architecturally) simple.

Amazon DynamoDB: A Scalable, Predictably Performant, and Fully Managed NoSQL Database Service (whitepaper). The analysis of the paper (Marc Brooker).

Acquired’s lessons learned from 200 company stories: optimism always wins (Sony); nothing can stop the will to survive (Nvidia); it’s never too late (TSMC); focus on what makes your beer taste better (Amazon and all utility companies); don’t be talent — own the business (Oprah, Tail Swift); you’ll get the partners you ask for (Amazon’s “If you’re not on my bus, get off”), and more.

Bookmarks (issue 4)

Go 1.19beta1. As usual, lots of good improvements in the language’s runtime and the compiler, with one particularly interesting addition being the new “knob” runtime/debug.SetMemoryLimit.

How to use gender-neutral language at work and in life (Grammarly). “Luckily, the English language is relatively gender-neutral in many respects” [at least, when compared to Russian and German languages].

Meet passkeys (Apple) and Everything you want to know about WebAuthn (OktaDev). As you can guess, I’m very excited with Apple stepping onto the path to a passwordless future, while betting on WebAuthn standard.

Replace CAPTCHAs with Private Access Tokens (Apple), Private Access Tokens: stepping into the privacy-respecting, CAPTCHA-less future we were promised (Fastly), Private Access Tokens: eliminating CAPTCHAs on iPhones and Macs with open standards (Cloudflare).

Bookmarks (issue 3)

Rust tracks on Exercism. More than 100 coding exercises to learn Rust through practice.

PostgreSQL Anonymizer 1.0. A PostgreSQL extension for declarative data masking (docs and examples).

Queries in PostgreSQL: 4. Index scan (Postgres Pro). An in-depth overview of how PostgreSQL decides if it will use an index. One particular thing I had no idea about before I read the article was that “The Index Scan cost is highly dependent on the correlation between the physical order of the tuples on disk and the order in which the access method returns the IDs”. That explains several cases from my own experience, where postgres kept using “unexpected” sequential scans, after we added “another index” to the database.

Monarch: Google’s planet-scale in-memory time series database (Micah Lerner). A review of the paper (PDF), which describes the latest iteration of Google’s in-house metrics system.

Google’s API Improvement Proposals (AIP). A collection of design documents that summarize Google’s API design decisions.

A real life use-case for generics in Go: API for client-side pagination

Let’s say we have a RESTful API for a general ledger, with the endpoints, that return a paginated collection of resources:

  1. GET /accounts, retrieves a list of accounts, filtered and sorted by some query parameters;
  2. GET /accounts/:uuid/transactions, retrieves a list of transactions for account;
  3. GET /postings, retrieves a list of postings stored in the ledger.

No, this one isn't about data-structures...

Go

Bookmarks (issue 2)

Digital object identifier (DOI) (Wikipedia). A DOI is a persistent identifier or handle used to identify various objects uniquely. It aims to be “resolvable”, usually to some form of access to the information object to which the DOI refers. This is achieved by binding the DOI to metadata about the object, such as a URL. Thus, by being actionable and interoperable, a DOI differs from identifiers such as ISBNs, which aim only to identify their referents uniquely.

Operations principles: securely deploying the graph to production at scale (Principled GraphQL). Lots of things listed there apply to any sort of APIs — not only to GraphQL.

Songs your English teacher will NEVER teach! (Learn English with Papa Teach Me). Vocabulary from “Savage”, “WAP”, and “34+35”. This video is definitely not for kids!

8 phrases to spring-clean from your emails (Grammarly).

Platforms and Power (Acquired). “7 Powers” author Hamilton Helmer and Chenyi Shi (Strategy Capital), joined Acquired Podcast to discuss platform businesses, and how the “Power” framework applies to them.

Halfthings (Mat Ryer). Building something for the users to play with, to touch, to feel, to break, makes all the difference and moves the conversations away from the meta. Doing “one thing” or “building an MVP” can easily pull you into a “too much” for a validation phase. Build a “halfthing” instead.

Bookmarks (issue 1)

Generics can make your Go code slower (PlanetScale):

  1. boxing vs monomorphization vs partial monomorphization (“GCShape stenciling with Dictionaries”)
  2. interface inlining doesn’t work well with the 1.18’s compiler
  3. generics work well for byte sequences (string | []byte)
  4. in simple cases, generics can be useful for function callbacks.

How Meta enables de-identified authentication at scale. The rational, the use-cases, and a high-level architecture of Meta’s Anonymous Credential Service (ACS).

Hidden dangers of duplicate key violations in PostgreSQL (AWS). INSERT … ON CONFLICT has additional benefits, if compared to relying on PostgreSQL’s “duplicate key violation” error:

  1. no additional space needed for dead tuples
  2. less autovacuum required
  3. transaction IDs aren’t used for nothing, preventing (postponing) the potential trx-id wraparound.

Diving into AWS IAM Roles for (Kubernetes) Service Accounts (IRSA).

Go talks I keep coming back to

I have a personal list of “top conference talks” that I keep referring back to, even after years of working with Go:

Keeping the list here, in public, should help my future self, in a situation where I’m stuck with a mind-blocker, and I need to quickly pull out a piece of community wisdom from the backyards of my memory. The list isn’t meant to be complete, and I expect to add more links here, moving forward.

Did I miss any? Share your suggestions with me on Twitter.

Love. Hate. Material Design

If Chrome had a notch

Chrome notch

Another witty picture under the cut…

I've got COVID. What do I do next?

Go to https://www.berlin.de/corona/massnahmen/abstands-und-hygieneregeln/ for the information (in German) about the latest regulations in Berlin. For the up-to-date information in your region, consult with your local authorities.

Everything I list below are the steps I found relevant, after I’ve self-tested positive for COVID in Berlin, in the end of January 2022.

Friday, 28th January 2022

I woke up with some mild symptoms of cold. On the day before I worked from home, and only had a usual an hour-long walk around Prenzlauer Berg — Mitte after the workday. The rapid antigen test (Schnelltest) was negative.

While still working from home, I took a “quiet Friday” at work, to simply focused on some mundane routines.

I went for a short walk in the afternoon, bought some grocery, and headed home.

Saturday, 29th January

Didn’t feel anywhere better or worse: mostly had a runny nose and a dry throat. I didn’t want to wait in the line for a quick-test, so I’ve just walked around the city for an hour and went home.

Sunday, 30th January

It felt the same as it’d been the day before, although the dry throat started to feel a bit more intense. Walked around for an hour, came home, made some coffee ;)

A couple hours later I started to cough. I made a self-test — we still had some at home — and,

Here we are! Welcome to the “two-stripes” club.

Keep reading…

Error messages in Go

When Go code propagates an error, the following pattern is very popular:

fmt.Errorf("failed to find a parking slot: %w", err)
// Or
fmt.Errorf("could not call mom: %w", err)

These “could not”, “failed to”, “unable to” make sense when my mind is in the local context of the function, method or package. But, in most of the cases I have to deal with, it makes the resulting log message overloaded with informational garbage:

unable to ask about the cat: failed to call mom: failed to do request: Get https://: context canceled

While discussing this issue with a colleague, we came up with the following “better” strategy:

  1. only the logger should express its attitude to the facts, using words “error”, “failed”, etc
  2. the business code must operate only with facts, e.g. “call mom”.
err := CallMom(number)
if err != nil {
    return fmt.Errorf("call mom (tel %s): %w", number, err)
}

This renders as following to the application logs:

error: ask about cat: call mom (tel 123): make request: Get https://: context canceled

For a long stack of errors, this makes the full error message more dense, showing more useful information per line.

Go

Good coffee places in Berlin

My definition of “good coffee places” includes, although, by no mean limited by, offering a “not too fruity” (sometimes “chocolaty”) Filter coffee or a decent Americano.

Keep reading…