A thought experiment on Apple M1

With Apple’s new M1 Macs showing (reportedly) huge performance improvement, compared to “old”, Intel-based Macs, I wonder what would hold Qualcomm (Snapdragon CPU) and others from doing “the same” and moving into laptop/desktop territory?

Microsoft already has Surface Pro X — an ARM-based Windows computer. They also have a version of Win10 for ARM, that one can even run on Raspberry Pi (still beta quality, I believe). Could 2021 become the year of ARM on desktop?

Even more interesting, Amazon’s Graviton is showing (again, reportedly) an excellent performance, while staying at reasonably low cost. What would stop Google/Microsoft from moving into ARM-based CPU territory for their clouds?

Alternative font-variant in VS Code

I’m not a big fun of the aesthetics of VS Code, at least not on macOS. In particular, I don’t like the way how its code editor renders fonts. But today I learned!

To make the fonts look better — I’m on macOS — set an alternative font-variant, e.g. “Comic Code Ligatures Medium” instead of “Regular”, in the editor’s settings. To specify font-variant, remove spaces in between the name of the font and pass the variant after the hyphen. That is, instead of 'Comic Code Ligatures', set the following:

"editor.fontFamily": "'ComicCodeLigatures-Medium', monospace"

Full disclosure, my IDE of choice is GoLand, and it has been so since it was only a plugin for IntelliJ IDEA. Even though I can comfortably use VS Code or Vim when I need to make a small change or look something up in the code, I need the IDE to effectively work on a large codebase.

The backstory for this note is that I’m working on a small TypeScript/React application in my spare time this month. For something that small, VS Code works great fine. In fact, I’m writing this particular note in VS Code too ;)

vs code

One year in production

It’s one year since I posted the very first note here. Unbelivable! Despite my own concerns I did published random posts over the course of the previous twelve months.

For the next round I came up with some personal goals:

  1. Keep posting.
  2. Work on your grammar.
  3. Add “Archive” and “Tags” sections.
  4. Find a better approach for managing drafts.
  5. Bring back the dark theme but figure out what to do with the illustrations.
  6. Keep posting ;)

Does profefe prefers "push" over "pull"?

The main component of profefe, a system for continuous profiling, is profefe-collector — a service that receives profiling data, taken from an application and persists the data in collector’s storage (design document describes it in more details). Receiving data from an external source (for example, profefe-agent), indicates that profefe, as an observability system, prefers “push” model. Why not “pulling” the profiles directly from the application?

Both push and pull models have their benefits and drawbacks.

A collector that pulls profiling data from running applications could simplify integration into existing infrastructure because there would be no need in making changes in the applications that already exposed pprof HTTP endpoint. Making sure that every application integrated and configured profefe-agent would be a challenging job in a large organisation.

On the other hand, pull model requires pprof servers to be exposed and available for the collector, so it could fetch (pull) profiling data. That can also be challenging in the deployments, where applications are collocated on the bare-metal machines. Every application (application’s instance) would have to communicate a unique TCP port for its pprof server.

To work as a pull-system, the collector must be able to discover the pprof servers, thus it requires a mechanism for service discovery (SD), to be usable at scale. Unfortunately, there isn’t a universal SD protocol or a provider, an observability system could be built upon.

Prometheus, the best example of an open-source system, which uses pull model for data collection, have to support several different SD systems in their code base. At some point they ended up introducing their own general protocol, that expects a “middle-man-service”, which translates the data from a SD system into a list of Prometheus targets (Update, this comment from u/bbrazil does a better job explaining the state of SD in Prometheus). There is no clear way for an open-source system to be both flexible and don’t end up being a pile of “plugins”, that no one is willing to maintain or break.

From the start of profefe project, several years ago, I had the idea that translating push into pull would be easier for an end-user. That is if a small deployment already exposes a pprof server, writing an external job that pulls the profiles from the applications, annotates them with meta-data, and pushes the data into the collector, can be as easy as spawning a cronjob in a sidecar. kube-profefe solves that nicely for deployments running in Kubernetes. At some point, I hoped to come up with something similar for Nomad+Consul if the experiments ended successfully.

Translating pull into push is a similarly possible but because profefe didn’t have to support any SD mechanisms from the start, that simplified the overall code base and allowed us to focus on the collector and the API for profiles quering.

profefe-collector does uses push model. But one can deploy profefe so it reflected the use cases your organisation has.

Do you use continuous profiling? Let me know about your experience. Share your thoughts on Twitter or discuss on r/golang.

profefecontinuous profilinggopprof

How to design a good API

A good API is designed around the use-case. A poorly designed, around the API’s implementation details.

What's in your main-dot-go? (aka New Go Project Boilerplate)

Sometimes I write small services in Go from scratch. And every time main.go ends up looking almost the same:

func main() {
	ctx, cancel := context.WithCancel(context.Background())
	defer cancel()

	sigs := make(chan os.Signal, 2)
	signal.Notify(sigs, os.Interrupt, syscall.SIGTERM)

	go func() {

	if err := run(ctx, os.Args[1:]); err != nil {

type Config struct {
	HTTPAddr            string
	HTTPShutdownTimeout time.Duration

func run(ctx context.Context, args []string) error {
	flags := flag.NewFlagSet("", flag.ExitOnError)

	var conf Config
	flags.StringVar(&conf.HTTPAddr, "http-addr", "", "address to listen on")
	flags.DurationVar(&conf.HTTPShutdownTimeout, "http-shutdown-timeout", 5*time.Second, "server shutdown timeout")

	if err := flags.Parse(args); err != nil {
		return err

	// TODO: define the handler, the routing, and wire the dependencies with the main context
	mux := http.NewServeMux()
	server := &http.Server{
			Addr:    conf.HTTPAddr,
			Handler: mux,

	errs := make(chan error, 1)
	go func() {
			log.Printf("starting: addr %s", server.Addr)
			errs <- server.ListenAndServe()

	select {
	case <-ctx.Done():
	case err := <-errs:
			return err

	// create new context because top-most one is already canceled
	ctx, cancel := context.WithTimeout(context.Background(), conf.HTTPShutdownTimeout)
	defer cancel()

	return server.Shutdown(ctx)

Of course, not every service requires an HTTP server, but the general idea stands.

When Go will introduce signal.NotifyContext(), the signals handling in main() function will be much smaller (we’re at go1.15rc1 as I’m writing that and the change hasn’t landed into the release yet).

I love how transparent is the flow here and how everything is scoped inside run() function. This structure forces you to eliminate global state, making unit or integration testing almost trivial — at least, in theory ;)

It might feel like too much of boilerplate code for a “small” service. In practice, though, I don’t recall any time this caused any real troubles to me. The beauty of Go is in explicitness.


Waveshare ESP8266 Driver Board Pins Mapping

I’ve been playing with an e-paper ESP866 Driver Board, and a 2.7" E-Ink display from Waveshare. Arduino C++ looks manageable. One strange thing, though. In both board’s documentation and GxEPD2 library’s examples, they say the display is connected to pins as BUSY → GPIO16, RST → GPIO5, DC → GPIO4, CS → GPIO15. This mapping seems wrong.

After digging through the code examples from Waveshare’s Wiki, the correct mapping is the following:


That’s how the initialisation of the main GxEPD2 class for my 2.7" display looks like now:

#define ENABLE_GxEPD2_GFX 0
#include <GxEPD2_BW.h>

// mapping of Waveshare e-Paper ESP8266 Driver Board
GxEPD2_BW<GxEPD2_270, GxEPD2_270::HEIGHT> display(GxEPD2_270(/*CS=15*/ SS, /*DC=4*/ 4, /*RST=2*/ 2, /*BUSY=5*/ 5));

Sticky headers. Please don't

Sticky (or “fixed") headers are everywhere. It feels that every web designer’s first attempt to site’s navigation starts with a sticky header. I hate it.

Interestingly, even Apple uses sticky headers on their website. On the main page of the navigation is the same dumb bar, that hangs at the top all the time and eats a good chunk of the screen.

main page on

But let’s go one page down:

product page on

The navigation bar still floats around but it blends with the page. It’s contextual. It doesn’t distract you from looking at the product.

When the purpose of the page is to provide you with reading materials and doesn’t require any actions from you, the bar just disapears from the design:

text page on

Now, it’s only you and the text.

For contrast, this is how Google does it:

somewhere on

Yes, these are three sticky bars; one under another. No, I can’t believe there is a need for them to be on the screen all the time.

Or, this is the current design of (Google again), whose goal is to become the default documentation portal for Go modules and packages:

I’m sure these two bars at the page’s top are very important but I doubt they are more important than the content of this page.


Owner of Logging Context

There’s the late-night dilemma…

Who should be in charge of logging context: a component’s owner or the component itself?

type Logger interface {
	With(...kvpairs) Logger

type Storage struct {
	logger Logger

// OPTION 1: component's owner defines the context of component's logger
func main() {
	_ = NewStorage(logger.With("component", "storage"))

// OPTION 2: component itself is in charge of its logging context
func NewStorage(logger Logger) (st *Storage) {
	return &Storage{
		logger: logger.With("component", "storage"),

Fun fact: a couple months back, we ruined the team’s Friday, by debating about a similar topic in the context of (Graphite) metrics namespaces. It has become even more intricate since then :/

Update (2020-04-15)

Many people on Twitter suggest that Option 1 is an obvious choice because only application knows how to name the components. I totally agree with that.

As I wrote later, the real dilemma is not about “application and component” but about “owner of the component”. Function main, in the example above, was a silly example, that tried (and failed) to illustrate the question in a code.

Let’s try another (silly) example:

// there are buch of different handlers (maybe ten) in this application
type Handler1 struct { logger Logger }

func (h *Handler1) ServeHTTP(w http.ResponseWriter, r *http.Request) {
	// OPTION 1
	req := NewRequst(h.logger.With("component", "request"), r)

type Handler2 struct { logger Logger }

func (h *Handler2) ServeHTTP(w http.ResponseWriter, r *http.Request) {
	// OPTION 1, still
	req := NewRequst(h.logger.With("component", "request"), r)

type Request struct {
	logger Logger

func NewRequst(logger Logger, *r http.Request) *Request {
	return &Request{
		logger: logger.With("component", "request"),

We want to have a consistent nomenclature across the application’s logs.

Is the choice still obvious? ;)

Do you have an opinion? Share it with me on Twitter.

Retrieve Location of macOS Device from Go

Participating in self-isolation is more fun when you have toys to play. As a fun weekend project, I wanted to look at how one accesses macOS Location Services and get the geographic location of the device from Go.

To obtain the geographic location of a device on macOS, we use Apple’s Core Location framework. The framework is part of the OS, but it requires writting Objective-C (or Swift). Thanks to Go’s cgo and because Objective-C is from the family of C languages, we can write a bridge between Objective-C and Go.

Full disclosure, I haven’t written a line in C for more than 15 years. And more to that, it’s the first time I actually wrote something in Objective-C. The code is for illustrative purposes only. If you can help making it better, please, file an issue or a PR to the example repo, or write me on Twitter.

The code might illustrate the example better than my English ;) Feel free to jump right into the repo on GitHub.

Part 1. Go

Let’s start with the Go code

// File location_manager_darwin.go

/* #import "location_manager_darwin.h" */
import "C"

// CurrentLocation retrives location of the device and returns it to the caller.
func CurrentLocation() (Location, error) {
    var cloc C.Location
    if ret := C.get_current_location(&cloc); ret != 0 {
        return Location{}, fmt.Errorf("failed to get location, code %d", ret)

    loc := Location{
        // convert C Location to Go Location...
    return loc, nil

Function CurrentLocation calls C function get_current_location, expecting it to either populate the C structure C.Location or return a non-zero error code.

get_current_location is a plain-C function, that acts as a glue layer between Objective-C and Go. The function and the C.Location type are described in the header file, which we import with #import "location_manager_darwin.h" in the cgo directive. We will implement the function later in this note. For now, the definition of the function from the header:

// File location_manager_darwin.h

// Location struct represents the location data
// from Apple's CLLocation object.
typedef struct _Location {
    CLLocationCoordinate2D coordinate;
    double altitude;
    double horizontalAccuracy;
    double verticalAccuracy;
} Location;

int get_current_location(Location *loc);

To compile the code we need two more cgo directives:

// File location_manager_darwin.go

#cgo CFLAGS: -x objective-c -mmacosx-version-min=10.14
#cgo LDFLAGS: -framework CoreLocation -mmacosx-version-min=10.14

#import "location_manager_darwin.h"
import "C"

cgo CFLAGS and cgo LDFLAGS define the behaviour of the compiler. Refer to Go’s documentation about cgo to learn more.

Part 2. Objective-C

As I said earlier, getting the user’s location requires writing some Objective-C.

To get the actual location we need an instance of Apple’s CLLocationManager class, whose delegate (CLLocationManagerDelegate) will receive the location events. For some degree, delegates in Objective-C are similar to interfaces in Go. That is, delegate is a class that must implement some methods, that will be invoked by the delegate’s owner.

To keep things simple, I implemented a class, LocationManager, that’s responsible for instantiating CLLocationManager and, at the same time, is the implementation of the delegate protocol:

// File location_manager_darwin.m

@interface LocationManager : NSObject <CLLocationManagerDelegate>
    CLLocationManager *manager;

@implementation LocationManager

- (id)init {
    self = [super init];

    // create an instance of CLLocationManager
    manager = [[CLLocationManager alloc] init];
    // assign ourself as the delegate of the created instance
    manager.delegate = self;

    return self;

- (void)locationManager:(CLLocationManager *)manager didUpdateLocations:(NSArray<CLLocation *> *)locations {
    // implement CLLocationManagerDelegate protocol

- (void)locationManager:(CLLocationManager *)manager didFailWithError:(NSError *)error {
    // implement CLLocationManagerDelegate protocol

CLLocationManager provides requestLocation method, that, as the name suggests, requests the user’s geolocation. The method is asynchronous; on the first call, the OS’s location service will fire a macOS’s modal dialogue, asking the user if they allow the application to access their location.

Below, method getCurrentLocation requests the geolocation and stops the invocation, waiting for events from OS’s location services to reach the delegate. The method then returns the location object (CLLocation) to the caller:

// File location_manager_darwin.m

@implementation LocationManager

- (id)init { ··· }

- (CLLocation *)getCurrentLocation {
    // request user’s current location
    [manager requestLocation];

    // start the run loop, waiting for the results from the delegate

    if (_errorCode != 0) {
        return nil;
    // return the most recently retrieved user location
    return manager.location;

- (void)locationManager:(CLLocationManager *)manager didUpdateLocations:(NSArray<CLLocation *> *)locations {
    // we got the location: stop the run loop to give back the control to getCurrentLocation

- (void)locationManager:(CLLocationManager *)manager didFailWithError:(NSError *)error {
    // we failed to get the location: store the error code and
    // stop the run loop to give back the control to getCurrentLocation
    _errorCode = error.code;


In the comments to the code above, I mentioned “run loop”. This is a fairly big topic, and I’m definitely not an expert to discuss it. Refer to Apple’s documentation about CFRunLoopRun function and related topics to get the idea about asynchronous programming with Apple’s Foundation framework.

Part 3. C

Because we can’t instantiate and use Objective-C classes from cgo, we need a plain C function to be a bridge between Objective-C and Go. It’s time to look at the implementation of get_current_location function, we discussed in Part 1:

// File location_manager_darwin.m

int get_current_location(Location *loc) {
    // create an instance of our LocationManager class
    LocationManager *lm = [[LocationManager alloc] init];
    // obtain user’s location; the call blocks the thread
    CLLocation *clloc = [lm getCurrentLocation];

    if (lm.errorCode != 0) {
        return lm.errorCode;

    // populate the resulting Location struct from Objective-C object
    loc->coordinate = clloc.coordinate;
    loc->altitude = clloc.altitude;

    return 0;

After days of browsing Apple’s documentation, SO and GitHub, I got the results I wanted. That’s how we use it in a Go code:

package main

func main() {
    loc, _ := CurrentLocation()
    fmt.Printf("got location: latitude %f, logitude %f\n", loc.Coordinate.Latitude, loc.Coordinate.Longitude)

When running the program in the terminal, macOS shows a popup window with the request to access the device’s location. The program then prints the obtained coordinates to the console (I redacted the real coordinates below with “xxxxxx”):

$ go run ./
got location: latitude 52.xxxxxx, logitude 13.xxxxxx

As I mentioned in the beginning, the code example is on GitHub

If you have any comments or suggestions, reach out to me on Twitter.

Have fun and stay safe.


Building Multi-Platform Docker Images with Travis CI and BuildKit

This is a lengthy note. If you don’t quite feel reading and only need the working example, go directly to the Travis CI build file.

The more I delve into the world of Raspberry Pi, the more I notice that “regular and boring” things on ARM are harder than I expected.

People build and distribute software exclusively for amd64. You read another “Kubernetes something” tutorial, that went viral on Twitter, and is fancy to try it out. Still, all helm charts, or whatever the author prefered, use Docker images built exclusively for amd64.

Docker toolchain has added the support for building multi-platform images in 19.x. However, it’s available only under the “experimental” mode. The topic of building multi-platform Docker images yet feels underrepresented.

But first, what are multi-platform Docker images?

When a client, e.g. Docker client, tries to pull an image, it must negotiate the details about what exactly to pull with the registry. The registry provides a manifest that describes the digest of the requested image, the volumes the image consists of, the platform this image can run on, etc. Optionally, the registry can provide a manifests list, which, as the name suggests, is a list of several manifests bundled into one. With the manifests list in hands, the client can figure out the particular digest of the image it needs to pull.

So multi-platform Docker images are just several images, whose manifests are bundled into the manifests list.

Imagine we want to pull the image golang:1.13.6-alpine3.10. Docker client will get the manifests list from Dockerhub. This list includes digests of several images, each built for the particular platform. If we’re on Raspberry Pi, running the current Raspbian Linux, which is arm/v7, the client will pick the corresponding image’s digest. Alternatively, we could choose to pull the image arm32v7/golang:1.13.6-alpine3.10 instead, and we ended up with the same image with the digest d72fa60fb5b9. Of course, to use a single universal image name, i.e. golang, on every platform is way more convenient.

You can read more about manifests in Docker registry documentation.

Does it mean I need to build different Docker images, for each platform I want to support?

Well, yes. This is how, official images are built.

For every platform, the image is built and pushed to the registry under the name <platform>/<image>:<tag>, e.g. amd64/golang:1-alpine. And next, a manifests list, that combines all those platform-specific images, is built and pushed with the simple name <image>:<tag>.

Docker’s BuildKit provides a toolkit that, among other nice things, allows building multi-platform images on a single host. BuildKit is used inside Docker' buildx project, that is part of the recent Docker version.

One can use buildx, but, for this post, I wanted to try out, what would it look like to use BuildKit directly. For profefe, the system for continuous profiling of Go services, I set up Travis CI, that builds a multi-platform Docker image and pushes them to Dockerhub.

profefe is written in Go. That simplifies things, because, thanks to Go compiler, I don’t have to think about how to compile code for different platforms. The same Dockerfile will work fine on every platform.

Here’s how “deploy” stage of the build job looks like (see travis.yml on profefe’s GitHub).

dist: bionic

language: go
  - 1.x

    - stage: deploy docker
      services: docker
        - PLATFORMS="linux/amd64,linux/arm64,linux/arm/v7"
        - docker container run --rm --privileged multiarch/qemu-user-static --reset -p yes
        - docker container run -d --rm --name buildkitd --privileged moby/buildkit:latest
        - sudo docker container cp buildkitd:/usr/bin/buildctl /usr/local/bin/
        - export BUILDKIT_HOST="docker-container://buildkitd"
      script: skip
        - provider: script
          script: |
            buildctl build \
              --progress=plain \
              --frontend=dockerfile.v0 \
              --local context=. --local dockerfile=. \
              --opt filename=contrib/docker/Dockerfile \
              --opt platform=$PLATFORMS \
              --opt build-arg:VERSION=\"master\" \
              --opt build-arg:GITSHA=\"$TRAVIS_COMMIT\" \
              --output type=image,\"name=profefe/profefe:git-master\",push=true
            repo: profefe/profefe
            branch: master
        - echo "$DOCKER_PASSWORD" | docker login --username "$DOCKER_USERNAME" --password-stdin
        - buildctl debug workers ls
        - docker container logs buildkitd

It’s a lot happening here, but I’ll describe the most critical parts.

Let’s start with dist: bionic.

We run the builds under Ubuntu 18.04 (Bionic Beaver). To be able to build multi-platform images on a single amd64 host, BuildKit uses QEMU to emulate other platforms. That requires Linux kernel 4.8, so even Ubuntu 16.04 (Xenial Xerus) should work.

The top-level details on how the emulation works are very well described in

In short, we tell the component of the kernel (binfmt_misc) to use QEMU when the system executes a binaries built for a different platform. The following call in the “install” step is what’s doing that:

- docker container run --rm --privileged multiarch/qemu-user-static --reset -p yes

Under the hood, the container runs a shell script from QEMU project, that registers the emulator as an executor of binaries from the external platforms.

If you think, that running a docker container to do the manipulations with the host’s OS looks weird, well… I can’t agree more. Probably, a better approach would be to install qemu-user-static, which would do the proper setup. Unfortunately, the current package’s version for Ubuntu Bionic doesn’t do the registration as we need it. I.e. its post-install doesn’t add the "F" flag (“fix binaries”), which is crucial for our goal. Let’s just agree,that docker-run will do ok for the demonstrational purpose.

- docker container run -d --rm --name buildkitd --privileged moby/buildkit:latest
- sudo docker container cp buildkitd:/usr/bin/buildctl /usr/local/bin/
- export BUILDKIT_HOST="docker-container://buildkitd"

This is another “docker-run’ism”. We start BuildKit’s buildkitd daemon inside the container, attaching it to the Docker daemon that runs on the host (“privileged” mode). Next, we copy buildctl binary from the container to the host system and set BUILDKIT_HOST environment variable, so buildctl knew where its daemon runs.

Alternatively, we could install BuildKit from GitHub and run the daemon directly on the build host. YOLO.

  - echo "$DOCKER_PASSWORD" | docker login --username "$DOCKER_USERNAME" --password-stdin

To be able to push the images to the registry, we need to log in providing Docker credentials to host’s Docker daemon. The credentials are set as Travis CI’s encrypted environment variables ([refer to Travis CI docs])](

buildctl build \
  --progress=plain \
  --frontend=dockerfile.v0 \
  --local context=. --local dockerfile=. \
  --opt filename=contrib/docker/Dockerfile \
  --opt platform=$PLATFORMS \
  --opt build-arg:VERSION=\"master\" \
  --opt build-arg:GITSHA=\"$TRAVIS_COMMIT\" \
  --output type=image,\"name=profefe/profefe:git-master\",push=true

This is the black box where everything happens. Magically!

We run buildctl stating that it must use the specified Dockerfile; it must build the images for defined platforms (I specified linux/amd64,linux/arm64,linux/arm/v7), create a manifests list tagged as the desired image (profefe/profefe:<version>), and push all the images to the registry.

buildctl debug workers ls shows what platforms does BuildKit on this host support. I listed only those I’m currently intrested with.

And that’s all. This setup automatically builds and pushes multi-platform Docker images for profefe ( on a commit to project’s “master” branch on GitHub.

As I hope you’ve seen, support for multi-platform is getting easier and things that were hard a year ago are only mildly annoying now :)

If you have any comments or suggestions, reach out to me on Twitter or discuss this note on r/docker Reddit.

Some more reading on the topic:

dockerbuildkittravis ciraspberry pi

k3s with Ubuntu Server (arm64) on Raspberry Pi 4

As I’ve twitted recently, I’m updating one of my Raspberry Pis to Ubuntu Server 19.10 (arm64).

“One of Raspberry Pis”?

My home cluster is four Raspberry Pis 4 (2GB); all connected to my internet router through ethernet and powered with 60W 6 USB-ports charger. All Pis build a small kubernetes cluster that runs with k3s.

All by one Pis run on Raspbian Buster Lite and this setup’s been working pretty well until I’ve found out, Aerospike, a database I required to run for a testing lab, only works on a 64-bit OS.

Luckily, Ubuntu Server has an arm64 version built for Raspberry Pi. Thus, my working plan is to switch one Pi to Ubuntu, compile and run a single-instance Aerospike server (and any other components, that require a 64-bit OS) on this Pi, and provide a kubernetes service in front of the DB, so other components in the cluster could access it as if it was fully managed by kubernetes.

The Setup

Setting up Ubuntu Server on a Pi was smooth. All I did was flushing the image with 19.10 OS to an SD card, as described in Ubuntu wiki. That is, the headless setup worked out of the box, and after I inserted the card into the PI and connected it to the router, I managed to SSH into the system:

$ ssh ubuntu@

The default password for ubuntu user is ubuntu. The system asks to change the password on the first login.

The first thing to do after installing the system:

$ sudo apt-get update
$ sudo apt-get upgrade -y

Disable “message of the day” (motd) to speed SSH login. For that I commented out the following lines in /etc/pam.d/login and /etc/pam.d/sshd:

#session    optional  motd=/run/motd.dynamic
#session    optional noupdate

Reduce GPU memory split. I truly don’t know if that even makes sense, tbh; read about memory split on Raspberry PI config-txt wiki. I added the following to /boot/firmware/usercfg.txt:


To run Kubernetes or Docker, the kernel needs some cgroup options. On Ubuntu Server, the configuration is in /boot/firmware/nobtcmd.txt (refer to cmdline=nobtcmd.txt in /boot/firmware/nobtcfg.txt). Add the following to the end of the file:

cgroup_enable=cpuset cgroup_enable=memory cgroup_memory=1

Reboot the Pi, re-login, and all is ready to install k3s-agent:

$ curl -sfL  | K3S_URL="https://<k3s-master-pi>:6443" K3S_TOKEN="<k3s-token>" sh -

After the agent installed and running, check the Pi was added to kubernetes cluster:

pi@pi-1:~ $ sudo kubectl get node -o wide
pi-1   master   30d   v1.17.0+k3s.1   Raspbian GNU/Linux 10 (buster)   4.19.75-v7l+        containerd://1.3.0-k3s.5
pi-2   <none>   47h   v1.17.0+k3s.1   Raspbian GNU/Linux 10 (buster)   4.19.75-v7l+        containerd://1.3.0-k3s.5
pi-3   <none>   47h   v1.17.0+k3s.1   Raspbian GNU/Linux 10 (buster)   4.19.75-v7l+        containerd://1.3.0-k3s.5
pi-4   <none>   10h   v1.17.0+k3s.1   Ubuntu 19.10                     5.3.0-1014-raspi2   containerd://1.3.0-k3s.5

That is for today. The next steps are to figure out how to build Aerospike on arm64, but this is a story for another day.

Update (2020-01-15)

I’ve managed to build and run Aerospike server for arm64! See make-arm64v8 branch in my fork of aerospike-server and the gist with my systemd services and configs.

Update (2020-02-13)

A week ago I tried to install Ubuntu Server 18.04.3 on Pi 4 and didn’t even get to the login shell in the headless mode. Now Ubuntu Server 18.04.4 LTS is out and it works exactly as described in this note:

$ kubectl get node -o wide
pi-1   master   63d     v1.17.2+k3s1   Raspbian GNU/Linux 10 (buster)   4.19.93-v7l+        containerd://1.3.3-k3s1
pi-2   <none>   35d     v1.17.2+k3s1   Raspbian GNU/Linux 10 (buster)   4.19.93-v7l+        containerd://1.3.3-k3s1
pi-3   <none>   14m     v1.17.2+k3s1   Ubuntu 18.04.4 LTS               5.3.0-1017-raspi2   containerd://1.3.3-k3s1
pi-4   <none>   4d23h   v1.17.2+k3s1   Ubuntu 19.10                     5.3.0-1017-raspi2   containerd://1.3.3-k3s1

Update (2020-06-01)

Ubuntu Server 20.04 is the recommended version of Ubuntu for Ri 4. It works perfectly fine:

$ kubectl get node -o wide
pi-1   master   172d    v1.18.2+k3s1   Raspbian GNU/Linux 10 (buster)   4.19.118-v7l+      containerd://1.3.3-k3s2
pi-2   <none>   57d     v1.17.4+k3s1   Raspbian GNU/Linux 10 (buster)   4.19.118-v7l+      containerd://1.3.3-k3s2
pi-3   <none>   38d     v1.17.4+k3s1   Ubuntu 20.04 LTS                 5.4.0-1011-raspi   containerd://1.3.3-k3s2
pi-4   <none>   3d19h   v1.18.2+k3s1   Ubuntu 20.04 LTS                 5.4.0-1011-raspi   containerd://1.3.3-k3s2

Also Raspberry Pi OS (ex-Raspbian) for arm64 is currently in beta.

raspberry piarm64aarch64k3skubernetes

The Fireside Edition

After listening to the “Fireside edition” of Go Time FM, I questioned myself, how would I answered the questions the hosts discussed.

Because no one has asked, you are very welcome:

1. If you had two weeks to spend on a personal Go project, what would you work on?

I really want to invest more time for profefe. Specifically, on implementing an analyser of stored profiles: the thing that would help to make sense of the data, showing how the performance of an instance, a node, or a cluster had changed over the period of time; how different parts of the codebase had influenced the performance of the application.

Recently Amazon has announced CodeGuru profiler (currently Java-only). From the description it feels exactly what I pictured in my head when I started the project.

Another topic that I would like to invest more time on is the understanding of the ecosystem around/inside Kubernetes. During the past two years, I slowed down the consumption of the DevOps/SRE topics, mostly due to the specific state of the infrastructure in our company. But, “k8s is the new linux”, regardless of what one’s opinion on that. Even profefe recently has ended up having a kube-profefe (a bridge between profefe and Kubernetes), contributed and maintained by other people.

2. What annoys you about Go of 2019?

The same small things that annoyed me in Go 1.4: var, new, make and “naked return”. Sure, I understand that they all ended up in the language for a reason. But I simply don’t like the “magic” of make, which works only with particular types; the two ways of defining a variable (var or :=), or a pointer to an instance of a type (new or &T{}).

One new thing, though. Go modules' semver imports. But I can’t say anything new about that. Probably, I just need to embrase them. Go 1.14 looks like a version where I might completely switch to modules, thanks to better handling of vendored dependencies.

3. What’s your ideal working environment?

That always surprises me. Lots of people keep saying that working from home is their ideal environment or even a factor that influence their job offers choice. I don’t like to work at home. The only time when I feel productive when stay home is in the nights. A cafe or a co-working works sometimes. But I like working in a big office space. I don’t know why.

Of course, open-plan offices can be very different. Yandex’s “Red Rose” is still the best space I ever worked in. I heard they do excursions around the Moscow’s office now.

3.1. Something on pair-programming?

Since I wrote about Yandex.

Some people thing pair-programming is a sort of a super-power. It’s, and it’s not. You can’t just put yourself in an environment, where someone is watching how you write the code while trying to hold a conversation about the code architecture. Pair-programming is a skill to master. But it pays off.

The pair-(trio actually)-programming sessions we did in Yandex, when we worked on bem-core, was the most significant skill boost I had during the five+ years their.

Of course, the positive experience comes from your peers. In my case, they were people with a huge baggage of knowledge and practice of working, talking, debating with each other. Like, out of nowhere, you get the understanding of what types of questions you must ask; when it is important to spend more time on thinking and when you can make a small hack.

4. Your advice to you junior-developer self?

Don’t overthink and afraid of starting anything. Trying something by making a raw, dirty, barely-working prototype will give you way more knowledge than thinking about how to do that.

gogo time fmaskme

[]byte to string conversion

Go has an old wiki page, titled “Compiler And Runtime Optimizations”.

The part I like most there is different cases where compiler doesn’t allocate memory for string to []byte conversions:

For a map m of type map[string]T and []byte b, m[string(b)] doesn’t allocate (the temporary string copy of the byte slice isn’t made)

Turned out, since this wiki page was written, more similar optimisations were added to the compiler.

As it’s in Go 1.12+ the following cases are also listed in runtime/string.go:

For the case "<" + string(b) + ">", where b is []byte no extra copying of b is needed.

if string(b) == "foo" { ··· }

In the code above, b []byte also won’t be copied.

There are still cases where compiler can’t optimise the code for us. In some of those cases it’s fine to do string to bytes conversion using a so called “unsafe trick” (accessing string’s underling data directly, with out copying the data from string to bytes and vice versa). One can find several ways of performing the trick, but none of them seems “the one that must be used”.

After years of episodic discussions, a collegue of mine assembled the list of different conserns and about the proper way of doing it (see “unsafe conversion between string <-> []byte” topic on golang-nuts forum). Thanks to replies from Go team, our most valid way of doing it is following:

// Refer to

type stringHeader struct {
	data      unsafe.Pointer
	stringLen int

type sliceHeader struct {
	data     unsafe.Pointer
	sliceLen int
	sliceCap int

func StringToBytes(s string) (b []byte) {
	stringHeader := (*stringHeader)(unsafe.Pointer(&s))
	sliceHeader := (*sliceHeader)(unsafe.Pointer(&b)) =
	sliceHeader.sliceLen = len(s)
	sliceHeader.sliceCap = len(s)
	return b

func BytesToString(b []byte) (s string) {
	sliceHeader := (*sliceHeader)(unsafe.Pointer(&b))
	stringHeader := (*stringHeader)(unsafe.Pointer(&s)) =
	stringHeader.stringLen = len(b)
	return s

Github Actions and GOPATH

The other day I received my beta access to GitHub Actions. To try them out I picked an existing pet project and created a workflow using a Go project template provided by GitHub. As it’s in September 2019, their template defines the sequence of steps: setup Go, checkout code, get dependencies, build. This is not exactly how I used to do it.

My project is a classic Go service ;) meaning: it uses vendoring and doesn’t use Go modules. So no need for “get dependencies” step. And it requires to be inside the GOPATH. With that, the provided workflow needed some adjustment.

After some trials and errors, I’ve managed to make checkout step to clone the repo into the correct destination inside the GOPATH. Here is the final workflow:

name: Run Go test
on: [pull_request]
        go-version: [1.12.9]

    runs-on: ubuntu-latest

      - uses: actions/setup-go@v1
          go-version: ${{ matrix.go-version }}

      - uses: actions/checkout@v1
          path: ./src/${{ github.repository }}
          fetch-depth: 5

      - run: make test
          GOPATH: ${{ runner.workspace }}

Note, how actions/checkout@v1 above uses custom path input parameter. I set the path to ./src/${{ github.repository }}, so the project is checked out to src directory in the runners’s workspace, which I later pass as the value of GOPATH to the “make test” step. The leading dot in ./src seems very important — I’ve spent the majority of the time trying to figure out that part — refer to this issue.

See the workflow in action.

To learn more about those ${{ ··· }} “macroses” I suggest looking at the Actions' “Contexts and expression syntax” documentation.

gogithub actions

Go's net/http.Headers

One probably knows that net/http.Headers is no more than map[string][]string with extra specific methods. A usual way to initialise and populate such data-structure from an external representation is something like that:

type Header map[string][]string

func (h Header) Add(key, val string) {
    if val == "" {
    h[key] = append(h[key], val)

func main() {
    h := make(Header)
    h.Add("Host", "")
    h.Add("Via", "")
    h.Add("Via", "")

From the code above, one can notice that we allocated a new slice of strings for every unique key that we added to headers. For things like HTTP headers, that’re automatically parsed for every incoming request, this bunch of tiny allocations is something we’d like to avoid.

I was curious to know if Go’s standard library cares about that.

Looking at the implementation of net/textproto.Reader.ReadMIMEHeader(), which’s used in the standard HTTP server, or Go 1.13’s new net/http.Header.Copy(), it turned out they solve the problem quit elegantly.

We know that for a majority of cases, HTTP headers are an immutable key-value pair, where most of the keys have a single value. Instead of allocating a separate slice for a unique key, Go pre-allocates a continues slice for values and refers to a sub-slice of this slice for all keys.

Knowing that, we can refactor the initial Header.Add as the following:

type Header map[string][]string

func (h Header) add(vv []string, key, val string) []string {
    if val == "" { ··· }

    // fast path for KV pair of a single value
    if h[key] == nil {
        vv = append(vv, value)
        h[key] = vv[:1:1]
        return vv[1:]

    // slow path, when KV pair has two or more values
    h[key] = append(h[key], val)
    return vv

func main() {
    h := make(Header)
    // net/textprotocol pre-counts total number of request's headers
    // to allocate the slice of known capacity
    vv := make([]string, 0)

    vv = h.add(vv, "Host", "")
    vv = h.add(vv, "Via", "")

Note that we use vv[:1:1] to create a subslice of the fixed capacity (length 1, capacity 1).

If there is a KV-pair that has several values, e.g. “Via” header, Add will allocate a separate slice for that key, doubling its capacity.


Hello World

Let’s create a blog. But let’s call them “notes”.

Because sometimes there are thoughts I want to share with you. Some of them might even be larger than a tweet.