July 2020 Status Update

July, the hottest month in Montreal every year. The heat wave rushes to you when you walk outside makes you wondering if you are in some tropical island, it sure doesn’t look like living at 45 latitde northen hemisphere. Last month was a rapid leap towards wayland objects implementations and now I just hitted the wall of xdg-shell protocol. Today I’d like to talk about what it is like to implement a wayland protocol.

there are really two parts of the story, the client and the compositor. Here I am only elaborate on the compositor side, protocols are usually (and it should be designed) easier to use on client side, thus the compositors takes most the workload, which is fine, there are way more clients than compositors.

A wayland protocol is a wl_global, so the first step is mostly the same.

wl_global_create(display, interface, version, data, binding_functions).

void
binding_functions(struct wl_client *client, void *data, uint32_t version, 
                  uint32_t id)
{...}

From this point on, you created a wl_resource object and work with its interface, which is a list of function calls. Then, you would probably run into two scenarios.

You found out the compositor already has most of the required functions, so implementing a protocol is a nature externsion.
The compositor does not have the functions so you have to implement a protocol interface and complete that interface with the compositor later.

As for my case with taiwins. Most of the time I ran into scenario 2. Simple protocols like wp_viewporter is fine, they are usually works like getter and setter functions. That you know you probably would not make many mistakes. Depends on the number of client requests, the workload could be one to a few hours. The complex protocols, like xdg_shell, on the other hand, are usually compound. It may contains a few sub-protocols and interacts with each other. This is when things gets a bit tricky. You can implement the interface as far as you can go but you are never sure how correct the implementation is. For instance with xdg_shell.xdg_surface. it can turn into either a xdg_popup or xdg_toplevel, Then xdg_toplevel and generate a xdg_popup, xdg_popup can later start a grab and you can have maintain the popup states. The implemenation is easily vunlerable for bugs. One tricky thing particularly is the object dependencies in wl_resource. The destruction of the objects does not follow the creation. Be careful you can easily end-up with a dangling pointer. The example here is xdg_surface and wl_surface, while you need a wl_surface to create a xdg_surface but the wl_surface could destruct before the xdg_surface. I’ve bitten a few times already and it is surely an annoying thing.

For xdg_shell, it took me a day or two to roughly fill up the request callbacks, it was like walking on thin air. Then another day to actually implement the compositor actual functions for testing. The day after I could finally grab a few real life applications for debugging, and it crashed…* This is the complex protocols, the stack of implementation is really long before actual tests can happen.

I sincerely hope next time I can complain less.