Featured image for Dev Diary 12: Koto August Progress Report

Dev Diary 12: Koto August Progress Report

September 6, 2021

In this development diary, I provide an update on progress made on Koto in the month of August 2021 and an extensive debrief on my experience with GTK4. In Dev Diary 11, I listed the following goals for Koto development over the course of August:
  1. Finish up our first pass on the audiobook user experience. This includes presenting playback state info, clickable genre buttons in the library view for filtering, alongside that "all" button shown the mockup.
  2. Implement our 10-band equalizer and playback speed controls.
  3. Switch our indexer logic from using the readdir Linux syscall to using GLib / GIO functionality, fix weird threading issues by introducing mutex locks on our Cartographer HashTable. This should improve the reliability of our indexing during initial startup and help us down the road.
Murphy's Law was at it again when it came to development this month, so not all the items above were addressed, however work on various "big ticket" items made considerable progress (or in the case of others, completion), so overall I am fairly happy with work done so far. While the updates on Koto will be much shorter than the usual ones (on account of the fact that I am on vacation at the moment and trying to enjoy the last few days of it), let its brevity not fool you, tons of work was involved, tons of bugs crushed, and I did die inside towards the end as I dealt with even more issues around GTK4. Nevertheless, let us get started!

#Playback Speed Controls

The first item I wanted to get to working on was the playback speed controls. While less useful when listening to music, there may be cases where you will want to speed up an audiobook or a podcast. To accomplish this, most applications (such as podcast clients like Pocket Casts) will implement an input control to allow you to type in the speed you want. Want it to go half speed? 0.5. Want it to be a tad faster by not make folks sound like chipmunks? 2.5. Though 1.10 to 1.25 are typically the sweet spots. Our input control allows a range from 0.5 all the way up to 2.5, with any values outside those bounds being rejected and reset back to 1.0. This functionality will create a new playback seek event, however not actually change your position but rather solely change the speed (with is a parameter that is part of that seek event). Once this work was complete, I began implementing functionality for jumping backwards and forwards by N number of seconds in the current track. This is incredibly useful for skipping all those Dollar Shave Club, Square Space, Ridgewallet, yada yada yada ads that basically every podcast has these days. You can also use it to skip the otherwise boring dialog in audiobooks. The backwards and forwards skip increments are configurable, with both having a minimum of 5 seconds and a maximum of 1min30s. Not sure why you would need to go that far ahead, but it costs nothing to change literally two numbers, so figured I would keep them with those two extremes just in case. By default, the backwards skip increment is 10 seconds and the forwards skip increment is 30 seconds. At the moment, this is configurable solely through the config.toml file, however once work begins on the settings UX, it will get added there.

#Equalizer

One of the biggest things I wanted to add to Koto was a proper 10-band equalizer, profiles, and the ability to set a preferred default. In the future, I want to expand this to be content-type specific equalizer defaults. For example, you may listen to a bunch of EDM and prefer to have an equalizer profile with a bunch of bass, however you may want to automatically switch to a profile that boosts the specific frequencies of the human voice for podcasts and audiobooks. So providing that level of granularity is something I want to accommodate. This functionality is using the equalizer-10band plugin, with each of our sliders communicating through our pipeline to one of the ten band properties used by the sliders, with the first one starting at band0 and last on band9. These properties trivialize profile switching, since we can just load in our list of band values in the fixed length list, changing the decibel values where necessary. To keep our gstreamer-related codebase simple, our functionality just sets all the band properties to 0 (as in 0db), normalizing all the frequencies. Going this route avoids the otherwise complex code that would be needed to track specific state changes related to a gstreamer "bin" to know when we have flushed the entire pipeline of the track content, dynamically remove or add the gstreamer "pad" for the equalizer pipeline, and change state again. I may go down that route in the future, however right now it is overkill, especially since the only other plugin I expect to leverage before the final release of Koto is the "removesilence" plugin, which has a specific property for enabling and disabling it. The equalizer-10band plugin would be the only plugin needing all the complex logic.

#GTK4 Woes, Disappointments, and Alternatives

If you have caught any of my streams of Koto development, which historically has happened Tuesdays and Thursdays on my Twitch channel, you would know that I have had to deal with a fair share of GTK4 bugs. I have elaborated on some of them in the past, however months of dealing with them with no headway from upstream has lead me down a path of constant frustration with the toolkit and upstream itself. To provide some concrete pain-points, I will break down some of the issues I have had.

#The Now

Sub-classing Previously, you were able to "sub-class" various Gtk widgets such as the GtkHeaderBar. Sub-classing allows you to more-or-less implement your own widget / logic on top of an existing one, leveraging its properties and signals to simplify your own widget or application logic. While this is less useful in the case of a GtkHeaderBar, GtkListBox was one of the widgets I wanted to sub-class for my KotoTrackList as well as the Koto artist listing. Unfortunately, in GTK4 these widgets were marked as "final", meaning they could not be sub-classed, and there has been no changes upstream in GTK4 to allow them again. Instead, you need your own class that sub-classes GObject + GObjectClass, hold a pointer to a GtkHeaderBar or GtkListBox, and make changes to the widgets referenced by the pointer. So still doable, but unnecessarily cumbersome. I have been doing this for various classes in Koto so far. Deprecated X11 APIs One of the biggest reasons I haven't moved Budgie Desktop View over to GTK4 is the deprecation of numerous X11 APIs. These removals were done due to GNOME's belief that the window manager should be the only piece of software controlling the positioning of the window, and that these APIs would for that reason not be Wayland friendly. Since GNOME's focus has been promoting and pushing Wayland support, they have done so at the cost of support not just in X11 but also cross-platform support under macOS and Windows. I will get into more of the "cost" of lack of focus on X11 support in a bit. Budgie Desktop View uses the monitors-changed and sized-changed signals in GdkScreen and GdkX11Screen respectively to know when the number of our monitors has changed, primary has changed, or the size of the monitor has changed. This allows us to perform calculations for our GtkFlowbox layout to eliminate the possibility of undesired item overflow, hiding items where necessary, and ensuring Budgie Desktop View is always positioned starting at 0,0 (top left) of the primary monitory. For Koto, the desire would be to use these APIs to know when to adjust our default optimal size for Koto in an unminimized scenario (not maximized but not minimized), as well as APIs for centering Koto on the primary monitor. Since none of those APIs exist anymore, it simply is not possible. Applications lose these capabilities for the simple reason that GNOME no longer wants to provide it, not for any real technical reason. They could have put all of those APIs behind an X11 namespace or OS-specific ones. They opted not to. Linux is not the only thing affected. All platforms are. Instead of having simple GTK functions to set the positioning, now everyone has to write their own APIs that interface directly with X11 APIs, or Windows APIs like SetWindowPos as another example, in order to achieve this. This makes the lives of developers that entrusted Gtk to be a solid cross-platform toolkit unnecessarily difficult, just results in more duplicate code everywhere, and increases maintainability in a place it should not be. This is already on top of all the changes they need to make moving from GTK3 to GTK4. More Major Issues A couple major issues that I have had to deal with just in the development of Koto, not to even get into the issues other application developers have run into, have been related to the GtkListView and GtkPopovers. I have talked about this in the past, however I have been dealing with issues related to both of these widgets for so long (as they are both fundamental to Koto's UX) that my frustrated with the lack of prioritizing by upstream on them and the magnitude of them is basically at a boiling point. When it comes to the GtkListView, scrolling has been broken for over a year now, where the scroll position will suddenly jump to the end of the list. It happens whenever the GtkListView reaches the point where items start getting "recycled", but it completely breaks scrolling even semi-long lists (like a list of artists). Allegedly another GNOME developer has this on their "to-do"...for months now. Another more significant issue, at least in my case, causes your current workflow to completely grind to a halt and nowadays can even result in a session loss. This is the issue with GtkPopovers in GTK4 under X11 taking keyboard and mouse input and not giving them up even when the window is not in focus. This has been an issue for 8 months now and an issue I have run into a fair bit when streaming Koto development, particularly sections of the codebase like the Koto Equalizer that reside inside a GtkPopover. If you have a popover open and switch to another window, that popover will still be in focus and you cannot interact with any parts of window you are intending to. This has been a pretty common annoyance when jumping between Koto and writing code in Visual Studio Code during my stream, but it can conceivably happen with any GTK4-based application that uses a popover. You have to explicitly close the popover then switch back to the application you wanted to interact with originally, in order to work around the issue. In older releases of GTK4, the Koto window would effectively "disappear" during segfaulting, with the popover remaining in the foreground and still taking keyboard and mouse input. To get around this and allow me to continue streaming development without having to always jump into TTY, kill the process, and disrupting my session, I started running this via gdb with a custom runner in Visual Studio Code. This was enough in those specific scenario, but now I'm back at the point where regardless, the popover remains and now not only do I have to swap to TTY to kill the process, but killing it and swapping back results in the Xorg server segfaulting. In my last Koto stream, this happened three times to the point that I just simply gave up and started messing with iced+Rust. Honestly, I am just so tired of GTK4 and these sorts of issues, and the constant attempts by GTK developers to wash their hands of responsibility for these issues and try to push them off to Mutter, Mesa, or X11. By their own admission, the "client-side popover implementation for X needs some improvements" and yet 8 months after the release of GTK4 as a "stable" new version of the toolkit, none of those improvements have come down the pipeline. Instead the focus has been on libadwaita and preparing to strip out even more from GTK with GTK5.

#The Future with GTK

As many of you know, I have been a strong proponent of building desktop-focused Linux applications that:
  1. Do not leverage technologies that risk negatively impacting the desktop experience.
  2. Provide a set of sane defaults out-of-the-box (like Solus does) while providing you the customization / flexibility you want and expect.
  3. Integrate well into your desktop environment of choice, or at the very least attempt to do so.
This view started to come into conflict with when GNOME applications began adopting libhandy, a library designed to "help with developing UI for mobile devices using GTK/GNOME" (per their own explanation). The introduction of libhandy into various GNOME applications resulted in regressions across a wide range of their applications or those under their GitLab, from Geary to GNOME Control Center. This was all in the name of making these applications scale from mobile to desktop. While a noble goal, a pursuit of one always means the compromising of the experience of another. Make the application to simple and you do not leverage the screen real estate of desktops / laptops. Make it too complex and it becomes a labyrinth of buttons, menus, sub-menus, etc. for mobile users. Personally if I was to build an application designed to be run on mobile, it would only run on mobile, with a completely different view for the desktop. Not a shared UX. However make no mistake, libhandy offered a lot of promising things on top of GTK3, such as their HdyPreferenceGroup/Page/Row/Window, HdySqueezer and HdyViewSwitcher widgets. I elaborated on this in my first blog post about Koto, that the improvements made via these widgets would have made perfect sense in GTK4 since they would only build on the existing widgets and expand their capabilities. Instead in the name of reducing ABI breakage (which was already considerable to begin with), the GNOME developers opted to just start working on their own widget library on top of GTK4 called libadwaita. Except unlike libhandy, which you could still theme using stylesheets, GNOME has taken a completely different direction with libadwaita that puts it in direct conflict with not just Solus and our belief in shipping a consistent user experience across all desktop applications and the Budgie environment, but for others in the Linux space as well. Recently, a System76 engineer began testing libadwaita applications only to discover they do not respect the GTK theme at all, but rather force libadwaita and not even respect the dark theme choice. This System76 engineer reasonably expressed their frustration as it impacts their desktop experience with Pop_OS!, and has been met with only hostility and getting shot down by GNOME developers. This is a conscious decision by GNOME developers that make it so if you want to adopt libadwaita for an application that honors the GNOME Human Interface Guidelines, you have to use the Adwaita theme. Yes, I recognize it is in the name. However if you want to support and adopt the GNOME HIG via libadwaita but still offer third-parties the flexibility to make the app integrate well in their ecosystem, you cannot. You will not be able to use a separate stylesheet and various GNOME developers have consistently said that this is a "hack" and instead of allowing system-wide theming and improving the APIs around it, they are going to only provide a recoloring API that allows specific aspects of the app to be recolored. This is application specific, does not apply to the entire desktop, and GNOME developers do not agree with the notion of it applying to the entire desktop. This introduces a significant regression in the desktop Linux space. The intent going forward in the GTK space is to require everybody to basically implement their own version of a libadwaita. elementary would have their own styling via Granite, GNOME developers would have libadwaita, and the likes of System76 and Solus are expected to implement their own. This means you will have a bunch of applications that look differently, some using dark themes and some not. Your desktop theme would no longer apply consistently, applications will stand out like a sore thumb, and all because a few "Don't Theme My App" developers are getting their way. For GTK5, the current proposals are to leave theme selection to platform libraries and remove the prefer-dark setting so dark theme preferences are done via platform libraries as well. If you are a desktop Linux user and want to have a consistent experience across all your apps, good luck with that once GNOME apps start adopting libadwaita and GTK5 becomes a thing. If you are building a desktop Linux application at this point and you want to provide flexibility for your users or allow operating systems to ship your app, with users being happy with how well it integrates with the rest of their Linux experience, using GTK4 and beyond is going to be shooting yourself in the foot. Do not do it.

#The Grand Experiment

When I first started Koto development, I did so by experimenting with both EFL (Enlightenment Foundation Libraries) and GTK in C. My decision to go with GTK over EFL boiled down to two items:
  1. The styling and positioning of widgets in EFL is done with their own declarative syntax rather than CSS, which would get compiled into an EDJ file and used in the application or a specific component.
  2. I wanted to follow the GNOME HIG more closely, such as popovers and headerbars. This obviously is easier when using GTK rather than EFL.
However as I have worked on Koto and dealt with issue after issue with GTK+C, I have become increasingly more frustrated with the state of GTK, the prioritization of GNOME, and the outright anti-user behavior GNOME has exhibited. I cannot in good conscience support this behavior by GNOME or their vision of how their software ecosystem and platform should function going into the future. I do not want to build applications using GTK anymore and building the next generation of Budgie and Solus applications in it is simply suicidal. Experimenting (or re-experimenting) with alternatives has been paramount for me. My grand experiment with GTK4 is over. On August 1st, I started a deep dive into the Rust handbook over a couple streams to obtain an up-to-date opinion on Rust. I had experimented with the language years prior and was not the biggest fan, however I was genuinely surprised when returning to it that I actually enjoyed many of the features it offered and its comprehensive standard library. It was further on my radar because I have been using a graphical application called Ajour for managing World of Warcraft add-ons. This add-on is built using Rust and a toolkit called iced. While iced shows a lot of promise, at this moment in its early days it is cumbersome to implement your own custom widgets since you have to deal with the rendering / most of the drawing yourself. Additionally, you cannot overlap widgets due to its lack of Layers. This means "basic" functionality like a popup volume slider, equalizers, etc. that we enjoy for Koto now would not be feasible. This is obviously disheartening, however I am really excited to see where iced goes from here, and will be following it closely. When it comes to Qt, I simply am not a fan of C++. I absolutely recognize that there are many language bindings, however the reality is the bindings for languages I would likely write in, Rust or Go are not active. Expanding on this, the history between Qt and their commercial license, and the open source community plus KDE has made me hesitant to adopt it for an application even if the bindings were actively developed. Obviously using Electron is out of the question, even if that would open the door to using whatever web frameworks I want. I can't stand nodejs and most modern JavaScript tooling. This basically leaves two options in my eyes:
  1. EFL
  2. A brand new toolkit
When it comes to EFL, I need to do a deep dive into using it, as well as investigating how much work it would be to develop Rust bindings. Hell, maybe I could even convince some System76 developers to get involved too. The existing Rust bindings haven't been touched in 7 years, are for an incredibly old version of EFL, and are not reflect of the current state of the "unified EFL APIs" or even the "legacy EFL APIs". While we would not be using Rust for Solus things, I do want it for personal projects as I prefer the memory safety and language features Rust provides. When it comes to a "brand new toolkit", this is something both Beatrice (Technical Lead of Solus) and I have been quite clear we are open to pursuing. Obviously it is not as easy as "just" writing a toolkit, however it would maximize our flexibility and we could work on ensuring that we are providing a framework that other Linux application developers could use with confidence. In my opinion (not speaking on behalf of Solus), this should even be separate from Solus itself, and instead be in a separate working group or neutral organization that is responsible for it. We have seen how much of a mistake it is for GNOME to be the main leads for GTK. We have seen how much of a headache Qt can be with its commercial licensing and corporate backing through various companies over the years. I started work on my "Modern Desktop Initiative" a long time ago to pursue technologies focused on desktop Linux computing and foster its growth. Unfortunately due to time constraints I have not been able to put as much time into it as I would like, however it may end up being a sensible neutral ground for various parties to engage in the future. Either way, in my opinion the future of desktop Linux application and environment development is not with GNOME and GTK. I know that opens the door to a lot of questions, questions that will be answered in time. I was extremely hopeful in GTK4 when it was first released back in December of 2020, unfortunately it has not panned out the way I have hoped. GNOME's actions have not given me any confidence in it. Going forward, the development of the GTK4-based Koto is officially on hold. Instead my focus will be on:
  1. Experimenting with EFL+C.
  2. Exploring the viability of up-to-date Rust bindings for EFL.
  3. Continuing to tinker with Rust+iced in the hopes of making headway on various aspects of application development with it and engage with some Rust developers on means of making headways on current pain-points with iced development.
This will allow me to pursue all of the most sensible options for myself, explore options on behalf of Solus, as well as hopefully improve the development experience for others along the way. Beatrice and I have spoke about having an intermediate format and tooling to generate the EDJ format before handing it off to their processor, this may be a good opportunity for me to start working on that. The experience will also help inform us (Solus) on the next steps for Budgie and various software developed by Solus. I am unsure how much of this will be streamed development however I am committed to continuing to actively write about these adventures. Whenever I stream, it will be announced on my Twitter and Fosstodon, so I encourage you to follow me there. Hopefully we will see a Koto in another toolkit sooner rather than later!