Today's Dev #8 | Joshua Strobl

I will get it right out of the way that in terms of development, I didn't really do much today. Today's development topic was more future prospective development around AI / LLM. With Slush 2024 currently ongoing, I wanted to take the opportunity outside a long day of work to go visit the fine folks over at Jolla and Venho. As someone that has been a fan of Jolla and Sailfish OS for a long time, originally owning the Jolla One and now the Jolla C2 (which I have talked about my plans for developing on in my Dev Diary 13: Koto - The Resurrection and intend on switching to once 4G is sorted), I have been very interested in seeing first hand their line of thinking behind an AI-first, local-first, privacy-first computer. If none of that sounds like your thing, that's perfectly fine. You can check out of this post and come back tomorrow. However for me, local-first LLM is something I have tinkered with using ollama, leveraging that in the Zed editor (instead of relying on Copilot), and started looking into solutions for Local Agentic RAG like LangGraph. The AMD hardware in my MiniPC isn't quite powerful enough to do local LLM at any sort of reasonable speed, so my goal was to shift that over to using ipex-llm with my Intel Arc GPU connected over Oculink for GPU accelerated processing. Well, instead of having to build all of that myself, maintain it, worry about a kernel update possibly borking my Intel GPU detection for some reason (cough recent Fedora update it seems has done that cough) - what if I could just have a box that sits in my living room and does all the agentic things for me without a whole lot of fussing on my part? What if my data could sit on device instead of being collected and farmed by Google or Microsoft? Well that sounds pretty exciting to me. This is what Jolla (as the hardware provider) and Venho (software) have set out to accomplish with their Jolla Mind2, which is currently up on pre-order. I wasn't going to just preorder a Jolla Mind2 without actually checking it out myself. First I wanted to hear more concrete plans / key objectives for the device / more importantly the software. So after having left work late, I trudged through the frankly miserable weather (snow, slush, rain, wind) and suffered the drunk Finn sitting next to me on the tram to go check it out.

I started out being given a guided tour of the current design for the home view (I don't know what they call it, it's just what I'm calling it) and it was followed by their Messaging center. The intent behind the home view seems to be to provide you quick access to specific contexts and collections of content. Some examples from their Messaging center and discussions I had with the team was:

Upon connecting your email account (IMAP / SMTP, other methods later), it would be able to pull those down and process them locally. If there are emails you need to reply to, it'll provide summaries and write responses. They made it very clear that it will not send the email on your behalf, you still need to review it before it sends. This isn't the movie Her, we know how that one ends.
If there are documents that need signing, their "signing agent" can do that for you. That is enticing as someone that always has to find their damn scanned image of their signature on the rare occasion something needs to be signed! I just hope they integrate Docusign somehow.
The assistant would be capable via their Calendar agent of determining if suggested dates & times (e.g. from the Message and Email agents) would work for you, providing you suggestions for adding that into your calendar, suggest reschedules, etc.
You could store various media on device to help in facilitating multi-modal scenarios. Right now their software is currently focused on text-driven modals (with an 8B model that I suspect is probably Llama), so time will tell how they evolve their software to support other modals like image and video.
You could maintain separate different contexts, like having one for personal and another for work.

Development-related, I asked if this was something the community could build on or if it was going to be a more closed ecosystem like Apple. They mentioned that they will have an Agent SDK. I did some research in their Discord and it is stated that the API clients will be open source and you could even completely replace the frontend if you like. While it wasn't something I raised while I was there, I did take a look at their Discord server for any open source plans. At least in August, Venho stated that:

Most of the platform will be FOSS, but we will have some closed parts (mainly related to Venho paid for services) for instance and UX. Though this is still something we're discussing internally, our FOSS strategy and timeline for that.

In principle, I like the idea of it being open source, but I can understand how some aspects like any Venho services (for example if they were to offer hosted large models) wouldn't necessarily be. I think time will tell on this one but if Jolla is any indication of the direction they might go, I think a hybrid approach is fine and seems to have served the Jolla community well in many regards -- for example a range of devices have been ported to Sailfish OS with just about as much success as you see with ports of random Android ROMs. Regardless, a step away from everything going to the cloud and instead at least being able to write my own open source agents for my own content, for a local device that is running Sailfish OS under the hood, with much of it being open source components is absolutely a step in the right direction. Perfect is the enemy of good and I have always been pragmatic when it comes to open source. If I could integrate it into my desktop and enable near real-time agentic behavior between the two, that'd be awesome. Do the same on my Sailfish OS device? Yes please. For now though, it's all still a lot of intention. Now all they have to do is execute on all of it, or make sure that what they can't immediately deliver on should have mechanisms (like a community plugin center or "marketplace" like Obsidian has) so the community can build out integrations. Otherwise my concern would be a box getting launched that is effectively a fancy piece of kit for downloading and reading email.

That's it for today folks. No grand conclusions. No statement on whether or not I'll preorder. Just some cool shit I really wanted to take a look at, share with the wider world, and some cogs in my head related to developing agentic RAGs that I wanted to start turning. I want to be clear at the end of this since I imagine some of the writing might've come off as sales pitchy that none of this is sponsored (I am not above sponsorships though), my tram ticket wasn't paid for, and I wasn't even directly invited (registration was open). I did leave though with a Jolla beanie and a beer though. Can't pass up swag or beer.