- ↔
- →
- PydanticAI
- Felix' Blog - A Review of Helix after 1.5 Years
- HYTRADBOI 2025
- Hot Tub Monitoring with Home Assistant and ESPHome · Jon Seager
- Initial launch: Nuanced call graph context layer for AI coding tools | Nuanced.dev
- March 20, 2025
-
🔗 News Minimalist Dark energy might be weakening + 1 more story rss
Today ChatGPT read 18296 top news stories. After removing previously covered events, there are 2 articles with a significance score over 5.9.
[6.3] Largest 3D map of the universe hints dark energy is becoming weaker —abc.net.au
Scientists using the Dark Energy Spectroscopic Instrument (DESI) have developed the largest 3D map of the universe. This map suggests that dark energy, responsible for the universe's expansion after the Big Bang, may not be constant and could be weakening.
The DESI project collected data from over 14 million galaxies using the Mayall Telescope in Arizona. These findings challenge the existing Lambda-CDM model, which assumed constant dark energy. Researchers are comparing this data with other astronomical surveys to verify results.
Despite these findings, scientists are cautious. Current measurements are at "4.2 sigma," with 5 sigma needed for confirmation. Researchers plan to gather more data from DESI's ongoing observations to further analyze dark energy's role in the universe.
[6.1] Research shows modern humans evolved from two ancient populations —cam.ac.uk
Researchers at the University of Cambridge have discovered that modern humans result from a genetic mix between two ancient populations that diverged around 1.5 million years ago. This challenges the previous belief that Homo sapiens emerged from a single lineage in Africa.
The study, published in Nature Genetics , highlights that one population contributed about 80% of modern human genes, while the other contributed 20%. This earlier mixing event, occurring around 300,000 years ago, was substantial compared to the later interbreeding with Neanderthals and Denisovans.
The research utilized a new computational model to analyze modern human DNA from the 1000 Genomes Project. The findings suggest that human evolution involved more complex interactions, rather than clear, distinct lineages. The team plans further studies to refine their model and explore relationships with fossil evidence.
Highly covered news with significance over 5.5
[5.8] EU allocates €800 billion for independent defense strategy
(politico.eu + 35)[5.8] Experimental drug shows promise in delaying Alzheimer's symptoms
(apnews.com + 14)[5.6] Bird flu outbreak reaches unprecedented levels globally
(globalnews.ca + 18)[5.8] Euclid telescope releases first data on 26 million galaxies
(rte.ie + 15)Thanks for reading!
You can set up and personalize your own newsletter like this with premium.
— Vadim
-
🔗 sacha chua :: living an awesome life Old-school blogger rss
Text from sketchOld-school blogger
[timeline showing different strands braided together]
I started blogging in 2001 (really, more like 2002), as a university student who had started playing around enjoyed learning out loud. Both blogging and Emacs continued through:
- teaching computer science
- going on a technical internship in Japan
- taking up graduate studies
- working at IBM
- experimenting with consulting and semi-retirement
- parenting
directly related to blogging: grad studies, working, experimenting
It's wonderful having such a long archive. I can trace my growth. I've changed a lot over the past 24 years. I miss being so optimistic and energetic, but who I am now and who I'm becoming are also okay.
[drawing of the butterfly life cycle]
- caterpillar
- chrysalis: We're in this messy stage where I digest myself and move my insides around
- butterfly: maybe someday
Learning out loud by blogging:
- Springboard: Writing as I learn means I can use my notes to pick up from where I left off.
- Sometimes my notes help other people.
- Sometimes people share what they've been learning.
- Writing helps me gather my tribe.
Questions to explore:
- What do I want to learn? How?
- What's nearby?
- What might be useful
- to my future self
- to others
Looking forward - I want to…
- draw more. It's fun.
- deepen my reflections.
- learn more.
- prepare so I can keep doing this.
How can I improve workflows for capturing/thinking/sharing/finding?
What can I do so I can keep learning and writing all my life? How can I get even better at it?
sach.ac/2025-03-16-01
Dave Winer's looking for old school bloggers (also this) so that nudged me to think about how and why I blog.
Still writing
From How to Take Smart Notes (Sönke Ahrens):
If you want to learn something for the long run, you have to write it down. If you want to really understand something, you have to translate it into your own words.
Writing and sharing are part of how I learn. Taking notes helps me learn things that are bigger than my working memory or my uninterrupted time segments. Sharing my notes helps me find them again later on, since I can search the Internet from my phone. Also, if I share my notes, sometimes I get to learn from other people too, and sometimes my notes help people figure out stuff and then they can build on that.
It makes sense to me to share these notes on a blog on my own domain, with a chronological view and an RSS feed that makes it easier for other people to check for updates if they want. Well, some other people. I suppose RSS readers are still a fairly technical sort of thing, and I don't particularly like posting on platforms like Facebook or LinkedIn. Anyway, I'll just keep writing here, and maybe people will come across posts via search engines or figure out how to get updates however they want to.
Summarizing posts(let* ((annotations '((2001 "university") (2003 "graduated, teaching") (2004 "internship in Japan") (2005 "grad school") (2007 "working at IBM") (2008 "drawing") (2012 "experiment with semi-retirement") (2016 "A+ was born") (2019 "EmacsConf, COVID-19") (2022 "SuperNote A5X") (2023 "even more EmacsConf automation") (2024 "cargo bike") (2025 "added iPad to the mix"))) (json-array-type 'list) (json-object-type 'alist) (posts-by-year (mapcar (lambda (o) (cons (car o) (length (cdr o)))) (seq-group-by (lambda (o) (substring (alist-get 'date o) 0 4)) (json-read-file "~/proj/static-blog/_site/blog/all/index.json"))))) (append '(("Year" "Posts" "Note") hline) (cl-loop for i from 2001 to 2025 collect (list (format "[[https://sachachua.com/blog/%d][%d]]" i i) (alist-get (number-to-string i) posts-by-year nil nil #'string=) (or (car (alist-get i annotations)) "")))))
Year Posts Note 2001 3 university 2002 31 2003 869 graduated, teaching 2004 971 internship in Japan 2005 678 grad school 2006 877 2007 510 working at IBM 2008 421 drawing 2009 452 2010 399 Quantified Self 2011 397 2012 361 experiment with semi-retirement 2013 359 2014 339 2015 251 2016 141 A+ was born 2017 145 2018 176 2019 121 EmacsConf, COVID-19 2020 94 2021 132 2022 78 SuperNote A5X 2023 122 even more EmacsConf automation 2024 148 cargo bike 2025 49 added iPad to the mix I don't see myself giving up these tools until I really can't use them any more. I'm keeping an eye out for assistive technology that might help me work around my limitations and the likely cognitive/physical decline I'll eventually run into. I'm encouraged by the fact that quite a few people manage to keep learning and writing even into their 80s and 90s.
Some weeks, Emacs News is all I can squeeze in: a long categorized list of links. When I have more time, I add little bits of code, drawings, reflections.
I love writing about little tweaks. Mostly that's about Emacs. I love the way I can shape it into something that fits me.
I like to summarize books and ideas as sketchnotes so that I have a chance of remembering what I want to learn from them. Also, the drawings are handy for sharing with others, and they're a way of giving back.
I'm slowly learning to write about life in a way that helps me learn more while respecting people's privacy. I like doing little experiments. Even tinier than the ones described in Tiny Experiments. Not "I will write 100 blog posts over the next 100 days," but rather, "What if I postpone fretting about A+'s homework until Saturday? What happens then?"
Writing workflow
After I get the kiddo through the morning routine and ready for virtual school, I usually play piano for about an hour or so. Then it's recess and some more hugs, and then I settle down for some writing or drawing. The weather is getting better, so I'm looking forward to moving some of that outside. Maybe I'll dust off those baby monitor apps so I can hear if A+ needs any help.
I mostly write on my laptop using Org Mode in Emacs. Org Mode is great for literate programming. I can mix my notes and my code however I like.
I don't write in a straightforward way. I jump around. I go on tangents and down rabbit-holes. It helps a little if I've sketched my thoughts beforehand, like for this post, or if I've done some audio braindumping to help me figure out where the interesting thoughts are. Sometimes I capture little thoughts on my phone and then move them to the post I'm working on. I'm trying to figure out how to chunk my thoughts better.
I have a lot of Emacs tweaks to make it easier to link to blog posts, bookmarks, sketches, sites from search results. I like including the text of sketches, too.
I use the 11ty static site generator to make my blog. I switched to it a few years ago because I didn't want to worry about keeping Wordpress secure. I don't have room for many programming languages in my brain at the moment, so I like the fact that 11ty uses JavaScript. It takes me about five minutes to compile my blog.
Reading workflow
From Dan Cullum: The more I read:
There is a strong correlation between the amount I’m reading, and the ideas I have for this blog. When I’m reading a lot, I feel like I have ideas coming out my eyes.
Reading makes me want to write, too.
I love the Toronto Public Library enough to transplant myself from the tropics and learn how to deal with winter. I've been reading more e-books lately. It's easier to highlight e-books compared to paper books. I can pick them up and put them down easily, and keep the pages open when I'm taking notes. I don't have to worry about misplacing them, either. I have some code to grab my highlights as a JSON, and then I can do things with them: include them in blog posts, add them to my personal notes, etc.
Not everything is available as an e-book, though, and sometimes the e-books have long hold times. Paper books are still handy enough.
I like reading blogs. They're much shorter than books are, and much less fluffy. Sometimes I feel like mainstream printed books have a lot of padding because of the considerations of the publishing industry: the book must be a certain size so it doesn't get lost on the bookstore shelf; the book must have a certain weight and thickness so people feel that it's worth $25. Blog posts can just get to the core of the idea instead of belabouring the point. I like the fractal density of hyperlinked text, too, and the conversational possibilities of it. It's a lot easier to bounce an idea back and forth to develop it when you can post in a day instead of waiting for a year for a book to be published.
I like reading on the new iPad. It's smaller than my laptop and bigger than my phone. It's easy to browse through blogs on it, unlike on my Supernote. I'm starting to develop a workflow for reading and writing smaller snippets: (toot)
- Read in NetNewsWire.
- Open interesting posts in Chrome on the iPad.
- Highlight the text.
- Use "Copy Link with Highlight".
- Tap on the selection again. Use "Share" to send it to Ice Cubes, a Mastodon client that can post to my GoToSocial instance and let me use my full post limit (5,000 characters, mwahahahaha).
- Paste the link into the toot, add my own thoughts, and post it.
I like linking to text fragments. Sharing from a webpage on my Android phone does this automatically. "Copy Highlight as Link" works from Chrome on the iPad. It saves people that little bit of scrolling or finding, although I suppose it would be helpful for people to go through the context before that selection. Alternatively, I could share directly from NetNewsWire and just link to the blog post instead of the text.
I like making visual book notes. They help me read a book well, and turning the sketch into a blog post gives me more opportunities to revisit it: when I write the post, and if someone comments or shares it.
Eventually I want to dust off my code for collecting Mastodon posts into a blog post, and maybe also re-establish a weekly review process.
Tangent: Check out Reading more blogs; Emacs Lisp: Listing blogs based on an OPML file for a table of the blogs I'm reading, along with the code I used to make a table of blogs, their latest post (as of the time I wrote my post, of course), and the post date.
Keeping an eye on the future
As the kiddo becomes more independent ("Mom, I'm 9, you don't have to fret about my jacket"), I'll have more time for myself. This is a good time to go bike and walk and explore outside, and to go deep and wide into our interests as a family. I do about 2-4 hours of consulting a week, just the stuff I'm interested in. (TODO: There's a tangent I want to write about interest-based nervous systems, which I notice in both A+ and myself, and probably building on this 2014 reflection on having a buffet of goals.) The rest is life time, divided among the things we want to learn/do/share and the things we do to take care of ourselves.
Even though I have increasing autonomy when it comes to time, and an increasing amount of focused time, I still haven't gotten to the bottom of my idea list or my to-write list. I don't think I'll ever get to the bottom of those lists, actually. I come up with ideas faster than I can do them. That's a good problem to have.
It makes sense to prepare for a couple of changes that will likely come up:
- Age-related farsightedness: It'll probably get harder to read small text, and I might eventually need to juggle my regular glasses as well as reading glasses. (W- already does this occasionally. He prefers having different pairs of glasses instead of bifocals or progressives, and his reasons seem sound. I don't want to have to adopt different postures to see out of different zones of glasses.) Developing good workflows for reading will probably help here. Also, the cargo vests I wear will probably help me with the "Where are my glasses?" problem.
- Menopause will probably rewire my brain a lot. I hear brain fog and tip-of-the-tongue can be challenging (see also Brain fog in menopause).
- My mom is 79 and running into issues with
cognitive and physical decline. She has a hard
time typing, speaking, remembering, deciding, or
feeling good. On the other hand, there are
examples of people who have stayed sharp for
decades. There are lots of factors that are
beyond my control. Still, it would be nice to
see if I can stack the deck a little. So yes to:
- walks, bike rides, exercise, and maybe I can figure out a fun way to improve strength;
- lots of learning and sharing and connecting
- and experiments with technological and cognitive aids, like speech recognition to work around typing, text-to-speech interfaces to work around vision, notes to work around working memory, and maybe large language models to work around issues with recall.
- … and I might as well learn Morse code or explore accessibility tools, just in case I'm limited to twitching cheek muscles or something like that.
The life expectancy at birth for the Philippines for women born in 1983 is ~65 years; in Canada, about ~80 years. I want to keep learning and writing and sharing for as many of those years as I can.
-
🔗 @binaryninja@infosec.exchange When it comes to medical devices, security matters. We are proud to announce mastodon
When it comes to medical devices, security matters. We are proud to announce that Binary Ninja, partnering with STR and Aarno Labs as part of an @ARPA_H program helped identify critical flaws in hospital patient monitors that prompted FDA and CISA advisories. See our latest blog for more details:
https://binary.ninja/2025/03/20/uncovering-medical-device- vulnerabilities.html
-
🔗 sacha chua :: living an awesome life Playing with chunk size when writing rss
How long is a blog post? Some people write short posts with one clear thought. Others write longer essays.
I tend to start out writing a short post and then get distracted by all the rabbit-holes I want to go down. Drafting my thoughts on blogging leads to adding lots of blogs to my reader, writing some code that takes an OPML and makes a table of blogs and their most recent posts, fixing the org-html-themes setup for my Emacs configuration, breaking out this chunk as its own post, drawing a bunch of mindmaps, doing a braindump, tweaking my workflow for processing braindumps to use faster-whisper and whisper-ctranslate2 instead of WhisperX because of this issue, so that I can try the whisper-large-v3-turbo model, experimenting with workflows for reviewing the PDF on the iPad… Definitely lots of yak-shaving (wiktionary definition). I still want to write that post. I already have the sketch I want to include in it. It's like Chilli in the Bluey episode Sticky Gecko (script): "The door: It is right here. All we need to do is walk out of it: it's so easy!" The thought! It's right there! Just get to it, brain! But I wander because I wonder. I suppose that's all right.
It might be fun to play around with the sizes of things I share: shorter when my attention is fragmented or squirrely, longer when I can think about something over several days or years. Here are some ways to tinker with that.
Breaking thoughts down into smaller chunks so I can get them out the door:
- When I notice that something is a big blog post (like this reflection I've been working on about blogging), I can break out parts of it into their own blog posts and then replace that section with links.
- I can post interesting quotes and snippets to Mastodon and then round them up periodically or refer to them in blog posts. TODO: It might be good to have a shortcut for an accessible link to a toot using a speech bubble or similar icon.
Taming my tangents and ideas: I'm sometimes envious of blogs with neat side notes, but really, I should just accept that the tangents that my mind wants to go on can take paragraphs and are more suited to, say, collapsible details or a different blog post. Something I can experiment with: instead of gallivanting off on that tangent (soo hard to resist when there's an idea for an Emacs tweak!), I can add a TODO and leave it for my future self. Maybe even two TODOs: one inline, where it makes sense in the text; and one in my Org Mode, with a link to the blog post so that I can go back and update it when (if!) I get around to it. Who knows, maybe someone might comment with something that already exists.
Saving scraps: It's easier to cut out half-fleshed-out ideas if I tell myself I'm just saving them somewhere. Right now I capture/refile them to a scraps heading, but there's probably a better way to handle this. Maybe I can post some thoughts to Mastodon and then save the toot URL. Maybe I can experiment with using Denote to manage private notes.
Connecting thoughts and building them up:
- I tend to write in small chunks. (TODO: I could probably do some kind of word-count analysis, might be neat.) Sketchnotes and hyperlinks might help me chunk thoughts so I can think about bigger things. I can link to paragraphs and text fragments, so I can connect thoughts with other parts of thoughts instead of trying to get the granularity right the first time around. The shortcuts I made for linking to blog posts and searching the Web or my notes are starting to help.
- I sporadically work on topic maps or indices. Maybe I'll gradually flesh them out into a digital garden / personal wiki.
- Sometimes I don't remember the exact words I used. Probabilistic search or vector search might help here, too. I don't need an AI-generated summary, I just want a list of related posts.
- I can figure out how to add backlinks to my blog, or simplify the workflow for adding links to previous posts. Maybe something based on this guide for 11ty or binyamin/eleventy-plugin-backlinks. I might need to write something custom anyway so that I can ignore the links coming from monthly/weekly review posts.
Connecting to other people's thoughts: For the purposes of conversation, it'll probably be good to let people know if I write something about their blog post. Doesn't happen automatically. Pingbacks and referrer logs got too swamped by spam a long time ago, so I don't think anyone really uses them. Idea: It might be neat to have something that quickly lists all the external links in a post, and maybe a way to save the e-mail addresses or Mastodon handles for people after I look them up so that I can make that even smoother, and some kind of quick template. I can send email and toot from within Emacs, so that's totally doable… (No, I am not going to write it right now, I'm going to add it to my to-do list.)
(Also, there's another thought here about books and The Great Conversation, and blogs and smaller-scale conversations, and William Thurston and mathematicians and understanding, and cafes…)
Hmm. I think that getting my brain to make smaller chunks and get them out the door will be a good thing to focus on. Synthesizing can come later.
Related:
-
🔗 Console.dev newsletter Konva rss
Description: JS 2D Canvas.
What we like: Framework for building animations and graphics on a 2D HTML5 canvas. Works well with React, Vue, Svelte. Dynamic animations, tweens, pre-built filters, & node management built in. Canvas can be exported in high quality to data URLs or images.
What we dislike: React integration doesn’t support React Native, but Konva itself works across platforms.
-
🔗 Console.dev newsletter Goravel rss
Description: Go web application framework.
What we like: Designed to be consistent with Laravel to make migration easy. Lots of built in modules e.g. AuthN and AuthZ, routing, middleware, gRPC, session, queues, validation, logging, etc. Includes an ORM natively integrated with seeding and migrations. Plugins for extending functionality.
What we dislike: There were breaking changes between minor version releases (v1.13 to v1.14 and v1.14 to v1.15).
-
- March 19, 2025
-
🔗 Jeremy Fielding (YouTube) A Critical Piece of Machinery has Failed. rss
If you want to join my community of makers and Tinkers consider getting a YouTube membership 👉 https://www.youtube.com/@JeremyFieldingSr/join
If you want to chip in a few bucks to support these projects and teaching videos, please visit my Patreon page or Buy Me a Coffee. 👉 https://www.patreon.com/jeremyfieldingsr 👉 https://www.buymeacoffee.com/jeremyfielding
Social media, websites, and other channel
Instagram https://www.instagram.com/jeremy_fielding/?hl=en Twitter 👉https://twitter.com/jeremy_fielding TikTok 👉https://www.tiktok.com/@jeremy_fielding0 LinkedIn 👉https://www.linkedin.com/in/jeremy-fielding-749b55250/ My websites 👉 https://www.jeremyfielding.com 👉https://www.fatherhoodengineered.com My other channel Fatherhood engineered channel 👉 https://www.youtube.com/channel/UC_jX1r7deAcCJ_fTtM9x8ZA
Notes:
Technical corrections
Nothing yet
-
🔗 sacha chua :: living an awesome life Reading more blogs; Emacs Lisp: Listing blogs based on an OPML file rss
Nudged by Dave Winer's post about old-school bloggers and my now-nicely-synchronizing setup of NetNewsWire (iOS) and FreshRSS (web), I gave Claude AI this prompt to list bloggers (with the addition of "Please include URLs and short bios.") and had fun going through the list it produced. A number of people were no longer blogging (unreachable sites or inactive blogs), but I found a few that I wanted to add to my feed reader.
Here is my people.opml at the moment (slightly redacted, as I read my husband's blog as well). This list has some non-old-school bloggers as well and some sketchnoters, but that's fine. It's a very tiny slice of the awesomeness of the Internet out there, definitely not exhaustive, just a start. I've been adding more by trawling through indieblog.page and the occasional interesting post on news.ycombinator.com.
It makes sense to make an HTML version to make it easier for people to explore, like those old-fashioned blog rolls. Ooh, maybe some kind of table like indieblog.page, listing a recent item from each blog. (I am totally not surprised about my tendency to self-nerd-snipe with some kind of Emacs thing.) This uses my-opml-table and my-rss-get-entries, which I have just added to my Emacs configuration.
my-opml-table(defun my-opml-table (xml) (sort (mapcar (lambda (o) (let ((latest (car (condition-case nil (my-rss-get-entries (dom-attr o 'xmlUrl)) (error nil))))) (list (if latest (format-time-string "%Y-%m-%d" (plist-get latest :date)) "") (org-link-make-string (or (dom-attr o 'htmlUrl) (dom-attr o 'xmlUrl)) (replace-regexp-in-string " *|" "" (dom-attr o 'text))) (if latest (org-link-make-string (plist-get latest :url) (or (plist-get latest :title) "(untitled)")) "")))) (dom-search xml (lambda (o) (and (eq (dom-tag o) 'outline) (dom-attr o 'xmlUrl) (dom-attr o 'text))))) :key #'car :reverse t))
my-rss-get-entries: Return a list of the form ((:title … :url … :date …) …).(defun my-rss-get-entries (url) "Return a list of the form ((:title ... :url ... :date ...) ...)." (with-current-buffer (url-retrieve-synchronously url) (set-buffer-multibyte t) (goto-char (point-min)) (when (re-search-forward "<\\?xml\\|<rss" nil t) (goto-char (match-beginning 0)) (sort (let* ((feed (xml-parse-region (point) (point-max))) (is-rss (> (length (xml-get-children (car feed) 'entry)) 0))) (if is-rss (mapcar (lambda (entry) (list :url (or (xml-get-attribute (car (or (seq-filter (lambda (x) (string= (xml-get-attribute x 'rel) "alternate")) (xml-get-children entry 'link)) (xml-get-children entry 'link))) 'href) (dom-text (dom-by-tag entry 'guid))) :title (elt (car (xml-get-children entry 'title)) 2) :date (date-to-time (elt (car (xml-get-children entry 'updated)) 2)))) (xml-get-children (car feed) 'entry)) (mapcar (lambda (entry) (list :url (or (caddr (car (xml-get-children entry 'link))) (dom-text (dom-by-tag entry 'guid))) :title (caddr (car (xml-get-children entry 'title))) :date (date-to-time (elt (car (xml-get-children entry 'pubDate)) 2)))) (xml-get-children (car (xml-get-children (car feed) 'channel)) 'item)))) :key (lambda (o) (plist-get o :date)) :lessp #'time-less-p :reverse t))))
(my-opml-table (xml-parse-file "~/Downloads/people.opml"))
I'm rebuilding my feed list from scratch. I want to read more. I read the aggregated feeds at planet.emacslife.com every week as part of preparing Emacs News. Maybe I'll go over the list of blogs I aggregate there, widen it to include all posts instead of just Emacs-specific ones, and see what resonates. Emacs people tend to be interesting. Here is an incomplete list based on people who've posted in the past two years or so, based on this work-in-progress planetemacslife-expanded.opml. (I haven't tweaked all the URLs yet. I stopped at around 2023 and made the rest of the elements
xoutline
instead ofoutline
so that my code would skip them.)(my-opml-table (xml-parse-file "~/Downloads/planetemacslife-expanded.opml"))
Making this table was fun. It's nice to see a lot of people also writing and learning out loud. This reminded me a little of EmacsConf - 2020 - talks - Sharing blogs (and more) with org-webring. TODO: Could be fun to have a blogroll page again.
I notice I tend to like:
- posts about adapting technology to personal interests, more than posts about the industry or generalizations
- detailed posts about things I'm currently interested in (Emacs, personal knowledge management, some Javascript), more than detailed tech posts about things I've decided not to get into at the moment
- "I" posts more than "You" posts: personal reflections rather than didactic advice
- curiosity, fun, experimentation
Looking forward to discovering more!
Related:
-
🔗 matklad Comptime Zig ORM rss
Comptime Zig ORM Mar 19, 2025
This post can be considered an advanced Zig tutorial. I will be covering some of the more unique aspects of the language, but won’t be explaining the easy part. If you haven’t read the Zig Language Reference, you might start there. Additionally, we will also learn the foundational trick for implementing relational model.
You will learn a sizable chunk of Zig after this post, but this isn’t going to be an easy read, so prepare your favorite beverage and get comfy!
[On Learning ](https://matklad.github.io/2025/03/19/comptime-zig-
orm.html#On-Learning)
One of the most ridiculously effective ways of learning for me is building toy versions of programs. This is slightly more specific than “to learn to code, code”: I claim that you can learn more by spending a week building your own very bad version of an application from scratch that you’d learn from working full-time for a year on a production ready codebase. Case in point: although the code in this post is lifted from TigerBeetle, and I’ve been working with it for a couple of years, I’ve learned a bunch of new things myself in the evening of hacking on the code for the post.
The hard part about the toy problem approach is finding the right toy! I remember, early in my career, spending about a year pestering everything with “what is your favorite model problem?” question, and not getting a real answer. Until one day @zmacter asked “have you tried a raytracer?” and that became my model problem for learning programming languages. Seriously, if you want to learn Zig, go write yourself a raytracer, I have some notes for that here.
[The Database ](https://matklad.github.io/2025/03/19/comptime-zig-
orm.html#The-Database)
In this post, we’ll work on a solution of a different model problem, which I think is an especially good fit for showcasing Zig’s comptime capabilities. This problem is a simplified version of the LSM Forest code from TigerBeetle.
Specifically, we will be implementing an in-memory relational database, whose schema is set at compile time. Before diving into the implementation, let’s sketch the interface we want.
First, we define objects which we will be storing in our database, accounts and transfers:
const Account = struct { id: ID = @enumFromInt(0), balance: u128, pub const ID = enum(u64) { _ }; }; const Transfer = struct { id: ID = @enumFromInt(0), amount: u128, debit_account: Account.ID, credit_account: Account.ID, pub const ID = enum(u64) { _ }; };
In Zig,
struct
is an expression that yields an anonymous struct type, which needs to be explicitly bound to an identifier:const Account = struct { ... }
Structs contain fields and declarations. Fields can have default values. This curious pattern
pub const ID = enum(u64) { _ };
is a Zig idiom for creating a newtype over an integer.ID
is an enumeration, whose backing type isu64
. This enumeration doesn’t have any explicitly named variants, but it is open (_
) — anyu64
numeric value is considered to be a member. This is exactly what we want for an id — it’s an opaque number with a unique type, whose “numberness” is not exposed (you can’t add two ids together). In the transfer struct, we refer to account id:debit_account: Account.ID
Note that although
Account.ID
andTransfer.ID
have exactly the same definition, they are distinct types. Let this sink in — Zig’s type system is nominal, but all types are anonymous!Ids will be assigned by our database automatically, using an auto-incrementing counter, and we will use zero id to signify a new object without an id assigned yet:
id: ID = @enumFromInt(0),
@enumFromInt
and@intFromEnum
are built-in functions for casting between an enum and its backing integer. It could have been cleaner to instead write:const Account = struct { id: ID = .unassigned, balance: u128, pub const ID = enum(u64) { unassigned = 0, _, }; };
That is, to add an explicitly named variant for zero.
A word on built in functions. In Zig, all compiler builtins use special syntax with
@
. This is somewhat unusual — traditionally, builtins are some magic functions inside the standard library, specially marked. Zig boldly uses a dedicated syntax for builtins, but, in the exchange, the Zig standard library is not privileged at all. It is just a normal library which happens to be distributed with the compiler, it doesn’t have special powers.
A word on type inference. Zig hits a sweet spot:
- explicit types are rarely needed in expressions,
- where they are needed a human reader usually wouldn’t be able to tell the type on the spot anyway,
- function signatures are always explicit,
- and the inference algorithm is direct and simple.
Specifically, Zig doesn’t do Hindley-Milner (separate phases of constraint gathering and solving), and instead directly infers a type of expression from the types of its subexpressions, with a small twist. If the type of the result is known already through other means, it is propagated down. Certain syntaxes and builtins take advantage of this known result type. Consider both flavors of defaulting id to zero:
id: ID = .unassigned, id: ID = @enumFromInt(0),
Here, the result type is known, it is
ID
. So, when Zig evaluates.unassigned
, it knows that the result type must beID
, and “desugars” the shorthand toID.unassigned
(the type needn’t be namable). Similarly, for@enumFromInt
case, the type of an enum to convert to is taken from the result. If the type is not otherwise known from the context, the result is a compilation error which can be fixed with an@as
type ascription builtin:// Compilation error: _which_ enum? const mystery = @enumFromInt(0); const mystery = @as(ID, @enumFromInt(0));
Note how Zig doesn’t need type ascription syntax , and just uses a builtin function.
So, yeah, we have accounts and transfers, they both have ids assigned by the database, an account has a balance, a transfer has an amount and refers to two accounts:
const Account = struct { id: ID = @enumFromInt(0), balance: u128, pub const ID = enum(u64) { _ }; }; const Transfer = struct { id: ID = @enumFromInt(0), amount: u128, debit_account: Account.ID, credit_account: Account.ID, pub const ID = enum(u64) { _ }; };
Now, let’s define our database from the schema:
const DBType = @import("./db.zig").DBType; const DB = DBType(.{ .tables = .{ .account = Account, .transfer = Transfer, }, .indexes = .{ .transfer = .{ .debit_account, .credit_account, }, }, });
A lot is going on here. First, we use
@import
builtin function to import (look, no need for syntax again!) theDBType
function.DBType
is a type constructor — it takes a DB schema, and returns a database type. For the schema, we ask for two tables, accounts and transfers, and also ask to include indexes on transfers’ foreign keys.The implementation of
DBType
is the meat of this post, but, for now, let’s see how we use it. Let’s write a function to add a transfer to the database:fn create_transfer( db: *DB, gpa: std.mem.Allocator, debit_account: Account.ID, credit_account: Account.ID, amount: u128, ) !?Transfer.ID { ... }
Zig doesn’t have a global allocator, so anything that needs to allocate takes an allocator argument.
std.mem.Allocator
is dynamically dispatched: inside it are a type-erased pointer to a particular allocator’s state, and a pointer to a vtable:const Allocator = struct { ptr: *anyopaque, vtable: *const VTable, pub const VTable = struct { alloc: *const fn ( *anyopaque, len: usize, alignment: Alignment, ret_addr: usize, ) ?[*]u8, ... }; };
This is a trait object, coded manually.
gpa
stands for general purpose allocator, which behaves more or less like a global allocator would, as far as the code is concerned. You often seearena: Allocator
, signifying that memory doesn’t need to be freed on per-object basis, orscratch: Allocator
, signifying that the memory can be used for short-lived allocations inside the function, but can’t outlive it.Inserting a new object into our in-memory database could allocate, so we need an allocator argument, and, conversely, need to signal possibility of an allocation failure in our result type, which is what the bang (
!
) is for.Another reason for why the operation might fail is that the transfer itself might be invalid (e.g., insufficient balance). For simplicity, I choose to model this by returning a
null
instead ofTransfer.ID
, hence the question mark (?
).In Zig, types are always specified in prefix notation, without exception. For example,
[3]?struct { r: u8, g: u8, b:u8 }
is an array of three optional colors.Let’s see the implementation of
create_transfer
:if (debit_account == credit_account) return null; const dr = db.account.get(debit_account) orelse return null; const cr = db.account.get(credit_account) orelse return null; if (dr.balance < amount) return null; if (cr.balance > std.math.maxInt(u128) - amount) return null; db.account.update(.{ .id = debit_account, .balance = dr.balance - amount, }); db.account.update(.{ .id = credit_account, .balance = dr.balance + amount, }); return try db.transfer.create(gpa, .{ .debit_account = debit_account, .credit_account = credit_account, .amount = amount, });
Zig doesn’t require braces around if’s body which makes for concise “guard” ifs. It comes with autoformatter out of the box, so there’s very little possibility for indentation confusion.
db.account.get(credit_account)
looks up an account by an id. The account might or might not exist, the return type of this function is?Account
. Zig’sorelse
unpacks optionals. The type ofreturn
expression isnoreturn
(!
from Rust), so the type ofcr
anddr
is justAccount
, without a question mark. Instead ofreturn
ing, we could haveorelse
ed some defaultAccount
.This line is the only place where we need to help type inference by spelling a type explicitly:
if (cr.balance > std.math.maxInt(u128) - amount) return null;
We pass the
u128
type to themaxInt
function. This is the case where a sophisticated smart type inference algorithm could look at the surrounding context and infer the type, but Zig deliberately requires the user to spell it in stations like this.Having done the balance checks, we ask our database to update the two balances, and to persist the new transfer object. Only
transfer.create
calls gets an allocator, so it is immediately obvious that only this part of the function can allocate.[A Usage Example ](https://matklad.github.io/2025/03/19/comptime-zig-
orm.html#A-Usage-Example)
Now, let’s write some test program:
pub fn main() !void { ... }
We are going to allocate, so we might fail, so we return
!void
. So, we’ll need an allocator:var gpa_instance: std.heap.DebugAllocator(.{}) = .{}; const gpa = gpa_instance.allocator();
gpa_instance
is a concrete allocator, of typestd.heap.DebugAllocator(.{})
. It is initialized with default values for all the fields. When evaluating= .{}
, Zig knows the result type, so it knows which fields are defaulted.gpa
is our trait object. Internally, it contains a pointer togpa_instance
. The state ofgpa_instance
will be mutated, so it needs to be declared as avar
. Thegpa
though is just a pair of pointers, and those pointers won’t be mutated themselves, so we can declare itconst
, similarly to how in Rust you’d writelet mut x = 92; let r = &mut x; // no mut on r!
Because Zig doesn’t track aliasing in the type system, figuring out what can get mutated when is generally harder in Zig than in Rust.
For the usage example, we’ll need some random numbers, which follow a similar pattern — a concrete PRNG and a dynamically dispatched trait object / vtable:
var random_instance = std.Random.DefaultPrng.init(92); const random = random_instance.random();
Finally, we create an instance of our database:
var db: DB = .{}; // defer db.deinit(gpa);
DB
will allocate memory, so we absolutely do need a deinit function to free it, but I am excluding it from the tutorial, as it requires some not particularly illuminating legwork.For starters, just create two accounts and a transfer:
const alice: Account.ID = try db.account.create(gpa, .{ .balance = 100 }); const bob: Account.ID = try db.account.create(gpa, .{ .balance = 200 }); const transfer: ?Transfer.ID = try create_transfer(&db, gpa, alice, bob, 100); assert(transfer != null);
So far, this feels like a hash-map with more steps. We aren’t doing anything relational here. We will, soon, but we’ll need some fake data. To keep things a touch more realistic, we won’t be distributing transfers equally between accounts, and instead ensure that 20% of hottest accounts are responsible for 80% of transfers:
fn pareto_index(random: std.Random, count: usize) usize { assert(count > 0); const hot = @divFloor(count * 2, 10); if (hot == 0) return random.uintLessThan(usize, count); if (random.uintLessThan(u32, 10) < 8) { return pareto_index(random, hot); } return hot + random.uintLessThan(usize, count - hot); }
Nothing new here —
@divFloor
is another builtin (an intention-bearing name for/
), and we need to pass the type we want to get out of random explicitly, instead of having type inference (and the human reader) figuring it out.The loop to populate the database is slightly more interesting:
var accounts: std.ArrayListUnmanaged(Account.ID) = .empty; defer accounts.deinit(gpa); const account_count = 100; try accounts.ensureTotalCapacity(gpa, account_count); accounts.appendAssumeCapacity(alice); accounts.appendAssumeCapacity(bob); while (accounts.items.len < account_count) { const account = try db.account.create(gpa, .{ .balance = 1000 }); accounts.appendAssumeCapacity(account); } const transfer_count = 100; for (0..transfer_count) |_| { const debit = pareto_index(random, account_count); const credit = pareto_index(random, account_count); const amount = random.uintLessThan(u128, 10); _ = try create_transfer( &db, gpa, accounts.items[debit], accounts.items[credit], amount, ); }
We’ll need to store generated account ids somewhere, so we use an
ArrayList
. Zig strongly pushes you towards batching your allocations, so we preallocate space for a hundred accounts at once, and then append without passing agpa
in. For simplicity, we don’t implement reservation API for our database, so we do need agpa
when creating an account or a transfer.In Zig, unused return value is a compilation error, so we need
_ =
to ignore the result of the transfer.Finally, we come to the relational part of the tutorial, we’ll do a non- trivial lookup. First, we’ll ask for all transfers from
alice
to anyone, and then for transfers betweenalice
andbob
specifically:var transfers_buffer: [10]Transfer = undefined; const alice_transfers = db.transfer.filter( .{ .debit_account = alice }, &transfers_buffer, ); for (alice_transfers) |t| { std.debug.print("alice: from={} to={} amount={}\n", .{ t.debit_account, t.credit_account, t.amount, }); } const alice_to_bob_transfers = db.transfer.filter( .{ .debit_account = alice, .credit_account = bob }, &transfers_buffer, ); for (alice_to_bob_transfers) |t| { std.debug.print("alice to bob: from={} to={} amount={}\n", .{ t.debit_account, t.credit_account, t.amount, }); }
The interesting parts are highlighted. We can ask the database to filter transfer objects to only those with matching attributes. You can do it using a brute-force loop over all transfers. But, if you are serious about your relational model, you obviously want to be faster than that! I wonder if this has something to do with the indexes we added when declaring
DB
?For Zig specifics, I don’t want to allocate the result, and I don’t want to bother with iterators, so I pass a stack-allocated out buffer in.
[A Call To Action ](https://matklad.github.io/2025/03/19/comptime-zig-
orm.html#A-Call-To-Action)
Here’s what we got so far, the interface for the so-far mysterious
db.zig
:const std = @import("std"); const assert = std.debug.assert; const DBType = @import("./db.zig").DBType; const Account = struct { id: ID = @enumFromInt(0), balance: u128, pub const ID = enum(u64) { unassigned, _, }; }; const Transfer = struct { id: ID = @enumFromInt(0), amount: u128, debit_account: Account.ID, credit_account: Account.ID, pub const ID = enum(u64) { _ }; }; const DB = DBType(.{ .tables = .{ .account = Account, .transfer = Transfer, }, .indexes = .{ .transfer = .{ .debit_account, .credit_account, }, }, }); fn create_transfer( db: *DB, gpa: std.mem.Allocator, debit_account: Account.ID, credit_account: Account.ID, amount: u128, ) !?Transfer.ID { if (debit_account == credit_account) return null; const dr = db.account.get(debit_account) orelse return null; const cr = db.account.get(credit_account) orelse return null; if (dr.balance < amount) return null; if (cr.balance > std.math.maxInt(u128) - amount) return null; db.account.update(.{ .id = debit_account, .balance = dr.balance - amount, }); db.account.update(.{ .id = credit_account, .balance = dr.balance + amount, }); return try db.transfer.create(gpa, .{ .debit_account = debit_account, .credit_account = credit_account, .amount = amount, }); } pub fn main() !void { var gpa_instance: std.heap.DebugAllocator(.{}) = .{}; const gpa = gpa_instance.allocator(); var random_instance = std.Random.DefaultPrng.init(92); const random = random_instance.random(); var db: DB = .{}; // defer db.deinit(gpa); const alice: Account.ID = try db.account.create(gpa, .{ .balance = 100 }); const bob: Account.ID = try db.account.create(gpa, .{ .balance = 200 }); const transfer = try create_transfer(&db, gpa, alice, bob, 100); assert(transfer != null); var accounts: std.ArrayListUnmanaged(Account.ID) = .empty; defer accounts.deinit(gpa); const account_count = 100; try accounts.ensureTotalCapacity(gpa, account_count); accounts.appendAssumeCapacity(alice); accounts.appendAssumeCapacity(bob); while (accounts.items.len < account_count) { const account = try db.account.create(gpa, .{ .balance = 1000 }); accounts.appendAssumeCapacity(account); } const transfer_count = 100; for (0..transfer_count) |_| { const debit = pareto_index(random, account_count); const credit = pareto_index(random, account_count); const amount = random.uintLessThan(u128, 10); _ = try create_transfer( &db, gpa, accounts.items[debit], accounts.items[credit], amount, ); } var transfers_buffer: [10]Transfer = undefined; const alice_transfers = db.transfer.filter( .{ .debit_account = alice }, &transfers_buffer ); for (alice_transfers) |t| { std.debug.print("alice: from={} to={} amount={}\n", .{ t.debit_account, t.credit_account, t.amount, }); } std.debug.print("\n\n", .{}); const alice_to_bob_transfers = db.transfer.filter( .{ .debit_account = alice, .credit_account = bob }, &transfers_buffer, ); for (alice_to_bob_transfers) |t| { std.debug.print("alice to bob: from={} to={} amount={}\n", .{ t.debit_account, t.credit_account, t.amount, }); } } fn pareto_index(random: std.Random, count: usize) usize { assert(count > 0); const hot = @divFloor(count * 2, 10); if (hot == 0) return random.uintLessThan(usize, count); if (random.uintLessThan(u32, 10) < 8) return pareto_index(random, hot); return hot + random.uintLessThan(usize, count - hot); }
If you want to get 90% out of this post, I strongly recommend you to not read any further, and instead copy the above code into your own
main.zig
and try to writedb.zig
yourself. I do think this is the most excellent exercise that can teach you more effectively than any blog post. It’ll take more time, of course, but you’ll get more knowledge per minute spent out of it.If you will settle for the 10%, read on! And, if you want 100% percent, then do your implementation first and then come back here!
[The Table ](https://matklad.github.io/2025/03/19/comptime-zig-
orm.html#The-Table)
We will be building
db.zig
from the gound up. It all will make sense! At the end.Our fundamental data structure is a sorted list of values. Here, I kindly ask the reader to engage suspension of disbelief: in a real database we will be using a data structure with efficient lookups and modifications, such as a B-tree or an LSM tree. For the purposes of this tutorial, we will use a simple sorted array, and will just close our eyes on O(N) insertions and removals.
Values are going to be sorted by a particular field. For example, we sort transfers by their ids. So, when creating a “Table” of transfers, we’ll need to pass the type of key, the type of value, and functions for extracting and comparing keys:
const TransfersTable = TableType(Transfer.ID, Transfer, struct { pub fn key_fn(value: Transfer) Transfer.ID { return value.id; } pub fn key_cmp(lhs: Transfer.ID, rhs: Transfer.ID) std.math.Order { return std.math.order(@intFromEnum(lhs), @intFromEnum(rhs)); } });
Here’s the corresponding declaration:
fn TableType( comptime KeyType: type, comptime ValueType: type, comptime Functions: type, ) type { const key_fn = Functions.key_fn; const key_cmp = Functions.key_cmp; return struct { ... }; }
This is a type constructor function, which takes a bunch of types as arguments and returns a new type. Such functions can only be called at compile time., Zig doesn’t have the ability to create new types at runtime, unlike something like the JVM.
Passing the table of functions as a
Functions
type is a weird idiom of Zig. It would be more natural to use the following signature:fn TableType( comptime KeyType: type, comptime ValueType: type, comptime key_fn: fn(value: ValueType): KeyType, comptime key_cmp: fn(lhs: KeyType, rhs: KeyType): std.math.Order, ) type
But this version is more painful to use at the call site. While
struct
is an expression inZig
, and you can declare one inline,fn
is not an expression, you can’t declare a function inline unless you employ another Zig idiom:const my_function = struct { fn double(x: u32) u32 { return x * 2; } }.double;
Here’s the implementation of the table:
struct { values: std.ArrayListUnmanaged(Value) = .empty, pub const Key = KeyType; pub const Value = ValueType; const Table = @This(); pub fn search( table: *const Table, key: Key, start_index: usize, ) usize { return start_index + std.sort.lowerBound( Value, table.values.items[start_index..], key, compare_fn, ); } fn compare_fn(key: Key, value: Value) std.math.Order { return key_cmp(key, key_fn(value)); } pub fn get(table: *const Table, key: Key) ?Value { const index = table.search(key, 0); if (index >= table.values.items.len) return null; const value = table.values.items[index]; if (key_cmp(key, key_fn(value)) != .eq) return null; return value; } pub fn reserve( table: *Table, gpa: std.mem.Allocator, extra: usize, ) !void { try table.values.ensureUnusedCapacity(gpa, extra); } pub fn insert(table: *Table, value: Value) void { assert(table.values.unusedCapacitySlice().len > 0); const index = table.search(key_fn(value), 0); table.values.insertAssumeCapacity(index, value); } pub fn remove(table: *Table, value: Value) void { const index = table.search(key_fn(value), 0); const removed = table.values.orderedRemove(index); assert(std.meta.eql(value, removed)); } };
The
search
function binary searches for the index corresponding to the givenkey
in the list of values. For convenience of the call-site we are yet to see, we also pass in the starting index for the search. It is useful for, e.g., pagination-style API where you use index as a cursor.The
get
function then usessearch
to find the index, and furthermore checks that we have an exact match.For the
insert
function, we do implement the reservation pattern: memory allocation and data structure modification are split into two functions. Modification proper is infallible, and has a reservation as a precondition.As promised, we do a naive linear memcpy for
insert
/remove
, but we pretend that it is actually logarithmic.Another unrealistic simplification is that our API is scalar — we insert or remove a single item at a time. Both Zig and relational model strongly encorage operating on a batch of objects at a time, pushing the
for
s down:pub fn insert(table: *Table, values: []const Value) void
Even with a naive array list, the batched version runs in O(N + K log K), which is much faster than O(N K) of the scalar version repeated K times. But we leave batching as an exercise for the reader.
[The Indexes ](https://matklad.github.io/2025/03/19/comptime-zig-
orm.html#The-Indexes)
Now comes the relational model part of the tutorial. If we have
const TransfersTable = TableType(Transfer.ID, Transfer, struct { pub fn key_fn(value: Transfer) Transfer.ID { return value.id; } pub fn key_cmp(lhs: Transfer.ID, rhs: Transfer.ID) std.math.Order { return std.math.order(@intFromEnum(lhs), @intFromEnum(rhs)); } });
we can efficiently filter transfers by their ids. How can we add an ability to filter transfers by, say, debit account id?
The idea is to add a second sorted list, which stores pairs of
(Account.ID, Transfer.ID)
With this setup, if you are interested in all transfers from alice, you can binary search for alice’s account ID in the second list, fetch corresponding transfer ids, and then lookup transfers in the first list:
Transfers: id=1 debit_account=alice credit_account=bob id=2 debit_account=charley credit_account=bob id=3 debit_account=alice credit_account=charley Index: debit_account=alice id=1 debit_account=alice id=3 debit_account=charley id=2
What makes this work is that we can maintain two lists in sync. When creating a transfer, you insert it in the Transfers table, but also insert the corresponding pair to the index table. Removal works similarly.
[Implementing the Index ](https://matklad.github.io/2025/03/19/comptime-
zig-orm.html#Implementing-the-Index)
Let’s implement an index table. Here, we start using
comptime
for real. In particular, we will parametrize our index table by the type (such asTransfer
) and the name of the field to build an index over (such as.debit_account
):const TransferDebitAccountIndex = IndexTableType(Transfer, .debit_account);
This is the signature:
fn IndexTableType( comptime Value: type, comptime field: std.meta.FieldEnum(Value), ) type { ... }
This is a type constructor, which takes a type and returns a type. The
std.meta.FieldEnum(Value)
call returns an enum whose variants are fields ofValue
. E.g, ourTransfer
isconst Transfer = struct { id: ID, amount: u128, debit_account: Account.ID, credit_account: Account.ID, };
so the corresponding
FieldEnum
would look like this:const TransferFieldEnum = enum { id, amount, debit_account, credit_account, };
Now, the body:
fn IndexTableType( comptime Value: type, comptime field: std.meta.FieldEnum(Value), ) type { const FieldType = @FieldType(Value, @tagName(field)); const Pair = struct { field: FieldType, id: Value.ID, }; return TableType(Pair, Pair, struct { pub fn key_fn(value: Pair) Pair { return value; } pub fn key_cmp(lhs: Pair, rhs: Pair) std.math.Order { return order_by(Pair, lhs, rhs, &.{ .field, .id }); } }); }
Ultimately, we want to delegate to existing
TableType
, as that already implements the logic for storing a sorted list of items. The item for us is a field type and value id pair. One note about theconst FieldType = @FieldType(Value, @tagName(field));
incantation used.FieldEnum
is the library-level abstraction. Meta programming builtins, like@FieldType
or@field
, work with string names of field (at compile time, of course).@tagName
converts from.debit_account
to"debit_account"
. We don’t have to useFieldEnum
, and could have used strings throughout, butFieldEnum
gives us two advantages:- Earlier type errors: calling
IndexTableType
with a field that doesn’t exist will error out at the call site, rather than at the definition side. - Greppability: in Zig, field access is always spelled as
.debit_account
syntactically, so it is advantageous to stick to the same convention during meta programming, to make sure it also gets into textual searches.
The
Key
for our table is the entirePair
. That is, we want to sort not only on the field value, but on ID as well, to make sure that, when we lookup all ids for a particular field value, we get back a sorted list. That’s why ourkey_fn
is an identity function:pub fn key_fn(value: Pair) Pair { return value; }
In
key_cmp
, we want to compare first byfield
, and then byid
. We can do it manually, but it’s more fun to do some meta programming here as well:pub fn key_cmp(lhs: Pair, rhs: Pair) std.math.Order { return order_by(Pair, lhs, rhs, &.{ .field, .id }); } fn order_by( comptime T: type, lhs: T, rhs: T, comptime fields: []const std.meta.FieldEnum(T), ) std.math.Order { ... }
order_by
is our first mixed-mode function. Some arguments arecomptime
, but some are runtime. This function should compare a pair ofT
by sequentially comparing the values of the corresponding fields, and returning as soon as two unequal fields are found. Here we use aninline for
:fn order_by( comptime T: type, lhs: T, rhs: T, comptime fields: []const std.meta.FieldEnum(T), ) std.math.Order { inline for (fields) |field| { const order = order_enums( @field(lhs, @tagName(field)), @field(rhs, @tagName(field)), ); if (order != .eq) return order; } return .eq; }
Because the list of fields is known at compile time, the loop is fully unrolled, and the actual generated code ends up looking as a sequence of direct comparisons.
@field
fetches a field from a value given acomptime
field’s name.order_enums
is a little helper which allows comparing either numbers or enums:fn order_enums(lhs: anytype, rhs: @TypeOf(lhs)) std.math.Order { return switch (@typeInfo(@TypeOf(lhs))) { .int => std.math.order(lhs, rhs), .@"enum" => std.math.order( @intFromEnum(lhs), @intFromEnum(rhs), ), else => comptime unreachable, }; }
@typeInfo
is a builtin that allows reflecting on the structure of types. In particular, it classifies types as structs, unions, enums, integers, etc, exactly what we need here. One wrinkle here is thatenum
is a keyword, so theenum
variant of@typeInfo
is spelled as@"enum"
. The@""
syntax allows using any string as a Zig identifier, it’s an escape for keywords.And that’s basically it for indexes! Now we have our main table (object table), and a number of index tables. The next task is to bundle them together, so that we can enforce consistency between the tables.
[The Bundle ](https://matklad.github.io/2025/03/19/comptime-zig-
orm.html#The-Bundle)
The next thing we will be building is the
Bundle
. It takes a type, a list of fields to build indexes over, and provides an API for creating and looking up values while maintaining consistency of indexes:pub fn BundleType( comptime Value: type, comptime indexed_fields: []const std.meta.FieldEnum(Value), ) type
With the bundle, we can finally explain what is the database that we’ve started with:
const DB = struct { account: BundleType(Account, &.{}), transfer: BundleType(Transfer, &.{ debit_account, credit_account, }), }
And these were the API we used, and which we’ll now implement:
var db: DB = ...; try db.account.create(gpa, .{ .balance = 100 }); db.account.update(.{ .id = debit_account, .balance = dr.balance - amount }); db.account.get(debit_account); try db.transfer.create(gpa, .{ .debit_account = debit_account, .credit_account = credit_account, .amount = amount, }) db.transfer.filter( .{ .debit_account = alice, .credit_account = bob }, &transfers_buffer, );
Let’s start:
pub fn BundleType( comptime Value: type, comptime indexed_fields: []const std.meta.FieldEnum(Value), ) type { return struct { id_counter: u64 = 0, objects: TableType(Value.ID, Value, struct { pub fn key_fn(value: Value) Value.ID { return value.id; } pub fn key_cmp( lhs: Value.ID, rhs: Value.ID, ) std.math.Order { return std.math.order( @intFromEnum(lhs), @intFromEnum(rhs), ); } }) = .{}, indexes: ???, const Bundle = @This(); pub fn get(bundle: *Bundle, id: Value.ID) ?Value { return bundle.objects.get(id); } ... } }
The basic structure is clear: we have an
id_counter
for assigning new ids when creating values, the object table which stores sorted values and which directly powers theget
method, and then we have the indexes. The indexes are tricky. For transfers, where we indexdebit_account
andcredit_account
, we want theindexes
to look like this:indexes: struct { debit_account: IndexTableType(Transfer, .debit_account), credit_account: IndexTableType(Transfer, .credit_account), }
So we need to iterate over passed in
indexed_fields
and create a struct with a field for each index. And that’s exactly what we’ll do:indexes: blk: { var fields: [indexed_fields.len]std.builtin.Type.StructField = undefined; for (indexed_fields, 0..) |indexed, i| { const Type = IndexTableType(Value, indexed); fields[i] = .{ .name = @tagName(indexed), .type = Type, .default_value_ptr = &(Type{}), .is_comptime = false, .alignment = @alignOf(Type), }; } break :blk @Type(.{ .@"struct" = .{ .layout = .auto, .is_tuple = false, .decls = &.{}, .fields = &fields, } }); } = .{},
This makes much more sense if you read it backwards:
break :blk
“returns” a value from the block labeledblk:
, this is Zig’s more imperative take on “everything is an expression”- We return a
@Type
.@Type
is an inverse of@TypeInfo
, in a sense that, hand waving a bit,@Type(@TypeInfo(T)) == T
. That is, we pass a description of the type, and get a type back. What we want to get is a struct with fields, so we pass@"struct"
and an array of fields. - Each element of
fields
is anstd.builtin.Type.StructField
, a description of a field, that is, its type, name, and default. - The type is
const Type = IndexTableType(Value, indexed);
That is, the index table for theindexed
field ofValue
. - The name matches the name of the index field.
- And the default is just a default for the
Type
. - Finally, we need
indexed_fields.len
fields.
Now that we have all tables,
create
andupdate
are relatively straightforward. Forcreate
, we need to make sure to insert the appropriate values into the objects table and into all of the indexes:pub fn create( bundle: *Bundle, gpa: std.mem.Allocator, value: Value, ) !Value.ID { assert(@intFromEnum(value.id) == 0); try bundle.objects.reserve(gpa, 1); inline for (indexed_fields) |field| { try @field(bundle.indexes, @tagName(field)) .reserve(gpa, 1); } errdefer comptime unreachable; bundle.id_counter += 1; const id: Value.ID = @enumFromInt(bundle.id_counter); var value_with_id = value; value_with_id.id = id; bundle.objects.insert(value_with_id); inline for (indexed_fields) |indexed_field| { const field = @tagName(indexed_field); @field(bundle.indexes, field) .insert(.{ .field = @field(value, field), .id = id }); } return id; }
We start with asserting that the id is 0. It’s our job to assign the id! But, before we do that, we reserve space for one more entry in all the tables. This is the place in the function where we allocate, and where we can fail. The cryptic
errdefer comptime unreachable
is a Zig tongue twister to say that no errors can happen after this point in function. Separating memory reservation and actual modification is helpful to make sure that the data structure remains consistent even in the face of a memory error.Had we not split out fallible
reserve
from infallibleinsert
, and kept the allocation insideinsert
, we could have ended in a situation when a value is inserted only in some of the indexes.I must admit that I am deeply skeptical that it is possible to consistently handle these kind of issues on memory allocation error correctly, I am in the “abort on OOM” camp personally. As a quick quiz, have you noticed when we didn ’t handle this issue correctly in the code we’ve already seen?
With that throat clearing done, the actual logic is straightforward:
- assign id,
- insert the value into the objects table,
- and then, for each of the indexed fields, insert the
(id, field)
pair into the corresponding index tree. Theinline for
loop is guaranteed to be fully unrolled at compile time.
As usual, we use
@field
to get a field by name (c.f. JavaScriptobj.foo
vsobj[foo]
).The
update
is even simple, as we don’t need to allocate new memory. So we just remove old values and insert new ones:pub fn update(bundle: *Bundle, value_new: Value) void { const id = value_new.id; assert(@intFromEnum(id) != 0); const value_old = bundle.get(value_new.id).?; assert(value_old.id == id); bundle.objects.remove(value_old); bundle.objects.insert(value_new); inline for (indexed_fields) |indexed_field| { const field = @tagName(indexed_field); @field(bundle.indexes, field).remove(.{ .field = @field(value_old, field), .id = id, }); @field(bundle.indexes, field).insert(.{ .field = @field(value_new, field), .id = id, }); } }
Although simple, this is the trick that makes the whole relational model work. See how we keep the indexes consistent, by looking up the old value, and removing the corresponding old pairs from indexes.
If that was to easy, don’t worry, we’ll do
filter
next, and that’s the toughest one in the entire exercise :)[Merge Sort Join ](https://matklad.github.io/2025/03/19/comptime-zig-
orm.html#Merge-Sort-Join)
Let’s revisit our original example:
var transfers_buffer: [10]Transfer = undefined; const alice_to_bob_transfers = db.transfer.filter( .{ .debit_account = alice, .credit_account = bob }, &transfers_buffer, ); for (alice_to_bob_transfers) |t| { std.debug.print("alice to bob: from={} to={} amount={}\n", .{ t.debit_account, t.credit_account, t.amount, }); }
The filter takes a filtering condition, equality conditions on a subset of indexed fields. It then fills the output buffer with as much objects as possible that match all conditions. How can we do that effectively?
Recall that our index tables are sorted by the field’s value first, and then by the object’s ID. So, to find all transfers with
.debit_account = alice
, we binary search the debit account index for pairs starting withalice
. Similarly, we can find all transfers with.credit_account = bob
, by binary searching in the other index table.In both indexes, we get some slice of pairs whose first components are
alice
andbob
respectively. Whats more, the second components of the pairs are sorted transfers ids! So, if we want to find ids which match both conditions, we merge two sorted sequences of ids!Now let’s do this for an arbitrary number of indexes:
pub fn filter(bundle: *Bundle, query: anytype, out: []Value) []Value { }
We can’t really type the
query
parameter, so we use Zig’sanytype
here. This is not dynamic typing, rather, it’s monomorphization maximus — every distinct type ofquery
will generate a fresh copy of thefilter
function.First, we reflect on
query
to figure out how many index tables we need to intersect:const fields = comptime std.meta.fieldNames(@TypeOf(query));
Then, we setup a bunch of indexes (the
ijk
kind, not the database index kind). We need an index into the output buffer, and index into the object table, and an index for each index table:var out_index: usize = 0; var object_index: usize = 0; var indexes: [fields.len]usize = @splat(0);
Then, for each index table, we’d want to get a slice of pairs that match the query. Ideally, we’d want to have
fields.len
local variables, one variable per each query field, but Zig doesn’t allow creating local variables via reflection.What we can have though is a single variable, which is a tuple of slices. To create such a thing, we first need to create its type:
const TupleOfSlices = comptime blk: { var components: [fields.len]type = undefined; for (0..fields.len) |i| { const IndexTable = @FieldType( @TypeOf(bundle.indexes), fields[i], ); const Pair = IndexTable.Value; components[i] = []Pair; } break :blk std.meta.Tuple(&components); }; var slices: TupleOfSlices = undefined;
Then, we can use the
search
function to find the actual slices:var slices: TupleOfSlices = undefined; inline for (fields, 0..) |field, i| { const index = @field(bundle.indexes, field).search(.{ .field = @field(query, field), .id = @enumFromInt(0), }, 0); slices[i] = @field(bundle.indexes, field) .values.items[index..]; }
Now, the merge algorithm proper. We are pointing a finger at each of the slices we have and, on each step, advance the fingers that point at the smallest ID. If at some point all our fingers point at the same id, we fetch the corresponding
Value
and add it to the output.Ideally, we’d use a k-way merge mere (
comptime
specialized to a particular k!), but, for simplicity, we’ll use linear search to find the lowest id. We will be iterating until either we run out of space in the output buffer, or we run off one of our slices:outer: while (true) { if (out_index == out.len) break :outer; inline for (0..fields.len) |i| { if (indexes[i] == slices[i].len) { break :outer; } if (slices[i][indexes[i]].field != @field(query, fields[i])) { break :outer; } } ... }
Then, we find the minimum ID:
var id_min = slices[0][indexes[0]].id; inline for (1..slices.len) |i| { const id_next = slices[i][indexes[i]].id if (@intFromEnum(id_next) < @intFromEnum(id_min)) { id_min = id_next; } }
Then, we advance all slices with the minimal ID, counting them:
var advanced_count: u32 = 0; inline for (0..slices.len) |i| { if (slices[i][indexes[i]].id == id_min) { indexes[i] += 1; advanced_count += 1; } }
If we advanced all slices (that is, all ids at a particular position are the same), we lookup the corresponding transfer in the object table and add it to the output:
if (advanced_count == slices.len) { object_index = bundle.objects.search(id_min, object_index); assert(object_index < bundle.objects.values.items.len); const value = bundle.objects.values.items[object_index]; inline for (fields) |field| { assert(@field(value, field) == @field(query, field)); } out[out_index] = value; out_index += 1; object_index += 1; }
This is where we pass a non-zero index to
search
!Altogether:
pub fn filter(bundle: *Bundle, query: anytype, out: []Value) []Value { const fields = comptime std.meta.fieldNames(@TypeOf(query)); var indexes: [fields.len]usize = @splat(0); var out_index: usize = 0; var object_index: usize = 0; const TupleOfSlices = comptime blk: { var components: [fields.len]type = undefined; for (0..fields.len) |i| { const IndexTable = @FieldType( @TypeOf(bundle.indexes), fields[i], ); const Pair = IndexTable.Value; components[i] = []Pair; } break :blk std.meta.Tuple(&components); }; var slices: TupleOfSlices = undefined; inline for (fields, 0..) |field, i| { const index = @field(bundle.indexes, field).search(.{ .field = @field(query, field), .id = @enumFromInt(0), }, 0); slices[i] = @field(bundle.indexes, field) .values.items[index..]; } outer: while (true) { if (out_index == out.len) break :outer; inline for (0..fields.len) |i| { if (indexes[i] == slices[i].len) { break :outer; } if (slices[i][indexes[i]].field != @field(query, fields[i])) { break :outer; } } var id_min = slices[0][indexes[0]].id; inline for (1..slices.len) |i| { const id_next = slices[i][indexes[i]].id; if (@intFromEnum(id_next) < @intFromEnum(id_min)) { id_min = id_next; } } var advanced_count: u32 = 0; inline for (0..slices.len) |i| { if (slices[i][indexes[i]].id == id_min) { indexes[i] += 1; advanced_count += 1; } } if (advanced_count == slices.len) { object_index = bundle.objects.search(id_min, object_index); assert(object_index < bundle.objects.values.items.len); const value = bundle.objects.values.items[object_index]; inline for (fields) |field| { assert(@field(value, field) == @field(query, field)); } out[out_index] = value; out_index += 1; object_index += 1; } } return out[0..out_index]; }
Congratulations! We’ve finished
BundleType
, which means we are past the worst of it, and are almost at the end
We only need code to turn
const DB = DBType(.{ .tables = .{ .account = Account, .transfer = Transfer, }, .indexes = .{ .transfer = .{ .debit_account, .credit_account, }, }, });
into a bunch of bundles with indexes, but at this point this should be trivial:
pub fn DBType(comptime schema: anytype) type { const bundle_names = std.meta.fieldNames(@TypeOf(schema.tables)); var bundles: [bundle_names.len]std.builtin.Type.StructField = undefined; for (bundle_names, 0..) |name, i| { const Value = @field(schema.tables, name); const indexes = if (@hasField(@TypeOf(schema.indexes), name)) @field(schema.indexes, name) else .{}; const Bundle = BundleType(Value, &indexes); bundles[i] = .{ .name = name, .type = Bundle, .default_value_ptr = &Bundle{}, .is_comptime = false, .alignment = @alignOf(Bundle), }; } return @Type(.{ .@"struct" = .{ .layout = .auto, .is_tuple = false, .decls = &.{}, .fields = &bundles, } }); }
Here’s the entire thing,
db.zig
, just under 300 lines:const std = @import("std"); const assert = std.debug.assert; pub fn DBType(comptime schema: anytype) type { const bundle_names = std.meta.fieldNames(@TypeOf(schema.tables)); var bundles: [bundle_names.len]std.builtin.Type.StructField = undefined; for (bundle_names, 0..) |name, i| { const Value = @field(schema.tables, name); const indexes = if (@hasField(@TypeOf(schema.indexes), name)) @field(schema.indexes, name) else .{}; const Bundle = BundleType(Value, &indexes); bundles[i] = .{ .name = name, .type = Bundle, .default_value_ptr = &Bundle{}, .is_comptime = false, .alignment = @alignOf(Bundle), }; } return @Type(.{ .@"struct" = .{ .layout = .auto, .is_tuple = false, .decls = &.{}, .fields = &bundles, } }); } pub fn BundleType( comptime Value: type, comptime indexed_fields: []const std.meta.FieldEnum(Value), ) type { return struct { id_counter: u64 = 0, objects: TableType(Value.ID, Value, struct { pub fn key_fn(value: Value) Value.ID { return value.id; } pub fn key_cmp(lhs: Value.ID, rhs: Value.ID) std.math.Order { return std.math.order(@intFromEnum(lhs), @intFromEnum(rhs)); } }) = .{}, indexes: blk: { var fields: [indexed_fields.len]std.builtin.Type.StructField = undefined; for (indexed_fields, 0..) |indexed, i| { const Type = IndexTableType(Value, indexed); fields[i] = .{ .name = @tagName(indexed), .type = Type, .default_value_ptr = &(Type{}), .is_comptime = false, .alignment = @alignOf(Type), }; } break :blk @Type(.{ .@"struct" = .{ .layout = .auto, .is_tuple = false, .decls = &.{}, .fields = &fields, } }); } = .{}, const Bundle = @This(); pub fn get(bundle: *Bundle, id: Value.ID) ?Value { return bundle.objects.get(id); } pub fn create( bundle: *Bundle, gpa: std.mem.Allocator, value: Value, ) !Value.ID { assert(@intFromEnum(value.id) == 0); try bundle.objects.reserve(gpa, 1); inline for (indexed_fields) |field| { try @field(bundle.indexes, @tagName(field)) .reserve(gpa, 1); } errdefer comptime unreachable; bundle.id_counter += 1; const id: Value.ID = @enumFromInt(bundle.id_counter); var value_with_id = value; value_with_id.id = id; bundle.objects.insert(value_with_id); inline for (indexed_fields) |indexed_field| { const field = @tagName(indexed_field); @field(bundle.indexes, field) .insert(.{ .field = @field(value, field), .id = id }); } return id; } pub fn update(bundle: *Bundle, value_new: Value) void { const id = value_new.id; assert(@intFromEnum(id) != 0); const value_old = bundle.get(value_new.id).?; assert(value_old.id == id); bundle.objects.remove(value_old); bundle.objects.insert(value_new); inline for (indexed_fields) |indexed_field| { const field = @tagName(indexed_field); @field(bundle.indexes, field) .remove(.{ .field = @field(value_old, field), .id = id }); @field(bundle.indexes, field) .insert(.{ .field = @field(value_new, field), .id = id }); } } pub fn filter(bundle: *Bundle, query: anytype, out: []Value) []Value { const fields = comptime std.meta.fieldNames(@TypeOf(query)); var indexes: [fields.len]usize = @splat(0); var out_index: usize = 0; var object_index: usize = 0; const TupleOfSlices = comptime blk: { var components: [fields.len]type = undefined; for (0..fields.len) |i| { const IndexTable = @FieldType( @TypeOf(bundle.indexes), fields[i], ); const Pair = IndexTable.Value; components[i] = []Pair; } break :blk std.meta.Tuple(&components); }; var slices: TupleOfSlices = undefined; inline for (fields, 0..) |field, i| { const index = @field(bundle.indexes, field).search(.{ .field = @field(query, field), .id = @enumFromInt(0), }, 0); slices[i] = @field(bundle.indexes, field) .values.items[index..]; } outer: while (true) { if (out_index == out.len) break :outer; inline for (0..fields.len) |i| { if (indexes[i] == slices[i].len) break :outer; if (slices[i][indexes[i]].field != @field(query, fields[i])) break :outer; } var id_min = slices[0][indexes[0]].id; inline for (1..slices.len) |i| { if (@intFromEnum(slices[i][indexes[i]].id) < @intFromEnum(id_min)) { id_min = slices[i][indexes[i]].id; } } var advanced_count: u32 = 0; inline for (0..slices.len) |i| { if (slices[i][indexes[i]].id == id_min) { indexes[i] += 1; advanced_count += 1; } } if (advanced_count == slices.len) { object_index = bundle.objects.search(id_min, object_index); assert(object_index < bundle.objects.values.items.len); const value = bundle.objects.values.items[object_index]; inline for (fields) |field| { assert(@field(value, field) == @field(query, field)); } out[out_index] = value; out_index += 1; object_index += 1; } } return out[0..out_index]; } }; } fn IndexTableType(comptime Value: type, comptime field: std.meta.FieldEnum(Value)) type { const FieldType = @FieldType(Value, @tagName(field)); const Pair = struct { field: FieldType, id: Value.ID, }; return TableType(Pair, Pair, struct { pub fn key_fn(value: Pair) Pair { return value; } pub fn key_cmp(lhs: Pair, rhs: Pair) std.math.Order { return order_by(Pair, lhs, rhs, &.{ .field, .id }); } }); } fn order_by( comptime T: type, lhs: T, rhs: T, comptime fields: []const std.meta.FieldEnum(T), ) std.math.Order { inline for (fields) |field| { const order = order_enums( @field(lhs, @tagName(field)), @field(rhs, @tagName(field)), ); if (order != .eq) return order; } return .eq; } fn order_enums(lhs: anytype, rhs: @TypeOf(lhs)) std.math.Order { return switch (@typeInfo(@TypeOf(lhs))) { .int => std.math.order(lhs, rhs), .@"enum" => std.math.order( @intFromEnum(lhs), @intFromEnum(rhs), ), else => comptime unreachable, }; } fn TableType( comptime KeyType: type, comptime ValueType: type, comptime Functions: type, ) type { const key_fn = Functions.key_fn; const key_cmp = Functions.key_cmp; return struct { values: std.ArrayListUnmanaged(Value) = .empty, pub const Key = KeyType; pub const Value = ValueType; const Table = @This(); pub fn get(table: *const Table, key: Key) ?Value { const index = table.search(key, 0); if (index >= table.values.items.len) return null; const value = table.values.items[index]; if (key_cmp(key, key_fn(value)) != .eq) return null; return value; } pub fn reserve(table: *Table, gpa: std.mem.Allocator, extra: usize) !void { try table.values.ensureUnusedCapacity(gpa, extra); } pub fn insert(table: *Table, value: Value) void { assert(table.values.unusedCapacitySlice().len > 0); const index = table.search(key_fn(value), 0); table.values.insertAssumeCapacity(index, value); } pub fn remove(table: *Table, value: Value) void { const index = table.search(key_fn(value), 0); const removed = table.values.orderedRemove(index); assert(std.meta.eql(value, removed)); } pub fn search(table: *const Table, key: Key, start_index: usize) usize { return start_index + std.sort.lowerBound( Value, table.values.items[start_index..], key, compare_fn, ); } fn compare_fn(key: Key, value: Value) std.math.Order { return key_cmp(key, key_fn(value)); } }; }
[Post Script ](https://matklad.github.io/2025/03/19/comptime-zig-
orm.html#Post-Script)
Note that, while the exercise is useful, it deliberately focuses narrowly on a single aspect of Zig — comptime reflection. You should avoid this feature if possible: hopefully, I have successfully convinced you that they can lead to somewhat mind-bending code. The topic of Zig in general is much larger, and I highly recommend the following resources, in this order:
-
- March 18, 2025
-
🔗 MetaBrainz Schema change release: May 19, 2025 rss
MusicBrainz is announcing a new database schema change release set for May 19, 2025. Like most of our recent schema changes, it should have little or no impact to downstream users.
There is one change to a major replicated table worth mentioning upfront: the
medium
table will have a newgid
column added. If you're running custom SQL queries against the database that join themedium
table at all, there is a small chance you could run into errors likeERROR: column reference "gid" is ambiguous
if you're not properly qualifying the columns being selected.We're also altering some columns on the
artist_release
andartist_release_group
tables (see below for more details). These are materialized tables used by our website on the back-end to speed up certain pages; you should normally not be accessing them directly, but it's worth mentioning just in case. These tables do exist on mirrors, but are only populated with data if you've runadmin/BuildMaterializedTables
before.Besides introducing some new tables for storing medium attributes and replacing some functions/triggers, you generally shouldn't have to worry about any other breaking changes in this release.
Finally, here is the complete list of scheduled tickets:
Database schema
The following tickets change the database schema in some way.
- MBS-9253: List EP release groups above singles on artist pages. A small change to the
get_artist_release_group_rows
function is required in order to be able to change the sorting of release groups to prioritize EPs over singles. The function will be changed to depend on the type'schild_order
(which can be safely changed at any time) rather than itsid
for sorting. While this function exists on mirrors, the function change shouldn't have any impact on them directly (but a change of thechild_order
of the types will affect the sorting for display on mirrors as well). We'll be adding new triggers to therelease_group_primary_type
andrelease_group_secondary_type
tables to run the function when the tables change - these triggers will also exist on mirrors. - MBS-13322: Race condition when removing unused URLs. A rare internal error can occur in one of our trigger functions that cleans up unused URLs. We'll replace that function,
delete_unused_url
, updating it to avoid a "race condition" whereby a URL can become used again the moment before it's deleted. This will have no impact on mirrors, asdelete_unused_url
is only invoked by triggers that don't exist on mirrors. - MBS-13464: Inconsistent sorting of artist release/release group titles. In the May 2021 schema change, we added some new materialized tables to significantly speed up the loading of artists' release and release group listings: the not-so-surprisingly named
artist_release
andartist_release_group
tables. These work by efficiently indexing an artist's releases and release groups by date and other attributes, and then finally by their titles. Except for efficiency reasons, we originally decided to only store the first character of the titles for sorting. That predictably leads to incorrect sorting in certain cases, like with undated live bootlegs, as shown in MBS-13464. After measuring the actual size impact, we've decided to update theartist_release
andartist_release_group
tables to replace theirsort_character
columns withname
columns that store the complete titles. - MBS-13768 - Add MBIDs to mediums. Adds a
gid
column to themedium
table, and a newmedium_gid_redirect
table. It generates MBIDs for existing mediums that will be replicated to mirrors. - MBS-13832: Also support PDF files in CAA / EAA
index_listing
(foris_front
purposes). PDF files are never treated asfront
for cover art archive purposes, probably because they originally did not have PNG thumbnails generated by the Internet Archive. That changed quite a while ago though, and there seems to be no reason to single them out anymore. We will just replace theindex_listing
views forcover_art_archive
andevent_art_archive
with ones amended to not filter out PDF files. - MBS-13964: Some recordings are missing a first release date. A bug was discovered that causes recordings to sometimes have incorrect first-release-date values if any of the releases they're attached to are merged with the "append" strategy. We'll be adding a new trigger to the
medium
table that updatesrecording_first_release_date
properly when such merges occur. Note that sincerecording_first_release_date
is a materialized table, this trigger will also run on mirrors; that way it's kept up-to-date even after runningadmin/BuildMaterializedTables
initially. - MBS-13965: Extend entity attribute schema to mediums. We will add the same tables for mediums as we already have for other entities that can potentially support entity attributes:
medium_attribute_type
,medium_attribute_type_allowed_value
andmedium_attribute
. This will eventually allow us to support medium-level attributes such as per-medium catalog numbers and barcodes, colors for vinyl, etc. We will not be implementing the feature fully yet - this is just the schema change required to be able to implement it at a point of our choice in the future, at least not before the release editor has been migrated to React. As such, the new tables will be added in mirrors but will be empty for quite a while.
Data corrections
- MBS-13966: Release group first release dates need to be recalculated. Another (unrelated) issue with "first release date" information, but this time with release groups rather than recordings. We've found that a small percentage of release groups' first release dates (as stored in the
release_group_meta
table and returned in the web service) is wrong. We won't be making any schema changes to address this, but will run a script to rebuild the incorrect data.
Search indexes
Data corrections to the
recording_first_release_date
andrelease_group_meta
tables do affect indexed recording and release group data respectively. If you have live search indexing enabled, those changes should be propagated to the search indexes automatically. Otherwise, you will have to perform a full reindex of those entities' search indexes.We’ll post upgrade instructions for standalone/mirror servers on the day of the release. If you have any questions, feel free to comment below or on the relevant above-linked tickets.
- MBS-9253: List EP release groups above singles on artist pages. A small change to the
-
🔗 astral-sh/uv 0.6.8 release
Release Notes
Enhancements
- Add support for enabling all groups by default with
default-groups = "all"
(#12289) - Add simpler
--managed-python
and--no-managed-python
flags for toggling Python preferences (#12246)
Performance
- Avoid allocations for default cache keys (#12063)
Bug fixes
- Allow local version mismatches when validating lockfile (#12285)
- Allow owned string when deserializing
requires-python
(#12278) - Make cache errors non-fatal in
Planner::build
(#12281)
uv 0.6.8
Install uv 0.6.8
Install prebuilt binaries via shell script
curl --proto '=https' --tlsv1.2 -LsSf https://github.com/astral-sh/uv/releases/download/0.6.8/uv-installer.sh | sh
Install prebuilt binaries via powershell script
powershell -ExecutionPolicy Bypass -c "irm https://github.com/astral-sh/uv/releases/download/0.6.8/uv-installer.ps1 | iex"
Download uv 0.6.8
File | Platform | Checksum
---|---|---
uv-aarch64-apple-darwin.tar.gz | Apple Silicon macOS | checksum
uv-x86_64-apple-darwin.tar.gz | Intel macOS | checksum
uv-aarch64-pc-windows-msvc.zip | ARM64 Windows | checksum
uv-i686-pc-windows-msvc.zip | x86 Windows | checksum
uv-x86_64-pc-windows-msvc.zip | x64 Windows | checksum
uv-aarch64-unknown-linux-gnu.tar.gz | ARM64 Linux | checksum
uv-i686-unknown-linux-gnu.tar.gz | x86 Linux | checksum
uv-powerpc64-unknown-linux-gnu.tar.gz | PPC64 Linux | checksum
uv-powerpc64le-unknown-linux-gnu.tar.gz | PPC64LE Linux | checksum
uv-s390x-unknown-linux-gnu.tar.gz | S390x Linux | checksum
uv-x86_64-unknown-linux-gnu.tar.gz | x64 Linux | checksum
uv-armv7-unknown-linux-gnueabihf.tar.gz | ARMv7 Linux | checksum
uv-aarch64-unknown-linux-musl.tar.gz | ARM64 MUSL Linux | checksum
uv-i686-unknown-linux-musl.tar.gz | x86 MUSL Linux | checksum
uv-x86_64-unknown-linux-musl.tar.gz | x64 MUSL Linux | checksum
uv-arm-unknown-linux-musleabihf.tar.gz | ARMv6 MUSL Linux (Hardfloat) | checksum
uv-armv7-unknown-linux-musleabihf.tar.gz | ARMv7 MUSL Linux | checksumuv-build 0.6.8
Install uv-build 0.6.8
Install prebuilt binaries via shell script
curl --proto '=https' --tlsv1.2 -LsSf https://github.com/astral-sh/uv/releases/download/0.6.8/uv-build-installer.sh | sh
Install prebuilt binaries via powershell script
powershell -ExecutionPolicy Bypass -c "irm https://github.com/astral-sh/uv/releases/download/0.6.8/uv-build-installer.ps1 | iex"
Download uv-build 0.6.8
File | Platform | Checksum
---|---|---
uv-build-aarch64-apple-darwin.tar.gz | Apple Silicon macOS | checksum
uv-build-x86_64-apple-darwin.tar.gz | Intel macOS | checksum
uv-build-aarch64-pc-windows-msvc.zip | ARM64 Windows | checksum
uv-build-i686-pc-windows-msvc.zip | x86 Windows | checksum
uv-build-x86_64-pc-windows-msvc.zip | x64 Windows | checksum
uv-build-aarch64-unknown-linux-gnu.tar.gz | ARM64 Linux | checksum
uv-build-i686-unknown-linux-gnu.tar.gz | x86 Linux | checksum
uv-build-powerpc64-unknown-linux-gnu.tar.gz | PPC64 Linux | checksum
uv-build-powerpc64le-unknown-linux-gnu.tar.gz | PPC64LE Linux | checksum
uv-build-s390x-unknown-linux-gnu.tar.gz | S390x Linux | checksum
uv-build-x86_64-unknown-linux-gnu.tar.gz | x64 Linux | checksum
uv-build-armv7-unknown-linux-gnueabihf.tar.gz | ARMv7 Linux | checksum
uv-build-aarch64-unknown-linux-musl.tar.gz | ARM64 MUSL Linux | checksum
uv-build-i686-unknown-linux-musl.tar.gz | x86 MUSL Linux | checksum
uv-build-x86_64-unknown-linux-musl.tar.gz | x64 MUSL Linux | checksum
uv-build-arm-unknown-linux-musleabihf.tar.gz | ARMv6 MUSL Linux (Hardfloat) | checksum
uv-build-armv7-unknown-linux-musleabihf.tar.gz | ARMv7 MUSL Linux | checksum - Add support for enabling all groups by default with
-
🔗 The Pragmatic Engineer Survey: What’s in your tech stack? rss
We want to capture an accurate snapshot of software engineering, today - and need your help! Tell us about your tech stack and get early access to the final report, plus extra analysis
We'd like to know what tools, languages, frameworks and platforms you are using today. Which tools/frameworks/languages are popular and why? Which ones do engineers love and dislike the most at this moment in time?
With more than 950,000 tech professionals subscribed to this newsletter, we have a unique opportunity to take the industry's pulse by finding out which tech stacks are typical - and which ones are less common.
So, we want to build a realistic picture of this - and share the findings in a special edition devoted to this big topic. But it's only possible with input from you.
We 're asking for your help to answer the question: what's in your tech stack? To help, please fill out this survey all about it. Doing so should only take between 5-15 minutes, covering the platform(s) you work on, the tooling you use, the custom tools you have built, and related topics.
The results will be published in a future edition of The Pragmatic Engineer. If you take part and fill out the survey, you will receive the full results early, plus some extra, exclusive analysis from myself and Elin.
This is the first time we 're running a survey that's so ambitious - and we very much appreciate your help. Previous research we did included a reality check on AI tooling and what GenZ software engineers really think. This survey is even more ambitious - and the results should reveal people's typical and atypical tooling choices, across the tech industry. You may even get inspiration for new and different tools, languages, and approaches to try out.
We plan to publish the findings in May.
-
🔗 News Minimalist Israel strikes Gaza, ending ceasefire + 3 more stories rss
Today ChatGPT read 18446 top news stories. After removing previously covered events, there are 4 articles with a significance score over 5.9.
[6.4] Israel resumes attacks in Gaza, ending ceasefire —sandiegouniontribune.com
Israeli airstrikes across the Gaza Strip killed at least 404 Palestinians, breaking a ceasefire in place since January. This escalation threatens to fully reignite the war that has been ongoing for 17 months, with Prime Minister Netanyahu stating the military operation is open-ended.
Civilian casualties included women and children, and the attacks prompted evacuations in eastern Gaza. The Israeli military plans to expand operations beyond airstrikes, as tensions rose amid stalled negotiations for a second phase of the ceasefire aimed at releasing hostages.
The strikes were met with mass protests in Israel, criticizing Netanyahu's leadership during the hostage crisis. Local health officials report over 48,000 Palestinians have died since the conflict began, making this one of the deadliest days of the war.
[6.1] Webb telescope directly observes CO2 on exoplanets for the first time —dawn.com
The James Webb Space Telescope has directly observed carbon dioxide in planets outside our solar system for the first time. This was achieved in the HR 8799 system, which is 130 light years away and only 30 million years old.
Researchers used Webb’s coronagraph instruments to view the planets, marking a departure from the usual method of detecting exoplanets when they cross in front of their host star. This new approach allowed scientists to see the light emitted directly from the planets, providing new insights into their atmospheres and formation processes.
[6.1] Germany plans nearly one trillion euros in new debt —dw.com
Germany plans to vote on a bill that would allow it to take on nearly one trillion euros in new debt for military and infrastructure investments. This requires a constitutional change and is unprecedented in the Bundestag's history.
The proposed legislation would ease the country's strict debt limits, allowing both the federal government and states to borrow more. It includes provisions for military spending, infrastructure upgrades, and climate protection, with a total of €500 billion allocated over the next twelve years.
Critics, including the far-right Alternative for Germany and the Left Party, oppose the debt package. Economists warn that this could significantly increase Germany's national debt and impact financial stability in Europe, particularly for already indebted countries.
[6.0] Global trade reaches a record $33 trillion, driven by services —unctad.org
Global trade reached a record $33 trillion in 2024, increasing by 3.7% or $1.2 trillion, according to UNCTAD. Growth was primarily driven by services, which rose 9%.
Developing economies outperformed developed nations, with trade rising 4% overall. East and South Asia led this growth, while trade in Russia, South Africa, and Brazil remained sluggish. Developed economies saw flat trade for the year.
Though trade started stable in early 2025, increasing geoeconomic tensions and policy shifts suggest potential disruptions. Shipping indexes showing reduced demand indicate businesses are adjusting to the changing landscape.
Highly covered news with significance over 5.5
[5.5] US withdraws from Ukraine war crimes investigation center
(theguardian.com + 5)[5.5] US intensifies airstrikes against Yemen's Houthi rebels
(apnews.com + 151)[5.5] Syria attends Brussels donor conference for the first time
(news.yahoo.com + 12)Thanks for reading!
Get access to 2x times more stories in high-significance range (5+) with News Minimalist Premium.
— Vadim
-
🔗 astral-sh/uv 0.6.7 release
Release Notes
If encountering inconsistent wheel version errors, see#12254.
Python
- Add CPython 3.14.0a6
- Fix regression where extension modules would use wrong
CXX
compiler on Linux - Enable FTS3 enhanced query syntax for SQLite
See the
python-build-standalone
release notes for more details.Enhancements
- Add support for
-c
constraints inuv add
(#12209) - Add support for
--global
default version inuv python pin
(#12115) - Always reinstall local source trees passed to
uv pip install
(#12176) - Render token claims on publish permission error (#12135)
- Add pip-compatible
--group
flag touv pip install
anduv pip compile
(#11686)
Preview features
- Avoid creating duplicate directory entries in built wheels (#12206)
- Allow overriding module names for editable builds (#12137)
Performance
- Avoid replicating core-metadata field on
File
struct (#12159)
Bug fixes
- Add
src
to default cache keys (#12062) - Discard insufficient fork markers (#10682)
- Ensure
python pin --global
creates parent directories if missing (#12180) - Fix GraalPy abi tag parsing and discovery (#12154)
- Remove extraneous script packages in
uv sync --script
(#12158) - Remove redundant
activate.bat
output (#12160) - Avoid subsequent index hint when no versions are available on the first index (#9332)
- Error on lockfiles with incoherent wheel versions (#12235)
Rust API
- Update
BaseClientBuild
to accept custom proxies (#12232)
Documentation
- Make testpypi index explicit in example snippet (#12148)
- Reverse and format the archived changelogs (#12099)
- Use consistent commas around i.e. and e.g. (#12157)
- Fix typos in MRE docs (#12198)
- Fix double space typo (#12171)
uv 0.6.7
Install uv 0.6.7
Install prebuilt binaries via shell script
curl --proto '=https' --tlsv1.2 -LsSf https://github.com/astral-sh/uv/releases/download/0.6.7/uv-installer.sh | sh
Install prebuilt binaries via powershell script
powershell -ExecutionPolicy Bypass -c "irm https://github.com/astral-sh/uv/releases/download/0.6.7/uv-installer.ps1 | iex"
Download uv 0.6.7
File | Platform | Checksum
---|---|---
uv-aarch64-apple-darwin.tar.gz | Apple Silicon macOS | checksum
uv-x86_64-apple-darwin.tar.gz | Intel macOS | checksum
uv-aarch64-pc-windows-msvc.zip | ARM64 Windows | checksum
uv-i686-pc-windows-msvc.zip | x86 Windows | checksum
uv-x86_64-pc-windows-msvc.zip | x64 Windows | checksum
uv-aarch64-unknown-linux-gnu.tar.gz | ARM64 Linux | checksum
uv-i686-unknown-linux-gnu.tar.gz | x86 Linux | checksum
uv-powerpc64-unknown-linux-gnu.tar.gz | PPC64 Linux | checksum
uv-powerpc64le-unknown-linux-gnu.tar.gz | PPC64LE Linux | checksum
uv-s390x-unknown-linux-gnu.tar.gz | S390x Linux | checksum
uv-x86_64-unknown-linux-gnu.tar.gz | x64 Linux | checksum
uv-armv7-unknown-linux-gnueabihf.tar.gz | ARMv7 Linux | checksum
uv-aarch64-unknown-linux-musl.tar.gz | ARM64 MUSL Linux | checksum
uv-i686-unknown-linux-musl.tar.gz | x86 MUSL Linux | checksum
uv-x86_64-unknown-linux-musl.tar.gz | x64 MUSL Linux | checksum
uv-arm-unknown-linux-musleabihf.tar.gz | ARMv6 MUSL Linux (Hardfloat) | checksum
uv-armv7-unknown-linux-musleabihf.tar.gz | ARMv7 MUSL Linux | checksumuv-build 0.6.7
Install uv-build 0.6.7
Install prebuilt binaries via shell script
curl --proto '=https' --tlsv1.2 -LsSf https://github.com/astral-sh/uv/releases/download/0.6.7/uv-build-installer.sh | sh
Install prebuilt binaries via powershell script
powershell -ExecutionPolicy Bypass -c "irm https://github.com/astral-sh/uv/releases/download/0.6.7/uv-build-installer.ps1 | iex"
Download uv-build 0.6.7
File | Platform | Checksum
---|---|---
uv-build-aarch64-apple-darwin.tar.gz | Apple Silicon macOS | checksum
uv-build-x86_64-apple-darwin.tar.gz | Intel macOS | checksum
uv-build-aarch64-pc-windows-msvc.zip | ARM64 Windows | checksum
uv-build-i686-pc-windows-msvc.zip | x86 Windows | checksum
uv-build-x86_64-pc-windows-msvc.zip | x64 Windows | checksum
uv-build-aarch64-unknown-linux-gnu.tar.gz | ARM64 Linux | checksum
uv-build-i686-unknown-linux-gnu.tar.gz | x86 Linux | checksum
uv-build-powerpc64-unknown-linux-gnu.tar.gz | PPC64 Linux | checksum
uv-build-powerpc64le-unknown-linux-gnu.tar.gz | PPC64LE Linux | checksum
uv-build-s390x-unknown-linux-gnu.tar.gz | S390x Linux | checksum
uv-build-x86_64-unknown-linux-gnu.tar.gz | x64 Linux | checksum
uv-build-armv7-unknown-linux-gnueabihf.tar.gz | ARMv7 Linux | checksum
uv-build-aarch64-unknown-linux-musl.tar.gz | ARM64 MUSL Linux | checksum
uv-build-i686-unknown-linux-musl.tar.gz | x86 MUSL Linux | checksum
uv-build-x86_64-unknown-linux-musl.tar.gz | x64 MUSL Linux | checksum
uv-build-arm-unknown-linux-musleabihf.tar.gz | ARMv6 MUSL Linux (Hardfloat) | checksum
uv-build-armv7-unknown-linux-musleabihf.tar.gz | ARMv7 MUSL Linux | checksum -
🔗 Aider-AI/aider v0.77.2.dev release
set version to 0.77.2.dev
-
🔗 Aider-AI/aider v0.77.1 release
version bump to 0.77.1
-
🔗 Rust Blog Announcing Rust 1.85.1 rss
The Rust team has published a new point release of Rust, 1.85.1. Rust is a programming language that is empowering everyone to build reliable and efficient software.
If you have a previous version of Rust installed via rustup, getting Rust 1.85.1 is as easy as:
rustup update stable
If you don't have it already, you can get
rustup
from the appropriate page on our website.[](https://blog.rust-lang.org/2025/03/18/Rust-1.85.1.html#whats-
in-1851)What's in 1.85.1
[](https://blog.rust-lang.org/2025/03/18/Rust-1.85.1.html#fixed-combined-
doctest-compilation)Fixed combined doctest compilation
Due to a bug in the implementation, combined doctests did not work as intended in the stable 2024 Edition. Internal errors with feature stability caused rustdoc to automatically use its "unmerged" fallback method instead, like in previous editions.
Those errors are now fixed in 1.85.1, realizing the performance improvement of combined doctest compilation as intended! See the backport issue for more details, including the risk analysis of making this behavioral change in a point release.
[](https://blog.rust-lang.org/2025/03/18/Rust-1.85.1.html#other-
fixes)Other fixes
1.85.1 also resolves a few regressions introduced in 1.85.0:
- Relax some
target_feature
checks when generating docs. - Fix errors in
std::fs::rename
on Windows 1607. - Downgrade bootstrap
cc
to fix custom targets. - Skip submodule updates when building Rust from a source tarball.
[](https://blog.rust-lang.org/2025/03/18/Rust-1.85.1.html#contributors-
to-1851)Contributors to 1.85.1
Many people came together to create Rust 1.85.1. We couldn't have done it without all of you. Thanks!
- Relax some
-
🔗 Baby Steps Rust in 2025: Language interop and the extensible compiler rss
For many years, C has effectively been the "lingua franca" of the computing world. It's pretty hard to combine code from two different programming languages in the same process-unless one of them is C. The same could theoretically be true for Rust, but in practice there are a number of obstacles that make that harder than it needs to be. Building out silky smooth language interop should be a core goal of helping Rust to target foundational applications. I think the right way to do this is not by extending rustc with knowledge of other programming languages but rather by building on Rust's core premise of being an extensible language. By investing in building out an " extensible compiler" we can allow crate authors to create a plethora of ergonomic, efficient bridges between Rust and other languages.
We'll know we've succeeded when…
When it comes to interop…
- It is easy to create a Rust crate that can be invoked from other languages and across multiple environments (desktop, Android, iOS, etc). Rust tooling covers the full story from writing the code to publishing your library.
- It is easy1 to carve out parts of an existing codebase and replace them with Rust. It is particularly easy to integrate Rust into C/C++ codebases.
When it comes to extensibility…
- Rust is host to wide variety of extensions ranging from custom lints and diagnostics ("clippy as a regular library") to integration and interop (ORMs, languages) to static analysis and automated reasoning^[math].
Lang interop: the least common denominator use case
In my head, I divide language interop into two core use cases. The first is what I call Least Common Denominator (LCD), where people would like to write one piece of code and then use it in a wide variety of environments. This might mean authoring a core SDK that can be invoked from many languages but it also covers writing a codebase that can be used from both Kotlin (Android) and Swift (iOS) or having a single piece of code usable for everything from servers to embedded systems. It might also be creating WebAssembly components for use in browsers or on edge providers.
What distinguishes the LCD use-case is two things. First, it is primarily unidirectional--calls mostly go from the other language to Rust. Second, you don't have to handle all of Rust. You really want to expose an API that is "simple enough" that it can be expressed reasonably idiomatically from many other languages. Examples of libraries supporting this use case today are uniffi and diplomat. This problem is not new, it's the same basic use case that WebAssembly components are targeting as well as old school things like COM and CORBA (in my view, though, each of those solutions is a bit too narrow for what we need).
When you dig in, the requirements for LCD get a bit more complicated. You want to start with simple types, yes, but quickly get people asking for the ability to make the generated wrapper from a given language more idiomatic. And you want to focus on calls into Rust, but you also need to support callbacks. In fact, to really integrate with other systems, you need generic facilities for things like logs, metrics, and I/O that can be mapped in different ways. For example, in a mobile environment, you don't necessarily want to use tokio to do an outgoing networking request. It is better to use the system libraries since they have special cases to account for the quirks of radio-based communication.
To really crack the LCD problem, you also have to solve a few other problems too:
- It needs to be easy to package up Rust code and upload it into the appropriate package managers for other languages. Think of a tool like maturin, which lets you bundle up Rust binaries as Python packages.
- For some use cases, download size is a very important constraint. Optimizing for size right now is hard to start. What's worse, your binary has to include code from the standard library, since we can't expect to find it on the device--and even if we could, we couldn't be sure it was ABI compatible with the one you built your code with.
Needed: the "serde" of language interop
Obviously, there's enough here to keep us going for a long time. I think the place to start is building out something akin to the "serde" of language interop: the serde package itself just defines the core trait for serialization and a derive. All of the format- specific details are factored out into other crates defined by a variety of people.
I'd like to see a universal set of conventions for defining the "generic API" that your Rust code follows and then a tool that extracts these conventions and hands them off to a backend to do the actual language specific work. It's not essential, but I think this core dispatching tool should live in the rust- lang org. All the language-specific details, on the other hand, would live in crates.io as crates that can be created by anyone.
Lang interop: the "deep interop" use case
The second use case is what I call the deep interop problem. For this use case, people want to be able to go deep in a particular language. Often this is because their Rust program needs to invoke APIs implemented in that other language, but it can also be that they want to stub out some part of that other program and replace it with Rust. One common example that requires deep interop is embedded developers looking to invoke gnarly C/C++ header files supplied by vendors. Deep interop also arises when you have an older codebase, such as the Rust for Linux project attempting to integrate Rust into their kernel or companies looking to integrate Rust into their existing codebases, most commonly C++ or Java.
Some of the existing deep interop crates focus specifically on the use case of invoking APIs from the other language (e.g., bindgen and duchess) but most wind up supporting bidirectional interaction (e.g., pyo3, [npapi- rs][], and neon). One interesting example is cxx, which supports bidirectional Rust-C++ interop, but does so in a rather opinionated way, encouraging you to make use of a subset of C++'s features that can be readily mapped (in this way, it's a bit of a hybrid of LCD and deep interop).
Interop with all languages is important. C and C++ are just more so.
I want to see smooth interop with all languages, but C and C++ are particularly important. This is because they have historically been the language of choice for foundational applications, and hence there is a lot of code that we need to integrate with. Integration with C today in Rust is, in my view, "ok" - most of what you need is there, but it's not as nicely integrated into the compiler or as accessible as it should be. Integration with C++ is a huge problem. I'm happy to see the Foundation's Rust-C++ Interoperability Initiative as well a projects like Google's crubit and of course the venerable cxx.
Needed: "the extensible compiler"
The traditional way to enable seamless interop with another language is to "bake it in" i.e., Kotlin has very smooth support for invoking Java code and Swift/Zig can natively build C and C++. I would prefer for Rust to take a different path, one I call the extensible compiler. The idea is to enable interop via, effectively, supercharged procedural macros that can integrate with the compiler to supply type information, generate shims and glue code, and generally manage the details of making Rust "play nicely" with another language.
In some sense, this is the same thing we do today. All the crates I mentioned above leverage procedural macros and custom derives to do their job. But procedural macrods today are the "simplest thing that could possibly work": tokens in, tokens out. Considering how simplistic they are, they've gotten us remarkably, but they also have distinct limitations. Error messages generated by the compiler are not expressed in terms of the macro input but rather the Rust code that gets generated, which can be really confusing; macros are not able to access type information or communicate information between macro invocations; macros cannot generate code on demand, as it is needed, which means that we spend time compiling code we might not need but also that we cannot integrate with monomorphization. And so forth.
I think we should integrate procedural macros more deeply into the compiler.2 I'd like macros that can inspect types, that can generate code in response to monomorphization, that can influence diagnostics3 and lints, and maybe even customize things like method dispatch rules. That will allow all people to author crates that provide awesome interop with all those languages, but it will also help people write crates for all kinds of other things. To get a sense for what I'm talking about, check out F#'s type providers and what they can do.
The challenge here will be figuring out how to keep the stabilization surface area as small as possible. Whenever possible I would look for ways to have macros communicate by generating ordinary Rust code, perhaps with some small tweaks. Imagine macros that generate things like a "virtual function", that has an ordinary Rust signature but where the body for a particular instance is constructed by a callback into the procedural macro during monomorphization. And what format should that body take? Ideally, it'd just be Rust code, so as to avoid introducing any new surface area.
Not needed: the Rust Evangelism Task Force
So, it turns out I'm a big fan of Rust. And, I ain't gonna lie, when I see a prominent project pick some other language, at least in a scenario where Rust would've done equally well, it makes me sad. And yet I also know that if every project were written in Rust, that would be so sad. I mean, who would we steal good ideas from?
I really like the idea of focusing our attention on making Rust work well with other languages , not on convincing people Rust is better 4. The easier it is to add Rust to a project, the more people will try it - and if Rust is truly a better fit for them, they'll use it more and more.
Conclusion: next steps
This post pitched out a north star where
- a single Rust library can be easily used across many languages and environments;
- Rust code can easily call and be called by functions in other languages;
- this is all implemented atop a rich procedural macro mechanism that lets plugins inspect type information, generate code on demand, and so forth.
How do we get there? I think there's some concrete next steps:
- Build out, adopt, or extend an easy system for producing "least common denominator" components that can be embedded in many contexts.
- Support the C++ interop initiatives at the Foundation and elsewhere. The wheels are turning: tmandry is the point-of-contact for project goal for that, and we recently held our first lang-team design meeting on the topic (this document is a great read, highly recommended!).
- Look for ways to extend proc macro capabilities and explore what it would take to invoke them from other phases of the compiler besides just the very beginning.
- An aside: I also think we should extend rustc to support compiling proc macros to web-assembly and use that by default. That would allow for strong sandboxing and deterministic execution and also easier caching to support faster build times.
-
Well, as easy as it can be. ↩︎
-
Rust's incremental compilation system is pretty well suited to this vision. It works by executing an arbitrary function and then recording what bits of the program state that function looks at. The next time we run the compiler, we can see if those bits of state have changed to avoid re-running the function. The interesting thing is that this function could as well be part of a procedural macro, it doesn't have to be built-in to the compiler. ↩︎
-
Stuff like the
diagnostics
tool attribute namespace is super cool! More of this! ↩︎ -
I've always been fond of this article Rust vs Go, "Why they're better together". ↩︎
-
- March 17, 2025
-
🔗 sacha chua :: living an awesome life Org Mode: Merge top-level items in an item list rss
I usually summarize Mastodon links, move them to my Emacs News Org file, and then categorize them. Today I accidentically categorized the links while they were still in my Mastodon buffer, so I had two lists with categories. I wanted to write some Emacs Lisp to merge sublists based on the top-level items. I could sort the list alphabetically with
C-c ^
(org-sort) and then delete the redundant top-level item lines, but it's fun to tinker with Emacs Lisp.Example input:
- Topic A:
- Item 1
- Item 2
- Item 2.1
- Topic B:
- Item 3
- Topic A:
- Item 4
- Item 4.1
- Item 4
Example output:
- Topic B:
- Item 3
- Topic A:
- Item 1
- Item 2
- Item 2.1
- Item 4
- Item 4.1
The sorting doesn't particularly matter to me, but I want the things under Topic A to be combined. Someday it might be nice to recursively merge other entries (ex: if there's another "Topic A: - Item 2" subitem like "Item 2.2"), but I don't need that yet.
Anyway, we can parse the list with
org-list-to-lisp
(which can even delete the original list) and recreate it withorg-list-to-org
, so then it's a matter of transforming the data structure.(defun my-org-merge-list-entries-at-point () "Merge entries in a nested Org Mode list at point that have the same top-level item text." (interactive) (save-excursion (let* ((list-indentation (save-excursion (goto-char (caar (org-list-struct))) (current-indentation))) (list-struct (org-list-to-lisp t)) (merged-list (my-org-merge-list-entries list-struct))) (insert (org-ascii--indent-string (org-list-to-org merged-list) list-indentation) "\n")))) (defun my-org-merge-list-entries (list-struct) "Merge an Org list based on its top-level headings" (cons (car list-struct) (mapcar (lambda (g) (list (car g) (let ((list-type (car (car (cdr (car (cdr g)))))) (entries (seq-mapcat #'cdar (mapcar #'cdr (cdr g))))) (apply #'append (list list-type) entries nil)))) (seq-group-by #'car (cdr list-struct)))))
A couple of test cases:
(ert-deftest my-org-merge-list-entries () (should (equal (my-org-merge-list-entries '(unordered ("Topic B:" (unordered ("Item 3"))))) '(unordered ("Topic B:" (unordered ("Item 3")))))) (should (equal (my-org-merge-list-entries '(unordered ("Topic B:" (unordered ("Item 3"))) ("Topic A:" (unordered ("Item 1") ("Item 2" (unordered ("Item 2.1"))))) ("Topic A:" (unordered ("Item 4" (unordered ("Item 4.1"))))))) '(unordered ("Topic B:" (unordered ("Item 3"))) ("Topic A:" (unordered ("Item 1") ("Item 2" (unordered ("Item 2.1"))) ("Item 4" (unordered ("Item 4.1")))))))))
Updating my custom links to also export to Org
Because
org-list-to-org
uses the Org conversion process, I need to make sure that my custom link functions also export to Org as a format. For example, in Emacs News, I use a package: link to make it easy to link to packages in both Emacs and in exported HTML. When I first ran my code, the links got replaced with their URLs, which isn't what I wanted. Turned out that I needed to add a case handling exporting toorg
format, like this:(defun my-org-package-export (link description format &optional arg) (let* ((package-info (car (assoc-default (intern link) package-archive-contents))) (package-source (and package-info (package-desc-archive package-info))) (path (format (cond ((null package-source) link) ((string= package-source "gnu") "https://elpa.gnu.org/packages/%s.html") ((string= package-source "melpa") "https://melpa.org/#/%s") ((string= package-source "nongnu") "https://elpa.nongnu.org/nongnu/%s.html") (t (error 'unknown-source))) link)) (desc (or description link))) (if package-source (cond ((eq format '11ty) (format "<a target=\"_blank\" href=\"%s\">%s</a>" path desc)) ((eq format 'html) (format "<a target=\"_blank\" href=\"%s\">%s</a>" path desc)) ((eq format 'wp) (format "<a target=\"_blank\" href=\"%s\">%s</a>" path desc)) ((eq format 'latex) (format "\\href{%s}{%s}" path desc)) ((eq format 'texinfo) (format "@uref{%s,%s}" path desc)) ((eq format 'ascii) (format "%s <%s>" desc path)) ((eq format 'org) (org-link-make-string (concat "package:" link) description)) ;; added this line (t path)) desc)))
- Topic A:
-
🔗 sacha chua :: living an awesome life 2025-03-17 Emacs news rss
- Upcoming events (iCal file, Org):
- M-x Research: TBA https://m-x-research.github.io/ Wed Mar 19 0900 America/Vancouver - 1100 America/Chicago - 1200 America/Toronto - 1600 Etc/GMT - 1700 Europe/Berlin - 2130 Asia/Kolkata – Thu Mar 20 0000 Asia/Singapore
- Emacs APAC: Emacs APAC meetup (virtual) https://emacs-apac.gitlab.io/announcements/ Sat Mar 22 0130 America/Vancouver - 0330 America/Chicago - 0430 America/Toronto - 0830 Etc/GMT - 0930 Europe/Berlin - 1400 Asia/Kolkata - 1630 Asia/Singapore
- EmacsSF (in person): coffee.el in SF https://www.meetup.com/emacs-sf/events/306610734/ Sat Mar 22 1100 America/Los_Angeles
- Emacs Berlin (hybrid, in English) https://emacs-berlin.org/ Wed Mar 26 1030 America/Vancouver - 1230 America/Chicago - 1330 America/Toronto - 1730 Etc/GMT - 1830 Europe/Berlin - 2300 Asia/Kolkata – Thu Mar 27 0130 Asia/Singapore
- Emacs 30:
- Beginner:
- Editing Files with Emacs (11:29)
- GNU Emacs má 40 let (Tomáš Čech) (23:23)
- Emacs என்பது யாதெனில் - அத்தியாயம் ஒன்று - Sakhil (01:00:55)
- Emacs configuration:
- Emacs Lisp:
- New package raq.el: HTTP Library Adapter for Emacs, suport url.el and plz.el, and can be extended. (Reddit)
- compile-angel - Ensure all Elisp files are both Byte and Native-Compiled (Alternative to: auto-compile) - Release 1.0.6 (Reddit)
- tip about get-mru-window (most recently used)
- Some problems of modernizing Emacs (incomplete - slides 0 to 6 only) (20:39)
- Appearance:
- Tip about using consult-theme from the consult package to preview themes quickly
- Announcing Calle 24 (Reddit, Irreal) - substitutes the default tool bar icons with those from SF Symbols, a library of images provided by Apple
- tomorrow-night-deepblue-theme (Release 1.2.1): A deep blue Emacs theme, inspired by the Tomorrow Night theme
- My Unique Emacs Theme Pack – Now Available for Download! (Reddit) - screenshots are in the Details heading
- Navigation:
- Noether: global minor mode managing user-defined posframe Views (Reddit)
- easysession.el 1.1.3: Persist and restore Emacs sessions including frames, tab-bar, buffers, indirect buffers, Dired, and window splits (Reddit)
- The Emacs recursive narrow package #coding #programming (02:41)
- A tutorial on 2 Columns in Emacs #emacs #coding #programming (11:32)
- Writing:
- Hugo-Heagren/clever-cite: format quoted text as you kill/yank between buffers (@HugoHeagren@scholar.social)
- quick-sdcv.el (Release 1.0.1): Turn Emacs into an offline dictionary with sdcv (Reddit)
- Editing Overleaf Documents with Emacs | Valentin Boettcher (Reddit) - nice! live editing, might be useful for collaboration too
- Chung-hong Chan: Miscellaneous talk #45 Quarto guy, Overleaf’s markdown mode?, Flymake, Mailing lists, CRAN - ESS also
- Wherein I Explain Why Emacs Is The Best Tool For WordPress (Reddit)
- Meet Harper | A Grammarly Alternative for Neovim | Emacs, Obsidian, Zed, VScode and Helix (deez) (21:16)
- Org Mode:
- Using Emacs Org mode to manage my appointments
- Calendar.org (Reddit)
- Org-anizing My Fragrance Collection with Emacs – literatelisp.eu (@theesm@social.tchncs.de)
- Sacha Chua: Remove open Org Mode clock entries
- a template for writing OWL ontologies as Org Mode documents, with supporting functions and scripts
- org-expose-emphasis-markers: A new package used to automatically show hidden emphasis markers at point in org mode when `org-hide-emphasis-markers` is on. (Reddit)
- Seam: personal wiki system based on Org mode (@spnw@gts.plexwave.org)
- Using an Org-mode README on SourceHut (@tiang@mastodon.social)
- Denote:
- Coding:
- using hideshow minor mode in all programming modes so you can toggle code sections (@dotemacs@mastodon.xyz)
- Announcing Casual Make (Reddit, Irreal)
- eglot-inactive-regions - eglot extension to dim inactive preprocessor branches - release: 0.6.4 (Reddit)
- I found an easy way to make code comments appear in other mode's syntax
- A story of mystery involving LS, clangd and a very easy fix
- flymake-bashate.el (1.0.2) - A Flymake backend for bashate: Real-time style checking for Bash shell scripts
- Bozhidar Batsov: neocaml: a new Emacs package for OCaml programming
- swyddfa/esbonio.el: Integrating the esbonio language server into Emacs (@alcarney@fosstodon.org)
- Mail, news, and chat:
- AI:
- gptel 0.9.8 released (tool-use, support for "reasoning" output, dry-run options and more) (Reddit)
- gptel-aibo update: new complete at point
- GitHub - rajp152k/fabric-gpt.el: Fabric Prompts for emacs gpt.el
- Aidermacs v1.0 Released. Available Now on Melpa and Non-GNU Elpa! (Reddit) (also 0.5.0 discussion)
- James Dyer: Ollama-Buddy 0.8.0 - Added System Prompts, Model Info and simpler menu model assignment
- James Dyer: Ollama-Buddy 0.7.1 - Org-mode Chat, Parameter Control and JSON Debugging
- claude-code.el (Reddit)
- lizqwerscott/mcp.el: An Mcp client inside Emacs (Reddit) - model context protocol servers
- Introducing forge-llm: Generate PR descriptions automatically with LLMs in Emacs Forge
- Community:
- Other:
- Emacs development:
- emacs-devel:
- Improving the Tools menu
- Re: Semantic: update or remove?
- Re: Semantic: update or remove? - more thoughts on tree-sitter and the Emacs ecosystem
- Re: Markers in a gap array - sorted array with gap
- Make marking conflicted files as resolved upon saving opt-out
- ; Add NEWS entry for java-ts-mode-method-chaining-indent-offset
- ; etc/NEWS (remember-prefix-map): Suggest a key reserved to users.
- New project-save-some-buffers command
- Add a new command `speedbar-window'.
- dired-copy-filename-as-kill: Support project-relative names
- Make turn-on-flyspell/turn-off-flyspell obsolete
- Improve tramp-*-with-sudo commands
- ; * etc/NEWS: Announce the larger number of sub-processes on w32.
- New configure option –with-systemduserunitdir
- Allow control of indicating empty rectangular selections
- Turn 'remember-mode' into a minor mode
- diff-apply-buffer now considers the region and can reverse-apply.
- Fix capitalization ELisp -> Elisp
- Automatically document when setopt is needed
- New user option follow-mode-prefix-key
- Remove variable aliases obsolete since Emacs 23.2
- New user variable `exchange-point-and-mark-highlight-region`
- emacs-devel:
- New packages:
- aider: Interact with Aider: AI pair programming made simple (MELPA)
- company-forge: Company backend for assignees and topics from forge (MELPA)
- denote-journal: Convenience functions for daily journaling with Denote (GNU ELPA)
- denote-markdown: Extensions that better integrate Denote with Markdown (GNU ELPA)
- denote-org: Denote extensions for Org mode (GNU ELPA)
- denote-sequence: Sequence notes or Folgezettel with Denote (GNU ELPA)
- denote-silo: Convenience functions for using Denote in multiple silos (GNU ELPA)
- indexed: Cache metadata on all Org files (MELPA)
- jira: Emacs Interface to Jira (MELPA)
- ob-aider: Org Babel functions for Aider.el integration (MELPA)
- org-expose-emphasis-markers: Automatically show hidden org emphasis markers (MELPA)
Links from reddit.com/r/emacs, r/orgmode, r/spacemacs, r/planetemacs, Mastodon #emacs, Bluesky #emacs, Hacker News, lobste.rs, programming.dev, lemmy.world, lemmy.ml, planet.emacslife.com, YouTube, the Emacs NEWS file, Emacs Calendar, and emacs-devel. Thanks to Andrés Ramírez for emacs-devel links. Do you have an Emacs-related link or announcement? Please e-mail me at sacha@sachachua.com. Thank you!
- Upcoming events (iCal file, Org):
-