๐Ÿก


  1. A Gentle Introduction to Graph Neural Networks
  2. Marco Zocca - Reconstructing a music recommendation model
  3. Kubernetes Spec v1.32: Reference Guide and Documentation
  4. A deep dive into modern Windows Structured Exception Handler (SEH) โš ๏ธ | Elma
  5. Flightle

  1. December 30, 2024
    1. ๐Ÿ”— ghostty-org/ghostty Ghostty Tip ("Nightly") release

      This tip release is automatically built and generated on every commit to main that passes tests.

      Warning

      This is a nightly build, not a tagged release. We recommend using tagged releases.
      You can find tagged releases on the Ghostty website.

      macOS Notes

      • DownloadGhostty.dmg for everyday use. Every other build is for debugging.
      • The builds are all universal binaries that work on both Apple Silicon and Intel.
      • The -debug-slow build has safety checks and symbols to make debugging sometimes easier.
      • The -debug-fast build disables the slowest safety checks so that the debug build is more usable but is still around 15x slower than the release builds. This is meant for use when a debug build is too slow to reproduce an issue.
    2. ๐Ÿ”— PyO3/maturin v1.8.1 release

      What's Changed

      • Update minimal manylinux version for riscv64 by @messense in #2415
      • Make invalid version info in pyproject.toml less fatal by @mhils in #2417
      • Make maturin develop fail if version info is missing in pyproject.toml by @mhils in #2418

      Full Changelog : v1.8.0...v1.8.1

    3. ๐Ÿ”— matklad What Is a dependency? rss

      What Is a dependency? Dec 30, 2024

      For whatever reason, I've been thinking about dependencies lately:

      Today, I managed to capture crisply the principal components of a "dependency". A dependency is:

      • A checksum
      • A location
      • A name
      • A version

      [Checksum ](https://matklad.github.io/2024/12/30/what-is-

      dependency.html#Checksum)

      Checksum is a cryptographic hash of the contents of a dependency. In many systems, checksums are treated as optional addons -- if you want to, you can additionally verify downloaded dependencies against pre-computed checksums. I think it is more useful to start with a checksum as a primary identity of a dependency, and build the rest of the system on top, as this gives the system many powerful properties. In particular, checksums force you to actually declare all dependencies, and they make it irrelevant where the dependency comes from.

      The checksum should be computed over a specific file tree, rather than over a compressed archive, to make sure that details of some random archive format do not leak into the definition of the hash. Curiously, it seems like it is possible to avoid hashing file system metadata, like permissions. The trick, as seen in Zig, is to set the executable bit based on the contents of the file (ELFs and hash-bangs get +x).

      If (direct) dependencies are specified via checksums, there's no need for lockfiles. Or, rather, the lock-file becomes a hash-tree structure, where the content hash of the root transitively pins down the rest of the hashes.

      [Location ](https://matklad.github.io/2024/12/30/what-is-

      dependency.html#Location)

      Location is the suggested way to acquire dependency. It is something that tells you how to get a file tree that matches the checksum you already know. Typically, a location is just an URL.

      If a dependency is identified via a checksum, than there might be several locations. In fact, common scenarios would have at least three or four:

      • The canonical URL from which the dependency can be downloaded, and which is considered the source of truth.
      • Global distributed content-addressable cache, which stores redundant copies of all dependencies in the ecosystem to provide availability.
      • Local on-disk cache of dependencies that are actually used on the machine.
      • Project local cache of dependencies that is a part of project's repository, to guarantee that dependencies are as available as the project itself.

      It doesn't matter where you got the dependency from, as long as the checksum matches. But it certainly helps to have at least one suggested place to search in

      [Name & Version ](https://matklad.github.io/2024/12/30/what-is-

      dependency.html#Name-Version)

      Name is a part of a dependency and is covered by dependency's checksum. Name tells when two different dependencies (two different checksums) correspond to different versions of the same thing. If you have two dependencies called foo, you might want to look into deduplicating them. That is, keeping only one hash, and replacing all the references to the other one.

      Version is a specific rule about dependency substitutability. SemVer is a good option here: 1.2.0 can be substituted for 1.1.2, but 1.2.0 and 2.1.0 are not interchangeable.

  2. December 29, 2024
    1. ๐Ÿ”— Confessions of a Code Addict Linux Context Switching Internals: Part 1 - Process State and Memory rss

      Before we dive into the article, I'd like to share something with you.

      A Note from the Author

      I started writing an article on the internals of context switching implementation in Linux and ended up writing almost 50 pages before I realized this is almost a book's worth of work.

      I am launching this as an early-release PDF book and a series of articles on this Substack. This first article is free, while the future ones will be for paid subscribers. The complete book may have 4-5 total chapters, totalling about 80-90 pages.

      The PDF book is in early-release with this 1st chapter, available at a 30% discount at the link below:

      Get E-book at 30% discount

      Discounts for Paid Subscribers

      As a thank you for supporting my work, I am offering you discounts:

      • Existing annual subscribers: 100% discount

      • Existing monthly subscribers (6+ months): 100% discount

      • New monthly subscribers: 50% discount

      • New annual subscribers: 100% discount

      I will send out a separate email to existing paid subscribers with the discount code. New paid subscribers, please reach out to me after upgrade.

      And, with that out of the way, let's get to the article!

      Subscribe now


      Introduction

      Context switching is necessary for a high-throughput and responsive system where all processes make progress despite limited execution resources. But, as we discussed in the previous article, it also has performance costs which cascade through various indirect mechanisms, such as cache and TLB eviction.

      When building performance-critical systems or debugging performance issues due to context switching, it becomes important to understand the internal implementation details to be able to reason through the performance issues and possibly mitigate them. Not only that, it leads you to learn many low-level details about the hardware architecture, and makes you realize why the kernel is so special.

      At first glance, context switching seems straightforward--save the current process's registers, switch page tables and stacks, and restore the new process's registers.

      However, the reality is much more complex, involving multiple data structures, hardware state management, and memory organization. To fully grasp context switching, we need to understand few key foundational concepts about the Linux kernel and X86-64 architecture. We'll explore these through a series of articles:

      1. Process Management Fundamentals (this article)

        1. Process representation through task_struct

        2. Virtual memory and address space management

        3. Key data structures for context switching

      2. User to Kernel Mode Transition

        1. System call and interrupt handling mechanics

        2. Hardware state changes

        3. Security considerations

      3. Timer Interrupt Handling

        1. What timer interrupts are

        2. How the Linux kernel uses timer interrupts to do process accounting

        3. Conditions for triggering a context switch

      4. Context Switching Implementation

        1. CPU register management

        2. Memory context switches

        3. Cache and TLB considerations

        4. Hardware vulnerability mitigations

      In this first article, we'll focus on two foundational aspects:

      1. How the kernel represents processes using task_struct and related structures

      2. How process address spaces are organized and managed

      Understanding these fundamentals is important because context switching is essentially the manipulation of these structures to safely transfer CPU control between processes. Let's begin!

      Cover art: Linux Context Switching
InternalsCover art: Linux Context Switching Internals


      Processes in the Linux Kernel

      Before diving into the details, let's look at how the Linux kernel organizes process information. At its core, the kernel splits process management into two main concerns: execution state and memory state, managed through two key structures:

      The Linux kernel logically splits the representation of a process into two
central data structures: task_struct which holds the execution state, and
mm_struct which holds the memory
state.Figure-1: The Linux kernel logically splits the representation of a process into two central data structures: task_struct which holds the execution state, and mm_struct which holds the memory state.

      As shown in the diagram, process management revolves around two main structures: task_struct for execution state and mm_struct for memory management. We will start by discussing the definition of task_struct and its role in representing the execution state of the process.

      Process State Management in Linux

      A process is a running instance of a program. While a program is simply a binary file containing executable code and data stored on disk, it becomes a process when the kernel executes it. Each execution creates a distinct process with its own execution state--consisting of memory contents and current instruction position. This distinction is important because as a process executes, its state continuously changes, making each instance unique even when running the same program.

      The Linux kernel's representation of processes reflects this dynamic nature. It defines a struct called task_struct which contains various fields to track the process state. Because task_struct is huge with dozens of fields, we'll focus on the ones most relevant to scheduling and context switching. The following figure shows a truncated definition with these essential fields.

      A partial definition of task_struct highlighting some of the key fields for
storing a processโ€™s execution and scheduling related state. It also include
the X86-64 specific definition of thread_struct which is used for storing the
hardware specific CPU state of the process during context
switchesFigure-2: A partial definition of task_struct highlighting some of the key fields for storing a process's execution and scheduling related state. It also include the X86-64 specific definition of thread_struct which is used for storing the hardware specific CPU state of the process during context switches

      Let's go over these fields one by one.

      Thread Information and State Flags

      The thread_info struct (shown below in figure-3) contains flag fields that track low-level state information about the process. While those flags track many different states, one particularly important flag for context switching is TIF_NEED_RESCHED.

      The definition of thread_info
structFigure-3: The definition of thread_info struct

      The kernel sets this flag in thread_info when it determines that a process has exhausted its CPU time quota and other runnable processes are waiting. Usually, this happens while handling the timer interrupt. The kernel's scheduler code defines the function shown in figure-4 to set this flag.

      The TIF_NEED_RESCHED flag is set by the kernel in thread_info when it
decides it is time to context switch. The flag is set by calling the
set_tsk_need_resched function defined in
include/linux/sched.hFigure-4: The TIF_NEED_RESCHED flag is set by the kernel in thread_info when it decides it is time to context switch. The flag is set by calling the set_tsk_need_resched function defined in include/linux/sched.h

      This flag serves as a trigger--when the kernel is returning back to user mode (after a system call or interrupt) and notices this flag is set, it initiates a context switch. The code snippet in figure-6 from kernel/entry/common.c shows how the kernel checks this flag and calls the scheduler's schedule function to do the context switch just before exiting to user mode.

      The kernel checks the TIF_NEED_RESCHED flag while returning back to user
mode and if set, it calls schedulerโ€™s schedule() function to trigger a
context
switchFigure-4: The kernel checks the TIF_NEED_RESCHED flag while returning back to user mode and if set, it calls scheduler's schedule() function to trigger a context switch

      With the low-level state tracking handled by thread_info, let's examine how the kernel tracks the broader execution state of a process.

      Process Execution States

      The __state field represents the execution state of a process in the kernel. At any given time, a process must be in one of these five states:

      1. Running/Runnable : The process is either actively executing on a CPU or is ready to run, waiting in the scheduler queue.

      2. Interruptible Sleep : The process has voluntarily entered a sleep state, typically waiting for some condition or event. In this state, it can be woken up by either the awaited condition or by signals. For example, a process waiting for a lock enters this state but can still respond to signals.

      3. Uninterruptible Sleep : Similar to interruptible sleep, but the process cannot be disturbed by signals. This state is typically used for critical I/O operations. For instance, when a process initiates a disk read, it enters this state until the data arrives.

      4. Stopped : The process enters this state when it receives certain signals (like SIGSTOP). This is commonly used in debugging scenarios or job control in shells.

      5. Zombie : This is a terminal state where the process has completed execution but its exit status and resources are yet to be collected by its parent process.

      These states are fundamental to the kernel's process management and directly influence scheduling decisions during context switches. For instance, if the currently executing process has exhausted its CPU time slice, but there are no other runnable processes, then the kernel will not do a context switch.

      Process Kernel Stack

      The stack field in task_struct is a pointer to the base address of the kernel stack. The stack serves two fundamental purposes in the execution of code on the CPU:

      1. Function Local Variable Management : The stack automatically handles the lifetime of function-local variables. When a function is called, the stack grows to accommodate its local variables. Upon function return, the stack shrinks, automatically deallocating those variables.

      2. Register State Preservation : The stack provides a mechanism for saving and restoring register values. For instance, the X86 SysV ABI mandates that before making a function call, the caller must preserve certain register values on the stack. This allows the called function to freely use these registers, and the caller can restore their original values after the call returns.

      Every process maintains two distinct stacks:

      • User Mode Stack : Used during normal process execution when the CPU is running in process code. This stack holds the call chain of currently executing user-space functions.

      • Kernel Mode Stack : Used when the CPU executes kernel code in the process context, such as during system calls or interrupt handling. When transitioning from user to kernel mode, the CPU saves the user mode register values on this kernel stack (we'll cover this in detail when discussing mode transitions).

      The user mode stack is tracked inside the mm field which we will discuss in the next section.

      Memory Management State of the Process

      The mm field points to an mm_struct object--the kernel's representation of process virtual memory (or address space). This structure is central to memory management as it contains:

      • The address of the process page table

      • Mappings for various memory segments (stack, text, data, etc.)

      • Memory management metadata

      This data structure is one of the centerpieces of context switching as it is directly involved in the switching of address spaces. Because of this central role, we will discuss it in detail in the next section.

      CPU Time Tracking

      The task_struct tracks CPU usage through two fields: utime and stime. These fields record the total amount of time the process has spent executing on the CPU in user mode, and kernel mode respectively, since its creation.

      A process is in user mode when the CPU is executing the process code itself, whereas it is in kernel mode when the CPU is executing some kernel code on behalf of the process, which usually happens when the process executes a system call.

      Note that, these fields are not what the scheduler uses to decide when to perform context switching. The scheduler specific time tracking information is maintained inside the se field which we discuss next.

      Scheduler Entity and Runtime Management

      Scheduling information is managed through the se field of task_struct, which contains a sched_entity structure. While most fields in this struct are deeply tied to the scheduler implementation, we'll focus on two critical fields that directly impact context switching:

      The definition of sched_entity struct which tracks the processโ€™s scheduler
related
stateFigure-6: The definition of sched_entity struct which tracks the process's scheduler related state

      • vruntime : It tracks the process's virtual runtime during its current CPU slice. This is the amount of time the process has executed on the CPU, weighted by its priority. The weighting mechanism ensures high-priority processes accumulate vruntime more slowly, thereby receiving more CPU time. For example:

        • A high-priority process might have its runtime weighted by 0.5

        • A low-priority process might have its runtime weighted by 2.0

        • This means that for equal actual CPU time, the high-priority process accumulates half the vruntime

      • deadline : This field defines the virtual runtime limit for the current execution slice. When a process's vruntime exceeds this deadline, it becomes a candidate for context switching.

      The relationship between vruntime and deadline forms the core mechanism for CPU time allocation and context switch decisions.

      CPU State and Register Management

      Context switching requires saving the hardware state of the process, and different hardware architectures have different registers and other architectural details. As a result, task_struct includes the thread_struct field to manage any architecture specific state of the process. The X86-64 definition of thread_struct is shown in figure-7.

      Figure-7: The definition of thread_struct for
X86-64Figure-7: The definition of thread_struct for X86-64

      Let's understand the role of the fields in thread_struct:

      Stack Pointer (sp) :

      • The sp field is used to save the value of the stack pointer register (RSP on x86-64) during context switching.

      • It comes handy during the stack switching step of the context switch process. To switch stacks, the kernel saves the current process's RSP register value in its sp field and then loads the sp field value of the next process into RSP. (Don't worry if this doesn't make sense yet--we'll discuss this in detail in the final article of the series.)

      Segment Registers (es, ds, fsbase, gsbase) :

      • The remaining fields in the thread_struct object are to save the kernel mode values of the segment registers (es, ds, fs, and gs) during context switching. Even though, in X86-64, segmented memory is no longer in use, these registers are still there and the kernel needs to save them for compatibility.

      • The fs and gs registers are important to understand here. They are used for implementing thread-local storage (TLS) in user space code. This is achieved by storing a base address in the fs (or gs) register. All thread-local objects are stored at an offset from this base address. These objects are accessed by reading the base address, and adding the object's offset to find the final virtual memory address.

      • The kernel also uses these registers in a similar fashion to implement percpu variables, which allow the kernel to track the state of processes executing simultaneously on different processors. For instance, every processor has a percpu pointer variable pointing to the currently executing process on that CPU. On whichever CPU the kernel executes, it can safely manipulate that process's state.

      • For these reasons, it is important to save and restore these segment registers during context switches.


      Key Points About Process Representation

      Before moving on to process memory management, let's summarize the key aspects of how Linux represents processes:

      1. Process State Management

        1. Each process is represented by a task_struct

        2. Low-level state flags are tracked in thread_info

        3. Process execution states (running, sleeping, etc.) are managed via __state

      2. Execution Context

        1. Process maintains separate user and kernel stacks

        2. CPU state is preserved in thread_struct, which has an architecture specific definition

      3. Scheduling Information

        1. Virtual runtime (vruntime) tracks weighted CPU usage

        2. Deadline determines when context switches occur

      These components work together to enable the kernel to track process execution state, make scheduling decisions, and perform context switches. However, a process needs more than just CPU state to execute--it needs memory for its code, data, and runtime information. This brings us to the second major aspect of process management: the address space.


      The Address Space of a Process

      Before diving into mm_struct, let's understand how operating systems organize process memory. They implement memory virtualization to support concurrent process execution. Each process operates with the illusion of having access to the entire processor memory, while in reality, the kernel allocates only a portion of physical memory to it.

      This is made possible using paging. The virtual memory of a process consists of pages (typically sized 4k). Any address that the process tries to read or write is a virtual address which is contained in one of these pages.

      However, in order to perform those reads and writes, the CPU needs to know the physical addresses in the main memory where the data is actually stored. For that, the hardware translates the virtual address into physical address. It does this via the page table which maps the virtual memory pages to physical pages.

      The following diagram shows an example of 2-level page table, but note that for X86-64, the Linux kernel uses a 4 (or sometimes 5) level page table to be able to address large amounts of physical memory.

      Figure-8: A two-level page table showing how virtual pages are mapped to
physical pages in the main
memoryFigure-8: A two-level page table showing how virtual pages are mapped to physical pages in the main memory

      User and Kernel Page Table

      Now, it turns out that a process may have two page tables, one for user mode execution and another for kernel mode. This is similar to how it has a user mode stack and a kernel stack. During a transition from user to kernel mode, both the stack and the page tables need to be switched to their kernel counterparts.

      The separate kernel page table was introduced as a security measure following the Spectre and Meltdown vulnerabilities. These attacks showed that malicious user-space processes could potentially read kernel memory through speculative execution side channels. By maintaining separate page tables, the kernel memory becomes completely isolated from the user space, effectively mitigating such vulnerabilities.

      Segments of a Process's Memory

      Beyond page tables, a process needs its code and data mapped into its address space before it can execute. This mapping follows a specific layout where different types of data occupy distinct memory regions called segments. The following diagram shows a detailed view of the memory layout of a process.

      Figure-9: The memory layout of a process showcasing various
segments.Figure-9: The memory layout of a process showcasing various segments.

      Each segment is created by mapping a set of pages for it, and then loading the corresponding piece of data from the program's executable file. Let's go over some of the key segments and understand their role in the process's execution.

      The Stack Segment

      This is the user mode stack of the process, used for the automatic memory management of function local variables, and saving/restoring registers when needed.

      Unlike the kernel mode stack, the user space stack does not have a fixed size, and can grow upto a limit. Growing the stack involves mapping more pages for it.

      While the stack manages function execution context, processes also need memory for dynamic allocation, for which the heap segment is used.

      Dynamic Memory: Heap

      The heap area is used for dynamic memory allocation. Unlike most other segments in the process's memory which are loaded with some kind of process data, the heap starts empty. But durings its execution, the process may dynamically allocate memory using malloc or similar functions, and the allocated memory may come from this region.

      Beyond dynamic memory, processes need space for their static data, which leads us to the data segment.

      Static Data Segments

      Any program consists of two kinds of data that is part of the compiled binary. These are global variables and constants initialized to a non-zero value, and other data which is either initialized to zero, or uninitialized but defaults to a zero value.

      The non-zero initialized data is mapped into the data segment of the process, while the zero initialized (or uninitialized) data goes into the BSS segment.

      Code Segment (Text)

      Finally, the code segment is where the executable code of the program is loaded. The protection mode of the pages backing this segment is set to read- only and execute so that no one is able to modify the executable code once it loaded into memory, as this can be a potential security risk.

      When multiple processes are executing the same code, typically, they share their text segment pages. It saves memory and is more efficient.


      With an understanding of these conceptual components, let's examine how the Linux kernel implements this memory organization through its mm_struct structure.

      Memory Management Implementation in Linux

      The kernel encapsulates all this memory management information in the mm_struct structure. This is also a very large struct, so we'll focus on the fields crucial for context switching. Figure-10 shows a truncated definition of mm_struct with only the fields we are interested in.

      Figure-10: A truncated definition of mm_struct focusing on the fields which
are primarily involved in managing the state of the processโ€™s memory. Also
shown is the definition of the mm_context_t struct which is an architecture
specific struct. The shown definition of mm_context_t is for
X86-64Figure-10: A truncated definition of mm_struct focusing on the fields which are primarily involved in managing the state of the process's memory. Also shown is the definition of the mm_context_t struct which is an architecture specific struct. The shown definition of mm_context_t is for X86-64

      Let's discuss these fields, and see how they map to what we discussed about virtual memory.

      Page Table Directory (PGD)

      The pgd field in mm_struct contains the physical address of the page table directory--the root of the process's page table hierarchy. This address must be physical rather than virtual because the hardware MMU directly accesses it during address translation.

      During context switches, this field is crucial as it enables the kernel to switch the CPU's active page table, effectively changing the virtual memory space from one process to another. The kernel loads this address into the CR3 register, which immediately changes the address translation context for all subsequent memory accesses.

      Memory Segment Boundaries

      We saw how the memory of a process is organized in the form of segments. The mm_struct object tracks the boundaries of these segments by maintaining their beginning and ending addresses.

      Stack Segment:

      start_stack is the base address of the user mode stack. Note that there is no field to track the end of the stack because unlike other segments, the stack can dynamically grow during the execution of the process. Also, the CPU tracks the address of the top of the stack using the RSP register, so it always knows the end of the stack.

      Code Segment:

      • start_code and end_code: These fields provide the bounds of the code (text) segment where the executable instructions of the program are loaded.

      Data Segment:

      • start_data and end_data: These fields form the bounds of the data segment.

      Architecture-Specific Memory State

      The context field (mm_context_t) handles architecture-specific memory state. For X86-64, it contains two key fields which are critical during context switches, as they help the kernel decide if a flush of the translation lookaside buffer (TLB) is needed or not.

      TLB is a small cache in the CPU core which caches the physical addresses of recently translated virtual addresses. The cache is crucial for performance because a page table walk for doing address translation is very expensive.

      The two fields in mm_context_t definition for X86-64 are:

      ctx_id

      Every process's address space is assigned two identifiers. One is a unique id stored in the ctx_id, and the other is a pseudo-unique id called address space identifier (ASID), which is a 12-bit id.

      The hardware uses the ASID to tag the entries in the TLB which allows storing entries for multiple processes without requiring a flush. The ASID value ensures that the hardware will not let one process access another process's physical memory. In the absence of the ASID mechanism, a TLB flush is mandatory across context switches.

      But the ASID is just 12-bit wide and has only 4095 possible values, as a result it needs to be recycled by the kernel when it runs out of available values. This means that while a process is switched out of CPU, its ASID may get recycled and given to another process.

      When a process is being resumed back as part of context switch, the kernel uses its ctx_id to find the ASID value that was assigned to the process during the previous slice of execution. If that ASID has not been recycled, a TLB flush is not needed. But, if the process has to use a new ASID, a TLB flush needs to be done.

      tlb_gen

      The tlb_gen field is a generation counter used to track page table modifications. Each CPU maintains a per-CPU variable recording the tlb_gen value for its current/last used mm_struct.

      When the kernel modifies page table mappings (e.g., unmapping pages or changing permissions), it increments tlb_gen in the mm_struct. This indicates that any CPU's cached TLB entries for this address space might be stale.

      When a CPU switches to running a process, it compares its stored tlb_gen (if it has one for this mm_struct) with the mm_struct's current tlb_gen. If they differ, or if this CPU hasn't recently run this process, it flushes its TLB to ensure it doesn't use stale translations.

      _The explanation provided for these fields might not make a lot of

      sense yet. But don 't worry, we will revisit it and discuss these in more detail in the last part of the series._


      This completes our discussion of how the Linux kernel represents process memory state. We've seen how mm_struct orchestrates virtual memory, from page tables to memory segments, and how it handles architecture-specific requirements for efficient memory access.

      Having examined both task_struct and mm_struct in detail, Let's summarize what we've learned and look at what comes next.

      Putting It All Together

      We've covered the two fundamental data structures that the Linux kernel uses to represent processes and their execution state:

      1. task_struct : The process descriptor that contains:

        1. Execution state (running, sleeping, etc.)

        2. CPU time tracking (user and system time)

        3. Scheduling information (sched_entity)

        4. Architecture-specific thread state

        5. Kernel stack location

      2. mm_struct : The memory descriptor that manages:

        1. Page table information

        2. Memory segment locations

        3. TLB and cache management metadata

      These structures work together during context switches. For example:

      • The scheduler uses task_struct's sched_entity field to track process's virtual runtime

      • The thread_info flag indicates when switching is needed

      • The thread_struct object contains the CPU specific state that needs saving

      • The mm_struct's context determines if TLB flushes are required

      Looking Ahead

      Understanding these data structures is crucial because they're at the heart of context switching operations. In our next article, we'll explore how the CPU transitions between user and kernel mode, specifically:

      • How system calls and interrupts trigger mode switches

      • The role of interrupt handlers

      • How the kernel saves user state

      • CPU protection mechanisms during transitions

      • Performance implications of mode switching

      We'll see how the fields we discussed in task_struct and mm_struct are actually used during these transitions, and how they enable the kernel to safely switch between different execution contexts.


      Additional Resources

      For readers interested in diving deeper:

      In our next article, we'll build on this foundation to understand the mechanics of user-to-kernel mode transitions.


      Share


      Support Confessions of a Code Addict

      If you find my work interesting and valuable, you can support me by opting for a paid subscription (it 's $6 monthly/$60 annual). As a bonus you get access to monthly live sessions, and all the past recordings.

      Subscribed

      Many people report failed payments, or don 't want a recurring subscription. For that I also have a buymeacoffee page. Where you can buy me coffees or become a member. I will upgrade you to a paid subscription for the equivalent duration here.

      Buy me a coffee

      I also have a GitHub Sponsor page. You will get a sponsorship badge, and also a complementary paid subscription here.

      Sponsor me on GitHub

    2. ๐Ÿ”— sacha chua :: living an awesome life Monthly review: November 2024 rss

      This month, I experimented with doing my daily drawings on a calendar grid. Since that meant I had a nice neat one-page summary of the month right there, I figured I might as well resume writing these monthly reviews.

      Most of my discretionary time was taken up by preparations for EmacsConf, which was a lot of fun. The main things were adding to our organizers notebook and figuring out our own BigBlueButton installation.

      We still had plenty of time to get outside to the playground, go for walks and bike adventures, go skating, and make wontons. On indoor days, we mostly played Minecraft, Ni No Kuni, and Supermarket Together. Now that her usual playgroup's shifting mostly indoors (and tend to be pretty cough-y when they're outdoors), we've been going to the ice rink instead, and have even had a couple of playdates with new friends.

      A+ definitely craves more stimulation during virtual school. The teacher suggested micro:bit programming, and we've been having fun making simple programs to run on actual hardware. A+ has also been learning turtle graphics via Python programming, and she's quite proud of programming by typing instead of using blocks. She passed the first stage of the gifted identifaction process in the public school board, so we had a couple of meetings and I scrambled to do some research. I don't think it'll change much. The Toronto District School Board doesn't offer virtual placements for gifted students even if they do identify an exceptionality, and there probably isn't anything in their budget for extra resources for the self-contained virtual school they're setting up next year. Ah well. We're planning to take a very chill, non-tiger-parenting approach to the whole thing, and we'll just have to see how things work out.

      We set up a new desk for A+ near the window, which let her enjoy more sunlight during the day. In return, I got to have her old desk setup, so now I can sometimes get computer time with an extra monitor (at least when I'm not helping her stave off boredom).

      It was the last month before W- retired, so we squeezed in a few dental and medical appointments to take advantage of the remaining coverage. Now we get to figure out what our days could be like!

      Blog posts

      Sketches

      Time

      Category Previous month % This month % Diff % h/wk Diff h/wk
      A+ 30.3 39.3 8.9 63.9 15.0
      Personal 8.3 10.9 2.6 17.8 4.3
      Discretionary - Play 0.2 0.0 -0.2 0.0 -0.3
      Unpaid work 3.5 2.9 -0.5 4.8 -0.9
      Discretionary - Family 1.0 0.2 -0.8 0.3 -1.3
      Sleep 35.8 33.5 -2.3 54.6 -3.8
      Business 4.1 0.3 -3.7 0.5 -6.3
      Discretionary - Productive 16.9 12.9 -4.0 21.0 -6.7
    3. ๐Ÿ”— 3Blue1Brown (YouTube) Monge's Theorem rss

      Full video: https://youtu.be/piJkuavhV50

    4. ๐Ÿ”— sacha chua :: living an awesome life Linking to Org Babel source in a comment, and making that always use file links rss

      I've been experimenting with these default header args for Org Babel source blocks.

      (setq org-babel-default-header-args
            '((:session . "none")
              (:results . "drawer replace")
              (:comments . "link")  ;; add a link to the original source
              (:exports . "both")
              (:cache . "no")
              (:eval . "never-export") ;; explicitly evaluate blocks instead of evaluating them during export
              (:hlines . "no")
              (:tangle . "no"))) ;; I have to explicitly set up blocks for tangling
      

      In particular, :comments link adds a comment before each source block with a link to the file it came from. This allows me to quickly jump to the actual definition. It also lets me use org-babel-detangle to copy changes back to my Org file.

      I also have a custom link type to make it easier to link to sections of my configuration file (Links to my config). Org Mode prompts for the link type to use when more than one function returns a link for storing, so that was interrupting my tangling with lots of interactive prompts. The following piece of advice ignores all the custom link types when tangling the link reference. That way, the link reference always uses the file: link instead of offering my custom link types.

      (advice-add #'org-babel-tangle--unbracketed-link
                  :around (lambda (old-fun &rest args)
                            (let (org-link-parameters)
                              (apply old-fun args))))
      
      This is part of my Emacs configuration.
    5. ๐Ÿ”— navidrome/navidrome v0.54.3 release

      Changelog

      Bug fixes

      Build process updates

      • 0bebd39: build(ci): use the head commit sha in PR versions (@deluan)

      Other work

      Full Changelog : v0.54.2...v0.54.3

      Helping out

      This release is only possible thanks to the support of some awesome people!

      Want to be one of them?
      You can sponsor, pay me a Ko- fi, or contribute with code.

      Where to go next?

  3. December 28, 2024
    1. ๐Ÿ”— News Minimalist New AI tool will identify diabetes risk 13 years early + 2 more stories rss

      The holidays have been remarkably quiet. In the last 10 days, there were only a few stories that AI found significant. Here they are:


      Today ChatGPT read 14156 top news stories. After removing previously covered events, there are 3 articles with a significance score over 5.7.

      [5.8] NHS to begin world-first trial of AI tool to identify type 2 diabetes risk โ€”theguardian.com

      The NHS in England is set to launch a trial of an AI tool designed to predict the risk of type 2 diabetes up to 13 years before it develops. This trial will take place in 2025 at two London hospital trusts.

      The AI tool, named Aire-DM, analyzes electrocardiogram (ECG) readings to detect subtle changes that indicate future diabetes risk. It has shown about 70% accuracy in predicting risk across diverse populations.

      Developed using data from 1.2 million ECGs, the tool aims to enable early interventions, potentially helping individuals avoid developing type 2 diabetes through lifestyle changes.

      [5.8] Italian energy company Eni launches โ‚ฌ100 million supercomputer to boost oil and gas exploration โ€”ft.com[$]

      Italian energy company Eni has launched HPC6, the world's most powerful supercomputer outside the US.

      The โ‚ฌ100 million machine features nearly 14,000 AMD graphics processing units and ranks fifth among the world's fastest computers.

      It will analyze data to locate new oil and gas reservoirs and support clean energy research.

      [5.7] UN Security Council approves new peacekeeping mission in Somalia โ€”reuters.com

      The UN Security Council has approved a new peacekeeping mission in Somalia, called AUSSOM, set to begin on January 1, 2025. This mission will replace a larger African Union anti-terrorism operation.

      The change comes as Somalia's security has relied on foreign support since 2006, following Ethiopia's invasion. The European Union and the United States, major funders of AU forces, sought to reduce troop numbers due to financial concerns.

      The U.S. abstained from the vote due to these funding issues, while the other 14 council members supported the resolution. Negotiations for the new mission were reportedly complex.

      Highly covered news with significance over 5.4

      [5.4] OpenAI plans shift to public benefit corporation to attract investment
      (reuters.com + 22)

      [5.4] Mexico tests cellphone app allowing migrants to send alert if they are about to be detained in US
      (apnews.com + 8)

      [5.4] US Congress moves to ban new sales of Chinese-made drones over security concerns
      (apnews.com + 10)

      Thanks for reading!

      You can create your own personal feed like this with News Minimalist premium.

      Vadim


      Powered by beehiiv

    2. ๐Ÿ”— sacha chua :: living an awesome life EmacsConf 2024 notes rss

      [2024-12-28 Sat]: Added talk and Q&A count, added note about BBB max simultaneous users, added note about BBB, added thanks

      The videos have been uploaded, thank-you notes have been sent, and the kiddo has decided to play a little Minecraft on her own, so now I get to write some quick notes on EmacsConf 2024.

      Stats

      Talks 31
      Hours 10.7
      Q&A web conferences 21
      Hours 7.8
      • Saturday:
        • gen: 177 peak + 14 peak lowres
        • dev: 226 peak + 79 peak lowres
      • Sunday:
        • gen: 89 peak + 10 peak lowres

      Server configuration:

      meet 16GB 8core dedicated peak 409% CPU (100% is 1 CPU), average 69.4%
      front 32GB 8core shared peak 70.66% CPU (100% is 1 CPU)
      live 64GB 16core shared peak 552% CPU (100% is 1 CPU) average 144%
      res 46GB 12core peak 81.54% total CPU (100% is 12 CPUs); each OBS ~250%), mem 7GB used
      media 3GB 1core  

      YouTube livestream stats:

      Shift Peak Avg
      Gen Sat AM 46 28
      Gen Sat PM 24 16
      Dev Sat AM 15 7
      Dev Sat PM 20 12
      Gen Sun AM 28 17
      Gen Sun PM 26 18

      Timeline

      Call for proposals [2024-06-30 Sun]
      CFP deadline [2024-09-20 Fri]
      Speaker notifications [2024-09-27 Fri]
      Publish schedule [2024-10-25 Fri]
      Video target date [2024-11-08 Fri]
      EmacsConf [2024-12-07 Sat]-[2024-12-07 Sat]

      We did early acceptances again this year. That was nice. I wasn't sure about committing longer periods of time early in the scheduling process, so I usually tried to nudge people to plan a 20-minute video with the option of possibly doing more, and I okayed longer talks once we figured out what the schedule looked like.

      There were 82 days between the call for proposals and the CFP deadline, another 49 days from that to the video target date, and 29 days between the video target date and EmacsConf. It felt like there was a good amount of time for proposals and videos. Six videos came in before or on the target date. The rest trickled in afterwards, which was fine because we wanted to keep things low-pressure for the speakers. We had enough capacity to process and caption the videos as they came in.

      Data

      We continued to use an Org file to store the talk information. It would be great to add some validation functions:

      • Check permissions and ownership for files
      • Check case sensitivity for Q&A type detection
      • Check BBB redirect pages to make sure they exist
      • Check transcripts for ` because that messes up formatting; consider escaping for the wiki
      • Check files are public and readable
      • Check captioned by comment vs caption status vs captioner

      Speakers uploaded their files via PsiTransfer again. I didn't get around to setting up the FTP server. I should probably rename ftp-upload.emacsconf.org to upload.emacsconf.org so that people don't get confused.

      Communication

      As usual, we announced the EmacsConf call for proposals on emacs-tangents, Emacs News, emacsconf-discuss, emacsconf-org, https://reddit.com/r/emacs. System Crafters, Irreal, and Emacs APAC, mentioned it, and people also posted about EmacsConf on Mastodon, X, BlueSky, and Facebook. @len@toot.si suggested submitting EmacsConf to https://foss.events, so I did. There was some other EmacsConf-related discussions in r/emacs. 200ok and Ardeo organized an in-person meetup in Switzerland, and emacs.si got together in Ljubljana.

      For communicating with speakers and volunteers, I used lots of mail merge (emacsconf-mail.el). Most of the templates only needed a little tweaking from last year's code. I added a function to help me double-check delivery, since the batches that I tried to send via async sometimes ran into errors.

      Next time, I think it could be interesting to add more blog posts and Mastodon toots.

      Also, maybe it would be good to get in touch with podcasts like

      to give a heads up on EmacsConf before it happens and also let them know when videos are available.

      We continued to use Mumble for backstage coordination. It worked out well.

      Schedule

      The schedule worked out to two days of talks, with two tracks on the first day, and about 15-20 minutes between each talk. We were able to adapt to late submissions, last-minute cancellations, and last-minute switches from Q&A to live.

      We added an open mic session on Sunday to fill in the time from a last-minute cancellation. That worked out nicely and it might be a good idea to schedule in that time next year. It was also good to move some of the usual closing remarks earlier. We were able to wrap up in a timely manner, which was great for some hosts and participants because they didn't have to stay up so late.

      Sunday was single-track, so it was nice and relaxed. I was a little worried that people might get bored if the current talk wasn't relevant to their interests, but everyone managed just fine. I probably should have remembered that Emacs people are good at turning extra time into more configuration tweaks.

      Most of the scheduling was determined by people's time constraints, so I didn't worry too much about making the talks flow logically. I accidentally forgot to note down one speaker's time constraints, but he caught it when we e-mailed the draft schedule and I was able to move things around for a better time for him.

      There was a tiny bit of technical confusion because the automated schedule publishing on res had case-sensitive matching (case-fold-search was set to nil), so if a talk was set to "Live" Q&A, it didn't announce it as a live talk because it was looking for live. Whoops. I've added that configuration setting to my emacsconf-stream-config.el, so the ansible scripts should get it next time.

      I asked Leo and Corwin if they wanted to manually control the talks this year. They opted to leave it automatically managed by crontab so that they wouldn't have to worry as much about timekeeping. It worked reliably. Hooray for automation! The only scheduling hiccup was because I turned off the crontab so that we could do Saturday closing remarks when we wanted to and I forgot to reenable autopilot the next day. We noticed when the opening remarks didn't start right on the dot, and I got everything back on track.

      Like last year, I scheduled the dev track to start a little later than the gen track. That made for a less frantic morning. Also, this year we scheduled Sunday morning to start with more IRC Q&A instead of live Q&A. We didn't notice any bandwidth issues on Sunday morning this time.

      It would be nice to have Javascript countdowns in some kind of web interface to make it easier for hosts, especially if we can update it with the actual time the current video will end in MPV.

      I can also update the emacsconf-stream.el code to make it easier to automatically count down to the next talk or to a specific talk.

      We have Javascript showing local time on the individual talk pages, but it would be nice to localize the times on all the schedule/watch pages too.

      Most of my stuff (scheduling, publishing, etc.) is handled by automation with just a little bit of manual nudging every so often, so it might be possible to organize an event that's more friendly to Europe/APAC timezones.

      Recorded videos

      As usual, we strongly encouraged speakers to record videos to lower everyone's stress levels and allow for captioning by volunteers, so that's what most speakers did. We were able to handle a few last-minute submissions as well as a live talk. Getting videos also meant we could publish them as each talk went live, including automatically putting the videos and transcripts on the wiki.

      We didn't have obvious video encoding cut-offs, so re-encoding in a screen was a reliable way to avoid interruptions this year. Also, no one complained about tiny text or low resolution, so the talk preparation instructions seem to be working out.

      Automatically normalizing the audio with ffmpeg-normalize didn't work out, so Leo Vivier did a last-minute scramble to normalize the audio the day before the conference. Maybe that's something that volunteers can help with during the lead-up to the conference, or maybe I can finally figure out how to fit that into my process. I don't have much time or patience to listen to things, but it would be nice to get that sorted out early.

      Next year we can try remixing the audio to mono. One of the talks had some audio moving around, which was a little distracting. Also, some people listen to the talks in one ear, so it would be good to drop things down to mono for them.

      We think 60fps videos stressed the res server a bit, resulting in dropped frames. Next year, we can downsample those to 30fps and add a note to the talk preparation instructions. The hosts also suggested looking into setting up streaming from each host's computer instead of using our shared VNC sessions.

      There was some colour smearing and weirdness when we played some videos with mpv on res. Upgrading MPV to v0.38 fixed it.

      Some people requested dark mode (light text on dark background), so maybe we can experiment with recommending that next year.

      I did a last-minute change to the shell scripts to load resources from the cache directory instead of the assets/stream directory, but I didn't get all of the file references, so sometimes the test videos played or the introductions didn't have captions. On the plus side, I learned how to use j in MPV to reload a subtitle file.

      Sometimes we needed to play the videos manually. If we get the hang of starting MPV in a screen or tmux session, it might be easier for hosts to check how much time is left, or to restart a video at a specific point if needed. Leo said he'll work on figuring out the configuration and the Lua scripts.

      I uploaded all the videos to YouTube and scheduled them. That was nice because then I didn't have to keep updating things during the conference. It turns out that Toobnix also has a way to schedule uploads. I just need to upload it as unlisted first, and then choose Scheduled from the visibility. I wonder if peertube-cli can be extended to schedule things. Anyway, since I didn't know about that during the conference, I just used emacsconf-publish-upload-talk function to upload videos.

      It was fun playing Interview with an Emacs Enthusiast in 2023 [Colorized] - YouTube at lunch. I put together some captions for it after the conference, so maybe we can play it with captions next year.

      Recorded introductions

      We record introductions so that hosts don't have to worry about how to say things on air. I should probably send the intro check e-mail earlier–maybe on the original video target date, even if speakers haven't submitted their videos yet. This will reduce the last-minute scramble to correct intros.

      When I switched the shell scripts to use the cache directory, I forgot to get it to do the intros from that directory as well, so some of the uncorrected intros were played.

      I forgot to copy the intro VTTs to the cache directory. This should be handled by the subed-record process for creating intros, so it'll be all sorted out next year.

      Captioning

      We used WhisperX for speech-to-text this year. It did a great job at preparing the first drafts of captions that our wonderful army of volunteer captioners could then edit. WhisperX's built-in voice activity detection cut down a lot on the hallucinations that OpenAI Whisper had during periods of silence in last year's captions, and there was only one instance of WhisperX missing a chunk of text from a speaker that I needed to manually fill in. I upgraded to a Lenovo P52 with 64GB RAM, so I was able to handle last-minute caption processing on my computer. It might be handy to have a smaller model ready for those last-minute requests, or have something ready to go for the commercial APIs.

      The timestamps were a little bit off. It was really helpful that speakers and volunteers used the backstage area to check video quality. I used Aeneas to re-align the text, but Aeneas was also confused by silences. I've added some code to subed so that I can realign regions of subtitles using Aeneas or WhisperX timestamps, and I also wrote some code to skim timestamps for easy verification.

      Anush V experimented with using machine learning for subtitle segmentation, so that might be something to explore going forward.

      BigBlueButton web conference

      This year we set up a new BigBlueButton web conferencing server. The server with our previous BigBlueButton instance had been donated by a defunct nonprofit, so it finally got removed on October 27. After investigating whether Jitsi or Galene might be a good fit for EmacsConf, we decided to continue with BigBlueButton. There were some concerns about non-free Mongo for BBB versions >= 2.3 and < 3, so I installed BBB 3.0. This was hard to get working on a Docker on the existing res server. We decided it was worth spinning up an additional Linode virtual private server. It turned out that BBB refused to run on anything smaller than 8GB/4core, so I scaled up to that during testing, scaled back down to 1GB/1core in between, and scaled up to 16GB/8core dedicated during the conference.

      I'm still not 100% sure I set everything up correctly or that everything was stable. Maybe next year BBB 3.0 will be better-tested, someone more sysad-y can doublecheck the setup, or we can try Galene.

      One of the benefits of upgrading to BBB 3.0 was that we could use the smart layout feature to drag the webcam thumbnails to the side of the shared screen. This made shared screens much easier to read. I haven't automated this yet, but it was easy enough for us to do via the shared VNC session.

      On the plus side, it was pretty straightforward to use the Rails console to create all the rooms. We used moderator access codes to give all the speakers moderator access. Mysteriously, superadmins didn't automatically have moderator access to all the rooms even if they were logged in, so we needed to add host access by hand so that they could start the recordings.

      Since we self-hosted and were budgeting more for the full-scale node, I didn't feel comfortable scaling it up to production size until a few days before the conference. I sent the access codes with the check-in e-mails to give speakers time to try things out.

      Compared to last year's stats:

        2023 2024
      Max number of simultaneous users 62 107
      Max number of simultaneous meetings 6 7
      Max number of people in one meeting 27 25
      Total unique people 84 102
      Total unique talking 36 40

      (Max number of simultaneous users wasn't deduplicated, since we need that number for server load planning)

      Tech checks and hosting

      FlowyCoder did a great job getting everyone checked in, especially once I figured out the right checklist to use. We used people's emergency contact information a couple of times.

      Corwin and Leo were able to jump in and out of the different streams for hosting. Sometimes they were both in the same Q&A session, which made it more conversational especially when they were covering for technical issues. We had a couple of crashes even though the tech checks went fine, so that was weird. Maybe something's up with BBB 3.0 or how I set it up.

      Next time, we can consider asking speakers what kind of facilitation style they like. A chatty host? Someone who focuses on reading the questions and then gets out of the way? Speakers reading their own questions and the host focusing on timekeeping/troubleshooting?

      Streaming

      I experimented with setting up the live0 streaming node as a 64GB 32core dedicated CPU server, but that was overkill, so we went back down to 64GB 16core and it still didn't approach the CPU limits.

      The 480p stream seemed stable, hooray! I had set it up last year to automatically kick in as soon as I started streaming to Icecast, and that worked out. I think I changed a loop to be while true instead of making it try 5 times, so that probably helped.

      I couldn't get Toobnix livestreaming to work this year. On the plus side, that meant that I could use OBS to directly stream to YouTube instead of trying to set up multicasting. I set up one YouTube livestreaming event for each shift and added the RTMP keys to our shift checklists so that I could update the settings before starting the stream. That was pretty straightforward.

      This year, I wrote a little randomizer function to display things on the countdown screen. At first I just dumped in https://www.gnu.org/fun/jokes/gnuemacs.acro.exp.en.html, but some of those were not quite what I was looking for. (… Probably should've read them all first!) Then I added random packages from GNU ELPA and NonGNU ELPA, and that was more fun. I might add MELPA next time too. The code for dumping random packages is probably worth putting into a different blog post, since it's the sort of thing people might like to add to their dashboards or screensavers.

      I ran into some C-s annoyances in screen even with flow control turned off, so it might be a good idea to switch to tmux instead of screen.

      Next year, I think it might be a good idea to make intro images for each talk. Then we can use that as the opening slide in BigBlueButton (unless they're already sharing something else) as well as a video thumbnail.

      Publishing

      The automated process for publishing talks and transcripts to the wiki occasionally needed nudging when someone else had committed a change to the wiki. I thought I had a git pull in there somewhere, but maybe I need to look at it some more.

      I forgot to switch the conference publishing phase and enable the inclusion of Etherpads, but fortunately Ihor noticed. I did some last-minute hacking to add them in, and then I remembered the variables I needed to set. Just need to add it to our process documentation.

      Etherpad

      We used Etherpad 1.9.7 to collect Q&A again this year. I didn't upgrade to Etherpad v2.x because I couldn't figure out how to get it running within the time I set aside for it, but maybe that's something for next year.

      I wrote some Elisp to copy the current ERC line (unwrapped) for easier pasting into Etherpad. That worked out really well, and it let me keep up with copying questions from IRC to the pad in between other bits of running around. (emacsconf-erc-copy in emacsconf-erc.el)

      Next year, I'll add pronouns and pronunciations to the Etherpad template so that hosts can refer to them easily.

      If I rejig the template to move the next/previous links so that notes can be added to the end, I might be able to use the Etherpad API to add text from IRC.

      IRC

      We remembered to give the libera.chat people a heads-up before the conference, so we didn't run into usage limits for https://chat.emacsconf.org. Yay!

      Aside from writing emacsconf-erc-copy (emacsconf-erc.el) to make it easier to add text from IRC to the Etherpad, I didn't tinker much with the IRC setup for this year. It continued to be a solid platform for discussion.

      I think a keyboard shortcut for inserting a talk's URL could be handy and should be pretty easy to add to my Embark keymap.

      Extracting the Q&A

      We sometimes forgot to start the recording for the Q&A until a few minutes into the talk. I considered extracting the Q&A recordings from the Icecast dump or YouTube stream recordings in order to get those first few minutes, but decided it wasn't worth it since people could generally figure out the answers.

      Getting the recordings off BigBlueButton was easier this year because I configured it with video as an additional processing format, so we could grab one file per session instead of combining the different streams with ffmpeg.

      I did a quick pass of the Q&A transcripts and chat logs to see if people mentioned anything that they might want to take out. I also copied IRC messages and the pads, and I copied over the answers from the transcripts using the new emacsconf-extract-subed-copy-section-text function.

      Audio mixing was uneven. It might be nice to figure out separate audio recordings just in case (#12302, bigbluebutton-dev). We ended up not tinkering with the audio for the Q&A, so next time, I can probably upload them without waiting to see if anyone wants to fiddle with the audio.

      Trimming the Q&A was pretty straightforward. I added a subed-crop-media-file function to subed so that I can trim files easily.

      Thanks to my completion functions for adding section headings based on comments, it was easy to index the Q&A this year. I didn't even put it up backstage for people to work on.

      Nudged by @ctietze, I'm experimenting with adding sticky videos if Javascript is enabled so that it's easier to navigate using the transcript. There's still a bit of tinkering to do, but it's a start.

      I added some conference-related variables to a .dir-locals.el file so that I can more easily update things even for past conferences. This is mostly related to publishing the captions on the wiki pages, which I do with Emacs Lisp.

      Budget and donations

      The total hosting cost for the conference was USD 42.92 + tax and the BBB testing in the lead-up to the conference was USD 3.11 + tax, so a total of USD 46.03+tax. The web node and the livestreaming node are kept as 1GB nanodes the rest of the year (USD 5 x 2 servers + tax, so USD 110). Very manageable.

      The Free Software Foundation also provided media.emacsconf.org for serving media files. Ry P provided res.emacsconf.org for OBS streaming over VNC sessions.

      Amin Bandali was away during the conference weekend and no one else knew how to get the list of donors and current donation stats from the FSF Working Together program on short notice. Next time, we can get that sorted out beforehand so that we can thank donors properly.

      Documentation and time

      I think my biggest challenge was having less time to prepare for EmacsConf this year because the kiddo wanted more of my attention. In many ways, the automation that I'd been gradually building up paid off. We were able to pull together EmacsConf even though I had limited focus time.

      Here's my Emacs-related time data (including Emacs News and tweaking my config):

      Year Jan Feb March April May June July Aug Sept Oct Nov Dec Total
      2023 23.4 15.9 16.2 11.2 4.4 11.5 6.5 13.3 36.6 86.6 93.2 113.0 432
      2024 71.2 12.0 5.6 6.6 3.3 9.6 11.0 4.7 36.0 40.3 52.3 67.7 320

      (and here's a longer-term analysis going back to 2012.)

      I spent 92.6 hours total in October and November 2024 doing Emacs-related things, compared to 179.8 hours the previous year – so, around half the time. Part of the 2023 total was related to preparing my presentation for EmacsConf, so I was much more familiar with my scripts then. Apparently, there was still a lot more that I needed to document. As I scrambled to get EmacsConf sorted out, I captured quick tasks/notes for the things I need to add to our organizers notebook. Now I get to go through all those notes in my inbox. Maybe next year will be even smoother.

      On the plus side, all the process-related improvements meant that the other volunteers could jump in pretty much whenever they wanted, including during the conference itself. I didn't want to impose firm commitments on people or bug them too much by e-mail, so we kept things very chill in terms of scheduling and planning. If people were available, we had stuff people could help with. If people were busy, that was fine, we could manage. This was nice, especially when I applied the same sort of chill approach to myself.

      I'd like to eventually get to the point of being able to mostly follow my checklists and notes from the start of the conference planning process to the end. I've been moving notes from year-specific organizer notebooks to the main organizers' notebook. I plan to keep that one as the main file for notes and processes, and then to have specific dates and notes in the yearly ones.

      Thanks

      • Thank you to all the speakers, volunteers, and participants, and to all those other people in our lives who make it possible through time and support.
      • Thanks to Leo Vivier and Corwin Brust for hosting the sessions, and to FlowyCoder for checking people in.
      • Thanks to our proposal review volunteers James Howell, JC Helary, and others for helping with the early acceptance process.
      • Thanks to our captioning volunteers: Mark Lewin, Rodrigo Morales, Anush, annona, and James Howell, and some speakers who captioned their own talks.
      • Thanks to Leo Vivier for fiddling with the audio to get things nicely synced.
      • Thanks to volunteers who kept the mailing lists free from spam.
      • Thanks to Bhavin Gandhi, Christopher Howard, Joseph Turner, and screwlisp for quality-checking.
      • Thanks to shoshin for the music.
      • Thanks to Amin Bandali for help with infrastructure and communication.
      • Thanks to Ry P for the server that we're using for OBS streaming and for processing videos.
      • Thanks to the Free Software Foundation for Emacs itself, the mailing lists, the media.emacsconf.org server, and handling donations on our behalf through the FSF Working Together program. https://www.fsf.org/working-together/fund
      • Thanks to the many users and contributers and project teams that create all the awesome free software we use, especially: BigBlueButton, Etherpad, Icecast, OBS, TheLounge, libera.chat, ffmpeg, OpenAI Whisper, WhisperX, the aeneas forced alignment tool, PsiTransfer, subed, and many, many other tools and services we used to prepare and host this years conference
      • Thanks to everyone!

      Overall

      Good experience. Lots of fun. I'd love to do it again next year. EmacsConf feels like a nice, cozy get-together where people share the cool things they've been working on and thinking about. People had fun! They said:

      • "emacsconf is absolutely knocking it out of the park when it comes to conference logistics"
      • "I think this conference has defined the terms for a successful online conference."
      • "EmacsConf is one of the big highlights of my year every year. Thank you a ton for running this ๐Ÿ˜Š"

      It's one of the highlights of my year too. =) Looking forward to the next one!

      In the meantime, y'all can stay connected via Emacs News, meetups (online and in person), Planet Emacslife, and now emacs.tv. Enjoy!

      p.s. I'd love to learn from other people's conference blog posts, EmacsConf or otherwise. I'm particularly interested in virtual conferences and how we can tinker with them to make them even better. I'm having a hard time finding posts; please feel free to send me links to ones you've liked or written!

  4. December 27, 2024
    1. ๐Ÿ”— astral-sh/uv 0.5.13 release

      Release Notes

      Bug fixes

      • Avoid enforcing URL check on initial publish (#10182)
      • Fix incorrect mismatched constraints reference (#10184)
      • Revert "Update reqwest (#10178)" (#10187)

      Install uv 0.5.13

      Install prebuilt binaries via shell script

      curl --proto '=https' --tlsv1.2 -LsSf https://github.com/astral-sh/uv/releases/download/0.5.13/uv-installer.sh | sh
      

      Install prebuilt binaries via powershell script

      powershell -ExecutionPolicy ByPass -c "irm https://github.com/astral-sh/uv/releases/download/0.5.13/uv-installer.ps1 | iex"
      

      Download uv 0.5.13

      File | Platform | Checksum
      ---|---|---
      uv-aarch64-apple-darwin.tar.gz | Apple Silicon macOS | checksum
      uv-x86_64-apple-darwin.tar.gz | Intel macOS | checksum
      uv-i686-pc-windows-msvc.zip | x86 Windows | checksum
      uv-x86_64-pc-windows-msvc.zip | x64 Windows | checksum
      uv-aarch64-unknown-linux-gnu.tar.gz | ARM64 Linux | checksum
      uv-i686-unknown-linux-gnu.tar.gz | x86 Linux | checksum
      uv-powerpc64-unknown-linux-gnu.tar.gz | PPC64 Linux | checksum
      uv-powerpc64le-unknown-linux-gnu.tar.gz | PPC64LE Linux | checksum
      uv-s390x-unknown-linux-gnu.tar.gz | S390x Linux | checksum
      uv-x86_64-unknown-linux-gnu.tar.gz | x64 Linux | checksum
      uv-armv7-unknown-linux-gnueabihf.tar.gz | ARMv7 Linux | checksum
      uv-aarch64-unknown-linux-musl.tar.gz | ARM64 MUSL Linux | checksum
      uv-i686-unknown-linux-musl.tar.gz | x86 MUSL Linux | checksum
      uv-x86_64-unknown-linux-musl.tar.gz | x64 MUSL Linux | checksum
      uv-arm-unknown-linux-musleabihf.tar.gz | ARMv6 MUSL Linux (Hardfloat) | checksum
      uv-armv7-unknown-linux-musleabihf.tar.gz | ARMv7 MUSL Linux | checksum

    2. ๐Ÿ”— @malcat@infosec.exchange You'll soon be able to export mastodon

      You'll soon be able to export #malcat's view to files:
      โ— Summary report as HTML+ SVG
      โ— Proximity & call graph views as SVG or PNG
      โ— Struct/hex/disasm views as HTML
      โ— Strings, symbols, intel, kesakode and other views as CSV

    3. ๐Ÿ”— sacha chua :: living an awesome life emacs.tv rss

      [2024-12-28 Sat]: I got emacstv-queue-random to fill the playlist with shuffled URLs, so it's all good now! =)

      I came across Ruby Video on Hacker News and thought it was a good idea, particularly the topic view. I mentioned it in a toot and that seemed to strike a chord in the #emacs community there, so I exported some of the metadata for EmacsConf videos into an Org Mode file. @xenodium whipped up a quick web prototype at emacs.tv. I added a bunch of videos from Emacs News and wrote some code for playing the videos from Emacs, and then grabbed more videos from YouTube playlists and Vimeo search results. (Gotta find a good way to monitor PeerTube…) As of this writing, there are 2785 videos with a combined playtime of more than 1000 hours.

      I am, in fact, listening to emacstv-background-mode as I write this. I was listening to it earlier while I played Minecraft with the kiddo. I'll probably shift some of my doomscrolling to shuffling through the emacs.tv web interface on my phone. I love hearing people's enthusiasm, and I occasionally pick up interesting tips along the way. (Gotta steal prot/window-single-toggle…)

      It's easy to use little crumbs of time to add more tags to the videos.org file. Sometimes I use org-agenda with buffer restriction (<) and search (s) to mark/unmark (m, u) so that I can bulk-tag (B +). To make this even more convenient, I've added emacstv-agenda-search, emacstv-org-ql-search, and emacstv-org-ql-search-untagged so that I can do that bulk tagging from anywhere.

      It would be nice to have mpv reuse the window. I wonder if I can queue up a number of videos instead of doing it one at a time, and if that would do the trick…

      Anyway, the web interface is at https://emacs.tv and the Elisp code and data are at https://github.com/emacstv/emacstv.github.io . Enjoy!