Tux Machines

LWN Articles About Linux (Kernel)

Posted by Roy Schestowitz on Jul 27, 2023

=> Security Leftovers | Android Leftovers

A Q&A about the realtime patches

In a session at the 2023 Real Time Linux Summit, Thomas Gleixner answered questions about the realtime feature of the kernel, its status, and the Real-Time Linux project's plans for the future. The talk was billed as a "Q&A about PREEMPT_RT" with a caveat: "anything except printk() and documentation". As might be guessed, the first two questions were on just those topics, but there were plenty of other questions (and answers) too. The summit was held in conjunction with the inaugural Embedded Open Source Summit in Prague, Czechia at the end of June.

Debian looks forward to 2038

=> ↺ Debian looks forward to 2038

The time_t rollover is still over 14 years away, so solving it might not appear to be an urgent problem. The fact that it does not affect most 64-bit systems may also encourage complacency. But there are affected 32-bit systems on the market now, and some of them are likely to still be operating in 2038; that is especially true of embedded systems, which might prove harder to fix as the date approaches. It would be far better if these systems were prepared for 2038 at deployment time; that means solving the problem as quickly as possible.

Work has been underway in the kernel for some years, and it is mostly free of year-2038 problems at this point. The GNU C Library (glibc) has also seen a considerable amount of work to allow it to handle both 32-bit and 64-bit time_t values, depending on how a program has been compiled. So the core of a Linux system is ready, but there is a lot more than that to the problem.

The proper time to split struct page

=> ↺ The proper time to split struct page

The page structure sits at the core of the kernel's memory-management subsystem; one such structure exists for every page of installed RAM. This structure is increasingly seen as a problem, though, and phasing it out is one of the many side projects associated with the folio conversion. One step in that direction is currently meeting some pushback from memory-management developers, though, who think that some of these changes are coming too soon.

The purpose of struct page is to allow the kernel to keep track of the status of each page — how it is being used, its position in a least-recently-used list, how many references to it exist, and more. The information needed varies considerably depending on how a given page is being used; a page of user-space anonymous memory is managed differently from, say, memory used for a kernel-space DMA buffer. Since page structures must be kept as small as possible — there are millions of them in a modern system, so every byte hurts — data must be stored as efficiently as possible. As a result, struct page is declared as a maze of nested unions, allowing the data fields for each usage type to be overlaid.

Stabilizing per-VMA locking

=> ↺ Stabilizing per-VMA locking

The kernel-development process routinely absorbs large changes to fundamental subsystems and still produces stable releases every nine or ten weeks. On occasion, though, the development community's luck runs out. The per-VMA locking work that went into the 6.4 release is a case in point; it looked like a well-tested change that improved page-fault scalability. There turned out to be a few demons hiding in that code, though, that made life difficult for early adopters of the 6.4 kernel.

The mmap_lock controls access to a process's address space. If a process has a lot of things going on at the same time — a large number of threads, for example — contention for the mmap_lock can become a serious performance problem. Handling page faults, which can involve changing page-table entries or expanding a virtual memory area (VMA), has traditionally required the mmap_lock, meaning that many threads generating page faults concurrently can slow a process down.

Developers have been working for years to address this problem, often pursuing one form or another of speculative page-fault handling, where the work is done without the protection of the lock. If contention occurred during the handling of a fault, the kernel would drop the work it had done before making it visible and start over. This work never got to the point of being merged, though. Eventually, Suren Baghdasaryan's per-VMA locking work was merged instead. Since a page fault happens within a specific VMA, handling it should really only require serialized access to that one VMA rather than to the entire address space. Per-VMA locking is, thus, a finer-grained version of mmap_lock that allows for more parallelism in the execution of specific tasks.

Extensible scheduler class rejected

=> ↺ Extensible scheduler class rejected

The extensible scheduler class enables the creation of CPU schedulers in BPF. After the fourth version of this series was greeted with relative silence, Tejun Heo asked about the status of this work...

=> gemini.tuxmachines.org

Proxy Information

Original URL: gemini://gemini.tuxmachines.org/n/2023/07/27/LWN_Articles_About_Linux_Kernel.gmi
Status Code: Success (20)
Meta: text/gemini;lang=en-GB
Capsule Response Time: 139.377678 milliseconds
Gemini-to-HTML Time: 0.853297 milliseconds

This content has been proxied by September (ba2dc).