“Wrap marker” Thunderbird Extension

Yay, time for a new Thunderbird extension. Wrap Marker.

The code is up on GitHub.

This Thunderbird extension adds a word wrap marker (also called “ruler”, depending on what editor you’re using) to the text area in the compose window when you’re editing plain text emails. In effect, a vertical line indicating that you’re close to the 72/76/80-character mark. (You can change the position in about:config. The default is 76.)

It works by changing the entire editor’s (think “iframe”) designMode from “on” to “off”, and adding a div with contenteditable=”true” instead. If this changes how your compose text area behaves, I’d consider that a bug, so please let me know.

At the time of this writing (February 26, 2018), this extension is still kind of beta and not exactly “thoroughly tested”. It will be submitted to Thunderbird’s extension page once it’s been tested some more and maybe once it’s gotten some of the known bugs fixed. These include:

  • Quoted text in a reply isn’t blue.
  • Your cursor position preference isn’t honored. The cursor will always be in the upper left corner when you start a new reply.
  • This feature is disabled for HTML emails. I don’t think it’ll ever work for HTML emails.
  • You get scrollbars all the time (This is probably fixable. Forgot to fix.)

Backporting security fixes to old versions of the Linux kernel (Meltdown to 2.6.18) (Part 1)

In this post, I’ll give a quick overview over what it takes to backport a large patch (the KAISER patch to protect against Meltdown) to the Linux kernel to a version of the Linux kernel from around ten years ago. Note that this post only covers the main technique and the assembly portion of the patch.

First of all, one should think hard about whether this necessary. Couldn’t you just run a newer kernel with older user space? The answer is, in most cases, yes, you could. As evidenced by our ability to run old Docker images with 10-year-old userland on modern kernels (perhaps adding vsyscall=emulate to the kernel command line), things often work just fine. However, you may run into problems if you’re running on bare metal. I’ve heard of people running a maintained 3.10 kernel on 10-year-old userland without much fuss. I’ve personally run a 64-bit kernel with 100% 32-bit userland (same kernel version, without X11).

However, some people may not be able to afford to re-test their whole setup with different kernel versions all the time, and that is why distributions usually backport pure security fixes from newer kernels to older kernels. The Linux kernel is constantly improved, and over time, the code base of the kernel version included in a specific stable version of a distribution, which may only get security fixes, tends to look pretty different from the current Linux kernel.

Now let’s pretend we have to backport a fix for the Meltdown vulnerability to Linux 2.6.18. First of all, we try very hard to come up with alternative ways to thwart this vulnerability. For 2.6.18, we come up empty-handed, but for earlier kernels, we may find the so-called 4G/4G patch.

This 4G/4G patch unfortunately never made it into the mainline kernel, but was adopted by Red Hat for inclusion in Red Hat Enterprise Linux up to version 4. So we could get our hands on a version of this patch for Linux 2.6.9, and perhaps forward-port this to 2.6.18. The patch at http://people.redhat.com/mingo/4g-patches/4g-2.6.6-B7 weighs in at around 4500 lines, and our foremost priority should be to find a patch with as few lines as possible.

The patch referenced in the original Meltdown paper weighs in at only 1000 lines, and is almost guaranteed to be very barebones. I’d say it would therefore make sense to attempt to backport this patch, and if we manage to do that, perhaps look at what the various distributors decided to do differently from what’s in this patch.

Before we start, it would probably make sense to find a couple sentences that describe what the patch is supposed to do. It’s more than likely that we came across various descriptions of the patch when we were looking for a barebones patch to base our work off from. LWN has a good introduction.

Preparations

We need the source tree of the target kernel version and the source kernel version extracted somewhere. The source kernel version can be had by doing:

$ git clone https://github.com/torvalds/linux.git
$ # cd / mv / etc.
$ git checkout v4.10-rc6

The target version in our case is over here: http://vault.centos.org/5.11/updates/SRPMS/kernel-2.6.18-419.el5.src.rpm. We need to extract this and apply all of the existing patches. I use a current version of Debian, and rpmbuild operates in ~/rpmbuild. So create this directory, and the directories, SRPMS, RPMS, SPECS, SOURCES, BUILD, and BUILDROOT below it. Move the .src.rpm into the SOURCES directory, and issue the following commands.

$ cd ~/rpmbuild/SOURCES
$ rpm2cpio * | cpio -idmv
$ mv kernel.spec ../SPEC
$ cd ../SPEC
$ rpmbuild --nodeps -bp kernel.spec

Make sure you didn’t get any errors in the last step. Your patched kernel, ready to build from, is now inside ~/rpmbuild/BUILD/.

We’ll be making a lot of use of grep and git blame to backport patches. I usually use less to browse code quickly, or open it in an editor (usually kate and/or sublime) when I think I’ll need the file for a longer time. I have two monitors, but having more would help. I also have a bunch of paper to scribble stuff on. When you have a lot of terminal windows open just for the grepping, compiling and other things, you’ll probably find that giving the editor

You’ll find that you’ll have to read up on four-level page tables while creating the patch. Depending on the way you work, you might as well do that before you dig in.

Here are a few more less tips:

  • You likely already know that you can search files by hitting ‘/’
    • You can use the arrow keys to browse through your search history
    • You can disable regex search by hitting Ctrl-R
    • You can type -N followed by return to display line numbers

For debugging, I use the venerable Bochs.

Digging in

arch/x86/entry/entry_64.S and arch/x86/entry/entry_64_compat.S

We have something in arch/x86/entry/entry_64.S and arch/x86/entry/entry_64_compat.S. Okay, we’re adding a few macros (SWITCH_KERNEL_CR3_NO_STACK, SWITCH_USER_CR3, SWITCH_KERNEL_CR3). These macros all seem to be close to a macro called SWAPGS or SWAPGS_UNSAFE_STACK. The presence of “UNSAFE_STACK” also dictates which SWITCH_CR3 macro we’re using. Though nothing may make sense yet, these are all important observations.

On the old kernel, this path doesn’t exist at all, but we have a promising-sounding arch/x86_64/ path.

~/src/kernel/el5/linux-2.6.18.4$ find arch/x86_64/ -name *entry*
arch/x86_64/kernel/entry.S
arch/x86_64/ia32/ia32entry.S

Opening arch/x86_64/kernel/entry.S, we see code that looks similar on the whole. SWAPGS doesn’t exist, but swapgs (as a pure assembly instruction) does. So let’s figure out what SWAPGS is about:

~/src/kernel/git$ grep -rn SWAPGS
...
arch/x86/include/asm/irqflags.h:122:#define SWAPGS      swapgs
...
arch/x86/include/asm/paravirt.h:908:#define SWAPGS                                                              \
        PARA_SITE(PARA_PATCH(pv_cpu_ops, PV_CPU_swapgs), CLBR_NONE,     \
                  call PARA_INDIRECT(pv_cpu_ops+PV_CPU_swapgs)          \
                 )
...

At this point, we might have a hunch that SWAPGS was introduced with the intention to make the same entry code work for both real hardware/real virtualization and paravirtualization, and this is sufficiently confirmed when we git blame the file a bit:

$ git blame arch/x86/entry/entry_64.S
...
72fe485854429 arch/x86/kernel/entry_64.S (Glauber de Oliveira Costa 2008-01-30 13:32:08 +0100  143)     SWAPGS_UNSAFE_STACK
...
$ git show 72fe485854429
commit 72fe4858544292ad64600765cb78bc02298c6b1c
Author: Glauber de Oliveira Costa <gcosta@redhat.com>
Date:   Wed Jan 30 13:32:08 2008 +0100

    x86: replace privileged instructions with paravirt macros
    
    The assembly code in entry_64.S issues a bunch of privileged instructions,
    like cli, sti, swapgs, and others. Paravirt guests are forbidden to do so,
    and we then replace them with macros that will do the right thing.
...

When looking at the above git blame, there are a lot of lines affecting SWAPGS with different commit hashes, but this one is the oldest. We should be able to transfer the macro calls to the lines adjacent to the swapgs instructions. Fortunately, the number of swapgs instructions and the number of SWAPGS macro calls are almost the same in both kernels. With just the names (SWITCH_KERNEL_CR3) of the macros we don’t really know if this switches the kernel CR3 to the user CR3 or the other way round, and when you look at code that was accepted upstream or in distributions, you might see that the macro names have become easier to understand. So let’s dig into the macros, which are declared in the newly #included asm/kaiser.h.

asm/kaiser.h

asm/kaiser.h consists of assembly code (#ifdef __ASSEMBLY__) and C code (#else).  Assembly code in the Linux kernel uses AT&T syntax, which means that the first operands are the sources and the second operands the destinations. The macros look pretty clean (i.e., they are mostly pure assembly code), except for the use of something called PER_CPU_VAR. Modern processors have more than one core, and these cores operate independently. One core might be executing user land, and another core might be in the kernel or about to do the entry into the kernel.

Unfortunately, when we grep for PER_CPU_VAR in the old kernel code, we come up empty-handed:

src/kernel/el5/linux-2.6.18.4$ grep -r PER_CPU_VAR .
src/kernel/el5/linux-2.6.18.4$

Note that a case-insensitive grep comes up with ia64-specific (as in Itanium) code. grepping for PER_CPU, on the other hand, yields a lot of results. Even the KAISER patch itself contains DECLARE_PER_CPU and DEFINE_PER_CPU statements. However, the older kernel doesn’t have DECLARE_PER_CPU_SECTION or DEFINE_PER_CPU_SECTION.

~/src/kernel/git$ grep -r PER_CPU_SECTION . | grep define
./include/linux/percpu-defs.h:#define DECLARE_PER_CPU_SECTION(type, name, sec)                  \
... (More matches in the same file)

Now, we do a chain of git blames until we find something that we consider useful:

git blame include/linux/percpu-defs.h
git show 7c756e6e19e71
git blame 7c756e6e19e71^ -- include/linux/percpu-defs.h # start blaming from one before 7c756e6e19e71; don't forget the '--'
git show 5028eaa97dd1d
# Looks like 5028eaa97dd1d creates the file for the first time, and the definitions used to be in include/asm-generic/percpu.h
git blame 5028eaa97dd1d^ -- include/asm-generic/percpu.h
git show 9b8de7479d0db
git blame 9b8de7479d0db^ -- include/linux/percpu.h
git show 0bd74fa8e29dc

At this point, we finally found the commit that first introduced DEFINE_PER_CPU_SECTION, but this still depends on DEFINE_PER_CPU_PAGE_ALIGNED, which isn’t available yet in 2.6.18. So the search continues:

git blame 0bd74fa8e29dc^ -- include/linux/percpu.h
git show 63cc8c7515646

This commit indicates that DEFINE_PER_CPU_PAGE_ALIGNED was introduced to avoid wasting memory. I don’t believe we really need to care about this. Let’s trace PER_CPU_VAR next:

grep -r PER_CPU_VAR . | grep define
git blame ./arch/x86/include/asm/percpu.h
git show dd17c8f72993f
git blame dd17c8f72993f^ -- arch/x86/include/asm/percpu.h
git show 3334052a321ac

This commit unifies the percpu_32.h and percpu_64.h files into a single header file, and indicates that PER_CPU_VAR only existed in the 32-bit code paths. Instead, the 64-bit code had this, which we grep straight away:

DECLARE_PER_CPU(struct x8664_pda, pda);

~/src/kernel/el5/linux-2.6.18.4$ grep -r x8664_pda
...
include/asm-x86_64/pda.h:11:struct x8664_pda {
...
~/src/kernel/el5/linux-2.6.18.4$ less -N include/asm-x86_64/pda.h
...
     10 /* Per processor datastructure. %gs points to it while the kernel runs */ 
     11 struct x8664_pda {
     12         struct task_struct *pcurrent;   /* Current process */
     13         unsigned long data_offset;      /* Per cpu data offset from linker address */
     14         unsigned long kernelstack;  /* top of kernel stack for current */ 
     15         unsigned long oldrsp;       /* user rsp for system call */
     16 #if DEBUG_STKSZ > EXCEPTION_STKSZ
     17         unsigned long debugstack;   /* #DB/#BP stack. */
     18 #endif
     19         int irqcount;               /* Irq nesting counter. Starts with -1 */   
     20         int cpunumber;              /* Logical CPU number */
     21         char *irqstackptr;      /* top of irqstack */
     22         int nodenumber;             /* number of current node */
     23         unsigned int __softirq_pending;
     24         unsigned int __nmi_count;       /* number of NMI on this CPUs */
     25         int mmu_state;     
     26         struct mm_struct *active_mm;
     27         unsigned apic_timer_irqs;
     28 } ____cacheline_aligned_in_smp;
...

Interesting, this is a per-processor data structure? pda.h doesn’t exist in modern kernels anymore, but some additional googling confirms that, yes, we should be able to use this. I ended up adding unsafe_stack_register_backup to this struct. Through some additional code searching we can find out how to access members of the PDA structure (for assembly, there’s a hint at the top: %gs points to the structure when we’re in kernel space).

The rest of asm/kaiser.h consists entirely of C function prototypes, which we can just copy over. At this point, we have successfully backported about 37% of the entire patch. I used this git blame technique to backport the entire patch. It’s a lot of work, and if you do not include the time it takes to read through the Meltdown papers and the news to get a good overview of what needs to be done, it took me about two to three weeks to get a still-broken patch that causes the system to panic around PID number 370, which is still long before you get to log in to the console. It still took well over a dozen rebuilds to get there.

KDE: Windows freeze or flicker but application doesn’t crash

I’m running KDE on two different systems, and one of them exhibits the following problem very often, and the other just did for the first time:

Windows stop updating their content, and perhaps flicker a bit. Switching to a different window and back causes the window contents to be updated, but still frozen. Which means that the application itself is not crashed.

The following command fixes this:

kwin --replace

You can run this from the run command prompt (Alt+F2) (also called Plasma search or krunner), or you could run it in a terminal. (You’d have to make sure the process doesn’t exit when you close the terminal though.)

If everything appears to be frozen and you can’t get to the run command prompt, you could still switch to a console, log in, and try running the following:

DISPLAY=:0 kwin --replace

Both systems have internal Intel graphics (quite different chipsets though) and KDE5.

The above commands will fix the problem for that time. Your open applications should not be affected by the change. I haven’t looked much into permanent fixes, but changing the rendering backend (System Settings → Display and Monitor → Compositor) may change the frequency the problem is triggered or maybe even get rid of it altogether. (I felt that OpenGL 2.0 probably triggered the problem fewer times than OpenGL 3.1.)

I’ve noticed a fair amount of traffic to my KDE-related posts. If you run into any weird KDE problems that you don’t know how to fix, feel free to leave a comment and ask.