How to find out if an executable uses (e.g.) SIMD instructions (includes jq mini-tutorial!)

“Embarrassingly parallel” algorithms can often make use of SIMD instructions like those that came with the SSE and AVX extensions. In the Python world, numpy is a very popular package to work with arrays. One of the first things I wondered when I started using numpy was, “How optimized is numpy?” Some quick investigation shows that it’s multi-threaded, and some googling shows that it uses SIMD instructions: https://stackoverflow.com/questions/17109410/how-can-i-check-if-my-installed-numpy-is-compiled-with-sse-sse2-instruction-set

Now, it’s a bit tedious to grep for strings like VADDPD in the disassembly, so this post develops a nicer method.

For the impatient, here’s an unorthodox dirty one-liner (it creates a temporary file) that does this for you. It requires jq and internet access to download a database.

tempfile=`mktemp`; curl https://raw.githubusercontent.com/asmjit/asmdb/488b6d986964627f0b130b5265722dde8d93f11d/x86data.js | cpp | sed -n '/^{/,/^}/ { p }' | jq '[ .instructions | .[] | { (.[0]): .[4] } ] | add' > $tempfile; objdump --no-show-raw-insn -M intel -d /usr/lib/python2.7/dist-packages/numpy/core/*.so | awk '{print $2}' | grep -v : | sort | uniq | while read line; do echo -n "$line  "; output=$(jq "with_entries(select(.key | match(\"(^$line\\/|\\/$line\$|$line\\/|^$line\$)\"))) | to_entries | .[] | .value" $tempfile); if [ -z "$output" ]; then echo; else echo $output; fi; done > output_test; rm $tempfile

Note that it is not able to distinguish between e.g. AVX and AVX512. It always prints out the most advanced extension possible, so it will print out AVX512 if any AVX is used. If you want something better, check out the Node.js version at the bottom of this post.

And around this point we start the explanation for the less impatient readers: first of all, we need a database of CPU instructions, and a simple Google query brings up this: https://github.com/asmjit/asmdb (The following discussion is based on commit 488b6d986964627f0b130b5265722dde8d93f11d.)

This project is in JavaScript, and the data file isn’t quite in JSON, so let’s do some minor preprocessing first to make our database easier to use:

cpp x86data.js | sed -n '/^{/,/^}/ { p }' > json

cpp is the C preprocessor to remove comments (there are comments and even multi-line comments in the actual data). The sed bit looks for a line starting with a { and after that a line starting with a }, all the while printing out this whole block.

Next, we need to get a disassembly. Here’s an example for numpy’s .so files:

objdump --no-show-raw-insn -M intel -d /usr/lib/python2.7/dist-packages/numpy/core/*.so | grep -P "^ +[0-9a-z]+:" | awk '{print $2}' | sort | uniq > numpy_instructions

This will get us all instruction mnemonics used. We get a file like this:

adc
add
addpd
addps
addsd
addss
and
andnpd
andnps

Let’s go back to our data. Today, we’ll use jq as our main tool to get the job done (though it will be many times slower than if we wrote a simple script that loads the hash once and re-uses it for every input instruction). If we just want the instructions block, we could do this:

jq '.instructions' json > instructions

However, this tool is a real Swiss army knife. We can use the familiar concept of piping, and we can wrap things in arrays or hashes just by enclosing expressions in [] or {}. Here’s an entire command to get an array of hashes containing only the instruction and the corresponding extension from the json file:

jq '[ .instructions | .[] | {instruction: .[0], extension: .[4] } ]' json

.[] iterates over the array inside the instructions key. Every item in the array is piped to a bit of jq code that creates a hash with an instruction and an extension key, which correspond to array element 0 and 4 in the input data. So we get output like this:

[
  {
    "instruction": "aaa",
    "extension": "X86 Deprecated   OF=U SF=U ZF=U AF=W PF=U CF=W"
  },
  {
    "instruction": "aas",
    "extension": "X86 Deprecated   OF=U SF=U ZF=U AF=W PF=U CF=W"
  },
  .
  .
  .
]

Now we’re going to do something slightly naughty. The extension field isn’t the same for all instructions with the same mnemonic, as different opcodes with the same mnemonics have been added to the instruction set over time. However, we don’t need to be that precise IMO, so we’re just going to merge everything into an object like {“mnemonic”: “extension info”}. First, let’s get an array of hashes:

jq '[ .instructions | .[] | { (.[0]): .[4] } ]' json | head
[
  {
    "aaa": "X86 Deprecated   OF=U SF=U ZF=U AF=W PF=U CF=W"
  },
  {
    "aas": "X86 Deprecated   OF=U SF=U ZF=U AF=W PF=U CF=W"
  },
  {
    "aad": "X86 Deprecated   OF=U SF=W ZF=W AF=U PF=W CF=U"
  },
  .
  .
  .
]

Now we just need to pipe this into the add filter to merge this array of hashes/objects into a single hash/object:

jq '[ .instructions | .[] | { (.[0]): .[4] } ] | add' json > mnem2ext.json

And the result is:

{
  "aaa": "X86 Deprecated   OF=U SF=U ZF=U AF=W PF=U CF=W",
  "aas": "X86 Deprecated   OF=U SF=U ZF=U AF=W PF=U CF=W",
  "aad": "X86 Deprecated   OF=U SF=W ZF=W AF=U PF=W CF=U",
  "aam": "X86 Deprecated   OF=U SF=W ZF=W AF=U PF=W CF=U",
  "adc": "X64              OF=W SF=W ZF=W AF=W PF=W CF=X",
  "add": "X64              OF=W SF=W ZF=W AF=W PF=W CF=W",
  "and": "X64              OF=0 SF=W ZF=W AF=U PF=W CF=0",
  "arpl": "X86 ZF=W",
  "bndcl": "MPX X64",
  ...
}

Wee! But how do we access the information in this file? Well, with jq of course (not efficient though):

while read line; do echo -n "$line  "; jq ".$line" min.json; done < numpy_instructions

Here’s an extract from the output:

cvttpd2dq  "SSE2"
cvttps2dq  "SSE2"
cvttsd2si  "SSE2 X64"
cvttss2si  "SSE X64"
cwde  "ANY"
div  "X64              OF=U SF=U ZF=U AF=U PF=U CF=U"
divpd  "SSE2"
divps  "SSE"
divsd  "SSE2"
divss  "SSE"
fabs  "FPU              C0=U C1=0 C2=U C3=U"
fadd  "FPU              C0=U C1=W C2=U C3=U"

Such a nice mix of instructions. <3 We have a few problems though. Here are some instructions that couldn’t resolved:

cmova  null
cmpneqss  null
ja  null
rep  null
seta  null
vcmplepd

A closer look at our database reveals that some instructions have slashes in them, like “cmova/cmovnbe”. These are aliases, so we should be able to detect these as well. jq sort of allows to search for keys using regex, though the syntax isn’t easy, and the bash escaping makes things a bit worse:

while read line; do echo -n "$line  "; jq "with_entries(select(.key | match(\"(^$line\\/|\\/$line\$|$line\\/|^$line\$)\")))" min.json; done < numpy_instructions > output

Things have gotten a bit slower again, and the rest of our output looks a bit different too:

xor  {
  "xor": "X64              OF=0 SF=W ZF=W AF=U PF=W CF=0"
}
xorpd  {
  "xorpd": "SSE2"
}
xorps  {
  "xorps": "SSE"
}

We can’t get rid of the echo, otherwise we’ll have no way to tell if jq is finding the mnemonic or not. So we’ll use jq to fix the format. Here’s an easy example:

echo '{ "b": "c" }' | jq 'to_entries[]'
[
  {
    "key": "b",
    "value": "c"
  }
]
echo '{ "b": "c" }' | jq 'to_entries | .[] | .value'
"c"

Here, we’re just converting the hash into an array (as we did above with with_entries), and only select the .values. We can just pipe this within jq:

while read line; do echo -n "$line  "; jq "with_entries(select(.key | match(\"(^$line\\/|\\/$line\$|$line\\/|^$line\$)\"))) | to_entries | .[] | .value" min.json; done < numpy_instructions > output

However, we don’t get a newline when we didn’t find an instruction, so we work around this in bash:

while read line; do echo -n "$line  "; output=$(jq "with_entries(select(.key | match(\"(^$line\\/|\\/$line\$|$line\\/|^$line\$)\"))) | to_entries | .[] | .value" min.json); if [ -z "$output" ]; then echo; else echo $output; fi; done < numpy_instructions > output

That leaves mostly pseudo-instructions. The following pseudo-instructions are not included in this database but would indicate SSE2: CMPEQPD, CMPLTPD, CMPLEPD, CMPUNORDPD, CMPNEQPD, CMPNLTPD, CMPNLEPD, CMPORDPD. These all belong to the CMPPD instruction introduced in SSE2, as far as I can tell. (https://www.felixcloutier.com/x86/CMPPD.html#tbl-3-2) It would make sense to have them in the database in this case, but I think I’ll leave well enough alone for now though.

Anyway, doing something like awk ‘{print $2}’ output | sed s/\”//g | sort | uniq shows that my numpy version may use instructions from the following sets:

ANY
AVX
AVX2
AVX512_BW
AVX512_DQ
AVX512_F
CMOV
FPU
FPU_POP
FPU_PUSH
I486
MMX2
SSE
SSE2
SSE4_1
X64

Well, that’s great. Let’s package this up into a shell script so it’s a bit easier to use. Just stick it in a directory that has cpu_extensions.min.json in it and it’ll work.

#!/bin/bash

json_file=$(dirname $0)/cpu_extensions.min.json
objdump --no-show-raw-insn -M intel -d $* | grep -P "^ [0-9a-z]+:" | awk '{print $2}' | sort | uniq | while read line; do
    echo -n "$line  "
    output=$(jq "with_entries(select(.key | match(\"(^$line\\/|\\/$line\$|$line\\/|^$line\$)\"))) | to_entries | .[] | .value" $json_file);
    if [ -z "$output" ];
        then echo;
    else
        echo $output | sed -e 's/"//g' -e 's/ .*//g'
    fi
done

Also, here’s a more efficient (O(n)) implementation in Node.js. It gets away with much less pre-processing, all you have to do is:

sed -n '/^{/,/^}/ { p }' x86data.js > cpu_extensions.json

However, it doesn’t execute objdump for you, so you have to call it like this:

show_cpu_extensions.js <(objdump --no-show-raw-insn -M intel -d /usr/lib/python2.7/dist-packages/numpy/core/*.so | grep -P "^ +[0-9a-z]+:" | awk '{print $2}' | sort | uniq)

I’ve also made it display all possible extensions.

#!/usr/bin/nodejs

var database_file;
var disassembly_file;

if (process.argv.length == 3) {
    // Use default database
    database_file = __dirname + "/cpu_extensions.json";
    disassembly_file = process.argv[2];
} else if (process.argv.length == 4) {
    database_file = process.argv[2];
    disassembly_file = process.argv[3];
} else {
    console.log("Usage: " + process.argv[1] + " [database] disassembly");
    console.log(process.argv);
    process.exit(1);
}

var fs = require("fs");
var readline = require("readline"); 
var mnem2ext = {};

var obj = JSON.parse(fs.readFileSync(database_file, "utf8"));
obj["instructions"].map(function(v, i) {
    var ext = v[4].replace(/ +[A-Z]+=.*/, "").replace(/  +.*/, "");

    if (v[0].match(/\//)) {
        v[0].split("/").forEach(function(v, i) {
            if (!mnem2ext[v]) {
                mnem2ext[v] = {};
            }
            mnem2ext[v][ext] = true;
        });
    } else {
        if (!mnem2ext[v[0]]) {
            mnem2ext[v[0]] = {};
        }
        mnem2ext[v[0]][ext] = true;
    }
});

var lineReader = require("readline").createInterface({input: fs.createReadStream(disassembly_file)});
lineReader.on("line", function(line) {
    console.log(line + ": " + (mnem2ext[line] ? Object.keys(mnem2ext[line]).join(", ") : undefined));
});

“Wrap marker” Thunderbird Extension

Yay, time for a new Thunderbird extension. Wrap Marker.

The code is up on GitHub.

This Thunderbird extension adds a word wrap marker (also called “ruler”, depending on what editor you’re using) to the text area in the compose window when you’re editing plain text emails. In effect, a vertical line indicating that you’re close to the 72/76/80-character mark. (You can change the position in about:config. The default is 76.)

It works by changing the entire editor’s (think “iframe”) designMode from “on” to “off”, and adding a div with contenteditable=”true” instead. If this changes how your compose text area behaves, I’d consider that a bug, so please let me know.

At the time of this writing (February 26, 2018), this extension is still kind of beta and not exactly “thoroughly tested”. It will be submitted to Thunderbird’s extension page once it’s been tested some more and maybe once it’s gotten some of the known bugs fixed. These include:

  • Quoted text in a reply isn’t blue.
  • Your cursor position preference isn’t honored. The cursor will always be in the upper left corner when you start a new reply.
  • This feature is disabled for HTML emails. I don’t think it’ll ever work for HTML emails.
  • You get scrollbars all the time (This is probably fixable. Forgot to fix.)

Backporting security fixes to old versions of the Linux kernel (Meltdown to 2.6.18) (Part 1)

In this post, I’ll give a quick overview over what it takes to backport a large patch (the KAISER patch to protect against Meltdown) to the Linux kernel to a version of the Linux kernel from around ten years ago. Note that this post only covers the main technique and the assembly portion of the patch.

First of all, one should think hard about whether this necessary. Couldn’t you just run a newer kernel with older user space? The answer is, in most cases, yes, you could. As evidenced by our ability to run old Docker images with 10-year-old userland on modern kernels (perhaps adding vsyscall=emulate to the kernel command line), things often work just fine. However, you may run into problems if you’re running on bare metal. I’ve heard of people running a maintained 3.10 kernel on 10-year-old userland without much fuss. I’ve personally run a 64-bit kernel with 100% 32-bit userland (same kernel version, without X11).

However, some people may not be able to afford to re-test their whole setup with different kernel versions all the time, and that is why distributions usually backport pure security fixes from newer kernels to older kernels. The Linux kernel is constantly improved, and over time, the code base of the kernel version included in a specific stable version of a distribution, which may only get security fixes, tends to look pretty different from the current Linux kernel.

Now let’s pretend we have to backport a fix for the Meltdown vulnerability to Linux 2.6.18. First of all, we try very hard to come up with alternative ways to thwart this vulnerability. For 2.6.18, we come up empty-handed, but for earlier kernels, we may find the so-called 4G/4G patch.

This 4G/4G patch unfortunately never made it into the mainline kernel, but was adopted by Red Hat for inclusion in Red Hat Enterprise Linux up to version 4. So we could get our hands on a version of this patch for Linux 2.6.9, and perhaps forward-port this to 2.6.18. The patch at http://people.redhat.com/mingo/4g-patches/4g-2.6.6-B7 weighs in at around 4500 lines, and our foremost priority should be to find a patch with as few lines as possible.

The patch referenced in the original Meltdown paper weighs in at only 1000 lines, and is almost guaranteed to be very barebones. I’d say it would therefore make sense to attempt to backport this patch, and if we manage to do that, perhaps look at what the various distributors decided to do differently from what’s in this patch.

Before we start, it would probably make sense to find a couple sentences that describe what the patch is supposed to do. It’s more than likely that we came across various descriptions of the patch when we were looking for a barebones patch to base our work off from. LWN has a good introduction.

Preparations

We need the source tree of the target kernel version and the source kernel version extracted somewhere. The source kernel version can be had by doing:

$ git clone https://github.com/torvalds/linux.git
$ # cd / mv / etc.
$ git checkout v4.10-rc6

The target version in our case is over here: http://vault.centos.org/5.11/updates/SRPMS/kernel-2.6.18-419.el5.src.rpm. We need to extract this and apply all of the existing patches. I use a current version of Debian, and rpmbuild operates in ~/rpmbuild. So create this directory, and the directories, SRPMS, RPMS, SPECS, SOURCES, BUILD, and BUILDROOT below it. Move the .src.rpm into the SOURCES directory, and issue the following commands.

$ cd ~/rpmbuild/SOURCES
$ rpm2cpio * | cpio -idmv
$ mv kernel.spec ../SPEC
$ cd ../SPEC
$ rpmbuild --nodeps -bp kernel.spec

Make sure you didn’t get any errors in the last step. Your patched kernel, ready to build from, is now inside ~/rpmbuild/BUILD/.

We’ll be making a lot of use of grep and git blame to backport patches. I usually use less to browse code quickly, or open it in an editor (usually kate and/or sublime) when I think I’ll need the file for a longer time. I have two monitors, but having more would help. I also have a bunch of paper to scribble stuff on. When you have a lot of terminal windows open just for the grepping, compiling and other things, you’ll probably find that giving the editor

You’ll find that you’ll have to read up on four-level page tables while creating the patch. Depending on the way you work, you might as well do that before you dig in.

Here are a few more less tips:

  • You likely already know that you can search files by hitting ‘/’
    • You can use the arrow keys to browse through your search history
    • You can disable regex search by hitting Ctrl-R
    • You can type -N followed by return to display line numbers

For debugging, I use the venerable Bochs.

Digging in

arch/x86/entry/entry_64.S and arch/x86/entry/entry_64_compat.S

We have something in arch/x86/entry/entry_64.S and arch/x86/entry/entry_64_compat.S. Okay, we’re adding a few macros (SWITCH_KERNEL_CR3_NO_STACK, SWITCH_USER_CR3, SWITCH_KERNEL_CR3). These macros all seem to be close to a macro called SWAPGS or SWAPGS_UNSAFE_STACK. The presence of “UNSAFE_STACK” also dictates which SWITCH_CR3 macro we’re using. Though nothing may make sense yet, these are all important observations.

On the old kernel, this path doesn’t exist at all, but we have a promising-sounding arch/x86_64/ path.

~/src/kernel/el5/linux-2.6.18.4$ find arch/x86_64/ -name *entry*
arch/x86_64/kernel/entry.S
arch/x86_64/ia32/ia32entry.S

Opening arch/x86_64/kernel/entry.S, we see code that looks similar on the whole. SWAPGS doesn’t exist, but swapgs (as a pure assembly instruction) does. So let’s figure out what SWAPGS is about:

~/src/kernel/git$ grep -rn SWAPGS
...
arch/x86/include/asm/irqflags.h:122:#define SWAPGS      swapgs
...
arch/x86/include/asm/paravirt.h:908:#define SWAPGS                                                              \
        PARA_SITE(PARA_PATCH(pv_cpu_ops, PV_CPU_swapgs), CLBR_NONE,     \
                  call PARA_INDIRECT(pv_cpu_ops+PV_CPU_swapgs)          \
                 )
...

At this point, we might have a hunch that SWAPGS was introduced with the intention to make the same entry code work for both real hardware/real virtualization and paravirtualization, and this is sufficiently confirmed when we git blame the file a bit:

$ git blame arch/x86/entry/entry_64.S
...
72fe485854429 arch/x86/kernel/entry_64.S (Glauber de Oliveira Costa 2008-01-30 13:32:08 +0100  143)     SWAPGS_UNSAFE_STACK
...
$ git show 72fe485854429
commit 72fe4858544292ad64600765cb78bc02298c6b1c
Author: Glauber de Oliveira Costa <gcosta@redhat.com>
Date:   Wed Jan 30 13:32:08 2008 +0100

    x86: replace privileged instructions with paravirt macros
    
    The assembly code in entry_64.S issues a bunch of privileged instructions,
    like cli, sti, swapgs, and others. Paravirt guests are forbidden to do so,
    and we then replace them with macros that will do the right thing.
...

When looking at the above git blame, there are a lot of lines affecting SWAPGS with different commit hashes, but this one is the oldest. We should be able to transfer the macro calls to the lines adjacent to the swapgs instructions. Fortunately, the number of swapgs instructions and the number of SWAPGS macro calls are almost the same in both kernels. With just the names (SWITCH_KERNEL_CR3) of the macros we don’t really know if this switches the kernel CR3 to the user CR3 or the other way round, and when you look at code that was accepted upstream or in distributions, you might see that the macro names have become easier to understand. So let’s dig into the macros, which are declared in the newly #included asm/kaiser.h.

asm/kaiser.h

asm/kaiser.h consists of assembly code (#ifdef __ASSEMBLY__) and C code (#else).  Assembly code in the Linux kernel uses AT&T syntax, which means that the first operands are the sources and the second operands the destinations. The macros look pretty clean (i.e., they are mostly pure assembly code), except for the use of something called PER_CPU_VAR. Modern processors have more than one core, and these cores operate independently. One core might be executing user land, and another core might be in the kernel or about to do the entry into the kernel.

Unfortunately, when we grep for PER_CPU_VAR in the old kernel code, we come up empty-handed:

src/kernel/el5/linux-2.6.18.4$ grep -r PER_CPU_VAR .
src/kernel/el5/linux-2.6.18.4$

Note that a case-insensitive grep comes up with ia64-specific (as in Itanium) code. grepping for PER_CPU, on the other hand, yields a lot of results. Even the KAISER patch itself contains DECLARE_PER_CPU and DEFINE_PER_CPU statements. However, the older kernel doesn’t have DECLARE_PER_CPU_SECTION or DEFINE_PER_CPU_SECTION.

~/src/kernel/git$ grep -r PER_CPU_SECTION . | grep define
./include/linux/percpu-defs.h:#define DECLARE_PER_CPU_SECTION(type, name, sec)                  \
... (More matches in the same file)

Now, we do a chain of git blames until we find something that we consider useful:

git blame include/linux/percpu-defs.h
git show 7c756e6e19e71
git blame 7c756e6e19e71^ -- include/linux/percpu-defs.h # start blaming from one before 7c756e6e19e71; don't forget the '--'
git show 5028eaa97dd1d
# Looks like 5028eaa97dd1d creates the file for the first time, and the definitions used to be in include/asm-generic/percpu.h
git blame 5028eaa97dd1d^ -- include/asm-generic/percpu.h
git show 9b8de7479d0db
git blame 9b8de7479d0db^ -- include/linux/percpu.h
git show 0bd74fa8e29dc

At this point, we finally found the commit that first introduced DEFINE_PER_CPU_SECTION, but this still depends on DEFINE_PER_CPU_PAGE_ALIGNED, which isn’t available yet in 2.6.18. So the search continues:

git blame 0bd74fa8e29dc^ -- include/linux/percpu.h
git show 63cc8c7515646

This commit indicates that DEFINE_PER_CPU_PAGE_ALIGNED was introduced to avoid wasting memory. I don’t believe we really need to care about this. Let’s trace PER_CPU_VAR next:

grep -r PER_CPU_VAR . | grep define
git blame ./arch/x86/include/asm/percpu.h
git show dd17c8f72993f
git blame dd17c8f72993f^ -- arch/x86/include/asm/percpu.h
git show 3334052a321ac

This commit unifies the percpu_32.h and percpu_64.h files into a single header file, and indicates that PER_CPU_VAR only existed in the 32-bit code paths. Instead, the 64-bit code had this, which we grep straight away:

DECLARE_PER_CPU(struct x8664_pda, pda);

~/src/kernel/el5/linux-2.6.18.4$ grep -r x8664_pda
...
include/asm-x86_64/pda.h:11:struct x8664_pda {
...
~/src/kernel/el5/linux-2.6.18.4$ less -N include/asm-x86_64/pda.h
...
     10 /* Per processor datastructure. %gs points to it while the kernel runs */ 
     11 struct x8664_pda {
     12         struct task_struct *pcurrent;   /* Current process */
     13         unsigned long data_offset;      /* Per cpu data offset from linker address */
     14         unsigned long kernelstack;  /* top of kernel stack for current */ 
     15         unsigned long oldrsp;       /* user rsp for system call */
     16 #if DEBUG_STKSZ > EXCEPTION_STKSZ
     17         unsigned long debugstack;   /* #DB/#BP stack. */
     18 #endif
     19         int irqcount;               /* Irq nesting counter. Starts with -1 */   
     20         int cpunumber;              /* Logical CPU number */
     21         char *irqstackptr;      /* top of irqstack */
     22         int nodenumber;             /* number of current node */
     23         unsigned int __softirq_pending;
     24         unsigned int __nmi_count;       /* number of NMI on this CPUs */
     25         int mmu_state;     
     26         struct mm_struct *active_mm;
     27         unsigned apic_timer_irqs;
     28 } ____cacheline_aligned_in_smp;
...

Interesting, this is a per-processor data structure? pda.h doesn’t exist in modern kernels anymore, but some additional googling confirms that, yes, we should be able to use this. I ended up adding unsafe_stack_register_backup to this struct. Through some additional code searching we can find out how to access members of the PDA structure (for assembly, there’s a hint at the top: %gs points to the structure when we’re in kernel space).

The rest of asm/kaiser.h consists entirely of C function prototypes, which we can just copy over. At this point, we have successfully backported about 37% of the entire patch. I used this git blame technique to backport the entire patch. It’s a lot of work, and if you do not include the time it takes to read through the Meltdown papers and the news to get a good overview of what needs to be done, it took me about two to three weeks to get a still-broken patch that causes the system to panic around PID number 370, which is still long before you get to log in to the console. It still took well over a dozen rebuilds to get there.

The Boring Game

I wrote a game! In 2.5 hours, even. It’s a console (as in Linux terminal) game, and written in Perl (5). I’ll call it “The Boring Game”.

You’re the pilot of a sophisticated airplane that does not crash into mountains, but bores tunnels through them. Flying costs money (because you use up fuel). Operating the boring machine attached to your airplane is extremely energy-intensive, and costs a fortune. Boring horizontally is expensive, but wait till you see how much you have to pay to bore up. However, (completed) tunnels are very useful infrastructure, so you get a nice reward every time you make it through the mountain.

The game’s settings are global constants at the top of the source file:

my $USLEEP = 80000;
my @STATE_CHANGE_PROBABILITY = (0.1, 0.4, 0.1);
my $MAX_ALTITUDE = 120; # (1 == one column). Need a few more columns to display the current funds
my $MOUNTAIN_CHAR = '.';
my $MOUNTAIN_OUTLINE_CHAR = '|';
my $PLAYER_CHAR = '@';
my $UP_CMD = 'k';
my $DOWN_CMD = 'j';
my $QUIT_CMD = 'q';

Here’s a YouTube video of the game in action:

You can get the source code at https://github.com/qiqitori/theboringgame.
To play the game, put boring_game.pl in any directory you want, and issue the following command:

perl boring_game.pl

CVE (Description) Generator / CVEジェネレーター

https://blog.qiqitori.com/cve_generator/ ← Newest CVE Generator version
https://github.com/qiqitori/cve-generator ← GitHub

I’ve been thinking of creating a small tool that is capable of creating CVE descriptions. The benefit of having such a tool would be:

  • Generating perfect descriptions in other languages without translating manually
  • Predictable (==theoretically, parseable) descriptions
  • High-quality output for people submitting a vulnerability description for the first time

CVE descriptions usually look like this:

Heap-based buffer overflow in the jpc_dec_decodepkt function in jpc_t2dec.c in JasPer 2.0.10 allows remote attackers to have unspecified impact via a crafted image.

This has the following pieces of information:

  • Locality (function and file name) (jpc_t2dec.c, jpc_dec_decodepkt())
  • Software name (JasPer)
  • Software version (2.0.10)
  • Attacker type (remote)
  • Impact (unspecified)
  • Using what? (specially crafted image)

Most CVE descriptions appear to contain no more and no less information than this.

One picture is worth a thousand words, so here’s a screenshot to give you an idea of how this could work:

screenshot_v1

The whole thing works entirely in JavaScript and doesn’t send any data anywhere. The code is currently pretty easy to grok, and probably anything but over-engineered.

To add a language, one would copy one of the existing .js files to create a base. The file name scheme is: cve_generator_VERSION_LANGUAGECODE.js. In these files, you have a large dictionary to translate option values to actual text, which looks like this:

 var tl = {
     "generic_vulnerability": "脆弱性",
     "generic_vulnerabilities": "複数の脆弱性",
     "memory_leak": "メモリリーク",
...

Then you have a couple of functions that are each responsible for creating a small sentence fragment, and one function that adds all these fragments together. These functions differ a bit depending on the grammar of the language in question.

Anyway, this thing probably lacks a lot of features. If you need anything, feel free to leave a comment here or on GitHub, or even send a pull request.

(License: GPLv3, but feel free to copy and paste the base and/or any minor bits for use in entirely unrelated projects (without any restrictions and under any license of your choosing)

 

以下同じ内容を日本語で書きます。

https://blog.qiqitori.com/cve_generator/ ← 最新のバージョン
https://github.com/qiqitori/cve-generator ← GitHub
スクリーンショットは上記の英文に貼ってあります。

CVE の説明文を「生成」してくれるツールみたいなのほしいと思って、何もあまり考えないで早速作ってみました。
ツール化するメリット:

  • 英語版と日本語版を一気に作れる。もちろん、他の言語も(未実現ですが)
  • 微妙な違いはないため、理論上パースもできるはず
  • 初めて CVE 文章を作る人の役に立つ

さて、CVE の説明文は大体みんなこんな感じです:

Heap-based buffer overflow in the jpc_dec_decodepkt function in jpc_t2dec.c in JasPer 2.0.10 allows remote attackers to have unspecified impact via a crafted image.

この文章に含まれている情報は以下の通りです:

  • ソフトウェア名 (JasPer)
  • ソフトウェアバージョン (2.0.10)
  • 攻撃者の種類 (リモート)
  • 影響 (不特定)
  • 入力方法など (巧妙に細工された画像ファイル)

作成したツールのコードは JavaScript で書かれていて、実行環境はブラウザーで、外部ネットワークアクセスは発生しません。まだ、オーバースペックから程遠いコードだと思います。笑

現在は、新しい言語を追加するのには、既存の .js ファイルを丸ごとコピーして要編集のところを編集するというイメージです。ファイル名は適当に cve_generator_VERSION_言語コード.js に決まっています。これらのファイルの中に、以下のようなオブジェクトを使って翻訳を入れます。

var tl = {
    "generic_vulnerability": "脆弱性",
    "generic_vulnerabilities": "複数の脆弱性",
    "memory_leak": "メモリリーク",
...

そのほかに、小さい文断片を返してくれる短い関数と、これらの関数が返す文断片をつないでちゃんとした文章を作る関数があります。言語によってやるべきことが違っていて、関数の構造もみんな微妙に違うので、あまりにも膨大化しすぎたら管理しづらくなりそうですが、まぁそのかわり開発時間は数時間で済みました。笑

とにかく機能はまだあまりありません。何か欲しいロジックなどありましたらご連絡ください~

ライセンスは、一応 GPLv3 ですが、ぜんぜん違うソフトを作るのに役に立ちそうなものがあったら、ぜひ GPLv3 と関係なく、著作権がないと考えて好きなように摘み取ってください。

Blog (and other Qiqitori sites) now accessible via HTTPS

Thanks to Let’s Encrypt, this blog and other sites under the Qiqitori domain are now accessible via HTTPS.

I used to have HTTPS accessibility a couple years ago, but had to open up port 443 for other purposes (circumventing a work firewall). I’ve long left that workplace and since Let’s Encrypt SSL certificates are free, things are back in place now. I’m about one year late to jump on the Let’s Encrypt bandwagon, but that’s mostly because I try to avoid being an early adopter sometimes.

Getting this to work was a whole lot easier than assumed:

nano /etc/apt/sources.list
# insert:
deb http://ftp.debian.org/debian jessie-backports main
# save and exit editor
apt-get update
apt-get install python-certbot-apache -t jessie-backports

# easy option; probably doesn't require manual config editing if your config is straightforward:
certbot --apache
 
# or below command is for people who are familiar with the process (perhaps after having added the first two subdomains):
certbot --apache certonly --domains subdomain.qiqitori.com # requires manual config editing

Don’t worry, the only thing (as far as I can tell) that certbot is doing to your config is change the paths to the SSL certiticate files. You’ll also be asked which file to edit. So maybe just backup your config file, try the automatic command first and then inspect.

One more thing: this blog is running on WordPress, and apparently image tags (with their src attribute) seem to be hard-linked in the database. I don’t have a lot of articles with images, so I thought I’d just try to fix them manually:

select id from wp_posts where post_status='publish' and post_content like '%src="http://blog.%';

This yielded only four IDs, which I then fixed in the normal post editor (change from the “Visual” tab to the “Text” tab) in the admin interface. If you have internal links:

select id from wp_posts where post_status='publish' and post_content like '%href="http://blog.%';

Rather than changing ‘http://’ to ‘https://’, you might want to use ‘//’, which is protocol-agnostic and chooses whatever the current page was loaded over.

Using Rails Routes Right After Startup

In a new Rails application I am developing at the moment, I have a background job that kicks in every few minutes that may need to send emails to users. This background job is started off in config/initializers/start_something.rb. I had multiple problems with this, but the main one is described in the title of this blog post.

First of all, I originally used FooMailer.foo_email(foo, bar).deliver_later. This would just silently do nothing. Mails just didn’t work. Nothing in /var/mail/maillog either. Drop the _later, and you get a stack trace and finally know why emails aren’t being sent: there is a problem rendering the template, in my case, link_to and url_for weren’t working.

The second problem is the main problem. You get a long stack trace like this:

from /home/.../.rvm/gems/ruby-2.3.0/gems/actionpack-5.0.0.1/lib/action_dispatch/routing/route_set.rb:629:in `generate'
from /home/.../.rvm/gems/ruby-2.3.0/gems/actionpack-5.0.0.1/lib/action_dispatch/routing/route_set.rb:660:in `generate'
from /home/.../.rvm/gems/ruby-2.3.0/gems/actionpack-5.0.0.1/lib/action_dispatch/routing/route_set.rb:707:in `url_for'
from /home/.../.rvm/gems/ruby-2.3.0/gems/actionpack-5.0.0.1/lib/action_dispatch/routing/url_for.rb:172:in `url_for'
from /home/.../.rvm/gems/ruby-2.3.0/gems/actionview-5.0.0.1/lib/action_view/routing_url_for.rb:90:in `url_for'
from /home/.../.rvm/gems/ruby-2.3.0/gems/actionview-5.0.0.1/lib/action_view/helpers/url_helper.rb:196:in `link_to'
from /home/.../kifu-kun/kifukun/app/views/..._mailer/..._email.html.erb:16:in `_app_views_..._mailer_..._html_erb__3955141667319229348_25724800'

And if you place <% byebug %> right before that line 16 in the template, and copy and paste the link_to line into the debugger, you get something like:

*** ActionController::UrlGenerationError Exception: No route matches {:action=>"...", :controller=>"...", :id=>...}

What? After you double and triple-checked the syntax and names of everything, you maybe decide to check the output of Rails.application.routes.routes:

#<ActionDispatch::Journey::Routes:0x00000004a5d940 @routes=[], @ast=nil, @anchored_routes=[], @custom_routes=[], @simulator=nil>

Um, that looks very empty! No routes? (Normally you get a couple screenfuls of stuff.) As stated earlier, we’re using a config/initializers/start….rb file, and I suspected that the routes just aren’t available yet at this point.

Rails.application.config.after_initialize do
  if defined?(Rails::Server) # don't perform job when running rails c
    FooJob.perform_now
  end
end

Sorry, tangent: this job is running every two minutes, so it performs itself later at the end of the perform method:

FooJob.set(wait: 2.minutes).perform_later # why does self. not work?

Yeah, self.set(…).perform_later doesn’t seem to work, so just use the full class name. (There are cron gems around, but I opted to skip those to cut down on dependencies. And that’s what got me into this mess. :p)

And we’re back to our after_initialize thing. I found this page titled “Rails initialization and configuration order” and thought stuff run here would be able to take advantage of most or all of Rails’ capabilities. Well, it turns out that routes are special in that regard. Here’s something I found after searching for a while: “Rails initializer that runs *after* routes are loaded?” So the answer to my problem is:

Rails.application.config.after_initialize do
  if defined?(Rails::Server) # don't perform job when running rails c
    Rails.application.reload_routes!
    FooJob.perform_now
  end
end

The third problem is really simple. This is the message:

*** ArgumentError Exception: Missing host to link to! Please provide the :host parameter, set default_url_options[:host], or set :only_path to true

That’s a pretty clear message. In other words, you just have to add (e.g.) host: ‘example.com’ (or something from the config) to the (perhaps implicit) options hash ({controller: ‘…’, action: ‘…’}) and you’re set.

“Reply As Original Recipient” Thunderbird Extension

GitHub repository: https://github.com/qiqitori/reply_as_original_recipient

URL: https://addons.mozilla.org/en-US/thunderbird/addon/reply-as-original-recipient/ 

This Thunderbird extension automatically changes the From: field in replies to whatever the original sender’s email had in To:, but only if there is a + in the email address (and there is only one address in To:).

URL: https://addons.mozilla.org/ja/thunderbird/addon/reply-as-original-recipient/ 

返信の時、受信したメールの「宛先」に入っているメールアドレスを返信の送信元に設定してくれるThunderbirdアドオンを作ってみました。
(ただし、仕様上、宛先のメールアドレスに「+」が入っていないと動作しません。メールアドレスが二つ以上入っている場合も動作しません。)

URL: https://addons.mozilla.org/de-DE/thunderbird/addon/reply-as-original-recipient/ 

Mit diesem Thunderbird-Addon wird bei Antworten auf angekommene Emails, die im “An:”-Feld eine Email-Adresse mit einem “+” enthalten, das “Von:”-Feld automatisch auf jene “An:”-Email-Adresse gesetzt. Allerdings funktioniert dies nur bei Emails, die nur einen “An:”-Empfänger haben.

2017-01-29 edit: 1.1 beta version: reply_as_original_recipient-1.1-tb.xpi
This version adds an option in the config editor that allows the extension to work even if there is no plus character in the To: address. The option is at “extensions.replyasoriginalrecipient.use_plus”. The default is true, meaning that the address has to contain a plus character.

“Reply To All Reminder” Thunderbird Extension

URL: https://addons.mozilla.org/en-US/thunderbird/addon/reply-to-all-reminder/

This Thunderbird extension asks you to confirm if you really want to reply to the person in the From: field only when you hit “Reply” and there are multiple people in the To: field or there is a CC field. Often you’ll want to hit “Reply to All” instead.

URL: https://addons.mozilla.org/ja/thunderbird/addon/reply-to-all-reminder/

このアドオンをインストールしていただくと、複数の受取人・CCが入っているメールに対して「全員に返信」ではなく「返信」を押した場合、確認メッセージが表示されます。

URL: https://addons.mozilla.org/de/thunderbird/addon/reply-to-all-reminder/

Mit diesem Add-on fragt Thunderbird, ob man wirklich nur dem Sender antworten möchte, wenn man bei einer Email mit mehreren Empfängern anstelle von “Allen antworten”, “Antworten” ausgewählt hat.

さくらクラウドの料金システムについて / Sakura’s Cloud Pricing System

さくらクラウドの料金システムには、月額、日額、時間額と、3つの価格があります。月額が日額などよりもお得であることは、プラン変更の際に頭に入れておいた方が良いかもしれません。
損をする例を見てみましょう:
プラン/2Core-2GB   30 + 0時間   3,240円 ← 高いスペックで通常の月額
プラン/2Core-2GB   17 + 14時間   2,916円 ← 高いスペックで17日間分
プラン/1Core-1GB   12 + 9時間   982円 ← 安いスペックで12日間分
12日間スペックが低かったのに、3898円と、高いスペックの通常の月額より658円高いです。
皆さん、気をつけてください。

Just here to document a peculiar feature of Sakura Cloud’s pricing system.
Let’s say you’ve been running on high-spec’d servers and want to reduce these specs a bit. For example, you would like to go from プラン/2Core-2GB down to プラン/1Core-1GB.
Depending on the day you do the change, you may end up paying more than necessary. Here’s an example:
プラン/2Core-2GB    30 + 0時間    3,240円 ← This is the normal monthly price for 2 core/2 GB plan
プラン/2Core-2GB    17 + 14時間    2,916円 ← Let’s say you reduced the specs on the 17th – you’ll pay almost a month’s worth of server fees for the 2 core/2 GB plan
プラン/1Core-1GB    12 + 9時間    982円 ← And the remaining days for the 1 core/1 GB plan
So that month you’ll pay 3898 JPY, even though you were running on lower specs.