“Reply As Original Recipient” Thunderbird Extension

GitHub repository: https://github.com/qiqitori/reply_as_original_recipient

URL: https://addons.mozilla.org/en-US/thunderbird/addon/reply-as-original-recipient/ 

This Thunderbird extension automatically changes the From: field in replies to whatever the original sender’s email had in To:, but only if there is a + in the email address (and there is only one address in To:).

URL: https://addons.mozilla.org/ja/thunderbird/addon/reply-as-original-recipient/ 

返信の時、受信したメールの「宛先」に入っているメールアドレスを返信の送信元に設定してくれるThunderbirdアドオンを作ってみました。
(ただし、仕様上、宛先のメールアドレスに「+」が入っていないと動作しません。メールアドレスが二つ以上入っている場合も動作しません。)

URL: https://addons.mozilla.org/de-DE/thunderbird/addon/reply-as-original-recipient/ 

Mit diesem Thunderbird-Addon wird bei Antworten auf angekommene Emails, die im “An:”-Feld eine Email-Adresse mit einem “+” enthalten, das “Von:”-Feld automatisch auf jene “An:”-Email-Adresse gesetzt. Allerdings funktioniert dies nur bei Emails, die nur einen “An:”-Empfänger haben.

2017-01-29 edit: 1.1 beta version: reply_as_original_recipient-1.1-tb.xpi
This version adds an option in the config editor that allows the extension to work even if there is no plus character in the To: address. The option is at “extensions.replyasoriginalrecipient.use_plus”. The default is true, meaning that the address has to contain a plus character.

“Reply To All Reminder” Thunderbird Extension

URL: https://addons.mozilla.org/en-US/thunderbird/addon/reply-to-all-reminder/

This Thunderbird extension asks you to confirm if you really want to reply to the person in the From: field only when you hit “Reply” and there are multiple people in the To: field or there is a CC field. Often you’ll want to hit “Reply to All” instead.

URL: https://addons.mozilla.org/ja/thunderbird/addon/reply-to-all-reminder/

このアドオンをインストールしていただくと、複数の受取人・CCが入っているメールに対して「全員に返信」ではなく「返信」を押した場合、確認メッセージが表示されます。

URL: https://addons.mozilla.org/de/thunderbird/addon/reply-to-all-reminder/

Mit diesem Add-on fragt Thunderbird, ob man wirklich nur dem Sender antworten möchte, wenn man bei einer Email mit mehreren Empfängern anstelle von “Allen antworten”, “Antworten” ausgewählt hat.

さくらクラウドの料金システムについて / Sakura’s Cloud Pricing System

さくらクラウドの料金システムには、月額、日額、時間額と、3つの価格があります。月額が日額などよりもお得であることは、プラン変更の際に頭に入れておいた方が良いかもしれません。
損をする例を見てみましょう:
プラン/2Core-2GB   30 + 0時間   3,240円 ← 高いスペックで通常の月額
プラン/2Core-2GB   17 + 14時間   2,916円 ← 高いスペックで17日間分
プラン/1Core-1GB   12 + 9時間   982円 ← 安いスペックで12日間分
12日間スペックが低かったのに、3898円と、高いスペックの通常の月額より658円高いです。
皆さん、気をつけてください。

Just here to document a peculiar feature of Sakura Cloud’s pricing system.
Let’s say you’ve been running on high-spec’d servers and want to reduce these specs a bit. For example, you would like to go from プラン/2Core-2GB down to プラン/1Core-1GB.
Depending on the day you do the change, you may end up paying more than necessary. Here’s an example:
プラン/2Core-2GB    30 + 0時間    3,240円 ← This is the normal monthly price for 2 core/2 GB plan
プラン/2Core-2GB    17 + 14時間    2,916円 ← Let’s say you reduced the specs on the 17th – you’ll pay almost a month’s worth of server fees for the 2 core/2 GB plan
プラン/1Core-1GB    12 + 9時間    982円 ← And the remaining days for the 1 core/1 GB plan
So that month you’ll pay 3898 JPY, even though you were running on lower specs.

Mixiの(モバイルでの)署名付きリクエスト

この度、ガラケー向けのMixiアプリを作ることになりました。こういうアプリのリクエストは、実機 → Mixiが運営しているサーバー → アプリ用サーバー、という流れで送られるようです。(もちろんサーバーから実機への回答は逆の流れです。)Mixiのサーバーは、各リクエストに、ユーザーのMixi IDみたいな文字列を付けてくれて、そして本当にMixiのサーバーから来たリクエストかどうかを検証できるように、署名も付けてくれます。

Mixiの公式ドキュメンテーションはhttp://developer.mixi.co.jp/appli/spec/mob/validate-oauth-signature/にありますが、残念ながら、現在のところ、多少足りない部分がありまして…

まずは、下記のGETリクエストを例として上げているのだが、

http://example.com/foo/?opensocial_app_id=123&opensocial_owner_id=xxxxxxxx

「Ky/6LlDHpHX1EZMRi5mfUl9vxqY=」という署名になるには、「opensocial_owner_id」を「xxxxxxxx」ではなく、「456」に設定しないといけません。
なんで知ってるの!?
archive.orgのおかげです: https://web.archive.org/web/20100912034001/http://developer.mixi.co.jp/appli/spec/mob/validate-oauth-signature

続いて、「特殊な文字(漢字など)を含んでいるパラメーターはどうすればいいですか?」という質問に対する明らかな答えもなく、困っていました。
いろいろ試したところ、二重エンコーディングで対応できました。
つまり、「example.com/foo?test=テスト」みたいなリクエストは、ユーザー側のブラウザーから「example.com/foo?test=%E3%83%86%E3%82%B9%E3%83%88」のように送られます。それを「%25E3%2583%2586%25E3%2582%25B9%25E3%2583%2588」にすると検証が通ります。

test[foo]=barみたいな、角括弧が入っているパラメーターも同じように二重エンコーディングします。

Ruby on Railsだと、下記のようなコードで検証できます。

def mixi_signed_request_mobile?
    require 'cgi'
    oauth_header = request.headers["HTTP_AUTHORIZATION"].split(/\s*,\s*/)
    oauth_header_hash = Hash[oauth_header.map { |keqv| keqv.gsub('"', '').split('=') }]

    base_string_array = [
      "oauth_consumer_key=" + @client_id,
      "oauth_nonce=" + oauth_header_hash["oauth_nonce"],
      "oauth_signature_method=" + "HMAC-SHA1",
      "oauth_timestamp=" + oauth_header_hash["oauth_timestamp"],
      "oauth_version=" + "1.0"
    ]
    if request.get?
      uri = URI.parse(request.url)
      query_array = URI.decode_www_form(uri.query)
      base_string_array += query_array.map{|param| "#{CGI.escape(param[0])}=#{CGI.escape(param[1])}"} # takes care of opensocial_app_id and opensocial_owner_id as well
    else
      # don't need the actual parameters for post requests (mixi-specific awkwardness)
      base_string_array += ["opensocial_app_id=#{params["opensocial_app_id"]}", "opensocial_owner_id=#{params["opensocial_owner_id"]}"]
    end
    base_string_array.sort!
    url = request.original_url.gsub(/\?.*/, '') # can't use request.base_url + request.path because the trailing slash is sometimes(?) cut off
    baseString = request.method + "&" + CGI.escape(url) + "&" + CGI.escape(base_string_array.join("&")) # [] and utf-8 are doubly percent-encoded!! e.g. "テスト" becomes "%25E3%2583%2586%25E3%2582%25B9%25E3%2583%2588"!!!!!

    sha1 = OpenSSL::HMAC::digest(OpenSSL::Digest::SHA1.new, @client_secret + "&", baseString)
    base64 = Base64.strict_encode64(sha1)

    return CGI.escape(base64) == oauth_header_hash["oauth_signature"]
  end

(The above explanation and code show how to verify Mixi’s OAuth 1.0 signatures, as used with Japanese feature phone apps.)

Excel character counter / エクセル文字カウンター

At work, somebody sometimes needs to count the length of selected text in Excel cells. Doing it in one’s head is understandably unpleasant, so why not see if there’s software out there that can count the number of characters in selected text? Well, I got pretty close at http://www.donationcoder.com/Software/Skrommel/index.html#DragKing. Since this is written in AutoHotkey, I’d just have to modify it a bit to count while the mouse cursor is being dragged. (Watch the video below to see it in action.) Unfortunately, pressing Ctrl+V while selecting text doesn’t work, so we instead send the WM_COPY message, like this:

SendMessage 0x301,0,0,%Control%, %WinTitle%  ; WM_COPY

Also it’s not cool if the script gets rid of your clipboard contents, so we restore that too. Get the script here: moji_counter_v3.ahk
Note that it’s partially localized to Japanese.

会社に、エクセルセル内の文字を数えたり区切ったりする作業をしている人がいるのですが、頭の中で数えるのはあまり楽しくないので、この度、どうにかならないかという相談を受けました。早速ネットで調べてみると、http://www.donationcoder.com/Software/Skrommel/index.html#DragKingでほぼ同仕様のソフトがありました!しかもAutoHotkeyなので、すぐにうちの用途に合せられます。

DragKingではマウスボタンを外すとテキストをコピーして文字列の長さをカウントしているのですが、文字列を選択している間にカウントできたらもっと便利なんじゃないかなと思って…しかし、Ctrl+Vを押しても文字がコピーされないので、

SendMessage 0x301,0,0,%Control%, %WinTitle%  ; WM_COPY

変わりましてWM_COPYというメッセージを送ってコピーさせます。後、DragKingの仕様と違って、コピー・カウントの後、クリップボードの内容を復活させます。
AutoHotkeyスクリプトを下記のリンクでダウンロードできます。
moji_counter_v3.ahk

ブラック心理テスト

I recently started a job at a Japanese app development company that is mostly making fortune telling apps and the occasional personality test. I’m getting to work on some interesting stuff, including app development, web development, one-off utilities, and server administration. My first assignment was a personality test, and since it’s released now, I think I’m just going to link to it below.

最近転職し、占いアプリなどを取り扱っている日本の会社で活躍することになりました。「ブラック心理テスト」というアプリの開発を頼まれたのですが、ようやくリリースできましたのでリンクを貼らせていただきます〜

ブラック心理テスト@Play Store (Android)
ブラック心理テスト@App Store (iPhone)

まぁ、とても簡単なアプリですが、友だちが「楽しいかも!」と言ってくれました^^

Male and female kanji (And how to process ENAMDICT’s XML in Perl)

The other day, someone told me that the kanji 漢 (the 漢 in 漢字) usually indicates a male name. This made me wonder what other kanji there are that might indicate a female or male name. So I downloaded JMnedict.xml and processed it a little bit.

Side note: Both XML::LibXML and MySQL’s import from XML thingy turned out to be uselessly slow, so I did this using XML::Twig. Here’s some lazy code to extract all the <keb> elements and their associated name_type:

#!/usr/bin/perl -w

use XML::Twig;
use strict;

binmode(STDOUT, ":utf8");

my $t = XML::Twig->new(twig_handlers => { entry => \&entry });

$t->parsefile($ARGV[0]);

sub entry {
    my ($t, $entry) = @_;
    my ($keb, $name_type);

    eval {
        $keb = $entry->first_child("k_ele")->first_child("keb")->text;
        $name_type = $entry->first_child("trans")->first_child("name_type")->text;
        $entry->purge;
    };
    print "$keb\t$name_type\n" if (!$@);
}

Now we have a file that looks like this:

ゝ泉    given name or forename, gender not specified
〆      female given name or forename
...

Great names, eh? Then we just do:

grep female all_names.csv > female_names.csv
grep "[^e]male" all_names.csv > male_names.csv
perl -C -n -e 'while ($_ =~ s/(\p{Block=CJK_Unified_Ideographs})//) { print "$1\n" }' female_names.csv | sort -n | uniq -c | sort -n > female_kanji.csv # Perl's -C option enables unicode everywhere. Unfortunately, this option doesn't work on the #! line.
perl -C -n -e 'while ($_ =~ s/(\p{Block=CJK_Unified_Ideographs})//) { print "$1\n" }' male_names.csv | sort -n | uniq -c | sort -n > male_kanji.csv
sed -i "s/^\s*//" female_kanji.csv # fix leading whitespace chars from uniq -c
sed -i "s/^\s*//" male_kanji.csv

Then we put the two files into two different sheets of the same file in a spreadsheet program. (I used LibreOffice, but Excel is better. Seriously.) Call one sheet “Female”, the other “Male”, and on a third sheet, concatenate the two lists of kanji and filter out the duplicates (in column A), and use the following formulas for columns B to E, respectively:

=IF(ISERROR(VLOOKUP(A2, Female.$A$1:$B$1601, 2, 0)), 0, VLOOKUP(A2, Female.$A$1:$B$1601, 2, 0))
=IF(ISERROR(VLOOKUP(A2, Male.$A$1:$B$1601, 2, 0)), 0, VLOOKUP(A2, Male.$A$1:$B$1601, 2, 0))
=IF(ISERROR(B2/C2), 1000000, B2/C2)
=B2+C2

(In LibreOffice Calc, you use a period instead of an exclamation mark to reference cells on a different sheet in formulas. LibreOffice Calc doesn’t support the IFERROR() function. I know that 1,000,000 is not the answer to n/0, but we’d like a high number for sorting purposes.) Copy the formulas down and perhaps add the following headers in the top row: Kanji, Female, Male, Ratio, Count. Copy the whole thing without formulas to a new sheet, sort by ratio, and then by count. Perhaps filter out all the kanji that have a count < 10. Here’s a link to my files: kanji_usage_in_names.ods (OpenDocument) and kanji_usage_in_names.xlsx (Excel 2007+).

So it turns out that there are hundreds of kanji that are strong indicators for the gender of a name. By the way, JMnedict.xml’s data isn’t very good: for example, even names like 大介 aren’t gender-classified yet. We’ve got only 1,719 unique kanji for all the gender-classified names, 1,601 unique kanji for female names, and 873 unique kanji for male names. Pretty low and weird numbers. So don’t expect too much accuracy.

Automatically resizing images with Word VBA

For not-too-interested readers: Use the third code snippet.

It’s time for a new post on VBA! Somebody at work is creating a whole lot of training material in Word. Great. Fast forward a couple months, and all of a sudden it’s my job to fix these up. They have all kinds of formatting inconsistencies, and many of the screenshots were resized without preserving their aspect ratios.

I thought, great, I’ll just write some VBA that fixes the screenshots. I quickly came up with something like the following:

Sub resize_all_images_to_page_width()
    For Each inline_shape In ThisDocument.InlineShapes
        inline_shape.LockAspectRatio = msoFalse
        inline_shape.ScaleWidth = 100
        inline_shape.ScaleHeight = 100
        percent = doc.PageSetup.TextColumns.Width / inline_shape.Width
        inline_shape.ScaleWidth = percent * 100
        inline_shape.ScaleHeight = percent * 100
    Next
End Sub

This would in theory scale every image to fit to the page’s width.

And it worked. For almost all of the images. For some, it did something incomprehensible (Word 2010) and turned images into a very long and extremely thin strip of pixels. I tried various things (including .Reset) and did some googling, but couldn’t find a better way to fix the images. What I really wanted was to find out an image’s original width and height, just the way it’s displayed in the Format Picture dialog in the Size tab, but there’s no way to access those values.
So to get an image’s original size, I decided to create a new temporary document, copy and paste the image there, set the .ScaleWidth and .ScaleHeight options to 100, and then look at its .Width and .Height properties. Yay, there we go!
I also decided I don’t want to scale images beyond 100% and added some extra logic to handle that. Here’s the result:

Sub resize_all_images_up_to_page_width()
    Set current_doc = ThisDocument
    Set new_doc = Documents.Add(DocumentType:=wdNewBlankDocument)

    current_doc.Activate
    For Each ishape In current_doc.InlineShapes
        ' ishape.Copy ' doesn't work
        ishape.Select ' <work-around>
        Selection.Copy ' </work-around>
        new_doc.Content.PasteAndFormat (wdPasteDefault)
        Set new_ishape = new_doc.InlineShapes(1)
        new_ishape.LockAspectRatio = msoFalse
        new_ishape.ScaleWidth = 100
        new_ishape.ScaleHeight = 100
        ishape.LockAspectRatio = msoFalse
        If (new_ishape.Width > current_doc.PageSetup.TextColumns.Width) Then
            ishape.Width = current_doc.PageSetup.TextColumns.Width
            ishape.Height = (current_doc.PageSetup.TextColumns.Width / new_ishape.Width) * new_ishape.Height
        Else
            ishape.Width = new_ishape.Width
            ishape.Height = new_ishape.Height
        End If
        new_ishape.Delete
        ishape.LockAspectRatio = msoTrue
    Next
End Sub

And finally, we’d like to make sure images don’t get longer than the length of a page:

Sub resize_all_images_up_to_page_width()
    Set current_doc = ThisDocument
    Set new_doc = Documents.Add(DocumentType:=wdNewBlankDocument)
    page_width = current_doc.PageSetup.TextColumns.Width
    page_height = current_doc.PageSetup.PageHeight - current_doc.PageSetup.TopMargin - current_doc.PageSetup.BottomMargin

    current_doc.Activate
    For Each ishape In current_doc.InlineShapes
        ' ishape.Copy ' doesn't work
        ishape.Select ' <work-around>
        Selection.Copy ' </work-around>
        new_doc.Content.PasteAndFormat (wdPasteDefault)
        Set new_ishape = new_doc.InlineShapes(1)
        new_ishape.LockAspectRatio = msoFalse
        new_ishape.ScaleWidth = 100
        new_ishape.ScaleHeight = 100
        ishape.LockAspectRatio = msoFalse
        If (new_ishape.Width > page_width) Then
            If ((page_width / new_ishape.Width) * new_ishape.Height > page_height) Then
                ishape.Width = page_height / new_ishape.Height * page_width
                ishape.Height = page_height
            Else
                ishape.Width = page_width
                ishape.Height = (page_width / new_ishape.Width) * new_ishape.Height
            End If
        ElseIf (new_ishape.Height > page_height) Then ' going to be shrinking both height and width, and width is okay already, so it'll be even okayer
            ishape.Width = page_height / new_ishape.Height * new_ishape.Width
            ishape.Height = page_height
        Else
            ishape.Width = new_ishape.Width
            ishape.Height = new_ishape.Height
        End If
        new_ishape.Delete
        ishape.LockAspectRatio = msoTrue
    Next
End Sub

Hope this helps! If you have any  questions, feel free to leave a comment.

How to Create Word Documents from Excel using VBA and Fit Strings into Single Lines

On my day job, I was tasked to add a button to an Excel spreadsheet that converts that spreadsheet into a Word file, for improved formatting and aesthetics. This is a pretty straightforward task:

  1. Add a reference to the Microsoft Word Object Library (Tools → References) to get access to the wd-constants
  2. It’s easiest to access things in table cells, so create a Word template that uses tables wherever you want your macro to copy something over from the Excel file.
  3. Write your code around the following skeleton code:
Set objWord = CreateObject("Word.Application") ' Create Word object
Set objDoc = objWord.Documents.Add(Template:=ActiveWorkbook.Path & "\template.dotx") ' Open new document from given template
Set objTable = objDoc.Tables(1) ' Get reference to first table in template
objTable.Cell(1, 1).Range.Text = Cells(1, 1).Text ' Copy cell at row 1, column 1 in current Excel sheet to row 1, column 1 in first table in Word file
' Save file (we don't care about the .docx and just want a .pdf)
FileName = Application.GetSaveAsFilename(fileFilter:="PDF files (*.pdf), *.pdf") ' Note that this is Excel's dialog, but that doesn't matter, as we only need the user to enter a file name/select a file
objDoc.SaveAs2 save_name, wdFormatPDF
objWord.Quit wdDoNotSaveChanges
ActiveWorkbook.FollowHyperlink save_name

In case you need backwards compatibility with older versions of Office, add checks like the following before specifying a template or saving as a PDF file:

If (Application.Version > 12) Then
    ' ...
Else
    ' ...
End If

For older versions of Office, you will need a PDF printer to create PDFs. You can display the print dialog box by doing:

objWord.Dialogs(wdDialogFilePrint).Show

If, unlike me, you do want to display the Word document, get rid of the objWord.Quit line (and perhaps the lines related to saving to PDF) and do the following:

objWord.Visible = True

That’s that, and now we come to the actually interesting part. The part I’ve alluded to in the title of this post. Say you have a table in your Word template that gets populated with a person’s name and email address. You’ve carefully set your column widths to sane values, and don’t want them to change, so you’ve got the “Automatically resize to fit contents” option (Table Properties → Table → Options) turned off. What happens when you get a long name or email address? The name or email address gets split somewhere, and now you’ve got two lines. Ugly! You’ve tried the “Fit text” option (Table Properties → Cell → Options) and felt sorely disappointed. What do we do? Perhaps you’ve tried to access the row’s Height property, and quickly realized that it’s not very useful, as you either get a value of 9999,  or 12 if you’ve changed the height policy, along with a table row that doesn’t look like before. I am sure there are other ways to solve this problem, but this is what I did:

n_lines_before = objDoc.ComputeStatistics(wdStatisticLines) ' Get total number of lines in document
.Cell(3, 4).Range.Text = Cells(26, "AJ").Text
n_lines_after = objDoc.ComputeStatistics(wdStatisticLines) ' Get total number of lines in document, again
Do While (n_lines_after > n_lines_before) ' If the data didn't fit in our cell, the total number of lines will have increased
    .Cell(3, 4).Range.Font.Size = .Cell(3, 4).Range.Font.Size - 1 ' Make the font size a little smaller until the total number of lines returns to being the same as before again
    n_lines_after = objDoc.ComputeStatistics(wdStatisticLines)
Loop

This post is not exactly related to Qiqitori, but I hope it’s of some use to someone, somewhere! It certainly serves as documentation for the code I’ve written, if anything. If you have any questions or better ways of doing things, feel free to leave a comment.

How to Calculate Distance Tables Using OpenStreetMaps

For an assignment on my job, I had to compute distances from a couple dozen locations in Japan (which I will call “source locations” in this article) to all postal codes that exist in Japan. Japan has approximately 123,401 postal codes, and we don’t want the straight line distance, but actual routes. So what do we do now? There’s no clear answer, but you may find an idea or two after reading this article.

Geocoding

“Geocoding” refers to the conversion of addresses to latitude/longitude pairs. In our case, (disregarding the couple dozen source locations,) addresses consist of only a postal code (and the country, which is the same for all addresses). Unfortunately, there is no publicly accessible database that matches postal codes with latitude/longitude pairs. (For Japan, at least. Maybe you’ll find that such a database exists for your country.) You could, however, cross-reference data from this database from the Ministry of Land, Infrastructure, Transport and Tourism and this one from the Japan Post website, but realize that this would be very hard because of differences in the spellings of names (e.g., 旭ヶ丘 vs. 旭ケ丘 and オシツクシ vs. 白糠町). It’s best to buy a database, especially if you’re doing this for a company. GeoPostcodes is one company that sells such databases, and at the time of this writing, the database for Japan costs €69.95.

You could also scrape the Google Maps API, but that would violate Google’s terms of service in most cases. You’d also be limited to (currently) 3,000 queries per IP per day. (You could also get access to the enterprise version for $10,000 per year. You’d still be limited to 100,000 requests per day. Hmm. I wonder if you’re allowed to resell your access.) At the time of this writing, you probably won’t find a geocoding API that will let you do this both comfortably and without violating its terms of service.

Calculating routes

With its 3,000 queries per IP per day limit, you won’t be able to use the Google Maps API or any other API (e.g., Bing Maps’ or Nokia’s) to calculate a couple million distances. (Probably not even mapquest open, which is free and merely requests that you ask for permission before firing off thousands of queries.) This is where OpenStreetMap comes in. You will be calculating routes on your own computer. You will find that there is a lot of open-source software to calculate routes using OpenStreetMap data. I’ve tested the following programs (in the following order):

  • Gosmore: Gave up after waiting over 24 hours for the conversion process from OpenStreetMap data (which is XML) to Gosmore’s data format to complete
  • Routino: Slow for long distances (think 60 seconds or more for routes that are longer than 1000 km), interface is extremely easy to use programmatically
  • Navit: Slightly buggy, reasonably fast, interface is either graphical or dbus-based and hard to use programmatically
  • OSRM: Extremely fast, interface is reasonably easy to use programmatically

OSRM is the clear winner here. OSRM manages to calculate even long routes within a few milliseconds. However, you’ll need some dead-serious hardware to convert OpenStreetMap data to a format OSRM can use. The conversion tool ended up using about 30 GB of memory (if I remember correctly) to convert OpenStreetMap data for Japan.

OSRM will start a multithreaded web server (on any port you wish). Within a perl script, you could perhaps perform queries like this:

$response = `curl -s \"http://127.0.0.1:5000/viaroute?loc=$lat1,$long1&loc=$lat2,$long2\"`;

OSRM by default returns JSON, and will by default return alternate routes in addition to the route it deems fastest. By the way, calculating all routes took less than half a day (using ~24 not-too-modern Opteron cores).

Occasionally, OSRM will fail to find a route between two points. In our case, this happened 5,078 times, and it happens for the following reasons:

  • There is no road nearby. (A single postal code can cover a large area.)
  • Isolated road network
    Isolated road network

    The road that is closest to the specified coordinates is not connected to the wider road network. You might be on an island somewhere. (E.g., Hokkaido! Note that routes in Hokkaido were calculated completely separately from the other routes, and the numbers and statistics in this article may disregard Hokkaido.) Or there might be an error in the OpenStreetMap data. (By the way, it would be great if you could correct some of these errors! Anybody can edit OpenStreetMap data.)

We can fix a large number of these broken routes by calculating routes to points in the vicinity of the latitude/longitude pair in question. I chose to check the points 200 m to the north, south, west, and east, and if that failed, incremented the distance to 400, 600, 800, and finally 1,000 m. This reduced the number of broken routes to 1,411.

So what do we do now? We get rid of all postal codes that point to an island. That will eliminate a couple hundred postal codes, and it’s fun, because you’ll get to click around a lot, just like in a game! We’re going to add map markers on a Google Map for all broken postal codes and then click away and make a list of the ones we don’t need. How do you add markers on a Google Map? The answer is only one Google search away. We take the code from the JSFiddle demo and modify it a little bit to make markers go away when we click them and add the relevant latitude/longitude pair to a text box. That leaves only 336 postal codes. Here’s a link to the modified code: display_lat_long_pairs_on_map_bad_zips.html

So what do you do with the remaining 336 postal codes? That’s up to you to decide. I suggest trying mapquest open.

Other facts about postal codes

  • There are buildings in Japan that have multiple postal codes, e.g., one for each floor.
  • The term “ZIP code” only applies to postal codes in the US. “ZIP” is short for “Zone Improvement Plan”. You can read all about it in the Wikipedia article.
  • The number of Japanese postal codes changes (very slightly) every month. (You can find a CSV file that matches postal codes with addresses on the Japan Post website. There you’ll also find small files that contain the updates to this database.) Codes get added, deleted, and re-assigned.

If you have any questions, feel that I’ve left out something important, know of any reasonable alternatives to using OSRM for this, or just found this article helpful, please leave a comment! And if you need distance tables and don’t feel like calculating them yourself, feel free to ask me. I’ll probably manage to get them calculated for you very quickly (for a very modest fee, which mostly depends on if I need to buy external databases).