Using Emacs and org-mode with Japanese Text

Using Emacs, Japanese text and Org-mode

This document summarises the steps required to use GNU Emacs to edit Japanese text in a UTF-8 encoded file, and to subsequently publish this file as a web page (HTML).

Check that your system can display Japanese text

Before you proceed, check that your system is capable of displaying Japanese text. The built in command view-hello-file, invoked with the C-h h keystrokes will display a text file containing many languages. You should see "Konnichiwa" written in Hiragana characters:

hello1.png

Further down the file you will see more Japanese text:

japanese1.png

If your Emacs displays are similar to these screen captures, then you can proceed with this tutorial.

Creating a new file and enabling multibyte

Japanese text can not be saved in a single byte ASCII file. Instead, a double-byte coding is used. I choose the UTF-8 encoding.

  • Step 1. Create a new buffer

    The word RET means to press the Enter key.

    C-x C-f FILENAME RET

  • Step 2. Enable multibyte characters

    M-x toggle-enable-multibyte-characters RET

    This long command name can be entered quickly by entering tog ESC en ESC then the RET

  • Step 3. Set the Language Environment

    C-x RET l Japanese RET - a shorter version of M-x set-language-environment RET Japanese RET

    Learn more about the Japanese language environment with the command:

    C-h L Japanese RET.

    japanese_help.png

    Click on the first link labelled Japanese for more information about the input method.

  • Step 4. Enter Japanese text

    Both Japanese and English text can be entered in the buffer. Enter the command C-\ to toggle between English and Japanese. The mode line will indicate Japanese mode when the Hiragana letter for A is displayed:

    hiragana_mode.png

  • Step 5. Save the file in double-byte encoding

    Force the file to be saved in double-byte characters using the command:

    C-x RET c utf-8 RET C-x C-s

  • Step Opening an existing file

    The file created in the previous section will be correctly recognised as a multi-byte file when you reopen the file next time.

    After you next run C-x C-f filename, you can swicth immediately to Japanese input.

Using the Japanese Input Method

Toggle the input method C-\ and ensure the mode line shows Japanaese mode (similar to the previous screen capture).

I find it necessary to increase the font size for greater clarity and this can be done with a shift-left mouse click and choosing a larger font size from the dialogue.

font_box.png

Start typing phonetically. For example, to type the word "Watashi" (which means "I"), type the letters watashi. In Japanese, this is represented by three hiragana symbols, wa-ta-shi. You can see these displayed as you complete each syllable:

watashi.png

The underline means this is an active text entry. At this point you can type a space to replace the characters with Kanji. This lookup is based on a phonetic dictionary which is part of Emacs.

kanji.png

If there is only corresponding Kanji character, it is replaced, otherwise continue typing a space to select from more than one match. If you want revert to the Hiragana type H.

To enter Katakana, enter a capital K before hitting Return. For example, to type the word Coca Cola, type kokakora then K. You do need to know how to pronounce foreign words (such as "Coca Cola") in Japanese syllables in order to type the word. The result is:

coke.png

Conversions to Kanji can be made on regions of Hiragana and Katakana text.

First of all, mark the beginning of the text with C-space, go to the end of the region then M-x followed by one of these commands:

japanese-katakana-region
Convert Hiragana to Katakana
japanese-hiragana-region
Convert Katakana to Hiragana
japanase-hankaku-region
Convert to half-width Katakana
japanase-zenkaku-region
Convert to full-width Katakana

The command names can be entered quickly by first entering Esc x jap ESC. Hit Tab for a list of commands or instead, enter the first two letters of the command then Tab.

The input method is called utf-translate-cjk-mode. Refer to this command for a description.

Exporting to org-mode

Insert an org-mode export template wih the command C-c C-e t. Change the Language line to look like:

 #+LANGUAGE:  jp

Now you can export the page and view in a browser.

A test file

The attached text file japanesetest.org is converted to japanesetest which you can open in your browser. It should look like the following:

japanese_output.png

More information

Author: Charles Cave <charlesweb@optusnet.com.au>

Date: 2009-01-14 Wed

HTML generated by org-mode 6.17c in emacs 22