Welcome to “Sleeping Mountains”

The past becomes present through us. Past narratives guide present actions. Past relationships inform those in the present. Past thought provides stimuli for present theory. Past descriptions of states of mind serve as reassurance that we in the present are not alone but instead have a wealth of advice to fall back on. And yet, the past remains in the shadows, unseen, asleep in the present.

This blog is entitled Sleeping Mountains in allusion to a poem by Yosano Akiko that I first encountered in a course by Dr. Kimberly Kono at Smith College. It was originally the title of a blog about my life working and studying in Japan from 2005 to 2012.

Text Mining

Installing MeCab and RMeCab on a Mac

Since my last post, you’ve maybe played around with WebChamame a bit. That site is a powerful tool to see what’s possible in natural language processing, which is a method used in linguistics and other social sciences to create statistics from texts. Although with WebChamame you are able to determine a number of settings and output your data into downloadable files, however, the site has its limitations. Most prominent among them: it takes a long time to find the bits of information you might be interested in.

Statisticians created the computer language R and the editor RStudio to overcome that problem because they need to wrangle data quickly to produce useful information. It is also possible to tokenize premodern Japanese texts in R and RStudio, but it requires installing a program for morphological analysis on your own computer. As you might have noticed in the last post, WebChamame uses MeCab for its morphological analyses.

This is an illustration from Tsutsumi and Ogiso (2015) of WebChamame’s workflow.
Text Mining

Breaking Digital Premodern Japanese Texts Down into Words using WebChamame

If you want to analyze Japanese texts digitally, the first problem you might run up against is that Japanese does not use spaces between words. A computer needs those spaces to know when one word ends and the next begins. So, you first need to be able to “tokenize” those words, that is determine the words. Deciding what is a word and what is not is difficult to decide. Is a verb ending a word or a part of a word? Linguists discuss these kinds of questions for us literary scholars and have created the necessary tools so that we don’t have to insert spaces manually into a text. Imagine how much work that would be!

This is an image from Den, et al., 2007, illustrating the different levels of differentiation between words in modern Japanese, taking into account complex composite nouns and verbs, in this case verbs combining a noun with the irregular verb “to do” (suru).

One way to see how computers can tokenize words (without installing anything on your own computer) is to use WebChamame. This site was built by researchers at the National Institute of Japanese Language and Linguistics. To get started, type a Japanese sentence into the left window on the WebChamame site.

This is a screengrab of the WebChamame website on June 28, 2022. Type your text into the box under 「テキストを入力」 to see what this site can do.

Happy Year of the Tiger

Storage houses at Tōshōgū in Nikkō on January 1, 2022. Photo by H. McGaughey.

Kashiwa, Chiba Over a year ago, my partner and I both got fellowships from the Japan Society for the Promotion of Science (JSPS). It took a long time to get to Japan, and we were able to take advantage of our privilege as researchers at universities. Students, professionals, and their families have had a far harder time of it and many have not been able to come at all, keeping couples and families apart for months even years now. May this year be good to us all!

This year, I finally made it Nikkō Shrine and the opportunity was our New Year break. We rented a small home near the shrine called Tōshōgū, where Tokugawa Ieyasu is entombed.

Class Material

Lektüre klassischer Texte (Reading classical Texts)

Trier, Germany This is a part of a lesson I developed for a German advanced reading class in classical Japanese (kobun) for the Japanologie department at the University of Trier in the winter semester of 2020/2021. Students were asked to test their skills reading a broad selection of poetry. I found published translations of all of these poems into English. Please get in touch if you are interested.