Category Archives: tools for the humanities

Publication – Text Mining JRPGs

The article I have been working on for many years now has been published through the Journal of Gaming and Virtual Worlds. The paper grew out of an experimental project that I conducted for the completion of the field papers requirements as part of my PhD degree that the MLCS department had at the time at the University of Alberta. I essentially text mined hundreds of JRPG reviews in order to find meaningful discourses that would help scholars understand the formation of the JRPG genre as a discursive phenomenon. While not all the results introduced groundbreaking elements to the history of the circulation of JRPGs in the anglophone Western world, I believe that it provide enough new elements that could serve as a base for the emergence of renewed inquiries on the genre, as well as reaffirm previous claims backed with statistical evidence.

Perhaps the element that surprised me the most was that, starting from 2009 onward, JRPGs started to be written on in a much more negative fashion than in previous years. The contrast between these two generated topics covering both positive and negative language offered some paths worth investigating:

Negative connotations attached to the genre clearly outnumber positive ones with JRPGs becoming objects of harsh critique at a time of an important industry-changing technological shift in game production and marketing. Although JRPGs did have a generally positive reputation in the first few years of the twenty-first century, their image was tarnished by the end of the decade as their higher presence in the media exposed them to conflictual reinterpretations through a phenomenon compa-
rable to Appadurai’s ‘tournament of values’.

Overall, I think this is just the beginning between text mining and me. Having gone through all the steps to publish this sort of research (both to conduct and explain the project), I am now in a much stronger position to tackle more ambitious projects on gaming culture and digital humanities methods using my own tools. If you have any game-related text data sets that beg to be explored, don’t hesitate to reach out!


Histoire du jeu – Montréal

Je serai à Montréal cette semaine pour le symposium annuel de l’histoire du jeu. J’y présenterai mon projet de base de données regroupant ma collection de pamphlet de jeu vidéo d’arcade japonais tout en faisant la démonstration de l’utilité d’utiliser ce genre de matériel afin d’approfondir notre connaissance du jeu d’arcade, mais surtout de l’espace du game center japonais.

La conférence elle-même porte sur les questions gravitant autour de la préservation du jeu vidéo.

C’est gratuit et ouvert au public.


Arcade and Game Center Chirashi Database 1.1

I just finished upgrading the Arcade and Game Center Chirashi Database with the latest documents I gathered during my research trip earlier this year. I also made a handful of little improvements, just enough to warrant a version change. Here is the full list:

  • Added documents acquired from January to April 2016
  • Added database values for tags, researchers, timestamp and venue
  • Added new type of query in relation to the Playing in Public project: Game Center
  • Multiple code enhancement

I did not implement the tag system in the search query modules yet as I do not know how to implement it in an efficient manner yet. I also need to upgrade the viewer page for plain readability. This will be part of a future update.



Day of DH 2016

On a whim, I decided to join the Day of DH movement yesterday. I created a small blog that you can access here. The time difference with Japan made it a little challenging to figure out the logistics, but I believe that the experiment was worthwhile.

Day of DH is a global event where DH scholars create a blog associated with the official Day of DH homepage and share whatever they do on that specific day with the community. This year, Day of DH was on April 8th.

As a DH scholar, I often get asked what DH is all about. This event is an attempt to answer that question by tapping into the large diversity of DH scholars to provide an answer that is both crowdsourced and can be further analyzed with text mining techniques. I expect some word clouds and other interesting things to be generated after the results of this year’s event; I am very much looking forward to look at all of it.

A Text Mining Week at Texas A&M (2)

I am now back from my Texan adventures in humanities computing at Texas A&M, but I still wish to mention some of the later projects to which I was introduced during my stay.

One of the major difference between DH at Texas A&M and the UoA is that researchers at the former institution focus on an older corpus of texts that is both difficult to access and challenging to digitize on a large scale. While we work with tweets and other born-digital documents, they work with books from the 18th century. The difficulty resides in the fact that, even when digitized, they remain difficult to transform into machine-readable format due to various problems such as the absence of standards for typeset and various noise that ink can produce when read by a machine. The EBBO and ECCO corpora are fraught with these problems.

Screen Shot 2015-12-10 at 4.58.37 PM

Considering these problems, the Initiative for Digital Humanities, Media & Culture worked on making these texts more reachable for the broader academic community with the 18th Connect portal. This search engine is linked to different other online collections and repositories and allows to look through libraries and collections for specific texts published in the 18th century.

Screen Shot 2015-12-10 at 5.15.17 PM

Feeling like contributing? The 18th Connect portal also hosts TypeWright, an online tool that allows the public to improve the OCR results of certain digitized texts by typing lines of texts directly from the scanned document, thus improving the quality of the digitized text. Just create an account and start typing!


Last but not least, I wish to spread the knowledge about the online class Programming for Humanists at TAMU that is being offered since 2014. The program allows for different registration options (including an official certificate or not) and covers a lot of important topics for DH students. This is a neat online program for students interested in the fundamentals of digital humanities, but do not have access to a DH introduction class at their home institutions. Take a look if that is your case!

A Text Mining Week at Texas A&M (1)

Blogging live from the extremely sunny campus of Texas A&M, College Station, Texas. Quite a contrast from snowy Edmonton.

2015-12-02 13.13.12

I have been fortunate enough to be invited to spend a week at Texas A&M (TAMU) to visit some of the scholars with whom I collaborate on the Novel TM project. Project co-investigator Doctor Laura Mandell and PhD student Nigel Lepianka were nice enough to show me around the campus (unable to drive, I find myself relying on Nigel most of the time).

So far, I presented some of my work on text mining JRPG video game reviews and was introduced to other text mining techniques using R (specifically, Nigel’s method to do some directed topic modelling). I was also introduced to some of the projects that the team here is working on as part of their Initiative for Digital Humanities, Medias and Culture.

The first one is, an extensive web portal that brings ressources for the study of the Syriac language to the wide web. While some of its contents remain to be published, the Gazetteer showcases how the platform can contribute as a geographical reference index.

Screen Shot 2015-12-02 at 8.08.58 PM

I also was introduced to the BigDIVA viewer today. This is a promising interface that could revolutionize library search results display for universities. I am particularly interested in its potential to help rethinking queries with space in mind, a way to present queries in a less hierarchical manner which would allow the uncovering of marginal files and documents. This is radically different that the regular Google search algorithms which relies more on result popularity amongst millions of users (a form of crowdsourcing) who may be looking for the same specific website. An interesting tool, and one that triggers reflections about what it means to read (and play) space.