ACTIV-ES: a comparable, cross-dialect corpus of ‘everyday’ Spanish from Argentina, Mexico, and Spain

The first release of the ACTIV-ES Spanish dialect corpus based on TV/film transcripts is now available here: https://github.com/francojc/activ-es

It includes 3,460,172 total tokens (Argentina: 1,103,039 Mexico: 976,192 Spain: 1,380,941) and comes in running text and word list (1:5 gram) formats. Each format has both a plain text and part-of-speech tagged version.

For more information about the development and evaluation of this resource you can download our paper at the Ninth Annual Language Resources and Evaluation Conference (LREC 2014) here: https://www.academia.edu/6962707/ACTIV-ES_a_comparable_cross-dialect_corpus_of_everyday_Spanish_from_Argentina_Mexico_and_Spain
plot_country-year-genre

Disappearing Spaces in Lion

I just realized that if you have your Dock on the left edge of the screen you cannot create a Space in Mission Control, the option disappears. Move it to the botton or right and you will be fine.

Weird.

May 2, 2012: update, the Space tab appears on the opposite side from the dock, so don’t go looking for it on the left side if your dock is on the left. :)

A special lecture by: Dr. Adam Ussishkin at WFU, Thursday March 1st @ 4pm in Greene Hall 162

Our WFU Interdisciplinary Linguistics Minor
announces a special lecture by

Dr. Adam Ussishkin
University of Arizona

Assoc. Professor of Linguistics & Cognitive Science


Psycholinguistics of under-studied languages: the case of subliminal speech priming in Maltese


Early and automatic processing of linguistic stimuli is fairly well-studied for resource-heavy languages such as English (cf. work on visual masked priming by Forster and Davis 1984, Forster et al. 2003, among many others), whereas psycholinguistic studies on languages with few resources are much rarer. In this talk, I first describe the creation of the first online language corpus of Maltese, a Semitic languages for which few electronic resources exist. Next, I discuss the application of the corpus to a psycholinguistic question and investigate the psycholinguistic reality of the consonantal root, a building block of Semitic languages. This investigation is carried out using the relatively novel subliminal speech priming technique.

Thursday March 1st @ 4pm in Greene Hall 162

Overcoming an IMDbPY installation issue on Ubuntu 11.04

IMDbPY is a Python module to enable backend search and retrieval of information from the IMDB. To install IMDby on Ubuntu you’ll need to download the module here. Then you’ll need to extract the module and run (as root):

$ sudo python setup.py install

You may get an error complaining about a ‘gcc’ compiler, I did, even though a quick:

$ which gcc

returns a live ‘gcc’ compiler on my box. The trick I found here is to install ‘python-dev’ through your Ubuntu package manager.

$ sudo apt-get install python-dev

Then you should be able to run the earlier module installation without errors. Fire up python and check it out to make sure.

$ python
>>> import imdb

Things should be fine!