Resources

Language data, corpora & wordlists
Brown corpus, tagged with Tree-Tagger, simple tagged format, parsed
Bank of English corpus wordlist (reference corpus)Frown corpus wordlistFlob corpus wordlist English stop word listSpanish stop word list
Ancora (Spanish) Ancora (POS & lemma)
SUBTLEX-XX (Subtitle corpora various languages) Project Gutenberg (Lit from various languages)
Tools
TreeTagger Hunpos (HMM tagger)
AntConc
Experimental software
DMDX DMDX remote
Ibex (webspr) Ibex Farm
Text/statistical Programming
ActivePerl (Windows) R-project (RStudio is a great IDE)
Python iPython (via academic Enthought)
Manuals and documentation
Penn Tag setTree-tagger Tag set Regular Expressions cheat sheet

Article 1, Article 2

Follow

Get every new post delivered to your Inbox.

Join 50 other followers

%d bloggers like this: