I was first introduced to corpus analysis when taking the Spanish-English module at the UAB as part of my degree. It proved very useful, especially for medical texts, since up to 500 texts can be batch downloaded from Medline in txt format. I don’t want to go into more detail right now, but if you want to see how corpus analysis works, you can read an article of mine on the subject.
At the time we used WordSmith tools, and at the cheap price of ₤50 (approx €75) I quickly bought myself a copy. My copy was version 4.0, while at university we were using version 3.0. The new version had the advantage of not being limited to DOS names (8 characters only) and generally had a better layout and a few interesting new functions, in particular the WebGetter. Unfortunately it also became less stable, and certain functions stopped working, like searching for words in context and bilingual text allignment.
This summer, while attending Mediterranean Editors and Translators, I came across a poster for a similar tool called AntConc. Since this is open-source software, I downloaded it and quickly tried it out. Now I’ve been giving a lecturing job, I’m particularly interested in AntConc, since students are much more likely to use a tool if it’s free. I have not looked much into it, but it does seem to be more stable than WordSmith Tools 4.0. However, it is possibly not quite so user friendly, as it does not calculate everything (concordance, collocates, etc) in one go, but rather you have to run a search separately for each tool. So for the moment I’m sticking with WordSmith, but I’ll definitely show my students AntConc as well, and encourage them to download it at home.
I have to say that this is an area that has not been exploited to its full potential. A program like this can’t be particularly hard to make, and if only someone could come up with a really excellent corpus analysis program, I’m sure it would be really successful. Or has someone already come up with one, and I’ve just not found it yet? Please let me know if you know a better tool.