在前面谈到语料库的时候,我们忽略了一个问题,即语料库是指语料构成的集合,而要进行语料的检索,分析和处理离不开语料工具。面对海量的语料库,很难想象用人工(manual work)处理将会耗费多少时间和精力。所以现代语料库的工作离不开计算机软件,甚至可以说,语料库工具对于语料库的检索,分析和处理具有至关重要的作用,离开了语料库工具,语料库工作可以说是寸步难行。至此我们可以这么说,语料库(语料集合)和给予计算机的语料库工具是从事语料库工作的基本条件。
那么语料库工具有哪些呢?在今天这个帖子里,我想介绍语料库工具的三个基本功能,即索引功能(concordance),词表功能(wordlist)和搭配查询功能(collocate)。这里使用的软件是Laurence Anthony 所编的免费软件Antconc3.21,可以从Anthony主页上下载:download 另外,本贴还引用了Laurence Anthony网站中Antconc在线帮助的截图,要完整了解该软件用法,请浏览Antconc help system

索引功能(concordance) 索引原来是指将文本中的词或术语按字母顺序排列,以便于确定其在文本中的位置和出现次数(hit)。在语料分析中,索引是指通过索引工具(concordancer)将检索词(nod节点词)出现的上下文,按照指定的跨距(span,按字母或者单字计算),并将检索词居中的显示方式。索引又称为KWIC(key words in context 语境中的关键词)。如下图所示:


词表功能(wordlist) 词表功能是指将文本中的词项数(type)和词形(token)分别列出的功能,同时,还将词项数按出现的频率高低排列。如下图所示:


搭配统计功能(collocate) 搭配是词与词出现的结伴行为,这种结伴行为具有一定的规律,同时这种规律又有一定的或然性(probability),即表现出一定的概率特征。有不少语料库工具具有搭配统计功能。通过KWIC索引,我们也能了解某个词的搭配,但是,往往会比较零碎,不系统,而通过语料库工具的搭配统计功能,能够将某个词的搭配按照统计数据从高到低或者反向排列,从而给研究者或者学习者一个直观的映像。
例如,通过Just the word我们查询到make的搭配信息:

'make' is V
V* obj N , e.g. make contribution (1273)
N subj V* , e.g. company make (345)
ADV V* , e.g. also make (1354)
V* ADV , e.g. make up (6195)
V* PREP , e.g. make of (4371)
V and V* , e.g. be and make (1690)
V or V* , e.g. be or make (128)
V* and V , e.g. make and be (1039)
V* or V , e.g. make or break (61)

'make' is N
V obj N* , e.g. do make (14)
ADJ N* , e.g. different make (19)
N* PREP , e.g. make of (119)
N* N , e.g. make sense (4)
PREP N* , e.g. on make (37)
N PREP N* , e.g. sort of make (21)
N* and N , e.g. make and model (14)
N* or N , e.g. make or decision (6)
article N* , e.g. .make (124)

cluster 1
make allowance (284)
make allowances (68)
make .allowance (65)
make no allowance (18)
make appearance (480)
make public appearance (21)
make rare appearance (11)
make appearances (33)
make brief appearance (15)
make an appearance (152)
make her appearance (14)
make his appearance (99)
make their appearance (33)
make its appearance (57)
make award (131)
make concession (260)
make concessions (108)
make a concession (26)
make any concessions (12)
make no concessions (21)
make some concessions (13)
make contribution (1273)
make financial contribution (13)
make great contribution (25)
make important contribution (71)
make large contribution (22)
make major contribution (75)
make outstanding contribution (11)
make positive contribution (43)
make significant contribution (110)
make small contribution (11)
make substantial contribution (41)
make useful contribution (25)
make valuable contribution (34)
make contributions (179)
make .contribution (20)
make real contribution (13)
make a contribution (745)
make his contribution (25)
make the contribution (31)
make their contribution (30)
make our contribution (11)
make own contribution (24)
make any contribution (21)
make their contributions (15)
make no contribution (16)
make some contribution (27)

通过以上引用Just the Word对make搭配统计结果的部分浏览,我们大概知道了搭配统计功能的内容。掌握词语搭配是学习外语的重要内容。平时的阅读以及搭配词典都能够帮助我们了掌握和了解词汇的搭配信息,不过,比较而言,由于语料库本身的优势,语料库应该是能够全面、准确、快捷查询搭配信息的工具。