In the past 20 years, the application of quantitative methods in historical linguistics has received a lot of attention. Traditional historical linguistics relies on the comparative method in order to determine the genealogical related-ness of languages. More recent quantitative approaches attempt to automate this process, either by developing computational tools that complement the comparative method (Steiner et al. 2010) or by applying fully automatized methods that take into account very limited or no linguistic knowledge, e.g. the Levenshtein approach. The Levenshtein method has been extensively used in dialectometry to measure the distances between various dialects (Kessler 1995; Heeringa 2004; Nerbonne 1996). It has also been frequently used to analyze the relatedness between languages, such as Indo-European (Serva and Petroni 2008; Blanchard et al. 2010), Austronesian (Petroni and Serva 2008), and a very large sample of 3002 languages (Holman 2010). In this paper we will examine the performance of the Levenshtein distance against n-gram models and a zipping approach by applying these methods to the same set of language data.
如果您有任何问题，请随时与我们联系，谢谢。Chapter Black box approaches to genealogical classification and their shortcomings 内容均搜集于网络，本身不存储任何资源，如侵犯到您的权益，请提交反馈，我们将配合您第一时间删除。