http://code.google.com/p/pychseg/
基于的MMSEG中文分词算法Python实现,正向最大匹配+多个规则。
需要安装psyco,有点费劲,下面是使用方法:
# -*- coding: utf-8 -*- from pychseg.mmseg.algorithms import SimpleAlgorithm from pychseg.mmseg import algorithms testdata='hello world' a= SimpleAlgorithm(testdata) words = a.segment() ww = [w for w in words] print ww, len(ww)
http://code.google.com/p/pymmseg-cpp/
https://github.com/pluskid/pymmseg-cpp/
本文来自: python下的两个分词工具



