write down,forget
adidas eqt support ultra primeknit vintage white coming soon adidas eqt support ultra boost primeknit adidas eqt support ultra pk vintage white available now adidas eqt support ultra primeknit vintage white sz adidas eqt support ultra boost primeknit adidas eqt adv support primeknit adidas eqt support ultra boost turbo red white adidas eqt support ultra boost turbo red white adidas eqt support ultra boost turbo red adidas eqt support ultra whiteturbo adidas eqt support ultra boost off white more images adidas eqt support ultra boost white tactile green adidas eqt support ultra boost beige adidas eqt support ultra boost beige adidas eqt support refined camo drop adidas eqt support refined camo drop adidas eqt support refined running whitecamo adidas eqt support 93 primeknit og colorway ba7506 adidas eqt running support 93 adidas eqt support 93

简繁体插件更新,支持es2.0

<Category: Diving Into ElasticSearch> 查看评论

简繁体分词及ES插件STConvert更新

地址:https://github.com/medcl/elasticsearch-analysis-stconvert

1.支持最新的ES2.0

2.内置多个Analyzer、多个Tokenizer、多个TokenFilter,不需要在elasticsearch.yml预先配置就能直接使用,更加方便,当然同时支持参数配置自定义,方法兼容以前。

3.新增2个CharFilter,默认已预置,可直接使用,有什么用呢?比如你的文本里面同时混合了简体和繁体了『北京國際電視檯』,而你的词典肯定是没有这样的词的,那么分词在Tokenizer处理的时候肯定会拆分为【北京】【國】【際】【電】【視】【檯】,我们使用charfilter预先处理一下,比如都转成简体【北京国际电视台】,那么分词的时候,就会是正确的【北京国际电视台】了。

例子:

简繁体混合出错的情况:

使用简繁体CharFilter处理后的分词结果:

测试的时候还发现一个analyze接口的bug,POST方式提交char_filters不生效:
https://github.com/elastic/elasticsearch/issues/15657

本文来自: 简繁体插件更新,支持es2.0