æ,ø,å是挪威字母中的最新字母
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z Æ Ø Å
当我们试图使用Hibernate的Lucene然后对它进行排序 Å clubs with A,Ø clubs with Ø,Æ clibs with A这是不对的。例如:
Å clubs with A
Ø clubs with Ø
Æ clibs with A
当前结果:
Aaalu,Åaalu,Baalu,Zaalu,
预期成绩:
Aaalu,Baalu,Zaalu,Åaalu,
以下是工作代码:
@AnalyzerDef(name = "myOwnAnalyzer", tokenizer = @TokenizerDef(factory = KeywordTokenizerFactory.class), filters = { @TokenFilterDef(factory = ASCIIFoldingFilterFactory.class), @TokenFilterDef(factory = LowerCaseFilterFactory.class), @TokenFilterDef(factory = PatternReplaceFilterFactory.class, params = { @Parameter(name = "pattern", value = "('-&\\.,\\(\\))"), @Parameter(name = "replacement", value = " "), @Parameter(name = "replace", value = "all") }), @TokenFilterDef(factory = PatternReplaceFilterFactory.class, params = { @Parameter(name = "pattern", value = "([^0-9\\p{L} ])"), @Parameter(name = "replacement", value = ""), @Parameter(name = "replace", value = "all") }), @TokenFilterDef(factory = TrimFilterFactory.class) } ) public class KikaPaya implements Serializable { @Fields({ @Field(index = Index.YES, store = Store.YES), @Field(name = "KikaPayaName_for_sort", index = Index.YES, analyzer = @Analyzer(definition = "myOwnAnalyzer")) }) @Column(name = "NAME", length = 100) private String name;
主要:
FullTextEntityManager ftem = Search.getFullTextEntityManager(factory.createEntityManager()); QueryBuilder qb = ftem.getSearchFactory().buildQueryBuilder().forEntity( KikaPaya.class ).get(); org.apache.lucene.search.Query query = qb.all().getQuery(); FullTextQuery fullTextQuery = ftem.createFullTextQuery(query, KikaPaya.class); fullTextQuery.setSort(new Sort(new SortField("KikaPayaName_for_sort", SortField.STRING, true))); fullTextQuery.setFirstResult(0).setMaxResults(150); int size = fullTextQuery.getResultSize(); List<KikaPaya> result = fullTextQuery.getResultList(); for (KikaPayauser : result) { logger.info("KikaPaya Name:" + user.getName()); }
以下是Lucene的版本(我无法更改):
<hibernate.version>4.2.8.Final</hibernate.version> <hibernate.search.version>4.3.0.Final</hibernate.search.version> <dependency> <groupId>org.hibernate</groupId> <artifactId>hibernate-entitymanager</artifactId> <version>4.2.8.Final</version> </dependency> <dependency> <groupId>org.apache.lucene</groupId> <artifactId>lucene-core</artifactId> <version>3.6.2</version> </dependency> <dependency> <groupId>org.apache.lucene</groupId> <artifactId>lucene-analyzers</artifactId> <version>3.6.2</version> </dependency>
有人可以建议获得正确结果的方法吗?
您可以org.apache.lucene.collation.CollationKeyFilter在Hibernate Search版本4.3.0.Final中使用类。创建自己的归类过滤器工厂:
org.apache.lucene.collation.CollationKeyFilter
import org.apache.lucene.analysis.TokenStream; import org.apache.lucene.collation.CollationKeyFilter; import org.apache.solr.analysis.BaseTokenFilterFactory; import java.text.Collator; import java.util.Locale; public final class NorwegianCollationFactory extends BaseTokenFilterFactory { @Override public TokenStream create(TokenStream input) { Collator norwegianCollator = Collator.getInstance(new Locale("no", "NO")); return new CollationKeyFilter(input, norwegianCollator); } }
并在AnalyzerDef中使用以下整理工厂:
@AnalyzerDef(name = "myOwnAnalyzer", tokenizer = @TokenizerDef(factory = KeywordTokenizerFactory.class), filters = { @TokenFilterDef(factory = ASCIIFoldingFilterFactory.class), @TokenFilterDef(factory = LowerCaseFilterFactory.class), @TokenFilterDef(factory = PatternReplaceFilterFactory.class, params = { @Parameter(name = "pattern", value = "('-&\\.,\\(\\))"), @Parameter(name = "replacement", value = " "), @Parameter(name = "replace", value = "all") }), @TokenFilterDef(factory = PatternReplaceFilterFactory.class, params = { @Parameter(name = "pattern", value = "([^0-9\\p{L} ])"), @Parameter(name = "replacement", value = ""), @Parameter(name = "replace", value = "all") }), @TokenFilterDef(factory = TrimFilterFactory.class) , @TokenFilterDef(factory = NorwegianCollationFactory .class) } ) public class KikaPaya implements Serializable {