现在已经苦了两天,只是不能删除文件 indexWriter.deleteDocuments(term)
indexWriter.deleteDocuments(term)
在这里,我将放置将进行测试的代码,希望有人可以指出我做错了的事情,以及已经尝试过的事情:
2.x
5.x
indexWriter.deleteDocuments()
indexReader.deleteDocuments()
indexOption
NONE
DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS
这里的代码:
import org.apache.lucene.analysis.core.SimpleAnalyzer; import org.apache.lucene.document.Document; import org.apache.lucene.document.Field; import org.apache.lucene.document.FieldType; import org.apache.lucene.index.*; import org.apache.lucene.queryparser.classic.ParseException; import org.apache.lucene.queryparser.classic.QueryParser; import org.apache.lucene.search.IndexSearcher; import org.apache.lucene.search.Query; import org.apache.lucene.search.ScoreDoc; import org.apache.lucene.store.Directory; import org.apache.lucene.store.FSDirectory; import java.io.IOException; import java.nio.file.Paths; public class TestSearch { static SimpleAnalyzer analyzer = new SimpleAnalyzer(); public static void main(String[] argvs) throws IOException, ParseException { generateIndex("5836962b0293a47b09d345f1"); query("5836962b0293a47b09d345f1"); delete("5836962b0293a47b09d345f1"); query("5836962b0293a47b09d345f1"); } public static void generateIndex(String id) throws IOException { Directory directory = FSDirectory.open(Paths.get("/tmp/test/lucene")); IndexWriterConfig config = new IndexWriterConfig(analyzer); IndexWriter iwriter = new IndexWriter(directory, config); FieldType fieldType = new FieldType(); fieldType.setStored(true); fieldType.setIndexOptions(IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS); Field idField = new Field("_id", id, fieldType); Document doc = new Document(); doc.add(idField); iwriter.addDocument(doc); iwriter.close(); } public static void query(String id) throws ParseException, IOException { Query query = new QueryParser("_id", analyzer).parse(id); Directory directory = FSDirectory.open(Paths.get("/tmp/test/lucene")); IndexReader ireader = DirectoryReader.open(directory); IndexSearcher isearcher = new IndexSearcher(ireader); ScoreDoc[] scoreDoc = isearcher.search(query, 100).scoreDocs; for(ScoreDoc scdoc: scoreDoc){ Document doc = isearcher.doc(scdoc.doc); System.out.println(doc.get("_id")); } } public static void delete(String id){ try { Directory directory = FSDirectory.open(Paths.get("/tmp/test/lucene")); IndexWriterConfig config = new IndexWriterConfig(analyzer); IndexWriter iwriter = new IndexWriter(directory, config); Term term = new Term("_id", id); iwriter.deleteDocuments(term); iwriter.commit(); iwriter.close(); }catch (IOException e){ e.printStackTrace(); } } }
首先generateIndex()将在中生成索引/tmp/test/lucene,并 query()显示id将成功查询该索引,然后delete()希望删除该文档,但query()再次将证明删除操作失败。
generateIndex()
/tmp/test/lucene
query()
id
delete()
这是pom依赖关系,以防有人可能需要测试
<dependency> <groupId>org.apache.lucene</groupId> <artifactId>lucene-core</artifactId> <version>5.5.4</version> <type>jar</type> </dependency> <dependency> <groupId>org.apache.lucene</groupId> <artifactId>lucene-analyzers-common</artifactId> <version>5.5.4</version> </dependency> <dependency> <groupId>org.apache.lucene</groupId> <artifactId>lucene-queryparser</artifactId> <version>5.5.4</version> </dependency> <dependency> <groupId>org.apache.lucene</groupId> <artifactId>lucene-analyzers-smartcn</artifactId> <version>5.5.4</version> </dependency>
渴望得到答案。
您的问题出在分析仪中。SimpleAnalyzer将标记定义为 字母的 最大字符串(StandardAnalyzer或什至WhitespaceAnalyzer是更典型的选择),因此要索引的值将分为多个标记:“ b”,“ a”,“ b”,“ d”,“ f”。您定义的delete方法虽然不会通过分析器,但只会创建一个原始术语。如果您尝试将其替换为以下内容,则可以看到它的作用main:
SimpleAnalyzer
StandardAnalyzer
WhitespaceAnalyzer
main
generateIndex("5836962b0293a47b09d345f1"); query("5836962b0293a47b09d345f1"); delete("b"); query("5836962b0293a47b09d345f1");
通常,查询和术语等 不 进行分析,而QueryParser进行分析。
对于(看起来像)标识符字段,您可能根本不想分析此字段。在这种情况下,请将其添加到FieldType中:
fieldType.setTokenized(false);
然后,您将不得不更改查询(再次进行QueryParser分析),并TermQuery改为使用。
TermQuery
Query query = new TermQuery(new Term("_id", id));