我正在尝试使用Duke快速重复数据删除引擎在我工作的公司的数据库中搜索一些重复的记录。
我从命令行这样运行它:
java -cp "C:\utils\duke-0.6\duke-0.6.jar;C:\utils\duke-0.6\lucene-core-3.6.1.jar" no.priv.garshol.duke.Duke --showmatches --verbose .\config.xml
但我得到一个错误:
Exception in thread "main" java.lang.UnsupportedOperationException: Operation no t yet supported at sun.jdbc.odbc.JdbcOdbcResultSet.isClosed(Unknown Source) at no.priv.garshol.duke.datasources.JDBCDataSource$JDBCIterator.close(JD BCDataSource.java:115) at no.priv.garshol.duke.Processor.deduplicate(Processor.java:152) at no.priv.garshol.duke.Duke.main_(Duke.java:135) at no.priv.garshol.duke.Duke.main(Duke.java:38)
我的配置文件如下所示:
<duke> <schema> <threshold>0.82</threshold> <maybe-threshold>0.80</maybe-threshold> <path>test</path> <property type="id"> <name>ID</name> </property> <property> <name>LNAME</name> <comparator>no.priv.garshol.duke.comparators.ExactComparator</comparator> <low>0.6</low> <high>0.8</high> </property> <property> <name>FNAME</name> <comparator>no.priv.garshol.duke.comparators.ExactComparator</comparator> <low>0.6</low> <high>0.8</high> </property> <property> <name>MNAME</name> <comparator>no.priv.garshol.duke.comparators.ExactComparator</comparator> <low>0.3</low> <high>0.5</high> </property> <property> <name>SSN</name> <comparator>no.priv.garshol.duke.comparators.ExactComparator</comparator> <low>0.0</low> <high>1.0</high> </property> </schema> <jdbc> <param name="driver-class" value="sun.jdbc.odbc.JdbcOdbcDriver" /> <param name="connection-string" value="jdbc:odbc:VT_DeDupe" /> <param name="user-name" value="aleer" /> <param name="password" value="**" /> <param name="query" value="select SocialSecurityNumber, LastName, FirstName, MiddleName, empssn from T_Employees" /> <column name="SocialSecurityNumber" property="ID" /> <column name="LastName" property="LNAME" /> <column name="FirstName" property="FNAME" /> <column name="MiddleName" property="MNAME" /> <column name="empssn" property="SSN" /> </jdbc> </duke>
它并没有真正告诉我不支持的功能…我只是在尝试,对配置没有什么要求。
正如mbonaci所说,问题是未实现JDBC驱动程序的isClosed()方法。即使实现它,也比简单地编写“ return close”更为困难。
我为此问题添加了一个丑陋的解决方法。请执行“汞拉”,然后重试。