我不敢相信我在问这个,但是…
您如何使用SCALA在SPARK SQL中逃逸SQL查询字符串?
我已经厌倦了一切,到处搜寻。我以为apache commons库可以做到,但是没有运气:
import org.apache.commons.lang.StringEscapeUtils var sql = StringEscapeUtils.escapeSql("'Ulmus_minor_'Toledo'"); df.filter("topic = '" + sql + ”‘“).map(_.getValuesMapAny)).collect().foreach(println);
import org.apache.commons.lang.StringEscapeUtils var sql = StringEscapeUtils.escapeSql("'Ulmus_minor_'Toledo'"); df.filter("topic = '" + sql +
”‘“).map(_.getValuesMapAny)).collect().foreach(println);
返回以下内容:
topic =’‘’Ulmus_minor _’‘Toledo’‘’^在scal.sys.package $ .error(package.scala:27)在org.apache.spark.sql.catalyst.SqlParser.parseExpression(SqlParser.scala:45)在org.apache.spark.sql.DataFrame.filter(DataFrame.scala:651)在$ iwC $$ iwC $$ iwC $$ iwC $$ iwC $$ iwC $ iwC $$ iwC $$ iwC。(:29 )在$ iwC $$ iwC $$ iwC $$ iwC $$ iwC $$ iwC $$ iwC $$ iwC。(:34)在$ iwC $$ iwC $$ iwC $ iwC $$ iwC $$ iwC $$ iwC。(:36)at $ iwC $$ iwC $$ iwC $$ iwC $$ iwC $$ iwC。(:38)at $ iwC $ iwC $$ iwC $ iwC $ iwC。(:40)at $ iwC $$ iwC $$ iwC $$ iwC。(:42)at $ iwC $$ iwC $$ iwC。(:44)at $ iwC $$ iwC。(:46)at $ iwC。(:48)at (:50)在。(:54)在。()在。(:7)在。()在$ print()在sun.reflect.NativeMethodAccessorImpl.invoke0(本机方法)在sun.reflect.NativeMethodAccessorImpl.invoke( NativeMethodAccessorImpl.java:62)位于org.apache.spark.repl.SparkIMain $ ReadEvalPrint.call(SparkIMain)的java.lang.reflect.Method.invoke(Method.java:497)的sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) .scala:1065)在org.apache.spark.repark.SparkIMain $ Request.loadAndRun(SparkIMain.scala:1338)在org.apache.spark.repl.SparkIMain.loadAndRunReq $ 1(SparkIMain.scala:840)在org.apache .org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)上的.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)在org.apache.spark.repl.SparkILoop.reallyInterpret $ 1(SparkILoop。 scala:857)位于org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)位于org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)位于org.apache.spark.repl .SparkILoop.processLine $ 1(SparkILoop.scala:657)at org.apache.spark.repl.SparkILoop.innerLoop $ 1(SparkILoop.scala:665)at org.apache.spark.repl.SparkILoop.org $ apache $ spark $ repl $ SparkILoop $$ loop(SparkILoop.scala: 670)at org.apache.spark.repl.SparkILoop $$ anonfun $ org $ apache $ spark $ repl $ SparkILoop $$ process $ 1.apply $ mcZ $ sp(SparkILoop.scala:997)at org.apache.spark.repl .sparkILoop $$ anonfun $ org $ apache $ spark $ repl $ SparkILoop $$ process $ 1.apply(SparkILoop.scala:945)在org.apache.spark.repl.SparkILoop $$ anonfun $ org $ apache $ spark $ repl $ org.apache.spark.repl.SparkILoop.org $ apache $ spark上的SparkILoop $$ process $ 1.apply(SparkILoop.scala:945)在scala.tools.nsc.util.ScalaClassLoader $ .savingContextLoader(ScalaClassLoader.scala:135)位于org.apache.spark.repl的$ repl $ SparkILoop $$ process(SparkILoop.scala:945),位于org.apache.spark.repl的SparkILoop.process(SparkILoop.scala:1059)。Main $ .main(Main.scala:31)在org.apache.spark.repl.Main.main(Main.scala)在sun.reflect.NativeMethodAccessorImpl.invoke0(本机方法)在sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl .java:62)位于org.apache.spark.deploy.SparkSubmit $ .org处的java.lang.reflect.Method.invoke(Method.java:497)处的sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) org.apache.spark.deploy.SparkSubmit $ .doRunMain $ 1(SparkSubmit.scala:170)上的$ apache $ spark $ deploy $ SparkSubmit $ runMain(在org.apache.spark.deploy.SparkSubmit $ org.apache.spark.deploy.SparkSubmit $ .main(SparkSubmit.scala:112)上的.submit(SparkSubmit.scala:193)org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)上的.submit(SparkSubmit.scala:193)scala)位于sun.reflect.NativeMethodAccessorImpl.invoke0(本地方法)位于sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)位于sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)位于java.lang.reflect在org.apache.spark.deploy上的.Method.invoke(Method.java:497)在org.apache.spark.deploy.SparkSubmit $ .org $ apache $ spark $ deploy $ SparkSubmit $$ runMain(SparkSubmit.scala:665)在org.apache.spark.deploy org.apache.spark.deploy.SparkSubmit $ .submit(SparkSubmit.scala:193)上的.SparkSubmit $ .doRunMain $ 1(SparkSubmit.scala:170)在org.apache.spark.deploy.SparkSubmit $ .main(SparkSubmit.scala) :112),网址为org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)scala)位于sun.reflect.NativeMethodAccessorImpl.invoke0(本地方法)位于sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)位于sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)位于java.lang.reflect在org.apache.spark.deploy上的.Method.invoke(Method.java:497)在org.apache.spark.deploy.SparkSubmit $ .org $ apache $ spark $ deploy $ SparkSubmit $$ runMain(SparkSubmit.scala:665)在org.apache.spark.deploy org.apache.spark.deploy.SparkSubmit $ .submit(SparkSubmit.scala:193)上的.SparkSubmit $ .doRunMain $ 1(SparkSubmit.scala:170)在org.apache.spark.deploy.SparkSubmit $ .main(SparkSubmit.scala) :112),网址为org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)org.apache.spark.deploy.SparkSubmit上的java.lang.reflect.Method.invoke(Method.java:497)上的sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)上的invoke(NativeMethodAccessorImpl.java:62) org.apache.spark.deploy.SparkSubmit $ .doRunMain $ 1(SparkSubmit.scala:170)上的$ .org $ apache $ spark $ deploy $ SparkSubmit $$ runMain(org.apache.spark.deploy.SparkSubmit $ .doRunMain $ 1(SparkSubmit.scala:170)在org.apache.spark.deploy org.apache.spark.deploy.SparkSubmit $ .main(SparkSubmit.scala:112)上的.SparkSubmit $ .submit(SparkSubmit.scala:193)org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)上的.SparkSubmit $ .submit(SparkSubmit.scala:193)org.apache.spark.deploy.SparkSubmit上的java.lang.reflect.Method.invoke(Method.java:497)上的sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)上的invoke(NativeMethodAccessorImpl.java:62) $ .org $ apache $ spark $ deploy $ SparkSubmit $$ runMain(SparkSubmit.scala:665)在org.apache.spark.deploy.SparkSubmit $ .doRunMain $ 1(SparkSubmit.scala:170)在org.apache.spark.deploy org.apache.spark.deploy.SparkSubmit $ .main(SparkSubmit.scala:112)上的.SparkSubmit $ .submit(SparkSubmit.scala:193)org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)上的.SparkSubmit $ .submit(SparkSubmit.scala:193)org.apache.spark.deploy.SparkSubmit $ .doRunMain $ 1(SparkSubmit.scala:170)上的org $ apache $ spark $ deploy $ SparkSubmit $$ runMain(org.apache.spark.deparky.SparkSubmit) org.apache.spark.deploy.SparkSubmit $ .main(SparkSubmit.scala:112)上的$ .submit(SparkSubmit.scala:193)org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)上的$ .submit(SparkSubmit.scala:193)org.apache.spark.deploy.SparkSubmit $ .doRunMain $ 1(SparkSubmit.scala:170)上的org $ apache $ spark $ deploy $ SparkSubmit $$ runMain(org.apache.spark.deparky.SparkSubmit) org.apache.spark.deploy.SparkSubmit $ .main(SparkSubmit.scala:112)上的$ .submit(SparkSubmit.scala:193)org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)上的$ .submit(SparkSubmit.scala:193)
帮助将是巨大的。
Ĵ
可能令人惊讶,但:
var sql = "'Ulmus_minor_'Toledo'" df.filter(s"""topic = "$sql"""")
可以正常工作,尽管使用它会更清洁:
df.filter($"topic" <=> sql)