小编典典

XPATH查询中的特殊字符

c#

我使用以下内容XPATH Query列出站点下的对象。ListObject[@Title='SomeValue']。SomeValue是动态的。只要SomeValue没有撇号(’),此查询就起作用。也尝试使用转义序列。没用

我究竟做错了什么?


阅读 1121

收藏
2020-05-19

共1个答案

小编典典

令人惊讶地,这很难做到。

看一下XPath Recommendation,您将看到它定义了一个字面量为:

Literal ::=   '"' [^"]* '"' 
            | "'" [^']* "'"

也就是说,XPath表达式中的字符串文字可以包含撇号或双引号,但不能同时包含两者。

您不能使用转义来解决此问题。像这样的文字:

'Some'Value'

将匹配以下XML文本:

Some'Value

这确实意味着,可能有一部分XML文本无法生成匹配的XPath文字,例如:

<elm att="&quot;&apos"/>

但这并不意味着不可能将文本与XPath匹配,这很棘手。在任何情况下,如果您要匹配的值都包含单引号和双引号,则可以构造一个表达式concat来生成将要匹配的文本:

elm[@att=concat('"', "'")]

因此,这导致了这一点,这比我希望的要复杂得多:

/// <summary>
/// Produce an XPath literal equal to the value if possible; if not, produce
/// an XPath expression that will match the value.
/// 
/// Note that this function will produce very long XPath expressions if a value
/// contains a long run of double quotes.
/// </summary>
/// <param name="value">The value to match.</param>
/// <returns>If the value contains only single or double quotes, an XPath
/// literal equal to the value.  If it contains both, an XPath expression,
/// using concat(), that evaluates to the value.</returns>
static string XPathLiteral(string value)
{
    // if the value contains only single or double quotes, construct
    // an XPath literal
    if (!value.Contains("\""))
    {
        return "\"" + value + "\"";
    }
    if (!value.Contains("'"))
    {
        return "'" + value + "'";
    }

    // if the value contains both single and double quotes, construct an
    // expression that concatenates all non-double-quote substrings with
    // the quotes, e.g.:
    //
    //    concat("foo", '"', "bar")
    StringBuilder sb = new StringBuilder();
    sb.Append("concat(");
    string[] substrings = value.Split('\"');
    for (int i = 0; i < substrings.Length; i++ )
    {
        bool needComma = (i>0);
        if (substrings[i] != "")
        {
            if (i > 0)
            {
                sb.Append(", ");
            }
            sb.Append("\"");
            sb.Append(substrings[i]);
            sb.Append("\"");
            needComma = true;
        }
        if (i < substrings.Length - 1)
        {
            if (needComma)
            {
                sb.Append(", ");                    
            }
            sb.Append("'\"'");
        }

    }
    sb.Append(")");
    return sb.ToString();
}

是的,我在所有边缘情况下都进行了测试。这就是逻辑如此愚蠢的原因:

    foreach (string s in new[]
    {
        "foo",              // no quotes
        "\"foo",            // double quotes only
        "'foo",             // single quotes only
        "'foo\"bar",        // both; double quotes in mid-string
        "'foo\"bar\"baz",   // multiple double quotes in mid-string
        "'foo\"",           // string ends with double quotes
        "'foo\"\"",         // string ends with run of double quotes
        "\"'foo",           // string begins with double quotes
        "\"\"'foo",         // string begins with run of double quotes
        "'foo\"\"bar"       // run of double quotes in mid-string
    })
    {
        Console.Write(s);
        Console.Write(" = ");
        Console.WriteLine(XPathLiteral(s));
        XmlElement elm = d.CreateElement("test");
        d.DocumentElement.AppendChild(elm);
        elm.SetAttribute("value", s);

        string xpath = "/root/test[@value = " + XPathLiteral(s) + "]";
        if (d.SelectSingleNode(xpath) == elm)
        {
            Console.WriteLine("OK");
        }
        else
        {
            Console.WriteLine("Should have found a match for {0}, and didn't.", s);
        }
    }
    Console.ReadKey();
}
2020-05-19