在我的雪花数据库中,一个表具有非utf8字符。我如何在它上面创建仅包含utf8字符的视图;通过排除非utf8字符的行还是替换它们?谢谢
应该可以使用以下测试来检查非utf:
MY_STRING IS NOT NULL AND TRY_HEX_DECODE_STRING(HEX_ENCODE(MY_STRING)) IS NULL
但是那时我没有数据可以测试。 要将字符串重新编码为utf-8,可以使用JavaScript函数:
CREATE OR REPLACE FUNCTION TO_UTF8(BINARY_TEXT BINARY) RETURNS TEXT LANGUAGE JAVASCRIPT STRICT IMMUTABLE AS ' var win1252 = [ /* C1 controls */ 8364, 129, 8218, 402, 8222, 8230, 8224, 8225, 710, 8240, 352, 8249, 338, 141, 381, 143, 144, 8216, 8217, 8220, 8221, 8226, 8211, 8212, 732, 8482, 353, 8250, 339, 157, 382, 376 ]; return String.fromCharCode( ...Array.from(BINARY_TEXT).map(x => (x < 128 || x > 159) ? x : (win1252[x - 128])) ); /* .map(...) can be removed if no conversion from win1252 needed */ '; SELECT NVL(TRY_HEX_DECODE_STRING(HEX_ENCODE(MY_STRING)), TO_UTF8(HEX_ENCODE(MY_STRING)::BINARY));