我使用下面的代码来检测操作符沿着非英语字符的使用。
/**
* Prepares a unicode aware RegEx pattern for operators
*
* \b (word boundary - wb) can be written as (?:(?<=^)(?=\w)|(?<=\w)(?=$)|(?<=\W)(?=\w)|(?<=\w)(?=\W))
* \B (non-word boundary - nwb) can be written as (?:(?<=^)(?=\W)|(?<=\W)(?=$)|(?<=\W)(?=\W)|(?<=\w)(?=\w))
* Unicode-aware \w pattern is [\p{Alphabetic}\p{Mark}\p{Decimal_Number}\p{Connector_Punctuation}\p{Join_Control}]
*
*/
const w = String.raw`[\p{Alphabetic}\p{Mark}\p{Decimal_Number}\p{Connector_Punctuation}\p{Join_Control}]`;
const nw = String.raw`[^\p{Alphabetic}\p{Mark}\p{Decimal_Number}\p{Connector_Punctuation}\p{Join_Control}]`;
const uwb = String.raw`(?:(?<=^)(?=${w})|(?<=${w})(?=$)|(?<=${nw})(?=${w})|(?<=${w})(?=${nw}))`;
const unwb = String.raw`(?:(?<=^)(?=${nw})|(?<=${nw})(?=$)|(?<=${nw})(?=${nw})|(?<=${w})(?=${w}))`;
const OPERATOR_REGEX = new RegExp(
String.raw`(?!${unwb}"[^"“”]*)${uwb}(and|or|not|exclude)(?=.*\s)${uwb}(?![^"“”]*"${unwb})`,
'giu'
);
const query1 = '(Java or "化粧" or 化粧品)';
const query2 = '(Java or 化粧 or 化粧品)';
console.log(query1.split(OPERATOR_REGEX));
console.log(query2.split(OPERATOR_REGEX));
这是一个令人印象深刻的方法answered here,但Safari浏览器不支持lookbehind正则表达式模式(Lookbehind in JS regular expressions)。
有什么好的方法可以让这种逻辑在Safari浏览器中发挥作用?
1条答案
按热度按时间oknwwptz1#
Lookbehind是在Firefox 78中发布的,看起来像是implemented right now in WebKit。
另请参阅this similar question,其中的回应表示无法进行多边形填充。