ios 在Objective-c中将转义Unicode转换为Unicode

tquggr8v  于 2023-05-23  发布在  iOS
关注(0)|答案(2)|浏览(331)

所以,
一个看似简单的问题把我难倒了。我有两个声明:

NSLog(@"%@", @"\U0001f1ee\U0001f1f9");

NSLog(@"%@", @"\\U0001f1ee\\U0001f1f9");

第一个输出正确的表情符号(标志)。第二个输出转义字符串。我需要对第二个字符串做什么转换才能让它也输出标志?
换句话说:我有一些转义的Unicode字符串,我想打印出来作为正确的表情符号。我该怎么做呢?
我试着转换到NSUTF8StringEncoding NSData,然后再转换回NSString,我试着使用NSNonLossyASCIIStringEncoding,没有乐趣。我一定是用错了
感谢您的任何帮助!

cvxl0en2

cvxl0en21#

简单。使用-stringByRemovingPercentEncoding。

NSString * string = @"\\U0001f1ee\\U0001f1f9" ;
NSLog( @"%@", [string stringByRemovingPercentEncoding]);
8xiog9wr

8xiog9wr2#

这比我想象的要花更长的时间。
我的最后一种方法是将Unicode转义序列转换为它们的UTF8等效字节,将它们编码为百分比转义序列,然后使用[NSString stringByRemovingPercentEncoding]生成实际的Unicode字符。
您必须手动执行位移位以完成转换,但在此之后其余的都是微不足道的。作为参考,您可以查看this example gist,它在从较大字符串中的给定ASCII等效字符解析该值后,使用Unicode等宽字符的十进制代码点等效值进行转换。这个逻辑是用JXA-ObjectiveC编写的,但是可以完美地移植到Objective-C中。

更新

下面是原生Objective-C实现的样子:

NSString* escapedUnicharToString(NSString* escapedUnicharString) {
    const char * escapedUnicharCString = [escapedUnicharString UTF8String];

    /// Marshal the hex chars to byte values.
    char unicharBytes[8] = {'\0', '\0', '\0', '\0', '\0', '\0', '\0', '\0'};
    unicharBytes[0] = escapedUnicharCString[2];
    unicharBytes[1] = escapedUnicharCString[3];
    unicharBytes[2] = escapedUnicharCString[4];
    unicharBytes[3] = escapedUnicharCString[5];
    unicharBytes[4] = escapedUnicharCString[6];
    unicharBytes[5] = escapedUnicharCString[7];
    unicharBytes[6] = escapedUnicharCString[8];
    unicharBytes[7] = escapedUnicharCString[9];
    
    /// Convert the marshaled bytes to their unichar-equivalent
    /// (`unsigned long`) value.
    unsigned long unicharIndex = strtoul(unicharBytes, NULL, 16);
    
    /// Convert the `unsigned long` to a binary string.
    NSMutableString *unicharBinaryString = [NSMutableString new];
    while (unicharIndex > 0) {
        unsigned long remainder = unicharIndex % 2;
        [unicharBinaryString appendFormat:@"%lu", remainder];
        unicharIndex /= 2;
    }
    
    /// Use the conversion mask for the *last* series of unicode chars.
    ///
    /// **Note**
    /// This must change if you're converting in a different range of
    /// unicode characters.
    ///
    /// See https://stackoverflow.com/a/6240184/12770455 for a list of
    /// alternate conversion masks.
    ///
    NSString *conversionMask = @"11110xxx10xxxxxx10xxxxxx10xxxxxx";
    
    NSMutableString *utf8BinaryString = [NSMutableString new];
    unsigned long utf8Offset = 0;

    /// Use the conversion mask to drop bits from the unichar binary
    /// string — beginning at the rightmost position and moving
    /// leftwards.
    ///
    /// The unichar binary string is reversed, so we'll fill-in
    /// each char using index 0.
    for (unsigned long i = [conversionMask length]; i > 0; i--) {
        unichar bit = [conversionMask characterAtIndex: (i - 1)];
        
        /// Fill-in the "x" characters from the unichar binary string.
        if (bit == 'x') {
            /// Fill with "0" when no chars remain.
            if (utf8Offset == [unicharBinaryString length]) {
                [utf8BinaryString insertString:@"0" atIndex:0];
                continue;
            }
            /// Fill with the current unichar binary offset char.
            [utf8BinaryString insertString:[NSString stringWithFormat:@"%c",
                                            [unicharBinaryString characterAtIndex: utf8Offset]] atIndex:0];
            utf8Offset++;
            continue;
        }
        
        /// Fill with the conversion mask's char.
        [utf8BinaryString insertString:[NSString stringWithFormat:@"%c", bit] atIndex:0];
    }
    
    /// Convert the UTF8-equivalent binary into its decimal-equivalent value.
    int decimal = 0;
    for (NSUInteger i = 0; i < [utf8BinaryString length]; i++) {
        unichar character = [utf8BinaryString characterAtIndex:i];
        int bit = character - '0';
        decimal = (decimal * 2) + bit;
    }
    
    /// Convert the UTF8-equivalent decimal to its hex-string equivalent.
    NSMutableString *percentEncoded = [NSMutableString stringWithFormat:@"%08X", decimal];
    
    /// Insert percent chars before each hex-represented byte.
    [percentEncoded insertString:@"%" atIndex:0];
    [percentEncoded insertString:@"%" atIndex:3];
    [percentEncoded insertString:@"%" atIndex:6];
    [percentEncoded insertString:@"%" atIndex:9];

    /// Decode the percent-encoded UTF8 char, and return to caller.
    return [percentEncoded stringByRemovingPercentEncoding];
}

int main(int argc, const char * argv[]) {
    @autoreleasepool {
        
        NSLog(@"%@", escapedUnicharToString(@"\\U0001D670"));
        
    }
    return 0;
}

相关问题