java 如何从Google Gimap中检索原始字节(而不是替换字符)

oprakyz7  于 2023-01-29  发布在  Java
关注(0)|答案(1)|浏览(90)

我正在为Google的Gimap服务(IMAP4v2和一些扩展)创建IMAP客户端。
我有一小部分邮件,FETCH命令返回UTF-8替换字符。同时,该字符在Gmail Web UI中正确显示。
作为一个介绍,下面是openssl s_client的协议交换:

**openssl s_client -crlf -quiet -connect imap.gmail.com:993**
    * OK Gimap ready for requests from 80.229.146.237 q15mb3103507wmo
    **1 LOGIN user@gmail.com app-password**
    * CAPABILITY IMAP4rev1 UNSELECT IDLE NAMESPACE QUOTA ID XLIST CHILDREN X-GM-EXT-1 UIDPLUS COMPRESS=DEFLATE ENABLE MOVE CONDSTORE ESEARCH UTF8=ACCEPT LIST-EXTENDED LIST-STATUS LITERAL- SPECIAL-USE APPENDLIMIT=35651584
    1 OK user@gmail.com authenticated (Success)
    **2 EXAMINE "[Gmail]/All Mail"**
    * FLAGS (\Answered \Flagged \Draft \Deleted \Seen $Forwarded $Junk $MailFlagBit0 $NotJunk $NotPhishing $Phishing Forwarded Junk NotJunk)
    * OK [PERMANENTFLAGS ()] Flags permitted.
    * OK [UIDVALIDITY 1] UIDs valid.
    * 95696 EXISTS
    * 0 RECENT
    * OK [UIDNEXT 1110737] Predicted next UID.
    * OK [HIGHESTMODSEQ 11952911]
    2 OK [READ-ONLY] [Gmail]/All Mail selected. (Success)
    **3 UID FETCH 936238 (BODY[HEADER.FIELDS (Subject)])**
    * 44300 FETCH (UID 936238 BODY[HEADER.FIELDS (Subject)] {64}
    Subject: Luc�a Cxxxxxxxxxxxxxxx posted a discussion on ASW
    
    )
    3 OK Success
    **4 LOGOUT**
    * BYE LOGOUT Requested
    LOGOUT OK 73 good day (Success)
    read:errno=0

终端显示了UTF-8替换字符,但这不是问题的必要指示符。真正的发现是当我读取Java代码中的字节时,Gimap传回了编码Unicode替换字符的三个UTF-8字节。我在下面给它们加了下划线:

[ 53 , 75 , 62 , 6A , 65 , 63 , 74 , 3A , 20 , 4C , 75 , 63 , EF , BF , BD , 61 , 20 ,
                                                              ^^^  ^^^  ^^^

同时,电子邮件在Gmail中显示正确。这里是谷歌的"查看原始"...

根据RFC2822,所有这些标题键和值都应该是7位ASCII字符,但这封电子邮件显然不是这样--但这如何解释谷歌"知道"正确的字符是什么,但不会通过他们的API向我提供相同的字符呢?
我只是想知道,是否可能我没有正确使用IMAP4V1?我注意到Gimap支持"ENABLE = UTF-8",但是添加这个命令序列并没有导致行为上的差异。

    • 摘要:**
    • 观察结果:**发送给Google的原始电子邮件无效,Google找到了存储发件人预期的正确字符的方法。
    • 预期结果:**当我请求此字段时,Google会给我显示的字节数
    • 实际结果:**Google没有给我任何字节来描述í,而是Google的API返回给我替换字符的字节。
    • 讨论:**也许这是Google实现错误--但我在这里发表我的帖子是为了确保我实际上是在错误地处理协议交互。

下面是我的最小测试用例,Java代码,我可以使用它来访问和打印字节。

package stackoverflow;

import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.regex.Pattern;
import java.io.IOException;
import java.io.InputStream;
import java.io.PrintWriter;
import java.net.InetSocketAddress;
import java.net.Socket;
import javax.net.ssl.SSLSocketFactory;

public class StackOverflowTestCase {

    /**
     * Connect to Gimap server, download a single email's Subject header.
     */
    public static void main( String[] argzes ) throws IOException {

        SimpleIMAPClient g = new SimpleIMAPClient();
        try {
            g.changeToLoggedInState().changeToSelectedState();
            String input = "";
            Pattern expectedResponse = Pattern.compile( "\\* [1234567890]+ FETCH \\(UID 936238 .*" );
            g.sendCommand( "4 UID FETCH 936238 (BODY[HEADER.FIELDS (Subject)])" );
            while (!input.startsWith( "4 OK " )) {
                input = g.readLine();
                if ( expectedResponse.matcher( input ).matches() ) {
                    // find the {999} construct in the response and parse the number.
                    int bytesToRead = Integer.parseInt( input.split( "\\{|}" )[1] );
                    byte[] bytes = g.response.readNBytes( bytesToRead );
                    log( bytes );
                }
            }
        }
        finally {
            g.logout();
        }
    }

    /**
     * Pretty-print a byte array.
     */
    private static void log( byte[] bytes ) {
        StringBuffer out = new StringBuffer();
        out.append( '[' );
        String prefix = " ";
        for (byte b : bytes) {
            out.append( prefix + String.format( "%02X ", b ) );
            prefix = ", ";
        }
        out.append( ']' );
        System.out.println( out.toString() );
    }

    /**
     * Class that implements a subset of the IMAP 4v2 protocol.
     * Only enough function is available to reproduce the test case for downloading one email header.
     * */
    private static class SimpleIMAPClient {

        /**
         * When true, the line exchanges between IMAP client and IMAP server
         * are printed to stdout and stderr respectively.
         * */
        static final boolean debug = true;

        /**
         * IMAP connection state, per RFC3501#3
         * */
        public static class State {
            public static final State NOT_AUTHENTICATED = new State();
            public static final State AUTHENTICATED = new State();
            public static final State SELECTED = new State();
            public static final State LOGGED_OUT = new State();

            private State () {
            }
        }

        /**
         * Outbound stream to the server
         * */
        public final PrintWriter request;
        
        /**
         * Inbound stream from the server
         */
        public final InputStream response;
        
        private final Socket imap;
        private State state;

        /**
         * Constructs a connection to the Gmail Gimap server using the default ciphers available in the JDK.
         * If established successfully, the object enters NOT_AUTHENTICATED state.
         * @throws ExceptionInInitializerError if the connection cannot be established.
         * */
        public SimpleIMAPClient () {
            try {
                imap = SSLSocketFactory.getDefault().createSocket();
                imap.connect( new InetSocketAddress( "imap.gmail.com", 993 ) );
                System.out.println( "Connection established to " + imap.getRemoteSocketAddress() + " from local port "
                        + imap.getLocalPort() );
                request = new PrintWriter( imap.getOutputStream(), true );
                response = imap.getInputStream();
            }
            catch ( IOException e ) {
                throw new ExceptionInInitializerError( e );
            }
            this.state = State.NOT_AUTHENTICATED;
        }

        /**
         * Moves the state to LOGGED_OUT. The class can no longer be used after calling this method.
         * */
        public void logout() throws IOException {
            sendCommand( "LOGOUT LOGOUT" );
            String input = readLine();
            while (!input.startsWith( "LOGOUT OK" )) {
                input = readLine();
            }
            response.close();
            request.close();
            imap.close();
            System.out.println( "Disconnected." );
            this.state = State.LOGGED_OUT;
        }

        /**
         * Change the state to SELECTED by opening the "All Mail" mailbox in
         * readonly mode. Returns a reference to this object.
         * @throws IllegalStateException if the beginning state is not AUTHENTICATED.
         */
        public SimpleIMAPClient changeToSelectedState() throws IOException {

            if ( !state.equals( State.AUTHENTICATED ) )
                throw new IllegalStateException();

            sendCommand( "3 EXAMINE \"[Gmail]/All Mail\"" );
            String input = readLine();
            while (!input.startsWith( "3 OK " )) {
                input = readLine();
            }
            this.state = State.SELECTED;
            return this;
        }

        /**
         * Change the state to AUTHENTICATED; and then returns a reference to
         * this object.
         * TODO Note, the account and app-password are hard-coded into this method implementation.
         * @throws IllegalStateException if the beginning state is not NOT_AUTHENTICATED.
         */
        public SimpleIMAPClient changeToLoggedInState() throws IOException {

            if ( !state.equals( State.NOT_AUTHENTICATED ) )
                throw new IllegalStateException();

            String input = readLine();
            while (!input.startsWith( "* OK " )) {
                input = readLine();
            }
            sendCommand( "1 LOGIN user@gmail.com ***********" );
            while (!input.startsWith( "1 OK " )) {
                input = readLine();
            }

            this.state = State.AUTHENTICATED;
            return this;
        }

        /**
         * Send the command to the IMAP server.
         */
        public void sendCommand( String message ) {

            request.print( message );
            if ( !message.endsWith( "\r\n" ) )
                request.print( "\r\n" );
            request.flush();
            System.out.println( message );
        }

        /**
         * Reads String in the default Charset. Stray \\r or \\n characters will raise
         * IOException. Only the \\r\\n sequence is allowed per RFC 3501#2.2.
         * 
         * @returns The line, identified by its terminator \\r\\n.
         */
        public String readLine() throws IOException {

            StringBuffer line = new StringBuffer();
            int ch = response.read();
            char prev = '\0';
            while (ch != -1) {
                if ( line.length() == 0 ) {
                    if ( ( char ) ch == '\n' ) {
                        throw new IOException( "Line begins with \\n" );
                    }
                } else {
                    if ( prev == '\r' && ( char ) ch == '\n' ) {
                        line.setLength( line.length() - 1 ); // chop off \r
                        System.err.println( line.toString() );
                        return line.toString();
                    } else if ( ( char ) ch == '\n' ) {
                        throw new IOException( "Encountered \\n without \\r." );
                    }
                }
                prev = ( char ) ch;
                line.append( prev );
                ch = response.read();
            }
            throw new IOException( "end of stream." );
        }
    }
}
bvjxkvbb

bvjxkvbb1#

在我过去使用GIMAP服务器的经验中,如果它必须解析报头来满足您的请求,它往往会吃掉非法字符。您可以通过一次性获取整个消息来解决这个问题,然后服务器可能会给予您未经过滤的原始数据。
我建议尝试获取RFC822或整个BODY部分。

相关问题