如果为空,则删除java中xml文件的第一行

gojuced7  于 2021-07-09  发布在  Java
关注(0)|答案(2)|浏览(484)

我从服务器上收到一个文件,上面有我的大学时间表,并试图从中提取数据。在某些文件(对于某些部门)的顶部有一个空行,它是文件的第一行,因此我得到: [Fatal Error] lesson:2:6: The processing instruction target matching "[xX][mM][lL]" is not allowed. 如何检查空白行并在java中删除同一个文件?我无法处理字符串和行,因为xml文件通常没有 \n 在队伍的尽头。
升级版

//it appeared on knt/151 file, so empty lines in the beginning of the file that caused fatal error
private void checkForEmptyLines(File f) {
    try {
        RandomAccessFile raf = new RandomAccessFile(f,"rw");
        while (raf.getFilePointer()!=raf.length()){
           //What should be here?
           Byte b = raf.readByte();
           if (b!=10)
               raf.write(b);
        }
    } catch (FileNotFoundException e) {
        e.printStackTrace();
    } catch (IOException e) {
        e.printStackTrace();
    }

}

upd xml文件处理:

public String[][] parse(String path)  {
    String[][] table = new String[8][6];

    File data = new File(path);
   // checkForEmptyLines(data);

    DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
    DocumentBuilder dBuilder  = null;
    Document doc = null;

    try {
        dBuilder = dbFactory.newDocumentBuilder();
        doc = dBuilder.parse(data);
    } catch (SAXException e) {
        e.printStackTrace();
    } catch (IOException e) {
        e.printStackTrace();
    } catch (ParserConfigurationException e) {
        e.printStackTrace();
    }

    doc.getDocumentElement().normalize();
    NodeList nodeList = doc.getElementsByTagName("Data");

    int rowIndex = 0;
    int columnIndex = 0;

    for (int i = 0; i < nodeList.getLength(); ++i) {
        if (i > 7 && !((i - 14) % 7 == 0)) { 
            Node node = nodeList.item(i);
            String line = node.getTextContent().replaceAll("\\t+", " "); 
            line = line.replace("\n", " ");

            if (columnIndex >= 6) {
                columnIndex = 0;
                ++rowIndex;
            }

            table[rowIndex][columnIndex++] = line;
        }
    }

xml文件示例

dxpyg8gm

dxpyg8gm1#

我的同事已经添加了这个代码,似乎它的工作。它不仅在开始时检查空字符串,而且删除它并将正确的数据写入新文件。
这个解决方案似乎很慢,如果可以做任何改进,请告诉我。

private static File skipFirstLine(File inputFile) {
    File outputFile = new File("skipped_" + inputFile.getName());

    try (BufferedReader reader = new BufferedReader(new FileReader(inputFile));
         BufferedWriter writer = new BufferedWriter(new FileWriter(outputFile))) {

        String line;
        int count = 0;
        while ((line = reader.readLine()) != null) {
            if (count == 0 && line.equals("")) {
                ++count;
                continue;
            }

            writer.write(line);
            writer.write("\n");
            ++count;
        }
    } catch (FileNotFoundException e) {
        e.printStackTrace();
    } catch (IOException e) {
        e.printStackTrace();
    }

    return outputFile;
}
nimxete2

nimxete22#

对于这个问题没有快速简单的答案,但可以说您应该了解如何将输入视为流。我已经更新了您的“检查空行”方法,从本质上说,它将流推进到第一个“<”字符,然后重置流并停止处理

//it appeared on knt/151 file, so empty lines in the beginning of the file that caused fatal error
private void checkForEmptyLines(BufferedInputStream fs) throws IOException {
    // Set mark and allow for up to 1024 characters to be read before this mark becomes invalid
    fs.mark(1024);
    int ch;
    while( -1 != (ch = fs.read()) {
        if( '<' == ch ) {
            fs.reset();
            break;
        }
        else {
            fs.mark(1024);
        }
    }
}

public String[][] parse(String path)  {
    String[][] table = new String[8][6];

    File data = new File(path);
    FileInputStream dataStream= new FileInputStream(data);
    BufferedInputStream bufferedDataStream= new BufferedDataStream(dataStream, 1024);
    checkForEmptyLines(bufferedDataStream);

    DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
    DocumentBuilder dBuilder  = null;
    Document doc = null;

    try {
        dBuilder = dbFactory.newDocumentBuilder();
        doc = dBuilder.parse(bufferedDataStream);
    } catch (SAXException e) {
        e.printStackTrace();
    } catch (IOException e) {
        e.printStackTrace();
    } catch (ParserConfigurationException e) {
        e.printStackTrace();
    }

    doc.getDocumentElement().normalize();
    NodeList nodeList = doc.getElementsByTagName("Data");

    int rowIndex = 0;
    int columnIndex = 0;

    for (int i = 0; i < nodeList.getLength(); ++i) {
        if (i > 7 && !((i - 14) % 7 == 0)) { 
            Node node = nodeList.item(i);
            String line = node.getTextContent().replaceAll("\\t+", " "); 
            line = line.replace("\n", " ");

            if (columnIndex >= 6) {
                columnIndex = 0;
                ++rowIndex;
            }

            table[rowIndex][columnIndex++] = line;
        }
    }

相关问题