java—将以前写入hdfs的lucene索引加载到ram目录中

czq61nw1 于 2021-06-03 发布在 Hadoop

关注(0)|答案(1)|浏览(370)

以下是错误消息：

Exception in thread "main" org.apache.lucene.index.IndexNotFoundException: no segments* file found in RAMDirectory@1cff1d4a lockFactory=org.apache.lucene.store.SingleInstanceLockFactory@2ddf0c3: files: [/prod/hdfs/LUCENE/index/140601/_0.cfe, /prod/hdfs/LUCENE/index/140601/segments_2, /prod/hdfs/LUCENE/index/140601/_0.si, /prod/hdfs/LUCENE/index/140601/segments.gen, /prod/hdfs/LUCENE/index/140601/_0.cfs]
    at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:801)
    at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:52)
    at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:66)

我已正确提交并关闭索引编写器。
以下是搜索者代码：

public class SearchFiles {

private SearchFiles() {}

public static void main(String[] args) throws Exception  {

    String filenm = ""; 
    // Creating FileSystem object, to be able to work with HDFS
    Configuration config = new Configuration();
    config.set("fs.defaultFS","hdfs://127.0.0.1:9000/");
    config.addResource(new Path("/usr/local/Cellar/hadoop/2.4.0/libexec/etc/hadoop/core-site.xml"));
    FileSystem dfs = FileSystem.get(config);
    FileStatus[] status = dfs.listStatus(new Path("/prod/hdfs/LUCENE/index/140601"));

    // Creating a RAMDirectory (memory) object, to be able to create index in memory.
    RAMDirectory rdir = new RAMDirectory();

    // Getting the list of index files present in the directory into an array.
    FSDataInputStream filereader = null;

    for (int i=0;i<status.length;i++)
    {

    // Reading data from index files on HDFS directory into filereader object.
    filereader = dfs.open(status[i].getPath());
        int size = filereader.available();
        // Reading data from file into a byte array.            

        byte[] bytarr = new byte[size];
        filereader.read(bytarr, 0, size);

    // Creating file in RAM directory with names same as that of 
    //index files present in HDFS directory.
        filenm = new String (status[i].getPath().toString()) ; 
        String sSplitValue = filenm.substring(21,filenm.length());
        System.out.println( sSplitValue);

        IndexOutput indxout = rdir.createOutput((sSplitValue) , null);

        // Writing data from byte array to the file in RAM directory
        indxout.writeBytes(bytarr,bytarr.length);
        indxout.flush();        
        indxout.close();  
    }
    filereader.close();
//  IndexReader indexReader = IndexReader.open(rdir);

    IndexReader indexReader = DirectoryReader.open(rdir); 
    IndexSearcher searcher = new IndexSearcher(indexReader);
    Analyzer analyzer = new StandardAnalyzer (Version.LUCENE_47); 
    QueryParser parser = new QueryParser(Version.LUCENE_47, "FUNDG_SRCE_CD",analyzer); 
    Query query = parser.parse("D"); 
    TopDocs results = searcher.search(query,1000); 

    int numTotalHits = results.totalHits; 
    TopDocs topDocs = searcher.search(query,1000); 
    ScoreDoc[] hits = topDocs.scoreDocs; 

    //Printing the number of documents or entries that match the search query.
    System.out.println("Total Hits = "+ numTotalHits); 
    for (int j =0 ; j < hits.length ; j++) {
        int docId = hits[j].doc; 

        Document d = searcher.doc(docId);

    System.out.println(d.get("FUNDG_SRCE_CD") +" " + d.get("ACCT_NUM") ) ; 
}
}
}

Java hadoop apache lucene

来源：https://stackoverflow.com/questions/24636212/loading-a-lucene-index-that-was-previously-written-to-hdfs-into-a-ramdirectory

1条答案

按热度按时间

6za6bjd01#

我认为你不应该把空值作为 IOContext 论据 createOutput . 尝试使用 IOContext.DEFAULT 相反。真的不知道这是否会使这项工作，但也许是朝着正确的方向迈出的一步。
为什么不简单一点呢？你可以用合适的 RAMDirectory 复制索引的构造函数：

public static void main(String[] args) throws Exception  {
    Directory oldDirectory = FSDirectory("/prod/hdfs/LUCENE/index/140601");
    Directory rdir = new RAMDirectory(fsDirectory, IOContext.DEFAULT);
    IndexReader indexReader = DirectoryReader.open(rdir); 
    //etc.
}

赞(0）回复(0）举报 2021-06-04

我来回答

java—将以前写入hdfs的lucene索引加载到ram目录中

1条答案

相关问题

热门标签

最新问答