向udf发送矩阵

p4tfgftt 于 2021-06-03 发布在 Hadoop

关注(0)|答案(1)|浏览(319)

我有一个问题与自定义项Pig拉丁语。我正在尝试实现一个系统，该系统必须验证本地存储的矩阵和hadoop存储库中存储的一组矩阵之间是否存在“Map”。对于Map，我的意思是如果hadoop中存在一个存储矩阵的行和列的排列，它将矩阵转换为一个与本地存储矩阵相等的矩阵。因为矩阵可以有数百个元素，所以我想在hadoop上执行Map算法来使用并行性。我正在寻找udf pig拉丁语，但我不知道如何将本地矩阵“发送”到udf函数。

public class Mapping extends EvalFunc<String>
 {
private int[][] matrixToMap; //The local matrix i want to map

public String exec(Tuple input) throws IOException { //Here the tuple are the matrix stored in hadoop
  if (input == null || input.size() == 0)
      return null;
  try{
       //HERE THE CODE FOR THE MAPPING
  }

     }
   }

}
考虑到我将使用以下代码，我的问题是如何初始化属性matrixtomap：

REGISTER /Users/myudfs.jar;  
//SOME CODE TO INITIALIZE ATTRIBUTE matrixToMap
records = LOAD 'Sample7.txt' //the matrix stored in hadoop
B = FOREACH records GENERATE myudfs.mapping(records);

假设pig脚本是在java程序中调用的，并且本地矩阵存储在java矩阵中。所以java程序看起来像：

int [][] localMatrix;
pigServer.registerJar("/Users/myudfs.jar");
//Some code to make Mapping.matrixToMap = localMatrix
pigServer.registerQuery("records = LOAD 'Sample7.txt';");
pigServer.registerQuery("B = FOREACH records GENERATE myudfs.Mapping(formula);");

你知道吗？谢谢

hadoop user-defined-functions Matrix apache-pig

来源：https://stackoverflow.com/questions/18983072/sending-a-matrix-to-udf-pig-latin

1条答案

按热度按时间

ojsjcaue1#

您可以像在自定义项的构造函数中那样初始化类变量：

public class Mapping extends EvalFunc<String>
{
  private int[][] matrixToMap; //The local matrix i want to map

  public Mapping(String filename) {
    // Code to populate matrixToMap from the data in filename
  }

  public String exec(Tuple input) throws IOException { //Here the tuple are the matrix stored in hadoop
    if (input == null || input.size() == 0)
      return null;
    try{
       //HERE THE CODE FOR THE MAPPING
    }

   }
 }

在脚本中，使用以下行：

DEFINE Mapping myudfs.Mapping('/path/to/matrix/on/HDFS');

使用此方法，您的矩阵必须存储在hdfs上，以便初始化并调用构造函数的Map器或还原器可以访问数据。

赞(0）回复(0）举报 2021-06-03

我来回答

向udf发送矩阵

1条答案

相关问题

热门标签

最新问答