hadoop原始比较器

zd287kbt  于 2021-06-03  发布在  Hadoop
关注(0)|答案(4)|浏览(263)

我试图在一个原始比较器中实现以下内容,但不确定如何编写这个?
这里的图姆斯坦普是一个很好写的地方。

if (this.getNaturalKey().compareTo(o.getNaturalKey()) != 0) {
                return this.getNaturalKey().compareTo(o.getNaturalKey());
            } else if (this.timeStamp != o.timeStamp) {
                return timeStamp.compareTo(o.timeStamp);
            } else {
                return 0;
            }

我在这里找到了一个提示,但不确定如何实现对longwriteabel类型的处理?http://my.safaribooksonline.com/book/databases/hadoop/9780596521974/serialization/id3548156
谢谢你的帮助

zpjtge22

zpjtge221#

您是否在询问比较hadoop提供的longwritable类型的方法?如果是,那么答案是使用 compare() 方法。有关详细信息,请向下滚动此处。

zdwk9cvp

zdwk9cvp2#

正确实现rawcomparator的最佳方法是扩展writableComarator和重写 compare() 方法。writeablecomparator编写得非常好,因此您可以很容易地理解它。

owfi6suc

owfi6suc3#

从我在 LongWritable 班级:

/**A Comparator optimized for LongWritable. */ 
  public static class Comparator extends WritableComparator {
    public Comparator() {
      super(LongWritable.class);
    }

    public int compare(byte[] b1, int s1, int l1,
                       byte[] b2, int s2, int l2) {
      long thisValue = readLong(b1, s1);
      long thatValue = readLong(b2, s2);
      return (thisValue<thatValue ? -1 : (thisValue==thatValue ? 0 : 1));
    }
  }

字节比较是 RawComparator .

px9o7tmv

px9o7tmv4#

假设我有一个compositekey,它表示一对(string stocksymbol,long timestamp)。我们可以对stocksymbol字段进行一次主要的分组传递,以获得一种类型的所有数据,然后在shuffle阶段,我们的“二次排序”使用timestamp long成员对timeseries点进行排序,以便它们以分区和排序的顺序到达reducer。

public class CompositeKey implements WritableComparable<CompositeKey> {
    // natural key is (stockSymbol)
    // composite key is a pair (stockSymbol, timestamp)
    private String stockSymbol;
    private long timestamp;
......//Getter setter omiited for clarity here
@Override
    public void readFields(DataInput in) throws IOException {
        this.stockSymbol = in.readUTF();
        this.timestamp = in.readLong();
    }

    @Override
    public void write(DataOutput out) throws IOException {
        out.writeUTF(this.stockSymbol);
        out.writeLong(this.timestamp);
    }

    @Override
    public int compareTo(CompositeKey other) {
        if (this.stockSymbol.compareTo(other.stockSymbol) != 0) {
            return this.stockSymbol.compareTo(other.stockSymbol);
        } 
        else if (this.timestamp != other.timestamp) {
            return timestamp < other.timestamp ? -1 : 1;
        } 
        else {
            return 0;
        }

    }

现在,compositekey比较器将是:

public class CompositeKeyComparator extends WritableComparator {

    protected CompositeKeyComparator() {
        super(CompositeKey.class, true);
    }

    @Override
    public int compare(WritableComparable wc1, WritableComparable wc2) {
        CompositeKey ck1 = (CompositeKey) wc1;
        CompositeKey ck2 = (CompositeKey) wc2;

        int comparison = ck1.getStockSymbol().compareTo(ck2.getStockSymbol());
        if (comparison == 0) {
            // stock symbols are equal here
            if (ck1.getTimestamp() == ck2.getTimestamp()) {
                return 0;
            }
            else if (ck1.getTimestamp() < ck2.getTimestamp()) {
                return -1;
            }
            else {
                return 1;
            }
        }
        else {
            return comparison;
        }
    }
}

相关问题