配置单元udf-将stringobjectinspector转换为字符串

pdtvr36n  于 2021-05-31  发布在  Hadoop
关注(0)|答案(1)|浏览(414)

我正在写通用自定义项。如果我直接使用udf,它就可以工作,但是如果我将udf与其他函数(distinct,max,min)一起使用,它甚至不会调用 evaluate 功能。
我想看看发生了什么,所以尝试记录这些值。但是,我们需要了解如何转换 StringObjectInspectorString .
代码

@Description(name = "Decrypt", value = "Decrypt the Given Column", extended = "SELECT Decrypt('Hello World!');")
public class Decrypt extends GenericUDF {
    Logger logger = Logger.getLogger(getClass().getName());

    PrimitiveObjectInspector col;
    StringObjectInspector databaseName;
    StringObjectInspector schemaName;
    StringObjectInspector tableName;
    StringObjectInspector colName;

    @Override
    public ObjectInspector initialize(ObjectInspector[] arguments) throws UDFArgumentException {
        System.out.println("******************************    initialize called  ******************************");
        logger.info("******************************    initialize called  ******************************");
        if (arguments.length != 5) {
            throw new UDFArgumentLengthException("Decrypt only takes 4 arguments: T, String, String, String");
        }

        ObjectInspector colObject = arguments[0];
        ObjectInspector databaseNameObject = arguments[1];
        ObjectInspector schemaNameObject = arguments[2];
        ObjectInspector tableNameObject = arguments[3];
        ObjectInspector colNameNameObject = arguments[4];

        if (    !(databaseNameObject instanceof StringObjectInspector) ||
                !(schemaNameObject instanceof StringObjectInspector) ||
                !(tableNameObject instanceof StringObjectInspector) ||
                !(colNameNameObject instanceof StringObjectInspector)
        ) {
            throw new UDFArgumentException("Error: databaseName, schemeName, tableName and ColName should be String");
        }

        this.col = (PrimitiveObjectInspector) colObject;
        this.databaseName = (StringObjectInspector) databaseNameObject;
        this.tableName = (StringObjectInspector) tableNameObject;
        this.schemaName = (StringObjectInspector) schemaNameObject;
        this.colName = (StringObjectInspector) colNameNameObject;

        logger.info("******************************    initialize end  ******************************");
        logger.info(col.toString());
        logger.info(col);
        logger.info(databaseNameObject.toString());
        logger.info(databaseNameObject);
        logger.info(colName.toString());
        logger.info(colName);
        logger.info(colNameNameObject);
        logger.info(colNameNameObject.toString());
        return PrimitiveObjectInspectorFactory.javaStringObjectInspector;
    }

    @Override
    public Object evaluate(DeferredObject[] deferredObjects) throws HiveException {
        System.out.println("********************Decrypt********************");
        logger.info("********************Decrypt********************");
        if(col.getPrimitiveJavaObject(deferredObjects[0].get()) == null){
            return null;
        }
        String stringToDecrypt = col.getPrimitiveJavaObject(deferredObjects[0].get()).toString();
        String database = databaseName.getPrimitiveJavaObject(deferredObjects[1].get());
        String schema = schemaName.getPrimitiveJavaObject(deferredObjects[2].get());
        String table = tableName.getPrimitiveJavaObject(deferredObjects[3].get());
        String col = colName.getPrimitiveJavaObject(deferredObjects[4].get());

        return new Text(AES.decrypt(stringToDecrypt, database, schema, table, col));
    }

    @Override
    public String getDisplayString(String[] strings) {
        return null;
    }

}
hsvhsicv

hsvhsicv1#

尝试 getPrimitiveJavaObject 方法而不是 toString ,更多详细信息。
关于您的问题的另一个想法,请查看优化标志:
矢量化: hive.vectorized.execution , hive.vectorized.execution.enabled , hive.vectorized.execution.reduce.groupby.enabled 基于成本的优化: hive.cbo.enable predicate 下推: hive.optimize.ppd 通过键入检查这些标志是否已启用/禁用 set <option> ,例如。, set hive.optimize.ppd; ,并尝试切换值。

相关问题