Im using the SHC spark connector by hortonworks to read an HBase table
https://github.com/hortonworks-spark/shc
I have some tasks that take a very long time to complete and I suspect its because of region size skew but would like to confirm it by logging which region/region server each task is reading.
I tried turning on debug logs by doing the following in the driver
Logger.getLogger("org").setLevel(Level.DEBUG);
Logger.getLogger("akka").setLevel(Level.DEBUG);
But it didnt seem to have any effect.
Is it possible to log the above somehow?
1条答案
按热度按时间oalqel3c1#
it didn't seem to have any effect.
Yes, unfortunately, SHC itself does not log the region/region server name information anywhere during the execution. That's why enabling DEBUG log would not help at all.
Is it possible to log the above somehow?
Yes, and only if you know where and how to customize shc's source code. You might need to insert your own log command, rebuild, test, package, and ship it with your application.
It depends on your goal. i.e. you might want to call
logDebug()
orlogInfo()
of the region name info during a task of table scanning. here is source code HBaseTableScanThe build, test, ship, .etc details are here in SHC's repo doc .