我正在尝试通过使用datastax cassandra连接器连接从cassandra访问数据。下面的代码正在为我工作。我试图在join之后总结rdd和cassandra的值列
tm(a.joinWithCassandraTable("ks","tbl").on(SomeColumns("key","key2","key3","key4","key5","key6","key7","key8","key9","key10","key11","key12","key13","key14","key15","column1","column2","column3","column4","column5")).select("value1").map { case (ip, row) => IP(ip.key, ip.key2, ip.key3,ip.key4,ip.key5,ip.key6,ip.key7,ip.key8,ip.key9,ip.key10,ip.key11,ip.key12,ip.key13,ip.key14,ip.key15,ip.column1,ip.column2,ip.column3,ip.column4,ip.column5,ip.value1 + row.getLong("value1")) }.saveToCassandra("ks", "tbl"))
但是,当我尝试执行左连接时,它会给出一个“cannot resolve symbol getlong”,我认为这是因为左连接不能保证值,因为它可能为null,但我无法用scala编写此代码。
tm(a.leftJoinWithCassandraTable("ks","tbl").on(SomeColumns("key","key2","key3","key4","key5","key6","key7","key8","key9","key10","key11","key12","key13","key14","key15","column1","column2","column3","column4","column5")).select("value1").map { case (ip, row) => IP(ip.key, ip.key2, ip.key3,ip.key4,ip.key5,ip.key6,ip.key7,ip.key8,ip.key9,ip.key10,ip.key11,ip.key12,ip.key13,ip.key14,ip.key15,ip.column1,ip.column2,ip.column3,ip.column4,ip.column5,ip.value1 + row.getLong("value1")) }.saveToCassandra("ks", "tbl"))
感谢您的帮助。如果有任何需要的信息,让我知道,我会尽量补充
1条答案
按热度按时间qq24tv8q1#
当你在Cassandra没有数据时,你应该得到一个
Option[Row]
而不是Row
对象。而不是
.map { case (ip, row) => ...}
你可以写:在这种情况下-当你没有数据的时候(
None
),然后你就回来了IP
对象本身,如果有数据,则构造新的IP
对象