spark 1.5和datastax-ddc-3.2.1 cassandra依赖jars?

lyr7nygr  于 2021-07-13  发布在  Java
关注(0)|答案(1)|浏览(322)

我使用的是spark 1.5和cassandra 3.2.1。任何人都可以指定构建路径中需要哪些jar来连接、查询和向cassandra插入数据。
现在我正在使用以下jars spark-cassandra-connector_2.10-1.5.0-m3.jar apache-cassandra-clientutil-3.2.1.jar cassandra-driver-core-3.0.0-beta1-bb1bce4-snapshot-shaded.jar spark-assembly-1.5.1-hadoop2.0.0-mr1-cdh4.2.0.jar guava-18.0.jar netty-all-4.0.23.final.jar
有了上面的jar,我就可以和Cassandra联系上了。我可以截断表,删除表。但我无法插入任何数据,即使是简单的插入查询。
代码如下:

import org.apache.spark.SparkConf;
import org.apache.spark.api.java.JavaSparkContext;

import com.datastax.driver.core.Session;
import com.datastax.spark.connector.cql.CassandraConnector;

public class Test {

public static void main(String[] args) {

    JavaSparkContext ctx = new JavaSparkContext(new SparkConf().setMaster("spark://blr-lt-203:7077").set("spark.cassandra.connection.host", "blr-lt-203").setAppName("testinsert").set("spark.serializer" ,"org.apache.spark.serializer.KryoSerializer").set("spark.kryoserializer.buffer.max" , "1024mb"));

    CassandraConnector connector = CassandraConnector.apply(ctx.getConf());

    Session session = connector.openSession();

    session.execute("insert into test.table1 (name) values ('abcd')") ;
    session.close();
    ctx.stop();

}

}

以下是日志:

16/03/28 21:24:52 INFO BlockManagerMaster: Trying to register BlockManager
16/03/28 21:24:52 INFO BlockManagerMasterEndpoint: Registering   block    manager localhost:50238 with 944.7 MB RAM,BlockManagerId(driver, localhost, 50238)
16/03/28 21:24:52 INFO BlockManagerMaster: Registered BlockManager
16/03/28 21:24:53 INFO NettyUtil: Did not find Netty's native epoll transport in the classpath, defaulting to NIO.
16/03/28 21:24:53 INFO Cluster: New Cassandra host localhost/127.0.0.1:9042 added
16/03/28 21:24:53 INFO CassandraConnector: Connected to Cassandra cluster: Test Cluster

它只是在这里停留了一段时间,然后超时,但有以下例外:

Exception in thread "main" com.datastax.driver.core.exceptions.UnavailableException: Not enough replicas available for query at consistency LOCAL_QUORUM (2 required but only 1 alive)

我做错什么了?
请让我知道什么是必需的jar或是否有一些版本兼容性问题。
spark(1.5)和cassandra(?)最稳定的版本是什么
提前谢谢

lp0sw83n

lp0sw83n1#

这个问题的出现是由于google的guava库之间的冲突。
解决方案是对spark cassandra连接器依赖项中的guava库进行着色。您可以使用maven shade插件来实现这一点。这是我的pom.xml,用来遮住guava库。

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0  
http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>

<groupId>com.pc.test</groupId>
<artifactId>casparktest</artifactId>
<version>0.0.1-SNAPSHOT</version>
<packaging>jar</packaging>

 <name>casparktest</name>
<url>http://maven.apache.org</url>

<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>

<dependencies>
<dependency>
 <groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>1.5.0</version>
</dependency>
<dependency>
  <groupId>junit</groupId>
  <artifactId>junit</artifactId>
  <version>3.8.1</version>
  <scope>test</scope>
</dependency>
<dependency>
    <groupId>com.datastax.spark</groupId>
    <artifactId>spark-cassandra-connector_2.10</artifactId>
    <version>1.5.0</version>
</dependency>
<dependency>
<groupId>com.datastax.cassandra</groupId>
<artifactId>cassandra-driver-core</artifactId>
<version>3.0.0-beta1</version>
</dependency>

</dependencies>
<build>

<plugins>
    <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-shade-plugin</artifactId>
        <version>2.3</version>
        <executions>
            <execution>
                <phase>package</phase>
                <goals>
                    <goal>shade</goal>
                </goals>
                <configuration>
                 <filters>
    <filter>
        <artifact>*:*</artifact>
        <excludes>
            <exclude>META-INF/*.SF</exclude>
            <exclude>META-INF/*.DSA</exclude>
            <exclude>META-INF/*.RSA</exclude>
        </excludes>
    </filter>
</filters>
                    <relocations>
                        <relocation>
                            <pattern>com.google</pattern>
                            <shadedPattern>com.pointcross.shaded.google</shadedPattern>
                        </relocation>

                    </relocations>
                    <minimizeJar>false</minimizeJar>
                    <shadedArtifactAttached>true</shadedArtifactAttached>
                </configuration>
            </execution>
        </executions>
    </plugin>
</plugins>
</build>

在此之后,您将进行maven构建,生成一个jar,其中包含pom.xml中提到的所有依赖项,还将隐藏guava库,您可以使用这些库提交spark作业。

相关问题