Pyspark在谷歌公司

tmb3ates  于 2023-02-18  发布在  Spark
关注(0)|答案(1)|浏览(146)

我正在尝试在谷歌colab上使用pyspark。每个教程都遵循类似的方法

!pip install pyspark # Import SparkSession
from pyspark.sql import SparkSession # Create a Spark Session
spark = SparkSession.builder.master("local[*]").getOrCreate() # Check Spark Session Information
spark # Import a Spark function from library
from pyspark.sql.functions import col

但我在中得到错误
x一个一个一个一个x一个一个二个x
我试着用这样的东西安装java

# Download Java Virtual Machine (JVM)
!apt-get install openjdk-8-jdk-headless -qq > /dev/null

正如教程所建议的,但似乎都不起作用。

1yjd4xko

1yjd4xko1#

这对我很有效,所以我贴出来以防有人需要。

!pip install pyspark
!pip install -U -q PyDrive
!apt install openjdk-8-jdk-headless -qq
import os
os.environ["JAVA_HOME"] = "/usr/lib/jvm/java-8-openjdk-amd64"
import pyspark
import pyspark.sql  as pyspark_sql
import pyspark.sql.types as pyspark_types
import pyspark.sql.functions  as pyspark_functions
from pyspark import SparkContext, SparkConf
# create the session
conf = SparkConf().set("spark.ui.port", "4050")

# create the context
sc = pyspark.SparkContext(conf=conf)
spark = pyspark_sql.SparkSession.builder.getOrCreate()

相关问题