无法从microsoft azure databricks笔记本加载ms sql数据库中的数据

smdncfj3 于 2021-05-27 发布在 Spark

关注(0)|答案(1)|浏览(485)

我想从ms sql数据库（不在azure上托管）检索数据到microsoft azure databricks笔记本。以下是我所做工作的步骤：
进入azure门户并创建一个资源组
创建azuredatabricks服务（但我不使用“在您自己的虚拟网络（vnet）中部署azuredatabricks工作区”选项）→ 也许我应该……）
一旦azuredatabricks服务就绪，我就启动它并创建一个没有特定配置的集群
然后我用这个脚本创建一个笔记本（在上一个集群上运行）

msSqlServer = "jdbc:sqlserver://xxx.xxx.xxx.xxx:1433;ApplicationIntent=readonly;databaseName=" + msSqlDatabase
query = """(select * from mytable)foo"""

df = (
  spark.read.format("jdbc")
  .option("url", msSqlServer)
  .option("dbtable", query)
  .option("user", msSqlUser)
  .option("password", msSqlPassword)
  .load()
)

我得到一个错误：

com.microsoft.sqlserver.jdbc.SQLServerException: The TCP/IP connection to the host xxx.xxx.xxx.xxx, port 1433 has failed. Error: &#34;connect timed out. Verify the connection properties. Make sure that an instance of SQL Server is running on the host and accepting TCP/IP connections at the port. Make sure that TCP connections to the port are not blocked by a firewall.&#34;.

在询问stackoverflow之前，我已经联系了我的公司网络和dba团队。dba说连接正常，但随后立即断开”
为您提供信息，我遵循了本教程https://docs.microsoft.com/en-us/azure/databricks/data/data-sources/sql-databases
也许有些东西需要配置，但我根本不在网络中（我只是一个小数据科学家，想在azuredatabricks上玩笔记本，访问他的公司数据库）。比如我怎么能 Make sure that TCP connections to the port are not blocked by a firewall ?
如果您有一些想法或已经遇到过这个问题，请随时与我们分享。：）
如果你需要更多的信息，请告诉我。

apache-spark sql-server azure-databricks

来源：https://stackoverflow.com/questions/63725850/unable-to-load-the-data-from-a-ms-sql-database-from-microsoft-azure-databricks-n

1条答案

按热度按时间

zzoitvuj1#

如果已将azure sql数据库配置为侦听端口1433上的tcp/ip流量，则可能是以下三个原因之一：
jdbc连接字符串输入正确。
防火墙正在阻止传入连接。
azure sql数据库未运行。
从azure门户获取azure sql数据库jdbc连接字符串。

使用jdbc和python的sql数据库：

jdbcHostname = "chepra.database.windows.net"
jdbcDatabase = "chepra"
jdbcPort = "1433"
username = "chepra"
password = "XXXXXXXXXX"
jdbcUrl = "jdbc:sqlserver://{0}:{1};database={2}".format(jdbcHostname, jdbcPort, jdbcDatabase)
connectionProperties = {
  "user" : username,
  "password" : password,
  "driver" : "com.microsoft.sqlserver.jdbc.SQLServerDriver"
}
pushdown_query = "(Select * from customers where CustomerID = 2) CustomerID"
df = spark.read.jdbc(url=jdbcUrl, table=pushdown_query, properties=connectionProperties)
display(df)

使用scala的jdbc sql数据库：

val jdbcHostname = "chepra.database.windows.net"
val jdbcPort = 1433
val jdbcDatabase = "chepra"

// Create the JDBC URL without passing in the user and password parameters.
val jdbcUrl = s"jdbc:sqlserver://${jdbcHostname}:${jdbcPort};database=${jdbcDatabase}"

// Create a Properties() object to hold the parameters.
import java.util.Properties
val connectionProperties = new Properties()

connectionProperties.put("user", s"chepra")
connectionProperties.put("password", s"XXXXXXXXXX")

val employees_table = spark.read.jdbc(jdbcUrl, "customers", connectionProperties)
employees_table.show()

赞(0）回复(0）举报 2021-05-27

我来回答

无法从microsoft azure databricks笔记本加载ms sql数据库中的数据

1条答案

相关问题

热门标签

最新问答