I want to use Databricks AutoLoader to read a stream of files, the volume of the data is high so I want to use file notification mode (when I used directory listing mode the latency was bad), but it seems I need a "storage queues" which is unavailable in Azure Premium storage, when I tried to run the following code I got the error msg: UnknownHostException: .queue.core.windows.net
val manager = CloudFilesAzureResourceManager
.newManager
.option("cloudFiles.connectionString", "XXX")
.option("cloudFiles.resourceGroup", "XXX")
.option("cloudFiles.subscriptionId", "XXX")
.option("cloudFiles.tenantId", "XXX")
.option("cloudFiles.clientId", "XXX")
.option("cloudFiles.clientSecret","XXX")
.option("path", "abfss://XXX@ZZZ.dfs.core.windows.net/test") // required only for setUpNotificationServices
.create()
// Set up a queue and a topic subscribed to the path provided in the manager.
manager.setUpNotificationServices("XXX")
https://learn.microsoft.com/en-us/azure/databricks/ingestion/auto-loader/file-notification-mode#permissions-azure
是否有方法在Azure Premium存储中使用文件通知模式?
1条答案
按热度按时间g2ieeal71#
使用自动加载器来扩展自动加载器以摄取数百万个文件。选项使用通知允许您选择目录列表模式来检测新文件。
请提供创建云资源所需的权限。如果将通知设置为true,请配置
cloudFiles
。有关使用数据块配置autoloader的详细信息,请参阅此link。它详细说明了在自动加载器上读取和写入流式数据。