ADF在推送到Cosmos时无法正确识别JSON列

w8f9ii69  于 2023-07-01  发布在  其他
关注(0)|答案(1)|浏览(101)

ADF管道的源是一个查询。沿着这样的东西。

SELECT
   FirstName,
   LastName,
   (
        SELECT Phonenumber FROM Phones p WHERE p.PhoneID = a.PhoneID
        FOR JSON PATH
    ) as PhoneNumbers
FROM Accounts a
FOR JSON PATH

看起来都是有效的JSON。无论我做什么,我都不能让ADF将其识别为JSON。它将其视为一个字符串,当发送到Parquet或Cosmos时,它通过添加像[{\\"FirstName\\":\\"TheDude\\"...这样的转义字符来使JSON无效
我需要能够读取包含JSON的数据,并将其写入Cosmos,同时保持JSON的完整性。任何帮助将不胜感激。
尝试将列Phone Numbers输出到Cosmos,并期望它看起来像常规JSON输出,但它在引号周围有转义字符。

lnvxswe2

lnvxswe21#

上面的查询将生成JSON,但在ADF中,它将把它作为字符串,因为它把它作为行值。这就是它将该行原样复制到目标的原因。
我需要能够读取包含JSON的数据,并将其写入Cosmos,同时保持JSON的完整性。
要实现这一点,首先需要存储生成的JSON,然后将该JSON复制到cosmos db。为此,您需要2个复制数据活动。

复制数据活动,将JSON保存在blob中:

在复制活动源中给予您的查询。在这里,我给出了一个示例查询。

要将JSON字符串存储为JSON文件,请使用分隔的文本数据集作为此复制活动中的接收器,文件扩展名为.json,并给予以下配置。

保持复制活动的Map不变。
在第一次复制活动执行后,您将获得如下所示的JSON文件。

将JSON的数据活动复制到Cosmos db:

现在,在ADF中为上述内容创建一个JSON数据集,并将其用作第二次复制活动的源。在接收器中给予你的Cosmos db数据集。
遵循下面的Map。

现在,您将在执行上述操作后从Cosmos db中的查询中获得JSON。

我的Pipeline JSON供您参考:

{
    "name": "pipeline3",
    "properties": {
        "activities": [
            {
                "name": "Copy data from SQL to JSON",
                "type": "Copy",
                "dependsOn": [],
                "policy": {
                    "timeout": "0.12:00:00",
                    "retry": 0,
                    "retryIntervalInSeconds": 30,
                    "secureOutput": false,
                    "secureInput": false
                },
                "userProperties": [],
                "typeProperties": {
                    "source": {
                        "type": "AzureSqlSource",
                        "sqlReaderQuery": "SELECT\n   FirstName,\n   LastName,\n   (\n        SELECT Phonenumber FROM s1\n        FOR JSON PATH\n    ) as PhoneNumbers\nFROM s1\nFOR JSON PATH;",
                        "queryTimeout": "02:00:00",
                        "partitionOption": "None"
                    },
                    "sink": {
                        "type": "DelimitedTextSink",
                        "storeSettings": {
                            "type": "AzureBlobFSWriteSettings"
                        },
                        "formatSettings": {
                            "type": "DelimitedTextWriteSettings",
                            "quoteAllText": true,
                            "fileExtension": ".txt"
                        }
                    },
                    "enableStaging": false,
                    "translator": {
                        "type": "TabularTranslator",
                        "typeConversion": true,
                        "typeConversionSettings": {
                            "allowDataTruncation": true,
                            "treatBooleanAsNumber": false
                        }
                    }
                },
                "inputs": [
                    {
                        "referenceName": "AzureSqlTable1",
                        "type": "DatasetReference"
                    }
                ],
                "outputs": [
                    {
                        "referenceName": "jsoncsv",
                        "type": "DatasetReference"
                    }
                ]
            },
            {
                "name": "Copy data from JSON to cosmos",
                "type": "Copy",
                "dependsOn": [
                    {
                        "activity": "Copy data from SQL to JSON",
                        "dependencyConditions": [
                            "Succeeded"
                        ]
                    }
                ],
                "policy": {
                    "timeout": "0.12:00:00",
                    "retry": 0,
                    "retryIntervalInSeconds": 30,
                    "secureOutput": false,
                    "secureInput": false
                },
                "userProperties": [],
                "typeProperties": {
                    "source": {
                        "type": "JsonSource",
                        "storeSettings": {
                            "type": "AzureBlobFSReadSettings",
                            "recursive": true,
                            "enablePartitionDiscovery": false
                        },
                        "formatSettings": {
                            "type": "JsonReadSettings"
                        }
                    },
                    "sink": {
                        "type": "CosmosDbSqlApiSink",
                        "writeBehavior": "insert",
                        "disableMetricsCollection": false
                    },
                    "enableStaging": false,
                    "translator": {
                        "type": "TabularTranslator",
                        "mappings": [
                            {
                                "source": {
                                    "path": "$['FirstName']"
                                },
                                "sink": {
                                    "path": "FirstName"
                                }
                            },
                            {
                                "source": {
                                    "path": "$['LastName']"
                                },
                                "sink": {
                                    "path": "LastName"
                                }
                            },
                            {
                                "source": {
                                    "path": "$['PhoneNumbers']"
                                },
                                "sink": {
                                    "path": "PhoneNumbers"
                                }
                            }
                        ],
                        "collectionReference": ""
                    }
                },
                "inputs": [
                    {
                        "referenceName": "Json1",
                        "type": "DatasetReference"
                    }
                ],
                "outputs": [
                    {
                        "referenceName": "CosmosDbNoSqlContainer1",
                        "type": "DatasetReference"
                    }
                ]
            }
        ],
        "annotations": []
    }
}

相关问题