sql server Xquery节点值性能

ctehm74n 于 2022-12-10 发布在 SQL Server

关注(0)|答案(1)|浏览(152)

I have a table with 25,000 rows. Table Audit (Id int identity(1,1), AdditionalInfo xml) The sample data in AdditionalInfo column for a row looks like below

<Audit version="1">
  <Context name="Event">
    <Action name="OrganizationEventReceived">
      <Input>
        <Source type="SourceOrganizationId">77d2678b-ea4a-43ad-816b-c63edf206b08</Source>
        <Target type="TargetOrganizationId">b98fd3ae-dbcb-4826-9d92-7e445ad61273,b98fd3ae-dbcb-4826-9d92-7e445ad61273,b98fd3ae-dbcb-4826-9d92-7e445ad61273</Target>
      </Input>
    </Action>
  </Context>
</Audit>

I like to shred the xml and collect the data in output dataset with following query.

SELECT   Id,
         p.value('(@name)[1]', 'nvarchar (100)') AS TargetAction, 
         p.value('(Input/Source/text())[1]', 'nvarchar (500)') AS Source, 
         p.value('(Input/Target/text())[1]', 'nvarchar (max)') AS Target
FROM dbo.Audit  CROSS APPLY AdditionalInfo.nodes('/Audit/Context/Action') AS AdditionalInfo(p)

The performance of the query is bad. It is taking 15 seconds to give the result set for just 25,000 rows. Is there a better way of doing it. I even tried putting primary and secondary xml indexes on AdditionalInfo column. Please help and let me know, to use better sql server xquery techniques.
Thanks,

sql-server

来源：https://stackoverflow.com/questions/27353373/sql-server-xquery-nodes-value-performance

1条答案

按热度按时间

vsaztqbk1#

Great question.
My recent task requires to parse about 35'000 XML documents, valid document being ~20kB.
More and larger xml files tend to exponentially fill the memory:

100 documents: 0:33
1000 documents: 25:00 😵‍💫

Try to distribute your work:

Variable target stores unstructured data, which eats most of computing power due to the data type and different length in values
depth of nodes in CROSS APPLY matters: avoid triple nodes in nodes() , consider two nodes and recursion (see below on split)
batch mode: process several documents at time, WHERE id IN (1,2,3)
loop a list of documents, FOR ;
parse using local variables, such as DECLARE @xml_doc XML; SET @xml_doc = SELECT xmldata FROM xmlsource WHERE id=1;
avoid exporting xml node content, only write result values
parse all elements separately: saving order of elements using function ROW_NUMBER() , then LEFT JOIN all parts to xml documents list using some identifier, such as xml_id

赞(0）回复(0）举报 2022-12-10

我来回答

sql server Xquery节点值性能

1条答案

相关问题

热门标签

最新问答