SQL Server MS SQL PATINDEX and Regex

gxwragnw  于 12个月前  发布在  其他
关注(0)|答案(1)|浏览(96)

Been banging my head at this for a few trying to come up with a clever way to regex (or if someone has a better idea) a size value from a very random string. I've setup the below test to show what I'm working with, I'm trying to get the Final output in this statement to be like these in Green, and not these in Red by using the @pattern in the query to pull out the...

Number x Number

Hoping there's a regex or string guru out there that might have an idea for me :)

DECLARE @pattern AS VARCHAR(100)
SET @pattern = '%[0-9][0-9. x][0-9.x ][0-9. x]%'

BEGIN

    WITH cte AS (
        SELECT 'Italy Terrazzo Sacra Nero 24x24 Honed' [Name]
        UNION
        SELECT 'Nero Marquina 1x3 Herringbone' [Name]
        UNION
        SELECT 'Myorka Blue 2x8' [Name]
        UNION
        SELECT 'Chrysler Driftwood Hickory Sea 12mil Wear Layer Glue Down 6.3x48.4' [Name]
        UNION
        SELECT 'Myorka Blue 2x8' [Name]
        UNION
        SELECT 'Nero Marquina 1x3 Herringbone' [Name]
        UNION
        SELECT 'Broadway Lvt Ash 9x72 Rigid Core Click- 6.0mm/28mil Wear Layer' [Name]
        UNION
        SELECT 'Minetta Concreto Pearl 2.5mm/28mil Wear Layer Glue Down 18x36' [Name]
        UNION
        SELECT 'Speak Leather Black 24 x 24' [Name]
        UNION
        SELECT 'Accent Montana White 12 x 36 Glossy' [Name]
    )

    SELECT 
         --FULL NAME--
          [Name]
         
         --HELPERS TO SEE WHAT I'M TESTING--
         ,PATINDEX('%[0-9]%x%', [Name]) AS [START]
         ,SUBSTRING([Name], PATINDEX('%[0-9]%x%', [Name]), LEN([Name]))
         ,REVERSE([Name]) AS [REVERSE]
         ,LEN([Name]) AS [STRLenght]
         ,PATINDEX(@pattern, REVERSE([Name])) + 1
         ,LEN([Name]) - PATINDEX('%[0-9][^A-z]x%', REVERSE([Name])) + 1 AS [END]

         --FULL CALCULATION FOR FINAL OUTPUT--
         ,CASE WHEN [Name] LIKE '%[0-9] x [0-9]%'
               THEN SUBSTRING([Name], PATINDEX('%[0-9]%x%', [Name]), (LEN([Name]) - PATINDEX('%[0-9]%x%', REVERSE([Name])) + 1) - PATINDEX('%[0-9]%x%', [Name]) + 1)
               WHEN [Name] LIKE '%[0-9]x[0-9]%'
               THEN REPLACE(SUBSTRING([Name], PATINDEX(@pattern, [Name]), (LEN([Name]) - PATINDEX(@pattern, REVERSE([Name])) + 1) - PATINDEX(@pattern, [Name]) + 1), 'x', ' x ')
               ELSE NULL
            END AS [Final]
          FROM cte
END
mwkjh3gx

mwkjh3gx1#

Please try the following solution based on tokenization.

It is leveraging SQL Server built-in XML and XQuery functionality.

Notable points:

  • CROSS APPLY is tokenizing input string as XML.
  • XQuery FLWOR expression traversing XML and filtering out any token that is not 'x' via the XPath predicate [lower-case(text()[1])="x"] .
  • /root/r[$pos - 1] and /root/r[$pos + 1] XPath predicates get preceding and following tokens.

SQL

-- DDL and sample data population, start
DECLARE @tbl TABLE (id INT IDENTITY PRIMARY KEY, tokens VARCHAR(256));
INSERT @tbl (tokens) VALUES
('Italy Terrazzo Sacra Nero 24x24 Honed'),
('Nero Marquina 1x3 Herringbone'),
('Myorka Blue 2x8'),
('Chrysler Driftwood Hickory Sea 12mil Wear Layer Glue Down 6.3x48.4'),
('Nero Marquina 1x3 Herringbone'),
('Broadway Lvt Ash 9x72 Rigid Core Click- 6.0mm/28mil Wear Layer'),
('Minetta Concreto Pearl 2.5mm/28mil Wear Layer Glue Down 18x36'),
('Speak Leather Black 24 x 24'),
('Accent Montana White 12 x 36 Glossy'),
('Baroque Crackled 1/2 X 6 Pencil Capri');
-- DDL and sample data population, end

DECLARE @separator CHAR(1) = SPACE(1);

SELECT *
    , c.query('
        for $x in /root/r[lower-case(text()[1])="x"]
        let $pos := count(root/r[. << $x]) + 1
        return data((/root/r[$pos - 1], $x, /root/r[$pos + 1]))
    ').value('text()[1]', 'VARCHAR(20)') AS result
FROM @tbl AS t
    CROSS APPLY (SELECT TRY_CAST('<root><r><![CDATA[' + 
        REPLACE(REPLACE(REPLACE(tokens,'x', ' x '),SPACE(2),SPACE(1)), @separator, ']]></r><r><![CDATA[') + 
        ']]></r></root>' AS XML)) AS t1(c)
ORDER BY id;

Output

idtokensresult
1Italy Terrazzo Sacra Nero 24x24 Honed24 x 24
2Nero Marquina 1x3 Herringbone1 x 3
3Myorka Blue 2x82 x 8
4Chrysler Driftwood Hickory Sea 12mil Wear Layer Glue Down 6.3x48.46.3 x 48.4
5Nero Marquina 1x3 Herringbone1 x 3
6Broadway Lvt Ash 9x72 Rigid Core Click- 6.0mm/28mil Wear Layer9 x 72
7Minetta Concreto Pearl 2.5mm/28mil Wear Layer Glue Down 18x3618 x 36
8Speak Leather Black 24 x 2424 x 24
9Accent Montana White 12 x 36 Glossy12 x 36
10Baroque Crackled 1/2 X 6 Pencil Capri1/2 x 6

相关问题