Extract String for specific characters SQL Server

ql3eal8s  于 2023-10-15  发布在  SQL Server
关注(0)|答案(3)|浏览(133)

Trying to extract specific string from a varchar column in SQL Server. And then from that string I need the last 4 digits only.

I tried substring & patindex but can't figure out how to get the last 4 digits.

This was one attempt but it only returns the first part of the text:

SUBSTRING([column], (PATINDEX('%ID0%-[0-9][0-9][0-9][0-9][0-9]%',[column])),9)

Here's some sample data:

Brian Larry, Hays Hill XX, ID007370, Option code - 0123 4567, AAA - XX
ID007366 - Dave Jones - XX - Option Code 0121 9999
ID00 7120  Brian Smith XX  Branded company
ID07113 Gary Barnes  - LLL 0123 9111 AAA LLL
ID00 7120  Charles Old XX   Recall operation
ID0007439 - Kerry Hill - Maidstone XX - Option Code 0124 234 BBB XX
ID006817 Paul George Jackson 1234 8464 AAA Recall operation

I need to extract the ID00000 number, but then only return the last 4 digits. So for ID007370 , I only want to return 7370 . The ID numbers can vary in length.

ruarlubt

ruarlubt1#

Something like this looks to be working:

SELECT  RIGHT(SUBSTRING(string_nonspace, y.start, PATINDEX('%[^0-9]%', STUFF(string_nonspace, 1, start + 1, '')) + 1), 4)
FROM    (
    VALUES  (N'Brian Larry, Hays Hill XX, ID007370, Option code - 0123 4567, AAA - XX')
    ,   (N'ID007366 - Dave Jones - XX - Option Code 0121 9999')
    ,   (N'ID00 7120  Brian Smith XX  Branded company')
    ,   (N'ID07113 Gary Barnes  - LLL 0123 9111 AAA LLL')
    ,   (N'ID00 7120  Charles Old XX   Recall operation')
    ,   (N'ID0007439 - Kerry Hill - Maidstone XX - Option Code 0124 234 BBB XX')
    ,   (N'ID006817 Paul George Jackson 1234 8464 AAA Recall operation')
) t (col1)
CROSS APPLY (
        SELECT  replace(col1, ' ', '') AS string_nonspace
    ) x
CROSS APPLY (
        SELECT  PATINDEX('%ID[0-9]%', string_nonspace) AS start
    ) y

Output:

IDstring_nonspace
7370BrianLarry,HaysHillXX,ID007370,Optioncode-01234567,AAA-XX
7366ID007366-DaveJones-XX-OptionCode01219999
7120ID007120BrianSmithXXBrandedcompany
7113ID07113GaryBarnes-LLL01239111AAALLL
7120ID007120CharlesOldXXRecalloperation
7439ID0007439-KerryHill-MaidstoneXX-OptionCode0124234BBBXX
6817ID006817PaulGeorgeJackson12348464AAARecalloperation

I remove spaces and then find starting position by PATINDEX('%ID[0-9]%' . Then i clip the string so ID-part is removed, and then search for first non-number value. When that done, you can do a SUBSTRING between start and the first non-number value. Finally, RIGHT(..., 4) gets the last four characters.

Of course this code will fail on many malformed strings, but them the breaks, the in-data kinda "blowers"

qv7cva1a

qv7cva1a2#

Although SQL is not that friendly when it comes to string extractions, you can Try this :

SELECT SUBSTRING([column], LEN([column]) - 4, 4)
FROM [table];

This query will first find the length of the string in the [column] field using the LEN() function. Then, it will use the SUBSTRING() function to extract the last 4 characters of the string, starting at the length of the string minus 4. You can also use the following query to extract the last 4 digits of the ID00000 number, even if the ID number varies in length:

SELECT SUBSTRING([column], PATINDEX('%ID0%-%[0-9][0-9][0-9][0-9]%', [column]) + 6, 4)
FROM [table];

This query will first use the PATINDEX() function to find the position of the first occurrence of the string ID0%- in the [column] field. Then, it will use the SUBSTRING() function to extract the last 4 characters of the string, starting at the position of the ID0%- string plus 6.

nom7f22z

nom7f22z3#

There is no solution that guaranties you will have a correct result because you violate the 1FN which is the most evil thing to do...

Just have a look if you have a new row like this one :

insert into Sample (SampleString)
values ('Brian Larry, LUCID007700880099 Hays Hill XX, ID007370, Option code - 0123 4567, AAA - XX')

Or this one :

insert into Sample (SampleString)
values ('Brian LUCID0AZ, Hays Hill XX, ID007370, Option code - 0123 4567, AAA - XX')

相关问题