I'm trying to generate the same MD5 codes from Python 3 and SQL but I get always a different result.
With this Unique Identifier f033b004-eb80-412d-9773-f4f06bb994c1
SELECT HASHBYTES('MD5', 'f033b004-eb80-412d-9773-f4f06bb994c1')
Result: 0x9BCE8D23CAC76AF4F61C04673CDD0081
And then I do that calculation
SELECT ABS(HASHBYTES('MD5', 'f033b004-eb80-412d-9773-f4f06bb994c1') % 10)
Result: 5
Now with python I use the hashlib
library
import hashlib
m = hashlib.md5('f033b004-eb80-412d-9773-f4f06bb994c1'.encode() )
m.digest(), m.hexdigest()
Result: b'\x9b\xce\x8d#\xca\xc7j\xf4\xf6\x1c\x04g<\xdd\x00\x81', '9bce8d23cac76af4f61c04673cdd0081'
And then I do the same calculations
int.from_bytes( b'\x9b\xce\x8d#\xca\xc7j\xf4\xf6\x1c\x04g<\xdd\x00\x81',
byteorder='big',
signed=False )%10
Result: 9
Does anyone knows how can I get from the Python code the same results as the SQL Server ?
3条答案
按热度按时间7vhp5slm1#
This is because SQL server returns the MD5 hash in UPPERCASE and Python in lowercase. The UPPER and lower case bytes have, of course, different byte values.
wfauudbj2#
Just ran into this... the problem is that when SQL server converts the hash into integer to perform the modulo, it will not give you the correct integer because the hexadecimal is too big (check out the top answer in here: Conversion of long Hex string to Integer not working as expected (works in SQL) )
On the other hand, python can do the conversion correctly because the max integer size is as much as memory allows
iswrvxsc3#
This is true for BigQuery, not sure it it applies to SQL Server too, but the reason for this discrepancy is that the BigQuery functions for MD5, SHA-1, SHA-256, and SHA-512 return base64-encoded representations of the hash values, while the Python hashlib library returns hexadecimal representations.
This code will return the same value as SQL: