Checksum()
, Binary_Checksum()
and CHECKSUM_AVG()
functions are provided by SQL Server to build a hash index based on an expression or a column list.
This can be helpful in determining whether a row has changed or not. The mechanism can then be used to identify whether the record has been updated or not.
I have found lots of example of collisions that are generated same hash value for different values. How we can identify the collision condition for these function.
Anyone have to clue about to which algorithm or technique used to generated/compute hash without hash collisions?
2条答案
按热度按时间6ljaweal1#
According to the CHECKSUM weakness explained article on sqlTeam:
The built-in CHECKUM function in SQL Server is built on a series of 4 bit left rotational xor operations.
A forum post from 2006 (linked in the article as well) posted by Peter Larsson inlcudes sql user defined functions that computes checksum. Author of the post claims to 100% compatibility with SQL Server's built in function (I haven't tested it myself).
In case the link goes dead, here is a copy of the relevant part:
With text/varchar/image data, call with SELECT
BINARY_CHECKSUM('abcdefghijklmnop')
,dbo.fnPesoBinaryChecksum('abcdefghijklmnop')
With integer data, call withSELECT BINARY_CHECKSUM(123)
,dbo.fnPesoBinaryChecksum(CAST(123 AS VARBINARY))
I haven't figured out how to calculate checksum for integers greater than 255 yet.Actually this is an improvement of MS function, since it accepts TEXT and IMAGE data.
Another good read is Exploring Hash Functions In SQL Server by Thomas Kejser, where the author checks the built in hash functions in sql server for speed an quality.
zbsbpyhn2#
PHP implementation of BINARY_CHECKSUM: