I've been on a painful journey over the last couple of weeks. But I reached my destination. In the hope that I'm able to help others reach this point faster, and in a less stressful way, I'll share my results.
The Background Story
We're currently rebuilding our platform in Python and Django. The existing platform is based on .NET and related MS technologies. We're soon preparing to launch, and part of that is also migrating user data from the old to the new platform.
In the old system, user passwords are hashed and verified using the built-in PWDENCRYPT
and PWDCOMPARE
functions of Microsoft SQL Server.
As we want the transition to be as frictionless as possible for our users, we of course also want them to not have to reset their password after we launch. So I got the seemingly simple task to create a custom Django password hasher that is able to verify user entered passwords againts the hashes from PWDENCRYPT
.
First I set out to find some documentation on what those MS SQL functions actually do. It seems to officially not be very well documented what happens behind the scenes (if at all), and according to the docs you're not really supposed to use it:
PWDENCRYPT is an older function and might not be supported in a future release of SQL Server. Use HASHBYTES instead. HASHBYTES provides more hashing algorithms.
Fortunately, a couple ofStack Overflow questions were able to guide me in the right direction, although not all the way, due to errors in their explanations and examples, which sent me on several detours.
After spending way too much time on it, I gave up. But then a colleague gave it a try and found the missing key: The passwords must be encoded using UTF-16LE
. A detail I missed, but might be obvious to someone who is used to work in a Windows environment.
1条答案
按热度按时间brccelvz1#
The Solution
So, let's go through it step by step.
PWDENCRYPT
Let's try generating a password hash in SQL Server and look at the result:
The result consists of 4 parts:
0x
an indication that this is a hexadecimal value.0200
tells us the version of the hashing algorithm used. In this case, it tells us that it's hashed usingSHA-512
(orSHA2_512
in the vocabulary of SQL Server). If the header is0100
, it is instead hashed usingSHA1
.8D6818FF
is the salt for the hash.B31BAAA6F849583A0C1319E1E73A260525CFEC22D4BC9FE2FBCC535C682342634D5B87B2ED1A1DA2C63B7FED3CCDFC82DE31DD5670CCAC9BA7BA5866EB893843
is the actual salted and hashed password.PWDCOMPARE
This function is used to compare a password against a salted hash. Let's try it out:
1
is a boolean value, indicating that the password matched the hashed password. Using another hash, the result will be0
:I changed the last 2 digits of the hash to be
00
and got a0
back: The password did not match the hashed password. So far so good.Reimplementing in SQL Using
HASHBYTES
Now, in order to understand what's going on, I sort of reimplemented this with
HASHBYTES
based on the information found in the previously mentioned Stack Overflow questions:Yay! I get the same hash. All good.
Reimplementing in Python
And then the grand finale: Doing the same in Python. This is where I ended up spending days in agony, but I will save you from it and show you what to do instead.
There you go! Adding
.upper()
to the result allows us to compare the hashes from SQL Server and Python: