spark scala:got error javax.crypto.aeadbadtagexception:标记不匹配!从aes-gcm二进制文件解密

aij0ehis  于 2021-07-13  发布在  Spark
关注(0)|答案(1)|浏览(379)

嗨,我想用 sc.BinaryFiles scala版本2.11.12 spark版本2.3
这是我的密码

val input_bin = spark.sparkContext.binaryFiles("file.csv.gz.enc")
val input_utf = input_bin.map{f => f._2}.collect()(0).open().readUTF()
val input_base64 =  Base64.decodeBase64(input_utf)

val GCM_NONCE_LENGTH = 12
val GCM_TAG_LENGTH = 16
val key = keyToSpec("<key>", 32, "AES", "UTF-8", Some("SHA-256"))

val cipher = Cipher.getInstance("AES/GCM/NoPadding")
val nonce = input_base64.slice(0, GCM_NONCE_LENGTH)

val spec = new GCMParameterSpec(128, nonce)
cipher.init(Cipher.DECRYPT_MODE, key, spec)
cipher.doFinal(input_base64)

def keyToSpec(key: String, keyLengthByte: Int, encryptAlgorithm: String,
                keyEncode:String = "UTF-8", digestMethod: Option[String] = Some("SHA-1")): SecretKeySpec = {
    //prepare key for encrypt/decrypt
    var keyBytes: Array[Byte] = key.getBytes(keyEncode)

    if (digestMethod.isDefined) {
      val sha: MessageDigest = MessageDigest.getInstance(digestMethod.get)
      keyBytes = sha.digest(keyBytes)
      keyBytes = util.Arrays.copyOf(keyBytes, keyLengthByte)
    }

    new SecretKeySpec(keyBytes, encryptAlgorithm)
  }

我发现了错误

javax.crypto.AEADBadTagException: Tag mismatch!
  at com.sun.crypto.provider.GaloisCounterMode.decryptFinal(GaloisCounterMode.java:571)
  at com.sun.crypto.provider.CipherCore.finalNoPadding(CipherCore.java:1046)
  at com.sun.crypto.provider.CipherCore.doFinal(CipherCore.java:983)
  at com.sun.crypto.provider.CipherCore.doFinal(CipherCore.java:845)
  at com.sun.crypto.provider.AESCipher.engineDoFinal(AESCipher.java:446)
  at javax.crypto.Cipher.doFinal(Cipher.java:2165)
  ... 50 elided

输入文件相当大(大约2gb)。
我不确定,我读错了文件或解密有什么问题。
我也有python版本,这段代码对我来说很好

real_key = key
hash_key = SHA256.new()
hash_key.update(real_key)

for filename in os.listdir(kwargs['enc_path']):
    if (re.search("\d{12}.csv.gz.enc$", filename)) and (dec_date in filename):
        with open(os.path.join(kwargs['enc_path'], filename)) as f:
            content = f.read()
            f.close()
            ct = base64.b64decode(content)
            nonce, tag = ct[:12], ct[-16:]
            cipher = AES.new(hash_key.digest(), AES.MODE_GCM, nonce)
            dec = cipher.decrypt_and_verify(ct[12:-16], tag)
            decrypted_data = gzip.decompress(dec).decode('utf-8')

有什么建议吗?
谢谢您
更新#1
通过将spark的文件读取方法改为本地文件读取(scala.io),并添加用于解密的aad,还应用了@topaco和@blackbishop的答案,我解决了这个问题

cipher.init(Cipher.DECRYPT_MODE, key, spec)
cipher.updateAAD(Array[Byte]()) // AAD can be changed to be any value but must be exact the same value when file is encrypted
cipher.doFinal(input_base64.drop(12)) // drop nonce before decrypt

我还在想,为什么spark不起作用

0x6upsns

0x6upsns1#

您使用的解密方法与python中的不同。在scala代码中,当调用 doFinal 你在传递所有的密码。
尝试以下更改:

// specify the tag length when creating GCMParameterSpec
val spec = new GCMParameterSpec(GCM_TAG_LENGTH, nonce)

// remove the nonce part from the cipher before calling doFinal
val dec = cipher.doFinal(input_base64.drop(GCM_NONCE_LENGTH))

相关问题