如何纠正TypeError: unicode对象必须编码哈希之前?

小开

错误已经告诉了你要做什么。MD5操作在字节上，所以你必须将Unicode字符串编码为bytes，例如用line.encode('utf-8')。

小开

最佳答案

它可能正在寻找来自wordlistfile的字符编码。

wordlistfile = open(wordlist,"r",encoding='utf-8')

或者，如果你在逐行工作:

line.encode('utf-8')

编辑

根据下面的评论和这个答案。

我上面的回答假设所需的输出是wordlist文件中的str。如果你喜欢在bytes中工作，那么你最好使用open(wordlist, "rb")。但重要的是要记住，如果你将hashfile与hexdigest的输出进行比较，你的hashfile应该wordlist0使用rb。hashlib.md5(value).hashdigest()输出一个str，它不能直接与bytes对象'abc' != b'abc'进行比较。(关于这个话题还有很多，但我没有时间)。

还应该注意到这句话:

line.replace("\n", "")

应该是

line.strip()

这对bytes和str都适用。但如果你决定简单地转换为bytes，那么你可以将该行更改为:

line.replace(b"\n", b"")

小开

请先看看那的答案。

现在，错误信息很清楚了:你只能使用字节，而不能使用Python字符串(在Python <中曾经是unicode;3)，所以你必须用你喜欢的编码来编码字符串:utf-32， utf-16， utf-8，甚至是受限制的8位编码之一(有些人可能称之为代码页)。

当你从文件中读取时，wordlist文件中的字节会被Python 3自动解码为Unicode。我建议你:

m.update(line.encode(wordlistfile.encoding))

因此，推送到md5算法的编码数据的编码与底层文件完全相同。

小开

你必须像utf-8一样定义encoding format，试试这个简单的方法，

使用SHA256算法生成一个随机数:

>>> import hashlib
>>> hashlib.sha256(str(random.getrandbits(256)).encode('utf-8')).hexdigest()
'cd183a211ed2434eac4f31b317c573c50e6c24e3a28b82ddcb0bf8bedf387a9f'

小开

你可以用二进制模式打开文件:

import hashlib


with open(hash_file) as file:
control_hash = file.readline().rstrip("\n")


wordlistfile = open(wordlist, "rb")
# ...
for line in wordlistfile:
if hashlib.md5(line.rstrip(b'\n\r')).hexdigest() == control_hash:
# collision

小开

存储密码(PY3):

import hashlib, os
password_salt = os.urandom(32).hex()
password = '12345'


hash = hashlib.sha512()
hash.update(('%s%s' % (password_salt, password)).encode('utf-8'))
password_hash = hash.hexdigest()

小开

该程序是上述MD5破解程序的免费和增强版本，它读取包含散列密码列表的文件，并从英语字典单词列表中检查散列单词。希望对大家有所帮助。

我从下面的链接下载了英语词典 https://github.com/dwyl/english-words < / p >

# md5cracker.py
# English Dictionary https://github.com/dwyl/english-words


import hashlib, sys


hash_file = 'exercise\hashed.txt'
wordlist = 'data_sets\english_dictionary\words.txt'


try:
hashdocument = open(hash_file,'r')
except IOError:
print('Invalid file.')
sys.exit()
else:
count = 0
for hash in hashdocument:
hash = hash.rstrip('\n')
print(hash)
i = 0
with open(wordlist,'r') as wordlistfile:
for word in wordlistfile:
m = hashlib.md5()
word = word.rstrip('\n')
m.update(word.encode('utf-8'))
word_hash = m.hexdigest()
if word_hash==hash:
print('The word, hash combination is ' + word + ',' + hash)
count += 1
break
i += 1
print('Itiration is ' + str(i))
if count == 0:
print('The hash given does not correspond to any supplied word in the wordlist.')
else:
print('Total passwords identified is: ' + str(count))
sys.exit()

小开

import hashlib
string_to_hash = '123'
hash_object = hashlib.sha256(str(string_to_hash).encode('utf-8'))
print('Hash', hash_object.hexdigest())

小开

编码这一行为我解决了这个问题。

m.update(line.encode('utf-8'))

小开

如果是单行字符串。用b或b把它包起来。

variable = b"This is a variable"

或

variable2 = B"This is also a variable"