snappy/redis-py集群

s4n0splo  于 2021-06-09  发布在  Redis
关注(0)|答案(1)|浏览(854)

我正在用python为redis集群编写cron脚本,并使用redis-py-cluster只从prod服务器读取数据。一个单独的java应用程序正在使用snappy压缩和java字符串编解码器utf-8编写redis集群。
我能读数据,但不能解码。

from rediscluster import RedisCluster
import snappy

host, port ="127.0.0.1", "30001"
startup_nodes = [{"host": host, "port": port}]
print("Trying connecting to redis cluster host=" + host +  ", port=" + str(port))

rc = RedisCluster(startup_nodes=startup_nodes, max_connections=32, decode_responses=True)
print("Connected",  rc)

print("Reading all keys, value ...\n\n")
for key in rc.scan_iter("uidx:*"):
   value = rc.get(key)
   #uncompress = snappy.uncompress(value, decoding="utf-8")
   print(key, value)
   print('\n')

print("Done. exit()")
exit()
``` `decode_responses=False` 对这个评论没问题。然而,变化 `decode_responses=True` 正在抛出错误。我的猜测是它不能得到正确的解码器。

Traceback (most recent call last):
File "splooks_cron.py", line 22, in
print(key, rc.get(key))
File "/Library/Python/2.7/site-packages/redis/client.py", line 1207, in get
return self.execute_command('GET', name)
File "/Library/Python/2.7/site-packages/rediscluster/utils.py", line 101, in inner
return func(*args,**kwargs)
File "/Library/Python/2.7/site-packages/rediscluster/client.py", line 410, in execute_command
return self.parse_response(r, command,**kwargs)
File "/Library/Python/2.7/site-packages/redis/client.py", line 768, in parse_response
response = connection.read_response()
File "/Library/Python/2.7/site-packages/redis/connection.py", line 636, in read_response
raise e
: 'utf8' codec can't decode byte 0x82 in position 0: invalid start byte

ps:取消对此行的注解 `uncompress = snappy.uncompress(value, decoding="utf-8")` 因为错误而中断

Traceback (most recent call last):
File "splooks_cron.py", line 27, in
uncompress = snappy.uncompress(value, decoding="utf-8")
File "/Library/Python/2.7/site-packages/snappy/snappy.py", line 91, in uncompress
return _uncompress(data).decode(decoding)
snappy.UncompressError: Error while decompressing: invalid input

cig3rfwq

cig3rfwq1#

经过几个小时的调试,我终于解决了这个问题。
我在编写redis集群的java代码中使用了xerial/snappyjava压缩器。有趣的是在压缩过程中 SnappyOutputStream 在压缩数据的开头添加一些偏移量。在我的情况下,这个看起来像这样

"\x82SNAPPY\x00\x00\x00\x00\x01\x00\x00\x00\x01\x00\x00\x01\xb6\x8b\x06\\******actual data here*****

由于这个原因,减压器无法计算。我修改了如下代码,并从值中删除了偏移量。现在很好用。

for key in rc.scan_iter("uidx:*"):
   value = rc.get(key) 
   #in my case offset was 20 and utf-8 is default ecoder/decoder for snappy 
   # https://github.com/andrix/python-snappy/blob/master/snappy/snappy.py
   uncompress_value = snappy.decompress(value[20:])
   print(key, uncompress_value)
   print('\n')

相关问题