hbase、mongodb或cassandra

我想建立一个分布式（跨大陆），容错和快速的图像和文件存储。会有一个 REST end-point 在为图像和/或文件提供服务的存储器前面。
图像或文件从中心位置存储/插入，但由本地安装的内部网服务器提供，该服务器对用户进行身份验证和授权。
一个对象可以具有同一图像的多个大小，可能还有与其相关的文件。使用上述存储使我能够选择 column family 和/或 column qualifier 获取请求的实体。
我确实考虑过 FileSystem 但是，要检索请求的实体，我需要知道数据库中的正确路径，或者应该智能地设计路径。这也意味着在新年开始时创建文件夹。
一个实体在不同的年份可以有不同的大小（缩略图、网格、预览等）。
获取图像的请求看起来像-

entityId  123
year      2017 
size      thumbnail

获取给定实体一年的所有可用映像的请求看起来像-

entityId  123
year      2017

我愿意接受任何其他存储解决方案，只要以上是可以实现的。谢谢你的帮助和建议。

您可以按照您的建议构建一个文件系统表，如

cqlsh> use keyspace1;
cqlsh:keyspace1> create table filesystem(
             ...   entitiyId int,
             ...   year int,
             ...   size text,
             ...   payload blob,
             ...   primary key (entitiyId, year, size));
cqlsh:keyspace1> insert into filesystem (entitiyId, year, size, payload) values (1,2017,'small',textAsBlob('payload'));
cqlsh:keyspace1> insert into filesystem (entitiyId, year, size, payload) values (1,2017,'big',textAsBlob('payload'));
cqlsh:keyspace1> insert into filesystem (entitiyId, year, size, payload) values (1,2016,'small',textAsBlob('payload'));
cqlsh:keyspace1> insert into filesystem (entitiyId, year, size, payload) values (1,2016,'big',textAsBlob('payload'));
cqlsh:keyspace1> insert into filesystem (entitiyId, year, size, payload) values (2,2016,'small',textAsBlob('payload'));
cqlsh:keyspace1>
cqlsh:keyspace1>
cqlsh:keyspace1> select * from filesystem where entitiyId=1 and year=2016;

 entitiyid | year | size  | payload
-----------+------+-------+------------------
         1 | 2016 |   big | 0x7061796c6f6164
         1 | 2016 | small | 0x7061796c6f6164

(2 rows)
cqlsh:keyspace1>

以及

cqlsh:keyspace1> select * from filesystem where entitiyId=1 and year=2016 and size='small';

 entitiyid | year | size  | payload
-----------+------+-------+------------------
         1 | 2016 | small | 0x7061796c6f6164

(1 rows)
cqlsh:keyspace1>

你不能用这种方法做的是选择一个特定的大小和id没有指定年份的图像。
对于相关的文件，您可以建立一个带有外部实体ID的列表，或者一个单独的分组表来将它们放在一起。
但是cassandra blob类型在理论上有2gb的限制，但是如果您需要性能，实际的限制是1mb，在极少数情况下是几mb（性能在许多方面随着blob的增大而降低）。如果没问题，那就试试吧。
另一个想法是使用类似awss3的东西来存储实际数据，启用跨区域复制，使用cassandra来存储元数据。但是，如果有人去aws-他们也有跨区域复制efs。
mongodb还可以通过跨区域复制轻松部署(https://docs.mongodb.com/manual/tutorial/deploy-geographically-distributed-replica-set/). 在mongodb中，您可以将所有数据保存在一个文档中，只需查询其中的相关部分。在我看来，mongodb比cassandra需要更多的内务管理（需要更多的配置和规划）。

hbase、mongodb或cassandra

1条答案

相关问题

热门标签

最新问答