That depends a lot on the optimization points and how cost is allocated, and if you really consider all the cost.
A centralized master (aka cloud storage aka LucidLink) has advantages in terms of simplicity of deployment, locating critical and high performance infrastructure in one place where it is highly leveraged and has high utilization which can be spread among many users. Assuming proper deployment and redundancy uptime can meet high standards. Required infrastructure is procured and managed at industrial scale and volume pricing rather than consumer/small business rates.
If it weren’t for the cost, I don’t think a lot of people would complain about these solutions in terms of features and usability. One of the key pain points of cloud storage isn’t even the actual storage cost but the egress fees.
I’m actually curious how much of the egress fees actually covers actual infrastructure cost vs. is highly marked up because it’s non-optional in most solutions, and is not universal to all cloud apps.
AWS reported an operating margin of 38% in the last quarter. Very different from the razor thin numbers in retail or hospitality. An indication that there’s plenty of milk they get away with.
A P2P model on the surface has a cost advantage over cloud, primarily because most of us don’t use metered connections and are playing the averages when it comes to our actual bandwidth usage.
It works decently in a small number of nodes, and is very good when there are just two nodes. But the more nodes you have, the more data has to be exchanged between nodes, and a node may have to be taxed up to n times the data volume rather than just 1 transfer to the cloud. Considering that individual nodes often have less powerful infrastructure that can create bottlenecks.
Also in a P2P environment the nodes have to overlap in uptime to facilitate transfers, whereas cloud central can function asynchronous. In a large network, individual nodes may have to upscale their specs to meet performance demands, yet that incremental hardware investment has a pretty low load factor and is thus not a good investment.
It’s also harder or more complicated to enforce locking for conflict avoidance in a P2P graph vs. a centralized repository. Which is why Resilio and other P2P sync tools fall short.
While P2P node graphs have inherent redundancy, it comes at significant complexity to manage coherence and recovery.
More unique to our use models, with LucidLink each client only maintains fraction of the data, just the files you cached or pinned. Only the central repository maintains a complete set with required redundancy. Doing the same is harder in a P2P network. Either nodes maintain the full set, which explodes storage capacity and cost, or if each node only maintains the required working set, a lot of complexity goes in to finding which node has which part of the data, and even worse, making sure at least 2 or 3 nodes maintain a copy of any given file for redundancy. And if one node is low on storage space, it can’t just purge a cache, if no other node already has a copy. It quick gets out of hand. And if not all nodes are up 7x24, a file that exists in the network may not be available at all times when you need it.
So if you added up all the cost of both options, and if all infrastructure were actually priced at cost (with no beneficial averaging and no milky markups), and if you add power consumption from all nodes remain up more than just work hours, I’m not sure P2P would come out ahead. Also considering the internal IT deployment and support cost - even if it’s your free weekend time.
Add to that the complexity of deployment, and that there is no service provider who maintains the system or can provide support for users.
So P2P is may be attractive for the Finns and Phils of the world depending on circumstances. But for most everyone else a central cloud tool is the better choice, as evidenced in the rapid adoption of LucidLink and similar tools. Now we just need them to be good partners instead of greedy startups with ambitions.
I think P2P has a deceiving allure because we’re attracted to solutions that keeps us in charge of our destiny, and allows us to compensate with elbow grease for hard costs in a volatile market. We generally suck at valuing and accounting for our time properly, because we love what we do, and have always lived on the leading edge of what can be done.
All these considerations are under the assumption that you work with high value data that needs to be maintained and available at all time, and where the same file may be modified by more than one node, rather than individual nodes just contributing to the pool, but pool entries become read-only once in existence. It’s a different story in opportunistic file sharing like the early Napster days, or other resource channels. Here you benefit when it works, but don’t fail if data becomes unavailable.