Hypothetical: What would happen if every user in the fediverse hosted their own server?

Lucien@beehaw.org · 2 years ago

Hypothetical: What would happen if every user in the fediverse hosted their own server?

shortwavesurfer@monero.house · 2 years ago

I dont think so. As an example, take the !technology@beehaw.org community for example. It can have say 1000 subscribers from lemmy.ml but only needs to send content to lemmy.ml once as it comes in. All 1000 subscribers see the cache copy from lemmy.ml and a message is only sent back to beehaw.org for comments, votes, etc. With everyone having their own instance beehaw.org would have to send updates to each one instead of sending an update to one instance and 100 users seeing it. A good level to strive for is many small communities of say a few thousand (1-5 thousand or so). That way one single server doesnt get to massive but federation requests arent overwhelming instances either

Lucien@beehaw.org · 2 years ago

So let’s say we want to scale up to several million users - what would that look like?

shortwavesurfer@monero.house · 2 years ago

Well, if we wanted say 50 million users at 5000 users per instance we would have 10000 instances. If we wanted 1 billion users we would have 200 thousand instances

learning2Draw@lemmy.ca · 2 years ago

Possibly failure, because setup isn’t just a simple or of box plop. And i can’t see how pings from 5000 microservers is better than 5000 users looking to register? But that’s more of a question than an informed opinion

manitcor@lemmy.intai.tech · 2 years ago

that ansible book works great, its just a bash script away from regular user DiY.

I’ve watched people who never used a computer install blockchain nodes and miners (including the networks). If someone wants to do it, they WILL figure it out.

learning2Draw@lemmy.ca · 2 years ago

Sure I’m not saying they won’t I’m saying there’s not that many people who ‘want’ to beyond the effort of clicking install

manitcor@lemmy.intai.tech · 2 years ago

my point is mainly that we are that close already. The ansible setup already boils it down to the bare minimum. its down to platform testing and building an installer.

Lucien@beehaw.org · 2 years ago

Next step is to spin up a cloud service which does all that for you, leaving you to just input a credit card and configure DNS correctly.

SSUPII@sopuli.xyz · 2 years ago

You also have to account another type of “ping” if a user lives in a cave 300 meters deep under sea level

Lucien@beehaw.org · 2 years ago

Maybe I should clarify with “each user successfully spun up…” I’m mostly curious if the 5000 microservers trying to federate is a more sustainable access pattern than 5000 users hitting the website.

Since federation is an async process, it can be optimized on both ends in a way that user browser requests cannot.

At the same time, federation would overall result in more bandwidth being used because not every user wants to view every post in the frontend.

rysiek@szmer.info · 2 years ago

Maybe I should clarify with “each user successfully spun up…” I’m mostly curious if the 5000 microservers trying to federate is a more sustainable access pattern than 5000 users hitting the website.

Sustainable in what sense?

It’s way more sustainable in the sense of “one website is not controlling the entirety of the experience of a given type of service for 5000 users”, for example. I think it’s important to talk about specific kinds of sustainability, and specific threats to it.

Things to consider (apart from bandwidth-related considerations):

technical knowledge necessary to safely and securely run and maintain a service
space, time, and resources (including financial) to do so
ability, willingness, and energy to moderate a service (this is where Big Tech platforms are falling flat on their faces, for example, and where smaller fedi communities work pretty damn well)

NightAuthor@beehaw.org · 2 years ago

But instance federation is an async process that is happening constantly. A user on your instance may be a realtime load, its only sporatic (on a per user basis). Basically, me spinning up an instance is a constant burden on the network, but me browsing is just a temporary load on a single server.

My understandings is that the best situation is a good number of powerful machines with instances with users evenly distributed amongst them.

rektifier@sh.itjust.works · 2 years ago

What you’re describing is no longer federation but full P2P. From a purely technical point of view, it may work, but the biggest problem will be abuse (spam, excessive resource use, illegal content). When a new instance shows up, how do you know if it’s a spammer or not? And if an instance is blocked by another instance, whose side should you be on?

Lucien@beehaw.org · 2 years ago

It wouldnt really be full P2P: I’d expect moderated communities to act as a funnel which everyone interacts with each other through. I wasn’t really considering the hypothetical micro instances to be like a normal server, since even when federated its unlikely that they would consume as much federation bandwidth as a large instance. Most people wouldn’t run a community, simply because they don’t want to moderate it.

Realistically, the abuse problems you mention can already currently happen if someone wants to. It’s easier to make an account on an existing server with a fresh email, spam a bit, and get banned than it is to register a new domain ($) and federate before doing the same. I think social networks would have a lot less spam if every time you wanted to send an abusive message, you had to spend $10 to burn a domain name.

Most of the content would still live on larger servers, so you end up moderating in the same place. Not much difference between banning an abusive user on your instance and banning an abusive single-user instance.

CanadaPlus@lemmy.sdf.org · 2 years ago

I hadn’t even thought of the moderating yet.

bdonvr@lemmy.rogers-net.com · 2 years ago

The way activitypub works is that each community has a list of every server that has at least one subscriber to that community.

Every time someone does something in that community, the community sends all those servers a message that tells them what just happened.

So instead of a few hundred servers it might have to inform of your one upvote of a post, it would have to basically inform every user (every user’s server)

It would be bad, it’s not designed to do that.

Lucien@beehaw.org · 2 years ago

So you’re saying that there’s a sweet spot between the number of servers being federated and the number of users per server. I wonder what the optimal network distribution would look like.

manitcor@lemmy.intai.tech · 2 years ago

not a great range but im going to guess between 1,000 and 10,000 users per node.

this is usually the point where midrange servers can be used successfully and the operation is manageable by normal people. This also groups people enough that they aren’t spamming the network with more requests than necessary to sync with thier friends.

pe1uca · 2 years ago

I’m running my personal instance, I haven’t had any issue interacting.
AFAIK it would help spread the load since my instance just asks/receives the activity once from other instances and then aggregates everything locally.

So each time I access a post I need to ask: How many upvotes does it have? How many comments? Which ones are new? From those comments how many upvotes each one has? Which ones are replays to others? Also, get me pfp of each user.
I just changed the sorting, either main feed or comments in a post, well I need to ask in what order they should be displayed.

All of these queries are done only in my own instance with my instance’s DB.

In this case beehaw.org just sends “Hey this post got an upvote”, and my instance figures out how it would affect the rest of the posts in my feed.

Also, right now lemmy.ml is taking a toll with all the new users, it takes a while to refresh the page and get any update, but with my instance I can keep scrolling and reading the data my instance already got from lemmy.ml or any other instance.

mrmanager@lemmy.today · edit-2 2 years ago

I’m also running my own instance, very few users and everything is really fast. Because I’m not on the same instance as all those users.

I guess with time, people will understand what I’m talking about. There are no downsides to using an instance with a low amount of users when you have federated technology.

I would love to get more users actually, just to help spread the load and to provide a good experience.

nii236@lemmy.jtmn.dev · 2 years ago

Same, but the communities will always congregate around one or two big instances because there’s no point building a tiny community with only a few people on your own instance.

What ends up happening is these micro deployments end up just being an identity server and caching layer.

towerful@beehaw.org · 2 years ago

I think there is a tipping point somewhere.

I think the connection calc is n * (n - 1) / 2 (at least, that’s what it is for mesh networks) so 1000 servers would be handling ~500k connections each.
That would be for 1000 users.
Those connections might be more lightweight, but there are significantly more of them (might even run into OS issues with that many open connections)

If each server was handling 50 users, the mesh connections would then be 1.2k.
50 users should be a blip wrt server load, and 1.2k mesh connections is more manageable.

Lucien@beehaw.org · 2 years ago

At the same time, those graph connections don’t need to be persistent network connections. You could easily cycle through connected nodes and batch update events without issue, and in that case, the primary constraint is bandwidth to the connected graph, not network connections.

andrew@radiation.party · 2 years ago

It’s a similar concept to email, so I would imagine there will always be big players who will have a reputation of trustworthiness/reliability.

The whole concept here seems to favor spinning up your own “cache” instance between you and the content you want (similar to how old email clients worked, downloading emails from the mail server and never live-fetching them), which is fabulous for distributing the load. Discovery takes a back seat when doing that, but it’s still pretty doable.

Lucien@beehaw.org · edit-2 2 years ago

I think the main difference between fediverse and email WRT cache instances is that if you create a cache instance for email, you’re only caching your personal emails. If you create a cache instance for a lemmy community, you’re caching every event on the community.

My intuition says there’s probably a breakpoint in community size where the cost of federating all events to the users who subscribe to them becomes greater than the cost of individually serving API requests to them on demand. Primarily because you’ll be caching a far greater amount of content than you actually consume, unlike with email.

Edit: That said, scaling out async work queues is a heck of a lot easier than scaling out web servers and databases. That fact alone might skew the breakpoint far enough that only communities with millions of subscribers see a flip in the cost equation…

Edo78@feddit.it · 2 years ago

Maybe I’m wrong (I’m on Lemmy since yesterday morning) but if you host your instance you’re only caching the communities you are interested in …if you never care about a community or interacted with an instance then those data will never reach your instance. Federated doesn’t imply full redundancy

nii236@lemmy.jtmn.dev · 2 years ago

This is correct, and it’s also worth noting that the remote comments are not “backfilled”, so you don’t get to read all the old stuff

CanadaPlus@lemmy.sdf.org · 2 years ago

I would be shocked if it worked well, seeing as it wasn’t designed for that.

Even if it did though, where would we be having this conversation? It would work more like a texting app than any kind of community.