Thoughts on (semi-)automated tool to import content from /r/rust to this community?

rglullis@communick.news · 1 year ago

Thoughts on (semi-)automated tool to import content from /r/rust to this community?

snaggen@programming.dev · 1 year ago

No, just look at the rust community on Lemmy that imports stuff like this. It is flooded with a lot of content, but that makes it impossible to follow and interact with. Also, if you know it is a bot that posted, you don’t have any reason to interact with that post. Automatic imports tend to feel like spam, so please don’t do this…

I’d rather see that people keep an eye open for suitable news, or ask genuine questions and write other interesting posts by hand. It may be a bit slow early on like it is now, but that is somewhat in proportion to the engagement so it all fits together.

snaggen@programming.dev · edit-2 1 year ago

Also, this community is coming along nicely. We are in the top tree communities on programming.dev if you look at the list of communities. We are the highest ranked programming language community, ahead of Python. So, I don’t see any need for artificially inflating this community.

Edit: Link to the community page https://programming.dev/communities

rglullis@communick.news · edit-2 1 year ago

It is not my plan to “flood” any community, I hope to make that very clear.

Also:

if you know it is a bot that posted, you don’t have any reason to interact with that post.

But there is even less reason for people to interact if there is no comment at all. Plus if you look at the roadmap, my plan is to make a notification system that will send a DM to the reddit user with a link to the responses they get on Lemmy.

snaggen@programming.dev · 1 year ago

Well, bringing over comments by a bot feels totally wrong. I’m not sure we want to have reddit comments here, since it sometimes differ quite a lot in culture. Bringing over posts only, could be done by bot if it is determined by a human, but then on the other hand I don’t see the point in involving the bot. Then you just look at the list in the bot instead of in /r/rust , and it is not that hard to just manually post if you find something there that would fit here. I like what you are trying to achieve, but I’m not a big fan of bots… it is so easy to get them wrong and then they can cause a lot of harm.

rglullis@communick.news · 1 year ago

Bringing over posts only, could be done by bot if it is determined by a human.

Yeah, this whole experiment started exactly like that. I wanted to have an easier way to bring content from /r/emacs to !emacs@communick.news, but I realized that I was missing out on a lot of the “self” posts and these are kind of meaningless without comments/responses.

I’m not a big fan of bots

To be quite honest, me neither. This is why my main goal is to make a system that can let the reddit users take control over the bot account quickly, even if it’s just to claim it and say “please stop mirroring my comments”. But we have to start somewhere.

Mac@mander.xyz · 1 year ago

Go look at other communities with those bots and see how well they perform and if people actually discuss the content.

rglullis@communick.news · 1 year ago

Is there anything with this type of bots? The only similar project I’ve seen is lemmit.online, and even that is a lot more limited in scope. They have only one bot pulling posts, no comments and all posts are done on their instances

SavvyWolf@pawb.social · 1 year ago

I’m against this for a number of reasons. To the point that honestly I think you should have a think about if this is something you want to actually persue.

Firstly, we are not Reddit, we are Lemmy. We don’t really need to just mindlessly copy content over just for the sake of content. If this community were just a mirror of Reddit, then what’s the point of being on Lemmy anyways?

Lemmy and Reddit are link aggregators anyway. If there’s something that’s good enough to crosspost, then someone who browses both can just manually copy it over. There’s not really much OC on Reddit, especially for something like a programming language.

And if someone really wants to look at Reddit posts, they’d follow the lemmit mirror. Or just use Reddit.

Anyway, that out of the way, I really don’t think you should be trying to force/convince people to join Lemmy. Especially using automated tools.

Your roadmap shows a bot that would pm Reddit users any responses to posts from other platforms. I know if I got a message like that, then I’d just mark it as spam. It reads like a scam and would not make me want to join the platform.

I actually have a random bot email me saying I have unread messages on a git repo that was cloned by a random Chinese site. I mark it as spam and ignore it.

Anyway, let’s imagine people take you up on the offer and convert their accounts, will they have access to the whole fediverse? The big instance ( lemmy.world ) iirc blocks lemmit, and I can imagine would block your instance as well. Defederation is a pain point for people wanting to join the fediverse, and this will exacerbate things.

But let’s assume you get all that sorted, and you manage to recruit, let’s say a few tens of thousands of people. Have you a plan on how you are going to approach moderation and instance politics? Not saying you haven’t, but something to consider if you’ve not.

Anyway, that’s my thoughts. Sorry if I’m a bit negative. I think the way for Lemmy to grow is organically by providing a good value alternative to Reddit. Not by laser focusing on getting as much people and content as possible.

sugar_in_your_tea@sh.itjust.works · edit-2 1 year ago

I agree with most of what you said, except this:

There’s not really much OC on Reddit

Most of the subreddits I used were largely OC, such as:

/r/buyitforlife
/r/scifiwriting
/r/books

I did like the link aggregation, which initially attracted me (mostly wanted something better than Google News), but I stayed for the OC.

So if there was a tool that would aggregate the OC from Reddit, I’d love it! Most of the content I still use from Reddit is older OC, like headphone advice (just bought another last week) and other stuff I use Reddit for as part of the purchasing process.

However, most of the value from Reddit is the searchability, I just can’t find anything on Lemmy, so having that content here wouldn’t solve anything. So if OP wants to solve the problems I have with Lemmy, work on full text search on lemmy itself, and if that’s good, maybe pulling in content from Reddit would be useful.

I don’t want users for the sake of users, and they will come once the platform solves their problems. I’m here because I’m ticked at Reddit and Lemmy is just good enough, but I’m sure others have their pain points as well.

rglullis@communick.news · 1 year ago

I firmly believe that the main thing to overcome is not technical, but the network effects. I can not think of any single social media network that has failed because of technical issues and too many users.

Orkut’s was a huge success in Brazil and India despite the constant outages. The fail whale being shown repeatedly in 2010 was not a problem for Twitter. I also lost count of how many times I saw the “You broke reddit” banner or cursed at how bad their search results are, but that didn’t stop me from coming back.

So, yes, I am also very interested in improving things and I am even trying to get involved with Lemmy development directly, but if we want the Fediverse to succeed we need users and we need them fast. The numbers are not looking good and there are even claims that Reddit has already won.

sugar_in_your_tea@sh.itjust.works · 1 year ago

I disagree, I think Lemmy has a number of technical issues that limit adoption, which limits the network effect. For example:

poor discoverability - even if someone gives me a community name, I still need to know the instance
no mod tools - popular communities can be poorly moderated
instance does matter in some cases, and that’s a cognitive load

So that means it’s hard to get started, disappointing to keep using, and unappealing to keep using. If you try to dump a bunch of users on that, they’ll mostly bounce off.

I personally think a lot of this is architectural. Mod tools can be added later, but the other two are just how the system works.

I’d prefer to build it in a decentralized manner, which means:

you keep a single namespace, just like Reddit
remove need for separate instances, so no differences between which instance you pick (and it’s okay if one or more go down)
search would work with the same mechanism as loading content, just with different parameters, so you’d contend with it early in the process

Basically, have it look more like Matrix and less like Mastodon. I think it would be pretty easy to phase it in too, since you could build in a lemmy-compatible server to access this new network, so that way it would look like another instance to other users but it would be a separate thing entirely to native users.

I think lemmy is great, but I don’t think it’ll overtake Reddit, even if everyone switched today from Reddit to lemmy. It just has too many technical issues that people will bounce off.

SavvyWolf@pawb.social · 1 year ago

That’s fair; I was probably thinking a bit too much about links to blog posts and the like.

Honestly, I think that might be better served by a nice searchable web interface to the Reddit data that the data hoarders people managed to collect.

rglullis@communick.news · 1 year ago

I’m all for constructive criticism and (I think) I understand your point of view, but I hope you understand mine: simply put, I think that is unethical to support Reddit and I don’t think it is a matter of “choice” between two different platforms. To me is less about “being on Lemmy” and more about “not being on Reddit”, if it makes any sense.

I will spare you from a long essay about all the issues with the current landscape in social media platforms and how badly Surveillance Capitalism is. Instead, I will just say that I am putting a lot of effort into building a sustainable alternative to the current platforms owned by Big Tech.

With this in mind, it makes no sense to me when someone says “people can just use Reddit”. It is a goal for me to get a sizeable chunk of the population out of these platforms and to help grow the intolerant minority that will flat out refuse to participate in the Big Tech platforms. To me this is the only way to disarm them and to stop them from doing all the damage they are doing to society at large.

Have you a plan on how you are going to approach moderation and instance politics?

I believe that the current issues with “instance politics” are solvable with the next generation of fediverse software, which will allow us to separate the user identities from the server that hosts them - which is already possible with systems like takahe. But anyway, first things first. Let’s not get overly anxious about fixable problems and focus on getting people out of the hands of this devil first.

SavvyWolf@pawb.social · 1 year ago

I broadly agree with you, and think that big tech companies are a problem and the fediverse is how social media should be run. However, it’s not me or anyone here you want to convince.

Most people just don’t feel this strongly about megacorps owning social media. I’ve been thinking about this a lot, and I think the thing holding back Mastodon and Lemmy isn’t the instance selection or lack of marketing or whatever. It’s the fact that people don’t agree with the values, philosophies and ideology of the project. Maybe that’s through lack of knowledge, maybe that’s through conscious choice, but the fediverse requires a level of “buy in” to the project’s ideology, at least at this point.

If you send unsolicited DMs to people, I expect you’ll get one of three responses:

Indifference: “Oh, another spam email. I don’t really care”.
Interest: “Oh, there’s that Lemmy thing I’ve seen mentioned. I should look at it”.
Anger: “Really? They took my post and reposted it there? I didn’t agree to that, Lemmy is a terrible platform!”

I expect that an “anger” response will probably be more likely than any other response, which will harm adoption. I agree with the sentiment about “getting people off Reddit”, but this feels like it’s pushing too hard and Redditors might not be as receptive.

I guess it’s like going to people on the street and saying that they should stop eating a certain type of food. You just make your group look very pushy and presumptuous, even if you have very good reasons.

IMO, work should be focused on spreading awareness in a non-assertive way about why moving from Reddit to Lemmy is the “correct” choice. Or, we can work on making Lemmy an attractive place (by fixing bugs and papercuts) so that people naturally head here the next time Reddit does an oopsie.

Although, that’s all moot with the main practical problem to this approach: How are you going to convince Reddit to actually let you implement this system?

I’m not sure the issue with “instance politics” will be solved by having dedicated identity servers; you’re just moving the problem of moderation from one person (running the instance) to another (running the identity servers).

Takahe doesn’t really solve this problem (unless I understand what they are doing - their features page is really unclear). You can have multiple identities per user, but that doesn’t protect the user account itself from the reputation of the instance owner.

rglullis@communick.news · 1 year ago

I expect that an “anger” response will probably be more likely than any other response, which will harm adoption.

Only if you assume that the majority of people are on reddit because they have a strong connection to the platform instead of the network, which I really believe to be false.

And also, not what I have experienced with the emacs community. The number of people that responded favorably to an invite was a lot higher than the number of people who showed lack of interest and non-respondents combined.

IMO, work should be focused on spreading awareness in a non-assertive way about why moving from Reddit to Lemmy is the “correct” choice.

Content is king, there is no way around it. Social networks can survive “fail whales” just fine. Bugs on Lemmy can be fixed. What can not be fixed without a major effort is the fact that Lemmy is losing active users and that people on reddit are already adapting to the “new normal” of crappy mobile apps, puppet mods and Surveillance Capitalism.

lightsecond@programming.dev · 1 year ago

Have you checked out https://lemmit.online?

I don’t know how i feel about a bot posting content from Reddit. Your project legit looks cool, but I personally block lemmit because these posts give me the feeling of abandoned cities. I was on reddit for the discussions. Same for lemmy. Posts without comments are boring.

rglullis@communick.news · 1 year ago

My tool brings the comments as well.

lightsecond@programming.dev · 1 year ago

Oh! I missed that. This sounds much nicer. Probably not for /c/rust though. Like someone else said, this community already has good engagement. I think you should target large non-technical subreddits like AITA. Those will take time to pick up on Lemmy.

Anders429@lemmy.world · edit-2 1 year ago

I’ve seen similar things done on other communities on Lemmy, and it always drives me nuts. Every single post on c/Technology is like this, making the whole community feel soulless and inactive.

Also, the amount of low quality questions or posts thinking they’re in r/PlayRust that would be posted would drive me up the wall. I’ve been glad to be away from that.

rglullis@communick.news · 1 year ago

Let me repeat: the idea is not have a fully automated system that would mirror every and single post from there to here, but mostly to have a set of tools that streamline the process that I’ve been doing already in plenty of places - e.g, https://communick.news/c/emacs, https://soccer.forum/c/main, https://programming.dev/c/elixir, and yes, even here.

The main practical difference of this is that it would require less time because I won’t have to be copy-pasting stuff around, not that I would be pushing more stuff.

Ok, there is the aspect of the comments, but these are mostly thought of as a way to nudge the people that participate on reddit and get them to look into the alternatives.

CommunityLinkFixerBot@lemmings.world · 1 year ago

Hi there! Looks like you linked to a Lemmy community using a URL instead of its name, which doesn’t work well for people on different instances. Try fixing it like this: !emacs@communick.news, !main@soccer.forum, !elixir@programming.dev

Ategon@programming.dev · edit-2 1 year ago

Reposting from the python post you did just to make sure you see it

Just wanted to let you know that we dont allow reposting of comments made on other platforms in this instance (both for spam reasons and tends to be a mess with gdpr). Link posts with no text are fine as long as you get approval from a moderator of a community or by one of the admins for site wide usage (and then allow moderators to opt out), and follow our bot guidelines that you can find in the sidebar on our site. Id be willing to greenlight the bot for sitewide usage as long as you remove the comments, make posts links only (links to the content, not the reddit post) and limit it to maximum 10 posts a day across the instance (max 2 in a specific community) (and also make sure you dont repost stuff already posted in the community). We have another bot called reddit x-poster that its been working fairly well with but they hang out in a couple communities like haskell

In terms of this community itself it doesnt really need help with posting. Its got more than 200 monthly active users already and around 4 posts every day

rglullis@communick.news · 1 year ago

Sorry, I saw your message but only now I had the time to stop to try to answer it. First of all, I should apologize for the bot running on !elixir@programming.dev, that was my dev instance and should not be even connected to the public internet when I ran those test jobs. I’d actually kindly ask you to please defederate from fediverser.communick.dev to ensure that this doesn’t happen again.

I just to make it clear that is not my plan to flood any community and that the idea is to only submit content when it seems to be aligned with the overall posts in the community and I hope that I can do everything according to the guidelines for each community. I just to make my case for what I’d consider to be a reasonable approach and check if these guidelines can be revised/approved:

For a lot of the programming communities, the value is not in the link sharing but in the “self posts”: questions and discussions around some topic or event that has happened. Just copy-pasting a question feels worse than a bot, because my tool is built with the idea to get the original poster here as well. Perhaps this rule could be relaxed for “self posts”, and the tool could bring the post along with the top N threads (N=3). This way, the people already here would see a post with some of the content (which could be an incentive for them to join in a conversation) and my tool could also pick up the replies from here and notify the users on reddit.
The tool is not meant to be a simple cross-poster, but *to build an off-ramp from reddit into the fediverse". In this sense, it would be better if the communities already had some organic activity as to dilute the feeling that only bots are participating. IOW, a community where we have 4 “organic” posts and 2 bot posts seem better than one that is dominated by one bot posting only one message every day. Would you consider revising the rule so that so that the bot could post at most 25% of the total of “organic posts” from the previous day?
If you think in terms of “how well this community is going in relation to the reddit” instead of “how it is doing in relation to each other”, the numbers are pitiful. If we go to /r/rust right now, it shows about ~250k subscribers and 1000 active visitors, and this is even after the mods from /r/rust explicitly expressed their wishes to move to Lemmy. Even if it sounds a bit elitist, I agree with the people who do not want to bring all of reddit to Lemmy, but I think that a tool like mine could be used to bring the top 1% and focus on the best content creators.

Ategon@programming.dev · edit-2 1 year ago

Will respond to both of your message here

Starting with the one at the top. GDPR encompases more than just PII. The messages themselves are mostly not covered but usernames are (as its a unique identifier that distinguishes one person from another (and if you want to go in terms of PII it can be used to easily identify a person as well)) as well as misc information that might be in the messages (e.g. if someone says they work for X company or says their actual name)

If you can make it so none of the messages you post are affected by that (every message posted is done by the same user, messages are filtered to remove any with personal data) then I would be more inclined to accept it (but up to the discretion of the mods of the community, not a site wide approval). Sure I can revise it to the 25% rule, ill make it 25% with a max

rglullis@communick.news · edit-2 1 year ago

Sorry, I will challenge you on the notion of usernames as personal identifiers. Last time I had to go through legal advice for a system we were building, the final advice was that usernames are only treated as such if they can be correlated with other online data, such as cookies or authentication tokens. To give an absurd example, if you sign in to a website and claim the username “rglullis” you will not be in any way connected to me and therefore the username can not be used as an “online identifier”.

Ategon@programming.dev · edit-2 1 year ago

it still 99% of time can point you towards someone (or severely limit the options and you can track them down from that) assuming its not common enough of one and its still something that uniquely separates people even without that. (And from the username here people can look up the user on reddit and then look up their post history for info on them as well)

‘personal data’ means any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person; (GDPR Art. 4)

an online identifier

Not a lawyer so I cant fully speak on it but I would rather err on the side of caution when dealing with this sort of thing until I get actual legal confirmation otherwise specific to our site and this

Platforms like discord (and reddit) need to anonymize the messages when someone deletes their account by assigning it to a generic deleted user thats used for everyone

rglullis@communick.news · 1 year ago

Oh, and not really a contention point, but a pet peeve of mine… “GDPR” is often thrown around but it should not be used as an excuse to prevent bots. Speaking as someone who had to deal with a lot of privacy-sensitive businesses and as an European: GDPR is mostly about PII, and posts on the internet along with usernames do not constitute as such. A bot taking content from reddit and posting anywhere else does not fall under any provision of personal data processing and no special consent is required from the users to collect it.

erlend_sh@lemmy.world · 1 year ago

I think something like this can work, if you bring humans fully into the loop. Posts should be made by people, so that someone’s responsible for the thread that gets made.

What about a ‘repost queue’ of Reddit that Lemmy users can sign up for? Having signed up to this queue, e.g. for /r/rust, I’d be presented with a list of the posts on /r/rust that do not yet exist on .dev/c/rust. Every hour or so I could opt to do a repost to Lemmy, from my own account.

In other words you’re just facilitating a manual action that’s already taking place.

rglullis@communick.news · 1 year ago

That could work well for posts with links, but what about the self posts? The people that I managed to bring over from /r/emacs to !emacs@communick.news have mentioned that the main “problem” is that just posting the links make the community feel like a simple “planet emacs” aggregator and that they wished to have the self posts with questions as well.

erlend_sh@lemmy.world · 1 year ago

One solution at a time I guess :)

But if your emacs community is in favor of the comments replication approach then that’d be a good testbed that might lead to even better approaches deemed acceptable elsewhere.

rglullis@communick.news · 1 year ago

Ok. I will setup a fediverser instance and make the changes needed.

robinm@programming.dev · 1 year ago

It seems to be a lot of work but could also be a good idea.

Something that I would like would also be a statement on the Rust blog to say that lemmy instance X is the main Rust lemmy instance and discussion should mostly be done here, so that the migration path is clear for reddit users.

rglullis@communick.news · 1 year ago

It is a bit of work (not a lot), but that is fine by me because I am really invested in making the fediverse something accessible and more than just a niche.

SavvyWolf@pawb.social · edit-2 1 year ago

So, I’ve posted a negative comment below. But since I’ve slept on this and thought about it a bit more, I am starting to think something like this could work… But with some caveats.

The idea is to create a fediverse “client” of sorts that mirrors content to and from Reddit. A “bridge” like in Matrix, I guess.

Some challenges and prerequisites:

On the Reddit end, this needs to be in an automod autopost and/or pinned post, saying there’s an experimental bot crossposting to Lemmy (And you need to convince the mods on the subreddit end to allow you to do this).
It should be a separate community. The rust community here is growing organically with a distinct culture, and it’d be a shame to destroy all that culture with a firehose of comment. The Lemmy community could also have unique content, and some users could also just prefer a small, cosy community.
That separate community should have an obvious name so that people don’t confuse it with the real Lemmy community (instance names don’t count). Maybe like bridge_r_rust.
You need to convince Reddit itself to let you do this. I don’t know how you’d do that.
I don’t think it’s a good idea to pm responses; just have the bot respond to messages directly on Reddit. It gives more visibility anyway. Making it public removes a lot of my concerns with harassing people.
Honestly, I feel giving people the ability to convert these bot accounts to real accounts feels like scope creep, and brings up a lot of technical challenges. Maybe just leave that be for now.
I think you’d need to speak to the people that run the big instances to see if you can avoid getting defederated. Perhaps the bridge community could be hidden by default in the all feed.
There needs to be a way to get people off the bridged community and into the “real” Lemmy community… It’d be a shame to do all this work and just end up with an alternate front-end for Reddit that ends up killing Lemmy.

I think bridged communities could actually work if you can solve the challenges above and provide a way for people to get from them to Lemmy itself.

rglullis@communick.news · 1 year ago

Thank you for not dismissing my work right away and giving some more time and thought into it, I really appreciate it. I think that there is a lot in your feedback that makes a lot of sense:

Having an automod post certainly would help. It would give a strong signal that the community on reddit is trying to migrate away from it. I think this could be achieved on the communities that were more vocal during the protests, but on the other hand I already had the experience on /r/emacs and there was a strong rejection from (some of) their moderators.
I was initially against the idea of making a two-way bridge to post Lemmy comments there, but I’m starting to accept it. I’m still not sure how that would work in terms of API usage and if this would incur costs for those running the fediverser servers (on top of the costs of running Lemmy) but I’m willing to try it out.
Speaking with the people in the big instances is something I’m already doing (or at least I’m trying to by posting about the tool in different communities before unleashing it) :)
The idea is not to automate everything and create a “firehose of reddit bots” here. I was talking with the admin of programming.dev yesterday and we seem to agree to cap “fediversed” posts to a maximum of 25% of “organic” posts that the community had in the previous day. We can also agree that link posts should not have the comments, and that “self posts” with questions or topical discussions can bring some of the comment threads. This means that the very small communities would be seeing only one post a day, and the ones that are growing or more established would never be suddenly taken over by the bot army.

Some of the other things though, I think will be harder to change or compromise, and if the admins or mods reject the proposal I will flat out not use the tool there:

I do not see the point in creating a separate community. I am fully aware that the bots and their automated posts should never become a sizable part of the community, but I feel like that if keep them separate them it ends up being as toothless as lemmit.online.
I do not want to ask permission from Reddit to do this. They’ve already been quite hostile to the third-party devs that were willing to work together, I can only imagine that they would never be welcoming to someone who’s is clearly aiming at getting their most valuable individuals in their userbase.
The idea to let reddit users register and take over their bot accounts is fundamental to this project. I want to make it very clear that this whole thing is a strategy to get people into the fediverse and put strong focus on the content creators of small-to-medium communities. I am trying to bootstrap a business around it and this is my attempt at increasing the TAM. The more people on the fediverse/threadiverse, the more SMB segment will look into establishing their media presence on the fediverse as well, and then I can start to actually have a sustainable operation.

SavvyWolf@pawb.social · 1 year ago

Thanks for listening to my feedback; I know I can be a bit forceful and dismissive at times.

If you are going to go through with this, please get permission from the subreddits you are trying to bridge beforehand. If some drama breaks out (which it will, because people online love drama), you’d rather them be on your side than against you. If someone takes issue with anything, then that’s something the subreddit mods can deal with, not you. They can also manage announcements of what is happening, and give you an air of legitimacy.

I do not want to ask permission from Reddit to do this.

You need Reddit’s permission. They have complete control over their platform and have the final say on what people do on it. They can ban anyone at any time for any reason. If you try to do something without their permission, they’ll just start banning your bots and send you takedown requests. I know it sucks, but they hold all the power here, and have made it clear that they want to stomp out competition.

rglullis@communick.news · 1 year ago

I completely agree with talking with the mods in the subreddits, but I can not possibly see how Reddit Inc will ever greenlight something like this. In a way, I’m actually hoping they will try to ban it because it would create some type of Streisand Effect.

They can try to ban the first or the second fediverser API key used by the fediverser app that I bring online, but if tens/hundreds of people start doing it, this would mean effectively that we will grow an army of independent crawlers and evangelizers for Lemmy and the fediverse in general.