I’ve noticed that the current sorting algorithms prioritize posts based on votes, which can sometimes lead to posts with high votes but few comments dominating the feed. This may not accurately reflect user engagement. On the other hand, sorting by “Most Comments” disregards votes entirely. I believe Lemmy should consider taking into account multiple user engagement metrics in their algorithms like comments, votes, time spent on a post, etc. What are your thoughts on this? Would you prefer a new sorting algorithm that combines various metrics, adjustments to existing algorithms to include more metrics, or do you like the current sorting algorithms available the way they are?
A good post doesn’t necessarily need a lot of comments and engagement. It can have that, but it’s optional in my eyes. While a bad post, with wrong and controversial information, can have a lot of heated comments, delivering a false engagement impression. Votes is much better with that. Going by heat of discussion, is exactly what I dislike about modern social media, where this is promoted, over real value (people who voted because the main content is good), fueling rage-bait promotion. It’s not perfect and it still happens, but it’s the less bad.
I think it would be cool to be able to configure that as a user. You could experiment with different settings and use the one that works best for you. It’ll be difficult to find a one size fits all solution.
Building on this, I wonder if you could add a setting to customize your own algorithms. You could weight a variety of different metrics.
This is not possible because sorting is done in the database, so adding a new sort option requires a database migration with new indexes, columns and updated queries. Not something that can be done with a simple plugin.
@[email protected] in https://github.com/LemmyNet/lemmy/issues/3936#issuecomment-1738847763
An alternative approach could involve utilizing an API endpoint that provides metadata for recent posts, allowing users to implement custom sorting logic on their client side using JavaScript. This API endpoint is currently accessible only to moderators and administrators
There is already such an API endpoint which is available for mods and admins.
Why does sorting need to be in the db?
Wven still we can probably acheive a decent customisation partually customisable sorts
Say we only care about upvotes, downvotes, comments, timefactor
Say we break it down so u can set a custom weighting for each of these eg 0, .25, .5, .1 then make a sort for all combinations thats only 4^4 = 256 combinations its a lot but seems within the real of possibility we still avoid needing to custom sort for each user.
The number of sorting algorithms needs to be much more limited than that; otherwise, it puts too much load on the server calculating all those combinations. It’s important to strike a balance between customization and system performance to ensure smooth operation and optimal user experience.
Its only increasing the sorting cost by a linear factor wont have an effect on the big O notation
Caching. As an example, an all feed and it’s associated indexes gets cached in the DB’s memory, allowing it to keep pace with thousands of users. And there are millions of posts on a server, sorting has to be done in the DB or the volume of meta in memory would be astounding.
As an option it would be nice, but the current system as a default is fine.
No just show most recent first. Stop trying to be like Facebook controlling people with algorithms.
I don’t understand platforms like Mastodon that mimic Twitter without incorporating the features that contribute to its popularity. If I were looking for a most recent sorting algorithm I would use a chat.
I’m on various old fashioned forums that are strictly newest first. They are just as usable and enjoyable as Lemmy. I don’t know anything about mastodon and I hate twitter.
I’d understand using new activity sorting for small communities but for large communities you can’t keep up with it.
Do you have an example?
There needs to be a choice. The fact that Reddit/Lemmy allow you to build and control what content you see is the best bit about them.
If you just show by new, then communities with lots of posts drown out smaller ones; low effort posts drown out ones that took a while to create. It also discourages engagement as the posts old enough to have good conversations will be a long way down the page.
What is the goal? Is it to drive more comment engagement? To ensure that posts with comments outrank those without, if vote metrics are comparable? (Look at me, gathering requirements like a little business analyst, aww)
If so then I would say incorporating some sort of comment metrics (even simply comment count from unique users) seems like a good way to achieve the goal.
I admit I get a bit disappointed seeing posts with no comments. Sometimes the post is cool and I’m glad I saw it. But most of the time I don’t have anything to comment on, either. I am kinda here for the conversation more than anything.
Exactly what algorithm would work best requires some trial and error I suppose. And I guess we would have to refine what behavior we want. Would it be better to have a few posts with boat tons of comments, or many posts with a small number of comments? I vote for the latter. In which case one could maybe boost posts with some middle range of comments, suppress those with either “too many” or “too few”.
Idk I’m just brainstorming here
Would it be feasible to expose the metadata for posts in such a way that search queries could be customized to sort a front page any way a user wants to see it?
For example average reading time, total upvotes, total number of comments, and other bits and pieces of data could be used to help people tailor their own experience. Perhaps even a sentiment analysis would be interesting to see: serious discussion, jokes and memes discussion, informative posters, political conversation left or right, etc.
Would it be feasible to expose the metadata for posts in such a way that search queries could be customized to sort a front page any way a user wants to see it?
There is already such an API endpoint which is available for mods and admins.
@[email protected] in https://discuss.online/comment/6718715
Yeah, it would definitely be feasible to expose post metadata for customized search queries. Currently, the data is restricted to admins and mods, but having an API endpoint for users could enhance the sorting options without significant strain on the server. It could lead to more tailored and engaging user experiences on the platform.
https://discuss.online/comment/6718201
Perhaps even a sentiment analysis would be interesting to see: serious discussion, jokes and memes discussion, informative posters, political conversation left or right, etc.
This reminds me of Slashdot moderation and Media Bias Fact Check Integration
Slashdot moderation
this was something I loved about slashdot moderation. When voting, people had to specify the reason for the vote. +1 funny, +1 insightful, +1 informative, -1 troll, -1 misleading, etc.
That way you can, for example, set in your user preferences to ignore positive votes for comedy, and put extra value on informative votes.
Then, to keep people from spamming up/down votes and to encourage them to think about their choices, they only gave out a limited number of moderation points to readers. So you’d have to choose which comments to spend your 5 points on.
Then finally, they had ‘meta moderation’ where you’d be shown a comment, and asked “would a vote of insightful be appropriate for this comment” to catch people who down-voted out of disagreement or personal vandetta. Any users who regularly mis-voted would stop receiving the ability to vote.
I don’t think this is directly applicable to a federated system, but I do think it’s one of the best-thought-out voting systems ever created for a discussion board.
edit: a couple other points i liked about it:
Comments were capped at (iirc) +5 and -1. Further votes wouldn’t change the comment’s score.
User karma wasn’t shown. The user page would just say Karma: good. Or Excellent, or poor, or some other vague term.
I like “hot” for votes only. Content does not necessarily need to provoke at lot of discussion to be worth seeing.
Equally, I like “active” for sorting according to what’s worth discussing in the comment section.
If you’re going to combine the two, it should be a new sort option. Going any further than that, and it’s beginning to smell like the type of code that sniffs out all the wrong buttons to press in people.
The problem with allowing third party apps is that it becomes very hard to implement things like time on post. All the app developers would have to implement it for it to become useful and they would have to do it in a consistent way. It would also be (IMO) a step towards the level of spying that seems to be standard in social media.
Well, that would only be implemented if it were offered by the API; otherwise, just use what is available right now, which are votes and the number of comments. I find it more invasive that other users can see the post history in my profile than admins being able to see the amount of time I spend reading each post. Revealing my feed feels akin to exposing my browsing history.