Lemmy requires active users to manually search for communities and discover content. Instances can choose to defederate from other instances, but I want instances that block as few other users as possible so I can decide for myself what content I see.
I want to add a column to this script to analyze Lemmy instances and identify communities that have high user activity but low blocking of users.
Initially I was thinking of adding a column that calculates the ratio of:
(active users) / (total blocked users)
However, this runs into a divide by zero error if there are no blocked users.
I’ve thought of a few ways to handle the ZeroDivisionError case, but there could be a better metric entirely that avoids this issue or gives a good measure of high activity + low blocking.
Does anyone have ideas for a better metric or ratio to use here?
Some context on what the data looks like:
- “active users” = number of active users in the past month
- “total blocked users” = sum of active users from all instances blocking or being blocked by this instance
Let me know if you have any suggestions! I’m open to different formulas or metrics beyond a simple ratio.
Appreciate any help!
Just add 1 to the denominator.
Simple is best.
The max(1,total_blocked) method will make instances with 1 blocked and 0 blocked appear to be equal.
Also to note if you don’t want significantly change the proportions add 1 to both top and bottom. It’s going to remove the divide by zero error and won’t significantly alter ratios. It’s used often in data science to avoid this problem