Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automod for #294 #478

Closed
wants to merge 4 commits into from
Closed

Automod for #294 #478

wants to merge 4 commits into from

Conversation

ghost
Copy link

@ghost ghost commented Jun 27, 2020

Started adding automod functionality much like dyno. #294 (comment)

Things that still need to be added:

Anti fast messages (may need tweaking)
Auto mute after 3 infractions
Create a banned word list (and decide where it goes)
Add logging

HackerJacker added 4 commits June 28, 2020 00:26
This still needs some optimisation but I think the basis is there.
@github-actions github-actions bot added admin Changes to the admin cog module Changes to the bot module labels Jun 28, 2020
@Cog.listener()
async def on_message(self, message):
def check(m):
return m.author == message.author and (datetime.datetime.utcnow() - m.created_at).seconds < 4
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will make 1000 calls to datetime.utcnow() for every message sent. This is a bad thing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The time, 4 seconds, should probably also be configurable


if not message.author.bot:
# Checks if message contains banned word.
if any([word in message.content for word in BANNED_WORDS]):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will block snigger. Consider using a regex of \Wword\W.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Surrounding the generator in [] is redundant, and is marginally slower than just leaving it as a generator; it defeats any's early-exit handler. Likewise in other places any([...]) has been used.

await message.delete()
await message.channel.send(f"{message.author.mention} Watch your language!", delete_after=5)
# Checks if message contains banned domain.
if any([word in message.content for word in BANNED_DOMAINS]):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

elif should be used, not if. We may have already deleted this message. Likewise, there's the same issue of matching half-matches.

await message.delete()
await message.channel.send(f"{message.author.mention} Don't spam mentions!", delete_after=5)
# Checks if message was sent too quickly.
if len(list(filter(lambda m: check(m), self.client.cached_messages))) >= 4:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.client.cached_messages is an array of 1000 messages. Ignoring the fact that the lambda is redundant, this will call check 1000 times for every message sent to the bot. Rather than checking every message retroactively, it would be better to track timestamps of sent messages as a dictionary of dequeues, each limited to 5 items, indexed by user ID. Each dequeue would contain the timestamp of messages as the bot receives them. Checking for spam is then a case of pulling up the corresponding dequeue for a given user, and checking if the oldest item in the dequeue is younger than n seconds. At this moment, spamming is more likely to DoS the bot than be deleted.

await message.channel.send(f"{message.author.mention} Don't spam mentions!", delete_after=5)
# Checks if message was sent too quickly.
if len(list(filter(lambda m: check(m), self.client.cached_messages))) >= 4:
await message.channel.purge(limit=4, check=who)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The magic number 4 has just been used twice here. For starters, it should be defined in a config somewhere. Secondly, because you're checking if it's >= 4, rather than == 4, we need to handle the case where it's > 4 as well. Additionally, messages will come into the bot faster than the deletion event for the purge. This means that if I spam 5 messages, the first 4 will be caught by the check, and deleted, but then messages 2 though 5 will also match the check, as the deletion response for 1 though 4 has not yet arrived. This causes the bot to double-warn and attempt to delete messages 2 though 4 twice.

Furthermore, limit is not the number of deleted messages. As the docs say:

limit (Optional[int]) – The number of messages to search through. This is not the number of messages that will be deleted, though it can be.

This means this filter can be trivially bypassed by having two or more members spam in unison. The bot will attempt to delete the most recent 4 messages, however if the check fails for all 4, it will delete none and will not continue searching.

# Checks if message contains mass mention.
if len(message.mentions) > 8:
await message.delete()
await message.channel.send(f"{message.author.mention} Don't spam mentions!", delete_after=5)
Copy link
Contributor

@Bottersnike Bottersnike Jul 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point, I'm getting a sense of DRY. It may be better to encapsulate the checks, then have a single call to message.delete() and to message.channel.send() after the encapsulated checks.

def check(m):
return m.author == message.author and (datetime.datetime.utcnow() - m.created_at).seconds < 4

def who(m):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't really a very suitable function name. is_original_author or just is_author would be a far more descriptive name.

@@ -89,6 +92,32 @@ async def on_member_join(self, member: Member):
# assign placeholder nickname
await member.edit(nick=PLACEHOLDER_NICKNAME)

@Cog.listener()
async def on_message(self, message):
def check(m):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check? Check for what? Give this function a name that tells me what it actually is; it's performing a very specific task, so name it like it is.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docstrings, typehints, etc.

# Checks if message contains banned word.
if any([word in message.content for word in BANNED_WORDS]):
await message.delete()
await message.channel.send(f"{message.author.mention} Watch your language!", delete_after=5)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete_after=5 is a bit of a magic number here and is going to be a pain to replace all occurrences off if it needs changed.

@@ -143,6 +144,7 @@ class Exchange:
}

# Admin Constants
BANNED_WORDS = []
Copy link
Contributor

@Bottersnike Bottersnike Jul 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's already a banned words constant defined here:

NICKNAME_PATTERNS = [
r"(discord\.gg/)", # invite links
r"(nigg|ligma|fag|nazi|hitler|\bpaki\b)", # banned words
r"(http(s)?:\/\/.)?(www\.)?[-a-zA-Z0-9@:%._\+~#=]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9@:%_\+.~#?&//=]*)", # hyperlinks
]

It would probably be better to pull this out into its own constant and use it, rather than having two copies of banned words.

@thebeanogamer
Copy link
Member

I'd like to repeat most of @Bottersnike's points but from an infra perspective, and point out how hilariously inefficient this will be from a computational perspective, how slow it will be, and how much it will murder the instance the bot runs on (which until now has been fine as a 1 core 512mb container)..

@ghost
Copy link
Author

ghost commented Jul 9, 2020

Thanks for the suggestions @Bottersnike. I will implement them soon.

@lightspeedana
Copy link
Member

Taking a brief look through @H4ckerJ4cker I understand you've tried hard to make this solution, but I suggest scrapping a lot of it and leaving it up to someone else to try for now. I'd like to think this works as a fairly heuristic approach, but it is not efficient in many ways and this is less of a side project and more of an actual job that needs to be done with no room for error. I'm happy to ensure someone in the mod team takes this role on and fulfils it, and I'm sure you'll be able to learn well from the final solution which will include more efficient solutions to the problems you've tried to solve.

Thanks so much for the help, and we'll be taking some of these ideas forward, but I think from now it's best that comdev/sudo/root take this on 😄

@ghost
Copy link
Author

ghost commented Jul 9, 2020

@lightspeedana this was never meant to be even thought about getting pushed yet. It is a draft.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
admin Changes to the admin cog module Changes to the bot module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants