The reason for it:
There are often times when you would like to display content on your page that was actually submitted by another user, such as displaying a list of recent posts on your homepage or something. The problem is that you don’t want to post any offensive material on such a prominent page. Without real live human moderation, the best we can do is strip out things we know are offensive (to most people anyway), such as bad words. Here’s a profanity filter for Django I wrote using code mostly sheisted from django.core.validators.
And here she is:
Here’s what the filter looks like: (If you don’t know how to make a filter in Django, read the documentation)
@register.filter("replace_bad_words")
def replace_bad_words(value):
""" Replaces profanities in strings with safe words
For instance, "shit" becomes "s--t"
"""
words_seen = [w for w in settings.PROFANITIES_LIST if w in value]
if words_seen:
for word in words_seen:
value = value.replace(word, "%s%s%s" % (word[0], '-'*(len(word)-2), word[-1]))
return value
Some other things:
Just throw that on a django template variable and it will replace words like “shit” with “s–t.” It won’t change words like “ass” and “dick” since they are technically not bad words… but if you think they are, you can do something like this:
... extra_bad_words = ['ass', 'dick'] bad_words = settings.PROFANITIES_LIST.extend(extra_bad_words) words_seen = [w for w in bad_words if w in value] ...
Pretty useful huh?
Oh, one more thing to add — this filter depends on the profanities list that is included in Django. To get this, make sure you import settings:
from django.conf import settings
That’s it.
I know there are a lot of details missing. I apologize. If you make a comment here, I’ll be happy to help you out.
6:08 pm, August 1, 2008g_hunter1 /
Nice stuff but does this also handle masked words?
10:40 pm, August 1, 2008proc /
What do you mean by masked words?
3:05 am, August 2, 2008ss0 /
I believe he meant when people use a different word like fsck for fuck. However since this is what your regex is basically doing, I don’t really see the point of the question.
2:47 pm, August 2, 2008g_hunter1 /
Yeah I kinda figured but thats a gallizon words to add to the list. But if you have to do it like that I guess you have to.
3:42 am, August 7, 2008fftb /
f—ing thursdays. just testing.
2:35 pm, August 8, 2008the daniel /
I did it this way once:
BADWORDS = [(x.word, ''.join(['*' for y in x.word])) for x in BadWord.objects.all()]
def badword_filter(text):
for word in BADWORDS:
text = text.replace(word[0], word[1])
return text
7:09 pm, February 20, 2010paul /
Amazed no one pointed out that you should have an indent below the for loop statement
10:49 am, September 15, 2011proc /
@paul
The syntax highlighter plugin I was using was terrible…sorry about that. thanks for pointing it out.
10:42 am, September 15, 2011yeago /
This doesn’t handle caps/lowercase variants. AKA fUck
10:53 am, September 15, 2011proc /
@yeago
Yeah, you’re right — 2 ways to solve this — lowercase everything before you do the analysis or use regular expressions. In any case, it’s just meant as a quick filter to remove the most common bad words and is definitely not meant to be a one stop shop. Check out http://en.wikipedia.org/wiki/Scunthorpe_problem