Django: A Profanity Filter

The reason for it:

There are often times when you would like to display content on your page that was actually submitted by another user, such as displaying a list of recent posts on your homepage or something.  The problem is that you don’t want to post any offensive material on such a prominent page.  Without real live human moderation, the best we can do is strip out things we know are offensive (to most people anyway), such as bad words.  Here’s a profanity filter for Django I wrote using code mostly sheisted from django.core.validators.

And here she is:

Here’s what the filter looks like: (If you don’t know how to make a filter in Django, read the documentation)

@register.filter("replace_bad_words")
def replace_bad_words(value):
    """ Replaces profanities in strings with safe words
    For instance, "shit" becomes "s--t"
    """
    words_seen = [w for w in settings.PROFANITIES_LIST if w in value]
    if words_seen:
        for word in words_seen:
            value = value.replace(word, "%s%s%s" % (word[0], '-'*(len(word)-2), word[-1]))
    return value

Some other things:

Just throw that on a django template variable and it will replace words like “shit” with “s–t.” It won’t change words like “ass” and “dick” since they are technically not bad words… but if you think they are, you can do something like this:

...
extra_bad_words = ['ass', 'dick']

bad_words = settings.PROFANITIES_LIST.extend(extra_bad_words)

words_seen = [w for w in bad_words if w in value]
...

Pretty useful huh?

Oh, one more thing to add — this filter depends on the profanities list that is included in Django.  To get this, make sure you import settings:


from django.conf import settings

That’s it.

I know there are a lot of details missing.  I apologize.  If you make a comment here, I’ll be happy to help you out.

Comments (10)

  1. 6:08 pm, August 1, 2008g_hunter1  / Reply

    Nice stuff but does this also handle masked words?

  2. 10:40 pm, August 1, 2008proc  / Reply

    What do you mean by masked words?

  3. 3:05 am, August 2, 2008ss0  / Reply

    I believe he meant when people use a different word like fsck for fuck. However since this is what your regex is basically doing, I don’t really see the point of the question.

  4. 2:47 pm, August 2, 2008g_hunter1  / Reply

    Yeah I kinda figured but thats a gallizon words to add to the list. But if you have to do it like that I guess you have to.

  5. 3:42 am, August 7, 2008fftb  / Reply

    f—ing thursdays. just testing.

  6. 2:35 pm, August 8, 2008the daniel  / Reply

    I did it this way once:

    BADWORDS = [(x.word, ''.join(['*' for y in x.word])) for x in BadWord.objects.all()]

    def badword_filter(text):
    for word in BADWORDS:
    text = text.replace(word[0], word[1])
    return text

  7. 7:09 pm, February 20, 2010paul  / Reply

    Amazed no one pointed out that you should have an indent below the for loop statement

    • 10:49 am, September 15, 2011proc  / Reply

      @paul
      The syntax highlighter plugin I was using was terrible…sorry about that. thanks for pointing it out.

  8. 10:42 am, September 15, 2011yeago  / Reply

    This doesn’t handle caps/lowercase variants. AKA fUck

    • 10:53 am, September 15, 2011proc  / Reply

      @yeago
      Yeah, you’re right — 2 ways to solve this — lowercase everything before you do the analysis or use regular expressions. In any case, it’s just meant as a quick filter to remove the most common bad words and is definitely not meant to be a one stop shop. Check out http://en.wikipedia.org/wiki/Scunthorpe_problem

Leave a Reply

Allowed Tags - You may use these HTML tags and attributes in your comment.

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Pingbacks (0)

› No pingbacks yet.