How to Remove Punctuation from a String in Python
In this short Python tutorial, you will learn how to remove punctuation from a string using Python. That is, you will learn how to remove all punctuation from a string in Python. Note, in a more recent post I cover how to remove punctuation from a Pandas DataFrame.
Removing Punctuation in Python
To this aim, you will use use Python strings, loops, and if-else statements. Furthermore, you will also be introduced to regular expressions in Python. However, in the first example we will be removing punctuation without the re module (regular expression module).
Now, we will start by answering the question what a punctuation is:
What is punctuation and examples?
Briefly explained, “punctuation” is the system of symbols that you use to separate written sentences and parts of sentences, as well as to make the sentences clear meaning clear. Now, these symbols are known as “punctuation marks”. See the image for examples.
How do I remove punctuation marks in Python?
Now, the easiest way to delete the punctuation in Python is to use regular expressions.
Remove Punctuation in Python in a For Loop
In this section, you are going to learn how to remove punctuation with Python in a for loop. First, you are going to create two string variables. Second, you are going to create an empty string and then loop through each character to remove and check if the specific character is not in the string you are controlling.
Step 1: Create the Punctuation String
First, you create the the punctuation string with the marks that you want to remove:
punctuation = '''!()-[]{};:'"\,<>./?@#$%^&*_~'''
punctuation_to_remove = "Python daddy is the best!!! -#. blog!!! ever:::"
Step 2: Loop Through Each Punctuation
Second, you loop through each punctuation and add this to the new, empty, string you have created. This way, you get a new string with all punctuations removed!
no_punctuation = ""
for char in punctuation_to_remove:
if char not in punctuation:
no_punctuation = no_punctuation + char
print(no_punctuation)
Now, there are, of course, better methods to do remove punctuation in Python. In the next section, you will learn how to use the string module for removing punctuation with Python.
How to Remove Punctuation in Python with the string Module
In this section, you will learn how much easier it is to remove punctuation with Python using the string module.
import string
table = str.maketrans(dict.fromkeys(string.punctuation))
no_punctuation= punctuation_to_remove.translate(table)
print(no_punctuation)
In the remove punctuation example above, you imported the string module, created a table for translation with the punctuation characters (the one you want to remove, that is), and then you translate (i.e., remove the punctuation marks).
Remove Punctuation in Python with a Regular Expressions
In this final example, on how to remove punctuation in Python, you will learn how to remove the marks with the help of regular expressions. This is, also, quite simple; you just have to import the re module:
import re
no_punctuation = re.sub(r'[^\w\s]','', punctuation_to_remove)
print(no_punctuation)
Now, you used the sub method and removed the punctuation (i.e., replaced them with “nothing”).
Fastest Way to Remove Punctuation in Python?
Now, which one of the three methods for removing punctuation in Python is the fastest one? I used timeit and it seems like using regular expressions is the way to go. This is how I set this up in a Jupyter lab notebook:
%alias_magic t timeit
Timing of the Remove Punctuation Methods:
Here’s how I timed the loop:
%%t
no_punctuation = ""
for char in punctuation_to_remove:
if char not in punctuation:
no_ppunctuation = no_punctuation + char
print(no_punctuation)
I did this for the rest of the code chunks (see above) and added them to a Pandas dataframe:
It is, also, possible to remove punctuation with Python from user input (recorded from keyboard).
Conclusion
To summarize, in this post you have learned three methods to remove punctuation in Python. You’ve learned that using the string module might be the fastest way to remove punctuation with Python.
1 COMMENT
Oh daddy you know exactly how to tame the snake 😉