Try converting stopwords to a set. Using a list, your approach is O(n*m) where n is the number of words in text and m is the number of stop-words, using a set the approach is O(n + m). Let's compare both approaches list vs set:
import timeit
from nltk.corpus import stopwords
def list_clean(text):
stop_words = stopwords.words('english')
return [w for w in text if w.lower() not in stop_words]
def set_clean(text):
set_stop_words = set(stopwords.words('english'))
return [w for w in text if w.lower() not in set_stop_words]
text = ['the', 'cat', 'is', 'on', 'the', 'table', 'that', 'is', 'in', 'some', 'room'] * 100000
if __name__ == "__main__":
print(timeit.timeit('list_clean(text)', 'from __main__ import text,list_clean', number=5))
print(timeit.timeit('set_clean(text)', 'from __main__ import text,set_clean', number=5))
Output
7.6629380420199595
0.8327891009976156
In the code above list_clean is a function that removes stopwords using a list and set_clean is a function that removes stopwords using a set. The first time corresponds to list_clean and the second time corresponds to set_clean. For the given example the set_clean is almost 10 times faster.
UPDATE
The O(n*m) and O(n + m) are examples of big o notation, a theoretical approach of measuring the efficiency of algorithms. Basically the larger the polynomial the less efficient the algorithm, in this case O(n*m) is larger than O(n + m) so the list_clean method is theoretically less efficient than the set_clean method. This numbers come from the fact that search in list is O(n) and searching in a set takes a constant amount of time, often referred as O(1).