For example I have list
my_list= ['image101.jpg', 'image2.jpg', 'image1.jpg']
and
my_list.sort()
gives me
['image1.jpg', 'image101.jpg', 'image2.jpg']
but I of course need
['image1.jpg', 'image2.jpg', 'image101.jpg']
How it can be done?
For example I have list
my_list= ['image101.jpg', 'image2.jpg', 'image1.jpg']
and
my_list.sort()
gives me
['image1.jpg', 'image101.jpg', 'image2.jpg']
but I of course need
['image1.jpg', 'image2.jpg', 'image101.jpg']
How it can be done?
list.sort accepts optional key function. Each item is passed to the function, and the return value of the function is used to compare items instead of the original values.
>>> my_list= ['image101.jpg', 'image2.jpg', 'image1.jpg']
>>> my_list.sort(key=lambda x: int(''.join(filter(str.isdigit, x))))
>>> my_list
['image1.jpg', 'image2.jpg', 'image101.jpg']
filter, str.isdigit were used to extract numbers:
>>> ''.join(filter(str.isdigit, 'image101.jpg'))
'101'
>>> int(''.join(filter(str.isdigit, 'image101.jpg')))
101
''.join(..) is not required in Python 2.xUse a regex to pull the number from the string and cast to int:
import re
r = re.compile("\d+")
l = my_list= ['image101.jpg', 'image2.jpg', 'image1.jpg']
l.sort(key=lambda x: int(r.search(x).group()))
Or maybe use a more specific regex including the .:
import re
r = re.compile("(\d+)\.")
l = my_list= ['image101.jpg', 'image2.jpg', 'image1.jpg']
l.sort(key=lambda x: int(r.search(x).group()))
Both give the same output for you example input:
['image1.jpg', 'image2.jpg', 'image101.jpg']
If you are sure of the extension you can use a very specific regex:
r = re.compile("(\d+)\.jpg$")
l.sort(key=lambda x: int(r.search(x).group(1)))
If you want to do this in the general case, I would try a natural sorting package like natsort.
from natsort import natsorted
my_list = ['image101.jpg', 'image2.jpg', 'image1.jpg']
natsorted(my_list)
Returns:
['image1.jpg', 'image2.jpg', 'image101.jpg']
You can install it using pip i.e. pip install natsort
Actually you don't need any regex patern. You can parse easily like that.
>>> 'image101.jpg'[5:-4]
'101'
Solution:
>>> sorted(my_list, key=lambda x: int(x[5:-4]))
['image1.jpg', 'image2.jpg', 'image101.jpg']