Why is single-value tuple plus list is more efficient than appending a number?

Question

Assume we need to append single number 1 to array a.

In Python we have 5 obvious ways:

a.append(1)
a += [1]
a += 1,
a.extend((1,))
a.extend([1])

Let's measure it with timeit :

from timeit import timeit

print(timeit("a.append(1)", "a = []", number=10_000_000))
print(timeit("a += [1]", "a = []", number=10_000_000))
print(timeit("a += 1,", "a = []", number=10_000_000))
print(timeit("a.extend((1,))", "a = []", number=10_000_000))
print(timeit("a.extend([1])", "a = []", number=10_000_000))

Here is the console's output:

5.05412472199896
5.869792026000141
3.1280645619990537
4.988895307998973
8.05588494499898

Why is third one more efficient than others?

take a look at https://stackoverflow.com/questions/2022031/python-append-vs-operator-on-lists-why-do-these-give-different-results — , Feb 18 '22 at 17:08
@SembeiNorimaki He's not getting different results, just different performance. — Barmar, Feb 18 '22 at 17:09
Probably because tuples are simpler than lists, since they're immutable. So creating the tuple is faster. — Barmar, Feb 18 '22 at 17:10
But why creating tuple and plusing it to list is more efficient than just appending value, it is not that simple — Gleb, Feb 18 '22 at 17:11
Your tests aren't just timing adding to the list, they're also timing creating the tuple or list that you extend with. — Barmar, Feb 18 '22 at 17:11
I suggest you look at the generated bytecode to see the difference. — Barmar, Feb 18 '22 at 17:11
Well the first thing I would say is that you can't add an `int` to a `list`. So the `a += 1` actually crashes. — TeddyBearSuicide, Feb 18 '22 at 17:12
Yes, and this is the part of the question, creating tuple is not free, but it is still better — Gleb, Feb 18 '22 at 17:12
@TeddyBearSuicide it's `a += 1,` (note the trailing comma that makes it a tuple). I missed it on the first reading too. :) — Samwise, Feb 18 '22 at 17:12
@Barmar it's actually cached by the compiler, a tuple literal that only consists of immutable literals becomes a constant — juanpa.arrivillaga, Feb 18 '22 at 17:14
Note `+=` with an array does not `append()` nor is it the same implementation as `+` — JonSG, Feb 18 '22 at 17:17

juanpa.arrivillaga · Accepted Answer · 2022-02-18T17:25:30.960

The creation of a tuple (1,) is optimized away by the compiler. On the other hand, the list is always created. Look at dis.dis

>>> import dis
>>> dis.dis('a.extend((1,))')
  1           0 LOAD_NAME                0 (a)
              2 LOAD_METHOD              1 (extend)
              4 LOAD_CONST               0 ((1,))
              6 CALL_METHOD              1
              8 RETURN_VALUE
>>> dis.dis('a.extend([1])')
  1           0 LOAD_NAME                0 (a)
              2 LOAD_METHOD              1 (extend)
              4 LOAD_CONST               0 (1)
              6 BUILD_LIST               1
              8 CALL_METHOD              1
             10 RETURN_VALUE

Notice, it takes less byte-code instructions, and merely does a LOAD_CONST on (1,). On the other hand, for the list, BUILD_LIST is called (with a LOAD_CONST for 1).

Note, you can access these constants on the code object:

>>> code = compile('a.extend((1,))', '', 'eval')
>>> code
<code object <module> at 0x10e91e0e0, file "", line 1>
>>> code.co_consts
((1,),)

Finally, as to why += is faster than .extend, well, again if you look at the bytecode:

>>> dis.dis('a += b')
  1           0 LOAD_NAME                0 (a)
              2 LOAD_NAME                1 (b)
              4 INPLACE_ADD
              6 STORE_NAME               0 (a)
              8 LOAD_CONST               0 (None)
             10 RETURN_VALUE
>>> dis.dis('a.extend(b)')
  1           0 LOAD_NAME                0 (a)
              2 LOAD_METHOD              1 (extend)
              4 LOAD_NAME                2 (b)
              6 CALL_METHOD              1
              8 RETURN_VALUE

You'll notice for .extend, it that requires first resolving the method (which takes extra time). Using the operator on the other hand has it's own bytecode: INPLACE_ADD so everything is pushed down into that C layer (plus, magic methods skip instance namespaces and a bunch of hooplah and are looked up directly on the class).

But what about `a.append(1)`? Why isn't that faster, since it doesn't have to create a tuple or list? — Barmar, Feb 18 '22 at 17:15
Note that you should be calling `extend` in these examples, not `append`, although it doesn't make a difference for the point you're making. — Barmar, Feb 18 '22 at 17:16
@Barmar look at the edit. Using operators is faster than using methods. Also, yeah you are right about extend — juanpa.arrivillaga, Feb 18 '22 at 17:17
Doesn't `INPLACE_ADD` have to look up and call the `__iadd__` method? — Barmar, Feb 18 '22 at 17:19
@Barmar it does,but without hopping back to the bytecode evaluation loop, done entirely in C, and it skips the instance namespace since magic-method lookups are special-cased — juanpa.arrivillaga, Feb 18 '22 at 17:20
Yeah, I was about to conjecture that there are optimizations for the built-in types. — Barmar, Feb 18 '22 at 17:21

score 1 · Answer 2 · answered Feb 18 '22 at 17:41

Ok, to sum up, what @juanpa.arrivillaga, @Samwise and @Barmar mentioned:

a += (1, ) is equiualent to a.__iadd__((1, )) but without loading methods. If we look at dis:

>>> dis.dis("a.__iadd__((1,))")
  1           0 LOAD_NAME                0 (a)
              2 LOAD_METHOD              1 (__iadd__)
              4 LOAD_CONST               0 ((1,))
              6 CALL_METHOD              1
              8 RETURN_VALUE

>>> dis.dis("a += (1, )")
  1           0 LOAD_NAME                0 (a)
              2 LOAD_CONST               0 ((1,))
              4 INPLACE_ADD
              6 STORE_NAME               0 (a)
              8 LOAD_CONST               1 (None)
             10 RETURN_VALUE

>>> dis.dis("a.append(1)")
  1           0 LOAD_NAME                0 (a)
              2 LOAD_METHOD              1 (append)
              4 LOAD_CONST               0 (1)
              6 CALL_METHOD              1
              8 RETURN_VALUE

You can see that in first and third cases we needed to use LOAD_METHOD before call and this is most resourse-mean part, whilst += have direct disassembler

Btw, first case is more worse than previous five and got 8.292503296999712 on timeit

Samwise · Answer 3 · 2022-02-18T17:13:22.993

0

The difference between the tuple and list is entirely due to the difference in how much time it takes to create each. This is easy to verify by timing the list/tuple creation on its own, and by timing the += operation independently of the creation of the objects being added:

>>> timeit("b = [1]", number=10_000_000)
0.447727799997665
>>> timeit("b = (1,)", number=10_000_000)
0.17419059993699193
>>> timeit("a += b", "a, b = [], [1]", number=10_000_000)
0.5244328000117093
>>> timeit("a += b", "a, b = [], (1,)", number=10_000_000)
0.5320590999908745

edited Feb 18 '22 at 17:13

answered Feb 18 '22 at 17:12

Samwise

68,105
3
30
44

I would be mostly curious why `a.append(1)` is not faster than both. – Mark Feb 18 '22 at 17:13
That's harder to answer without going into the guts of CPython, but I'd imagine that `+=` is able to optimize away an attribute lookup or otherwise bypass some piece of Python interpreter logic that makes `.append` take longer. – Samwise Feb 18 '22 at 17:14
Another possibility is that mutating a small list takes longer than creating an entirely new one and rebinding a variable... – Samwise Feb 18 '22 at 17:16
I think @juanpa.arrivillaga in the other answer has a pretty good hypothesis. – Mark Feb 18 '22 at 17:19
1

ah yes, that's the "attribute lookup" I was referring to. My first hunch was exactly right. :) – Samwise Feb 18 '22 at 17:19

Why is single-value tuple plus list is more efficient than appending a number?

3 Answers3