Introduction
Before we dive into the details of peephole optimization, let us try to understand how Python code is executed. As the Python program runs, source code gets compiled into a set of instructions called bytecode
. This intermediate bytecode
is stored with .pyc
extension under __pycache__
folder which is then executed by the Python interpreter. This bytecode
or .pyc
files contain a “faster” or “optimized” version of the source code.
The compiled code object is accessible via the attribute __code__
on the function and carries a few important attributes. Here we are interested in co_consts. We will understand these when we go thru the examples below.
co_consts
returns a tuple of any literals that occur in the function bodyco_varnames
returns a tuple containing the names of any local variables used in the function bodyco_names
returns a tuple of any non-local names referenced in the function body
The code object contains bytecode
and also some other information necessary for the CPython to run the bytecode
. When we define a function in Python, it creates a code object for it and we can access it using the __code__
attribute.
Peephole optimization
It is another type of optimization technique in Python. As the Python source code goes through the compilation step, certain things such as numeric expressions, strings & tuples get optimized and stored in bytecode
instruction. Let’s get into details.
Constant Expressions
Numeric Calculations
The constant expressions, for example, a = 30 * 8 * 70,
get optimized during the compilation.
Assume that your application has to refer to this variable multiple times during the execution of the program. In such a case, Python decides to optimize this expression by calculating the value 16800
and storing it in the variable a
. You might think that we could have directly assigned 16800
to the variable a
. But that comes at the cost of losing the readability, because this might indicate 30 days, 8 hours, and $70/per her. It makes sense to write it like this instead of using 16800
directly.
String length ≤ 4096 & tuple length ≤ 256
Any string of length ≤ 4096 and tuple of length ≤ 256 gets optimized similar to numeric constants as explained above. Anything outside this range will not be optimized by Python.
In this example, a, b and c are optimized as their length is less than the threshold mentioned above.
a = "TDS" * 5
b = [1, 2] * 7
c = (10, 20, 30) * 3
Membership Tests
First, let us try what are membership operators. Membership operators are used for testing the membership in sequences such as lists, tuples, or strings. There are 2 membership operators in Python — in
and not in
. Refer to the sample example below for better understanding.
2 in [1,2,3]
>> True
5 in [1,2,3]
>> False
5 not in [1,2,3]
>> True
Now that we have understood what are membership operators, let us understand the optimization related to the membership test.
Whenever Python encounters membership test for mutable objects such as lists & sets, they will be replaced by their immutable counterpart during the compilation. That means lists are converted to tuples
and sets are converted to frozen sets
during the compilation.
When Python encounters the list [10, 20, 30], it gets replaced with tuple i.e. (10, 20, 30) during the compilation. The same way sets
are replaced with frozen sets
.
Example 1:
30 * 8 * 70
is a constant expression. So it gets evaluated by the compiler to16800.
“TDS” * 3
is also a constant expression and length is less 4096. So it is also get evaluated by the compiler toTDSTDSTDS.
“T” * 4097
is a sequence of length ≥ 4096 hence it will not be evaluated by the compiler.(1, 2) * 5
is sequence of length 10 which is less than the threshold for tuple i.e. 256 so it gets evaluated and stored as(1,2,1,2,1,2,1,2,1,2).
(10,)
is a sequence of length 257 which is above the threshold for tuple so it won’t get evaluated by the compiler.[101, 102] * 2
is a list. Since lists are mutable objects it will not be evaluated by the compiler.
Example 2:
In this example, we will look at peephole optimization concerning the membership test. As we can see from the below code, list and sets are replaced with tuple and frozen set respectively as expected.
Tip: We should prefer set membership operation over list and tuple as set membership operation is much faster.
Conclusion
In this article, you have understood the concept of peephole optimization in Python. If you are interested in reading another technique called interning, you can read it here.
Originally published at Medium on 22-Aug-2020