Python Simplified

Optimization in Python — Peephole

Peephole Optimization resized

Introduction

Before we dive into the details of peephole optimization, let us try to understand how Python code is executed. As the Python program runs, source code gets compiled into a set of instructions called bytecode. This intermediate bytecode is stored with .pyc extension under __pycache__ folder which is then executed by the Python interpreter. This bytecode or .pyc files contain a “faster” or “optimized” version of the source code.

peephole1

The compiled code object is accessible via the attribute __code__ on the function and carries a few important attributes. Here we are interested in co_consts. We will understand these when we go thru the examples below.

  • co_consts returns a tuple of any literals that occur in the function body
  • co_varnames returns a tuple containing the names of any local variables used in the function body
  • co_names returns a tuple of any non-local names referenced in the function body

The code object contains bytecode and also some other information necessary for the CPython to run the bytecode. When we define a function in Python, it creates a code object for it and we can access it using the __code__ attribute.

Peephole optimization

It is another type of optimization technique in Python. As the Python source code goes through the compilation step, certain things such as numeric expressions, strings & tuples get optimized and stored in bytecode instruction. Let’s get into details.

Constant Expressions

Numeric Calculations

The constant expressions, for example, a = 30 * 8 * 70, get optimized during the compilation.

Assume that your application has to refer to this variable multiple times during the execution of the program. In such a case, Python decides to optimize this expression by calculating the value 16800 and storing it in the variable a. You might think that we could have directly assigned 16800 to the variable a. But that comes at the cost of losing the readability, because this might indicate 30 days, 8 hours, and $70/per her. It makes sense to write it like this instead of using 16800 directly.

String length ≤ 4096 & tuple length ≤ 256

Any string of length ≤ 4096 and tuple of length ≤ 256 gets optimized similar to numeric constants as explained above. Anything outside this range will not be optimized by Python.

In this example, a, b and c are optimized as their length is less than the threshold mentioned above.

				
					a = "TDS" * 5
b = [1, 2] * 7
c = (10, 20, 30) * 3
				
			

Membership Tests

First, let us try what are membership operators. Membership operators are used for testing the membership in sequences such as lists, tuples, or strings. There are 2 membership operators in Python — in and not in. Refer to the sample example below for better understanding.

				
					2 in [1,2,3]
>> True

5 in [1,2,3]
>> False

5 not in [1,2,3]
>> True
				
			

Now that we have understood what are membership operators, let us understand the optimization related to the membership test.

 

Whenever Python encounters membership test for mutable objects such as lists & sets, they will be replaced by their immutable counterpart during the compilation. That means lists are converted to tuples and sets are converted to frozen sets during the compilation.

When Python encounters the list [10, 20, 30], it gets replaced with tuple i.e. (10, 20, 30) during the compilation. The same way sets are replaced with frozen sets.

Example 1:

peephole2
  • 30 * 8 * 70 is a constant expression. So it gets evaluated by the compiler to 16800.
  • “TDS” * 3 is also a constant expression and length is less 4096. So it is also get evaluated by the compiler to TDSTDSTDS.
  • “T” * 4097 is a sequence of length ≥ 4096 hence it will not be evaluated by the compiler.
  • (1, 2) * 5 is sequence of length 10 which is less than the threshold for tuple i.e. 256 so it gets evaluated and stored as (1,2,1,2,1,2,1,2,1,2).
  • (10,) is a sequence of length 257 which is above the threshold for tuple so it won’t get evaluated by the compiler.
  • [101, 102] * 2 is a list. Since lists are mutable objects it will not be evaluated by the compiler.

Example 2:

In this example, we will look at peephole optimization concerning the membership test. As we can see from the below code, list and sets are replaced with tuple and frozen set respectively as expected.

peephole3

Tip: We should prefer set membership operation over list and tuple as set membership operation is much faster.

Conclusion

In this article, you have understood the concept of peephole optimization in Python. If you are interested in reading another technique called interning, you can read it here.

Originally published at Medium on 22-Aug-2020

Share on facebook
Share on twitter
Share on linkedin
Share on whatsapp
Share on email
Chetan Ambi

Chetan Ambi

A Software Engineer & Team Lead with over 10+ years of IT experience, a Technical Blogger with a passion for cutting edge technology. Currently working in the field of Python, Machine Learning & Data Science. Chetan Ambi holds a Bachelor of Engineering Degree in Computer Science.
Scroll to Top