Introduction
When you need to filter out data based on a condition, you can easily do so by using python’s built-in filter function. That’s it! So simple!
If you want to combine two lists (or any other iterables) into one, you can use Python’s built-in zip function. Not clear? No problem. It is an interesting feature of Python and of great use to you! You will understand all about it in this article.
In the previous article, we have covered the map function in great detail. In this post, we will understand zip and filter functions, syntax, examples, etc.
filter function
As you can see from the below syntax, the filter function takes just two arguments: a function and an iterable to filter out the data as per your need.
- function / None: first, you need to write a function that contains the condition to filter the data. When this function is used with the filter it returns those elements of iterable for which function returns true.
- iterable: input data on which you need to perform the filter operation. It can be any iterable like sets, lists, tuples, dictionaries, etc.
Syntax
filter(function or None, iterable)
Result: Filter function returns an iterator object having only the filtered data as per your conditions (as written in the function/lambda).
Examples
(1) Given students’ marks, you can filter out the marks with first-class using the filter function. Example shows how to use lambda to filter.
>>> marks = [35, 85, 90, 30, 55, 99, 75, 66, 45]
>>> first_class = filter(lambda x: x>= 60, marks)
>>> print("\nFirst Class Student Marks")
>>> print("--"*15)
>>> for mark in first_class:
... print(mark)
Output:
First Class Student Marks
------------------------------
85
90
99
75
66
(2) Given a list of book titles, filter the books on Python programming language. Example showing how you can use Python function for filtering.
>>> library_books = ['Python Basics', 'Java Advanced', 'C# Fundamentals', 'SQL for Begineers', 'Advanced Python', 'Machine Learning with Python', 'Golang Basics']
>>> def is_python_book(book):
... if book.lower().strip().find("python") != -1:
... return True
... else:
... return False
>>> python_books = filter(is_python_book, library_books)
>>> for b in python_books:
... print(b)
Output:
Python Basics
Advanced Python
Machine Learning with Python
(3) Given a list of chemical compounds, you can filter out the compounds containing and not containing oxygen element.
>>> chemical_compounds = ['H2SO4', 'KCL', 'O2', 'CO2', 'H2', 'NH3']
>>> has_oxygen_element = filter(lambda x: x.strip().upper().find('O') != -1, chemical_compounds)
>>> no_oxygen_element = filter(lambda x: x.strip().upper().find('O') == -1, chemical_compounds)
>>> print("\nChemical Compounds Containing Oxygen Element")
>>> print("_"*50)
>>> for o_compound in has_oxygen_element:
... print(o_compound)
>>> print("\nChemical Compounds Without Oxygen Element")
>>> print("_"*50)
>>> for compound in no_oxygen_element:
... print(compound)
Output:
Chemical Compounds Containing Oxygen Element
__________________________________________________
H2SO4
O2
CO2
Chemical Compounds Without Oxygen Element
__________________________________________________
KCL
H2
NH3
CHCL3
Interesting fact # 1: If you give “None” instead of the function name as the first argument to filter, then the filter function will return all the objects in the given iterable data that has the truth value of 1 or True. Not sure what is a “truth value”? Click here to understand.
>>> list(filter(None,[0, 1, 0.0, 'Python', '', [], None, True]))
[1, 'Python', True]
Interesting fact # 2: On the contrary, if you want a function that returns elements of iterable for which function returns false, then you can look at itertools.filterfalse().
>>> mylist = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>>> list(filter(lambda x: x%2==0, mylist))
[2, 4, 6, 8, 10]
>>> from itertools import filterfalse
>>> mylist = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>>> list(itertools.filterfalse(lambda x: x%2==0, mylist))
[1, 3, 5, 7, 9]
Interesting fact # 3: If the function is not None then the filter function will be equivalent to generator expression as shown below —
(item for item in iterable if function(item))
If the function is None then the filter function is equivalent to generator expression as shown below —
(item for item in iterable if item)
Don’t know what generator expressions are? Read here.
zip function
The zip function takes one or more iterables as arguments, zips together elements from each iterable based on their relative position and return tuples until one of the iterable input is exhausted. The output of the zip function is an iterator.
Syntax
zip(*iterables)
where,
iterable — an iterable like sets, lists, tuples, dictionaries, etc.
Examples
(1) zip function with no iterable: When no iterables are passed to zip function, it returns an empty tuple.
>>> list(zip())
[]
(2) zip function with one iterable: You probably don’t use the zip function with only one iterable but it is good to know what happens. As seen from the below code, it returns a list of tuples where the tuple contains only one element.
>>> list(zip([1, 2, 3]))
[(1,), (2,), (3,)]
(3) zip function with iterables of the same length: In the below example, the zip function zips together the elements from list1 and list2 and returns an iterator.
>>> list1 = 'abcde'
>>> list2 = [1, 4, 9, 16, 25]
>>> list( zip(list1, list2) )
[('a', 1), ('b', 4), ('c', 9), ('d', 16), ('e', 25)]
(4) zip function with iterables of different lengths: In the below example, list1, list2, and list3 are of different lengths. The length of the output iterator depends on the size of the iterable with the least elements i.e. in the below example, it depends on the size of list3.
>>> list1 = [1, 2, 3, 4, 5]
>>> list2 = [1, 4, 9, 16, 25]
>>> list3 = [1, 8, 27]
>>> list(zip(list1, list2, list3))
[(1, 1, 1), (2, 4, 8), (3, 9, 27)]
When dealing with iterables of different lengths, the zip function stops when the shortest iterable is exhausted. This can be taken altered using the zip_longest function from the itertools module.
zip_longest(): With zip_longest function, zip function continues the iteration until the longest iterable is exhausted. The missing values are filled with None by default. This can be modified by passing different a value to the fillvalue parameter.
>>> from itertools import zip_longest
>>> list(zip_longest(list1, list2, list3))
[(1, 1, 1), (2, 4, 8), (3, 9, 27), (4, 16, None), (5, 25, None)]
>>> list(zip_longest(list1, list2, list3, fillvalue='-'))
[(1, 1, 1), (2, 4, 8), (3, 9, 27), (4, 16, '-'), (5, 25, '-')]
(5) zip function for unpacking iterables: Do you know zip function can also be used for unpacking the zipped object by using asterisk *. Here is an example. As you can see, the zipped object is unpacked into two iterables.
>>> zipped = [('a', 1), ('b', 4), ('c', 9), ('d', 16), ('e', 25)]
>>> unzipped = zip(*zipped)
>>> list(unzipped)
[('a', 'b', 'c', 'd', 'e'), (1, 4, 9, 16, 25)]
List comprehension Vs. filter
Filter and list comprehension have their own pros and cons. Based on the speed comparison between filter and list comprehension (refer to sample speed comparison below), the filter function is marginally faster in most cases.
For speed comparison, we are repeating the operation of identifying even numbers in the range 0–100 around 100,000 times. Below the are results.
Case 1: filter function with a function — 1.97 seconds
Case 2: list comprehension with function — 2.71 seconds
Case 3: list comprehension without function — 1.41 seconds
>>> from timeit import timeit
>>> def my_func(x):
... if x%2 ==0:
... return True
... else:
... return False
>>> print("filter :", timeit("list(filter(my_func, range(100)))", globals=globals(), number=100_000))
# filter : 1.9725104999999998
>>> print("listcomp 1 :", timeit("[my_func(x) for x in range(100)]", globals=globals(), number=100_000))
# listcomp 1 : 2.7164394
>>> print("listcomp 2 :", timeit("[x for x in range(100) if x%2==0]", globals=globals(), number=100_000))
# listcomp 2 : 1.4104230000000006
As seen from the results, the list comprehension without function runs faster than filter and list comprehension with function.
So, which one you should consider using — filter or list comprehension? If you need to load the entire data into the memory and/or if there is a need to iterate over the result multiple times then we should consider list comprehension. Otherwise, you can consider using the filter function. Also, list comprehensions are more Pythonic than filter map functions.
Summary
- The filter function takes a function and an iterable, applies the function to each item in the iterable and returns an iterator object for which which function returns true
- You can also pass “None” instead of the function name as the first argument to filter, then the filter function will return all the objects in the given iterable data that has the truth value of 1 or True.
if you want a function that returns elements of iterable for which function returns false, then you can look at itertools.filterfalse().
The zip function takes one or more iterables as arguments, zips together elements from each iterable based on their relative position and return tuples until one of the iterable input is exhausted.
Based on the speed comparison between filter and list comprehension, the filter function is marginally faster in most cases.
If you need to load the entire data into the memory and/or if there is a need to iterate over the result multiple times then we should consider list comprehension. Otherwise, you can consider using the filter function.