Intermediate Python Knowledge
List Comprehension and Beyond — Understand 4 Key Related Techniques in Python
They’re easier than you may think
Apr 20 ·7min read
When we’re learning Python, list comprehension is a tricky technique that can take us some time to fully understand it. After we learned it, we like to use it because it’s a neat way to showcase our coding expertise in Python. In particular, when we have the chance to let beginners read our code, they’ll be amazed to know the existence of such a concise way of creating lists in Python. Actually, what they probably don’t know is that understanding the syntax of list comprehension can be useful for them to understand a few other critical Python techniques. In this article, let’s explore them altogether.
1. List Comprehension
Let’s first review what the list comprehension is. It has the following basic syntax: [expression for item in iterable]
. In essence, it goes over an iterable, executes particular operations that create an item and returns a list consisting of these items. Consider the example below.
>>> # create a list of words >>> words = ["quixotic", "loquacious", "epistemology", "liminal"] >>> # create a list of numbers counting the letters >>> letter_counts = [len(x) for x in words] >>> letter_counts [8, 10, 12, 7]
In the above code, we create a list of numbers called letter_counts
with each being the letter count for each word in the words
list. Pretty straightforward right?
Let’s make something more interesting. In the code below, we create a list of uppercased words by filtering the words
list using an if
statement.
>>> # create a list of uppercased words with letter count > 8 >>> uppercased = [x.upper() for x in words if len(x) > 8] >>> uppercased ['LOQUACIOUS', 'EPISTEMOLOGY']
2. Dictionary Comprehension
Besides the list comprehension, Python has a similar technique for creating dictionaries termed dictionary comprehension. It has the following basic syntax: {exp_key: exp_value for item in iterable}
. As you can see, the expression is similar to list comprehension in that both have the iteration part (i.e., for item in iterable
).
There are two differences. First, we use curly brackets for dictionary comprehension instead of square brackets for list comprehension. Second, dictionary comprehension has two expressions with one for the key and the other for the value, as opposed to one expression in the list comprehension.
Let’s see the example below. We have a list of tuples with each holding the student’s name and score. Next, we use the dictionary comprehension technique to create a dictionary with the names being the keys and the scores being the values.
>>> # create a list of tuples having student names and scores >>> scores = [("John", 88), ("David", 95), ("Aaron", 94)] >>> # create a dictionary using name as key as score as value >>> dict_scores = {x[0]: x[1] for x in scores} >>> dict_scores {'John': 88, 'David': 95, 'Aaron': 94}
To make our example more interesting (thus more can be learned), we can incorporate a conditional assignment with the dictionary comprehension (actually list comprehension too). Consider the example below which still uses the scores
list.
>>> # create the dictionary using name as key as grade as value >>> dict_grades = {x[0]: 'Pass' if x[1] >= 90 else "Fail" for x in scores} >>> dict_grades {'John': 'Fail', 'David': 'Pass', 'Aaron': 'Pass'}
3. Set Comprehension
We all know that there are three major built-in collection data structures in Python: lists, dictionaries, and sets. Since there are the list and dictionary comprehension, it’s surprising to know that there is also set comprehension.
The set comprehension has the following syntax: {expression for item in iterable}
. The syntax is almost identical to the list comprehension except that it uses curly brackets instead of square brackets. Let’s see how it works with the following example.
>>> # create a list of words of random letters >>> nonsenses = ["adcss", "cehhe", "DesLs", "dddd"] >>> # create a set of words of unique letters for each word >>> unique_letters = {"".join(set(x)) for x in nonsenses} >>> unique_letters {'d', 'cdas', 'eLsD', 'ceh'}
In the above code, we create a list of nonsense words with random letters called nonsenses
. We then create a set of words called unique_letters
with each consisting of unique letters for the words.
One thing to note is that in Python, sets can’t have items with duplicate values, and thus set comprehension will remove duplicate items automatically for us. Please see the code below for this feature.
>>> # create a list of numbers >>> numbers = [(12, 20, 15), (11, 9, 15), (11, 13, 22)] >>> # create a set of odd numbers >>> unique_numbers = {x for triple in numbers for x in triple} >>> unique_numbers {9, 11, 12, 13, 15, 20, 22}
In the above code, we create a set called unique_numbers
from the list numbers
with items of tuples. As you can see, the duplicate numbers (e.g., 11) in the list have only one copy in the set.
One new thing here is that we use a nested comprehension, which has the following syntax: expression for items in iterable for item in items
. This technique is useful in cases the iterable contains other collections (e.g., list
in list
or tuple
in list
). Notably, we can use this nested comprehension for the list, dictionary, and set comprehensions.
4. Generator Expression
We learned that we use curly brackets for the set comprehension and square brackets for the list comprehension. What if we use the parentheses, like (expression for item in iterable)
? Good question, which leads to the discussion of generator expression, and some people also call it generator comprehension.
In other words, when we use the parentheses, we’re actually declaring a generator expression, which creates a generator. Generators are “lazy” iterators in Python. It means that generators can be used where an iterator is needed, but it provides the needed item until it’s requested (this is why it’s called “lazy”, a programming jargon ). Let’s see the example below.
>>> # create the generator and get the item >>> squares_gen = (x*x for x in range(3)) >>> squares_gen.__next__() 0 >>> squares_gen.__next__() 1 >>> squares_gen.__next__() 4 >>> squares_gen.__next__() Traceback (most recent call last): File "<stdin>", line 1, in <module> StopIteration
In the above code, we create a generator called squares_gen
using the generator expression. Using the built-in __next__
method, we’re able to retrieve the next item from the generator. However, when the generator runs out items, it will raise a StopIteration
exception, indicating that all items have been used up.
Because of its lazy evaluation feature, generators are a memory-efficient technique to iterate over an enormous list of items without the need to create the iterable in the first place. For example, we work with an enormously large file, and reading the entire file to the memory may exhaust the computer’s RAM, causing it to be non-responsive. Instead, we can use the generator expression technique, like this (row for row in open(filename))
, which allows us to read the file row by row to minimize the memory usage.
To illustrate how the generator expression works, let’s consider a simplified example. In the code below, we create a list and a generator of 100 million numbers with each being a square. Clearly, the generator uses much less memory than the list when we check their sizes.
>>> # a list of 100 million numbers >>> numbers_list = [x*x for x in range(100000000)] >>> numbers_list.__sizeof__() 859724448 >>> # a generator of 100 million numbers >>> numbers_gen = (x*x for x in range(100000000)) >>> numbers_gen.__sizeof__() 96
If our goal is to calculate the sum of these numbers, both options will result in the same result. But importantly, after calculating the sum, the generator will not yield any additional items, as mentioned below. If you do need to use an iterable multiple times, you can either create a list or create a generator every time you need it, with the latter being a more memory-efficient way.
>>> # calculate the sum >>> sum(numbers_list) 333333328333333350000000 >>> sum(numbers_gen) 333333328333333350000000
Conclusions
In this article, we studied the four important techniques in Python, all of which include an identical component in terms of their syntax. Here’s a quick recap of these techniques and highlights of their use cases.
- List comprehension :
[expression for item in iterable]
— a concise way to create lists - Dictionary comprehension :
{exp_key: exp_value for item in iterable}
— a concise way to create dictionaries - Set comprehension :
{expression for item in iterable}
— a concise way to create sets (no duplicate items) - Generator expression :
(expression for item in iterable)
— a concise way to create generators (memory efficient)
About the Author
I write blogs about Python and data processing and analysis. Just in case you’ve missed some of my earlier blogs. In case you missed some of my previous posts, here are the links to some articles that are relevant to the current one.
30 Simple Tricks to Level Up Your Python Coding
Better Python
medium.com
以上所述就是小编给大家介绍的《List Comprehension and Beyond — Understand 4 Key Related Techniques in Python》,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对 码农网 的支持!
猜你喜欢:本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。
Introduction to Semi-Supervised Learning
Xiaojin Zhu、Andrew B. Goldberg / Morgan and Claypool Publishers / 2009-6-29 / USD 40.00
Semi-supervised learning is a learning paradigm concerned with the study of how computers and natural systems such as humans learn in the presence of both labeled and unlabeled data. Traditionally, le......一起来看看 《Introduction to Semi-Supervised Learning》 这本书的介绍吧!