Explanation of how nested list comprehension works?

I have no problem understanding this:

a = [1,2,3,4]
b = [x for x in a]

I thought that was all, but then I found this snippet:

a = [[1,2],[3,4],[5,6]]
b = [x for xs in a for x in xs]

Which makes b = [1,2,3,4,5,6]. The problem is I'm having trouble understanding the syntax in [x for xs in a for x in xs], Could anyone explain how it works?

27364 次浏览

It can be written like this

result = []
for xs in a:
for x in xs:
result.append(x)

You can read more about it here

b = [x for xs in a for x in xs] is similar to following nested loop.

b = []
for xs in a:
for x in xs:
b.append(x)

Effectively:

...for xs in a...]

is iterating over your main (outer) list and returning each of your sublists in turn.

...for x in xs]

is then iterating over each of these sub lists.

This can be re-written as:

b = []
for xs in a:
for x in xs:
b.append(x)

Ah, the incomprehensible "nested" comprehensions. Loops unroll in the same order as in the comprehension.

[leaf for branch in tree for leaf in branch]

It helps to think of it like this.

for branch in tree:
for leaf in branch:
yield leaf

The PEP202 asserts this syntax with "the last index varying fastest" is "the Right One", notably without an explanation of why.

if a = [[1,2],[3,4],[5,6]], then if we unroll that list comp, we get:

      +----------------a------------------+
| +--xs---+ , +--xs---+ , +--xs---+ | for xs in a
| | x , x |   | x , x |   | x , x | | for x in xs
a  =  [ [ 1 , 2 ] , [ 3 , 4 ] , [ 5 , 6 ] ]
b  =  [ x for xs in a for x in xs ] == [1,2,3,4,5,6] #a list of just the "x"s

This is an example of a nested comprehension. Think of a = [[1,2],[3,4],[5,6]] as a 3 by 2 matrix (matrix= [[1,2],[3,4],[5,6]]).

       ______
row 1 |1 | 2 |
______
row 2 |3 | 4 |
______
row 3 |5 | 6 |
______

The list comprehension you see is another way to get all the elements from this matrix into a list.

I will try to explain this using different variables which will hopefully make more sense.

b = [element for row in matrix for element in row]

The first for loop iterates over the rows inside the matrix ie [1,2],[3,4],[5,6]. The second for loop iterates over each element in the list of 2 elements.

I have written a small article on List Comprehension on my website http://programmathics.com/programming/python/python-list-comprehension-tutorial/ which actually covered a very similar scenario to this question. I also give some other examples and explanations of python list comprehension.

Disclaimer: I am the creator of that website.

Yes, you can nest for loops INSIDE of a list comprehension. You can even nest if statements in there.

dice_rolls = []
for roll1 in range(1,7):
for roll2 in range(1,7):
for roll3 in range(1,7):
dice_rolls.append((roll1, roll2, roll3))


# becomes


dice_rolls = [(roll1, roll2, roll3) for roll1 in range(1, 7) for roll2 in range(1, 7)
for roll3 in range(1, 7)]

I wrote a short article on medium explaining list comprehensions and some other cool things you can do with python, you should have a look if you're interested : )

Here is how I best remember it: (pseudocode, but has this type of pattern)

[(x,y,z) (loop 1) (loop 2) (loop 3)]

where the right most loop (loop 3) is the inner most loop.

[(x,y,z)    for x in range(3)    for y in range(3)    for z in range(3)]

has the structure as:

for x in range(3):
for y in range(3):
for z in range(3):
print((x,y,z))

Edit I wanted to add another pattern:

[(result) (loop 1) (loop 2) (loop 3) (condition)]

Ex:

[(x,y,z)    for x in range(3)    for y in range(3)    for z in range(3)    if x == y == z]

Has this type of structure:

for x in range(3):
for y in range(3):
for z in range(3):
if x == y == z:
print((x,y,z))






You are asking for nested lists.

Let me try to answer this question in a step-by-step basis, covering these topics:

  • For loops
  • list comprehensions
  • both nested for loops and list comprehensions

For Loop

You have this list: lst = [0,1,2,3,4,5,6,7,8] and you want to iterate the list one item at a time and add them to a new list. You do a simple for loop:

lst = [0,1,2,3,4,5,6,7,8]
new_list = []


for lst_item in lst:
new_list.append(lst_item)

You can do exactly the same thing with a list comprehension (it's more pythonic).

List Comprehension

List comprehensions are a (*sometimes) simpler and elegant way to create lists.

new_list = [lst_item for lst_item in lst]

You read it this way: for every lst_item in lst, add lst_item to new_list

Nested Lists

What are nested lists? A simple definition: it's a list which contains sublists. You have lists within another list.

*Depending on who you talk with, nested lists are one of those cases where list comprehensions can be more difficult to read than regular for loops.

Let's say you have this nested list: nested_list = [[0,1,2], [3,4,5], [6,7,8]], and you want to transform it to a flattened list like this one: flattened list = [0,1,2,3,4,5,6,7,8].

If you use the same for loops as before you wouldn't get it.

flattened_list = []
for list_item in nested_list:
flattened_list.append(list_item)

Why? Because each list_item is actually one of the sublists. In the first iteration you get [0,1,2], then [3,4,5] and finally [6,7,8].

You can check it like this:

nested_list[0] == [0, 1, 2]
nested_list[1] == [3, 4, 5]
nested_list[2] == [6, 7, 8]

You need a way to go into the sublists and add each sublist item to the flattened list.

How? You add an extra layer of iteration. Actually, you add one for each layer of sublists.

In the example above you have two layers.

The for loop solution.

nested_list = [[0,1,2], [3,4,5], [6,7,8]]


flattened_list = []
for sublist in nested_list:
for item in sublist:
flattened_list.append(item)

Let's read this code out loud.

for sublist in nested_list: each sublist is [0,1,2], [3,4,5], [6,7,8]. In the first iteration of the first loop we go inside [0,1,2].

for item in sublist: the first item of [0,1,2] is 0, which is appended to flattened_list. Then comes 1 and finally 2.

Up until this point flattened_list is [0,1,2].

We finish the last iteration of the second loop, so we go to the next iteration of the first loop. We go inside [3,4,5].

Then we go to each item of this sublist and append it to flattened_list. And then we go the next iteration and so on.

How can you do it with List Comprehensions?

The List Comprehension solution.

flattened_list = [item for sublist in nested_list for item in sublist]

You read it like this: add each item from each sublist from nested_list.

It's more concise, but if you have many layers it could become more difficult to read.

Let's see both together

#for loop
nested_list = [[0,1,2], [3,4,5], [6,7,8]]


flattened_list = []
for sublist in nested_list:
for item in sublist:
flattened_list.append(item)


----------------------------------------------------------------------


#list comprehension
flattened_list = [item for sublist in nested_list for item in sublist]

The more layers of iteration you will be adding more for x in y.


EDIT April 2021.

You can flatten a nested list with Numpy. Technically speaking, in Numpy the term would be 'array'.

For a small list it's an overkill, but if you're crunching millions of numbers in a list you may need Numpy.

From the Numpy's documentation. We have an attribute flat

b = np.array(
[
[ 0,  1,  2,  3],
[10, 11, 12, 13],
[20, 21, 22, 23],
[30, 31, 32, 33],
[40, 41, 42, 43]
]
)


for element in b.flat:
print(element)


0
1
2
...
41
42
43

english grammar:

b = "a list of 'specific items' taken from 'what loop?' "
b = [x for xs in a for x in xs]

x is the specific item
for xs in a for x in xs is the loop

The whole confusion in this syntax arise due to the first variable and the bad naming conventions.

[door for room in house for door in room]

Here 'door' is what sets the confusion

Imagine here if there was no 'door' variable at the start

[for room in house for door in room]

this way we can get it better.

And this become even more confusing using variables like [x, xs, y], So variable naming is also a key

You can also do something with the loop variable like:

doors = [door for room in house for door in str(room)]

which is equivalent to:

for room in house:
for door in str(room):
bolts.append(door)