When doing data science, you might find yourself wanting to read lists of lists, filtering column names, removing vowels from a list, or flattening a matrix. You can easily use a lambda function or a for a loop; As you well know, there are multiple ways to go about this. One other way to do this is by using list comprehensions.
This tutorial will go over the following topics in list comprehension:
- You’ll first get a short recap of what Python lists are and how they compare to other Python data structures;
- Next, you’ll dive into Python lists comprehensions: you’ll learn more about the mathematics behind Python lists, how you can construct list comprehensions, how you can rewrite them as for loops or lambda functions.
- When you’ve got the basics down, it’s also time to fine-tune your list comprehensions by adding conditional statements to them: you’ll learn how you can include conditions in list comprehensions and how you can handle multiple if conditions and if-else statements.
- Lastly, you’ll dive into nested list comprehensions to iterate multiple times over lists.
If you’re also interested in tackling list comprehensions together with iterators and generators? Check out DataCamp’s Python Data Science Toolbox course!
By now, you will have probably played around with values that had several data types. You have saved each value in a separate variable: each variable represents a single value. However, in data science, you’ll often work with many data points, which will make it hard to keep on storing every value in a separate variable. Instead, you store all of these values in a Python list.
Python Lists
Lists are one of the four built-in data structures in Python. Other data structures that you might know are tuples, dictionaries, and sets. A list in Python is different from, for example, int
or bool
, in the sense that it’s a compound data type: you can group values in lists. These values don’t need to be of the same type: they can be a combination of boolean, String, integer, float values.
List literals are a collection of data surrounded by brackets, and the elements are separated by a comma. The list is capable of holding various data types inside it, unlike arrays.
For example, let’s say you want to build a list of courses then you could have:
courses = ['statistics', 'python', 'linear algebra']
Note that lists are ordered collections of items or objects. This makes lists in Python “sequence types”, as they behave like a sequence. This means that they can be iterated using for loops. Other examples of sequences are Strings, tuples, or sets.
Lists are similar in spirit to strings you can use the len()
function and square brackets [ ]
to access the data, with the first element indexed at 0.
Tip: if you’d like to know more, test, or practice your knowledge of Python lists, you can do so by going through the most common questions on Python lists here.
Now, on a practical note: you build up a list with two square brackets (start bracket and end bracket). Inside these brackets, you’ll use commas to separate your values. You can then assign your list to a variable. The values that you put in a Python list can be of any data type, even lists!
Take a look at the following example of a list:
# Assign integer values to `a` and `b`
a = 4
b = 9
# Create a list with the variables `a` and `b`
count_list = [1,2,3,a,5,6,7,8,b,10]
count_list
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
Note that the values of a
and b
have been updated in the list count_list
.
Creation of List
A list is created by placing all the items inside a square bracket []
separated by commas. It can have an infinite number of elements having various data types like string, integer, float, etc.
list1 = [1,2,3,4,5,6] #with same data type
list1
[1, 2, 3, 4, 5, 6]
list2 = [1, 'Aditya', 2.5] #with mixed data type
list2
[1, 'Aditya', 2.5]
Accessing Elements from a List
Elements from the list can be accessed in various ways:
- Index based: You can use the index operator to access the element from a list. In python, the indexing starts from 0 and ends at
n-1
, where n is the number of elements in the list.Indexing can be further categorized into positive and negative indexing.
index = [1,2,3,4,5]
index[0], index[4] #positive indexing
(1, 5)
index[5] #this will give an error since there is no element at index 5 in the list
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-39-fd6e7e3edbf2> in <module>
----> 1 index[5] #this will give an error since there is no element at index 5 in the list
IndexError: list index out of range
index[-1], index[-2] #negative indexing, here -1 means select first element from the last
(5, 4)
- Slice based: This is helpful when you want to access a sequence/range of elements from the list. The semicolon (
:
) is used as a slice operator to access the elements.
index[:3], index[2:]
([1, 2, 3], [3, 4, 5])
List Methods
Some of the most commonly used list methods are :
Python List Comprehension
With the recap of the Python lists fresh in mind, you can easily see that defining and creating lists in Python can be a tiresome job. Typing in all of the values separately can take quite some time, and you can easily make mistakes.
List comprehension in Python is also surrounded by brackets, but instead of the list of data inside it, you enter an expression followed by for
loop and if-else
clauses.
A most basic form of List comprehensions in Python are constructed as follows: list_variable = [expression for item in collection] The first expression generates elements in the list followed by a for
loop over some collection of data which would evaluate the expression for every item in the collection.
But how do you get to this formula-like way of building and using these constructs in Python? Let’s dig a little bit deeper.
The Mathematics
Luckily, Python has the solution for you: it offers you a way to implement a mathematical notation to do this: list comprehension.
Remember in maths, the common ways to describe lists (or sets, or tuples, or vectors) are: S = {x² : x in {0 … 9}} V = (1, 2, 4, 8, …, 2¹²) M = {x | x in S and x even} In other words, you’ll find that the above definitions tell you the following:
- S is a sequence that contains values between 0 and 9 included, and each value is raised to the power of two.
- The sequence V, on the other hand, contains the value 2 that is raised to a certain power x. The power
x
starts from 0 and goes till 12. - Lastly, the sequence M contains only the even elements from the sequence S.
If the above definitions look a little cryptic to you, then take a look at the actual lists that these definitions would produce: S = {0, 1, 4, 9, 16, 25, 36, 49, 64, 81} V = {1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096} M = {0, 4, 16, 36, 64} You see the result of each list and the operations that were described in them!
Now that you’ve understood some of the maths behind lists, you can translate or implement the mathematical notation of constructing lists in Python using list comprehensions! Take a look at the following lines of code:
S = [x**2 for x in range(10)]
V = [2**i for i in range(13)]
S
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
V
[1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096]
If you only want to extract expression for a specific set of data from the entire collection, you can add on an if
clause after the for
loop. The expression will be added to the list only if the if
clause is True. You can even have more than one if
clause, and the expression will be in the list only if all the if
clauses return True.
Similarly, to select only an even
number from the collection S
, you will have an if
clause which will check whether the given element is even or not. If it is even then that element will be added to the list.
M = [x for x in S if x % 2 == 0]
M
[0, 4, 16, 36, 64]
This all looks very similar to the mathematical definitions that you just saw, right?
No worries if you’re a bit at lost at this point; Even if you’re not a math genius, these list comprehensions are quite easy if you take your time to study them. Take a second, closer look at the Python code that you see in the code chunk above.
You’ll see that the code tells you that:
- The list
S
is built up with the square brackets that you read above in the first section. In those brackets, you see that there is an element x, which is raised to the power of 10. Now, you just need to know for how many values (and which values!) you need to raise to the power of 2. This is determined inrange(10)
. Considering all of this, you can derive that you’ll raise all numbers, going from 0 to 9, to the power of 2.
- The list
V
contains the base value 2, which is raised to a certain power. Just like before, now you need to know which power ori
is exactly going to be used to do this. You see thati
, in this case, is part ofrange(13)
, which means that you start from 0 and go until 12. All of this means that your list is going to have 13 values – those values will be 2 raised to the power 0, 1, 2, … up to 12.
- Lastly, the list
M
contains elements that are part of S if -and only if- they can be divided by 2 without having any leftovers. The modulo needs to be 0. In other words, the list M is built up with the equal values that are stored in list S.
Now that you see this all written out, it makes a lot more sense, right?
Recap And Practice
In short, you see that there are a couple of elements coming back in all these lines of code:
- The square brackets, which are a signature of Python lists;
- The
for
keyword, followed by a variable that symbolizes a list item; And
- The
in
keyword, followed by a sequence (which can be a list!).
And this results in the piece of code which you saw at the beginning of this section:
list_variable = [x for x in iterable] Now it’s your turn to go ahead and get started with list comprehensions in Python! Let’s stick close to the mathematical lists that you have seen before: Q = {x3: x in {0 … 10}} Try to code the above line of code in your python shell and the output should look like: [0, 1, 8, 27, 64, 125, 216, 343, 512, 729, 1000]
List Comprehension as an Alternative
List comprehension is a complete substitute to for loops, lambda function as well as the functions map()
, filter()
and reduce()
. What’s more, for some people, list comprehension can even be easier to understand and use in practice! You’ll read more about this in the next section!
However, if you’d like to know more about functions and lambda functions in Python, check out our Python Functions Tutorial.
For Loops
As you might already know, you use for loops to repeat a block of code a fixed number of times. List comprehensions are good alternatives to for loops, as they are more compact. Consider the following example that starts with the variable numbers
, defined as a range from 0 up until 9.
Remember that the number that you pass to the range()
function is the number of integers that you want to generate, starting from zero, of course. This means that range(10)
will return [0,1,2,3,4,5,6,7,8,9]
.
# Initialize `numbers`
numbers = range(10)
If you now want to operate on every element in numbers, you can do this with a for loop, just like the one below:
# Initialize `new_list`
new_list = []
# Add values to `new_list`
for n in numbers:
if n%2==0: #check if the element is even
new_list.append(n**2) #raise that element to the power of 2 and append to the list
# Print `new_list`
print(new_list)
[0, 4, 16, 36, 64]
This is all nice and well, but now consider the following example of a list comprehension, where you do the same with a more compact notation:
# Create `new_list`
new_list = [n**2 for n in numbers if n%2==0] #expression followed by for loop followed by the conditional clause
# Print `new_list`
print(new_list)
[0, 4, 16, 36, 64]
Tip: Check out DataCamp’s Loops in Python tutorial for more information on loops in Python.
Lambda Functions with map(), filter() and reduce()
Lambda functions are also called “anonymous functions” or “functions without a name”. That means that you only use these types of functions when they are created. Lambda functions borrow their name from the lambda
keyword in Python, which is used to declare these functions instead of the standard def
keyword.
You usually use these functions together with the map()
, filter()
, and reduce()
functions.
How to Replace map() in Combination with Lambda Functions
You can rewrite the combination map()
and a lambda function just like in the example below:
# Initialize the `kilometer` list
kilometer = [39.2, 36.5, 37.3, 37.8]
# Construct `feet` with `map()`
feet = map(lambda x: float(3280.8399)*x, kilometer)
# Print `feet` as a list
print(list(feet))
[128608.92408000001, 119750.65635, 122375.32826999998, 124015.74822]
Now, you can easily replace this combination of functions that define the feet
variable with list comprehensions, taking into account the components that you have read about in the previous section:
- Start with the square brackets.
- Then add the body of the lambda function in those square brackets or the expression you want to calculate:
float(3280.8399)*x
.
- Next, add the
for
keyword and make sure to repeat the sequence element or the itemx
in the for loop, that you referenced by adding the body of the lambda function in the expression part.
- Finally, a usual
for
loop followed by anin
keyword which will specify from where you are going to fetch thex
known as a collection. In this case, you will transform the elements of thekilometer
list.
If you do all of this, you’ll get the following result which should ideally match the result of lambda function output:
# Convert `kilometer` to `feet`
feet = [float(3280.8399)*x for x in kilometer]
# Print `feet`
print(feet)
[128608.92408000001, 119750.65635, 122375.32826999998, 124015.74822]
filter() and Lambda Functions to List Comprehensions
Now that you have seen how easily you can convert the map()
function in combination with a lambda function, you can also tackle code that contains the Python filter()
function with lambda functions and rewrite that as well.
In the following example, you will filter out even numbers and only keep the odd numbers which are not divisible by 2:
# Map the values of `feet` to integers
feet = list(map(int, feet))
# Filter `feet` to only include uneven distances
uneven = filter(lambda x: x%2, feet)
# Check the type of `uneven`
type(uneven)
# Print `uneven` as a list
print(list(uneven))
[122375, 124015]
To rewrite the lines of code in the above example, you can use two list comprehensions:
- One to convert the values of
feet
to integers; - Second to filter out even values from the feet list.
First, you rewrite the map()
function, which you use to convert the elements of the feet
list to integers. Then, you tackle the filter()
function: you take the body of the lambda function, use the for
and in
keywords to logically connect x
and feet
:
# Constructing `feet`
feet = [int(x) for x in feet]
# Print `feet`
print(feet)
# Get all uneven distances
uneven = [x for x in feet if x%2!= 0]
# Print `uneven`
print(uneven)
[128608, 119750, 122375, 124015]
[122375, 124015]
reduce()
and Lambda Functions in Python
Lastly, you can also rewrite lambda functions that are used with the reduce()
function to more compact lines of code. Take a look at the following example:
# Import `reduce` from `functools`
from functools import reduce
# Reduce `feet` to `reduced_feet`
reduced_feet = reduce(lambda x,y: x+y, feet)
# Print `reduced_feet`
print(reduced_feet)
494748
Note that in Python 3, the reduce()
function has been moved to the functools
package. You’ll, therefore, need to import the module to use it, just like in the code example above.
The chunk of code above is quite lengthy, isn’t it?
Let’s rewrite this piece of code!
Be careful! You need to take into account that you can’t use y
. List comprehensions only work with only one element, such as the x
that you have seen throughout the many examples of this tutorial.
How are you going to solve this?
Well, in cases like these, aggregating functions such as sum()
might come in handy. The sum()
function would just do a usual sum over the entire list and assign the output to reduced_feet
list:
# Construct `reduced_feet`
reduced_feet = sum([x for x in feet])
# Print `reduced_feet`
print(reduced_feet)
494748
Another way of aggregating the elements is by using the sum()
function on the list feet
. Hence, you don’t need to use list comprehension in this case, as shown below:
sum(feet)
494748
Note that when you think about it, the use of aggregating functions when rewriting the reduce()
function in combination with a lambda function makes sense. It’s very similar to what you do in SQL when you use aggregating functions to limit the number of records that you get back after running your query. In this case, you use the sum()
function to aggregate the elements in feet
to only get back one definitive value!
Note that even though this approach might not be as performant in SQL, this is the way to go when you’re working in Python!
List Comprehensions with Conditionals
Now that you have understood the basics of list comprehensions in Python, it’s time to adjust your comprehension control flow with the help of conditionals.
Though you have already seen a few examples where you made use of conditional statements like if
clause, now you will delve more deeply into it.
Let’s use a list comprehension which will make use of the if
statement and to construct it you will:
- First, you will define the expression that will divide the expression by 2;
- Then, you will write a
for
loop which will iterate over the collectionfeet
;
- Finally, you will write the
if
statement that will check whether the number is even or odd.
# Define `uneven`
uneven = [x/2 for x in feet if x%2==0]
# Print `uneven`
print(uneven)
[64304.0, 59875.0]
Note that you can rewrite the above code chunk with a Python for loop easily, however, the code will be verbose!
# Initialize and empty list `uneven`
uneven = []
# Add values to `uneven`
for x in feet:
if x % 2 == 0:
x = x / 2
uneven.append(x)
# Print `uneven`
print(uneven)
[64304.0, 59875.0]
Multiple If Conditions
Now that you have understood how you can add conditions, it’s time to convert the following for loop to a list comprehension with conditionals.
divided = []
for x in range(100):
if x%2 == 0 :
if x%6 == 0:
divided.append(x)
Be careful, you see that the following for loop contains two conditions! To solve this, all you need to do is add two if
conditions one followed by another. Only when both conditions are satisfied, the expression
will be added to the list. In the following example, the expression x
will be added as a multiple of 6.
divided = [x for x in range(100) if x % 2 == 0 if x % 6 == 0]
print(divided)
[0, 6, 12, 18, 24, 30, 36, 42, 48, 54, 60, 66, 72, 78, 84, 90, 96]
If-Else Conditions
Of course, it’s much more common to work with conditionals that involve more than one condition. That’s right, you’ll more often see if
in combination with elif
and else
. Now, how do you deal with that if you plan to rewrite your code?
Take a look at the following example of a more complex conditional list comprehension:
[x+1 if x >= 120000 else x+5 for x in feet]
[128609, 119755, 122376, 124016]
In the above line of code, you have two expressions:
- The first expression is dependent on the
if
statement; - While the second expression is dependent on the
else
statement.
Rest everything else in the above example is pretty self-explanatory.
Now look at the following code chunk, which is a rewrite of the above piece of code:
for x in feet:
if x >= 120000:
x + 1
else:
x+5
You see that this is the same code, but restructured: the last for x in feet
now initializes the for loop. After that, you add the condition if x >= 120000
and the line of code that you want to execute if this condition is True: x + 1
. If the condition is False
instead, the last bit of code in your list comprehension is executed: x+5
.
Nested List Comprehensions
Apart from conditionals, you can also adjust your list comprehensions by nesting them within other list comprehensions. This is handy when you want to work with lists of lists: generating lists of lists, transposing lists of lists, or flattening lists of lists to regular lists. For example, it becomes extremely easy with nested list comprehensions.
Take a look at the following example:
list_of_list = [[1,2,3],[4,5,6],[7,8]]
# Flatten `list_of_list`
[y for x in list_of_list for y in x]
[1, 2, 3, 4, 5, 6, 7, 8]
You assign a rather simple list of lists to a variable list_of_list
. In the next line, you execute a list comprehension that returns a normal list. What actually happens is that you take the list elements ( y
) of the nested lists ( x
) in list_of_list
and return a list of those list elements y
that are comprised in x
.
You see that most of the keywords and elements used in the example of the nested list comprehension are similar to those you used in the simple list comprehension examples:
- Square brackets
- Two
for
keywords, followed by a variable that symbolizes an item of the list of lists (x
) and a list item of a nested list (y
); And
- Two in keywords, followed by a list of lists (
list_of_list
) and a list item (x
).
Most of the components are just used twice, and you go one level higher (or deeper, depends on how you look at it!).
It takes some time to get used to, but it’s rather simple, huh?
Let’s now consider another example, where you see that you can also use two pairs of square brackets to change the logic of your nested list comprehension:
matrix = [[1,2,3],[4,5,6],[7,8,9]]
[[row[i] for row in matrix] for i in range(3)]
[[1, 4, 7], [2, 5, 8], [3, 6, 9]]
Now practice: rewrite the code chunk above to a nested for loop. If you need some pointers on how to tackle this exercise, go to one of the previous sections of this tutorial.
transposed = []
for i in range(3):
transposed_row = []
for row in matrix:
transposed_row.append(row[i])
transposed.append(transposed_row)
transposed
[[1, 4, 7], [2, 5, 8], [3, 6, 9]]
You can also use nested list comprehensions when you need to create a list of lists that is a matrix. Check out the following example:
matrix = [[0 for col in range(4)] for row in range(3)]
matrix
[[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]]
Tip: practice your loop skills in Python and rewrite the above code chunk to a nested for loop!
You can find the solution below.
for x in range(3):
nested = []
matrix.append(nested)
for row in range(4):
nested.append(0)
If you want to get some extra work done, work on translating this for loop to a while loop. You can find the solution below:
x = 0
matrix =[]
while x < 3:
nested = []
y = 0
matrix.append(nested)
x = x+1
while y < 4:
nested.append(0)
y= y+1
Lastly, it’s good to know that you can also use functions such as int()
to convert the entries in your feet
list to integers. By encapsulating [int(x) for x in feet]
within another list comprehension, you construct a matrix or lists of your list pretty easily:
[[int(x) for x in feet] for x in feet]
[[128608, 119750, 122375, 124015],
[128608, 119750, 122375, 124015],
[128608, 119750, 122375, 124015],
[128608, 119750, 122375, 124015]]
Conclusion
Congratulations, you have made it to the end of this tutorial!
In this tutorial, you tackled list comprehensions, a mechanism that’s frequently used in Python for data science. Now that you understand the workings of this mechanism, you’re ready to also tackle a dictionary, set comprehensions!
Don’t forget that you can practice your Python skills on a daily basis with DataCamp’s daily practice mode! You can find it right on your dashboard. If you don’t know the daily practice mode yet, read up here!
Though list comprehensions can make our code more succinct, it is important to ensure that your final code is as readable as possible, so very long single lines of code should be avoided to ensure that our code is user friendly.
This article has been published from the source link without modifications to the text. Only the headline has been changed.
Source link