list
4 Lists and Tuples
4.1 List
One more topic you’ll need to understand before you can begin writing programs in earnest is the list
data type and its cousin, the tuple
. Lists
and tuples
can contain multiple values, which makes writing programs that handle large amounts of data easier. These data types are called containers, meaning they are objects that “contain” other objects. They each have some important distinguishing properties and come with their own set of methods for interacting with objects of each type. List
and tuple
belong to sequence data types, which means they represent ordered collections of items. They share the same characteristic as string
and the range
object returned by range()
function. Many of the capabilities shown in this chapter apply to all sequence types.
Checkout https://docs.python.org/3/library/stdtypes.html#sequence-types-list-tuple-range for more details.
In a string
, the values are characters; in a list
, they can be any type. The values in a list
are called elements or sometimes items. Items are separated with commas.
There are several ways to create a new list
; the simplest is to enclose the elements in square brackets (“[” and ”]”). A list
that contains no elements is called an empty list
; you can create one with empty brackets.
type([10, 20, 30, 40]), type(['calculus', 'introduction to mathematics', 'computer programming', 'linear algebra'])
(list, list)
The first example is a list of four integers and the second is a list of four strings.
4.1.1 Getting Individual Values in a List with Indexes
You can reference a list
item by writing the list
’s name followed by the element’s index (that is, its position number) enclosed in square brackets ([]
, known as the subscription operator or bracket operator). Remember that the indices start at 0:
subjects = ['calculus', 'introduction to mathematics', 'computer programming', 'linear algebra']
print(subjects[0])
print(subjects[3])
calculus
linear algebra
Note that the first index is 0, the last index is one less than the size of the list
; a list
of four items has 3 as its last index.
Python will give you an IndexError
error message if you use an index that exceeds the number of values in your list value.
The elements of a list
don’t have to be the same type. The following list
contains a string
, a float
, an integer
, and another list
:
The values in these lists of lists can be accessed using multiple indexes:
The first index dictates which items in the outer list
to use, and the second indicates the value within the inner list
. If you only use one index like spam[3]
, the program will print the entire list value at that index.
4.1.1.1 Negative Indexes
While indexes start at 0 and go up, you can also use negative integers for the index. The integer value -1 refers to the last index in a list
, the value -2 refers to the second-to-last index in a list
, and so on.
4.1.1.2 Getting a List
’s Length with the len()
Function
The len()
function will return the number of values that are in a list
, just like it can count the number of characters in a string.
4.1.1.3 Getting a sublist from Another List
with Slices
Just as an index can get a single value from a list
, a slice can get several values from a list
as a new list. A slice is typed between square brackets, like an index, but has two integers separated by a colon.
subjects[2]
is a list with an index.subjects[1:3]
is a list with a slice.
In a slice, the first integer is the index where the slice starts. The second integer is the index where the slice ends. A slice goes up to, but will not include, the value at the second index. A slice evaluates to a new list
.
subjects = ['calculus', 'introduction to mathematics', 'computer programming', 'linear algebra']
print(subjects[0:3])
print(subjects[1:-1])
['calculus', 'introduction to mathematics', 'computer programming']
['introduction to mathematics', 'computer programming']
As a shortcut, you can leave out one or both indexes on either side of the colon in the slice. Leaving out the first index is the same as using 0 or the beginning of the list
. Leaving out the second index is the same as using the length of the list
, which will slice to the end of the list
.
print(subjects[:3]) # same as subjects[0:3]
print(subjects[1:]) # same as subjects[1:len(s)]
print(subjects[:]) # same as s[0:len(s)]
['calculus', 'introduction to mathematics', 'computer programming']
['introduction to mathematics', 'computer programming', 'linear algebra']
['calculus', 'introduction to mathematics', 'computer programming', 'linear algebra']
Just like range()
, slicing has the optional third index that can be used to specify the step.
4.1.1.4 Changing Values in a List
with Indexes
Unlike strings
, lists
are mutable because you can reassign an item in a list
. When the bracket operator appears on the left side of an assignment, it identifies the element of the list
that will be assigned.
The first element of numbers, which used to be 123, is now 5.
All in all, you can think of a list
as a relationship between indices and elements. This relationship is called a mapping; each index “maps to” one of the elements.
4.1.2 List
Concatenation and List
Replication
Lists can be concatenated and replicated just like strings. The +
operator combines two lists to create a new list
and the *
operator can be used with a list
and an integer value to replicate the list
.
4.1.2.1 Removing Values from Lists with del
Statements
The del
statement will delete values at an index in a list
. All values in the list
after the deleted value will be moved up one index.
4.1.3 List
traversal
In Chapter 2, you have learned about using for
loops to execute a block of code a certain number of times. Technically, a for
loop repeats the code block once for each item in a sequence.
This is because the return value from range(4)
is a sequence that Python considers similar to [0, 1, 2, 3]
. The following program has the same output as the previous one:
for subject in subjects: # subjects = ['calculus', 'introduction to mathematics', 'computer programming', 'linear algebra']
print(subject)
calculus
introduction to mathematics
computer programming
linear algebra
This works well if you only need to read the elements of the list
. But you need the indices that you want to write or update the elements. A common way to do that is to combine the functions range()
and len()
:
A common Python
technique is to use range(len(someList))
with a for
loop to iterate over the indexes of a list.
for i in range(len(numbers)): # numbers = [17, 5, 42, 7]
print(i, numbers[i])
numbers[i] = numbers[i]**2
print(numbers)
0 17
1 5
2 42
3 7
[289, 25, 1764, 49]
This loop traverses the list and prints each element. len()
returns the number of elements in the list. range()
returns a list of indices from 0
to n − 1
, where n
is the length of the list. Each time through the loop, i
gets the index of the next element. This is handy since it will iterate through all the indexes, no matter how many items it contains.
4.1.3.1 The in
and not in
Operators
You can determine whether an object is or isn’t in a list
with the in
and not in
operators. These expressions will evaluate to a Boolean
value.
4.1.3.2 Using the enumerate()
Function with Lists
Instead of using the range(len(someList))
technique with a for
loop to obtain the integer index of the items in the list
, you can call the enumerate()
function instead. On each iteration of the loop, enumerate()
will return two values: the index of the item and the item itself.
4.1.3.3 Loop in Multiple Lists
with zip()
Built-in function zip()
enables you to iterate over multiple sequences of data at the same time. The function receives as arguments any number of sequences and returns an iterator that produces tuples
containing the elements at the same index in each.
names = ['Bob', 'Sue', 'Amanda']
grade_point_averages = [3.5, 4.0, 3.75]
for name, gpa in zip(names, grade_point_averages):
print('Name=', name, 'GPA=', gpa)
Name= Bob GPA= 3.5
Name= Sue GPA= 4.0
Name= Amanda GPA= 3.75
The above snippet call zip()
to produces the tuples ('Bob', 3.5)
, ('Sue', 4.0)
and ('Amanda', 3.75)
consisting of the elements at index 0, 1 and 2 of each list
, respectively. Note that we unpack (which we will elaborate later on) each tuple into name
and gpa
and display them.
4.1.4 Methods of the list
A method, introduced in Chapter 1, is the same as a function, except it is “called on” an object. For example, if a list
object were stored in spam
, you would call the index()
list method on that list
like so: spam.index('hello')
. The method part comes after the object, separated by a period.
Each data type has its own set of methods. The list
data type, for example, has several useful methods for finding, adding, removing, and otherwise manipulating values in a list
.
4.1.4.1 Adding elements to Lists
with the append()
and insert()
Methods
append()
adds a new element to the end of a list
:
The previous append()
method call adds the argument to the end of the list
. The insert()
method can insert an element at any index in the list
. The first argument to insert()
is the index for the new value, and the second argument is the new value to be inserted.
Notice that the code is
t.append('d')
andt.insert(1, 'e')
, nott = t.append('d')
andt = t.insert(1, 'e')
. In fact, the return value ofappend()
andinsert()
isNone
, so you definitely wouldn’t want to store this as the new variable value. Rather, thelist
is modified in-place.
Methods belong to a single data type. The append()
and insert()
methods are list
methods and can be called only on list
object, not on other objects such as strings
or integers
.
4.1.4.2 Adding all the elements of a List
to the end of List
with the extend()
Methods
Use list
method extend()
to add all the elements of another sequence to the end of a list:
4.1.4.3 Removing elements from Lists
with the remove()
Method
The remove()
method will pass the object to be removed from the list
when it is called:
The
del
statement is good to use when you know the index of the element you want to remove from thelist
. Theremove()
method is useful when you know the element you want to remove from the list.
4.1.4.4 Sorting the elements in a List
with the sort()
Method
Lists of numbers or lists of strings can be sorted with the sort()
method:
spam = [2, 5, 3.14, 1, -7]
spam.sort() # The default behavior is sorting in ascending order
print(spam)
spam = ['ants', 'cats', 'dogs', 'badgers', 'Elephants']
spam.sort()
print(spam)
[-7, 1, 2, 3.14, 5]
['Elephants', 'ants', 'badgers', 'cats', 'dogs']
Note that sort()
uses “ASCII order” rather than alphabetical order for sorting strings. This means uppercase letters come before lowercase letters. Therefore, the lowercase a is sorted so that it comes after the uppercase Z.
You can also pass True
for the reverse
keyword argument to have sort()
sort the values in reverse order.
4.1.4.5 Searching an element in a List
with the index()
Method
List
objects have an index()
method that accepts an argument, and if that argument exists in the list, the index of the argument is returned. If the argument isn’t in the list
, then Python
produces a ValueError
error.
When there are duplicates of the elements in the list
, the index of its first appearance is returned.
4.1.5 Numerical functions for list
There are a number of built-in functions that can be used on lists
that allow you to quickly look through a list
without writing your own loops:
Check out https://docs.python.org/3/tutorial/datastructures.html#more-on-lists for more methods!
4.1.6 List
Comprehensions
Consider how you might make a list
of the first 10 square numbers (that is, the square of each integer from 1 through 10).
[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]
But a list comprehension allows you to generate this same list in just one line of code. A list comprehension combines the for
loop and the creation of new elements into one line, and automatically appends each new element!
[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]
To use this syntax
- Begin with a descriptive name for the
list
, such assquares
. - Next, open a set of square brackets and define the expression for the values you want to store in the new
list
. In this example, the expression isvalue**2
- Then, write a
for
loop to generate the numbers you want to feed into the expression and close the square brackets. In this example, thefor
loop iterates value inrange(1, 11)
, which feeds the values 1 through 10 into the expressionvalue**2
.
Note that no colon is used at the end of the for
statement.
The syntax of list comprehension is similar to the set-builder notation. For instance, preivous example is similar to \(\{x^2 | x \in \{1,2,...,10\}\}\)
Another common operation is filtering elements to select only those that match a condition. This typically produces a list
with fewer elements than the data being filtered. To do this in a list
comprehension, use the if
clause. The following includes in list1
only the even values produced by the for
clause:
4.1.7 Exercise 1: Bulls and Cows (or 1A2B) is a code-breaking game. The numerical version of the game is usually played with four digits. On a sheet of paper, the players each write a 4-digit secret number. The digits must all be different. Then, in turn, the players try to guess their opponent’s number, which gives the number of matches. If the matching digits are in their right positions, they are “bulls” (A). If they are in different positions, they are “cows” (B). For example, if the secret number is 4271 and our guess is 1234, then we should get 1 bull and 2 cows. (The bull is “2”, the cows are “4” and “1”.). Please complete the following game design, that the computer will generate a 4-digit number, and we must write a function to read the user’s 4-digit inputs and check the user’s guess against the secret number. Finally, return the message XAXB to the user.
%%writefile 1A2B.py
import random
# Generate a random four-digit number
def generate_number():
digits = list(range(10))
random.shuffle(digits) # randomly shuffle the list!
return digits[:4]
# Check the user's guess against the secret number
def check_guess(guess, secret):
# Note that both guess and secret are lists!
a = 0 # number of correct digits in the correct position
b = 0 # number of correct digits in the wrong position
for i,j in zip(__,__): # Iterate over two lists
if i == j:
a += 1
elif __________: # Use operator to determine whether the digit is in secret number or not
b += 1
return a, b
# Play the game
print("Welcome to 1A2B!")
print("I'm thinking of a four-digit number. Can you guess it?")
secret = generate_number()
guesses = 0
while True:
guess = input("Enter your guess, enter 'quit' to give up: ")
if guess == 'quit':
print("The secret number is", secret)
break
elif len(guess) != 4 or not guess.isdigit():
print("Invalid guess. Please enter a four-digit number.")
continue
guess = _______ # Use list comprehension to get the 4-digit guess list
guesses += 1
result = check_guess(guess, secret)
print(result[0],'A', result[1], 'B', sep="")
if result[0] == 4:
print("Congratulations, you guessed the number in", guesses, "guesses!")
break
Overwriting 1A2B.py
4.1.8 Sequence Data Types
Lists
aren’t the only data types that represent ordered sequences of values. For example, strings
and lists
are similar if you consider a string to be a “list” of single text characters. The Python
sequence data types include lists
, strings
, range objects returned by range()
, and tuples
. Many of the things you can do with lists
can also be done with strings
and other values of sequence types: indexing; slicing; and using them with for loops, with len()
, and with the in
and not in
operators.
4.1.9 Mutable and Immutable Data Types
But lists
and strings
are different in an important way. A list object is a mutable data type: it can have elements added, removed, or changed. However, a string is immutable: it cannot be changed. Trying to reassign a single character in a string results in a TypeError
error:
4.2 Tuples
A tuple
is a sequence of values much like a list
. The values stored in a tuple
can be any type, and they are indexed by integers. The important difference is that tuples
are immutable.
It is similar to the tuple you encounter in math
Although it is not necessary, it is common to enclose tuples
in parentheses to help us quickly identify tuples
when we look at Python
code:
To create a tuple
with a single element, you have to include the final comma or use the tuple()
function:
If the argument of tuple()
is a sequence (string
, list
, or tuple
), the result is a tuple
with the elements of the sequence:
Most list
operators also work on tuples
. The bracket operator indexes an element:
But if you try to modify one of the elements of the tuple
, you get an error:
You can use tuples
to convey to anyone reading your code that you don’t intend for that sequence of values to change. Use a tuple
if you need an ordered sequence of values that never changes.
4.2.1 Unpacking Sequences
We have seen the multiple assignment trick in the previous chapter (which is actually unpacking the tuple
). In fact, you can unpack any sequence’s elements by assigning the sequence to a comma-separated list of variables.
Unpacking is widely used to return multiple values in a function:
4.3 References
Technically, in Python
, variables store references to the computer memory locations where the values are stored.
spam = 42
cheese = spam
print(id(cheese), id(spam))
spam = 100
print(id(cheese), id(spam))
spam, cheese
140731993439296 140731993439296
140731993439296 140731993441152
(100, 42)
When you assign 42 to the spam
variable, you are actually creating the 42 value in the computer’s memory and storing a reference (address) to it in the spam
variable. When you copy the value in spam
and assign it to the variable cheese
, you are actually copying the reference. Both the spam
and cheese
variables refer to the 42 value in the computer’s memory. When you later change the value in spam
to 100, you’re creating a new 100 value and storing a reference to it in spam
. This doesn’t affect the value in cheese
. Integers are immutable values that don’t change; changing the spam
variable is actually making it refer to a completely different value in memory.
You can use id()
function to verify this behavior. In CPython (the most widely used implementation of Python), the identifier returned by id()
is actually the memory address of the object, represented as a Python integer. All values in Python have a unique identity (address) that can be obtained with the id()
function.
But lists
don’t work this way, because list
are mutable:
spam = [0, 1, 2, 3, 4, 5]
cheese = spam # The reference is being copied, not the list.
print(id(cheese), id(spam))
cheese[1] = 'Hello!' # This changes the list value.
print(id(cheese), id(spam))
spam, cheese
2790646437440 2790646437440
2790646437440 2790646437440
([0, 'Hello!', 2, 3, 4, 5], [0, 'Hello!', 2, 3, 4, 5])
Using boxes as a metaphor for variables, the following shows what happens when a list
is assigned to the spam
variable.
Then, the reference in spam
is copied to cheese
. Only a new reference was created and stored in cheese
, not a new list
. Note how both references refer to the same list
.
When you alter the list
that cheese
refers to, the list
that spam
refers to is also changed, because both cheese
and spam
refer to the same list
.
You may be wondering why the weird behavior with mutable lists
in the previous section doesn’t happen with immutable values like integers
or strings
. Let us elaborate on this topics.
Like integer
, 'Hello'
is a string
which is immutable and cannot be changed. If you “change” the string
in a variable, a new string
object is being made at a different place in memory, and the variable refers to this new string
.
2790646417200
2790646416176
However, lists
can be modified because they are mutable objects. The append()
method doesn’t create a new list
object; it changes the existing list
object. We call this “modifying the object in-place.”
eggs = ['Hello'] # This creates a new list.
print(id(eggs))
eggs.append('World') # append() modifies the list "in place".
print(id(eggs)) # eggs still refers to the same list as before.
2790646499648
2790646499648
If two variables refer to the same list
(like spam
and cheese
in the previous section) and the list
itself changes, both variables are affected because they both refer to the same list
. The append()
, remove()
, sort()
, reverse()
, and other list
methods modify their lists
in place.
Python’s automatic garbage collector deletes any values not being referred to by any variables to free up memory. You don’t need to worry about how the garbage collector works, which is a good thing: manual memory management in other programming languages is a common source of bugs.
4.3.1 Passing References
References are particularly important for understanding how arguments get passed to functions. When a function is called, the values of the arguments are copied to the parameter variables. For lists
(and dictionaries
, which we will describe in the next chapter), this means a copy of the reference is used for the parameter.
[1, 2, 3, 'Hello']
Notice that when eggs()
is called, a return value is not used to assign a new value to spam
. Instead, it modifies the list in place directly. Even though spam
and someParameter
contain separate references, they both refer to the same list
. This is why the append('Hello')
method call inside the function affects the list
even after the function call has returned.
For immutable types string
and integers
, we will create a new object in the function when we modify someParameter
. Therefore, the original value will not be modified after the loop.
4.3.2 The copy
Module’s copy()
and deepcopy()
Functions
Python
provides a module named copy
that provides both the copy()
and deepcopy()
functions. copy()
, can be used to make a duplicate copy of a mutable value like a list or dictionary, not just a copy of a reference.
import copy
spam = ['A', 'B', 'C', 'D']
print(id(spam))
cheese = copy.copy(spam)
print(id(cheese)) # cheese is a different list with different identity.
cheese[1] = 42
spam, cheese
2790645442048
2790645869632
(['A', 'B', 'C', 'D'], ['A', 42, 'C', 'D'])
Now the spam
and cheese
variables refer to separate lists
, which is why only the list in cheese
is modified when you assign 42 at index 1.
If the list you need to copy contains another list, then use the
copy.deepcopy()
function instead ofcopy.copy()
. Thedeepcopy()
function will copy these inner lists as well.
4.3.3 Exercise 2: Here, we will simulate the process of a simple card game. The game is played with a standard deck of 52 cards, and we will randomly select 40 cards and divide them evenly between two players. Each player gets a hand of 20 cards. The goal of the game is to collect pairs of cards with the same rank (e.g., two aces, two kings, etc.). The player with the most pairs at the end of the game wins.
import random
# Write a function create_deck that creates a list of tuples representing a standard deck of 52 cards.
# Each tuple should contain two elements: the rank (e.g., "ace", "king", etc.)
# and the suit (e.g., "hearts", "spades", etc.).
def create_deck():
ranks = ["A", "2", "3", "4", "5", "6", "7", "8", "9", "10", "J", "Q", "K"]
suits = ['♣', '♦', '♥', '♠']
deck = [(rank, suit) ______] # Use list comprehension to create the deck.
return deck
# A function that takes the deck as a parameter and returns two lists, each containing 26 randomly-selected
# cards from the deck. Use list slicing and the random module to implement this function.
def deal_cards(deck):
deck = deck[:40]
random.shuffle(deck)
hand1 = _____ # Split it into 20 cards in each using slice
hand2 = _____
return hand1, hand2
# Write a function find_pairs that takes a list of cards as a parameter and returns a list of tuples
# representing the pairs of cards in the list. A pair is defined as two cards with the same rank.
def find_pairs(cards):
pairs = []
for i, card1 in enumerate(cards):
for j, card2 in enumerate(cards):
if i != j and card1[0] == card2[0] and card1 not in [pair[0] for pair in pairs]\
and card1 not in [pair[1] for pair in pairs] and card2 not in [pair[0] for pair in pairs]\
and card2 not in [pair[1] for pair in pairs]:
pairs._____((card1, card2)) # Use a method from the list to add it into the pairs
return pairs
deck = create_deck()
hand1, hand2 = deal_cards(deck)
pairs1 = find_pairs(hand1)
pairs2 = find_pairs(hand2)
print(pairs1)
print(pairs2)
if ___________: # Compare the length of the two lists
print("Player 1 wins!")
elif _____________:
print("Player 2 wins!")
else:
print("It's a tie!")
Lists
are useful data types since they allow you to write code that works on a modifiable number of values in a single variable. Later on, you will see programs using lists
to do things that would be difficult or impossible to do without them.
Lists
are a sequence data type that is mutable, meaning that their contents can change. Tuples
and strings
, though also sequence data types, are immutable and cannot be changed. A variable that contains a tuple
or string
value can be overwritten with a new tuple
or string
value, but this is not the same thing as modifying the existing value in place — like, say, the append()
or remove()
methods do on lists
. Because tuples
are immutable, they don’t provide methods like sort()
and reverse()
, which modify existing lists
. However Python
provides the built-in functions sorted()
and reversed()
, which take any sequence as a parameter and return a new sequence with the same elements in a different order.
Variables do not store list
objects directly; they store references to lists
. This is an important distinction when you are copying variables or passing lists
as arguments in function calls. Because the value that is being copied is the list reference, be aware that any changes you make to the list
might impact another variable in your program. You can use copy()
or deepcopy()
if you want to make changes to a list
in one variable without modifying the original list
. It is noted that slicing also create a new list
object.