5  Other collections - Dictionaries and Sets

Author

phonchi

Published

March 17, 2023

Open In Colab


5.1 Dictionaries

We’ve discussed three built-in sequence collections — strings, lists and tuples. Now, we consider the built-in non-sequence collections — dictionaries and sets. A dictionary is an unordered collection which stores key–value pairs that map immutable keys to values, just as a conventional dictionary maps words to definitions. A set is an unordered collection of unique immutable elements.

Like a list, a dictionary is a mutable collection of many values, but more general. In a list, the index positions have to be integers; in a dictionary, the indices can be any immutable data type. You can think of a dictionary as a mapping between a set of indices (which are called keys) and a set of values. Each key maps to a value. The association of a key and a value is called a key-value pair or sometimes an item.

A dictionary’s keys must be immutable (such as strings, integers or tuples) and unique (that is, no duplicates). However, multiple keys can have the same value.

source: https://pynative.com/python-dictionaries/

It is noted that as of Python 3.7, dictionary items maintain the order in which they are inserted into the dictionary. However, dictionaries are considered unordered collections and do not write code that depends on the order of the key–value pairs.

As an example, we’ll build a dictionary that maps from subjects to grades, so the keys are string while the values are integers. The function dict creates a new dictionary with no items.

type({}) # {} also treated as dict in Python
dict
grade = dict()
type(grade), grade
(dict, {})

To add/update items to the dictionary, you can again use subscript operator (square brackets):

grade['calculus'] = 85 # Key:'calculus', value: 85
print(grade) # Note that key and value are separate by colon
{'calculus': 85}

You can create a dictionary that contains multiple items by enclosing in curly braces, {}, a comma-separated list of key–value pairs, each of the form key:value.

grade = {'calculus':85, 'introduction to mathematics':80, 'computer programming':90, 'linear algebra':95}
grade
{'calculus': 85,
 'introduction to mathematics': 80,
 'computer programming': 90,
 'linear algebra': 95}

You can store them using separate lists for subjects and scores, but the following update and maintenance will become tedious:

subjects = ['calculus', 'introduction to mathematics', 'computer programming', 'linear algebra']
score = [85, 80, 90, 95]

You can now use the keys to look up the corresponding values:

print(grade['computer programming'])
90

You can obtain the number of items using len()

len(grade)
4

Note that you can’t access items in them using integer indexes like grade[0] because it is unordered collections. (Consider the case when you use 4, 2, 1, 0 as the keys). Therefore, you can’t use slice syntax for dictionaries.

Trying to access a key that does not exist in a dictionary will result in a KeyError error message, much like a list’s “out-of-range” IndexError error message.

grade['English']
KeyError: 'English'

To add or delete an entry, it is similar to list

grade['English'] = 100
grade
{'calculus': 85,
 'introduction to mathematics': 80,
 'computer programming': 90,
 'linear algebra': 95,
 'English': 100}

You can delete a key–value pair from a dictionary with the del statement:

del grade['English']
grade
{'calculus': 85,
 'introduction to mathematics': 80,
 'computer programming': 90,
 'linear algebra': 95}

5.1.1 The keys(), values(), and items() Methods

There are three dictionary methods that will return list-like values of the dictionary’s keys, values, or both keys and values: keys(), values(), and items(). The values returned by these methods are not true lists, but these data types (dict_keys, dict_values, and dict_items, respectively) can be used in for loops (Just like range object)!

If you want a true list from one of these methods, pass its list-like return value to the list() function

subject = list(grade.keys())
score = list(grade.values())
print(subject)
print(score)
['calculus', 'introduction to mathematics', 'computer programming', 'linear algebra']
[85, 80, 90, 95]
for v in grade.values():
    print(v)
85
80
90
95

Here, a for loop iterates over each of the values in the grade dictionary. A for loop can also iterate over the keys:

for k in grade.keys():
    print(k)
calculus
introduction to mathematics
computer programming
linear algebra
for k in grade:
    print(k)
calculus
introduction to mathematics
computer programming
linear algebra

Note that by default, it will traverse over the keys!

Dictionaries have a method called items() that returns a list of tuples, where each tuple is a key-value pair:

list(grade.items())
[('calculus', 85),
 ('introduction to mathematics', 80),
 ('computer programming', 90),
 ('linear algebra', 95)]

Combining items(), multiple assignment, and for, you can see a nice code pattern for traversing the keys and values of a dictionary in a single loop:

for key, val in grade.items():
    print(key,val)
calculus 85
introduction to mathematics 80
computer programming 90
linear algebra 95

5.1.1.1 Checking Whether a Key or Value Exists in a Dictionary

Recall from the previous chapter that the in and not in operators can check whether a value exists in a list. You can also use these operators to see whether a certain key or value exists in a dictionary

'calculus' in grade, 'English' in grade.keys(), 85 in grade.values() 
(True, False, True)

Again, it will check keys by default. Therefore, in the previous example, 'calculus' in grade is essentially a shorter version of writing 'calculus' in grade.keys(). This is always the case: if you ever want to check whether a value is (or isn’t) a key in the dictionary, you can simply use the in (or not in) keyword with the dictionary itself.

5.1.1.2 Retrieve value uisng get() Method

if 'English' in grade:
    e_score= grade['English']

It’s tedious to check whether a key exists in a dictionary before accessing that key’s value. Fortunately, dictionaries have a get() method that takes two arguments: the key of the value to retrieve and a fallback value to return if that key does not exist.

picnicItems = {'apples': 5, 'cups': 2}
print('I am bringing ' + str(picnicItems.get('cups', 0)) + ' cups.') 
print('I am bringing ' + str(picnicItems.get('eggs', 0)) + ' eggs.')
I am bringing 2 cups.
I am bringing 0 eggs.

Because there is no ‘eggs’ key in the picnicItems dictionary, the default value 0 is returned by the get() method. Without using get(), the code would have caused a KeyError message

picnicItems = {'apples': 5, 'cups': 2}
'I am bringing ' + str(picnicItems['eggs']) + ' eggs.'
KeyError: 'eggs'

5.1.1.3 Update value using setdefault() Method

You’ll often have to set a value in a dictionary for a certain key only if that key does not already have a value. The code looks something like this:

spam = {'name': 'Pooka', 'age': 5}
if 'color' not in spam:
    spam['color'] = 'black'

The setdefault() method offers a way to do this in one line of code. The first argument passed to the method is the key to check for, and the second argument is the value to set at that key if the key does not exist.

The setdefault() method is a nice shortcut to ensure that a key exists. Here is a short program that counts the number of occurrences of each letter in a string.

message = 'It was a bright cold day in April, and the clocks were striking thirteen.'
count = {}

for character in message:
    if character not in count:
        count[character] = 0
    count[character] = count[character] + 1

print(count)    
{'I': 1, 't': 6, ' ': 13, 'w': 2, 'a': 4, 's': 3, 'b': 1, 'r': 5, 'i': 6, 'g': 2, 'h': 3, 'c': 3, 'o': 2, 'l': 3, 'd': 3, 'y': 1, 'n': 4, 'A': 1, 'p': 1, ',': 1, 'e': 5, 'k': 2, '.': 1}
message = 'It was a bright cold day in April, and the clocks were striking thirteen.'
count = {}

for character in message:
    count.setdefault(character, 0)
    count[character] = count[character] + 1

print(count)    
{'I': 1, 't': 6, ' ': 13, 'w': 2, 'a': 4, 's': 3, 'b': 1, 'r': 5, 'i': 6, 'g': 2, 'h': 3, 'c': 3, 'o': 2, 'l': 3, 'd': 3, 'y': 1, 'n': 4, 'A': 1, 'p': 1, ',': 1, 'e': 5, 'k': 2, '.': 1}

You can view the execution of this program at https://autbor.com/setdefault. The program loops over each character in the message variable’s string, counting how often each character appears. The setdefault() method ensures that the key is in the count dictionary (with a default value of 0) so the program doesn’t throw a KeyError error when count[character] = count[character] + 1 is executed!

From the output, you can see that the lowercase letter c appears 3 times, the space character appears 13 times, and the uppercase letter A appears 1 time.

5.1.2 Pretty Printing

If you import the pprint module into your programs, you’ll have access to the pprint() function that will “pretty print” a dictionary’s values. This is helpful when you want a cleaner display of the items in a dictionary than what print() provides.

import pprint

message = 'It was a bright cold day in April, and the clocks were striking thirteen.'
count = {}

for character in message:
    count.setdefault(character, 0)
    count[character] = count[character] + 1

pprint.pprint(count)
{' ': 13,
 ',': 1,
 '.': 1,
 'A': 1,
 'I': 1,
 'a': 4,
 'b': 1,
 'c': 3,
 'd': 3,
 'e': 5,
 'g': 2,
 'h': 3,
 'i': 6,
 'k': 2,
 'l': 3,
 'n': 4,
 'o': 2,
 'p': 1,
 'r': 5,
 's': 3,
 't': 6,
 'w': 2,
 'y': 1}

You can view the execution of this program at https://autbor.com/pprint/. This time, when the program is run, the output looks much cleaner, with the keys sorted. The pprint.pprint() function is especially helpful when the dictionary itself contains nested lists or dictionaries.

5.1.2.1 Dictionary Comprehensions

Dictionary comprehensions provide a convenient notation for quickly generating dictionaries, often by mapping one dictionary to another. For example, in a dictionary with unique values, you can reverse the key–value pairs:

months = {'January': 1, 'February': 2, 'March': 3}
months2 = {number:name for name, number in months.items()}
months2
{1: 'January', 2: 'February', 3: 'March'}

Curly braces delimit a dictionary comprehension, and the expression to the left of the for specifies a key–value pair of the form key:value. The comprehension iterates through months.items(), unpacking each key–value pair tuple into the variables name and number. The expression number:name reverses the key and value, so the new dictionary maps the month numbers to the month names.

A dictionary comprehension also can map a dictionary’s values to new values. The following comprehension converts a dictionary of names and lists of grades into a dictionary of names and grade-point averages. The variables k and v commonly mean key and value:

grades = {'Sue': [98, 87, 94], 'Bob': [84, 95, 91]}
{'Bob': [84, 95, 91], 'Sue': [98, 87, 94]}
grades2 = {k:sum(v)/len(v) for k, v in grades.items()}
grades2
{'Sue': 93.0, 'Bob': 90.0}

Note the above is nested structure, with a list in a dictionary!

5.1.3 Using Data Structures to Model Real-World Things

A tic-tac-toe board looks like a large hash symbol (#) with nine slots that can each contain an X, an O, or a blank. To represent the board with a dictionary, you can assign each slot a key, as shown in below:

1 2 3
4 5 6
7 8 9

You can use string values to represent what’s in each slot on the board: ‘X’, ‘O’, or ’ ’ (a space). Thus, you’ll need to store nine strings. You can use a dictionary of values for this. The string value with the key ‘3’ can represent the top-right corner, the string value with the key ‘7’ can represent the bottom-left corner, the string value with the key ‘5’ can represent the middle, and so on. This dictionary is a data structure that represents a tic-tac-toe board. Store this board-as-a-dictionary in a variable named board.

board = {'1': ' ', '2': ' ', '3': ' ',
         '4': ' ', '5': ' ', '6': ' ',
         '7': ' ', '8': ' ', '9': ' '}

A board where player O has won by placing Os across the top might look like this:

board = {'1': 'O', '2': 'O', '3': 'O',
         '4': 'X', '5': 'X', '6': ' ',
         '7': ' ', '8': ' ', '9': 'X'}
print(board)
{'1': 'O', '2': 'O', '3': 'O', '4': 'X', '5': 'X', '6': ' ', '7': ' ', '8': ' ', '9': 'X'}
source: https://automatetheboringstuff.com/2e/chapter5/

Of course, the player should see only what is printed to the screen, not the contents of variables. Let’s create a function to print the board dictionary onto the screen:

def printBoard(board):
    """
    The function that prints out the board in a square shape
    """
    print(board['1'] + '|' + board['2'] + '|' + board['3'])
    print('-+-+-')
    print(board['4'] + '|' + board['5'] + '|' + board['6'])
    print('-+-+-')
    print(board['7'] + '|' + board['8'] + '|' + board['9'])

printBoard(board)
O|O|O
-+-+-
X|X| 
-+-+-
 | |X

The printBoard() function can handle any tic-tac-toe data structure you pass it!

Now let’s add code that allows the players to enter their moves.

board = {'1': ' ', '2': ' ', '3': ' ',
         '4': ' ', '5': ' ', '6': ' ',
         '7': ' ', '8': ' ', '9': ' '}

board['1'] = 'O'  # Update the board
printBoard(board)
board['5'] = 'X' 
printBoard(board)
O| | 
-+-+-
 | | 
-+-+-
 | | 
O| | 
-+-+-
 |X| 
-+-+-
 | | 

We can also check whether player has won the game or not using the following code:

p = 'O' # Check whether 'O' has won the game or not

((board['1'] == board['2'] == board['3'] == p) or  # Across top
(board['4'] == board['5'] == board['6'] == p) or  # Across middle
(board['7'] == board['8'] == board['9'] == p) or  # Across boardottom
(board['1'] == board['4'] == board['7'] == p) or  # Down left
(board['2'] == board['5'] == board['8'] == p) or  # Down middle
(board['3'] == board['6'] == board['9'] == p) or  # Down right
(board['3'] == board['5'] == board['7'] == p) or  # Diagonal
(board['1'] == board['5'] == board['9'] == p))  # Diagonal
False

Note that by enclosing the condiations with (), we do not have to add \ for multipline commands.

5.1.4 Exercise 1: Tic-tac-toe is a classic pencil-and-paper game played on a 3 × 3 grid. Players take turns placing their ‘X’ or ‘O’ marks, trying to get three in a row. Try to complete the following game design by complete three functions getBlankBoard(), isValidSpace() and isBoardFull().

%%writefile ttt.py

def printBoard(board):
    """
    The function that prints out the board in a square shape
    """
    print(board['1'] + '|' + board['2'] + '|' + board['3'])
    print('-+-+-')
    print(board['4'] + '|' + board['5'] + '|' + board['6'])
    print('-+-+-')
    print(board['7'] + '|' + board['8'] + '|' + board['9'])


def updateBoard(board, space, mark):
    """Sets the space on the board to mark."""
    board[space] = mark

def isWinner(board, player):
    """Return True if player is a winner on this TTTBoard."""
    # Shorter variable names used here for readablility:
    b, p = board, player
    # Check for 3 marks across 3 rows, 3 columns, and 2 diagonals.
    return ((b['1'] == b['2'] == b['3'] == p) or  # Across top
            (b['4'] == b['5'] == b['6'] == p) or  # Across middle
            (b['7'] == b['8'] == b['9'] == p) or  # Across bottom
            (b['1'] == b['4'] == b['7'] == p) or  # Down left
            (b['2'] == b['5'] == b['8'] == p) or  # Down middle
            (b['3'] == b['6'] == b['9'] == p) or  # Down right
            (b['3'] == b['5'] == b['7'] == p) or  # Diagonal
            (b['1'] == b['5'] == b['9'] == p))    # Diagonal

def getBlankBoard():
    """Create a new, blank tic-tac-toe board."""
    # Map of space numbers: 1|2|3
    #                       -+-+-
    #                       4|5|6
    #                       -+-+-
    #                       7|8|9
    # Keys are '1' through '9', the values are 'X', 'O', or ' ':
    # Initialize all spaces of the board as blank string ' ' using loop or dictionary complehention
    ALL_SPACES = ['1', '2', '3', '4', '5', '6', '7', '8', '9']
    ______________________________
    return board

def isValidSpace(board, space):
    """Returns True if the space on the board is a valid space number (1-9)
    and the space is blank."""
    ALL_SPACES = ['1', '2', '3', '4', '5', '6', '7', '8', '9']
    return ____________ and ________________

def isBoardFull(board):
    """Return True if every space on the board has been taken."""
    for v in ________:  # Traverse over the board to see if there is blank space
        if v == ' ':
            return False  # If any space is blank, return False.
    return True  # No spaces are blank, so return True



# The logic of the game
print('Welcome to Tic-Tac-Toe!')
gameBoard = getBlankBoard()  # Create a TTT board dictionary.
currentPlayer, nextPlayer = 'X', 'O'  # X goes first, O goes next.

while True:  # Main game loop.
    # 1. Display the board on the screen:
    printBoard(gameBoard)

    # 2. Keep asking the player until they enter a number 1-9:
    move = None
    while not isValidSpace(gameBoard, move):
        print('What is ' + currentPlayer + '\'s move? (1-9)')
        move = input('> ')
    updateBoard(gameBoard, move, currentPlayer)  # Make the move.

    # 3. Check if the game is over:
    if isWinner(gameBoard, currentPlayer):  # Check for a winner.
        printBoard(gameBoard)
        print(currentPlayer + ' has won the game!')
        break
    elif isBoardFull(gameBoard):  # Check for a tie.
        print(printBoard(gameBoard))
        print('The game is a tie!')
        break
    # 4. Switch turns to the next player using multiple assignment:
    currentPlayer, nextPlayer = nextPlayer, currentPlayer
print('Thanks for playing!')
Overwriting ttt.py

5.1.5 Nested Dictionaries and Lists

5.1.5.1 A List of Dictionaries

Consider a game featuring aliens that can have different colors and point values. This simple dictionary stores information about a particular alien:

alien_0 = {'color': 'green', 'points': 5}

The alien_0 dictionary contains a variety of information about one alien, but it has no room to store information about a second alien, much less a screen full of aliens. How can you manage a fleet of aliens? One way is to make a list of aliens in which each alien is a dictionary of information about that alien.

aliens = []
# Make 30 green aliens.
for alien_number in range(30):
    new_alien = {'color': 'green', 'points': 5, 'speed': 'slow'}
    aliens.append(new_alien)
# Show the first 5 aliens.
for alien in aliens[:5]:
    print(alien)
print("...")
# Show how many aliens have been created.
print(f"Total number of aliens: {len(aliens)}")
{'color': 'green', 'points': 5, 'speed': 'slow'}
{'color': 'green', 'points': 5, 'speed': 'slow'}
{'color': 'green', 'points': 5, 'speed': 'slow'}
{'color': 'green', 'points': 5, 'speed': 'slow'}
{'color': 'green', 'points': 5, 'speed': 'slow'}
...
Total number of aliens: 30

These aliens all have the same characteristics, but Python considers each one a separate object, which allows us to modify each alien individually. How might you work with a group of aliens like this? Imagine that one aspect of a game has some aliens changing color and moving faster as the game progresses. When it’s time to change colors, we can use a for loop and an if statement to change the color of the aliens. For example, to change the first three aliens to yellow, medium-speed aliens worth 10 points each, we could do this:

for alien in aliens[:3]:
    if alien['color'] == 'green':
        alien['color'] = 'yellow'
        alien['speed'] = 'medium'
        alien['points'] = 10

aliens[:10]
[{'color': 'yellow', 'points': 10, 'speed': 'medium'},
 {'color': 'yellow', 'points': 10, 'speed': 'medium'},
 {'color': 'yellow', 'points': 10, 'speed': 'medium'},
 {'color': 'green', 'points': 5, 'speed': 'slow'},
 {'color': 'green', 'points': 5, 'speed': 'slow'},
 {'color': 'green', 'points': 5, 'speed': 'slow'},
 {'color': 'green', 'points': 5, 'speed': 'slow'},
 {'color': 'green', 'points': 5, 'speed': 'slow'},
 {'color': 'green', 'points': 5, 'speed': 'slow'},
 {'color': 'green', 'points': 5, 'speed': 'slow'}]

5.2 Sets

A set is an unordered collection of unique values. Sets may contain only immutable objects, like strings, ints, floats and tuples that contain only immutable elements.

The following code creates a set of strings named colors:

colors = {'red', 'orange', 'yellow', 'green', 'red', 'blue'} # Similar to set in math
colors
{'blue', 'green', 'orange', 'red', 'yellow'}

Notice that the duplicate string 'red' was ignored (without causing an error). An important use of sets is duplicate elimination, which is automatic when creating a set. Also, the resulting set’s values may not be displayed in the same order as they were listed! Though the color names are displayed in sorted order, sets are unordered. You should not write code that depends on the order of their elements!

Note that we also use curly bracket to create a set!

Though sets are iterable, they are not sequences and do not support indexing and slicing with square brackets, [].

colors[0]
TypeError: 'set' object is not subscriptable

You can determine the number of items in a set with the built-in len() function:

len(colors)
5

You can check whether a set contains a particular value using the in and not in operators:

'red' in colors, 'purple' not in colors
(True, True)

Sets are iterable, so you can process each set element with a for loop:

for color in colors: # {'blue', 'green', 'orange', 'red', 'yellow'}
    print(color, end=' ')
green blue red yellow orange 

Sets are unordered, so there’s no significance to the iteration order!

5.2.1 Creating a Set with the Built-In set() Function

You can create a set from another collection of values by using the built-in set() function — here we create a list that contains several duplicate integer values and use that list as set’s argument:

numbers = list(range(10)) + list(range(5))
numbers
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4]
set(numbers)
{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}

If you need to create an empty set, you must use the set() function with empty parentheses, rather than empty braces, {}, which represent an empty dictionary:

set()
set()

Python displays an empty set as set() to avoid confusion with Python’s string representation of an empty dictionary ({}).

5.2.2 Set Operators and Methods

Sets are mutable — you can add and remove elements, but set elements must be immutable. Therefore, a set cannot have other sets as elements.

{6,5,'a'}
{5, 6, 'a'}
{7,3,{3,5,7}}
TypeError: unhashable type: 'set'
{7,3,[3,5,7]}
TypeError: unhashable type: 'list'

5.2.2.1 Methods for Adding and Removing Elements

Here we first discuss operators and methods that modify an existing set.

Set method update() performs a union operation modifying the set in-place — the argument can be any iterable:

numbers = {1, 3, 5}     
numbers.update(range(10)) 
numbers
{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}

Set method add() inserts its argument if the argument is not already in the set; otherwise, the set remains unchanged:

numbers.add(17)
numbers.add(3)
numbers
{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 17}

Set method remove() removes its argument from the set — a KeyError occurs if the value is not in the set:

numbers.remove(3)
numbers
{0, 1, 2, 4, 5, 6, 7, 8, 9, 17}
numbers.remove(11)
KeyError: 11

Method discard() also removes its argument from the set but does not cause an exception if the value is not in the set.

numbers.discard(1)
numbers
{2, 4, 5, 6, 7, 8, 9, 17}
numbers.discard(0)

5.2.3 Mathematical Set Operations

The operators and methods presented in this section each result in a new set!

5.2.3.1 Union

The union of two sets is a set consisting of all the unique elements from both sets. You can calculate the union with the | operator or with the set union() method:

{1, 3, 5} | {2, 3, 4}
{1, 2, 3, 4, 5}
{1, 3, 5}.union([20, 20, 3, 40, 40])
{1, 3, 5, 20, 40}

The operands of the binary set operators, like |, must both be sets. The corresponding set methods may receive any iterable object as an argument — we passed a list. When a mathematical set method receives a non-set iterable argument, it first converts the iterable to a set, then applies the mathematical operation.

5.2.3.2 Intersection

The intersection of two sets is a set consisting of all the unique elements that the two sets have in common. You can calculate the intersection with the & operator or with the set intersection() method:

{1, 3, 5} & {2, 3, 4}
{3}
{1, 3, 5}.intersection([1, 2, 2, 3, 3, 4, 4])
{1, 3}

5.2.3.3 Difference

The difference between two sets is a set consisting of the elements in the left operand that are not in the right operand. You can calculate the difference with the - operator or with the set difference() method:

{1, 3, 5} - {2, 3, 4}
{1, 5}
{1, 3, 5, 7}.difference([2, 2, 3, 3, 4, 4])
{1, 5, 7}

5.2.3.4 Symmetric Difference

The symmetric difference between two sets is a set consisting of the elements of both sets that are not in common with one another. You can calculate the symmetric difference with the ^ operator or with the set symmetric_difference method:

{1, 3, 5} ^ {2, 3, 4}
{1, 2, 4, 5}
{1, 3, 5, 7}.symmetric_difference([2, 2, 3, 3, 4, 4])
{1, 2, 4, 5, 7}

5.2.3.5 Disjoint

Two sets are disjoint if they do not have any common elements. You can determine this with the set isdisjoint() method:

{1, 3, 5}.isdisjoint({2, 4, 6})
True
{1, 3, 5}.isdisjoint({4, 6, 1})
False

5.2.4 Set Comprehensions

Like dictionary comprehensions, you define set comprehensions in curly braces. Let’s create a new set containing only the unique even values in the list numbers:

numbers = [1, 2, 2, 3, 4, 5, 6, 6, 7, 8, 9, 10, 10]
evens = {item for item in numbers if item % 2 == 0}

evens
{2, 4, 6, 8, 10}

5.2.5 Sorting the set and dictionary

As we mentioned last week, data types like tuples don’t provide methods like sort(). However Python provides the built-in function sorted(), which takes any sequence as a parameter and returns a new container with the same elements in a different order. You can also apply sorted to the set, but the returning container will be list.

help(sorted)
Help on built-in function sorted in module builtins:

sorted(iterable, /, *, key=None, reverse=False)
    Return a new list containing all items from the iterable in ascending order.
    
    A custom key function can be supplied to customize the sort order, and the
    reverse flag can be set to request the result in descending order.
RANKS = ["A", "2", "3", "4", "5", "6", "7", "8", "9", "10", "J", "Q", "K"]

def rank_key(card):
    return RANKS.index(card)

ori_set = {"A", "2", "7", "4", "Q"}
print(sorted(ori_set))
sorted_list = sorted(ori_set, key=rank_key) 
# Each element will be replaced by the output of rank_key() and sorts!
print(sorted_list)
['2', '4', '7', 'A', 'Q']
['A', '2', '4', '7', 'Q']

Note that we have changed the behavior of the sorted() function by providing the custom key that allows us to sort the data in a specific order using the predefined list and the index() function.

If you would like to sort the dictionary, you need to use the items() method (Otherwise, it will only return keys). The returning container will again be a list:

grade = {'calculus':85, 'introduction to mathematics':80, 'introduction to computer science':90, 'linear algebra':95}
sorted_list = sorted(grade.items())
print(sorted_list)
[('calculus', 85), ('introduction to computer science', 90), ('introduction to mathematics', 80), ('linear algebra', 95)]

If you would like to sort by the value, use the following code:

def value_key(x):
    return x[1]

grade = {'calculus':85, 'introduction to mathematics':80, 'computer programming':90, 'linear algebra':95}
sorted_dict = sorted(grade.items(), key=value_key)
print(sorted_dict)
[('introduction to mathematics', 80), ('calculus', 85), ('computer programming', 90), ('linear algebra', 95)]

5.2.6 Exercise 2: Try to design a program that counts the number of unique characters in a string. Be sure to exclude the punctuation and white space and the character with upper and lower cases are treated as different characters.

source: https://realpython.com/python-encodings-guide/
import string

# You can obtain all the punctuations using the following attribute
string.punctuation
'!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~'
# Sample text
text = "The quick brown fox jumped over the lazy dog. The dog didn't seem to mind!"

# Create a set of punctuation characters
punctuation = set(string.punctuation)
    
# Create a set of all characters included in the text
all_chars = _____________
    
# Find the difference between the two sets to get the non-punctuation characters
non_punctuation_chars = ____________________________
# Also remove the whitespace
_____________________________
    
# Count the number of non-punctuation characters
num_unique_chars = len(non_punctuation_chars)

print("Number of unique characters:", num_unique_chars)

# Sanity check
assert num_unique_chars == 27
Number of unique characters: 27

In this chapter, we discussed Python’s dictionary and set collections. They are both unorder, mutable and do not allow duplicates.

We said what a dictionary is and presented several examples. We showed the syntax of key–value pairs and showed how to use them to create dictionaries with comma-separated lists of key–value pairs in curly braces, {}. You also created dictionaries with dictionary comprehensions. You used square brackets, [], to retrieve the value corresponding to a key, and to insert and update key–value pairs. You also used the dictionary method update to change a key’s associated value. You iterated through a dictionary’s keys, values and items.

You created sets of unique immutable values. You combined sets with set operators and methods, changed sets’ values with the mutable set operations and created sets with set comprehensions. You saw that sets are mutable.

A short comparison of the containers is shown below:

Feature List Tuple Dictionary Set
Mutable (Can be modified in place) Yes No Yes (keys are immutable) Yes
Iterable (Can be use in for loop) Yes Yes Yes Yes
Ordered (Can access by index, slicing) Yes Yes No No
Duplicate Values Allowed Allowed Not in keys Not allowed