CIS 1051 - Temple Rome Spring 2023¶

Intro to Problem solving and¶

Programming in Python¶

LOGO

LOGO

Dictionaries¶

Prof. Andrea Gallegati

( tuj81353@temple.edu )

A Dictionary Is a Mapping¶

... like a lists, but more general.

The indices have not to be integers, but can be rather (almost) any type.

A collection of indices (aka keys), each associated with a single value.

Each item (the association) is called a key-value pair.

In mathematical language, this represents a (keys to values) mapping.

An actual dictionary maps from English to Spanish words, for example.

dict creates a new dictionary with no items: thus, avoid using it for a variable name!

In [6]:
eng2sp = dict()
eng2sp
Out[6]:
{}
  • The curly brackets {} are an empty dictionary.
  • Use square brackets [] to add items to the dictionary.
In [8]:
eng2sp['one'] = 'uno'

This item maps 'one' into 'uno'.

Printing the dictionary, a colon separates the key-value pair.

In [9]:
eng2sp
Out[9]:
{'one': 'uno'}

This output format is also an input format: let's create a new dictionary like this

In [10]:
eng2sp = {'one': 'uno', 'two': 'dos', 'three': 'tres'}

Printing it again, we might be surprised:

In [11]:
eng2sp
Out[11]:
{'one': 'uno', 'three': 'tres', 'two': 'dos'}

being the key-value pairs order not the same, in any computer (usually).

In general, it's unpredictable.

Not an issue: dictionary elements are never indexed with integer indices.

We use the keys to look up the corresponding values

In [12]:
eng2sp['two']
Out[12]:
'dos'

this always maps into the same value 'dos'.

No matter what the items order is.

If the dictionaryhas not that key, we get an exception:

In [13]:
eng2sp['four']
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-13-ab2e10594e13> in <module>()
----> 1 eng2sp['four']

KeyError: 'four'

len function returns the number of key-value pairs:

In [14]:
len(eng2sp)
Out[14]:
3

in operator tells whether something appears as a key (not a value) in the dictionary:

In [15]:
 'one' in eng2sp
Out[15]:
True
In [16]:
'uno' in eng2sp
Out[16]:
False

To see whether something appears as a value in a dictionary use:

  • values method (returns a collection of values)
  • in operator
In [17]:
vals = eng2sp.values()
'uno' in vals
Out[17]:
True

in operator uses different algorithms for lists and dictionaries.

  • Lists: traverses elements in order (variable search time)
  • Dictionaries: looks up hashtables (same search time)

Dictionary as a Collection of Counters¶

to count different letters occurrences, in a given a string, there are several ways:

  • a counter for each letter (26 variables): increment them
  • 26 elements list, map letteres to indices: increment its values
  • dictionary with letters-counters: increment key-value pairs

Same computation, but different implementations.

Some implementations are better than others.

An advantage of the dictionary one (for example) is to make room just for the letters that do appear:

In [18]:
def histogram(s):
    d = dict()
    for c in s:
        if c not in d:
            d[c] = 1
        else:
            d[c] += 1
    return d

histogram, a statistical term for a collection of counters (or frequencies)

In [18]:
def histogram(s):
    d = dict()
    for c in s:
        if c not in d:
            d[c] = 1
        else:
            d[c] += 1
    return d
  • if c is not in the dictionary, create a new key-value pair
  • if c is already in the dictionary, increment d[c]
In [29]:
h = histogram('brontosaurus')
h
Out[29]:
{'a': 1, 'b': 1, 'n': 1, 'o': 2, 'r': 2, 's': 2, 't': 1, 'u': 2}

get method, takes

  • a key
  • a default value

returns the corresponding value (to the key)

In [21]:
h = histogram('a')
h.get('a', 0)
Out[21]:
1
In [22]:
h.get('b', 0)
Out[22]:
0

otherwise, the default one

With this, we can re-write histogram (more concisely)

In [28]:
def histogram(s): 
    d = dict() 
    for c in s: 
        d[c] = d.get(c, 0) + 1 
    return d

Looping and Dictionaries¶

for loop on a dictionary, traverses the dictionary keys.

In [30]:
def print_hist(h):
    for c in h:
        print(c, h[c])
In [31]:
h = histogram('parrot')
print_hist(h)
t 1
p 1
a 1
r 2
o 1

... the keys are in no particular order.

In [32]:
def print_hist(h):
    for key in sorted(h):
        print(key, h[key])

let's use the built-in function sorted

In [33]:
print_hist(h)
a 1
o 1
p 1
r 2
t 1

Reverse Lookup¶

Given a dictionaryand a key, it's easy to find the corresponding value (aka lookup)

v = d[k]

what about reverse lookup?

... if we have the value and want to find the key:

  • first, more than one key might map into that value
  • second, no simple syntax to do that search

This function returns the first key that maps into a given value:

In [34]:
def reverse_lookup(d, v):
    for k in d:
        if d[k] == v:
            return k
    raise LookupError()

raise statement causes a LookupError exception

a built-in to indicate the operation failed: value not in the dictionary.

Here below a successful reverse lookup:

In [35]:
h = histogram('parrot')
k = reverse_lookup(h, 2)
k
Out[35]:
'r'

... and an unsuccessful one:

In [36]:
k = reverse_lookup(h, 3)
---------------------------------------------------------------------------
LookupError                               Traceback (most recent call last)
<ipython-input-36-8e509d336cbb> in <module>()
----> 1 k = reverse_lookup(h, 3)

<ipython-input-34-7ad83190a369> in reverse_lookup(d, v)
      3         if d[k] == v:
      4             return k
----> 5     raise LookupError()

LookupError: 

same effect as when Python raises an exception: it prints

  • a traceback
  • an error message

raise statement can take (an optional) detailed error message:

In [37]:
raise LookupError('value does not appear in the dictionary')
---------------------------------------------------------------------------
LookupError                               Traceback (most recent call last)
<ipython-input-37-126aa63df058> in <module>()
----> 1 raise LookupError('value does not appear in the dictionary')

LookupError: value does not appear in the dictionary

reverse lookup is much slower than a forward lookup

  • if we have to do it often
  • if the dictionary gets big

the program performances will suffer.

Dictionaries and Lists¶

a dictionary may have lists values.

Given a dictionary mapping letters into frequencies, inverting it we might have:

  • different letters (values)
  • same frequency (keys)

Each value in the inverted dictionary must be a list of letters.

This function inverts a dictionary:

In [40]:
def invert_dict(d):
    inverse = dict()
    for key in d:
        val = d[key]
        if val not in inverse:
            inverse[val] = [key]
        else:
            inverse[val].append(key)
    return inverse

Each time through the loop:

  • it gets a key from the dictionary
  • it gets the corresponding value
  • initialize a new item with a singleton (if necessary)
  • append the corresponding key to the list (otherwise)
In [38]:
hist = histogram('parrot')
hist
Out[38]:
{'a': 1, 'o': 1, 'p': 1, 'r': 2, 't': 1}
In [41]:
inverse = invert_dict(hist)
inverse
Out[41]:
{1: ['t', 'p', 'a', 'o'], 2: ['r']}

These are the state diagrams representing the two dictionaries.

lists are drawn outside the box, to keep the diagram simple

Lists can be values in a dictionary, but they cannot be keys.

In [42]:
t = [1, 2, 3]
d = dict()
d[t] = 'oops'
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-42-487ac70bbf34> in <module>()
      1 t = [1, 2, 3]
      2 d = dict()
----> 3 d[t] = 'oops'

TypeError: unhashable type: 'list'

Dictionaries are implemented using hashtables: the keys have to be hashable.

A hash is a function:

  • taking a value (of any kind)
  • returning an integer.

Dictionaries (aka hashtables) use these integers (aka hash values) to store and look up key-value pairs.

Everything works fine if the keys are immutable.

Being the keys mutable (like lists) bad things happen:

  • Python hashes the key to store the key-value pair.
  • We modify the key and then hash it again (different location).
  • The dictionary wouldn’t work correctly.

That’s why mutable types like lists aren’t hashable, while tuples are.

Since dictionaries are mutable, they can’t be used as keys, but they can be used as values!