Sets and Maps



1. Introduction

In this set of notes, we’ll talk about two new abstract data types as well as their Java implementations: Sets and Maps (also called dictionaries). You’ve seen both of these before, as they exist in Python.

2. Sets

2.1. Overview

A set is a collection of elements typically used in situations where you want to quickly determine if something is present in the set.

A set has two, main properties:

  1. It is non-sequenced. This means that, unlike an array, the elements in the set do not have indexes. You can’t access the n-th item in a set, because the set items are not sequenced.

  2. No duplicates are allowed. An item can only appear once in the set.

2.2. Sets in Java

https://docs.oracle.com/javase/8/docs/api/java/util/Set.html

Set in Java is an interface. That interface is also a Collection, so it has the following:

There are two main classes available that implement Set: TreeSet and HashSet. The names imply how they are built.

HashSet

https://docs.oracle.com/javase/8/docs/api/java/util/HashSet.html

A HashSet implements a Set using a hash table. Items are hashed into buckets, and if needed the table is periodically enlarged to maintain performance. On average, operations on a HashSet are all \(O(1)\). For the most part, if you want a set, this is the implementation you should use.

TreeSet

https://docs.oracle.com/javase/8/docs/api/java/util/TreeSet.html

A TreeSet implements a Set using a tree data structure. This means that add, contains, and remove are all \(O(\textrm{log }N)\). Why would you ever use this instead of a HashSet? With a TreeSet, the items are stored in order. That means that you can iterate through them (using the Iterator) and they will be in order. (If you don’t need an order, then just use a HashSet.)

3. Maps

3.1. Overview

A map, also called a dictionary, is used to map keys to values.

A map has some important properties:

  1. It is a collection of key-value pairs. (Each key is associated with one value.)

  2. Keys must be unique. (You can’t insert two things with the same key.)

At first glance, these properties are limiting: What if I want to associate more than one thing with a key? For example, imagine I’m building a music manager and want to use a map to link the playlist name with the songs in the list. The key is the playlist name, but it can only map to one thing. So how do I associate it with more than one song? Easy, I put those songs in a list (or a set), and associate the list with the key.

3.2. Maps in Java

https://docs.oracle.com/javase/8/docs/api/java/util/Map.html

Map is a Java interface. It has the following methods:

Much like sets, two main classes implement Map: HashMap and TreeMap.

HashMap

https://docs.oracle.com/javase/8/docs/api/java/util/HashMap.html

A HashMap implements a Map using a hash table. The key is hashed and the key-value pair is stored in the appropriate bucket and if needed the table is periodically enlarged to maintain performance.

The main operations average \(O(1)\) performance due to the hashtable. If you need a map, you probably want to use a HashMap.

TreeMap

https://docs.oracle.com/javase/8/docs/api/java/util/TreeMap.html

You are already intimately familiar with a TreeMap: You’ve built one. The KeyValueBinarySearchTree you constructed for homework 6 is very similar to a TreeMap. There are a few differences, however:

  • A TreeMap doesn’t use a binary search tree, instead, it uses something called a red-black tree. (As a side note, TreeSet also uses a red-black tree.)

  • Due to the red-black tree, the tree always stays balanced, providing guaranteed complexities.

The main operations are \(O(\textrm{log }N)\) due to the tree. Why would you use a TreeMap instead of HashMap? Much like sets, it is about order. You can iterate through the keys of a TreeMap in order. The keys of a HashMap are not usefully ordered.