Main
Algorithms Illuminated (Part 2): Graph Algorithms and Data Structures
Algorithms Illuminated (Part 2): Graph Algorithms and Data Structures
Tim Roughgarden
Algorithms are the heart and soul of computer science. Their applications range from network routing and computational genomics to publickey cryptography and machine learning. Studying algorithms can make you a better programmer, a clearer thinker, and a master of technical interviews. Algorithms Illuminated is an accessible introduction to the subject for anyone with at least a little programming experience. The exposition emphasizes the big picture and conceptual understanding over lowlevel implementation and mathematical detailslike a transcript of what an expert algorithms tutor would say over a series of oneonone lessons. Part 2 covers graph search and applications, shortest paths, and the usage and implementation of several data structures (heaps, search trees, hash tables, and bloom filters).
Year:
2018
Publisher:
Soundlikeyourself Publishing, LLC
Language:
english
Pages:
222 / 221
ISBN 10:
0999282921
ISBN 13:
9780999282922
File:
PDF, 7.93 MB
Download (pdf, 7.93 MB)
Preview
 Open in Browser
 Checking other formats...
 Please login to your account first

Need help? Please read our short guide how to send a book to Kindle.
The file will be sent to your email address. It may take up to 15 minutes before you receive it.
The file will be sent to your Kindle account. It may takes up to 15 minutes before you received it.
Please note you need to add our email km@bookmail.org to approved email addresses. Read more.
Please note you need to add our email km@bookmail.org to approved email addresses. Read more.
You may be interested in
Most frequently terms
graph^{509}
vertex^{387}
algorithm^{326}
hash^{307}
vertices^{267}
heap^{199}
array^{140}
shortest^{134}
hash table^{117}
bloom^{104}
implementation^{103}
applications^{101}
scc^{97}
linear^{95}
dfs^{94}
insert^{86}
explored^{85}
adjacency^{85}
input^{83}
graphs^{82}
connected components^{80}
hash function^{79}
dijkstra^{78}
topological^{76}
len^{76}
algorithms^{75}
ordering^{73}
directed graph^{72}
graph search^{70}
quiz^{66}
bfs^{65}
subtree^{64}
undirected^{64}
pointer^{62}
compute^{61}
lengths^{60}
hash tables^{59}
notation^{58}
bloom filters^{56}
breadth^{56}
iteration^{55}
delete^{55}
loop^{54}
starting vertex^{52}
smallest^{51}
theorem^{50}
node^{49}
implement^{48}
topological ordering^{47}
lookup^{47}
binary^{46}
bloom filter^{46}
hash functions^{46}
computing^{46}
representation^{46}
parent^{45}
heaps^{45}
edge lengths^{44}
sorted^{43}
queue^{43}
You can write a book review and share your experiences. Other readers will always be interested in your opinion of the books you've read. Whether you've loved the book or not, if you give your honest and detailed thoughts then people will find new books that are right for them.
1

2

Algorithms Illuminated Part 2: Graph Algorithms and Data Structures Tim Roughgarden c 2018 by Tim Roughgarden All rights reserved. No portion of this book may be reproduced in any form without permission from the publisher, except as permitted by U. S. copyright law. First Edition Cover image: Untitled, by Nick Terry ISBN: 9780999282922 (Paperback) ISBN: 9780999282939 (ebook) Library of Congress Control Number: 2017914282 Soundlikeyourself Publishing, LLC San Francisco, CA soundlikeyourselfpublishing@gmail.com www.algorithmsilluminated.org In memory of James Wesley Shean (1921–2010) Contents Preface vii 7 1 1 2 3 7 13 Graphs: The Basics 7.1 Some Vocabulary 7.2 A Few Applications 7.3 Measuring the Size of a Graph 7.4 Representing a Graph Problems 8 Graph Search and Its Applications 8.1 Overview 8.2 BreadthFirst Search and Shortest Paths 8.3 Computing Connected Components 8.4 DepthFirst Search 8.5 Topological Sort *8.6 Computing Strongly Connected Components 8.7 The Structure of the Web Problems 15 15 25 34 40 45 54 66 71 9 Dijkstra’s ShortestPath Algorithm 9.1 The SingleSource Shortest Path Problem 9.2 Dijkstra’s Algorithm *9.3 Why Is Dijkstra’s Algorithm Correct? 9.4 Implementation and Running Time Problems 76 76 80 83 89 91 10 The 10.1 10.2 10.3 Heap Data Structure Data Structures: An Overview Supported Operations Applications v 95 95 98 101 vi Contents 10.4 Speeding Up Dijkstra’s Algorithm *10.5 Implementation Details Problems 106 112 123 11 Search Trees 11.1 Sorted Arrays 11.2 Search Trees: Supported Operations *11.3 Implementation Details *11.4 Balanced Search Trees Problems 126 126 129 131 145 149 12 Hash Tables and Bloom Filters 12.1 Supported Operations 12.2 Applications *12.3 Implementation: HighLevel Ideas *12.4 Further Implementation Details 12.5 Bloom Filters: The Basics *12.6 Bloom Filters: Heuristic Analysis Problems 151 151 154 159 173 178 184 190 C Quick Review of Asymptotic Notation C.1 The Gist C.2 BigO Notation C.3 Examples C.4 BigOmega and BigTheta Notation 193 193 194 195 197 Solutions to Selected Problems 200 Index 203 Preface This book is the second of a fourpart series based on my online algorithms courses that have been running regularly since 2012, which in turn are based on an undergraduate course that I’ve taught many times at Stanford University. The first part of the series is not a prerequisite for this one, and this book should be accessible to any reader who has the background described in the “Who Are You?” section below and is familiar with asymptotic notation (which is reviewed in Appendix C). What We’ll Cover Algorithms Illuminated, Part 2 provides an introduction to and basic literacy in the following three topics. Graph search and applications. Graphs model many diﬀerent types of networks, including road networks, communication networks, social networks, and networks of dependencies between tasks. Graphs can get complex, but there are several blazingly fast primitives for reasoning about graph structure. We begin with lineartime algorithms for searching a graph, with applications ranging from network analysis to task sequencing. Shortest paths. In the shortestpath problem, the goal is to compute the best route in a network from point A to point B. The problem has obvious applications, like computing driving directions, and also shows up in disguise in many more general planning problems. We’ll generalize one of our graph search algorithms and arrive at Dijkstra’s famous shortestpath algorithm. Data structures. This book will make you an educated client of several diﬀerent data structures for maintaining an evolving set of objects with keys. The primary goal is to develop your intuition vii viii Preface about which data structure is the right one for your application. The optional advanced sections provide guidance in how to implement these data structures from scratch. We first discuss heaps, which can quickly identify the stored object with the smallest key and are useful for sorting, implementing a priority queue, and implementing Dijkstra’s algorithm in nearlinear time. Search trees maintain a total ordering over the keys of the stored objects and support an even richer array of operations. Hash tables are optimized for superfast lookups and are ubiquitous in modern programs. We’ll also cover the bloom filter, a close cousin of the hash table that uses less space at the expense of occasional errors. For a more detailed look into the book’s contents, check out the “Upshot” sections that conclude each chapter and highlight the most important points. The starred sections of the book are the most advanced ones. The timeconstrained reader can skip these on a first reading without loss of continuity. Topics covered in the other three parts. Algorithms Illuminated, Part 1 covers asymptotic notation (bigO notation and its close cousins), divideandconquer algorithms and the master method, randomized QuickSort and its analysis, and lineartime selection algorithms. Part 3 focuses on greedy algorithms (scheduling, minimum spanning trees, clustering, Huﬀman codes) and dynamic programming (knapsack, sequence alignment, shortest paths, optimal search trees). Part 4 is all about N P completeness, what it means for the algorithm designer, and strategies for coping with computationally intractable problems, including the analysis of heuristics and local search. Skills You’ll Learn Mastering algorithms takes time and eﬀort. Why bother? Become a better programmer. You’ll learn several blazingly fast subroutines for processing data as well as several useful data structures for organizing data that you can deploy directly in your own programs. Implementing and using these algorithms will stretch and improve your programming skills. You’ll also learn general algorithm design paradigms that are relevant for many diﬀerent problems across diﬀerent domains, as well as tools for predicting the performance of ix Preface such algorithms. These “algorithmic design patterns” can help you come up with new algorithms for problems that arise in your own work. Sharpen your analytical skills. You’ll get lots of practice describing and reasoning about algorithms. Through mathematical analysis, you’ll gain a deep understanding of the specific algorithms and data structures covered in these books. You’ll acquire facility with several mathematical techniques that are broadly useful for analyzing algorithms. Think algorithmically. After you learn about algorithms, it’s hard to not see them everywhere, whether you’re riding an elevator, watching a flock of birds, managing your investment portfolio, or even watching an infant learn. Algorithmic thinking is increasingly useful and prevalent in disciplines outside of computer science, including biology, statistics, and economics. Literacy with computer science’s greatest hits. Studying algorithms can feel like watching a highlight reel of many of the greatest hits from the last sixty years of computer science. No longer will you feel excluded at that computer science cocktail party when someone cracks a joke about Dijkstra’s algorithm. After reading these books, you’ll know exactly what they mean. Ace your technical interviews. Over the years, countless students have regaled me with stories about how mastering the concepts in these books enabled them to ace every technical interview question they were ever asked. How These Books Are Diﬀerent This series of books has only one goal: to teach the basics of algorithms in the most accessible way possible. Think of them as a transcript of what an expert algorithms tutor would say to you over a series of oneonone lessons. There are a number of excellent more traditional and encyclopedic textbooks on algorithms, any of which usefully complement this book series with additional details, problems, and topics. I encourage you to explore and find your own favorites. There are also several books that, unlike these books, cater to programmers looking for readymade x Preface algorithm implementations in a specific programming language. Many such implementations are freely available on the Web as well. Who Are You? The whole point of these books and the online courses upon which they are based is to be as widely and easily accessible as possible. People of all ages, backgrounds, and walks of life are well represented in my online courses, and there are large numbers of students (highschool, college, etc.), software engineers (both current and aspiring), scientists, and professionals hailing from all corners of the world. This book is not an introduction to programming, and ideally you’ve acquired basic programming skills in a standard language (like Java, Python, C, Scala, Haskell, etc.). For a litmus test, check out Section 8.2—if it makes sense, you’ll be fine for the rest of the book. If you need to beef up your programming skills, there are several outstanding free online courses that teach basic programming. We also use mathematical analysis as needed to understand how and why algorithms really work. The freely available book Mathematics for Computer Science, by Eric Lehman, F. Thomson Leighton, and Albert R. Meyer is an P excellent and entertaining refresher on mathematical notation (like and 8), the basics of proofs (induction, contradiction, etc.), discrete probability, and much more. Additional Resources These books are based on online courses that are currently running on the Coursera and Stanford Lagunita platforms. I’ve made several resources available to help you replicate as much of the online course experience as you like. Videos. If you’re more in the mood to watch and listen than to read, check out the YouTube video playlists available from www.algorithmsilluminated.org. These videos cover all of the topics of this book series, as well as additional advanced topics. I hope they exude a contagious enthusiasm for algorithms that, alas, is impossible to replicate fully on the printed page. Quizzes. How can you know if you’re truly absorbing the concepts in this book? Quizzes with solutions and explanations are scattered xi Preface throughout the text; when you encounter one, I encourage you to pause and think about the answer before reading on. Endofchapter problems. At the end of each chapter you’ll find several relatively straightforward questions for testing your understanding, followed by harder and more openended challenge problems. Solutions to problems that are marked with an “(S)” appear at the end of the book. Readers can interact with me and each other about the remaining endofchapter problems through the book’s discussion forum (see below). Programming problems. Most of the chapters conclude with a suggested programming project, whose goal is to help you develop a detailed understanding of an algorithm by creating your own working implementation of it. Data sets, along with test cases and their solutions, can be found at www.algorithmsilluminated.org. Discussion forums. A big reason for the success of online courses is the opportunities they provide for participants to help each other understand the course material and debug programs through discussion forums. Readers of these books have the same opportunity, via the forums available at www.algorithmsilluminated.org. Acknowledgments These books would not exist without the passion and hunger supplied by the hundreds of thousands of participants in my algorithms courses over the years, both on campus at Stanford and on online platforms. I am particularly grateful to those who supplied detailed feedback on an earlier draft of this book: Tonya Blust, Yuan Cao, Jim Humelsine, Vladimir Kokshenev, Bayram Kuliyev, Patrick Monkelban, and Daniel Zingaro. I always appreciate suggestions and corrections from readers. These are best communicated through the discussion forums mentioned above. Tim Roughgarden London, United Kingdom July 2018 Chapter 7 Graphs: The Basics This short chapter explains what graphs are, what they are good for, and the most common ways to represent them in a computer program. The next two chapters cover a number of famous and useful algorithms for reasoning about graphs. 7.1 Some Vocabulary When you hear the word “graph,” you probably think about an xaxis, a yaxis, and so on (Figure 7.1(a)). To an algorithms person, a graph can also mean a representation of the relationships between pairs of objects (Figure 7.1(b)). 40 f(n)=n f(n)=log n 35 30 f(n) 25 20 15 10 5 0 0 5 10 15 20 25 30 35 40 n (a) A graph (to most of the world) (b) A graph (in algorithms) Figure 7.1: In algorithms, a graph is a representation of a set of objects (such as people) and the pairwise relationships between them (such as friendships). The second type of graph has two ingredients—the objects being represented, and their pairwise relationships. The former are called 1 2 Graphs: The Basics the vertices (singular: vertex) or the nodes of the graph.1 The pairwise relationships translate to the edges of the graph. We usually denote the vertex and edge sets of a graph by V and E, respectively, and sometimes write G = (V, E) to mean the graph G with vertices V and edges E. There are two flavors of graphs, directed and undirected. Both types are important and ubiquitous in applications, so you should know about both of them. In an undirected graph, each edge corresponds to an unordered pair {v, w} of vertices, which are called the endpoints of the edge (Figure 7.2(a)). In an undirected graph, there is no diﬀerence between an edge (v, w) and an edge (w, v). In a directed graph, each edge (v, w) is an ordered pair, with the edge traveling from the first vertex v (called the tail) to the second w (the head); see Figure 7.2(b).2 v v s t s t w w (a) An undirected graph (b) A directed graph Figure 7.2: Graphs with four vertices and five edges. The edges of undirected and directed graphs are unordered and ordered vertex pairs, respectively. 7.2 A Few Applications Graphs are a fundamental concept, and they show up all the time in computer science, biology, sociology, economics, and so on. Here are a few of the countless examples. 1 Having two names for the same thing can be annoying, but both terms are in widespread use and you should be familiar with them. For the most part, we’ll stick with “vertices” throughout this book series. 2 Directed edges are sometimes called arcs, but we won’t use this terminology in this book series. 7.3 Measuring the Size of a Graph 3 Road networks. When your smartphone’s software computes driving directions, it searches through a graph that represents the road network, with vertices corresponding to intersections and edges corresponding to individual road segments. The World Wide Web. The Web can be modeled as a directed graph, with the vertices corresponding to individual Web pages, and the edges corresponding to hyperlinks, directed from the page containing the hyperlink to the destination page. Social networks. A social network can be represented as a graph whose vertices correspond to individuals and edges to some type of relationship. For example, an edge could indicate a friendship between its endpoints, or that one of its endpoints is a follower of the other. Among the currently popular social networks, which ones are most naturally modeled as an undirected graph, and which ones as a directed graph? (There are interesting examples of both.) Precedence constraints. Graphs are also useful in problems that lack an obvious network structure. For example, imagine that you have to complete a bunch of tasks, subject to precedence constraints— perhaps you’re a firstyear university student, planning which courses to take and in which order. One way to tackle this problem is to apply the topological sorting algorithm described in Section 8.5 to the following directed graph: there is one vertex for each course that your major requires, with an edge directed from course A to course B whenever A is a prerequisite for B. 7.3 Measuring the Size of a Graph In this book, like in Part 1, we’ll analyze the running time of diﬀerent algorithms as a function of the input size. When the input is a single array, as for a sorting algorithm, there is an obvious way to define the “input size,” as the array’s length. When the input involves a graph, we must specify exactly how the graph is represented and what we mean by its “size.” 7.3.1 The Number of Edges in a Graph Two parameters control a graph’s size—the number of vertices and the number of edges. Here is the most common notation for these 4 Graphs: The Basics quantities. Notation for Graphs For a graph G = (V, E) with vertex set V and edge set E: • n = V  denotes the number of vertices. • m = E denotes the number of edges.3 The next quiz asks you to think about how the number m of edges in an undirected graph can depend on the number n of vertices. For this question, we’ll assume that there’s at most one undirected edge between each pair of vertices—no “parallel edges” are allowed. We’ll also assume that the graph is “connected.” We’ll define this concept formally in Section 8.3; intuitively, it means that the graph is “in one piece,” with no way to break it into two parts without any edges crossing between the parts. The graphs in Figures 7.1(b) and 7.2(a) are connected, while the graph in Figure 7.3 is not. Figure 7.3: An undirected graph that is not connected. Quiz 7.1 Consider an undirected graph with n vertices and no parallel edges. Assume that the graph is connected, meaning “in one piece.” What are the minimum and maximum numbers of edges, respectively, that the graph could have? 3 For a finite set S, S denotes the number of elements in S. 7.3 Measuring the Size of a Graph 5 n(n 1) 2 a) n 1 and b) n 1 and n2 c) n and 2n d) n and nn (See Section 7.3.3 for the solution and discussion.) 7.3.2 Sparse vs. Dense Graphs Now that Quiz 7.1 has you thinking about how the number of edges of a graph can vary with the number of vertices, we can discuss the distinction between sparse and dense graphs. The diﬀerence is important because some data structures and algorithms are better suited for sparse graphs, and others for dense graphs. Let’s translate the solution to Quiz 7.1 into asymptotic notation.4 First, if an undirected graph with n vertices is connected, the number of edges m is at least linear in n (that is, m = ⌦(n)).5 Second, if the graph has no parallel edges, then m = O(n2 ).6 We conclude that the number of edges in a connected undirected graph with no parallel edges is somewhere between linear and quadratic in the number of vertices. Informally, a graph is sparse if the number of edges is relatively close to linear in the number of vertices, and dense if this number is closer to quadratic in the number of vertices. For example, graphs with n vertices and O(n log n) edges are usually considered sparse, while those with ⌦(n2 / log n) edges are considered dense. “Partially dense” graphs, like those with ⇡ n3/2 edges, may be considered either sparse or dense, depending on the specific application. 7.3.3 Solution to Quiz 7.1 Correct answer: (a). In a connected undirected graph with n vertices and no parallel edges, the number m of edges is at least n 1 4 See Appendix C for a review of bigO, bigOmega, and bigTheta notation. If the graph need not be connected, there could be as few as zero edges. 6 If parallel edges are allowed, a graph with at least two vertices can have an arbitrarily large number of edges. 5 6 Graphs: The Basics and at most n(n 1)/2. To see why the lower bound is correct, consider a graph G = (V, E). As a thought experiment, imagine building up G one edge at a time, starting from the graph with vertices V and no edges. Initially, before any edges are added, each of the n vertices is completely isolated, so the graph trivially has n distinct “pieces.” Adding an edge (v, w) has the eﬀect of fusing the piece containing v with the piece containing w (Figure 7.4). Thus, each edge addition decreases the number of pieces by at most 1.7 To get down to a single piece from n pieces, you need to add at least n 1 edges. There are plenty of connected graphs that have n vertices and only n 1 edges—these are called trees (Figure 7.5). newly added edge Figure 7.4: Adding a new edge fuses the pieces containing its endpoints into a single piece. In this example, the number of diﬀerent pieces drops from three to two. (a) A path on four vertices (b) A star on four vertices Figure 7.5: Two connected undirected graphs with four vertices and three edges. The maximum number of edges in a graph with no parallel edges is achieved by the complete graph, with every possible edge present. 7 If both endpoints of the edge are already in the same piece, the number of pieces doesn’t decrease at all. 7.4 7 Representing a Graph Because there are n2 = n(n2 1) pairs of vertices in an nvertex graph, this is also the maximum number of edges. For example, when n = 4, the maximum number of edges is 42 = 6 (Figure 7.6).8 Figure 7.6: The complete graph on four vertices has 7.4 4 2 = 6 edges. Representing a Graph There is more than one way to encode a graph for use in an algorithm. In this book series, we’ll work primarily with the “adjacency list” representation of a graph (Section 7.4.1), but you should also be aware of the “adjacency matrix” representation (Section 7.4.2). 7.4.1 Adjacency Lists The adjacency list representation of graphs is the dominant one that we’ll use in this book series. Ingredients for Adjacency Lists 1. An array containing the graph’s vertices. 2. An array containing the graph’s edges. 3. For each edge, a pointer to each of its two endpoints. 4. For each vertex, a pointer to each of the incident edges. 8 n is pronounced “n choose 2,” and is also sometimes referred to as a 2 “binomial coeﬃcient.” To see why the number of ways to choose an unordered pair of distinct objects from a set of n objects is n(n2 1) , think about choosing the first object (from the n options) and then a second, distinct object (from the n 1 remaining options). The n(n 1) resulting outcomes produce each pair (x, y) of objects twice (once with x first and y second, once with y first and x second), so there must be n(n2 1) pairs in all. 8 Graphs: The Basics The adjacency list representation boils down to two arrays (or linked lists, if you prefer): one for keeping track of the vertices, and one for the edges. These two arrays crossreference each other in the natural way, with each edge associated with pointers to its endpoints and each vertex with pointers to the edges for which it is an endpoint. For a directed graph, each edge keeps track of which endpoint is the tail and which endpoint is the head. Each vertex v maintains two arrays of pointers, one for the outgoing edges (for which v is the tail) and one for the incoming edges (for which v is the head). What are the memory requirements of the adjacency list representation? Quiz 7.2 How much space does the adjacency list representation of a graph require, as a function of the number n of vertices and the number m of edges? a) ⇥(n) b) ⇥(m) c) ⇥(m + n) d) ⇥(n2 ) (See Section 7.4.4 for the solution and discussion.) 7.4.2 The Adjacency Matrix Consider an undirected graph G = (V, E) with n vertices and no parallel edges, and label its vertices 1, 2, 3, . . . , n. The adjacency matrix representation of G is a square n ⇥ n matrix A—equivalently, a twodimensional array—with only zeroes and ones as entries. Each entry Aij is defined as ⇢ 1 if edge (i, j) belongs to E Aij = 0 otherwise. Thus, an adjacency matrix maintains one bit for each pair of vertices, which keeps track of whether or not the edge is present (Figure 7.7). 7.4 9 Representing a Graph 1 2 3 0 B 2 B1 B 3 @0 4 0 1 0 1 1 0 1 0 1 1 1 2 3 (a) A graph. . . 4 0 4 1 0 C 1C C 1A 0 (b) . . . and its adjacency matrix Figure 7.7: The adjacency matrix of a graph maintains one bit for each vertex pair, indicating whether or not there is an edge connecting the two vertices. It’s easy to add bells and whistles to the adjacency matrix representation of a graph: • Parallel edges. If a graph can have multiple edges with the same pair of endpoints, then Aij can be defined as the number of edges with endpoints i and j. • Weighted graphs. Similarly, if each edge (i, j) has a weight wij — perhaps representing a cost or a distance—then each entry Aij stores wij . • Directed graphs. For a directed graph G, each entry Aij of the adjacency matrix is defined as ⇢ 1 if edge (i, j) belongs to E Aij = 0 otherwise, where “edge (i, j)” now refers to the edge directed from i to j. Every undirected graph has a symmetric adjacency matrix, while a directed graph usually has an asymmetric adjacency matrix. What are the memory requirements of an adjacency matrix? 10 Graphs: The Basics Quiz 7.3 How much space does the adjacency matrix of a graph require, as a function of the number n of vertices and the number m of edges? a) ⇥(n) b) ⇥(m) c) ⇥(m + n) d) ⇥(n2 ) (See Section 7.4.4 for the solution and discussion.) 7.4.3 Comparing the Representations Confronted with two diﬀerent ways to represent a graph, you’re probably wondering: Which is better? The answer, as it so often is with such questions, is “it depends.” First, it depends on the density of your graph—on how the number m of edges compares to the number n of vertices. The moral of Quizzes 7.2 and 7.3 is that the adjacency matrix is an eﬃcient way to encode a dense graph but is wasteful for a sparse graph. Second, it depends on which operations you want to support. On both counts, adjacency lists make more sense for the algorithms and applications described in this book series. Most of our graph algorithms will involve exploring a graph. Adjacency lists are perfect for graph exploration—you arrive at a vertex, and the adjacency list immediately indicates your options for the next step.9 Adjacency matrices do have their applications, but we won’t see them in this book series.10 Much of the modernday interest in fast graph primitives is motivated by massive sparse networks. Consider, for example, the Web graph (Section 7.2), where vertices correspond to Web pages and directed edges to hyperlinks. It’s hard to get an exact measurement of 9 If you had access to only the adjacency matrix of a graph, how long would it take you to figure out which edges are incident to a given vertex? 10 For example, you can count the number of common neighbors of each pair of vertices in one fell swoop by squaring the graph’s adjacency matrix. 7.4 Representing a Graph 11 the size of this graph, but a conservative lower bound on the number of vertices is 10 billion, or 1010 . Storing and reading through an array of this length already requires significant computational resources, but it is well within the limits of what modern computers can do. The size of the adjacency matrix of this graph, however, is proportional to 100 quintillion (1020 ). This is way too big to store or process with today’s technology. But the Web graph is sparse—the average number of outgoing edges from a vertex is well under 100. The memory requirements of the adjacency list representation of the Web graph are therefore proportional to 1012 (a trillion). This may be too big for your laptop, but it’s within the capabilities of stateoftheart dataprocessing systems.11 7.4.4 Solutions to Quizzes 7.2–7.3 Solution to Quiz 7.2 Correct answer: (c). The adjacency list representation requires space linear in the size of the graph (meaning the number of vertices plus the number of edges), which is ideal.12 Seeing this is a little tricky. Let’s step through the four ingredients one by one. The vertex and edge arrays have lengths n and m, respectively, and so require ⇥(n) and ⇥(m) space. The third ingredient associates two pointers with each edge (one for each endpoint). These 2m pointers contribute an additional ⇥(m) to the space requirement. The fourth ingredient might make you nervous. After all, each of the n vertices can participate in as many as n 1 edges—one per other vertex—seemingly leading to a bound of ⇥(n2 ). This quadratic bound would be accurate in a very dense graph, but is overkill in sparser graphs. The key insight is: For every vertex!edge pointer in the fourth ingredient, there is a corresponding edge!vertex pointer in the third ingredient. If the edge e is incident to the vertex v, then e has a pointer to its endpoint v, and, conversely, v has a pointer to the incident edge e. We conclude that the pointers in the third and fourth ingredients are in onetoone correspondence, and so they require 11 For example, the essence of Google’s original PageRank algorithm for measuring Web page importance relied on eﬃcient search in the Web graph. 12 Caveat: The leading constant factor here is larger than that for the adjacency matrix by an order of magnitude. 12 Graphs: The Basics exactly the same amount of space, namely ⇥(m). The final scorecard is: vertex array edge array pointers from edges to endpoints + pointers from vertices to incident edges total ⇥(n) ⇥(m) ⇥(m) ⇥(m) ⇥(m + n). The bound of ⇥(m + n) applies whether or not the graph is connected, and whether or not it has parallel edges.13 Solution to Quiz 7.3 Correct answer: (d). The straightforward way to store an adjacency matrix is as an n ⇥ n twodimensional array of bits. This uses ⇥(n2 ) space, albeit with a small hidden constant. For a dense graph, in which the number of edges is itself close to quadratic in n, the adjacency matrix requires space close to linear in the size of the graph. For sparse graphs, however, in which the number of edges is closer to linear in n, the adjacency matrix representation is highly wasteful.14 The Upshot P A graph is a representation of the pairwise relationships between objects, such as friendships in a social network, hyperlinks between Web pages, or dependencies between tasks. P A graph comprises a set of vertices and a set of edges. Edges are unordered in undirected graphs and ordered in directed graphs. P A graph is sparse if the number of edges m is close to linear in the number of vertices n, and dense if m is close to quadratic in n. 13 If the graph is connected, then m n 1 (by Quiz 7.1), and we could write ⇥(m) in place of ⇥(m + n). 14 This waste can be reduced by using tricks for storing and manipulating sparse matrices, meaning matrices with lots of zeroes. For instance, Matlab and Python’s SciPy package both support sparse matrix representations. 13 Problems P The adjacency list representation of a graph maintains vertex and edge arrays, crossreferencing each other in the natural way, and requires space linear in the total number of vertices and edges. P The adjacency matrix representation of a graph maintains one bit per pair of vertices to keep track of which edges are present, and requires space quadratic in the number of vertices. P The adjacency list representation is the preferred one for sparse graphs, and for applications that involve graph exploration. Test Your Understanding Problem 7.1 (S) Let G = (V, E) be an undirected graph. By the degree of a vertex v 2 V , we mean the number of edges in E that are incident to v (i.e., that have v as an endpoint).15 For each of the following conditions on the graph G, is the condition satisfied only by dense graphs, only by sparse graphs, or by both some sparse and some dense graphs? As usual, n = V  denotes the number of vertices. Assume that n is large (say, at least 10,000). a) At least one vertex of G has degree at most 10. b) Every vertex of G has degree at most 10. c) At least one vertex of G has degree n d) Every vertex of G has degree n 1. 1. Problem 7.2 (S) Consider an undirected graph G = (V, E) that is represented as an adjacency matrix. Given a vertex v 2 V , how many operations are required to identify the edges incident to v? (Let k denote the number of such edges. As usual, n and m denote the number of vertices and edges, respectively.) 15 The abbreviation “i.e.” stands for id est, and means “that is.” 14 Graphs: The Basics a) ⇥(1) b) ⇥(k) c) ⇥(n) d) ⇥(m) Problem 7.3 Consider a directed graph G = (V, E) represented with adjacency lists, with each vertex storing an array of its outgoing edges (but not its incoming edges). Given a vertex v 2 V , how many operations are required to identify the incoming edges of v? (Let k denote the number of such edges. As usual, n and m denote the number of vertices and edges, respectively). a) ⇥(1) b) ⇥(k) c) ⇥(n) d) ⇥(m) Chapter 8 Graph Search and Its Applications This chapter is all about fundamental primitives for graph search and their applications. One very cool aspect of this material is that all the algorithms that we’ll cover are blazingly fast (linear time with small constants), and it can be quite tricky to understand why they work! The culmination of this chapter—computing the strongly connected components of a directed graph with only two passes of depthfirst search (Section 8.6)—vividly illustrates how fast algorithms often require deep insight into the problem structure. We begin with an overview section (Section 8.1), which covers some reasons why you should care about graph search, a general strategy for searching a graph without doing any redundant work, and a highlevel introduction to the two most important search strategies, breadthfirst search (BFS) and depthfirst search (DFS). Sections 8.2 and 8.3 describe BFS in more detail, including applications to computing shortest paths and the connected components of an undirected graph. Sections 8.4 and 8.5 drill down on DFS and how to use it to compute a topological ordering of a directed acyclic graph (equivalently, to sequence tasks while respecting precedence constraints). Section 8.6 uses DFS to compute the strongly connected components of a directed graph in linear time. Section 8.7 explains how this fast graph primitive can be used to explore the structure of the Web. 8.1 Overview This section provides a bird’seye view of algorithms for graph search and their applications. 8.1.1 Some Applications Why would we want to search a graph, or to figure out if a graph contains a path from point A to point B? Here are a few of the many, 15 16 Graph Search and Its Applications many reasons. Checking connectivity. In a physical network, such as a road network or a network of computers, an important sanity check is that you can get anywhere from anywhere else. That is, for every choice of a point A and a point B, there should be a path in the network from the former to the latter. Connectivity can also be important in abstract (nonphysical) graphs that represent pairwise relationships between objects. One network that’s fun to play with is the movie network, where vertices correspond to movie actors, and two actors are connected by an undirected edge whenever they appeared in the same movie.1 For example, how many “degrees of separation” are there between diﬀerent actors? The most famous statistic of this type is the Bacon number, which is the minimum number of hops through the movie network needed to reach the fairly ubiquitous actor Kevin Bacon.2 So, Kevin Bacon himself has a Bacon number of 0, every actor who has appeared in a movie with Kevin Bacon has a Bacon number of 1, every actor who has appeared with an actor whose Bacon number is 1 but who is not Kevin Bacon himself has a Bacon number of 2, and so on. For example, Jon Hamm—perhaps best known as Don Draper from the cable television series Mad Men—has a Bacon number of 2. Hamm never appeared in a movie with Bacon, but he did have a bit part in the Colin Firth vehicle A Single Man, and Firth and Bacon costarred in Atom Egoyan’s Where the Truth Lies (Figure 8.1).3 Jon Hamm A Single Man Colin Firth Where the Truth Lies Kevin Bacon Figure 8.1: A snippet of the movie network, showing that Jon Hamm’s Bacon number is at most 2. 1 https://oracleofbacon.org/ The Bacon number is a riﬀ on the older concept of the Erdös number, named after the famous mathematician Paul Erdös, which measures the number of degrees of separation from Erdös in the coauthorship graph (where vertices are researchers, and there is an edge between each pair of researchers who have coauthored a paper). 3 There are also lots of other twohop paths between Bacon and Hamm. 2 8.1 Overview 17 Shortest paths. The Bacon number concerns the shortest path between two vertices of the movie network, meaning the path using the fewest number of edges. We’ll see in Section 8.2 that a graph search strategy known as breadthfirst search naturally computes shortest paths. Plenty of other problems boil down to a shortestpath computation, where the definition of “short” depends on the application (minimizing time for driving directions, or money for airline tickets, and so on). Dijkstra’s shortestpath algorithm, the subject of Chapter 9, builds on breadthfirst search to solve more general shortestpath problems. Planning. A path in a graph need not represent a physical path through a physical network. More abstractly, a path is a sequence of decisions taking you from one state to another. Graph search algorithms can be applied to such abstract graphs to compute a plan for reaching a goal state from an initial state. For example, imagine you want to use an algorithm to solve a Sudoku puzzle. Think of the graph where vertices correspond to partially completed Sudoku puzzles (with some of the 81 squares blank, but no rules of Sudoku violated), and directed edges correspond to filling in one new entry of the puzzle (subject to the rules of Sudoku). The problem of computing a solution to the puzzle is exactly the problem of computing a directed path from the vertex corresponding to the initial state of the puzzle to the vertex corresponding to the completed puzzle.4 For another example, using a robotic hand to grasp a coﬀee mug is essentially a planning problem. In the associated graph, vertices correspond to the possible configurations of the hand, and edges correspond to small and realizable changes in the configuration. Connected components. We’ll also see algorithms that build on graph search to compute the connected components (the “pieces”) of a graph. Defining and computing the connected components of an undirected graph is relatively easy (Section 8.3). For directed graphs, even defining what a “connected component” should mean is a little subtle. Section 8.6 defines them and shows how to use depthfirst search (Section 8.4) to compute them eﬃciently. We’ll also 4 Because this graph is too big to write down explicitly, practical Sudoku solvers incorporate some additional ideas. 18 Graph Search and Its Applications see applications of depthfirst search to sequencing tasks (Section 8.5) and to understanding the structure of the Web graph (Section 8.7). 8.1.2 ForFree Graph Primitives The examples in Section 8.1.1 demonstrate that graph search is a fundamental and widely applicable primitive. I’m happy to report that, in this chapter, all our algorithms will be blazingly fast, running in just O(m + n) time, where m and n denote the number of edges and vertices of the graph.5 That’s just a constant factor larger than the amount of time required to read the input!6 We conclude that these algorithms are “forfree primitives”—whenever you have graph data, you should feel free to apply any of these primitives to glean information about what it looks like.7 ForFree Primitives We can think of an algorithm with linear or nearlinear running time as a primitive that we can use essentially “for free” because the amount of computation used is barely more than the amount required just to read the input. When you have a primitive relevant to your problem that is so blazingly fast, why not use it? For example, you can always compute the connected components of your graph data in a preprocessing step, even if you’re not quite sure how it will help later. One of the goals of this book series is to stock your algorithmic toolbox with as many forfree primitives as possible, ready to be applied at will. 8.1.3 Generic Graph Search The point of a graph search algorithm is to solve the following problem. 5 Also, the constants hidden in the bigO notation are reasonably small. In graph search and connectivity problems, there is no reason to expect that the input graph is connected. In the disconnected case, where m might be much smaller than n, the size of a graph is ⇥(m + n) but not necessarily ⇥(m). 7 Can we do better? No, up to the hidden constant factor: every correct algorithm must at least read the entire input in some cases. 6 8.1 19 Overview Problem: Graph Search Input: An undirected or directed graph G = (V, E), and a starting vertex s 2 V . Goal: Identify the vertices of V reachable from s in G. By a vertex v being “reachable,” we mean that there is a sequence of edges in G that travels from s to v. If G is a directed graph, all the path’s edges should be traversed in the forward (outgoing) direction. For example, in Figure 8.2(a), the set of reachable vertices (from s) is {s, u, v, w}. In the directed version of the graph in Figure 8.2(b), there is no directed path from s to w, and only the vertices s, u, and v are reachable from s via a directed path.8 u u y x s s w w z v (a) An undirected graph y x z v (b) A directed version Figure 8.2: In (a), the set of vertices reachable from s is {s, u, v, w}. In (b), it is {s, u, v}. The two graph search strategies that we’ll focus on—breadthfirst search and depthfirst search—are diﬀerent ways of instantiating a generic graph search algorithm. The generic algorithm systematically finds all the reachable vertices, taking care to avoid exploring anything twice. It maintains an extra variable with each vertex that keeps track of whether or not it has already been explored, planting a flag the first time that vertex is reached. The main loop’s responsibility is to reach a new unexplored vertex in each iteration. 8 In general, most of the algorithms and arguments in this chapter apply equally well to undirected and directed graphs. The big exception is computing connected components, which is a trickier problem in directed graphs than in undirected graphs. 20 Graph Search and Its Applications GenericSearch Input: graph G = (V, E) and a vertex s 2 V . Postcondition: a vertex is reachable from s if and only if it is marked as “explored.” mark s as explored, all other vertices as unexplored while there is an edge (v, w) 2 E with v explored and w unexplored do choose some such edge (v, w) // underspecified mark w as explored The algorithm is essentially the same for both directed and undirected graphs. In the directed case, the edge (v, w) chosen in an iteration of the while loop should be directed from an explored vertex v to an unexplored vertex w. On Pseudocode This book series explains algorithms using a mixture of highlevel pseudocode and English (as above). I’m assuming that you have the skills to translate such highlevel descriptions into working code in your favorite programming language. Several other books and resources on the Web oﬀer concrete implementations of various algorithms in specific programming languages. The first benefit of emphasizing highlevel descriptions over languagespecific implementations is flexibility. While I assume familiarity with some programming language, I don’t care which one. Second, this approach promotes the understanding of algorithms at a deep and conceptual level, unencumbered by lowlevel details. Seasoned programmers and computer scientists generally think and communicate about algorithms at a similarly high level. Still, there is no substitute for the detailed understanding of an algorithm that comes from providing 8.1 Overview 21 your own working implementation of it. I strongly encourage you to implement as many of the algorithms in this book as you have time for. (It’s also a great excuse to pick up a new programming language!) For guidance, see the endofchapter Programming Problems and supporting test cases. For example, in the graph in Figure 8.2(a), initially only our home base s is marked as explored. In the first iteration of the while loop, two edges meet the loop condition: (s, u) and (s, v). The GenericSearch algorithm chooses one of these edges—(s, u), say—and marks u as explored. In the second iteration of the loop, there are again two choices: (s, v) and (u, w). The algorithm might choose (u, w), in which case w is marked as explored. With one more iteration (after choosing either (s, v) or (w, v)), v is marked as explored. At this point, the edge (x, y) has two unexplored endpoints and the other edges have two explored endpoints, and the algorithm halts. As one would hope, the vertices marked as explored—s, u, v, and w—are precisely the vertices reachable from s. This generic graph search algorithm is underspecified, as multiple edges (v, w) can be eligible for selection in an iteration of the while loop. Breadthfirst search and depthfirst search correspond to two specific decisions about which edge to explore next. No matter how this choice is made, the GenericSearch algorithm is guaranteed to be correct (in both undirected and directed graphs). Proposition 8.1 (Correctness of Generic Graph Search) At the conclusion of the GenericSearch algorithm, a vertex v 2 V is marked as explored if and only if there is a path from s to v in G. Section 8.1.5 provides a formal proof of Proposition 8.1; feel free to skip it if the proposition seems intuitively obvious. On Lemmas, Theorems, and the Like In mathematical writing, the most important technical statements are labeled theorems. A lemma is a technical statement that assists with the proof of a theorem (much as a subroutine assists with the 22 Graph Search and Its Applications implementation of a larger program). A corollary is a statement that follows immediately from an alreadyproved result, such as a special case of a theorem. We use the term proposition for standalone technical statements that are not particularly important in their own right. What about the running time of the GenericSearch algorithm? The algorithm explores each edge at most once—after an edge (v, w) has been explored for the first time, both v and w are marked as explored and the edge will not be considered again. This suggests that it should be possible to implement the algorithm in linear time, as long as we can quickly identify an eligible edge (v, w) in each iteration of the while loop. We’ll see how this works in detail for breadthfirst search and depthfirst search in Sections 8.2 and 8.4, respectively. 8.1.4 BreadthFirst and DepthFirst Search Every iteration of the GenericSearch algorithm chooses an edge that is “on the frontier” of the explored part of the graph, with one endpoint explored and the other unexplored (Figure 8.3). There can be many such edges, and to specify the algorithm fully we need a method for choosing one of them. We’ll focus on the two most important strategies: breadthfirst search and depthfirst search. Both are excellent ways to explore a graph, and each has its own set of applications. Breadthfirst search (BFS). The highlevel idea of breadthfirst search—or BFS to its friends—is to explore the vertices of a graph cautiously, in “layers.” Layer 0 consists only of the starting vertex s. Layer 1 contains the vertices that neighbor s, meaning the vertices v such that (s, v) is an edge of the graph (directed from s to v, in the case that G is directed). Layer 2 comprises the neighbors of layer1 vertices that do not already belong to layer 0 or 1, and so on. In Sections 8.2 and 8.3, we’ll see: • how to implement BFS in linear time using a queue (firstin firstout) data structure; • how to use BFS to compute (in linear time) the length of a shortest path between one vertex and all other vertices, with the 8.1 23 Overview explored unexplored s the frontier Figure 8.3: Every iteration of the GenericSearch algorithm chooses an edge “on the frontier,” with one endpoint explored and the other unexplored. layeri vertices being precisely the vertices at distance i from s; • how to use BFS to compute (in linear time) the connected components of an undirected graph. Depthfirst search (DFS). Depthfirst search—DFS to its friends—is perhaps even more important. DFS employs a more aggressive strategy for exploring a graph, very much in the spirit of how you might explore a maze, going as deeply as you can and backtracking only when absolutely necessary. In Sections 8.4–8.7, we’ll see: • how to implement DFS in linear time using either recursion or an explicit stack (lastin firstout) data structure; • how to use DFS to compute (in linear time) a topological ordering of the vertices of a directed acyclic graph, a useful primitive for task sequencing problems; • how to use DFS to compute (in linear time) the “strongly connected components” of a directed graph, with applications to understanding the structure of the Web. 24 Graph Search and Its Applications 8.1.5 Correctness of the GenericSearch Algorithm We now prove Proposition 8.1, which states that at the conclusion of the GenericSearch algorithm with input graph G = (V, E) and starting vertex s 2 V , a vertex v 2 V is marked as explored if and only if there is a path from s to v in G. As usual, if G is a directed graph, the s ; v path should also be directed, with all edges traversed in the forward direction. The “only if” direction of the proposition should be intuitively clear: The only way that the GenericSearch algorithm discovers new vertices is by following paths from s.9 The “if” direction asserts the less obvious fact that the GenericSearch algorithm doesn’t miss anything—it finds every vertex that it could conceivably discover. For this direction, we’ll use a proof by contradiction. Recall that in this type of proof, you assume the opposite of what you want to prove, and then build on this assumption with a sequence of logically correct steps that culminates in a patently false statement. Such a contradiction implies that the assumption can’t be true, which proves the desired statement. So, assume that there is a path from s to v in the graph G, but the GenericSearch algorithm somehow misses it and concludes with the vertex v marked as unexplored. Let S ✓ V denote the vertices of G marked as explored by the algorithm. The vertex s belongs to S (by the first line of the algorithm), and the vertex v does not (by assumption). Because the s ; v path travels from a vertex inside S to one outside S, at least one edge e of the path has one endpoint u in S and the other w outside S (with e directed from u to w in the case that G is directed); see Figure 8.4. But this, my friends, is impossible: The edge e would be eligible for selection in the while loop of the GenericSearch algorithm, and the algorithm would have explored at least one more vertex, rather than giving up! There’s no way that the GenericSearch algorithm could have halted at this point, so we’ve reached a contradiction. This contradiction concludes the proof of Proposition 8.1. QE D 10 9 If we wanted to be pedantic about it, we’d prove this direction by induction on the number of loop iterations. 10 “Q.e.d.” is an abbreviation for quod erat demonstrandum, and means “that which was to be demonstrated.” In mathematical writing, it is used at the end of a proof to mark its completion. 8.2 BreadthFirst Search and Shortest Paths 25 eligible for exploration! w v e s u S = explored vertices Figure 8.4: Proof of Proposition 8.1. As long as the GenericSearch algorithm has not yet discovered all the reachable vertices, there is an eligible edge along which it can explore further. 8.2 BreadthFirst Search and Shortest Paths Let’s drill down on our first specific graph search strategy, breadthfirst search. 8.2.1 HighLevel Idea Breadthfirst search explores the vertices of a graph in layers, in order of increasing distance from the starting vertex. Layer 0 contains the starting vertex s and nothing else. Layer 1 is the set of vertices that are one hop away from s—that is, s’s neighbors. These are the vertices that are explored immediately after s in breadthfirst search. For example, in the graph in Figure 8.5, a and b are the neighbors of s and constitute layer 1. In general, the vertices in a layer i are those that neighbor a vertex in layer i 1 and that do not already belong to one of the layers 0, 1, 2, . . . , i 1. Breadthfirst search explores all of layeri vertices immediately after completing its exploration of layer(i 1) vertices. (Vertices not reachable from s do not belong to any layer.) For example, in Figure 8.5, the layer2 vertices are c and d, as they neighbor layer1 vertices but do not themselves belong to layer 0 or 1. (The vertex s is also a neighbor of a layer1 vertex, but it already belongs to layer 0.) The last layer of the graph in Figure 8.5 comprises only the vertex e. 26 Graph Search and Its Applications a e s layer 3 c b d layer 0 layer 2 layer 1 Figure 8.5: Breadthfirst search discovers vertices in layers. The layeri vertices are the neighbors of the layer(i 1) vertices that do not appear in any earlier layer. Quiz 8.1 Consider an undirected graph with n 2 vertices. What are the minimum and maximum number of diﬀerent layers that the graph could have, respectively? a) 1 and n 1 b) 2 and n 1 c) 1 and n d) 2 and n (See Section 8.2.6 for the solution and discussion.) 8.2.2 Pseudocode for BFS Implementing breadthfirst search in linear time requires a simple “firstin firstout” data structure known as a queue. BFS uses a queue to keep track of which vertices to explore next. If you’re unfamiliar with queues, now is a good time to read up on them in your favorite introductory programming book (or on Wikipedia). The gist is that 8.2 BreadthFirst Search and Shortest Paths 27 a queue is a data structure for maintaining a list of objects, and you can remove stuﬀ from the front or add stuﬀ to the back in constant time.11 BFS Input: graph G = (V, E) in adjacencylist representation, and a vertex s 2 V . Postcondition: a vertex is reachable from s if and only if it is marked as “explored.” 1 2 3 4 5 6 7 8 mark s as explored, all other vertices as unexplored Q := a queue data structure, initialized with s while Q is not empty do remove the vertex from the front of Q, call it v for each edge (v, w) in v’s adjacency list do if w is unexplored then mark w as explored add w to the end of Q Each iteration of the while loop explores one new vertex. In line 5, BFS iterates through all the edges incident to the vertex v (if G is undirected) or through all the outgoing edges from v (if G is directed).12 Unexplored neighbors of v are added to the end of the queue and are marked as explored; they will eventually be processed in later iterations of the algorithm. 8.2.3 An Example Let’s see how our pseudocode works for the graph in Figure 8.5, numbering the vertices in order of insertion into the queue (equivalently, in order of exploration). The starting vertex s is always the first to 11 You may never need to implement a queue from scratch, as they are built in to most modern programming languages. If you do, you can use a doubly linked list. Or, if you have advance knowledge of the maximum number of objects that you might have to store (which is V , in the case of BFS), you can get away with a fixedlength array and a couple of indices (which keep track of the front and back of the queue). 12 This is the step where it’s so convenient to have the input graph represented via adjacency lists. 28 Graph Search and Its Applications be explored. The first iteration of the while loop extracts s from the queue Q and the subsequent for loop examines the edges (s, a) and (s, b), in whatever order these edges appear in s’s adjacency list. Because neither a nor b is marked as explored, both get inserted into the queue. Let’s say that edge (s, a) came first and so a is inserted before b. The current state of the graph and the queue is now: the frontier #2 a e #1 already removed front of queue s c b b d #3 a s state of the queue Q The next iteration of the while loop extracts the vertex a from the front of the queue, and considers its incident edges (s, a) and (a, c). It skips over the former after doublechecking that s is already marked as explored, and adds the (previously unexplored) vertex c to the end of the queue. The third iteration extracts the vertex b from the front of the queue and adds vertex d to the end (because s and c are already marked as explored, they are skipped over). The new picture is: the frontier #2 a e #1 #4 s c already removed front of queue b d #3 #5 d c b a s state of the queue Q In the fourth iteration, the vertex c is removed from the front of the queue. Of its neighbors, the vertex e is the only one not encountered 8.2 29 BreadthFirst Search and Shortest Paths before, and it is added to the end of the queue. The final two iterations extract d and then e from the queue, and verify that all of their neighbors have already been explored. The queue is then empty, and the algorithm halts. The vertices are explored in order of the layers, with the layeri vertices explored immediately after the layer(i 1) vertices (Figure 8.6). #2 #6 a e #1 #4 s c a s e layer 3 c b d layer 0 b d #3 #5 (a) Order of exploration layer 2 layer 1 (b) Layers Figure 8.6: In breadthfirst search, the layeri vertices are explored immediately after the layer(i 1) vertices. 8.2.4 Correctness and Running Time Breadthfirst search discovers all the vertices reachable from the starting vertex, and it runs in linear time. The more refined running time bound in Theorem 8.2(c) below will come in handy for our lineartime algorithm for computing connected components (described in Section 8.3). Theorem 8.2 (Properties of BFS) For every undirected or directed graph G = (V, E) in adjacencylist representation and for every starting vertex s 2 V : (a) At the conclusion of BFS, a vertex v 2 V is marked as explored if and only if there is a path from s to v in G. (b) The running time of BFS is O(m + n), where m = E and n = V . 30 Graph Search and Its Applications (c) The running time of lines 2–8 of BFS is O(ms + ns ), where ms and ns denote the number of edges and vertices, respectively, reachable from s in G. Proof: Part (a) follows from the guarantee in Proposition 8.1 for the generic graph search algorithm GenericSearch, of which BFS is a special case.13 Part (b) follows from part (c), as the overall running time of BFS is just the running time of lines 2–8 plus the O(n) time needed for the initialization in line 1. We can prove part (c) by inspecting the pseudocode. The initialization in line 2 takes O(1) time. In the main while loop, the algorithm only ever encounters the ns vertices that are reachable from s. Because no vertex is explored twice, each such vertex is added to the end of the queue and removed from the front of the queue exactly once. Each of these operations takes O(1) time—this is the whole point of the firstin firstout queue data structure—and so the total amount of time spent in lines 3–4 and 7–8 is O(ns ). Each of the ms edges (v, w) reachable from s is processed in line 5 at most twice—once when v is explored, and once when w is explored.14 Thus the total amount of time spent in lines 5–6 is O(ms ), and the overall running time for lines 2–8 is O(ms + ns ). QE D 8.2.5 Shortest Paths The properties in Theorem 8.2 are not unique to breadthfirst search— for example, they also hold for depthfirst search. What is unique about BFS is that, with just a couple extra lines of code, it eﬃciently computes shortestpath distances. 13 Formally, BFS is equivalent to the version of GenericSearch where, in every iteration of the latter’s while loop, the algorithm chooses the eligible edge (v, w) for which v was discovered the earliest, breaking ties among v’s eligible edges according to their order in v’s adjacency list. If that sounds too complicated, you can alternatively check that the proof of Proposition 8.1 holds verbatim also for breadthfirst search. Intuitively, breadthfirst search discovers vertices only by exploring paths from s; as long as it hasn’t explored every vertex on a path, the “next vertex” on the path is still in the queue awaiting future exploration. 14 If G is a directed graph, each edge is processed at most once, when its tail vertex is explored. 8.2 BreadthFirst Search and Shortest Paths 31 Problem Definition In a graph G, we use the notation dist(v, w) for the fewest number of edges in a path from v to w (or +1, if G contains no path from v to w).15 Problem: Shortest Paths (Unit Edge Lengths) Input: An undirected or directed graph G = (V, E), and a starting vertex s 2 V . Output: dist(s, v) for every vertex v 2 V .16 For example, if G is the movie network and s is the vertex corresponding to Kevin Bacon, the problem of computing shortest paths is precisely the problem of computing everyone’s Bacon number (Section 8.1.1). The basic graph search problem (Section 8.1.3) corresponds to the special case of identifying all the vertices v with dist(s, v) 6= +1. Pseudocode To compute shortest paths, we add two lines to the basic BFS algorithm (lines 2 and 9 below); these increase the algorithm’s running time by a small constant factor. The first one initializes preliminary estimates of vertices’ shortestpath distances—0 for s, and +1 for the other vertices, which might not even be reachable from s. The second one executes whenever a vertex w is discovered for the first time, and computes w’s final shortestpath distance as one more than that of the vertex v that triggered w’s discovery. 15 As usual, if G is directed, all the edges of the path should be traversed in the forward direction. 16 The phrase “unit edge lengths” in the problem statement refers to the assumption that each edge of G contributes 1 to the length of a path. Chapter 9 generalizes BFS to compute shortest paths in graphs in which each edge has its own nonnegative length. 32 Graph Search and Its Applications AugmentedBFS Input: graph G = (V, E) in adjacencylist representation, and a vertex s 2 V . Postcondition: for every vertex v 2 V , the value l(v) equals the true shortestpath distance dist(s, v). 1 2 3 4 5 6 7 8 9 10 mark s as explored, all other vertices as unexplored l(s) := 0, l(v) := +1 for every v 6= s Q := a queue data structure, initialized with s while Q is not empty do remove the vertex from the front of Q, call it v for each edge (v, w) in v’s adjacency list do if w is unexplored then mark w as explored l(w) := l(v) + 1 add w to the end of Q Example and Analysis In our running example (Figure 8.6), the first iteration of the while loop discovers the vertices a and b. Because s triggered their discovery and l(s) = 0, the algorithm reassigns l(a) and l(b) from +1 to 1: the frontier l(a)=1 l(e)=+∞ a e l(s)=0 already removed front of queue s c b l(b)=1 l(c)=+∞ d l(d)=+∞ b a s state of the queue Q The second iteration of the while loop processes the vertex a, leading to c’s discovery. The algorithm reassigns l(c) from +1 to l(a) + 1, which is 2. Similarly, in the third iteration, l(d) is set to l(b) + 1, which is also 2: 8.2 33 BreadthFirst Search and Shortest Paths the frontier l(a)=1 l(e)=+∞ a e l(s)=0 already removed front of queue s c b l(b)=1 l(c)=2 d l(d)=2 d c b a s state of the queue Q The fourth iteration discovers the final vertex e via the vertex c, and sets l(e) to l(c) + 1, which is 3. At this point, for every vertex v, l(v) equals the true shortestpath distance dist(s, v), which also equals the number of the layer that contains v (Figure 8.6). These properties hold in general, and not just for this example. Theorem 8.3 (Properties of AugmentedBFS) For every undirected or directed graph G = (V, E) in adjacencylist representation and for every starting vertex s 2 V : (a) At the conclusion of AugmentedBFS, for every vertex v 2 V , the value of l(v) equals the length dist(s, v) of a shortest path from s to v in G (or +1, if no such path exists). (b) The running time of AugmentedBFS is O(m+n), where m = E and n = V . Because the asymptotic running time of the AugmentedBFS algorithm is the same as that of BFS, part (b) of Theorem 8.3 follows from the latter’s running time guarantee (Theorem 8.2(b)). Part (a) follows from two observations. First, the vertices v with dist(s, v) = i are precisely the vertices in the ith layer of the graph—this is why we defined layers the way we did. Second, for every layeri vertex w, AugmentedBFS eventually sets l(w) = i (since w is discovered via a layer(i 1) vertex v with l(v) = i 1). For vertices not in any layer— that is, not reachable from s—both dist(s, v) and l(v) are +1.17 17 If you’re hungry for a more rigorous proof, then proceed—in the privacy of your own home—by induction on the number of while loop iterations performed by the AugmentedBFS algorithm. Alternatively, Theorem 8.3(a) is a special case of the correctness of Dijkstra’s shortestpath algorithm, as proved in Section 9.3. 34 Graph Search and Its Applications 8.2.6 Solution to Quiz 8.1 Correct answer: (d). An undirected graph with n 2 vertices has at least two layers and at most n layers. When n 2, there cannot be fewer than two layers because s is the only vertex in layer 0. Complete graphs have only two layers (Figure 8.7(a)). There cannot be more than n layers, as layers are disjoint and contain at least one vertex each. Path graphs have n layers (Figure 8.7(b)). s s layer 1 layer 0 (a) A complete graph layer 0 layer 1 layer 2 layer 3 (b) A path graph Figure 8.7: An nvertex graph can have anywhere from two to n diﬀerent layers. 8.3 Computing Connected Components In this section, G = (V, E) will always denote an undirected graph. We postpone the more diﬃcult connectivity problems in directed graphs until Section 8.6. 8.3.1 Connected Components An undirected graph G = (V, E) naturally falls into “pieces,” which are called connected components (Figure 8.8). More formally, a connected component is a maximal subset S ✓ V of vertices such that there is a path from any vertex in S to any other vertex in S.18 For example, 18 Still more formally, the connected components of a graph can be defined as the equivalence classes of a suitable equivalence relation. Equivalence relations are usually covered in a first course on proofs or on discrete mathematics. A 8.3 35 Computing Connected Components the connected components of the graph in Figure 8.8 are {1, 3, 5, 7, 9}, {2, 4}, and {6, 8, 10}. 3 1 6 5 2 4 8 7 10 9 Figure 8.8: A graph with vertex set {1, 2, 3, . . . , 10} and three connected components. The goal of this section is to use breadthfirst search to compute the connected components of a graph in linear time.19 Problem: Undirected Connected Components Input: An undirected graph G = (V, E). Goal: Identify the connected components of G. Next, let’s doublecheck your understanding of the definition of connected components. relation on a set X of objects specifies, for each pair x, y 2 X of objects, whether or not x and y are related. (If so, we write x ⇠ y.) For connected components, the relevant relation (on the set V ) is “v ⇠G w if and only if there is a path between v and w in G.” An equivalence relation satisfies three properties. First, it is reflexive, meaning that x ⇠ x for every x 2 X. (Satisfied by ⇠G , as the empty path connects a vertex with itself.) Second, it is symmetric, with x ⇠ y if and only if y ⇠ x. (Satisfied by ⇠G , as G is undirected.) Finally, it is transitive, meaning that x ⇠ y and y ⇠ z implies that x ⇠ z. (Satisfied by ⇠G , as you can paste together a path between vertices u and v with a path between vertices v and w to get a path between u and w.) An equivalence relation partitions the set of objects into equivalence classes, with each object related to all the objects in its class, and only to these. The equivalence classes of the relation ⇠G are the connected components of G. 19 Other graph search algorithms, including depthfirst search, can be used to compute connected components in exactly the same way. 36 Graph Search and Its Applications Quiz 8.2 Consider an undirected graph with n vertices and m edges. What are the minimum and maximum number of connected components that the graph could have, respectively? a) 1 and n 1 b) 1 and n c) 1 and max{m, n} d) 2 and max{m, n} (See Section 8.3.6 for the solution and discussion.) 8.3.2 Applications There are several reasons why you might be interested in the connected components of a graph. Detecting network failures. One obvious application is checking whether or not a network, such as a road or communication network, has become disconnected. Data visualization. Another application is in graph visualization— if you’re trying to draw or otherwise visualize a graph, presumably you want to display the diﬀerent components separately. Clustering. Suppose you have a collection of objects that you care about, with each pair annotated as either “similar” or “dissimilar.” For example, the objects could be documents (like crawled Web pages or news stories), with similar objects corresponding to nearduplicate documents (perhaps diﬀering only in a timestamp or a headline). Or the objects could be genomes, with two genomes deemed similar if a small number of mutations can transform one into the other. Now form an undirected graph G = (V, E), with vertices corresponding to objects and edges corresponding to pairs of similar objects. Intuitively, each connected component of this graph represents a set of objects that share much in common. For example, if the objects are crawled news stories, one might expect the vertices of a connected component to be variations on the same story reported on diﬀerent 8.3 Computing Connected Components 37 Web sites. If the objects are genomes, a connected component might correspond to diﬀerent individuals belonging to the same species. 8.3.3 The UCC Algorithm Computing the connected components of an undirected graph easily reduces to breadthfirst search (or other graph search algorithms, such as depthfirst search). The idea is to use an outer loop to make a single pass over the vertices, invoking BFS as a subroutine whenever the algorithm encounters a vertex that it has never seen before. This outer loop ensures that the algorithm looks at every vertex at least once. Vertices are initialized as unexplored before the outer loop, and not inside a call to BFS. The algorithm also maintains a field cc(v) for each vertex v, to remember which connected component contains it. By identifying each vertex of V with its position in the vertex array, we can assume that V = {1, 2, 3, . . . , n}. UCC Input: undirected graph G = (V, E) in adjacencylist representation, with V = {1, 2, 3, . . . , n}. Postcondition: for every u, v 2 V , cc(u) = cc(v) if and only if u, v are in the same connected component. mark all vertices as unexplored numCC := 0 for i := 1 to n do // try all vertices if i is unexplored then // avoid redundancy numCC := numCC + 1 // new component // call BFS starting at i (lines 2–8) Q := a queue data structure, initialized with i while Q is not empty do remove the vertex from the front of Q, call it v cc(v) := numCC for each (v, w) in v’s adjacency list do if w is unexplored then mark w as explored add w to the end of Q 38 Graph Search and Its Applications 8.3.4 An Example Let’s trace the UCC algorithm’s execution on the graph in Figure 8.8. The algorithm marks all vertices as unexplored and starts the outer for loop with vertex 1. This vertex has not been seen before, so the algorithm invokes BFS from it. Because BFS finds everything reachable from its starting vertex (Theorem 8.2(a)), it discovers all the vertices in {1, 3, 5, 7, 9}, and sets their ccvalues to 1. One possible order of exploration is: #1 #2 1 3 unexplored #3 unexplored 5 2 6 unexplored 4 #4 #5 8 10 7 9 unexplored unexplored connected component #1 Once this call to BFS completes, the algorithm’s outer for loop marches on and considers vertex 2. This vertex was not discovered by the first call to BFS, so BFS is invoked again, this time with vertex 2 as the starting vertex. After discovering vertices 2 and 4 (and setting their ccvalues to 2), this call to BFS completes and the UCC algorithm resumes its outer for loop. Has the algorithm seen vertex 3 before? Yup, in the first BFS call. What about vertex 4? Yes again, this time in the second BFS call. Vertex 5? Been there, done that in the first BFS call. But what about vertex 6? Neither of the previous BFS calls discovered this vertex, so BFS is called again with vertex 6 as the starting vertex. This third call to BFS discovers the vertices in {6, 8, 10}, and sets their ccvalues to 3: #1 #2 1 3 #8 #3 #6 #7 5 2 4 #4 #5 7 9 connected component #1 connected component #2 6 8 10 #9 #10 connected component #3 8.3 Computing Connected Components 39 Finally, the algorithm verifies that the remaining vertices (7, 8, 9, and 10) have already been explored and halts. 8.3.5 Correctness and Running Time The UCC algorithm correctly computes the connected components of an undirected graph, and does so in linear time. Theorem 8.4 (Properties of UCC) For every undirected graph G = (V, E) in adjacencylist representation: (a) At the conclusion of UCC, for every pair u, v of vertices, cc(u) = cc(v) if and only if u and v belong to the same connected component of G. (b) The running time of UCC is O(m + n), where m = E and n = V . Proof: For correctness, the first property of breadthfirst search (Theorem 8.2(a)) implies that each call to BFS with a starting vertex i will discover the vertices in i’s connected component and nothing more. The UCC algorithm gives these vertices a common ccvalue. Because no vertex is explored twice, each call to BFS identifies a new connected component, with each component having a diﬀerent ccvalue. The outer for loop ensures that every vertex is visited at least once, so the algorithm will discover every connected component. The running time bound follows from our refined running time analysis of BFS (Theorem 8.2(c)). Each call to BFS from a vertex i runs in O(mi + ni ) time, where mi and ni denote the number of edges and vertices, respectively, in i’s connected component. As BFS is called only once for each connected component, and each vertex or edge of G participates in exactlyPone component, the combined P running time of all the BFS calls is O( i mi + i ni ) = O(m+n). The initialization and additional bookkeeping performed by the algorithm requires only O(n) time, so the final running time is O(m + n). QE D 8.3.6 Solution to Quiz 8.2 Correct answer: (b). A graph with one connected component is one in which you can get from anywhere to anywhere else. Path 40 Graph Search and Its Applications graphs and complete graphs (Figure 8.7) are two examples. At the other extreme, in a graph with no edges, each vertex is in its own connected component, for a total of n. There cannot be more than n connected components, as they are disjoint and each contains at least one vertex. 8.4 DepthFirst Search Why do we need another graph search strategy? After all, breadthfirst search seems pretty awesome—it finds all the vertices reachable from the starting vertex in linear time, and can even compute shortestpath distances along the way. There’s another lineartime graph search strategy, depthfirst search (DFS), which comes with its own impressive catalog of applications (not already covered by BFS). For example, we’ll see how to use DFS to compute in linear time a topological ordering of the vertices of a directed acyclic graph, as well as the connected components (appropriately defined) of a directed graph. 8.4.1 An Example If breadthfirst search is the cautious and tentative exploration strategy, depthfirst search is its more aggressive cousin, always exploring from the most recently discovered vertex and backtracking only when necessary (like exploring a maze). Before we describe the full pseudocode for DFS, let’s illustrate how it works on the same running example used in Section 8.2 (Figure 8.9). a s e c b d Figure 8.9: Running example for depthfirst search. 8.4 41 DepthFirst Search Like BFS, DFS marks a vertex as explored the first time it discovers it. Because it begins its exploration at the starting vertex s, for the graph in Figure 8.9, the first iteration of DFS examines the edges (s, a) and (s, b), in whatever order these edges appear in s’s adjacency list. Let’s say (s, a) comes first, leading DFS to discover the vertex a and mark it as explored. The second iteration of DFS is where it diverges from BFS—rather than considering next s’s other layer1 neighbor b, DFS immediately proceeds to exploring the neighbors of a. (It will eventually get back to exploring (s, b).) Perhaps from a it checks s first (which is already marked as explored) and then discovers the vertex c, which is where it travels next: the frontier #2 a e #1 #3 s c b d Then DFS examines in some order the neighbors of c, the most recently discovered vertex. To keep things interesting, let’s say that DFS discovers d next, followed by e: #2 #5 a e #1 #3 s c b need to backtrack from here d #4 the frontier From e, DFS has nowhere to go—both of e’s neighbors are already marked as explored. DFS is forced to retreat to the previous vertex, namely d, and resume exploring the rest of its neighbors. From d, DFS will discover the final vertex b (perhaps after checking c and finding it marked as explored). Once at b, the dominoes fall quickly. DFS 42 Graph Search and Its Applications discovers that all of b’s neighbors have already been explored, and must backtrack to the previously visited vertex, which is d. Similarly, because all of d’s remaining neighbors are already marked as explored, DFS must rewind further, to c. DFS then retreats further to a (after checking that all of c’s remaining neighbors are marked as explored), then to s. It finally stops once it checks s’s remaining neighbor (which is b) and finds it marked as explored. 8.4.2 Pseudocode for DFS Iterative Implementation One way to think about and implement DFS is to start from the code for BFS and make two changes: (i) swap in a stack data structure (which is lastin firstout) for the queue (which is firstin firstout); and (ii) postpone checking whether a vertex has already been explored until after removing it from the data structure.20,21 DFS (Iterative Version) Input: graph G = (V, E) in adjacencylist representation, and a vertex s 2 V . Postcondition: a vertex is reachable from s if and only if it is marked as “explored.” mark all vertices as unexplored S := a stack data structure, initialized with s while S is not empty do remove (“pop”) the vertex v from the front of S if v is unexplored then mark v as explored for each edge (v, w) in v’s adjacency list do add (“push”) w to the front of S 20 A stack is a “lastin firstout” data structure—like those stacks of upsidedown trays at a cafeteria—that is typically studied in a first programming course (along with queues, see footnote 11). A stack maintains a list of objects, and you can add an object to the beginning of the list (a “push”) or remove one from the beginning of the list (a “pop”) in constant time. 21 Would the algorithm behave the same if we made only the first change? 8.4 DepthFirst Search 43 As usual, the edges processed in the for loop are the edges incident to v (if G is an undirected graph) or the edges outgoing from v (if G is a directed graph). For example, in the graph in Figure 8.9, the first iteration of DFS’s while loop pops the vertex s and pushes its two neighbors onto the stack in some order, say, with b first and a second. Because a was the last to be pushed, it is the first to be popped, in the second iteration of the while loop. This causes s and c to be pushed onto the stack, let’s say with c first. The vertex s is popped in the next iteration; since it has already been marked as explored, the algorithm skips it. Then c is popped, and all of its neighbors (a, b, d, and e) are pushed onto the stack, joining the first occurrence of b. If d is pushed last, and also b is pushed before e when d is popped in the next iteration, then we recover the order of exploration from Section 8.4.1 (as you should check). Recursive Implementation Depthfirst search also has an elegant recursive implementation.22 DFS (Recursive Version) Input: graph G = (V, E) in adjacencylist representation, and a vertex s 2 V . Postcondition: a vertex is reachable from s if and only if it is marked as “explored.” // all vertices unexplored before outer call mark s as explored for each edge (s, v) in s’s adjacency list do if v is unexplored then DFS (G, v) In this implementation, all recursive calls to DFS have access to the same set of global variables which track the vertices that have been marked as explored (with all vertices initially unexplored). The aggressive nature of DFS is perhaps more obvious in this implementation—the 22 I’m assuming you’ve heard of recursion as part of your programming background. A recursive procedure is one that invokes itself as a subroutine. 44 Graph Search and Its Applications algorithm immediately recurses on the first unexplored neighbor that it finds, before considering the remaining neighbors.23 In eﬀect, the explicit stack data structure in the iterative implementation of DFS is being simulated by the program stack of recursive calls in the recursive implementation.24 8.4.3 Correctness and Running Time Depthfirst search is just as correct and just as blazingly fast as breadthfirst search, for the same reasons (cf., Theorem 8.2).25 Theorem 8.5 (Properties of DFS) For every undirected or directed graph G = (V, E) in adjacencylist representation and for every starting vertex s 2 V : (a) At the conclusion of DFS, a vertex v 2 V is marked as explored if and only if there is a path from s to v in G. (b) The running time of DFS is O(m + n), where m = E and n = V . Part (a) holds because depthfirst search is a special case of the generic graph search algorithm GenericSearch (see Proposition 8.1).26 Part (b) holds because DFS examines each edge at most twice (once from each endpoint) and, because the stack supports pushes and pops in O(1) time, performs a constant number of operations per edge examination (for O(m) total). The initialization requires O(n) time.27 23 As stated, the two versions of DFS explore the edges in a vertex’s adjacency list in opposite orders. (Do you see why?) If one of the versions is modified to iterate backward through a vertex’s adjacency list, then the iterative and recursive implementations explore the vertices in the same order. 24 Pro tip: If your computer runs out of memory while executing the recursive version of DFS on a big graph, you should either switch to the iterative version or increase the program stack size in your programming environment. 25 The abbreviation “cf.” stands for confer and means “compare to.” 26 Formally, DFS is equivalent to the version of GenericSearch in which, in every iteration of the latter’s while loop, the algorithm chooses the eligible edge (v, w) for which v was discovered most recently. Ties among v’s eligible edges are broken according to their order (for the recursive version) or their reverse order (for the iterative version) in v’s adjacency list. 27 The refined bound in Theorem 8.2(c) also holds for DFS (for the same reasons), which means DFS can substitute for BFS in the lineartime UCC algorithm for computing connected components in Section 8.3. 8.5 45 Topological Sort 8.5 Topological Sort Depthfirst search is perfectly suited for computing a topological ordering of a directed acyclic graph. “What’s that and who cares,” you say? 8.5.1 Topological Orderings Imagine that you have a bunch of tasks to complete, and there are precedence constraints, meaning that you cannot start some of the tasks until you have completed others. Think, for example, about the courses in a university degree program, some of which are prerequisites for others. One application of topological orderings is to sequencing tasks so that all precedence constraints are respected. Topological Orderings Let G = (V, E) be a directed graph. A topological ordering of G is an assignment f (v) of every vertex v 2 V to a diﬀerent number such that: for every (v, w) 2 E, f (v) < f (w). The function f eﬀectively orders the vertices, from the vertex with the smallest f value to the one with the largest. The condition asserts that all of G’s (directed) edges should travel forward in the ordering, with the label of the tail of an edge smaller than that of its head. Quiz 8.3 How many diﬀerent topological orderings does the following graph have? Use only the labels {1, 2, 3, 4}. v s t w 46 Graph Search and Its Applications a) 0 b) 1 c) 2 d) 3 (See Section 8.5.7 for the solution and discussion.) You can visualize a topological ordering by plotting the vertices in order of their f values. In a topological ordering, all edges of the graph are directed from left to right. Figure 8.10 plots the topological orderings identified in the solution to Quiz 8.3. s v w t s w v t 1 2 3 4 1 2 3 4 (a) One topological ordering. . . (b) . . . and another one Figure 8.10: A topological ordering eﬀectively plots the vertices of a graph on a line, with all edges going from left to right. When the vertices of a graph represent tasks and the directed edges represent precedence constraints, topological orderings correspond exactly to the diﬀerent ways to sequence the tasks while respecting the precedence constraints. 8.5.2 When Does a Topological Ordering Exist? Does every graph have a topological ordering? No way. Think about a graph consisting solely of a directed cycle (Figure 8.11(a)). No matter what vertex ordering you choose, traversing the edges of the cycle takes you back to the starting point, which is possible only if some edges go backward in the ordering (Figure 8.11(b)). More generally, it is impossible to topologically order the vertices of a graph that contains a directed cycle. Equivalently, it is impossible to sequence a set of tasks when their dependencies are circular. Happily, directed cycles are the only obstruction to topological orderings. A directed graph without any directed cycles is called— 8.5 47 Topological Sort w v x u y z (a) A directed cycle u v w x y z 1 2 3 4 5 6 (b) A nontopological ordering Figure 8.11: Only a graph without directed cycles can have a topological ordering. wait for it—a directed acyclic graph, or simply a DAG. For example, the graph in Figure 8.10 is directed acyclic; the graph in Figure 8.11 is not. Theorem 8.6 (Every DAG Has a Topological Ordering) Every directed acyclic graph has at least one topological ordering. To prove this theorem, we’ll need the following lemma about source vertices. A source vertex of a directed graph is a vertex with no incoming edges. (Analogously, a sink vertex is one with no outgoing edges.) For example, s is the unique source vertex in the graph in Figure 8.10; the directed cycle in Figure 8.11 does not have any source vertices. Lemma 8.7 (Every DAG Has a Source) Every directed acyclic graph has at least one source vertex. Lemma 8.7 is true because if you keep following incoming edges backward out of an arbitrary vertex of a directed acyclic graph, you’re bound to eventually reach a source vertex. (Otherwise, you would produce a cycle, which is impossible.) See also Figure 8.12.28 28 More formally, pick a vertex v0 of a directed acyclic graph G; if it’s a source vertex, we’re done. If not, it has at least one incoming edge (v1 , v0 ). If v1 is a source vertex, we’re done. Otherwise, there is an incoming edge of the form (v2 , v1 ) and we can iterate again. After iterating up to n times, where n is the number of vertices, we either find a source vertex or produce a sequence of n edges (vn , vn 1 ), (vn 1 , vn 2 ), . . . , (v1 , v0 ). Because there are only n vertices, there’s at 48 Graph Search and Its Applications v3 v1 v4 v2 v5 v7 v0 v6 Figure 8.12: Tracing incoming edges back from a vertex fails to find a source vertex only if the graph contains a directed cycle. We can prove Theorem 8.6 by populating a topological ordering from left to right with successively extracted source vertices.29 Proof of Theorem 8.6: Let G be a directed acyclic graph with n vertices. The plan is to assign f values to vertices in increasing order, from 1 to n. Which vertex has earned the right to wear 1 as its f value? It had better be a source vertex—if a vertex with an incoming edge was assigned the first position, the incoming edge would go backward in the ordering. So, let v1 be a source vertex of G—one exists by Lemma 8.7—and assign f (v1 ) = 1. If there are multiple source vertices, pick one arbitrarily. Next, obtain the graph G0 from G by removing v1 and all its edges. Because G is directed acyclic, so is G0 —deleting stuﬀ can’t create new cycles. We can therefore recursively compute a topological ordering of G0 , using the labels {2, 3, 4, . . . , n}, with every edge in G0 traveling forward in the ordering. (Since each recursive call is on a smaller graph, the recursion eventually stops.) The only edges in G that are not also in G0 are the (outgoing) edges of v1 ; as f (v1 ) = 1, these also travel forward in the ordering.30 QE D least one repeat vertex in the sequence vn , vn 1 , . . . , v0 . But if vj = vi with j > i, then the edges (vj , vj 1 ), . . . , (vi+1 , vi ) form a directed cycle, contradicting the assumption that G is directed acyclic. (In Figure 8.12, i = 2 and j = 8.) 29 Alternatively, following outgoing edges rather than incoming edges in the proof of Lemma 8.7 shows that every DAG has at least one sink vertex, and we can populate a topological ordering from right to left with successively extracted sink vertices. 30 If you prefer a formal proof of correctness, proceed in the privacy of your 8.5 49 Topological Sort 8.5.3 Computing a Topological Ordering Theorem 8.6 implies that it makes sense to ask for a topological ordering of a directed graph if and only if the graph is directed acyclic. Problem: Topological Sort Input: A directed acyclic graph G = (V, E). Output: A topological ordering of the vertices of G. The proofs of Lemma 8.7 and Theorem 8.6 naturally lead to an algorithm. For an nvertex directed acyclic graph in adjacencylist representation, the former proof gives an O(n)time subroutine for finding a source vertex. The latter proof computes a topological ordering with n invocations of this subroutine, plucking oﬀ a new source vertex in each iteration.31 The running time of this algorithm is O(n2 ), which is linear time for the densest graphs (with m = ⇥(n2 ) edges) but not for sparser graphs (where n2 could be way bigger than m). Next up: a slicker solution via depthfirst search, resulting in a lineartime (O(m + n)) algorithm.32 8.5.4 Topological Sort via DFS The slick way to compute a topological ordering is to augment depthfirst search in two small ways. For simplicity, we’ll start from the recursive implementation of DFS in Section 8.4. The first addition is an outer loop that makes a single pass over the vertices, invoking DFS as a subroutine whenever a previously unexplored vertex is discovered. This ensures that every vertex is eventually discovered and assigned a label. The global variable curLabel keeps track of where we are in the topological ordering. Our algorithm will compute an ordering in reverse order (from right to left), so curLabel counts down from the number of vertices to 1. own home by induction on the number of vertices. 31 For the graph in Figure 8.10, this algorithm might compute either of the two topological orderings, depending on which of v, w is chosen as the source vertex in the second iteration, after s has been removed. 32 With some cleverness, the algorithm implicit in the proofs of Lemma 8.7 and Theorem 8.6 can also be implemented in linear time—do you see how to do it? 50 Graph Search and Its Applications TopoSort Input: directed acyclic graph G = (V, E) in adjacencylist representation. Postcondition: the f values of vertices constitute a topological ordering of G. mark all vertices as unexplored curLabel := V  // keeps track of ordering for every v 2 V do if v is unexplored then // in a prior DFS DFSTopo (G, v) Second, we must add a line of code to DFS that assigns an f value to a vertex. The right time to do this is immediately upon completion of the DFS call initiated at v. DFSTopo Input: graph G = (V, E) in adjacencylist representation, and a vertex s 2 V . Postcondition: every vertex reachable from s is marked as “explored” and has an assigned f value. mark s as explored for each edge (s, v) in s’s outgoing adjacency list do if v is unexplored then DFSTopo (G, v) f (s) := curLabel // s’s position in ordering curLabel := curLabel 1 // work righttoleft 8.5.5 An Example Suppose the input graph is the graph in Quiz 8.3. The TopoSort algorithm initializes the global variable curLabel to the number of vertices, which is 4. The outer loop in TopoSort iterates through the vertices in an arbitrary order; let’s assume this order is v, t, s, w. In the first iteration, because v is not marked as explored, the algorithm 8.5 51 Topological Sort invokes the DFSTopo subroutine with starting vertex v. The only outgoing edge from v is (v, t), and the next step is to recursively call DFSTopo with starting vertex t. This call returns immediately (as t has no outgoing edges), at which point f (t) is set to 4 and curLabel is decremented from 4 to 3. Next, the DFSTopo call at v completes (as v has no other outgoing edges), at which point f (v) is set to 3 and curLabel is decremented from 3 to 2. At this point, the TopoSort algorithm resumes its linear scan of the vertices in its outer loop. The next vertex is t; because t has already been marked as explored in the first call to DFSTopo, the TopoSort algorithm skips it. Because the next vertex (which is s) has not yet been explored, the algorithm invokes DFSTopo from s. From s, DFSTopo skips v (which is already marked as explored) and recursively calls DFSTopo at the newly discovered vertex w. The call at w completes immediately (the only outgoing edge is to the previously explored vertex t), at which point f (w) is set to 2 and curLabel is decremented from 2 to 1. Finally, the DFSTopo call at vertex s completes, and f (s) is set to 1. The resulting topological ordering is the same as that in Figure 8.10(b). Quiz 8.4 What happens when the TopoSort algorithm is run on a graph with a directed cycle? a) The algorithm might or might not loop forever. b) The algorithm always loops forever. c) The algorithm always halts, and may or may not successfully compute a topological ordering. d) The algorithm always halts, and never successfully computes a topological ordering. (See Section 8.5.7 for the solution and discussion.) 8.5.6 Correctness and Running Time The TopoSort algorithm correctly computes a topological ordering of a directed acyclic graph, and does so in linear time. 52 Graph Search and Its Applications Theorem 8.8 (Properties of TopoSort) For every acyclic graph G = (V, E) in adjacencylist representation: directed (a) At the conclusion of TopoSort, every vertex v has been assigned an f value, and these f values constitute a topological ordering of G. (b) The running time of TopoSort is O(m + n), where m = E and n = V . Proof: The TopoSort algorithm runs in linear time for the usual reasons. It explores each edge only once (from its tail), and therefore performs only a constant number of operations for each vertex or edge. This implies an overall running time of O(m + n). For correctness, first note that DFSTopo will be called from each vertex v 2 V exactly once, when v is encountered for the first time, and that v is assigned a label when this call completes. Thus, every vertex receives a label, and by decrementing the curLabel variable with every label assignment, the algorithm ensures that each vertex v gets a distinct label f (v) from the set {1, 2, . . . , V }. To see why these labels constitute a topological ordering, consider an arbitrary edge (v, w); we must argue that f (v) < f (w). There are two cases, depending on which of v, w the algorithm discovers first.33 If v is discovered before w, then DFSTopo is invoked with starting vertex v before w has been marked as explored. As w is reachable from v (via the edge (v, w)), this call to DFSTopo eventually discovers w and recursively calls DFSTopo at w. By the lastin firstout nature of recursive calls, the call to DFSTopo at w completes before that at v. Because labels are assigned in decreasing order, w is assigned a larger f value than v, as required. Second, suppose w is discovered by the TopoSort algorithm before v. Because G is a directed acyclic graph, there is no path from w back to v; otherwise, combining such a path with the edge (v, w) would produce a directed cycle (Figure 8.13). Thus, the call to DFSTopo starting at w cannot discover v and completes with v still unexplored. Once again, the DFSTopo call at w completes before that at v and hence f (v) < f (w). QE D 33 Both cases are possible, as we saw in Section 8.5.5. *8.6 53 Computing Strongly Connected Components G v w Figure 8.13: A directed acyclic graph cannot contain both an edge (v, w) and a path from w back to v. 8.5.7 Solution to Quizzes 8.3–8.4 Solution to Quiz 8.3 Correct answer: (c). Figure 8.14 shows two diﬀerent topological orderings of the graph—you should check that these are the only ones. f(v) = 2 f(v) = 3 v v f(s) = 1 f(t) = 4 s t f(s) = 1 f(t) = 4 s t w w f(w) = 3 f(w) = 2 (a) One topological ordering. . . (b) . . . and another one Figure 8.14: Two topological orderings of the graph in Quiz 8.3. Solution to Quiz 8.4 Correct answer: (d). The algorithm always halts: There are only V  iterations of the outer loop, and each iteration either does nothing or invokes depthfirst search (with minor additional bookkeeping). Depthfirst search always halts, whether or not the input graph is directed acyclic (Theorem 8.5), and so TopoSort does as well. Any chance it halts with a topological ordering? No way—it is impossible to topologically sort the vertices of any graph with a directed cycle (recall Section 8.5.2). 54 Graph Search and Its Applications *8.6 Computing Strongly Connected Components Next we’ll learn an even more interesting application of depthfirst search: computing the strongly connected components of a directed graph.34 Our algorithm will be just as blazingly fast as in the undirected case (Section 8.3), although less straightforward. Computing strongly connected components is a more challenging problem than topological sorting, and one pass of depthfirst search won’t be enough. So, we’ll use two!35 8.6.1 Defining Strongly Connected Components What do we even mean by a “connected component” of a directed graph? For example, how many connected components does the graph in Figure 8.15 have? 2 1 4 3 Figure 8.15: How many connected components? It’s tempting to say that this graph has one connected component— if it were a physical object, with the edges corresponding to strings tying the vertices together, we could pick it up and it would hang together in one piece. But remember how we defined connected components in the undirected case (Section 8.3), as maximal regions within which you can get from anywhere to anywhere else. There is no way to “move to the left” in the graph in Figure 8.15, so it’s not the case that you can get from anywhere to anywhere else. 34 Starred sections like this one are the more diﬃcult sections; they can be skipped on a first reading. 35 Actually, there is a somewhat tricky way to compute the strongly connected components of a directed graph with only one pass of depthfirst search; see the paper “DepthFirst Search and Linear Graph Algorithms,” by Robert E. Tarjan (SIAM Journal on Computing, 1973). *8.6 55 Computing Strongly Connected Components A strongly connected component or SCC of a directed graph is a maximal subset S ✓ V of vertices such that there is a directed path from any vertex in S to any other vertex in S.36 For example, the strongly connected components of the graph in Figure 8.16 are {1, 3, 5}, {11}, {2, 4, 7, 9}, and {6, 8, 10}. Within each component, it’s possible to get from anywhere to anywhere else (as you should check). Each component is maximal subject to this property, as there’s no way to “move to the left” from one SCC to another. SCC#1 SCC#4 SCC#2 6 11 3 1 10 8 5 9 2 SCC#3 7 4 Figure 8.16: A graph with vertex set {1, 2, 3, . . . , 11} and four strongly connected components. The relationships between the four SCCs of the graph in Figure 8.16 mirror those between the four vertices in the graph in Figure 8.15. More generally, if you squint, every directed graph can be viewed as a directed acyclic graph built up from its SCCs. Proposition 8.9 (The SCC MetaGraph Is Directed Acyclic) Let G = (V, E) be a directed graph. Define the corresponding metagraph H = (X, F ) with one metavertex x 2 X per SCC of G and a 36 As with connected components in undirected graphs (footnote 18), the strongly connected components of a directed g