Main
Classic Computer Science Problems in Python
Classic Computer Science Problems in Python
David Kopec
Classic Computer Science Problems in Python deepens your knowledge of problemsolving techniques from the realm of computer science by challenging you with timetested scenarios, exercises, and algorithms. As you work through examples in search, clustering, graphs, and more, you'll remember important things you've forgotten and discover classic solutions to your "new" problems!
About the Technology
Computer science problems that seem new or unique are often rooted in classic algorithms, coding techniques, and engineering principles. And classic approaches are still the best way to solve them! Understanding these techniques in Python expands your potential for success in web development, data munging, machine learning, and more.
About the Book
Classic Computer Science Problems in Python sharpens your CS problemsolving skills with timetested scenarios, exercises, and algorithms, using Python. You'll tackle dozens of coding challenges, ranging from simple tasks like binary search algorithms to clustering data using kmeans. You'll especially enjoy the feeling of satisfaction as you crack problems that connect computer science to the realworld concerns of apps, data, performance, and even nailing your next job interview!
What's Inside
• Search algorithms
• Common techniques for graphs
• Neural networks
• Genetic algorithms
• Adversarial search
• Uses type hints throughout
• Covers Python 3.7
About the Reader
For intermediate Python programmers.
About the Technology
Computer science problems that seem new or unique are often rooted in classic algorithms, coding techniques, and engineering principles. And classic approaches are still the best way to solve them! Understanding these techniques in Python expands your potential for success in web development, data munging, machine learning, and more.
About the Book
Classic Computer Science Problems in Python sharpens your CS problemsolving skills with timetested scenarios, exercises, and algorithms, using Python. You'll tackle dozens of coding challenges, ranging from simple tasks like binary search algorithms to clustering data using kmeans. You'll especially enjoy the feeling of satisfaction as you crack problems that connect computer science to the realworld concerns of apps, data, performance, and even nailing your next job interview!
What's Inside
• Search algorithms
• Common techniques for graphs
• Neural networks
• Genetic algorithms
• Adversarial search
• Uses type hints throughout
• Covers Python 3.7
About the Reader
For intermediate Python programmers.
Year:
2019
Edition:
1
Publisher:
Manning Publications
Language:
english
Pages:
224
ISBN 10:
1617295981
ISBN 13:
9781617295980
File:
PDF, 4.28 MB
Download (pdf, 4.28 MB)
Preview
 Open in Browser
 Checking other formats...
 Convert to EPUB
 Convert to FB2
 Convert to MOBI
 Convert to TXT
 Convert to RTF
 Converted file can differ from the original. If possible, download the file in its original format.
 Please login to your account first

Need help? Please read our short guide how to send a book to Kindle.
The file will be sent to your email address. It may take up to 15 minutes before you receive it.
The file will be sent to your Kindle account. It may takes up to 15 minutes before you received it.
Please note you need to add our email km@bookmail.org to approved email addresses. Read more.
Please note you need to add our email km@bookmail.org to approved email addresses. Read more.
You may be interested in
Most frequently terms
int^{255}
def^{242}
python^{216}
algorithm^{183}
listing^{181}
float^{181}
import^{175}
graph^{162}
neural^{152}
str^{130}
output^{120}
algorithms^{119}
vertex^{118}
genetic^{93}
type hints^{85}
constraint^{85}
neural networks^{83}
neuron^{78}
neurons^{75}
maze^{75}
vertices^{69}
neural network^{69}
longitude^{67}
cluster^{64}
minimax^{64}
variable^{63}
clustering^{60}
tuple^{58}
governor^{58}
input^{58}
shortest^{57}
chromosome^{56}
len^{55}
gene^{55}
variables^{54}
artificial^{54}
fib2^{53}
assignment^{53}
node^{53}
applications^{53}
tic^{51}
dict^{50}
weights^{50}
grid^{49}
programming^{49}
tac^{49}
generic^{49}
fitness^{48}
optional^{47}
atlanta^{46}
genetic algorithm^{45}
miami^{45}
recursive^{44}
detroit^{43}
typing import^{42}
genetic algorithms^{42}
bool^{41}
riverside^{41}
continued def^{40}
phoenix^{39}
kukluklu
DOPE AF TYSM TYSM TYSM
01 December 2019 (21:50)
You can write a book review and share your experiences. Other readers will always be interested in your opinion of the books you've read. Whether you've loved the book or not, if you give your honest and detailed thoughts then people will find new books that are right for them.
1

2

David Kopec MANNING Classic Computer Science Problems in Python Classic Computer Science Problems in Python DAVID KOPEC MANNING SHELTER ISLAND For online information and ordering of this and other Manning books, please visit www.manning.com. The publisher offers discounts on this book when ordered in quantity. For more information, please contact Special Sales Department Manning Publications Co. 20 Baldwin Road PO Box 761 Shelter Island, NY 11964 Email: orders@manning.com © 2019 by Manning Publications Co. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by means electronic, mechanical, photocopying, or otherwise, without prior written permission of the publisher. Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in the book, and Manning Publications was aware of a trademark claim, the designations have been printed in initial caps or all caps. Recognizing the importance of preserving what has been written, it is Manning’s policy to have the books we publish printed on acidfree paper, and we exert our best efforts to that end. Recognizing also our responsibility to conserve the resources of our planet, Manning books are printed on paper that is at least 15 percent recycled and processed without the use of elemental chlorine. Manning Publications Co. 20 Baldwin Road PO Box 761 Shelter Island, NY 11964 Development editor: Technical development editor: Review editor: Production editor: Copy editor: Proofreader: Technical proofreader: Typesetter: Cover designer: ISBN 9781617295980 Printed in the United States of America 1 2 3 4 5 6 7 8 9 10 – SP – 24 23 22 21 20 19 Jennifer Stout Frances Buontemp Aleksandar Dragosavljević Deirdre Hiam Andy Carroll Katie Tennant Juan Rufes Dottie Marsico Marija Tudor Dedicated to my grandmother, Erminia Antos, a lifelong teacher and learner. contents acknowledgments xi about this book xiii about the author xiv about the cover illustration Introduction 0.1 0.2 0.3 0.4 0.5 0.6 0.7 1 1 Why Python? 1 What is a classic computer science problem? 2 What kinds of problems are in this book? 2 Who is this book for? 3 Python versioning, source code repository, and type hints 4 No graphics, no UI code, just the standard library Part of a series 5 Small problems 1.1 xv 5 6 The Fibonacci sequence 6 A first recursive attempt 6 Utilizing base cases 8 Memoization to the rescue 9 Automatic memoization 10 Keep it simple, Fibonacci 11 Generating Fibonacci numbers with a generator 11 ■ ■ ■ vii viii CONTENTS 1.2 1.3 Trivial compression 12 Unbreakable encryption Getting the data in order 1.4 1.5 Calculating pi 19 The Towers of Hanoi Modeling the towers 1.6 1.7 2 Realworld applications Exercises 24 Search problems 2.1 20 16 16 ■ Encrypting and decrypting 20 ■ Solving The Towers of Hanoi 22 24 25 DNA search 25 Storing DNA 25 Linear search A generic example 30 ■ 2.2 18 Maze solving 27 ■ Binary search 28 32 Generating a random maze 32 Miscellaneous maze minutiae 33 Depthfirst search 34 Breadthfirst search 38 A* search 42 ■ ■ ■ ■ 2.3 Missionaries and cannibals Representing the problem 2.4 2.5 Realworld applications Exercises 51 47 Constraintsatisfaction problems 4 Graph problems 4.1 4.2 Solving 49 51 3 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 47 ■ 52 Building a constraintsatisfaction problem framework The Australian mapcoloring problem 57 The eight queens problem 59 Word search 61 SEND+MORE=MONEY 65 Circuit board layout 66 Realworld applications 67 Exercises 67 68 A map as a graph 68 Building a graph framework Working with Edge and Graph 71 75 53 ix CONTENTS 4.3 Finding the shortest path 76 Revisiting breadthfirst search (BFS) 4.4 Minimizing the cost of building the network Workings with weights spanning tree 82 4.5 78 Realworld applications Exercises 93 Genetic algorithms 6 Kmeans clustering 7 Fairly simple neural networks 6.1 6.2 6.3 6.4 6.5 6.6 6.7 7.1 7.2 Finding the minimum 88 93 94 Biological background 94 A generic genetic algorithm 95 A naive test 102 SEND+MORE=MONEY revisited 104 Optimizing list compression 107 Challenges for genetic algorithms 109 Realworld applications 110 Exercises 111 112 Preliminaries 113 The kmeans clustering algorithm 115 Clustering governors by age and longitude 119 Clustering Michael Jackson albums by length 124 Kmeans clustering problems and extensions 125 Realworld applications 126 Exercises 126 127 Biological basis? 128 Artificial neural networks Neurons 129 Layers The big picture 135 ■ 7.3 78 88 5 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 ■ Finding shortest paths in a weighted graph Dijkstra’s algorithm 4.6 4.7 76 Preliminaries Dot product 129 130 ■ Backpropagation 135 135 ■ The activation function 136 131 x CONTENTS 7.4 Building the network 136 Implementing neurons 137 Implementing layers Implementing the network 140 ■ 7.5 Classification problems Normalizing data 143 Classifying wine 147 7.6 7.7 7.8 7.9 8 8.1 8.2 143 ■ The classic iris data set Speeding up neural networks 149 Neural network problems and extensions Realworld applications 151 Exercises 152 Adversarial search 138 144 150 153 Basic board game components Tictactoe 155 153 Managing tictactoe state 155 Minimax 158 Testing minimax with tictactoe 160 Developing a tictactoe AI 162 ■ ■ 8.3 Connect Four 163 Connect Four game machinery 163 A Connect Four AI 168 Improving minimax with alphabeta pruning ■ ■ 8.4 8.5 8.6 9 Minimax improvements beyond alphabeta pruning Realworld applications 170 Exercises 171 Miscellaneous problems 9.1 9.2 appendix A appendix B appendix C 172 The knapsack problem 172 The Traveling Salesman Problem The naive approach 9.3 9.4 9.5 169 177 ■ Taking it to the next level Phone number mnemonics 182 Realworld applications 184 Exercises 184 Glossary 186 More resources 191 A brief introduction to type hints index 201 177 195 182 170 acknowledgments Thank you, everyone at Manning who helped in the production of this book: Cheryl Weisman, Deirdre Hiam, Katie Tennant, Dottie Marsico, Janet Vail, Barbara Mirecki, Aleksandar Dragosavljević, Mary Piergies, and Marija Tudor. I thank acquisitions editor Brian Sawyer, who wisely steered us toward attacking Python after I finished Swift. Thank you, development editor Jennifer Stout, for always having a positive attitude. Thanks go to technical editor Frances Buontempo, who provided careful consideration of each chapter and gave detailed, useful feedback at every turn. I thank copyeditor Andy Carroll, whose superb attention to detail on both the Swift book and this one caught several of my mistakes, and also my technical proofreader, Juan Rufes. The following people also reviewed the book: Al Krinker, Al Pezewski, Alan Bogusiewicz, Brian Canada, Craig Henderson, Daniel KenneyJung, Edmond Sesay, Ewa Baranowska, Gary Barnhart, Geoff Clark, James Watson, Jeffrey Lim, Jens Christian, Bredahl Madsen, Juan Jimenez, Juan Rufes, Matt Lemke, Mayur Patil, Michael Bright, Roberto Casadei, Sam Zaydel, Thorsten Weber, Tom Jeffries, and Will Lopez. Thanks go to all who provided constructive and specific criticism during the book’s development. Your feedback was incorporated. I thank my family, friends, and colleagues who encouraged me to take on this book project immediately following the publication of Classic Computer Science Problems in Swift. I thank my online friends on Twitter and elsewhere who have provided encouraging words and helped promote the book in ways small and large. And I thank my wife, Rebecca Kopec, and my mom, Sylvia Kopec, who are always supportive of my projects. xi xii ACKNOWLEDGMENTS We developed this book in a fairly short period of time. The vast majority of the manuscript was written over the summer of 2018, based on the earlier Swift version. I appreciate that Manning was willing to compress its (usually much longer) process to enable me to work during a schedule that was convenient to me. I know this put pressure on the entire team as we went through three rounds of reviews at multiple different levels amongst many different people in just a few months. Most readers would be amazed at how many different kinds of reviews a technical book by a traditional publisher goes through and how many people have their part in critiquing and revising it. From the technical proofer to the copy editor, the review editor, all of the official reviewers, and everyone in between, I thank you! Finally, most importantly, I thank my readers for purchasing this book. In a world full of halfhearted online tutorials, I think it is important to support the development of books that provide the same author’s voice throughout an extended volume. Online tutorials can be superb resources, but your purchase enables fulllength, vetted, and carefully developed books to still have a place in computer science education. about this book Trademarks Trademarked names appear in this book. Rather than use a trademark symbol with every occurrence of a trademarked name, the names are only used in an editorial fashion and to the benefit of the trademark owner, with no intention of infringement of the trademark. “Python” is a registered trademark of the Python Software Foundation. “Connect Four” is a trademark of Hasbro, Inc. Book forum Purchase of Classic Computer Science Problems in Python includes free access to a private web forum run by Manning Publications where you can make comments about the book, ask technical questions, and receive help from the author and from other users. To access the forum, go to https://www.manning.com/books/classiccomputerscienceproblemsinpython. You can also learn more about Manning's forums and the rules of conduct at https://forums.manning.com/forums/about. Manning’s commitment to our readers is to provide a venue where a meaningful dialogue between individual readers and between readers and the author can take place. It is not a commitment to any specific amount of participation on the part of the author, whose contribution to the forum remains voluntary (and unpaid). We suggest you try asking him some challenging questions lest his interest stray! The forum and the archives of previous discussions will be accessible from the publisher’s website as long as the book is in print. xiii about the author David Kopec is an assistant professor of Computer Science & Innovation at Champlain College in Burlington, Vermont. He is an experienced software developer and the author of Classic Computer Science Problems in Swift (Manning, 2018), and Dart for Absolute Beginners (Apress, 2014). David holds a bachelor’s degree in economics and a master’s in computer science, both from Dartmouth College. You can reach David on Twitter @davekopec. xiv about the cover illustration The figure on the cover of Classic Computer Science Problems in Python is captioned “Habit of a Bonza or Priest in China.” The illustration is taken from Thomas Jefferys’ A Collection of the Dresses of Different Nations, Ancient and Modern (four volumes), London, published between 1757 and 1772. The title page states that these are handcolored copperplate engravings, heightened with gum arabic. Thomas Jefferys (1719–1771) was called “Geographer to King George III.” He was an English cartographer who was the leading map supplier of his day. He engraved and printed maps for government and other official bodies and produced a wide range of commercial maps and atlases, especially of North America. His work as a map maker sparked an interest in local dress customs of the lands he surveyed and mapped, which are brilliantly displayed in this collection. Fascination with faraway lands and travel for pleasure were relatively new phenomena in the late eighteenth century, and collections such as this one were popular, introducing both the tourist as well as the armchair traveler to the inhabitants of other countries. The diversity of the drawings in Jefferys’ volumes speaks vividly of the uniqueness and individuality of the world’s nations some 200 years ago. Dress codes have changed since then, and the diversity by region and country, so rich at the time, has faded away. It’s now often hard to tell the inhabitants of one continent from another. Perhaps, trying to view it optimistically, we’ve traded a cultural and visual diversity for a more varied personal life—or a more varied and interesting intellectual and technical life. At a time when it’s difficult to tell one computer book from another, Manning celebrates the inventiveness and initiative of the computer business with book covers based on the rich diversity of regional life of two centuries ago, brought back to life by Jefferys’ pictures. xv Introduction Thank you for purchasing Classic Computer Science Problems in Python. Python is one of the most popular programming languages in the world, and people become Python programmers from a variety of backgrounds. Some have a formal computer science education. Others learn Python as a hobby. Still others use Python in a professional setting, but their primary job is not to be a software developer. The problems in this intermediate book will help seasoned programmers refresh themselves on ideas from their CS education while learning some advanced features of the language. Selftaught programmers will accelerate their CS education by learning classic problems in the language of their choice: Python. This book covers such a diversity of problemsolving techniques that there is truly something for everyone. This book is not an introduction to Python. There are numerous excellent books from Manning and other publishers in that vein.1 Instead, this book assumes that you are already an intermediate or advanced Python programmer. Although this book requires Python 3.7, mastery of every facet of the latest version of Python is not assumed. In fact, the book’s content was created with the assumption that it would serve as learning material to help readers achieve such mastery. On the other hand, this book is not appropriate for readers completely new to Python. Why Python? Python is used in pursuits as diverse as data science, filmmaking, computer science education, IT management, and much more. There really is no computing field that 1 If you are just starting your Python journey, you may want to first check out The Quick Python Book, 3rd edition, by Naomi Ceder (Manning, 2018) before beginning this book. 1 2 Introduction Python has not touched (except maybe kernel development). Python is loved for its flexibility, beautiful and succinct syntax, objectoriented purity, and bustling community. The strong community is important because it means Python is welcoming to newcomers and has a large ecosystem of available libraries for developers to build upon. For the preceding reasons, Python is sometimes thought of as a beginnerfriendly language, and that characterization is probably true. Most people would agree that Python is easier to learn than C++, for example, and its community is almost certainly friendlier to newcomers. As a result, many people learn Python because it is approachable, and they start writing the programs they want to write fairly quickly. But they may never have received an education in computer science that teaches them all of the powerful problemsolving techniques available to them. If you are one of those programmers who knows Python but does not know CS, this book is for you. Other people learn Python as a second, third, fourth, or fifth language after a long time working in software development. For them, seeing old problems they’ve already seen in another language will help them accelerate their learning of Python. For them, this book may be a good refresher before a job interview, or it might expose them to some problemsolving techniques they had not previously thought of exploiting in their work. I would encourage them to skim the table of contents to see if there are topics in this book that excite them. What is a classic computer science problem? Some say that computers are to computer science as telescopes are to astronomy. If that’s the case, then perhaps a programming language is like a telescope lens. In any event, the term “classic computer science problems” is used here to mean “programming problems typically taught in an undergraduate computer science curriculum.” There are certain programming problems that are given to new programmers to solve and that have become commonplace enough to be deemed classic, whether in a classroom setting during the pursuit of a bachelor’s degree (in computer science, software engineering, and the like) or within the confines of an intermediate programming textbook (for example, a first book on artificial intelligence or algorithms). A selection of such problems is what you will find in this book. The problems range from the trivial, which can be solved in a few lines of code, to the complex, which require the buildup of systems over multiple chapters. Some problems touch on artificial intelligence, and others simply require common sense. Some problems are practical, and other problems are fanciful. What kinds of problems are in this book? Chapter 1 introduces problemsolving techniques that will likely look familiar to most readers. Things like recursion, memoization, and bit manipulation are essential building blocks of other techniques explored in later chapters. This gentle introduction is followed by chapter 2, which focuses on search problems. Search is such a large topic that you could arguably place most problems in the Who is this book for? 3 book under its banner. Chapter 2 introduces the most essential search algorithms, including binary search, depthfirst search, breadthfirst search, and A*. These algorithms are reused throughout the rest of the book. In chapter 3, you will build a framework for solving a broad range of problems that can be abstractly defined by variables of limited domains that have constraints between them. This includes such classics as the eight queens problem, the Australian mapcoloring problem, and the cryptarithmetic SEND+MORE=MONEY. Chapter 4 explores the world of graph algorithms, which to the uninitiated are surprisingly broad in their applicability. In this chapter, you will build a graph data structure and then use it to solve several classic optimization problems. Chapter 5 explores genetic algorithms, a technique that is less deterministic than most covered in the book but that sometimes can solve problems traditional algorithms cannot solve in a reasonable amount of time. Chapter 6 covers kmeans clustering and is perhaps the most algorithmically specific chapter in the book. This clustering technique is simple to implement, easy to understand, and broadly applicable. Chapter 7 aims to explain what a neural network is and to give the reader a taste of what a very simple neural network looks like. It does not aim to provide comprehensive coverage of this exciting and evolving field. In this chapter, you will build a neural network from first principles, using no external libraries, so you can really see how a neural network works. Chapter 8 is on adversarial search in twoplayer perfect information games. You will learn a search algorithm known as minimax, which can be used to develop an artificial opponent that can play games like chess, checkers, and Connect Four well. Finally, chapter 9 covers interesting (and fun) problems that did not quite fit anywhere else in the book. Who is this book for? This book is for both intermediate and experienced programmers. Experienced programmers who want to deepen their knowledge of Python will find comfortably familiar problems from their computer science or programming education. Intermediate programmers will be introduced to these classic problems in the language of their choice: Python. Developers getting ready for coding interviews will likely find this book to be valuable preparation material. In addition to professional programmers, students enrolled in undergraduate computer science programs who have an interest in Python will likely find this book helpful. It makes no attempt to be a rigorous introduction to data structures and algorithms. This is not a data structures and algorithms textbook. You will not find proofs or extensive use of bigO notation within its pages. Instead, it is positioned as an approachable, handson tutorial to the problemsolving techniques that should be the end product of taking data structure, algorithm, and artificial intelligence classes. 4 Introduction Once again, knowledge of Python’s syntax and semantics is assumed. A reader with zero programming experience will get little out of this book, and a programmer with zero Python experience will almost certainly struggle. In other words, Classic Computer Science Problems in Python is a book for working Python programmers and computer science students. Python versioning, source code repository, and type hints The source code in this book was written to adhere to version 3.7 of the Python language. It utilizes features of Python that only became available in Python 3.7, so some of the code will not run on earlier versions of Python. Instead of struggling and trying to make the examples run in an earlier version, please just download the latest version of Python before starting the book. This book only makes use of the Python standard library (with a slight exception in chapter 2, where the typing_extensions module is installed), so all of the code in this book should run on any platform where Python is supported (macOS, Windows, GNU/Linux, and so on). The code in this book was only tested against CPython (the main Python interpreter available from python.org), although it is likely that most of it will run in a Python 3.7–compatible version of another Python interpreter. This book does not explain how to use Python tools like editors, IDEs, debuggers, and the Python REPL. The book’s source code is available online from the GitHub repository: https://github.com/davecom/ClassicComputerScienceProblems InPython. The source code is organized into folders by chapter. As you read each chapter, you will see the name of a source file in the header of each code listing. You can find that source file in its respective folder in the repository. You should be able to run the problem by just entering python3 filename.py or python filename.py depending on your computer’s setup with regards to the name of the Python 3 interpreter. Every code listing in this book makes use of Python type hints, also known as type annotations. These annotations are a relatively new feature for the Python language, and they may look intimidating to Python programmers who have never seen them before. They are used for three reasons: 1 2 3 They provide clarity about the types of variables, function parameters, and function returns. They selfdocument the code in a sense, as a result of reason 1. Instead of having to search through a comment or docstring to find the return type of a function, you can just look at its signature. They allow the code to be typechecked for correctness. One popular Python type checker is mypy. Not everyone is a fan of type hints, and choosing to use them throughout the book was frankly a gamble. I hope they will be a help instead of a hindrance. It takes a little more time to write Python with type hints, but it provides more clarity when read Part of a series 5 back. An interesting note is that type hints have no effect on the actual running of the code in the Python interpreter. You can remove the type hints from any of the code in this book, and it should still run. If you have never seen type hints before and feel you need a more comprehensive introduction to them before diving into the book, please see appendix C, which provides a crash course in type hints. No graphics, no UI code, just the standard library There are no examples in this book that produce graphical output or that make use of a graphical user interface (GUI). Why? The goal is to solve the posed problems with solutions that are as concise and readable as possible. Often, doing graphics gets in the way or makes solutions significantly more complex than they need to be to illustrate the technique or algorithm in question. Further, by not making use of any GUI framework, all of the code in the book is eminently portable. It can as easily run on an embedded distribution of Python running on Linux as it can on a desktop running Windows. Also, a conscious decision was made to only use packages from the Python standard library instead of any external libraries, as most advanced Python books do. Why? The goal is to teach problemsolving techniques from first principles, not to “pip install a solution.” By having to work through every problem from scratch, you will hopefully gain an understanding about how popular libraries work behind the scenes. At a minimum, only using the standard library makes the code in this book more portable and easier to run. This is not to say that graphical solutions are not sometimes more illustrative of an algorithm than textbased solutions. It simply was not the focus of this book. It would add another layer of unnecessary complexity. Part of a series This is the second book in a series titled Classic Computer Science Problems published by Manning. The first book was Classic Computer Science Problems in Swift, published in 2018. In each book in the series, we aim to provide languagespecific insight while teaching through the lens of the same (mostly) computer science problems. If you enjoy this book and plan to learn another language covered by the series, you may find going from one book to another an easy way to improve your mastery of that language. For now, the series covers just Swift and Python. I wrote the first two books myself, because I have significant experience in both of those languages, but we are already discussing plans for future books in the series coauthored by people who are experts in other languages. I encourage you to look out for them if you enjoy this book. For more information about the series, visit https://classicproblems.com/. Small problems To get started, we will explore some simple problems that can be solved with no more than a few relatively short functions. Although these problems are small, they will still allow us to explore some interesting problemsolving techniques. Think of them as a good warmup. 1.1 The Fibonacci sequence The Fibonacci sequence is a sequence of numbers such that any number, except for the first and second, is the sum of the previous two: 0, 1, 1, 2, 3, 5, 8, 13, 21... The value of the first Fibonacci number in the sequence is 0. The value of the fourth Fibonacci number is 2. It follows that to get the value of any Fibonacci number, n, in the sequence, one can use the formula fib(n) = fib(n  1) + fib(n  2) 1.1.1 A first recursive attempt The preceding formula for computing a number in the Fibonacci sequence (illustrated in figure 1.1) is a form of pseudocode that can be trivially translated into a recursive Python function. (A recursive function is a function that calls itself.) This mechanical translation will serve as our first attempt at writing a function to return a given value of the Fibonacci sequence. Listing 1.1 fib1.py def fib1(n: int) > int: return fib1(n  1) + fib1(n  2) 6 The Fibonacci sequence 7 + I’m as tall as the previous two stickmen added together. 1 1 2 = Me too! 3 5 8 Figure 1.1 The height of each stickman is the previous two stickmen’s heights added together. Let’s try to run this function by calling it with a value. Listing 1.2 fib1.py continued if __name__ == "__main__": print(fib1(5)) Uhoh! If we try to run fib1.py, we generate an error: RecursionError: maximum recursion depth exceeded The issue is that fib1() will run forever without returning a final result. Every call to fib1() results in another two calls of fib1() with no end in sight. We call such a circumstance infinite recursion (see figure 1.2), and it is analogous to an infinite loop. In recursion, we go around and around... n–2 fib(n) n–1 Figure 1.2 The recursive function fib(n) calls itself with the arguments n2 and n1. 8 1.1.2 CHAPTER 1 Small problems Utilizing base cases Notice that until you run fib1(), there is no indication from your Python environment that there is anything wrong with it. It is the duty of the programmer to avoid infinite recursion, not the compiler or the interpreter. The reason for the infinite recursion is that we never specified a base case. In a recursive function, a base case serves as a stopping point. In the case of the Fibonacci function, we have natural base cases in the form of the special first two sequence values, 0 and 1. Neither 0 nor 1 is the sum of the previous two numbers in the sequence. Instead, they are the special first two values. Let’s try specifying them as base cases. Listing 1.3 fib2.py def fib2(n: int) > int: if n < 2: # base case return n return fib2(n  2) + fib2(n  1) # recursive case The fib2() version of the Fibonacci function returns 0 as the zeroth number (fib2(0)), rather than the first number, as in our original proposition. In a programming context, this kind of makes sense because we are used to sequences starting with a zeroth element. NOTE fib2() can be called successfully and will return correct results. Try calling it with some small values. Listing 1.4 fib2.py continued if __name__ == "__main__": print(fib2(5)) print(fib2(10)) Do not try calling fib2(50). It will never finish executing! Why? Every call to fib2() results in two more calls to fib2() by way of the recursive calls fib2(n  1) and fib2(n  2) (see figure 1.3). In other words, the call tree grows exponentially. For example, a call of fib2(4) results in this entire set of calls: fib2(4) fib2(3) fib2(2) fib2(2) fib2(1) fib2(1) fib2(1) fib2(0) fib2(0) > > > > > > > > > fib2(3), fib2(2), fib2(1), fib2(1), 1 1 1 0 0 fib2(2) fib2(1) fib2(0) fib2(0) If you count them (and as you will see if you add some print calls), there are 9 calls to fib2() just to compute the 4th element! It gets worse. There are 15 calls required to 9 The Fibonacci sequence fib2(4) fib2(3) fib2(2) fib2(1) fib2(0) 1 0 fib2(2) fib2(1) fib2(1) fib2(0) 1 1 0 Figure 1.3 Every nonbasecase call of fib2() results in two more calls of fib2(). compute element 5, 177 calls to compute element 10, and 21,891 calls to compute element 20. We can do better. 1.1.3 Memoization to the rescue Memoization is a technique in which you store the results of computational tasks when they are completed so that when you need them again, you can look them up instead of needing to compute them a second (or millionth) time (see figure 1.4).1 Do you know n? I do know n. Look n up in my memory. I don’t know n. Calculate n. Figure 1.4 The human memoization machine Let’s create a new version of the Fibonacci function that utilizes a Python dictionary for memoization purposes. 1 Donald Michie, a famous British computer scientist, coined the term memoization. Donald Michie, Memo functions: a language feature with “rotelearning” properties (Edinburgh University, Department of Machine Intelligence and Perception, 1967). 10 CHAPTER 1 Small problems Listing 1.5 fib3.py from typing import Dict memo: Dict[int, int] = {0: 0, 1: 1} # our base cases def fib3(n: int) > int: if n not in memo: memo[n] = fib3(n  1) + fib3(n  2) return memo[n] # memoization You can now safely call fib3(50). Listing 1.6 fib3.py continued if __name__ == "__main__": print(fib3(5)) print(fib3(50)) A call to fib3(20) will result in just 39 calls of fib3() as opposed to the 21,891 of fib2() resulting from the call fib2(20). memo is prefilled with the earlier base cases of 0 and 1, saving fib3() from the complexity of another if statement. 1.1.4 Automatic memoization fib3() can be further simplified. Python has a builtin decorator for memoizing any function automagically. In fib4(), the decorator @functools.lru_cache() is used with the same exact code as we used in fib2(). Each time fib4() is executed with a novel argument, the decorator causes the return value to be cached. Upon future calls of fib4() with the same argument, the previous return value of fib4() for that argument is retrieved from the cache and returned. Listing 1.7 fib4.py from functools import lru_cache @lru_cache(maxsize=None) def fib4(n: int) > int: # same definition as fib2() if n < 2: # base case return n return fib4(n  2) + fib4(n  1) # recursive case if __name__ == "__main__": print(fib4(5)) print(fib4(50)) Note that we are able to calculate fib4(50) instantly, even though the body of the Fibonacci function is the same as that in fib2(). @lru_cache’s maxsize property indicates how many of the most recent calls of the function it is decorating should be cached. Setting it to None indicates that there is no limit. The Fibonacci sequence 1.1.5 11 Keep it simple, Fibonacci There is an even more performant option. We can solve Fibonacci with an oldfashioned iterative approach. Listing 1.8 fib5.py def fib5(n: int) > int: if n == 0: return n # special case last: int = 0 # initially set to fib(0) next: int = 1 # initially set to fib(1) for _ in range(1, n): last, next = next, last + next return next if __name__ == "__main__": print(fib5(5)) print(fib5(50)) The body of the for loop in fib5() uses tuple unpacking in perhaps a bit of an overly clever way. Some may feel that it sacrifices readability for conciseness. Others may find the conciseness in and of itself more readable. The gist is, last is being set to the previous value of next, and next is being set to the previous value of last plus the previous value of next. This avoids the creation of a temporary variable to hold the old value of next after last is updated but before next is updated. Using tuple unpacking in this fashion for some kind of variable swap is common in Python. WARNING With this approach, the body of the for loop will run a maximum of n  1 times. In other words, this is the most efficient version yet. Compare 19 runs of the for loop body to 21,891 recursive calls of fib2() for the 20th Fibonacci number. That could make a serious difference in a realworld application! In the recursive solutions, we worked backward. In this iterative solution, we work forward. Sometimes recursion is the most intuitive way to solve a problem. For example, the meat of fib1() and fib2() is pretty much a mechanical translation of the original Fibonacci formula. However, naive recursive solutions can also come with significant performance costs. Remember, any problem that can be solved recursively can also be solved iteratively. 1.1.6 Generating Fibonacci numbers with a generator So far, we have written functions that output a single value in the Fibonacci sequence. What if we want to output the entire sequence up to some value instead? It is easy to convert fib5() into a Python generator using the yield statement. When the generator is iterated, each iteration will spew a value from the Fibonacci sequence using a yield statement. 12 CHAPTER 1 Small problems Listing 1.9 fib6.py from typing import Generator def fib6(n: int) > Generator[int, None, None]: yield 0 # special case if n > 0: yield 1 # special case last: int = 0 # initially set to fib(0) next: int = 1 # initially set to fib(1) for _ in range(1, n): last, next = next, last + next yield next # main generation step if __name__ == "__main__": for i in fib6(50): print(i) If you run fib6.py, you will see 51 numbers in the Fibonacci sequence printed. For each iteration of the for loop for i in fib6(50):, fib6() runs through to a yield statement. If the end of the function is reached and there are no more yield statements, the loop finishes iterating. 1.2 Trivial compression Saving space (virtual or real) is often important. It is more efficient to use less space, and it can save money. If you are renting an apartment that is bigger than you need for your things and family, you could “downsize” to a smaller place that is less expensive. If you are paying by the byte to store your data on a server, you may want to compress it so that its storage costs you less. Compression is the act of taking data and encoding it (changing its form) in such a way that it takes up less space. Decompression is reversing the process, returning the data to its original form. If it is more storageefficient to compress data, then why is all data not compressed? There is a tradeoff between time and space. It takes time to compress a piece of data and to decompress it back into its original form. Therefore, data compression only makes sense in situations where small size is prioritized over fast execution. Think of large files being transmitted over the internet. Compressing them makes sense because it will take longer to transfer the files than it will to decompress them once received. Further, the time taken to compress the files for their storage on the original server only needs to be accounted for once. The easiest data compression wins come about when you realize that data storage types use more bits than are strictly required for their contents. For instance, thinking lowlevel, if an unsigned integer that will never exceed 65,535 is being stored as a 64bit unsigned integer in memory, it is being stored inefficiently. It could instead be stored as a 16bit unsigned integer. This would reduce the space consumption for the actual number by 75% (16 bits instead of 64 bits). If millions of such numbers are being stored inefficiently, it can add up to megabytes of wasted space. 13 Trivial compression In Python, sometimes for the sake of simplicity (which is a legitimate goal, of course), the developer is shielded from thinking in bits. There is no 64bit unsigned integer type, and there is no 16bit unsigned integer type. There is just a single int type that can store numbers of arbitrary precision. The function sys.getsizeof() can help you find out how many bytes of memory your Python objects are consuming. But due to the inherent overhead of the Python object system, there is no way to create an int that takes up less than 28 bytes (224 bits) in Python 3.7. A single int can be extended one bit at a time (as we will do in this example), but it consumes a minimum of 28 bytes. NOTE If you are a little rusty regarding binary, recall that a bit is a single value that is either a 1 or a 0. A sequence of 1s and 0s is read in base 2 to represent a number. For the purposes of this section, you do not need to do any math in base 2, but you do need to understand that the number of bits that a type stores determines how many different values it can represent. For example, 1 bit can represent 2 values (0 or 1), 2 bits can represent 4 values (00, 01, 10, 11), 3 bits can represent 8 values, and so on. 2 Decompression Compression If the number of possible different values that a type is meant to represent is less than the number of values that the bits being used to store it can represent, it can likely be more efficiently stored. Consider the nucleotides that form a gene in DNA.2 Each nucleotide can only be one of four values: A, C, G, or T. (There will be more about this in chapter 2.) Yet if the gene is stored as String a str, which can be thought of as a colrepresentation lection of Unicode characters, each nucleotide will be represented by a char“ATG” “A” “T” “G” = + + acter, which generally requires 8 bits of 24 bits storage. In binary, just 2 bits are needed 8 bits 8 bits 8 bits to store a type with four possible values: 00, 01, 10, and 11 are the four different values that can be represented by 2 bits. If A is assigned 00, C is assigned 01, G is assigned 10, and T is assigned 11, the storage required for a string of nucleotides can be reduced by 75% (from 8 bits 001110 00 11 10 = + + to 2 bits per nucleotide). 6 bits Instead of storing our nucleotides as 2 bits 2 bits 2 bits Bitstring a str, they can be stored as a bit string (see figure 1.5). A bit string is exactly representation what it sounds like: an arbitrarylength Figure 1.5 Compressing a str representing a sequence of 1s and 0s. Unfortunately, gene into a 2bitpernucleotide bit string This example is inspired by Algorithms, 4th Edition, by Robert Sedgewick and Kevin Wayne (AddisonWesley Professional, 2011), page 819. 14 CHAPTER 1 Small problems the Python standard library contains no offtheshelf construct for working with bit strings of arbitrary length. The following code converts a str composed of As, Cs, Gs, and Ts into a string of bits and back again. The string of bits is stored within an int. Because the int type in Python can be of any length, it can be used as a bit string of any length. To convert back into a str, we will implement the Python __str__() special method. Listing 1.10 trivial_compression.py class CompressedGene: def __init__(self, gene: str) > None: self._compress(gene) A CompressedGene is provided a str of characters representing the nucleotides in a gene, and it internally stores the sequence of nucleotides as a bit string. The __init__() method’s main responsibility is to initialize the bitstring construct with the appropriate data. __init__() calls _compress() to do the dirty work of actually converting the provided str of nucleotides into a bit string. Note that _compress() starts with an underscore. Python has no concept of truly private methods or variables. (All variables and methods can be accessed through reflection; there’s no strict enforcement of privacy.) A leading underscore is used as a convention to indicate that the implementation of a method should not be relied on by actors outside of the class. (It is subject to change and should be treated as private.) If you start a method or instance variable name in a class with two leading underscores, Python will “name mangle” it, changing its implementation name with a salt and not making it easily discoverable by other classes. We use one underscore in this book to indicate a “private” variable or method, but you may wish to use two if you really want to emphasize that something is private. For more on naming in Python, check out the section “Descriptive Naming Styles” from PEP 8: http://mng.bz/NA52. TIP Next, let’s look at how we can actually perform the compression. Listing 1.11 trivial_compression.py continued def _compress(self, gene: str) > None: self.bit_string: int = 1 # start with sentinel for nucleotide in gene.upper(): self.bit_string <<= 2 # shift left two bits if nucleotide == "A": # change last two bits to 00 self.bit_string = 0b00 elif nucleotide == "C": # change last two bits to 01 self.bit_string = 0b01 elif nucleotide == "G": # change last two bits to 10 self.bit_string = 0b10 elif nucleotide == "T": # change last two bits to 11 self.bit_string = 0b11 else: raise ValueError("Invalid Nucleotide:{}".format(nucleotide)) Trivial compression 15 The _compress()method looks at each character in the str of nucleotides sequentially. When it sees an A, it adds 00 to the bit string. When it sees a C, it adds 01, and so on. Remember that two bits are needed for each nucleotide. As a result, before we add each new nucleotide, we shift the bit string two bits to the left (self.bit_string <<= 2). Every nucleotide is added using an “or” operation (). After the left shift, two 0s are added to the right side of the bit string. In bitwise operations, “ORing” (for example, self.bit_string = 0b10) 0s with any other value results in the other value replacing the 0s. In other words, we continually add two new bits to the right side of the bit string. The two bits that are added are determined by the type of the nucleotide. Finally, we will implement decompression and the special __str__() method that uses it. Listing 1.12 trivial_compression.py continued def decompress(self) > str: gene: str = "" for i in range(0, self.bit_string.bit_length()  1, 2): #  1 to exclude sentinel bits: int = self.bit_string >> i & 0b11 # get just 2 relevant bits if bits == 0b00: # A gene += "A" elif bits == 0b01: # C gene += "C" elif bits == 0b10: # G gene += "G" elif bits == 0b11: # T gene += "T" else: raise ValueError("Invalid bits:{}".format(bits)) return gene[::1] # [::1] reverses string by slicing backward def __str__(self) > str: # string representation for pretty printing return self.decompress() decompress() reads two bits from the bit string at a time, and it uses those two bits to determine which character to add to the end of the str representation of the gene. Because the bits are being read backward, compared to the order they were compressed in (right to left instead of left to right), the str representation is ultimately reversed (using the slicing notation for reversal [::1]). Finally, note how the convenient int method bit_length() aided in the development of decompress(). Let’s test it out. Listing 1.13 trivial_compression.py continued if __name__ == "__main__": from sys import getsizeof original: str = "TAGGGATTAACCGTTATATATATATAGCCATGGATCGATTATATAGGGATTAACCGTTATATATATATAGC CATGGATCGATTATA" * 100 16 CHAPTER 1 Small problems print("original is {} bytes".format(getsizeof(original))) compressed: CompressedGene = CompressedGene(original) # compress print("compressed is {} bytes".format(getsizeof(compressed.bit_string))) print(compressed) # decompress print("original and decompressed are the same: {}".format(original == compressed.decompress())) Using the sys.getsizeof() method, we can indicate in the output whether we did indeed save almost 75% of the memory cost of storing the gene through this compression scheme. Listing 1.14 trivial_compression.py output original is 8649 bytes compressed is 2320 bytes TAGGGATTAACC… original and decompressed are the same: True NOTE In the CompressedGene class, we used if statements extensively to decide between a series of cases in both the compression and the decompression methods. Because Python has no switch statement, this is somewhat typical. What you will also see in Python sometimes is a high reliance on dictionaries in place of extensive if statements to deal with a set of cases. Imagine, for instance, a dictionary in which we could look up each nucleotide’s respective bits. This can sometimes be more readable, but it can come with a performance cost. Even though a dictionary lookup is technically O(1), the cost of running a hash function will sometimes mean a dictionary is less performant than a series of ifs. Whether this holds will depend on what a particular program’s if statements must evaluate to make their decision. You may want to run performance tests on both methods if you need to make a decision between ifs and dictionary lookup in a critical section of code. 1.3 Unbreakable encryption A onetime pad is a way of encrypting a piece of data by combining it with meaningless random dummy data in such a way that the original cannot be reconstituted without access to both the product and the dummy data. In essence, this leaves the encrypter with a key pair. One key is the product, and the other is the random dummy data. One key on its own is useless; only the combination of both keys can unlock the original data. When performed correctly, a onetime pad is a form of unbreakable encryption. Figure 1.6 shows the process. 1.3.1 Getting the data in order In this example, we will encrypt a str using a onetime pad. One way of thinking about a Python 3 str is as a sequence of UTF8 bytes (with UTF8 being a Unicode character encoding). A str can be converted into a sequence of UTF8 bytes (represented as the bytes type) through the encode() method. Likewise, a sequence of UTF8 bytes can be converted back into a str using the decode() method on the bytes type. 17 Unbreakable encryption Original data Key 1 (Dummy data) Dummy data Encryption (XOR) Key 1 (Dummy data) Key 2 (Product) Key 2 (Product) Original data Decryption (XOR) Figure 1.6 A onetime pad results in two keys that can be separated and then recombined to recreate the original data. There are three criteria that the dummy data used in a onetime pad encryption operation must meet for the resulting product to be unbreakable. The dummy data must be the same length as the original data, truly random, and completely secret. The first and third criteria are common sense. If the dummy data repeats because it is too short, there could be an observed pattern. If one of the keys is not truly secret (perhaps it is reused elsewhere or partially revealed), then an attacker has a clue. The second criteria poses a question all its own: can we produce truly random data? The answer for most computers is no. In this example we will use the pseudorandom data generating function token_ bytes() from the secrets module (first included in the standard library in Python 3.6). Our data will not be truly random, in the sense that the secrets package still is using a pseudorandom number generator behind the scenes, but it will be close enough for our purposes. Let’s generate a random key for use as dummy data. Listing 1.15 unbreakable_encryption.py from secrets import token_bytes from typing import Tuple def random_key(length: int) > int: # generate length random bytes tb: bytes = token_bytes(length) # convert those bytes into a bit string and return it return int.from_bytes(tb, "big") This function creates an int filled with length random bytes. The method int.from_ bytes() is used to convert from bytes to int. How can multiple bytes be converted to a single integer? The answer lies in section 1.2. In that section, you learned that the int type can be of arbitrary size, and you saw how it can be used as a generic bit string. int is being used in the same way here. For example, the from_bytes() method will take 7 bytes (7 bytes * 8 bits = 56 bits) and convert them into a 56bit integer. Why is 18 CHAPTER 1 Small problems this useful? Bitwise operations can be executed more easily and performantly on a single int (read “long bit string”) than on many individual bytes in a sequence. And we are about to use the bitwise operation XOR. 1.3.2 Encrypting and decrypting How will the dummy data be combined with the original data that we want to encrypt? The XOR operation will serve this purpose. XOR is a logical bitwise (operates at the bit level) operation that returns true when one of its operands is true but returns false when both are true or neither is true. As you may have guessed, XOR stands for exclusive or. In Python, the XOR operator is ^. In the context of the bits of binary numbers, XOR returns 1 for 0 ^ 1 and 1 ^ 0, but 0 for 0 ^ 0 and 1 ^ 1. If the bits of two numbers are combined using XOR, a helpful property is that the product can be recombined with either of the operands to produce the other operand: A ^ B = C C ^ B = A C ^ A = B This key insight forms the basis of onetime pad encryption. To form our product, we will simply XOR an int representing the bytes in our original str with a randomly generated int of the same bit length (as produced by random_key()). Our returned key pair will be the dummy data and the product. Listing 1.16 unbreakable_encryption.py continued def encrypt(original: str) > Tuple[int, int]: original_bytes: bytes = original.encode() dummy: int = random_key(len(original_bytes)) original_key: int = int.from_bytes(original_bytes, "big") encrypted: int = original_key ^ dummy # XOR return dummy, encrypted int.from_bytes() is being passed two arguments. The first is the bytes that we want to convert into an int. The second is the endianness of those bytes ("big"). Endianness refers to the byteordering used to store data. NOTE Does the most significant byte come first, or does the least significant byte come first? In our case, it does not matter as long as we use the same ordering both when we encrypt and decrypt, because we are actually only manipulating the data at the individual bit level. In other situations, when you are not controlling both ends of the encoding process, the ordering can absolutely matter, so be careful! Decryption is simply a matter of recombining the key pair we generated with encrypt(). This is achieved once again by doing an XOR operation between each and every bit in the two keys. The ultimate output must be converted back to a str. First, the int is converted to bytes using int.to_bytes(). This method requires the number of bytes to be converted from the int. To get this number, we divide the bit length Calculating pi 19 by eight (the number of bits in a byte). Finally, the bytes method decode() gives us back a str. Listing 1.17 unbreakable_encryption.py continued def decrypt(key1: int, key2: int) > str: decrypted: int = key1 ^ key2 # XOR temp: bytes = decrypted.to_bytes((decrypted.bit_length()+ 7) // 8, "big") return temp.decode() It was necessary to add 7 to the length of the decrypted data before using integerdivision (//) to divide by 8 to ensure that we “round up,” to avoid an offbyone error. If our onetime pad encryption truly works, we should be able to encrypt and decrypt the same Unicode string without issue. Listing 1.18 unbreakable_encryption.py continued if __name__ == "__main__": key1, key2 = encrypt("One Time Pad!") result: str = decrypt(key1, key2) print(result) If your console outputs One Time Pad! then everything worked. 1.4 Calculating pi The mathematically significant number pi (π or 3.14159…) can be derived using many formulas. One of the simplest is the Leibniz formula. It posits that the convergence of the following infinite series is equal to pi: π = 4/1  4/3 + 4/5  4/7 + 4/9  4/11... You will notice that the infinite series’ numerator remains 4 while the denominator increases by 2, and the operation on the terms alternates between addition and subtraction. We can model the series in a straightforward way by translating pieces of the formula into variables in a function. The numerator can be a constant 4. The denominator can be a variable that begins at 1 and is incremented by 2. The operation can be represented as either 1 or 1 based on whether we are adding or subtracting. Finally, the variable pi is used in listing 1.19 to collect the sum of the series as the for loop proceeds. Listing 1.19 calculating_pi.py def calculate_pi(n_terms: int) > float: numerator: float = 4.0 denominator: float = 1.0 operation: float = 1.0 pi: float = 0.0 for _ in range(n_terms): pi += operation * (numerator / denominator) 20 CHAPTER 1 Small problems denominator += 2.0 operation *= 1.0 return pi if __name__ == "__main__": print(calculate_pi(1000000)) TIP On most platforms, Python floats are 64bit floatingpoint numbers (or double in C). This function is an example of how rote conversion between formula and programmatic code can be both simple and effective in modeling or simulating an interesting concept. Rote conversion is a useful tool, but we must keep in mind that it is not necessarily the most efficient solution. Certainly, the Leibniz formula for pi can be implemented with more efficient or compact code. The more terms in the infinite series (the higher the value of n_terms when calculate_pi() is called), the more accurate the ultimate calculation of pi will be. NOTE 1.5 The Towers of Hanoi Three vertical pegs (henceforth “towers”) stand tall. We will label them A, B, and C. Doughnutshaped discs are around tower A. The widest disc is at the bottom, and we will call it disc 1. The rest of the discs above disc 1 are labeled with increasing numerals and get progressively narrower. For instance, if we were to work with three discs, the widest disc, the one on the bottom, would be 1. The next widest disc, disc 2, would sit on top of disc 1. And finally, the narrowest disc, disc 3, would sit on top of disc 2. Our goal is to move all of the discs from tower A to tower C given the following constraints: Only one disc can be moved at a time. The topmost disc of any tower is the only one available for moving. A wider disc can never be atop a narrower disc. Figure 1.7 summarizes the problem. 1.5.1 Modeling the towers A stack is a data structure that is modeled on the concept of LastInFirstOut (LIFO). The last thing put into it is the first thing that comes out of it. The two most basic operations on a stack are push and pop. A push puts a new item into a stack, whereas a pop removes and returns the last item put in. We can easily model a stack in Python using a list as a backing store. 21 The Towers of Hanoi We want to move all of the discs here, one at a time. Smaller discs must be on top of larger discs. 3 2 1 A (Starting position) B C Figure 1.7 The challenge is to move the three discs, one at a time, from tower A to tower C. A larger disc may never be on top of a smaller disc. Listing 1.20 hanoi.py from typing import TypeVar, Generic, List T = TypeVar('T') class Stack(Generic[T]): def __init__(self) > None: self._container: List[T] = [] def push(self, item: T) > None: self._container.append(item) def pop(self) > T: return self._container.pop() def __repr__(self) > str: return repr(self._container) This Stack class implements __repr__() so that we can easily explore the contents of a tower. __repr__() is what will be output when print() is applied to a Stack. NOTE NOTE As was described in the introduction, this book utilizes type hints throughout. The import of Generic from the typing module enables Stack to be generic over a particular type in type hints. The arbitrary type T is defined 22 CHAPTER 1 Small problems in T = TypeVar('T'). T can be any type. When a type hint is later used for a Stack to solve the Hanoi problem, it is typehinted as type Stack[int], which means T is filled in with type int. In other words, the stack is a stack of integers. If you are struggling with type hints, take a look at appendix C. Stacks are perfect standins for the towers in The Towers of Hanoi. When we want to put a disc onto a tower, we can just push it. When we want to move a disc from one tower to another, we can pop it from the first and push it onto the second. Let’s define our towers as Stacks and fill the first tower with discs. Listing 1.21 hanoi.py continued num_discs: int = 3 tower_a: Stack[int] = Stack() tower_b: Stack[int] = Stack() tower_c: Stack[int] = Stack() for i in range(1, num_discs + 1): tower_a.push(i) 1.5.2 Solving The Towers of Hanoi How can The Towers of Hanoi be solved? Imagine we were only trying to move 1 disc. We would know how to do that, right? In fact, moving one disc is our base case for a recursive solution to The Towers of Hanoi. The recursive case is moving more than one disc. Therefore, the key insight is that we essentially have two scenarios we need to codify: moving one disc (the base case) and moving more than one disc (the recursive case). Let’s look at a specific example to understand the recursive case. Say we have three discs (top, middle, and bottom) on tower A that we want to move to tower C. (It may help to sketch out the problem as you follow along.) We could first move the top disc to tower C. Then we could move the middle disc to tower B. Then we could move the top disc from tower C to tower B. Now we have the bottom disc still on tower A and the upper two discs on tower B. Essentially, we have now successfully moved two discs from one tower (A) to another tower (B). Moving the bottom disc from A to C is our base case (moving a single disc). Now we can move the two upper discs from B to C in the same procedure that we did from A to B. We move the top disc to A, the middle disc to C, and finally the top disc from A to C. In a computer science classroom, it is not uncommon to see a little model of the towers built using dowels and plastic doughnuts. You can build your own model using three pencils and three pieces of paper. It may help you visualize the solution. TIP In our threedisc example, we had a simple base case of moving a single disc and a recursive case of moving all of the other discs (two in this case), using the third tower temporarily. We could break the recursive case into three steps: The Towers of Hanoi 1 2 3 23 Move the upper n1 discs from tower A to B (the temporary tower), using C as the inbetween. Move the single lowest disc from A to C. Move the n1 discs from tower B to C, using A as the inbetween. The amazing thing is that this recursive algorithm works not only for three discs, but for any number of discs. We will codify it as a function called hanoi() that is responsible for moving discs from one tower to another, given a third temporary tower. Listing 1.22 hanoi.py continued def hanoi(begin: Stack[int], end: Stack[int], temp: Stack[int], n: int) > None: if n == 1: end.push(begin.pop()) else: hanoi(begin, temp, end, n  1) hanoi(begin, end, temp, 1) hanoi(temp, end, begin, n  1) After calling hanoi(), you should examine towers A, B, and C to verify that the discs were moved successfully. Listing 1.23 hanoi.py continued if __name__ == "__main__": hanoi(tower_a, tower_c, tower_b, num_discs) print(tower_a) print(tower_b) print(tower_c) You will find that they were. In codifying the solution to The Towers of Hanoi, we did not necessarily need to understand every step required to move multiple discs from tower A to tower C. But we came to understand the general recursive algorithm for moving any number of discs, and we codified it, letting the computer do the rest. This is the power of formulating recursive solutions to problems: we often can think of solutions in an abstract manner without the drudgery of negotiating every individual action in our minds. Incidentally, the hanoi() function will execute an exponential number of times as a function of the number of discs, which makes solving the problem for even 64 discs untenable. You can try it with various other numbers of discs by changing the num_ discs variable. The exponentially increasing number of steps required as the number of discs increases is where the legend of The Towers of Hanoi comes from; you can read more about it in any number of sources. You may also be interested in reading more about the mathematics behind its recursive solution; see Carl Burch’s explanation in “About the Towers of Hanoi,” http://mng.bz/c1i2. 24 1.6 CHAPTER 1 Small problems Realworld applications The various techniques presented in this chapter (recursion, memoization, compression, and manipulation at the bit level) are so common in modern software development that it is impossible to imagine the world of computing without them. Although problems can be solved without them, it is often more logical or performant to solve problems with them. Recursion, in particular, is at the heart of not just many algorithms, but even whole programming languages. In some functional programming languages, like Scheme and Haskell, recursion takes the place of the loops used in imperative languages. It is worth remembering, though, that anything accomplishable with a recursive technique is also accomplishable with an iterative technique. Memoization has been applied successfully to speed up the work of parsers (programs that interpret languages). It is useful in all problems where the result of a recent calculation will likely be asked for again. Another application of memoization is in language runtimes. Some language runtimes (versions of Prolog, for instance) will store the results of function calls automatically (automemoization), so that the function need not execute the next time the same call is made. This is similar to how the @lru_cache() decorator in fib6() worked. Compression has made an internetconnected world constrained by bandwidth more tolerable. The bitstring technique examined in section 1.2 is usable for realworld simple data types that have a limited number of possible values, for which even a byte is overkill. The majority of compression algorithms, however, operate by finding patterns or structures within a data set that allow for repeated information to be eliminated. They are significantly more complicated than what is covered in section 1.2. Onetime pads are not practical for general encryption. They require both the encrypter and the decrypter to have possession of one of the keys (the dummy data in our example) for the original data to be reconstructed, which is cumbersome and defeats the goal of most encryption schemes (keeping keys secret). But you may be interested to know that the name “onetime pad” comes from spies using real paper pads with dummy data on them to create encrypted communications during the Cold War. These techniques are programmatic building blocks that other algorithms are built on top of. In future chapters you will see them applied liberally. 1.7 Exercises 1 2 3 4 Write yet another function that solves for element n of the Fibonacci sequence, using a technique of your own design. Write unit tests that evaluate its correctness and performance relative to the other versions in this chapter. You saw how the simple int type in Python can be used to represent a bit string. Write an ergonomic wrapper around int that can be used generically as a sequence of bits (make it iterable and implement __getitem__()). Reimplement CompressedGene, using the wrapper. Write a solver for The Towers of Hanoi that works for any number of towers. Use a onetime pad to encrypt and decrypt images. Search problems “Search” is such a broad term that this entire book could be called Classic Search Problems in Python. This chapter is about core search algorithms that every programmer should know. It does not claim to be comprehensive, despite the declaratory title. 2.1 DNA search Genes are commonly represented in computer software as a sequence of the characters A, C, G, and T. Each letter represents a nucleotide, and the combination of three nucleotides is called a codon. This is illustrated in figure 2.1. A codon codes for a specific amino acid that together with other amino acids can form a protein. A classic task in bioinformatics software is to find a particular codon within a gene. 2.1.1 Storing DNA We can represent a nucleotide as a simple IntEnum with four cases. Listing 2.1 dna_search.py from enum import IntEnum from typing import Tuple, List Nucleotide: IntEnum = IntEnum('Nucleotide', ('A', 'C', 'G', 'T')) Nucleotide is of type IntEnum instead of just Enum, because IntEnum gives us comparison operators (<, >=, and so on) “for free.” Having these operators in a data type is required in order for the search algorithms we are going to implement to be able to operate on it. Tuple and List are imported from the typing package to assist with type hints. 25 26 CHAPTER 2 Search problems 1 codon (3 nucleotides) 1 nucleotide A A T T C G T A T G C A Part of a gene Figure 2.1 A nucleotide is represented by one of the letters A, C, G, and T. A codon is composed of three nucleotides, and a gene is composed of multiple codons. Codons can be defined as a tuple of three Nucleotides. A gene may be defined as a list of Codons. Listing 2.2 dna_search.py continued Codon = Tuple[Nucleotide, Nucleotide, Nucleotide] Gene = List[Codon] # type alias for genes # type alias for codons Although we will later need to compare one Codon to another, we do not need to define a custom class with the < operator explicitly implemented for Codon. This is because Python has builtin support for comparisons between tuples that are composed of types that are also comparable. NOTE Typically, genes on the internet will be in a file format that contains a giant string representing all of the nucleotides in the gene’s sequence. We will define such a string for an imaginary gene and call it gene_str. Listing 2.3 dna_search.py continued gene_str: str = "ACGTGGCTCTCTAACGTACGTACGTACGGGGTTTATATATACCCTAGGACTCCCTTT" We will also need a utility function to convert a str into a Gene. Listing 2.4 dna_search.py continued def string_to_gene(s: str) > Gene: gene: Gene = [] for i in range(0, len(s), 3): if (i + 2) >= len(s): # don't run off end! DNA search 27 return gene # initialize codon out of three nucleotides codon: Codon = (Nucleotide[s[i]], Nucleotide[s[i + 1]], Nucleotide[s[i + 2]]) gene.append(codon) # add codon to gene return gene string_to_gene() continually goes through the provided str and converts its next three characters into Codons that it adds to the end of a new Gene. If it finds that there is no Nucleotide two places into the future of the current place in s that it is examining (see the if statement within the loop), then it knows it has reached the end of an incomplete gene, and it skips over those last one or two nucleotides. string_to_gene() can be used to convert the str gene_str into a Gene. Listing 2.5 dna_search.py continued my_gene: Gene = string_to_gene(gene_str) 2.1.2 Linear search One basic operation we may want to perform on a gene is to search it for a particular codon. The goal is to simply find out whether the codon exists within the gene or not. A linear search goes through every element in a search space, in the order of the original data structure, until what is sought is found or the end of the data structure is reached. In effect, a linear search is the most simple, natural, and obvious way to search for something. In the worst case, a linear search will require going through every element in a data structure, so it is of O(n) complexity, where n is the number of elements in the structure. This is illustrated in figure 2.2. Worst case Start Step 1 2 3 4 5 6 7 8 9 10 11 Figure 2.2 In the worst case of a linear search, you’ll sequentially look through every element of the array. It is trivial to define a function that performs a linear search. It simply must go through every element in a data structure and check for its equivalence to the item being sought. The following code defines such a function for a Gene and a Codon and then tries it out for my_gene and Codons called acg and gat. Listing 2.6 dna_search.py continued def linear_contains(gene: Gene, key_codon: Codon) > bool: for codon in gene: if codon == key_codon: return True return False 28 CHAPTER 2 Search problems acg: Codon = (Nucleotide.A, Nucleotide.C, Nucleotide.G) gat: Codon = (Nucleotide.G, Nucleotide.A, Nucleotide.T) print(linear_contains(my_gene, acg)) # True print(linear_contains(my_gene, gat)) # False This function is for illustrative purposes only. The Python builtin sequence types (list, tuple, range) all implement the __contains__() method, which allows us to do a search for a specific item in them by simply using the in operator. In fact, the in operator can be used with any type that implements __contains__(). For instance, we could search my_gene for acg and print out the result by writing print(acg in my_gene). NOTE Binary search There is a faster way to search than looking at every element, but it requires us to know something about the order of the data structure ahead of time. If we know that the structure is sorted, and we can instantly access any item within it by its index, we can perform a binary search. Based on this criteria, a sorted Python list is a perfect candidate for a binary search. A binary search works by looking at the middle element in a sorted range of elements, comparing it to the element sought, reducing the range by half based on that comparison, and starting the process over again. Let’s look at a concrete example. Suppose we have a list of alphabetically sorted words like ["cat", "dog", "kangaroo", "llama", "rabbit", "rat", "zebra"] and we are searching for the word “rat”: 1 2 3 We could determine that the middle element in this sevenword list is “llama.” We could determine that “rat” comes after “llama” alphabetically, so it must be in the (approximately) half of the list that comes after “llama.” (If we had found “rat” in this step, we could have returned its location; if we had found that our word came before the middle word we were checking, we could be assured that it was in the half of the list before “llama.”) We could rerun steps 1 and 2 for the half of the list that we know “rat” is still possibly in. In effect, this half becomes our new base list. These steps continually run until “rat” is found or the range we are looking in no longer contains any elements to search, meaning that “rat” does not exist within the word list. Figure 2.3 illustrates a binary search. Notice that it does not involve searching every element, unlike a linear search. A binary search continually reduces the search space by half, so it has a worstcase runtime of O(lg n). There is a sortof catch, though. Unlike a linear search, a binary Start Step 1 Step 2 Step 3 Worst Case Step 4 2.1.3 Figure 2.3 In the worst case of a binary search, you’ll look through just lg(n) elements of the list. DNA search 29 search requires a sorted data structure to search through, and sorting takes time. In fact, sorting takes O(n lg n) time for the best sorting algorithms. If we are only going to run our search once, and our original data structure is unsorted, it probably makes sense to just do a linear search. But if the search is going to be performed many times, the time cost of doing the sort is worth it, to reap the benefit of the greatly reduced time cost of each individual search. Writing a binary search function for a gene and a codon is not unlike writing one for any other type of data, because the Codon type can be compared to others of its type, and the Gene type is just a list. Listing 2.7 dna_search.py continued def binary_contains(gene: Gene, key_codon: Codon) > bool: low: int = 0 high: int = len(gene)  1 while low <= high: # while there is still a search space mid: int = (low + high) // 2 if gene[mid] < key_codon: low = mid + 1 elif gene[mid] > key_codon: high = mid  1 else: return True return False Let’s walk through this function line by line. low: int = 0 high: int = len(gene)  1 We start by looking at a range that encompasses the entire list (gene). while low <= high: We keep searching as long as there is a still a range to search within. When low is greater than high, it means that there are no longer any slots to look at within the list. mid: int = (low + high) // 2 We calculate the middle, mid, by using integer division and the simple mean formula you learned in grade school. if gene[mid] < key_codon: low = mid + 1 If the element we are looking for is after the middle element of the range we are looking at, we modify the range that we will look at during the next iteration of the loop by moving low to be one past the current middle element. This is where we halve the range for the next iteration. elif gene[mid] > key_codon: high = mid  1 30 CHAPTER 2 Search problems Similarly, we halve in the other direction when the element we are looking for is less than the middle element. else: return True If the element in question is not less than or greater than the middle element, that means we found it! And, of course, if the loop ran out of iterations, we return False (not reproduced here), indicating that it was never found. We can try running our function with the same gene and codon, but we must remember to sort first. Listing 2.8 dna_search.py continued my_sorted_gene: Gene = sorted(my_gene) print(binary_contains(my_sorted_gene, acg)) print(binary_contains(my_sorted_gene, gat)) # True # False You can build a performant binary search using the Python standard library’s bisect module: https://docs.python.org/3/library/bisect.html. TIP 2.1.4 A generic example The functions linear_contains() and binary_contains() can be generalized to work with almost any Python sequence. The following generalized versions are nearly identical to the versions you saw before, with only some names and type hints changed. There are many imported types in the following code listing. We will be reusing the file generic_search.py for many further generic search algorithms in this chapter, and this gets the imports out of the way. NOTE Before proceeding with the book, you will need to install the typing_ extensions module via either pip install typing_extensions or pip3 install typing_extensions, depending on how your Python interpreter is configured. You will need this module to access the Protocol type, which will NOTE be in the standard library in a future version of Python (as specified by PEP 544). Therefore, in a future version of Python, importing the typing_ extensions module should be unnecessary, and you will be able to use from typing import Protocol instead of from typing_extensions import Protocol. Listing 2.9 generic_search.py from __future__ import annotations from typing import TypeVar, Iterable, Sequence, Generic, List, Callable, Set, Deque, Dict, Any, Optional from typing_extensions import Protocol from heapq import heappush, heappop T = TypeVar('T') DNA search 31 def linear_contains(iterable: Iterable[T], key: T) > bool: for item in iterable: if item == key: return True return False C = TypeVar("C", bound="Comparable") class Comparable(Protocol): def __eq__(self, other: Any) > bool: ... def __lt__(self: C, other: C) > bool: ... def __gt__(self: C, other: C) > bool: return (not self < other) and self != other def __le__(self: C, other: C) > bool: return self < other or self == other def __ge__(self: C, other: C) > bool: return not self < other def binary_contains(sequence: Sequence[C], key: C) > bool: low: int = 0 high: int = len(sequence)  1 while low <= high: # while there is still a search space mid: int = (low + high) // 2 if sequence[mid] < key: low = mid + 1 elif sequence[mid] > key: high = mid  1 else: return True return False if __name__ == "__main__": print(linear_contains([1, 5, 15, 15, 15, 15, 20], 5)) # True print(binary_contains(["a", "d", "e", "f", "z"], "f")) # True print(binary_contains(["john", "mark", "ronald", "sarah"], "sheila")) False # Now you can try doing searches on other types of data. These functions can be reused for almost any Python collection. That is the power of writing code generically. The only unfortunate element of this example is the convoluted hoops that had to be jumped through for Python’s type hints, in the form of the Comparable class. A Comparable type is a type that implements the comparison operators (<, >=, and so on). There should be a more succinct way in future versions of Python to create a type hint for types that implement these common operators. 32 2.2 CHAPTER 2 Search problems Maze solving Finding a path through a maze is analogous to many common search problems in computer science. Why not literally find a path through a maze, then, to illustrate the breadthfirst search, depthfirst search, and A* algorithms? Our maze will be a twodimensional grid of Cells. A Cell is an enum with str values where " " will represent an empty space and "X" will represent a blocked space. There are also other cases for illustrative purposes when printing a maze. Listing 2.10 maze.py from enum import Enum from typing import List, NamedTuple, Callable, Optional import random from math import sqrt from generic_search import dfs, bfs, node_to_path, astar, Node class Cell(str, Enum): EMPTY = " " BLOCKED = "X" START = "S" GOAL = "G" PATH = "*" Once again, we are getting a large number of imports out of the way. Note that the last import (from generic_search) is of symbols we have not yet defined. It is included here for convenience, but you may want to comment it out until you need it. We will need a way to refer to an individual location in the maze. This will simply be a NamedTuple with properties representing the row and column of the location in question. Listing 2.11 maze.py continued class MazeLocation(NamedTuple): row: int column: int 2.2.1 Generating a random maze Our Maze class will internally keep track of a grid (a list of lists) representing its state. It will also have instance variables for the number of rows, number of columns, start location, and goal location. Its grid will be randomly filled with blocked cells. The maze that is generated should be fairly sparse so that there is almost always a path from a given starting location to a given goal location. (This is for testing our algorithms, after all.) We’ll let the caller of a new maze decide on the exact sparseness, but we will provide a default value of 20% blocked. When a random number beats the threshold of the sparseness parameter in question, we will simply replace an empty space with a wall. If we do this for every possible place in the maze, statistically, the sparseness of the maze as a whole will approximate the sparseness parameter supplied. Maze solving 33 Listing 2.12 maze.py continued class Maze: def __init__(self, rows: int = 10, columns: int = 10, sparseness: float = 0.2, start: MazeLocation = MazeLocation(0, 0), goal: MazeLocation = MazeLocation(9, 9)) > None: # initialize basic instance variables self._rows: int = rows self._columns: int = columns self.start: MazeLocation = start self.goal: MazeLocation = goal # fill the grid with empty cells self._grid: List[List[Cell]] = [[Cell.EMPTY for c in range(columns)] for r in range(rows)] # populate the grid with blocked cells self._randomly_fill(rows, columns, sparseness) # fill the start and goal locations in self._grid[start.row][start.column] = Cell.START self._grid[goal.row][goal.column] = Cell.GOAL def _randomly_fill(self, rows: int, columns: int, sparseness: float): for row in range(rows): for column in range(columns): if random.uniform(0, 1.0) < sparseness: self._grid[row][column] = Cell.BLOCKED Now that we have a maze, we also want a way to print it succinctly to the console. We want its characters to be close together so it looks like a real maze. Listing 2.13 maze.py continued # return a nicely formatted version of the maze for printing def __str__(self) > str: output: str = "" for row in self._grid: output += "".join([c.value for c in row]) + "\n" return output Go ahead and test these maze functions. maze: Maze = Maze() print(maze) 2.2.2 Miscellaneous maze minutiae It will be handy later to have a function that checks whether we have reached our goal during the search. In other words, we want to check whether a particular MazeLocation that the search has reached is the goal. We can add a method to Maze. Listing 2.14 maze.py continued def goal_test(self, ml: MazeLocation) > bool: return ml == self.goal 34 CHAPTER 2 Search problems How can we move within our mazes? Let’s say that we can move horizontally and vertically one space at a time from a given space in the maze. Using these criteria, a successors() function can find the possible next locations from a given MazeLocation. However, the successors() function will differ for every Maze because every Maze has a different size and set of walls. Therefore, we will define it as a method on Maze. Listing 2.15 maze.py continued def successors(self, ml: MazeLocation) > List[MazeLocation]: locations: List[MazeLocation] = [] if ml.row + 1 < self._rows and self._grid[ml.row + 1][ml.column] != Cell.BLOCKED: locations.append(MazeLocation(ml.row + 1, ml.column)) if ml.row  1 >= 0 and self._grid[ml.row  1][ml.column] != Cell.BLOCKED: locations.append(MazeLocation(ml.row  1, ml.column)) if ml.column + 1 < self._columns and self._grid[ml.row][ml.column + 1] != Cell.BLOCKED: locations.append(MazeLocation(ml.row, ml.column + 1)) if ml.column  1 >= 0 and self._grid[ml.row][ml.column  1] != Cell.BLOCKED: locations.append(MazeLocation(ml.row, ml.column  1)) return locations successors() simply checks above, below, to the right, and to the left of a MazeLocation in a Maze to see if it can find empty spaces that can be gone to from that location. It also avoids checking locations beyond the edges of the Maze. It puts every possible MazeLocation that it finds into a list that it ultimately returns to the caller. 2.2.3 Depthfirst search A depthfirst search (DFS) is what its name suggests: a search that goes as deeply as it can before backtracking to its last decision point if it reaches a dead end. We’ll implement a generic depthfirst search that can solve our maze problem. It will also be reusable for other problems. Figure 2.4 illustrates an inprogress depthfirst search of a maze. STACKS The depthfirst search algorithm relies on a data structure known as a stack. (If you read about stacks in chapter 1, feel free to skip this section.) A stack is a data structure that operates under the LastInFirstOut (LIFO) principle. Imagine a stack of papers. The last paper placed on top of the stack is the first paper pulled off the stack. It is common for a stack to be implemented on top of a more primitive data structure like a list. We will implement our stack on top of Python’s list type. Stacks generally have at least two operations: push()—Places an item on top of the stack pop()—Removes the item from the top of the stack and returns it We will implement both of these, as well as an empty property to check if the stack has any more items in it. We will add the code for the stack to the generic_search.py file that we were working with earlier in the chapter. We already have completed all of the necessary imports. 35 Maze solving 5 4 6 3 7 Start 8 Goal 2 Barrier 1 Path of exploration Figure 2.4 In depthfirst search, the search proceeds along a continuously deeper path until it hits a barrier and must backtrack to the last decision point. Listing 2.16 generic_search.py continued class Stack(Generic[T]): def __init__(self) > None: self._container: List[T] = [] @property def empty(self) > bool: return not self._container # not is true for empty container def push(self, item: T) > None: self._container.append(item) def pop(self) > T: return self._container.pop() def __repr__(self) > str: return repr(self._container) # LIFO 36 CHAPTER 2 Search problems Note that implementing a stack using a Python list is as simple as always appending items onto its right end and always removing items from its extreme right end. The pop() method on list will fail if there are no longer any items in the list, so pop() will fail on a Stack if it is empty as well. THE DFS ALGORITHM We will need one more little tidbit before we can get to implementing DFS. We need a Node class that we will use to keep track of how we got from one state to another state (or from one place to another place) as we search. You can think of a Node as a wrapper around a state. In the case of our mazesolving problem, those states are of type MazeLocation. We’ll call the Node that a state came from its parent. We will also define our Node class as having cost and heuristic properties and with __lt__() implemented, so we can reuse it later in the A* algorithm. Listing 2.17 generic_search.py continued class Node(Generic[T]): def __init__(self, state: T, parent: Optional[Node], cost: float = 0.0, heuristic: float = 0.0) > None: self.state: T = state self.parent: Optional[Node] = parent self.cost: float = cost self.heuristic: float = heuristic def __lt__(self, other: Node) > bool: return (self.cost + self.heuristic) < (other.cost + other.heuristic) TIP The Optional type indicates that a value of a parameterized type may be referenced by the variable, or the variable may reference None. At the top of the file, the from __future__ import annotations allows Node to reference itself in the type hints of its methods. Without it, we would need to put the type hint in quotes as a string (for example, 'Node'). In future versions of Python, importing annotations will be unnecessary. See TIP PEP 563, “Postponed Evaluation of Annotations,” for more information: http://mng.bz/pgzR. An inprogress depthfirst search needs to keep track of two data structures: the stack of states (or “places”) that we are considering searching, which we will call the frontier; and the set of states that we have already searched, which we will call explored. As long as there are more states to visit in the frontier, DFS will keep checking whether they are the goal (if a state is the goal, DFS will stop and return it) and adding their successors to the frontier. It will also mark each state that has already been searched as explored, so that the search does not get caught in a circle, reaching states that have prior visited states as successors. If the frontier is empty, it means there is nowhere left to search. Maze solving 37 Listing 2.18 generic_search.py continued def dfs(initial: T, goal_test: Callable[[T], bool], successors: Callable[[T], List[T]]) > Optional[Node[T]]: # frontier is where we've yet to go frontier: Stack[Node[T]] = Stack() frontier.push(Node(initial, None)) # explored is where we've been explored: Set[T] = {initial} # keep going while there is more to explore while not frontier.empty: current_node: Node[T] = frontier.pop() current_state: T = current_node.state # if we found the goal, we're done if goal_test(current_state): return current_node # check where we can go next and haven't explored for child in successors(current_state): if child in explored: # skip children we already explored continue explored.add(child) frontier.push(Node(child, current_node)) return None # went through everything and never found goal If dfs() is successful, it returns the Node encapsulating the goal state. The path from the start to the goal can be reconstructed by working backward from this Node and its priors using the parent property. Listing 2.19 generic_search.py continued def node_to_path(node: Node[T]) > List[T]: path: List[T] = [node.state] # work backwards from end to front while node.parent is not None: node = node.parent path.append(node.state) path.reverse() return path For display purposes, it will be useful to mark up the maze with the successful path, the start state, and the goal state. It will also be useful to be able to remove a path so that we can try different search algorithms on the same maze. The following two methods should be added to the Maze class in maze.py. Listing 2.20 maze.py continued def mark(self, path: List[MazeLocation]): for maze_location in path: self._grid[maze_location.row][maze_location.column] = Cell.PATH self._grid[self.start.row][self.start.column] = Cell.START self._grid[self.goal.row][self.goal.column] = Cell.GOAL def clear(self, path: List[MazeLocation]): 38 CHAPTER 2 Search problems for maze_location in path: self._grid[maze_location.row][maze_location.column] = Cell.EMPTY self._grid[self.start.row][self.start.column] = Cell.START self._grid[self.goal.row][self.goal.column] = Cell.GOAL It has been a long journey, but we are finally ready to solve the maze. Listing 2.21 maze.py continued if __name__ == "__main__": # Test DFS m: Maze = Maze() print(m) solution1: Optional[Node[MazeLocation]] = dfs(m.start, m.goal_test, m.successors) if solution1 is None: print("No solution found using depthfirst search!") else: path1: List[MazeLocation] = node_to_path(solution1) m.mark(path1) print(m) m.clear(path1) A successful solution will look something like this: S****X X X ***** X* XX******X X* X**X X ***** * X *X *G The asterisks represent the path that our depthfirst search function found from the start to the goal. Remember, because each maze is randomly generated, not every maze has a solution. 2.2.4 Breadthfirst search You may notice that the solution paths to the mazes found by depthfirst traversal seem unnatural. They are usually not the shortest paths. Breadthfirst search (BFS) always finds the shortest path by systematically looking one layer of nodes farther away from the start state in each iteration of the search. There are particular problems in which a depthfirst search is likely to find a solution more quickly than a breadthfirst search, and vice versa. Therefore, choosing between the two is sometimes a tradeoff between the possibility of finding a solution quickly and the certainty of finding the shortest path to the goal (if one exists). Figure 2.5 illustrates an inprogress breadthfirst search of a maze. 39 Maze solving Start 6 Goal 3 7 1 4 8 2 5 Barrier 9 Path of exploration Figure 2.5 In a breadthfirst search, the closest elements to the starting location are searched first. To understand why a depthfirst search sometimes returns a result faster than a breadthfirst search, imagine looking for a marking on a particular layer of an onion. A searcher using a depthfirst strategy may plunge a knife into the center of the onion and haphazardly examine the chunks cut out. If the marked layer happens to be near the chunk cut out, there is a chance that the searcher will find it more quickly than another searcher using a breadthfirst strategy, who painstakingly peels the onion one layer at a time. To get a better picture of why breadthfirst search always finds the shortest solution path where one exists, consider trying to find the path with the fewest number of stops between Boston and New York by train. If you keep going in the same direction and backtracking when you hit a dead end (as in depthfirst search), you may first find a route all the way to Seattle before it connects back to New York. However, in a breadthfirst search, you will first check all of the stations one stop away from Boston. Then you will check all of the stations two stops away from Boston. Then you will check all of the stations three stops away from Boston. This will keep going until you find New York. Therefore, when you do find New York, you will know you have found the route with the fewest stops, because you already checked all of the stations that are fewer stops away from Boston, and none of them was New York. 40 CHAPTER 2 Search problems QUEUES To implement BFS, a data structure known as a queue is required. Whereas a stack is LIFO, a queue is FIFO (FirstInFirstOut). A queue is like a line to use a restroom. The first person who got in line goes to the restroom first. At a minimum, a queue has the same push() and pop() methods as a stack. In fact, our implementation for Queue (backed by a Python deque) is almost identical to our implementation of Stack, with the only changes being the removal of elements from the left end of the _container instead of the right end and the switch from a list to a deque. (I use the word “left” here to mean the beginning of the backing store.) The elements on the left end are the oldest elements still in the deque (in terms of arrival time), so they are the first elements popped. Listing 2.22 generic_search.py continued class Queue(Generic[T]): def __init__(self) > None: self._container: Deque[T] = Deque() @property def empty(self) > bool: return not self._container # not is true for empty container def push(self, item: T) > None: self._container.append(item) def pop(self) > T: return self._container.popleft() # FIFO def __repr__(self) > str: return repr(self._container) TIP Why did the implementation of Queue use a deque as its backing store, whereas the implementation of Stack used a list as its backing store? It has to do with where we pop. In a stack, we push to the right and pop from the right. In a queue we push to the right as well, but we pop from the left. The Python list data structure has efficient pops from the right but not from the left. A deque can efficiently pop from either side. As a result, there is a builtin method on deque called popleft() but no equivalent method on list. You could certainly find other ways to use a list as the backing store for a queue, but they would be less efficient. Popping from the left on a deque is an O(1) operation, whereas it is an O(n) operation on a list. In the case of the list, after popping from the left, every subsequent element must be moved one to the left, making it inefficient. THE BFS ALGORITHM Amazingly, the algorithm for a breadthfirst search is identical to the algorithm for a depthfirst search, with the frontier changed from a stack to a queue. Changing the Maze solving 41 frontier from a stack to a queue changes the order in which states are searched and ensures that the states closest to the start state are searched first. Listing 2.23 generic_search.py continued def bfs(initial: T, goal_test: Callable[[T], bool], successors: Callable[[T], List[T]]) > Optional[Node[T]]: # frontier is where we've yet to go frontier: Queue[Node[T]] = Queue() frontier.push(Node(initial, None)) # explored is where we've been explored: Set[T] = {initial} # keep going while there is more to explore while not frontier.empty: current_node: Node[T] = frontier.pop() current_state: T = current_node.state # if we found the goal, we're done if goal_test(current_state): return current_node # check where we can go next and haven't explored for child in successors(current_state): if child in explored: # skip children we already explored continue explored.add(child) frontier.push(Node(child, current_node)) return None # went through everything and never found goal If you try running bfs(), you will see that it always finds the shortest solution to the maze in question. The following trial is added just after the previous one in the if __name__ == "__main__": section of the file, so results can be compared on the same maze. Listing 2.24 maze.py continued # Test BFS solution2: Optional[Node[MazeLocation]] = bfs(m.start, m.goal_test, m.successors) if solution2 is None: print("No solution found using breadthfirst search!") else: path2: List[MazeLocation] = node_to_path(solution2) m.mark(path2) print(m) m.clear(path2) It is amazing that you can keep an algorithm the same and just change the data structure that it accesses and get radically different results. The following is the result of calling bfs() on the same maze that we earlier called dfs() on. Notice how the path marked by the asterisks is more direct from start to goal than in the prior example. 42 CHAPTER 2 Search problems S X X *X * X X *XX * X * X X *X * X * X *********G 2.2.5 A* search It can be very timeconsuming to peel back an onion, layerbylayer, as a breadthfirst search does. Like a BFS, an A* search aims to find the shortest path from start state to goal state. Unlike the preceding BFS implementation, an A* search uses a combination of a cost function and a heuristic function to focus its search on pathways most likely to get to the goal quickly. The cost function, g(n), examines the cost to get to a particular state. In the case of our maze, this would be how many previous steps we had to go through to get to the state in question. The heuristic function, h(n), gives an estimate of the cost to get from the state in question to the goal state. It can be proved that if h(n) is an admissible heuristic, then the final path found will be optimal. An admissible heuristic is one that never overestimates the cost to reach the goal. On a twodimensional plane, one example is a straightline distance heuristic, because a straight line is always the shortest path.1 The total cost for any state being considered is f(n), which is simply the combination of g(n) and h(n). In fact, f(n) = g(n) + h(n). When choosing the next state to explore from the frontier, an A* search picks the one with the lowest f(n). This is how it distinguishes itself from BFS and DFS. PRIORITY QUEUES To pick the state on the frontier with the lowest f(n), an A* search uses a priority queue as the data structure for its frontier. A priority queue keeps its elements in an internal order, such that the first element popped out is always the highestpriority element. (In our case, the highestpriority item is the one with the lowest f(n).) Usually this means the internal use of a binary heap, which results in O(lg n) pushes and O(lg n) pops. Python’s standard library contains heappush() and heappop() functions that will take a list and maintain it as a binary heap. We can implement a priority queue by building a thin wrapper around these standard library functions. Our PriorityQueue class will be similar to our Stack and Queue classes, with the push() and pop() methods modified to use heappush() and heappop(). 1 For more information on heuristics, see Stuart Russell and Peter Norvig, Artificial Intelligence: A Modern Approach, 3rd edition (Pearson, 2010), page 94. 43 Maze solving Listing 2.25 generic_search.py continued class PriorityQueue(Generic[T]): def __init__(self) > None: self._container: List[T] = [] @property def empty(self) > bool: return not self._container # not is true for empty container def push(self, item: T) > None: heappush(self._container, item) # in by priority def pop(self) > T: return heappop(self._container) # out by priority def __repr__(self) > str: return repr(self._container) To determine the priority of a particular element versus another of its kind, heappush() and heappop(), compare them by using the < operator. This is why we needed to implement __lt__() on Node earlier. One Node is compared to another by looking at its respective f(n), which is simply the sum of the properties cost and heuristic. HEURISTICS A heuristic is an intuition about the way to solve a problem.2 In the case of maze solving, a heuristic aims to choose the best maze location to search next, in the quest to get to the goal. In other words, it is an educated guess about which nodes on the frontier are closest to the goal. As was mentioned previously, if a heuristic used with an A* search produces an accurate relative result and is admissible (never overestimates the distance), then A* will deliver the shortest path. Heuristics that calculate smaller values end up leading to a search through