Main Forging Python

Forging Python

PDF, 3.86 MB
Download (pdf, 3.86 MB)

You may be interested in Powered by Rec2Me


Most frequently terms

You can write a book review and share your experiences. Other readers will always be interested in your opinion of the books you've read. Whether you've loved the book or not, if you give your honest and detailed thoughts then people will find new books that are right for them.

Get Your Head Around - Basic Algebra I

PDF, 390 KB

Forging Python

EPUB, 442 KB
Forging Python
Best practices and life lessons
developing Python.
Miki Tebeka
This book is for sale at
This version was published on 2019-01-14

This is a Leanpub book. Leanpub empowers authors and
publishers with the Lean Publishing process. Lean
Publishing is the act of publishing an in-progress ebook
using lightweight tools and many iterations to get reader
feedback, pivot until you have the right book and build
traction once you do.

This work is licensed under a Creative Commons
Attribution-NonCommercial 4.0 International License

Forward . . . . . . . . . . . . . . . . . . . . . . . . . . . . .


Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . .


Introduction . . . . . . . . . . . . . . . . . . . . . . . . . .


Writing Good Code . . . . . . . . . . . . . . . . . . . . .


Which Python? . . . . . . . . . . . . . . . . . . . . . . . .


IDEs and Editors . . . . . . . . . . . . . . . . . . . . . . .


Project Structure . . . . . . . . . . . . . . . . . . . . . . .


Managing Dependencies . . . . . . . . . . . . . . . . . .


Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .


Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .


Configuration . . . . . . . . . . . . . . . . . . . . . . . . .


Debugging . . . . . . . . . . . . . . . . . . . . . . . . . . .


Deployment . . . . . . . . . . . . . . . . . . . . . . . . . .


Monitoring & Alerting . . . . . . . . . . . . . . . . . . .



Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . .


Going Faster . . . . . . . . . . . . . . . . . . . . . . . . . .


Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
Time Management . . . . . . . . . . . . . . . . . . . . . . 112
Asking Questions . . . . . . . . . . . . . . . . . . . . . . . 119
Thanks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
Contributors . . . . . . . . . . . . . . . . . . . . . . . . 128

; Forward
There is a saying: “I’ve learned a lot from my teachers, more
from my peers and most from my students.”
I’ve been learning from teachers and peers for a long time, and
love the informal round table talks as a form of education.
I try to implement this method of teaching at my company
353solutions¹ and people really like it. As a bonus I’m learning
so much from my students.
Terry Pratchett said “Writing is the most fun you can have by
yourself.” I did have fun writing this book, not only from the
writing process itself but also from the discussions with the
people who helped. I am grateful to anyone who contributed
and taught me along the way.
I hope this book will inspire you to come out and talk to people
as a way of learning. You will get many different perspectives
on the problems your facing, and as Alan Kay said: “A change
in perspective is worth 80 IQ points.”
This book is open source, feel free to head over to https:
// and submit bugs, offer
ideas and ask questions. I will do my best to improve this book
according to your suggestions.
Happy Hacking,
Miki Tebeka, April 2018

To Adi & Shira.
Who are young and their stars shine ever so brightly.

Being honest may not get you many friends but it’ll
always get you the right ones.
- John Lennon
Youngstar: Hey Graybeard, tell our readers a bit about yourself.
Graybeard: Why don’t you introduce me and I’ll introduce
Youngstar: Great idea. Let’s see… You’ve been around the IT
industry since punch cards, you’re also the best proof that you
can teach an old dog new tricks. I have no idea how you find
time to learn all the cool stuff you know. You’re smart and
usually pretty quiet until you start talking about technology.
You’re currently not doing much work but still manage to
earn a lot. Oh - and you have a quirky sense of humor.
Graybeard: Cute, I’m not that old though. About you…
You’re somewhat new to IT, finished college a few years back.
You’re very bright and motivated and you sold your company
not long ago for way too much money. You also like to learn
and one of the few people who get my humor. Oh - and you’re
a great example that a woman can make it in high tech.
Youngstar: Thanks. And yeah - I’m good at pretending to like
your humor.
Graybeard: At least you try, my wife doesn’t even bother.



Youngstar: Oh, and we’re fictional characters.
Graybeard: We are?
Youngstar: Don’t pretend you don’t know. How does that
make you feel?
Graybeard: Really? This is not that kind of book.
Youngstar: Can you recall how we met?
Graybeard: I think it was just when I was leaving that
company to start freelancing. And you’d just arrived, still wet
behind the ears.
Youngstar: Yeah, I think we had about a month together
before you left. Man… those were big shoes to fill!
Graybeard: I hope the smell wasn’t that bad.
Youngstar: It was OK, I killed most of the fungus. Can you
tell the readers about this book?
Graybeard: After I left, we decided to meet about once a week
in “The Forge”.
Youngstar: “The Forge” is a great pub just down the road.
Graybeard: Thanks for the close captioning. And yes - it’s a
great pub. We were geeking out regularly and I was kinda
mentoring you when you started that startup doing that
online thingie.
Youngstar: That was both great help and a lot of fun.
Graybeard: Yeah, and we keep meeting about once a week.
But it has been less fun since you made all that money selling
your company and became a snob.
Youngstar: I truly hope you’re joking. Also you got some of
that money, if you recall you got some equity for all the advice
you gave.



Graybeard: I’m joking. Money didn’t spoil you, and once
you’re out of this big company we might hack together on
a new one.
Youngstar: Anything else our readers need to know?
Graybeard: The meetings we had were around Python. But
I think most of the things we talked about apply to other
technologies as well.
Youngstar: I agree. Well, that’s about all the time we have
for the introduction. The attention span of the average reader
nowadays is pretty short. We hope you’ll have as much fun
reading the book as we had in those meeting.
Graybeard: Cheers!

Writing Good Code
Always code as if the guy who ends up maintaining
your code will be a violent psychopath who knows
where you live.
- Bill Mitchell
Youngstar: Your code is always easy to read and maintain.
How do you do it?
Graybeard: Thanks! It took me a lot of time and practice to
get there. And I’m still improving.
Youngstar: That’s a long journey, I don’t have so much time.
Can you share some of the highlights?
Graybeard: Will do, but you need to keep improving.
Youngstar: Yeah, yeah - I’ll “sharpen my axe”.
Graybeard: Good girl! The main theme is simplicity.
Youngstar: Like in KISS²?
Graybeard: Somewhat. As developers, we spend most of our
time reading code, not writing it.
Youngstar: Which means it needs to be readable.
Graybeard: Exactly.
Youngstar: OK, so how do I write readable code?
²Keep it simple, Stupid.

Writing Good Code


Graybeard: By rewriting. I see the first iterations of code I
write as sketches³.
Youngstar: How do I find time to write several iterations of
Graybeard: As Fred Brooks said: “plan to throw one away;
you will, anyhow.”
Code rewrites will happen, you need to allocate time for them.
Youngstar: Is this from The Mythical Man Month⁴? That’s an
old book.
Graybeard: It’s old but about people, and people haven’t
changed that much since it was written.
Youngstar: We haven’t changed a lot in the last 10,000 years.
Back on track, what else will help me write good code?
Graybeard: Reading good code.
Youngstar: Where will I find that? I know where to find bad
code - it’s everywhere.
Graybeard: Not everywhere. There are a few places where
you can see amazing code. For example, almost everything
written by Peter Norvig.
Youngstar: Yes, I’ve seen his spell checker⁵, it’s awesome!
Graybeard: It is. There’s also some good code and advice in
the ASOA book⁶.
Youngstar: Oh, I’ve read some chapters. The one Berkely DB⁷
was good. I’ll keep on reading this book.

Writing Good Code


Graybeard: Yup, and along the way you’ll find people to
follow and read their code. You might even find a good
Youngstar: That I have. Though he’s getting old.
Graybeard: Like wine, I get better with age.
Youngstar: You keep telling yourself that. Anything else
about writing good code?
Graybeard: Read bad code.
Youngstar: Learn from other people’s mistakes?
Graybeard: Yes, but also look out for things you do. From
time to time I go and read “How to Write Unmaintainable
Code”⁸ and try to see if I do anything shown there in my code.
Youngstar: OK, will pay it a visit. What else?
Graybeard: What code does not have any bugs?
Youngstar: Eh… none?
Graybeard: Exactly!
Youngstar: You lost me there grandpa.
Graybeard: The code you don’t write, or code that you delete.
Youngstar: Oh. It’s also the fastest.
Graybeard: Exactly. In a way code is our enemy⁹, we’d like
to have less of it.
Youngstar: Can you give me an example?
Graybeard: Sure. Assume you’re asked to process some data
in Excel files. This will require you to install an external

Writing Good Code


library to read excel (such as xlrd¹⁰). However if you ask them
to send over the files in CSV format - there’s already a csv¹¹
module in Python. No need to install and maintain third-party
Youngstar: I see.
Graybeard: Also, many times due to specification changes you’ll have code that does nothing. Make sure to delete it. One
of my most productive days was deleting a few thousand lines
of unused code.
Youngstar: How did that happen?
Graybeard: Specification changes, libraries came about that
did the same work …
Youngstar: I’m beginning to see what you mean by “code is
our enemy”. What else?
Graybeard: Keep your functions short and with a small
number of parameters. A good rule of thumb is no more than
forty lines of code per function.
Youngstar: Forty? Doesn’t seem like much.
Graybeard: It’s not a law of nature, but it’ll make your code
nicer. It’ll make you think on small pieces of code which are
easier to understand and maintain.
Youngstar: Also avoid globals?
Graybeard: Yup. I like functional programming since it’s
easier to reason about. However you can’t avoid state, no
matter how hard you try¹².
Youngstar: Sometimes TDD¹³ helps with that.

Writing Good Code


Graybeard: Yes, especially when you start out. It forces you
to write small pieces of code that are easy to test. However
Google for “TDD is dead”¹⁴ for some interesting discussion
about TDD.
Youngstar: OK. Any more?
Graybeard: Did I tell you that old linguistics joke?
Youngstar: Old and linguistics? Must be a good one - do tell.
Graybeard: I’ll make it brief.
During the cold war the US created an automatic system
for translating from Russian to English. When the system
was ready they tested it by giving it an English sentence to
translate to Russian and back. The input was “The spirit is
willing but the flesh is weak” and the output was “The vodka
is good but the meat is rotten.”
Youngstar: Ha! Not that bad.
Graybeard: The secret is starting with low expectations.
Youngstar: OK, and how is this related to what we’re talking
Graybeard: The idea is that every language has a different
way of saying the same thing. In Python we call it “pythonic
Youngstar: I’ve heard that term before. Mostly with reference
to the Zen of Python ¹⁵.
Graybeard: Good old Tim Peters, he is someone to learn from.
Youngstar: So learn how to speak the language?

Writing Good Code


Graybeard: Yes. A lot of people when they start write Java
in Python, C in Python etc… But you need to learn how to
properly speak the language.
Youngstar: OK, will do. Any other advice?
Graybeard: The most important thing is to have a good
mental model of what you do. You’ll hear people talking about
building an ontology¹⁶, which means figuring out how to talk
about things.
Youngstar: The “two hard things…”¹⁷?
Graybeard: Naming is important, especially in Python which
is untyped.
Youngstar: It’s also hard to get right.
Graybeard: Yeah, it usually takes me a couple of iterations
until I get names right. A red flag are generic names like
“object”, “other”, …
But back to ontology, it’s important to define what “things”
are. At a place I worked we got a bug report that we count
unique users wrong. The code seems OK so my boss went to
talk to people. Turned out we had four different definitions of
“unique users” in the company.
Youngstar: Ouch. I see what you mean - it starts before you
Graybeard: Sometimes things emerge as you write the code,
then you need to revise your model.
Youngstar: OK, will do. Anything else?
Graybeard: There are may rules to follow - DRY¹⁸, SPOT¹⁹,
¹⁸Do not repeat yourself
¹⁹Single point of truth


Writing Good Code

minimizing coupling … You’ll find them as you go.
Youngstar: Any reference?
Graybeard: There’s a good summary in “The Art of Unix
Programming”²⁰, and may other²¹ other places.
One trick you can do is see if you can understand your code
without the comments²².
Youngstar: OK. I’ll practice and read. More beer?
Graybeard: You keep asking these rhetorical questions.

• Have a good mental model
• Aim for readability
• Don’t stop writing the first time the code
• Read other people’s code
• Find a mentor
• Learn how to speak the language


Which Python?
Gentlemen, choose your weapons.
- A Night in Casablanca
Youngstar: I’ve been thinking of using PyPy for my new
project, I heard it’s super fast.
Graybeard: Before we get into that, let’s take a step back.
Why use Python?
Youngstar: Seriously? Coming from you?
Graybeard: Programming languages are tools, not religion
like some people tend to make them.
Youngstar: And if all you have is a hammer…
Graybeard: Exactly. You have some experience with other
Youngstar: Mainly thanks to you.
Graybeard: So again, why Python?
Youngstar: I’m most productive with Python. Going from
zero to working is fastest.
Graybeard: OK, so speed of development - which is important in a startup. What else?
Youngstar: There are many great packages I can use.
Graybeard: Yes, a good ecosystem. Audry Tang said that
“perl5 is just syntax; CPAN is the language”. I believe this is
true for Python as well.

Which Python?


Youngstar: CPAN is Perl’s PyPI²³?
Graybeard: Yes. What other reasons do you have for choosing
Youngstar: It’s open source?
Graybeard: And why is that a good thing?
Youngstar: It means nobody can take it away from me. And
worse case, I can fix bugs in Python before an official release.
Graybeard: Yup. Gimme one more.
Youngstar: Oh, the community is great. People are usually
nice and helpful, and there are a lot of articles and videos out
Graybeard: Right. Now let’s try to think of places where you
won’t use Python, it’ll help clarify some things.
Youngstar: Embedded?
Graybeard: You mean small devices or real time requirements?
Youngstar: I guess both.
Graybeard: Yeah, it’s hard to fit Python on small devices.
However it’s possible and MicroPython²⁴ does a good job.
Youngstar: I’ve never heard about MicroPython, I’ll take a
Graybeard: As for real time - most garbage collected languages don’t fit the bill. Anything else Python’s not good for?
Youngstar: I guess if you need a lot of formal checking of
your system.

Which Python?


Graybeard: Yea. This leads me to what I call “the cost of error”
which has implication on many areas both in development
and in business. For example, Jane Street²⁵ is a trading company who uses OCaml²⁶ - they claim it helps them make sure
their code is correct.
Youngstar: I guess that in trading systems you feel the pain
of bugs right away.
Graybeard: Yeah, ask someone from Knight capital²⁷ once.
On the other hand, I worked in an HFT²⁸ firm once and we
used Python and made money.
Youngstar: Yeah, yeah - we all heard your war stories many
Graybeard: Be nice to your elders! Anything else?
Youngstar: I can’t think of anything else - do tell.
Graybeard: Hiring is one.
Youngstar: You mean finding programmers?
Graybeard: Yes, try to recruit some good Haskell²⁹ programmers sometime.
Youngstar: Try recruiting good programmers in any language.
Graybeard: Right. Remind me what your startup is all about.
Youngstar: It’s a backend thingie with REST API.

²⁸High Frequency Trading

Which Python?


Graybeard: Seriously? This is almost as bad as “It doesn’t
work!” bug reports. However it’ll do for now. Looks like
Python is a good fit for you.
Youngstar: What a surprise…
Graybeard: Huh! Now let’s try to see which Python. What
Python distributions do you know?
Youngstar: There’s CPython³⁰, Jython³¹, IronPython³², PyPy³³
and now I know of MicroPython³⁴. Oh and there’s the subject
of Python 2 and Python 3.
Graybeard: IronPyton is for .NET shops, which you’re not.
Jython is for Java shops or when you need to use Java libraries
- and I don’t think this is your case either.
Youngstar: And I’m running on hosted servers so MicroPython is not for me as well.
Graybeard: When will you want to use PyPy?
Youngstar: For the speed?
Graybeard: TANSTAAFL
Youngstar: Gesundheit!
Graybeard: It’s an acronym for “there is no such thing as a
free lunch”. What’s the downside of using PyPy?
Youngstar: Well, packages I guess. Not all of them support
Graybeard: Yes. Going off mainstream has it’s down side.
Youngstar: Says the man who uses archlinux³⁵.

Which Python?


Graybeard: Trust me, there are days I regret it. But most days
I’m very happy - it fits my preferences. Which is exactly what
the Python you choose should do for you. So let me ask you what are your speed requirements?
Youngstar: The faster the better?
Graybeard: Then why not pick assembly as your programming language? Even better - manufacture your own hardware.
Youngstar: I see what you mean. I need write some business
requirements and then see if Python fits them. I have a hunch
it will.
Graybeard: In God we trust; all others must bring data.
Youngstar: Good one. Yours?.
Graybeard: Not mine - W. Edwards Deming.
Youngstar: I’ll spec and measure. Now let’s talk on Python 2
vs Python 3.
Graybeard: OK. Python 3 is the future, choose it.
Youngstar: That was easy! Should I tell it to all the people
who still use Python 2?
Graybeard: There are many good reasons to keep using
Python 2.
Youngstar: Because you’re and old fossil who can’t change?
Graybeard: Get off my lawn!
Youngstar: Sure, can I finish my beer first?
Graybeard: I’d say dependencies are the main reason. However the situation has improved significantly in the last couple

Which Python?


of years. If you head over to Python 3 Wall of Superpowers³⁶
(which used to be called “Python 3 Wall of Shame”) you’ll
see mostly green now, which means most “top downloaded”
packages support Python 3 now.
Youngstar: What other reason are there? Legacy code?
Graybeard: You won’t believe how fast the new cool code
you wrote a while ago becomes legacy code. Most of the time
we improve existing code, not write new stuff. If you already
have a decent code base, writing new code from scratch is
a dangerous thing. Read “Things You Should Never Do³⁷”
Youngstar: How do you find the time to read all of these
Graybeard: I don’t have time not to. But this is something for
later conversation. Another thing you learn with experience is
to appreciate things that work. Zach Holman, then at github,
said “Your product should be cutting edge, not your tech …
stability is sexy.”
Youngstar: I wonder how we make progress then.
Graybeard: Sometimes the advantages of new technology
outweigh the risk. Also, people are way too optimistic for
their own good.
Youngstar: Oh, what about Anaconda³⁸? I heard people talking about it.
Graybeard: Anaconda is based on CPython, and comes bundled with scientific packages. There are other scientific Python
distributions out there but it seems to be the dominant one.

Which Python?


If you plan to use a lot of scientific packages, such as numpy,
scipy, matplotlib and others, give Anaconda a try.
Youngstar: I don’t have plan for that now, and as you said
earlier switching is not that painful.
Graybeard: Just make sure you have a good test suite.
Youngstar: Will do, but testing is a big subject and we’re
getting to the point where my boyfriend gets jealous of you.
Final recommendation?
Graybeard: Don’t be lazy, do your homework and find the
right Python, or other programming language, for you. Note
that switching from one Python to another shouldn’t be that
difficult. At one place we had to switch from Python 3 to 2 due
to dependency issue, it took us about half a day to do that.
Youngstar: So the decision is not that crucial?
Graybeard: It is, don’t take it lightly. We were lucky the
switch was easy, you might not be.


Which Python?

• Choose CPython 3.x if you have a new
project with few dependencies
– Python 3 is the future
• Choose CPython 2.x if you have older code
base or dependencies that does not support
Python 3
• Choose Jython³⁹ if you need interaction
with Java
– Or if you’re in a Java shop and want
to sneak Python in the back door ;)
• Similarly, choose IronPyton⁴⁰ if you need
interaction with .NET
• Choose PyPy⁴¹ if you need some speed and
love living on the edge
• Use Anaconda⁴² distribution if you use a lot
of scientific Python packages


IDEs and Editors
All mail clients suck. This one just sucks less.
- Michael R. Elkins (mutt website⁴³)
Youngstar: What are you using to write Python code?
Graybeard: Vim⁴⁴, I use it for everything.
Youngstar: Cool, so I’ll start using it.
Graybeard: Hold your horses. Mastering Vim is a long and
sometimes a painful experience. I’ve been using it for about
20 years and I’m still learning.
Youngstar: Whoa! I don’t have 20 years, I need to get productive now.
Graybeard: Since you’re going to spend most of your time
inside an editor/IDE⁴⁵ - try to pick a good one and master it.
Youngstar: I know I’ll regret this… But which one should I
Graybeard: It’s not that simple, there are several factors you
need to consider. At the end, it’s a matter of personal taste.
Check out the editor war⁴⁶ sometime.
Youngstar: Editor war?
Graybeard: Yeah, some people get too passionate sometimes.
⁴⁵Integrated Development Environment

IDEs and Editors


Youngstar: OK. Let’s start with what you’re using. Why are
you using Vim?
Graybeard: As I said - it takes time to master Vim and get
used to its dual editing mode. However once you’ve mastered
Vim you’ll be super productive with it not just in Python but
with almost any other language. Vim itself is pretty barebones editor, but it has a rich plugin ecosystem which can
transform it to a powerful IDE. One of the main advantages, at
least for backend developers like me, is that on most Unix like
systems - it’s already there. Vim can work in “terminal mode”
which does not require a windowing system. This means you
can SSH to a box and start editing. Oh - and you can write
Vim scripts in Python.
Youngstar: Isn’t Vim old?
Graybeard: In tech old usually means working - take me for
Youngstar: Ha! What’s the other editor old developers use?
The lispy one?
Graybeard: Emacs⁴⁷?
Youngstar: That’s the one.
Graybeard: Emacs is a text editor that does everything⁴⁸. It
has excellent Python support with python-mode⁴⁹ and many
core Python developers use it.
Youngstar: Then why don’t you use it?
Graybeard: Since I picked the dark side of the editor war.
Youngstar: And something more modern?

IDEs and Editors


Graybeard: Before going modern, I’d like to stress that both
of these editors take a lot of work to master⁵⁰. But once you
grok them, both will offer you things that most other editors
or IDEs will not.
Youngstar: Noted, I’ll invest some time learning one of them.
Maybe emacs just to annoy you.
Graybeard: I never get annoyed by the stupid editors people
Youngstar: Something more modern?
Graybeard: I’m seeing a lot of people using PyCharm⁵¹, from
JetBrains, the makers of IntelliJ. There also PyDev⁵² which sits
on top of Eclipse.
Youngstar: IntelliJ? Eclipse? Aren’t those Java IDEs?
Graybeard: They started there, but now they are very powerful general purpose IDEs. You will need Java to run them,
and a lot of memory. A strong CPU won’t hurt as well.
Youngstar: And PyCharm/PyDev are the Python environment?
Graybeard: Yes. There’s also Aptana⁵³ which is Eclipse already bundled with PyDev.
Youngstar: Doesn’t it take time to start them?
Graybeard: People usually have them running for weeks at a
time. You can switch projects without closing the IDE.
Youngstar: OK. Any other options?

IDEs and Editors


Graybeard: In Windows world, Visual Studio⁵⁴ comes with
excellent Python support called PTVS⁵⁵.
Youngstar: Windows? Visual Studio? You?
Graybeard: Some claim that Visual Studio is the best IDE out
there, but then again - they are using Windows ;)
Youngstar: Thanks but I don’t think I’ll switch to Windows
just for that.
Graybeard: Smart girl.
Youngstar: After all the brainwashing you did?
Graybeard: I prefer “showing you the light”.
Youngstar: Yeah, yeah. Back on track - any more?
Graybeard: There are so many.
Microsoft also makes Visual Studio Code⁵⁶, which is cross
platform and has good Python support, and a good Vi plugin.
Spyder⁵⁷ is good you’re doing a lot of scientific Python or
coming from Matlab. It’s not as polished but fits better with
scientific development.
There are also Atom⁵⁸, Sublime⁵⁹, and many other good
editors out there with Python support. There are Wiki pages
for both Editors Wiki⁶⁰ and IDEs Wiki⁶¹ on the Python web
site if the above are not enough.
Youngstar: As usual, I’m more confused than before.

IDEs and Editors


Graybeard: My advice - pick one or two, and make sure Vim
is one of them ;), and try them out. Do a little project with
each, see what fits your work style and then start specializing.
I personally try a new one every now and then - but always
get back to Vim eventually. Maybe I’m too old to learn new
Youngstar: OK. Anything I need to pay attention to while
learning or using these IDEs?
Graybeard: Most of them have good integration with linters,
make sure to enable it.
Youngstar: Linters?
Graybeard: Programs that check your code for common
errors and coding conventions. We’ll talk more on them later,
but the editor will mark lines with errors so you can fix them
right away. For example I use flake8⁶² integration in Vim⁶³.
Youngstar: Fixing errors closer to when you introduce them
is always better.
Graybeard: Yes. I think some of them run the tests in the
background whenever the code changes.
Youngstar: Cool!
Graybeard: Depends on how fast your tests are.
Youngstar: I can see that. Any other advice?
Graybeard: What? That was not enough for you? I guess
another good advice is to be patient.
Youngstar: Have you seen my hair color? I wasn’t born with
the patience gene.

IDEs and Editors


Graybeard: You kids … The point is that it takes time to
master an editor or an IDE. Give it time, and you’ll see
your productivity soaring. I call it the “output” part of a
programmer I/O.
Youngstar: I/O? As in input/output?
Graybeard: Yes. Most of your time as a developer should be
spent thinking. However reading and writing are also part of
the process and a good editor or IDE can increase the output
part. Another bonus of fast writing is that you can write
several drafts of your code and not lock into the first one you
Youngstar: Good point. I guess I’ll brush on my speed reading
to get the input part faster.
Graybeard: Yes. We programmers spend a lot of time reading,
both code and technical documents.
Youngstar: And in your case a lot of Sci-Fi.
Graybeard: Where do you think I get all my ideas from?
Youngstar: Thin air?
Graybeard: You’re too kind, I thought you were going to
mention a certain body part.
Youngstar: What are you? Six?
Graybeard: Mentally? Not much more. But I see this conversation has taken a bad turn so I’ll stop here.
Youngstar: Right as usual, cheers!


IDEs and Editors

• Give Vim or Emacs a try, they will rock
your world
– See here⁶⁴ on how to turn Vim into a
Python IDE
• PyCharm is a good choice
– Make sure you have plenty of RAM
– Also if you’re in a Java shop - there’s
probably a lot of knowledge on IntelliJ
(which PyCharm is based off)
• Visual Studio Code great
• If you’re in a Windows shop, give Visual
Studio a try
• If you’re doing a lot of scientific Python take a look at Spyder


Project Structure
organizations which design systems … are constrained to produce designs which are copies of the
communication structures of these organizations.
- Conway’s Law
Youngstar: How should I structure my code? I currently have
everything in one directory and it looks messy.
Graybeard: Are you facing a specific problem?
Youngstar: Not really, but I assume I should be more organized.
Graybeard: As the bad guy said: “Assumptions are the mother
of all !#?@ups”.
Youngstar: Which movie was that?
Graybeard: “Under Siege 2” if my one bit memory serves me
Youngstar: Don’t think I saw that one.
Graybeard: Trust me - you’re not missing anything. But back
to your question. Why are you trying to fix something that
you don’t know is broken?
Youngstar: You’re probably right. I’ll leave it for now.
Graybeard: I didn’t say it’s not broken. I just said you think
it’s not broken.
Youngstar: OK, enlighten me.

Project Structure


Graybeard: Do you have some tests?
Youngstar: Sure!
Graybeard: How do you make sure they don’t get to production?
Youngstar: Why shouldn’t they?
Graybeard: Ask github who had a few hours of downtime⁶⁵
a while back. The cause was tests deleting the production
Youngstar: Ouch!
Graybeard: Yes, and github are not the only ones bitten by
this problem.
Python has an established way to organize projects. It’s not
mandatory but I found it’s a good practice. Let’s assume that
the name of your project is archer.
Youngstar: Do you have to bring that TV show into everything?
Graybeard: Please be quiet, I’m trying to teach you something here. I’m also still hurt you didn’t take my suggestion
for a project name.
Youngstar: I’m being quiet.
GrayBeard draws the following diagram on a napkin:


Project Structure


├── Makefile
├── requirements.txt
├── archer
├── docs
└── tests

Graybeard: Let’s go over this. The top archer directory is
your project - the one you clone from source control.
The second archer directory is your Python package where
the code is. tests are outside of the code so they won’t get
Youngstar: And the rest of the files?
Graybeard: Every project should have a README with at least
an elevator pitch. This focuses people on what we’re doing
here. It should also contain instructions for developers not
found in the docs.
The docs directory is the generated documentation, I don’t
usually have docs other than what’s in the code and in the
Youngstar: .md stand for markdown⁶⁶ right?
Graybeard: Yes. You can also use ReStructuredText⁶⁷ or plain
text. But markdown became very dominant these days. There
are several variants of Markdown, pick one and stick to it.

Project Structure


Youngstar: Markdown it is then.
Graybeard: What else? Oh, I usually have a main Makefile
to automate some tasks, requirements.txt to specify external
requirements. And one script to run all the tests. We’ll discuss
what’s in requirements.txt and when we talk
about dependencies and testing.
Youngstar: OK, I’ll try to remind you - considering your one
bit memory.
Graybeard: Yay, an external memory! I’ll drink to that.
As said, this is my personal preference which is based on how
many Python projects are structured. You might find another
one better for you but I suggest you start with it.
Youngstar: Anything else?
Graybeard: Yes, don’t overthink and spend too much on it.
Start with one structure and only if it becomes a problem fix
Youngstar: That’s advice you give for many things.
Graybeard: Because it’s a good one, and hopefully one day
you’ll make it a habit.
Youngstar: Is there a way to automatically generate documentation?
Graybeard: Yeah, write simple code that people can understand.
Youngstar: That’s a manual way.
Graybeard: OK “Miss Always Right”. I stand, actually sit,
I say that the only updated documentation is the code itself.

Project Structure


Youngstar: That’s good in the general case, however sometimes I need to write tricky code. For example when optimizing.
Graybeard: Optimization is a subject for another talk. But
you’re right, when you do stuff that is not that obvious - write
good docstrings and comments.
Youngstar: Are there tools to generate nice documentation
from docstrings?
Graybeard: Of course. In the Python world we mostly use
Sphinx⁶⁸. It has a format for documentation strings and can
generate HTML, PDF and other formats. A nice feature of
Sphinx is that it can run doctest⁶⁹ tests.
Youngstar: doctest is where you write snippets of code in
your docstrings?
Graybeard: Exactly, and I find it cool that you have testable
Youngstar: How about the “big stuff”? Things that don’t fit
inside one module?
Graybeard: You have the README for that and also Sphinx
can have top level documentation. Note that if you have
documentation, you’ll need to add checking it as part of the
code review.
Youngstar: How did we get from project structure to writing
Graybeard: Not sure. Last thing about documentation is that
several times I saw people investing a lot of time in generating
very nice documentation that nobody looks at.

Project Structure


Youngstar: I’ll start with simple documentation. Anything
else about project structure?
Graybeard: There are more files you might need. A
to help with packaging. ChangeLog to list changes, NOTICE.txt
or LICENSE.txt for specifying license. tox.ini for running
tests on multiple versions of Python and many other files.
Start with the least amount of items and add new ones only
when you need to.
Youngstar: Then trim and restructure periodically?
Graybeard: Exactly.
Youngstar: What about, I’ve seen it in many projects.
Graybeard: is used for packaging. Do you need
Youngstar: Currently I deploy directly from git.
Graybeard: So you probably don’t need packaging.
is mostly used when creating packages for other people to
use and in open source code. There’s a lot of options there
and when you decided to release some of your code as open
source we can talk about it.
Youngstar: I’ll live without for now. Priorities …
Graybeard: Very good.


Project Structure

• Start with an established project structure
(like GreyBeard’s example above)
• Separate code from tests
• Have a README with an elevator pitch and
development instructions
• Use a Makefile or other tool to automate
common tasks
• Have one script to run the tests
• Look into Sphinx⁷⁰ for generating documentation
– But only if you need to


Only the paranoid survive.
- Andy Grove
Youngstar: You won’t believe the stupid bug I was chasing
Graybeard: Do tell.
Youngstar: I was updating some packages …
Graybeard: … and one of the new versions had a regression
bug that took you all day to figure out.
Youngstar: What do you know? I’m not that special after all.
Graybeard: Oh, you are unique - just like everybody else.
Youngstar: Funny! So how can I avoid bugs like this in the
Graybeard: You know that the best way to solve a bug is to
make sure that it’s impossible to introduce such bugs in the
Youngstar: Yeah, forgot who taught me that …
Graybeard: Buy me another beer and I’ll refresh your memory.
Youngstar: Sure thing. Now back to my question…

Managing Dependencies


Graybeard: How do you manage your dependencies?
Youngstar: I have a requirements.txt with package per line,
and I run pip install -r requirements.txt to install them.
Graybeard: You know you can specify a specific version
using ==. For example requests==2.12.4
Youngstar: I didn’t know that. But why would you do that you won’t get all the bug fixes … Doh!
Graybeard: Exactly!
Youngstar: Then I should probably version all my packages.
Graybeard: I agree.
Youngstar: I know I’ll regret this… But any other pointers on
dependency management?
Graybeard: As I said many times, one of the biggest factors in
your development practice is the price of error. For example
it’s much harder to fix a bug in an embedded system than
in a small site web server. The bigger the cost of error the
more strict you want to be with your requirements and enable
stable builds.
For example, do you use virtual environments?
Youngstar: Yes, I use virtualenv⁷¹.
Graybeard: Why?
Youngstar: So that packages are installed in isolation per
project and not globally in the system.
Graybeard: Good, this is one more isolation level. By the way,
newer versions of Python comes with venv⁷² module which

Managing Dependencies


does basically the same work. And there also a newer tool
called pipenv⁷³.
Youngstar: That’s nice, one less dependency. What are the
differences between virtualenv, venv and pipenv?
Graybeard: With virtualenv you can specify a different
Python interpreter, for example even if your default Python is
3 you can still create a virtual environment with the Python
2 interpreter.
Also since venv is in the Python standard library, it’ll updated
only when a new version of Python is released. virtualenv
will probably have a faster release cycle.
pipenv combines pip and virtualenv to one tool.

Youngstar: Good to know. The downside of using virtual
environments is I need to teach my IDE which is the right
Graybeard: Which IDE are you playing with right now?
Youngstar: VSCode⁷⁴.
Graybeard: That’s a cool one, almost as good as Vim.
Youngstar: Yeah, yeah. Any other pointers for managing
Graybeard: Don’t use the system Python.
Youngstar: Why?
Graybeard: In general, it’s preferred to leave the system
Python alone since a lot of system utilities are written in
Python and a system upgrade might break your code. Red
Hat⁷⁵ based distros use a lot of Python.

Managing Dependencies


Youngstar: On the other side of things, if I upgrade a package
that a system tool depends on - I might break a system tool.
Graybeard Right.
What will happen to your code once the next debian⁷⁶ ships
with Python 3 as default?
Youngstar: I see, I’ll install a Python for my application with
the right version. Is debian a popular distro?
Graybeard: Very, several other distros are based on debian,
such as Ubuntu⁷⁷ and Mint⁷⁸. Changes to debian will find their
way to these distros eventually.
Youngstar: I use Mint, now I remember reading somewhere
it’s debian based.
Graybeard: Yup. Now what happens if PyPI⁷⁹ is down when
you deploy?
Youngstar: I’m pretty much screwed, but how can I overcome
Graybeard: In some cases it might be OK to wait for PyPI to
get back up. It’s has been more stable in recent years. If you
need to deploy no matter what, then you need to pre build
your dependencies and tell pip to install it from your servers.
Youngstar: pip can do that?
Graybeard: pip can do many things, this is one of them. See
the --index-url and --find-link options of pip install.
Youngstar: OK.
Graybeard: Now about the version of the C compiler…

Managing Dependencies


Youngstar: I write Python code, not C.
Graybeard: You can write Python modules in C, and there are
many good reasons for doing that - but mostly as last resort.
It’s likely that one of your dependencies is a C extension.
Then you’ll need a C compiler and possibly some libraries
and header files. Some libraries require a Fortran compiler.
Youngstar: Fortran?
Graybeard: Yes, in some cases a Fortran compiler can do
better optimization than a C compiler.
Youngstar: How do people on the Windows world find a C
Graybeard: There’s a free C compiler for every major platform. gcc⁸⁰ or clang⁸¹ on Unix like systems. And the Microsoft
compiler comes free nowadays.
Youngstar: Good to know. And what’s the solution here for
the C extensions problem?
Graybeard: The idea is that you build all your dependencies
in advance and then use them. The latest packaging format is
called wheel⁸². It’s basically a zip file that contains both the
Python code and the compiled extension as a shared library.
Youngstar: What happened to eggs⁸³?
Graybeard: wheel is the new egg.
Youngstar: I’ll get the T-Shirt.
Graybeard: Some companies have a “build machine” which
has all the required dependencies to build the packages. This

Managing Dependencies


way you don’t need to install a lot of tools on your production
machines. This build machine is usually also the one serving
these third party packages. By the way, this process of keeping third party dependencies locally is sometimes known as
Youngstar: How deep does this rabbit hole go?
Graybeard: Just you wait Alice. Oh! The places we’ll go…
Dependency management is an old and unsolved problem.
Pick any package manager: yum, apt, gem, npm … - all of them
have their problems.
Youngstar: Consolation of fools… Can we get back to the
Python realm?
Graybeard: Yes.
Youngstar: And …
Graybeard: Hold on, collecting my thoughts… OK. If you’re
doing a lot of scientific computing - numpy, pandas, matplotlib
and other packages. pip installing them can be a pain.
Youngstar: Right… Should I wax my legs while doing this?
Graybeard: Not sure what will hurt more. Anyway … There’s
an alternate package manager called conda⁸⁴. conda was
developed by Anacodna to solving the problem of installing
scientific packages. Over time in became a general installer
and you can install other packages with it. Note that not all
of the packages on PyPI can be installed with conda.
Youngstar: What do I do then?
Graybeard: conda plays well with pip and you can use both.
conda has its own notion of “environments” and it installs pip

Managing Dependencies


in them for just this case. conda supports Linux, Windows,
Youngstar: Do you get royalties from Anaconda?
Graybeard: Nope, but since I’ve been doing a lot of scientific
Python lately it had saved me tons of time and agony. Going
deeper …
You can use docker⁸⁵. This will give you a system where you
know exactly what going on - which version of Python, of
libc … However docker comes with it own set of issues mainly what’s called “orchestration” but I won’t get into that.
The simple approach is just to run a single container as your
application on the host.
Youngstar: OK.
Graybeard: Alan Kay once said “People who are really serious about software should make their own hardware.”
Youngstar: Let’s stop here, I have no intention of starting a
hardware company.
Graybeard: CPUs have bugs as well, you might want to
control the version of CPU you use.
Youngstar: OK. A related question - How do you choose
which package to use?
Graybeard: If the package implements a known protocol or
connection to external tool (such as a database), chances are
that the main site of the protocol/tool will list recommended
“language bindings”. For example, the bottom part of msgpack
site⁸⁶ has a “Languages” section with Python pointing to

Managing Dependencies


Youngstar: And if I don’t find a reference to Python in the
main site?
Graybeard: Most packages are hosted on public sites such as
github⁸⁷ There you can see the project “health” - how many
committers, commit history and last commit, number of open
bugs …
Ask around, the Python community is very friendly and helpful. There are also sites who have a curated list of packages.
However don’t blindly trust them, make up your own mind. I
find they have a tendency to recommend the shiny new toys.
Youngstar: Err toward mature package.
Graybeard: “Stability is sexy.”
Youngstar: We need to have a talk about how you define
“sexy”, but another time.
Graybeard: Ha!
Another thing you should do is test before you use. Pick a
package or two and try it out to see how it behaves. Try to
simulate real environment and load as much as you can and
always make sure to write code in a way that makes switching
packages easy as possible.
Youngstar: Do I really need to do so much even before writing
even one line of code?
Graybeard: This is sometimes called “accidental complexity.”
But no, don’t start with having your own build machine and
internal PyPI. Start simple with pip, virtual environment and
versioned requirements file.
Youngstar: Pain vs Gain?

Managing Dependencies


Graybeard: Exactly. Start with minimal effort that works for
you and grow when you need.
Youngstar: Thanks for that. My head is full and my beer glass
is empty - time to go home.
Graybeard: Cheers!

• Depending on the cost of error - pick a
strategy for versioning
• Version your dependencies, write them
down and place them in source control
• Use wheels when possible
• conda is a good alternative to pip
• docker will give you even more control but
it comes with a cost
• You might want to invest in your own
internal package repository
• Have a process for evaluating new packages. Lean toward old and stable ones

Two rules of database systems
1. It takes 7 years minimum to create a production-ready database system
2. You’re not an exception to rule 1
- Luca Candela
Youngstar: I need to store some data and was thinking of
using MySQL, what do you think?
Graybeard: I think you mean MariaDB.
Youngstar: What?
Graybeard: MariaDB is the community fork of MySQL, done
after Oracle bought MySQL.
Youngstar: Like OpenOffice and LibreOffice?
Graybeard: Exactly.
Youngstar: OK. Now that we clarified this issue, can we get
back to my initial question?
Graybeard: I don’t know enough about your data to give you
a good answer.
Youngstar: Currently I don’t have much data. Some user
information, some session data. Things are very much in flux
so it’s hard to know.



Graybeard: I’ll give you my usual advice - start simple.
Youngstar: Gee, why didn’t I think of that? What do you
mean by “simple”?
Graybeard: When you start with a database such as MySQL
you add complexity to your system. You need to serialize/deserialize your objects, you have schemas to design and update
- and schema migration can be tricky. Using MySQL also
means you need a server, users, backup …
Youngstar: OK, so what do you suggest?
Graybeard: When I need storage, I usually start with shelve⁸⁸.
It’s very much like a dict which is backed to disk. The main
limitation is that the keys have to be strings, the values can be
anything that pickle⁸⁹ can handle. I don’t have to worry about
serialization, schemas and other things.
Youngstar: How do I query it?
Graybeard: By running for loops in Python.
Youngstar: Isn’t it slow?
Graybeard: sighs Speed again? What’s your speed requirement? How many objects do you have? Have you profiled
your code? …
Youngstar: OK, OK …
Graybeard: As a rule of thumb, for a system that’s not that
loaded and around tens of thousands of objects - shelve will
work reasonably well.
Youngstar: Is it thread safe?
Graybeard: Is your application multi-threaded?



Youngstar: I haven’t decided on the web server yet, so I don’t
Graybeard: Well, if you find you need to be thread safe slap a threading.Lock⁹⁰ on it. It’s a good idea to have your
own data access layer anyway, so switching storage backends
shouldn’t be that hard. Writing a nice DAL⁹¹ also forces you
to think about your storage API. Most of them time the usual
CRUD is enough, maybe some search as well.
Youngstar: DAL? CRUD?
Graybeard: DAL is Data access layer. CURD is Create, Update, Retrieve, Delete
Youngstar: Ah. What about ORMs⁹²? I heard SQLAlchemy is
Graybeard: I have mixed feeling about ORMs. On one hand
they save you a lot of boilerplate coding. However I found
out that when your data usage become more sophisticated,
you need to work around them. Also I haven’t found a good
ORM for a NoSQL databases yet. If you end up using an ORM,
make sure it’s easy to rip it out if it becomes a problem more
than a solution.
Youngstar: NoSQL as in MongoDB?
Graybeard: Yup. There are so many of them.
Youngstar: Are they better than SQL ones?
Graybeard: It really depends on your usage. I found NoSQL
databases good for early stages when your data model is still
in flux and schemas are just in your way. I usually start with
⁹²Object Relational Mapping



shelve⁹³ and switch to NoSQL database if I need support for
large amount of data or client/server architecture.
Note that in NoSQL the schema does exist, it’s in the code
instead of in the database.
Youngstar: Will I need client/server support?
Graybeard: My crystal ball is broken today. However the
answer is probably yes. You usually run more than one server
for failover or load handling, and you’ll want all of these
servers looking at the same data.
Youngstar: I guess if I can make my server stateless it’ll be
Graybeard: Good insight. In practice this is really hard to
achieve, but a good goal to strive to. I worked at a company
that stored all the required data in HTTP cookies. This meant
the client sent all the data we needed in every request. This
saved us a lot of database queries, however you need to be
aware of the security risks of storing data on the client.
Youngstar: When will you pick an SQL database?
Graybeard: There are many parameters that point to SQL
database. One thing is that many people know SQL, and if
you have many hands touching the data - it’s a good thing.
Also many tools, mainly reporting ones, work well with SQL.
The other thing is that some of the SQL databases, I personally
prefer PostgreSQL⁹⁴, are wicked fast when you have much
more reads than writes.
SQL databases have transactions, which means when you
insert ten records either all of them will enter the database



or none of them. In some NoSQL systems this is really hard
to achieve.
Also, SQL databases tend to be older, which means they are
more stable and have more tooling and knowledge around
Youngstar: You prefer older? You love all this new and shiny
Graybeard: I know, but I’ve been bitten by “new” databases.
At one company we worked with a two years old database.
About 90% of our downtime was due to database issues.
Youngstar: Ouch.
Graybeard: Yes. I hear the situation has improved since then,
it takes time for a database to mature and be production ready.
Youngstar: OK, I’ll learn some SQL then.
Graybeard: It’s not just SQL you need to learn but also
NoSQL. There are many ways to model your data and you
need to know things like normalization, fact tables, type 2
dimension tables and more. One of the more effective ways I
know is to start from the UI and think about the queries you’re
going to perform. After that you start modeling the data.
Thinking and designing your data layer is very important. In
“The Mythical Man-Month⁹⁵” Fred Brooks says: “Show me
your flowcharts and conceal your tables, and I shall continue
to be mystified. Show me your tables, and I won’t usually
need your flowcharts; they’ll be obvious.”
Youngstar: flowcharts?
Graybeard: Yeah, this book is from 1975.



Youngstar: 75? Are you kidding me?
Graybeard: It’s timeless. Talks mostly about people and
communication, and people haven’t change a lot in the last
few thousand year.
Youngstar: But still … 75?
Graybeard: Read it for yourself and decide. Well worth the
time in my opinion.
Youngstar: OK… Going back to present day - any more
Graybeard: Couple tidbits:
You’ll probably have some complex queries in your code. I
recommend saving them in external files - SQL, YAML …
and not in code. I once worked in a company who used the
Spring framework. They went half the way and stored the
SQL queries in the Spring XML configuration files. It was
really hard to read the SQL embedded in the XML, there was
no syntax highlighting and viewing diffs was a mess.
The second thing is that most Python’s SQL database drivers
support accessing columns by name and not just by index.
Accessing by index is both less readable and prone to error,
someone changes the SQL query and suddenly row[2] is not
the column you want. For example in sqlite3⁹⁶ you need to
set the connection row_factory attribute to sqlite3.Row and
then each column can be accessed both by position and by
Youngstar: OK, I’ll remember these. Now what about backup?
How often to I need to backup my databases?
Graybeard: You don’t need backup.



Youngstar: I don’t?
Graybeard: No - you need recovery. You’ll be surprised
how many companies had backups of their data but couldn’t
restore from it when time came.
Youngstar: So backup is part of recovery. How often should
I do it?
Graybeard: Again, depending on your audit and recovery
needs - this question can have very different answer. Another
thing is that backups tend to grow in size and accumulate,
have a good retention policy.
If you use a hosted database - that might take care of backup
and recovery for you.
Youngstar: Hosted?
Graybeard: Yup. And considering that they take all the
operations headache from you it might be a good solution.
Google has BigQuery⁹⁷, Amazon has Athena⁹⁸… and many
An extra benefit for BigQuery and Athena is that they scale.
Both claim they can process billions of records in seconds.
Youngstar: Don’t they cost money?
Graybeard: TANSTAAFL⁹⁹. Don’t make the common mistake
of underestimating the cost of running your own servers.
Deployment, monitoring, alerting, backup and more - all take
time and effort. And developer time is expensive. In The
Art of Unix Programming¹⁰⁰ Eric Raymond says the rule
of Economy is: “Programmer time is expensive; conserve it
⁹⁹There ain’t no such thing as a free lunch



in preference to machine time.” This is true in most cases,
whenever you can save developer time - do it.
This is also why people like Google App Engine¹⁰¹ - zero ops.
Youngstar: I have to say now I’m totally confused.
Graybeard: Yeah, too many options is not a good thing.
Remember this when we’ll talk about monitoring. But for now
- just start with shelve or something simple as it. When things
get more interesting - go over the queries you do, the business
requirements and then select the right solution. Who knows?
You might find yourself using a graph database at the end.
Youngstar: A graph database?
Graybeard: Yes. You store not just objects, but also relationship between them. Look up neo4j¹⁰² which is a very popular
graph database, they have some good usage examples on their
Youngstar: Any other types of databases I need to know of?
Graybeard: There are so many. I think we considered the
main ones except search based ones.
Youngstar: Like Elasticsearch¹⁰³?
Graybeard: Yes, it’s actually my favorite.
Youngstar: My God, it’s full of databases!
Graybeard: Yes Dave. Also It’s not uncommon to use more
than just one database. For example a combination of SQL
for fast queries and search database for textual search. Some
people use Redis¹⁰⁴ for fast key/value and MongoDB for



document storage. It really all depends, but having just one
is a big plus.
Youngstar: I’ll start simple and grow when it hurts.
Graybeard: Wise words to end the night. My beer is empty
and home is calling. Next time…

• Start simple, shelve¹⁰⁵ is a great option
• Know your data and queries before selecting a database
– Think of things like embedded vs
client server, SQL vs NoSQL vs Key/Value vs Graph …
• Consider hosted database - let someone else
wake at 3am
• Pick a mature database
• Make sure you can recover from backup
• Have a policy to trim your backups


A computer lets you make more mistakes faster
than any invention in human history, with the
possible exceptions of handguns and tequila.
- Mitch Radcliffe
Youngstar: I fixed a bug today and accidentally introduced a
new one.
Graybeard: Sounds like the “99 little bugs in the code” poem.
Youngstar: I can guess the rest of it.
Graybeard: Don’t you have regression tests?
Youngstar: I have a few unit tests, but that’s about it. What
are regression tests?
Graybeard: Tests that guard against exactly what happened
to you - that new changes didn’t break anything old. There
are many kinds of tests and this is an important one.
Youngstar: So I should do more regression testing?
Graybeard: Let’s back off a bit. Why do you test?
Youngstar: Well, for one thing to make sure I don’t break
Graybeard: Any other reason?
Youngstar: Check that the code runs as intended?
Graybeard: These are mainly unit tests. More reasons?



Youngstar: Hmm, nothing comes to mind currently. What are
more reasons?
Graybeard: There are many - integration tests check that all
parts of the system connect together. Fuzzing tries to bring
down your system with unusual input and there are many
more kinds of tests.
What do you think are the down sides of testing?
Youngstar: Downside? Let’s see … Well - they take time to
write, that’s for sure.
Graybeard: Anything else?
Youngstar: Every time I change my code - I need to change
the tests as well. This makes sure that I didn’t mess anything
up, but also take more time.
Graybeard: Yes. This is what the guys in Getting Real¹⁰⁶ call
“mass”. The more mass you have, the harder it is to make
The amount and kind of testing is influenced by the cost of
error. If you’re writing a life support system - you’ll use much
more testing than what you need in your little project right
The main point here is that testing is “pain vs gain” balance.
Make sure the extra mass and time pain is worth the gain.
Youngstar: Speaking of tests, do you practice TDD¹⁰⁷?
Graybeard: Sometimes, mostly when working with new
developers. I found out it helps them design cleaner code. You
should fit the methodology to the team your working with. I



personally write tests after the first or second draft of the code
is working.
Youngstar: How do you know it’s working?
Graybeard: I try it out in the REPL.
Youngstar: The what?
Graybeard: REPL stands for “read eval print loop”, you might
also know it as “the interactive prompt”. You write little pieces
of code and test them as you go. After I’m done and happy
with the code, I write some tests.
People underestimate how much does the REPL help during
development, give it a try next time.
Youngstar: OK, I will. Which testing framework do you use?
Graybeard: I personally prefer pytest¹⁰⁸, I’ve used unittest¹⁰⁹
with discover mode as well.
Youngstar: Why do you prefer pytest?
Graybeard: I find pytest simpler, and I always go for simple.
Also love their parametrize fixtures¹¹⁰ which let you run the
same test with different input (AKA table driven testing).
Their xunit output¹¹¹ is great for Jenkins¹¹² integration as well.
Oh, and I also use tox¹¹³ for testing the same code on multiple
versions/implementations of Python.
Youngstar: I’ll start with pytest then, don’t need multi
version testing currently. How do I run the tests?



Graybeard: pytest comes with pytest script that discovers
and executes tests. But this is usually the last thing in I run.
Youngstar: Last? What do you run before it?
Graybeard: Few things: I check that there are no calls to pdb
in the code.
Youngstar: pdb is the Python debugger?
Graybeard: Yes, you can insert calls to it if the breakpoint
condition becomes too complicated. We’ll talk about debugging later. Another thing I do is clean all the compiled
Youngstar: The .pyc files that are generated on import?
Graybeard: Say you renamed a module but forgot to change
the import in your code. Since the .pyc of the old module is
still there - your test will pass.
Youngstar: Gotcha.
Graybeard: I also run linter, I use flake8¹¹⁴ which combines
pyflakes¹¹⁵ and pep8¹¹⁶, before the tests and fail on any output.
Youngstar: Does flake8 check for coding conventions?
Graybeard: Yes, this is how I avoid wasting time on coding
convention talks. If the code passes flake8 - it’s fine. However
don’t get too stuck on coding conventions, see Raymond
Hettinger’s talk called Beyond PEP8¹¹⁷.
Youngstar: Will do, anything else?
Graybeard: Nope. After that I run the test suite.



Youngstar: Sounds like a lot of steps. Knowing you, you
probably have a script to do this.
Graybeard: Correct, I’ll mail it over if I remember. But I’m
sure you can code it yourself. The steps are: Clean .pyc, search
for pdb, run flake8 and finally run the tests.
Youngstar: I’ll remind you to mail me.
Graybeard: Thanks. Having one command to run your tests
also makes sure other members in your team don’t forget
steps. I’m not the only one with a one bit memory.
Youngstar: In some cases I found out the tests run for a long
time. Which makes it annoying to run them every time I make
a change.
Graybeard: My rule of thumb is that developers won’t run
tests that take more than about a minute.
Youngstar: So how do you run longer tests?
Graybeard: With my friend Jenkins¹¹⁸.
Youngstar: Is it the system that monitors your source tree and
run tests on every change?
Graybeard: Yes. It’s called “continuous integration” or CI for
short. Jenkins can do much more but at its heart this is exactly
what it does.
I separate the tests to faster ones that can run on a developer
machine without too much setup and longer ones that run on
pytest have a way to mark tests and pick a subset of tests
to run. In unittest I use environment variables and a special
exception that’s called SkipTest.



Youngstar: And when Jenkins runs the tests it selects all of
Graybeard: Yup. A common mistake that people do is to write
a lot of code in the Jenkins execute field.
Youngstar: Why is it a mistake?
Graybeard: Since then it’s usually not in source control.
Youngstar: Ah! And then if you want to make changes to how
tests are run - you change the script and commit.
Graybeard: Exactly. Note that Jenkins can do much more but
start simple as always.
Youngstar: Another thing I recall we talked about was to
make sure tests don’t get into production.
Graybeard: Yes, try to make it impossible for tests to get or
touch production.
Youngstar: Any more advice?
Graybeard: Yes - cleanup at start of the test.
Youngstar: Say what?
Graybeard: Most test frameworks allow you a setup and
teardown methods. Most people create what they need in
the setup, for example setting database tables and populating
them with data. Then the use the teardown to cleanup everything. The problem is that teardown gets called even when the
tests fail, and then if you want to debug - the data is missing. If
on the other hand you use only the setup method and initially
cleanup and then populate, you’ll still have data to debug if
the tests fail.
Youngstar: Will do.
Graybeard: The last thing to remember…



Youngstar: Yay, there’s more!
Graybeard: Testing is a mastery by itself, and done right it’ll
save you a lot of agony. But no matter how hard you test bugs will get out into production and you need to be ready
for that. Monitoring and altering is something we’ll talk about
another time. NASA which has a very strict and thorough
development process¹¹⁹, still manage¹²⁰ to ship bugs to outer
Youngstar: Really?
Graybeard: Yup. But they have a system in place to fix bugs
in outer space as well.
Youngstar: I guess I’ll have to mock some parts of the system
for testing, any advice on this?
Graybeard: In general - don’t mock! Every time you use a
mock you cheat and don’t really test your system. Mocks are
another “mass” you acquire and need to be updated to match
what they are mocking. I’ve found out that with a little effort
you can usually avoid mocking. I once worked at a company
where we were doing web scraping, getting HTML pages,
parsing them, analyzing and storing in a database. At first
someone suggest we’ll mock the HTTP connection and get
a canned HTML. But with a bit more coding we created an
HTTP server using Flask¹²¹ which returned canned HTML
pages. This way we also tested our connection infrastructure
and when we wanted to test accessing pages with user/password - it was easy to add these kind of pages to the test HTTP
However sometime the cost of not mocking is too much ¹¹⁹



“pain vs gain” again. There’s a mock¹²² package in the Python
3 and for Python 2 it’s available on pypi¹²³.
Youngstar: Any more advice?
Graybeard: Testing is a bottomless pit. We can talk on it for
hours, but I’m getting tired and I think we covered the main
points. Also my beer is empty - going home now.
Youngstar: Cheers.

• Find the “gain vs pain” balance for your
• Have one script to run tests
• Have a CI system, Jenkins is a good bet
• Separate tests to ones developers run and
ones Jenkins runs
• Cleanup on setup
• Make it impossible for tests to get into
• Avoid mocking as much as you can
• No matter how hard your test, some bugs
will slip though - be ready for this


Amateurs think about tactics, but professionals
think about logistics.
- General Robert H. Barrow
Youngstar: I now have two environments where the code run.
We have a production environment but we also have a QA
environment. I have an if env == 'PROD': in my code but I’m
not to happy about it. I also remember you once said I should
try to minimize if in my code. How would you handle it.
Graybeard: What makes you think you have only two environment?
Youngstar: Oh, you’re right. There’s also the local development environment on my machine.
Graybeard: Yeah, and the number of environments will grow.
You might want to check a new database version, a new
package version …
Youngstar: Eeeek, again accidental complexity bites us in the
Graybeard: How much did you drink? You usually get depressed later on.
Youngstar: You’re right, lemme get another round and you
can tell me how to solve my problems.
Graybeard: Sure, I’ll wait.



Youngstar fetches a new round, they drink in silence for a few
Graybeard: OK, did you figure how to solve your problem by
Youngstar: I thought of some kind of configuration system,
then have a configuration file per environment. Probably use
JSON since writing my own format is bad.
Graybeard: Why JSON?
Youngstar: There’s already a parser and it’s well known
Graybeard: Would you like to have some comments in your
Youngstar: Probably yes … that rules out JSON. YAML?
Graybeard: YAML is a great format for configuration. I use
it a lot, but there’s something even simpler.
Youngstar: YAML is pretty simple, you just load the configuration file. The only way it’ll be simple if the configuration
will already be in Python … Oh - so I’ll use Python.
Graybeard: Yes. I usually start with a system where I have and just import it. Having said that, a YAML (or
other format) based system is good as well. But start the
simplest way you can.
Youngstar: But then how do I get a different configuration
per system?
Graybeard: You can have an overrides file where you place
values per system, or use environment variables. Then I have
a system that reads the overrides and update
Youngstar How do you manager these override files?



Graybeard: Yes. In most cases the deployment system, say
Ansible¹²⁴, will generate one based on the environment.
Youngstar: And I guess the default in should be
for local development environment?
Graybeard: That’s right.
Youngstar: This system looks good enough to my usage,
anything else?
Graybeard: There are many ways to do configuration, and
you should pick the one that fits your case. We talked about
overrides, the usual order is defaults < configuration < environment variables < command line switches. You can use
something like ChainMap¹²⁵ for this.
Youngstar: OK. I guess adding command line support helps
in quickly testing other systems.
Graybeard: Yes, sometime the command that starts your
program (say docker¹²⁶) gives all the right switches. Then you
can go without configuration system at all in your code.
Youngstar: It’s not true, you just moved the configuration
system to the deployment/running system.
Graybeard: I said “in your code”. Glad you caught that, many
people when they talk about “zero configuration” mean “in
the code”. There’s a nice thing about not having configuration
in your code, but I found out that the code is usually tested
better than the configuration system. I prefer to have the
complexity where there are more tests.
Youngstar: What about storing configuration in a server?



Graybeard: People do that as well, they use systems like
Consul¹²⁷, etcd¹²⁸, ZooKeeper¹²⁹ and others.
Youngstar: Then you need just to know where the configuration server is.
Graybeard: Yeah, but then someone need to populate the
configuration values on the server.
Youngstar: Agree. Anything else about configuration?
Graybeard: There’s so much more. Some people believe you
should use just environment variables.
Youngstar: Why?
Graybeard: Read the 12 factor app¹³⁰ and see.
Youngstar: Yay, more reading. By the way: I am using fabric¹³¹ for deployment, should I switch to Ansible?
Graybeard: Depends on the complexity of your deployment.
fabric is very simple so I usually start there and switch
to something more complex only when I need to. If you
use docker based system like docker-compose¹³² and kubernetes¹³³, they have their own system for hooking containers
Youngstar: And then my code uses less configuration.
Graybeard: Exactly. But beware of jumping into docker - it’s
cool but comes with it’s own set of problems.
Youngstar: Which are?



Graybeard: Let’s talk about it later when we discuss deployment.
Youngstar: OK. I guess as usual I’ll start simple and grow in
complexity when I need to.
Graybeard: So young and so wise.
Youngstar: That’s right. Anything else I should know regarding configuration?
Graybeard: Sometime you compose configuration values
from other configuration values. Make sure to do that after
you read the overrides. When I get there I usually add an
init function to the configuration system and call it when
the program starts.
Youngstar Why not do it automatically on import?
Graybeard You tell me.
Youngstar Since I don’t control the order of imports. Some
module I import can import the configuration as well.
Graybeard Also as the Zen says: “Explicit is better than
Youngstar: Good old Tim, he knew what he was talking
about. Anything else?
Graybeard: Configuration can get very tricky, fight hard to
keep it simple so you won’t end up with a very complex set of
rules. There will also be an edge case where you configuration
system falls short. As long as it supports the majority of cases
- you’re fine.
Youngstar: As usual, simple things go very deep with you.
Graybeard: A good configuration system will reduce the
complexity in your code. This complexity doesn’t go away,



but it’s contained somewhere else which is a good thing.
Youngstar: What about passwords and other “secret” stuff?
Where do I store it?
Graybeard: Make sure they don’t make it to configuration or
checked in by mistake. We’ll have a talk on security later…
Youngstar: OK then.

• Start simple. A Python based configuration
system with overrides will get you a long
• Know that most times you move configuration complexity to another system.
• Learn about the various solutions out there
and what people do, then adapt to your
system what works.
• Give more than one way to specify configuration. Usually we have default < configuration file < environment variables <
command line switches
• Make sure “secrets” are protected in your
configuration system and not check into
source control

If debugging is the process of removing bugs, then
programming must be the process of putting them
- Edsger Dijkstra
Youngstar: I have a bug at work that I just can’t figure out.
How do you debug?
Graybeard: I mostly don’t.
Youngstar: Come on, you’re not that good.
Graybeard: Oh, I have not mastered the art of writing bug
free code… yet. What I’m saying that I don’t debug in the
traditional sense of using a debugger.
Youngstar: Ah, so how do you solve code problems?
Graybeard: Ever heard about Rob Pike?
Youngstar: The names rings a bell, not sure from where.
Graybeard: Look him up, he did a lot. Anyway he once said:
“If you dive into the bug, you tend to fix the local issue in the
code, but if you think about the bug first, how the bug came
to be, you often find and correct a higher-level problem in the
code that will improve the design and prevent further bugs.”
I think it was his experience when working with Ken Thompson.
Youngstar: Ken Thompson of Unix?



Graybeard: Among other things.
Youngstar: That’s all very nice, but way to understanding
goes through debugging some time.
Graybeard: Right. However I’m a backend guy and most of
the time debugging is impossible. I mostly use logging to
understand what’s going on. If I do debug, it’s usually with
the command line debugger that comes with Python - pdb¹³⁴.
Youngstar: Why not a visual one?
Graybeard: Since most of the time I’m in an SSH session to
a server, which makes UI hard or impossible. Also once you
get to know pdb it’s very effective.
Youngstar: Just like mastering Vim? OK, I’ll spend some time
with it.
Graybeard: However, if you use good IDE it’ll have a visual
debugger and sometimes these are nice. As we talked before,
knowing your IDE well will save you tons of time.
Youngstar: OK. What else?
Graybeard: Why do you assume there’s more?
Youngstar: Since with you there’s always more.
Graybeard: Fair point. One of the tricks I used is sometime
to place a “hard” breakpoint in the code. I do this when the
condition for the breakpoint becomes pretty complex.
Youngstar: I thought pdb support conditional breakpoints.
Graybeard: You’re right. I can do that in pdb or other debuggers but in some cases it’s much easier to specify the condition
in Python code. What you do it something like this (writes on



if some_complex_condition():
import pdb; pdb.set_trace()

Youngstar: I thought there were no semi-colons in Python.
Graybeard: There are, but rarely used. In this case where it’s
just debugging it’s convenient to have it in one line. I have a
Vim abbreviation for this line.
Youngstar: I bet you do.
Graybeard: Then you run your code normally, not via pdb.
And once the condition is met - you’ll get the pdb prompt. If
you have IPython installed you can use its debugger instead
of pdb, its a bit nicer.
Youngstar: And you make sure this is not left with the code
in your test script.
Graybeard: Exactly. I have a rule in the script that runs the
tests to check no stray pdb.set_trace() are in the code.
But as I said earlier, I mostly use logs. It’s an art to get the right
balance between huge logs and too little information. Try to
err on the TMI side.
Youngstar: TMI as in “Too Much Information”?
Graybeard: Yes. Storage is very cheap comparing to programmer time.
Youngstar: But what if the logs get too big?
Graybeard: You usually save only a window of time backwards. There are great tools for log rotation, both in the
standard library and Unix utilities.
Youngstar: Like logrotate¹³⁵?



Graybeard: Exactly. You can also ship logs to log aggregation
services, we’ll talk about logging and monitoring later.
Oh, and Python’s logging module can listen on a socket¹³⁶
and change the logging configuration in run time. This way
you can temporarily set a log level in one of your modules
for a while, collect enough data and then return it back to the
normal level.
Youngstar: Cool, I’ll look it up. Anything else about debugging?
Graybeard: Today’s systems are usually have more than one
part. Debugging such a system is even more complicated.
One thing I found that helps is to pass around a context
object between sub systems. This way you can search the logs
and get a logical view of an operation between several sub
Youngstar: What’s in the context object?
Graybeard: Anything you think is useful. The bare minimum
is just an identifier for the current operation/session.
Another thing people do it sometimes connect to a running
service and inspect what’s going on with the Python REPL.
There are several such systems, see Twisted manhole¹³⁷ for
Youngstar: OK. Armed with this knowledge I’m heading back
to the office.
Graybeard: Remind me to talk with you about work/life
balance sometime.
Youngstar: OK.



Graybeard: But before you head back, another thing that
really helps is giving it time. Letting what Daniel Khaneman
calls “system 2”¹³⁸ work on the problem.
Youngstar: System 2?
Graybeard: Yeah, not a very imaginative name. Think of it
as the part of your brain the works below the surface. It’s the
one that does most of the leaps in understanding but it needs
time. Instead of heading back to the office, go home and watch
a video called “Hammock Driven Development”¹³⁹ by Rich
I can’t tell you how many bugs I solved during jogging.
Youngstar: Oh, we definitely need to talk about work/life
balance and how you have time to learn all this stuff.
Now that you mention this and I see my empty beer glass.
I’m guess I’m over my “Ballmer Peak”¹⁴⁰, so I’ll go home and
watch that video.
Graybeard: Kudos on knowing your XKCD.
Youngstar: Thanks and g’night.



• Writing simple code will make debugging
• Understand the bug before you fix it
• Know how to work a debugger. Both from
IDE and command line
• When fixing a bug try to make sure these
kind of bugs won’t happen again
• Use logs, err on the TMI side
• Use automation to make sure debugging
code doesn’t get to the source tree
• Give your subconscious time to work

May the queries flow, and the pagers remain silent.
- SRE¹⁴¹ Benediction
Youngstar: I’d like to place my code out there in alpha state
so people can play with it.
Graybeard: Getting feedback early is a very good thing.
Where are you going to put the code?
Youngstar: That’s what I was going to ask you. There are so
many options - AWS, GAE, Heroku, Azure, my own servers
… Which one do you use?
Graybeard: I use the one that fits my needs.
Youngstar: That was helpful.
Graybeard: The point is that there’s no “one size fits all”.
It depends on many factors. And I use different hosting
solutions in different situations.
Youngstar: One of these factors is if I can place my data
Graybeard: Yes. A lot of companies think their data is safer
if the keep it in house. However I tend to trust the Google/Amazon security experts much more than the local IT.
Youngstar: I don’t know much about security.



Graybeard: We’ll fix that later. However today it’s more common for companies to host data outside. And even companies
that say “we host data ourselves” usually mean “on our hosted
servers”. Sometimes you can’t host data outside due to legal
reasons or some regulation.
Youngstar: IANAL, but I think I’m OK with hosting data
Graybeard: What most companies underestimate is the cost
of having your own servers. Scaling up becomes much more
painful, and you need people doing rotation who can drive
at 3AM to some Colo, have the right keys and know how to
reboot the servers.
Youngstar: Colo?
Graybeard: Short for “co-location centre”. It’s usually a secure place for your servers with good network, security and
other goodies.
Youngstar: So not from the office network?
Graybeard: Sadly I’ve seen that too.
Youngstar: OK, I’ll start with the cloud then. Which one?
Graybeard: There are many options and many variables you
need to consider. As usual - some research required.
Youngstar: Such as pricing?
Graybeard: Pricing is one aspect. However most companies
don’t fathom how much time consuming operations can be.
Youngstar: And by time you mean money.
Graybeard: Exactly. I’d do my best to limit my operational
Youngstar: OK, less ops is better. What else?



Graybeard: Try to avoid vendor lock.
Youngstar: By using open standards?
Graybeard: Yes, and also creating abstractions in your code.
Youngstar: “All problems in computer science can be solved
by another level of indirection”.
Graybeard: Did you catch my quote addiction? Was this
David Wheeler?
Youngstar: Yup. Just stumbled on this the other day.
Graybeard: Another thing you need to take into consideration when choosing who to use is size and reputation.
Youngstar: Very much like selecting technologies to use.
Graybeard: As the old joke says: “Nobody ever got fired for
buying IBM”. Sometimes it’s OK to bet on younger products,
but infrastructure is something you need working.
Youngstar: “Stability is sexy”.
Graybeard: Oh, you actually listen to what I say. I’m flattered.
Youngstar: Yeah, yeah. Go on.
Graybeard: Once you decided on hosting which fits you budget and seem decent enough. You need to fit deployment to
your process. The ideal today is called continuous delivery¹⁴²
- once tests pass on Jenkins, the code goes to production.
Youngstar: I heard that deployment is painful.
Graybeard: It doesn’t have to be. There’s a piece by the late
Aaron Swartz called “Lean into the Pain”¹⁴³. He says that just



like sport, we need to do the stuff that hurts us a lot in order
to get better at it.
Youngstar: And when we deploy a lot it won’t be an issue.
Graybeard: Yup. Note that there are deploys and there are
deploys. Most of them will be a non issue, but some of them
will give you a headache.
Youngstar: Can you give me an example?
Graybeard: Changing a database schema in a non backwardcompatible way.
Youngstar: Which means you need to re-process all the data?
Graybeard: Yes. And also you’ll have some processes still
working with the old format and some working with the new
Youngstar: Ouch!
Graybeard: There’s a reason NoSQL is popular.
Youngstar: You can make breaking changes in NoSQL.
Graybeard: That you can, but it’s sometimes easier. However
you pay in other areas, like lack of transactions. Pick your
Youngstar: OK. I’ll think about it and try to automate my
deployment as much as possible.
Graybeard: Good plan. Another thing which is hard in some
platforms is zero downtime.
Youngstar: I read about it. So many options - Blue Green¹⁴⁴,
Canary Releases¹⁴⁵, Rolling deployments¹⁴⁶ …



Graybeard: As usual, go simple and scale when you need.
Some platforms like GAE do it for you.
Youngstar: Cool. They scale as well?
Graybeard: Yes. So does AWS and others. You need to take
care to limit scaling otherwise a spike in load can get you
Youngstar: Ouch!
Graybeard: It’s also hurts that users can’t access your site due
to load.
Youngstar: I’ll pick my poison.
Graybeard: You’re learning. It’s all about trade-offs.
Youngstar: What else?
Graybeard: You need to make sure you don’t have snowflake
Youngstar: I thought servers like cold temperatures.
Graybeard: What Martin Fowler means is a unique server
that you can’t rebuild if it’s gone.
Youngstar: So automate again. Which tool? Ansible¹⁴⁸, SaltStack¹⁴⁹, Chef¹⁵⁰, Terraform¹⁵¹…
Graybeard: Do your homework and ask around. I usually
start simple with Fabric¹⁵² and move to the heavy weight
when I need them.
Youngstar: OK. I will.



Graybeard: Automation also helps with avoiding errors. Some
people swear by checklists, but manage to forget a step.
Youngstar: I get it, you sent me the “automate all the things”¹⁵³
meme enough times already.
Graybeard: OK, moving on then … It’s important that there
won’t be one production environment. You need one or more
for QA.
Youngstar: But probably not that fancy.
Graybeard: Yup. So make sure to parameterize everything cluster size, machine type …
Youngstar: What about Docker¹⁵⁴?
Graybeard: Docker helps in some aspects - it takes you out
of dependency hell. However it comes with another level of
Youngstar: TANSTAAFL?
Graybeard: Exactly. Docker is also let’s you create a copy
of production environment on your local machine, which is
Youngstar: Anything else?
Graybeard: A nice thing is to mark deployment times on your
monitoring graphs. This way is you see a spike in errors it’s
easy to see if it’s related to a specific release.
Youngstar: Just a vertical line?
Graybeard: Any way you want, as long as it’s visible.
Youngstar: OK.



Graybeard: Also make you you can do a rollback as well. If
a release goes bad you need to be able to quickly get back.
Blue-Green and rolling releases help with this.
Youngstar: Don’t forget the cute canaries.
Graybeard: That’s right. They were helpful at the coal mines
and they are helpful now. Every release is a risk.
Youngstar: And we don’t like risk.
Graybeard: Yeah. In “Keys to SRE”¹⁵⁵ Ben Treynor talks about
“error budget”. If a deployment went bad and there’s down
time - it’s taken out of your error budget and you release less.
Youngstar: Sound reasonable. It seems there’s so much infrastructure to build and process to develop.
Graybeard: Yeah. And backups which work, and security and
Youngstar: OK. I get it - ops is a lot of time and money. Final
advice before my head explodes?
Graybeard: Get more beer?
Youngstar: I mean deployment wise.
Graybeard: I usually start with GAE which is zero ops and
once things start to heat up - I look into other platform. Or
stay in GAE if it gives me all that I need.
Oh, and no deploys close to weekends or vacations.
Youngstar: OK. I’ll take a good look at my architecture and
see if it can fit in one of the no-ops hosting. And now that
beer please.
Graybeard: Sure thing.



• Don’t underestimate how much operations
will cost you in time and money
• Pick a solution that will reduce the operations burden
– Automate everything you can
• Do your homework. Learn about deployment methods, tools and procedures
• Be ready to roll back releases
• Mark release in your monitoring tools

Monitoring & Alerting
On a long enough timeline, the survival rate for
everyone drops to zero.
- “Fight Club” movie
Youngstar: Our logging system paid off this week.
Graybeard: Do tell.
Youngstar: A customer called to say they are missing some
data. A quick search in the log files found that one sub system
was down for a couple of days, we brought it back up and the
missing data was in front of the customer eyes in about an
Graybeard: Fixing a system in an hour is indeed good.
However I think you can do better.
Youngstar: Better than that? How?
Graybeard: You need to know about problem before your
Youngstar: Well, we have great logging. But we look at the
logs after we found out there’s a problem. We do monitor our
machines for load, disk space and other things. However this
was an application crash and didn’t cause a system problem,
it actually reduced the load.
Graybeard: Two things: One is that monitoring without
alerting is not that helpful - nobody is watching the graphs

Monitoring & Alerting


24/7. Second is that there are better things to monitor than
disk space.
Youngstar: Let’s take these one at a time. You’re saying I need
some automated system that will alert me when a metric goes
Graybeard: Yes. You usually start with a fixed threshold,
but as your system grows complex you need more advanced
methods. Remember that if you have too many alerts - people
will ignore them. It’s the classic “the kid who cried wolf”
story. There are some cool new systems now that apply
“anomaly detection” algorithms to metrics. There are even
companies that provide a service where you send them your
metrics and they alert when they find an anomaly.
Youngstar: I’ll start simple with manual thresholds and move
to more sophisticated stuff later.
Graybeard: Yup. “start simple” always wins. Other questions
you need to ask yourself about alerting are “who?” and
Youngstar: We’re a small team, I guess everyone should pitch
Graybeard: Yeah. At one company I worked with had a good
rotation system. There were weekly shifts, rotating at Monday
noon. Each shift had a primary and secondary role.
Youngstar: I don’t believe that everyone can solve every
Graybeard: Yeah, but it’s the Pareto principle¹⁵⁶ - most errors
are easy to solve. The big bonus is that everyone feels the pain
of a failing system and start writing more robust code, and
also pay more attention in code reviews.

Monitoring & Alerting


I saw a great talk called “Keys to SRE”¹⁵⁷ by the guy who
started the SRE team in Google.
Youngstar: SRE?
Graybeard: Site Reliability Engineer. It’s the group that makes
sure things keep running in Google.
Youngstar: OK.
Graybeard: Where was I? … Oh yeah, in the video he mentions that a couple of sleepless nights does wonders to the
stability of code people write.
Youngstar: I can see that. And I think that will be a good fit
for my small team. I’ll give it a try - getting woken up at 3am
gets old real fast. How do you actually alert?
Graybeard: Usually by alert to cellphone, pagerduty¹⁵⁸ seems
to be very popular. It’s good also to alert to the ops chat room.
Youngstar: OK. And if I recall you recommend to do postmortem on every issue.
Graybeard: Yeah, start with 5 whys¹⁵⁹ and develop your own
system. Along the way update your “red book” for what to do
when shit happens.
Youngstar: I thought shit happens all the time.
Graybeard: That’s right. Now let’s talk on what to monitor.
Youngstar: I guess the usual - disk space, load, memory …
Graybeard: Right and wrong.
Youngstar: Gee, that’s helpful.

Monitoring & Alerting


Graybeard: Let me ask you - how’s an 80% full disk affect
your revenue?
Youngstar: Hmm. Well, it’s an indication that I’m going to
have a problem and this might drive out users. Hard to place
a number on this.
Graybeard: Right. Also let’s say everything looks OK system
wise but your users can’t see data from the last 2 days.
Youngstar: I guess I need to check that as well.
Graybeard: Most people start “bottom up” from system metrics to system health. But the more important is “system
health”, you need to monitor your KPIs.
Youngstar: The what?
Graybeard: KPI - Key Performance Indicator. You need to be
up to date with your TLAs.
Youngstar: Three Letter Acronym?
Graybeard: Yup. Take Netflix for example, they have one
major KPI they monitor called SPS¹⁶⁰ - starts per second. It
follows a wave pattern if there’s some deviation from this
pattern - they take a look.
Youngstar: I see. But then you need to hook your own
monitoring to your programs. It’s also harder to find problem
in a wave like pattern which I guess differ from country to
country and changes over the holidays.
Graybeard: Yes, it’s harder but better. Most of the time people
measure what’s easy and not what’s important. Take highway
police for example.
Youngstar: What about them?

Monitoring & Alerting


Graybeard: They do a lot of speed traps, not because speed is
the major cause of accidents, but because it’s easy to measure.
Unlike reckless driving, which is far more dangerous but
harder to catch.
Youngstar: I see. And how do I find these all important KPIs?
Graybeard: That’s a business question, I’m a tech guy. You’re
the one owning a company - go and figure it out. As usual
start simple and optimize along the way.
Youngstar: What about the other monitoring - disk, CPU,
memory …
Graybeard: Keep them, but try to figure out how do they
affect your business.
Youngstar: Anything else?
Graybeard: Yes - automate as much as you can.
Youngstar: For example?
Graybeard: If the disk is getting full, and you know a place
where you can clean up - do it. Even better run what I call a
janitor process periodically to clean things up.
Youngstar: Sound good. What’s system do you recommend
for this?
Graybeard: There are many, many systems our there. See
what you need and what they offer and try to find a good
match. As usual go with boring reliable technology. Lately
I’ve been using the ELK¹⁶¹ stack, but that’s just a personal
preference. I already had Elasticsearch in place, so not using
yet another system looked like a win to me. But really - have a
look around, there are many and it might be that one of them
is a better fit to your needs than ELK.

Monitoring & Alerting


Youngstar: Great, more homework. Anything else?
Graybeard: It’s a good idea to do “ops drills” where you
simulate problems and people solve them.
Youngstar: I guess we’ll have plenty of the real thing to
practice on.
Graybeard: It’s better to deal with your first outage not at
3am with a customer shouting over the phone. Also other
team members can look and learn.
Youngstar: Isn’t that what Netflix chaos monkey¹⁶² do?
Graybeard: Sort of, but wait until you get there. By the way
they have more tools that destroy things. By the way, it’s
called the Simian Army¹⁶³ now.
Youngstar: Oh my… I need another drink to reflect on that.
Want some?
Graybeard: OK, I get the hint. I’ll shut up about monitoring
and alerting now :)

• Identify your KPIs and monitor them
• Start with simple thresholds and move to
more sophisticated systems later
• Have a pager duty rotation, everyone
should pitch in
• Automate recovery as much as you can
• Update a “red book” for solving problems
• Do a postmortem for every outage
• Have ops drills

First rule of computer security: don’t buy a computer. Second rule: if you buy one, don’t turn it on.
- Dark Avenger
Youngstar: I was going over our HTTP logs and found some
weird stuff there.
Graybeard: “Little Bobby Tables”¹⁶⁴?
Youngstar: There was some SQL injection, some trying to
run script and other fishy requests. How do I protect myself
against such things?
Graybeard: One thing you need to keep in mind is that if
someone is really targeting you - you will get hacked. Hackers
managed to get into NASA, banks and many other secure
Youngstar: So I should just give up?
Graybeard: Why do you lock your door when you leave the
Youngstar: So bad people won’t be able to get in?
Graybeard: And you think that people who rob banks can’t
get in your house?
Youngstar: They’ll be able to. But I do it to deter most casual
thieves. Oh, I see where you’re going with this. I shouldn’t
make myself an easy target.



Graybeard: Exactly. I’ll give you some simple rules to follow.
Keep in mind I’m not a security expert.
Youngstar: If I had a penny on every thing you’re not an
expert in…
Graybeard: You’ll probably have problems carrying all this
Youngstar: Ha. OK, rules?
Graybeard: Let’s start with the social aspect. All the security
in the world won’t help if you have weak passwords, if
your computer doesn’t ask for login when you turn it on, if
employees write passwords on a sticky note, or blindly click
on any link sent to them.
Youngstar: You mean phishing¹⁶⁵?
Graybeard: Yup. And other social hacks. The key is to be
aware, keep learning and educate people.
Youngstar: Good paranoid culture, sounds like fun.
Graybeard: Nah, just be careful - that’s all. You don’t think
locking your door makes you a paranoid.
Youngstar: You’re right. But you told me that only the
paranoid survive.
Graybeard: That was Andy Grove¹⁶⁶, not me.
Youngstar: OK. Apart from culture?
Graybeard: One more thing about culture is that you need
to make security part of the process. Make security reviews
to your code - Both as part of code reviews and dedicated




security audits. Appoint someone in your company to be the
security tsar.
Youngstar: Anything special I should look for in those reviews?
Graybeard: Try to think like the bad guy. “How can I break
this piece of code?”. Read “The Security Mindset¹⁶⁷” by Bruce