Main Forging Python

Forging Python

EPUB, 442 KB
Download (epub, 442 KB)

You may be interested in Powered by Rec2Me


Practices of the Python Pro

PDF, 4.09 MB

Most frequently terms

You can write a book review and share your experiences. Other readers will always be interested in your opinion of the books you've read. Whether you've loved the book or not, if you give your honest and detailed thoughts then people will find new books that are right for them.

Forging Python

PDF, 3.86 MB

The Blockchain (R)evolution - The Swiss Perspective

PDF, 2.21 MB
Forging Python

Best practices and life lessons developing Python.

Miki Tebeka

This book is for sale at

This version was published on 2019-01-14

* * * * *

This is a Leanpub book. Leanpub empowers authors and publishers with the Lean Publishing process. Lean Publishing is the act of publishing an in-progress ebook using lightweight tools and many iterations to get reader feedback, pivot until you have the right book and build traction once you do.

* * * * *

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License

Table of Contents




Writing Good Code

Which Python?

IDEs and Editors

Project Structure

Managing Dependencies






Monitoring & Alerting


Going Faster


Time Management

Asking Questions

Thanks Contributors



There is a saying: “I’ve learned a lot from my teachers, more from my peers and most from my students.”

I’ve been learning from teachers and peers for a long time, and love the informal round table talks as a form of education. I try to implement this method of teaching at my company 353solutions and people really like it. As a bonus I’m learning so much from my students.

Terry Pratchett said “Writing is the most fun you can have by yourself.” I did have fun writing this book, not only from the writing process itself but also from the discussions with the people who helped. I am grateful to anyone who contributed and taught me along the way.

I hope this book will inspire you to come out and talk to people as a way of learning. You will get many different perspectives on the problems your facing, and as Alan Kay said: “A change in perspective is worth 80 IQ points.”

This book is open source, feel free to head over to and submit bugs, offer ideas and ask questions. I will do my best to improve this book according to y; our suggestions.

Happy Hacking,

Miki Tebeka, April 2018


To Adi & Shira.

Who are young and their stars shine ever so brightly.


Being honest may not get you many friends but it’ll always get you the right ones.

- John Lennon

Youngstar: Hey Graybeard, tell our readers a bit about yourself.

Graybeard: Why don’t you introduce me and I’ll introduce you?

Youngstar: Great idea. Let’s see… You’ve been around the IT industry since punch cards, you’re also the best proof that you can teach an old dog new tricks. I have no idea how you find time to learn all the cool stuff you know. You’re smart and usually pretty quiet until you start talking about technology. You’re currently not doing much work but still manage to earn a lot. Oh - and you have a quirky sense of humor.

Graybeard: Cute, I’m not that old though. About you… You’re somewhat new to IT, finished college a few years back. You’re very bright and motivated and you sold your company not long ago for way too much money. You also like to learn and one of the few people who get my humor. Oh - and you’re a great example that a woman can make it in high tech.

Youngstar: Thanks. And yeah - I’m good at pretending to like your humor.

Graybeard: At least you try, my wife doesn’t even bother.

Youngstar: Oh, and we’re fictional characters.

Graybeard: We are?

Youngstar: Don’t pretend you don’t know. How does that make you feel?

Graybeard: Really? This is not that kind of book.

Youngstar: Can you recall how we met?

Graybeard: I think it was just when I was leaving that company to start freelancing. And you’d just arrived, still wet behind the ears.

Youngstar: Yeah, I think we had about a month together before you left. Man… those were big shoes to fill!

Graybeard: I hope the smell wasn’t that bad.

Youngstar: It was OK, I killed most of the fungus. Can you tell the readers about this book?

Graybeard: After I left, we decided to meet about once a week in “The Forge”.

Youngstar: “The Forge” is a great pub just down the road.

Graybeard: Thanks for the close captioning. And yes - it’s a great pub. We were geeking out regularly and I was kinda mentoring you when you started that startup doing that online thingie.

Youngstar: That was both great help and a lot of fun.

Graybeard: Yeah, and we keep meeting about once a week. But it has been less fun since you made all that money selling your company and became a snob.

Youngstar: I truly hope you’re joking. Also you got some of that money, if you recall you got some equity for all the advice you gave.

Graybeard: I’m joking. Money didn’t spoil you, and once you’re out of this big company we might hack together on a new one.

Youngstar: Anything else our readers need to know?

Graybeard: The meetings we had were around Python. But I think most of the things we talked about apply to other technologies as well.

Youngstar: I agree. Well, that’s about all the time we have for the introduction. The attention span of the average reader nowadays is pretty short. We hope you’ll have as much fun reading the book as we had in those meeting.

Graybeard: Cheers!

Writing Good Code

Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live.

- Bill Mitchell

Youngstar: Your code is always easy to read and maintain. How do you do it?

Graybeard: Thanks! It took me a lot of time and practice to get there. And I’m still improving.

Youngstar: That’s a long journey, I don’t have so much time. Can you share some of the highlights?

Graybeard: Will do, but you need to keep improving.

Youngstar: Yeah, yeah - I’ll “sharpen my axe”.

Graybeard: Good girl! The main theme is simplicity.

Youngstar: Like in KISS1?

Graybeard: Somewhat. As developers, we spend most of our time reading code, not writing it.

Youngstar: Which means it needs to be readable.

Graybeard: Exactly.

Youngstar: OK, so how do I write readable code?

Graybeard: By rewriting. I see the first iterations of code I write as sketches.

Youngstar: How do I find time to write several iterations of code?

Graybeard: As Fred Brooks said: “plan to throw one away; you will, anyhow.”

Code rewrites will happen, you need to allocate time for them.

Youngstar: Is this from The Mythical Man Month? That’s an old book.

Graybeard: It’s old but about people, and people haven’t changed that much since it was written.

Youngstar: We haven’t changed a lot in the last 10,000 years. Back on track, what else will help me write good code?

Graybeard: Reading good code.

Youngstar: Where will I find that? I know where to find bad code - it’s everywhere.

Graybeard: Not everywhere. There are a few places where you can see amazing code. For example, almost everything written by Peter Norvig.

Youngstar: Yes, I’ve seen his spell checker, it’s awesome!

Graybeard: It is. There’s also some good code and advice in the ASOA book.

Youngstar: Oh, I’ve read some chapters. The one Berkely DB was good. I’ll keep on reading this book.

Graybeard: Yup, and along the way you’ll find people to follow and read their code. You might even find a good mentor.

Youngstar: That I have. Though he’s getting old.

Graybeard: Like wine, I get better with age.

Youngstar: You keep telling yourself that. Anything else about writing good code?

Graybeard: Read bad code.

Youngstar: Learn from other people’s mistakes?

Graybeard: Yes, but also look out for things you do. From time to time I go and read “How to Write Unmaintainable Code” and try to see if I do anything shown there in my code.

Youngstar: OK, will pay it a visit. What else?

Graybeard: What code does not have any bugs?

Youngstar: Eh… none?

Graybeard: Exactly!

Youngstar: You lost me there grandpa.

Graybeard: The code you don’t write, or code that you delete.

Youngstar: Oh. It’s also the fastest.

Graybeard: Exactly. In a way code is our enemy, we’d like to have less of it.

Youngstar: Can you give me an example?

Graybeard: Sure. Assume you’re asked to process some data in Excel files. This will require you to install an external library to read excel (such as xlrd). However if you ask them to send over the files in CSV format - there’s already a csv module in Python. No need to install and maintain third-party packages.

Youngstar: I see.

Graybeard: Also, many times due to specification changes - you’ll have code that does nothing. Make sure to delete it. One of my most productive days was deleting a few thousand lines of unused code.

Youngstar: How did that happen?

Graybeard: Specification changes, libraries came about that did the same work …

Youngstar: I’m beginning to see what you mean by “code is our enemy”. What else?

Graybeard: Keep your functions short and with a small number of parameters. A good rule of thumb is no more than forty lines of code per function.

Youngstar: Forty? Doesn’t seem like much.

Graybeard: It’s not a law of nature, but it’ll make your code nicer. It’ll make you think on small pieces of code which are easier to understand and maintain.

Youngstar: Also avoid globals?

Graybeard: Yup. I like functional programming since it’s easier to reason about. However you can’t avoid state, no matter how hard you try.

Youngstar: Sometimes TDD helps with that.

Graybeard: Yes, especially when you start out. It forces you to write small pieces of code that are easy to test. However Google for “TDD is dead” for some interesting discussion about TDD.

Youngstar: OK. Any more?

Graybeard: Did I tell you that old linguistics joke?

Youngstar: Old and linguistics? Must be a good one - do tell.

Graybeard: I’ll make it brief.

During the cold war the US created an automatic system for translating from Russian to English. When the system was ready they tested it by giving it an English sentence to translate to Russian and back. The input was “The spirit is willing but the flesh is weak” and the output was “The vodka is good but the meat is rotten.”

Youngstar: Ha! Not that bad.

Graybeard: The secret is starting with low expectations.

Youngstar: OK, and how is this related to what we’re talking about?

Graybeard: The idea is that every language has a different way of saying the same thing. In Python we call it “pythonic code”.

Youngstar: I’ve heard that term before. Mostly with reference to the Zen of Python .

Graybeard: Good old Tim Peters, he is someone to learn from.

Youngstar: So learn how to speak the language?

Graybeard: Yes. A lot of people when they start write Java in Python, C in Python etc… But you need to learn how to properly speak the language.

Youngstar: OK, will do. Any other advice?

Graybeard: The most important thing is to have a good mental model of what you do. You’ll hear people talking about building an ontology, which means figuring out how to talk about things.

Youngstar: The “two hard things…”?

Graybeard: Naming is important, especially in Python which is untyped.

Youngstar: It’s also hard to get right.

Graybeard: Yeah, it usually takes me a couple of iterations until I get names right. A red flag are generic names like “object”, “other”, …

But back to ontology, it’s important to define what “things” are. At a place I worked we got a bug report that we count unique users wrong. The code seems OK so my boss went to talk to people. Turned out we had four different definitions of “unique users” in the company.

Youngstar: Ouch. I see what you mean - it starts before you code.

Graybeard: Sometimes things emerge as you write the code, then you need to revise your model.

Youngstar: OK, will do. Anything else?

Graybeard: There are may rules to follow - DRY2, SPOT3, minimizing coupling … You’ll find them as you go.

Youngstar: Any reference?

Graybeard: There’s a good summary in “The Art of Unix Programming”, and may other other places.

One trick you can do is see if you can understand your code without the comments.

Youngstar: OK. I’ll practice and read. More beer?

Graybeard: You keep asking these rhetorical questions.


Have a good mental model

Aim for readability

Don’t stop writing the first time the code works

Read other people’s code

Find a mentor

Learn how to speak the language

Which Python?

Gentlemen, choose your weapons.

- A Night in Casablanca

Youngstar: I’ve been thinking of using PyPy for my new project, I heard it’s super fast.

Graybeard: Before we get into that, let’s take a step back. Why use Python?

Youngstar: Seriously? Coming from you?

Graybeard: Programming languages are tools, not religion like some people tend to make them.

Youngstar: And if all you have is a hammer…

Graybeard: Exactly. You have some experience with other languages.

Youngstar: Mainly thanks to you.

Graybeard: So again, why Python?

Youngstar: I’m most productive with Python. Going from zero to working is fastest.

Graybeard: OK, so speed of development - which is important in a startup. What else?

Youngstar: There are many great packages I can use.

Graybeard: Yes, a good ecosystem. Audry Tang said that “perl5 is just syntax; CPAN is the language”. I believe this is true for Python as well.

Youngstar: CPAN is Perl’s PyPI?

Graybeard: Yes. What other reasons do you have for choosing Python?

Youngstar: It’s open source?

Graybeard: And why is that a good thing?

Youngstar: It means nobody can take it away from me. And worse case, I can fix bugs in Python before an official release.

Graybeard: Yup. Gimme one more.

Youngstar: Oh, the community is great. People are usually nice and helpful, and there are a lot of articles and videos out there.

Graybeard: Right. Now let’s try to think of places where you won’t use Python, it’ll help clarify some things.

Youngstar: Embedded?

Graybeard: You mean small devices or real time requirements?

Youngstar: I guess both.

Graybeard: Yeah, it’s hard to fit Python on small devices. However it’s possible and MicroPython does a good job.

Youngstar: I’ve never heard about MicroPython, I’ll take a look.

Graybeard: As for real time - most garbage collected languages don’t fit the bill. Anything else Python’s not good for?

Youngstar: I guess if you need a lot of formal checking of your system.

Graybeard: Yea. This leads me to what I call “the cost of error” which has implication on many areas both in development and in business. For example, Jane Street is a trading company who uses OCaml - they claim it helps them make sure their code is correct.

Youngstar: I guess that in trading systems you feel the pain of bugs right away.

Graybeard: Yeah, ask someone from Knight capital once. On the other hand, I worked in an HFT4 firm once and we used Python and made money.

Youngstar: Yeah, yeah - we all heard your war stories many times.

Graybeard: Be nice to your elders! Anything else?

Youngstar: I can’t think of anything else - do tell.

Graybeard: Hiring is one.

Youngstar: You mean finding programmers?

Graybeard: Yes, try to recruit some good Haskell programmers sometime.

Youngstar: Try recruiting good programmers in any language.

Graybeard: Right. Remind me what your startup is all about.

Youngstar: It’s a backend thingie with REST API.

Graybeard: Seriously? This is almost as bad as “It doesn’t work!” bug reports. However it’ll do for now. Looks like Python is a good fit for you.

Youngstar: What a surprise…

Graybeard: Huh! Now let’s try to see which Python. What Python distributions do you know?

Youngstar: There’s CPython, Jython, IronPython, PyPy and now I know of MicroPython. Oh and there’s the subject of Python 2 and Python 3.

Graybeard: IronPyton is for .NET shops, which you’re not. Jython is for Java shops or when you need to use Java libraries - and I don’t think this is your case either.

Youngstar: And I’m running on hosted servers so MicroPython is not for me as well.

Graybeard: When will you want to use PyPy?

Youngstar: For the speed?

Graybeard: TANSTAAFL

Youngstar: Gesundheit!

Graybeard: It’s an acronym for “there is no such thing as a free lunch”. What’s the downside of using PyPy?

Youngstar: Well, packages I guess. Not all of them support PyPy.

Graybeard: Yes. Going off mainstream has it’s down side.

Youngstar: Says the man who uses archlinux.

Graybeard: Trust me, there are days I regret it. But most days I’m very happy - it fits my preferences. Which is exactly what the Python you choose should do for you. So let me ask you - what are your speed requirements?

Youngstar: The faster the better?

Graybeard: Then why not pick assembly as your programming language? Even better - manufacture your own hardware.

Youngstar: I see what you mean. I need write some business requirements and then see if Python fits them. I have a hunch it will.

Graybeard: In God we trust; all others must bring data.

Youngstar: Good one. Yours?.

Graybeard: Not mine - W. Edwards Deming.

Youngstar: I’ll spec and measure. Now let’s talk on Python 2 vs Python 3.

Graybeard: OK. Python 3 is the future, choose it.

Youngstar: That was easy! Should I tell it to all the people who still use Python 2?

Graybeard: There are many good reasons to keep using Python 2.

Youngstar: Because you’re and old fossil who can’t change?

Graybeard: Get off my lawn!

Youngstar: Sure, can I finish my beer first?

Graybeard: I’d say dependencies are the main reason. However the situation has improved significantly in the last couple of years. If you head over to Python 3 Wall of Superpowers (which used to be called “Python 3 Wall of Shame”) you’ll see mostly green now, which means most “top downloaded” packages support Python 3 now.

Youngstar: What other reason are there? Legacy code?

Graybeard: You won’t believe how fast the new cool code you wrote a while ago becomes legacy code. Most of the time we improve existing code, not write new stuff. If you already have a decent code base, writing new code from scratch is a dangerous thing. Read “Things You Should Never Do” sometime.

Youngstar: How do you find the time to read all of these things?

Graybeard: I don’t have time not to. But this is something for later conversation. Another thing you learn with experience is to appreciate things that work. Zach Holman, then at github, said “Your product should be cutting edge, not your tech … stability is sexy.”

Youngstar: I wonder how we make progress then.

Graybeard: Sometimes the advantages of new technology outweigh the risk. Also, people are way too optimistic for their own good.

Youngstar: Oh, what about Anaconda? I heard people talking about it.

Graybeard: Anaconda is based on CPython, and comes bundled with scientific packages. There are other scientific Python distributions out there but it seems to be the dominant one. If you plan to use a lot of scientific packages, such as numpy, scipy, matplotlib and others, give Anaconda a try.

Youngstar: I don’t have plan for that now, and as you said earlier switching is not that painful.

Graybeard: Just make sure you have a good test suite.

Youngstar: Will do, but testing is a big subject and we’re getting to the point where my boyfriend gets jealous of you. Final recommendation?

Graybeard: Don’t be lazy, do your homework and find the right Python, or other programming language, for you. Note that switching from one Python to another shouldn’t be that difficult. At one place we had to switch from Python 3 to 2 due to dependency issue, it took us about half a day to do that.

Youngstar: So the decision is not that crucial?

Graybeard: It is, don’t take it lightly. We were lucky the switch was easy, you might not be.


Choose CPython 3.x if you have a new project with few dependencies Python 3 is the future

Choose CPython 2.x if you have older code base or dependencies that does not support Python 3

Choose Jython if you need interaction with Java Or if you’re in a Java shop and want to sneak Python in the back door ;)

Similarly, choose IronPyton if you need interaction with .NET

Choose PyPy if you need some speed and love living on the edge

Use Anaconda distribution if you use a lot of scientific Python packages

IDEs and Editors

All mail clients suck. This one just sucks less.

- Michael R. Elkins (mutt website)

Youngstar: What are you using to write Python code?

Graybeard: Vim, I use it for everything.

Youngstar: Cool, so I’ll start using it.

Graybeard: Hold your horses. Mastering Vim is a long and sometimes a painful experience. I’ve been using it for about 20 years and I’m still learning.

Youngstar: Whoa! I don’t have 20 years, I need to get productive now.

Graybeard: Since you’re going to spend most of your time inside an editor/IDE5 - try to pick a good one and master it.

Youngstar: I know I’ll regret this… But which one should I use?

Graybeard: It’s not that simple, there are several factors you need to consider. At the end, it’s a matter of personal taste. Check out the editor war sometime.

Youngstar: Editor war?

Graybeard: Yeah, some people get too passionate sometimes.

Youngstar: OK. Let’s start with what you’re using. Why are you using Vim?

Graybeard: As I said - it takes time to master Vim and get used to its dual editing mode. However once you’ve mastered Vim you’ll be super productive with it not just in Python but with almost any other language. Vim itself is pretty bare-bones editor, but it has a rich plugin ecosystem which can transform it to a powerful IDE. One of the main advantages, at least for backend developers like me, is that on most Unix like systems - it’s already there. Vim can work in “terminal mode” which does not require a windowing system. This means you can SSH to a box and start editing. Oh - and you can write Vim scripts in Python.

Youngstar: Isn’t Vim old?

Graybeard: In tech old usually means working - take me for example.

Youngstar: Ha! What’s the other editor old developers use? The lispy one?

Graybeard: Emacs?

Youngstar: That’s the one.

Graybeard: Emacs is a text editor that does everything. It has excellent Python support with python-mode and many core Python developers use it.

Youngstar: Then why don’t you use it?

Graybeard: Since I picked the dark side of the editor war.

Youngstar: And something more modern?

Graybeard: Before going modern, I’d like to stress that both of these editors take a lot of work to master. But once you grok them, both will offer you things that most other editors or IDEs will not.

Youngstar: Noted, I’ll invest some time learning one of them. Maybe emacs just to annoy you.

Graybeard: I never get annoyed by the stupid editors people use.

Youngstar: Something more modern?

Graybeard: I’m seeing a lot of people using PyCharm, from JetBrains, the makers of IntelliJ. There also PyDev which sits on top of Eclipse.

Youngstar: IntelliJ? Eclipse? Aren’t those Java IDEs?

Graybeard: They started there, but now they are very powerful general purpose IDEs. You will need Java to run them, and a lot of memory. A strong CPU won’t hurt as well.

Youngstar: And PyCharm/PyDev are the Python environment?

Graybeard: Yes. There’s also Aptana which is Eclipse already bundled with PyDev.

Youngstar: Doesn’t it take time to start them?

Graybeard: People usually have them running for weeks at a time. You can switch projects without closing the IDE.

Youngstar: OK. Any other options?

Graybeard: In Windows world, Visual Studio comes with excellent Python support called PTVS.

Youngstar: Windows? Visual Studio? You?

Graybeard: Some claim that Visual Studio is the best IDE out there, but then again - they are using Windows ;)

Youngstar: Thanks but I don’t think I’ll switch to Windows just for that.

Graybeard: Smart girl.

Youngstar: After all the brainwashing you did?

Graybeard: I prefer “showing you the light”.

Youngstar: Yeah, yeah. Back on track - any more?

Graybeard: There are so many.

Microsoft also makes Visual Studio Code, which is cross platform and has good Python support, and a good Vi plugin.

Spyder is good you’re doing a lot of scientific Python or coming from Matlab. It’s not as polished but fits better with scientific development.

There are also Atom, Sublime, and many other good editors out there with Python support. There are Wiki pages for both Editors Wiki and IDEs Wiki on the Python web site if the above are not enough.

Youngstar: As usual, I’m more confused than before.

Graybeard: My advice - pick one or two, and make sure Vim is one of them ;), and try them out. Do a little project with each, see what fits your work style and then start specializing. I personally try a new one every now and then - but always get back to Vim eventually. Maybe I’m too old to learn new tricks.

Youngstar: OK. Anything I need to pay attention to while learning or using these IDEs?

Graybeard: Most of them have good integration with linters, make sure to enable it.

Youngstar: Linters?

Graybeard: Programs that check your code for common errors and coding conventions. We’ll talk more on them later, but the editor will mark lines with errors so you can fix them right away. For example I use flake8 integration in Vim.

Youngstar: Fixing errors closer to when you introduce them is always better.

Graybeard: Yes. I think some of them run the tests in the background whenever the code changes.

Youngstar: Cool!

Graybeard: Depends on how fast your tests are.

Youngstar: I can see that. Any other advice?

Graybeard: What? That was not enough for you? I guess another good advice is to be patient.

Youngstar: Have you seen my hair color? I wasn’t born with the patience gene.

Graybeard: You kids … The point is that it takes time to master an editor or an IDE. Give it time, and you’ll see your productivity soaring. I call it the “output” part of a programmer I/O.

Youngstar: I/O? As in input/output?

Graybeard: Yes. Most of your time as a developer should be spent thinking. However reading and writing are also part of the process and a good editor or IDE can increase the output part. Another bonus of fast writing is that you can write several drafts of your code and not lock into the first one you write.

Youngstar: Good point. I guess I’ll brush on my speed reading to get the input part faster.

Graybeard: Yes. We programmers spend a lot of time reading, both code and technical documents.

Youngstar: And in your case a lot of Sci-Fi.

Graybeard: Where do you think I get all my ideas from?

Youngstar: Thin air?

Graybeard: You’re too kind, I thought you were going to mention a certain body part.

Youngstar: What are you? Six?

Graybeard: Mentally? Not much more. But I see this conversation has taken a bad turn so I’ll stop here.

Youngstar: Right as usual, cheers!


Give Vim or Emacs a try, they will rock your world See here on how to turn Vim into a Python IDE

PyCharm is a good choice Make sure you have plenty of RAM

Also if you’re in a Java shop - there’s probably a lot of knowledge on IntelliJ (which PyCharm is based off)

Visual Studio Code great

If you’re in a Windows shop, give Visual Studio a try

If you’re doing a lot of scientific Python - take a look at Spyder

Project Structure

organizations which design systems … are constrained to produce designs which are copies of the communication structures of these organizations.

- Conway’s Law

Youngstar: How should I structure my code? I currently have everything in one directory and it looks messy.

Graybeard: Are you facing a specific problem?

Youngstar: Not really, but I assume I should be more organized.

Graybeard: As the bad guy said: “Assumptions are the mother of all !#?@ups”.

Youngstar: Which movie was that?

Graybeard: “Under Siege 2” if my one bit memory serves me right.

Youngstar: Don’t think I saw that one.

Graybeard: Trust me - you’re not missing anything. But back to your question. Why are you trying to fix something that you don’t know is broken?

Youngstar: You’re probably right. I’ll leave it for now.

Graybeard: I didn’t say it’s not broken. I just said you think it’s not broken.

Youngstar: OK, enlighten me.

Graybeard: Do you have some tests?

Youngstar: Sure!

Graybeard: How do you make sure they don’t get to production?

Youngstar: Why shouldn’t they?

Graybeard: Ask github who had a few hours of downtime a while back. The cause was tests deleting the production database.

Youngstar: Ouch!

Graybeard: Yes, and github are not the only ones bitten by this problem.

Python has an established way to organize projects. It’s not mandatory but I found it’s a good practice. Let’s assume that the name of your project is archer.

Youngstar: Do you have to bring that TV show into everything?

Graybeard: Please be quiet, I’m trying to teach you something here. I’m also still hurt you didn’t take my suggestion for a project name.

Youngstar: I’m being quiet.

GrayBeard draws the following diagram on a napkin:

1 archer 2 ├── 3 ├── Makefile 4 ├── 5 ├── requirements.txt 6 ├── archer 7 │ └── 8 ├── docs 9 └── tests 10 └──

Graybeard: Let’s go over this. The top archer directory is your project - the one you clone from source control.

The second archer directory is your Python package where the code is. tests are outside of the code so they won’t get deployed.

Youngstar: And the rest of the files?

Graybeard: Every project should have a README with at least an elevator pitch. This focuses people on what we’re doing here. It should also contain instructions for developers not found in the docs.

The docs directory is the generated documentation, I don’t usually have docs other than what’s in the code and in the README.

Youngstar: .md stand for markdown right?

Graybeard: Yes. You can also use ReStructuredText or plain text. But markdown became very dominant these days. There are several variants of Markdown, pick one and stick to it.

Youngstar: Markdown it is then.

Graybeard: What else? Oh, I usually have a main Makefile to automate some tasks, requirements.txt to specify external requirements. And one script to run all the tests. We’ll discuss what’s in requirements.txt and when we talk about dependencies and testing.

Youngstar: OK, I’ll try to remind you - considering your one bit memory.

Graybeard: Yay, an external memory! I’ll drink to that.

As said, this is my personal preference which is based on how many Python projects are structured. You might find another one better for you but I suggest you start with it.

Youngstar: Anything else?

Graybeard: Yes, don’t overthink and spend too much on it. Start with one structure and only if it becomes a problem fix it.

Youngstar: That’s advice you give for many things.

Graybeard: Because it’s a good one, and hopefully one day you’ll make it a habit.

Youngstar: Is there a way to automatically generate documentation?

Graybeard: Yeah, write simple code that people can understand.

Youngstar: That’s a manual way.

Graybeard: OK “Miss Always Right”. I stand, actually sit, corrected.

I say that the only updated documentation is the code itself.

Youngstar: That’s good in the general case, however sometimes I need to write tricky code. For example when optimizing.

Graybeard: Optimization is a subject for another talk. But you’re right, when you do stuff that is not that obvious - write good docstrings and comments.

Youngstar: Are there tools to generate nice documentation from docstrings?

Graybeard: Of course. In the Python world we mostly use Sphinx. It has a format for documentation strings and can generate HTML, PDF and other formats. A nice feature of Sphinx is that it can run doctest tests.

Youngstar: doctest is where you write snippets of code in your docstrings?

Graybeard: Exactly, and I find it cool that you have testable documentation.

Youngstar: How about the “big stuff”? Things that don’t fit inside one module?

Graybeard: You have the README for that and also Sphinx can have top level documentation. Note that if you have documentation, you’ll need to add checking it as part of the code review.

Youngstar: How did we get from project structure to writing documentation?

Graybeard: Not sure. Last thing about documentation is that several times I saw people investing a lot of time in generating very nice documentation that nobody looks at.

Youngstar: I’ll start with simple documentation. Anything else about project structure?

Graybeard: There are more files you might need. A to help with packaging. ChangeLog to list changes, NOTICE.txt or LICENSE.txt for specifying license. tox.ini for running tests on multiple versions of Python and many other files. Start with the least amount of items and add new ones only when you need to.

Youngstar: Then trim and restructure periodically?

Graybeard: Exactly.

Youngstar: What about, I’ve seen it in many projects.

Graybeard: is used for packaging. Do you need packaging?

Youngstar: Currently I deploy directly from git.

Graybeard: So you probably don’t need packaging. is mostly used when creating packages for other people to use and in open source code. There’s a lot of options there and when you decided to release some of your code as open source we can talk about it.

Youngstar: I’ll live without for now. Priorities …

Graybeard: Very good.


Start with an established project structure (like GreyBeard’s example above)

Separate code from tests

Have a README with an elevator pitch and development instructions

Use a Makefile or other tool to automate common tasks

Have one script to run the tests

Look into Sphinx for generating documentation But only if you need to

Managing Dependencies

Only the paranoid survive.

- Andy Grove

Youngstar: You won’t believe the stupid bug I was chasing today.

Graybeard: Do tell.

Youngstar: I was updating some packages …

Graybeard: … and one of the new versions had a regression bug that took you all day to figure out.

Youngstar: What do you know? I’m not that special after all.

Graybeard: Oh, you are unique - just like everybody else.

Youngstar: Funny! So how can I avoid bugs like this in the future?

Graybeard: You know that the best way to solve a bug is to make sure that it’s impossible to introduce such bugs in the future.

Youngstar: Yeah, forgot who taught me that …

Graybeard: Buy me another beer and I’ll refresh your memory.

Youngstar: Sure thing. Now back to my question…

Graybeard: How do you manage your dependencies?

Youngstar: I have a requirements.txt with package per line, and I run pip install -r requirements.txt to install them.

Graybeard: You know you can specify a specific version using ==. For example requests==2.12.4

Youngstar: I didn’t know that. But why would you do that - you won’t get all the bug fixes … Doh!

Graybeard: Exactly!

Youngstar: Then I should probably version all my packages.

Graybeard: I agree.

Youngstar: I know I’ll regret this… But any other pointers on dependency management?

Graybeard: As I said many times, one of the biggest factors in your development practice is the price of error. For example it’s much harder to fix a bug in an embedded system than in a small site web server. The bigger the cost of error the more strict you want to be with your requirements and enable stable builds.

For example, do you use virtual environments?

Youngstar: Yes, I use virtualenv.

Graybeard: Why?

Youngstar: So that packages are installed in isolation per project and not globally in the system.

Graybeard: Good, this is one more isolation level. By the way, newer versions of Python comes with venv module which does basically the same work. And there also a newer tool called pipenv.

Youngstar: That’s nice, one less dependency. What are the differences between virtualenv, venv and pipenv?

Graybeard: With virtualenv you can specify a different Python interpreter, for example even if your default Python is 3 you can still create a virtual environment with the Python 2 interpreter.

Also since venv is in the Python standard library, it’ll updated only when a new version of Python is released. virtualenv will probably have a faster release cycle.

pipenv combines pip and virtualenv to one tool.

Youngstar: Good to know. The downside of using virtual environments is I need to teach my IDE which is the right Python.

Graybeard: Which IDE are you playing with right now?

Youngstar: VSCode.

Graybeard: That’s a cool one, almost as good as Vim.

Youngstar: Yeah, yeah. Any other pointers for managing dependencies?

Graybeard: Don’t use the system Python.

Youngstar: Why?

Graybeard: In general, it’s preferred to leave the system Python alone since a lot of system utilities are written in Python and a system upgrade might break your code. Red Hat based distros use a lot of Python.

Youngstar: On the other side of things, if I upgrade a package that a system tool depends on - I might break a system tool.

Graybeard Right.

What will happen to your code once the next debian ships with Python 3 as default?

Youngstar: I see, I’ll install a Python for my application with the right version. Is debian a popular distro?

Graybeard: Very, several other distros are based on debian, such as Ubuntu and Mint. Changes to debian will find their way to these distros eventually.

Youngstar: I use Mint, now I remember reading somewhere it’s debian based.

Graybeard: Yup. Now what happens if PyPI is down when you deploy?

Youngstar: I’m pretty much screwed, but how can I overcome this?

Graybeard: In some cases it might be OK to wait for PyPI to get back up. It’s has been more stable in recent years. If you need to deploy no matter what, then you need to pre build your dependencies and tell pip to install it from your servers.

Youngstar: pip can do that?

Graybeard: pip can do many things, this is one of them. See the --index-url and --find-link options of pip install.

Youngstar: OK.

Graybeard: Now about the version of the C compiler…

Youngstar: I write Python code, not C.

Graybeard: You can write Python modules in C, and there are many good reasons for doing that - but mostly as last resort. It’s likely that one of your dependencies is a C extension. Then you’ll need a C compiler and possibly some libraries and header files. Some libraries require a Fortran compiler.

Youngstar: Fortran?

Graybeard: Yes, in some cases a Fortran compiler can do better optimization than a C compiler.

Youngstar: How do people on the Windows world find a C compiler?

Graybeard: There’s a free C compiler for every major platform. gcc or clang on Unix like systems. And the Microsoft compiler comes free nowadays.

Youngstar: Good to know. And what’s the solution here for the C extensions problem?

Graybeard: The idea is that you build all your dependencies in advance and then use them. The latest packaging format is called wheel. It’s basically a zip file that contains both the Python code and the compiled extension as a shared library.

Youngstar: What happened to eggs?

Graybeard: wheel is the new egg.

Youngstar: I’ll get the T-Shirt.

Graybeard: Some companies have a “build machine” which has all the required dependencies to build the packages. This way you don’t need to install a lot of tools on your production machines. This build machine is usually also the one serving these third party packages. By the way, this process of keeping third party dependencies locally is sometimes known as “vendoring”.

Youngstar: How deep does this rabbit hole go?

Graybeard: Just you wait Alice. Oh! The places we’ll go… Dependency management is an old and unsolved problem. Pick any package manager: yum, apt, gem, npm … - all of them have their problems.

Youngstar: Consolation of fools… Can we get back to the Python realm?

Graybeard: Yes.

Youngstar: And …

Graybeard: Hold on, collecting my thoughts… OK. If you’re doing a lot of scientific computing - numpy, pandas, matplotlib and other packages. pip installing them can be a pain.

Youngstar: Right… Should I wax my legs while doing this?

Graybeard: Not sure what will hurt more. Anyway … There’s an alternate package manager called conda. conda was developed by Anacodna to solving the problem of installing scientific packages. Over time in became a general installer and you can install other packages with it. Note that not all of the packages on PyPI can be installed with conda.

Youngstar: What do I do then?

Graybeard: conda plays well with pip and you can use both. conda has its own notion of “environments” and it installs pip in them for just this case. conda supports Linux, Windows, OSX, ARM …

Youngstar: Do you get royalties from Anaconda?

Graybeard: Nope, but since I’ve been doing a lot of scientific Python lately it had saved me tons of time and agony. Going deeper …

You can use docker. This will give you a system where you know exactly what going on - which version of Python, of libc … However docker comes with it own set of issues - mainly what’s called “orchestration” but I won’t get into that. The simple approach is just to run a single container as your application on the host.

Youngstar: OK.

Graybeard: Alan Kay once said “People who are really serious about software should make their own hardware.”

Youngstar: Let’s stop here, I have no intention of starting a hardware company.

Graybeard: CPUs have bugs as well, you might want to control the version of CPU you use.

Youngstar: OK. A related question - How do you choose which package to use?

Graybeard: If the package implements a known protocol or connection to external tool (such as a database), chances are that the main site of the protocol/tool will list recommended “language bindings”. For example, the bottom part of msgpack site has a “Languages” section with Python pointing to msgpack-python.

Youngstar: And if I don’t find a reference to Python in the main site?

Graybeard: Most packages are hosted on public sites such as github There you can see the project “health” - how many committers, commit history and last commit, number of open bugs …

Ask around, the Python community is very friendly and helpful. There are also sites who have a curated list of packages. However don’t blindly trust them, make up your own mind. I find they have a tendency to recommend the shiny new toys.

Youngstar: Err toward mature package.

Graybeard: “Stability is sexy.”

Youngstar: We need to have a talk about how you define “sexy”, but another time.

Graybeard: Ha!

Another thing you should do is test before you use. Pick a package or two and try it out to see how it behaves. Try to simulate real environment and load as much as you can and always make sure to write code in a way that makes switching packages easy as possible.

Youngstar: Do I really need to do so much even before writing even one line of code?

Graybeard: This is sometimes called “accidental complexity.” But no, don’t start with having your own build machine and internal PyPI. Start simple with pip, virtual environment and versioned requirements file.

Youngstar: Pain vs Gain?

Graybeard: Exactly. Start with minimal effort that works for you and grow when you need.

Youngstar: Thanks for that. My head is full and my beer glass is empty - time to go home.

Graybeard: Cheers!


Depending on the cost of error - pick a strategy for versioning

Version your dependencies, write them down and place them in source control

Use wheels when possible

conda is a good alternative to pip

docker will give you even more control but it comes with a cost

You might want to invest in your own internal package repository

Have a process for evaluating new packages. Lean toward old and stable ones


Two rules of database systems

It takes 7 years minimum to create a production-ready database system

You’re not an exception to rule 1

- Luca Candela

Youngstar: I need to store some data and was thinking of using MySQL, what do you think?

Graybeard: I think you mean MariaDB.

Youngstar: What?

Graybeard: MariaDB is the community fork of MySQL, done after Oracle bought MySQL.

Youngstar: Like OpenOffice and LibreOffice?

Graybeard: Exactly.

Youngstar: OK. Now that we clarified this issue, can we get back to my initial question?

Graybeard: I don’t know enough about your data to give you a good answer.

Youngstar: Currently I don’t have much data. Some user information, some session data. Things are very much in flux so it’s hard to know.

Graybeard: I’ll give you my usual advice - start simple.

Youngstar: Gee, why didn’t I think of that? What do you mean by “simple”?

Graybeard: When you start with a database such as MySQL you add complexity to your system. You need to serialize/deserialize your objects, you have schemas to design and update - and schema migration can be tricky. Using MySQL also means you need a server, users, backup …

Youngstar: OK, so what do you suggest?

Graybeard: When I need storage, I usually start with shelve. It’s very much like a dict which is backed to disk. The main limitation is that the keys have to be strings, the values can be anything that pickle can handle. I don’t have to worry about serialization, schemas and other things.

Youngstar: How do I query it?

Graybeard: By running for loops in Python.

Youngstar: Isn’t it slow?

Graybeard: sighs Speed again? What’s your speed requirement? How many objects do you have? Have you profiled your code? …

Youngstar: OK, OK …

Graybeard: As a rule of thumb, for a system that’s not that loaded and around tens of thousands of objects - shelve will work reasonably well.

Youngstar: Is it thread safe?

Graybeard: Is your application multi-threaded?

Youngstar: I haven’t decided on the web server yet, so I don’t know.

Graybeard: Well, if you find you need to be thread safe - slap a threading.Lock on it. It’s a good idea to have your own data access layer anyway, so switching storage backends shouldn’t be that hard. Writing a nice DAL also forces you to think about your storage API. Most of them time the usual CRUD is enough, maybe some search as well.

Youngstar: DAL? CRUD?

Graybeard: DAL is Data access layer. CURD is Create, Update, Retrieve, Delete

Youngstar: Ah. What about ORMs6? I heard SQLAlchemy is great.

Graybeard: I have mixed feeling about ORMs. On one hand they save you a lot of boilerplate coding. However I found out that when your data usage become more sophisticated, you need to work around them. Also I haven’t found a good ORM for a NoSQL databases yet. If you end up using an ORM, make sure it’s easy to rip it out if it becomes a problem more than a solution.

Youngstar: NoSQL as in MongoDB?

Graybeard: Yup. There are so many of them.

Youngstar: Are they better than SQL ones?

Graybeard: It really depends on your usage. I found NoSQL databases good for early stages when your data model is still in flux and schemas are just in your way. I usually start with shelve and switch to NoSQL database if I need support for large amount of data or client/server architecture.

Note that in NoSQL the schema does exist, it’s in the code instead of in the database.

Youngstar: Will I need client/server support?

Graybeard: My crystal ball is broken today. However the answer is probably yes. You usually run more than one server for failover or load handling, and you’ll want all of these servers looking at the same data.

Youngstar: I guess if I can make my server stateless it’ll be best.

Graybeard: Good insight. In practice this is really hard to achieve, but a good goal to strive to. I worked at a company that stored all the required data in HTTP cookies. This meant the client sent all the data we needed in every request. This saved us a lot of database queries, however you need to be aware of the security risks of storing data on the client.

Youngstar: When will you pick an SQL database?

Graybeard: There are many parameters that point to SQL database. One thing is that many people know SQL, and if you have many hands touching the data - it’s a good thing. Also many tools, mainly reporting ones, work well with SQL.

The other thing is that some of the SQL databases, I personally prefer PostgreSQL, are wicked fast when you have much more reads than writes.

SQL databases have transactions, which means when you insert ten records either all of them will enter the database or none of them. In some NoSQL systems this is really hard to achieve.

Also, SQL databases tend to be older, which means they are more stable and have more tooling and knowledge around them.

Youngstar: You prefer older? You love all this new and shiny stuff.

Graybeard: I know, but I’ve been bitten by “new” databases. At one company we worked with a two years old database. About 90% of our downtime was due to database issues.

Youngstar: Ouch.

Graybeard: Yes. I hear the situation has improved since then, it takes time for a database to mature and be production ready.

Youngstar: OK, I’ll learn some SQL then.

Graybeard: It’s not just SQL you need to learn but also NoSQL. There are many ways to model your data and you need to know things like normalization, fact tables, type 2 dimension tables and more. One of the more effective ways I know is to start from the UI and think about the queries you’re going to perform. After that you start modeling the data.

Thinking and designing your data layer is very important. In “The Mythical Man-Month” Fred Brooks says: “Show me your flowcharts and conceal your tables, and I shall continue to be mystified. Show me your tables, and I won’t usually need your flowcharts; they’ll be obvious.”

Youngstar: flowcharts?

Graybeard: Yeah, this book is from 1975.

Youngstar: 75? Are you kidding me?

Graybeard: It’s timeless. Talks mostly about people and communication, and people haven’t change a lot in the last few thousand year.

Youngstar: But still … 75?

Graybeard: Read it for yourself and decide. Well worth the time in my opinion.

Youngstar: OK… Going back to present day - any more advice?

Graybeard: Couple tidbits:

You’ll probably have some complex queries in your code. I recommend saving them in external files - SQL, YAML … and not in code. I once worked in a company who used the Spring framework. They went half the way and stored the SQL queries in the Spring XML configuration files. It was really hard to read the SQL embedded in the XML, there was no syntax highlighting and viewing diffs was a mess.

The second thing is that most Python’s SQL database drivers support accessing columns by name and not just by index. Accessing by index is both less readable and prone to error, someone changes the SQL query and suddenly row[2] is not the column you want. For example in sqlite3 you need to set the connection row_factory attribute to sqlite3.Row and then each column can be accessed both by position and by name.

Youngstar: OK, I’ll remember these. Now what about backup? How often to I need to backup my databases?

Graybeard: You don’t need backup.

Youngstar: I don’t?

Graybeard: No - you need recovery. You’ll be surprised how many companies had backups of their data but couldn’t restore from it when time came.

Youngstar: So backup is part of recovery. How often should I do it?

Graybeard: Again, depending on your audit and recovery needs - this question can have very different answer. Another thing is that backups tend to grow in size and accumulate, have a good retention policy.

If you use a hosted database - that might take care of backup and recovery for you.

Youngstar: Hosted?

Graybeard: Yup. And considering that they take all the operations headache from you it might be a good solution. Google has BigQuery, Amazon has Athena… and many others.

An extra benefit for BigQuery and Athena is that they scale. Both claim they can process billions of records in seconds.

Youngstar: Don’t they cost money?

Graybeard: TANSTAAFL7. Don’t make the common mistake of underestimating the cost of running your own servers. Deployment, monitoring, alerting, backup and more - all take time and effort. And developer time is expensive. In The Art of Unix Programming Eric Raymond says the rule of Economy is: “Programmer time is expensive; conserve it in preference to machine time.” This is true in most cases, whenever you can save developer time - do it.

This is also why people like Google App Engine - zero ops.

Youngstar: I have to say now I’m totally confused.

Graybeard: Yeah, too many options is not a good thing. Remember this when we’ll talk about monitoring. But for now - just start with shelve or something simple as it. When things get more interesting - go over the queries you do, the business requirements and then select the right solution. Who knows? You might find yourself using a graph database at the end.

Youngstar: A graph database?

Graybeard: Yes. You store not just objects, but also relationship between them. Look up neo4j which is a very popular graph database, they have some good usage examples on their site.

Youngstar: Any other types of databases I need to know of?

Graybeard: There are so many. I think we considered the main ones except search based ones.

Youngstar: Like Elasticsearch?

Graybeard: Yes, it’s actually my favorite.

Youngstar: My God, it’s full of databases!

Graybeard: Yes Dave. Also It’s not uncommon to use more than just one database. For example a combination of SQL for fast queries and search database for textual search. Some people use Redis for fast key/value and MongoDB for document storage. It really all depends, but having just one is a big plus.

Youngstar: I’ll start simple and grow when it hurts.

Graybeard: Wise words to end the night. My beer is empty and home is calling. Next time…


Start simple, shelve is a great option

Know your data and queries before selecting a database Think of things like embedded vs client server, SQL vs NoSQL vs Key/Value vs Graph …

Consider hosted database - let someone else wake at 3am

Pick a mature database

Make sure you can recover from backup

Have a policy to trim your backups


A computer lets you make more mistakes faster than any invention in human history, with the possible exceptions of handguns and tequila.

- Mitch Radcliffe

Youngstar: I fixed a bug today and accidentally introduced a new one.

Graybeard: Sounds like the “99 little bugs in the code” poem.

Youngstar: I can guess the rest of it.

Graybeard: Don’t you have regression tests?

Youngstar: I have a few unit tests, but that’s about it. What are regression tests?

Graybeard: Tests that guard against exactly what happened to you - that new changes didn’t break anything old. There are many kinds of tests and this is an important one.

Youngstar: So I should do more regression testing?

Graybeard: Let’s back off a bit. Why do you test?

Youngstar: Well, for one thing to make sure I don’t break anything.

Graybeard: Any other reason?

Youngstar: Check that the code runs as intended?

Graybeard: These are mainly unit tests. More reasons?

Youngstar: Hmm, nothing comes to mind currently. What are more reasons?

Graybeard: There are many - integration tests check that all parts of the system connect together. Fuzzing tries to bring down your system with unusual input and there are many more kinds of tests.

What do you think are the down sides of testing?

Youngstar: Downside? Let’s see … Well - they take time to write, that’s for sure.

Graybeard: Anything else?

Youngstar: Every time I change my code - I need to change the tests as well. This makes sure that I didn’t mess anything up, but also take more time.

Graybeard: Yes. This is what the guys in Getting Real call “mass”. The more mass you have, the harder it is to make changes.

The amount and kind of testing is influenced by the cost of error. If you’re writing a life support system - you’ll use much more testing than what you need in your little project right now.

The main point here is that testing is “pain vs gain” balance. Make sure the extra mass and time pain is worth the gain.

Youngstar: Speaking of tests, do you practice TDD?

Graybeard: Sometimes, mostly when working with new developers. I found out it helps them design cleaner code. You should fit the methodology to the team your working with. I personally write tests after the first or second draft of the code is working.

Youngstar: How do you know it’s working?

Graybeard: I try it out in the REPL.

Youngstar: The what?

Graybeard: REPL stands for “read eval print loop”, you might also know it as “the interactive prompt”. You write little pieces of code and test them as you go. After I’m done and happy with the code, I write some tests.

People underestimate how much does the REPL help during development, give it a try next time.

Youngstar: OK, I will. Which testing framework do you use?

Graybeard: I personally prefer pytest, I’ve used unittest with discover mode as well.

Youngstar: Why do you prefer pytest?

Graybeard: I find pytest simpler, and I always go for simple. Also love their parametrize fixtures which let you run the same test with different input (AKA table driven testing). Their xunit output is great for Jenkins integration as well.

Oh, and I also use tox for testing the same code on multiple versions/implementations of Python.

Youngstar: I’ll start with pytest then, don’t need multi version testing currently. How do I run the tests?

Graybeard: pytest comes with pytest script that discovers and executes tests. But this is usually the last thing in I run.

Youngstar: Last? What do you run before it?

Graybeard: Few things: I check that there are no calls to pdb in the code.

Youngstar: pdb is the Python debugger?

Graybeard: Yes, you can insert calls to it if the breakpoint condition becomes too complicated. We’ll talk about debugging later. Another thing I do is clean all the compiled modules.

Youngstar: The .pyc files that are generated on import? Why?

Graybeard: Say you renamed a module but forgot to change the import in your code. Since the .pyc of the old module is still there - your test will pass.

Youngstar: Gotcha.

Graybeard: I also run linter, I use flake8 which combines pyflakes and pep8, before the tests and fail on any output.

Youngstar: Does flake8 check for coding conventions?

Graybeard: Yes, this is how I avoid wasting time on coding convention talks. If the code passes flake8 - it’s fine. However don’t get too stuck on coding conventions, see Raymond Hettinger’s talk called Beyond PEP8.

Youngstar: Will do, anything else?

Graybeard: Nope. After that I run the test suite.

Youngstar: Sounds like a lot of steps. Knowing you, you probably have a script to do this.

Graybeard: Correct, I’ll mail it over if I remember. But I’m sure you can code it yourself. The steps are: Clean .pyc, search for pdb, run flake8 and finally run the tests.

Youngstar: I’ll remind you to mail me.

Graybeard: Thanks. Having one command to run your tests also makes sure other members in your team don’t forget steps. I’m not the only one with a one bit memory.

Youngstar: In some cases I found out the tests run for a long time. Which makes it annoying to run them every time I make a change.

Graybeard: My rule of thumb is that developers won’t run tests that take more than about a minute.

Youngstar: So how do you run longer tests?

Graybeard: With my friend Jenkins.

Youngstar: Is it the system that monitors your source tree and run tests on every change?

Graybeard: Yes. It’s called “continuous integration” or CI for short. Jenkins can do much more but at its heart this is exactly what it does.

I separate the tests to faster ones that can run on a developer machine without too much setup and longer ones that run on Jenkins.

pytest have a way to mark tests and pick a subset of tests to run. In unittest I use environment variables and a special exception that’s called SkipTest.

Youngstar: And when Jenkins runs the tests it selects all of them?

Graybeard: Yup. A common mistake that people do is to write a lot of code in the Jenkins execute field.

Youngstar: Why is it a mistake?

Graybeard: Since then it’s usually not in source control.

Youngstar: Ah! And then if you want to make changes to how tests are run - you change the script and commit.

Graybeard: Exactly. Note that Jenkins can do much more but start simple as always.

Youngstar: Another thing I recall we talked about was to make sure tests don’t get into production.

Graybeard: Yes, try to make it impossible for tests to get or touch production.

Youngstar: Any more advice?

Graybeard: Yes - cleanup at start of the test.

Youngstar: Say what?

Graybeard: Most test frameworks allow you a setup and teardown methods. Most people create what they need in the setup, for example setting database tables and populating them with data. Then the use the teardown to cleanup everything. The problem is that teardown gets called even when the tests fail, and then if you want to debug - the data is missing. If on the other hand you use only the setup method and initially cleanup and then populate, you’ll still have data to debug if the tests fail.

Youngstar: Will do.

Graybeard: The last thing to remember…

Youngstar: Yay, there’s more!

Graybeard: Testing is a mastery by itself, and done right it’ll save you a lot of agony. But no matter how hard you test - bugs will get out into production and you need to be ready for that. Monitoring and altering is something we’ll talk about another time. NASA which has a very strict and thorough development process, still manage to ship bugs to outer space.

Youngstar: Really?

Graybeard: Yup. But they have a system in place to fix bugs in outer space as well.

Youngstar: I guess I’ll have to mock some parts of the system for testing, any advice on this?

Graybeard: In general - don’t mock! Every time you use a mock you cheat and don’t really test your system. Mocks are another “mass” you acquire and need to be updated to match what they are mocking. I’ve found out that with a little effort you can usually avoid mocking. I once worked at a company where we were doing web scraping, getting HTML pages, parsing them, analyzing and storing in a database. At first someone suggest we’ll mock the HTTP connection and get a canned HTML. But with a bit more coding we created an HTTP server using Flask which returned canned HTML pages. This way we also tested our connection infrastructure and when we wanted to test accessing pages with user/password - it was easy to add these kind of pages to the test HTTP server.

However sometime the cost of not mocking is too much - “pain vs gain” again. There’s a mock package in the Python 3 and for Python 2 it’s available on pypi.

Youngstar: Any more advice?

Graybeard: Testing is a bottomless pit. We can talk on it for hours, but I’m getting tired and I think we covered the main points. Also my beer is empty - going home now.

Youngstar: Cheers.


Find the “gain vs pain” balance for your tests

Have one script to run tests

Have a CI system, Jenkins is a good bet

Separate tests to ones developers run and ones Jenkins runs

Cleanup on setup

Make it impossible for tests to get into production

Avoid mocking as much as you can

No matter how hard your test, some bugs will slip though - be ready for this


Amateurs think about tactics, but professionals think about logistics.

- General Robert H. Barrow

Youngstar: I now have two environments where the code run. We have a production environment but we also have a QA environment. I have an if env == 'PROD': in my code but I’m not to happy about it. I also remember you once said I should try to minimize if in my code. How would you handle it.

Graybeard: What makes you think you have only two environment?

Youngstar: Oh, you’re right. There’s also the local development environment on my machine.

Graybeard: Yeah, and the number of environments will grow. You might want to check a new database version, a new package version …

Youngstar: Eeeek, again accidental complexity bites us in the behind.

Graybeard: How much did you drink? You usually get depressed later on.

Youngstar: You’re right, lemme get another round and you can tell me how to solve my problems.

Graybeard: Sure, I’ll wait.

Youngstar fetches a new round, they drink in silence for a few minutes.

Graybeard: OK, did you figure how to solve your problem by now?

Youngstar: I thought of some kind of configuration system, then have a configuration file per environment. Probably use JSON since writing my own format is bad.

Graybeard: Why JSON?

Youngstar: There’s already a parser and it’s well known format.

Graybeard: Would you like to have some comments in your configuration?

Youngstar: Probably yes … that rules out JSON. YAML?

Graybeard: YAML is a great format for configuration. I use it a lot, but there’s something even simpler.

Youngstar: YAML is pretty simple, you just load the configuration file. The only way it’ll be simple if the configuration will already be in Python … Oh - so I’ll use Python.

Graybeard: Yes. I usually start with a system where I have and just import it. Having said that, a YAML (or other format) based system is good as well. But start the simplest way you can.

Youngstar: But then how do I get a different configuration per system?

Graybeard: You can have an overrides file where you place values per system, or use environment variables. Then I have a system that reads the overrides and update

Youngstar How do you manager these override files?

Graybeard: Yes. In most cases the deployment system, say Ansible, will generate one based on the environment.

Youngstar: And I guess the default in should be for local development environment?

Graybeard: That’s right.

Youngstar: This system looks good enough to my usage, anything else?

Graybeard: There are many ways to do configuration, and you should pick the one that fits your case. We talked about overrides, the usual order is defaults < configuration < environment variables < command line switches. You can use something like ChainMap for this.

Youngstar: OK. I guess adding command line support helps in quickly testing other systems.

Graybeard: Yes, sometime the command that starts your program (say docker) gives all the right switches. Then you can go without configuration system at all in your code.

Youngstar: It’s not true, you just moved the configuration system to the deployment/running system.

Graybeard: I said “in your code”. Glad you caught that, many people when they talk about “zero configuration” mean “in the code”. There’s a nice thing about not having configuration in your code, but I found out that the code is usually tested better than the configuration system. I prefer to have the complexity where there are more tests.

Youngstar: What about storing configuration in a server?

Graybeard: People do that as well, they use systems like Consul, etcd, ZooKeeper and others.

Youngstar: Then you need just to know where the configuration server is.

Graybeard: Yeah, but then someone need to populate the configuration values on the server.

Youngstar: Agree. Anything else about configuration?

Graybeard: There’s so much more. Some people believe you should use just environment variables.

Youngstar: Why?

Graybeard: Read the 12 factor app and see.

Youngstar: Yay, more reading. By the way: I am using fabric for deployment, should I switch to Ansible?

Graybeard: Depends on the complexity of your deployment. fabric is very simple so I usually start there and switch to something more complex only when I need to. If you use docker based system like docker-compose and kubernetes, they have their own system for hooking containers together.

Youngstar: And then my code uses less configuration.

Graybeard: Exactly. But beware of jumping into docker - it’s cool but comes with it’s own set of problems.

Youngstar: Which are?

Graybeard: Let’s talk about it later when we discuss deployment.

Youngstar: OK. I guess as usual I’ll start simple and grow in complexity when I need to.

Graybeard: So young and so wise.

Youngstar: That’s right. Anything else I should know regarding configuration?

Graybeard: Sometime you compose configuration values from other configuration values. Make sure to do that after you read the overrides. When I get there I usually add an init function to the configuration system and call it when the program starts.

Youngstar Why not do it automatically on import?

Graybeard You tell me.

Youngstar Since I don’t control the order of imports. Some module I import can import the configuration as well.

Graybeard Also as the Zen says: “Explicit is better than implicit.”

Youngstar: Good old Tim, he knew what he was talking about. Anything else?

Graybeard: Configuration can get very tricky, fight hard to keep it simple so you won’t end up with a very complex set of rules. There will also be an edge case where you configuration system falls short. As long as it supports the majority of cases - you’re fine.

Youngstar: As usual, simple things go very deep with you.

Graybeard: A good configuration system will reduce the complexity in your code. This complexity doesn’t go away, but it’s contained somewhere else which is a good thing.

Youngstar: What about passwords and other “secret” stuff? Where do I store it?

Graybeard: Make sure they don’t make it to configuration or checked in by mistake. We’ll have a talk on security later…

Youngstar: OK then.


Start simple. A Python based configuration system with overrides will get you a long way

Know that most times you move configuration complexity to another system.

Learn about the various solutions out there and what people do, then adapt to your system what works.

Give more than one way to specify configuration. Usually we have default < configuration file < environment variables < command line switches

Make sure “secrets” are protected in your configuration system and not check into source control


If debugging is the process of removing bugs, then programming must be the process of putting them in.

- Edsger Dijkstra

Youngstar: I have a bug at work that I just can’t figure out. How do you debug?

Graybeard: I mostly don’t.

Youngstar: Come on, you’re not that good.

Graybeard: Oh, I have not mastered the art of writing bug free code… yet. What I’m saying that I don’t debug in the traditional sense of using a debugger.

Youngstar: Ah, so how do you solve code problems?

Graybeard: Ever heard about Rob Pike?

Youngstar: The names rings a bell, not sure from where.

Graybeard: Look him up, he did a lot. Anyway he once said:

“If you dive into the bug, you tend to fix the local issue in the code, but if you think about the bug first, how the bug came to be, you often find and correct a higher-level problem in the code that will improve the design and prevent further bugs.”

I think it was his experience when working with Ken Thompson.

Youngstar: Ken Thompson of Unix?

Graybeard: Among other things.

Youngstar: That’s all very nice, but way to understanding goes through debugging some time.

Graybeard: Right. However I’m a backend guy and most of the time debugging is impossible. I mostly use logging to understand what’s going on. If I do debug, it’s usually with the command line debugger that comes with Python - pdb.

Youngstar: Why not a visual one?

Graybeard: Since most of the time I’m in an SSH session to a server, which makes UI hard or impossible. Also once you get to know pdb it’s very effective.

Youngstar: Just like mastering Vim? OK, I’ll spend some time with it.

Graybeard: However, if you use good IDE it’ll have a visual debugger and sometimes these are nice. As we talked before, knowing your IDE well will save you tons of time.

Youngstar: OK. What else?

Graybeard: Why do you assume there’s more?

Youngstar: Since with you there’s always more.

Graybeard: Fair point. One of the tricks I used is sometime to place a “hard” breakpoint in the code. I do this when the condition for the breakpoint becomes pretty complex.

Youngstar: I thought pdb support conditional breakpoints.

Graybeard: You’re right. I can do that in pdb or other debuggers but in some cases it’s much easier to specify the condition in Python code. What you do it something like this (writes on napkin):

1 if some_complex_condition(): 2 import pdb; pdb.set_trace()

Youngstar: I thought there were no semi-colons in Python.

Graybeard: There are, but rarely used. In this case where it’s just debugging it’s convenient to have it in one line. I have a Vim abbreviation for this line.

Youngstar: I bet you do.

Graybeard: Then you run your code normally, not via pdb. And once the condition is met - you’ll get the pdb prompt. If you have IPython installed you can use its debugger instead of pdb, its a bit nicer.

Youngstar: And you make sure this is not left with the code in your test script.

Graybeard: Exactly. I have a rule in the script that runs the tests to check no stray pdb.set_trace() are in the code.

But as I said earlier, I mostly use logs. It’s an art to get the right balance between huge logs and too little information. Try to err on the TMI side.

Youngstar: TMI as in “Too Much Information”?

Graybeard: Yes. Storage is very cheap comparing to programmer time.

Youngstar: But what if the logs get too big?

Graybeard: You usually save only a window of time backwards. There are great tools for log rotation, both in the standard library and Unix utilities.

Youngstar: Like logrotate?

Graybeard: Exactly. You can also ship logs to log aggregation services, we’ll talk about logging and monitoring later.

Oh, and Python’s logging module can listen on a socket and change the logging configuration in run time. This way you can temporarily set a log level in one of your modules for a while, collect enough data and then return it back to the normal level.

Youngstar: Cool, I’ll look it up. Anything else about debugging?

Graybeard: Today’s systems are usually have more than one part. Debugging such a system is even more complicated. One thing I found that helps is to pass around a context object between sub systems. This way you can search the logs and get a logical view of an operation between several sub systems.

Youngstar: What’s in the context object?

Graybeard: Anything you think is useful. The bare minimum is just an identifier for the current operation/session.

Another thing people do it sometimes connect to a running service and inspect what’s going on with the Python REPL. There are several such systems, see Twisted manhole for example.

Youngstar: OK. Armed with this knowledge I’m heading back to the office.

Graybeard: Remind me to talk with you about work/life balance sometime.

Youngstar: OK.

Graybeard: But before you head back, another thing that really helps is giving it time. Letting what Daniel Khaneman calls “system 2” work on the problem.

Youngstar: System 2?

Graybeard: Yeah, not a very imaginative name. Think of it as the part of your brain the works below the surface. It’s the one that does most of the leaps in understanding but it needs time. Instead of heading back to the office, go home and watch a video called “Hammock Driven Development” by Rich Hickey.

I can’t tell you how many bugs I solved during jogging.

Youngstar: Oh, we definitely need to talk about work/life balance and how you have time to learn all this stuff.

Now that you mention this and I see my empty beer glass. I’m guess I’m over my “Ballmer Peak”, so I’ll go home and watch that video.

Graybeard: Kudos on knowing your XKCD.

Youngstar: Thanks and g’night.


Writing simple code will make debugging easier

Understand the bug before you fix it

Know how to work a debugger. Both from IDE and command line

When fixing a bug try to make sure these kind of bugs won’t happen again

Use logs, err on the TMI side

Use automation to make sure debugging code doesn’t get to the source tree

Give your subconscious time to work


May the queries flow, and the pagers remain silent.

- SRE Benediction

Youngstar: I’d like to place my code out there in alpha state so people can play with it.

Graybeard: Getting feedback early is a very good thing. Where are you going to put the code?

Youngstar: That’s what I was going to ask you. There are so many options - AWS, GAE, Heroku, Azure, my own servers … Which one do you use?

Graybeard: I use the one that fits my needs.

Youngstar: That was helpful.

Graybeard: The point is that there’s no “one size fits all”. It depends on many factors. And I use different hosting solutions in different situations.

Youngstar: One of these factors is if I can place my data outside?

Graybeard: Yes. A lot of companies think their data is safer if the keep it in house. However I tend to trust the Google/Amazon security experts much more than the local IT.

Youngstar: I don’t know much about security.

Graybeard: We’ll fix that later. However today it’s more common for companies to host data outside. And even companies that say “we host data ourselves” usually mean “on our hosted servers”. Sometimes you can’t host data outside due to legal reasons or some regulation.

Youngstar: IANAL, but I think I’m OK with hosting data outside.

Graybeard: What most companies underestimate is the cost of having your own servers. Scaling up becomes much more painful, and you need people doing rotation who can drive at 3AM to some Colo, have the right keys and know how to reboot the servers.

Youngstar: Colo?

Graybeard: Short for “co-location centre”. It’s usually a secure place for your servers with good network, security and other goodies.

Youngstar: So not from the office network?

Graybeard: Sadly I’ve seen that too.

Youngstar: OK, I’ll start with the cloud then. Which one?

Graybeard: There are many options and many variables you need to consider. As usual - some research required.

Youngstar: Such as pricing?

Graybeard: Pricing is one aspect. However most companies don’t fathom how much time consuming operations can be.

Youngstar: And by time you mean money.

Graybeard: Exactly. I’d do my best to limit my operational involvement.

Youngstar: OK, less ops is better. What else?

Graybeard: Try to avoid vendor lock.

Youngstar: By using open standards?

Graybeard: Yes, and also creating abstractions in your code.

Youngstar: “All problems in computer science can be solved by another level of indirection”.

Graybeard: Did you catch my quote addiction? Was this David Wheeler?

Youngstar: Yup. Just stumbled on this the other day.

Graybeard: Another thing you need to take into consideration when choosing who to use is size and reputation.

Youngstar: Very much like selecting technologies to use.

Graybeard: As the old joke says: “Nobody ever got fired for buying IBM”. Sometimes it’s OK to bet on younger products, but infrastructure is something you need working.

Youngstar: “Stability is sexy”.

Graybeard: Oh, you actually listen to what I say. I’m flattered.

Youngstar: Yeah, yeah. Go on.

Graybeard: Once you decided on hosting which fits you budget and seem decent enough. You need to fit deployment to your process. The ideal today is called continuous delivery - once tests pass on Jenkins, the code goes to production.

Youngstar: I heard that deployment is painful.

Graybeard: It doesn’t have to be. There’s a piece by the late Aaron Swartz called “Lean into the Pain”. He says that just like sport, we need to do the stuff that hurts us a lot in order to get better at it.

Youngstar: And when we deploy a lot it won’t be an issue.

Graybeard: Yup. Note that there are deploys and there are deploys. Most of them will be a non issue, but some of them will give you a headache.

Youngstar: Can you give me an example?

Graybeard: Changing a database schema in a non backward-compatible way.

Youngstar: Which means you need to re-process all the data?

Graybeard: Yes. And also you’ll have some processes still working with the old format and some working with the new format.

Youngstar: Ouch!

Graybeard: There’s a reason NoSQL is popular.

Youngstar: You can make breaking changes in NoSQL.

Graybeard: That you can, but it’s sometimes easier. However you pay in other areas, like lack of transactions. Pick your poison…

Youngstar: OK. I’ll think about it and try to automate my deployment as much as possible.

Graybeard: Good plan. Another thing which is hard in some platforms is zero downtime.

Youngstar: I read about it. So many options - Blue Green, Canary Releases, Rolling deployments …

Graybeard: As usual, go simple and scale when you need. Some platforms like GAE do it for you.

Youngstar: Cool. They scale as well?

Graybeard: Yes. So does AWS and others. You need to take care to limit scaling otherwise a spike in load can get you bankrupt.

Youngstar: Ouch!

Graybeard: It’s also hurts that users can’t access your site due to load.

Youngstar: I’ll pick my poison.

Graybeard: You’re learning. It’s all about trade-offs.

Youngstar: What else?

Graybeard: You need to make sure you don’t have snowflake servers.

Youngstar: I thought servers like cold temperatures.

Graybeard: What Martin Fowler means is a unique server that you can’t rebuild if it’s gone.

Youngstar: So automate again. Which tool? Ansible, SaltStack, Chef, Terraform…

Graybeard: Do your homework and ask around. I usually start simple with Fabric and move to the heavy weight when I need them.

Youngstar: OK. I will.

Graybeard: Automation also helps with avoiding errors. Some people swear by checklists, but manage to forget a step.

Youngstar: I get it, you sent me the “automate all the things” meme enough times already.

Graybeard: OK, moving on then … It’s important that there won’t be one production environment. You need one or more for QA.

Youngstar: But probably not that fancy.

Graybeard: Yup. So make sure to parameterize everything - cluster size, machine type …

Youngstar: What about Docker?

Graybeard: Docker helps in some aspects - it takes you out of dependency hell. However it comes with another level of orchestration.

Youngstar: TANSTAAFL?

Graybeard: Exactly. Docker is also let’s you create a copy of production environment on your local machine, which is handy.

Youngstar: Anything else?

Graybeard: A nice thing is to mark deployment times on your monitoring graphs. This way is you see a spike in errors it’s easy to see if it’s related to a specific release.

Youngstar: Just a vertical line?

Graybeard: Any way you want, as long as it’s visible.

Youngstar: OK.

Graybeard: Also make you you can do a rollback as well. If a release goes bad you need to be able to quickly get back. Blue-Green and rolling releases help with this.

Youngstar: Don’t forget the cute canaries.

Graybeard: That’s right. They were helpful at the coal mines and they are helpful now. Every release is a risk.

Youngstar: And we don’t like risk.

Graybeard: Yeah. In “Keys to SRE” Ben Treynor talks about “error budget”. If a deployment went bad and there’s down time - it’s taken out of your error budget and you release less.

Youngstar: Sound reasonable. It seems there’s so much infrastructure to build and process to develop.

Graybeard: Yeah. And backups which work, and security and …

Youngstar: OK. I get it - ops is a lot of time and money. Final advice before my head explodes?

Graybeard: Get more beer?

Youngstar: I mean deployment wise.

Graybeard: I usually start with GAE which is zero ops and once things start to heat up - I look into other platform. Or stay in GAE if it gives me all that I need.

Oh, and no deploys close to weekends or vacations.

Youngstar: OK. I’ll take a good look at my architecture and see if it can fit in one of the no-ops hosting. And now that beer please.

Graybeard: Sure thing.


Don’t underestimate how much operations will cost you in time and money

Pick a solution that will reduce the operations burden Automate everything you can

Do your homework. Learn about deployment methods, tools and procedures

Be ready to roll back releases

Mark release in your monitoring tools

Monitoring & Alerting

On a long enough timeline, the survival rate for everyone drops to zero.

- “Fight Club” movie

Youngstar: Our logging system paid off this week.

Graybeard: Do tell.

Youngstar: A customer called to say they are missing some data. A quick search in the log files found that one sub system was down for a couple of days, we brought it back up and the missing data was in front of the customer eyes in about an hour.

Graybeard: Fixing a system in an hour is indeed good. However I think you can do better.

Youngstar: Better than that? How?

Graybeard: You need to know about problem before your customers.

Youngstar: Well, we have great logging. But we look at the logs after we found out there’s a problem. We do monitor our machines for load, disk space and other things. However this was an application crash and didn’t cause a system problem, it actually reduced the load.

Graybeard: Two things: One is that monitoring without alerting is not that helpful - nobody is watching the graphs 24/7. Second is that there are better things to monitor than disk space.

Youngstar: Let’s take these one at a time. You’re saying I need some automated system that will alert me when a metric goes funky?

Graybeard: Yes. You usually start with a fixed threshold, but as your system grows complex you need more advanced methods. Remember that if you have too many alerts - people will ignore them. It’s the classic “the kid who cried wolf” story. There are some cool new systems now that apply “anomaly detection” algorithms to metrics. There are even companies that provide a service where you send them your metrics and they alert when they find an anomaly.

Youngstar: I’ll start simple with manual thresholds and move to more sophisticated stuff later.

Graybeard: Yup. “start simple” always wins. Other questions you need to ask yourself about alerting are “who?” and “how?”.

Youngstar: We’re a small team, I guess everyone should pitch in.

Graybeard: Yeah. At one company I worked with had a good rotation system. There were weekly shifts, rotating at Monday noon. Each shift had a primary and secondary role.

Youngstar: I don’t believe that everyone can solve every problem.

Graybeard: Yeah, but it’s the Pareto principle - most errors are easy to solve. The big bonus is that everyone feels the pain of a failing system and start writing more robust code, and also pay more attention in code reviews.

I saw a great talk called “Keys to SRE” by the guy who started the SRE team in Google.

Youngstar: SRE?

Graybeard: Site Reliability Engineer. It’s the group that makes sure things keep running in Google.

Youngstar: OK.

Graybeard: Where was I? … Oh yeah, in the video he mentions that a couple of sleepless nights does wonders to the stability of code people write.

Youngstar: I can see that. And I think that will be a good fit for my small team. I’ll give it a try - getting woken up at 3am gets old real fast. How do you actually alert?

Graybeard: Usually by alert to cellphone, pagerduty seems to be very popular. It’s good also to alert to the ops chat room.

Youngstar: OK. And if I recall you recommend to do postmortem on every issue.

Graybeard: Yeah, start with 5 whys and develop your own system. Along the way update your “red book” for what to do when shit happens.

Youngstar: I thought shit happens all the time.

Graybeard: That’s right. Now let’s talk on what to monitor.

Youngstar: I guess the usual - disk space, load, memory …

Graybeard: Right and wrong.

Youngstar: Gee, that’s helpful.

Graybeard: Let me ask you - how’s an 80% full disk affect your revenue?

Youngstar: Hmm. Well, it’s an indication that I’m going to have a problem and this might drive out users. Hard to place a number on this.

Graybeard: Right. Also let’s say everything looks OK system wise but your users can’t see data from the last 2 days.

Youngstar: I guess I need to check that as well.

Graybeard: Most people start “bottom up” from system metrics to system health. But the more important is “system health”, you need to monitor your KPIs.

Youngstar: The what?

Graybeard: KPI - Key Performance Indicator. You need to be up to date with your TLAs.

Youngstar: Three Letter Acronym?

Graybeard: Yup. Take Netflix for example, they have one major KPI they monitor called SPS - starts per second. It follows a wave pattern if there’s some deviation from this pattern - they take a look.

Youngstar: I see. But then you need to hook your own monitoring to your programs. It’s also harder to find problem in a wave like pattern which I guess differ from country to country and changes over the holidays.

Graybeard: Yes, it’s harder but better. Most of the time people measure what’s easy and not what’s important. Take highway police for example.

Youngstar: What about them?

Graybeard: They do a lot of speed traps, not because speed is the major cause of accidents, but because it’s easy to measure. Unlike reckless driving, which is far more dangerous but harder to catch.

Youngstar: I see. And how do I find these all important KPIs?

Graybeard: That’s a business question, I’m a tech guy. You’re the one owning a company - go and figure it out. As usual start simple and optimize along the way.

Youngstar: What about the other monitoring - disk, CPU, memory …

Graybeard: Keep them, but try to figure out how do they affect your business.

Youngstar: Anything else?

Graybeard: Yes - automate as much as you can.

Youngstar: For example?

Graybeard: If the disk is getting full, and you know a place where you can clean up - do it. Even better run what I call a janitor process periodically to clean things up.

Youngstar: Sound good. What’s system do you recommend for this?

Graybeard: There are many, many systems our there. See what you need and what they offer and try to find a good match. As usual go with boring reliable technology. Lately I’ve been using the ELK stack, but that’s just a personal preference. I already had Elasticsearch in place, so not using yet another system looked like a win to me. But really - have a look around, there are many and it might be that one of them is a better fit to your needs than ELK.

Youngstar: Great, more homework. Anything else?

Graybeard: It’s a good idea to do “ops drills” where you simulate problems and people solve them.

Youngstar: I guess we’ll have plenty of the real thing to practice on.

Graybeard: It’s better to deal with your first outage not at 3am with a customer shouting over the phone. Also other team members can look and learn.

Youngstar: Isn’t that what Netflix chaos monkey do?

Graybeard: Sort of, but wait until you get there. By the way they have more tools that destroy things. By the way, it’s called the Simian Army now.

Youngstar: Oh my… I need another drink to reflect on that. Want some?

Graybeard: OK, I get the hint. I’ll shut up about monitoring and alerting now :)


Identify your KPIs and monitor them

Start with simple thresholds and move to more sophisticated systems later

Have a pager duty rotation, everyone should pitch in

Automate recovery as much as you can

Update a “red book” for solving problems

Do a postmortem for every outage

Have ops drills


First rule of computer security: don’t buy a computer. Second rule: if you buy one, don’t turn it on.

- Dark Avenger

Youngstar: I was going over our HTTP logs and found some weird stuff there.

Graybeard: “Little Bobby Tables”?

Youngstar: There was some SQL injection, some trying to run script and other fishy requests. How do I protect myself against such things?

Graybeard: One thing you need to keep in mind is that if someone is really targeting you - you will get hacked. Hackers managed to get into NASA, banks and many other secure places.

Youngstar: So I should just give up?

Graybeard: Why do you lock your door when you leave the house?

Youngstar: So bad people won’t be able to get in?

Graybeard: And you think that people who rob banks can’t get in your house?

Youngstar: They’ll be able to. But I do it to deter most casual thieves. Oh, I see where you’re going with this. I shouldn’t make myself an easy target.

Graybeard: Exactly. I’ll give you some simple rules to follow. Keep in mind I’m not a security expert.

Youngstar: If I had a penny on every thing you’re not an expert in…

Graybeard: You’ll probably have problems carrying all this weight.

Youngstar: Ha. OK, rules?

Graybeard: Let’s start with the social aspect. All the security in the world won’t help if you have weak passwords, if your computer doesn’t ask for login when you turn it on, if employees write passwords on a sticky note, or blindly click on any link sent to them.

Youngstar: You mean phishing?

Graybeard: Yup. And other social hacks. The key is to be aware, keep learning and educate people.

Youngstar: Good paranoid culture, sounds like fun.

Graybeard: Nah, just be careful - that’s all. You don’t think locking your door makes you a paranoid.

Youngstar: You’re right. But you told me that only the paranoid survive.

Graybeard: That was Andy Grove, not me.

Youngstar: OK. Apart from culture?

Graybeard: One more thing about culture is that you need to make security part of the process. Make security reviews to your code - Both as part of code reviews and dedicated security audits. Appoint someone in your company to be the security tsar.

Youngstar: Anything special I should look for in those reviews?

Graybeard: Try to think like the bad guy. “How can I break this piece of code?”. Read “The Security Mindset” by Bruce Schneier to get some ideas.

Youngstar: OK. What else?

Graybeard: We usually think of security in layers. There’s network layer, server layer, deception layer, encryption layer and more. Each has its own set of tools and practices. Think about the layers that are more valuable and effective and invest your time there.

Youngstar: Deception?

Graybeard: Yeah, something called honeypots.

Youngstar: Now I can’t get the image of Winnie the Pooh out of my head.

Graybeard: Funny, now I can’t either. In any case, security is a cat & mouse game and you need to be updated all the time. One good practice to keep things patched. Depends on your hosting choice, they usually do a good job patching. But you should keep track and make sure you’re up to date.

Youngstar: OK. I’ll patch away.

Graybeard: Note that some patches require reboots. You need to be ready for this and plan how to keep things up while rebooting.

Youngstar: I remember our talk on “hot deploys”. Any security tools I should familiarize myself with?

Graybeard: There are many. A good starting point is what comes with Kali Linux.

Youngstar: Isn’t Kali some Hindu goddess?

Graybeard: Envy of the competition?

Youngstar: Never envy, always cautious.

Graybeard: If you have time and money, you can hire a pentesting team.

Youngstar: pentesting?

Graybeard: Penetration testing. These companies will try to break into your site and will give you a report.

Youngstar: Like in the Sneakers movie?

Graybeard: Yup.

Youngstar: I’ll go and watch it again. I love Robert Redford.

Graybeard: Should I tell your boyfriend he should be worried?

Youngstar: … Sure, I like to keep him on his toes.

Graybeard: The poor guy. I hope he appreciates his luck.

Youngstar: Let’s get back to security please?

Graybeard: OK. Do what you did - monitor your logs. Add some automation to alert you when something fishy happens. There are several tools for that, the technical term you’re looking for is SIEM.

Youngstar: OK. You mentioned hosting companies doing patches. Do they do more?

Graybeard: Yeah they do, sometimes for free since it’s their reputation as well, sometimes at cost. And there are companies who give security as a service, WAF for example.

Youngstar: I’ll Google what WAF is. How much should I spend on security?

Graybeard: You need to think how much each security breach will cost you, not just money but also reputation. Then prioritize and protect.

Youngstar: Oh, I like that slogan.

Graybeard: Now about secrets…

Youngstar: Secrets? I don’t have any.

Graybeard: Sure you do. Your email password, keys to your hosting provider and more.

Youngstar: Oh these, what about them?

Graybeard: How do you keep them safe?

Youngstar: I have file encrypted with gpg for these. The master password is in my head.

Graybeard: And if you have software that needs some of these keys?

Youngstar: I set it in the environment when deploying.

Graybeard: And how does the deploy script knows?

Youngstar: It asks me.

Graybeard: So it’s not fully automated then.

Youngstar: Yup. By the way, is gpg good enough?

Graybeard: It’s better than rot13, which I saw people use.

Youngstar: rot13?

Graybeard: It’s a substitution cypher where each letter is replaced with the letter 13 places after it, in a cyclic manner.

Youngstar: And since there are 26 letters in the English alphabet, if you rot13 and rot13 you’ll get the original.

Graybeard: Yes. Not that secure but I’ve seen people use it. You can implement it with a single tr command8.

Youngstar: You and your aliases. Let’s get back to how can I fully automate my secrets.

Graybeard: Some of the automation systems like Ansible have modules that automate this process, There are special databases for managing secrets and some companies role their own.

Youngstar: NIH syndrome?

Graybeard: Probably. Sadly it’s a very common syndrome.

Youngstar: Any other things I should know?

Graybeard: I think we’ve covered the major points.

Youngstar: Right. Now I’m heading back to my place, and will make sure the door is locked.

Graybeard: Sadly they didn’t invent virtual guard dogs like the beast you have at home.

Youngstar: What do you mean by “beast”? He’s a cutie!

Graybeard: He is cute, but also very big and scary sometimes.

Youngstar: And probably needs a walk, I’m out of here.

Graybeard: Cheers.


Get in a security mindset

Appoint someone to be in charge of security

Make security part of your process. Do security audits and look for violations in code review

Keep software up to date, make sure patches happen

Secure in layers. Invest in the ones that give you the best benefit

If you have money, hire a pentesting team

Have a process to keep secrets

Going Faster

Write clear, precise code. Every ten years it will run 1,000 times faster.

- Joe Armstrong

Youngstar: We’re starting to get traffic on our site and some of the servers became busy. I think I need to rewrite some of my modules in C.

Graybeard: You know the three rules of optimization9?

Youngstar: Nope.

Graybeard: First rule is: “Don’t.”

Youngstar: Very helpful.

Graybeard: Actually it is. Second rule is: “Don’t… yet”.

Youngstar: And the third is “never”?

Graybeard: Nope, it’s: “Profile before you optimize.”

Youngstar: That one I get, but why avoid optimization?

Graybeard: Because there are so many better ways to make things run faster than writing code which is hard to understand and maintain.

Youngstar: Do tell.

Graybeard: Let’s start with the industry obsession for speed. The question you should ask is not “Can I make it faster?” but “Is it fast enough?”.

Youngstar: What’s “enough?”

Graybeard: This is an excellent question, and a lot of companies are not asking it. Try to get hard numbers from the product manager/business people. They need to understand that you’ll build a totally different system if they need minutes or milliseconds response time.

Youngstar: Minutes?

Graybeard: There are batch system in enterprise that runs once a day, so even days might be a valid answer.

Youngstar: And if I hit these numbers, spend my time elsewhere developing new features?

Graybeard: Exactly. A lot of time people say - “make it as fast as you can.”. Don’t let them get away with it.

Youngstar: And how do I do that?

Graybeard: I tell them something like: “OK, but I’ll need a supercomputer and two years to get as fast as I can.”

Youngstar: Nice!

Graybeard: One thing you want to do before optimizing is making sure your code works.

Youngstar: Doh!

Graybeard: You’d be surprised how many times people optimize bugs. Make sure you have a good regression/acceptance test before you start. Also spend time with the code and understand what it’s doing.

Youngstar: Makes sense.

Graybeard: After you’re ready, the first thing you should do is profile.

Youngstar: I know about the Python profilers and pstats.

Graybeard: Excellent, these will help you identify the problem. Note that there are several UI front ends to pstats and some IDE’s have excellent integration.

Youngstar: You? Preaching UI?

Graybeard: Sometimes a pictures does worth a thousand words. Note however that pictures can lie just as good as words.

Youngstar: Meaning?

Graybeard: You need to understand what your viewing and how you measured it. For example people who use Windows should disable the anti-virus software before running profilers.

Youngstar: Oh! “Lies, damn lies and benchmarks.”10?

Graybeard: Exactly. Also note that there are several kind of profilers. You usually start with time based ones, but there are event based, memory and other profilers out there - know the tools.

Youngstar: I’ll make sure to have more than a hammer in my toolbox. However, my system is more complex than just one component. How can I find out how much time each part takes?

Graybeard: I tend to use a timing decorator on functions. This decorator logs the function execution time and then I can see what’s taking time. This combined with context object to know which functions belong to same request help me understand what’s going on.

Youngstar: Something like yslow?

Graybeard: Not as fancy, but yes.

Once you identified most promising candidates for optimiziation, it’s time to evaluate how much it’ll take to make it better and pick the one with best effort/speedup ratio.

Youngstar: Pain vs Gain again?

Graybeard: It’s always there.

Youngstar: Now that I have what to optimize, how do I do it?

Graybeard: There are many tools and techniques out there. I’ll try to point out some of the major ones. But do your homework.

Youngstar: I always do.

Graybeard: Always?