SeaGL 2024 Keynote accompaniment

It is my great honor to have been asked to keynote for SeaGL, Seattle GNU/Linux, a Free/Libre Open Source Software conference in Seattle, WA. I’ll post some form of my slides here, and I’d also like to include the list of linked credits, one-pagers I wrote/assembled, and resources I reference during the talk.

Links

‌A great introduction to password managers and other personal security auth tools on TeamPassword: https://teampassword.com/blog/one-time-passwords-vs-two-factor-authentication

Get 1Password: https://1password.com/product/password-manager

Get Flickr: https://identity.flickr.com/sign-up and then Pro mode

Get Kagi: https://kagi.com/pricing

91% of adults do not read ToS: https://www.businessinsider.com/deloitte-study-91-percent-agree-terms-of-service-without-reading-2017-11?op=1
Most ToS are incomprehensible to most adults: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3313837
“The revenue of the service provider is maximized when the user blindly uses the service.” https://www.sciencedirect.com/science/article/pii/S2667096823000204

Lawsuits involving Meta Platforms: https://en.wikipedia.org/wiki/Lawsuits_involving_Meta_Platforms

Firefox Multi-Account Containers:
– Firefox extension: https://addons.mozilla.org/en-US/firefox/addon/multi-account-containers/
– More info: https://support.mozilla.org/en-US/kb/containers

New communities

Image credit

Angelo Pantazis photographed and made the header image from my slides, find his work here: https://unsplash.com/@angelopantazis

“RKODE Boot Camp”

If you’re in tech and you have friends outside our industry, you are probably regularly asked some flavor of the question, “how do I do what you do?”, which is usually a question regarding high pay, freedom of work location, what I would call an “adult” level of trust from your colleagues and superiors, and flexibility in how you get the work done, among other benefits. There is no magic to working in tech; if you want to use your brain without needing an advanced degree, it’s a great way to get paid and survive, and maybe even buy a house and retire some day. For many it is a position of privilege that enables getting there – for me that was absolutely the case, which is a discussion better in person than on a public blog post. Furthermore, the way to “get a tech job” evolves all the time – the path I took would not look the same today.

Recently my partner was asking more seriously about what a path like this could look like, and asked somewhat rhetorically, “Should I just go to a code school?” And I have pretty strong negative feelings on that. The ONE code school that I used to think was pretty decent has also really disappointed me in the last few years. In my opinion, the folks that will get a job and stay in industry for more than a year or two would have done so without the expense and unaccredited/ever-changing program of a code school. Saliently, you cannot earn as an instructor for a boot camp/code school (interchangeably in this blog) what you can earn in industry. I can think of a dozen people who did these programs whose instructors suddenly dropped in the middle of a term, because they were doing the job while applying to software roles. I used to tell folks to check if the instructors had themselves graduated from the code school they were teaching at, which tells me whether or not the code school is teaching you to perform this program, or actually learn practices for industry.

So! What to do? I’m self-taught, with one code class that I took at the beginning of my journey in 2013. After the course was over, a woman fellow student recommended PyLadies, and I made the decision to go every Saturday and work on my projects and publish blog posts on what I was learning every week, and I did that for a year while I was finishing my degree in French and minor in Mathematics. That led to a weird little internship, which then led to 18mo in Support Engineering at a server config management company, which then, roundaboutedly, led to five years at my old company and three years in my current company as an operations engineer/devops engineer/site reliability engineer (these terms evolve), a space I’ve now inhabited since 2016 over a few roles. So I do think there is real possibility in a background, but consistent, learning toward a tech role, but I find myself frustrated by just how much money that seems to cost. With that in mind, I started punching up what I think would give you a base on which to seek out and competently interview for for entry level technical roles – junior software engineer, technical support engineer (emphasis on technical), IT desk/service, and more.

Language – pick ONE!

Lots to choose from but below is the list I think one could pick from and stick with – don’t bounce around too much once you’ve chosen. It’s important to learn deeply before casting your net too widely. Once you can read one language and make out its control flow reliably, you really can read most any language, and starting to work in a different language is really not too bad. You will find yourself forever comparing a new language’s paradigm to what you already know, so you want to make your first mental model strong. I started with Python and have seen it used from frontend to backend to personal scripting to simple data management, I have learned some C/C++ and boy oh boy can you get paid big bucks for being able to navigate its ancient and hallowed nooks and crannies well, JavaScript and Typescript are foundational and critical for web development, and Java is just ubiquitous. Describing these languages beyond this very brief sketch is outside the scope of this already monster-sized blog post. If there is another language that you are strongly considering that is not on this list, go for it! Learn it deeply. I think the program I’ve developed could very much be applied to many other languages as well, I just think that the five I’ve listed are terrific ones for beginners.

JavaScript, Typescript, C++, Java, Python

Who is this for?

I’m thinking of the adult professionals who can enjoy reading for its own sake and who are motivated to spend a couple hours a week – maybe more if you get really into something. Have you really enjoyed figuring out some piece of Excel automation? Have you gotten really into managing your own workflow, or improved upon/created effective, repeatable, teachable systems for getting some piece of work done? Have you, on your own, decided to pick up and complete a non-fiction book without external pressure? Do you enjoy engaging people on complex topics? If the answer to a couple of these is yes, I think you’d probably do well in a program like this and in this line of work, such as that can be hundreds of kinds of roles.

You’ll need a computer. You don’t need anything fancy, genuinely. A mac is probably best, but you can find used ones between 3-8 years old that will still work for a while, but a windows computer will work fine too. With Windows 10, you can use the Windows Subsystem for Linux 2, or WSL 2 for a real Linux command line. Personally I have not yet stumped my WSL 2 install with anything I’ve tried to do – python, bash, docker, rust, etc. If you would like to pursue a Powershell and native Windows based programming career, this isn’t really the blog for you, and I don’t think I’m the person to advise you on it, but godspeed!

The Program

Finally we come to the program itself that I jotted down one evening about a week ago, and keep chewing on as what I think could really get you a job in 6-9 months, for very little money – a fraction of a fraction of the cost of a code school. And if you get to the end (spoiler: the last one is “take a class at your community college”) and have decided it’s not for you, the money you’ll be out is the cost of a few books, NOT $30,000 or whatever they’re trying to charge these days.

  1. Join the main language’s slack or discord. Join the beginners channel, and pick five other channels to follow and keep up with. Make one of them social, and join conversation a couple times a week. If the channels aren’t active, pick new ones. Come up with a technical question to ask at least once a week in your learnings. Try to answer other newbie questions. This will be part of your learning experience for the rest of this program.
  2. Optional: a codecademy (or comparable free) class on one of the recommended languages
  3. One beginning programming book with exercises
  4. 3-5 videos and exercises on Git and version control
  5. PROJECT: Spend two weeks building something with what you learned. Feel free to liberally crib from the book you selected in step 3!
  6. One beginning HTML and CSS book with exercises 
  7. PROJECT: Spend two weeks building a local website with just html/css. Cribbing rules apply.
  8. One beginning systems or networking or ops or AWS book (not azure and not gcp)
  9. Optional but highly recommended: bash and Linux command line basics
  10. The Phoenix Project book
  11. Forge Your Future with Open Source book
  12. A second and more specialized programming book which contains exercises in the same language you chose above
  13. PROJECT: Spend two to three weeks building a project based on the specialized book you read.
  14. Workshop a larger idea* for a project, which will require your language of choice and html/css. Find a collaborator from your slack. Work on it for six weeks. Scoping this can be hard, but your new colleagues and your collaborator can work on this with you.  *If your language is c++, skip the html/css aspect.
  15. A community college course in your chosen language
  16. Stay active in the slack, start building a resume of projects with old jobs relegated to 1-2 lines each.

“But why this and not that?”

There are some pretty opinionated recommendations in the above list! My intention is not to offend but to offer my professional perspective. I really think you can throw a dart and find a perfectly good book to learn with, and as you develop and make connections with people, you’ll find the books you want to read on these topics. They can’t DRM you out of your own physical book, either. If you prefer e-reading or audiobooks (probably a little harder for hands-on computer learning but not to be discounted and plenty powerful still!), please feel free. Personally, I like to have the paper book, and find it’s harder to ignore than a PDF in yet another browser tab, competing for my attention with social media and whatever else I’m looking at at the moment. This isn’t a program I am administering, it is a set of guidelines I am recommending.

I feel that Git is something that is best suited to personal instruction, so instead of videos, you could ask a friend in your new slack for an intro if that is appropriate, or honestly if you’ve read this far and you’re still interested, I will try to make time for this for you, the reader!

If you’re going to learn a cloud hosting system, just learn AWS. GCP and Azure exist, but the ideas are the same so using a different one after learning a bit about AWS is not too big a lift. AWS is used all over and defined the game. Just be careful with the free tier if/when you start making infrastructure on there.

I think that this curriculum is missing an introduction to databases, and I really welcome your feedback on where that would fit in this setup.

The Phoenix Project and Forge Your Future might seem like odd items out in this list. However, TPP is a business of technology book that is well written and exciting and ~fictionalized, and the insights it offers are innumerable and years later, I find myself thinking about it and recommending it. Forge Your Future with Open Source is a how-to for getting involved with Open Source, which is a common but challenging recommendation for newbies, and this book explains and demystifies – and justifies! – how and why to get involved at this level.

Don’t sleep on the slack. Seriously, do not proceed without step 1! Join the language slack, get involved, get to know people, make it part of your community and your life and this program. You will not get a job if you don’t make friends in this network of people. Folks hire their friends. If you’re a beginner, you simply will not get a job over someone else if you are a total unknown. Once you have the resume, this will be less (or differently) important, but you probably don’t yet, so it’s time to add to the rich community of this language with your awesomeness! And you’ll be adding as much as you are receiving! Look for other questions to answer, ways to help, volunteer opportunities, and the “payoff” in lifetime connections and strength of networking will be utterly invaluable.

Thanks for reading! Please comment and let me know what you think! You can also email me at bootcamp at rkode dot com if you would prefer.

Modern Python Environmenting with Pyenv and Pipenv

One of my favorite things about Python is the environmenting. I first learned virtualenv, then virtualenvwrapper, and now just pipenv, the current favored method to put a special location on top of the $PATH for a specific purpose.

PATH and venv/virtualenv explanation

Let’s talk a bit about how Python creates “environments.” When new to Python and to code isolation in general, some will compare a virtualenv to a virtual machine, or a docker container, or a remote server, however the actual truth is that nothing is actually isolated at all. A symlink to the Python interpreter you created the virtualenv with, as well as the location of any Python modules you have installed, are given to your current session as the FIRST place to look for executing or running anything.

We’ll get there but let’s talk briefly about the $PATH because even with professionals who have been working in technology for many years, I see this misunderstood, and it’s important. The $PATH variable is an environment variable (a variable that has value in your current session – so when you open up a new terminal) that is a _list of paths_, or a list of locations. Just as my address is 123 Main Street, and my friend’s address is 125 Main Street, you would have a reasonable guess as to where to find me. So in this analogy, let’s say that PATH="123 Main Street:125 Main Street". Your mac or linux machine works the same way, except the locations are directories like /usr/local/bin/ and /home/rachel/.local/bin and /sbin. The order of the directories is crucial – if you have an executable in the first path it finds, and in the third, your computer will STOP looking after it finds the first one!

So, the way that all Python environmenting, to my knowledge, works, is to change the $PATH variable to put the Python stuff, for the desired venv, at the very front of the $PATH. We would also say “on top of the $PATH,” because this is absolutely analogous to a stack. So with the Python interpreter you’ve specified, and the Python modules you’ve installed, and the $PATH changed to look for these FIRST, we have a venv! It’s important to understand that it’s really all on the same machine and with the same level of access – but we just artificially make the desired stuff MORE available.

Automate a bunch of virtualenvs at once

So, at my job, I’m working on improving onboarding. This is a really tall order with a lot of different vectors, but the technical aspects are all pretty exciting. First up, for whatever reason, I chose to work on the problem of how challenging it is to get a new engineer on our team going with a ton of different venvs. There’s a boatload of Python in our day-to-day, and laboriously going through and setting up the venv for 10+ places is a pain, and error-prone, and the way that we’ve been doing it is the old-school way, with virtualenvwrapper (a tool I’ve loved for a long time but which is starting to go stale with newer Pythons [and newer versions of macs]!), so weird things have begun to go wrong.

It’s not only time for an update to our methodology, it’s time to just abstract this problem away entirely. Folks on my team interact with Python but we don’t write much of it, but we must use venvs all day every day for various tasks because there’s just tons of tooling that’s been written in various versions of Python. So I wanted to write something that would just… DO IT. Something to take all the places where there should be a virtualenv, and make it based on spec.

There were a lot of places I went to research, I thought about Makefiles (really not the right tool), thought about Docker (make an image, not committed to the repo*, for every single one??? no), looked up some Python docs, and finally stumbled across this fabulous blog post, https://www.rootstrap.com/blog/how-to-manage-your-python-projects-with-pipenv-pyenv/ by Bruno Michetti, detailing how to make pyenv and pipenv play nicely together. Edit: And I assembled it all in bash, because that’s the system scripting language I’m most comfortable with. It’s not in Python, sorry! Mentally I consider this level of abstraction to be another level above Python, though this may be possible to bootstrap itself – I don’t really know.

Pyenv is a really lovely abstraction to download whaaaaatever version of Python you want, and switch amongst them very easily without having to know its location, whether or not it’s a symlink, yadda yadda.

Pipenv is a tool that bundles pip, which is the Python module install tool, with virtualenv! How sensible, given that the workflow previous to the existence of pipenv was to make a virtualenv, activate the virtualenv, and then pip install everything in requirements.txt. Pipenv condenses that with some nice bells and whistles besides.

The remaining challenge was getting it all into bash so the user could just run one thing, and organizing the script comprehensibly.

*not committed to the repo because this is not a task where I’m trying to convince fifteen teams to generate their module in a completely new way

The code

Typed this out by hand, I hope you like it, if you too have a ton of virtualenvs to install at once, or want to give this to a team to be able to use once you fill in the blanks of repo location & the repos themselves of Python modules using various Python interpreters! The only thing that’s not real are the repo names. It would be cool to get these dynamically – if you know how this could be done, based on a requirements file or something else, please holler in the comments!

#!/bin/bash

repo_location=$HOME/repos

# brew install pyenv & pipenv
brewing () {
    brew install pyenv
    brew install pipenv
    brew update && brew upgrade pyenv && brew upgrade pipenv
}

# some tasks for pyenv to feel happy
pyenv_setup () {
    if [[ $PIPENV_VENV_IN_PROJECT=1 ]]; then
        echo "environment set up for pyenv already, probably"
    else
        echo 'if command -v pyenc 1>/dev/null 2>&1; then' >> ~/.zshrc
        echo '  eval "$$(pyenv init -)"' >> ~/.zshrc
        echo 'fi' >> ~/.zshrc
        echo 'export PIPENV_VENV_IN_PROJECT=1' >> ~/.zshrc
        source ~/.zshrc
    fi

    # if these already exist, the script will ask if you want to download them again
    pyenv install 2.7.18
    pyenv install 3.8.13
    pyenv install 3.9.13
}

# now the juicy stuff!  repo names are fake.  eat your vegetables!
venv_creation () {
    cd $repo_location

    # py 3.8 envs:
    three_eights="zucchini rutabaga cabbage"
    for i in $(echo $three_eights); do
        cd $i
        pyenv local 3.8.13
        pipenv install --python 3.8 -r requirements.txt
        cd $repo_location
    done
    cd $repo_location   # can you tell I've been scarred by bad dir management

    # py 3.9 envs:
    three_nines="cauliflower radicchio spinach"
    for i in $(echo $three_nines); do
        cd $i
        pyenv local 3.9.13
        pipenv install --python 3.9 -r requirements.txt
        cd $repo_location
    done
    cd $repo_location
}

brewing
pyenv_setup
venv_creation

Technical Onboarding information category

Hey folks, been a while on the main blog. Got lots of stuff I’d like to talk about here at some point (like getting off twitter via buying another WP blog)! The one for today is borne of some work onboarding I’m doing for a new person on the other side of the world from me, which brings endless fascinating challenges, basically all of them good!! Was talking with my PyLadies besties (six of us ran the portland pyladies chapter for a few years and we have all stayed close) and we got to some interesting places as far as technical (and possibly other kinds that I can’t speak to) onboarding.

As I see it, there are four categories of information relevant to a new engineer*.

  1. What I expect you to bring to this role
  2. What I expect you to absorb and internalize *without* documentation
  3. What I expect you to read and reference from documented internal sources
  4. What is undocumented but which should be documented.

You’ll notice that three of these are pretty straightforward. Or at least, ideally they would be. Regarding item 1, what I expect you to bring to this role should be clearly laid out in the job req, and in our case in Data Center Provisioning, we wanted them to have strong networking, decent general command line ability (this can be switch-specific though – switches, like Cisco network switches and Arista network switches, and others, have their own structure that is NOT filesystem tree based, but which is somewhat cross applicable to ~Linux command line interfaces, which is my familiarity), virtualization experience, a little scripting, and if you have cloud ops experience and some programming, that’s a bonus.

Re the third (we’ll come back to the second), what I expect you to read and reference from documented internal sources is never as grand as you want it to be! It isn’t that we don’t have a ton of docs – we do, but as anyone in this field knows, it is a monumental task to keep it updated and constantly added to in relevant, searchable ways. We do our best and are always trying to come up with ways to incorporate this work into a regular cadence of attention and discussion.

Which brings me to our fourth category, what’s not documented but should be. This is a huge category, and it’s absolutely fuzzy with the second category! There is so much that I can CLEARLY state that should be documented, though, so even though there is gray area, there is such a huge quantity that is utterly black and white.

Finally, the category of what I expect you learn and internalize *without* documentation. I bristled at the idea of this earlier on in my career. Now, I am pretty sure that this is the secret sauce of any technical role, maybe any role period, and I believe that my success as a colleague-trainer depends upon clarifying this category over all others. The clearest examples of things in this category are, like, (in my field anyway) how to use code repositories, including how to submit your code for review and how to generally use a cloud remote tool like Github (that’s probably a pretty odd label for what GH is.. please forgive it). These are the kinds of things you would explicitly show a junior level person, or someone whose background is ENTIRELY networking or physical racking-and-stacking in datacenters, but at my job we generally expect that you have this.

The less clear pieces are harder to describe (no kidding lol). I think this is ultimately the category of company/job specific knowledge, that as you deepen your understanding, hopefully gives you tools and knowledge for future career as well. Some of this is historical context for your job, that you don’t really get until you’ve been there for a minute. The context that I have, having been at Fastly for 18mo now (not a long time but more than a flash in the pan), is a key part of the value that I bring, and the longer I’m in this job, the more valuable that will be (but not necessarily – there is an upper limit that is probably worth exploring in a “should you move on from your job” kind of post). Another piece might be, at least in my case, the unique design of what we build. Not going to get too specific in a public blog post of course, and while you can learn design generalities for what we do and make and provide, the specifics of our builds will not be known until you develop some working, repetitive knowledge. And then it will change subtly, over and over and over!

“But wait, Rachel, shouldn’t this stuff be documented?” This question is basically THE reason I felt the need to write this blog post. The answer to some of it is yes! The answer to a LOT of it is, “uh… kinda?” Like, yes, we have design docs, which are thoughtfully created and introduced to the relevant teams! And they apply to one version of it, or maybe one category of versions of it. But this one is different for pretty good reasons, and you can tell by looking at X and Y. And this other one is different for reasons we later discovered aren’t very good, and you can kiiiinda tell by looking at Z or W! I hope this is not too abstract. I’d like to get to a place where I can really define this category.

I think this second category is also partially the reflection of the fact that we simply cannot document everything. It’s not possible, and further, we got work to do folks! One of the common personal aphorisms I return to is that they’ve hired us here for our brains. This, and my last job, are not widget jobs, like my support engineer job from 2014-2016, which was extremely technical and demanding, but which 90% depended on a) ticket inflow and b) having TOO MUCH ticket inflow in order to never “overstaff” in this “loss-leader” department (a characterization I find short-sighted and even cruel – can you tell my feelings on this with what I’m putting in scare-quotes??). They need us to be finding and solving the continual edge-cases we encounter before they become major incidents or otherwise impede the function of our organization… and we are very lucky to be able to do that. Sometimes we’ll all gripe about some bizarre complexity or a decision handed down to us from a few levels up, but really, we’re all just trying to solve technical AND business problems, and it’s genuinely a rare and special thing to be given the freedom and the salary to do so with as much of your brain as you care to engage.

Quick side bar, I do think that a very technical job (to a point!! no ocean-boiling, that’s burnout territory) where you really are trying to solve problems and get shit done a little better each time, is rather a fountain of youth. Keep your brain sharp and you keep dementia at bay a bit better! No guarantees in this world of ours but cognitive function is a real thing that declines and like any muscle, the way to keep it going is to USE it!

So. Getting back to the categories of onboarding information… What do we do with this? Well, I’m working with it as we speak, and it’s HARD! It’s a bit know-it-when-I-see-it, and part of trying to write all this out was to personally try to be able to recognize the “second category” of information a little faster, so I can impart it better, so our newest engineer can internalize it more easily, so we don’t fall into the trap of trying to document something like “and then, type cd foldername…..”. And of course, the point of it all is to prepare our newest engineers on our team in the best possible way we can for themselves and for our organization!

Truly I’d LOVE to hear what you think about these categories, and if this reminds you of Some Book (or blog post!) I Should Read, please let me know!

* this isn't the place to discuss coder/programmer/sysadmin/noc/etc all having the title of engineer, it's fine, your opinion probably isn't wrong even if we don't agree, just not going to talk about it here.

2021 Job Search Retrospective

The last time I was job-seeking, I described what I was looking for on this very blog. The post was descriptive and even a bit demanding, and then the role I found was so utterly ridiculously perfect for me, I can only call it a success. I think there’s a bit of psychology to a “here’s what I want in a job & here’s what I DON’T want” blog post, no matter what your level or newness to a given industry. Describing characteristics of a job you want lights up a “hey that’s me/my company!” affirmation in a brain, and describing characteristics you do NOT want lights up a whole other defensive part of a brain, I think, that says “well we don’t do exactly that, and here’s how we’re actually really good about this even though it might seem to share this characteristic,” that serves to engage and make folks want what you are. Also, I think displaying some “taste” and preference paints a more whole picture of you as a human being, which helps you stand out. These are all my opinions and just what worked for me, and I may even be laying my cards out too much since it does kinda feel like a “trick.” At any rate it’s all a bit moot since I haven’t done this since 2016, and the job landscape for newcomers to tech is so different than it was five years ago.

I was looking for a new job privately this time. The reasons for this are not complicated or dramatic; I wanted something very different to broaden my perspective, I wanted to be paid better, and I wanted to get onto a track where I could get a senior title by the time I’m 40 (about five years from now). I’m a late-comer to tech & I worry about the perception people might have of me, being in my 40s and being given junior level responsibilities, alongside the double whammy of sexism and ageism, which is not here yet but is certainly on its way – anyway, a lot of anxiety as you can see. However, I only had my applications, network, and cover letters to express these things, rather than a more inbound request with the above mentioned blog post, because despite truly loving my job and trusting my team at work, I did not want to tip my hand. The power imbalance is just too great to feel comfortable doing that – I have a mortgage to pay and an uninsulated house to heat!

The Process

So first off, I put a few feelers out, and made a Trello board. I made a Trello board for myself the last time I was job-seeking, and it served to organize my efforts extremely well. Everything in job seeking is phases, so each job, represented by a card, would travel along each phase, sometimes skipping a lot of them straight to the Rejected column, and sometimes seemingly languishing in a pre-interview state for months (for good reasons).

Surprisingly, most of the cold jobs I applied for were via LinkedIn. Their interface has gotten better for this in recent years, though I don’t think that the “instant apply” feature is very useful – if no one makes any effort to personalize or customize your first touch, then without STANDOUT credentials, I don’t think it will usually result in anyone reaching out. The best ones I found were those that said “apply on website,” and while that’s definitely harder and there were many that asked what felt like short-form essay questions, I had good luck with those – I like writing, haha.

But is writing the job, when looking for Operations/SRE/Devops/Infra (all of these are search terms that more or less apply to what I was looking for)? I do think good communication is an undersung part of working with others, however I’m no better an engineer than someone who didn’t have the education and focus on writing that I had in my early life, and yet I’m preferenced because I can express myself clearly and enthusiastically. So, while it advantages me, I think it’s ultimately an odd additional hoop that will make it harder for people with various learning disabilities, ADHD, anxiety, and others, to apply. I don’t know a solution to this – more phone screens? That is hard to do. Probably the answer is “spend more time/money” than a technical solution.

So, the breakdown. I applied for fourteen jobs over the course of about a month, received one offer, was rejected by eight, and I told a further five that I was no longer interested. Fourteen in a month might not seem like a lot, and indeed if I had not been employed that number would be at least double that, but it was really very nearly too much for me while continuing to work at my current job, which again, I like very much and respect my coworkers hugely.

My strategy for each “job card” in Trello was to add it first to the first column, To Apply. In that, I would put a URL to the job posting. When I came back to apply to the job, I moved the card to the Applied column, copy-pasted the entirety of the job posting into the Description of the card, and attached the resume and CL (if applicable) to the card itself. This makes each card pretty heavy honestly, and I’m happy I was able to do this all on a free Trello plan, because they’re hosting a lot of very similar resumes/cover letters for me now!! And I referred back to the job postings, often removed before they’ve even hired for it, constantly, in trying to find the right things to say during screens and interviews. And with each new job, I tweaked the resume a bit, and wrote entirely new cover letters each time.

quick cover letter sidebar:

Quick sidebar, let me tell you my strategy for cover letters! “Hi, thanks for receiving my application, I’m excited to work with you because a, b, and c from the job posting, and I know I’ll be a great fit because x, y, and z alignment. My experience suits this role particularly well because REASON. Thank you for your time!” The attitude needs to be “I’m a professional, so are you, let’s work together as mutually respectable humans!” No supplication here, and no more than a paragraph, SOMETIMES two! At the end I could post all of my cover letters, I think that could be useful. No names though obviously.

Anyway, back to the process. So after I sent things off, preferably to their own career page portal, it was usually a few days before people got back to me. Sometimes it was the next day, sometimes it was a couple weeks. This felt pretty fast to me, because I’m fortunate enough to be in a specialty that is in high demand. That will change someday, so uh, I’m gonna try to put some money away for that inevitability.

I also talked with a couple recruiters. Local recruiters often have access to jobs you wouldn’t otherwise know about, and are always worth chatting with – everybody makes money if it works out, and it’s in their best interest to help it really work out! If you end up leaving a job which you have been recruited into, their fee is often reduced, and the trust relationship between the recruiter and the company is harmed. In my experience, they did not have access to many larger organizations, but to lots of good local companies, which may be what you are looking for.

The Nos (From Me)

So let’s talk about some of the ones I turned down. A lot of the smaller local companies ended up being supporting Software-as-a-Service (SaaS) applications – a few cloud servers and the networking behind it, and all the other bits and bobs to make that run smoothly. No small task, but that’s what I’m doing at my current role, and I want to see what else is out there. Another really common trait I found in many of these jobs was that they were ready to hire their FIRST ops person. They’d gotten by with some setups in the cloud, and now wanted to hire someone dedicated to the task, to save money & grow. I don’t want to be anyone’s first ops hire, I was at my current role, and while I know more than I did then, the pressure is too much – possibly BECAUSE I know more now, I know more of how it can go wrong!

  • A friend of a friend reached out, we’ve talked about jobs before, wanting their first ops person as described, and I did turn him down (no interviews yet, so who knows if I’d have even made it to the point of an offer) and suggested that instead of one ops person, they hire two. A hard sell, I know, but which would create a collaborative working environment, scale for much longer than just one person, and ease the inevitable crispiness of the lone ops engineer.
  • Another place made a device and app that monitored and controlled kids’ internet and computer usage. Fuck your surveillance software, I will never take a paycheck for that. That kind of shit KILLS QUEER KIDS.
  • Another place did dynamic ad insertion on all-“free” tv. No thanks, that’s another shade of data gathering/selling (who’s paying for all this bandwidth??) and who knows, maybe/probably more insidious shit.
  • Another company’s hiring had seemed good, if corporate, until I talked with my technical interviewer, who wasn’t on the team but who had been serving in the role I was interviewing for on an interim basis. This person was very stretched thin, I am guessing, but hadn’t prepared for my interview, which really is not acceptable. He had no prepared questions, and instead seemed to pick out pieces of my resume to quiz me on edge cases as he thought of them. I did ok generally, he asked me detailed questions about nginx configuration and I didn’t do well and he said so, and yo, those docs are rough and imo, benefits from collaborative pairing because it’s tricky AND critical to get right. I was actually counting down the minutes til the interview was over. Toward the end I’m like, “so would we be working closely together” and he said “oh yes, oh yes, very closely,” and I made my decision to withdraw at that moment. Pretty sure I would have been condescended to there! So I had a bit of (perfectly professional) back and forth with the recruiter who asked me to reconsider, but man, I can’t work with that dude.
  • Finally, the last place I withdrew from, I’m actually a bit torn on. Had a recruiter screen where I made it super clear that my coding is no better than Fine and has never really been production (though I do have a hair of python in production at my current job), and from there the experience was quite good – initial chat, engineering manager chat, and then a technical challenge which was my favorite of any I did (some places had take-homes, some didn’t), where I submitted a chonky bash script and a Dockerfile to modernize & automate a manual process that a hypothetical team needed. It was all via Github. Because it was over the holidays, it took a bit to get back to me (this is a good thing), and by the time they got back to me, there were a number of comments on my PR. This was also the week of the actual fucking attempted coup in the United States. I was feeling frayed. I didn’t know if I was proceeding with any of the jobs because of a bit of silence (again – this was a good thing, because it means people were taking actual time off!) and a lot of fear on my part. I didn’t really enjoy what seemed like deliberately obtuse questions, but I responded, and then he asked MORE questions, and it just, I was really tired. I’d been applying & interviewing for jobs non stop for nearly 3mo by that point, while going to my day job. So I withdrew, after learning that this person was on the team – it just seemed like he was communicating very indirectly, where it could have been a collaboration. I know that evaluating PRs for candidates is difficult. I also wish there had been expectations set around the number of back-and-forths on the PR – I do that all day at work and it’s critical to be able to do that well and with kindness – but I didn’t know that we’d be doing that with this PR, and I was honestly out of juice by that point. At the end of the day, I think I’d have been able to handle this if it had not been such a civically stressful week.
  • Neck in neck with the job I took was one at a… platform as a service company I can’t really describe without giving them away. I was pretty intimidated by this entire process, they said they really wanted Ruby and/or Rust, and I just don’t have either of those (“some python?”), so I made that really clear and they still wanted to go forward with me. I thought I would be out after an initial set of technical questions, I wasn’t, I thought I’d be out after the technical interview, I wasn’t, and then I made it to the final round of interviews, but by that point I had an offer in hand from where I ended up accepting, so withdrew. I would totally apply here again, despite a high technical bar everyone was very kind.

Rejections

  • The first place I was rejected was for an SRE role at a monitoring company. This would have been awesome but I was probably underqualified, they really wanted someone with more coding. I wish that it didn’t feel like, at so many places, that they wanted someone who was good at EVERYTHING – back end production development AND advanced, modern operations and monitoring. That is how many of these roles felt, and as my job search and discussions proceeded, I started to make it very clear in the beginning stages that I really don’t have much of a coding background.
  • The next place I really wanted has a market that is very specific so I don’t think I can generalize it here. Ha, I can’t even really say one of the biggest reasons I wanted to work there! Anyway, the pre-conversations went really well, lovely chats with the recruiter and the EM, where they asked about ops stuff, how ya do. Then they sent me a challenge which was a pared down version of their known product, and gave me a test suite and skeleton vars, and said make the tests pass. And I BASICALLY did, for, I think, 7 out of 11. I made yet another pass, ish, I wrote a lot about another, didn’t super know how to tackle another, & had really NO idea how to make the last one pass. Honestly I felt good about it, which might sound wacky, but I think I demonstrated enough Python to show that I can be competent with reviewing and analyzing code while doing a terrific job with ops, but they felt differently. This was the one big bummer.
  • Another, a cloud services/tooling place, it seemed like they were looking for jesus christ zirself and after a few interviews they turned me down. No surprise there.
  • Another, a budgeting app (sorta), was a GREAT process up until the challenge. They asked me to write a lot for the initial application, which I’m always happy to, and then I had thoughtful, great conversations with folks. Then they gave me an ops challenge which I did really like the flavor of, as that’s the kind of thing I’m good at and can do, but it was basically (at least) twice as much as I think it should have been for a reasonable challenge. I spent about fifteen hours on it over a few days, doing nothing else and depending on my husband to do ALL the cooking and cleaning and errands and stuff, which we usually share pretty equitably. It would have made a good “let’s see how far we get” pairing interview over a couple hours. It also would have been good if they’d set a time limit or suggestion on the challenge, or told me when they’d want it back. I asked for some of those a couple times, and was just told “hey, however long it takes you, that’s fine,” but folks, it’s not fine! I have a job and endless other obligations. So, I did ok on that, I wrote a ton of server provisioning templating from scratch, got stuff installed, struggled a fair bit with nagios from-scratch setup, ended up entirely losing access to one of the four servers (lol), and sent them 5500 words on the process, describing backup plans, safeties, & what I’d do in the workplace if these kinds of things went wrong. I did not get that job, and I feel like that’s fine, and that it was a poorly scoped challenge, though the content was good. But I was sad about it, because I think I would have done well with a better-scoped challenge, and I really like the product.
  • Another, I was iffy on culture wise because it seemed like a bunch of dudes in a small, not-growing company, and they were looking also for a jesus christ-ian replacement for someone who’d departed for happier trails and no doubt, much better pay for his skillset. Also they had misrepresented some things to the recruiter – there were no real docs and there wasn’t any intention of moving entirely from metal to the cloud. So I wasn’t sure, and then found out they made an offer to someone else about an hour after I had the initial chat with the EM. So that was probably for the best.

And finally…

Finally, the place that hired me! I have long been intimidated by this company, and I literally would not have even looked, if not for a friend who pointed me to the listings. It said “Senior” so I almost turned away without reading it! But I read it and thought hmm, not only am I capable of doing these tasks as stated, I would LOVE to. So I sent off my things the week before Thanksgiving, and then there was another delay over Christmas and NYE. There was a recruiter conversation, then a conversation with the Engineering Manager I’d be reporting to, and then a technical conversation that was kind and even informal feeling, and THEN I had my final interview the first business day of 2021. I took that day off work (felt weird, I really want to be there for planning and for that lovely first day, potential energy feeling) and the interviews went great. Talked with a data center sourcing manager, a VP about the values of the organization, a communication & collaboration interview with someone who reminded me SO MUCH of a sweet friend of mine that I felt immediately at ease and did really well with that, and finally a technical background & project work interview with a person from a team who works in the data centers themselves.

I cannot tell you how honestly easy and lovely this entire process was. I had GREAT conversations with everyone, and was comfortable enough to come up with the right stories and data of my past experience that was needed, HOWEVER they were not casual conversations, and each person had a proscribed series of questions they needed to ask. I cannot TELL you how much I appreciate this kind of process and planning and effort. A week later, they asked for my references, and after each one I spoke with my reference, and they said that the EM was very excited about me! A few days after that, I received an offer, and I spent a day in a VERY CHALLENGING haze while I needed to wait for the offer to finalize before giving notice at work and having all of those hard conversations.

So! I am halfway through my two weeks’ notice at my current job, and looking forward to three weeks off between, and then starting the new job at the end of February. I will be working on provisioning software for bare metal servers as managed entirely by the company, and I am so excited and so just… relieved, and lucky, that it feels like I have really found a niche which fits me and my interests. Also, omg, no on-call rotation!! Whaaaaaat!!

I hope this was useful for anyone trying to get a scope on the tech job market! Hmu if you have any questions or thoughts. Thanks for reading!

Local Politics: Iannarone and Raiford for Portland, OR Mayor 2020

And now for something completely different. I have gone “off topic” a few times before on the blog, but obviously it’s mostly technical in nature, around here. Today I’d like to talk about something incredibly important, the upcoming Portland Mayoral election, in a couple months here.

Vote for Sarah Iannarone. Not because I think she’s a better candidate than Teressa Raiford, but because she is on the ballot. Think of the voters we can pull off of Ted Wheeler – they will either vote for Iannarone or Raiford. There are no swing voters between Wheeler and Raiford, there are just Wheeler voters and Iannarone/Raiford voters. I believe in being able to vote your actual truth for people that represent you, and I also know that Wheeler has gotten far too many of my community maimed and killed, and he has GOT. TO. GO.

Raiford is amazing. She founded and runs Don’t Shoot PDX, which provides legal support for families affected by gun violence. Her entire pedigree is fantastic. I want to see Raiford in local politics for as long as she’ll have us.

However, the election needs to go to Iannarone. She has utterly endless DETAILED policy which Raiford’s handful of campaign sites lack. And in the generalities of Raiford’s sites, being an advocate for public education, transportation, transparency, labor rights, Iannarone has practically the same POV except a) she has details about every level of these policies and how to institute them, and b) she is on the ballot.

That’s really the critical piece here. Teressa is great! But Sarah is ALSO GREAT, and she is on the ballot. Sarah is on the ballot. Sarah has a genuine shot at being able to capitalize on how much a tremendous, continuous, and DELIBERATE failure Wheeler has been in his role as police commissioner and mayor.

The thing is, Sarah is on the ballot. She’s the one who can win. We must must MUST vote to change our voting structure to Ranked Choice voting rather than first-past-the-post. Until we do however, if it’s Wheeler vs Iannarone, then Sarah might get over 50%. She’s an outside candidate not entrenched in local government already & not incredibly, densely backed by enormously moneyed interests. If it’s Wheeler vs Iannarone vs Raiford, Wheeler will get the same number of votes, and Sarah and Teressa will be sharing the remaining pool of votes.

Here’s the very critical article: https://medium.com/@adie.bovee/an-open-letter-regarding-portlands-upcoming-mayoral-runoff-eb31e2624181 , posted about Iannarone’s involvement with Don’t Shoot PDX, which appears to be based on a now-deleted tweet of Sarah’s that questioned the wisdom of running a write-in campaign, saying that telling a black woman that running against her is a vote for Wheeler, is itself a silencing and racist action. I don’t think I agree that saying that a write-in campaign is less likely to win than someone on the ballot, is itself racist, but please, come to your own conclusions. The article also says “… it doesn’t take much digging at all to learn some critical history of Iannarone’s campaign’s relationship with Don’tShootPDX,” and hey, maybe it’s out there, but I couldn’t find anything in the first two page results of the term “iannarone “don’t shoot pdx””. What I found was coverage about mayoral debates which mentioned Iannarone, Raiford, and Don’t Shoot PDX.

Finally, please read what each candidate says, and decide for yourself. PLEASE read. I’ve provided handy links to ALL of the official campaign policy for either candidate. I’ve also got word counts because I find it pretty remarkable, the level of detail and difference.

Raiford’s platform is here, 2599 words: https://www.movingportlandforward.com/the-peoples-platform

Sarah’s platform on entirely reimagining public safety, 5933 words (be sure to click through all the “Show Full Policy” expansions): https://sarah2020.com/en/policies/rethinking-public-safety/

Sarah’s platform on transportation and the “green new deal”, 1215 words: https://sarah2020.com/en/policies/green-new-deal/

All of Sarah’s writings and platform proposals for Coronavirus response, 1726 words: https://sarah2020.com/en/policies/covid-19/

Sarah’s massive reformation ideas for monetary and economic support for marginalized and out of work members of our community, 7519 words (!!):
https://sarah2020.com/en/policies/recovery-and-resiliency/

Sarah’s platform on Housing for All, 3644 words:
https://sarah2020.com/en/policies/housing-for-all/

Sarah’s platform on vastly VASTLY transparented (not a word, but it’s fine) government, including municipal internet, FOIA request improvement, cracking open wide the voter rolls, and so so so much more, 3327 words:
https://sarah2020.com/en/policies/good-government/

I have looked and looked, and asked staunch advocates for Raiford, for similar policy plans from her, and I just haven’t been able to find them.

Thanks for reading.

An update on the De-Google

So a few months ago, I got a bug to get off Google, so I want to talk a little about how that’s gone! Lots of progress, not done yet.

Table of Contents:
Fastmail
Calendar
Maps
Photos
Drive
Android

First of all, the very positive. FastMail has been an absolute and complete delight. I was very skeptical that I would love an email client more than Gmail, because its search and its apps are great, and I’ve been using them since…. 2002 or 2003 or 2004? Since it was in beta, which was a long time and I don’t really care to log in to find out! THAT SAID. FastMail is great. It is absolutely instant in its snappiness and customizability, its support is fantastic, their docs are great, and the product is just a pleasure.

One of the most wonderful parts about it, aside from how gosh darn FAST it is, are the auto expire settings you can create for a folder. For example, and I know everybody has emails like these, I have a Twitch folder, into which all messages announcing that so-and-so has gone live on their channel, get filtered (better than on gmail but I can’t quite discern in what way, it may be my imagination), so I see them, and then I never look at them again. If I’m available, I’ll go check on the stream. But regardless, after I’ve seen practically even just the SUBJECT of the email, I never, ever need to look at this email again. So I’ve set a 30 day deletion rule on that folder! I have one for Twitch, one for Github emails, and I’m going to set one up for Meetup too. This basically means that the email that doesn’t get deleted is stuff I generally want to keep, and that the spammier (but still desired) stuff will never contribute demonstrably to the space I’m paying for.

The aliases are great. I know folks who have either never gone to google or run their own email is familiar with this, but aside from adding +whatever to the end of my regular email, I’d never gotten to experience this before, since I am using my actual domain to receive email through. So I have name@fastmail dot com, but I also have:
* “rachel@”, which I can put for personal things and give to people individually
* “subs@,” for comic subscriptions, newsletters, and other things that are not going to be personal to me,
* “business@”, replacing my former dedicated gmail address which was for all business things, amazon orders, transactional emails of all kinds, and finally
* “junk@”, for true junk mail. Hilton gets this, anyone that claims to HAVE TO have an email in order to proceed gets junk, etc.

What the aliases have meant, then, is that I don’t need to maintain multiple accounts. Things get filtered really beautifully and immediately. Everything I need is right there.

Which brings me to the next item. Calendar! FastMail also has a calendar included, because I think Outlook and Google have made it so that just has to be standard in an email offering. The calendar.. is fine. It is just ok. It, like.. mostly integrates with Google Calendar. I think it resends a given invite to everybody if you add anyone to the list. Its defaults seem weird, sometimes it’s 12am and sometimes it’s the time you click on. However, it is usable and a perfectly fine replacement for Google Calendar, so it’s enough and I have moved completely off of GCal. However, GCal is just so… invested-in, and it shows, and I miss its UI.

And so is Google Maps. I have tried to switch over to OSMAnd, a mobile app based on Open Street Maps, which, let’s be very honest about how hard these problems are, does an admirable job. But GMaps is also incredibly heavily invested-in, and this is one where I really do feel like it is not quite usable enough for me. This is fine around town, I do know where I’m going and am happy to find info in other ways, but if I need complete directions, I still pull out Google Maps, because it’s incredibly reliable. I know this is an important one to stop using, too, so if anyone has any tips on other open street map apps I could use, even happy to pay for things, please hmu in the comments.

Ah, and Google Drive. It’s funny, I never even felt terribly reliant on GDrive, and yet it is the one thing I’ve almost entirely put off doing. It’s just going to be such a slog, to pull it all down and set up all the Stuff to put it all elsewhere. But I need to do this, so I’m literally going to set a reminder right now to take the following steps:
* Set my Linux machine up on a job to pull it all down. I imagine this will be a zip file, god help me if I have to do it one by one. (if this is the case I will look for a third party tool)
* ADDITIONALLY get all my wedding photos onto… oh shoot. My Windows desktop. Ah well, this will be a good opportunity to interact with the AWS CLI from Powershell, something it would be great to get to know a bit better (for fun).
* Create an S3 bucket on my Amazon account, and then probably just make a job to push it all up and lock it tf down.
I can see all of this becoming an Automation Project, which sounds fun but which also makes me nervous, because there’s nothing that makes me put a project off like “gotta do it the RIGHT way,” so I’ll probably just roll through the gui and, other than creating the for loop to upload all the stuff, it will all be pretty manual and nonrepeatable. I think that’s, basically fine.

Google Photos is GREAT. GRRRRREEAT. I really wish it weren’t quite so good. Here’s a note to go check out auto sync to Flickr, as I’ve been pretty pleased with their open source support and ethic, these last few years, and it just seems high quality and workable. I think transferring those photos over will be a challenge, and not one I’m super likely to prioritize at the moment. But if you’ve done this specifically, please let me know how that has gone!

Now we get to the truly difficult stuff. I’m probably never getting off Android, and I rely too much on so many apps (read: the google play store) that are unlikely to ever be supported in smaller open source mobile OSs. Ultimately I don’t want to be an iconoclast in this stuff! I’m willing to make many changes, but just kissing goodbye to everything I know in my phone… is just going to be too far, for me. I spend all day in Twitter, my email, some games, Authy, and Podcast Addict (a fabulous podcast app maintained by ONE PERSON who takes bug reports and has a Patreon which is highly deserving of your bucks, if you like to listen to podcasts on Android!

This is getting rather long, but I think that’s more or less what I wanted to say! Just like last time I’d love to hear from you if you’ve done this too. Cheers!

The De-Google

Hi folks, so, it has been a bit, but I’ve changed my personal email setup and this would be a good place to talk about why, and how I’m doing so.

First, I’ve got a new email, through FastMail. I’ve set up my DNS through them and through WordPress.com for FastMail to accept and send email on my behalf. Paying $5/mo, you’ve got access to as many aliases as you could want, so I have a few of them going to my email, all included.

So I figure step one is move all my business/subscription emails over to the respective aliases to my email I set. Well, step 1 (step 0?) was actually setting up the DNS, but I’m not going to go into that because it was all extremely discoverable to be able to access mail from my domain to Fastmail. But if we’re calling step 1 post-setup the first step, then I’ve started in on moving all my emails to the new addresses. Amazon, Steam, Kickstarter, and myriad others are now all pointing to the new spot.

Next will be actually following their migration guide. This will involve pulling all my contacts and calendar items over to Fastmail. I’ll also need to figure out how I want to handle my storage in Google Drive, which I think is going to be Amazon S3 – I’m not wild about Bezos but I’m most familiar (slash, sorry all, totally in love) with AWS of all the cloud providers, and Gb/mo are extremely cheap on S3.

I’m doing this because I think it’s important to pay for the technology that is meaningful and useful to you, if you can, and the consequences of not doing so mean that Google/Facebook/Amazon(I know) have yet another lump of data to sell to someone, about you and people like you. I want to opt out, and I’m technically savvy enough to do so*. I just think it’s, like, beyond ironic to take “don’t be evil” out of the organizational credo. Project Dragonfly and D&I lipservice and condoned internal sexual assault… I just don’t need to be in their web any longer.

What’s going to be hard, though, are many things.
One, I’ll be paying $5/mo for.. ever. Unless I start hosting my own. Which I will never do. It feels like a second marriage, signing up to HAVE TO pay this, forever. I KNOW it is worth it. and five bucks a month is NOT going to make a dent in my spending. But it’s a long-term commitment, and it’s wild to me, for some reason.
Two, turning off location for gmaps and not using Maps at all is.. probably going to take me some time. They’ve poured a lot of money into making it a beautiful, usable map interface. It’s so good. Gah. I’m pre-missing it.
Three, actually getting all the right stuff to go to my new email (and aliases) is really going to be a thing. I will be tidying up pieces of this for a year, I estimate.
Fourth and finally, convincing all the individual humans I know to email me at the new email is going to be an absolutely serious pain. I changed my email about eight years ago when my original gmail account, rachelkelly@gmail.com, just got so overflowed with other Rachel Kellys’ valid emails, that I set a permanent “vacation responder” on anything that comes in saying “you should try to contact me in other ways,” which applies to people who know me, AND to people who know the other Rachel Kellys. Then, I created a personal email, and a business(/junk) email, both in Google space. This separation has worked well. But I recall family giving me grief for changing my email and not Getting it entirely for some time – this shouldn’t be so hard on others, but a) companies like Google absolutely have a vested interest in it being so, and b) email is an old protocol, yo, and the evolutions that have happened with email are absolutely the result of an actual crapload of work and enhancement over existing supercomplexity.

So, I’m starting out. I want to get off Google’s grid, in so much as I can. I’ll let you know how it’s gone in a bit.

  • you shouldn’t need a ton of technical savvy to de-google, there are guides, but it probably doesn’t hurt

Mongo migration

For the past few months I’ve been at a terrific job, doing devops at a small SaaS company. Real quick, SaaS means “Software as a Service” & refers to companies that have a webapp that they either sell access to and/or set up a version of for their customers. There are a lot of challenges with doing devops for a company like this, trying to find the balance between the heavyweight solutions and the latest and greatest to find what’s right for us, all the while (personally speaking) doing a LOT of learning on the topic. That’s not to say that heavyweight versus the latest&greatest are opposed; there are a few more weights on that spinning disk, not the least of which is “what we were doing before was …”.

So what I’ve been working on for the last few weeks, somewhere between the old solution & the new hotness, has been a Mongo problem. We deal in data that must be scrubbed before we analyze it. So the way that works, is that the host captures data, then scrubs ALL data there, and then sends it on to our long-term storage database, and then all local data on that host is removed after a couple days. What we’ll do with all of this in five, ten years will hopefully be the subject of another post, but for now we are only dealing with about 30GB of data in the long-term storage DB, collected over the last couple years. Let’s call that “Storeo,” and the hosts that they come from “partner databases,” which is true enough.

We’ve developed a couple of schemas for Storeo, and we only upgrade our partners from one to the next with code releases. So we have a couple old versions of Storeo kicking around. The next piece of this story is that we have an analytics dashboard set up for each partner, which pulls from Storeo, based on a domain field in the data we get from each partner. There’s one for each version of Storeo that they (and we) have to refer to, which means multiple dashboards just to get all the info! So that’s foolish, yeah? As a result, a previous engineer wrote a Mongo migration script to migrate all data from version 1 to 2, and then from version 2 to 3, the current version. So there are two steps to this – first, to migrate all the legacy data up to the current version so everything can be analyzed in the same way, and second, to do this regularly so even if partners are using older versions, we roll that data up so there is ONE source of truth for all their data.

As happens occasionally, no one can quite remember how I got this project, but it’s been a ride. Mostly good, occasionally “how the hell does Mongo even work?”. Some of the problems I’ve gone through have been of a Mongo nature, some of them of a sysadmin nature, some of them just basic DBA. Many of these steps might make you scream, but I’m cataloguing them because I want to try to get down what all I’ve done and learned. When you are self-taught, your education comes in fits and starts and in no particular, and in sometimes infuriating (out of) order. So I’m going to do my best to show you all the things I did wrong, too.

Problem 1 – Where to Test

I wanted to test the migration locally, not on the production Storeo server, which continues to receive data from all our partner database. First, I fired up the mongodump docs and tried that. Well, I nearly immediately ran out of room, and deleted that dump/ directory with those contents. When I looked around with a df -h /, a command which shows you the disk file size on root, human-readable, the output was that there were only a couple gigs left. Well, I knew that dumping a 15GB database wasn’t going to work locally. So I investigated a lot of other options, like sending the mongodump to another server (technically possible), SSHing into the server but sending all dumped data to my local machine with plenty of space on it. This probably took a couple days of investigation between other tasks.

None of this really panned out (but I still think it should have), and my boss let me know that there’s a 300GB volume attached to Storeo, and I said, wait, but I didn’t see that, I looked for something like that, and they gently let me know not to give df any arguments in order to see all disks mounted on a server. With that, a df -h showed me the 300GB volume, mounted on /var/lib! Excellent. On a practical note, it’s extremely sensible to have all the data for your application stored on a volume rather than on some enormously provisioned server. When you use AWS, one volume is much the same as the next, so putting databases on their own volumes is pretty sensible. Keep your basic server’s disk very bare bones, put more complex stuff on modular disks that you can move around if you need.

So with that!! I made a directory for myself there to separate from the production stuff, confirmed that mongodump/mongorestore do NOT interrupt read/write operations, and made mongodumps of versions 1, 2 and 3. This took.. maybe an hour. Then, because they were still quite large (Mongo is very jealous of disk space), I tarballed & gzipped them to reduce them down to half a gig or so. We use magic-wormhole all the time at work (available with a quick pip install magic-wormhole [assuming you have Python and pip installed {but it doesn’t have to be just a Python thing, just like I use ag and that’s a super Perl-y tool}]) so I sent these tarballs to my local machine, untarred/ungzipped, and mongorestored to the versions of Storeo 1, 2, & 3 that I have locally to run our app on my own machine. This probably, with carefulness and lots of reading, took another couple hours. At this point we’re probably a week in.

Problem 2 – How to Test

At this point, I finally started testing the migration itself since everything was a safe copy and totally destructible. Also I retained the tarballs in case I ended up wanting to drop the database or fiddle with it in some unrecoverable way. I took a count of the documents being migrated, and of the space taken up by each DB (which was different than on prod – I thought until this week that those sizes should be constant from prod-mongodump-tarball-mongorestore, but that’s not true – apparently most databases are wiggly with their sizing). The migration script is a javascript script (how do you even say that) that you feed into mongo like so mongo migration1-to-2.js, within which you define dbSource and dbTarget. The source, in this case, is version 1 of Storeo, and the target is version 2. Each of these is a distinct database managed by Mongo. With great trepidation, I did iiiit. Ok, I’ve left a piece out. I, um, didn’t know how to run JS. Googling said “oh just give the path to the browser!” so I did and, uh – that didn’t work. You may be saying “Duh.” Look, I’ve never done any front-end at all, and have never touched javascript outside that Codecademy series I did on here a couple years back. With my tail between my legs I asked my boss again, & was told about the above, just mongo filename.js.

The script took three hours!! Gah! So I ran the next one, which took SEVEN (since it contained everything from the first one, too), and regular attention to the ssh session so I didn’t lose the process (don’t worry, linux-loving friends, I’ll get there, just keep reading). These two migrations took two business days. At this point, we started talking to the team who manages the data analysis dashboards for our partners to talk about some of the complexities. Because a) this isn’t a tool from Mongo, there are no public docs on it and b) you can only test Storeo performance after the data has been scrubbed and sent, even locally, we decided to set up a few demo servers to point to test versions of the database.

Remember the volume attached to Storeo on production? Whoo! I logged onto Storeo and learned a ton more about mongodump & mongorestore, and made teststoreo1, teststoreo2, and teststoreo3, exact mongodump/restore copies of versions 1, 2 & 3 of Storeo. Their sizes, again, were different, but we’ve learned that that’s ok! Mongo has a lot of guarantees, space management isn’t one of them, so pack extra disk and we’ll be fine. So because this took a lot of googling and careful testing, because the last thing I wanted to do was mongorestore back into the place I’d mongodumped from – at the time I wasn’t sure if mongorestore overwrites the disk entirely, and wanted to be cautious versus potential lost data. So, make the directory, mongdump into it while specifying the database. Then restore into a new database (with the same name as the directory you’ve just made – this isn’t mandatory but made it easier to trace) while feeding it the path where the mongodump lives.

mkdir teststoreo1 # make the directory
mongodump -d storeo1 teststoreo1/ # dump the database with the name storeo1 into the dir we just made 
... # this takes some time, depending of course on the size
mongorestore -d teststoreo1 teststoreo1/storeo1 # there could be a dump/ in front of this end path

So after doing this for the other two Storeo databases as well, a show dbs command in the Mongo shell outputs all three production Storeos, as well as all three test Storeos. This meant we were in a good place to do some final testing. There were a few more meetings assessing risk and the complexity of all the pieces of our infrastructure that touch Storeo, how you do. Because the function of Storeo is to continually take in stripped data, I had to ensure that we weren’t going to lose information being sent during the migration. Because it’s not an officially supported tool but instead something that we wrote in-house, and I hadn’t been able to find a tool that moves data from one mongo DB to another, it’s hard to know what will and won’t impact production, so I set up one of our demo servers to send its stripped data to teststoreo1, and then kicked off the migration from teststoreo1 to teststoreo2 to make sure there was no data loss. On that demo server, while the migration was migratin’, I made a bunch of new dummy data that I’d be able to trace back to this demo server. A few hours later, when the 1-to-2 migration was complete, sure enough there were a handful of documents in teststoreo1 that were new – they’d been held & NOT sent! With this, I was very happy with the migration script.

So I kicked off the following script with mongo migrate1-2.js, quit the process with ctrl-z, and put it in the background (after identifying it as job 1) with bg %1, so it wouldn’t be interrupted by my leaving the session (see?)..

'use strict';

var dbSource = connect("localhost/storeo1");
var dbTarget = connect("localhost/storeo2");

// The migration process could take so long that new documents may be created
// while the script is still running. We will move only the ones created
// before the start of the process
var now = new ISODate();

dbSource.collection_1.find().forEach(function(elem){
    elem.schemaVersion = 2; // this means each element is given the NEW schema version
    dbTarget.collection_1.insert(elem);
});

dbSource.collection_2.find({createTime: {$lt: now}}).forEach(function(elem){
    elem.schemaVersion = 2;
    dbTarget.collection_2.insert(elem);
});

dbSource.collection_3.find({timestamp: {$lt: now}}).forEach(function(elem){
    elem.schemaVersion = 2;
    dbTarget.collection_3.insert(elem);
});


dbSource.collection_1.remove({}); // this collection did not have a timestamp
dbSource.collection_2.remove({createTime: {$lt: now}});
dbSource.collection_3.remove({timestamp: {$lt: now}});

The second script was the same but for the definitions of dbSource and dbTarget to storeo2 and storeo3, respectively. As with the testing, the first one took about three hours, the second, seven. With each one, I kicked it off, then put it in the background, then checked on it… later. Because it’d been backgrounded (that’s a verb, sure), it wasn’t quiiiiite possible to tell when it was done. That could be fixed with some kind of output at the end of the script, but that’s not how I did it!

Then I set up a lil cron job there at the end to regularly move data from 1 to 2, and once that had run for the first time, then I set up the second cron job to move it from 2 to 3.

Who wants to talk about Mongo????????

Exploring Dockerfiles

I’d like to continue the previous entry on Docker a little further. Last time we talked about the installation process & a little more, so this time we’re going to talk about the next part of getting started with Docker – writing a Dockerfile.

Here’s what we talked about last time, and with one odd little exception (why did I promise to talk about load testing…) we’re going to cover all these things!

So, next steps, make the container persistent – it isn’t yet, and play around with Dockerfiles, and just do a little more spying on the produced container itself & probably try to do some babby’s frist load testing things in there & spy on the container as a process without the box & all its processes within!

First let’s take a look at the Docker process we created last time. Just like at your native command line, docker commands all resemble low-level Linux commands, so just like you’d use ps to look at the processes running at any given time on your machine, you can use docker ps to see all the Docker processes it is managing at any given time. If you followed along last time you’ll see some that have been exited but which you don’t have access to – each time you run the docker run -it bash you get a new process. But the old ones are still there! The all flag will show us these Exited boxes with a docker ps -a.

rachel $ docker ps -a
CONTAINER ID        IMAGE                      COMMAND                  CREATED             STATUS                     PORTS               NAMES
b5de9583d7b3        fedora                     "bash"                   10 minutes ago      Exited (0) 3 seconds ago                       pedantic_morse
35192bfa05d4        images/cowsay-dockerfile   "/usr/games/cowsay *P"   2 hours ago         Exited (0) 2 hours ago                         gigantic_goldberg
a0e40d55125a        images/cowsayimage         "/usr/games/cowsay 'D"   3 hours ago         Exited (0) 3 hours ago                         jovial_mcnulty
d32381833772        debian                     "bash"                   3 hours ago         Exited (0) 3 hours ago                         cowsay

You’ll notice a few things, first that the names are a mix of adjective_noun, except one – the cowsay container example is from the excellent Using Docker where I’ve gained a lot of my recent Docker information. Their status is all Exited. Some of the container-specific commands are similar to the init.d service commands, like start, stop, and rm, so let’s start the desired container in that list up there. The container we’re going to start up is similar to the one we made before & is Fedora, though it is true that I only made it ~10m ago!

docker start pedantic_morse

So now the output of docker ps includes the container we just made. So how do we keep it? We commit it, just like with Git! Replace pedantic_morse with whatever name yours has been assigned beneath the NAMES column.

rachel $ docker commit pedantic_morse images/morse
sha256:b398fe28d7fd26a52e0947fc8eebb7614b8a8d6d19a5332359df167c9296c04f

So what we’ve done here is create an image from which we can create containers. images/morse is the image, pedantic_morse is the Docker process that we crafted it from. For every time we run the image images/morse, it creates a new Docker process, so at this point it’s still not persistent in ONE image, HOWEVER we can use this image to perform one-offs.

Clearly we’re not getting into the strength of Docker, yet. So now it’s time for a very basic Dockerfile. Just like Vagrantfile and Procfile & probably a few other similarly intended setup files, the D in Dockerfile is capitalized and there’s no extension to it, because remember – Linux doesn’t care about file extensions!

The main piece to know with Dockerfiles is that their syntax can be as minimal as you like, and personally I recommend making them non-complex – major structural pieces, and insert kickoff scripts or use some config management in the container itself for anything much more complicated. I reserve the right to change my mind on this later! And this is also more for next time to learn. But the way it looks, the RUN command will run any bash you put in it, but if you need anything more complex, the contents become a lot more murky, in my opinion. Simple is better than complex, but complex is better than complicated, so let’s do what we need to here.

For posterity and a simplistic example, here’s the first Dockerfile I ever wrote. (ed note: I trimmed this down because each line of a Dockerfile creates a new filesystem – try to truncate Dockerfile lines as much as possible)

FROM fedora:23
RUN /bin/bash
RUN echo "the dockerfile took!"

RUN dnf install -y wget tar man

MAINTAINER Rachel!

The output of this, which is a bit long to post, pulls down version 23 of Fedora, uses bash for the following commands, prints “the dockerfile took!” to stdout, and then installs those three packages. I’m unsure why some of those aren’t present in a base Fedora image, but it doesn’t appear to be related to what I’m working on in this blog post, so we’ll leave it be for now.

This is about ten times longer than I thought it would be, woohoo! I hope you learned something, please please let me know if I’ve missed the mark on anything, cheers!

Tune in next time and we’ll talk about a more complicated Dockerfile, and syncing it up to… something 🙂 come back and you’ll find out what!