13 June 2013

Phone call: 'I've got this sick machine ...'

me:  well, why it is sick?

them:  yum complains about a missing signing key

me: so install the key; it is down in /etc/pki/rpm-gpg/, and rpm --import ... will do the trick

them: that directory is not there

me: who set up the machine?

them:  well, I was handed it, and ...

me: so, take a level zero backup and then clean up the machine before trying to work on it, or deploy a new one

them: well, I can't

I just got off that call from a friend in a new employment situation

The technical fix was outlined by me long ago, and I sent an email with the link along to the person calling

BUT: Fixing the mindset inside the caller's head: do not try to work in a undefined (here: broken) environment is harder

But the caller has a problem in their work-flow process; a fix has to be done; sooner is probably better than later; a broken machine in production is 'technical debt', pure and simple.  Fundamental expectations are not met; binary partition will not work well to isolate problems, as more than one piece is probably broken.  It will break again, and a perception may well form that the caller may be the problem, rather than the broken environment they were handed

Be sure to make a note to yourself to also address the broken process that permitted that machine to escape into production.  Dollars to doughnuts, there is 'more than one roach' lurking.  I'll cover a bet that there are not tested backups in that shop