============================================================================= Section 9: All about file permissions... ============================================================================= From: jdell@maggie.mit.edu (John Ellithorpe) Organization: Massachusetts Institute of Technology Here's a pretty bad story. I wanted to have root use tcsh instead of the Bourne shell. So I decided to copy tcsh to /usr/local/bin. I created the file, /etc/shells, and put in /usr/local/bin/tcsh, along with /bin/sh and /bin/csh. All seems fine, so I used the chsh command and changed root's shell to /usr/local/bin/tcsh. So I logged out and tried to log back in. Only to find out that I couldn't get back in. Every time I tried to log in, I only got the statement: /usr/local/bin/tcsh: permission denied! I instantly realized what I had done. I forgot to check that tcsh has execute privileges and I couldn't get in as root! After about 30 minutes of getting mad at myself, I finally figured out to just bring the system down to single-user mode, which ONLY uses the /bin/sh, thankfully, and edited the password file back to /bin/sh. ----------------------------------------------------------------------------- From: djd@csg.cs.reading.ac.uk (David J Dawkins) Organization: University of Reading About a year back, I was looking through /etc and found that a few system files had world write permission. Gasping with horror, I went to put it right with something like dipshit# chmod -r 664 /etc/* (I know, I know, goddamnit!.. now) Everything was OK for about two to three weeks, then the machine went down for some reason (other than the obvious). Well, I expect that you can imagine the result. The booting procedure was unable to run fsck, so barfed and mounted the file systems read-only, and bunged me into single-user mode. Dumb expression..gradual realisation..cold sweat. Of course, now I can't do a frigging chmod +x on anything because it's all read-only. In fact I can't run anything that isn't part of sh. Wedgerama. Hysteria time. Consider reformatting disks. All sorts of crap ideas. Headless chicken scene. Confession. "You did WHAT??!!" Much forehead slapping, solemn oaths and floor pacing. Luckily, we have a local MegaUnixGenius who, having sat puzzled for an hour or more, decided to boot from a cdrom and take things from there. He fixed it. My boss, totally amazed at the fix I'd got the system into, luckily saw the funny side of it. I didn't. Even though at that stage, I didn't know much about unix/suns/booting/admin, I did actually know enough to NOT use a command like the one above. Don't ask. Must be the drugs. BTW, if my future employer _is_ reading this (like they say he/she might), then I have certainly learned tonnes of stuff in the last year, especially having had to set up a complete Sun system, fix local problems, etc :-) Anyone else got a tale of SGS (Spontaneous Gross Stupidity) ? ----------------------------------------------------------------------------- From: mfraioli@grebyn.com (Marc Fraioli) Organization: Grebyn Timesharing I was happily churning along developing something on a Sun workstation, and was getting a number of annoying permission denieds from trying to write into a directory heirarchy that I didn't own. Getting tired of that, I decided to set the permissions on that subtree to 777 while I was working, so I wouldn't have to worry about it. Someone had recently told me that rather than using plain "su", it was good to use "su -", but the implications had not yet sunk in. (You can probably see where this is going already, but I'll go to the bitter end.) Anyway, I cd'd to where I wanted to be, the top of my subtree, and did su -. Then I did chmod -R 777. I then started to wonder why it was taking so damn long when there were only about 45 files in 20 directories under where I (thought) I was. Well, needless to say, su - simulates a real login, and had put me into root's home directory, /, so I was proceeding to set file permissions for the whole system to wide open. I aborted it before it finished, realizing that something was wrong, but this took quite a while to straighten out. ----------------------------------------------------------------------------- From: jerry@incc.com (Jerry Rocteur) Organization: InCC.com Perwez Belgium I sent one of my support guys to do an Oracle update in Madrid. As instructed he created a new user called esf and changed the files in /u/appl to owner esf, however in doing so he *must* have cocked up his find command, the command was: find /u/appl -user appl -exec chown esf {} \; He rang me up to tell me there was a problem, I logged in via x25 and about 75% of files on system belonged to owner esf. VERY little worked on system. What a mess, it took me a while and I came up with a brain wave to fix it but it really screwed up the system. Moral: be *very* careful of find execs, get the syntax right!!! ----------------------------------------------------------------------------- From: weave@bach.udel.edu (Ken Weaverling) Organization: University of Delaware A friend of mine called me up saying he no longer could log into his system. I asked him what he had done recently, and found out that he thought that all executable programs in /bin /usr/bin /etc and so on should be owned by bin, since they were all binaries! So he had chown'ed them all. ----------------------------------------------------------------------------- From: rob@wzv.win.tue.nl (Rob J. Nauta) Organization: None At my previous employer, the sysadmin would create new user accounts by hand by editing the passwd file, create a home dir, put some files in it, and chown '*' and '.*' to that new user. Thus, /home/machine was also chowned ('.*' also matches '..'). It was quite handy to see who was added last, but after a while I slipped him the hint to chown '.[a-z]*' which works much better of course. But the stories told now are more folklore than real horror. Having read 2 Stephen Kings this weekend I beg everyone to tell more interesting stories, about demons, the system clock running backwards, old files reappearing etc ! ----------------------------------------------------------------------------- From: alan@spuddy.uucp (Alan Saunders) Organization: Spuddy's Public Usenet Domain About inexperienced sysadmins .. One such had been on a Sun syasadmin course, and learned all about security. One of the topics was on file and group access. On his return, he decided to put what he had learned into practice, and changed the ownership of all files in /bin, /usr/bin to bin.bin! I was called in when no one could log in to the system (of course /bin/login needs to be setuid root!) ----------------------------------------------------------------------------- From: pete@tecc.co.uk (Pete Bentley) Organization: T.E.C.C. Ltd, London, England The guys next door had just got a Sun 3/360 (or some such) to host a VME-bus image processing system - none of them knew much (or cared much) about Un*x and so early on a student on loan to them got a space in the wrong place and did pillock# chmod -r -x ~ /* with the same results (system in single user, refusing to run any commands or go multi-user). As it happened a) This was a government establishment, and so the order for the QIC tapes for backups had not yet been approved, hence no backups... b) The install script for the kernel drivers for the image processing stuff had not worked 'out of the box', and so the company had sent an engineer down to install it. I hadn't been around when he came and built their drivers, and they hadn't a clue what he had done. So, there was no way to rebuild the drivers without another engineer call and because of (a) there were no backups of the driver...Anyway, a complete reload was therefore out of the question. These were the days before SunOS on CD-ROM. In the end I managed to get the thing up by booting from tape, installing the miniroot into the swap partition and booting from that. This gave me a working tar and a working mount, but no chmod. Also no mt command. Also at this time very little of my Un*x experience was on Suns, so I had no idea of the layout of the distribution tape. Various experiments with dd and the non-rewinding tape device eventually found the file on the tape with a chmod I could extract. chmod +x /etc/* /bin/* /usr/bin/* on the system's existing disk was enough to make it bootable. After that I sat the student down with a SunOS manual and let him figure out the mess and correct the permissions that had been todged all over the system... ----------------------------------------------------------------------------- From: dvsc-a@minster.york.ac.uk Organization: Department of Computer Science, University of York, England I was changing the UIDs of a few users on one of our major servers, due to a clash with some machines newly connected to the net. Fine, edit /etc/passwd then chown all their files to the new UID. So, rather than just assume that all files owned by "fred" live in /home/machine/fred I did this: machine# find / -user old_uid -exec chown username {} \; This was fine... except it was late at night and I was tired, and in a hurry to get home. I had six of these commands to type, and as they would take a long time I'd just let them run in the background over night..... So, you come in the next morning and a user compains... I can't login to the 4/490 - it says "/bin/login: setgid: not owner". Okay.... naive user problem no? rlogin machine -l root /bin/login: setgid: not owner machine console login: root /bin/login: setgid: not owner Okay - I REALLY can't get in... lets reboot single user and see whats on... this worked. /bin/login is owned (and setuid to) one of the users whos UID I changed the previous day... infact ALL FILES in the ENTIRE filesystem are owned by this user..problem! We `only' lost about 200 man hours through my little typing mistake. The moral: Beware anything recursive when logged in as root! ----------------------------------------------------------------------------- From: joslin_paul@ae.ge.com Organization: GE Aircraft Engines True confession time: Cron is a great way to hide your flubs. I installed the COPS security package on a system, then set up cron to recheck the system once a month. No problem, right? Except that I had configured COPS to put the reports in /. As a security measure, COPS chmods its directory to u-rwx,w-rwx so that only the COPS owner can read the reports. The chronology was 1) Run cops. Add cops entry to root's crontab. Later that day, notice that / was 600; change it back. 2) 30 days later: get calls from users - can't log in, "No shell" error messages. Find / is 600; change it. Vaguely remember that this happened once before. The machine was a sandbox, so almost anything could have changed /. 3) 30 days later: get calls from users - can't log in, "No shell" error messages. Find / is 600; change it. Vaguely remember that this happened once before. Happen to think "cron"; notice that the only cron activity for root last night was COPS. Read COPS source and discover problem. Moral: RTFM. Keep logs, so that you can notice patterns in your data. Don't do anything as root that you can do as a mortal. ----------------------------------------------------------------------------- From: johnd@cortex.physiol.su.oz.au (John Dodson) Organization: Department of Physiology, University of Sydney, NSW, Australia Some years ago when we went from Version 7 Unix on a PDP11 to a flavour of BSD on a Vax, I was working on the Vax in my home directory & came across a file that I had no permission on (I'd created it as root) so the following ensued... $ /bin/su - Password: # chown -R me * mmmmm this seems to be taking a long time ! kill. # ls -l the result was that I was in / after the su ! (good old V7 su used to leave you in the current directory ;-) It took me quite a while to restore all the right ownerships to /bin /etc & /dev (especially the suid/sgid files) I'd managed to kill it before it got off the root filesystem. ----------------------------------------------------------------------------- From: adb@geac.com (Anthony DeBoer) Organization: Geac Computer Corporation I was once called in to save a system where most things worked, but the main application package being used on it hung the moment you entered it (leaving the system more than a little useless for getting things done). I poked around for awhile, verified that the application's files were all present, undamaged, and had the right permissions. The folks who normally used the machine had also discovered that all was well if root tried to run it. But nothing was visibly wrong anywhere. So, being a bit hungry by then, I took a break for supper, and about halfway through, the little voice at the back of my head that sometimes helps me said, "/dev/tty". Sure enough, somebody had chmod'ded it to 0644, and the application directed (or tried to direct, in this case) all its I/O through it rather than just using stdin/stdout like a sane normal process. ------------------------------------------------------------------------------- *NEW* From: mike@sojurn.lns.pa.us (Mike Sangrey) To set the stage: We used the csh. We were fairly new to Unix. We were developing a fairly eloborate system in ``C''. We made some fairly harmless (most of the time) mistakes: We had ``.'' (dot) in root's PATH. (Yeah, I know, so sue me.) We had the forsight to set up a pseudo-user for our package. Certain of these programs were to run setuid as the pseudo-user others weren't setuid and were to be only run as that psuedo-user. You know the scenario. The problem was that sometimes during development, one of us didn't have the permission to execute a program. We frequently fell into executing things as root. One particularly frustrating day we did something even more stupid: chmod 777 *. Then, just to make sure (of how stupid we can be) we flipped to a virtual terminal that was su'ed to root. The next command, which used the csh's history mechanism, executed a ``C'' program -- NOT the executable, mind you, the source. Believe it or not, the end effect was the same as cd / rm -fr * Sort of reminds me of the story of a hurricane, a junk yard and the creation of a 747. Who'd a thunk it?!! Take some inexperienced people and a powerful system; add profuse doses of frustration and wha-la! -- You have a Stephen King shell script. ----------------------------------------------------------------------------- From: mba@controls.ccd.harris.com (Belinda Asbell) Organization: Harris Controls In article, JRowe@cen.ex.ac.uk (J.Rowe) writes: >> Am I the only one to have mangled a root shell? Probably not. I learned the hard way to be careful if messing with /etc/passwd. One day, for some reason, I couldn't login as root (pretty scary, since I knew the root passwd and hadn't changed it). Turned out that somehow I'd blitzed the first letter of /etc/passwd somehow (vi does bizarre things sometimes). So I logged in as 'oot' and fixed it. NEVER do a "chmod -R u-s .", especially not in /usr.... I think that "mount -o" or something similar will mount a filesystem read-write if it's come up in singleuser mode and is mounted read-only..... ============================================================================= Section 10: Depends on the machine... ============================================================================= From: kochmar@sei.cmu.edu (John Kochmar) Organization: The Software Engineering Institute A long time ago, back when the Apollo 460 was around and I had just graduated from college, I had the good fortune of being one of two adminstrators in charge of making a cluster of 460's a part of our environment. One of the things I was tasked with was geting them onto our network. Well, I was young, I had the manuals, and a guy from Apollo tech support was there to help. How hard could it be, right? Well, we got out the manuals, configured the system (relying heavily on the defaults), and within 2 hours, we had that puppy on the network. Life was good. About 3 hours later, I get a phone call from a systems programmer / developer from CMU campus (the SEI is a part of CMU, and we are on their network.) He told me that if I didn't take the &%@*ing Apollo off the network, he was going to do hurtful things to me physically. Life was not so good. As it turned out, in default mode, the Apollo answered every address request it saw, even if it is not the machine the request was for. Kind of a "hey, I'm not who you are looking for, but I'm out here in case you decide you'd rather talk to me." Apollo considered this a feature, and they took advantage of it in their OS environment. However, one of the earlier versions of a heavily network dependant OS developed at CMU considered this a bug. The OS would issue a request, and expect only the machine it was looking for to answer it. Of course, it would assume that if it got an answer to its request, it must be the machine it expected to talk to. It didn't look at the address of the answer it got, so if it wasn't the correct machine, most of the time the OS would hang or panic. The outcome? Over about 3 hours time, more and more of campus was talking to our little 460, which had just enough muscle to keep up with the requests. By the time campus figured out what was going on, we had an Apollo merrily answering the network requests for hundreds of machines (the ones that were still up, that is.) This caused the part of campus who used the new OS going to hell in a bucket, one very busy Apollo 460, and one very warm ethernet. Well, we turned off the Apollo, configured it not to chat to all of campus before putting it back on the ethernet (this time, we did it while talking with campus, making sure we didn't cause the same problems we did the last time -- we didn't have a packet monitor at the time), and campus changed their OS to look at the request response before assuming it was the correct one. I also learned to think very carefully about default values before using them. ----------------------------------------------------------------------------- From: dinicola@itnux2.cineca.it (Attilio Dinicola) Organization: Laboratorio di Fisica Computazionale, INFM. Trento Italia I was mor'ing somethin at the system console, ultrix os under me! I wanted to press a ^L and, unfortunately, the nearest ^P suspended system activities: a console mode prompt appeared. So, I pressed: res Thinking .. resume .. but res became restart and the system rebooted destroying all processes. Naturally, Murphy was in front of me and some batch jobs were running since four or five days before. WERE .. RUNNING! ----------------------------------------------------------------------------- From: sam@bsu-cs.bsu.edu (B. Samuel Blanchard) Organization: Dept. of CS Ball State University Muncie IN kill -1 1 on an Altos SV box is not good. I pulled this one trying to show off. No more gettys appeared when uses logged off. When I went to the console, I calmly typed 0 to the Run Level request prompt. 2 would have been nice? It was my first SystemV like box, and it seemed to have such nice berkley commands. A control-s on a Sequent S27 console can cause processes to hang waiting to write to the console. Unfortunatly, su is one such process. No real problem since I don't blindly reboot on request ;-) ============================================================================= Section 11: The miscellaneous collection (a.k.a. 'oops')... ============================================================================= From: hirai@cc.swarthmore.edu (Eiji Hirai) Organization: Information Services, Swarthmore College, Swarthmore, PA, USA We were running a system software that had a serious bug where if anyone had logged out ungracefully, the system wouldn't let any more users onto the system and users who were logged on couldn't execute any new commands. (The newest release of the software later on did fix this bug.) I had to reboot the machine to restore the system to a sane state. I did a wall < exists, overwrite (y/n)?" ... since it was started from cron, it just read "EOF". Tried again. Read "EOF". And so on. All output went to /tmp... what was full after the file reached 90 MB! What happened next? I'm using a SCO machine, /tmp is in my root filesystem and when trying to login, the machine said something about being not able to write loggin informations - and threw me out again. Switched machine off. Power on, go to single user mode. Tried to login - immediately thrown out again. I finally managed to repair the mess by booting from Floppy disk, mounting (and fsck-ing) the root filesystem and cleaning /tmp/* ============================================================================= Section 12: The morals of these stories... ============================================================================= From: jarocki@dvorak.amd.com (John Jarocki) Organization: Advanced Micro Devices, Inc.; Austin, Texas - Never hand out directions on "how to" do some sysadmin task until the directions have been tested thoroughly. - Corollary: Just because it works one one flavor on *nix says nothing about the others. '-} - Corollary: This goes for changes to rc.local (and other such "vital" scripties. ----------------------------------------------------------------------------- From: ericw@hobbes.amd.com (Eric Wedaa) Organization: Advanced Micro Devices, Inc. -NEVER use 'rm ', use rm -i ' instead. -Do backups more often than you go to church. -Read the backup media at least as often as you go to church. -Set up your prompt to do a `pwd` everytime you cd. -Always do a `cd .` before doing anything. -DOCUMENT all your changes to the system (We use a text file called /Changes) -Don't nuke stuff you are not sure about. -Do major changes to the system on Saturday morning so you will have all weekend to fix it. -Have a shadow watching you when you do anything major. -Don't do systems work on a Friday afternoon. (or any other time when you are tired and not paying attention.) ----------------------------------------------------------------------------- From: rca@Ingres.COM (Bob Arnold) Organization: Ask Computer Systems Inc., Ingres Division, Alameda CA 94501 1) The "man" pages don't tell you everything you need to know. 2) Don't do backups to floppies. 3) Test your backups to make sure they are readable. 4) Handle the format program (and anything else that writes directly to disk devices) like nitroglycerine. 5) Strenuously avoid systems with inadequate backup and restore programs wherever possible (thank goodness for "restore" with an "e"!). 6) If you've never done sysadmin work before, take a formal training class. 7) You get what you pay for. 8) There's no substutite for experience. 9) It's a lot less painful to learn from someone else's experience than your own (that's what this thread is about, I guess :-) ) ----------------------------------------------------------------------------- From: jimh@pacdata.uucp (Jim Harkins) Organization: Pacific Data Products If you appoint someone to admin your machine you better be willing to train them. If they've never had a hard disk crash on them you might want to ensure they understand hardware does stuff like that. ----------------------------------------------------------------------------- From: dvsc-a@minster.york.ac.uk Organization: Department of Computer Science, University of York, England Beware anything recursive when logged in as root! ----------------------------------------------------------------------------- From: matthews@oberon.umd.edu (Mike Matthews) Organization: /etc/organization *NEVER* move something important. Copy, VERIFY, and THEN delete. ----------------------------------------------------------------------------- From: almquist@chopin.udel.edu (Squish) Organization: Human Interface Technology Lab (on vacation) When you are doing some BIG type the command and reread what you've typed about 100 times to make sure its sunk in (: ----------------------------------------------------------------------------- *NEW* From: Nick Sayer If / is full, du /dev. ----------------------------------------------------------------------------- *NEW* From: TRIEMER@EAGLE.WESLEYAN.EDU Organization: Wesleyan College Never ever assume that some prepackaged script that you are running does anything right.