* wildcard character
*.doc is a "regular expression"
Unix uses them more and more flexibly.
b*r.??? matches anything that starts with b, ends with r, any
extension.
There are books on this.
All documents r through z: [r-z].doc
Lines that start with something or end with something...
See man regexp (but it is dense and tough going...)
All Unix commands use regexp structure for regular expressions
How do we use this stuff?
Applying regular expressions to files: generalized regular expression parser
Suppose want to find which files have a particular pattern
grep pattern filename (or filepattern)
grep botany.css *.html
in my files produces a list of all files with the stylesheet call tag.
Echo
Echoes stuff to the screen
grep never sees the asterisks, the shell does an expansion first.
Type echo *.jpg *.html to see what grep *.jpg *.html sees
Quotes prevent the shell from expanding a regular expression before handing off to grep
grep '*.jpg' *.html
Ken was after the logo80.gif file and other logoNN.gif files. A special construct was necessary, though it took him some time to get to it:
grep "logo[0-9]*.gif" *.html in /home/httpd/html
to get all logo files
with ls the simpler ls logo*.gif works, but not with grep. There are twists and turns in here.
find files by date, length, type,...
find . -name "*.html" -print
The period refers to the current directory (it also searches subdirectories). The quotes are necessary or the result is simply that which ls would show.
Suppose we are looking for logo80.gif using grep "logo[0-9]*.gif"
Well we can grep "logo[0-9]*.gif" `find . -name "*.html" -print`
Note the backtick in front of the find command and at the end. This returns all logoNN.gif files in the current directory AND all subdirectories. Find, in the tail, produces a master list of file names (via *.html, find handles searching current and subdirectories). The backticks force the shell to execute find command first and then all those files get parsed by grep.
Note on backticks: echo date produces the word "date" on the screen. echo `date` produces the system date.
More regular expressions:
In a man page you can type / and then a regexp to find that: /beedlebody will find the phrase beedlebody when you are in the man page.
/10.2 takes you to the first occurrence of 10.2 in the file.
In grep -i means ignore the case
cd /etc/rc.d
more rc.local
octal coding
-rw-rw-rw-
0110110110
or 0666 octal
If you want rwx for you and read and execute for you:
-rwx-r-xr-x
0111101101
0755
A common permission
The command to change a file to 0755 (an executable) then you
issue the command
chmod 0755 filename
Another common combination is chmod 0711 filename (execute only for others in your group and all others) (owner:group:others) or (user:group:others)
Another way to chmod (change mod) is chmod u+x filename (add execute privilege to user)
If you
set -x
It sets the shell to verbose mode. It then displays all that the
shell is doing.
sh -n program
check program for syntax but do not run the program
The extensions for set and unset are in "man sh" not in "man set" nor "man unset"
In DOS you only have a path variable and a defined temp directory. In Linux there are lot more environment variables. Type "env" and press enter.
The set command displays ALL variables that have been set on the system, not just your environment.
USERNAME=Dana
and the USERNAME variable is set to Dana. For other programs to see this variable, if you want it to stick, you have to export it:
export USERNAME
unset USERNAME
and it is completely GONE.
To put it back:
USERNAME=Dana
export USERNAME
To blank it:
USERNAME=
How do you show your variables?
echo PATH
and you see the word PATH
so to reference a variable you say $PATH
echo $PATH
This is true in regular commands too ... if[$PATH =
man sh has a list of these...
$$ is PID (process ID not the medical PID)
$# is positional arguement
$? is the return value of the value returned to the shell. Things that run successfully return a particular value to the shell that you can check to see that things ran.
Everything you run on the command line is given a positional arguement number.
set command impacts $# When you issue a set command it also sets the positional
set bunch of junk
echo $# results in 3
echo $1 yields bunch
echo $2 of
echo $3 yields junk
Suppose you need $1, $2,... $99 all set. Then the shift command shifts the list over, tossing the first n characters. If you shift the whole set you toss it: shift $# ($# is just the number of positional arguments).
Creating a shell script: programming lite
Two files run every time you log in:
.bash_profile
.bashrc
Old shell used to run a .profile file, and the old C shell ran .csh and .cshrc Bashrc just looks for a global one that sits out in another directory. .bach_profile does a little bit, but not a whole lot. You can car this file:
cat .back_profile
We might want to know more than what .bash_profile tells us. Maybe we want to run who or current time or current internet bandwidth use.
man -k time
is a keyword list search for manual pages with "time" in them. Note only section one is relevant on the man -k time return, as only section one hits are keyword hits for commands you can issue.
The date command returns the date
/home/kgirrard/gwatch.out contains a series of four digit numbers in five columns that are timing references from Telecom. Hour and minute, bandwidth in (kilobytes per second), out, and number of connections.
Suppose you want to get the last line of file out of gwatch.out. How do you get the last line of a file. One of the commands was head, the other was the tail command. Pull up the man page for tail to see the format to tell it to pull the last line as it normally returns the last 20 lines.
tail --lines 1 gwatch.out
--lines invokes is the option, 1 is a single line, the file name at the end.
Now we need to use the set command to get at a positional result. We want the third column. We need the output of tail to use it with the set command. Position three is $3. There are a couple ways. We could redirect the output, either to a file using > or to another command using |.
We can also use ` `
We used ` ` to pass the results of a find command to a grep command (see above). We use the same approach: pass tail result out.
set `tail --lines 1 /home/kgirrard/gwatch.out`
That sets five positional parameters and $3 will be the third one. See it with echo
Use pico to create the following two line file:
pico filename
who
date
then you have to make it executable using chmod
chmod 0755 filename
To run the command you must issue ./startup The ./ says, essentially, "look in the current directory" as the current directory is not in the path command (nor should it be for security reasons).
More UNIX file systems: you've seen files. You've seen directories. Now experience links. The way UNIX stores and manages files does not fit how you want to think about it. It is however, elegant. They split up storage from the references to the storage. There is not a strong connection between the data in a file and the name you give a file. Storage can be anywhere on a disk. Reference to a file sits in a directory. The file is simply a directory to where a file is. The file has an inode component that references where the file is. Now the directory entry references only the inode file (I've botched this explanation: the directory reference you see refers only to an inode file which in turns knows how to access the stored file.
When you remove something you are removing references. If a file is referenced by two references then deleting one reference does not remove the file. When all references are removed, then UNIX deletes the file. Thus one file can be used in multiple files.
Ken once set up multiple users interfacing to a common program, each home directory had a link to a common file. Updating the common file updated all users.
ls -al
ln (link manager command)
ln filename.extant filename.created.and.linked.to_filename.extant
ln email.txt address.txt
Would create a file named address.txt that links to email.txt. This is called a hard link as it is a file with a real reference. Now on the number of references: note that directories are referenced by the files . and .. Yes, that's dot and dot dot. You see them in ls command results.
To find who is using a file system or file:
/usr/sbin/fuser -v /home
would look at who is using the /home directory. The path name is necessary on the command as this path is not mapped.
Back to references: when the last reference to a file is deleted the file is deleted by UNIX. Suppose cracker hacker joe install a spy program on a UNIX system, sets it to run in the background, and then delete the directory item and the program disappears from the directories. The only way to see it is to run ps -ax (look up the process IDs) and kill it. But without the directory reference there is no way to get at the file. Once killed, it is gone forever.
Issue an ln -s filenamenameorig filenamelinkingtoorig
This is more akin to a Mac OS alias or a Windows shortcut. These do not affect the link count, and they don't "preserve" the original file. You can remove the original file and the symbolic link does not know it. It is a loose link, if you will. Why use it? If you need to create a reference to a file on another disk you need to use it. Each disk has its own set of inode numbers. The hard link runs only on inode number. The hard link is only calling an inode number, like "5", so there is no way to say "Inode 5 on disk zztop" In this cross disk situation only a symbolic link can be used.
ln -s email.txt email.sym
If you try to link to a directory with a hard link you will be rebuffed. You cannot hard link to a directory. You can do a symbolic or soft link to a directory. Thus you can "alias" a directory. Why use it: if I'm moving stuff around in my operating system. Especially when you are the sysadmin. Especially in a big transition. You can transition in stages by using symbolically linked directories. This is also useful ps -ax
This allows one to copy files into a directory without actually copying them in: just soft link across to the correct directory.
symbolic links are prefixed with the letter l in the command ls -1
/dev
Device directory. Prefixes are unusual in here. B: block devices. C: Character devices. L: symbolic links. this has do with how device will handle input. Character devices take a character at a time. Some devices read in blocks, disks are block devices. P devices: pipe devices used for communication. You can make pipe devices with a command. Pipes are FIFO devices (first in first out). A pipe has the advantage of tossing everything that has been read once it has been read.
Runlevel Configuration
From AT&T System V. In /etc/rc.d
If you run in S/0 you are running in single user mode. No root level password needed at this level. This is a level that you can use to recover a lost sysadmin password.
S/1 is not used anymore (runlevel one)
Runlevel 2 is multiuser configuration. What does not happen at this level is networking support. The machine runs standalone but with all multiuser configuration stuff booting.
Runlevel 3 is multiuser with Networking.
Runlevel 4: Not sure what is does.
Runlevel 5 is reboot: it reboots the machine.
Runlevel 6 is GUI level in Linux. Instead of a command prompt you get a GUI screen with a log in screen.
There is a directory rc.d (that's right, there's a dot in that directory name).
You can, from the command line, move from runlevel to runlevel. There is a program called init, the overseer of all. Run pstree and you can see init at the root level, it has process ID 1. There is a file called initab that holds all the configuration information for init: /etc/inittab
There is another file, but I missed what it does:
/etc/rc.d/rc.sysinit
Back to inittab. Perform a more on inittab in /etc
The run level information is there, what happens when power goes off, OH inittab calls rc.sysinit from a line in it.
Up in rc.d you will see all this other stuff. rc.sysinit based on boot check program by Smoorenburg.
Comments: . /etc/sysconfig/network. Runs network configuration files. Note especially the dot space slash at the lead-in. That is not a mistake. We have been running stuff in our current shell. Remember I said you can run environment vars and export them so programs can access them?
Programs run variables in another shell - a child shell. Variables are not exported to the parent shell. So anything that the program sets is not exported out. So to force export to a parent. Use . /filename (dot space)
You can also, in the world of shell, issue an sh command to create a child shell and then exit back out to where you were in the enviroment before.
So that is why the dot space notation: to set the environment for the parent space.
sysconfig/network sets all the network information for the server. IP address host name.
See rc.sysinit for information on finding this file.
We are now in Runlevel 3 directory.
K: stop (kill) process
S: start process
K and S are followed by a number. These are run in that order. Each is run in sequence. It invokes K01snmpd as K01snmpd stop. S99local is run as S99 start.
Let's look at one I'm fond of, inet. Internet.
S99local is full of stuff we have to do that is unique to us.
rc0.d is shutting the computer down: it is all Kill commands. The only start is a halt command.
Strings command: looks through compiled files and looks for strings.
01 December 1999
Monday we looked at what Linux does when it signs up.
Today: Other things we do at bootup
They run in a couple different ways. Some are a type of application that are used heavily, others infrequently. Internally how they are built into the op sys are different. The web server is busy: it is a static installed process. Run pstree and you will see httpd and it runs all the time. Telnetd is behind inetd, what is different about that?
Well, inetd is the internet superserver daemon. It listens for requests for a lot of protocols. Telnet, ftp, time, for mail, for pop and imap, finger. It listens for all of them. When you telnet to shark, inetd looks for that request. It starts up telnetd and then telnetd runs your application. Then inetd steps out of the picture. There are a whole bunch of these. These don't run until you want them to.
If I ftp into Shark and then pstree now, now in addition to the telnets there is an ftpd
Command: ftp shark
inetd has a configuration file in /etc/inetd.conf
There is a whole nother class of services called RPC: remote procedure calls. Sun's contribution to UNIX (and Berkeley). Berkeley created their BSD they created the network is the computer. In a networked environment you could run even parts of a program on other computers to distribute the load. This is the RPC. You set up a list of procedures available to others and anyone who knows the unique number can access that service. There are a lot of services available...
file that sets this up /etc/rpc
This is list of services, they are not all active. We do not use any right now. Eastern Oregon uses many of these, but many are sun centric, so if you do not have sun clients you do not use a lot of RPCs. We don't use most of this. Network information systems, user file management in a distributed UNIX environment.
Another class is things that listen all the time:
The only wrinkle in this, if you look at inetd you will see a security layer in place. inside inetd
ftp stream tcp nowait root (run as root) /usr/sbin/tcpd in.ftpd -l -a
note the tcpd file: a security wrapper. It checks against a security list of whether the computer is OK or not and then either opens or does not open the connection. Shell, login, telnet, ftp are notoriously insecure. We limit who can telnet to us. That is why so many services in inetd call tcpd.
tcpd is compiled
The config files are host allowed and host not allowed. /etc/hosts.allow /etc/hosts.deny
140.211 Eastern Oregon 206.49.89 FSM Telecom
/etc/services
Ties a service name to a number.
Well behaved applications that act as servers will take advantage of this service. This is how programs are supposes to look up port numbers.
etc/rpc is also ports. Service ports run up to 99,999. After 100,000 you are into RPCs.
Suppose you know we have a pop server on this campus and you know it is pop3
telnet shark.palikir pop-3
Now you are a client on the pop server. If you know how to talk to it
user dleeling
then the password and you can log in.
If you have services that are not password protected then you are in. You can port scan to find open ports. All these tcp services use clear text commands like user, password, even help. They all speak the same language: tcp.
Setting up a listserv with majordomo
Source Code
majordomo-1.94.4 tar gz
issue a tar -ztf maj*|more to read the contents of the archive. (piping to more, like ps ax |more)
To unpack the file:
tar -zxvf maj* and press enter (the command is silent, we just unpacked the file). The maj* works cause there is only one maj type file.
Take a look at the readme file. The first most important information. You need PERL 4.036 or 5.002 or better (5 is object oriented, 4 is not) (we use PERL 5.004). You also need a C compiler.
Then look at common problems and security notes. Now look at the INSTALL file.
less INSTALL
This file tells us what to do. Some programs need to have configure run first. Majordomo is an older program and does not support configure. Configure goes out and looks at what your system has and is running. So for majordomo we have to do it all ourselves. We need to choose a user and group.
After that you edit the makefile sometimes Makefile. Makefiles are used for a lot of things. They define dependencies between different files. Main purpose is for source code. Can be used for other things. Anytime there is a dependency you can use a Makefile. They are also used to install things. Because this is PERL, Makefile will probably only do the INSTALL for us. There are notes in here for Posix compliant systems and POSIX non-compliant systems.
Ken is running with two terminals open, one with the directions, one with the Makefile open.
First we need to tell it where PERL is. Check ls --l /bin/perl Nope. Ok, check ls --l /usr/bin/perl Yep, directory exists. It is probably in your path, probaby the first few items. Or get smart and use the find command. If you forgot find, get the man page with man find. Where do you want it installed? W_HOME /usr/local/majordomo - $(VERSION)
Hmm... he opts now to create a majordomo user. To encapsulate the security. This will create a special user id. It will allowed specified security. By running everything under one user id you can set things tighter. As opposed to using root. If run as root then that opens up security problems. Let it run not under root.
He is now fiddling with the passwd file. In the majordomo...
moer /etc/group
and it took the daemon number (2) and put it into the passwd file, subbing 101:2 for 101:101 in the passwd file.
back to configuration to enter this data
W_USER = 101
W_GROUP = 2
Ken modifies the path to remove usr/ucb (a Sun path)
Now edit majordomo.cf and the variables to set. We need to copy sample.cf to majordomo.cf
cp sample.cf majordomo.cf
Edits $homedir to /usr/local/majordomo
The version name is in path: edit it out. When you upgrade it creates a whole new folder with the new version number as a folder name. I couldn't follow where he went to edit that path information. Sendmail is usually in /usr/lib/sendmail. But on Linux it is in /sbin/sendmail. There is a symbolic link back to /sbin/sendmail from /usr/lib/sendmail. It should not have been in /user/lib/sendmail, but until all programs change their reference, the symbolic link holds the system together.
Now we do make wrapper
Issue the command. If no errors (and Ken got through this clean)...
Do make install as root! Standard users cannot do this. Linux does not use /usr/local/ It is popular spot for majordomo and major applications. /usr/local replicates /src /bin /sbin /etc just like root. An artificial root or higher level root for programs to lounge around in.
There are some processes that require majordomo to be root in order to run.
OK, some alias file stuff. Ken is not using M4 sendmail. M4 is the most confusing configuration file format. Really dorky things like different versions of quote. Just for cuteness. syslog uses M4, almost nothing else. So we won't use M4's define command. Sendmail is, however, just as bad as M4. There are books on sendmail configuration files.
Time passes...
OK, so we are still trying to determine how to tell majordomo where its aliases file is, which way to execute section four of this install stuff. Looking for majordomo.aliases file.
find . -name \*aliases print.
Does not yet exist. Ken does not like the OA/usr/local/majordomo/majordomo.aliases way of doing it, but can see no other way to do this.
cd /etc
more aliases
to see all mail aliases
Sets up path to majordomo in aliases file:
# Majordomo aliases
majordomo: "|/usr/local/majordomo/wrapper majordomo"
owner-majordomo: kgirrard
majordomo-owner: kgirrard
A glitch: newaliases, a sendmail program that must be run, reports a problem with the majordomo aliases file.
Goes back to sendmail.cf and changes OA to O Aliasfile=/usr/local/majordomo
He had to change ownership on majordomo, the current directory.
chown root . (current directory)
chgrp root .
If folder is owned by username and file inside is owned by root, this is security problem: username could remove and then load up a file under the same name or replace a root owned file.
I run a command. majordomo wants to write to the directory to handle mock locks, but sendmail does not want the folder to be writable. Did a chmod 0755 which made the directory writable. newaliases is then run. NOT FATAL! newaliases pops warnings, but nothing fatal. The writable directory is a security risk, but the change to root ownership and writability have kept everyone happy except... security.
Unknown error number nine. Ken decides to head back to his office to tackle this error as config-test does not indicate any problems.
When you see at that login prompt at the start, what is providing the prompt? The shell. Which shell? The one specified in /etc/passwd
grep kgirrard /etc/passwd
Linux default is bash
commands are separated from switches by whitespace. Whitespace is important.
Why do some switches prefix with dash? System V ATT used the dash option. The new latest and greatest BSD version is moving in another direction. Moving to where the options are not preceded by a dash.
Some use two dashes:
The -- is appearing more often than no dash.
Dash dash options are considered "long options" Options that follow a single dash are usually single letter options that can be concatenated. Double dash options are often words and cannot be concatenated.
Process management. ps command, what is running.
ps: what is running
kill: stop processes. Only superusers (root) can kill processes owned by others.
For daemon processes, you can also send them a kill -HUP pid (where pid is the process id) which is a hang up signal. Tied to the days of modems. When the modem disconnected all processes got the signal and terminated. With most daemons, rather than going away they use HUP as a signal to reload their configuration file. When you edit the config file you can force a reload of config files with kill -HUP. kill -1 is the same as -HUP.
kill (9) can't be caught, that is, it is designed that way. It cannot be trapped. Programs can trap kills and do what they want to do prior to shutdown. Kill 9 is elimination with extreme prejudice: the process never sees the signal. Tossing a user who has something running that traps and ignores regular kill commands. kill -9 pid. To determine someone's id use:
Or, more specifically,
ps aux|grep peterp
To find a specific user.
who has a format for process ids too.
command &
Start a background process. Caveat. If you log off in the middle of it: it dies. To change that behavior:
nohup command &
See man nohup, but nohup makes the background process immune to HUPs.
Symbolic links, a review.
Here's another use of symbolic links. Completely hypothetical situation. I want to distinguish between symbolic links and hard links. If you have a file in a directory. When you do an ls listing. Then number in the second column is the number hard links to that file. You can create other links to the file by saying ln file1 newfile2 where ln is the link command. When you do these hard links the entries are identical with each other. The file exists until you eliminate every single reference to the file. Hard links are limited to a single disk volume. Unix separates directory entries for data from data on the disk.
Symbolic links are the ones that show up with a -> on the right hand side. They do not preserve the original file. Back to hypo situation. You have a program that is writing out log files. The files grows inordinately large. So once a week you move a copy of that file off to another location and it creates an empty file where the old one existed. Suppose somewhere else you want a reference to that, you can do that.
So I have /logs/logfile
Elsewhere I have /home/prog.logfile hardlinked back. If the program that rotates them removes /logs/logfile during its erase portion the file /home/prog.logfile holds the original file: it never actually gets deleted. Now suppose you create a new logfile with touch /log/logfile. But now /home/prog.logfile points at that original logfile.
So you use a symbolic link. The symbolic link continues to point to the same location so it pulls the new log file data, the symbolic link does not preserve the original file that it pointed too.
Symbolic links: Another usage
#!/bin/sh
is the way a program tell UNIX what shell it wants to use.
The listserv last week included two cgi scripts. They called for #!/usr/local/bin/perl but in Red Hat Linux /usr/bin/perl
So you put a symbolic linked named PERL in /usr/local/bin/perl to point to /usr/bin/perl
Lee Ling home
COM-FSM home page