 |
Files in Perl
|
This tutorial walks through
using files in Perl.
Information About Files
Even without opening files we can get useful information about them. We can
apply file test operators to filenames (scalar ‘string’ variables)
or to filehandles (those things that look like <STDIN> or <PASSWDFILE>).
foreach $filename (@ARGV) {
if (-e $filename) {
print "$filename exists, "
if (-r $filename) {
$fileSize = -s $filename;
print "and we can read it! It is $filesize bytes long.\n";
} else {
print "but we can't read it!\n";
}
} else {
print "There is no file called $filename, homer.\n";
}
}
Most of the file tests return ‘1’ if they’re true, and nothing
if they’re not. The main exception is the ‘-s’ operator,
which returns the size of the file, and the various ‘age’ operators,
which return a number of days.
| Test |
True when? |
| -r |
We can read the file. |
| -w |
We can write to the
file. |
| -x |
We can run the file. |
| -o |
We own the file. |
| -e |
The file exists. |
| -s |
Returns the size of
the file in bytes. |
| -f |
File is a plain file
(not a directory or special file) |
| -d |
File is a directory. |
| -l |
File is a symbolic
link.
|
| -t |
Filehandle is open
to terminal input or output. |
| -T |
File is a text file. |
| -B |
File is a binary file. |
| -M |
Returns age of file
in days since last modified. |
| -A |
Returns age of file
in days since last accessed. |
Here’s a program that warns you if a file hasn’t
been used for a long time:
#Syntax: useless <age in days> <list
of files to check>
$LongTime = shift(@ARGV);
foreach $filename (@ARGV) {
if (-A $filename > $LongTime) {
print "You haven't used $filename in a long time!\n";
}
}
Opening and Closing
Now comes the fun part, actually getting into and making use of files. Most
of the time, if you use Perl for writing small utilities, you’ll be
opening files, reading their data, and either reporting on the data, or writing
a modified file out to the screen. The screen is known in Perl as STDOUT.
You’ve already used it quite extensively. The correct syntax for the “print” statement
is “print FILEHANDLE Stuff to print”. Like most Perl statements,
however, if you leave out the FILEHANDLE it assumes something, and what it
assumes is STDOUT.
When you want to open a file, you
use the ‘open’ function. When
you’re done with the file, you use the ‘close’ statement.
open(MYFILE,$FileToOpen);
#do things with it
close MYFILE;
If your scalar variable is in all capitals, you can use a simpler form of
the open function:
$DEADBASE = "/jp/jerry/Dead/Venues.txt";
open(DEADBASE);
Generally, you’ll want to
make sure you opened it successfully:
if (open(DEADBASE)) {
#do things with it
} else {
print "Could not open $DEADBASE.\n";
}
If the file is so important that
you can’t do anything without it, you
can just up and die right there:
open(DEADBASE) || die "Could
not open $DEADBASE.\n"
That’s your standard ‘or’ expression in there. This supported
syntax takes advantage of the fact that if the first expression in an ‘or’ query
is true, the rest are ignored. If the first expression is false, the rest have
to be evaluated.
Normally, the only thing you can
do with your open file is read it. If you want to write to it, you have to
tell Perl when you open the file that you’re
going to be writing to it. Under normal circumstances you won’t be both
reading and writing to the same file.
| Open |
Does What? |
| " Filename" |
We can read the file. |
| "< Filename" |
We can read the file. |
| "> Filename" |
We are creating a
new file. It will erase any old file with the same name. |
| ">> Filename" |
We are appending to
the file. It will create the file if it doesn’t already exist. |
$LOGFILE = "dopey.log";
open(LOGFILE) || die "Dopey will not open.\n";
print LOGFILE "You just started me up.\n";
close LOGFILE;
This program should tell you “Dopey will not open” when we try
to open it because the file doesn’t already exist. And in order to create
it, we would need permission from Unix to write to it, and we didn’t
ask for write permission. Try it again with ‘>’ added in front
of the “dopey.log” filename, and then again with ‘>>’ in
front of “dopey.log”:
$LOGFILE = ">dopey.log";
$LOGFILE = ">>dopey.log";
Try each option a number of times, and make sure you look at the log file
after each time!
Reading and Writing
When you want to read what’s in the file, you have quite a few options,
but the most common is to use ‘while’. We’ve already done
quite a bit of this. Surround the filehandle with the two angle brackets and
go to it. You’ll be handed each line of the file in ‘$_’.
$SearchTerm = shift(@ARGV);
$OUTFILE = ">" . shift(@ARGV);
open(OUTFILE) || die "Cannot open $OUTFILE.\n";
while (@ARGV) {
$INFILE = shift(@ARGV);
if (open(INFILE)) {
while (<INFILE>) {
print OUTFILE if /$SearchTerm/i;
}
} else {
print "Cannot open $INFILE.\n";
}
}
close OUTFILE;
close INFILE;
It probably would have been easier
and looked nicer to use while (<@ARGV>)
and not worry about $INFILE, but then we wouldn’t have been able to explicitly
warn when a file couldn’t be opened for reading.
Automatically Modifying
Files
Remember that line you put at the start of every Perl script you write? That
means something! It is not only the location of Perl in your operating system,
it is the ‘command line options’ that you want to pass to Perl.
So you can tell Perl that want to “edit your files in place” by
means of that top line. Add “-iext” to the line, and any “print” statements
that you do while you are reading a file using <> will be written back
to the file. The original file will be backed up with the same filename and
the extension that you specify. Here’s an example:
#!/usr/local/bin/perl -i_bak
while (<>) {
print "All work and no play makes Jack a dull boy.\n";
}
print "Honey, I'm home!\n";
This program will replace every
single line in the files you give it with the sentence “All work and no play makes Jack a dull boy.” When
it’s done, it will print “Honey, I'm home!” to the screen,
and will leave the original files in a file ending with ‘_bak’.
Be careful! If you leave off the ‘extension’, and just do ‘-i’,
Perl will go right ahead and modify your old file and not back up the original!
Change the top line in the above program to “#!/usr/local/bin/perl -i” and
you will completely lose any files you specify on the command line. Not a pleasant
prospect!
Directories
Directories (folders if you’re more familiar with the Macintosh) work
pretty much the same as files. You just add ‘dir’ to the end of
the open, close, and read statements. You can’t use the all-caps trick
on directories.
$CurrentDir = shift(@ARGV);
opendir CURRENTDIR,$CurrentDir;
@FileNames = readdir(CURRENTDIR);
closedir CURRENTDIR;
foreach $filename (@FileNames) {
$FullPath = $CurrentDir . "/" . $filename;
print $filename;
if (-d $FullPath) {
print "/";
} elsif (-x $FullPath) {
print "*";
}
print "\n";
}
You pretty much never want to write to directories.
Attaching to Arrays
If you want the contents of your arrays to last over time, you can “attach” those
arrays to files. Any changes to your array will be reflected in the file and
will still be there the next time you use that file. In Perl terms, you tie
the array to the file.
use NDBM_File;
use Fcntl;
$db = "/jp/jerry/swine";
tie(%Names,NDBM_File, "$db/ages", O_CREAT|O_RDWR, oct(‘640’));
while (<>) {
($name,$age) = split(/:/);
$Names{$name} = $age;
}
untie %Names;
This sets up the file “/jp/jerry/swine/ages” as the associative
array ‘%Names’, and it takes whatever we’ve piped to the
command and uses that as the initial data. Pipe the following file to it:
Jerold M. Stratton:33
James M. Stratton:32
Steven Spear:30
Jack W. Pope:52
John Paul:52
Hsiao-Ping Feng:29
Hannah Kinney:49
Shahra Meshkaty:34
Daniel Kramarsky:25
Then, show the ties with:
#!/usr/local/bin/perl
use NDBM_File;
use Fcntl;
$db = "/jp/jerry/swine";
tie(%Names,NDBM_File, "$db/ages", O_RDWR, oct(‘440’));
foreach $name (keys(%Names)) {
print "$name : $Names{$name}\n";
}
untie %Names;
You can create more than one tie in the same Perl script, allowing you to
create databases with multiple fields. Since these are associative arrays,
you need to make sure that your key is unique to each record.
Once you tie an associative array
to a file, you can work with the ‘file’ just
like it was a normal associative array.
The “O_CREAT”, “O_RDWR” have to do with open privileges.
The number “440” and “640” have to do with the permissions
of the file you’re creating. These are Unix things that you don’t
have to worry too much about: read the man pages for ‘ls’ and ‘chmod’ to
make sure your files aren’t accessible to the entire world.
About
this Tutorial
This tutorial is written by Jerry Stratton and is published
under the GNU Free Documentation License.
|