[LUAU] Sharing a cute script for FTP based file grooming

Brian Chee chee at hawaii.edu
Thu Jul 17 17:07:47 PDT 2014


So considering just how poor the quality of the other scripts I found on my
search, I thought I'd share this with the community. This was in reaction
to trying to getting rid of old files off a National Instruments cRIO
device. The gist is that the cRIO would crash if the flash filled up, and
the system does NOT recycle space. So we MUST remove old files to prevent
this. However, the unit outputs a very strange file listing and we had to
parse it for some sort of separators to get rid of the HTML and leave the
file name. Since we have the date embedded into the filename, this is how
we solved the "pruning" process.

Lastly this is referenced in the /var/spool/cron/ directory for user "
*insertusernamhere*" which owns the script and the target directory. We run
this weekly, here's the cron statement:

## Run the file prune for UHMC cRIO ONLY on wednesdays to delete 7 days or
older
15 3 * * 4 /export/grianach/*insertusernamehere*/uhmc-crio-fileprune.sh

This runs every 4th day (thursday at 3:15am HST)

/brian chee


*<Insert weird file listing off cRIO>*
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html>
<head>
<title>Index of /MainData on 166.122.250.83:21</title>
</head>
<body>
<h1>Index of /MainData on 166.122.250.83:21</h1>
<hr>
<pre>
  2014 Jul 17 03:14  File        <a href="
ftp://User:hig411@166.122.250.83:21/MainData/14_07_17_Maui.txt
<ftp://User:hig411@166.122.250.83/MainData/14_07_17_Maui.txt>
">14_07_17_Maui.txt</a>  (9721166 bytes)
  2014 Jul 10        File        <a href="
ftp://User:hig411@166.122.250.83:21/MainData/14_07_09_Maui.txt
<ftp://User:hig411@166.122.250.83/MainData/14_07_09_Maui.txt>
">14_07_09_Maui.txt</a>  (72090182 bytes)
  2014 Jul 10 23:59  File        <a href="
ftp://User:hig411@166.122.250.83:21/MainData/14_07_10_Maui.txt
<ftp://User:hig411@166.122.250.83/MainData/14_07_10_Maui.txt>
">14_07_10_Maui.txt</a>  (71997734 bytes)
  2014 Jul 12        File        <a href="
ftp://User:hig411@166.122.250.83:21/MainData/14_07_11_Maui.txt
<ftp://User:hig411@166.122.250.83/MainData/14_07_11_Maui.txt>
">14_07_11_Maui.txt</a>  (72000302 bytes)
  2014 Jul 12 23:59  File        <a href="
ftp://User:hig411@166.122.250.83:21/MainData/14_07_12_Maui.txt
<ftp://User:hig411@166.122.250.83/MainData/14_07_12_Maui.txt>
">14_07_12_Maui.txt</a>  (71954934 bytes)
  2014 Jul 13 23:59  File        <a href="
ftp://User:hig411@166.122.250.83:21/MainData/14_07_13_Maui.txt
<ftp://User:hig411@166.122.250.83/MainData/14_07_13_Maui.txt>
">14_07_13_Maui.txt</a>  (71938670 bytes)
  2014 Jul 14 23:59  File        <a href="
ftp://User:hig411@166.122.250.83:21/MainData/14_07_14_Maui.txt
<ftp://User:hig411@166.122.250.83/MainData/14_07_14_Maui.txt>
">14_07_14_Maui.txt</a>  (71898438 bytes)
  2014 Jul 16        File        <a href="
ftp://User:hig411@166.122.250.83:21/MainData/14_07_15_Maui.txt
<ftp://User:hig411@166.122.250.83/MainData/14_07_15_Maui.txt>
">14_07_15_Maui.txt</a>  (72109014 bytes)
  2014 Jul 17        File        <a href="
ftp://User:hig411@166.122.250.83:21/MainData/14_07_16_Maui.txt
<ftp://User:hig411@166.122.250.83/MainData/14_07_16_Maui.txt>
">14_07_16_Maui.txt</a>  (72046526 bytes)
</pre>
</body>
</html>

*</Insert weird file listing off cRIO>*
**NOTE: I'm searching for a lessthan and then a greaterthan symbol since I
don't have spaces to search for in the string.*

We found a similar problem on IIS based FTP servers and some ISP's...so
considering the pain we went through getting this to work, I figured others
might want a copy to start from.

As always, mileage may vary and much depends upon how your FTP service
presents file listings. But hopefully I've put enough comments in that
you'll be able to figure out what we did and modify it for your use. While
I will answer questions about it, I AM NOT PROVIDING SUPPORT, THIS IS
PROVIDED AS IS.

This is on a CentOS server connecting to a National Instruments cRIO DAC
system.

/brian chee

P.S. This was written by Ross Ishida and I hovered over him trying to learn
this script


*<Paste>*
#!/bin/sh

# Script to list out and find the oldest files that are x number of days old
# Then delete only the files that are more than x days old.

# Set current date into a variable
curdate=`date +%y_%m_%d`

###### Set who to mail the log file to
*mailto=grianach at soest.hawaii.edu <grianach at soest.hawaii.edu>*

###### Set subject line of the email
subject="Somekind of subject for $curdate"

# temporary name of log file
logfile=/tmp/output-$curdate.log

# Set the date you want to start pruning off old files at
# Literally just change the number in the string "...days ago" to get prune
date

prunedate=`date +%y_%m_%d -d "7 days ago"`
#prunedate=`date +%y_%m_%d -d "2 days ago"`

# This wget command does the following
# no-remove-listing leaves the file list intact
# no-verbose means use the terse version of the file listing
# dash O (letter) Output to file standard out
# insert the FTP command with username and password in line and output
errors to NULL
#    and then send the outuput of the command to the logfile specified above
# This first wget command just lists out the files with the full URL into
the log file so that
#    we know which files we're starting with

wget --no-remove-listing --no-verbose  -O -
*ftp://User:password@ipaddress/MainData/*. 2>/dev/null >> $logfile

##### This is only because Windows FTP via IIS returns a URL not a standard
file listing
# Start a loop to go thru the listing of files from the above wget command
# This will loop through the file list but carve off just the filename from
the big URL
# The file list will then be sorted
# the second if-then-else-do loop will look to see if the file being looked
at is older than the
#   Prunedate.
# If the file is older than prunedate, then run the statements after the fi
line
#   However, if it doesn't find any files older than the prune date,
terminate the script
# In this case the awk command is looking first for a lessthan symbol and
then the greatthan symbol
# This will get rid of the stuff between the symbols, leaving the filename
we want. The print command
# represents the second field of the listing which is the URL
# Notice the backslash before the lessthan and greaterthan symbols, this is
called
#   escaping the symbol so that the script doesn't try to process it but
rather use it as
#   part of the string.

for i in `wget --no-remove-listing --no-verbose  -O -
*ftp://User:password@ipaddress/MainData/*. 2>/dev/null | grep File |
 awk -F\< '{print $2}' | awk -F\> '{print $2}' | sort `
do
  result=`echo $i | grep $prunedate`
  if [ -n "$result" ]
  then
    break
  # This is the end of the first loop
  fi
  # echo to the std output the truncated filename that you plan to delete
  echo remove $i
  # one at a time loop thru the list of files to delete and delete them
  # The loop goes from the lftp statement to the matching stuff flag

  lftp *ftp://User:password@IPaddress/MainData* << EOLSTUFF
ls -l $i
# rm $i
EOLSTUFF
# This is the end of the if-then-else-do loop

done  >> $logfile

# Once the entire loop is done, output the results of the loop appending to
the raw URL listing in the first loop.
# Keep in mind that if this script terminates it will keep appending to the
filename each time it's run in that day
# This is where I should output the logfile to sendmail and then once sent,
delete the log file
cat /tmp/output-$curdate.log | mailx -s "$subject" $mailto

# get rid of log once mailed
rm /tmp/output-$curdate.log

*</Paste>*
-- 
********************************************
University of Hawaii SOEST
Advanced Network Computing Laboratory (ANCL)
Brian Chee <chee at hawaii.edu>
2525 Correa Road, HIG 500
Honolulu, HI 96822
Office: 808-956-5797


More information about the LUAU mailing list