I suppose that this is going into Server Fault territory, but I added
the following cron job to delete old metrics of ours that haven't been
written to for over 30 days (e.g. of cloud instances that have been
disposed):
As people have pointed out, removing the files is the way to go. Expanding on previous answers, I made this script that removes any file that has exceeded its max retention age. Run it as a cronjob fairly regularly.
#!/bin/bash
d=$1
now=$(date +%s)
MINRET=86400
if [ -z "$d" ]; then
echo "Must specify a directory to clean" >&2
exit 1
fi
find $d -name '*.wsp' | while read w; do
age=$((now - $(stat -c '%Y' "$w")))
if [ $age -gt $MINRET ]; then
retention=$(whisper-info.py $w maxRetention)
if [ $age -gt $retention ]; then
echo "Removing $w ($age > $retention)"
rm $w
fi
fi
done
find $d -empty -type d -delete
A couple of bits to be aware of - the whisper-info call is quite heavyweight. To reduce the number of calls to it I've put the MINRET constant in, so that no file will be considered for deletion until it is 1 day old (24*60*60 seconds) - adjust to fit your needs. There are probably other things that can be done to shard the job or generally improve its efficiency, but I haven't had need to as yet.