Sujet : Re: Files tree
De : vallor (at) *nospam* cultnix.org (vallor)
Groupes : comp.os.linux.miscDate : 12. Apr 2024, 15:08:02
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <uvbf82$2c2fr$3@dont-email.me>
References : 1
User-Agent : Pan/0.157 (Mariinka; 1e36d04 gitlab.gnome.org/GNOME/pan.git; x86_64-pc-linux-gnu)
On Fri, 12 Apr 2024 13:39:34 +0100, James Harris
<
james.harris.1@gmail.com> wrote in <
uvba27$2c40q$1@dont-email.me>:
For a number of reasons I am looking for a way of recording a list of
the files (and file-like objects) on a Unix system at certain points in
time. The main output would simply be sorted text with one
fully-qualified file name on each line.
What follows is my first attempt at it. I'd appreciate any feedback on
whether I am going about it the right way or whether it could be
improved either in concept or in coding.
There are two tiny scripts. In the examples below they write to
temporary files f1 and f2 to test the mechanism but the idea is that the
reports would be stored in timestamped files so that comparisons between
one report and another could be made later.
The first, and primary, script generates nothing other than names and is
as follows.
export LC_ALL=C sudo find /\
-path "/proc/*" -prune -o\
-path "/run/*" -prune -o\
-path "/sys/*" -prune -o\
-path "/tmp/*/*" -prune -o\
-print0 | sort -z | tr '\0' '\n' > /tmp/f1
Filenames with newlines will contain your record delimiter, which
will be passed through without modification. You might want
to rethink this.
You'll see I made some choices such as to omit files from /proc but not
from /dev, for example, to record any lost+found contents, to record
mounted filesystems, to show just one level of /tmp, etc.
I am not sure I coded the command right albeit that it seems to work on
test cases.
The output from that starts with lines such as
/
/bin /boot /boot/System.map-5.15.0-101-generic
/boot/System.map-5.15.0-102-generic ...etc...
Such a form would be ideal for input to grep and diff to look for
relevant files that have been added or removed between any two runs.
The second, and less important, part is to store (in a separate file)
info about each of the file names as that may be relevant in some cases.
That takes the first file as input and has the following form.
cat /tmp/f1 |\
tr '\n' '\0' |\
xargs -0 sudo ls -ld > /tmp/f2
The output from that is such as
drwxr-xr-x 23 root root 4096 Apr 13 2023 /
lrwxrwxrwx 1 root root 7 Mar 7 2023 /bin -> usr/bin
drwxr-xr-x 3 root root 4096 Apr 11 11:30 /boot ...etc...
As for run times, if anyone's interested, despite the server I ran this
on having multiple locally mounted filesystems and one NFS the initial
tests ran in 90 seconds to generate the first file and 5 minutes to
generate the second, which would mean (as long as no faults are found)
that it would be no problem to run at least the first script whenever
required. Other than that, I'd probably also schedule both to run each
night.
Since there is a significant difference in run times, you might want
to try running your first find(1) with the -ls option, instead of using
the pipeline to ls(1). (You could also possibly do it all with one
find(1) command, and use cut(1), awk(1) or perl(1) to split things
up, but my brain isn't fully booted yet this morning to figure that
out. ;) )
That's the idea. As I say, comments, advice and criticisms on the idea
or on the coding would be appreciated!
A commendable first effort! Just be careful -- filenames can contain
pretty much any character, including newlines.
BTW, there is a newsgroup "comp.unix.shell" that is alive and
active, if you were inclined to broaden your audience.
-- -v