Sujet : Re: Files tree
De : james.harris.1 (at) *nospam* gmail.com (James Harris)
Groupes : comp.os.linux.miscDate : 20. Apr 2024, 16:23:27
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <v00mlg$3nvjj$1@dont-email.me>
References : 1 2
User-Agent : Mozilla Thunderbird
On 12/04/2024 15:13, Rich wrote:
James Harris <james.harris.1@gmail.com> wrote:
export LC_ALL=C
sudo find /\
-path "/proc/*" -prune -o\
-path "/run/*" -prune -o\
-path "/sys/*" -prune -o\
-path "/tmp/*/*" -prune -o\
-print0 | sort -z | tr '\0' '\n' > /tmp/f1
If you are going to output null terminated filenames (-print0) then
don't almost immediately throw out the nulls by converting them to
newlines. The purpose of -print0 and the nulls is to avoid *any*
problems with any filename character (i.e., a filename /can/ contain a
newline). If, by any chance, you have even one filename with a
newline in the name, converting the nulls to newlines for storage will
break the storage file (i.e., you can't differentate the "newlines
ending filenames" from the "newlines that belong inside a filename").
Convert the nulls to newlines only when you want to view with less,
then your "files of records" are not corrupt from the start.:
tr '\0' '\n' < /tmp/f1 | less ; or
< /tmp/f1 tr '\0' '\n' | less ; if you prefer the input file on the
left
I am trying to do the zero termination just now but have run in to a problem. The above find command may report errors such as permission failures and missing files. I really should include such info in the output but coming from stderr such lines are newline- rather than nul-terminated and therefore cannot be combined with just 2>&1.
To get around that I tried
find 2> >(tr '\n' '\0')
That partly works. After sorting, error messages appear at the end (they begin with 'f' where non-error lines begin with '/') which is fine but there is usually one garbled line between good results and error messages and very likely some other corruption elsewhere in the file.
I guess that's due to output buffering but even
stdbuf -oL -eL find 2> >(stdbuf -oL -eL tr '\n' '\0')
doesn't work. It is already getting well beyond my comfort zone and is getting increasingly complex which, it has to be said, would not be the case with newline terminators.
Hence this post to ask for suggestions on where to go next.
I guess I could write find's stdout and stderr to temp files, sed the stderr data, convert newlines to nuls, combine with the stdout data and then I'll be back on track and can sort the result. But before I do that I thought to check back for suggestions. Are there any simpler ways?
...
However, if you want 'data' on the files, you'd be better off using the
'stat' command, as it is intended to 'aquire' meta-data about files
(and by far more meta-data than what ls shows).
You'd feed it filenames with xargs as well:
xargs -0 stat --format="%i %h %n" > /tmp/f2
To output inode, number of links, and name (granted, you can get this
from ls as well, but look through the stat man page, there is a lot
more stuff you can pull out).
Agreed, stat would be better.
The one tool that does not (yet) seem to consume "null terminated
lines" is diff, and you can work around that by converting to newlines
at the time you do the diff:
diff -u <(tr '\0' '\n' < /tmp/f1) <(tr '\0' '\n' < /tmp/f2)
And, note, all of these "convert nulls to newlines at time of need" can
be scripted, so you could have a "file-list-diff" script that contains:
#!/bin/bash
diff -u <(tr '\0' '\n' < "$1") <(tr '\0' '\n' < "$2")
And then you can diff two of your audit files by:
file-list-diff /tmp/f1 /tmp/f2
And not have to remember the tr invocation and process substitution
syntax to do the same at every call.
Understood. Thanks for the clear info.
-- James Harris