inn 2.7.0, expire takes ages and hangs
Sujet : inn 2.7.0, expire takes ages and hangs
De : aw (at) *nospam* somewhere.invalid (Adam W.)
Groupes : news.software.nntpDate : 29. Oct 2024, 14:47:54
Autres entêtes
Organisation : news.chmurka.net
Message-ID : <5dcbedb5-81e3-4ba1-a218-ac52c626cb05-aw@news.chmurka.net>
User-Agent : tin/2.6.1-20211226 ("Convalmore") (Linux/6.1.21-v7+ (armv7l))
Hi,
Users started complaining that they're getting "400 Expiring process ..."
errors on my server. I looked and it's true, expire was running since it
was started from the cron at night, but it was hanged. strace showed that
expire process (expire -v1, called from news.daily) was waiting for data
on a socket:
$ strace -f -p 28158
strace: Process 28158 attached
select(6, [5], NULL, NULL, {tv_sec=28, tv_usec=42890}
It just waits on select, and does nothing else.
$ ls -l /proc/28158/fd/
total 0
lr-x------ 1 news news 64 Oct 29 14:23 0 -> 'pipe:[318604403]'
l-wx------ 1 news news 64 Oct 29 14:23 1 -> 'pipe:[318607484]'
lrwx------ 1 news news 64 Oct 29 14:23 10 -> /usr/local/news/spool/cnfs/cnfs1
lrwx------ 1 news news 64 Oct 29 14:23 11 -> /usr/local/news/spool/cnfs/cnfs2
lrwx------ 1 news news 64 Oct 29 14:23 12 -> /usr/local/news/spool/cnfs/cnfs3
lrwx------ 1 news news 64 Oct 29 14:23 13 -> /usr/local/news/spool/cnfs/cnfs4
lrwx------ 1 news news 64 Oct 29 14:23 14 -> /usr/local/news/spool/cnfs/cnfs5
lrwx------ 1 news news 64 Oct 29 14:23 15 -> /usr/local/news/spool/cnfs/junk1
lrwx------ 1 news news 64 Oct 29 14:23 16 -> /usr/local/news/spool/cnfs/ctrl1
lrwx------ 1 news news 64 Oct 29 14:23 17 -> /usr/local/news/spool/cnfs/priv1
lrwx------ 1 news news 64 Oct 29 14:23 18 -> /usr/local/news/spool/cnfs/big81
lrwx------ 1 news news 64 Oct 29 14:23 19 -> /usr/local/news/spool/cnfs/alt1
l-wx------ 1 news news 64 Oct 29 14:23 2 -> 'pipe:[318607484]'
lrwx------ 1 news news 64 Oct 29 14:23 20 -> /usr/local/news/spool/cnfs/spam1
lrwx------ 1 news news 64 Oct 29 14:23 21 -> /usr/local/news/spool/cnfs/spam2
l-wx------ 1 news news 64 Oct 29 14:23 22 -> /usr/local/news/db/history.n
lr-x------ 1 news news 64 Oct 29 14:23 23 -> /usr/local/news/db/history.n
lr-x------ 1 news news 64 Oct 29 14:23 24 -> /usr/local/news/db/history
lrwx------ 1 news news 64 Oct 29 14:23 3 -> '/tmp/#7734312 (deleted)'
lrwx------ 1 news news 64 Oct 29 14:23 4 -> 'socket:[318607485]'
lrwx------ 1 news news 64 Oct 29 14:23 5 -> 'socket:[318607492]'
lr-x------ 1 news news 64 Oct 29 14:23 6 -> /usr/local/news/db/history
lrwx------ 1 news news 64 Oct 29 14:23 7 -> /usr/local/news/db/history.n.dir
lrwx------ 1 news news 64 Oct 29 14:23 8 -> /usr/local/news/db/history.n.index
lrwx------ 1 news news 64 Oct 29 14:23 9 -> /usr/local/news/db/history.n.hash
$ cat /proc/28158/fdinfo/5
pos: 0
flags: 02
mnt_id: 9
scm_fds: 0
I killed the process, restarted the server, and it seems to be fine
(unless it's not?).
So now questions:
1. Is it possible that I messed anything up by forcefully killing it?
2. Why did it hang / how can I diagnose it / can I diagnose it?
3. I added noexpire to news.daily cron line to avoid it in the future. I
guess there will be no consequences in my setup, as:
a) I'm using huge CNFS buffers for bulk storage (around 100 GB),
b) I'm using timehash for low-volume local groups (currently 53 MB),
c) my expire.ctl is set to never expire ("*:A:never:never:never").
Am I correct?
4. What is responsible for expiring history? Expireover or expire? I'd
guess the former (which I still keep enabled), but now I'm not sure
anymore.
BTW, I remember that it happened before, here's the sample report, just
nobody complained then (which is strange, because people are using this
server constantly, maybe it wasn't throttled then?). Note the time it took
expire to run:
expire begin Sat Oct 19 03:08:38 CEST 2024: (-v1)
Article lines processed 6057586
Articles retained 6051793
Entries expired 5793
expire end Sat Oct 19 16:15:51 CEST 2024
One earlier run was even longer:
expire begin Thu Oct 10 03:10:09 CEST 2024: (-v1)
Article lines processed 6012924
Articles retained 6007938
Entries expired 4986
expire end Fri Oct 11 19:58:42 CEST 2024
Most of the runs take at most two minutes, like this one:
expire begin Mon Oct 7 03:06:06 CEST 2024: (-v1)
Article lines processed 5986522
Articles retained 5979903
Entries expired 6619
expire end Mon Oct 7 03:07:51 CEST 2024
Another question: why does it expire anything if I have expiration
disabled? I guess "Entries expired" should be 0 in all cases? Or is it
about removing history entries for articles that are no longer in CNFS (or
that are too old, I have "remember" set to 721: "/remember/:721")?
Thanks.
Haut de la page
Les messages affichés proviennent d'usenet.
NewsPortal