inn 2.7.0, expire takes ages and hangs

Liste des GroupesRevenir à ns nntp 
Sujet : inn 2.7.0, expire takes ages and hangs
De : aw (at) *nospam* somewhere.invalid (Adam W.)
Groupes : news.software.nntp
Date : 29. Oct 2024, 14:47:54
Autres entêtes
Organisation : news.chmurka.net
Message-ID : <5dcbedb5-81e3-4ba1-a218-ac52c626cb05-aw@news.chmurka.net>
User-Agent : tin/2.6.1-20211226 ("Convalmore") (Linux/6.1.21-v7+ (armv7l))
Hi,

Users started complaining that they're getting "400 Expiring process ..."
errors on my server. I looked and it's true, expire was running since it
was started from the cron at night, but it was hanged. strace showed that
expire process (expire -v1, called from news.daily) was waiting for data
on a socket:

$ strace -f -p 28158
strace: Process 28158 attached
select(6, [5], NULL, NULL, {tv_sec=28, tv_usec=42890}

It just waits on select, and does nothing else.

$ ls -l /proc/28158/fd/
total 0
lr-x------ 1 news news 64 Oct 29 14:23 0 -> 'pipe:[318604403]'
l-wx------ 1 news news 64 Oct 29 14:23 1 -> 'pipe:[318607484]'
lrwx------ 1 news news 64 Oct 29 14:23 10 -> /usr/local/news/spool/cnfs/cnfs1
lrwx------ 1 news news 64 Oct 29 14:23 11 -> /usr/local/news/spool/cnfs/cnfs2
lrwx------ 1 news news 64 Oct 29 14:23 12 -> /usr/local/news/spool/cnfs/cnfs3
lrwx------ 1 news news 64 Oct 29 14:23 13 -> /usr/local/news/spool/cnfs/cnfs4
lrwx------ 1 news news 64 Oct 29 14:23 14 -> /usr/local/news/spool/cnfs/cnfs5
lrwx------ 1 news news 64 Oct 29 14:23 15 -> /usr/local/news/spool/cnfs/junk1
lrwx------ 1 news news 64 Oct 29 14:23 16 -> /usr/local/news/spool/cnfs/ctrl1
lrwx------ 1 news news 64 Oct 29 14:23 17 -> /usr/local/news/spool/cnfs/priv1
lrwx------ 1 news news 64 Oct 29 14:23 18 -> /usr/local/news/spool/cnfs/big81
lrwx------ 1 news news 64 Oct 29 14:23 19 -> /usr/local/news/spool/cnfs/alt1
l-wx------ 1 news news 64 Oct 29 14:23 2 -> 'pipe:[318607484]'
lrwx------ 1 news news 64 Oct 29 14:23 20 -> /usr/local/news/spool/cnfs/spam1
lrwx------ 1 news news 64 Oct 29 14:23 21 -> /usr/local/news/spool/cnfs/spam2
l-wx------ 1 news news 64 Oct 29 14:23 22 -> /usr/local/news/db/history.n
lr-x------ 1 news news 64 Oct 29 14:23 23 -> /usr/local/news/db/history.n
lr-x------ 1 news news 64 Oct 29 14:23 24 -> /usr/local/news/db/history
lrwx------ 1 news news 64 Oct 29 14:23 3 -> '/tmp/#7734312 (deleted)'
lrwx------ 1 news news 64 Oct 29 14:23 4 -> 'socket:[318607485]'
lrwx------ 1 news news 64 Oct 29 14:23 5 -> 'socket:[318607492]'
lr-x------ 1 news news 64 Oct 29 14:23 6 -> /usr/local/news/db/history
lrwx------ 1 news news 64 Oct 29 14:23 7 -> /usr/local/news/db/history.n.dir
lrwx------ 1 news news 64 Oct 29 14:23 8 -> /usr/local/news/db/history.n.index
lrwx------ 1 news news 64 Oct 29 14:23 9 -> /usr/local/news/db/history.n.hash

$ cat /proc/28158/fdinfo/5
pos:    0
flags:  02
mnt_id: 9
scm_fds: 0

I killed the process, restarted the server, and it seems to be fine
(unless it's not?).

So now questions:

1. Is it possible that I messed anything up by forcefully killing it?

2. Why did it hang / how can I diagnose it / can I diagnose it?

3. I added noexpire to news.daily cron line to avoid it in the future. I
guess there will be no consequences in my setup, as:

a) I'm using huge CNFS buffers for bulk storage (around 100 GB),
b) I'm using timehash for low-volume local groups (currently 53 MB),
c) my expire.ctl is set to never expire ("*:A:never:never:never").

Am I correct?

4. What is responsible for expiring history? Expireover or expire? I'd
guess the former (which I still keep enabled), but now I'm not sure
anymore.

BTW, I remember that it happened before, here's the sample report, just
nobody complained then (which is strange, because people are using this
server constantly, maybe it wasn't throttled then?). Note the time it took
expire to run:

expire begin Sat Oct 19 03:08:38 CEST 2024: (-v1)
    Article lines processed  6057586
    Articles retained        6051793
    Entries expired             5793
expire end Sat Oct 19 16:15:51 CEST 2024

One earlier run was even longer:

expire begin Thu Oct 10 03:10:09 CEST 2024: (-v1)
    Article lines processed  6012924
    Articles retained        6007938
    Entries expired             4986
expire end Fri Oct 11 19:58:42 CEST 2024

Most of the runs take at most two minutes, like this one:

expire begin Mon Oct  7 03:06:06 CEST 2024: (-v1)
    Article lines processed  5986522
    Articles retained        5979903
    Entries expired             6619
expire end Mon Oct  7 03:07:51 CEST 2024

Another question: why does it expire anything if I have expiration
disabled? I guess "Entries expired" should be 0 in all cases? Or is it
about removing history entries for articles that are no longer in CNFS (or
that are too old, I have "remember" set to 721: "/remember/:721")?

Thanks.

Date Sujet#  Auteur
29 Oct 24 * inn 2.7.0, expire takes ages and hangs6Adam W.
2 Nov 24 +* Re: inn 2.7.0, expire takes ages and hangs3Matija Nalis
5 Nov 24 i+- Re: inn 2.7.0, expire takes ages and hangs1Adam W.
5 Nov 24 i`- Re: inn 2.7.0, expire takes ages and hangs1Adam W.
2 Nov 24 `* Re: inn 2.7.0, expire takes ages and hangs2Julien ÉLIE
5 Nov 24  `- Re: inn 2.7.0, expire takes ages and hangs1Adam W.

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal