In article <
tWhGP.1790040$TBhc.304536@fx16.iad>,
EricP <
ThatWouldBeTelling@thevillage.com> wrote:
Dan Cross wrote:
In article <t3CFP.529522$f81.217510@fx48.iad>,
EricP <ThatWouldBeTelling@thevillage.com> wrote:
Dan Cross wrote:
In article <8uzEP.115550$Xq5f.66883@fx38.iad>,
EricP <ThatWouldBeTelling@thevillage.com> wrote:
Dan Cross wrote:
In article <kVgEP.1277108$_N6e.605199@fx17.iad>,
EricP <ThatWouldBeTelling@thevillage.com> wrote:
Dan Cross wrote:
[snip]
Consider a thread that takes a spinlock; suppose some
high-priority interrupt comes in while the thread is holding
that lock. In response to the interrupt, software decides to
suspend the thread and switch some other thread; that thread
wants to lock the spin lock that the now-descheduled thread is
holding: a classic deadlock scenario.
Terminology: mutexes coordinate mutual exclusion between threads,
spinlocks coordinate mutual exclusion between cpu cores.
Windows "critical sections" are mutexes with a fast path.
A spin lock is simply any lock where you spin trying to acquire
the lock, as opposed to a blocking synchronization protocol.
Here I'm using the terminology of Herlihy and Shavit [Her08].
I'm using terminology in common use on multiprocessor systems like
VMS since 1980's and later WinNT.
>
Each OS data structure has different reentrancy requirements.
- Ones accessed by threads are guarded by mutexes.
Example on Windows would be the page tables as paging is done by threads.
- Ones accessed by OS software interrupt level are guarded by spinlocks.
Example on Windows is the Scheduler tables.
- Ones accessed by HW interrupts are guarded by interrupt spinlocks.
Example on Windows is device interrupt service routines.
I'm not sure what you're saying here, or rather, how it is
relevant.
>
Your earlier statement was that "mutexes coordinate mutual
exclusion between threads, spinlocks coordinate mutual exclusion
between cpu cores." But this is not the case.
>
Taking VMS as an example, from [Gol94], sec 9.5 (page 255):
>
|A thread of execution running on one processor acquires a
|spinlock to serialize access to the data with threads of
|execution running on other processors. Before acquiring
|the spinlock, the thread of executions raises IPL to block
|accesses by other threads of execution running on the same
|processor. The IPL value is determined by the spinlock being
|locked.
>
Goldberg and Saravanan seems quite explicit that threads can
acquire and hold spinlocks on OpenVMS, which seems to be
something that you were objecting to?
It took me a while to find copy of that book online.
That 1994 book relates to VMS before they added kernel threads to the OS.
Before you said you were using terminology commonly used on
systems "since [the] 1980s". The systems you cited were VMS and
Windows. I quoted references about both systems from the mid
1990s and mid 2000s, respectively. If that terminology were
common in the 1980s, as you suggested, surely it would be in a
book from 1994? Regardless, see below.
>
SMP OS (synchronized with spinlocks) was added to VMS is 1988.
VMS clusters were released in 1983 with its locking system and
cluster read-write locks can be used to build other synchronization
structures like mutexes, semaphores, quorums.
>
I don't know when the term "thread" meaning "scheduled units of execution
within a common address space" first show up. I found a reference to it on
Unix in 1989. Previously the term "task" was used to mean the same thing,
such as Ada tasks which on DEC Ada in 1986 were similar to what are now
called user mode threads. It looks to me that the DEC Ada runtime library
was later morphed into DECThreads user mode library released in 1991.
The point is that if, as you assert, Windows and VMS both use
the terminology of _cores_ (or processors) acquiring and holding
spinlocks instead of threads doing so, there should be some
reference to that in those books. There is not. The
terminology on those systems is consistent: threads are the
things that hold spinlocks; it's axiomatic that those threads
are running on processors, since the whole point is that it is
faster to busy-wait on the spinlock rather than potentially
context switch as with a different mutual exclusion primtiive,
like what VMS calles a mutex (which is a specific software
object, not the generic concept of a "mutex" as a mutual
exclusion primitive. In the latter context, a spinlock is a
kind of mutex; in the former, it's different).
At that point VMS scheduled processes, an execution context plus virtual
space, like most *nix (except I think Solaris which had kernel threads).
>
It says on pg 3: "The term thread of execution as used in this book
refers to a computational entity, an agent of execution."
This refers generally to the various code sequences from different context
that may be intertwined on a processors core for interrupts, exceptions,
Dpc (called driver forks in VMS), process AST's (software interrupts),
and applications.
>
In a later book on VMS after they added kernel scheduled threads they
consistently use the two terms "thread of execution" and "kernel thread"
to distinguish the two meanings.
Are you referring to the 1997 revision of Goldenberg et al,
that covers VMS 7.0? [Gol97] In that context, VMS "kernel
threads" refer to threads of execution within a program that
are managed and scheduled by the VMS executive (aka kernel).
>
I saw reference to 1997 internals book and that kernel threads were added.
However the only other OpenVMS internals document I could find was
OpenVMS Alpha Internals and Data Structures - Memory Management,
Goldenberg et al, 2003.
Ok, I have that one, too.
This is described on page 151, section 5.1:
|A thread of execution is a sequential flow of instruction
|executiuon. A traditional program is single-threaded,
|consisting of one main thread of execution. In contrast, a
|multithreaded program divides its work among multiple threads
|that can run concurrently. OpenVMS supplies DECthreads, a
|run-time library, to facilitate the creation, synchronization,
|and deletion of multiple user-mode thredas.
|
|OpenVMS Alpha Version 7.0 supports multiple execution contexts
|within a process: each execution context has its own hardware
|context and stacks, and canexecute independently of the others;
|all execution context with a process share the same virtual
|address space. The kernel _kernel thread_ refers to one of
|those execution contexts and the thread of execution running
|within it. This volume uses the term _multithreaded_ to refer
|to a process with multiple kernel threads.
|
|The kernel thread is now the basis for OpenVMS scheduling.
|[...]
>
This "kernel thread" is the definition I have been using:
unit of execution *scheduled* by the operating system.
But that's not particularly relevant. The interesting bit vis
spinlocks in the kernel is what threads are running in kernel
mode, and how those threads synchronize between themselves to
control access to kernel objects.
It continues on page 152:
|The term _kernel thread_ is a somewhat confusing name, in that
|instructions executing in a kernel thread can run in any of the
|four access modes, not only kernel mode. The term arose to
|distinguish the current multithreaded model, supported within
|the OpenVMS executive (or kernel), from the previous model,
|supported only by DECthreads.
Sadly, the later books on OpenVMS internals are more focused on
specific subsystems (scheduling and process management; memory
management) than the ealier volumes. Details of the
implementation of synchronization primitives are almost entirely
missing.
There is discussion about how those primitives are _used_,
however. [Gol03] mentions spinlocks, such as the SCHED
spinlock, in the context of memory management data structures.
E.g., on page 51:
|The swappability of the PHD [Process Header] result in several
|different methods for synchronizing access to fields within it.
|Because a PHD can be inswapped to a different balance set slot
|than it last occuped, accesses to a PHD that use its system
|space address must be synchronized against swapper
|interference. Accesses from a kernel thread to its own PHD can
|be made with the SCHED spinlock held to block any rescheduling
|and possible swapping of the process. Holding the MMG spinlock
|is another way to block swapping.
Notably here, that the language is again in terms of (software)
threads. It seems clear enough that in the VMS context,
spinlocks are conceptually held by threads, not cores.
>
Also Gol91 VAX VMS Internals and Data Structures v5.2 1991
>
All three, Gol91, Gol94, Rus12 have a section on spinlocks and
explicitly say spinlocks synchronize processors, not threads.
>
Gol91 pg 172, Gol94 pg 254: "A spinlock is acquired by a processor to
synchronize access to data shared by members of an SMP system."
Also [Gol91], page 173:
|A resource synchronized through elevated IPL on a uniprocessor
|is synchronzied through a combination of spinlock and elevated
|IPL on an SMP system. *A thread of execution* running on one
|processor acquires a spinlock to serialized access to the data
|with *threads of execution* running on other processors.
|Before acquiring, the _thread of execution_ raises IPL to block
|accesses *by other threads of execution* running on the *same*
|processor.
Emphasis added. Note that the primitives that do the acquiring
and holding are threads.
(Note, I already quoted this text from [Gol94] in an earlier
response; page 255.)
Rus12 pg 179: "The mechanism the kernel uses to achieve multiprocessor
mutual exclusion is called a spinlock."
Yes, I already pointed out that this is the usual case. But you
are still ignoring the uniprocessor case: spinlocks must work
there, as well.
But more relevantly, this does not imply that _processors_ "own"
spinlocks in the way you have asserted. That is done by
threads. A thread does not magically "turn into" a processor
when it acquires a spinlock; it remains a thread, albeit one
that is now holding a spinlock that it wasn't before.
Furthermore a search for the word "spinlock " in all three documents
finds repeated references to "processor spinlock". There are NO usages
of "process spinlock" on VMS or "thread spinlock" on WinNT.
I don't know what "process spinlock" you are referring to on VMS
or why you are mentioning it; this seems to be a strawman.
With respect to Windows, unless you have some very specific
meaning in mind for the term "thread spinlock", but if one takes
this to be a generic search for the two terms "thread" and
"spinlock" to see if they occur together, this is demonstrably
false: I have repeatedly provided specific references and
quotations showing that Windows refers to _threads_ acquiring
spinlocks.
It is mutexes that synchronize VMS processes or WNT threads.
I don't know what processes on VMS have to do with it. The
available documentation quite clearly says that spinlocks are
one of the synchronization mechanisms used by VMS to serialize
access to critical sections among multiple threads of execution,
both on the same processor and on other processors in
multiprocessor configurations. Remember: my objection to your
terminology is that cores/processors are the things that hold
spinlocks.
The difference is spinlocks do not invoke the scheduler to trigger
an executable context switch, whereas mutexes may invoke the scheduler.
That's obvious, and I've stated as much repeatedly. Refer to my
earlier posts where I drew a distinction between blocking
lock protocols as opposed to spinning protocols. Actually, to
quote myself from earlier in the thread:
|A spin lock is simply any lock where you spin trying to acquire
|the lock, as opposed to a blocking synchronization protocol.
|Here I'm using the terminology of Herlihy and Shavit [Her08].
|
|Traditional Unix "sleep" and "wakeup" are an example of a
|blocking protocol, where a thread may "sleep" on some
|"channel", yielding the locking thread while the lock cannot be
|acquired, presumably scheduling something else until the thread
|is later marked runnable by virtual of something else calling
|"wakeup" on the sleep channel.
There is a section on mutexes in
Gol91 pg 196, Gol04 pg 277, Rus12 pg 183 Low-IRQL Synchronization
This is all presented wildly out of context, and doesn't support
your earlier assertion, that spinlocks are held by cores and not
threads.
Gol91 pg 25, Gol94 pg 29
"Thus, the VMS executive requires a third synchronization tool to allow
synchronized access to pageable data structures. This tool must also allow
a process to be removed from execution while it maintains ownership of the
structure in question. One synchronization tool that fulfills these
requirements is called a mutual exclusion semaphore (mutex)."
Yup. Use of VMS "mutex" primitives is a blocking protocol, as
described above.
Gol91 pg 196, Gol94 pg 277
"However, in some situations requiring synchronization, elevated IPL is an
unacceptable technique."
...
"Typically, the threads of execution whose accesses are synchronized
through a mutex are process context threads."
>
Rus12 pg 183 Low-IRQL Synchronization
"Because waiting for a spinlock literally stalls a processor, spinlocks
can be used only under the following strictly limited circumstances"
You omitted important context that makes this clearer. This
whole section begins, "Executive software outside the kernel
also needs to synchronize access to global data structures in
a multiprocessor environment." Here, they're describing other
mechanisms for achieving synchronization between competing
software running on different CPUs.
Here's the full quote corresponding to the snippet above:
|Spinlocks only partially fill the executive's need for
|synchronization mechanisms, however. Because waiting for a
|spinlock literally stalls a processor, spinlocks can only be
|used under the following strictly limited circumstances:
|
|* The protected resource must be access quickly and without
| complicated interactions with other code.
|* The critical section code can't be paged out of memory,
| can't make references to pageable data, can't call external
| procedures (including system services), and can't generate
| interrupts or exceptions.
(One presumes they mean synchronous interrupts, here; as we've
seen, asynchronous interrupts are blocked by means of setting
the IRQL.)
But this is all well-known; and the first point is also
something I've repeatedly throughout this discussion. Quoting
myself again:
|Spinlocks, of the form I was describing, are an optimization
|for cases where the critical section guarded by the lock is
|short, and the overhead of invoking a scheduler and context
|switching to some other thread is higher than just spinning
|until the lock becomes available again. The whole point is
|to avoid blocking and switching.
These criteria from Russinovich et al are entirely consistent
with this.
"There are several additional synchronization mechanisms for use when
spinlocks are not suitable:
- Kernel dispatcher objects
- Fast mutexes and guarded mutexes
- Pushlocks
- Executive resources
>
Additionally, user-mode code, which also executes at low IRQL, must be able
to have its own locking primitives. Windows supports various
user-mode-specific primitives:
- Condition variables (CondVars)
- Slim Reader-Writer Locks (SRW Locks)
- Run-once initialization (InitOnce)
- Critical sections"
>
These books consistently refer to spinlocks synchronize *processors*,
No, they really do not, as the other direct quotations I have
provided show.
Spinlocks synchronize threads of execution, either on
uniprocessor or multiprocessor systems. On the latter, they
synchronize between multiple threads of execution either locally
or between processor cores; on uniprocessors.
Really, this is standard and normal, and is pretty much how all
relevant systems describe it, including Windows and VMS.
aka cores, at raised IPL (>= dispatch), and mutexes synchronize scheduled
processes/threads at lower IPL (< dispatch).
See above.
The reason is that when a kernel thread, executing a system call,
raises the processor >= dispatch, which is required to acquire a spinlock,
it stops being a scheduled thread, and may have temporarily switched
kernel stacks as processor interrupt levels require their own stacks.
I think perhaps you mean that processor _modes_, which on x86_64
means the processor's current protection ring, require their own
stacks. Windows supports three broad categories of stacks: for
each kernel-managed thread, a user stack and a kernel stack, and
for each physical processor, a DPC stack (since, as Russinovich
et al put it, "DPC's run in arbitrary thread context.")
When a thread is running in user mode (ring 3), it's running on
its user-mode stack; when running in kernel mode (ring 0), it's
running on its kernel-mode stack. DPCs are run on their own
stack (presumably switched to via some kind of trampoline).
Systems running on x86 usually also have a handful of special
stacks for specific types of interrupts (NMIs, double-fault
handlers, debug interrupts, and machine-check exceptions come to
mind). A field in the IDT entry for any given
interrupt/exception vector points to the index in the interrupt
stack table in the TSS that should be used for that
interrupt/exception; if not specified, the default ring 0 stack
pointed to by the RSP0 entry is used. If an trap occurs while
running in ring 3, the appropriate stack for the trap will be
selected and the processor will switch to it automatically. If
already running in ring 0, this won't happen unless the trap
vector entry in the IDT specifies a non-zero IST index, in which
case that stack will be selected and switched to
unconditionally. Note that there are also fields in the TSS for
stacks for rings 1 and 2, but almost no one uses those (Windows
does not), in large part because the page table format used by
Intel and AMD only distinguishes between 0 and 3 for page-level
permissions.
The sysenter/syscall mechanism is a little more involved here
(in particular, it does not automatically load the stack pointer
register to point to the kernel-mode stack), and the system call
entry point (e.g., the address pointed to by the LSTAR MSR on
x64 when considering SYSCALL) must be carefully written.
Since on entry to the code pointed at by LSTAR the stack pointer
for the currently executing thread is not changed, %rsp still
points to the user-mode stack. But the processor is in kernel
mode, so interrupt delivery won't generally trigger switching to
the stack in RSP0 in the TSS (if specified in the IDT entry for
a given trap, switching to one of the other stacks in the IST
will happen, but those are typically reserved for a limited set
of traps that require special handling with respect to stacks;
there are only seven IST entries in the TSS anyway). So a
system may inhibit interrupt delivery for a short time until it
has saved user-mode state and set up it's kernel mode stack.
But systems need not inhibit interrupt delivery while servicing
most system calls, and indeed, they often do not. Importantly,
inhibiting interrupt delivery initially isn't to avoid being
rescheduled, but rather to protect the kernel from faulting if
the user mode stack is trashed (and to avoid accidentally
leaking kernel data into user mode via taking an interrupt on
the user-mode stack).
At raised IPL many of the OS data structures associated with a kernel thread
cannot be touched as they might trigger a page fault which can only be
serviced by a kernel thread which can sleep, and not by processors/cores
which can only spin-wait. (Both VMS and WNT will crash/panic if code
page faults at IPL >= dispatch for exactly this reason.)
No, that's not really the issue.
A page fault is a synchronous exception; similarly with a
software interrupt (e.g., the effect of the `INT` instruction on
x86). These are always delivered regardless of what the current
thread has set the interrupt priority threshold to on the
current processor, or whether it has locally inhibited interrupt
delivery (e.g., via the `CLI` instruction on x86_64). If such
were to be allowed while the current thread is holding a
spinlock, then the trap handler could try and acquire that same
spinlock (perhaps indirectly), which would result in a deadlock.
Similarly with why threads holding spinlocks shouldn't invoke
system calls themselves.
The statement, "...and not by processors/cores which can only
spin-wait" is not meaningful. Again "processors" and "cores"
are hardware devices, not software objects. They are not
restricted to spin-waiting; they just execute instructions and
do other processor-y things.
You continue to try and force these terms to mean something else
but still have not produced any reference that uses those terms
in that manner.
All of this is for controlling access to non-reentrant shared data.
Non-reentrant data shared between processors/cores is controlled by
IPL and blocked by spinlocks. Non-reentrant data shared between kernel
threads is controlled with scheduler synchronization and thread switching.
I'm afraid you've got the details rather muddled.
For instance, saying that, "non-reentrant data shared between
processors/cores is controlled by IPL and blocked by spinlocks"
is rather meaningless; those words just don't related to each
other that way.
The correct thing to say is that spinlocks are mutual exclusion
structures that can be used to serialize access to data shared
between multiple concurrent threads of execution, running either
on the same processor or in parallel across multiple processors.
You also don't explain how "non-reentrant data shared between
kernel threads" is appreciably different from, "non-reentrant
data shared between processors/cores." For that matter, one
does not usually speak of _data_ as being reentrant; rather,
reentrancy is a concept one usually applies to programs.
I think this stems from this idea you seem have that threads
somehow turn into "processors/cores", whatever that means, as
these are obviously not the hardware devices, when interrupts
are disabled on the CPUs they are running on. Near as I can
tell this is your own unique invention, and nothing else uses
that terminology; use by the systems you cited is not supported
by evidence.
Consequently, I find your repeated assertions about these terms
rather strange.
- Dan C.