|
|
 |
 |
 |
 |
Scheme Programming Language
|
 |
 |
 |
 |
 |
 |
 |
 |
OS Continuations and the Incredible Disappearing Grad Student
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 It's a long posting, so I hope that I haven't violated any community rules and that this problem will interest somebody enough to comment. To match the title, I'll start off with a quoted blog posting retrieved from the ether by the Wayback Machine from Pinku Surana. The actual blog seems to have disappeared when Northwestern University's CS department reorganized their website.
> OS & Continuations > Microkernel operating systems should expose continuations, not > threads, as the low-level interface to the kernel. Threads should be > provided as a library in user-space layered on top of the > microkernel. Richard Draves' thesis demonstrated that using > continuations as a programming technique within the MACH 3.0 kernel > simplified the code and vastly improved performance. Randall Dean got > good results when he modified the C-Threads library (user-space > threads) to use continuations. If these results are valid, then the > L4 microkernel's API should expose continuations, and the OS layered > on top (L4Linux) should use continuations to implement high-level OS > features like LWP and scheduling. > Continuations should be useful in other systems projects. For > example, Apache (AFAIK) is multithreaded so it can handle a pipeline > of requests without blocking. Though this is the right goal, they > might be able to create a queue of continuations instead, each > representing a request somewhere in the pipeline. By using > continuations the way the MACH group did they might be able to save > memory space, reduce code complexity and reduce critical section > size. All of this will improve performance. In addition, you no > longer rely on various OS threading interfaces, policies, and bugs. > I should mention that the MACH group implemented first-class > continuations, but they did not use the expensive, generic technique > employed by Scheme/SML. These languages save the entire stack - in > some cases they duplicate the stack - to preserve all possible > information required to continue the computation. In addition, the > compiler can't determine when a continuation is no longer valid, thus > creating space leaks and weird exception handling problems. MACH > explicitly maintained the data they needed for a continuation so they > could, in many cases, dispose of the stack. In fact, I think their > technique could be described as a very efficient implementation of > thread pools (w/o the threads, of course). Posted by surana at > February 28, 2003 10:51 AM
The significance is that I came to the same conclusion as I was working on my microkernel. I wanted it to be so small that the timer driver and scheduler (if there *was* a scheduler) ran in untrusted user-space servers. Hence, I needed a concurrency mechanism that could work without an OS-level scheduler. Here's the place for the alt.os.development guys to get interested. I discovered that such a thing could be done with threads. I could implement two operators: Fork() - Unix-style fork that returns the identifier of the child thread or an identifier of the invoking thread. Invoke(thread) -> identifier - Runs thread, passing it the identifier of the current thread (the one that called Invoke). Invoking a thread causes it to return the "current thread identifier" either from the Fork() that created it or the last Invoke() it called. Lispers take note: doesn't that Invoke look like applying call/cc to a continuation? I noticed the resemblance, and decided to switch from threads over to continuations. Doing so instantly made kernel programming more elegant. Instead of threads that could be valid to operate upon (ie: contain program state) *or not* at any time, I ended up with "virtual processors" that contain the OS state of a running continuation and "saved continuations" that contain the state of reified continuations. A continuation is the contents of the saved processor. Of course, some concessions must be made to the fact that we're talking about an operating system kernel written in Object Pascal that wants to interoperate with C -- lack of garbage collection and closures and inefficiency of copying a full machine continuation (ie: all user- and kernel-level machine state one would normally find in a thread or process). So I end up with this set of operators: CallCC(entry,closure) -> value - Calls procedure entry of two parameters, one a continuation and the other a pointer, passing the reified calling continuation as the first argument and closure as the second (to provide for pseudo-closures in non-closure languages). Entry is called in an entirely new continuation created by the kernel. Throw(continuation,value) - Passes value to continuation and throws to continuation, causing the original CallCC to return value. Destroys the calling continuation. CallCC(continuation) -> value - Throws to continuation, passing it the current continuation (which is not destroyed) instead of an arbitrary value. Copy(continuation) -> continuation - Copies the given continuation. Destroy(continuation) - Destroys the given continuation. These are one-shot continuations, but given Copy implementing multiple-use continuations in user-space shouldn't be very tough. Lispers, you come in *here*, as we start talking about the use of continuations for concurrency. Concurrency (on shared-memory multiprocessor systems) comes easily: simply create a virtual processor for each physical processor the operating system sees. A master processor starts each one up with a continuation, and they pretty much run themselves concurrently. If a scheduler is present in user-space, any number of virtual processors can access it by running continuations that act as coroutines of that scheduler. The kernel supplies mutexes based on a TSL instruction. Of course, if that seems like a stupid concurrency model to you, only say so! Now here's the hard bit: asynchronous events. I define them here as tasks that run in response to asynchronous events or signals from hardware or software, ie: generalized interrupt handlers. In most microkernel systems (like L4) the kernel simply translates an event into a message that gets sent to some handler process, which handles the event by receiving the message in its own time. But we're talking about a continuation-based kernel, in which there's no scheduler to run that handler process. Event handlers also need some mechanism to handle the task they interrupted to implement more conventional preemptive multiprocessing models. That can easily be accomplished by running an event handler via CallCC. The mechanism I use to implement a continuation switch runs at the completion of an Interrupt Service Routine, so handling interrupts with events with CallCC's leaves a pretty low interrupt latency. Now the question is: *how do you make that fast*? Interrupt handlers, even if they don't run in the ISR itself, must run fast enough that the interrupted continuation doesn't suffer unduly. How do you think that should work? The next bit is the solution I worked out, and an alternative that just occurred to me. <Spoilers> I decided to bring back the persistence (continued existence across operations) of threads by creating "persistent continuations". These come in one of two states: "filled" or "empty". References to a persistent continuation from user-space are only valid when that continuation is in the "filled" state, which it only obtains from being reified by CallCC. However, internal references to the continuation held by the kernel are always valid, because the actual lifetime of the persistent continuation is managed by reference counting. Now here's the trick: Throw()'ing a persistent continuation saves the current continuation from the virtual processor to the persistent continuation structure instead of deleting the continuation, but sets the continuation's state to "empty". So the user thinks the continuation no longer exists, while the kernel event-handling machinery sees a continuation filled with data. Then, when the event handler must be run again, the kernel passes the "empty" persistent continuation to CallCC, resuming it. In short: event handlers act like threads (running, suspending themselves, and running again), while normal computations act like continuations (running and terminating or running and reifying). Of course, being able to CallCC to a continuation that suspended itself on a Throw means somehow passing the Current Continuation that puts the CC in there to the called continuation passed explicitly to CallCC. Put simply, Throw needs to start returning a continuation, *but only for persistent continuations*. The normal kind get destroyed as before. The kernel internally converts a normal saved continuation to a persistent continuation when it saves that continuation as an event handler. Note that by CallCC'ing to a continuation on an event and Throw'ing to a continuation (the interrupted one or a new one) at the end of the event handler, saving the state of the event handler in a continuation as a side-effect, these performance gains are made: 1.No continuations are created when an event is handled, except by explicit request of user code. This includes Copy'ing. 2.No continuations are destroyed when an event is handled, except by explicit request of user code. 3.Event handling system calls only need to be invoked once for a continuation to handle the given event forever. 4.The total overhead of an event handler (time and dynamically-allocated memory not spent in the interrupted continuation or the event-handler continuation) drops no memory at all and two context switches (one to the handler, one out of it). But now Throw() is a leaky abstraction, a routine that sometimes returns and sometimes doesn't according to the whims of the kernel. Any more elegant way to handle asynchronous events quickly? </Spoilers> Alternative
... read more »
When you talk about kernel-space vs. user-space, what do you mean exactly? You said that the kernel is written in Object Pascal so I assume it runs on top of DOS/Windows; is that correct? Or does it run on bare hardware? Also, how is user code evaluated? Is it also written in Pascal? Or is it interpreted by the kernel? Aziz,,,
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Abdulaziz Ghuloum wrote: > When you talk about kernel-space vs. user-space, what do you mean > exactly? You said that the kernel is written in Object Pascal so I > assume it runs on top of DOS/Windows; is that correct? Or does it run > on bare hardware? Also, how is user code evaluated? Is it also > written in Pascal? Or is it interpreted by the kernel? > Aziz,,,
Kernel-space vs. user-space is a distinction made by hardware between code that runs with the ability to perform privileged instructions (such as changing the control registers and turning interrupts on or off) and code that lacks such abilities. We call the latter user-space, because user applications run at that privilege level, and the former kernel-space, because the kernel alone runs with that much privilege. *Object* Pascal. Pascal hasn't been interpreted for 30 years! The kernel compiles to binary and runs on bare metal. User code comes in binary form (just like on any other operating system) and runs in that form. It can be written in any language that compiles to binary, and requires no interpretation by the kernel or anyone else. So that, hopefully, gives you a good idea of what kind of environment you're dealing with here. User programs run as binary in unprivileged mode, as on most major operating systems. This means that the "current continuation" is defined as the contents of the processor registers, the current address space, and the current stack. Eli -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGDoCxXxM/2T/QnN0RAmQfAJ4thSziuBq4u1xKv9nqCnJZ00OsFQCgrVcH 1VlzH+rpEOJmLCr/13jP7VI= =gunk -----END PGP SIGNATURE-----
I'm no expert in OS development, but your idea reminds me a popular technique of developing console/arcade video games which have historically run close to bare metal (and been required to run very efficiently). Essentially it is to write the whole program in CPS---you break things into small chunk of tasks, and whenever you ask system-level service or your job can yield CPU, you call the system routine with the continuation of your job. A good thing is that you can pass several continuations to the system if necessary (e.g. for event handlers). Writing everything in CPS manually is a bit of pain, though, and I guess a compiler might help here, translating user code that uses call/cc and continuation invocation into CPS. --shiro
On Fri, 30 Mar 2007 23:26:36 -0400, Eli Gottlieb
<eligottl @gmail.com> wrote: >To match the title, I'll start off with a quoted blog posting retrieved >from the ether by the Wayback Machine from Pinku Surana. The actual >blog seems to have disappeared when Northwestern University's CS >department reorganized their website. >> OS & Continuations >> Microkernel operating systems should expose continuations, not >> threads, as the low-level interface to the kernel. Threads should be >> provided as a library in user-space layered on top of the >> microkernel. Richard Draves' thesis demonstrated that using >> continuations as a programming technique within the MACH 3.0 kernel >> simplified the code and vastly improved performance. Randall Dean got >> good results when he modified the C-Threads library (user-space >> threads) to use continuations. If these results are valid, then the >> L4 microkernel's API should expose continuations, and the OS layered >> on top (L4Linux) should use continuations to implement high-level OS >> features like LWP and scheduling. >> : >> I should mention that the MACH group implemented first-class >> continuations, but they did not use the expensive, generic technique >> employed by Scheme/SML. These languages save the entire stack - in >> some cases they duplicate the stack - to preserve all possible >> information required to continue the computation. In addition, the >> compiler can't determine when a continuation is no longer valid, thus >> creating space leaks and weird exception handling problems. MACH >> explicitly maintained the data they needed for a continuation so they >> could, in many cases, dispose of the stack. In fact, I think their >> technique could be described as a very efficient implementation of >> thread pools (w/o the threads, of course). Posted by surana at >> February 28, 2003 10:51 AM
This is not entirely correct. Draves' and Dean's "continuations" are simply a known callback function. They are implemented in the protected kernel space - the program cannot access them - and in Scheme terms they are one-shot, escape (upward) continuations only. In their system, synchronous (blocking) calls are split into an entry function and an exit (continuation) function. The entry function constructs a callback to the exit function and then initiates an asynchronous operation which results in an interrupt (which may be handled by any thread). When the operation is complete and the "continuation" becomes runnable, it may be resumed by whatever thread is current by simply calling the exit function. This setup allowed them to dispense with per-thread kernel stacks and enabled any thread to resume any computation because the thread kept no in-kernel context. According to the paper, this proved advantageous performance-wise for RPC and especially for (Mach) exception handling. The original paper is here: http://citeseer.ist.psu.edu/draves91using.html Another paper in a similar vein that may be of interest is "Scheduler Activations: Effective Kernel Support for the User-Level Management of Parallelism". This paper describes a method for allowing the kernel to directly support user-space thread management. It can be found here: http://citeseer.ist.psu.edu/anderson92scheduler.html
>The significance is that I came to the same conclusion as I was working >on my microkernel. I wanted it to be so small that the timer driver and >scheduler (if there *was* a scheduler) ran in untrusted user-space >servers. Hence, I needed a concurrency mechanism that could work >without an OS-level scheduler. >Here's the place for the alt.os.development guys to get interested. >I discovered that such a thing could be done with threads. I could >implement two operators: >Fork() - Unix-style fork that returns the identifier of the child thread >or an identifier of the invoking thread. >Invoke(thread) -> identifier - Runs thread, passing it the identifier of >the current thread (the one that called Invoke). Invoking a thread >causes it to return the "current thread identifier" either from the >Fork() that created it or the last Invoke() it called. >Lispers take note: doesn't that Invoke look like applying call/cc to a >continuation? I noticed the resemblance, and decided to switch from >threads over to continuations. >Doing so instantly made kernel programming more elegant. Instead of >threads that could be valid to operate upon (ie: contain program state) >*or not* at any time, I ended up with "virtual processors" that contain >the OS state of a running continuation and "saved continuations" that >contain the state of reified continuations. A continuation is the >contents of the saved processor. >Of course, some concessions must be made to the fact that we're talking >about an operating system kernel written in Object Pascal that wants to >interoperate with C -- lack of garbage collection and closures and >inefficiency of copying a full machine continuation (ie: all user- and >kernel-level machine state one would normally find in a thread or >process). So I end up with this set of operators: >CallCC(entry,closure) -> value - Calls procedure entry of two >parameters, one a continuation and the other a pointer, passing the >reified calling continuation as the first argument and closure as the >second (to provide for pseudo-closures in non-closure languages). Entry >is called in an entirely new continuation created by the kernel. >Throw(continuation,value) - Passes value to continuation and throws to >continuation, causing the original CallCC to return value. Destroys the >calling continuation. >CallCC(continuation) -> value - Throws to continuation, passing it the >current continuation (which is not destroyed) instead of an arbitrary value. >Copy(continuation) -> continuation - Copies the given continuation. >Destroy(continuation) - Destroys the given continuation. >These are one-shot continuations, but given Copy implementing >multiple-use continuations in user-space shouldn't be very tough. >Lispers, you come in *here*, as we start talking about the use of >continuations for concurrency. >Concurrency (on shared-memory multiprocessor systems) comes easily: >simply create a virtual processor for each physical processor the >operating system sees. A master processor starts each one up with a >continuation, and they pretty much run themselves concurrently. If a >scheduler is present in user-space, any number of virtual processors can >access it by running continuations that act as coroutines of that >scheduler. The kernel supplies mutexes based on a TSL instruction. >Of course, if that seems like a stupid concurrency model to you, only >say so! >Now here's the hard bit: asynchronous events. I define them here as >tasks that run in response to asynchronous events or signals from >hardware or software, ie: generalized interrupt handlers. In most >microkernel systems (like L4) the kernel simply translates an event into >a message that gets sent to some handler process, which handles the >event by receiving the message in its own time. >But we're talking about a continuation-based kernel, in which there's no >scheduler to run that handler process. Event handlers also need some >mechanism to handle the task they interrupted to implement more >conventional preemptive multiprocessing models. >That can easily be accomplished by running an event handler via CallCC. > The mechanism I use to implement a continuation switch runs at the >completion of an Interrupt Service Routine, so handling interrupts with >events with CallCC's leaves a pretty low interrupt latency. >Now the question is: *how do you make that fast*? Interrupt handlers, >even if they don't run in the ISR itself, must run fast enough that the >interrupted continuation doesn't suffer unduly. >How do you think that should work? The next bit is the solution I >worked out, and an alternative that just occurred to me.
It sounds like you've reinvented and expanded upon a fair bit of the machinery from the papers I referred to above. ><Spoilers> >I decided to bring back the persistence (continued existence across >operations) of threads by creating "persistent continuations". These >come in one of two states: "filled" or "empty". References to a >persistent continuation from user-space are only valid when that >continuation is in the "filled" state, which it only obtains from being >reified by CallCC. However, internal references to the continuation >held by the kernel are always valid, because the actual lifetime of the >persistent continuation is managed by reference counting.
I am uncomfortable with user-space programs having access to the continuations - it seems unnecessary from the OS's point of view and it complicates the implementation by asking the question: what is the program allowed to do with them?
>Now here's the trick: Throw()'ing a persistent continuation saves the >current continuation from the virtual processor to the persistent >continuation structure instead of deleting the continuation, but sets >the continuation's state to "empty". So the user thinks the >continuation no longer exists, while the kernel event-handling machinery >sees a continuation filled with data. Then, when the event handler must >be run again, the kernel passes the "empty" persistent continuation to >CallCC, resuming it. >In short: event handlers act like threads (running, suspending >themselves, and running again), while normal computations act like >continuations (running and terminating or running and reifying). >Of course, being able to CallCC to a continuation that suspended itself >on a Throw means somehow passing the Current Continuation that puts the >CC in there to the called continuation passed explicitly to CallCC. Put >simply, Throw needs to start returning a continuation, *but only for >persistent continuations*. The normal kind get destroyed as
... read more »
Eli Gottlieb wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > It's a long posting, so I hope that I haven't violated any community > rules and that this problem will interest somebody enough to comment. > To match the title, I'll start off with a quoted blog posting retrieved > from the ether by the Wayback Machine from Pinku Surana. The actual > blog seems to have disappeared when Northwestern University's CS > department reorganized their website. >> OS & Continuations >>snip
This is interesting, as it is similar, conceptually, to the method I devised for my OS. I hadn't couched the concept in the terms you use, but the idea is similar, in that all user code runs without any blocking calls. If a routine needs to do something that will required hardware access, it builds a 'message' that requests that data is obtained from an interrupt routine and then this builds a message that the data is available, and thus calls the next stage of processing. One stage can 'call' more than one thread of processing by effectively passing more than one message. The difference is that the messages are basically all the same, and the 'calling' routine has no control over the code that is called. It alters the data object, and the data object's metadata defines what code should be called when a given type of alteration occurs. Thus this section of code is effectively calling the next piece of code, but needs no information about whether it exists, what it does, or what it's interface is. It is the ultimate in object orientation, as whilst most object oriented programs hide the method from the caller, but expose the programming interface, this exposes nothing, not even it's existence, and every piece of code exists within total isolation. (Technically, there is an exposure, in that the data structure itself is pre-defined, thus forming a commonality between the code objects, but that is as far as it goes.) Oddly enough, I'm just writing a B+tree handler, which runs at the memory management level, and this is a set of basic recursive routines. However, the filesystem handler uses the same code, as all objects are handled by the same system, which simply maps objects from the disc to the memory. The idea is that a single system handles all of the free space on both disc and in memory, using essentially the same structures, so that the data objects are never seen as 'on disc' or 'in memory' by the user code, but are simply accessed, and they will be mapped in as required. The disc structure and the memory structure are identical, and simply use a flag to denote whether the object is currently on disc or in memory. Sorry, I digress. The access methods are all recursive, and I was analysing them, as I don't want the kernel to block waiting for the next tree node to be read from the disc during tree searches, and found that the recursive routine lends itself very well to a message-passing methodology. I can process a step in the search, and then pass the results to the message structure, which call the same routine again if the data is available, or queues it until the disc access is complete. Matt
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
George Neuner wrote: > This is not entirely correct. Draves' and Dean's "continuations" are > simply a known callback function. They are implemented in the > protected kernel space - the program cannot access them - and in > Scheme terms they are one-shot, escape (upward) continuations only. > In their system, synchronous (blocking) calls are split into an entry > function and an exit (continuation) function. The entry function > constructs a callback to the exit function and then initiates an > asynchronous operation which results in an interrupt (which may be > handled by any thread). When the operation is complete and the > "continuation" becomes runnable, it may be resumed by whatever thread > is current by simply calling the exit function. > This setup allowed them to dispense with per-thread kernel stacks and > enabled any thread to resume any computation because the thread kept > no in-kernel context. According to the paper, this proved > advantageous performance-wise for RPC and especially for (Mach) > exception handling. > The original paper is here: > http://citeseer.ist.psu.edu/draves91using.html
I've read the original paper before, and understand it. That's why I'm not using their "continuations", which are really just an optimization for thread blocking. > Another paper in a similar vein that may be of interest is "Scheduler > Activations: Effective Kernel Support for the User-Level Management of > Parallelism". This paper describes a method for allowing the kernel > to directly support user-space thread management. > It can be found here: > http://citeseer.ist.psu.edu/anderson92scheduler.html
That one's a very interesting paper (I could never find it on CiteSeer, thanks!). They seem to be trying for something similar to me, with a couple of differences: 1) Their "virtual processors" are abstractions that can be multiplexed across any number of physical processors. In my design, every physical processor is abstracted by exactly one virtual processor. Which virtual processor abstracts which physical processor never changes, because virtual processors are really just an extension of the physical processor that keeps track of some OS variables (like page tables and the stack) in addition to the processor registers. 2) They use threads. I use continuations. Since continuations do not naturally expect to be interrupted (nor do reified continuations ever have their contents changed), this one seems rather significant. Or do you see something I don't? > It sounds like you've reinvented and expanded upon a fair bit of the > machinery from the papers I referred to above.
See above. Lots of similarities, with some differences. > This is why "events" are traditionally handled in the kernel. The > interrupt handler invokes the dispatcher/scheduler upon exit and the > dispatcher, in turn, decides which computation gets to resume. > George > -- > for email reply remove "/" from address
At least, it certainly shows why traditional kernels maintain some level of control over what runs. But I wrote to comp.lang.scheme hoping that somebody had seen this problem before. I know that coroutines-with-scheduler is the traditional method of using continuations to build cooperative threading, but has any Schemer ever had to deal with the occurance of asynchronous events before? How did they deal with it? - -- Eli Gottlieb -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGD/jWXxM/2T/QnN0RAusGAKCpZJ/GJMXjeZs/oE3d2gRKsrBwOgCfd66/ wQwJHbDNtxbZGn6LH8J0kEs= =V+oT -----END PGP SIGNATURE-----
On Sun, 01 Apr 2007 14:24:22 -0400, Eli Gottlieb
<eligottl @gmail.com> wrote: >George Neuner wrote: >> Another paper in a similar vein that may be of interest is "Scheduler >> Activations: Effective Kernel Support for the User-Level Management of >> Parallelism". This paper describes a method for allowing the kernel >> to directly support user-space thread management. >> It can be found here: >> http://citeseer.ist.psu.edu/anderson92scheduler.html >That one's a very interesting paper (I could never find it on CiteSeer, >thanks!). They seem to be trying for something similar to me, with a >couple of differences: >1) Their "virtual processors" are abstractions that can be multiplexed >across any number of physical processors. In my design, every physical >processor is abstracted by exactly one virtual processor. Which virtual >processor abstracts which physical processor never changes, because >virtual processors are really just an extension of the physical >processor that keeps track of some OS variables (like page tables and >the stack) in addition to the processor registers.
In either case the "processor" represents a compute resource that the program is free to use any way it desires. It's perfectly reasonable for an OS to provide a virtual processor - other than for performance reasons, I see no relevance in an application distinguishing real from virtual. >2) They use threads. I use continuations. Since continuations do not >naturally expect to be interrupted (nor do reified continuations ever >have their contents changed), this one seems rather significant. Or do >you see something I don't?
When I use the term "thread", I am not referring to a particular control model but rather to the old multiprogramming notion of a thread being a locus or path of control. Unless I misunderstood, your event model seems intended to implement multiprogramming ... threads in spirit if not in name. If you look at section 3.1, there is a description of kernel interaction with the user-space thread scheduler. I interpret that as follows (with some additional notes for those who haven't read the paper): An "activation" represents the actual processor - in conventional OS terms you can think of it as a kernel thread assigned to the process. A process can have more than one activation. Threads run on activations. When an activation chases into the kernel, the thread attached to it stores its own continuation in a user space data structure. The kernel starts the operation asynchronously and makes an upcall to an event handler, passing control back to user space and notifying the thread manager that the thread has blocked. In response the thread manager selects a stored, runnable continuation and invokes it - in effect switching to a new thread. Activation upcalls also happen in response to events such as I/O completion - for which the event handler would mark the appropriate thread's continuation to be runnable before entering the thread manager's scheduler. The paper doesn't really address preemption or interrupt handling, but the activation model can directly support both by allowing an interrupt to cause an activation to transition into the kernel. If you think of the activations as processors, your scheduling event handlers collectively as a user space thread manager, and your continuations as user threads, the parallel becomes clear. You keep describing this thing you've created as a kernel, but in rereading your original post, it looks now more like you have created in large part a state machine multitasker rather than a support kernel for user space tasking. FSM can be a perfectly good method of multitasking but I suspect it is not what you actually intended. I think you're having trouble figuring out how to connect applications / tasks / drivers to it because you're thinking of them as separate entities whereas with a state machine, the "application" code to be executed must be closely coupled. Re: a dispatcher/scheduler - you have to have one somewhere to make it work. Even though, in the paper, threads were user space entities, there was a system scheduler that decided which process to give the activation to. The only way to avoid a system scheduler is if there is only _one_ process. Deciding what is active at any particular time is the one job that a kernel can't delegate - the single process case is a degenerate example. Drivers don't have to be in the kernel, but they do need some kernel support - at a minimum the interrupt handler has to be able to identify and schedule the driver task to be run. If a driver is going to affect task scheduling, then the kernel has to expose APIs for that. Similarly, you have to ask yourself how your tasks will communicate if not through the kernel. Continuations make a fine RPC mechanism in a shared address space, but they suck rocks when you want to isolate and decouple the tasks (even if only conceptually). >I wrote to comp.lang.scheme hoping that somebody had seen this >problem before. I know that coroutines-with-scheduler is the >traditional method of using continuations to build cooperative >threading, but has any Schemer ever had to deal with the occurance of >asynchronous events before? How did they deal with it?
Asynchronous event handling is implementation dependent in Scheme and the mechanism is essentially hidden from the programmer. Most provide an event abstraction (which includes exceptions), a system event queue, Unix-like out-of-band handlers and analogs of select(). Threaded Schemes add per-thread queues and synchronization. Coming from an OS background, I'm having trouble understanding exactly what you are looking for. Scheme's event model is application oriented and IMO not very useful for the kind of close coupled control you seem to be targeting ... unless you intend your kernel to be embedded within (and linked to) the application. Even so, with the direction you've gone in, I think rather than looking at the event model from the application point of view, you'd be better off studying the source of a Scheme that implements continuation based threading - Chicken comes to mind but there are probably others out there as well. George -- for email reply remove "/" from address
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 George Neuner wrote: > In either case the "processor" represents a compute resource that the > program is free to use any way it desires. It's perfectly reasonable > for an OS to provide a virtual processor - other than for performance > reasons, I see no relevance in an application distinguishing real from > virtual.
True. >> 2) They use threads. I use continuations. Since continuations do not >> naturally expect to be interrupted (nor do reified continuations ever >> have their contents changed), this one seems rather significant. Or do >> you see something I don't? > When I use the term "thread", I am not referring to a particular > control model but rather to the old multiprogramming notion of a > thread being a locus or path of control. Unless I misunderstood, your > event model seems intended to implement multiprogramming ... threads > in spirit if not in name.
As it is written (in "Continuations and threads: Expressing machine concurrency directly in advanced languages"): A thread is a trajectory in continuation space. If you want to speak practically, a thread is a consistently-named object that can contain a continuation and supports two operations: 1) Block - Saves a continuation into the thread and makes the OS find another thread to run. 2) Run - Throws to the stored continuation. Each of these changes the state of the thread to X'ing, where X is the operation.
> If you look at section 3.1, there is a description of kernel > interaction with the user-space thread scheduler. I interpret that as > follows (with some additional notes for those who haven't read the > paper): > An "activation" represents the actual processor - in conventional OS > terms you can think of it as a kernel thread assigned to the process. > A process can have more than one activation. Threads run on > activations. > When an activation chases into the kernel, the thread attached to it > stores its own continuation in a user space data structure. The > kernel starts the operation asynchronously and makes an upcall to an > event handler, passing control back to user space and notifying the > thread manager that the thread has blocked. In response the thread > manager selects a stored, runnable continuation and invokes it - in > effect switching to a new thread. > Activation upcalls also happen in response to events such as I/O > completion - for which the event handler would mark the appropriate > thread's continuation to be runnable before entering the thread > manager's scheduler. > The paper doesn't really address preemption or interrupt handling, but > the activation model can directly support both by allowing an > interrupt to cause an activation to transition into the kernel. > If you think of the activations as processors, your scheduling event > handlers collectively as a user space thread manager, and your > continuations as user threads, the parallel becomes clear.
Going to need some time to wrap my brain around that. > You keep describing this thing you've created as a kernel, but in > rereading your original post, it looks now more like you have created > in large part a state machine multitasker rather than a support kernel > for user space tasking. FSM can be a perfectly good method of > multitasking but I suspect it is not what you actually intended. I > think you're having trouble figuring out how to connect applications / > tasks / drivers to it because you're thinking of them as separate > entities whereas with a state machine, the "application" code to be > executed must be closely coupled.
No, I was not thinking of a state machine. > Re: a dispatcher/scheduler - you have to have one somewhere to make it > work. Even though, in the paper, threads were user space entities, > there was a system scheduler that decided which process to give the > activation to. The only way to avoid a system scheduler is if there > is only _one_ process. Deciding what is active at any particular time > is the one job that a kernel can't delegate - the single process case > is a degenerate example.
There *is* a dispatcher in my kernel, but it's only meant to provide three guarantees: 1) Every event handler runs until it throws to another continuation or terminates. 2) Event handlers run in FIFO order. 3) Event handlers can see the continuation they interrupted. So yes, the operating system does run a dispatcher on each interrupt (along with continuation operations, to make sure that an event handler can preempt a normal continuation thrown by another event handler). However, as far as user-level programs are concerned (with the small exception of event handlers) there is no operating-system level scheduler to handle concurrency for them. > Drivers don't have to be in the kernel, but they do need some kernel > support - at a minimum the interrupt handler has to be able to > identify and schedule the driver task to be run. If a driver is going > to affect task scheduling, then the kernel has to expose APIs for > that.
Read above. > Similarly, you have to ask yourself how your tasks will communicate if > not through the kernel. Continuations make a fine RPC mechanism in a > shared address space, but they suck rocks when you want to isolate and > decouple the tasks (even if only conceptually).
The kernel has an IPC mechanism that is sufficiently separate from continuations and concurrency that I didn't include it here. >> I wrote to comp.lang.scheme hoping that somebody had seen this >> problem before. I know that coroutines-with-scheduler is the >> traditional method of using continuations to build cooperative >> threading, but has any Schemer ever had to deal with the occurance of >> asynchronous events before? How did they deal with it? > Asynchronous event handling is implementation dependent in Scheme and > the mechanism is essentially hidden from the programmer. Most provide > an event abstraction (which includes exceptions), a system event > queue, Unix-like out-of-band handlers and analogs of select(). > Threaded Schemes add per-thread queues and synchronization.
Most of what I've seen is like select() or polling. > Coming from an OS background, I'm having trouble understanding exactly > what you are looking for. Scheme's event model is application > oriented and IMO not very useful for the kind of close coupled control > you seem to be targeting ... unless you intend your kernel to be > embedded within (and linked to) the application.
I'm looking for a model that lets me hand user programs the basic tools of concurrency (abstract continuations, parallel agents of computation aka processors, and asynchronous events for preemption) and have those user programs (or at least the sufficiently privileged ones) implement whatever concurrency model they can use best. Since manipulating continuations is a privilege (checked by the kernel's security system), lower-level server programs can implement threading models that higher-level user applications *must* use. > Even so, with the direction you've gone in, I think rather than > looking at the event model from the application point of view, you'd > be better off studying the source of a Scheme that implements > continuation based threading - Chicken comes to mind but there are > probably others out there as well. > George > -- > for email reply remove "/" from address
Going to take a look at Chicken, but I have to go now -- holidays don't run themselves. - -- The science of economics is the cleverest proof of free will yet constructed. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGEiMXXxM/2T/QnN0RAuUhAJ4pO4Eo1/HkDEV+OtvQdT5ySeAHBwCfYJoA tj/zedubCaMylWF9xIL2TaU= =h2YP -----END PGP SIGNATURE-----
On Apr 1, 8:16 am, "Shiro Kawai" <shiro.ka@gmail.com> wrote: > I'm no expert in OS development, but your idea reminds me a > popular technique of developing console/arcade video games > which have historically run close to bare metal (and been > required to run very efficiently). > Essentially it is to write the whole program in CPS---you > break things into small chunk of tasks, and whenever you ask > system-level service or your job can yield CPU, you call > the system routine with the continuation of your job. > A good thing is that you can pass several continuations > to the system if necessary (e.g. for event handlers). > Writing everything in CPS manually is a bit of pain, though, > and I guess a compiler might help here, translating user > code that uses call/cc and continuation invocation into CPS.
Hm... This sounds familiar... cheers, felix
> If you want to speak practically, a thread is a consistently-named > object that can contain a continuation and supports two operations: > 1) Block - Saves a continuation into the thread and makes the OS find > another thread to run. > 2) Run - Throws to the stored continuation.
Looks like such continuations (implemented as callbacks) are the usual way of programming in the contexts where the thread wait() or sleep() primitives are not available, like DISPATCH_LEVEL in Windows kernel and "nonblocking" contexts in Linux kernel (where you cannot wait). -- Maxim Shatskih, Windows DDK MVP StorageCraft Corporation m@storagecraft.com http://www.storagecraft.com
On 4?3?, ??1:34, "bunny@yoho-gmail.com" <bunny@gmail.com> wrote:
> On Apr 1, 8:16 am, "Shiro Kawai" <shiro.ka @gmail.com> wrote: > > I'm no expert in OS development, but your idea reminds me a > > popular technique of developing console/arcade video games > > which have historically run close to bare metal (and been > > required to run very efficiently). > > Essentially it is to write the whole program in CPS---you > > break things into small chunk of tasks, and whenever you ask > > system-level service or your job can yield CPU, you call > > the system routine with the continuation of your job. > > A good thing is that you can pass several continuations > > to the system if necessary (e.g. for event handlers). > > Writing everything in CPS manually is a bit of pain, though,> > and I guess a compiler might help here, translating user > > code that uses call/cc and continuation invocation into CPS. > Hm... This sounds familiar... > cheers, > felix
I bet it does to you :-) FYI, there are some other constraints in the system I described. Game developers (at least whom I know) tend to prefer using their own memory management to generic allocator/GC. They pre-allocate chunks of memories, and put each manually-created continuation into a chunk. No state sharing between chunks. Chunks can be freed once it's done, unless it is explicitly retained. This makes memory management significantly easier and predictable. I think compiler can arrange the generated CPS in this way, as well.
On Tue, 03 Apr 2007 05:49:12 -0400, Eli Gottlieb <eligottl @gmail.com> wrote: >George Neuner wrote: >> When I use the term "thread", I am not referring to a particular >> control model but rather to the old multiprogramming notion of a >> thread being a locus or path of control. >As it is written (in "Continuations and threads: Expressing machine >concurrency directly in advanced languages"): A thread is a trajectory >in continuation space.
I think I first came across the term "threading" in the way I use it in Hoare's CSP. >There *is* a dispatcher in my kernel, but it's only meant to provide >three guarantees: >1) Every event handler runs until it throws to another continuation or >terminates. >2) Event handlers run in FIFO order. >3) Event handlers can see the continuation they interrupted. >So yes, the operating system does run a dispatcher on each interrupt >(along with continuation operations, to make sure that an event handler >can preempt a normal continuation thrown by another event handler). >However, as far as user-level programs are concerned (with the small >exception of event handlers) there is no operating-system level >scheduler to handle concurrency for them.
Ok, that wasn't clear from your previous posts. >I'm looking for a model that lets me hand user programs the basic tools >of concurrency (abstract continuations, parallel agents of computation >aka processors, and asynchronous events for preemption) and have those >user programs (or at least the sufficiently privileged ones) implement >whatever concurrency model they can use best. Since manipulating >continuations is a privilege (checked by the kernel's security system), >lower-level server programs can implement threading models that >higher-level user applications *must* use.
Hmm ... I'd have to mull over that one for a while. I am not aware of any existing design that enables a non-kernel server module to provide concurrency support for other user-space programs. At first glance it seems unwieldy and problem fraught. >holidays don't run themselves.
Hope you have a good one. George -- for email reply remove "/" from address
|
 |
 |
 |
 |
|