QNX Messaging (Send/Receive/Reply) has two distinct advantages. It copies the actual message, rather than a pointer to a buffer, so the code works the same way across the network or within the local computer. Each process has a copy of the data and can freely modify it. More importantly, this mechanism inherently synchronizes the processes, so the CPU is allocated efficiently.

Thus, the "QNX way" of designing software leads to many small co-operative processes. Initially many programmers feel that they do not want their processes to ever be blocked, which is a result of writing software for less elegant systems and not breaking down processes into small enough blocks.

I will discuss exceptions later, but for now, assume that all processes will frequently be blocked and that this is good. Ideally, the CPU should be idling a majority of the time, otherwise the system will be inadequate to respond to a number of events occurring together. I will show how a typical system is designed, so that you will see why this is true, what the exception cases are and how to handle them. The first key is to break up the design into small functional blocks. Initially programmers make each process do too much.

As an example, let's take a process control system which is handling an assembly line. Perhaps there are three A to D cards which are used to monitor a plastic extruder and two D to A ports that control it.

You would have a Collection Process and an Analyzer Process for each A to D card, an Output Process for each D to A card and a Control Process to ultimately control the machine for a minimum of 9 processes to do this.

The Collection Processes setup the A to D cards for data collection, which likely occurs under DMA or during interrupts. So these processes should not actually get any CPU time during the collection process and will be blocked on a Receive() calls. They will be unblocked when the interrupt routine Trigger()s a proxy, which is a special prearranged message for this purpose. When a Collection Process Receive()s a proxy message, it will be marked Ready. Since this process is of quite high priority, it will run now. It will package up the data into a message and re-start the next data collection phase. It will Send() a message, either containing the actual data, or some details regarding where in shared memory the data is, to its Analyzer Process.

The Analyzer Process has been Receive() blocked waiting for this message, so the receipt of the message unblocks it and blocks the Collection Process (although data collection is actually continuing during interrupts at this point). The Analyzer Process immediately Reply()s, which unblocks the Collection Process which calls Receive() and blocks waiting for the next proxy message.

The Analyzer Process examines the data and Send()s a message to the Control Process, which has been Receive() blocked. It Reply()s. The Analyzer Process calls Receive() and blocks waiting for more data.

The Control Process has Receive()ed messages from each Analyzer Process. Based on the data, it determines what new control outputs are required. The mistake that might initially be made here, is for the Control Process to Send() a message to the Output Process. The rule of thumb is that the lowest level (closer to hardware) process is higher priority and Send()s up the hierarchy. The Output Process has nothing to do until some new command is available. After the last output, the Output Pocess did a Send() to the Control Process, its message indicating that the previous command was completed and it is ready for another. So the Output Process is Reply() blocked on the Control Process, which now Reply()s with the new output value. Thus the Control Process is never blocked on a Send() to a lower priority process.

Generally there are a number of functional units as described above. Draw a diagram similar to the root structure of a tree. At the bottom tips of the roots are the processes that interface to the hardware. They Send() to processes above that do something with the data. Higher up are lower priority processes that have a broader overview of the entire system, but less detail work. The performance of the system can be tuned by adjusting the priority levels of the processes.

You can look at the design imagining a corporate structure. The President has ultimate control but does nothing but co-ordinate and knows no detail. He never leaves his office, but the Managers come running to him to ask what to do about this information they have. He replies and they carry out his wishes, by repeating the same system with their underlings.

This system is thus easy to design. CPU usage is extremely efficient, because there is no polling and no semaphores to check. Only processes that truly have useful work get CPU time. The kernel itself is used by the programmer by the system design to schedule processes.

Using non-blocking messaging merely moves the co-ordination of the processes to a different mechanism, because the processes still need to be co-ordinated for the system to operate. Other mechanisms rely on the programmer and more CPU time to impliment. They are liable to require changes when modifications are made to the system.

If a process higher up the tree cannot keep up, then the hardware is inadequate. Sometimes buffering can help, but if the inputs need to be examined and new outputs generated at a particular frequency, more buffering will not solve the problem. Changing to a non-blocking messaging system will not improve matters, since it is the amount of work to do that is of concern not the mechanism of synchronization.

All that said, lets look at handling some special cases, that initially appear to prevent the design described above.

Sometimes it is necessary for a process up the tree to get a message to a process down the line. Since all processes are typically Receive() blocked on processes below them, they can receive a message. So the process up the tree can use a proxy message to tell this process to Send() a message so it can use Reply() to pass the data.

Timers are handled similarly. The timer is attached to a proxy and whenever the timer fires, the process Receive()s the proxy. This avoids two different processing loops within the same process, since all messages are handled by the same Receive(). The process blocks, waiting for some event, which could be the timer firing or a message from another process. As well as avoiding polling, this simplifies the code as well. There are no race conditions as there are with signals.

Another case is two processes on the same level that need to notify one another of some event. You do not want to blindly do a Send() since it is not up the chain of command. There are two approaches that work here. One is to send a proxy, which notifies the other process to do a Send() which is then safe to do because the process has been designed to Reply() immediately and is waiting to do so. The other is to create a buffering mailbox process. This is most useful when there are many processes involved and a number of messages. A process does a Send() to the mailbox, which immediately Reply()s. The mailbox buffers the message and Trigger()s a proxy to the recipient process, which then does a Send() to the mailbox which Reply()s with the buffered message. This way each process can retrieve messages when convenient, rather than being obligated to interract with another process when it is convenient for that other process.

When there is a large amount of data to be dispersed to many processes, sending a message to each is typically inefficient. If each of us had to call to ask about the weather, rather than listening to the radio, the system would be overwhelmed. Also, often only a portion of the data is required. For this scenario, shared memory works well. It is important to design the system so tha only one process modifies a particular variable. Depending on the requirements, the processes can read the values as required or can be notified that values have been changed with a proxy message.

The QNX messaging system is highly efficient and makes it quite easy to design complex and reliable systems. Other than the case of porting code from another operating system, native QNX messaging should be used.

Note: These instructions are for typical situations. Individual configuration may differ. If you have any questions, please contact us for assistance.

Copyright © 1998 Qenesis Inc. All rights reserved.