Figure 1 shows a conceptual view of how most operating systems implement shared memory. Each process has its own virtual address space, which is typically larger than the physical address space. Certain regions of the virtual address space are mapped to regions of physical memory, while other regions remain unused.2The operating system, in conjunction with the hardware architecture, typically implements such mappings by means of a separate page table per process. Most regions of physical memory are owned by a single process, and only mapped into the virtual address space of that process. However, a region of shared memory may be created if the operating systems maps a region of physical memory into the virtual address spaces of several different processes. This enables all the processes to access the same region of physical memory.
The fact that each process has its own virtual address space, with a mapping defined by a separate page table, has a consequence that is not immediately obvious. It implies that the virtual address at which a given shared memory region starts may in fact vary from process to process. Therefore, a location in shared memory cannot be uniquely identified by means of a virtual address, since the virtual address may refer to different physical addresses in different processes. From a programmer's point of view, this means that a pointer (which is simply a virtual address), cannot uniquely identify a location in shared memory to a set of multiple processes. Consequently, recursive data structures, such as trees or linked lists, cannot be represented in shared memory using pointers.
POSH deals with this limitation by introducing the concept of memory handles. POSH assigns a unique identifier to each shared memory region upon creation. A memory handle consists of the identifier for a shared memory region, and a byte offset in that region. This information uniquely identifies the memory location to all processes, regardless of the actual virtual address at which the region starts. Each process maintains a separate table that stores the starting virtual address (in that process) of every shared memory region. When a process needs to access the memory location to which a memory handle refers, the actual address is found by looking up the starting address in the table, and adding the given byte offset. Since the table is implemented as a simple array, indexed by the shared memory region's identifier, the lookup is a simple constant-time operation.
Python's standard container objects (dictionaries, lists and tuples) are examples of recursive data structures. Their object structures contain pointers that refer to other objects. For instance, a tuple object contains an array of pointers, where each pointer refers to an element of the tuple. As explained above, the use of pointers makes the standard container objects unsuited for placement in shared memory. By using memory handles in place of pointers, POSH implements separate shareable versions of the standard container types. This allows dictionaries, lists and tuples to be shared. The ability to create shared dictionaries also paves the road for supporting shared objects with attributes, as this is implemented by storing the attribute values in a shared dictionary.