aboutsummaryrefslogtreecommitdiffstats
path: root/doc/publican/Protocol.xml
diff options
context:
space:
mode:
authorSebastian Wick <sebastian.wick@redhat.com>2025-10-28 00:36:53 +0100
committerPekka Paalanen <pq@iki.fi>2025-11-27 17:39:11 +0200
commitbbb5fa66a7fe53c492e4cc2e616c4ada38712542 (patch)
treec2e10fe75e0f9f4e1e2c9ac5a375e5de97d6cb36 /doc/publican/Protocol.xml
parentbuild: Bump to meson version 0.64.0 (diff)
downloadwayland-bbb5fa66a7fe53c492e4cc2e616c4ada38712542.tar
wayland-bbb5fa66a7fe53c492e4cc2e616c4ada38712542.tar.gz
wayland-bbb5fa66a7fe53c492e4cc2e616c4ada38712542.tar.bz2
wayland-bbb5fa66a7fe53c492e4cc2e616c4ada38712542.tar.lz
wayland-bbb5fa66a7fe53c492e4cc2e616c4ada38712542.tar.xz
wayland-bbb5fa66a7fe53c492e4cc2e616c4ada38712542.tar.zst
wayland-bbb5fa66a7fe53c492e4cc2e616c4ada38712542.zip
doc: Refactor the build system for complete build dir docs
By structuring things differently, it becomes possible to have a complete build of the docs in the build dir, without having to install anything. Signed-off-by: Sebastian Wick <sebastian.wick@redhat.com>
Diffstat (limited to 'doc/publican/Protocol.xml')
-rw-r--r--doc/publican/Protocol.xml592
1 files changed, 592 insertions, 0 deletions
diff --git a/doc/publican/Protocol.xml b/doc/publican/Protocol.xml
new file mode 100644
index 0000000..e4087e9
--- /dev/null
+++ b/doc/publican/Protocol.xml
@@ -0,0 +1,592 @@
+<?xml version='1.0' encoding='utf-8' ?>
+<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" [
+<!ENTITY % BOOK_ENTITIES SYSTEM "Wayland.ent">
+%BOOK_ENTITIES;
+]>
+<chapter id="chap-Protocol">
+ <title>Wayland Protocol and Model of Operation</title>
+ <section id="sect-Protocol-Basic-Principles">
+ <title>Basic Principles</title>
+ <para>
+ The Wayland protocol is an asynchronous object oriented protocol. All
+ requests are method invocations on some object. The requests include
+ an object ID that uniquely identifies an object on the server. Each
+ object implements an interface and the requests include an opcode that
+ identifies which method in the interface to invoke.
+ </para>
+ <para>
+ The protocol is message-based. A message sent by a client to the server
+ is called request. A message from the server to a client is called event.
+ A message has a number of arguments, each of which has a certain type (see
+ <xref linkend="sect-Protocol-Wire-Format"/> for a list of argument types).
+ </para>
+ <para>
+ Additionally, the protocol can specify <type>enum</type>s which associate
+ names to specific numeric enumeration values. These are primarily just
+ descriptive in nature: at the wire format level enums are just integers.
+ But they also serve a secondary purpose to enhance type safety or
+ otherwise add context for use in language bindings or other such code.
+ This latter usage is only supported so long as code written before these
+ attributes were introduced still works after; in other words, adding an
+ enum should not break API, otherwise it puts backwards compatibility at
+ risk.
+ </para>
+ <para>
+ <type>enum</type>s can be defined as just a set of integers, or as
+ bitfields. This is specified via the <type>bitfield</type> boolean
+ attribute in the <type>enum</type> definition. If this attribute is true,
+ the enum is intended to be accessed primarily using bitwise operations,
+ for example when arbitrarily many choices of the enum can be ORed
+ together; if it is false, or the attribute is omitted, then the enum
+ arguments are a just a sequence of numerical values.
+ </para>
+ <para>
+ The <type>enum</type> attribute can be used on either <type>uint</type>
+ or <type>int</type> arguments, however if the <type>enum</type> is
+ defined as a <type>bitfield</type>, it can only be used on
+ <type>uint</type> args.
+ </para>
+ <para>
+ The server sends back events to the client, each event is emitted from
+ an object. Events can be error conditions. The event includes the
+ object ID and the event opcode, from which the client can determine
+ the type of event. Events are generated both in response to requests
+ (in which case the request and the event constitutes a round trip) or
+ spontaneously when the server state changes.
+ </para>
+ <para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ State is broadcast on connect, events are sent
+ out when state changes. Clients must listen for
+ these changes and cache the state.
+ There is no need (or mechanism) to query server state.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ The server will broadcast the presence of a number of global objects,
+ which in turn will broadcast their current state.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+ </section>
+ <section id="sect-Protocol-Code-Generation">
+ <title>Code Generation</title>
+ <para>
+ The interfaces, requests and events are defined in
+ <filename>protocol/wayland.xml</filename>.
+ This xml is used to generate the function prototypes that can be used by
+ clients and compositors.
+ </para>
+ <para>
+ The protocol entry points are generated as inline functions which just
+ wrap the <function>wl_proxy_*</function> functions. The inline functions aren't
+ part of the library ABI and language bindings should generate their
+ own stubs for the protocol entry points from the xml.
+ </para>
+ </section>
+ <section id="sect-Protocol-Wire-Format">
+ <title>Wire Format</title>
+ <para>
+ The protocol is sent over a UNIX domain stream socket, where the endpoint
+ usually is named <systemitem class="service">wayland-0</systemitem>
+ (although it can be changed via <emphasis>WAYLAND_DISPLAY</emphasis>
+ in the environment). Beginning in Wayland 1.15, implementations can
+ optionally support server socket endpoints located at arbitrary
+ locations in the filesystem by setting <emphasis>WAYLAND_DISPLAY</emphasis>
+ to the absolute path at which the server endpoint listens. The socket may
+ also be provided through file descriptor inheritance, in which case
+ <emphasis>WAYLAND_SOCKET</emphasis> is set.
+ </para>
+ <para>
+ Every message is structured as 32-bit words; values are represented in the
+ host's byte-order. The message header has 2 words in it:
+ <itemizedlist>
+ <listitem>
+ <para>
+ The first word is the sender's object ID (32-bit).
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ The second has 2 parts of 16-bit. The upper 16-bits are the message
+ size in bytes, starting at the header (i.e. it has a minimum value of 8).The lower is the request/event opcode.
+ </para>
+ </listitem>
+ </itemizedlist>
+ The payload describes the request/event arguments. Every argument is always
+ aligned to 32-bits. Where padding is required, the value of padding bytes is
+ undefined. There is no prefix that describes the type, but it is
+ inferred implicitly from the xml specification.
+ </para>
+ <para>
+
+ The representation of argument types are as follows:
+ <variablelist>
+ <varlistentry>
+ <term>int</term>
+ <term>uint</term>
+ <listitem>
+ <para>
+ The value is the 32-bit value of the signed/unsigned
+ int.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>fixed</term>
+ <listitem>
+ <para>
+ Signed 24.8 decimal numbers. It is a signed decimal type which
+ offers a sign bit, 23 bits of integer precision and 8 bits of
+ decimal precision. This is exposed as an opaque struct with
+ conversion helpers to and from double and int on the C API side.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>string</term>
+ <listitem>
+ <para>
+ Starts with an unsigned 32-bit length (including null terminator),
+ followed by the UTF-8 encoded string contents, including
+ terminating null byte, then padding to a 32-bit boundary. A null
+ value is represented with a length of 0. Interior null bytes are
+ not permitted.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>object</term>
+ <listitem>
+ <para>
+ 32-bit object ID. A null value is represented with an ID of 0.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>new_id</term>
+ <listitem>
+ <para>
+ The 32-bit object ID. Generally, the interface used for the new
+ object is inferred from the xml, but in the case where it's not
+ specified, a new_id is preceded by a <code>string</code> specifying
+ the interface name, and a <code>uint</code> specifying the version.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>array</term>
+ <listitem>
+ <para>
+ Starts with 32-bit array size in bytes, followed by the array
+ contents verbatim, and finally padding to a 32-bit boundary.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>fd</term>
+ <listitem>
+ <para>
+ The file descriptor is not stored in the message buffer, but in
+ the ancillary data of the UNIX domain socket message (msg_control).
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
+ <para>
+ The protocol does not specify the exact position of the ancillary data
+ in the stream, except that the order of file descriptors is the same as
+ the order of messages and <code>fd</code> arguments within messages on
+ the wire.
+ </para>
+ <para>
+ In particular, it means that any byte of the stream, even the message
+ header, may carry the ancillary data with file descriptors.
+ </para>
+ <para>
+ Clients and compositors should queue incoming data until they have
+ whole messages to process, as file descriptors may arrive earlier
+ or later than the corresponding data bytes.
+ </para>
+ </section>
+ <xi:include href="ProtocolInterfaces.xml" xmlns:xi="http://www.w3.org/2001/XInclude"/>
+ <section id="sect-Protocol-Versioning">
+ <title>Versioning</title>
+ <para>
+ Every interface is versioned and every protocol object implements a
+ particular version of its interface. For global objects, the maximum
+ version supported by the server is advertised with the global and the
+ actual version of the created protocol object is determined by the
+ version argument passed to wl_registry.bind(). For objects that are
+ not globals, their version is inferred from the object that created
+ them.
+ </para>
+ <para>
+ In order to keep things sane, this has a few implications for
+ interface versions:
+ <itemizedlist>
+ <listitem>
+ <para>
+ The object creation hierarchy must be a tree. Otherwise,
+ inferring object versions from the parent object becomes a much
+ more difficult to properly track.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ When the version of an interface increases, so does the version
+ of its parent (recursively until you get to a global interface)
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ A global interface's version number acts like a counter for all
+ of its child interfaces. Whenever a child interface gets
+ modified, the global parent's interface version number also
+ increases (see above). The child interface then takes on the
+ same version number as the new version of its parent global
+ interface.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+ <para>
+ To illustrate the above, consider the wl_compositor interface. It
+ has two children, wl_surface and wl_region. As of wayland version
+ 1.2, wl_surface and wl_compositor are both at version 3. If
+ something is added to the wl_region interface, both wl_region and
+ wl_compositor will get bumpped to version 4. If, afterwards,
+ wl_surface is changed, both wl_compositor and wl_surface will be at
+ version 5. In this way the global interface version is used as a
+ sort of "counter" for all of its child interfaces. This makes it
+ very simple to know the version of the child given the version of its
+ parent. The child is at the highest possible interface version that
+ is less than or equal to its parent's version.
+ </para>
+ <para>
+ It is worth noting a particular exception to the above versioning
+ scheme. The wl_display (and, by extension, wl_registry) interface
+ cannot change because it is the core protocol object and its version
+ is never advertised nor is there a mechanism to request a different
+ version.
+ </para>
+ </section>
+ <section id="sect-Protocol-Connect-Time">
+ <title>Connect Time</title>
+ <para>
+ There is no fixed connection setup information, the server emits
+ multiple events at connect time, to indicate the presence and
+ properties of global objects: outputs, compositor, input devices.
+ </para>
+ </section>
+ <section id="sect-Protocol-Security-and-Authentication">
+ <title>Security and Authentication</title>
+ <para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ mostly about access to underlying buffers, need new drm auth
+ mechanism (the grant-to ioctl idea), need to check the cmd stream?
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ getting the server socket depends on the compositor type, could
+ be a system wide name, through fd passing on the session dbus.
+ or the client is forked by the compositor and the fd is
+ already opened.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+ </section>
+ <section id="sect-Protocol-Creating-Objects">
+ <title>Creating Objects</title>
+ <para>
+ Each object has a unique ID. The IDs are allocated by the entity
+ creating the object (either client or server). IDs allocated by the
+ client are in the range [1, 0xfeffffff] while IDs allocated by the
+ server are in the range [0xff000000, 0xffffffff]. The 0 ID is
+ reserved to represent a null or non-existent object.
+
+ For efficiency purposes, the IDs are densely packed in the sense that
+ the ID N will not be used until N-1 has been used. This ordering is
+ not merely a guideline, but a strict requirement, and there are
+ implementations of the protocol that rigorously enforce this rule,
+ including the ubiquitous libwayland.
+ </para>
+ </section>
+ <section id="sect-Protocol-Compositor">
+ <title>Compositor</title>
+ <para>
+ The compositor is a global object, advertised at connect time.
+ </para>
+ <para>
+ See <xref linkend="protocol-spec-wl_compositor"/> for the
+ protocol description.
+ </para>
+ </section>
+ <section id="sect-Protocol-Surface">
+ <title>Surfaces</title>
+ <para>
+ A surface manages a rectangular grid of pixels that clients create
+ for displaying their content to the screen. Clients don't know
+ the global position of their surfaces, and cannot access other
+ clients' surfaces.
+ </para>
+ <para>
+ Once the client has finished writing pixels, it 'commits' the
+ buffer; this permits the compositor to access the buffer and read
+ the pixels. When the compositor is finished, it releases the
+ buffer back to the client.
+ </para>
+ <para>
+ See <xref linkend="protocol-spec-wl_surface"/> for the protocol
+ description.
+ </para>
+ </section>
+ <section id="sect-Protocol-Input">
+ <title>Input</title>
+ <para>
+ A seat represents a group of input devices including mice,
+ keyboards and touchscreens. It has a keyboard and pointer
+ focus. Seats are global objects. Pointer events are delivered
+ in surface-local coordinates.
+ </para>
+ <para>
+ The compositor maintains an implicit grab when a button is
+ pressed, to ensure that the corresponding button release
+ event gets delivered to the same surface. But there is no way
+ for clients to take an explicit grab. Instead, surfaces can
+ be mapped as 'popup', which combines transient window semantics
+ with a pointer grab.
+ </para>
+ <para>
+ To avoid race conditions, input events that are likely to
+ trigger further requests (such as button presses, key events,
+ pointer motions) carry serial numbers, and requests such as
+ wl_surface.set_popup require that the serial number of the
+ triggering event is specified. The server maintains a
+ monotonically increasing counter for these serial numbers.
+ </para>
+ <para>
+ Input events also carry timestamps with millisecond granularity.
+ Their base is undefined, so they can't be compared against
+ system time (as obtained with clock_gettime or gettimeofday).
+ They can be compared with each other though, and for instance
+ be used to identify sequences of button presses as double
+ or triple clicks.
+ </para>
+ <para>
+ See <xref linkend="protocol-spec-wl_seat"/> for the
+ protocol description.
+ </para>
+ <para>
+ Talk about:
+
+ <itemizedlist>
+ <listitem>
+ <para>
+ keyboard map, change events
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ xkb on Wayland
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ multi pointer Wayland
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+ <para>
+ A surface can change the pointer image when the surface is the pointer
+ focus of the input device. Wayland doesn't automatically change the
+ pointer image when a pointer enters a surface, but expects the
+ application to set the cursor it wants in response to the pointer
+ focus and motion events. The rationale is that a client has to manage
+ changing pointer images for UI elements within the surface in response
+ to motion events anyway, so we'll make that the only mechanism for
+ setting or changing the pointer image. If the server receives a request
+ to set the pointer image after the surface loses pointer focus, the
+ request is ignored. To the client this will look like it successfully
+ set the pointer image.
+ </para>
+ <para>
+ Setting the pointer image to NULL causes the cursor to be hidden.
+ </para>
+ <para>
+ The compositor will revert the pointer image back to a default image
+ when no surface has the pointer focus for that device.
+ </para>
+ <para>
+ What if the pointer moves from one window which has set a special
+ pointer image to a surface that doesn't set an image in response to
+ the motion event? The new surface will be stuck with the special
+ pointer image. We can't just revert the pointer image on leaving a
+ surface, since if we immediately enter a surface that sets a different
+ image, the image will flicker. If a client does not set a pointer image
+ when the pointer enters a surface, the pointer stays with the image set
+ by the last surface that changed it, possibly even hidden. Such a client
+ is likely just broken.
+ </para>
+ </section>
+ <section id="sect-Protocol-Output">
+ <title>Output</title>
+ <para>
+ An output is a global object, advertised at connect time or as it
+ comes and goes.
+ </para>
+ <para>
+ See <xref linkend="protocol-spec-wl_output"/> for the protocol
+ description.
+ </para>
+ <para>
+ </para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ laid out in a big (compositor) coordinate system
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ basically xrandr over Wayland
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ geometry needs position in compositor coordinate system
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ events to advertise available modes, requests to move and change
+ modes
+ </para>
+ </listitem>
+ </itemizedlist>
+ </section>
+ <section id="sect-Protocol-data-sharing">
+ <title>Data sharing between clients</title>
+ <para>
+ The Wayland protocol provides clients a mechanism for sharing
+ data that allows the implementation of copy-paste and
+ drag-and-drop. The client providing the data creates a
+ <function>wl_data_source</function> object and the clients
+ obtaining the data will see it as <function>wl_data_offer</function>
+ object. This interface allows the clients to agree on a mutually
+ supported mime type and transfer the data via a file descriptor
+ that is passed through the protocol.
+ </para>
+ <para>
+ The next section explains the negotiation between data source and
+ data offer objects. <xref linkend="sect-Protocol-data-sharing-devices"/>
+ explains how these objects are created and passed to different
+ clients using the <function>wl_data_device</function> interface
+ that implements copy-paste and drag-and-drop support.
+ </para>
+ <para>
+ See <xref linkend="protocol-spec-wl_data_offer"/>,
+ <xref linkend="protocol-spec-wl_data_source"/>,
+ <xref linkend="protocol-spec-wl_data_device"/> and
+ <xref linkend="protocol-spec-wl_data_device_manager"/> for
+ protocol descriptions.
+ </para>
+ <para>
+ MIME is defined in RFC's 2045-2049. A
+ <ulink url="https://www.iana.org/assignments/media-types/media-types.xhtml">
+ registry of MIME types</ulink> is maintained by the Internet Assigned
+ Numbers Authority (IANA).
+ </para>
+ <section>
+ <title>Data negotiation</title>
+ <para>
+ A client providing data to other clients will create a <function>wl_data_source</function>
+ object and advertise the mime types for the formats it supports for
+ that data through the <function>wl_data_source.offer</function>
+ request. On the receiving end, the data offer object will generate one
+ <function>wl_data_offer.offer</function> event for each supported mime
+ type.
+ </para>
+ <para>
+ The actual data transfer happens when the receiving client sends a
+ <function>wl_data_offer.receive</function> request. This request takes
+ a mime type and a file descriptor as arguments. This request will generate a
+ <function>wl_data_source.send</function> event on the sending client
+ with the same arguments, and the latter client is expected to write its
+ data to the given file descriptor using the chosen mime type.
+ </para>
+ </section>
+ <section id="sect-Protocol-data-sharing-devices">
+ <title>Data devices</title>
+ <para>
+ Data devices glue data sources and offers together. A data device is
+ associated with a <function>wl_seat</function> and is obtained by the clients using the
+ <function>wl_data_device_manager</function> factory object, which is also responsible for
+ creating data sources.
+ </para>
+ <para>
+ Clients are informed of new data offers through the
+ <function>wl_data_device.data_offer</function> event. After this
+ event is generated the data offer will advertise the available mime
+ types. New data offers are introduced prior to their use for
+ copy-paste or drag-and-drop.
+ </para>
+ <section>
+ <title>Selection</title>
+ <para>
+ Each data device has a selection data source. Clients create a data
+ source object using the device manager and may set it as the
+ current selection for a given data device. Whenever the current
+ selection changes, the client with keyboard focus receives a
+ <function>wl_data_device.selection</function> event. This event is
+ also generated on a client immediately before it receives keyboard
+ focus.
+ </para>
+ <para>
+ The data offer is introduced with
+ <function>wl_data_device.data_offer</function> event before the
+ selection event.
+ </para>
+ </section>
+ <section>
+ <title>Drag and Drop</title>
+ <para>
+ A drag-and-drop operation is started using the
+ <function>wl_data_device.start_drag</function> request. This
+ requests causes a pointer grab that will generate enter, motion and
+ leave events on the data device. A data source is supplied as
+ argument to start_drag, and data offers associated with it are
+ supplied to clients surfaces under the pointer in the
+ <function>wl_data_device.enter</function> event. The data offer
+ is introduced to the client prior to the enter event with the
+ <function>wl_data_device.data_offer</function> event.
+ </para>
+ <para>
+ Clients are expected to provide feedback to the data sending client
+ by calling the <function>wl_data_offer.accept</function> request with
+ a mime type it accepts. If none of the advertised mime types is
+ supported by the receiving client, it should supply NULL to the
+ accept request. The accept request causes the sending client to
+ receive a <function>wl_data_source.target</function> event with the
+ chosen mime type.
+ </para>
+ <para>
+ When the drag ends, the receiving client receives a
+ <function>wl_data_device.drop</function> event at which it is expected
+ to transfer the data using the
+ <function>wl_data_offer.receive</function> request.
+ </para>
+ </section>
+ </section>
+ </section>
+</chapter>