INTERNET-DRAFTNetwork Working Group J. LazzaroJanuary 23, 2006Request for Comments: 4695 J. WawrzynekExpires: July 23, 2006Category: Standards Track UC Berkeley November 2006 RTP Payload Format for MIDI<draft-ietf-avt-rtp-midi-format-15.txt>Status ofthisThis MemoBy submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents ofThis document specifies an Internet standards track protocol for the InternetEngineering Task Force (IETF), its areas,community, andits working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents validrequests discussion and suggestions fora maximumimprovements. Please refer to the current edition ofsix monthsthe "Internet Official Protocol Standards" (STD 1) for the standardization state andmay be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The liststatus ofcurrent Internet-Drafts can be accessed at http://www.ietf.org/1id-abstracts.txt. The listthis protocol. Distribution ofInternet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on July 23, 2006.this memo is unlimited. Copyright Notice Copyright (C) The IETF Trust (2006). Abstract This memo describesan RTPa Real-time Transport Protocol (RTP) payload format for the MIDI (Musical Instrument Digital Interface) command language. The format encodes all commands that may legally appear on a MIDI 1.0 DIN cable. The format is suitable for interactive applications (such as network musical performance) andcontent-deliverycontent- delivery applications (such as file streaming). The format may be used over unicast and multicast UDPas well asand TCP, and it defines tools for graceful recovery from packet loss. Stream behavior, including the MIDI rendering method, may be customized during session setup. The format also serves as a mode for the mpeg4-generic format, to support the MPEG 4 Audio Object Types for General MIDI, Downloadable Sounds Level 2, and Structured Audio. Table of Contents 1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.1....................................................4 1.1. Terminology. . . . . . . . . . . . . . . . . . . . . . . . 6 1.2................................................5 1.2. Bitfield Conventions. . . . . . . . . . . . . . . . . . . 6.......................................6 2. Packet Format. . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.1...................................................6 2.1. RTP Header. . . . . . . . . . . . . . . . . . . . . . . . 7 2.2.................................................7 2.2. MIDI Payload. . . . . . . . . . . . . . . . . . . . . . . 12..............................................11 3. MIDI Command Section. . . . . . . . . . . . . . . . . . . . . . 14 3.1...........................................12 3.1. Timestamps. . . . . . . . . . . . . . . . . . . . . . . . 15 3.2...............................................14 3.2. Command Coding. . . . . . . . . . . . . . . . . . . . . . 17...........................................16 4. The Recovery Journal System. . . . . . . . . . . . . . . . . . . 24....................................22 5. Recovery Journal Format. . . . . . . . . . . . . . . . . . . . . 26........................................24 6. Session Description Protocol. . . . . . . . . . . . . . . . . . 30 6.1...................................28 6.1. Session Descriptions for Native Streams. . . . . . . . . . 31 6.2...................29 6.2. Session Descriptions for mpeg4-generic Streams. . . . . . 33 6.3............30 6.3. Parameters. . . . . . . . . . . . . . . . . . . . . . . . 35................................................33 7. Extensibility. . . . . . . . . . . . . . . . . . . . . . . . . . 37..................................................34 8. Congestion Control. . . . . . . . . . . . . . . . . . . . . . . 38.............................................35 9. Security Considerations ........................................35 10. Acknowledgements ..............................................36 11. IANA Considerations ...........................................37 11.1. rtp-midi Media Type Registration .........................37 11.1.1. Repository Request for "audio/rtp-midi" ...........40 11.2. mpeg4-generic Media Type Registration ....................41 11.2.1. Repository Request for Mode rtp-midi for mpeg4-generic .....................................44 11.3. asc Media Type Registration ..............................46 A. The Recovery Journal Channel Chapters. . . . . . . . . . . . . . 39 A.1..........................48 A.1. Recovery Journal Definitions. . . . . . . . . . . . . . . 39 A.2..............................48 A.2. Chapter P: MIDI Program Change. . . . . . . . . . . . . . 44 A.3............................52 A.3. Chapter C: MIDI Control Change. . . . . . . . . . . . . . 45 A.3.1............................53 A.3.1. Log Inclusion Rules. . . . . . . . . . . . . . . . 45 A.3.2................................54 A.3.2. Controller Log Format. . . . . . . . . . . . . . . 47 A.3.3..............................55 A.3.3. Log List Coding Rules. . . . . . . . . . . . . . . 49 A.3.4..............................57 A.3.4. The Parameter System. . . . . . . . . . . . . . . . 52 A.4...............................60 A.4. Chapter M: MIDI Parameter System. . . . . . . . . . . . . 54 A.4.1..........................62 A.4.1. Log Inclusion Rules. . . . . . . . . . . . . . . . 55 A.4.2................................64 A.4.2. Log Coding Rules. . . . . . . . . . . . . . . . . . 57 A.4.2.1...................................65 A.4.2.1. The Value Tool. . . . . . . . . . . . . . . 58 A.4.2.2.............................67 A.4.2.2. The Count Tool. . . . . . . . . . . . . . . 62 A.5.............................70 A.5. Chapter W: MIDI Pitch Wheel. . . . . . . . . . . . . . . . 63 A.6...............................71 A.6. Chapter N: MIDI NoteOff and NoteOn. . . . . . . . . . . . 64 A.6.1........................71 A.6.1. Header Structure. . . . . . . . . . . . . . . . . . 65 A.6.2...................................73 A.6.2. Note Structures. . . . . . . . . . . . . . . . . . 66 A.7....................................74 A.7. Chapter E: MIDI Note Command Extras. . . . . . . . . . . . 68 A.7.1.......................75 A.7.1. Note Log Format. . . . . . . . . . . . . . . . . . 69 A.7.2....................................76 A.7.2. Log Inclusion Rules. . . . . . . . . . . . . . . . 69 A.8................................76 A.8. Chapter T: MIDI Channel Aftertouch. . . . . . . . . . . . 70 A.9........................77 A.9. Chapter A: MIDI Poly Aftertouch. . . . . . . . . . . . . . 71...........................78 B. The Recovery Journal System Chapters. . . . . . . . . . . . . . 73 B.1...........................79 B.1. System Chapter D: Simple System Commands. . . . . . . . . 73 B.1.1..................79 B.1.1. Undefined System Commands. . . . . . . . . . . 74 B.2..........................80 B.2. System Chapter V: Active Sense Command. . . . . . . . . . 77 B.3....................83 B.3. System Chapter Q: Sequencer State Commands. . . . . . . . 78 B.3.1................83 B.3.1. Non-compliant Sequencers. . . . . . . . . . . 80 B.4...........................85 B.4. System Chapter F: MIDI Time Code. . . . . . . . . . . . . 81 B.4.1Tape Position ............86 B.4.1. Partial Frames. . . . . . . . . . . . . . . . . . 83 B.5.....................................88 B.5. System Chapter X: System Exclusive. . . . . . . . . . . . 85 B.5.1........................89 B.5.1. Chapter Format. . . . . . . . . . . . . . . . 85 B.5.2.....................................90 B.5.2. Log Inclusion Semantics. . . . . . . . . . . . 88 B.5.3............................92 B.5.3. TCOUNT and COUNTfields . . . . . . . . . . . . 90Fields ............................95 C. Session Configuration Tools. . . . . . . . . . . . . . . . . . . 92 C.1....................................95 C.1. Configuration Tools: Stream Subsetting. . . . . . . . . . . . . . . . . . . . . 93 C.2....................97 C.2. Configuration Tools: The Journalling System. . . . . . . . . . . . . . . . . . 97 C.2.1..............101 C.2.1. The j_sec Parameter. . . . . . . . . . . . . . . . 98 C.2.2...............................102 C.2.2. The j_update Parameter. . . . . . . . . . . . . . . 99 C.2.2.1............................103 C.2.2.1. The anchor Sending Policy. . . . . . . . . . 100 C.2.2.2.................104 C.2.2.2. The closed-loop Sending Policy. . . . . . . 100 C.2.2.3............104 C.2.2.3. The open-loop Sending Policy. . . . . . . . 104 C.2.3..............108 C.2.3. Recovery Journal Chapter Inclusion Parameters. . . . . . . . . . . . 106 C.3.....110 C.3. Configuration Tools: Timestamp Semantics. . . . . . . . . . . . . . . . . . . . 111 C.3.1.................115 C.3.1. The comex Algorithm. . . . . . . . . . . . . . . . 111 C.3.2...............................115 C.3.2. The async Algorithm. . . . . . . . . . . . . . . . 112 C.3.3...............................116 C.3.3. The buffer Algorithm. . . . . . . . . . . . . . . . 113 C.4..............................117 C.4. Configuration Tools: Packet Timing Tools. . . . . . . . . . . . . . . . . . . . 115 C.4.1.................118 C.4.1. Packet Duration Tools. . . . . . . . . . . . . . . 115 C.4.2.............................119 C.4.2. The guardtime Parameter. . . . . . . . . . . . . . 116 C.5...........................120 C.5. Configuration Tools: Stream Description. . . . . . . . . . . . . . . . . . . . 118 C.6..................121 C.6. Configuration Tools: MIDI Rendering. . . . . . . . . . . . . . . . . . . . . . 124 C.6.1......................128 C.6.1. The multimode Parameter. . . . . . . . . . . . . . 125 C.6.2...........................129 C.6.2. Renderer Specification. . . . . . . . . . . . . . . 125 C.6.3............................129 C.6.3. Renderer Initialization. . . . . . . . . . . . . . 128 C.6.4...........................131 C.6.4. MIDI Channel Mapping. . . . . . . . . . . . . . . . 129 C.6.4.1..............................133 C.6.4.1. The smf_info. . . . . . . . . . . . . . . . . . 130 C.6.4.2Parameter ....................134 C.6.4.2. The smf_inline, smf_url, and smf_cid. . . . . . . . 132 C.6.4.3Parameters ................................136 C.6.4.3. The chanmask. . . . . . . . . . . . . . . . . . 133 C.6.5Parameter ....................136 C.6.5. The audio/asc Media Type. . . . . . . . . . . . . . 134 C.7..........................137 C.7. Interoperability. . . . . . . . . . . . . . . . . . . . . 136 C.7.1.........................................139 C.7.1. MIDI Contentstreaming . . . . . . . . . . . . . . . . . 136 C.7.2Streaming Applications ...............139 C.7.2. MIDI Networkmusical performance . . . . . . . . . . . . 139Musical Performance Applications .....142 D. Parameter Syntax Definitions. . . . . . . . . . . . . . . . . . 148..................................150 E. A MIDI Overview for Networking Specialists. . . . . . . . . . . 154 E.1....................156 E.1. Commands Types. . . . . . . . . . . . . . . . . . . . . . 156 E.2...........................................159 E.2. Running Status. . . . . . . . . . . . . . . . . . . . . . 156 E.3...........................................159 E.3. Command Timing. . . . . . . . . . . . . . . . . . . . . . 157 E.4...........................................160 E.4. AudioSpecificConfigtemplatesTemplates for MMArenderers . . . . . . 157 F. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . 162 G. Security Considerations . . . . . . . . . . . . . . . . . . . . . 163 H. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . . 164 H.1 rtp-midi Media Type Registration . . . . . . . . . . . . . 164 H.1.1 Repository request . . . . . . . . . . . . . . . . . 167 H.2 mpeg4-generic Media Type Registration . . . . . . . . . . . 168 H.2.1 Repository request . . . . . . . . . . . . . . . . . 171 H.3 asc Media Type Registration . . . . . . . . . . . . . . . . 173 I.Renderers ..........160 References. . . . . . . . . . . . . . . . . . . . . . . . . . . 175 I.1.......................................................165 Normative References. . . . . . . . . . . . . . . . . . . 175 I.2.............................................165 Informative References. . . . . . . . . . . . . . . . . . 176 J. Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 178 K. Intellectual Property Rights Statement . . . . . . . . . . . . . 178 L. Full Copyright Statement . . . . . . . . . . . . . . . . . . . . 178 N. Change Log for <draft-ietf-avt-rtp-midi-format-15.txt> . . . . . 180...........................................166 1. Introduction The Internet Engineering Task Force (IETF) has developed a set of focused tools for multimedia networking ([RFC3550][SDP][RFC4566] [RFC3261] [RFC2326]). These tools can be combined in different ways to support a variety of real-time applications over Internet Protocol (IP) networks. For example, a telephony application might use the Session Initiation Protocol (SIP, [RFC3261]) to set up a phone call. Call setup would include negotiations to agree on a common audio codec [RFC3264]. Negotiations would use the Session Description Protocol (SDP,[SDP])[RFC4566]) to describe candidate codecs. After a call is set up, audio data would flow between the parties using the Real Time Protocol (RTP, [RFC3550]) under any applicable profile (for example, the Audio/Visual Profile (AVP, [RFC3551])). The tools used in this telephony example (SIP, SDP, RTP) might be combined in a different way to support a content streaming application, perhaps in conjunction with othertools (suchtools, such as the Real Time Streaming Protocol (RTSP,[RFC2326])).[RFC2326]). The MIDI (Musical Instrument Digital Interface) command language [MIDI] is widely used in musical applications that are analogous to the examples described above. On stage and in the recording studio, MIDI is used for the interactive remote control of musical instruments, an application similar in spirit to telephony. On web pages, Standard MIDI Files (SMFs, [MIDI]) rendered using the General MIDI standard [MIDI] provide a low-bandwidth substitute for audio streaming. This memo is motivated by a simple premise: if MIDI performances could be sent as RTP streams that are managed by IETF session tools, a hybridization of the MIDI and IETF application domains may occur. For example, interoperable MIDI networking may foster network music performance applications, in which a group of musicians, located at different physical locations, interact over a network to perform as they would if they were located in the same room [NMP]. As a second example, the streaming community may begin to use MIDI forlow-bitratelow- bitrate audio coding, perhaps in conjunction with normative sound synthesis methods [MPEGSA]. To enable MIDI applicationsusingto use RTP, this memo defines an RTP payload format and its media type. Sections 2-5 and Appendices A-B define the RTP payload format. Section 6 and Appendices C-D define the media types identifying the payload format, the parameters needed for configuration, and how the parameters are utilized in SDP. Appendix C also includes interoperability guidelines for the example applications described above: network musical performance using SIP (Appendix C.7.2) and content-streaming using RTSP (Appendix C.7.1). Another potential application area for RTP MIDI is MIDI networking for professional audio equipment and electronic musical instruments. We do not offer interoperability guidelines for this application in this memo. However, RTP MIDI has been designed with stage and studio applications in mind, and we expect that efforts to define a stage and studio framework will rely on RTP MIDI for MIDI transport services. Some applications may require MIDI media delivery at a certain service quality level (latency, jitter, packet loss, etc). RTP itself does not provide service guarantees. However, applications may use lower-layer network protocols to configure the quality of the transport services that RTP uses. These protocols may act to reserve network resources for RTP flows[RFC2205],[RFC2205] or may simply direct RTP traffic onto a dedicated "media network" in a local installation. Note that RTP and the MIDI payload format do provide tools that applications may use to achieve the best possible real-time performance at a given service level. This memo normatively defines the syntax and semantics of the MIDI payload format. However, this memo does not define algorithms for sending and receiving packets. An ancillary document[GUIDE][RFC4696] provides informative guidance on algorithms. Supplemental information may be found in related conference publications [NMP] [GRAME]. Throughout this memo, the phrase "native stream" refers to a stream that uses the rtp-midi media type. The phrase "mpeg4-generic stream" refers to a stream that uses the mpeg4-generic media type (in mode rtp-midi) to operate in an MPEG 4 environment [RFC3640]. Section 6 describes this distinction in detail.1.11.1. TerminologyTheIn this document, the key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL"in this documentare to be interpreted as described in BCP 14, RFC 2119 [RFC2119].1.21.2. Bitfield ConventionsTheIn this document, the packet bitfieldsin this documentthat share a common name often have identical semantics. As most of these bitfields appear in Appendices A-B, we define the common bitfield names in Appendix A.1. However, a few of these common names also appear in the main text of this document. For convenience, we list these definitions below: o R flag bit. R flag bits are reserved for future use. Senders MUST set R bits to 0. Receivers MUST ignore R bit values. o LENGTH field. All fields named LENGTH (as distinct from LEN) code the number of octets in the structure that contains it, including the header it resides in and all hierarchical levels below it. If a structure contains a LENGTH field, a receiver MUST use the LENGTH field value to advance past the structure during parsing, rather than use knowledge about the internal format of the structure. 2. Packet Format In this section, we introduce the format of RTP MIDI packets. The description includes some background information on RTP, for the benefit of MIDI implementors new to IETF tools. Implementors should consult [RFC3550] for an authoritative description of RTP. This memo assumes that the reader is familiar with MIDI syntax and semantics. Appendix E provides a MIDI overview, at a level of detail sufficient to understand most of this memo. Implementors should consult [MIDI] for an authoritative description of MIDI. The MIDI payload format maps a MIDI command stream (16 voice channels + systems) onto an RTP stream. An RTP media stream is a sequence of logical packets that share a common format. Each packet consists of two parts: the RTP header and the MIDI payload. Figure 1 shows this format (vertical space delineates the header and payload). We describe RTP packets as "logical" packets to highlight the fact that RTP itself is not a network-layer protocol. Instead, RTP packets are mapped onto network protocols (such as unicast UDP, multicast UDP, or TCP) by an application [ALF]. The interleaved mode of the Real Time Streaming Protocol (RTSP, [RFC2326]) is an example of an RTP mapping to TCP transport, as is[CONTRANS]. 2.1[RFC4571]. 2.1. RTP Header [RFC3550] provides a complete description of the RTP header fields. In this section, we clarify the role of a few RTP header fields for MIDI applications. All fields are coded in network byte order(big-endian).(big- endian). 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | V |P|X| CC |M| PT | Sequence number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Timestamp | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SSRC | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | MIDI command section ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Journal section ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 1 -- Packet format The behavior of the 1-bit M field depends on the media type of the stream. For native streams, the M bit MUST be set to 1 if the MIDI command section has a non-zero LEN field, and MUST be set to 0 otherwise. For mpeg4-generic streams, the M bit MUST be set to 1 for all packets (to conform to [RFC3640]). In an RTP MIDI stream, the 16-bit sequence number field is initialized to a randomly chosenvalue,value and is incremented by one (modulo 2^16) for each packet sent in the stream. A related quantity, the 32-bit extended packet sequence number, may be computed by tracking rollovers of the 16-bit sequence number. Note that different receivers of the same stream may compute different extended packet sequence numbers, depending on when the receiver joined the session. The 32-bit timestamp field sets the base timestamp value for the packet. The payload codes MIDI command timing relative to this value. The timestamp units are set by the clock rate parameter. For example, if the clock rate has a value of 44100 Hz, two packets whose base timestamp values differ by 2 seconds have RTP timestamp fields that differ by 88200. Note that the clock rate parameter is not encoded within each RTP MIDI packet. A receiver of an RTP MIDI stream becomes aware of the clock rate as part of the session setup process. For example, if a session management tool uses the Session Description Protocol (SDP,[SDP])[RFC4566]) to describe a media session, the clock rate parameter is set using the rtpmap attribute. We show examples of session setup in Section 6. For RTP MIDIstreamstreams destined to be rendered into audio, the clock rate SHOULD be an audio sample rate of 32 KHz or higher. This recommendation is due to the sensitivity of human musical perception to small timing errors in musical note sequences, and due to the timbral changes that occur when two near-simultaneous MIDI NoteOns are rendered with a different timing than that desired by the content author due to clock rate quantization. RTP MIDI streams that are not destined for audio rendering (such as MIDI streams that control stage lighting) MAY use a lower clockrate,rate but SHOULD use a clock rate high enough to avoid timing artifacts in the application. For RTP MIDI streams destined to be rendered into audio, the clock rate SHOULD be chosen from rates in common use in professional audio applications or in consumer audio distribution. At the time of this writing, these rates include 32 KHz, 44.1 KHz, 48 KHz, 64 KHz, 88.2 KHz, 96 KHz, 176.4 KHz, and 192 KHz. If the RTP MIDI session is a part of a synchronized media session that includes another (non-MIDI) RTP audio stream with a clockratesrate of 32 KHz or higher, the RTP MIDI stream SHOULD use a clock rate that matches the clock rate of the other audio stream. However, if the RTP MIDI stream is destined to be rendered into audio, the RTP MIDI stream SHOULD NOT use a clock rate lower than 32 KHz, even if this second stream has a clock rate less than 32 KHz. Timestamps of consecutive packets do not necessarily increment at a fixed rate, because RTP MIDI packets are not necessarily sent at a fixed rate. The degree of packet transmission regularity reflects the underlying application dynamics. Interactive applications may vary the packet sending rate to track the gestural rate of a human performer, whereas content-streaming applications may send packets at a fixed rate. Therefore, the timestamps for two sequential RTP packets may be identical, or the second packet may have a timestamp arbitrarily larger than the first packet (modulo 2^32). Section 3 places additional restrictions on the RTP timestamps for two sequential RTP packets, as does the guardtime parameter (Appendix C.4.2). We use the term "media time" to denote the temporal duration of the media coded by an RTP packet. The media time coded by a packet is computed by subtracting the last command timestamp in the MIDI command section from the RTP timestamp (modulo 2^32). If the MIDI list of the MIDI command section of a packet is empty, the media time coded by the packet is 0 ms. Appendix C.4.1 discusses media time issues in detail. We now define RTP session semantics, in the context of sessions specified using the session description protocol[SDP].[RFC4566]. A session description media line ("m=") specifies an RTP session. An RTP session has an independent space of 2^32 synchronization sources. Synchronization source identifiers are coded in the SSRC header field of RTP session packets. The payload types that may appear in the PT header field of RTP session packets are listed at the end of the media line. Several RTP MIDI streams may appear in an RTP session. Each stream is distinguished by a unique SSRCvalue,value and has a unique sequence number and RTP timestamp space. Multiple streams in the RTP session may be sent by a single party. Multiple parties may send streams in the RTP session. An RTP MIDI stream encodes data for a single MIDI command name space (16 voice channels + Systems). Streams in an RTP session may use different payload types, or they may use the same payload type. However, each party may send, at most, one RTP MIDI stream for each payload type mapped to an RTP MIDI payload format in an RTP session. Recall that dynamic binding of payload type numbers in[SDP][RFC4566] lets a party map many payload type numbers to the RTP MIDI payloadformat, andformat; thus a party may send many RTP MIDI streams in a single RTP session. Pairs of streams (unicast or multicast) that communicate between two parties in an RTP session and that share a payload type have the same association as a MIDI cable pair thatcross- connectscross-connects two devices in a MIDI 1.0 DIN network. The RTP session architecture described above is efficient in its use of network ports, as one RTP session (using a port pair per party) supports the transport of many MIDI name spaces (16 MIDI channels + systems). We define tools for grouping and labelling MIDI name spaces across streams and sessions in Appendix C.5 of this memo. The RTP header timestamps for each stream in an RTP session have separately and randomly chosen initialization values. Receivers use the timing fields encoded in the RTP control protocol (RTCP, [RFC3550]) sender reports to synchronize the streams sent by a party. The SSRC values for each stream in an RTP session are also separately and randomly chosen, as described in [RFC3550]. Receivers use the CNAME field encoded in RTCP sender reports to verify that streams were sent by the same party, and to detect SSRC collisions, as described in [RFC3550]. In some applications, a receiver renders MIDI commands into audio (or into control actions, such as the rewind of a tape deck or the dimming of stage lights). In other applications, a receiver presents a MIDI stream to software programs via an Application Programmer Interface (API). Appendix C.6 defines session configuration tools to specify what receivers should do with a MIDI command stream. If a multimedia session uses different RTP MIDI streams to send different classes of media, the streams MUST be sent over different RTP sessions. For example, if a multimedia session uses one MIDI stream for audio and a second MIDI stream to control a lighting system, the audio and lighting streams MUST be sent over different RTP sessions, each with its own media line. Session description tools defined in Appendix C.5 let a sending party split a single MIDI name space (16 voice channels + systems) over several RTP MIDI streams. Split transport of a MIDI command stream is a delicate task, because correct command stream reconstruction by a receiver depends on exact timing synchronization across the streams. To support split name spaces, we define the following requirements: o A party MUST NOT send several RTP MIDI streams that share a MIDI name space in the same RTP session. Instead, each stream MUST be sent from a different RTP session. o If several RTP MIDI streams sent by a party share a MIDI name space, all streams MUST use the same SSRCvalue,value and MUST use the same randomly chosen RTP timestamp initialization value. These rules let a receiver identify streams that share a MIDI name space (by matching SSRCvalues),values) and alsoletslet a receiver accurately reconstruct the source MIDI command stream (by using RTP timestamps to interleave commands from the two streams). Care MUST be taken by senders to ensure that SSRC changes due to collisions are reflected in both streams. Receivers MUST regularly examine the RTCP CNAME fields associated with the linked streams, to ensure that the assumed link islegitimate,legitimate and not the result ofaan SSRC collision by another sender. Except for the special cases described above, a party may send many RTP MIDI streams in the same session. However, it is sometimes advantageous for two RTP MIDI streams to be sent over different RTP sessions. For example, two streams may need different values for RTP session-level attributes (such as the sendonly and recvonly attributes). As a second example, two RTP sessions may be needed to send two unicast streams in a multimedia session that originate on different computers (with different IP numbers). Two RTP sessions are needed in this case because transport addresses are specified on the RTP-session or multimedia-session level, not on a payload type level. On a final note, in some uses of MIDI, parties send bidirectional traffic to conduct transactions (such as file exchange). These commands were designed to work over MIDI 1.0 DIN cable networks may be configured in a multicast topology, which use purepure"party-line" signalling. Thus, if a multimedia session ensures a multicast connection between all parties, bidirectional MIDI commands will work without additional support from the RTP MIDI payload format.2.22.2. MIDI Payload The payload (Figure 1) MUST begin with the MIDI command section. The MIDI command section codes a (possibly empty) list of timestamped MIDI commands, and provides the essential service of the payload format. The payload MAY also contain a journal section. The journal section provides resiliency by coding the recent history of the stream. A flag in the MIDI command section codes the presence of a journal section in the payload. Section 3 defines the MIDI command section. Sections 4-5 and Appendices A-B define the recovery journal, the default format for the journal section. Here, we describe how these payload sections operate in a stream in an RTP session. The journalling method for a stream is set at the start of a session and MUST NOT be changed thereafter. A stream may be set to use the recovery journal, to use an alternative journal format (none are defined in this memo), ortonot to use a journal. The default journalling method of a stream is inferred from its transport type. Streams that use unreliable transport (such as UDP) default to using the recovery journal. Streams that use reliable transport (such as TCP) default to not using a journal. Appendix C.2.1 defines session configuration tools for overriding these defaults. For all types of transport, a sender MUST transmit an RTP packet stream with consecutive sequence numbers (modulo 2^16). If a stream uses the recovery journal, every payload in the stream MUST include a journal section. If a stream does not use journalling, a journal section MUST NOT appear in a stream payload. If a stream uses an alternative journal format, the specification for the journal format defines an inclusion policy. If a stream is sent over UDP transport, the Maximum Transmission Unit (MTU) of the underlying network limits the practical size of the payload section (for example, an Ethernet MTU is 1500 octets), for applications where predictable and minimal packet transmission latency is critical. A sender SHOULD NOT create RTP MIDI UDP packets whose size exceeds the MTU of the underlying network. Instead, the sender SHOULD take steps to keep the maximum packet size under the MTU limit. These steps may take many forms. The default closed-loop recovery journal sending policy (defined in Appendix C.2.2.2) uses RTP control protocol (RTCP, [RFC3550]) feedback to manage the RTP MIDI packet size. In addition, Section 3.2 and Appendix B.5.2 provide specific tools for managing the size of packets that code MIDI System Exclusive (0xF0) commands. Appendix C.5 defines session configuration tools that may be used to split a dense MIDI name space into several UDP streams (each sent in a different RTP session, per Section 2.1) so that the payload fits comfortably into an MTU. Another option is to use TCP. Section 4.3 of[GUIDE][RFC4696] provides non-normative advice for packet size management. 3. MIDI Command Section Figure 2 shows the format of the MIDI command section. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |B|J|Z|P|LEN... | MIDI list ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 2 -- MIDI command section The MIDI command section begins with a variable-length header. The header field LEN codes the number of octets in the MIDI list thatfollowsfollow the header. If the header flag B is 0, the header is one octet long, and LEN is a 4-bit field, supporting a maximum MIDI list length of 15 octets. If B is 1, the header is two octets long, and LEN is a 12-bit field, supporting a maximum MIDI list length of 4095 octets. LEN is coded in network byte order (big-endian): the 4 bits of LEN that appear in the first header octet code the most significant 4 bits of the 12-bit LEN value. A LEN value of 0 is legal, and it codes an empty MIDIlistlist. If the J header bit is set to 1, a journal section MUST appear after the MIDI command section in the payload. If the J header bit is set to 0, the payload MUST NOT contain a journal section. We define the semantics of the P header bit in Section 3.2. If the LEN header field is nonzero, the MIDI list has the structure shown in Figure 3. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Delta Time 0 (1-4 octets long, or 0 octets if Z = 1) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | MIDI Command 0 (1 or more octets long) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Delta Time 1 (1-4 octets long) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | MIDI Command 1 (1 or more octets long) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Delta Time N (1-4 octets long) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | MIDI Command N (0 or more octets long) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 3 -- MIDI liststructure.structure If the header flag Z is 1, the MIDI list begins with a complete MIDI command (coded in the MIDI Command 0fieldfield, in Figure 3) preceded by a delta time (coded in the Delta Time 0 field). If Z is 0, the Delta Time 0 field is not present in the MIDI list, and the command coded in the MIDI Command 0 field has an implicit delta time of 0. The MIDI list structure may also optionally encode a list of N additional complete MIDI commands, each coded in a MIDI Command K field. Each additional command MUST be preceded by a Delta Time K field, which codes the command's delta time. We discuss exceptions to the "command fields code complete MIDI commands" rule in Section 3.2. The final MIDI command field(i.e.(i.e., the MIDI Command Nfieldfield, shown in Figure 3) in the MIDI list MAY be empty. Moreover, a MIDI list MAY consist a single delta time (encoded in the Delta Time 0 field) without an associated command (which would have been encoded in the MIDI Command 0 field). These rules enable MIDI coding features that are explained in Section 3.1. We delay the explanations because an understanding of RTP MIDI timestamps is necessary to describe the features.3.13.1. Timestamps In this section, we describe how RTP MIDI encodes a timestamp for each MIDI list command. Command timestamps have the same units as RTP packet header timestamps (described in Section 2.1 and [RFC3550]). Recall that RTP timestamps have units of seconds, whose scaling is set during session configuration (see Section 6.1 and[SDP]).[RFC4566]). As shown in Figure 3, the MIDI list encodes time using a compactdelta- timedelta-time format. The RTP MIDI delta time syntax is a modified form of the MIDI File delta time syntax [MIDI]. RTP MIDI delta times use 1-4 octet fields to encode 32-bit unsigned integers. Figure 4 shows the encoded and decoded forms of delta times. Note that delta time values may be legally encoded in multiple formats; for example, there are four legal ways to encode the zero delta time (0x00, 0x8000, 0x808000, 0x80808000). RTP MIDI uses delta times to encode a timestamp for each MIDI command. The timestamp for MIDI Command K is the summation (modulo 2^32) of the RTP timestamp and decoded delta times 0 through K. This cumulative coding technique, borrowed from MIDI File delta time coding, is efficient because it reduces the number of multi-octet delta times. All command timestamps in a packet MUST be less than or equal to the RTP timestamp of the next packet in the stream (modulo 2^32). This restriction ensures that a particular RTP MIDI packet in a stream is uniquely responsible for encoding time starting at the moment after the RTP timestamp encoded in the RTP packet header, and ending at the moment before the final command timestamp encoded in the MIDI list. The "moment before" and "moment after" qualifiers acknowledge the "less than or equal" semantics (as opposed to "strictly less than") in the sentence above this paragraph. Note that it is possible to "pad" the end of an RTP MIDI packet with time that is guaranteed to be void of MIDI commands, by setting the "Delta Time N" field of the MIDI list to the end of the void time, and by omitting its corresponding "MIDI Command N" field (a syntactic construction the preamble of Section 3 expressly made legal). In addition, it is possible to code an RTP MIDI packet to express that a period of time in the stream is void of MIDI commands. The RTP timestamp in the header would code the start of the void time. The MIDI list of this packet would consist of a "Delta Time 0" field that coded the end of the void time. No other fields would be present in the MIDI list (a syntactic construction the preamble of Section 3 also expressly made legal). By default, a command timestamp indicates the execution time for the command. The difference between two timestamps indicates the time delay between the execution of the commands. This difference may be zero, coding simultaneous execution. In this memo, we refer to this interpretation of timestamps as "comex" (COMmand EXecution) semantics. We formally define comex semantics in Appendix C.3. The comex interpretation of timestamps works well for transcoding a Standard MIDI File (SMF) into an RTP MIDI stream, as SMFs code a timestamp for each MIDI command stored in the file. To transcode an SMF that uses metric time markers, use the SMF tempo map (encoded in the SMF as meta-events) to convert metric SMF timestamp units into seconds-based RTP timestamp units. The comex interpretation also works well for MIDI hardware controllers that are coding raw sensor data directly onto an RTP MIDI stream. Note that this controller designthatis preferable to a design that converts raw sensor data into a MIDI 1.0 cable command stream and then transcodes the stream onto an RTP MIDI stream. The comex interpretation of timestamps is usually not the best timestamp interpretation for transcoding a MIDI source that uses implicit command timing (such as MIDI 1.0 DIN cables) into an RTP MIDI stream. Appendix C.3 defines alternatives to comexsemantics,semantics and describes session configuration tools for selecting the timestamp interpretation semantics for a stream. One-Octet Delta Time: Encoded form: 0ddddddd Decoded form: 00000000 00000000 00000000 0ddddddd Two-Octet Delta Time: Encoded form: 1ccccccc 0ddddddd Decoded form: 00000000 00000000 00cccccc cddddddd Three-Octet Delta Time: Encoded form: 1bbbbbbb 1ccccccc 0ddddddd Decoded form: 00000000 000bbbbb bbcccccc cddddddd Four-Octet Delta Time: Encoded form: 1aaaaaaa 1bbbbbbb 1ccccccc 0ddddddd Decoded form: 0000aaaa aaabbbbb bbcccccc cddddddd Figure 4 -- Decoding delta time formats3.23.2. Command Coding Each non-empty MIDI Command field in the MIDI list codes one of the MIDI command types that may legally appear on a MIDI 1.0 DIN cable. Standard MIDI File meta-events do not fit this definition and MUST NOT appear in the MIDI list. As a rule, each MIDI Command field codes a complete command, in the binary command format defined in [MIDI]. In the remainder of this section, we describe exceptions to this rule. The first MIDI channel command in the MIDI list MUST include a status octet. Running status coding, as defined in [MIDI], MAY be used for all subsequent MIDI channel commands in the list. As in [MIDI], System Common and System Exclusive messages (0xF0 ... 0xF7) cancel the running status state, but System Real-time messages (0xF8 ... 0xFF) do not affect the running status state. All System commands in the MIDI list MUST include a status octet. As we note above, the first channel command in the MIDI list MUST include a status octet. However, the corresponding command in the original MIDI source data stream might not have a status octet (in this case, the source would be coding the command using running status). If the status octet of the first channel command in the MIDI list does not appear in the source data stream, the P (phantom) header bit MUST be set to 1. In all other cases, the P bit MUST be set to 0. Note that the P bit describes the MIDI source data stream, not the MIDI list encoding; regardless of the state of the P bit, the MIDI list MUST include the status octet. As receivers MUST be able to decode running status, sender implementors should feel free to use running status to improve bandwidth efficiency. However, senders SHOULD NOT introduce timing jitter into an existing MIDI command stream through an inappropriate use or removal of running status coding. This warning primarily applies to senders whose RTP MIDI streams may be transcoded onto a MIDI 1.0 DIN cable [MIDI] by the receiver: both the timestamps and the command coding (running status or not) must comply with the physical restrictions of implicit time coding over a slow serial line. On a MIDI 1.0 DIN cable [MIDI], a System Real-time command may be embedded inside of another "host" MIDI command. This syntactic construction is not supported in the payload format: a MIDI Command field in the MIDI list codes exactly one MIDI command (partially or completely). To encode an embedded System Real-time command, senders MUST extract the command from itshost,host and code it in the MIDI list as a separate command. The host command and System Real-time command SHOULD appear in the same MIDI list. The delta time of the System Real-time command SHOULD result in a command timestamp that encodes the System Real-time command placement in its original embedded position. Two methods are provided for encoding MIDI System Exclusive (SysEx) commands in the MIDI list. A SysEx command may be encoded in a MIDI Command field verbatim: a 0xF0 octet, followed by an arbitrary number of data octets, followed by a 0xF7 octet. Alternatively, a SysEx command may be encoded as multiple segments. The command is divided into two or more SysEx command segments; each segment is encoded in its own MIDI Command field in the MIDI list. The payload format supports segmentation in order to encode SysEx commands that encode information in the temporal pattern of data octets. By encoding these commands as a series of segments, each data octet may be associated with a distinct delta time. Segmentation also supports the coding of large SysEx commands across several packets. To segment a SysEx command, first partition its data octet list into two or more sublists. The last sublist MAY be empty(i.e.(i.e., contain no octets); all other sublists MUST contain at least one data octet. To complete the segmentation, add the status octets defined in Figure 5 to the head and tail of the first, last, and any "middle" sublists. Figure 6 shows example segmentations of a SysEx command. A sender MAY cancel a segmented SysEx command transmission that is in progress, by sending the "cancel" sublist shown in Figure 5. A "cancel" sublist MAY follow a "first" or "middle" sublist in the transmission, but MUST NOT follow a "last" sublist. The cancel MUST be empty (thus, 0xF7 0xF4 is the only legal cancel sublist). The cancellation feature is needed because Appendix C.1 defines configuration tools that let session parties exclude certain SysEx commands in the stream. Senders that transcode a MIDI source onto an RTP MIDI stream under these constraints have the responsibility of excluding undesired commands from the RTP MIDI stream. The cancellation feature lets a sender start the transmission of a command before the MIDI source has sent the entire command. If a sender determines that the command whose transmission is in progress should not appear on the RTP stream, it cancels the command. Without a method for cancelling a SysEx command transmission, senders would be forced to use a high-latency store-and-forward approach to transcoding SysEx commands onto RTP MIDI packets, in order to validate each SysEx command before transmission. The recommended receiver reaction to a cancellation depends on the capabilities of the receiver. For example, a sound synthesizer that is directly parsing RTP MIDI packets and rendering them to audio will be aware of the fact that SysEx commands may be cancelled in RTP MIDI. These receivers SHOULD detect a SysEx cancellation in the MIDIlist,list and act as ifitthey had never received the SysEx command. As a second example, a synthesizer may be receiving MIDI data from an RTP MIDI stream via a MIDI DIN cable (or a software API emulation of a MIDI DIN cable). In this case, anRTP-MIDI awareRTP-MIDI-aware system receives the RTP MIDIstream,stream and transcodes it onto the MIDI DIN cable (or its emulation). Upon the receipt of the cancel sublist, theRTP-MIDIRTP-MIDI- aware transcoder might have already sent the first part of the SysEx command on the MIDI DIN cable to the receiver. Unfortunately, the MIDI DIN cable protocol cannot directly code "cancel SysEx in progress" semantics. However, MIDI DIN cable receivers begin SysEx processing after the complete command arrives. The receiver checks to see if it recognizes the command (coded in the first few octets) and then checks to see if the command is the correct length. Thus, in practice, a transcoder can cancel a SysEx command by sending an 0xF7 to (prematurely) end the SysEx command -- the receiver will detect the incorrect commandlength,length and discard the command. Appendix C.1 defines configuration tools that may be used to prohibit SysEx command cancellation. The relative ordering of SysEx command segments in a MIDI list must match the relative ordering of the sublists in the original SysEx command. By default, commands other than System Real-time MIDI commands MUST NOT appear between SysEx command segments (Appendix C.1 defines configuration tools to change this default, to let other commands types appear between segments). If the command segments of a SysEx command are placed in the MIDI lists of two or more RTP packets, the segment ordering rules apply to the concatenation of all affected MIDI lists. ----------------------------------------------------------- | Sublist Position | Head Status Octet | Tail Status Octet | |-----------------------------------------------------------| | first | 0xF0 | 0xF0 | |-----------------------------------------------------------| | middle | 0xF7 | 0xF0 | |-----------------------------------------------------------| | last | 0xF7 | 0xF7 | |-----------------------------------------------------------| | cancel | 0xF7 | 0xF4 | ----------------------------------------------------------- Figure 5 -- Command segmentation status octets [MIDI] permits 0xF7 octets that are not part of a (0xF0, 0xF7) pair to appear on a MIDI 1.0 DIN cable. Unpaired 0xF7 octets have no semantic meaning in MIDI, apart from cancelling running status. Unpaired 0xF7 octets MUST NOT appear in the MIDI list of the MIDI Command section. We impose this restriction to avoid interference with the command segmentation coding defined in Figure 5. SysEx commands carried on a MIDI 1.0 DIN cable may use the "dropped 0xF7" construction [MIDI]. In this coding method, the 0xF7 octet is dropped from the end of the SysEx command, and the status octet of the next MIDI command acts both to terminate the SysEx command and start the next command. To encode this construction in the payload format, follow these steps: o Determine the appropriate delta times for the SysEx command and the command that follows the SysEx command. o Insert the "dropped" 0xF7 octet at the end of the SysEx command, to form the standard SysEx syntax. o Code both commands into the MIDI list using the rules above. o Replace the 0xF7 octet that terminates the verbatim SysEx encoding or the last segment of the segmented SysEx encoding with a 0xF5 octet. This substitution informs the receiver of the original dropped 0xF7 coding. [MIDI] reserves the undefined System Common commands 0xF4 and 0xF5 and the undefined System Real-time commands 0xF9 and 0xFD for future use. By default, undefined commands MUST NOT appear in a MIDI Command field in the MIDI list, with the exception of the 0xF5 octets used to code the "dropped 0xF7" construction and the 0xF4 octets used by SysEx "cancel" sublists. During session configuration, a stream may be customized to transport undefined commands (Appendix C.1). For this case, we now define how senders encode undefined commands in the MIDI list. An undefined System Real-time command MUST be coded using the System Real-time rules. If the undefined System Common commands are put to use in a future version of [MIDI], the command will begin with an 0xF4 or 0xF5 status octet, followed by an arbitrary number of data octets(i.e.(i.e., zero or more data bytes). To encode these commands, senders MUST terminate the command with an 0xF7octet,octet and place the modified command into the MIDI Command field. Unfortunately, non-compliant uses of the undefined System Common commands may appear in MIDI implementations. To model these commands, we assume that the command begins with an 0xF4 or 0xF5 status octet, followed by zero or more data octets, followed by zero or more trailing 0xF7 statusoctet(s).octets. To encode the command, senders MUST first remove all trailing 0xF7 status octets from the command. Then, senders MUST terminate the command with an 0xF7octet,octet and place the modified command into the MIDI Command field. Note that we include the trailing octets in our model as a cautionary measure: if such commands appeared in a non-compliant use of an undefined System Common command, an RTP MIDI encoding of the command that did not remove trailing octets could be mistaken for an encoding of "middle" or "last" sublist of a segmented SysEx commands (Figure 5) under certain packet loss conditions. Original SysEx command: 0xF0 0x01 0x02 0x03 0x04 0x05 0x06 0x07 0x08 0xF7 A two-segment segmentation: 0xF0 0x01 0x02 0x03 0x04 0xF0 0xF7 0x05 0x06 0x07 0x08 0xF7 A different two-segment segmentation: 0xF0 0x01 0xF0 0xF7 0x02 0x03 0x04 0x05 0x06 0x07 0x08 0xF7 A three-segment segmentation: 0xF0 0x01 0x02 0xF0 0xF7 0x03 0x04 0xF0 0xF7 0x05 0x06 0x07 0x08 0xF7 The segmentation with the largest number of segments: 0xF0 0x01 0xF0 0xF7 0x02 0xF0 0xF7 0x03 0xF0 0xF7 0x04 0xF0 0xF7 0x05 0xF0 0xF7 0x06 0xF0 0xF7 0x07 0xF0 0xF7 0x08 0xF0 0xF7 0xF7 Figure 6 -- Example segmentations 4. The Recovery Journal System The recovery journal is the default resiliency tool for unreliable transport. In this section, we normatively define the roles that senders and receivers play in the recovery journal system. MIDI is a fragile code. A single lost command in a MIDI command stream may produce an artifact in the rendered performance. We normatively classify rendering artifacts into two categories: o Transient artifacts. Transient artifacts produce immediate but short-term glitches in the performance. For example, a lost NoteOn (0x9) command produces a transient artifact: one note fails to play, but the artifact does not extend beyond the end of that note. o Indefinite artifacts. Indefinite artifacts produce long-lasting errors in the rendered performance. For example, a lost NoteOff (0x8) command may produce an indefinite artifact: the note that should have been ended by the lost NoteOff command may sustain indefinitely. As a second example, the loss of a Control Change (0xB) command for controller number 7 (Channel Volume) may produce an indefinite artifact: after the loss, all notes on the channel may play too softly or too loudly. The purpose of the recovery journal system is to satisfy the recovery journal mandate: the MIDI performance rendered from an RTP MIDI stream sent over unreliable transport MUST NOT contain indefinite artifacts. The recovery journal system does not use packet retransmission to satisfy this mandate. Instead, each packet includes a special section, called the recovery journal. The recovery journal codes the history of the stream, back to an earlier packet called the checkpoint packet. The range of coverage for the journal is called the checkpoint history. The recovery journal codes the information necessary to recover from the loss of an arbitrary number of packets in the checkpoint history. Appendix A.1 normatively defines the checkpoint packet and the checkpoint history. When a receiver detects a packet loss, it compares its own knowledge about the history of the stream with the history information coded in the recovery journal of the packet that ends the loss event. By noting the differences in these two versions of the past, a receiver is able to transform all indefinite artifacts in the rendered performance into transient artifacts, by executing MIDI commands to repair the stream. We now state the normative role for senders in the recovery journal system. Senders prepare a recovery journal for every packet in the stream. In doing so, senders choose the checkpoint packet identity for the journal. Senders make this choice by applying a sending policy. Appendix C.2.2 normatively defines three sending policies:"closed-loop","closed- loop", "open-loop", and "anchor". By default, senders MUST use the closed-loop sending policy. If the session description overrides this default policy, by using the parameter j_update defined in Appendix C.2.2, senders MUST use the specified policy. After choosing the checkpoint packet identity for a packet, the sender creates the recovery journal. By default, this journal MUST conform to the normative semantics in Section 5 and Appendices A-B in this memo. In Appendix C.2.3, we define parameters that modify the normative semantics for recovery journals. If the session description uses these parameters, the journal created by the sender MUST conform to the modified semantics. Next, we state the normative role for receivers in the recovery journal system. A receiver MUST detect each RTP sequence number break in a stream. If the sequence number break is due to a packet loss event (as defined in[RFC3550])[RFC3550]), the receiver MUST repair all indefinite artifacts in the rendered MIDI performance caused by the loss. If the sequence number break is due to an out-of-order packet (as defined in[RFC3550])[RFC3550]), the receiver MUST NOT take actions that introduce indefinite artifacts (ignoring the out-of-order packet is a safe option). Receivers take special precautions when entering or exiting a session. A receiver MUST process the first received packet in a stream as if it were a packet that ends a loss event. Upon exiting a session, a receiver MUST ensure that the rendered MIDI performance does not end with indefinite artifacts. Receivers are under no obligation to perform indefinite artifact repairs at the moment a packet arrives. A receiver that uses a playout buffer may choose to wait until the moment of rendering before processing the recovery journal, as the "lost" packet may be a late packet that arrives in time to use. Next, we state the normative role for the creator of the session description in the recovery journal system. Depending on the application, the sender, the receivers, and other parties may take part in creating or approving the session description. A session description that specifies the default closed-loop sending policy and the default recovery journal semantics satisfies the recovery journal mandate. However, these default behaviors may not be appropriate for all sessions. If the creators of a session description use the parameters defined in Appendix C.2 to override these defaults, the creators MUST ensure that the parameters define a system thatsatisfysatisfies the recovery journal mandate. Finally, we note that this memo does not specify sender or receiver recovery journal algorithms. Implementations are free to use any algorithm that conforms to the requirements in this section. Thenon- normative [GUIDE]non-normative [RFC4696] discusses sender and receiver algorithm design. 5. Recovery Journal Format This section introduces the structure of the recoveryjournal,journal and defines the bitfields of recovery journal headers. Appendices A-B complete the bitfield definition of the recovery journal. The recovery journal has a three-level structure: o Top-level header. o Channel and system journal headers.EncodesThese headers encode recovery information for a single voice channel (channel journal) or for all systems commands (system journal). o Chapters.DescribesChapters describe recovery information for a single MIDI command type. Figure 7 shows the top-level structure of the recovery journal. The recovery journals consists of a 3-octet header, followed by an optional system journal (labeled S-journal in Figure 7) and an optional list of channel journals. Figure 8 shows the recovery journal header format. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Recovery journal header | S-journal ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Channel journals ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 7 -- Top-level recovery journal format 0 1 2 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S|Y|A|H|TOTCHAN| Checkpoint Packet Seqnum | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 8 -- Recovery journal header If the Y header bit is set to 1, the system journal appears in the recovery journal, directly following the recovery journal header. If the A header bit is set to 1, the recovery journal ends with a list of (TOTCHAN + 1) channel journals (the 4-bit TOTCHAN header field is interpreted as an unsigned integer). A MIDI channel MAY be represented by (at most) one channel journal in a recovery journal. Channel journals MUST appear in the recovery journal in ascending channel-number order. If A and Y are both zero, the recovery journal only contains its3-octet header,3- octet header and is considered to be an "empty" journal. The S (single-packet loss) bit appears in most recovery journal structures, including the recovery journal header. The S bit helps receivers efficiently parse the recovery journal in the common case of the loss of a single packet. Appendix A.1 defines S bit semantics. The H bit indicates if MIDI channels in the stream have been configured to use the enhanced Chapter C encoding (Appendix A.3.3). By default, the payload format does not use enhanced Chapter C encoding. In this default case, the H bit MUST be set to 0 for all packets in the stream. If the stream has been configured so that controller numbers for one or more MIDI channels use enhanced Chapter C encoding, the H bit MUST be set to 1 in all packets in the stream. In Appendix C.2.3, we show how to configure a stream to use enhanced Chapter C encoding. The 16-bit Checkpoint Packet Seqnum header field codes the sequence number of the checkpoint packet for this journal, in network byte order (big-endian). The choice of the checkpoint packet sets the depth of the checkpoint history for the journal (defined in Appendix A.1). Receivers may use the Checkpoint Packet Seqnum field of the packet that ends a loss event to verify that the journal checkpoint history covers the entire loss event. The checkpoint history covers the loss event if the Checkpoint Packet Seqnum field is less than or equal to one plus the highest RTP sequence number previously received on the stream (modulo 2^16). 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S| CHAN |H| LENGTH |P|C|M|W|N|E|T|A| Chapters ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 9 -- Channel journal format Figure 9 shows the structure of a channel journal: a 3-octet header, followed by a list of leaf elements called channel chapters. A channel journal encodes information about MIDI commands on the MIDI channel coded by the 4-bit CHAN header field. Note that CHAN uses the same bit encoding as the channel nibble in MIDI Channel Messages (the cccc field in Figure E.1 of Appendix E). The 10-bit LENGTH field codes the length of the channel journal. The semantics for LENGTH fields are uniform throughout the recovery journal, and are defined in Appendix A.1. The third octet of the channel journal header is the Table of Contents (TOC) of the channel journal. The TOC is a set of bits that encode the presence of a chapter in the journal. Each chapter contains information about a certain class of MIDI channel command: o Chapter P: MIDI Program Change (0xC) o Chapter C: MIDI Control Change (0xB) o Chapter M: MIDI Parameter System (part of 0xB) o Chapter W: MIDI Pitch Wheel (0xE) o Chapter N: MIDI NoteOff (0x8), NoteOn (0x9) o Chapter E: MIDI Note Command Extras (0x8, 0x9) o Chapter T: MIDI Channel Aftertouch (0xD) o Chapter A: MIDI Poly Aftertouch (0xA) Chapters appear in a list following the header, in order of their appearance in the TOC. Appendices A.2-9 describe the bitfield format for each chapter, and define the conditions under which a chapter type MUST appear in the recovery journal. If any chapter types are required for a channel, an associated channel journal MUST appear in the recovery journal. The H bit indicates if controller numbers on a MIDI channel have been configured to use the enhanced Chapter C encoding (Appendix A.3.3). By default, controller numbers on a MIDI channel do not use enhanced Chapter C encoding. In this default case, the H bit MUST be set to 0 for all channel journal headers for the channel in the recovery journal, for all packets in the stream. However, if at least one controller number for a MIDI channel has been configured to use the enhanced Chapter C encoding, the H bit for its channel journal MUST be set to 1, for all packets in the stream. In Appendix C.2.3, we show how to configure a controller number to use enhanced Chapter C encoding. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S|D|V|Q|F|X| LENGTH | System chapters ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 10 -- System journal format Figure 10 shows the structure of the system journal: a 2-octet header, followed by a list of system chapters. Each chapter codes information about a specific class of MIDI Systems command: o Chapter D: Song Select (0xF3), Tune Request (0xF6), Reset (0xFF), undefined System commands (0xF4, 0xF5, 0xF9, 0xFD) o Chapter V: Active Sense (0xFE) o Chapter Q: Sequencer State (0xF2, 0xF8, 0xF9, 0xFA, 0xFB, 0xFC) o Chapter F: MTC Tape Position (0xF1, 0xF0 0x7F 0xcc 0x01 0x01) o Chapter X: System Exclusive (all other 0xF0) The 10-bit LENGTH field codes the size of the systemjournal,journal and conforms to semantics described in Appendix A.1. The D, V, Q, F, and X header bits form a Table of Contents (TOC) for the system journal. A TOC bit that is set to 1 codes the presence of a chapter in the journal. Chapters appear in a list following the header, in the order of their appearance in the TOC. Appendix B describes the bitfield format for the systemchapters,chapters anddefinedefines the conditions under which a chapter type MUST appear in the recovery journal. If any system chapter type is required to appear in the recovery journal, the system journal MUST appear in the recovery journal. 6. Session Description Protocol RTP does not perform session management. Instead, RTP works together with session management tools, such as the Session Initiation Protocol (SIP, [RFC3261]) and the Real Time Streaming Protocol (RTSP, [RFC2326]). RTP payload formats define media type parameters for use in session management (for example, this memo defines "rtp-midi" as the media type for native RTP MIDI streams). In most cases, session management tools use the media type parameters via another standard, the Session Description Protocol (SDP,[SDP]).[RFC4566]). SDP is a textual format for specifying session descriptions. Session descriptions specify the network transport and media encoding for RTP sessions. Session management tools coordinate the exchange of session descriptions between participants ("parties"). Some session management tools use SDP to negotiate details of media transport (network addresses, ports,etc).etc.). We refer to this use of SDP as "negotiated usage". One example of negotiated usage is the Offer/Answer protocol ([RFC3264] and Appendix C.7.2 in this memo) as used by SIP. Other session management tools use SDP to declare the media encoding for thesession,session but use other techniques to negotiate network transport. We refer to this use of SDP as "declarative usage". One example of declarative usage is RTSP ([RFC2326] and Appendix C.7.1 in this memo). Below, we show session description examples for native (Section 6.1) and mpeg4-generic (Section 6.2) streams. In Section 6.3, we introduce session configuration tools that may be used to customize streams.6.16.1. Session Descriptions for Native Streams The session description below defines a unicast UDP RTP session (via a media ("m=") line) whose sole payload type (96) is mapped to a minimal native RTP MIDI stream. v=0 o=lazzaro 2520644554 2838152170 IN IP4 first.example.net s=Example t=0 0 m=audio 5004 RTP/AVP 96 c=IN IP4 192.0.2.94 a=rtpmap:96 rtp-midi/44100 The rtpmap attribute line uses the "rtp-midi" media type to specify an RTP MIDI native stream. The clock rate specified on the rtpmap line (in the example above, 44100 Hz) sets the scaling for the RTP timestamp header field (see Section 2.1, and also [RFC3550]). Note that this document does not specify a default clock rate value for RTP MIDI. When RTP MIDI is used with SDP, parties MUST use the rtpmap line to communicate the clock rate. Guidance for selecting the RTP MIDI clock rate value appears in Section 2.1. We consider the RTP MIDI stream shown above to be "minimal" because the session description does not customize the stream with parameters. Without such customization, a native RTP MIDI stream has these characteristics: 1. If the stream uses unreliable transport (unicast UDP, multicast UDP,...),etc.), the recovery journal system is in use, and the RTP payload contains both the MIDI command section and the journal section. If the stream uses reliable transport (such as TCP), the stream does not use journalling, and the payload contains only the MIDI command section (Section 2.2). 2. If the stream uses the recovery journal system, the recovery journal system uses the default sending policy and the default journal semantics (Section 4). 3. In the MIDI command section of the payload, command timestamps use the default "comex" semantics (Section 3). 4. The recommended temporal duration ("media time") of an RTP packet ranges from 0 to 200 ms, and the RTP timestamp difference between sequential packets in the stream may be arbitrarily large (Section 2.1). 5. If more than one minimal rtp-midi stream appears in a session, the MIDI name spaces for these streams are independent: channel 1 in the first stream does not reference the same MIDI channel as channel 1 in the second stream (see Appendix C.5 for a discussion of the independence of minimal rtp-midi streams). 6. The rendering method for the stream is not specified. Whata whatthe receiver "does" with a minimal native MIDI stream is "out of scope" of this memo. For example, in content creation environments, a user may manually configure client software to render the stream with a specific software package. As in standard in RTP, RTP sessions managed by SIP are sendrecv by default (parties send and receive MIDI), and RTP sessions managed by RTSP are recvonly by default (server sends and client receives). In sendrecv RTP MIDI sessions for the session description shown above, the 16 voice channel + systems MIDI name space is unique for each sender. Thus, in atwo partytwo-party session, the voice channel 0 sent by one party is distinct from the voice channel 0 sent by the other party. This behavior corresponds to what occurs when two MIDI 1.0 DIN devices are cross-connected with two MIDI cables (one cable routing MIDI Out from the first device into MIDI In of the second device, a second cable routing MIDI In from the first device into MIDI Out of the second device). We define this "association" formally in Section 2.1. MIDI 1.0 DIN networks may be configured in a "party-line" multicast topology. For these networks, the MIDI protocol itself provides tools for addressing specific devices in transactions on a multicast network, and for device discovery. Thus, apart from providing a1-to-many1- to-many forward path and a many-to-1 reverse path, IETF protocols do not need to provide any special support for MIDI multicast networking.6.26.2. Session Descriptions for mpeg4-generic Streams An mpeg4-generic [RFC3640] RTP MIDI stream uses an MPEG 4 Audio Object Type to render MIDI into audio. Three Audio Object Types accept MIDI input: o General MIDI (Audio Object Type ID 15), based on the General MIDI rendering standard [MIDI]. o Wavetable Synthesis (Audio Object Type ID 14), based on the Downloadable Sounds Level 2 (DLS 2) rendering standard [DLS2]. o Main Synthetic (Audio Object Type ID 13), based on Structured Audio and the programming language SAOL [MPEGSA]. The primary service of an mpeg4-generic stream is to code Access Units (AUs). We define the mpeg4-generic RTP MIDI AU as the MIDI payload shown in Figure 1 of Section 2.1 of this memo: a MIDI command section optionally followed by a journal section. Exactly one RTP MIDI AU MUST be mapped to one mpeg4-generic RTP MIDI packet. The mpeg4-generic options for placing several AUs in an RTP packet MUST NOT be used with RTP MIDI. The mpeg4-generic options for fragmenting and interleaving AUs MUST NOT be used with RTP MIDI. The mpeg4-generic RTP packet payload (Figure 1 in [RFC3640]) MUST contain empty AU Header and Auxiliary sections. These rules yieldmpeg4-genericmpeg4- generic packets that are structurally identical to native RTP MIDI packets, an essential property for the correct operation of the payload format. The session descriptionbelowthat follows defines a unicast UDP RTP session (via a media ("m=") line) whose sole payload type (96) is mapped to a minimal mpeg4-generic RTP MIDI stream. This example uses the General MIDI Audio Object Type under Synthesis Profile @ Level 2. v=0 o=lazzaro 2520644554 2838152170 IN IP6 first.example.net s=Example t=0 0 m=audio 5004 RTP/AVP 96 c=IN IP6 2001:DB80::7F2E:172A:1E24 a=rtpmap:96 mpeg4-generic/44100 a=fmtp:96 streamtype=5; mode=rtp-midi; profile-level-id=12; config=7A0A0000001A4D546864000000060000000100604D54726B0000 000600FF2F000 (The a=fmtp line has been wrapped to fit the page to accommodate memo formatting restrictions; it comprises a single line inSDP)SDP.) The fmtp attribute line codes the four parameters (streamtype, mode, profile-level-id, and config) that are required in all mpeg4-generic session descriptions [RFC3640]. For RTP MIDI streams, the streamtype parameter MUST be set to 5, the "mode" parameter MUST be set to"rtp- midi","rtp-midi", and the "profile-level-id" parameter MUST be set to the MPEG-4 Profile Level for the stream. For the Synthesis Profile, legalprofile- level-idprofile-level-id values are 11, 12, and 13, coding low (11), medium (12), or high (13) decoder computational complexity, as defined by MPEG conformance tests. In a minimal RTP MIDI session description, the config value MUST be a hexadecimal encoding [RFC3640] of the AudioSpecificConfig data block [MPEGAUDIO] for the stream. AudioSpecificConfig encodes the Audio Object Type for thestream,stream and also encodes initialization data (SAOL programs, DLS 2 wave tables,etc).etc.). Standard MIDI Files encoded in AudioSpecificConfig in a minimal session description MUST be ignored by the receiver. Receivers determine the rendering algorithm for the session by interpreting the first 5 bits of AudioSpecificConfig as an unsigned integer that codes the Audio Object Type. In our example above, the leading config string nibbles "7A" yield the Audio Object Type 15 (General MIDI). In Appendix E.4, we derive the config string value in the session description shown above; the starting point of the derivation is the MPEG bitstreams defined in [MPEGSA] and [MPEGAUDIO]. We consider the stream to be "minimal" because the session description does not customize the stream through the use of parameters, other than the 4 required mpeg4-generic parameters described above. In Section 6.1, we describe the behavior of a minimal native stream, as a numbered list of characteristics. Items 1-4 on that list also describe the minimal mpeg4-generic stream, but items 5 and 6 require restatements, as listed below: 5. If more than one minimal mpeg4-generic stream appears in a session, each stream uses an independent instance of the Audio Object Type coded in the config parameter value. 6. A minimal mpeg4-generic stream encodes the AudioSpecificConfig as an inline hexadecimal constant. If a session description is sent over UDP, it may be impossible to transport large AudioSpecificConfig blocks within the Maximum Transmission Size (MTU) of the underlying network (for Ethernet, the MTU is 1500 octets). In some cases, the AudioSpecificConfig block may exceed the maximum size of the UDP packet itself. The comments in Section 6.1 on SIP and RTSP stream directional defaults, sendrecv MIDI channelusageusage, and MIDI 1.0 DIN multicast networks also apply to mpeg4-generic RTP MIDI sessions. In sendrecv sessions, each party's session description MUST use identical values for the mpeg4-generic parameters (including the required streamtype, mode, profile-level-id, and config parameters). As a consequence, each party uses anidentically-configuredidentically configured MPEG 4 Audio Object Type to render MIDI commands into audio. The preamble to Appendix C discusses a way to create "virtual sendrecv" sessions that do not have this restriction.6.36.3. Parameters This section introduces parameters for session configuration for RTP MIDI streams. In session descriptions, parameters modify the semantics of a payload type. Parameters are specified onaan fmtp attribute line. See the session description example in Section 6.2 for an example of a fmtp attribute line. The parameters add features to the minimal streams described in Sections 6.1-2, and support several types of services: o Stream subsetting. By default, all MIDI commands that are legal to appear on a MIDI 1.0 DIN cable may appear in an RTP MIDI stream. The cm_unused parameter overrides this default by prohibiting certain commands from appearing in the stream. The cm_used parameter is used in conjunction with cm_unused, to simplify the specification of complex exclusion rules. We describe cm_unused and cm_used in Appendix C.1. o Journal customization. The j_sec and j_update parameters configure the use of the journal section. The ch_default, ch_never, and ch_anchor parameters configure the semantics of the recovery journal chapters. These parameters are described in AppendixC.2,C.2 and override the default stream behaviors 1 and22, listed in Section 6.1 and referenced in Section 6.2. o MIDI command timestamp semantics. The tsmode, octpos, mperiod, and linerate parameters customize the semantics of timestamps in the MIDI command section. These parameters let RTP MIDI accurately encode the implicit time coding of MIDI 1.0 DIN cables. These parameters are described in AppendixC.3,C.3 and override default stream behavior33, listed in Section 6.1 and referenced in Section 6.2 o Media time. The rtp_ptime and rtp_maxptime parameters define the temporal duration ("media time") of an RTP MIDI packet. The guardtime parameter sets the minimum sending rate of stream packets. These parameters are described in AppendixC.4,C.4 and override default stream behavior44, listed in Section 6.1 and referenced in Section 6.2. o Stream description. The musicport parameter labels the MIDI name space of RTP streams in a multimedia session. Musicport is described in Appendix C.5. The musicport parameter overrides default stream behavior55, in Sections 6.1 and 6.2. o MIDI rendering. Several parameters specify the MIDI rendering method of a stream. These parameters are described in AppendixC.6,C.6 and override default stream behavior66, in Sections 6.1 and 6.2. In Appendix C.7, we specify interoperability guidelines for two RTP MIDI application areas: content-streaming using RTSP (Appendix C.7.1) and network musical performance using SIP (Appendix C.7.2). 7. Extensibility The payload format defined in this memo exclusively encodes all commands that may legally appear on a MIDI 1.0 DIN cable. Many worthy uses of MIDI over RTP do not fall within the narrow scope of the payload format. For example, the payload format does not support the direct transport of Standard MIDI File (SMF) meta-event and metric timing data. As a second example, the payload format does not define transport tools for user-defined commands (apart from tools to support System Exclusive commands [MIDI]). The payload format does not provide an extension mechanism to support new features of this nature, by design. Instead, we encourage the development of new payload formats for specialized musical applications. The IETF session management tools [RFC3264] [RFC2326] support codec negotiation, to facilitate the use of new payload formats in abackward- compatiblebackward-compatible way. However, the payload format does provide several extensibility tools, which we list below: o Journalling. As described in Appendix C.2, new token values for the j_sec and j_update parameters may be defined in IETF standards-track documents. This mechanism supports the design of new journal formats and the definition of new journal sending policies. o Rendering. The payload format may be extended to support new MIDI renderers (Appendix C.6.2). Certain general aspects of the RTP MIDI rendering process may also be extended, via the definition of new token values for the render (Appendix C.6) and smf_info (Appendix C.6.4.1) parameters. o Undefined commands. [MIDI] reserves 4 MIDI System commands for future use (0xF4, 0xF5, 0xF9, 0xFD). If updates to [MIDI] define the reserved commands, IETF standards-track documents may be defined to provide resiliency support for the commands. Opaque LEGAL fields appear in System Chapter D for this purpose (Appendix B.1.1). A final form of extensibility involves the inclusion of the payload format in framework documents. Framework documents describe how to combine protocols to form a platform for interoperable applications. For example, a stage and studio framework might define how to use SIP [RFC3261], RTSP [RFC2326], SDP[SDP][RFC4566], and RTP [RFC3550] to support media networking for professional audio equipment and electronic musical instruments. 8. Congestion Control The RTP congestion control requirements defined in [RFC3550] apply to RTP MIDI sessions, and implementors should carefully read the congestion control section in [RFC3550]. As noted in [RFC3550], all transport protocols used on the Internet need to address congestion control in some way, and RTP is not an exception. In addition, the congestion control requirements defined in [RFC3551] applies to RTP MIDI sessions run under applicable profiles. The basic congestion control requirement defined in [RFC3551] is that RTP sessions that use UDP transport should monitor packet loss (viaRTCP,RTCP orviaother means) to ensure that the RTP stream competes fairly with TCP flows that share the network. Finally, RTP MIDI has congestion control issues that are unique for an audio RTP payload format. In applications such as network musical performance [NMP], the packet rate is linked to the gestural rate of a human performer. Senders MUST monitor the MIDI command source for patterns that result in excessive packetrates,rates and take actions during RTP transcoding to reduce the RTP packet rate.[GUIDE][RFC4696] offers implementation guidance on this issue.A. The Recovery Journal Channel Chapters A.1 Recovery Journal Definitions This Appendix defines the terminology and9. Security Considerations Implementors should carefully read thecoding idioms that are used inSecurity Considerations sections of therecovery journal bitfield descriptions in Section 5 (journal header structure), Appendices A.2-9 (channel journal chapters)RTP [RFC3550], AVP [RFC3551], andAppendices B.1-5 (system journal chapters). We assume thatother RTP profile documents, as therecovery journal residesissues discussed in these sections directly apply to RTP MIDI streams. Implementors should also review thejournal section ofSecure Real-time Transport Protocol (SRTP, [RFC3711]), an RTPpacket with sequence number I ("packet I") andprofile that addresses theCheckpoint Packet Seqnum fieldsecurity issues discussed inthe top-level recovery journal header refers to a previous packet with sequence number C (an exception is the self- referential C = I case). Unless stated otherwise, algorithms[RFC3550] and [RFC3551]. Here, we discuss security issues that areassumedunique touse modulo 2^16 arithmetic for calculations on 16-bit sequence numbersthe RTP MIDI payload format. When using RTP MIDI, authentication of incoming RTP andmodulo 2^32 arithmetic for calculations on 32-bit extended sequence numbers. Several bitfield coding idioms appear throughoutRTCP packets is RECOMMENDED. Per-packet authentication may be provided by SRTP or by other means. Without therecovery journal system, with consistent semantics. Most recovery journal elements begin withuse of authentication, attackers could forge MIDI commands into an"S" (Single-packet loss) bit. S bits are designedongoing stream, damaging speakers and eardrums. An attacker could also craft RTP and RTCP packets tohelp receivers efficiently parse through the recovery journal hierarchyexploit known bugs in thecommon caseclient and take effective control of a client machine. Session management tools (such as SIP [RFC3261]) SHOULD use authentication during thelosstransport ofa single packet. As a rule, S bits MUST be set to 1. However,all session descriptions containing RTP MIDI media streams. For SIP, the Security Considerations section in [RFC3261] provides anexception applies if a recovery journal elementoverview of possible authentication mechanisms. RTP MIDI session descriptions should use authentication because the session descriptions may code initialization data using the parameters described inpacket I encodesAppendix C. If an attacker inserts bogus initialization dataaboutinto acommand stored insession description, he can corrupt theMIDI command section of packet I - 1. In this case,session or forge an client attack. Session descriptions may also code renderer initialization data by reference, via theS bit ofurl (Appendix C.6.3) and smf_url (Appendix C.6.4.2) parameters. If therecovery journal element MUST be setcoded URL is spoofed, both session and client are open to0. Ifattack, even if the session description itself is authenticated. Therefore, URLs specified in url and smf_url parameters SHOULD use [RFC2818]. Section 2.1 allows streams sent by arecovery journal element has its S bit setparty in two RTP sessions to0, all higher-level recovery journal elements that contain it MUST alsohaveS bits that are set to 0, includingthetop-level recovery journal header. Other consistent bitfield coding idioms are described below: o R flag bit. R flag bitssame SSRC value and the same RTP timestamp initialization value, under certain circumstances. Normally, these values arereservedrandomly chosen forfuture use. Senders MUST set R bits to 0. Receivers MUST ignore R bit values. o LENGTH field. All fields named LENGTH (as distinct from LEN) code the number of octets in the structure that contains it, including the header it resideseach stream inand all hierarchical levels below it. If a structure containsaLENGTH field, a receiver MUST use the LENGTH field valuesession, toadvance past the structure during parsing, rather than use knowledge aboutmake plaintext guessing harder to do if theinternal formatpayloads are encrypted. Thus, Section 2.1 weakens this aspect ofthe structure.RTP security. 10. Acknowledgements Wenow define normative terms usedthank the networking, media compression, and computer music community members who have commented or contributed todescribe recovery journal semantics. o Checkpoint history. The checkpoint history of a recovery journal istheconcatenation ofeffort, including Kurt B, Cynthia Bruyns, Steve Casner, Paul Davis, Robin Davies, Joanne Dow, Tobias Erichsen, Nicolas Falquet, Dominique Fober, Philippe Gentric, Michael Godfrey, Chris Grigg, Todd Hager, Michel Jullian, Phil Kerr, Young-Kwon Lim, Jessica Little, Jan van der Meer, Colin Perkins, Charlie Richmond, Herbie Robinson, Larry Rowe, Eric Scheirer, Dave Singer, Martijn Sipkema, William Stewart, Kent Terry, Magnus Westerlund, Tom White, Jim Wright, Doug Wyatt, and Giorgio Zoia. We also thank theMIDI command sectionsmembers ofpackets C through I - 1. The final command intheMIDI command sectionSan Francisco Bay Area music and audio community forpacket I - 1 is considered the most recent command; the first command increating theMIDI command sectioncontext forpacket C is the oldest command. If command X is less recent than command Y, X is considered to be "before Y". A checkpoint history with no commands is considered to be empty. The checkpoint history never containstheMIDI commandwork, including Don Buchla, Chris Chafe, Richard Duda, Dan Ellis, Adrian Freed, Ben Gold, Jaron Lanier, Roger Linn, Richard Lyon, Dana Massie, Max Mathews, Keith McMillen, Carver Mead, Nelson Morgan, Tom Oberheim, Malcolm Slaney, Dave Smith, Julius Smith, David Wessel, and Matt Wright. 11. IANA Considerations This sectionof the packet I (the packet containing the recovery journal), so if C == I, the checkpoint history is empty by definition. o Session history. The session history ofmakes arecovery journal is the concatenation of MIDI command sections from the first packetseries ofthe session uprequests topacket I - 1.IANA. ThedefinitionsIANA has completed registration/assignments ofcommand recency and history emptiness follow those inthecheckpoint history.below requests. Thesession history never containssub-sections that follow hold theMIDI commandactual, detailed requests. All registrations in this sectionof packet I,are in the IETF tree andsofollow thesession historyrules of [RFC4288] and [RFC3555], as appropriate. In Section 11.1, we request thefirst packet in the session is empty by definition. o Finished/unfinished commands. If all octetsregistration of aMIDI command appear in the session history, the commandnew media type: "audio/rtp-midi". Paired with this request isdefined to be finished. If some but not all octets ofacommand appear in the session history, the command is defined to be unfinished. Unfinished commands occur if segments ofrequest for aSysEx command appear inrepository for new values for severalRTP packets. For example, if a SysEx command is coded as 3 segments,parameters associated withsegment 1 in packet K, segment 2 in packet K + 1, and segment 3"audio/rtp-midi". We request this repository inpacket K + 2, the session histories for packets K + 1 and K + 2 contain unfinished versions ofSection 11.1.1. In Section 11.2, we request thecommand. A session history contains a finished versionregistration of acancelled SysEx command if the history contains the cancel sublistnew value ("rtp- midi") for thecommand. o Reset State commands. Reset State (RS) commands reset renderers to an initialized "powerup" condition. The RS commands are: System Reset (0xFF), General MIDI System Enable (0xF0 0x7E 0xcc 0x09 0x01 0xF7), General MIDI 2 System Enable (0xF0 0x7E 0xcc 0x09 0x03 0xF7), General MIDI System Disable (0xF0 0x7E 0xcc 0x09 0x00 0xF7), Turn DLS On (0xF0 0x7E 0xcc 0x0A 0x01 0xF7) and Turn DLS Off (0xF0 0x7E 0xcc 0x0A 0x02 0xF7). Registrations of subrender"mode" parametertoken values (Appendix C.6.2) and IETF standards-track documents MAY specify additional RS commands. o Active commands. Active command are MIDI commands that do not appear before a Reset State command in the session history. o N-active commands. N-active commands are MIDI commands that do not appear before oneof thefollowing commands"mpeg4-generic" media type. The "mpeg4-generic" media type is defined inthe session history: MIDI Control Change numbers 123-127 (numbers with All Notes Off semantics) or 120 (All Sound Off),[RFC3640], andany Reset State command. o C-active commands. C-active commands[RFC3640] defines a repository for the "mode" parameter. However, we believe we areMIDI commands that do not appear before one ofthefollowing commands infirst to request thesession history: MIDI Control Change number 121 (Reset All Controllers) and any Reset State command. o Oldest-first ordering rule. Several recovery journal chapters contain a listregistration ofelements, where each element is associated withaMIDI command that appears in the session history. In most cases,"mode" value, so we believe thechapter definition requires that list elements be ordered in accordanceregistry for "mode" has not yet been created by IANA. Paired withthe "oldest-first ordering rule". Below,our "mode" parameter value request for "mpeg4-generic" is a request for a repository for new values for several parameters wenormatively definehave defined for use with the "rtp-midi" mode value. We request thisrule: Elementsrepository in Section 11.2.1. In Section 11.3, we request the registration of a new media type: "audio/asc". No repository request is associated with this request. 11.1. rtp-midi Media Type Registration This section requests themost recent command inregistration of thesession history coded in"rtp-midi" subtype for thelist MUST appear at"audio" media type. We request theendregistration of thelist. Elements associated with the oldest commandparameters listed in thesession history coded in"optional parameters" section below (both thelist MUST appear at"non-extensible parameters" and thestart"extensible parameters" lists). We also request the creation of repositories for thelist. All other list elements MUST be arranged with respect to these boundary elements, to produce a list ordering that strictly reflects"extensible parameters"; therelative session history recencydetails ofthe commands coded by the elementsthis request appear inthe list. o Parameter system. A MIDI feature that provides two sets of 16,384 parameters to expand the 0-127 controller number space.Section 11.1.1, below. Media type name: audio Subtype name: rtp-midi Required parameters: rate: TheRegistered Parameter Names (RPN) systemRTP timestamp clock rate. See Sections 2.1 andthe Non-Registered Parameter Names (NRPN) system each provides 16,384 parameters. o Parameter system transaction.6.1 for usage details. Optional parameters: Non-extensible parameters: ch_anchor: See Appendix C.2.3 for usage details. ch_default: See Appendix C.2.3 for usage details. ch_never: See Appendix C.2.3 for usage details. cm_unused: See Appendix C.1 for usage details. cm_used: See Appendix C.1 for usage details. chanmask: See Appendix C.6.4.3 for usage details. cid: See Appendix C.6.3 for usage details. guardtime: See Appendix C.4.2 for usage details. inline: See Appendix C.6.3 for usage details. linerate: See Appendix C.3 for usage details. mperiod: See Appendix C.3 for usage details. multimode: See Appendix C.6.1 for usage details. musicport: See Appendix C.5 for usage details. octpos: See Appendix C.3 for usage details. rinit: See Appendix C.6.3 for usage details. rtp_maxptime: See Appendix C.4.1 for usage details. rtp_ptime: See Appendix C.4.1 for usage details. smf_cid: See Appendix C.6.4.2 for usage details. smf_inline: See Appendix C.6.4.2 for usage details. smf_url: See Appendix C.6.4.2 for usage details. tsmode: See Appendix C.3 for usage details. url: See Appendix C.6.3 for usage details. Extensible parameters: j_sec: See Appendix C.2.1 for usage details. See Section 11.1.1 for repository details. j_update: See Appendix C.2.2 for usage details. See Section 11.1.1 for repository details. render: See Appendix C.6 for usage details. See Section 11.1.1 for repository details. subrender: See Appendix C.6.2 for usage details. See Section 11.1.1 for repository details. smf_info: See Appendix C.6.4.1 for usage details. See Section 11.1.1 for repository details. Encoding considerations: Thevalue of RPNsformat for this type is framed andNRPNs are changed by a seriesbinary. Restrictions on usage: This type is only defined for real-time transfers ofControl Change commands that form a parameter system transaction. A canonical transaction begins with two Control Change commands to setMIDI streams via RTP. Stored-file semantics for rtp-midi may be defined in theparameter number (controller numbers 99future. Security considerations: See Section 9 of this memo. Interoperability considerations: None. Published specification: This memo and98 for NRPNs,[MIDI] serve as the normative specification. In addition, references [NMP], [GRAME], and [RFC4696] provide non-normative implementation guidance. Applications that use this media type: Audio content-creation hardware, such as MIDI controllernumbers 101piano keyboards and100MIDI audio synthesizers. Audio content-creation software, such as music sequencers, digital audio workstations, and soft synthesizers. Computer operating systems, forRPNs). The transaction continues with an arbitrary numbernetwork support ofData Entry (controller numbers 6 and 38), Data Increment (controller number 96),MIDI Application Programmer Interfaces. Content distribution servers andData Decrement (controller number 97) Control Change commandsterminals may use this media type for low bit-rate music coding. Additional information: None. Person & email address toset the parameter value. The transaction ends with a second pair of (99, 98) or (101, 100) Controlcontact for further information: John Lazzaro <lazzaro@cs.berkeley.edu> Intended usage: COMMON. Author: John Lazzaro <lazzaro@cs.berkeley.edu> Changecommands that specifycontroller: IETF Audio/Video Transport Working Group delegated from thenull parameter (MSB value 0x7F, LSB value 0x7F). Several variantsIESG. 11.1.1. Repository Request for "audio/rtp-midi" For the "rtp-midi" subtype, we request the creation of repositories for extensions to thecanonical transaction sequencefollowing parameters (which arepossible. Most commonly, the terminal pairthose listed as "extensible parameters" in Section 11.1). j_sec: Registrations for this repository may only occur via an IETF standards-track document. Appendix C.2.1 of(99, 98) or (101, 100) Control Change commandsthis memo describes appropriate registrations for this repository. Initial values for this repository appear below: "none": Defined in Appendix C.2.1 of this memo. "recj": Defined in Appendix C.2.1 of this memo. j_update: Registrations for this repository mayspecify a parameter other than the null parameter. Inonly occur via an IETF standards-track document. Appendix C.2.2 of thiscase, the command pair terminates the first transaction and starts a second transaction. The command pair is considered to be a part both transactions. This variant is legal and recommendedmemo describes appropriate registrations for this repository. Initial values for this repository appear below: "anchor": Defined in[MIDI]. We refer toAppendix C.2.2 of thisvariant as a "type 1 variant". Less commonly, the MSB (99 or 101) or LSB (98 or 100) commandmemo. "open-loop": Defined in Appendix C.2.2 of this memo. "closed-loop": Defined in Appendix C.2.2 of this memo. render: Registrations for this repository MUST include a(99, 98) or (101, 100) Control Change pair may be omitted. Ifspecification of theMSB command is omitted,usage of thetransaction usesproposed value. See text in theMSB valuepreamble ofthe most recent C-active Control Change commandAppendix C.6 forcontroller number 99 or 101details (the paragraph thatappearsbegins "Other render token ..."). Initial values for this repository appear below: "unknown": Defined inthe session history. We refer toAppendix C.6 of thisvariant as a "type 2 variant". If the LSB command is omitted, the LSB value 0x00 is assumed. We refer tomemo. "synthetic": Defined in Appendix C.6 of thisvariant as a "type 3 variant". The type 2 and type 3 variants are defined as legal, but are not recommended,memo. "api": Defined in[MIDI]. System real-time commands may appear at any point during a transaction (even between octetsAppendix C.6 ofindividual commandsthis memo. "null": Defined in Appendix C.6 of this memo. subrender: Registrations for this repository MUST include a specification of thetransaction). More generally, [MIDI] does not forbidusage of theappearanceproposed value. See text Appendix C.6.2 for details (the paragraph that begins "Other subrender token ..."). Initial values for this repository appear below: "default": Defined in Appendix C.6.2 ofunrelated MIDI commands during an open transaction. Asthis memo. smf_info: Registrations for this repository MUST include arule, these commands are considered to be "outside" the transaction, and do not effectspecification of thestatususage of thetransactionproposed value. See text inany way. Exceptions toAppendix C.6.4.1 for details (the paragraph that begins "Other smf_info token ..."). Initial values for thisrule are commands whose semantics act to terminate transactions: Reset State commands, and Control Change (0xB) for controller number 121 (Reset All Controllers) [RP015]. o Initiated parameter system transaction. A canonical parameter system transaction whose (99, 98) or (101, 100) initial Control Change command pair appearsrepository appear below: "ignore": Defined inthe session history is considered to be an initiated parameter system transaction.Appendix C.6.4.1 of this memo. "sdp_start": Defined in Appendix C.6.4.1 of this memo. "identity": Defined in Appendix C.6.4.1 of this memo. 11.2. mpeg4-generic Media Type Registration Thisdefinition also holdssection requests the registration of the "rtp-midi" value fortype 1 variants. For type 2 variants (dropped MSB), a transaction whose initial LSB Control Change command appears inthesession history is an initiated transaction. For"mode" parameter of the "mpeg4-generic" media type. The "mpeg4- generic" media type3 variants (dropped LSB), a transactionisconsidereddefined in [RFC3640], and [RFC3640] defines a repository for the "mode" parameter. We are registering mode rtp- midi tobe initiated if at least one transaction command followssupport theinitial MSB (99 or 101) Control Change command inMPEG Audio codecs [MPEGSA] that use MIDI. In conjunction with this registration request, we request thesession history. The completionregistration ofa transaction does not nullify its "initiated" status. o Session history reference counts. Several recovery journal chapters include a reference count field, which codesthetotal number of commands of a type that appearparameters listed in thesession history. Examples include"optional parameters" section below (both theReset and Tune Request command logs (Chapter D, Appendix B.1)"non-extensible parameters" and theActive Sense command (Chapter V, Appendix B.2). Upon"extensible parameters" lists). We also request thedetection of a loss event, reference count fields let a receiver deduce if any instancescreation ofthe command have been lost, by comparing the journal reference count with its own reference count. Thus, a reference count field makes sense, evenrepositories forcommand types in which knowing the NUMBER of lost commands is irrelevant (as is true with all oftheexample commands mentioned above). The chapter definitions in Appendices A.2-9 and B.1-5 reflect"extensible parameters"; thedefault recovery journal behavior. The ch_default, ch_never, and ch_anchor parameters modify these definitions, as describeddetails of this request appear in AppendixC.2.3.11.2.1, below. Media type name: audio Subtype name: mpeg4-generic Required parameters: Thechapter definitions specify if data MUST"mode" parameter is required by [RFC3640]. [RFC3640] requests a repository for "mode", so that new values for mode may bepresent inadded. We request that thejournal. Senders MAY also include non-required data invalue "rtp-midi" be added to thejournal. This optional data MUST comply with"mode" repository. In mode rtp-midi, thenormative chapter definition. For example, if a chapter definition states thatmpeg4-generic parameter rate is afield codes data fromrequired parameter. Rate specifies themost recent active commandRTP timestamp clock rate. See Sections 2.1 and 6.2 for usage details of rate in mode rtp-midi. Optional parameters: We request registration of thesession history, the sender MUST NOT code inactive commands or older commandsfollowing parameters for use inthe field. Finally, we note that a channel journal only encodes information about MIDI commands appearing on the MIDI channel the journal protects. All references to MIDI commands in Appendices A.2-9 should be read as "MIDI commands appearing on this channel." A.2 Chapter P: MIDI Program Change A channel journal MUST contain Chapter P if an active Program Change (0xC) command appears in the checkpoint history. Figure A.2.1 shows themode rtp-midi for mpeg4-generic. Non-extensible parameters: ch_anchor: See Appendix C.2.3 for usage details. ch_default: See Appendix C.2.3 for usage details. ch_never: See Appendix C.2.3 for usage details. cm_unused: See Appendix C.1 for usage details. cm_used: See Appendix C.1 for usage details. chanmask: See Appendix C.6.4.3 for usage details. cid: See Appendix C.6.3 for usage details. guardtime: See Appendix C.4.2 for usage details. inline: See Appendix C.6.3 for usage details. linerate: See Appendix C.3 for usage details. mperiod: See Appendix C.3 for usage details. multimode: See Appendix C.6.1 for usage details. musicport: See Appendix C.5 for usage details. octpos: See Appendix C.3 for usage details. rinit: See Appendix C.6.3 for usage details. rtp_maxptime: See Appendix C.4.1 for usage details. rtp_ptime: See Appendix C.4.1 for usage details. smf_cid: See Appendix C.6.4.2 for usage details. smf_inline: See Appendix C.6.4.2 for usage details. smf_url: See Appendix C.6.4.2 for usage details. tsmode: See Appendix C.3 for usage details. url: See Appendix C.6.3 for usage details. Extensible parameters: j_sec: See Appendix C.2.1 for usage details. See Section 11.2.1 for repository details. j_update: See Appendix C.2.2 for usage details. See Section 11.2.1 for repository details. render: See Appendix C.6 for usage details. See Section 11.2.1 for repository details. subrender: See Appendix C.6.2 for usage details. See Section 11.2.1 for repository details. smf_info: See Appendix C.6.4.1 for usage details. See Section 11.2.1 for repository details. Encoding considerations: The format forChapter P. 0 1 2 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8this type is framed and binary. Restrictions on usage: Only defined for real-time transfers of audio/mpeg4-generic RTP streams with mode=rtp-midi. Security considerations: See Section 90 1 2 3 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S| PROGRAM |B| BANK-MSB |X| BANK-LSB | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure A.2.1 -- Chapter P format The chapter has a fixed sizeof24 bits. The PROGRAM field indicatesthis memo. Interoperability considerations: Except for thedata value ofmarker bit (Section 2.1), themost recent active Program Change commandpacket formats for audio/rtp-midi and audio/mpeg4-generic (mode rtp-midi) are identical. The formats differ in use: audio/mpeg4-generic is for MPEG work, and audio/rtp-midi is for all other work. Published specification: This memo, [MIDI], and [MPEGSA] are thesession history. By default, the B, BANK-MSB, X,normative references. In addition, references [NMP], [GRAME], andBANK-LSB fields MUST be set to 0. Below, we define exceptions to[RFC4696] provide non-normative implementation guidance. Applications that use thisdefault condition. If an active Control Change (0xB) commandmedia type: MPEG 4 servers and terminals that support [MPEGSA]. Additional information: None. Person & email address to contact forcontroller number 0 (Bank Select MSB) appears before the Programfurther information: John Lazzaro <lazzaro@cs.berkeley.edu> Intended usage: COMMON. Author: John Lazzaro <lazzaro@cs.berkeley.edu> Changecommand incontroller: IETF Audio/Video Transport Working Group delegated from thesession history,IESG. 11.2.1. Repository Request for Mode rtp-midi for mpeg4-generic For mode rtp-midi of theB bit MUST be setmpeg4-generic subtype, we request the creation of repositories for extensions to1, andtheBANK-MSB fieldfollowing parameters (which are those listed as "extensible parameters" in Section 11.2). j_sec: Registrations for this repository may only occur via an IETF standards-track document. Appendix C.2.1 of this memo describes appropriate registrations for this repository. Initial values for this repository appear below: "none": Defined in Appendix C.2.1 of this memo. "recj": Defined in Appendix C.2.1 of this memo. j_update: Registrations for this repository may only occur via an IETF standards-track document. Appendix C.2.2 of this memo describes appropriate registrations for this repository. Initial values for this repository appear below: "anchor": Defined in Appendix C.2.2 of this memo. "open-loop": Defined in Appendix C.2.2 of this memo. "closed-loop": Defined in Appendix C.2.2 of this memo. render: Registrations for this repository MUSTcodeinclude a specification of thedata valueusage of theControl Change command. If B is set to 1,proposed value. See text in theBANK-LSB fieldpreamble of Appendix C.6 for details (the paragraph that begins "Other render token ..."). Initial values for this repository appear below: "unknown": Defined in Appendix C.6 of this memo. "synthetic": Defined in Appendix C.6 of this memo. "null": Defined in Appendix C.6 of this memo. subrender: Registrations for this repository MUSTcodeinclude a specification of thedata valueusage of themost recent Control Change commandproposed value. See text Appendix C.6.2 forcontroller number 32 (Bank Select LSB)details (the paragraph thatpreceded the Program Change command coded in the PROGRAM fieldbegins "Other subrender token ..." andfollowedsubsequent paragraphs). Note that theControl Change command codedtext inthe BANK-MSB field. If no such Control Change command exists, the BANK-LSB fieldAppendix C.6.2 contains restrictions on subrender registrations for mpeg4-generic ("Registrations for mpeg4-generic subrender values ..."). Initial values for this repository appear below: "default": Defined in Appendix C.6.2 of this memo. smf_info: Registrations for this repository MUSTbe set to 0. If B is set to 1, and ifinclude aControl Change commandspecification of the usage of the proposed value. See text in Appendix C.6.4.1 forcontroller number 121 (Reset All Controllers) appearsdetails (the paragraph that begins "Other smf_info token ..."). Initial values for this repository appear below: "ignore": Defined in Appendix C.6.4.1 of this memo. "sdp_start": Defined in Appendix C.6.4.1 of this memo. "identity": Defined in Appendix C.6.4.1 of this memo. 11.3. asc Media Type Registration This section registers "asc" as a subtype for theMIDI stream between the Control Change command coded by"audio" media type. We register this subtype to support theBANK-MSB field andremote transfer of theProgram Change command coded by"config" parameter of thePROGRAM field,mpeg4-generic media type [RFC3640] when it is used with mpeg4-generic mode rtp-midi (registered in Appendix 11.2 above). We explain theX bit MUST be setmechanics of using "audio/asc" to1.set the config parameter in Section 6.2 and Appendix C.6.5 of this document. Note that[RP015] specifiesthis registration is a new subtype registration and is not an addition to a repository defined by MPEG-related memos (such as [RFC3640]). Also note thatReset All Controllersthis request for "audio/asc" does notresetregister parameters, and does not request thevaluescreation of a repository. Media type name: audio Subtype name: asc Required parameters: None. Optional parameters: None. Encoding considerations: The native form ofcontroller numbers 0 (Bank Select MSB)the data object is binary data, zero-padded to an octet boundary. Restrictions on usage: This type is only defined for data object (stored file) transfer. The most common transports for the type are HTTP and32 (Bank Select LSB). Thus,SMTP. Security considerations: See Section 9 of this memo. Interoperability considerations: None. Published specification: The audio/asc data object is theX bit does not effect how receivers willAudioSpecificConfig binary data structure, which is normatively defined in [MPEGAUDIO]. Applications that usethe BANK-LSBthis media type: MPEG 4 Audio servers andBANK-MSB values when recovering from a lost Programterminals that support audio/mpeg4-generic RTP streams for mode rtp-midi. Additional information: None. Person & email address to contact for further information: John Lazzaro <lazzaro@cs.berkeley.edu> Intended usage: COMMON. Author: John Lazzaro <lazzaro@cs.berkeley.edu> Changecommand.controller: IETF Audio/Video Transport Working Group delegated from the IESG. A. TheX bit serves to aid recovery in MIDI applications where controller numbers 0Recovery Journal Channel Chapters A.1. Recovery Journal Definitions This appendix defines the terminology and32the coding idioms that are used ina non-standard way. A.3 Chapter C: MIDI Control Change Figure A.3.1 showstheformat for Chapter C. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4recovery journal bitfield descriptions in Section 56 7 8 8 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S| LEN |S| NUMBER |A| VALUE/ALT |S| NUMBER | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |A| VALUE/ALT | .... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure A.3.1 -- Chapter C format The chapter consists of a 1-octet header, followed by a variable length list of 2-octet controller logs. The list MUST contain at least one controller log. The 7-bit LEN field codes(journal header structure), Appendices A.2 to A.9 (channel journal chapters) and Appendices B.1 to B.5 (system journal chapters). We assume that thenumber of controller logsrecovery journal resides in thelist, minus one. We define the semanticsjournal section of an RTP packet with sequence number I ("packet I") and that thecontroller log fieldsCheckpoint Packet Seqnum field inAppendix A.3.2. A channelthe top-level recovery journalMUST contain Chapterheader refers to a previous packet with sequence number Cif(an exception is therules defined in this Appendix require that one or more controller logsself- referential C = I case). Unless stated otherwise, algorithms are assumed to use modulo 2^16 arithmetic for calculations on 16-bit sequence numbers and modulo 2^32 arithmetic for calculations on 32-bit extended sequence numbers. Several bitfield coding idioms appear throughout the recovery journal system, with consistent semantics. Most recovery journal elements begin with an "S" (Single-packet loss) bit. S bits are designed to help receivers efficiently parse through the recovery journal hierarchy in thelist. A.3.1 Log Inclusion Rules A controller logcommon case of the loss of a single packet. As a rule, S bits MUST be set to 1. However, an exception applies if a recovery journal element in packet I encodesinformationdata about aparticular Control Changecommand stored in thesession history.MIDI command section of packet I - 1. In this case, thedefault useS bit of thepayload format, list logsrecovery journal element MUSTencode information about the most recent active command in the session history forbe set to 0. If acontroller number. Logs encoding earlier commandsrecovery journal element has its S bit set to 0, all higher-level recovery journal elements that contain it MUSTNOT appear in the list. Also, as a rule,also have S bits that are set to 0, including thelist MUST contain a logtop-level recovery journal header. Other consistent bitfield coding idioms are described below: o R flag bit. R flag bits are reserved for future use. Senders MUST set R bits to 0. Receivers MUST ignore R bit values. o LENGTH field. All fields named LENGTH (as distinct from LEN) code themost recent active command for a controllernumber of octets in the structure thatappearscontains it, including the header it resides in and all hierarchical levels below it. If a structure contains a LENGTH field, a receiver MUST use thecheckpoint history. Below, weLENGTH field value to advance past the structure during parsing, rather than use knowledge about the internal format of the structure. We now defineexceptionsnormative terms used tothis rule:describe recovery journal semantics. o Checkpoint history. The checkpoint history of a recovery journal is the concatenation of the MIDIstreams may transmit 14-bit controller values using paired Most Significant Byte (MSB, controller numbers 0-31, 99, 101) and Least Significant Byte (LSB, controller numbers 32-63, 98, 100) Control Change commands [MIDI]. Ifcommand sections of packets C through I - 1. The final command in the MIDI command section for packet I - 1 is considered the most recentactive Control Changecommand; the first command in thesession historyMIDI command section fora 14-bit controller pair uses the MSB number, Chapterpacket CMAY omitis thecontroller log for mostoldest command. If command X is less recentactive Control Changethan commandfor the associated LSB number, asY, X is considered to be "before Y". A checkpoint history with no commands is considered to be empty. The checkpoint history never contains the MIDI commandordering makes this LSB value irrelevant. However, this exception MUST NOT be appliedsection of packet I (the packet containing the recovery journal), so if C == I, thesendercheckpoint history is empty by definition. o Session history. The session history of a recovery journal isnot certain thatthe concatenation of MIDIsource uses 14-bit semantics forcommand sections from the first packet of the session up to packet I - 1. The definitions of command recency and history emptiness follow those in the checkpoint history. The session history never contains thecontroller number pair. Note that someMIDIsources ignore 14-bit controller semantics,command section of packet I, anduseso theLSB controller numbers as independent 7-bit controllers.session history of the first packet in the session is empty by definition. o Finished/unfinished commands. Ifactive Control Change commands for controller numbers 0 (Bank Select MSB) or 32 (Bank Select LSB)all octets of a MIDI command appear in thecheckpointsession history,and ifthe commandinstances are also coded in the BANK-MSB and BANK-LSB fieldsis defined as being finished. If some but not all octets of a command appear in theChapter P (Appendix A.2), Chapter C MAY omit the controller logs forsession history, thecommands. o Several controller numbers pairs arecommand is definedto be mutually exclusive. Controller numbers 124 (Omni Off) and 125 (Omni On) form a mutually exclusive pair,asdo controller numbers 126 (Mono) and 127 (Poly). If active Control Changebeing unfinished. Unfinished commandsfor one or both membersoccur if segments of amutually exclusive pairSysEx command appear inthe checkpoint history,several RTP packets. For example, if alog forSysEx command is coded as 3 segments, with segment 1 in packet K, segment 2 in packet K + 1, and segment 3 in packet K + 2, thecontroller numbersession histories for packets K + 1 and K + 2 contain unfinished versions of themost recentcommand. A session history contains a finished version of a cancelled SysEx commandfor the pair inif thecheckpointhistoryMUST appear in the controller list. However, the list MAY omitcontains thecontroller logcancel sublist for themost recent activecommand. o Reset State commands. Reset State (RS) commands reset renderers to an initialized "powerup" condition. The RS commands are: System Reset (0xFF), General MIDI System Enable (0xF0 0x7E 0xcc 0x09 0x01 0xF7), General MIDI 2 System Enable (0xF0 0x7E 0xcc 0x09 0x03 0xF7), General MIDI System Disable (0xF0 0x7E 0xcc 0x09 0x00 0xF7), Turn DLS On (0xF0 0x7E 0xcc 0x0A 0x01 0xF7), and Turn DLS Off (0xF0 0x7E 0xcc 0x0A 0x02 0xF7). Registrations of subrender parameter token values (Appendix C.6.2) and IETF standards-track documents MAY specify additional RS commands. o Active commands. Active commandforare MIDI commands that do not appear before a Reset State command in theother numbersession history. o N-active commands. N-active commands are MIDI commands that do not appear before one of the following commands in thepair. If activesession history: MIDI Control Change numbers 123-127 (numbers with All Notes Off semantics) or 120 (All Sound Off), and any Reset State command. o C-active commands. C-active commandsforare MIDI commands that do not appear before oneor both membersofa mutually exclusive pair appearthe following commands in the sessionhistory,history: MIDI Control Change number 121 (Reset All Controllers) and any Reset State command. o Oldest-first ordering rule. Several recovery journal chapters contain alog for the controller numberlist of elements, where each element is associated with a MIDI command that appears in the session history. In mostrecent command forcases, thepair does not appearchapter definition requires that list elements be ordered in accordance with thecontroller list, a log for"oldest-first ordering rule". Below, we normatively define this rule: Elements associated with the most recent commandforin theother number ofsession history coded in thepairlist MUSTNOTappearinat the end of thecontrollerlist.o If an active Control ChangeElements associated with the oldest commandfor controller number 121 (Reset All Controllers) appearsin the sessionhistory,history coded in thecontrollerlistMAY omit logs for Control Change commands that precede the Reset All Controllers command inMUST appear at thesession history, under certain conditions. Namely, a log MAY be omitted ifstart of thesender is certainlist. All other list elements MUST be arranged with respect to these boundary elements, to produce a list ordering thatcommand stream followsstrictly reflects theReset All Controllers semantics definedrelative session history recency of the commands coded by the elements in[RP015], and ifthelog codes a controller number for which [RP015] specifies a reset value. For example, [RP015] specifieslist. o Parameter system. A MIDI feature that provides two sets of 16,384 parameters to expand the 0-127 controller number1 (Modulation Wheel) is reset tospace. The Registered Parameter Names (RPN) system and the Non- Registered Parameter Names (NRPN) system each provides 16,384 parameters. o Parameter system transaction. The value0,of RPNs andthusNRPNs are changed by acontroller log for Modulation Wheel MAY be omitted from the controller log list. In contrast, [RP015] specifiesseries of Control Change commands thatcontroller number 7 (Channel Volume) is not reset, and thusform acontroller log for Channel Volume MUST NOT be omitted fromparameter system transaction. A canonical transaction begins with two Control Change commands to set thecontroller log list. o Appendix A.3.4 defines exception rulesparameter number (controller numbers 99 and 98 forthe MIDI Parameter SystemNRPNs, controller numbers6, 38,101 and96-101. A.3.2 Controller Log Format Figure A.3.2 shows the controller log structure of Chapter C. 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S| NUMBER |A| VALUE/ALT | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure A.3.2 -- Chapter C controller log100 for RPNs). The7-bit NUMBER field identifies the controllertransaction continues with an arbitrary number of Data Entry (controller numbers 6 and 38), Data Increment (controller number 96), and Data Decrement (controller number 97) Control Change commands to set thecoded command. The 7-bit VALUE/ALT field codes recovery information for the command.parameter value. TheA bit sets the formattransaction ends with a second pair of (99, 98) or (101, 100) Control Change commands that specify theVALUE/ALT field. A log encodes recovery information using onenull parameter (MSB value 0x7F, LSB value 0x7F). Several variants of thefollowing tools: the value tool,canonical transaction sequence are possible. Most commonly, thetoggle tool,terminal pair of (99, 98) or (101, 100) Control Change commands may specify a parameter other than thecount tool. A log usesnull parameter. In this case, thevalue tool ifcommand pair terminates theA bitfirst transaction and starts a second transaction. The command pair issetconsidered to0. The value tool codesbe a part of both transactions. This variant is legal and recommended in [MIDI]. We refer to this variant as a "type 1 variant". Less commonly, the7-bit data valueMSB (99 or 101) or LSB (98 or 100) command of a (99, 98) or (101, 100) Control Change pair may be omitted. If the MSB commandinis omitted, theVALUE/ALT field. Thetransaction uses the MSB valuetool works bestof the most recent C-active Control Change command forcontrollers that code a continuous quantity, such ascontroller number1 (Modulation Wheel). The A bit is set to 1 to code the toggle99 orcount tool. These tools work best for controllers101 thatcode discrete actions. Figure A.3.3 showsappears in thecontroller log for these tools. 0 1 0 1 2 3 4 5 6 7 8 9 0 1session history. We refer to this variant as a "type 23 4 5 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S| NUMBER |1|T| ALT | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure A.3.3 -- Controller log for ALT tools A log uses the toggle tool ifvariant". If theT bitLSB command isset to 0. A log uses the count tool ifomitted, theT bitLSB value 0x00 issetassumed. We refer to1. Both methods use the 6-bit ALT fieldthis variant asan unsigned integer.a "type 3 variant". Thetoggle tool works best for controllers that act as on/off switches, such as 64 (Damper Pedal (Sustain)). These controllers code the "off" state with control values 0-63type 2 and type 3 variants are defined as legal, but are not recommended, in [MIDI]. System real-time commands may appear at any point during a transaction (even between octets of individual commands in the"on" state with 64-127. Fortransaction). More generally, [MIDI] does not forbid thetoggle tool,appearance of unrelated MIDI commands during an open transaction. As a rule, these commands are considered to be "outside" theALT field codestransaction and do not affect thetotal numberstatus oftoggles (off->on and on->off) duethe transaction in any way. Exceptions toControl Changethis rule are commandsin the session history, upwhose semantics act to terminate transactions: Reset State commands, andincluding a toggle caused by the command coded by the log. The toggle count includes toggles caused byControl Changecommands(0xB) for controller number 121 (Reset AllControllers). Toggle counting is performed modulo 64. The toggle count is reset at the start of a session, and whenever a Reset StateControllers) [RP015]. o Initiated parameter system transaction. A canonical parameter system transaction whose (99, 98) or (101, 100) initial Control Change command(Appendix A.1)pair appears in the sessionhistory. When these reset events occur, the toggle count for a controllerhistory issetconsidered to0 (for controllers whose default value is 0-63) orbe an initiated parameter system transaction. This definition also holds for type 1(for controllersvariants. For type 2 variants (dropped MSB), a transaction whosedefault value is 64-127). The Damper Pedal (Sustain) controller illustrates the benefits of the toggle tool over the value tool for switch controllers. As often usedinitial LSB Control Change command appears inpiano applications, the "on" state ofthecontroller lets notes resonate, while the "off" state immediately damps notessession history is an initiated transaction. For type 3 variants (dropped LSB), a transaction is considered tosilence. The loss ofbe initiated if at least one transaction command follows the"off"initial MSB (99 or 101) Control Change command inan "on->off->on" sequence results in ringing notes that should have been damped silent. The toggle tool lets receivers detect this lost "off" command butthevalue tool does not.session history. The completion of a transaction does not nullify its "initiated" status. o Session history reference counts. Several recovery journal chapters include a reference counttool conceptually similar to the toggle tool. For the count tool, the ALT fieldfield, which codes the total number ofControl Changecommands of a type that appear in the sessionhistory, up to and includinghistory. Examples include the Reset and Tune Request commandcoded bylogs (Chapter D, Appendix B.1) and thelog. Command counting is performed modulo 64. TheActive Sense commandcount is set to 0 at(Chapter V, Appendix B.2). Upon thestartdetection ofthe session, and is reset to 0 wheneveraReset State command (Appendix A.1) appears in the session history. Because theloss event, reference counttool ignores the data value, it isfields let agood match for controllers whose controller value is ignored, such as number 123 (All Notes Off). More generally,receiver deduce if any instances of the command have been lost, by comparing the journal reference counttool may be used to codewith its own reference count. Thus, a(modulo 64) identification numberreference count field makes sense, even fora command. A.3.3 Log List Coding Rules In this section, we describe the organization of controller logs in the Chapter C log list. A log encodes information about a particular Control Changecommand types in which knowing thesession history. In most cases, a command SHOULD be coded by a single tool (and thus, a single log). If a numberNUMBER of lost commands iscoded with a single tool, and this toolirrelevant (as is true with all of thecount tool, recovery Control Changeexample commandsgenerated by a receiver SHOULD usementioned above). The chapter definitions in Appendices A.2 to A.9 and B.1 to B.5 reflect the defaultcontrol value for the controller. However, a command MAY be coded by several tool types (and thus, several logs, each using a different tool). This technique may improverecoveryperformance for controllers with complex semantics, suchjournal behavior. The ch_default, ch_never, and ch_anchor parameters modify these definitions, ascontroller number 84 (Portamento Control), or controller number 121 (Reset All Controllers) when used with a non-zero data octet (with the semanticsdescribed in[DLS2]). If a command is encoded by multiple tools, the logsAppendix C.2.3. The chapter definitions specify if data MUST beplacedpresent in thelistjournal. Senders MAY also include non-required data in thefollowing order: count tool log (if any), followed by value tool log (if any), followed by toggle tool log (if any). The Chapter C log listjournal. This optional data MUSTobey the oldest-first ordering rule (defined in Appendix A.1). Note that this ordering preserves the information necessary for the recovery of 14-bit controller values, without precluding the use of MSB and LSB controller pairs as independent 7-bit controllers. In the default use ofcomply with thepayload format, all logsnormative chapter definition. For example, if a chapter definition states thatappear in the list foracontroller number encode information about one Control Change command -- namely,field codes data from the most recent activeControl Changecommand in the sessionhistory for the number. This coding scheme provides good recovery performance forhistory, thestandard uses of Control Changesender MUST NOT code inactive commandsdefined in [MIDI]. However, not all MIDI applications restrict the use of Control Changeor older commandsto those definedin[MIDI]. For example, considerthecommon MIDI encoding of rotary encoders ("infinite" rotation knobs). The mixing consolefield. Finally, we note that a channel journal only encodes information about MIDIconvention defined in [LCP] codescommands appearing on theposition of rotary encoders as a series of Control Change commands. Each command encodes a relative change of knob position fromMIDI channel thelast update (expressedjournal protects. All references to MIDI commands in Appendices A.2 to A.9 should be read asa clockwise or counter- clockwise knob turning angle). As the knob position is encoded incrementally over a series of Control"MIDI commands appearing on this channel." A.2. Chapter P: MIDI Program Changecommands, the best recovery performance is obtainedA channel journal MUST contain Chapter P ifthe log list encodes all Controlan active Program Changecommands for encoder controller numbers that appear(0xC) command appears in the checkpointhistory, not onlyhistory. Figure A.2.1 shows the format for Chapter P. 0 1 2 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S| PROGRAM |B| BANK-MSB |X| BANK-LSB | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure A.2.1 -- Chapter P format The chapter has a fixed size of 24 bits. The PROGRAM field indicates the data value of the most recentcommand. To support application areas that use Controlactive Program Changecommandscommand inthis way, Chapter C maythe session history. By default, the B, BANK-MSB, X, and BANK-LSB fields MUST beconfiguredset toencode information about several0. Below, we define exceptions to this default condition. If an active Control Changecommands(0xB) command foracontrollernumber. We usenumber 0 (Bank Select MSB) appears before theterm "enhanced" to describe this encoding method, which we describe below. In Appendix C.2.3, we show how to configure a stream to use enhanced Chapter C encoding for specific controller numbers. In Section 5Program Change command in themain text, we show how the H bits insession history, therecovery journal header (Figure 8)B bit MUST be set to 1, andinthechannel journal header (Figure 9) indicateBANK-MSB field MUST code theusedata value ofenhanced Chapter C encoding. Here, we define howthe Control Change command. If B is set toencode a Chapter C log list that uses1, theenhanced encoding method. Senders that useBANK-LSB field MUST code theenhanced encoding methoddata value of the most recent Control Change command foracontroller numberMUST obey32 (Bank Select LSB) that preceded therules below. These rules let a receiver determine which logsProgram Change command coded in thelist correspond to lost commands. Note that these rules overridePROGRAM field and followed theexceptions listed in Appendix A.3.1. o If N commands for a controller number are encodedControl Change command coded in thelist,BANK-MSB field. If no such Control Change command exists, thecommandsBANK- LSB field MUST bethe N most recent commandsset to 0. If B is set to 1, and if a Control Change command forthecontroller number 121 (Reset All Controllers) appears in thesession history. For example, for N = 2, the sender MUST encodeMIDI stream between themost recentControl Change commandand the second most recent command, notcoded by themost recent commandBANK-MSB field and thethird most recent command. o If a controller number uses enhanced encoding, the encoding of the least-recentProgram Change commandforcoded by thecontroller number inPROGRAM field, thelog listX bit MUSTinclude a count tool log. In addition, if commands are encoded for the controller number whose logs have S bitsbe set to0,1. Note that [RP015] specifies that Reset All Controllers does not reset theencodingvalues ofthe least-recent command with S =controller numbers 0logs MUST include a count tool log. The count tool is OPTIONAL for(Bank Select MSB) and 32 (Bank Select LSB). Thus, theother commands forX bit does not effect how receivers will use the BANK-LSB and BANK-MSB values when recovering from a lost Program Change command. The X bit serves to aid recovery in MIDI applications where controllernumber encodednumbers 0 and 32 are used inthe list, asareceiver is able to efficiently deducenon- standard way. A.3. Chapter C: MIDI Control Change Figure A.3.1 shows thecount tool value for these commands,format forboth single-packet and multi-packet loss events. oChapter C. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 8 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S| LEN |S| NUMBER |A| VALUE/ALT |S| NUMBER | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |A| VALUE/ALT | .... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure A.3.1 -- Chapter C format Theusechapter consists ofthe value and toggle tools MUST be identical for all commands foracontroller number encoded in the list. For example,1-octet header, followed by avalue tool log eithervariable length list of 2-octet controller logs. The list MUSTappear for all commands for thecontain at least one controller log. The 7-bit LEN field codes the numbercodedof controller logs in the list,or alternatively, value tool logs forminus one. We define thecontroller number MUST NOT appear insemantics of thelist. Likewise, a toggle toolcontroller logeitherfields in Appendix A.3.2. A channel journal MUSTappear for all commands forcontain Chapter C if thecontroller number codedrules defined inthe list,this appendix require that one oralternatively, toggle tool logs for themore controllernumber MUST NOTlogs appear in the list.o IfA.3.1. Log Inclusion Rules A controller log encodes information about a particular Control Change commandis encoded by multiple tools, the logs MUST be placedin thelist insession history. In thefollowing order: count tool log (if any), followed by value tool log (if any), followed by toggle tool log (if any). These rules permit a receiver recovering from a packet loss todefault use of thecount tool log to matchpayload format, list logs MUST encode information about thecommands encodedmost recent active command in thelist with its ownsession historyof the stream, as we describe below. Note that the text below describesfor anon-normative algorithm; receivers are free to use any algorithm to match its history withcontroller number. Logs encoding earlier commands MUST NOT appear in theloglist.InAlso, as atypical implementation ofrule, theenhanced encoding method,list MUST contain areceiver computes and stores count, value, and toggle tool data field valueslog for the most recentControl Changeactive commandit has received for a controller number. After a loss event, a receiver parses the Chapter C list, and processes list logsfor a controller number thatuses enhanced encoding as follows. The receiver comparesappears in thecount tool ALT field forcheckpoint history. Below, we define exceptions to this rule: o MIDI streams may transmit 14-bit controller values using paired Most Significant Byte (MSB, controller numbers 0-31, 99, 101) and Least Significant Byte (LSB, controller numbers 32-63, 98, 100) Control Change commands [MIDI]. If theleast-recentmost recent active Control Change command in the session history for a 14-bit controller pair uses the MSB number, Chapter C MAY omit the controllernumber inlog for thelist against its stored count datamost recent active Control Change command for thecontrollerassociated LSB number,to determineas the command ordering makes this LSB value irrelevant. However, this exception MUST NOT be applied ifrecoverythe sender isnecessarynot certain that the MIDI source uses 14-bit semantics for the controller number pair. Note that some MIDI sources ignore 14-bit controller semantics and use the LSB controller numbers as independent 7- bit controllers. o If active Control Change commands for controller numbers 0 (Bank Select MSB) or 32 (Bank Select LSB) appear in the checkpoint history, and if the command instances are also coded in thelist. The valueBANK-MSB andtoggle toolBANK-LSB fields of the Chapter P (Appendix A.2), Chapter C MAY omit the controller logs(if any) that directly followfor thecount tool logcommands. o Several controller number pairs areassociated with this least-recent command. To check more-recentdefined to be mutually exclusive. Controller numbers 124 (Omni Off) and 125 (Omni On) form a mutually exclusive pair, as do controller numbers 126 (Mono) and 127 (Poly). If active Control Change commands for one or both members of a mutually exclusive pair appear in thecontroller, the receiver detects additional value and/or toggle tool logscheckpoint history, a log for the controller number of the most recent command for the pair in thelist, and infers count tool datacheckpoint history MUST appear in the controller list. However, the list MAY omit the controller log for the most recent active commandcoded by these log(s). This inferred data is used to determine if recovery is necessaryfor thecommand coded byother number in thevalue and/or toggle tool logs. In this way, a receiver is able to execute only lost commands, without executing a command twice. While recovering from a single packet loss,pair. If active Control Change commands for one or both members of areceiver may skip through S = 1 logsmutually exclusive pair appear in thelist, as the first S = 0session history, and if a log foran enhancedthe controller numberis always a count tool log. Note thatof therequirementsmost recent command for the pair does not appear inAppendix C.2.2.2the controller list, a log forprotective sender and receiver actions during session startupthe most recent command formulticast operation arethe other number ofparticular importancethe pair MUST NOT appear in the controller list. o If an active Control Change command forenhanced encoding, as receivers need to initialize its count tool data structures with recovery journal datacontroller number 121 (Reset All Controllers) appears inorder to matchthe session history, the controller list MAY omit logs for Control Change commandscorrectly after a loss event. Finally, we note in passingthatin some applications of rotary encoders,precede the Reset All Controllers command in the session history, under certain conditions. Namely, agood user experience maylog MAY bepossible withoutomitted if theuse of enhanced encoding. These applications are distinguished by visual feedback of encoding position thatsender isdriven bycertain that a command stream follows thepost-recovery rotary encoding stream,Reset All Controllers semantics defined in [RP015], andrelatively low packet loss. In these domains, recovery performance may be acceptable for rotary encodersif the loglist encodes only the most recent command, if both count and value logs appearcodes a controller number forthe command. A.3.4 The Parameter System Readers may wish to review the Appendix A.1 definitions of "parameter system", "parameter system transaction", and "initiated parameter system transaction" before reading this section. Parameter system transactions updatewhich [RP015] specifies aMIDI Registered Parameter Number (RPN) or Non-Registered Parameter Number (NRPN)reset value.A parameter system transaction is a sequence of Control Change commandsFor example, [RP015] specifies thatmay usecontroller number 1 (Modulation Wheel) is reset to thefollowing controllers numbers: o Data Entry MSB (6) o Data Entry LSB (38) o Data Increment (96) o Data Decrement (97) o Non-Registered Parameter Number (NRPN) LSB (98) o Non-Registered Parameter Number (NRPN) MSB (99) o Registered Parameter Number (RPN) LSB (100) o Registered Parameter Number (RPN) MSB (101) Control Change commands that are a part ofvalue 0, and thus aparameter system transaction MUST NOT be coded in Chapter Ccontrollerlogs. Instead, these commands are coded in Chapter M,log for Modulation Wheel MAY be omitted from theMIDI Parameter chapter defined in Appendix A.4. However, Control Change commandscontroller log list. In contrast, [RP015] specifies thatuse the listed controllers as general-purpose controllers (i.e. outside ofcontroller number 7 (Channel Volume) is not reset, and thus aparameter system transaction)controller log for Channel Volume MUST NOT becoded in Chapter M. Instead,omitted from thecontrollers are coded in Chapter C controller logs. Thecontrollerlogs follow the coding rules stated inlog list. o AppendixA.3.2 and A.3.3. TheA.3.4 defines exception rules forcoding paired LSB and MSB controllers, as defined in Appendix A.3.1, apply tothepairs (6, 38), (99, 98), and (101, 100) when coded in Chapter C. If active Control Change commands forMIDI Parameter System controller numbers 6, 38,or 96-101 appear in the checkpoint history,andthese commands are used as general-purpose controllers,96-101. A.3.2. Controller Log Format Figure A.3.2 shows themost recent general-purpose command instance for thesecontrollernumbers MUST appear as entries in thelog structure of Chapter C. 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S| NUMBER |A| VALUE/ALT | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure A.3.2 -- Chapter C controllerlist. MIDI syntax permits a source to use controllers 6, 38, 96, and 97 as parameter-system controllers and general-purpose controllers in the same stream. An RTP MIDI sender MUST deducelog The 7-bit NUMBER field identifies therole of each Control Change command for thesecontrollernumbers by noting the placementnumber of thecommand in the stream, and MUST use thiscoded command. The 7-bit VALUE/ALT field codes recovery informationto codefor thecommand in Chapter C or Chapter M as appropriate. Specifically, active Control Change commands for controllers 6, 38, 96, and 97 act in a general-purpose way when o No active Control Change commands that set an RPN or NRPN parameter number appear in the session history, or ocommand. Themost recent active Control Change commands inA bit sets thesession history that set an RPN or NRPN parameter number codeformat of thenull parameter (MSB value 0x7F, LSB value 0x7F), or oVALUE/ALT field. AControl Change command for controller number 121 (Reset All Controllers) appears more recently inlog encodes recovery information using one of thesession history than all active Control Change commands that set an RPNfollowing tools: the value tool, the toggle tool, orNRPN parameter number (see [RP015] for details). Finally, we note that a MIDI source that followstherecommendations of [MIDI] exclusivelycount tool. A log usesnumbers 98-101 as parameter system controllers. Alternatively, a MIDI source may exclusively use 98-101 as general- purpose controllers, and losetheability perform parameter system transactions in a stream. Invalue tool if thelanguage of [MIDI],A bit is set to 0. The value tool codes thegeneral-purpose use7-bit data value ofcontrollers 98-101 constitutesanon-standard controller assignment. As most real-world MIDI sources usecommand in thestandard controller assignmentVALUE/ALT field. The value tool works best forcontroller numbers 98-101, an RTP MIDI sender SHOULD assume these controllers act as parameter systemcontrollersunless it knowsthat code aMIDI source uses controller numbers 98-101 in a general-purpose way. A.4 Chapter M: MIDI Parameter System Readers may wishcontinuous quantity, such as number 1 (Modulation Wheel). The A bit is set toreview1 to code theAppendix A.1 definitions for "C-active", "parameter system", "parameter system transaction", and "initiated parameter system transaction" before reading this Appendix. Chapter M protects parameter system transactionstoggle or count tool. These tools work best forRegistered Parameter Number (RPN) and Non-Registered Parameter Number (NRPN) values.controllers that code discrete actions. FigureA.4.1A.3.3 shows theformatcontroller log forChapter M. 0 1 2 3these tools. 0 12 3 4 5 6 7 8 90 1 2 3 4 5 6 7 8 9 0 1 2 3 4 56 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S|P|E|U|W|Z| LENGTH |Q| PENDING | Log list ...+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S| NUMBER |1|T| ALT |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ FigureA.4.1A.3.3 --Top-level Chapter M format Chapter M begins with a 2-octet header. IfController log for ALT tools A log uses theP headertoggle tool if the T bit is set to1, a 1-octet field follows0. A log uses theheader, codingcount tool if the7-bit PENDING value and its associated Q bit.T bit is set to 1. Both methods use the 6-bit ALT field as an unsigned integer. The10-bit LENGTHtoggle tool works best for controllers that act as on/off switches, such as 64 (Damper Pedal (Sustain)). These controllers code the "off" state with control values 0-63 and the "on" state with 64-127. For the toggle tool, the ALT field codes thesizetotal number ofChapter M,toggles (off->on andconformson->off) due tosemantics describedControl Change commands inAppendix A.1. Chapter M ends with a list of zero or more variable-length parameter logs. Appendix A.4.2 definesthebitfield format of a parameter log. Appendix A.4.1 defines the inclusion semantics of the log list. A channel journal MUST contain Chapter M if the rules defined in Appendix A.4.1 require that one or more parameter logs appear insession history, up to and including a toggle caused by thelist. A channel journal also MUST contain Chapter M ifcommand coded by themost recent C- activelog. The toggle count includes toggles caused by Control Changecommand involved in a parameter system transaction in the checkpoint history is: o an RPN MSB (101) or NRPN MSB (99) controller, or o an RPN LSB (100) or NRPN LSB (98)commands for controllerthat completesnumber 121 (Reset All Controllers). Toggle counting is performed modulo 64. The toggle count is reset at thecodingstart ofthe null parameter (MSB value 0x7F, LSB value 0x7F). This rule provides loss protection for partially-transmitted parameter numbersa session, andfor the null parameter numbers. If the most recent C-active Control Change command involved inwhenever aparameter system transactionReset State command (Appendix A.1) appears in the sessionhistory is for the RPN MSB or NRPN MSB controller,history. When these reset events occur, theP header bit MUST betoggle count for a controller is set to1, and the PENDING field (and its associated Q bit) MUST follow0 (for controllers whose default value is 0-63) or 1 (for controllers whose default value is 64-127). The Damper Pedal (Sustain) controller illustrates theChapter M header. Otherwise,benefits of theP header bit MUST be set to 0, andtoggle tool over thePENDING field and Q bit MUST NOT appearvalue tool for switch controllers. As often used inChapter M. If PENDING codes an NRPN MSB,piano applications, theQ bit MUST be set to 1. If PENDING codes an RPN MSB,"on" state of theQ bit MUST be set to 0. The E header bit codescontroller lets notes resonate, while thecurrent transaction"off" state immediately damps notes to silence. The loss of theMIDI stream. If E = 1,"off" command in aninitiated transaction is"on->off->on" sequence results inprogress. Below, we defineringing notes that should have been damped silent. The toggle tool lets receivers detect this lost "off" command, but therules for settingvalue tool does not. The count tool is conceptually similar to theE header bit: o If no C-active parameter system transactiontoggle tool. For the count tool, the ALT field codes the total number of Control Change commandsappearin the session history,the E bit MUST be setup to0. o Ifand including theP header bitcommand coded by the log. Command counting is performed modulo 64. The command count is set to1,0 at theE bit MUST be set to 0. o Ifstart of themost recent C-active parameter system transaction Control Changesession and is reset to 0 whenever a Reset State command (Appendix A.1) appears in the sessionhistoryhistory. Because the count tool ignores the data value, it is a good match forthe NRPN LSB or RPN LSBcontrollers whose controllernumber, and this command acts to complete the coding of the null parameter (MSB value 0x7F, LSBvalue0x7F),is ignored, such as number 123 (All Notes Off). More generally, theE bit MUSTcount tool may beset to 0. o Otherwise, an initiated transaction is in progress, and the E bit MUST be setused to1. The U, W, and Z header bitscodeproperties that are shared by all parameter logs in the list. If these properties are set, parameter logs may be coded with improved efficiency (we explain how in A.4.1). By default,a (modulo 64) identification number for a command. A.3.3. Log List Coding Rules In this section, we describe theU, W, and Z bits MUST be set to 0. If all parameterorganization of controller logs in thelist code RPN parameters, the U bit MAY be set to 1. If all parameter logsChapter C log list. A log encodes information about a particular Control Change command in thelist code NRPN parameters, the W bit MAYsession history. In most cases, a command SHOULD beset to 1.coded by a single tool (and, thus, a single log). Ifthe parameter numbers of all RPNa number is coded with a single tool andNRPN logs inthis tool is thelist lie incount tool, recovery Control Change commands generated by a receiver SHOULD use therange 0-127 (and thus have an MSBdefault control valueof 0),for theZ bitcontroller. However, a command MAY beset to 1. Note that C-active semantics appear in the preceding paragraphs because [RP015] specifies that pending Parameter System transactions are closedcoded by several tool types (and, thus, several logs, each using aControl Change commanddifferent tool). This technique may improve recovery performance for controllers with complex semantics, such as controller number 84 (Portamento Control) or controller number 121 (Reset AllControllers). A.4.1 Log Inclusion Rules Parameter logs code recovery information for a specific RPN or NRPN parameter. A parameter log MUST appear in the list if an active Control Change command that formsControllers) when used with apart of an initiated transaction fornon-zero data octet (with theparameter appearssemantics described inthe checkpoint history. An exception to this rule applies if the checkpoint history only contains transaction Control Change commands for controller numbers 98-101 that act to terminate the transaction. In this case,[DLS2]). If alog forcommand is encoded by multiple tools, theparameter MAYlogs MUST beomitted from the list. A log MAY appearplaced in the listif an active Control Change command that forms a part of an initiated transaction for the parameter appearsin thesession history. Otherwise, afollowing order: count tool logfor the parameter MUST NOT appear in the list. Multiple logs for the same RPN or NRPN parameter MUST NOT appear in the(if any), followed by value tool loglist.(if any), followed by toggle tool log (if any). TheparameterChapter C log list MUST obey the oldest-first ordering rule (defined in AppendixA.1), with the phrase "parameter transaction" replacingA.1). Note that this ordering preserves theword "command" ininformation necessary for therule definition. Parameter logs associated withrecovery of 14-bit controller values, without precluding theRPN or NRPN null parameter (LSB = 0x7F,use of MSB= 0x7F) MUST NOT appear inand LSB controller pairs as independent 7-bit controllers. In thelog list. Chapter M usesdefault use of theE header bit (Figure A.4.1) andpayload format, all logs that appear in theloglistordering rules to code null parameter semantics. Note that "active" semantics (rather than "C-active" semantics) appear in the preceding paragraphs because [RP015] specifies that pending Parameter System transactions are not reset byfor a controller number encode information about one Control Change command -- namely, the most recent active Control Change command in the session history forcontroller number 121 (Reset All Controllers). However,therule that follows uses C-active semantics, because it concernsnumber. This coding scheme provides good recovery performance for theprotectionstandard uses of Control Change commands defined in [MIDI]. However, not all MIDI applications restrict thetransaction system itself, and [RP015] specifies that Reset All Controllers actsuse of Control Change commands toclose a transactionthose defined inprogress. In most cases, parameter logs for RPN and NRPN parameters that are assigned to the ch_never parameter (Appendix C.2.3) MAY be omitted from[MIDI]. For example, consider thelist. An exception applies if: ocommon MIDI encoding of rotary encoders ("infinite" rotation knobs). Thelog codes the most recent initiated transactionmixing console MIDI convention defined in [LCP] codes thesession history, and o A C-activeposition of rotary encoders as a series of Control Change commands. Each commandthat formsencodes apartrelative change of knob position from thetransaction appears in the checkpoint history, and o The E header bit forlast update (expressed as a clockwise or counter-clockwise knob turning angle). As thetop-level Chapter M header (Figure A.4.1)knob position isset to 1. In this case,encoded incrementally over a series of Control Change commands, the best recovery performance is obtained if the log list encodes all Control Change commands forthe parameter MUSTencoder controller numbers that appear in thelist. This log informs receivers recovering from a losscheckpoint history, not only the most recent command. To support application areas thata transaction isuse Control Change commands inprogress, so that the receiver is ablethis way, Chapter C may be configured tocorrectly interpret RPN or NRPNencode information about several Control Change commandsthat follow the loss event. A.4.2 Log Coding Rules Figure A.4.2 showsfor a controller number. We use theparameter log structure ofterm "enhanced" to describe this encoding method, which we describe below. In Appendix C.2.3, we show how to configure a stream to use enhanced ChapterM. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4C encoding for specific controller numbers. In Section 56 7 8 8 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S| PNUM-LSB |Q| PNUM-MSB |J|K|L|M|N|T|V|R| Fields ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure A.4.2 -- Parameter log format The log begins with a header, whose default size (as showninFigure A.4.2) is 3 octets. IftheQmain text, we show how the H bits in the recovery journal headerbit is set to 0,(Figure 8) and in thelog encodes an RPN parameter. If Q = 1,channel journal header (Figure 9) indicate the use of enhanced Chapter C encoding. Here, we define how to encode a Chapter C logencodes an NRPN parameter. The 7-bit PNUM-MSB and PNUM-LSB fields codelist that uses theparameter number, and reflectenhanced encoding method. Senders that use theControl Change command data valuesenhanced encoding method forcontrollers 99 and 98 (for NRPNs) or 101 and 100 (for RPNs). The J, K, L, M, and N header bits formaTable of Contents (TOC) for the log, and signalcontroller number MUST obey thepresence of fixed-sized fields that followrules below. These rules let a receiver determine which logs in theheader. A header bit that is setlist correspond to1 codeslost commands. Note that these rules override thepresence ofexceptions listed in Appendix A.3.1. o If N commands for afieldcontroller number are encoded in thelog. The ordering of fields inlist, thelog followscommands MUST be theordering ofN most recent commands for theheader bitscontroller number in theTOC. Appendices A.4.2.1-2 definesession history. For example, for N = 2, thefields associated with each TOC header bit. The Tsender MUST encode the most recent command andV header bits code information abouttheparameter log, but aresecond most recent command, notpart oftheTOC. A set T or V bit does not signalmost recent command and thepresence of any parameter log field.third most recent command. o Ifthe rules in Appendix A.4.1 state that a log foragiven parameter MUST appear in Chapter M, the log MUST code sufficient information to protect the parameter fromcontroller number uses enhanced encoding, thelossencoding ofactive parameter transaction Control Change commands inthecheckpoint history. This rule does not apply ifleast-recent command for theparameter coded bycontroller number in the logis assigned to the ch_never parameter (Appendix C.2.3).list MUST include a count tool log. Inthis case, senders MAY choose to setaddition, if commands are encoded for theJ, K, L, M, and N TOCcontroller number whose logs have S bits set to 0,coding a parameter log with no fields. Note that logs to protect parameters that are assigned to ch_never are REQUIRED under certain conditions (see Appendix A.4.1). The purposethe encoding of thelog is to inform receivers recovering from a loss thatleast-recent command with S = 0 logs MUST include atransactioncount tool log. The count tool is OPTIONAL for the other commands for the controller number encoded inprogress, so thatthe list, as a receiver is able tocorrectly interpret RPN or NRPN Control Change commands that followefficiently deduce the count tool value for these commands, for both single-packet and multi-packet lossevent. Parameter logs provide twoevents. o The use of the value and toggle tools MUST be identical forparameter protection:all commands for a controller number encoded in the list. For example, a value tooland the count tool. Depending onlog either MUST appear for all commands for thesemantics ofcontroller number coded in theparameter, senders may use either tool, both tools,list, orneither tool to protect a given parameter. Thealternatively, value toolcodes information a receiver may use to determinelogs for thecurrent value of an RPN or NRPN parameter. Ifcontroller number MUST NOT appear in the list. Likewise, aparametertoggle tool loguses the value tool, the V header biteither MUSTbe set to 1, andappear for all commands for thesemantics definedcontroller number coded inAppendices A.4.2.1 for settingtheJ, K, L, and M TOC bits MUST be followed. If a parameter log does not use the value tool, the V bit MUST be set to 0, and the J, K, L, and M TOC bits MUST also be set to 0. The countlist, or alternatively, toggle toolcodeslogs for the controller numberof transactions for an RPN or NRPN parameter.MUST NOT appear in the list. o If aparameter log uses the count tool,command is encoded by multiple tools, theT header bitlogs MUST beset to 1, and the semantics definedplaced inAppendices A.4.2.2 for settingtheN TOC bit MUST be followed. If a parameter log does not uselist in the following order: counttool, the T bit and the N TOC bit MUST be settool log (if any), followed by value tool log (if any), followed by toggle tool log (if any). These rules permit a receiver recovering from a packet loss to0. Note that V and T are set ifuse thesender uses value (V) orcount(T)toolfor thelogon an ongoing basis. Thus, V may be set even if J = K = L = M = 0, and T may be set even if N = 0. In many cases, all parameters codedto match the commands encoded in theloglistarewith its own history ofone type (RPN and NRPN), and all parameter numbers lie intherange 0-127. As described in Appendix A.4.1, senders MAY signal this condition by settingstream, as we describe below. Note that thetop-level Chapter M header bit Ztext below describes a non-normative algorithm; receivers are free to1 (to code the restricted range) and by setting the U or W bituse any algorithm to1 (to code the parameter type). If the top-level Chapter M header codes Z = 1 and either U = 1 or W = 1, all logs inmatch its history with theparameterloglist MUST uselist. In amodified header format. This modification deletes bits 8-15typical implementation of thebitfield shown in Figure A.4.2, to yieldenhanced encoding method, a2-octet header. The values of the deleted PNUM-MSBreceiver computes andQ fields may be inferred from the U, W,stores count, value, andZ bit values. A.4.2.1 The Value Tool The valuetoggle tooluses several fields to track the value of an RPN or NRPN parameter. The J TOC bit codes the presence of the octet shown in Figure A.4.3 in the field list. 0 0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+ |X| ENTRY-MSB | +-+-+-+-+-+-+-+-+ Figure A.4.3 -- ENTRY-MSB field The 7-bit ENTRY-MSB field codes thedatavalue offield values for the most recentactiveControl Change command it has received for a controllernumber 6 (Data Entry MSB) innumber. After a loss event, a receiver parses thesession historyChapter C list and processes list logs for a controller number thatappearsuses enhanced encoding as follows. The receiver compares the count tool ALT field for the least-recent command for the controller number ina transactionthe list against its stored count data for thelog parameter. The X bit MUST be setcontroller number, to1determine if recovery is necessary for the command codedby ENTRY-MSB precedes the most recent Control Change command for controller 121 (Reset All Controllers) in the session history. Otherwise, the X bit MUST be set to 0. A parameter log that usesin the list. The value and toggle toolMUST include the ENTRY-MSB field if an active Control Change command for controller number 6 appears in the checkpoint history. Note that [RP015] specifieslogs (if any) thatControl Change commands for controller 121 (Reset All Controllers) do not reset RPN and NRPN values, and thusdirectly follow theX bit would not play a recovery role for MIDI systems that comply with [RP015]. However, certain renderers (such as DLS 2 [DLS2]) specify that certain RPN valuescount tool log arereset for some uses of Reset All Controllers. The X bit (and other bitfield features of this nature inassociated with thisAppendix) plays a role in recoveryleast-recent command. To check more-recent commands forrenderers of this type. The K TOC bit codes the presence of the octet shown in Figure A.4.4 inthefield list. 0 0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+ |X| ENTRY-LSB | +-+-+-+-+-+-+-+-+ Figure A.4.4 -- ENTRY-LSB field The 7-bit ENTRY-LSB field codescontroller, thedatareceiver detects additional valueof the most recent active Control Change commandand/or toggle tool logs for the controller number38 (Data Entry LSB)in thesession history that appears in a transactionlist and infers count tool data for thelog parameter. The X bit MUST be setcommand coded by these logs. This inferred data is used to1determine if recovery is necessary for the command coded byENTRY-LSB precedes the most recent Control Change command for controller 121 (Reset All Controllers) in the session history. Otherwise, the X bit MUST be set to 0. As a rule, a parameter log that usesthe value and/or toggle toolMUST include the ENTRY-LSB field if an active Control Changelogs. In this way, a receiver is able to execute only lost commands, without executing a commandfor controller number 38 appearstwice. While recovering from a single packet loss, a receiver may skip through S = 1 logs in thecheckpoint history. However,list, as theENTRY-LSB field MUST NOT appear in a parameterfirst S = 0 logif the Control Change command associated with the ENTRY-LSB precedes a Control Change commandfor an enhanced controller number6 (Data Entry MSB) that appears inis always atransaction forcount tool log. Note that thelog parameterrequirements intheAppendix C.2.2.2 for protective sender and receiver actions during sessionhistory. The L TOC bit codes the presencestartup for multicast operation are ofthe octets shown in Figure A.4.5 in the field list. 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |G|X| A-BUTTON | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure A.4.5 -- A-BUTTON field The 14-bit A-BUTTON field codes aparticular importance for enhanced encoding, as receivers need to initialize its countof the number of active Control Changetool data structures with recovery journal data in order to match commandsfor controller numbers 96 and 97 (Data Increment and Data Decrement)correctly after a loss event. Finally, we note inthe session historypassing thatappearin some applications of rotary encoders, atransaction for the log parameter. The M TOC bit codesgood user experience may be possible without thepresenceuse of enhanced encoding. These applications are distinguished by visual feedback of encoding position that is driven by theoctets shown in Figure A.4.6 inpost-recovery rotary encoding stream, and relatively low packet loss. In these domains, recovery performance may be acceptable for rotary encoders if thefield list. 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |G|R| C-BUTTON | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure A.4.6 -- C-BUTTON fieldlog list encodes only the most recent command, if both count and value logs appear for the command. A.3.4. The14-bit C-BUTTON field has semantics identicalParameter System Readers may wish toA-BUTTON, exceptreview the Appendix A.1 definitions of "parameter system", "parameter system transaction", and "initiated parameter system transaction" before reading this section. Parameter system transactions update a MIDI Registered Parameter Number (RPN) or Non-Registered Parameter Number (NRPN) value. A parameter system transaction is a sequence of Control Change commands that may use the following controllers numbers: o Data Entry MSB (6) o Data Entry LSB (38) o Data Incrementand(96) o Data Decrement (97) o Non-Registered Parameter Number (NRPN) LSB (98) o Non-Registered Parameter Number (NRPN) MSB (99) o Registered Parameter Number (RPN) LSB (100) o Registered Parameter Number (RPN) MSB (101) Control Change commands thatprecedeare a part of a parameter system transaction MUST NOT be coded in Chapter C controller logs. Instead, these commands are coded in Chapter M, themost recentMIDI Parameter chapter defined in Appendix A.4. However, Control Changecommand for controller 121 (Reset All Controllers)commands that use the listed controllers as general-purpose controllers (i.e., outside of a parameter system transaction) MUST NOT be coded in Chapter M. Instead, thesession historycontrollers arenot counted. For both A-BUTTONcoded in Chapter C controller logs. The controller logs follow the coding rules stated in Appendix A.3.2 andC-BUTTON, Data IncrementA.3.3. The rules for coding paired LSB andData DecrementMSB controllers, as defined in Appendix A.3.1, apply to the pairs (6, 38), (99, 98), and (101, 100) when coded in Chapter C. If active Control Change commandsare not counted if they precede Control Changes commandsfor controller numbers6 (Data Entry MSB)6, 38, or38 (Data Entry LSB) that96-101 appear ina transaction forthelog parametercheckpoint history, and these commands are used as general-purpose controllers, the most recent general-purpose command instance for these controller numbers MUST appear as entries in thesession history. The A-BUTTONChapter C controller list. MIDI syntax permits a source to use controllers 6, 38, 96, andC-BUTTON fields are interpreted97 asunsigned integers,parameter-system controllers and general-purpose controllers in theG bit associatedsame stream. An RTP MIDI sender MUST deduce thefield codesrole of each Control Change command for these controller numbers by noting thesignplacement of theinteger (G = 0 for positive or zero, G = 1 for negative). To computecommand in the stream and MUST use this information to code thecount value, initialize the count value to 0, add 1command in Chapter C or Chapter M, as appropriate. Specifically, active Control Change commands foreach qualifying Data Increment command,controllers 6, 38, 96, andsubtract 1 for each qualifying Data Decrement command. After each add97 act in a general-purpose way when o no active Control Change commands that set an RPN orsubtract, limit the count magnitude to 16383. The G bit codes the sign of the count, andNRPN parameter number appear in theA-BUTTONsession history, orC-BUTTON field codes the count magnitude. For the A-BUTTON field, ifo the most recentqualified Data Incrementactive Control Change commands in the session history that set an RPN orData Decrement command precedesNRPN parameter number code themost recentnull parameter (MSB value 0x7F, LSB value 0x7F), or o a Control Change command for controller number 121 (Reset All Controllers) appears more recently in the sessionhistory, the X bit associated with A-BUTTON field MUST be set to 1. Otherwise, the X bit MUST be set to 0. A parameter log that uses the value tool MUST include the A-BUTTON and C-BUTTON fields if anhistory than all active Control Changecommand for controller numbers 96commands that set an RPN or97 appears in the checkpoint history. However, to improve coding efficiency, this rule has several exceptions: o If the log includes the A-BUTTON field, and ifNRPN parameter number (see [RP015] for details). Finally, we note that a MIDI source that follows theX bitrecommendations ofthe A-BUTTON field is set to 1, the C-BUTTON field (and its associated R[MIDI] exclusively uses numbers 98-101 as parameter system controllers. Alternatively, a MIDI source may exclusively use 98-101 as general-purpose controllers andG bits) MAY be omitted fromlose thelog. o Ifability perform parameter system transactions in a stream. In thelog includeslanguage of [MIDI], theA-BUTTON field, and ifgeneral-purpose use of controllers 98-101 constitutes a non-standard controller assignment. As most real-world MIDI sources use theA-BUTTON and C-BUTTON fields (and their associated G bits) code identical values,standard controller assignment for controller numbers 98-101, an RTP MIDI sender SHOULD assume these controllers act as parameter system controllers, unless it knows that a MIDI source uses controller numbers 98-101 in a general-purpose way. A.4. Chapter M: MIDI Parameter System Readers may wish to review theC-BUTTON field (and its associated RAppendix A.1 definitions for "C-active", "parameter system", "parameter system transaction", andG bits) MAY be omitted from the log. A.4.2.2 The Count Tool The count tool tracks the number of"initiated parameter system transaction" before reading this appendix. Chapter M protects parameter system transactions foran RPN or NRPN parameter. The N TOC bit codes the presence of the octet shown inRegistered Parameter Number (RPN) and Non-Registered Parameter Number (NRPN) values. FigureA.4.7 inA.4.1 shows thefield list.format for Chapter M. 0 1 2 3 0 1 2 3 4 5 6 7+-+-+-+-+-+-+-+-+ |X| COUNT8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S|P|E|U|W|Z| LENGTH |Q| PENDING |+-+-+-+-+-+-+-+-+Log list ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ FigureA.4.7A.4.1 --COUNTTop-level Chapter M format Chapter M begins with a 2-octet header. If the P header bit is set to 1, a 1-octet fieldThefollows the header, coding the 7-bitCOUNTPENDING value and its associated Q bit. The 10-bit LENGTH field codes thenumbersize ofinitiated transactions for the log parameter that appearChapter M and conforms to semantics described in Appendix A.1. Chapter M ends with a list of zero or more variable-length parameter logs. Appendix A.4.2 defines thesession history. Initiated transactions are counted if theybitfield format of a parameter log. Appendix A.4.1 defines the inclusion semantics of the log list. A channel journal MUST contain Chapter M if the rules defined in Appendix A.4.1 require that one or moreactiveparameter logs appear in the list. A channel journal also MUST contain Chapter M if the most recent C-active Control Changecommands, including commands for controllers 98-101command involved in a parameter system transaction in the checkpoint history is o an RPN MSB (101) or NRPN MSB (99) controller, or o an RPN LSB (100) or NRPN LSB (98) controller thatinitiatecompletes the coding of the null parametertransaction. If(MSB value 0x7F, LSB value 0x7F). This rule provides loss protection for partially transmitted parameter numbers and for themost recent counted transaction precedesnull parameter numbers. If the most recent C-active Control Change commandfor controller 121 (Reset All Controllers)involved in a parameter system transaction in the sessionhistory,history is for theXRPN MSB or NRPN MSB controller, the P header bit MUST be set to 1, and the PENDING field (and its associatedwithQ bit) MUST follow theCOUNTChapter M header. Otherwise, the P header bit MUST be set to 0, and the PENDING field and Q bit MUST NOT appear in Chapter M. If PENDING codes an NRPN MSB, the Q bit MUST be set to 1.Otherwise,If PENDING codes an RPN MSB, theXQ bit MUST be set to 0.Transaction counting is performed modulo 128.Thetransaction count is set to 0 atE header bit codes thestartcurrent transaction state ofa session, and is reset to 0 whenever a Reset State command (Appendix A.1) appears in the session history. A parameter log that uses the count tool MUST includetheCOUNT field ifMIDI stream. If E = 1, anactive command that increments theinitiated transactioncount (modulo 128) appearsis in progress. Below, we define thecheckpoint history. A.5 Chapter W: MIDI Pitch Wheel A channel journal MUST contain Chapter W if arules for setting the E header bit: o If no C-activeMIDI Pitch Wheel (0xE) command appearsparameter system transaction Control Change commands appear in thecheckpoint history. Figure A.5.1 showssession history, theformat for Chapter W. 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S| FIRST |R| SECOND | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure A.5.1 -- Chapter W format The chapter has a fixed size of 16 bits. The FIRST and SECOND fields areE bit MUST be set to 0. o If the7-bit values ofP header bit is set to 1, thefirst and second data octets ofE bit MUST be set to 0. o If the most recentactive Pitch WheelC-active parameter system transaction Control Change command in the sessionhistory. Note that Chapter W encodes C-active commands,history is for the NRPN LSB or RPN LSB controller number, andthus does not encode active commands that are not C-active (seeif this command acts to complete thesecond-to-last paragraph of Appendix A.1 for an explanationcoding ofchapter inclusion text in this regard). Chapter W does not encode "active but not C-active" commands because [RP015] declares that Control Change commands for controller number 121 (Reset All Controllers) acts to resetthePitch Wheelnull parameter (MSB value 0x7F, LSB value 0x7F), the E bit MUST be set to 0.If Chapter W encoded "active but not C-active" commands, a repair operation following a Reset All Controllers command could incorrectly repairo Otherwise, an initiated transaction is in progress, and thestream with a stale Pitch Wheel value. A.6 Chapter N: MIDI NoteOffE bit MUST be set to 1. The U, W, andNoteOn In this Appendix, we consider NoteOn commandsZ header bits code properties that are shared by all parameter logs in the list. If these properties are set, parameter logs may be coded withzero velocityimproved efficiency (we explain how in A.4.1). By default, the U, W, and Z bits MUST be set to 0. If all parameter logs in the list code RPN parameters, the U bit MAY beNoteOff commands. Readers may wishset toreview1. If all parameter logs in theAppendix A.1 definitionlist code NRPN parameters, the W bit MAY be set to 1. If the parameter numbers of"N-active commands" before reading this Appendix. Chapter N completely protects note commands in streams that alternate between NoteOnall RPN andNoteOff commands for a particular note number. However,NRPN logs inrare applications, multiple overlapping NoteOn commands maythe list lie in the range 0-127 (and thus have an MSB value of 0), the Z bit MAY be set to 1. Note that C-active semantics appear in the preceding paragraphs because [RP015] specifies that pending Parameter System transactions are closed by a Control Change command for controller number 121 (Reset All Controllers). A.4.1. Log Inclusion Rules Parameter logs code recovery information for anote number. Chapter E, described in Appendix A.7, augments Chapter N to completely protect these streams.specific RPN or NRPN parameter. Achannel journalparameter log MUSTcontain Chapter Nappear in the list if anN-active MIDI NoteOn (0x9) or NoteOff (0x8)active Control Change command that forms a part of an initiated transaction for the parameter appears in the checkpoint history.Figure A.6.1 showsAn exception to this rule applies if theformatcheckpoint history only contains transaction Control Change commands forChapter N. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 8 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |B| LEN | LOW | HIGH |S| NOTENUM |Y| VELOCITY | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S| NOTENUM |Y| VELOCITY | .... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | OFFBITS | OFFBITS | .... | OFFBITS | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure A.6.1 -- Chapter N format Chapter N consists ofcontroller numbers 98-101 that act to terminate the transaction. In this case, a2-octet header, followed by at least one oflog for thefollowing data structures: oparameter MAY be omitted from the list. A log MAY appear in the list if an active Control Change command that forms a part ofnote logs to code NoteOn commands. o A NoteOff bitfield structure to code NoteOff commands. We definean initiated transaction for theheader bitfield semanticsparameter appears inAppendix A.6.1. We definethenotesession history. Otherwise, a logsemantics andfor theNoteOff bitfield semantics in Appendix A.6.2. If one or more N-active NoteOn or NoteOff commandsparameter MUST NOT appear in thecheckpoint history reference a note number,list. Multiple logs for thenote numbersame RPN or NRPN parameter MUSTbe codedNOT appear ineitherthenoteloglist or the NoteOff bitfield structure.list. Thenoteparameter log list MUSTcontain an entry for all note numbers whose most recent checkpoint history appearance isobey the oldest-first ordering rule (defined inan N-active NoteOn command. The NoteOff bitfield structure MUST contain a set bit for all note numbers whose most recent checkpoint history appearance isAppendix A.1), with the phrase "parameter transaction" replacing the word "command" inan N- active NoteOff command. A note numberthe rule definition. Parameter logs associated with the RPN or NRPN null parameter (LSB = 0x7F, MSB = 0x7F) MUST NOTbe codedappear inboth structures.the log list. Chapter M uses the E header bit (Figure A.4.1) and the log list ordering rules to code null parameter semantics. Note that "active" semantics (rather than "C-active" semantics) appear in the preceding paragraphs because [RP015] specifies that pending Parameter System transactions are not reset by a Control Change command for controller number 121 (Reset AllnoteControllers). However, the rule that follows uses C-active semantics, because it concerns the protection of the transaction system itself, and [RP015] specifies that Reset All Controllers acts to close a transaction in progress. In most cases, parameter logs for RPN andNoteOff bitfield set bits MUST codeNRPN parameters that are assigned to the ch_never parameter (Appendix C.2.3) MAY be omitted from the list. An exception applies if o the log codes the most recentN- active NoteOn or NoteOff reference to a note numberinitiated transaction in the sessionhistory. The note log list MUST obeyhistory, and o a C-active command that forms a part of theoldest-first ordering rule (definedtransaction appears inAppendix A.1). A.6.1 Header Structure Thethe checkpoint history, and o the E header bit for the top-level ChapterN, shownM header (Figure A.4.1) is set to 1. In this case, a log for the parameter MUST appear in the list. This log informs receivers recovering from a loss that a transaction is in progress, so that the receiver is able to correctly interpret RPN or NRPN Control Change commands that follow the loss event. A.4.2. Log Coding Rules FigureA.6.2, codesA.4.2 shows thesizeparameter log structure ofthe note list and bitfield structures.Chapter M. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |B| LEN | LOW | HIGH6 7 8 9 0 1 2 3 4 5 6 7 8 8 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S| PNUM-LSB |Q| PNUM-MSB |J|K|L|M|N|T|V|R| Fields ... |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ FigureA.6.2A.4.2 --Chapter N headerParameter log format TheLEN field,log begins with a7-bit integer value, codes the number of 2-octet note logsheader, whose default size (as shown in Figure A.4.2) is 3 octets. If thenote list. ZeroQ header bit isa valid value for LEN, and codesset to 0, the log encodes anempty note list.RPN parameter. If Q = 1, the log encodes an NRPN parameter. The4-bit LOW7-bit PNUM-MSB andHIGHPNUM-LSB fields code the parameter number and reflect the Control Change command data values for controllers 99 and 98 (for NRPNs) or 101 and 100 (for RPNs). The J, K, L, M, and N header bits form a Table ofOFFBITS octets that followContents (TOC) for thenoteloglist. LOWandHIGH are unsigned integer values. If LOW <= HIGH, there are (HIGH - LOW + 1) OFFBITS octetssignal the presence of fixed-sized fields that follow the header. A header bit that is set to 1 codes the presence of a field in thechapter.log. Thevalue pairs (LOW = 15, HIGH = 0) and (LOW = 15, HIGH = 1) code an empty NoteOff bitfield structure (i.e. no OFFBITS octets). Other (LOW > HIGH) value pairs MUST NOT appearordering of fields in theheader.log follows the ordering of the header bits in the TOC. Appendices A.4.2.1-2 define the fields associated with each TOC header bit. TheB bit provides S-bit functionality (Appendix A.1) forT and V header bits code information about theNoteOff bitfield structure. By default,parameter log but are not part of theB bit MUST beTOC. A setto 1. However, ifT or V bit does not signal theMIDI command sectionpresence of any parameter log field. If theprevious packet (packet I - 1, with I as definedrules in AppendixA.1) includesA.4.1 state that aNoteOff commandlog for a given parameter MUST appear in Chapter M, thechannel, the B bitlog MUSTbe setcode sufficient information to0. Ifprotect theB bitparameter from the loss of active parameter transaction Control Change commands in the checkpoint history. This rule does not apply if the parameter coded by the log issetassigned to0,thehigher-level recovery journal elements that contain Chapterch_never parameter (Appendix C.2.3). In this case, senders MAY choose to set the J, K, L, M, and NMUST have STOC bits to 0, coding a parameter log with no fields. Note that logs to protect parameters that aresetassigned to0, including the top-level journal header.ch_never are REQUIRED under certain conditions (see Appendix A.4.1). TheLEN valuepurpose of127 codesthe log is to inform receivers recovering from anote list length of 127loss that a transaction is in progress, so that the receiver is able to correctly interpret RPN or128 note logs, depending onNRPN Control Change commands that follow thevalues of LOW and HIGH. If LEN = 127, LOW = 15, and HIGH = 0,loss event. Parameter logs provide two tools for parameter protection: thenote list holds 128 note logs,value tool and theNoteOff bitfield structure is empty. For other valuescount tool. Depending on the semantics ofLOW and HIGH, LEN = 127the parameter, senders may use either tool, both tools, or neither tool to protect a given parameter. The value tool codesthatinformation a receiver may use to determine thenote list contains 127 note logs. In this case,current value of an RPN or NRPN parameter. If a parameter log uses thechapter has (HIGH - LOW + 1) NoteOff OFFBITS octets if LOW <= HIGH,value tool, the V header bit MUST be set to 1, andhas no OFFBITS octets if LOW = 15 and HIGH = 1. A.6.2 Note Structures Figure A.6.3 showsthe2-octet note log structure. 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S| NOTENUM |Y| VELOCITY | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure A.6.3 -- Chapter N notesemantics defined in Appendices A.4.2.1 for setting the J, K, L, and M TOC bits MUST be followed. If a parameter log does not use the value tool, the V bit MUST be set to 0, and the J, K, L, and M TOC bits MUST also be set to 0. The7-bit NOTENUM fieldcount tool codes thenotenumber of transactions for an RPN or NRPN parameter. If a parameter log uses thelog. A note numbercount tool, the T header bit MUSTNOTberepresented by multiple note logsset to 1, and the semantics defined in Appendices A.4.2.2 for setting thenote list. The 7-bit VELOCITY field codesN TOC bit MUST be followed. If a parameter log does not use thevelocity value forcount tool, themost recent N- active NoteOn command forT bit and thenote number inN TOC bit MUST be set to 0. Note that V and T are set if thesession history. Multiple overlapping NoteOnssender uses value (V) or count (T) tool fora given note numberthe log on an ongoing basis. Thus, V may be set even if J = K = L = M = 0, and T may be set even if N = 0. In many cases, all parameters codedusing Chapter E, as discussed in Appendix A.7. VELOCITY is never zero; NoteOn commands with zero velocity are coded as NoteOff commandsin theNoteOff bitfield structure. The notelogdoes not code the execution timelist are of one type (RPN and NRPN), and all parameter numbers lie in theNoteOn command. However,range 0-127. As described in Appendix A.4.1, senders MAY signal this condition by setting theYtop-level Chapter M header bitcodes a hint fromZ to 1 (to code thesender aboutrestricted range) and by setting theNoteOn execution time. The YU or W bitcodes a recommendationtoplay (Y1 (to code the parameter type). If the top-level Chapter M header codes Z =1)1 and either U = 1 orskip (YW =0) the NoteOn command recovered from the note log. See Section 4.2 of [GUIDE] for non-normative guidance on1, all logs in the parameter log list MUST use a modified header format. This modification deletes bits 8-15 of theY bit. Figure A.6.1 shows the NoteOffbitfieldstructure, as the list of OFFBITS octets at the endshown in Figure A.4.2, to yield a 2-octet header. The values of thechapter. A NoteOff OFFBITS octet codes NoteOff information for eight consecutive MIDI note numbers, withdeleted PNUM-MSB and Q fields may be inferred from themost-significantU, W, and Z bitrepresentingvalues. A.4.2.1. The Value Tool The value tool uses several fields to track thelowest note number.value of an RPN or NRPN parameter. Themost- significantJ TOC bitof the first OFFBITS octetcodes thenote number 8*LOW; the most-significant bitpresence of thelast OFFBITSoctetcodesshown in Figure A.4.3 in thenote number 8*HIGH. A set bitfield list. 0 0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+ |X| ENTRY-MSB | +-+-+-+-+-+-+-+-+ Figure A.4.3 -- ENTRY-MSB field The 7-bit ENTRY-MSB field codesa NoteOffthe data value of the most recent active Control Change command for controller number 6 (Data Entry MSB) in thenote number. Insession history that appears in a transaction for the log parameter. The X bit MUST be set to 1 if the command coded by ENTRY-MSB precedes the mostefficient codingrecent Control Change command for controller 121 (Reset All Controllers) in theNoteOff bitfield structure, the first and last octets ofsession history. Otherwise, thestructure contain at least oneX bit MUST be setbit. Note that Chapter N does not code NoteOff velocity data. Noteto 0. A parameter log thatin the general case,uses therecovery journal does not codevalue tool MUST include therelative placement of a NoteOff command and a ChangeENTRY-MSB field if an active Control Change command for controller64 (Damper Pedal (Sustain)). In many cases, a receiver processing a loss event may deduce this relative placement from the history ofnumber 6 appears in thestream,checkpoint history. Note that [RP015] specifies that Control Change commands for controller 121 (Reset All Controllers) do not reset RPN and NRPN values, and thusdetermine if a NoteOff note is sustained bythepedal. If such a determination isX bit would notpossible, receivers SHOULD err on the side of silencing pedal sustains, as erroneously sustained notes may produce unpleasant (albeit transient) artifacts. A.7 Chapter E: MIDI Note Command Extras Readers may wish to review the Appendix A.1 definition of "N-active commands" before reading this Appendix. In this Appendix, a NoteOn command with a velocity of 0 is considered to be a NoteOff command withplay arelease velocity value of 64. Chapter E encodesrecoveryinformation aboutrole for MIDINoteOn (0x9) and NoteOff (0x8) command featuressystems thatrarely appear in MIDI streams. Receivers use Chapter E to reduce transient artifacts for streams where several NoteOn commands appearcomply with [RP015]. However, certain renderers (such as DLS 2 [DLS2]) specify that certain RPN values are reset for some uses of Reset All Controllers. The X bit (and other bitfield features of this nature in this appendix) plays anote number without an intervening NoteOff. Receivers also use Chapter E to reduce transient artifactsrole in recovery forstreams that use NoteOff release velocity. Chapter E supplementsrenderers of this type. The K TOC bit codes thenote information codedpresence of the octet shown inChapter N (Appendix A.6).FigureA.7.1 showsA.4.4 in theformat for Chapter E. 0 1 2 3 0 1 2 3 4 5 6 7 8 9field list. 01 2 3 4 5 6 7 8 90 1 2 3 4 5 6 78 8 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S| LEN |S| NOTENUM |V| COUNT/VEL |S| NOTENUM | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |V| COUNT/VEL | ....+-+-+-+-+-+-+-+-+ |X| ENTRY-LSB |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-++-+-+-+-+-+-+-+-+ FigureA.7.1A.4.4 --Chapter E format The chapter consists of a 1-octet header, followed by a variable length list of 2-octet note logs. Appendix A.7.1 defines the bitfield format for a note log. The log list MUST contain at least one note log.ENTRY-LSB field The 7-bitLEN headerENTRY-LSB field codes thenumberdata value ofnote logsthe most recent active Control Change command for controller number 38 (Data Entry LSB) in thelist, minus one. A channel journalsession history that appears in a transaction for the log parameter. The X bit MUSTcontain Chapter Ebe set to 1 if therules definedcommand coded by ENTRY-LSB precedes the most recent Control Change command for controller 121 (Reset All Controllers) inthis Appendix requirethe session history. Otherwise, the X bit MUST be set to 0. As a rule, a parameter log thatone or more note logs appearuses the value tool MUST include the ENTRY-LSB field if an active Control Change command for controller number 38 appears in thelist. The note log listcheckpoint history. However, the ENTRY-LSB field MUSTobeyNOT appear in a parameter log if theoldest-first ordering rule (definedControl Change command associated with the ENTRY-LSB precedes a Control Change command for controller number 6 (Data Entry MSB) that appears inAppendix A.1). A.7.1 Note Log Format Figure A.7.2 reproducesa transaction for thenotelogstructureparameter in the session history. The L TOC bit codes the presence ofChapter E.the octets shown in Figure A.4.5 in the field list. 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+|S| NOTENUM |V| COUNT/VEL|G|X| A-BUTTON | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ FigureA.7.2A.4.5 --Chapter E note log A note log codes information about the MIDI note number coded by the 7-bit NOTENUM field. The nature of the information depends on the value of the V flag bit. If the V bit is set to 1, the COUNT/VELA-BUTTON fieldcodes the release velocity value for the most recent N-active NoteOff command for the note number that appears in the session history. If the V bit is set to 0, the COUNT/VELThe 14-bit A-BUTTON field codes areferencecount of the number ofNoteOn and NoteOffactive Control Change commands for controller numbers 96 and 97 (Data Increment and Data Decrement) in thenote numbersession history that appear in a transaction for thesession history.log parameter. Thereference count is set to 0 atM TOC bit codes thestartpresence of thesession. NoteOn commands increment the count by 1. NoteOff commands decrement the count by 1. However, a decrement that generates a negative count value is not performed. If the reference count isoctets shown in Figure A.4.6 in therange 0-126, the 7-bit COUNT/VELfieldcodes an unsigned integer representation of the count. If the count is greater or equal to 127, COUNT/VEL is setlist. 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |G|R| C-BUTTON | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure A.4.6 -- C-BUTTON field The 14-bit C-BUTTON field has semantics identical to127. By default,A-BUTTON, except that Data Increment and Data Decrement Control Change commands that precede thecount is reset to 0 whenever a Reset Statemost recent Control Change command(Appendix A.1) appearsfor controller 121 (Reset All Controllers) in the sessionhistory,history are not counted. For both A-BUTTON andwhenever MIDIC-BUTTON, Data Increment and Data Decrement Control Change commands are not counted if they precede Control Changes commands for controller numbers123-127 (numbers with All Notes Off semantics)6 (Data Entry MSB) or120 (All Sound Off)38 (Data Entry LSB) that appear in a transaction for the log parameter in the session history.A.7.2 Log Inclusion Rules IfThe A-BUTTON and C-BUTTON fields are interpreted as unsigned integers, and themost recent N-active NoteOnG bit associated the field codes the sign of the integer (G = 0 for positive orNoteOff commandzero, G = 1 fora note number innegative). To compute and code thecheckpoint history is a NoteOff command with a release velocitycount value, initialize the count valueother than 64, a note log whose V bit is setto 0, add 1 for each qualifying Data Increment command, and subtract 1MUST appear in Chapter Efor each qualifying Data Decrement command. After each add or subtract, limit thenote number. Ifcount magnitude to 16383. The G bit codes themost recent N-active NoteOn or NoteOff command for a note number insign of thecheckpoint history is a NoteOff command,count, andifthereference count forA-BUTTON or C-BUTTON field codes thenote number is greater than 0, a note log whose V bit is set to 0 MUST appear in Chapter E forcount magnitude. For thenote number. IfA-BUTTON field, if the most recentN-active NoteOnqualified Data Increment orNoteOffData Decrement command precedes the most recent Control Change command fora note numbercontroller 121 (Reset All Controllers) in thecheckpoint history is a NoteOn command, and if the reference count forsession history, thenote number is greater than 1, a note log whose VX bitis set to 0associated with A-BUTTON field MUSTappear in Chapter E for the note number. At most two note logs MAY appear in Chapter E for a note number: one log whose V bit is set to 0, and one log whose V bit isbe set to 1.Chapter E codes a maximum of 128 note logs. IfOtherwise, thelog inclusion rules yield more than 128 REQUIRED logs, note logs whose VX bitis set to 1MUST bedropped from Chapter E in order to reach the 128-log limit. Note logs whose V bit isset to0 MUST NOT be dropped. Most MIDI streams do not use NoteOn and NoteOff commands in ways that would trigger the log inclusion rules. For these streams, Chapter E would never be REQUIRED to appear in a channel journal. The ch_never0. A parameter(Appendix C.2.3) may be used to configure theloginclusion rules for Chapter E. A.8 Chapter T: MIDI Channel Aftertouch A channel journalthat uses the value tool MUSTcontain Chapter Tinclude the A-BUTTON and C-BUTTON fields if anN-active and C-active MIDI Channel Aftertouch (0xD)active Control Change command for controller numbers 96 or 97 appears in the checkpoint history.Figure A.8.1 shows the format for Chapter T. 0 0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+ |S| PRESSURE | +-+-+-+-+-+-+-+-+ Figure A.8.1 -- Chapter T format The chapterHowever, to improve coding efficiency, this rule hasa fixed size of 8 bits. The 7-bit PRESSURE field holdsseveral exceptions: o If thepressure value oflog includes themost recent N-activeA-BUTTON field, andC-active Channel Aftertouch command inif thesession history. Chapter T only encodes commands that are C-activeX bit of the A-BUTTON field is set to 1, the C-BUTTON field (and its associated R andN-active. We define a C-active restriction because [RP015] declaresG bits) MAY be omitted from the log. o If the log includes the A-BUTTON field, and if the A-BUTTON and C-BUTTON fields (and their associated G bits) code identical values, the C-BUTTON field (and its associated R and G bits) MAY be omitted from the log. A.4.2.2. The Count Tool The count tool tracks the number of transactions for an RPN or NRPN parameter. The N TOC bit codes the presence of the octet shown in Figure A.4.7 in the field list. 0 0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+ |X| COUNT | +-+-+-+-+-+-+-+-+ Figure A.4.7 -- COUNT field The 7-bit COUNT codes the number of initiated transactions for the log parameter thataappear in the session history. Initiated transactions are counted if they contain one or more active Control Change commands, including commands for controllers 98-101 that initiate the parameter transaction. If the most recent counted transaction precedes the most recent Control Change command for controller 121 (Reset All Controllers)actsin the session history, the X bit associated with the COUNT field MUST be set toreset1. Otherwise, thechannel pressureX bit MUST be set to 0. Transaction counting is performed modulo 128. The transaction count is set to 0(see the discussionat theendstart ofAppendix A.5 foramore complete rationale). We define an N-active restriction on the assumption that aftertouch commands are linked to note activity, and thus Channel Aftertouch commands that are not N-active are stalesession andshould not be usedis reset torepair0 whenever astream. A.9Reset State command (Appendix A.1) appears in the session history. A parameter log that uses the count tool MUST include the COUNT field if an active command that increments the transaction count (modulo 128) appears in the checkpoint history. A.5. ChapterA:W: MIDIPoly AftertouchPitch Wheel A channel journal MUST contain ChapterAW if a C-activePoly Aftertouch (0xA)MIDI Pitch Wheel (0xE) command appears in the checkpoint history. FigureA.9.1A.5.1 shows the format for ChapterA. 0 1 2 3W. 0 12 3 4 5 6 7 8 90 1 2 3 4 5 6 7 8 9 0 1 2 3 4 56 7 8 8 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S| LEN |S| NOTENUM |X| PRESSURE+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S|NOTENUM | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |X| PRESSURE | ....FIRST |R| SECOND |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ FigureA.9.1A.5.1 -- ChapterAW format The chapterconsists of a 1-octet header, followed byhas avariable length listfixed size of2-octet note logs. A note log MUST appear for a note number if a C-active Poly Aftertouch command for the note number appears in the checkpoint history. A note number MUST NOT be represented by multiple note logs in the note list.16 bits. Thenote log list MUST obeyFIRST and SECOND fields are theoldest- first ordering rule (defined in Appendix A.1). The7-bitLEN field codes the number of note logs in the list, minus one. Figure A.9.2 reproduces the note log structurevalues ofChapter A. 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S| NOTENUM |X| PRESSURE | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure A.9.2 -- Chapter A note log The 7-bit PRESSURE field codesthepressure valuefirst and second data octets of the most recentC-activePoly AftertouchPitch Wheel command in the sessionhistory for the MIDI note number coded in the 7-bit NOTENUM field. As a rule, the X bit MUST be set to 0. However, the X bit MUST be set to 1 if the command coded by the log appears before one of the followinghistory. Note that Chapter W encodes C-active commandsin the session history: MIDI Control Change numbers 123-127 (numbers with All Notes Off semantics) or 120 (All Sound Off). We defineand thus does not encode active commands that are not C-activerestrictions(see the second-to-last paragraph of Appendix A.1 for an explanation of chapter inclusion text in this regard). ChapterAW does not encode "active but not C-active" commands because [RP015] declares thataControl Changecommandcommands for controller number 121 (Reset All Controllers)actsact to reset thepolyphonic pressurePitch Wheel value to0 (see0. If Chapter W encoded "active but not C-active" commands, a repair operation following a Reset All Controllers command could incorrectly repair thediscussion atstream with a stale Pitch Wheel value. A.6. Chapter N: MIDI NoteOff and NoteOn In this appendix, we consider NoteOn commands with zero velocity to be NoteOff commands. Readers may wish to review theend ofAppendixA.5A.1 definition of "N-active commands" before reading this appendix. Chapter N completely protects note commands in streams that alternate between NoteOn and NoteOff commands for amore complete rationale). B. The Recovery Journal System Chapters B.1 Systemparticular note number. However, in rare applications, multiple overlapping NoteOn commands may appear for a note number. ChapterD: Simple System Commands The systemE, described in Appendix A.7, augments Chapter N to completely protect these streams. A channel journal MUST contain ChapterDN if anactive MIDI Reset (0xFF), MIDI Tune Request (0xF6), MIDI Song Select (0xF3), undefinedN-active MIDISystem Common (0xF4 and 0xF5),NoteOn (0x9) orundefined MIDI System Real-time (0xF9 and 0xFD)NoteOff (0x8) command appears in the checkpoint history. FigureB.1.1A.6.1 shows thevariable-lengthformat for ChapterD.N. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 898 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+|S|B|G|H|J|K|Y|Z| Command logs ...|B| LEN | LOW | HIGH |S| NOTENUM |Y| VELOCITY | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S| NOTENUM |Y| VELOCITY | .... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | OFFBITS | OFFBITS | .... | OFFBITS | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ FigureB.1.1A.6.1 --SystemChapterDN formatThe chapterChapter N consists of a1-octet2-octet header, followed by at least oneor more command logs. Header flag bits indicateof thepresencefollowing data structures: o A list ofcommandnote logsfor the Reset (B = 1), Tune Request (G = 1), Song Select (H = 1), undefined System Common 0xF4 (J = 1), undefined System Common 0xF5 (K = 1), undefined System Real-time 0xF9 (Y = 1), or undefined System Real- time 0xFD (Z = 1)to code NoteOn commands.Command logs appear in a list following the header, in the order thato A NoteOff bitfield structure to code NoteOff commands. We define theflag bits appearheader bitfield semantics in Appendix A.6.1. We define theheader. Figure B.1.2 shows the 1-octet command log format for the Reset and Tune Request commands. 0 0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+ |S| COUNT | +-+-+-+-+-+-+-+-+ Figure B.1.2 -- Commandnote logfor Resetsemantics andTune Request Chapter D MUST containtheReset command log if an active Reset command appearsNoteOff bitfield semantics in Appendix A.6.2. If one or more N-active NoteOn or NoteOff commands in the checkpointhistory. The 7-bit COUNT field codeshistory reference a note number, thetotalnote numberof Reset commands (modulo 128) presentMUST be coded in either thesession history. Chapter D MUST containnote log list or theTune Request commandNoteOff bitfield structure. The note logiflist MUST contain anactive Tune Request command appears in theentry for all note numbers whose most recent checkpointhistory. The 7-bit COUNT field codes the total number of Tune Request commands (modulo 128) present in the session history. For these commands, the COUNT field acts as a reference count. See the definition of "sessionhistoryreference counts"appearance is inAppendix A.1 for more information. Figure B.1.3 shows the 1-octet command log format for the Song Selectan N-active NoteOn command.0 0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+ |S| VALUE | +-+-+-+-+-+-+-+-+ Figure B.1.3 -- Song Select command log format Chapter DThe NoteOff bitfield structure MUST containthe Song Select command log if an active Song Select command appears in thea set bit for all note numbers whose most recent checkpointhistory. The 7-bit VALUE field codes the songhistory appearance is in an N-active NoteOff command. A note numberofMUST NOT be coded in both structures. All note logs and NoteOff bitfield set bits MUST code the most recentactive Song Select commandN-active NoteOn or NoteOff reference to a note number in the session history.B.1.1 Undefined System Commands In this section, we define the Chapter D command logs for the undefined System commands. [MIDI] reservesThe note log list MUST obey theundefined System commands 0xF4, 0xF5, 0xF9, and 0xFDoldest-first ordering rule (defined in Appendix A.1). A.6.1. Header Structure The header forfuture use. At the time of this writing, any MIDI command stream that uses these commands is non-compliant with [MIDI]. However, future versions of [MIDI] may define these commands, and a few products do use these commandsChapter N, shown ina non-compliant manner.FigureB.1.4 showsA.6.2, codes thevariable length command log format forsize of theundefined System Common commands (0xF4note list and0xF5). 0 1 2 3bitfield structures. 0 12 3 4 5 6 7 8 90 1 2 3 4 5 6 7 8 9 0 1 2 3 4 56 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S|C|V|L|DSZ| LENGTH | COUNT | VALUE ...+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |B| LEN |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+LOW |LEGAL ...HIGH |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ FigureB.1.4A.6.2 --Undefined System Common command log format The command log codes a single command type (0xF4 or 0xF5, not both). Chapter D MUST contain a command log if an active 0xF4 command appears in the checkpoint history, and MUST contain an independent command log if an active 0xF5 command appears in the checkpoint history.ChapterD consists of a two-octetN headerfollowed by a variable number of data fields. Header flag bits indicate the presence of the COUNT field (C = 1), the VALUE field (V = 1), and the LEGAL field (L = 1).The10-bit LENGTH fieldLEN field, a 7-bit integer value, codes thesizenumber of 2-octet note logs in thecommand log,note list. Zero is a valid value for LEN andconforms to semantics described in Appendix A.1. The 2-bit DSZ fieldcodes an empty note list. The 4-bit LOW and HIGH fields code the number ofdataOFFBITS octetsin the command instancethatappears most recently infollow thesession history.note log list. LOW and HIGH are unsigned integer values. IfDSZ = 0-2,LOW <= HIGH, there are (HIGH - LOW + 1) OFFBITS octets in thecommand has 0-2 data octets. If DSZchapter. The value pairs (LOW =3, the command has 3 or more command data octets. We now define the default rules for the use of the COUNT, VALUE,15, HIGH = 0) andLEGAL fields. The session configuration tools defined in Appendix C.2.3 may be used to override this behavior. By default, if the DSZ field is set to 0, the command log(LOW = 15, HIGH = 1) code an empty NoteOff bitfield structure (i.e., no OFFBITS octets). Other (LOW > HIGH) value pairs MUSTincludeNOT appear in theCOUNT field.header. The8-bit COUNT field codes the total number of commands of the type coded by the log (0xF4 or 0xF5) present inB bit provides S-bit functionality (Appendix A.1) for thesession history, modulo 256.NoteOff bitfield structure. By default,iftheDSZ field isB bit MUST be set to1-3,1. However, if the MIDI commandlog MUST include the VALUE field. The variable-length VALUE field codes a verbatim copy the data octets for the most recent usesection of the previous packet (packet I - 1, with I as defined in Appendix A.1) includes a NoteOff commandtype coded byfor thelog (0xF4 or 0xF5) inchannel, thesession history. The most-significantB bitof the final data octetMUST be set to1, and0. If themost-significantB bitof all other data octets MUST be set to 0. The LEGAL fieldisreserved for future use. If an updateset to[MIDI] defines the 0xF4 or 0xF5 command, an IETF standards-track document may define the LEGAL field. Until such a document appears, senders MUST NOT use0, theLEGAL field, and receivershigher-level recovery journal elements that contain Chapter N MUSTuse the LENGTH fieldhave S bits that are set toskip over0, including theLEGAL field.top-level journal header. TheLEGAL field would be defined by the IETF ifLEN value of 127 codes a note list length of 127 or 128 note logs, depending on thesemanticsvalues of LOW and HIGH. If LEN = 127, LOW = 15, and HIGH = 0, thenew 0xF4 or 0xF5 command could not be protected from packet loss vianote list holds 128 note logs, and theuseNoteOff bitfield structure is empty. For other values of LOW and HIGH, LEN = 127 codes that theCOUNTnote list contains 127 note logs. In this case, the chapter has (HIGH - LOW + 1) NoteOff OFFBITS octets if LOW <= HIGH andVALUE fields.has no OFFBITS octets if LOW = 15 and HIGH = 1. A.6.2. Note Structures FigureB.1.5A.6.3 shows thevariable length command2-octet note logformat for the undefined System Real-time commands (0xF9 and 0xFD). 0 1 2 3structure. 0 12 3 4 5 6 7 8 90 1 2 3 4 5 6 7 8 9 0 1 2 3 4 56 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S|C|L| LENGTH | COUNT | LEGAL ...+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S| NOTENUM |Y| VELOCITY |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ FigureB.1.5A.6.3 --Undefined System Real-time commandChapter N note logformatThecommand log7-bit NOTENUM field codesa single command type (0xF9 or 0xFD, not both). Chapter Dthe note number for the log. A note number MUSTcontain a command log if an active 0xF9 command appears in the checkpoint history, and MUST contain an independent command log if an active 0xFD command appears in the checkpoint history. Chapter D consists of a one-octet header followedNOT be represented bya variable number of data fields. Header flag bits indicate the presence of the COUNT field (C = 1) andmultiple note logs in theLEGAL field (L = 1).note list. The5-bit LENGTH7-bit VELOCITY field codes thesize ofvelocity value for the most recent N-active NoteOn commandlog, and conforms to semantics described in Appendix A.1. We now define the default rulesfor theuse ofnote number in theCOUNT and LEGAL fields. Thesessionconfiguration tools defined in Appendix C.2.3history. Multiple overlapping NoteOns for a given note number may beused to override this behavior. The 8-bit COUNT field codes the total number of commands of the typecodedby the log presentusing Chapter E, as discussed inthe session history, modulo 256. By default, the COUNT field MUST be presentAppendix A.7. VELOCITY is never zero; NoteOn commands with zero velocity are coded as NoteOff commands in thecommand log.NoteOff bitfield structure. TheLEGAL field is reserved for future use. If an update to [MIDI] defines the 0xF9 or 0xFD command, an IETF standards-track document may definenote log does not code theLEGAL field to protectexecution time of the NoteOn command.Until suchHowever, the Y bit codes adocument appears, senders MUST NOT usehint from theLEGAL field, and receivers MUST usesender about theLENGTH fieldNoteOn execution time. The Y bit codes a recommendation to play (Y = 1) or skipover the LEGAL field. The LEGAL field would be defined by the IETF if the semantics of(Y = 0) thenew 0xF9 or 0xFDNoteOn commandcould not be protectedrecovered frompacket loss viathe note log. See Section 4.2 of [RFC4696] for non-normative guidance on the use of theCOUNT field. Finally, we note that some non-standard usesY bit. Figure A.6.1 shows the NoteOff bitfield structure, as the list of OFFBITS octets at theundefined System Real-time commands act to implement non-compliant variantsend of theMIDI sequencer system. In Appendix B.3.1, we describe resiliency toolschapter. A NoteOff OFFBITS octet codes NoteOff information forthe MIDI sequencer system that provide some protection in this case. B.2 System Chapter V: Active Sense Command The system journal MUST contain Chapter V if an activeeight consecutive MIDIActive Sense (0xFE) command appears innote numbers, with thecheckpoint history. Figure B.2.1 showsmost-significant bit representing theformat for Chapter V. 0 0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+ |S| COUNT | +-+-+-+-+-+-+-+-+ Figure B.2.1 -- System Chapter V formatlowest note number. The7-bit COUNT fieldmost-significant bit of the first OFFBITS octet codes thetotalnote number 8*LOW; the most-significant bit ofActive Sense commands (modulo 128) present inthesession history. The COUNT field acts as a reference count. Seelast OFFBITS octet codes thedefinition of "session history reference counts" in Appendix A.1note number 8*HIGH. A set bit codes a NoteOff command formore information. B.3 System Chapter Q: Sequencer State Commands This Appendix describes Chapter Q,thesystem chapternote number. In the most efficient coding for theMIDI sequencer commands. The system journal MUSTNoteOff bitfield structure, the first and last octets of the structure contain at least one set bit. Note that ChapterQ if an active MIDI Song Position Pointer (0xF2), MIDI Clock (0xF8), MIDI Start (0xFA), MIDI Continue (0xFB) or MIDI Stop (0xFC) command appearsN does not code NoteOff velocity data. Note that in thecheckpoint history,general case, the recovery journal does not code the relative placement of a NoteOff command and a Change Control command for controller 64 (Damper Pedal (Sustain)). In many cases, a receiver processing a loss event may deduce this relative placement from the history of the stream and thus determine if a NoteOff note is sustained by therules defined in this Appendix requirepedal. If such achange indetermination is not possible, receivers SHOULD err on theChapter Q bitfield contents becauseside of silencing pedal sustains, as erroneously sustained notes may produce unpleasant (albeit transient) artifacts. A.7. Chapter E: MIDI Note Command Extras Readers may wish to review the Appendix A.1 definition of "N-active commands" before reading this appendix. In this appendix, a NoteOn commandappearance.with a velocity of 0 is considered to be a NoteOff command with a release velocity value of 64. Chapter E encodes recovery information about MIDI NoteOn (0x9) and NoteOff (0x8) command features that rarely appear in MIDI streams. Receivers use Chapter E to reduce transient artifacts for streams where several NoteOn commands appear for a note number without an intervening NoteOff. Receivers also use Chapter E to reduce transient artifacts for streams that use NoteOff release velocity. Chapter E supplements the note information coded in Chapter N (Appendix A.6). FigureB.3.1A.7.1 shows thevariable-lengthformat for ChapterQ.E. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 898 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+|S|N|D|C|T| TOP | CLOCK | TIMETOOLS ...|S| LEN |S| NOTENUM |V| COUNT/VEL |S| NOTENUM | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |V| COUNT/VEL |....... |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ FigureB.3.1A.7.1 --SystemChapterQE formatChapter QThe chapter consists of a 1-octetheaderheader, followed byseveral optional fields, in the order shown in Figure B.3.1. Header flag bits signal the presencea variable- length list ofthe 16-bit CLOCK field (C = 1) and the 24-bit TIMETOOLS field (T = 1). The 3-bit TOP header field is interpreted as an unsigned integer, as are CLOCK and TIMETOOLS. We describe the TIMETOOLS field in2-octet note logs. AppendixB.3.1. Chapter Q encodes the most recent state of the sequencer system. Receivers use the chapter to re-synchronizeA.7.1 defines thesequencer afterbitfield format for apacket loss episode. Chapter fields encode the on/off state of the sequencer, the current position in the song, and the downbeat.note log. TheNlog list MUST contain at least one note log. The 7-bit LEN headerbit encodesfield codes therelative occurrencenumber ofthe Start, Stop, and Continue commandsnote logs in thesession history. If an active Start or Continue command appears most recently, the N bitlist, minus one. A channel journal MUSTbe set to 1. If an active Stop appears most recently, orcontain Chapter E ifno active Start, Stop,the rules defined in this appendix require that one orContinue commandsmore note logs appear in thesession history, the N bit MUST be set to 0.list. TheC header flag, the TOP header field, and the CLOCK field act to codenote log list MUST obey thecurrent positionoldest-first ordering rule (defined in Appendix A.1). A.7.1. Note Log Format Figure A.7.2 reproduces thesequence: o If C = 1, the 3-bit TOP header field and the 16-bit CLOCK field are combined to form the 19-bit unsigned quantity 65536*TOP + CLOCK. This value encodes the song position in unitsnote log structure ofMIDI Clocks (24 clocks per quarter note), modulo 524288. Note thatChapter E. 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S| NOTENUM |V| COUNT/VEL | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure A.7.2 -- Chapter E note log A note log codes information about themaximum song position value that may beMIDI note number coded by theSong Position Pointer command is 98303 clocks (which may be coded with 17 bits), and MIDI-coded songs are generally constructed to avoid durations longer than this value. However, the 19-bit size may be useful for real-time applications, such as a drum machine MIDI output that is sending clock commands for long periods7-bit NOTENUM field. The nature oftime. o If C = 0,thesong position isinformation depends on thestartvalue of thesong. The C = 0 positionV flag bit. If the V bit isidenticalset tothe position coded by C =1,TOP = 0, and CLOCK = 0, forthecase whereCOUNT/VEL field codes thesong position is less than 524288 MIDI clocks. In certain situations (defined later in this section), normative text may requirerelease velocity value for theC = 0 ormost recent N-active NoteOff command for theC = 1, TOP = 0, CLOCK = 0 encoding ofnote number that appears in thestart ofsession history. If thesong. The C, TOP, and CLOCK fields MUST beV bit is set tocode the current song position, for both N = 0 and N = 1 conditions. If C =0, theTOPCOUNT/VEL fieldMUST be set to 0. See [MIDI] forcodes aprecise definitionreference count ofa song position. The D header bit encodes information aboutthedownbeat,number of NoteOn andacts to qualify the song position coded byNoteOff commands for theC, TOP, and CLOCK fields. Ifnote number that appear in theD bitsession history. The reference count is set to1,0 at thesong position representsstart of themost recent position insession. NoteOn commands increment thesequencecount by 1. NoteOff commands decrement the count by 1. However, a decrement thathas played.generates a negative count value is not performed. IfD = 1,thenext Clock command (if N = 1) orreference count is in thenext (Continue, Clock) pair (if N = 0) acts to incrementrange 0-126, thesong position by one clock, and to play7-bit COUNT/VEL field codes an unsigned integer representation of theupdated position.count. If theD bitcount is greater than or equal to 127, COUNT/VEL is set to0,127. By default, thesong position representscount is reset to 0 whenever aposition in the sequence that has not yet been played. If D = 0, the next ClockReset State command(if N = 1) or(Appendix A.1) appears in thenext (Continue, Clock) pair (if N = 0) acts to play the pointsession history, and whenever MIDI Control Change commands for controller numbers 123-127 (numbers with All Notes Off semantics) or 120 (All Sound Off) appear in thesong coded bysession history. A.7.2. Log Inclusion Rules If thesong position. The song position is not incremented. An example stream that uses D = 0 coding is one whosemost recentsequence command is a StartN-active NoteOn orSong Position PointerNoteOff command(both N = 1 conditions). However, itfor a note number in the checkpoint history isalso possible to construct examples where D = 0 and N = 0. A Start command immediately followed byaStopNoteOff command with a release velocity value other than 64, a note log whose V bit iscodedset to 1 MUST appear in ChapterQ by setting C = 0, D = 0, N = 0, TOP = 0.E for the note number. IfN = 1 (coding Start or Continue), D = 0 (coding thatthedownbeat has yet to be played), andmost recent N-active NoteOn or NoteOff command for a note number in thesong positioncheckpoint history isat the start ofa NoteOff command, and if thesong,reference count for theC =note number is greater than 0, a note log whose V bit is set to 0song position encodingMUSTbe used if a Startappear in Chapter E for the note number. If the most recent N-active NoteOn or NoteOff commandoccurs more recently thanfor aContinue commandnote number in thesession history,checkpoint history is a NoteOn command, and if theC =reference count for the note number is greater than 1,TOP = 0, CLOCK =a note log whose V bit is set to 0song position encodingMUSTbe used if a Continue command occurs more recently than a Start commandappear inthe session history. B.3.1 Non-compliant Sequencers TheChapterQ description in this Appendix assumes thatE for thesequencer system counts off time with Clock commands, as mandatednote number. At most, two note logs MAY appear in[MIDI]. However,Chapter E for afew non-compliant products do not use Clock commandsnote number: one log whose V bit is set tocount off time, but instead use non-standard methods.0, and one log whose V bit is set to 1. ChapterQ usesE codes a maximum of 128 note logs. If theTIMETOOLS fieldlog inclusion rules yield more than 128 REQUIRED logs, note logs whose V bit is set toprovide resiliency support for these non-standard products. By default, the TIMETOOLS field1 MUSTNOT appear inbe dropped from ChapterQ, andE in order to reach theT header128-log limit. Note logs whose V bit is set to 0 MUST NOT besetdropped. Most MIDI streams do not use NoteOn and NoteOff commands in ways that would trigger the log inclusion rules. For these streams, Chapter E would never be REQUIRED to0. The session configuration tools describedappear inAppendix C.2.3a channel journal. The ch_never parameter (Appendix C.2.3) may be used toselect TIMETOOLS coding. Figure B.3.2 shows the format ofconfigure the24-bit TIMETOOLS field. 0 1 2 0 1 2 3 4 5 6 7 8 9log inclusion rules for Chapter E. A.8. Chapter T: MIDI Channel Aftertouch A channel journal MUST contain Chapter T if an N-active and C-active MIDI Channel Aftertouch (0xD) command appears in the checkpoint history. Figure A.8.1 shows the format for Chapter T. 0 0 1 2 3 4 5 6 78 9 0 1 2 3 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | TIME+-+-+-+-+-+-+-+-+ |S| PRESSURE |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-++-+-+-+-+-+-+-+-+ FigureB.3.2A.8.1 --TIMETOOLSChapter T format TheTIME field ischapter has a24-bit unsigned integer quantity, with unitsfixed size ofmilliseconds. TIME codes an additive correction term for8 bits. The 7-bit PRESSURE field holds thesong position coded bypressure value of theTOP, CLOCK, C fields. TIME is codedmost recent N-active and C-active Channel Aftertouch command innetwork byte order (big-endian). A receiver computesthecorrect song position by converting TIME into units of MIDI clockssession history. Chapter T only encodes commands that are C-active andadding it to 65536*TOP + CLOCK (assuming C = 1). Alternatively,N-active. We define areceiver may convert 65536*TOP + CLOCK into milliseconds (assuming C = 1) and add itC-active restriction because [RP015] declares that a Control Change command for controller 121 (Reset All Controllers) acts toTIME. The downbeat (D header bit) semantics defined in Appendix B.3 applyreset the channel pressure to 0 (see thecorrected song position. B.4 System Chapter F: MIDI Time Code Tape Position This Appendix describes Chapter F,discussion at thesystem chapterend of Appendix A.5 for a more complete rationale). We define an N-active restriction on theMIDI Time Code (MTC) commands. Readers may wishassumption that aftertouch commands are linked toreview the Appendix A.1 definition of "finished/unfinished commands" before reading this Appendix. The systemnote activity, and thus Channel Aftertouch commands that are not N-active are stale and should not be used to repair a stream. A.9. Chapter A: MIDI Poly Aftertouch A channel journal MUST contain ChapterFA ifan active System Common Quarter Frame command (0xF1) or an active finished System Exclusive (Universal Real Time) MTC Full Framea C-active Poly Aftertouch (0xA) command(F0 7F cc 01 01 hr mn sc fr F7)appears in the checkpoint history.Otherwise, the system journal MUST NOT contain Chapter F.FigureB.4.1A.9.1 shows thevariable-lengthformat for ChapterF.A. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 898 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+|S|C|P|Q|D|POINT| COMPLETE ...|S| LEN |S| NOTENUM |X| PRESSURE |S| NOTENUM | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |X| PRESSURE |... | PARTIAL ....... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| ... | +-+-+-+-+-+-+-+-+FigureB.4.1A.9.1 --SystemChapterFA formatChapter F holds information about recent MTC tape positions coded in the session history. Receivers use Chapter F to re-synchronize the MTC system after a packet loss episode. Chapter FThe chapter consists of a 1-octetheaderheader, followed byseveral optional fields,a variable- length list of 2-octet note logs. A note log MUST appear for a note number if a C-active Poly Aftertouch command for the note number appears in theorder showncheckpoint history. A note number MUST NOT be represented by multiple note logs inFigure B.4.1.the note list. TheC and P header bits form a Table of Contents (TOC), and signalnote log list MUST obey thepresenceoldest-first ordering rule (defined in Appendix A.1). The 7-bit LEN field codes the number of note logs in the32-bit COMPLETE field (C = 1) andlist, minus one. Figure A.9.2 reproduces the32-bit PARTIAL field (P = 1). The Q header bit codes information about the COMPLETE field format. Ifnote log structure of ChapterF does not contain a COMPLETE field, Q MUST be set to 0.A. 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S| NOTENUM |X| PRESSURE | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure A.9.2 -- Chapter A note log TheD header bit7-bit PRESSURE field codes thetape movement direction. Ifpressure value of thetape is moving forward, or ifmost recent C-active Poly Aftertouch command in thetape direction is indeterminate,session history for theDMIDI note number coded in the 7-bit NOTENUM field. As a rule, the X bit MUST be set to 0.If the tape is moving in the reverse direction,However, theDX bit MUST be set to1. In most cases,1 if theorderingcommand coded by the log appears before one of the following commands in the sessionhistory clearly defines the tape direction. However,history: MIDI Control Change numbers 123-127 (numbers with All Notes Off semantics) or 120 (All Sound Off). We define C-active restrictions for Chapter A because [RP015] declares that afewControl Change commandsequences have an indeterminate direction (such as a session history consisting of one Full Frame command). The 3-bit POINT header field is interpreted as an unsigned integer. Appendix B.4.1 defines howfor controller 121 (Reset All Controllers) acts to reset thePOINT field codes information aboutpolyphonic pressure to 0 (see thecontents ofdiscussion at thePARTIAL field. If Chapter F does not containend of Appendix A.5 for aPARTIAL field, POINT MUST be set to 7 (if D = 0) or 0 (if D = 1).more complete rationale). B. The Recovery Journal System Chapters B.1. System ChapterFD: Simple System Commands The system journal MUSTinclude the COMPLETE fieldcontain Chapter D if an activefinished Full Frame command appears in the checkpoint history,MIDI Reset (0xFF), MIDI Tune Request (0xF6), MIDI Song Select (0xF3), undefined MIDI System Common (0xF4 and 0xF5), orif an active Quarter Frameundefined MIDI System Real- time (0xF9 and 0xFD) commandthat completes the encoding of a frame valueappears in the checkpoint history.The COMPLETE field encodes the most recent active complete MTC frame value that appears in the session history. This frame value may take the form of a series of 8 active Quarter Frame commands (0xF1 0x0n through 0xF1 0x7n for forward tape movement, 0xF1 0x7n through 0xF1 0x0n for reverse tape movement), or may take the form of an active finished Full Frame command. If the COMPLETE field encodes a Quarter Frame command series, the Q header bit MUST be set to 1, and the COMPLETE field MUST have the format shown inFigureB.4.2. The 4-bit fields MT0 through MT7 code the data (lower) nibble forB.1.1 shows theQuarter Frame commands for Message Type 0 through Message Type 7 [MIDI]. These nibbles encode a complete frame value, in addition to fields reservedvariable-length format forfuture use by [MIDI].Chapter D. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| MT0 | MT1 | MT2 | MT3 | MT4 | MT5 | MT6 | MT7|S|B|G|H|J|K|Y|Z| Command logs ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ FigureB.4.2B.1.1 --COMPLETE field format, Q = 1 In this usage, the frame value encodedSystem Chapter D format The chapter consists of a 1-octet header, followed by one or more command logs. Header flag bits indicate the presence of command logs for the Reset (B = 1), Tune Request (G = 1), Song Select (H = 1), undefined System Common 0xF4 (J = 1), undefined System Common 0xF5 (K = 1), undefined System Real-time 0xF9 (Y = 1), or undefined System Real-time 0xFD (Z = 1) commands. Command logs appear in a list following theCOMPLETE field MUST be offset by 2 frames (relative toheader, in theframe value encodedorder that the flag bits appear in theQuarter Frame commands) ifheader. Figure B.1.2 shows theframe value codes a 0xF1 0x0n through 0xF1 0x7n1-octet commandsequence. This offset compensateslog format for thetwo-frame latencyReset and Tune Request commands. 0 0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+ |S| COUNT | +-+-+-+-+-+-+-+-+ Figure B.1.2 -- Command log for Reset and Tune Request Chapter D MUST contain the Reset command log if an active Reset command appears in the checkpoint history. The 7-bit COUNT field codes the total number of Reset commands (modulo 128) present in theQuarter Frame encoding for forward tape movement. No offset is appliedsession history. Chapter D MUST contain the Tune Request command log if an active Tune Request command appears in theframe valuecheckpoint history. The 7-bit COUNT field codes the total number of Tune Request commands (modulo 128) present in the session history. For these commands, the COUNT field acts as a0xF1 0x7n through 0xF1 0x0n Quarter Framereference count. See the definition of "session history reference counts" in Appendix A.1 for more information. Figure B.1.3 shows the 1-octet commandsequence.log format for the Song Select command. 0 0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+ |S| VALUE | +-+-+-+-+-+-+-+-+ Figure B.1.3 -- Song Select command log format Chapter D MUST contain the Song Select command log if an active Song Select command appears in the checkpoint history. The 7-bit VALUE field codes the song number of the most recent activecomplete MTC frame value may alternatively be encoded by an active finished Full Frame command.Song Select command in the session history. B.1.1. Undefined System Commands In thiscase,section, we define theQ header bit MUST be set to 0,Chapter D command logs for the undefined System commands. [MIDI] reserves the undefined System commands 0xF4, 0xF5, 0xF9, and 0xFD for future use. At theCOMPLETE field MUST have format showntime of this writing, any MIDI command stream that uses these commands is non-compliant with [MIDI]. However, future versions of [MIDI] may define these commands, and a few products do use these commands in a non-compliant manner. FigureB.4.3. The HR, MN, SC, and FR fields correspond toB.1.4 shows thehr, mn, sc, and fr data octets ofvariable-length command log format for theFull Frame command.undefined System Common commands (0xF4 and 0xF5). 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S|C|V|L|DSZ| LENGTH |HRCOUNT |MNVALUE ... |SC+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |FRLEGAL ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ FigureB.4.3B.1.4 --COMPLETE field format, Q = 0 B.4.1 Partial FramesUndefined System Common command log format Themost recent active session historycommandthat encodes MTC frame value data may be a Quarter Frame command other than a forward-moving 0xF1 0x7n command (which completes a frame value for forward tape movement) orlog codes areverse-moving 0xF1 0x1nsingle command(which completes a frame value for reverse tape movement). We consider thistypeof Quarter Frame command to be associated with a partial frame value. The Quarter Frame sequence that defines a partial frame value MUST either start at Message Type 0 and increment contiguously to an intermediate Message Type less than 7,(0xF4 orstart at Message Type 7 and decrement contiguously to an intermediate Message type greater than 0. A Quarter Frame command sequence that does not follow this pattern is0xF5, notassociated with a partial frame value.both). ChapterFD MUSTincludecontain aPARTIAL fieldcommand log ifthe most recentan active 0xF4 command appears in the checkpoint historythat encodes MTC frame value data is a Quarter Frameand MUST contain an independent commandthat is associated with a partial frame value. Otherwise,log if an active 0xF5 command appears in the checkpoint history. ChapterF MUST NOT include a PARTIAL field. The partial frame valueD consists ofthea two-octet header followed by a variable number of data(lower) nibblesfields. Header flag bits indicate the presence of theQuarter Frame command sequence. The PARTIALCOUNT fieldcodes the partial frame value, using(C = 1), theformat shown in Figure B.4.2. Message Type fields that are not associated with a Quarter Frame command MUST be set to 0. The POINT headerVALUE fieldindicates the Message Type fields in(V = 1), and thePARTIALLEGAL fieldcode valid data. If P(L =1, the POINT1). The 10-bit LENGTH fieldMUST encode the unsigned integer value formed bycodes thelower 3 bitssize of theupper nibble ofcommand log and conforms to semantics described in Appendix A.1. The 2-bit DSZ field codes thedata valuenumber of data octets in themost recent active Quarter Framecommand instance that appears most recently in the session history. IfD = 0 and PDSZ =1, POINT MUST take on a value in0-2, therange 0-6.command has 0-2 data octets. IfDDSZ =13, the command has 3 or more command data octets. We now define the default rules for the use of the COUNT, VALUE, andP = 1, POINT MUST take on a valueLEGAL fields. The session configuration tools defined in Appendix C.2.3 may be used to override this behavior. By default, if therange 1-7. If D =DSZ field is set to 0,MT fields (Figure B.4.2) intheinclusive range 0 up to and includingcommand log MUST include thePOINT value encodeCOUNT field. The 8-bit COUNT field codes thepartial frame value. If D = 1, MT fields intotal number of commands of theinclusive range 7 down to and includingtype coded by thePOINT value encodelog (0xF4 or 0xF5) present in thepartial frame value. Note that unlikesession history, modulo 256. By default, if theCOMPLETEDSZ fieldencoding, senders MUST NOT add a 2-frame offsetis set to 1-3, thepartial frame value encoded in PARTIAL. Forcommand log MUST include thedefault semantics, ifVALUE field. The variable-length VALUE field codes arecovery journal contains Chapter F, and if the session history codes a legal [MIDI] series of Quarter Frame and Full Frame commands,verbatim copy thechapter always contains a COMPLETE or a PARTIAL field (and may contain both fields). Thus, a one-octet Chapter F (C = P = 0) always codesdata octets for thepresencemost recent use ofan illegalthe commandsequencetype coded by the log (0xF4 or 0xF5) in the sessionhistory (under some conditions,history. The most-significant bit of theC =final data octet MUST be set to 1,P = 0 condition may also codeand thepresencemost-significant bit ofan illegal command sequence).all other data octets MUST be set to 0. Theillegal command sequence conditions are transient in nature, and usually indicate that a Quarter Frame command sequence began with an intermediate Message Type. B.5 System Chapter X: System Exclusive This Appendix describes Chapter X, the system chapterLEGAL field is reserved forMIDI System Exclusive (SysEx) commands (0xF0). Readers may wishfuture use. If an update toreview[MIDI] defines theAppendix A.1 definition of "finished/unfinished commands" before reading this Appendix. Chapter X consists of a list of one0xF4 ormore command logs. Each log in0xF5 command, an IETF standards-track document may define thelist codes information aboutLEGAL field. Until such aspecific finished or unfinished SysEx command that appears indocument appears, senders MUST NOT use thesession history. The system journalLEGAL field, and receivers MUSTcontain Chapter X ifuse therules defined in Appendix B.5.2 require that one or more logs appear inLENGTH field to skip over thelist.LEGAL field. Thelog list is not precededLEGAL field would be defined bya header. Instead, each log implicitly encodes its own length. Given the length oftheN'th list log,IETF if thepresencesemantics of the(N+1)'th list log maynew 0xF4 or 0xF5 command could not beinferredprotected from packet loss via theLENGTH field of the system journal header (Figure 10 in Section 5use of themain text). The log list MUST obey the oldest-first ordering rule (defined in Appendix A.1). B.5.1 Chapter FormatCOUNT and VALUE fields. FigureB.5.1B.1.5 shows thebitfieldvariable-length command log format for theChapter X command log.undefined System Real-time commands (0xF9 and 0xFD). 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+|S|T|C|F|D|L|STA| TCOUNT|S|C|L| LENGTH | COUNT |FIRST ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | DATALEGAL ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ FigureB.5.1B.1.5 --Chapter XUndefined System Real-time command log formatAThe command log codes a single command type (0xF9 or 0xFD, not both). ChapterXD MUST contain a command log if an active 0xF9 command appears in the checkpoint history and MUST contain an independent command log if an active 0xFD command appears in the checkpoint history. Chapter D consists of a1-octet header,one-octet header followed bythe optional TCOUNT, COUNT, FIRST, and DATA fields. The T, C, F, and D header bits act asaTablevariable number ofContents (TOC) for the log. If T is set to 1, the 1-octet TCOUNT field appears indata fields. Header flag bits indicate thelog. If C is set to 1,presence of the1-octetCOUNT fieldappears in the log. If F is set to 1,(C = 1) and thevariable-length FIRSTLEGAL fieldappears in(L = 1). The 5-bit LENGTH field codes thelog. If D is set to 1,size of thevariable-length DATA field appearscommand log and conforms to semantics described in Appendix A.1. We now define thelog. The L header bit sets the coding tooldefault rules for thelog. We defineuse of thelog codingCOUNT and LEGAL fields. The session configuration tools defined in AppendixB.5.2.C.2.3 may be used to override this behavior. TheSTA8-bit COUNT field codes thestatustotal number of commands of thecommandtype coded by the log present in the session history, modulo 256. By default, the COUNT field MUST be present in the command log. The2-bit STA valueLEGAL field isinterpreted as an unsigned integer.reserved for future use. IfSTA is 0, the log codes an unfinished command. Non-zero STA values code different classes of finished commands. An STA value of 1 codes a cancelled command,anSTA value of 2 codes a command that usesupdate to [MIDI] defines the"dropped F7" construction, and0xF9 or 0xFD command, anSTA value of 3 codes all other finished commands. Section 3.2 in the main text describes cancelled and "dropped F7" commands. The S bit (Appendix A.1) of the first log in the list acts asIETF standards-track document may define theS bit for Chapter X. ForLEGAL field to protect theother logs incommand. Until such a document appears, senders MUST NOT use thelist,LEGAL field, and receivers MUST use theS bit refersLENGTH field to skip over thelog itself.LEGAL field. Thevalue of the "phantom" S bit associated with the first log isLEGAL field would be defined by thefollowing rules: o If the list codes one log, the phantom S-bit value is the same asIETF if theChapter X S-bit value. o Ifsemantics of thelist codes multiple logs,new 0xF9 or 0xFD command could not be protected from packet loss via thephantom S-bit value isuse of thelogical ORCOUNT field. Finally, we note that some non-standard uses of theS-bit valueundefined System Real-time commands act to implement non-compliant variants of thefirst and secondMIDI sequencer system. In Appendix B.3.1, we describe resiliency tools for the MIDI sequencer system that provide some protection in this case. B.2. System Chapter V: Active Sense Command The system journal MUST contain Chapter V if an active MIDI Active Sense (0xFE) commandlogsappears in thelist. In all other respects, the S bit followscheckpoint history. Figure B.2.1 shows thesemantics defined in Appendix A.1.format for Chapter V. 0 0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+ |S| COUNT | +-+-+-+-+-+-+-+-+ Figure B.2.1 -- System Chapter V format TheFIRST7-bit COUNT field(present if F = 1) encodes a variable-length unsigned integer value that setscodes thecoveragetotal number of Active Sense commands (modulo 128) present in theDATA field.session history. TheFIRSTCOUNT field(present if F = 1) encodesacts as avariable-length unsigned integer value that specifies which SysEx data bytes are encoded inreference count. See theDATA fielddefinition of "session history reference counts" in Appendix A.1 for more information. B.3. System Chapter Q: Sequencer State Commands This appendix describes Chapter Q, thelog.system chapter for the MIDI sequencer commands. TheFIRST field consists ofsystem journal MUST contain Chapter Q if anoctet whose most- significant bit is set to 0, optionally preceded by oneactive MIDI Song Position Pointer (0xF2), MIDI Clock (0xF8), MIDI Start (0xFA), MIDI Continue (0xFB), ormore octets whose most-significant bit is set to 1. The algorithm shownMIDI Stop (0xFC) command appears inFigure B.5.2 decodes this format into an unsigned integer, to yieldthevalue dec(FIRST). FIRST uses a variable-length encoding because dec(FIRST) references a data octet in a SysEx command,checkpoint history, anda SysEx command may contain an arbitrary number of data octets. One-Octet FIRST value: Encoded form: 0ddddddd Decoded form: 00000000 00000000 00000000 0ddddddd Two-Octet FIRST value: Encoded form: 1ccccccc 0ddddddd Decoded form: 00000000 00000000 00cccccc cddddddd Three-Octet FIRST value: Encoded form: 1bbbbbbb 1ccccccc 0ddddddd Decoded form: 00000000 000bbbbb bbcccccc cddddddd Four-Octet FIRST value: Encoded form: 1aaaaaaa 1bbbbbbb 1ccccccc 0ddddddd Decoded form: 0000aaaa aaabbbbb bbcccccc cddddddd Figure B.5.2 -- Decoding FIRST field formats The DATA field (presentifD = 1) encodesthe rules defined in this appendix require amodified version ofchange in thedata octetsChapter Q bitfield contents because of theSysExcommandcoded byappearance. Figure B.3.1 shows thelog. Status octets MUST NOT be codedvariable-length format for Chapter Q. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S|N|D|C|T| TOP | CLOCK | TIMETOOLS ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure B.3.1 -- System Chapter Q format Chapter Q consists of a 1-octet header followed by several optional fields, in theDATA field. If F = 0, the DATA field begins withorder shown in Figure B.3.1. Header flag bits signal thefirst data octetpresence of theSysEx command,16-bit CLOCK field (C = 1) andincludes all subsequent data octets for the command that appear inthesession history. If F24-bit TIMETOOLS field (T =1, the DATA1). The 3-bit TOP header fieldbegins with the (dec(FIRST) + 1)'th data octet of the SysEx command,is interpreted as an unsigned integer, as are CLOCK andincludes all subsequent data octets forTIMETOOLS. We describe thecommand that appearTIMETOOLS field in Appendix B.3.1. Chapter Q encodes thesession history. Note thatmost recent state of theword "command" insequencer system. Receivers use thedescriptions above referschapter to re-synchronize theoriginal SysEx command as it appears in the source MIDI data stream, not tosequencer after aparticular MIDI list SysEx command segment. The lengthpacket loss episode. Chapter fields encode the on/off state of theDATA field is coded implicitly, usingsequencer, themost- significant bit of each octet.current position in the song, and the downbeat. Themost-significantN header bitofencodes thefinal octetrelative occurrence of theDATA fieldStart, Stop, and Continue commands in the session history. If an active Start or Continue command appears most recently, the N bit MUST be set to 1.TheIf an active Stop appears mostsignificantrecently, or if no active Start, Stop, or Continue commands appear in the session history, the N bitof all other DATA octetsMUST be set to 0.This coding method relies onThe C header flag, thefact thatTOP header field, and themost-significant bit of a MIDI data octet is 0 by definition. Apart from this length-coding modification, the DATACLOCK fieldencodes a verbatim copy of all data octets it encodes. B.5.2 Log Inclusion Semantics Chapter X offers two toolsact toprotect SysEx commands:code the"recency" toolcurrent position in the sequence: o If C = 1, the 3-bit TOP header field and the"list" tool. The tool definitions use16-bit CLOCK field are combined to form theconcept of19-bit unsigned quantity 65536*TOP + CLOCK. This value encodes the"SysEx type"song position in units ofa command, which we now define. Each SysExMIDI Clocks (24 clocks per quarter note), modulo 524288. Note that the maximum song position value that may be coded by the Song Position Pointer commandinstance in a session, excepting MTC Full Frame commands,issaid to have a "SysEx type". Types are used in equality comparisons: two SysEx commands in a session98303 clocks (which may be coded with 17 bits), and that MIDI-coded songs aresaidgenerally constructed tohave "the same SysEx type" or "different SysEx types". If efficiency is not a concern, a sender may follow a simple typing rule: every SysEx command inavoid durations longer than this value. However, thesession history has19-bit size may be useful for real-time applications, such as adifferent SysEx type, and thus, no twodrum machine MIDI output that is sending clock commandsinfor long periods of time. o If C = 0, thesession havesong position is thesame type. To improve efficiency, senders MAY implement exceptions to this rule. These exceptions declare certain setsstart ofSysEx command instancesthe song. The C = 0 position is identical tohavethesame SysEx type. Any command not coveredposition coded byan exception follows the simple rule. We list exceptions below: o All commands with identical data octet fields (same number of data octets, same valueC = 1, TOP = 0, and CLOCK = 0, foreach data octet) havethesame type. This rule MUST be applied to all SysEx commandscase where the song position is less than 524288 MIDI clocks. In certain situations (defined later in this section), normative text may require thesession,C = 0 ornot at all. Note thattheimplementation of this exception requires no sender knowledgeC = 1, TOP = 0, CLOCK = 0 encoding of theformat and semanticsstart of theSysEx commands in the stream, merely the abilitysong. The C, TOP, and CLOCK fields MUST be set tocountcode the current song position, for both N = 0 andcompare octets. o Two instances ofN = 1 conditions. If C = 0, thesame command whose semanticsTOP field MUST be setor report the valueto 0. See [MIDI] for a precise definition ofthe same "parameter" have the same type.a song position. Theimplementation of this exception requires specific knowledge ofD header bit encodes information about theformatdownbeat andsemantics of SysEx commands. In practice, a sender implementation choosesacts tosupport this exception for certain classes of commands (such asqualify theUniversal System Exclusive commands defined in [MIDI]). If a sender supports this exception for a particular command in a class (for example,song position coded by theUniversal Real Time System Exclusive message for Master Volume, F0 F7 cc 04 01 vv vv F7, defined in [MIDI]), it MUST supportC, TOP, and CLOCK fields. If theexceptionD bit is set toall instances of this particular command1, the song position represents the most recent position in thesession. We now use this definition of "SysEx type"sequence that has played. If D = 1, the next Clock command (if N = 1) or the next (Continue, Clock) pair (if N = 0) acts todefineincrement the"recency" toolsong position by one clock, and to play the"list" tool for Chapter X. By default,updated position. If theChapter X log list MUST code sufficient informationD bit is set toprotect0, therendered MIDI performance from indefinite artifacts caused bysong position represents a position in theloss of all finished or unfinished active SysEx commandssequence thatappearhas not yet been played. If D = 0, the next Clock command (if N = 1) or the next (Continue, Clock) pair (if N = 0) acts to play the point in thecheckpoint history (excluding finished MTC Full Frame commands, which aresong codedin Chapter F (Appendix B.4)). To protectby the song position. The song position is not incremented. An example of a stream that uses D = 0 coding is one whose most recent sequence commandofis aspecific SysEx type with the recency tool, senders MUST codeStart or Song Position Pointer command (both N = 1 conditions). However, it is also possible to construct examples where D = 0 and N = 0. A Start command immediately followed by alogStop command is coded in Chapter Q by setting C = 0, D = 0, N = 0, TOP = 0. If N = 1 (coding Start or Continue), D = 0 (coding that thelog list for the most recent finished active instance ofdownbeat has yet to be played), and theSysEx type that appears insong position is at thecheckpoint history. Additionally, if an unfinished active instancestart of theSysEx type appears insong, thecheckpoint history, sendersC = 0 song position encoding MUSTcodebe used if alogStart command occurs more recently than a Continue command in thelog list forsession history, and theunfinished command instance. The L header bit of both command logsC = 1, TOP = 0, CLOCK = 0 song position encoding MUST beset to 0. To protectused if a Continue commandofoccurs more recently than aspecific SysEx type withStart command in thelist tool, senders MUST code a logsession history. B.3.1. Non-compliant Sequencers The Chapter Q description in this appendix assumes that the sequencer system counts off time with Clock commands, as mandated in [MIDI]. However, a few non-compliant products do not use Clock commands to count off time, but instead use non-standard methods. ChapterX log listQ uses the TIMETOOLS field to provide resiliency support foreach finished or unfinished active instance ofthese non-standard products. By default, theSysEx type that appearsTIMETOOLS field MUST NOT appear in Chapter Q, and thecheckpoint history. The LT header bitof list tool command logsMUST be set to1. As a rule, a log REQUIRED by the list or recency tool MUST include a DATA field that codes all data octets that appear0. The session configuration tools described in Appendix C.2.3 may be used to select TIMETOOLS coding. Figure B.3.2 shows thecheckpoint history for the SysEx command instance associated withformat of thelog.24-bit TIMETOOLS field. 0 1 2 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | TIME | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure B.3.2 -- TIMETOOLS format TheFIRSTTIME fieldMAY be used to configureis aDATA field that minimally meets this requirement. An exception to this rule applies to cancelled commands (defined in Section 3.2). REQUIRED command logs associated24-bit unsigned integer quantity, withcancelled commands MAY beunits of milliseconds. TIME codes an additive correction term for the song position coded by the TOP, CLOCK, and C fields. TIME is codedwith no DATA field. However, if DATA appearsin network byte order (big-endian). A receiver computes thelog, DATA MUST code all data octets that appearcorrect song position by converting TIME into units of MIDI clocks and adding it to 65536*TOP + CLOCK (assuming C = 1). Alternatively, a receiver may convert 65536*TOP + CLOCK into milliseconds (assuming C = 1) and add it to TIME. The downbeat (D header bit) semantics defined in Appendix B.3 apply to thecheckpoint history forcorrected song position. B.4. System Chapter F: MIDI Time Code Tape Position This appendix describes Chapter F, thecommand associated withsystem chapter for thelog. As defined byMIDI Time Code (MTC) commands. Readers may wish to review thepreceding text inAppendix A.1 definition of "finished/unfinished commands" before reading thissection, by default all finishedappendix. The system journal MUST contain Chapter F if an active System Common Quarter Frame command (0xF1) orunfinishedan activeSysEx commands that appear in the checkpoint history (excludingfinished System Exclusive (Universal Real Time) MTC Full Framecommands)command (F0 7F cc 01 01 hr mn sc fr F7) appears in the checkpoint history. Otherwise, the system journal MUSTbe protected byNOT contain Chapter F. Figure B.4.1 shows thelist tool orvariable-length format for Chapter F. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S|C|P|Q|D|POINT| COMPLETE ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ... | PARTIAL ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ... | +-+-+-+-+-+-+-+-+ Figure B.4.1 -- System Chapter F format Chapter F holds information about recent MTC tape positions coded in therecency tool. For some MIDI source streams, this default yields asession history. Receivers use ChapterX whose size is too large. For example, imagine that a sender beginsF totranscodere-synchronize the MTC system after aSysEx command with 10,000 data octets ontopacket loss episode. Chapter F consists of aUDP RTP stream "on the fly", by sending SysEx command segments as soon as data octets are delivered1-octet header followed by several optional fields, in theMIDI source. After 1000 octets have been sent, the expansion of Chapter X yields an RTP packet that is too large to fitorder shown inthe Maximum Transmission Unit (MTU) for the stream. In this situation, ifFigure B.4.1. The C and P header bits form asender usesTable of Contents (TOC) and signal theclosed-loop sending policy for SysEx commands,presence of theRTP packet size may always be capped by stalling32-bit COMPLETE field (C = 1) and thestream. In a stream stall, once32-bit PARTIAL field (P = 1). The Q header bit codes information about thepacket reachesCOMPLETE field format. If Chapter F does not contain amaximum size, the sender refrains from sending new packets with non-empty MIDI Command Sections until receiver feedback permitsCOMPLETE field, Q MUST be set to 0. The D header bit codes thetrimming of Chapter X.tape movement direction. If thestream permits arbitrary commandstape is moving forward, or if the tape direction is indeterminate, the D bit MUST be set toappear between SysEx segments (selectable during configuration using0. If thetools definedtape is moving inAppendix C.1),thesender may stallreverse direction, theSysEx segment stream but continueD bit MUST be set tocode other1. In most cases, the ordering of commands in theMIDI list. Stalls are a workable but sub-optimal solution to Chapter X size issues. As an alternative to stalls, senders SHOULD take preemptive action duringsessionconfiguration to reduce the anticipated size of Chapter X, usinghistory clearly defines themethods described below: o Partitioned transport. Appendix C.5 provides tools for sending a MIDI name space over several RTP streams. Senders may use these tools to map a MIDI source intotape direction. However, alow-latency UDP RTP stream (for channel commands and short SysEx commands) andfew command sequences have an indeterminate direction (such as areliable [CONTRANS] TCP stream (for bulk-data SysEx commands). The cm_unused and cm_used parameters (Appendix C.1) may be used to communicate the naturesession history consisting ofthe SysEx command partition. As TCPone Full Frame command). The 3-bit POINT header field isreliable, the RTP MIDI TCP stream would not use the recovery journal. To minimize transmission latency for short SysEx commands, senders may begin segmental transmission for all SysEx commands overinterpreted as an unsigned integer. Appendix B.4.1 defines how theUDP stream, and then cancelPOINT field codes information about theUDP transmissioncontents oflong commands (using tools described in Section 3.2) and resendthecommands over the TCP stream. o Selective protection. Journal protection mayPARTIAL field. If Chapter F does notbe necessary for all SysEx commands incontain astream. The ch_never parameter (Appendix C.2) mayPARTIAL field, POINT MUST beused to communicate which SysEx commands are excluded from Chapter X. B.5.3 TCOUNT and COUNT fields If the T header bit isset to1,7 (if D = 0) or 0 (if D = 1). Chapter F MUST include the8-bit TCOUNTCOMPLETE field if an active finished Full Frame command appears in the checkpoint history, or if an active Quarter Frame commandlog. If the C header bit is set to 1,that completes the8-bit COUNT fieldencoding of a frame value appears in thecommand log. TCOUNT and COUNT are interpreted as unsigned integers.checkpoint history. TheTCOUNTCOMPLETE fieldcodes the total number of SysEx commands of the SysEx type coded byencodes thelogmost recent active complete MTC frame value thatappearappears in the sessionhistory, at the moment after the (finished or unfinished) command coded by the log enters the sessionhistory.The COUNT field codesThis frame value may take thetotal numberform ofSysEx commands that appear in the session history, excludinga series of 8 active Quarter Frame commandsthat are excluded from Chapter X via the ch_never parameter (Appendix C.2), at the moment after the (finished(0xF1 0x0n through 0xF1 0x7n for forward tape movement, 0xF1 0x7n through 0xF1 0x0n for reverse tape movement) orunfinished) command coded by the log entersmay take thesession history. Command counting for TCOUNT and COUNT uses modulo-256 arithmetic. MTCform of an active finished Full Framecommand instances (Appendix B.4) are included in command counting ifcommand. If theTCOUNT and COUNT definitions warrant their inclusion, as are cancelled commands (Section 3.2). Senders useCOMPLETE field encodes a Quarter Frame command series, theTCOUNT and COUNT fieldsQ header bit MUST be set totrack the identity1, and(for TCOUNT)thesequence position of a command instance. SendersCOMPLETE field MUSTusehave theTCOUNT or COUNTformat shown in Figure B.4.2. The 4-bit fieldsif identity or sequence information is necessary to protectMT0 through MT7 code thecommand type coded bydata (lower) nibble for thelog. IfQuarter Frame commands for Message Type 0 through Message Type 7 [MIDI]. These nibbles encode asender uses the COUNT fieldcomplete frame value, ina session,addition to fields reserved for future use by [MIDI]. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | MT0 | MT1 | MT2 | MT3 | MT4 | MT5 | MT6 | MT7 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure B.4.2 -- COMPLETE field format, Q = 1 In this usage, thefinal command log in every Chapter Xframe value encoded in thestreamCOMPLETE field MUSTcodebe offset by 2 frames (relative to theCOUNT field. This rule lets receivers resynchronizeframe value encoded in theCOUNTQuarter Frame commands) if the frame valueaftercodes apacket loss. C. Session Configuration Tools In Sections 6.1-20xF1 0x0n through 0xF1 0x7n command sequence. This offset compensates for the two- frame latency of themain text, we show session descriptionsQuarter Frame encoding forminimal native and mpeg4-generic RTP MIDI streams. Minimal streams lackforward tape movement. No offset is applied if theflexibility to support some applications.frame value codes a 0xF1 0x7n through 0xF1 0x0n Quarter Frame command sequence. The most recent active complete MTC frame value may alternatively be encoded by an active finished Full Frame command. In thisAppendix, we describe how to customize stream behavior throughcase, theuse ofQ header bit MUST be set to 0, and thepayloadCOMPLETE field MUST have formatparameters.shown in Figure B.4.3. TheAppendix begins with 6 sections, each devotedHR, MN, SC, and FR fields correspond toparameters that affect a particular aspect of stream behavior: o Appendix C.1 describesthestream subsetting system (cm_unusedhr, mn, sc, andcm_used). o Appendix C.2 describes the journalling system (ch_anchor, ch_default, ch_never, j_sec, j_update). o Appendix C.3 describes MIDI command timestamp semantics (linerate, mperiod, octpos, tsmode). o Appendix C.4 describes the temporal duration ("media time")fr data octets ofan RTP MIDI packet (guardtime, rtp_maxptime, rtp_ptime). o Appendix C.5 concerns stream description (musicport). o Appendix C.6 describes MIDI rendering (chanmask, cid, inline, multimode, render, rinit, subrender, smf_cid, smf_info, smf_inline, smf_url, url).the Full Frame command. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | HR | MN | SC | FR | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure B.4.3 -- COMPLETE field format, Q = 0 B.4.1. Partial Frames Theparameters listed above may optionally appear inmost recent active sessiondescriptionshistory command that encodes MTC frame value data may be a Quarter Frame command other than a forward-moving 0xF1 0x7n command (which completes a frame value for forward tape movement) or a reverse-moving 0xF1 0x1n command (which completes a frame value for reverse tape movement). We consider this type ofRTP MIDI streams. If these parameters are used in an SDP session description, the parameters appear on an fmtp attribute line. This attribute line appliesQuarter Frame command tothe payload typebe associated withthe fmtp line.a partial frame value. Theparameters listed above add extra functionality ("features") to minimal RTP MIDI streams. In Appendix C.7, we show how to use these featuresQuarter Frame sequence that defines a partial frame value MUST either start at Message Type 0 and increment contiguously tosupport two classes of applications: content-streaming using RTSP (Appendix C.7.1)an intermediate Message Type less than 7, or start at Message Type 7 andnetwork musical performance using SIP (Appendix C.7.2). The participants indecrement contiguously to an intermediate Message type greater than 0. A Quarter Frame command sequence that does not follow this pattern is not associated with amultimedia sessionpartial frame value. Chapter F MUSTshareinclude acommon view of all ofPARTIAL field if theRTP MIDI streams that appearmost recent active command inan RTP session, as defined by a single media (m=) line. In some RTP MIDI applications,the"common view" restriction makes it difficult to use sendrecv streams (all parties send and receive), as each party has its own requirements. For example,checkpoint history that encodes MTC frame value data is atwo-party network musical performance application may wish to customize the renderer on each host to match the CPU performanceQuarter Frame command that is associated with a partial frame value. Otherwise, Chapter F MUST NOT include a PARTIAL field. The partial frame value consists of thehost [NMP]. We solve this problem by using two RTP MIDI streams -- one sendonly, one recvonly -- in lieudata (lower) nibbles ofone sendrecv stream.the Quarter Frame command sequence. Thedata flows inPARTIAL field codes thetwo streams travelpartial frame value, using the format shown inopposite directions, to control receivers configured to use different renderers. In the third example in Appendix C.5, we show how the musicport parameter may be used to define virtual sendrecv streams. As a general rule, the RTP MIDI protocol does not handle parameter changes during a session well, because the parameters describe heavyweight or stateful configurationFigure B.4.2. Message Type fields that are noteasily changed once a session has begun. Thus, parties SHOULD NOT expect that parameter change requests duringassociated with asession willQuarter Frame command MUST beaccepted by other parties. However, implementors SHOULD support in-session parameter changes that are easyset tohandle (example:0. The POINT header field indicates theguardtime parameter defined in Appendix C.4), and SHOULD be capable of accepting requests for changes of those parameters, as received by its session management protocol (for example, re-offersMessage Type fields inSIP [RFC3264]). Appendix D defines the Augmented Backus-Naur Form (ABNF, [RFC2234]) syntax forthepayload parameters. Appendix H provides information toPARTIAL field code valid data. If P = 1, theInternet Assigned Numbers Authority (IANA) onPOINT field MUST encode themedia types and parameters defined in this document. Appendix C.6.5 definesunsigned integer value formed by themedia type "audio/asc", a stored object for initializing mpeg4-generic renderers. As described in Appendix C.6,lower 3 bits of theaudio/asc media type is assigned toupper nibble of the"rinit" parameter to specify an initializationdataobject forvalue of thedefault mpeg4-generic renderer. Note that RTP stream semantics are not defined for "audio/asc". Therefore,most recent active Quarter Frame command in the"asc" subtypesession history. If D = 0 and P = 1, POINT MUSTNOT appeartake onthe rtpmap line ofasession description. C.1 Configuration Tools: Stream Subsetting As defined in Section 3.2value in themain text, the MIDI list of an RTP MIDI packet may encode any MIDI command that may legally appearrange 0-6. If D = 1 and P = 1, POINT MUST take on aMIDI 1.0 DIN cable. In this Appendix we define two parameters (cm_unused and cm_used) that modify this default condition, by excluding certain types of MIDI commands fromvalue in theMIDI list of all packetsrange 1-7. If D = 0, MT fields (Figure B.4.2) ina stream. For example, if a multimedia session partitions a MIDI name space into two RTP MIDI streams,theparameters may be usedinclusive range from 0 up todefine which commands appear in each stream. In this Appendix, we define a simple language for specifying MIDI command types.and including the POINT value encode the partial frame value. Ifa command type is assignedD = 1, MT fields in the inclusive range from 7 down tocm_unused,and including thecommands coded byPOINT value encode thestringpartial frame value. Note that, unlike the COMPLETE field encoding, senders MUST NOTappear in the MIDI list. Ifadd acommand type is assigned2-frame offset tocm_used, the commands coded bythestring MAY appearpartial frame value encoded in PARTIAL. For theMIDI list. The parameter list may code multiple assignments to cm_used and cm_unused. Assignments havedefault semantics, if acumulative effect,recovery journal contains Chapter F, andare applied inif theordersession history codes a legal [MIDI] series ofappearance inQuarter Frame and Full Frame commands, theparameter list. A later assignment ofchapter always contains acommand type to the same parameter expands the scope ofCOMPLETE or a PARTIAL field (and may contain both fields). Thus, a one-octet Chapter F (C = P = 0) always codes theearlier assignment. A later assignmentpresence ofaan illegal commandtype tosequence in theopposite parameter cancels (partially or completely)session history (under some conditions, theeffectC = 1, P = 0 condition may also code the presence of anearlier assignment. To initialize the stream subsetting system, "implicit" assignments to cm_unused and cm_usedillegal command sequence). The illegal command sequence conditions areprocessed before processing the actual assignments that appeartransient inthe parameter list. The System Common undefined commands (0xF4, 0xF5)nature and usually indicate that a Quarter Frame command sequence began with an intermediate Message Type. B.5. System Chapter X: System Exclusive This appendix describes Chapter X, the system chapter for MIDI SystemReal-Time UndefinedExclusive (SysEx) commands(0xF9, 0xFD) are implicitly assigned(0xF0). Readers may wish tocm_unused. All otherreview the Appendix A.1 definition of "finished/unfinished commands" before reading this appendix. Chapter X consists of a list of one or more command logs. Each log in the list codes information about a specific finished or unfinished SysEx commandtypes are implicitly assigned to cm_used. Notethat appears in theimplicit assignments codesession history. The system journal MUST contain Chapter X if thedefault behavior of an RTP MIDI stream asrules defined inSection 3.2 in the main text (namely, that all commandsAppendix B.5.2 require thatmay legally appear on a MIDI 1.0 DIN cable mayone or more logs appear in thestream). Also note that assignmentslist. The log list is not preceded by a header. Instead, each log implicitly encodes its own length. Given the length of theSystem Common undefined commands (0xF4, 0xF5) apply toN'th list log, theusepresence ofthese commands intheMIDI source command stream, not(N+1)'th list log may be inferred from thespecial useLENGTH field of0xF4 and 0xF5 in SysEx segment encoding definedthe system journal header (Figure 10 in Section3.2 in5 of the maintext. As a rule, parameter assignmentstext). The log list MUST obey thefollowing syntax (seeoldest-first ordering rule (defined in AppendixD for ABNF): <parameter> = [channel list]<command-type list>[field list] The command-type list is mandatory;A.1). B.5.1. Chapter Format Figure B.5.1 shows thechannel and field lists are optional. The command-type list specifiesbitfield format for theMIDIChapter X commandtypes for whichlog. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S|T|C|F|D|L|STA| TCOUNT | COUNT | FIRST ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | DATA ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure B.5.1 -- Chapter X command log format A Chapter X command log consists of a 1-octet header, followed by theparameter applies.optional TCOUNT, COUNT, FIRST, and DATA fields. Thecommand-type list isT, C, F, and D header bits act as aconcatenated sequence of one or moreTable of Contents (TOC) for theletters (ABCFGHJKMNPQTVWXYZ). The letters code the following command types: o A: Poly Aftertouch (0xA) o B: System Reset (0xFF) o C: Control Change (0xB) o F: System Time Code (0xF1) o G: System Tune Request (0xF6) o H: System Song Select (0xF3) o J: System Common Undefined (0xF4) o K: System Common Undefined (0xF5) o N: NoteOff (0x8), NoteOn (0x9) o P: Program Change (0xC) o Q: System Sequencer (0xF2, 0xF8, 0xF9, 0xFA, 0xFB, 0xFC) o T: Channel Aftertouch (0xD) o V: System Active Sense (0xFE) o W: Pitch Wheel (0xE) o X: SysEx (0xF0) o Y: System Real-Time Undefined (0xF9) o Z: System Real-Time Undefined (0xFD) In additionlog. If T is set to 1, theletters above, the letter M may also appear1-octet TCOUNT field appears in thecommand-type list. The letter M referslog. If C is set to 1, theMIDI parameter system (see definition in Appendix A.1 and1-octet COUNT field appears in[MIDI]). An assignment of Mthe log. If F is set tocm_unused codes that no RPN or NRPN transactions may appear1, the variable-length FIRST field appears in theMIDI list. Note that if cm_unusedlog. If D isassignedset to 1, theletter M, Control Change (0xB) commandsvariable-length DATA field appears in the log. The L header bit sets the coding tool for thecontroller numbers inlog. We define thestandard controller assignment might still appearlog coding tools in Appendix B.5.2. The STA field codes theMIDI list. Forstatus of the command coded by the log. The 2-bit STA value is interpreted as anexplanation, see Appendix A.3.4 forunsigned integer. If STA is 0, the log codes an unfinished command. Non-zero STA values code different classes of finished commands. An STA value of 1 codes adiscussioncancelled command, an STA value of 2 codes a command that uses the"general-purpose" use"dropped F7" construction, and an STA value ofparameter system controller numbers. In3 codes all other finished commands. Section 3.2 in the main textbelow, rules that apply to "MIDI voice channel commands" also apply to letter M.describes cancelled and "dropped F7" commands. Theletters inS bit (Appendix A.1) of thecommand-type list MUST be upper case, and MUST appear in alphabetical order. Letters other than (ABCFGHJKMNPQTVWXYZ) that appearfirst log in the listMUST be ignored. For MIDI voice channel commands, the channel list specifiesacts as theMIDI channelsS bit forwhichChapter X. For theparameter applies. If no channel list is provided,other logs in theparameter applieslist, the S bit refers toall MIDI channels (0-15). The channel list takestheform of a listlog itself. The value ofchannel numbers (0 through 15) and dash-separated channel number ranges (i.e. 0-5, 8-12, etc). Dots (i.e. "." characters) separate elements inthechannel list. Recall that System commands do not have a MIDI channel"phantom" S bit associated withthem. Thus, for most command-type letters that code System commands (B, F, G, H, J, K, Q, V, Y and Z),thechannel listfirst log isignored. For the command-type letter X,defined by theappearance of certain numbers infollowing rules: o If thechannellist codesspecial semantics.one log, the phantom S-bit value is the same as the Chapter X S-bit value. oThe digit 0If the list codesthat SysEx "cancel" sublists (Section 3.2 in the main text) MUST NOT appear in the MIDI list. o The digit 1 codes that cancel sublists MAY appear in the MIDI list (the default condition). o The digit 2 codes that commands other than System Real-time MIDI commands MUST NOT appear between SysEx command segments in the MIDI list (the default condition). o The digit 3 codes that any MIDI command type may appear between SysEx command segments inmultiple logs, theMIDI list, withphantom S-bit value is theexceptionlogical OR of thesegmented encodingS-bit value ofathe first and secondSysExcommand(verbatim SysEx commands are OK). For command-type X,logs in thechannel list MUST NOT contain both digits 0 and 1, and MUST NOT contain both digits 2 and 3. For command-type X, channel list numberslist. In all otherthanrespects, thenumbers defined above are ignored. If X does not have a channel list,S bit follows the semanticsmarked "the default condition"defined inthe list above apply.Appendix A.1. Thesyntax forFIRST fieldlists in(present if F = 1) encodes aparameter assignment follows the syntax for channel lists. If no field list is provided,variable-length unsigned integer value that sets theparameter applies to all controller or note numbers. For command-type C (Control Change),coverage of the DATA field. The FIRST fieldlist codes the controller numbers (0-255) for(present if F = 1) encodes a variable-length unsigned integer value that specifies which SysEx data bytes are encoded in theparameter applies. For command-type M (Parameter System), theDATA fieldlist codes the Registered Parameter Numbers (RPNs) and Non-Registered Parameter Numbers (NRPNs) for whichof theparameter applies.log. Thenumber range 0-16383 specifies RPNs, the number range 16384-32767 specifies NRPNs (16384 correspondsFIRST field consists of an octet whose most-significant bit is set toNRPN0,32767 correspondsoptionally preceded by one or more octets whose most-significant bit is set toNRPN 16383). For command-types N (NoteOn and NoteOff) and A (Poly Aftertouch), the field list codes the note numbers for which1. The algorithm shown in Figure B.5.2 decodes this format into an unsigned integer, to yield theparameter applies. For command-types Jvalue dec(FIRST). FIRST uses a variable-length encoding because dec(FIRST) references a data octet in a SysEx command, andK (System Common Undefined), the field list consists ofasingle digit, which specifies theSysEx command may contain an arbitrary number of data octets. One-Octet FIRST value: Encoded form: 0ddddddd Decoded form: 00000000 00000000 00000000 0ddddddd Two-Octet FIRST value: Encoded form: 1ccccccc 0ddddddd Decoded form: 00000000 00000000 00cccccc cddddddd Three-Octet FIRST value: Encoded form: 1bbbbbbb 1ccccccc 0ddddddd Decoded form: 00000000 000bbbbb bbcccccc cddddddd Four-Octet FIRST value: Encoded form: 1aaaaaaa 1bbbbbbb 1ccccccc 0ddddddd Decoded form: 0000aaaa aaabbbbb bbcccccc cddddddd Figure B.5.2 -- Decoding FIRST field formats The DATA field (present if D = 1) encodes a modified version of the data octetsthat followof the SysEx commandoctet. For command-type X (SysEx),coded by the log. Status octets MUST NOT be coded in the DATA field. If F = 0, the DATA fieldlist codesbegins with thenumberfirst data octet of the SysEx command and includes all subsequent data octets for the command thatmayappear ina SysEx command. Thus,the session history. If F = 1, the DATA fieldlist 0-255 specifies SysEx commandsbegins with255 or fewerthe (dec(FIRST) + 1)'th dataoctets,octet of thefield list 256-429496729 specifiesSysExcommands with more than 255 data octets but excludes commands with 255 or fewer data octets,command andthe field list 0 excludesincludes allcommands. A secondary parameter assignment syntax customizes command-type X (see Appendix Dsubsequent data octets forcomplete ABNF): <parameter> = "__" <h-list> ["_" <h-list>] "__" The assignment definestheclass of SysEx commandscommand thatobeys the semantics ofappear in theassigned parameter. The command class is specified by listingsession history. Note that thepermitted values ofword "command" in thefirst N data octets that followdescriptions above refers to the original SysEx0xF0commandoctet. Anyas it appears in the source MIDI data stream, not to a particular MIDI list SysEx commandwhose first N data octets matchsegment. The length of thelistDATA field isa membercoded implicitly, using the most- significant bit of each octet. The most-significant bit of theclass. Each <h-list> defines a datafinal octet of thecommand, as a dot-separated (".") listDATA field MUST be set to 1. The most-significant bit ofone or more hexadecimal constants (such as "7F") or dash- separated hexadecimal ranges (such as "01-1F"). Underscores ("_") separate each <h-list>. Double-underscores ("__") delineateall other DATA octets MUST be set to 0. This coding method relies on the fact that the most-significant bit of a MIDI data octetlist. Usingis 0 by definition. Apart from thissyntax, each assignment specifieslength-coding modification, the DATA field encodes asingle SysEx command class. Session descriptions may use several assignmentsverbatim copy of all data octets it encodes. B.5.2. Log Inclusion Semantics Chapter X offers two tools tocm_usedprotect SysEx commands: the "recency" tool andcm_unused to specify complex behaviors. The example session description below illustratesthe "list" tool. The tool definitions useofthestream subsetting parameters: v=0 o=lazzaro 2520644554 2838152170 IN IP6 first.example.net s=Example t=0 0 m=audio 5004 RTP/AVP 96 c=IN IP6 2001:DB80::7F2E:172A:1E24 a=rtpmap:96 rtp-midi/44100 a=fmtp:96 cm_unused=ACGHJKNMPTVWXYZ; cm_used=__7F_00-7F_01_01__ The session description configuresconcept of thestream for use"SysEx type" of a command, which we now define. Each SysEx command instance inclock applications. All voice channels are unused, as are all System Commands except those used for MIDI Time Code (command-type F, and thea session, excepting MTC Full FrameSysEx command thatcommands, ismatched by the string assignedsaid tocm_used), the System Sequencerhave a "SysEx type". Types are used in equality comparisons: two SysEx commands(command-type Q), and System Reset (command- type B). C.2 Configuration Tools: The Journalling System In this Appendix, we definein a session are said to have "the same SysEx type" or "different SysEx types". If efficiency is not a concern, a sender may follow a simple typing rule: every SysEx command in thepayload format parameters that configure stream journallingsession history has a different SysEx type, and thus no two commands in therecovery journal system. The j_sec parameter (Appendix C.2.1) sets the journalling method forsession have thestream. The j_update parameter (Appendix C.2.2)same type. To improve efficiency, senders MAY implement exceptions to this rule. These exceptions declare that certain sets of SysEx command instances have therecovery journal sending policy for the stream. Appendix C.2.2 also definessame SysEx type. Any command not covered by an exception follows thesending policiessimple rule. We list exceptions below: o All commands with identical data octet fields (same number of data octets, same value for each data octet) have therecovery journal system. Appendix C.2.3 defines several parameterssame type. This rule MUST be applied to all SysEx commands in the session, or not at all. Note thatmodifytherecovery journal semantics. These parameters changeimplementation of this exception requires no sender knowledge of thedefault recovery journalformat and semanticsas definedof the SysEx commands inSection 5the stream, merely the ability to count andAppendices A-B. The journalling method for a stream iscompare octets. o Two instances of the same command whose semantics setator report thestartvalue ofa session and MUST NOT be changed thereafter. This requirement forbids changes to the j_sec parameter once a session has begun. A related requirement, defined intheAppendix sections below, forbidssame "parameter" have theacceptancesame type. The implementation of this exception requires specific knowledge ofparameter values that would violatetherecovery journal mandate.format and semantics of SysEx commands. Inmany cases,practice, achange in onesender implementation chooses to support this exception for certain classes of commands (such as theparametersUniversal System Exclusive commands defined in [MIDI]). If a sender supports thisAppendix during an on-going session would resultexception for a particular command in aviolation ofclass (for example, therecovery journal mandateUniversal Real Time System Exclusive message foran implementation;Master Volume, F0 F7 cc 04 01 vv vv F7, defined inthis case, the parameter change[MIDI]), it MUSTNOT be accepted. C.2.1 The j_sec Parameter Section 2.2 definessupport thedefault journalling method for a stream. Streams that use unreliable transport (such as UDP) defaultexception tousingall instances of this particular command in therecovery journal. Streams thatsession. We now usereliable transport (such as TCP) default to not using a journal. The parameter j_sec may be used to overridethisdefault. This memo defines two symbolic valuesdefinition of "SysEx type" to define the "recency" tool and the "list" tool forj_sec: "none",Chapter X. By default, the Chapter X log list MUST code sufficient information toindicate thatprotect the rendered MIDI performance from indefinite artifacts caused by the loss of allstream payloadsfinished or unfinished active SysEx commands that appear in the checkpoint history (excluding finished MTC Full Frame commands, which are coded in Chapter F (Appendix B.4)). To protect a command of a specific SysEx type with the recency tool, senders MUSTNOT containcode ajournal section, and "recj", to indicatelog in the log list for the most recent finished active instance of the SysEx type thatall stream payloadsappears in the checkpoint history. Additionally, if an unfinished active instance of the SysEx type appears in the checkpoint history, senders MUSTcontaincode ajournal section that useslog in therecovery journal format. For example,log list for thej_sec parameter mightunfinished command instance. The L header bit of both command logs MUST be set to"none" for0. To protect aUDP stream that travels between two hosts on a local network that is known to provide reliable datagram delivery. The session description below configurescommand of aUDP stream that does not usespecific SysEx type with therecovery journal: v=0 o=lazzaro 2520644554 2838152170 IN IP4 first.example.net s=Example t=0 0 m=audio 5004 RTP/AVP 96 c=IN IP4 192.0.2.94 a=rtpmap:96 rtp-midi/44100 a=fmtp:96 j_sec=none Other IETF standards-track documents may define alternative journal formats. These documentslist tool, senders MUSTdefine new symbolic valuescode a log in the Chapter X log list for each finished or unfinished active instance of thej_sec parameter to signalSysEx type that appears in theusecheckpoint history. The L header bit of list tool command logs MUST be set to 1. As a rule, a log REQUIRED by theformat. Partieslist or recency tool MUSTNOT acceptinclude aj_sec valueDATA field thatviolatescodes all data octets that appear in therecovery journal mandate (see Section 4checkpoint history fordetails). If a session description usesthe SysEx command instance associated with the log. The FIRST field MAY be used to configure aj_sec value unknownDATA field that minimally meets this requirement. An exception to this rule applies to cancelled commands (defined in Section 3.2). REQUIRED command logs associated with cancelled commands MAY be coded with no DATA field. However, if DATA appears in therecipient, the recipientlog, DATA MUSTNOT acceptcode all data octets that appear in thedescription. Special j_sec issues arise when sessions are managedcheckpoint history for the command associated with the log. As defined bysession management tools (like RTSP, [RFC2326])the preceding text in this section, by default all finished or unfinished active SysEx commands thatuse SDP for "declarative usage" purposes (seeappear in thepreamble of Section 6 for details).checkpoint history (excluding finished MTC Full Frame commands) MUST be protected by the list tool or the recency tool. Forthese session management tools, SDP does not code transport details (such assome MIDI source streams, this default yields a Chapter X whose size is too large. For example, imagine that a sender begins to transcode a SysEx command with 10,000 data octets onto a UDPor TCP) forRTP stream "on thesession. Instead, server and client negotiate transport details via other means (for RTSP,fly", by sending SysEx command segments as soon as data octets are delivered by theSETUP method).MIDI source. After 1000 octets have been sent, the expansion of Chapter X yields an RTP packet that is too large to fit in the Maximum Transmission Unit (MTU) for the stream. In thisscenario,situation, if a sender uses theuse ofclosed-loop sending policy for SysEx commands, thej_sec parameterRTP packet size may always beill-advised, as the creator ofcapped by stalling thesession description may not yet knowstream. In a stream stall, once thetransport type forpacket reaches a maximum size, thesession. In this case,sender refrains from sending new packets with non- empty MIDI Command Sections until receiver feedback permits thesession description SHOULD configuretrimming of Chapter X. If thejournalling systemstream permits arbitrary commands to appear between SysEx segments (selectable during configuration using theparameterstools defined inthe remainder ofAppendixC.2,C.1), the sender may stall the SysEx segment stream butSHOULD NOT use j_seccontinue toset the journalling status. Recall that if j_sec does not appearcode other commands in the MIDI list. Stalls are a workable but sub-optimal solution to Chapter X size issues. As an alternative to stalls, senders SHOULD take preemptive action during sessiondescription,configuration to reduce thedefault method for choosinganticipated size of Chapter X, using thejournalling method is in effect (no journalmethods described below: o Partitioned transport. Appendix C.5 provides tools for sending a MIDI name space over several RTP streams. Senders may use these tools to map a MIDI source into a low-latency UDP RTP stream (for channel commands and short SysEx commands) and a reliabletransport, recovery journal for unreliable transport). However, in declarative usage situations where[RFC4571] TCP stream (for bulk-data SysEx commands). The cm_unused and cm_used parameters (Appendix C.1) may be used to communicate thecreatornature of thesession description knows journallingSysEx command partition. As TCP isalways required or never required, the session description SHOULD usereliable, thej_sec parameter. C.2.2 The j_update Parameter In Section 4, weRTP MIDI TCP stream would not use theterm "sending policy" to describe the method a sender uses to choose the checkpoint packet identity for eachrecoveryjournal in a stream. Injournal. To minimize transmission latency for short SysEx commands, senders may begin segmental transmission for all SysEx commands over thesub-sections that follow, we normatively define three sending policies: anchor, closed-loop,UDP stream andopen-loop. As statedthen cancel the UDP transmission of long commands (using tools described in Section4,3.2) and resend thedefault sending policycommands over the TCP stream. o Selective protection. Journal protection may not be necessary for all SysEx commands in astream is the closed-loop policy.stream. Thej_updatech_never parameter (Appendix C.2) may be used tooverride this default. We define three symbolic values for j_update: "anchor",communicate which SysEx commands are excluded from Chapter X. B.5.3. TCOUNT and COUNT Fields If the T header bit is set toindicate that1, thestream uses8-bit TCOUNT field appears in theanchor sending policy, "open-loop",command log. If the C header bit is set toindicate that1, thestream uses8-bit COUNT field appears in theopen-loop sending policy,command log. TCOUNT and"closed-loop", to indicate thatCOUNT are interpreted as unsigned integers. The TCOUNT field codes thestream usestotal number of SysEx commands of theclosed-loop sending policy. See Appendix C.2.3 for examples session descriptions that useSysEx type coded by thej_update parameter. Parties MUST NOT accept a j_update valuelog thatviolatesappear in therecovery journal mandate (Section 4). Other IETF standards-track documents may define additional sending policies forsession history, at therecovery journal system. These documents MUST define new symbolic values formoment after thej_update parameter to signal(finished or unfinished) command coded by theuse oflog enters thenew policy. If asessiondescription uses a j_update value unknown tohistory. The COUNT field codes therecipient,total number of SysEx commands that appear in therecipient MUST NOT acceptsession history, excluding commands that are excluded from Chapter X via thedescription. C.2.2.1 The anchor Sending Policy Inch_never parameter (Appendix C.2), at theanchor policy,moment after thesender uses the first packet in(finished or unfinished) command coded by thestream aslog enters thecheckpoint packetsession history. Command counting forall packetsTCOUNT and COUNT uses modulo-256 arithmetic. MTC Full Frame command instances (Appendix B.4) are included in command counting if thestream. The anchor policy satisfies the recovery journal mandate (Section 4),TCOUNT and COUNT definitions warrant their inclusion, as are cancelled commands (Section 3.2). Senders use thecheckpoint history always coversTCOUNT and COUNT fields to track theentire stream. The anchor policy does not requireidentity and (for TCOUNT) theusesequence position of a command instance. Senders MUST use theRTP control protocol (RTCP, [RFC3550])TCOUNT orother feedback from receiver to sender. Senders do not need to take special actionsCOUNT fields if identity or sequence information is necessary toensure that received streams start up free of artifacts, asprotect therecovery journal always coverscommand type coded by theentire history oflog. If a sender uses thestream. Receivers are relieved ofCOUNT field in a session, theresponsibility of trackingfinal command log in every Chapter X in thechanging identity ofstream MUST code thecheckpoint packet, becauseCOUNT field. This rule lets receivers resynchronize thecheckpointCOUNT value after a packetnever changes. The main drawbackloss. C. Session Configuration Tools In Sections 6.1-2 of theanchor policy is bandwidth efficiency. Because the checkpoint history covers the entire stream, the size ofmain text, we show session descriptions for minimal native and mpeg4-generic RTP MIDI streams. Minimal streams lack therecovery journals produced byflexibility to support some applications. In thispolicy usually exceedsappendix, we describe how to customize stream behavior through thejournal sizeuse ofalternative policies. For single-channel MIDI data streams,thebandwidth overheadpayload format parameters. The appendix begins with 6 sections, each devoted to parameters that affect a particular aspect of stream behavior: o Appendix C.1 describes theanchor policy is often acceptable (seestream subsetting system (cm_unused and cm_used). o AppendixA.4 of [NMP]). For dense streams,C.2 describes theclosed-loop or open-loop policies may be more appropriate. C.2.2.2 The closed-loop Sending Policy The closed-loop policy isjournalling system (ch_anchor, ch_default, ch_never, j_sec, j_update). o Appendix C.3 describes MIDI command timestamp semantics (linerate, mperiod, octpos, tsmode). o Appendix C.4 describes thedefault policytemporal duration ("media time") ofthe recovery journal system. For eachan RTP MIDI packet (guardtime, rtp_maxptime, rtp_ptime). o Appendix C.5 concerns stream description (musicport). o Appendix C.6 describes MIDI rendering (chanmask, cid, inline, multimode, render, rinit, subrender, smf_cid, smf_info, smf_inline, smf_url, url). The parameters listed above may optionally appear inthe stream, the policy lets senders choose the smallest possible checkpoint history that satisfies the recovery journal mandate. As smaller checkpoint histories generally yield smaller recovery journals, the closed-loop policy reduces the bandwidthsession descriptions ofa stream, relative toRTP MIDI streams. If these parameters are used in an SDP session description, theanchor policy. The closed-loop policy reliesparameters appear onfeedback from receiveran fmtp attribute line. This attribute line applies tosender. The policy assumes that a receiver periodically informs the sender of the highest sequence number it has seen so far inthestream, coded inpayload type associated with the32-bit extension format defined in [RFC3550]. For RTCP, receivers transmit this information in the Extended Highest Sequence Number Received (EHSNR) fieldfmtp line. The parameters listed above add extra functionality ("features") to minimal RTP MIDI streams. In Appendix C.7, we show how to use these features to support two classes ofReceiver Reports. RTCP Sender or Receiver Reports MUST be sent by any participantapplications: content-streaming using RTSP (Appendix C.7.1) and network musical performance using SIP (Appendix C.7.2). The participants in a multimedia sessionwith closed loop sending policy, unless another feedback mechanismMUST share a common view of all of the RTP MIDI streams that appear in an RTP session, as defined by a single media (m=) line. In some RTP MIDI applications, the "common view" restriction makes it difficult to use sendrecv streams (all parties send and receive), as each party hasbeen agreed upon. The senderits own requirements. For example, a two-party network musical performance application maysafely use receiver sequence number feedbackwish toguide checkpoint history management, because Section 4 requires receiverscustomize the renderer on each host torepair indefinite artifacts whenever a packet loss event occur. We now normatively definematch theclosed-loop policy. AtCPU performance of themoment a sender prepares anhost [NMP]. We solve this problem by using two RTPpacket for transmission, the sender is awareMIDI streams -- one sendonly, one recvonly -- in lieu ofR >= 0one sendrecv stream. The data flows in the two streams travel in opposite directions, to control receiversforconfigured to use different renderers. In thestream. Sendersthird example in Appendix C.5, we show how the musicport parameter maybecome aware ofbe used to define virtual sendrecv streams. As areceiver via RTCP traffic fromgeneral rule, thereceiver, viaRTPpackets fromMIDI protocol does not handle parameter changes during apaired stream sent by the receiver tosession well, because thesender, via messages fromparameters describe heavyweight or stateful configuration that is not easily changed once a sessionmanagement tool, orhas begun. Thus, parties SHOULD NOT expect that parameter change requests during a session will be accepted by othermeans. As receivers join and leave a session,parties. However, implementors SHOULD support in-session parameter changes that are easy to handle (for example, thevalueguardtime parameter defined in Appendix C.4) and SHOULD be capable ofR changes. Each known receiver k (1 <= k <= R) is associated with a 32-bit extended packet sequence number M(k), where the extension reflects the sequence number rollover countaccepting requests for changes ofthe sender. If the sender hasthose parameters, as receivedat least one feedback report from receiver k, M(k) is the most recent report of the highest RTP packet sequence number seenby its session management protocol (for example, re-offers in SIP [RFC3264]). Appendix D defines thereceiver, normalizedAugmented Backus-Naur Form (ABNF, [RFC4234]) syntax for the payload parameters. Section 11 provides information toreflecttherollover count ofInternet Assigned Numbers Authority (IANA) on thesender. Ifmedia types and parameters defined in this document. Appendix C.6.5 defines thesender has not receivedmedia type "audio/asc", afeedback report fromstored object for initializing mpeg4-generic renderers. As described in Appendix C.6, thereceiver, M(k) is the extended sequence number ofaudio/asc media type is assigned to thelast packet"rinit" parameter to specify an initialization data object for thesender transmitted before it became aware ofdefault mpeg4-generic renderer. Note that RTP stream semantics are not defined for "audio/asc". Therefore, thereceiver. If"asc" subtype MUST NOT appear on thesender became awarertpmap line ofthis receiver before it sent the first packeta session description. C.1. Configuration Tools: Stream Subsetting As defined in Section 3.2 in thestream, M(k) ismain text, theextended sequence numberMIDI list ofthe firstan RTP MIDI packetin the stream. Givenmay encode any MIDI command that may legally appear on a MIDI 1.0 DIN cable. In thisdefinition of M(),appendix, wenow state the closed-loop policy. When preparing a new packet for transmission, a sender MUST choose a checkpoint packet with extended sequence number N, suchdefine two parameters (cm_unused and cm_used) thatM(k) >= (N - 1) for all k, 1 <= k <= R, where R >= 1. The policy does not restrict sender behavior in the R == 0 (no known receivers) case. Undermodify this default condition, by excluding certain types of MIDI commands from theclosed-loop policy as defined above, a sender may transmitMIDI list of all packetswhose checkpoint history is shorter than the session history (as definedinAppendix A.1). In this event,anew receiver that joins the stream may experience indefinite artifacts.stream. For example, if aControl Change (0xB) command for Channel Volume (controller number 7) was sent early in a stream, and latermultimedia session partitions anew receiver joins the session,MIDI name space into two RTP MIDI streams, theclosed-loop policyparameters maypermit all packets sent to the new receiverbe used touse a checkpoint history that does not include the Channel Volume Control Change command. As a result, the new receiver experiences an indefinite artifact, and play all notes on a channel too loudly or too softly. To addressdefine which commands appear in each stream. In thisissue, the closed-loop policy states that wheneverappendix, we define asender becomes aware ofsimple language for specifying MIDI command types. If anew receiver, the sender MUST determine if the receiver would be subjectcommand type is assigned toindefinite artifacts undercm_unused, theclosed-loop policy. If so,commands coded by thesenderstring MUSTensure that the receiver startsNOT appear in thesession free of indefinite artifacts. For example,MIDI list. If a command type is assigned tosolvecm_used, theChannel Volume issue described above,commands coded by thesenderstring MAY appear in the MIDI list. The parameter list may code multiple assignments to cm_used and cm_unused. Assignments have a cumulative effect and are applied in thecurrent stateorder ofthe Channel Volume controller numbersappearance in therecovery journal Chapter C, until it receives the first RTCP RR report that signals thatparameter list. A later assignment of apacket containing this Chapter C has been received. In satisfying this requirement, senders MAY infercommand type to theinitial MIDI statesame parameter expands the scope of thereceiver fromearlier assignment. A later assignment of a command type to thesession description. For example,opposite parameter cancels (partially or completely) the effect of an earlier assignment. To initialize the streamexample in Section 6.2 hassubsetting system, "implicit" assignments to cm_unused and cm_used are processed before processing theinitial state defined in [MIDI] for General MIDI. In a unicast RTP session, a receiver may safely assumeactual assignments thatthe sender is aware of its presence of a receiver from the first packet sentappear in theRTP stream. However, inparameter list. The System Common undefined commands (0xF4, 0xF5) and the System Real-Time Undefined commands (0xF9, 0xFD) are implicitly assigned to cm_unused. All other command typesof RTP sessions (multicast, conference focus, RTP translator/mixer), a receiver is often not ableare implicitly assigned todetermine ifcm_used. Note that thesender is initially awareimplicit assignments code the default behavior ofits presencean RTP MIDI stream asa receiver. To address this issue,defined in Section 3.2 in theclosed-loop policy statesmain text (namely, that all commands thatif a receiver participates in a session where itmayhave access tolegally appear on astream whose sender is not aware of the receiver, the receiver MUST take actions to ensure that its renderedMIDIperformance does not contain indefinite artifacts. These protections will be necessarily incomplete. For example, a receiver1.0 DIN cable maymonitor the Checkpoint Packet Seqnum for uncovered loss events, and "err onappear in thesidestream). Also note that assignments ofcaution" with respect to handling stuck notes due to lost MIDI NoteOff commands, butthereceiver is not ableSystem Common undefined commands (0xF4, 0xF5) apply tocompensate forthelackuse ofChannel Volume initialization data in the recovery journal. The receiver MUST NOT discontinuetheseprotective actions until it is certain thatcommands in thesender is aware of its presence. If a receiver isMIDI source command stream, notable to ascertain sender awareness, the receiver MUST continue these protective actions forthedurationspecial use ofthe session. Note that in a multicast session where all parties are expected to send0xF4 andreceive, the reception of RTCP receiver reports from the sender about0xF5 in SysEx segment encoding defined in Section 3.2 in theRTP streammain text. As areceiver is multicasting is evidence of sender awareness that the RTP stream multicast byrule, parameter assignments obey thesenderfollowing syntax (see Appendix D for ABNF): <parameter> = [channel list]<command-type list>[field list] The command-type list isbeing monitored by the receiver. Receivers may also obtain sender awareness evidence from session management tools, or by other means. In practice, ongoing observation ofmandatory; theCheckpoint Packet Seqnum to determine ifchannel and field lists are optional. The command-type list specifies thesender is taking actions to prevent loss eventsMIDI command types fora receiverwhich the parameter applies. The command-type list is agood indicationconcatenated sequence ofsender awareness, as is the sudden appearanceone or more ofrecovery journal chapters with numerous Control Change controller data that was not foreshadowed by recent commands coded intheMIDI list shortly after sending an RTCP RR.letters (ABCFGHJKMNPQTVWXYZ). Thefinal set of normative closed-loop policy requirements concern how senders and receivers handle unplanned disruptions of RTCP feedback from a receiver to a sender. By "unplanned", we refer to disruptions that are not dueletters code the following command types: o A: Poly Aftertouch (0xA) o B: System Reset (0xFF) o C: Control Change (0xB) o F: System Time Code (0xF1) o G: System Tune Request (0xF6) o H: System Song Select (0xF3) o J: System Common Undefined (0xF4) o K: System Common Undefined (0xF5) o N: NoteOff (0x8), NoteOn (0x9) o P: Program Change (0xC) o Q: System Sequencer (0xF2, 0xF8, 0xF9, 0xFA, 0xFB, 0xFC) o T: Channel Aftertouch (0xD) o V: System Active Sense (0xFE) o W: Pitch Wheel (0xE) o X: SysEx (0xF0) o Y: System Real-Time Undefined (0xF9) o Z: System Real-Time Undefined (0xFD) In addition to thesignalled terminationletters above, the letter M may also appear in the command-type list. The letter M refers to the MIDI parameter system (see definition in Appendix A.1 and in [MIDI]). An assignment ofan RTP stream, via an RTCP BYEM to cm_unused codes that no RPN orvia session management tools. As defined earlierNRPN transactions may appear inthis section,theclosed-loop policy states that a sender MUST choose a checkpoint packet with extended sequence number N, suchMIDI list. Note thatM(k) >= (N - 1) for all k, 1 <= k <= R, where R >= 1. If the sender has received at least one feedback report from receiver k, M(k)if cm_unused is assigned themost recent report ofletter M, Control Change (0xB) commands for thehighest RTP packet sequence number seen bycontroller numbers in thereceiver, normalized to reflectstandard controller assignment might still appear in therollover countMIDI list. For an explanation, see Appendix A.3.4 for a discussion of thesender. If this receiver k stops sending feedback"general-purpose" use of parameter system controller numbers. In the text below, rules that apply to "MIDI voice channel commands" also apply to thesender,letter M. The letters in theM(k) value used bycommand-type list MUST be uppercase and MUST appear in alphabetical order. Letters other than (ABCFGHJKMNPQTVWXYZ) that appear in thesender reflectslist MUST be ignored. For MIDI voice channel commands, thelast feedback report fromchannel list specifies thereceiver. As time progresses without feedback from receiver k, this fixed M(k) value forcesMIDI channels for which thesenderparameter applies. If no channel list is provided, the parameter applies toincreaseall MIDI channels (0-15). The channel list takes thesizeform ofthe checkpoint history, and thus increases the bandwidtha list ofthe stream. At some point, the sender may need to take actionchannel numbers (0 through 15) and dash-separated channel number ranges (i.e., 0-5, 8-12, etc). Dots (i.e., "." characters) separate elements inorder to limit the bandwidth ofthestream. Inchannel list. Recall that System commands do not have a MIDI channel associated with them. Thus, for mostenvisioned uses of RTP MIDI, long before this pointcommand-type letters that code System commands (B, F, G, H, J, K, Q, V, Y, and Z), the channel list isreached,ignored. For theSSRC time-out mechanism defined in [RFC3550] will removecommand-type letter X, theuncooperative receiver fromappearance of certain numbers in thesession (notechannel list codes special semantics. o The digit 0 codes that SysEx "cancel" sublists (Section 3.2 in theclosed-loop policy does not suggest or require any special sender behavior upon an SSRC time-out,main text) MUST NOT appear in the MIDI list. o The digit 1 codes that cancel sublists MAY appear in the MIDI list (the default condition). o The digit 2 codes that commands other thanthe sender actions related to changing R described earlierSystem Real-time MIDI commands MUST NOT appear between SysEx command segments inthis section). However,the MIDI list (the default condition). o The digit 3 codes that any MIDI command type may appear between SysEx command segments inrare situations,thebandwidthMIDI list, with the exception of thestream (due to a lacksegmented encoding offeedback reports froma second SysEx command (verbatim SysEx commands are OK). For command-type X, thesender) may become too large to continue sendingchannel list MUST NOT contain both digits 0 and 1, and it MUST NOT contain both digits 2 and 3. For command-type X, channel list numbers other than thestream tonumbers defined above are ignored. If X does not have a channel list, thereceiver beforesemantics marked "the default condition" in theSSRC time-out occurslist above apply. The syntax for field lists in a parameter assignment follows thereceiver. In this case, the closed-loop policy states that the sender should invoke the SSRC time-outsyntax for channel lists. If no field list is provided, thereceiver early. We now discuss receiver responsibilities in the case of unplanned disruptions of RTCP feedback from receiverparameter applies tosender. Inall controller or note numbers. For command-type C (Control Change), theunicast case, if a sender invokesfield list codes theSSRC time-out mechanismcontroller numbers (0-255) fora receiver,which thereceiver stops receiving packets fromparameter applies. For command-type M (Parameter System), thesender. The sender behavior imposed byfield list codes the Registered Parameter Numbers (RPNs) and Non-Registered Parameter Numbers (NRPNs) for which theguardtimeparameter(Appendix C.4.2) letsapplies. The number range 0-16383 specifies RPNs, thereceiver conclude a SSRC time-out has occurred in a reasonable time period. In this case of a time-out, a receiver MUST keep sending RTCP feedback, in ordernumber range 16384-32767 specifies NRPNs (16384 corresponds tore-establishNRPN 0, 32767 corresponds to NRPN 16383). For command-types N (NoteOn and NoteOff) and A (Poly Aftertouch), theRTP flow fromfield list codes thesender. Unlessnote numbers for which thereceiver expects a prompt recovery ofparameter applies. For command-types J and K (System Common Undefined), theRTP flow,field list consists of a single digit, which specifies thereceiver MUST take actions to ensurenumber of data octets that follow therendered MIDI performance does not exhibit "very long transient artifacts" (for example, by silencing NoteOns to prevent stuck notes) while awaiting reconnection ofcommand octet. For command-type X (SysEx), theflow. Infield list codes themulticast case, ifnumber of data octets that may appear in asender invokesSysEx command. Thus, theSSRC time-out mechanism for a receiver,field list 0-255 specifies SysEx commands with 255 or fewer data octets, thereceiver may continue to receive packets,field list 256-4294967295 specifies SysEx commands with more than 255 data octets but excludes commands with 255 or fewer data octets, and thesender will no longer being using the M(k) feedback from the receiver to choose each checkpoint packet. Iffield list 0 excludes all commands. A secondary parameter assignment syntax customizes command-type X (see Appendix D for complete ABNF): <parameter> = "__" <h-list> ["_" <h-list>] "__" The assignment defines thereceiver does not have additional informationclass of SysEx commands thatprecludes an SSRC time-out (such as RTCP Receiver Reports fromobeys thesender about an RTP streamsemantics of thereceiverassigned parameter. The command class ismulticasting back to the sender), the receiver MUST monitorspecified by listing theCheckpoint Packet Seqnum to detect an SSRC time-out. If an SSRC time-out is detected,permitted values of thereceiver MUSTfirst N data octets that follow theinstructions for SSRC time-outs described for the unicast case above. Finally, we note thatSysEx 0xF0 command octet. Any SysEx command whose first N data octets match theclosed-loop policylist issuitable for use in RTP/RTCP sessions that use multicast transport. However, aspectsa member of theclosed-loop policy do not scale well to sessions with large numbersclass. Each <h-list> defines a data octet ofparticipants. The sender state scales linearly withthenumbercommand, as a dot-separated (".") list ofreceivers,one or more hexadecimal constants (such as "7F") or dash-separated hexadecimal ranges (such as "01-1F"). Underscores ("_") separate each <h-list>. Double-underscores ("__") delineate thesender needsdata octet list. Using this syntax, each assignment specifies a single SysEx command class. Session descriptions may use several assignments totrack the identitycm_used andM(k) value for each receiver k.cm_unused to specify complex behaviors. Theaverage recovery journal size is not independent ofexample session description below illustrates thenumberuse ofreceivers, as the RTCP reporting interval backoff slows downtherate of a full update of M(k) values.stream subsetting parameters: v=0 o=lazzaro 2520644554 2838152170 IN IP6 first.example.net s=Example t=0 0 m=audio 5004 RTP/AVP 96 c=IN IP6 2001:DB80::7F2E:172A:1E24 a=rtpmap:96 rtp-midi/44100 a=fmtp:96 cm_unused=ACGHJKNMPTVWXYZ; cm_used=__7F_00-7F_01_01__ Thebackoff algorithm may also increasesession description configures theamount of ancillary statestream for use in clock applications. All voice channels are unused, as are all System Commands except those used for MIDI Time Code (command-type F, and the Full Frame SysEx command that is matched byimplementations ofthenormative senderstring assigned to cm_used), the System Sequencer commands (command-type Q), andreceiver behaviors defined in Section 4. C.2.2.3 The open-loop Sending PolicySystem Reset (command-type B). C.2. Configuration Tools: Theopen-loop policy is suitable for sessionsJournalling System In this appendix, we define the payload format parameters thatare not able to implementconfigure stream journalling and thereceiver-to-sender feedback required byrecovery journal system. The j_sec parameter (Appendix C.2.1) sets theclosed-loop policy, and are also not able to usejournalling method for theanchor policy because of bandwidth constraints.stream. Theopen-loop policy does not place constraints on how a sender choosesj_update parameter (Appendix C.2.2) sets thecheckpoint packetrecovery journal sending policy foreach packet inthe stream.InAppendix C.2.2 also defines theabsencesending policies ofsuch constraints, a receiver may find thatthe recovery journalin the packet that ends a loss event has a checkpoint historysystem. Appendix C.2.3 defines several parameters thatdoes not cover the entire loss event. We refer to loss events of this type as uncovered loss events. To ensure that uncovered loss events do not compromisemodify the recovery journalmandate,semantics. These parameters change theopen-loop policy assigns specificdefault recoverytasks to senders, receivers,journal semantics as defined in Section 5 andthe creators of session descriptions.Appendices A-B. Theunderlying premise of the open-loop policyjournalling method for a stream isthatset at theindefinite artifacts produces during uncovered loss events fall into two classes. One classstart ofartifacts are recoverable indefinite artifacts. Receivers are ablea session and MUST NOT be changed thereafter. This requirement forbids changes torepair recoverable artifacts that occur during an uncovered loss event without intervention fromthesender, atj_sec parameter once a session has begun. A related requirement, defined in thepotential costappendix sections below, forbids the acceptance ofunpleasant transient artifacts. For example, after an uncovered loss event, receivers are able to repair indefinite artifacts due to NoteOff (0x8) commandsparameter values thatmay have occurred duringwould violate theloss event, by executing NoteOff commands for all active NoteOns commands. This action causesrecovery journal mandate. In many cases, atransient artifacts (a sudden silent periodchange in one of theperformance), but ensures that no stuck notes sound indefinitely. We refer to MIDI commands that are amenable to repairparameters defined in thisfashion as recoverable MIDI commands. A second class of artifacts are unrecoverable indefinite artifacts. If this class of artifact occursappendix during anuncovered loss event, the receiver is not able to repairongoing session would result in a violation of thestream. For example, afterrecovery journal mandate for anuncovered loss event, receivers are not able to repair indefinite artifacts due to Control Change (0xB) Channel Volume (controller number 7) commands that have occurred during the loss event. A repair is impossible becauseimplementation; in this case, thereceiver has no way of determiningparameter change MUST NOT be accepted. C.2.1. The j_sec Parameter Section 2.2 defines thedata value ofdefault journalling method for alost Channel Volume command. We referstream. Streams that use unreliable transport (such as UDP) default toMIDI commandsusing the recovery journal. Streams thatare fragile in this wayuse reliable transport (such asunrecoverable MIDI commands. The open-loop policy doesTCP) default to notspecify howusing a journal. The parameter j_sec may be used topartition the MIDI command set into recoverableoverride this default. This memo defines two symbolic values for j_sec: "none", to indicate that all stream payloads MUST NOT contain a journal section, andunrecoverable commands. Instead, it assumes"recj", to indicate that all stream payloads MUST contain a journal section that uses thecreators ofrecovery journal format. For example, thesession descriptions are able to comej_sec parameter might be set toagreement"none" for a UDP stream that travels between two hosts on asuitable recoverable/unrecoverable MIDI command partition for an application. Given these definitions, we now state the normative requirements for the open-loop policy. In the open-loop policy, the creators of thelocal network that is known to provide reliable datagram delivery. The session descriptionMUSTbelow configures a UDP stream that does not use thech_anchor parameter (defined in Appendix C.2.3) to protect all unrecoverable MIDI command types from indefinite artifacts, or alternativelyrecovery journal: v=0 o=lazzaro 2520644554 2838152170 IN IP4 first.example.net s=Example t=0 0 m=audio 5004 RTP/AVP 96 c=IN IP4 192.0.2.94 a=rtpmap:96 rtp-midi/44100 a=fmtp:96 j_sec=none Other IETF standards-track documents may define alternative journal formats. These documents MUSTusedefine new symbolic values for thecm_unusedj_sec parameter(defined in Appendix C.1)toexclude the command types fromsignal thestream. These options act to shield command types from artifacts during an uncovered loss event. Inuse of theopen-loop policy, receiversformat. Parties MUSTexamine the Checkpoint Packet Seqnum field ofNOT accept a j_sec value that violates the recovery journalheader after every loss event, to check if the loss event is an uncovered loss event.mandate (see Section5 shows how to perform this check.4 for details). Ifan uncovered loss event has occurred,areceiversession description uses a j_sec value unknown to the recipient, the recipient MUSTperform indefinite artifact recovery for all MIDI command types thatNOT accept the description. Special j_sec issues arise when sessions arenot shieldedmanaged bych_anchor and cm_unused parameter assignments insession management tools (like RTSP, [RFC2326]) that use SDP for "declarative usage" purposes (see the preamble of Section 6 for details). For these sessiondescription. The open-loop policymanagement tools, SDP does notplace specific constraints on the sender. However,code transport details (such as UDP or TCP) for theopen-loop policy works best ifsession. Instead, server and client negotiate transport details via other means (for RTSP, thesender managesSETUP method). In this scenario, thesizeuse of thecheckpoint history to ensure that uncovered losses occur infrequently, by taking into accountj_sec parameter may be ill-advised, as thedelay and loss characteristicscreator of thenetwork. Also, as each checkpoint packet change incurssession description may not yet know therisk of an uncovered loss, senders should only movetransport type for thecheckpoint if it reducessession. In this case, thesize ofsession description SHOULD configure thejournal. C.2.3 Recovery Journal Chapter Inclusion Parameters The recovery journal chapter definitions (Appendices A-B) specify under what conditions a chapter MUST appearjournalling system using the parameters defined in therecovery journal. In most cases,remainder of Appendix C.2, but it SHOULD NOT use j_sec to set thedefinition statesjournalling status. Recall that ifa certain command appears in the checkpoint history, a certain chapter type MUSTj_sec does not appear in the session description, the default method for choosing the journalling method is in effect (no journal for reliable transport, recovery journalto protectfor unreliable transport). However, in declarative usage situations where thecommand.creator of the session description knows that journalling is always required or never required, the session description SHOULD use the j_sec parameter. C.2.2. The j_update Parameter Inthis section,Section 4, wedescribeuse thechapter inclusion parameters. These parameters modifyterm "sending policy" to describe theconditions under whichmethod achapter appears the journal. These parameters are essentialsender uses to choose theuse ofcheckpoint packet identity for each recovery journal in a stream. In theopen-loop policy (Appendix C.2.2.3),sub-sections that follow, we normatively define three sending policies: anchor, closed-loop, and open-loop. As stated in Section 4, the default sending policy for a stream is the closed-loop policy. The j_update parameter mayalsobe used tosimplify implementations ofoverride this default. We define three symbolic values for j_update: "anchor", to indicate that the stream uses theclosed-loop (Appendix C.2.2.2) andanchor(Appendix C.2.2.1) policies. Each parameter represents a type of chapter inclusion semantics. An assignmentsending policy, "open-loop", toa parameter declares which chapters (or chapter subsets) obeyindicate that theinclusion semantics. We describestream uses theassignment syntaxopen-loop sending policy, and "closed-loop", to indicate that the stream uses the closed-loop sending policy. See Appendix C.2.3 forthese parameters later in this section. A partyexamples session descriptions that use the j_update parameter. Parties MUST NOT acceptchapter inclusion parameter valuesa j_update value thatviolateviolates the recovery journal mandate (Section 4).All assignments ofOther IETF standards-track documents may define additional sending policies for thesubsetting parameters (cm_used and cm_unused)recovery journal system. These documents MUSTprecede the first assignment of a chapter inclusion parameter indefine new symbolic values for the j_update parameterlist. Below, we normatively defineto signal thesemanticsuse of thechapter inclusion parameters. For clarity, we define the action of parameters on complete chapters.new policy. If aparameter is assigned a subset ofsession description uses achapter, the definition applies onlyj_update value unknown to thechapter subset. o ch_never. A chapter assigned torecipient, thech_never parameterrecipient MUST NOTappearaccept the description. C.2.2.1. The anchor Sending Policy In the anchor policy, the sender uses the first packet in the stream as the checkpoint packet for all packets in the stream. The anchor policy satisfies the recovery journal(Appendix A.4.1-2 defines exceptions to this rule for Chapter M). To signalmandate (Section 4), as theexclusioncheckpoint history always covers the entire stream. The anchor policy does not require the use ofa chapter fromthejournal, an assignmentRTP control protocol (RTCP, [RFC3550]) or other feedback from receiver toch_never MUST be made, even ifsender. Senders do not need to take special actions to ensure that received streams start up free of artifacts, as thecommands coded byrecovery journal always covers thechapter are assigned to cm_unused. This rule simplifiesentire history of thehandlingstream. Receivers are relieved ofcommands types that may be coded in several chapters. o ch_default. A chapter assigned tothech_default parameter MUST followresponsibility of tracking thedefault semantics forchanging identity of thechapter, as defined in Appendices A-B. o ch_anchor. A chapter assigned tocheckpoint packet, because thech_anchor MUST obey a modified versioncheckpoint packet never changes. The main drawback of thedefault chapter semantics. In the modified semantics, all references toanchor policy is bandwidth efficiency. Because the checkpoint historyare replaced with references tocovers thesession history, and all references toentire stream, thecheckpoint packet are replaced with references tosize of thefirst packet sent inrecovery journals produced by this policy usually exceeds thestream. Parameter assignments obeyjournal size of alternative policies. For single-channel MIDI data streams, thefollowing syntaxbandwidth overhead of the anchor policy is often acceptable (see AppendixD for ABNF): <parameter> = [channel list]<chapter list>[field list]A.4 of [NMP]). For dense streams, the closed-loop or open-loop policies may be more appropriate. C.2.2.2. Thechapter listclosed-loop Sending Policy The closed-loop policy ismandatory;thechannel and field lists are optional. Multiple assignments to parameters have a cumulative effect, and are applied indefault policy of theorder of parameter appearancerecovery journal system. For each packet ina media description. To determine the semantics of a list of chapter inclusion parameter assignments, we begin by assuming an implicit assignment of all channel and system chapters tothech_default parameter, withstream, thedefault values forpolicy lets senders choose thechannel list and field list for each chaptersmallest possible checkpoint history thatare defined below. We then interpretsatisfies thesemantics ofrecovery journal mandate. As smaller checkpoint histories generally yield smaller recovery journals, theactual parameter assignments, usingclosed-loop policy reduces therules below. A later assignmentbandwidth of achapterstream, relative to thesame parameter expandsanchor policy. The closed-loop policy relies on feedback from receiver to sender. The policy assumes that a receiver periodically informs thescopesender of theearlier assignment. In most cases, a later assignmenthighest sequence number it has seen so far in the stream, coded in the 32-bit extension format defined in [RFC3550]. For RTCP, receivers transmit this information in the Extended Highest Sequence Number Received (EHSNR) field of Receiver Reports. RTCP Sender or Receiver Reports MUST be sent by any participant in achaptersession with closed loop sending policy, unless another feedback mechanism has been agreed upon. The sender may safely use receiver sequence number feedback to guide checkpoint history management, because Section 4 requires that receivers repair indefinite artifacts whenever adifferent parameter cancels (partially or completely)packet loss event occur. We now normatively define theeffect of an earlier assignment. The chapter list specifiesclosed-loop policy. At thechannel or system chaptersmoment a sender prepares an RTP packet forwhichtransmission, theparameter applies. The chapter listsender isa concatenated sequenceaware ofone or moreR >= 0 receivers for the stream. Senders may become aware of a receiver via RTCP traffic from theletters corresponding toreceiver, via RTP packets from a paired stream sent by thechapter types (ACDEFMNPQTVWX). In addition,receiver to thelist may contain onesender, via messages from a session management tool, ormoreby other means. As receivers join and leave a session, the value of R changes. Each known receiver k (1 <= k <= R) is associated with a 32-bit extended packet sequence number M(k), where theletters forextension reflects thesub-chapter types (BGHJKYZ)sequence number rollover count ofSystem Chapter D. The letters in a chapter list MUST be upper case, and MUST appear in alphabetical order. Letters other than (ABCDEFGHJKMNPQTVWXYZ) that appear inthechapter list MUST be ignored. The channel list specifiessender. If thechannel journals for which this parameter applies; if no channel listsender has received at least one feedback report from receiver k, M(k) isprovided, the parameter applies to all channel journals. The channel list takestheform of a listmost recent report ofchannel numbers (0 through 15) and dash-separated channelthe highest RTP packet sequence numberranges (i.e. 0-5, 8-12, etc). Dots (i.e. "." characters) separate elements inseen by thechannel list. Severalreceiver, normalized to reflect the rollover count of thesystems chapters may be configured to have special semantics. Configuration occurs by specifyingsender. If the sender has not received achannel list forfeedback report from thesystems channel, usingreceiver, M(k) is thecoding described below (note that MIDI Systems commands do not have a "channel", and thusextended sequence number of theoriginal purposelast packet the sender transmitted before it became aware of thechannel list does not apply to systems chapters). The expression "the digit N"receiver. If the sender became aware of this receiver before it sent the first packet in thetext below refers tostream, M(k) is theinclusionextended sequence number ofN as a "channel"the first packet in thechannel liststream. Given this definition of M(), we now state the closed-loop policy. When preparing a new packet for transmission, asystems chapter. For the J and K Chapter D sub-chapters (undefined System Common), the digit 0 codessender MUST choose a checkpoint packet with extended sequence number N, such that M(k) >= (N - 1) for all k, 1 <= k <= R, where R >= 1. The policy does not restrict sender behavior in theparameter applies toR == 0 (no known receivers) case. Under theLEGAL field ofclosed-loop policy as defined above, a sender may transmit packets whose checkpoint history is shorter than theassociated command log (Figure B.1.4 ofsession history (as defined in AppendixB.1), the digit 1 codesA.1). In this event, a new receiver that joins theparameter applies to the VALUE field of thestream may experience indefinite artifacts. For example, if a Control Change (0xB) commandlog,for Channel Volume (controller number 7) was sent early in a stream, and later a new receiver joins thedigit 2 codes thatsession, theparameter appliesclosed-loop policy may permit all packets sent to theCOUNT field of the command log. For the Y and Z Chapter D sub-chapters (undefined System Real-time), the digit 0 codes that the parameter appliesnew receiver to use a checkpoint history that does not include theLEGAL field ofChannel Volume Control Change command. As a result, theassociated command log (Figure B.1.5 of Appendix B.1)new receiver experiences an indefinite artifact, and plays all notes on a channel too loudly or too softly. To address this issue, thedigit 1 codesclosed-loop policy states thatthe parameter applies to the COUNT fieldwhenever a sender becomes aware of a new receiver, thecommand log. For Chapter Q (Sequencer State Commands), the digit 0 codes thatsender MUST determine if theparameter appliesreceiver would be subject to indefinite artifacts under thedefault Chapter Q definition, which forbidsclosed-loop policy. If so, theTIME field. The digit 1 codessender MUST ensure that theparameter appliesreceiver starts the session free of indefinite artifacts. For example, to solve theoptional Chapter Q definition, which supportsChannel Volume issue described above, theTIME field. The syntax for field lists followssender may code thesyntax for channel lists. If no field list is provided,current state of theparameter applies to allChannel Volume controlleror note numbers. Fornumbers in the recovery journal Chapter C,if no field list is provided,until it receives thecontroller numbers do not use enhancedfirst RTCP RR report that signals that a packet containing this Chapter Cencoding (Appendix A.3.3). For Chapter C, the field list may take on values inhas been received. In satisfying this requirement, senders MAY infer therange 0 to 255. A field value Xinitial MIDI state of the receiver from the session description. For example, the stream example in Section 6.2 has therange 0-127 refers toinitial state defined in [MIDI] for General MIDI. In acontroller number X, and indicatesunicast RTP session, a receiver may safely assume that thecontroller number does not use enhanced Chapter C encoding. A field value Xsender is aware of its presence of a receiver from the first packet sent in therange 128-255 refers toRTP stream. However, in other types of RTP sessions (multicast, conference focus, RTP translator/mixer), acontroller number "X minus 128", and indicatesreceiver is often not able to determine if thecontroller number does usesender is initially aware of its presence as a receiver. To address this issue, theenhanced Chapter C encoding. Assignments madeclosed-loop policy states that if a receiver participates in a session where it may have access toconfigure the Chapter C encoding method foracontroller numberstream whose sender is not aware of the receiver, the receiver MUSTbe madetake actions to ensure that its rendered MIDI performance does not contain indefinite artifacts. These protections will be necessarily incomplete. For example, a receiver may monitor thech_default or ch_anchor parameters, as assignmentsCheckpoint Packet Seqnum for uncovered loss events, and "err on the side of caution" with respect toch_never acthandling stuck notes due toexcludelost MIDI NoteOff commands, but thenumber fromreceiver is not able to compensate for therecovery journal (and thus,lack of Channel Volume initialization data in theindicated encoding method is irrelevant). A Chapter C field listrecovery journal. The receiver MUST NOTencode conflicting information aboutdiscontinue these protective actions until it is certain that theenhanced encoding statussender is aware of its presence. If aparticular controller number. For example, values 0 and 128 MUST NOT both be coded by a field list. For Chapter M, the field list codesreceiver is not able to ascertain sender awareness, theRegistered Parameter Numbers (RPNs) and Non-Registered Parameter Numbers (NRPNs)receiver MUST continue these protective actions forwhichtheparameter applies. The number range 0-16383 specifies RPNs,duration of thenumber range 16384-32767 specifies NRPNs (16384 corresponds to NRPN 0, 32767 correspondssession. Note that in a multicast session where all parties are expected toNRPN 16383). For Chapters Nsend andA,receive, thefield list codesreception of RTCP receiver reports from thenote numbers for whichsender about theparameter applies. The note number range specified for Chapter N also applies to Chapter E. For Chapter E,RTP stream a receiver is multicasting is evidence of thedigit 0 codessender's awareness that theparameter applies to Chapter E note logs whose V bitRTP stream multicast by the sender isset to 0,being monitored by thedigit 1 codes thatreceiver. Receivers may also obtain sender awareness evidence from session management tools, or by other means. In practice, ongoing observation of theparameter appliesCheckpoint Packet Seqnum tonote logs whose V bitdetermine if the sender issettaking actions to1. For Chapter X, the field list codes the number of data octets that may appear inprevent loss events for aSysEx command thatreceiver is a good indication of sender awareness, as iscoded in the chapter. Thus, the field list 0-255 specifies SysEx commands with 255 or fewer data octets,thefield list 256-429496729 specifies SysEx commandssudden appearance of recovery journal chapters withmore than 255numerous Control Change controller dataoctets but excludesthat was not foreshadowed by recent commandswith 255 or fewer data octets, andcoded in thefieldMIDI list0 excludes all commands. A secondary parameter assignment syntax customizes Chapter X (see Appendix D for complete ABNF): <parameter> = "__" <h-list> ["_" <h-list>] "__"shortly after sending an RTCP RR. Theassignment defines a classfinal set ofSysEx commands whose Chapter X coding obeys the semanticsnormative closed-loop policy requirements concern how senders and receivers handle unplanned disruptions of RTCP feedback from a receiver to a sender. By "unplanned", we refer to disruptions that are not due to theassigned parameter. The command class is specified by listing the permitted valuessignalled termination of an RTP stream, via an RTCP BYE or via session management tools. As defined earlier in this section, thefirst N data octetsclosed-loop policy states thatfollow the SysEx 0xF0 command octet. Any SysEx command whose first N data octets matcha sender MUST choose a checkpoint packet with extended sequence number N, such that M(k) >= (N - 1) for all k, 1 <= k <= R, where R >= 1. If thelistsender has received at least one feedback report from receiver k, M(k) isa member oftheclass. Each <h-list> defines a data octetmost recent report of thecommand, as a dot-separated (".") list of one or more hexadecimal constants (such as "7F") or dash- separated hexadecimal ranges (such as "01-1F"). Underscores ("_") separate each <h-list>. Double-underscores ("__") delineatehighest RTP packet sequence number seen by thedata octet list. Using this syntax, each assignment specifies a single SysEx command class. Session descriptions may use several assignmentsreceiver, normalized to reflect thesame (or different) parameters to specify complex Chapter X behaviors. The ordering behaviorrollover count ofmultiple assignments followstheguidelines for chapter parameter assignments described earlier insender. If thissection. The example session description below illustratesreceiver k stops sending feedback to theuse ofsender, thechapter inclusion parameters: v=0 o=lazzaro 2520644554 2838152170 IN IP6 first.example.net s=Example t=0 0 m=audio 5004 RTP/AVP 96 c=IN IP6 2001:DB80::7F2E:172A:1E24 a=rtpmap:96 rtp-midi/44100 a=fmtp:96 j_update=open-loop; cm_unused=ABCFGHJKMQTVWXYZ; cm_used=__7E_00-7F_09_01.02.03__; cm_used=__7F_00-7F_04_01.02__; cm_used=C7.64; ch_never=ABCDEFGHJKMQTVWXYZ; ch_never=4.11-13N; ch_anchor=P; ch_anchor=C7.64; ch_anchor=__7E_00-7F_09_01.02.03__; ch_anchor=__7F_00-7F_04_01.02__ (The a=fmtp line has been wrapped to fitM(k) value used by thepage to accommodate memo formatting restrictions; it comprises a single line in SDP) The j_update parameter codes thatsender reflects thestream useslast feedback report from theopen-loop policy. Most MIDI command-types are assignedreceiver. As time progresses without feedback from receiver k, this fixed M(k) value forces the sender tocm_unusedincrease the size of the checkpoint history, and thusdo not appear inincreases the bandwidth of the stream.As a consequence,At some point, theassignmentssender may need to take action in order to limit thefirst ch_never parameter reflect thatbandwidth of the stream. In mostchapters are not in use. Chapter N for several MIDI channelsenvisioned uses of RTP MIDI, long before this point isassigned to ch_never. Chapter N for MIDI channels other than 4, 11, 12, and 13 may appearreached, the SSRC time-out mechanism defined in [RFC3550] will remove therecovery journal, usinguncooperative receiver from the(default) ch_default semantics. In practice, this assignment pattern would reflect knowledge about a resilient rendering method in use forsession (note that theexcluded channels. The MIDI Program Change command and several MIDI Control Change controller numbers are assignedclosed-loop policy does not suggest or require any special sender behavior upon an SSRC time-out, other than the sender actions related toch_anchor. Note thatchanging R, described earlier in this section). However, in rare situations, theorderingbandwidth of thech_anchor chapter C assignment after the ch_never command actsstream (due tooverridea lack of feedback reports from thech_never assignment forsender) may become too large to continue sending thelisted controller numbers (7 and 64). The assignment of command-type Xstream tocm_unused excludes most SysEx commands fromthestream. Exceptions are made for General MIDI System On/Off commands andreceiver before the SSRC time-out occurs for theMaster Volume and Balance commands, viareceiver. In this case, theuse ofclosed-loop policy states that thesecondary assignment syntax. The cm_used assignment codessender should invoke theexception, andSSRC time-out for thech_anchor assignment codes how these commands are protectedreceiver early. We now discuss receiver responsibilities inChapter X. C.3 Configuration Tools: Timestamp Semantics The MIDI command section ofthepayload format consists of a listcase ofcommands, each with an associated timestamp. The semanticsunplanned disruptions ofcommand timestamps may be set during session configuration, usingRTCP feedback from receiver to sender. In theparameters we describe in this section The parameter "tsmode" specifiesunicast case, if a sender invokes thetimestamp semanticsSSRC time-out mechanism for astream. The parameter takes on one of three token values: "comex", "async", or "buffer".receiver, the receiver stops receiving packets from the sender. Thedefault "comex" value specifies that timestamps codesender behavior imposed by theexecution time for a commandguardtime parameter (AppendixC.3.1), and supportsC.4.2) lets theaccurate transcoding Standard MIDI Files (SMFs, [MIDI]). The "comex" value is also RECOMMENDED for new MIDI user-interface controller designs. The "async" value specifiesreceiver conclude that anasynchronous timestamp sampling algorithm for time-of-arrival sources (Appendix C.3.2). The "buffer" value specifies a synchronous timestamp sampling algorithm (Appendix C.3.3) for time-of- arrival sources. Ancillary parameters MAY follow tsmodeSSRC time-out has occurred in amedia description. We define these parameters in Appendices C.3.2-3 below. C.3.1 The comex Algorithm The default "comex" (COMmand EXecution) tsmode value specifies the executionreasonable timefor the command. With comex,period. In this case of a time-out, a receiver MUST keep sending RTCP feedback, in order to re-establish thedifference between two timestamps indicatesRTP flow from thetime delay betweensender. Unless theexecutionreceiver expects a prompt recovery of thecommands. This difference may be zero, coding simultaneous execution. The comex interpretation of timestamps works well for transcoding a Standard MIDI File (SMF, [MIDI]) into anRTPMIDI stream, as SMFs code a timestamp for each MIDI command stored inflow, thefile. To transcode an SMFreceiver MUST take actions to ensure thatuses metric time markers, usetheSMF tempo map (encoded in the SMF as meta-events) to convert metric SMF timestamp units into seconds-based RTP timestamp units. New MIDI controller designs (piano keyboard, drum pads, etc) that support RTPrendered MIDIand that have direct accessperformance does not exhibit "very long transient artifacts" (for example, by silencing NoteOns tosensor data SHOULD use comex interpretationprevent stuck notes) while awaiting reconnection of the flow. In the multicast case, if a sender invokes the SSRC time-out mechanism fortimestamps, so that simultaneous gestural eventsa receiver, the receiver may continue to receive packets, but the sender will no longer beaccurately coded byusing the M(k) feedback from the receiver to choose each checkpoint packet. If the receiver does not have additional information that precludes an SSRC time-out (such as RTCP Receiver Reports from the sender about an RTPMIDI. Comexstream the receiver isa poor choicemulticasting back to the sender), the receiver MUST monitor the Checkpoint Packet Seqnum to detect an SSRC time-out. If an SSRC time-out is detected, the receiver MUST follow the instructions fortranscoding MIDI 1.0 DIN cables [MIDI],SSRC time-outs described fora reason thatthe unicast case above. Finally, wenow explain. A MIDI DIN cablenote that the closed-loop policy isan asynchronous serial protocol (320 microseconds per MIDI byte). MIDI commands on a DIN cable aresuitable for use in RTP/RTCP sessions that use multicast transport. However, aspects of the closed-loop policy do nottaggedscale well to sessions withtimestamps. Instead, MIDI DIN receivers infer command timing from the timelarge numbers ofarrivalparticipants. The sender state scales linearly with the number of receivers, as thebytes. Thus, two two- byte MIDI commands that occur at a source simultaneously are encoded on a MIDI 1.0 DIN cable with a 640 microsecond time offset. A MIDI DIN receiver is unablesender needs totell if this time offset existed intrack thesource performance, oridentity and M(k) value for each receiver k. The average recovery journal size isan artifactnot independent of theserial speednumber of receivers, as thecable. However,RTCP reporting interval backoff slows down theRTP MIDI comex interpretationrate oftimestamps declares thatatimestamp offset between two commands reflects the timingfull update of M(k) values. The backoff algorithm may also increase thesource performance. This semantic mismatch isamount of ancillary state used by implementations of thereason that comexnormative sender and receiver behaviors defined in Section 4. C.2.2.3. The open-loop Sending Policy The open-loop policy isa poor choicesuitable fortranscoding MIDI DIN cables. Notesessions that are not able to implement thechoicereceiver-to-sender feedback required by the closed-loop policy, and that are also not able to use the anchor policy because of bandwidth constraints. The open-loop policy does not place constraints on how a sender chooses theRTP timestamp rate (Section 6.1-2checkpoint packet for each packet in themain text) cannot fix this inaccuracy issue.stream. In thesectionsabsence of such constraints, a receiver may find thatfollow, we describe two alternative timestamp interpretations ("async" and "buffer")the recovery journal in the packet thatareends abetter matchloss event has a checkpoint history that does not cover the entire loss event. We refer toMIDI 1.0 DIN cable timing, andloss events of this type as uncovered loss events. To ensure that uncovered loss events do not compromise the recovery journal mandate, the open-loop policy assigns specific recovery tasks toother MIDI time-of-arrival sources. The "octpos", "linerate",senders, receivers, and"mperiod" ancillary parameters (defined below) SHOULD NOT be used with comex. C.3.2 The async Algorithm The "async" tsmode value specifiestheasynchronous samplingcreators of session descriptions. The underlying premise ofa MIDI time-of-arrival source. In asynchronous sampling,themoment an octetopen-loop policy isreceived from a source itthat the indefinite artifacts produced during uncovered loss events fall into two classes. One class of artifacts islabelled with a wall-clock time value. The time value has RTP timestamp units. The "octpos" ancillary parameter defines how RTP command timestampsrecoverable indefinite artifacts. Receivers arederivedable to repair recoverable artifacts that occur during an uncovered loss event without intervention fromoctet time values. If octpos has the token value "first", a timestamp codesthetime value ofsender, at thefirst octetpotential cost of unpleasant transient artifacts. For example, after an uncovered loss event, receivers are able to repair indefinite artifacts due to NoteOff (0x8) commands that may have occurred during thecommand. If octpos has the token value "last",loss event, by executing NoteOff commands for all active NoteOns commands. This action causes atimestamp codes the time value oftransient artifact (a sudden silent period in thelast octetperformance), but ensures that no stuck notes sound indefinitely. We refer to MIDI commands that are amenable to repair in this fashion as recoverable MIDI commands. A second class ofthe command.artifacts is unrecoverable indefinite artifacts. If this class of artifact occurs during an uncovered loss event, theoctpos parameter doesreceiver is notappear in the media description,able to repair thesender doesstream. For example, after an uncovered loss event, receivers are notknow the which octet ofable to repair indefinite artifacts due to Control Change (0xB) Channel Volume (controller number 7) commands that have occurred during thecommandloss event. A repair is impossible because thetimestamp references (for example,receiver has no way of determining thesender may be relying on an operating system servicedata value of a lost Channel Volume command. We refer to MIDI commands that are fragile in this way as unrecoverable MIDI commands. The open-loop policy does not specifythis information). The octpos semantics referhow to partition thefirst or last octet of aMIDI commandasset into recoverable and unrecoverable commands. Instead, itappearsassumes that the creators of the session descriptions are able to come to agreement on atime-of-arrivalsuitable recoverable/unrecoverable MIDIsource, not as it appears incommand partition for anRTP MIDI packet. This distinction is significant becauseapplication. Given these definitions, we now state theRTP coding may contain octets that are not present innormative requirements for thesource. For example,open-loop policy. In thestatus octetopen-loop policy, the creators of thefirst MIDI commandsession description MUST use the ch_anchor parameter (defined ina packet may have been addedAppendix C.2.3) totheprotect all unrecoverable MIDIstream during transcoding,command types from indefinite artifacts, or alternatively MUST use the cm_unused parameter (defined in Appendix C.1) tocomply withexclude theRTP MIDI running status requirements (Section 3.2). The "linerate" ancillary parameter definescommand types from thetimespan of one MIDI octet onstream. These options act to shield command types from artifacts during an uncovered loss event. In thetransmission mediumopen-loop policy, receivers MUST examine the Checkpoint Packet Seqnum field of theMIDI sourcerecovery journal header after every loss event, tobe sampled (such as a MIDI 1.0 DIN cable). The parameter has units of nanoseconds, and takes on integral values. For MIDI 1.0 DIN cables,check if thecorrect linerate value is 320000 (this valueloss event isalso the default value for the parameter). We now showan uncovered loss event. Section 5 shows how to perform this check. If an uncovered loss event has occurred, asession description examplereceiver MUST perform indefinite artifact recovery forthe async algorithm. Consider a sender that is transcoding aall MIDI1.0 DIN cable source into RTP. The sender runs on a computing platformcommand types thatassigns time values to every incoming octet of the source,are not shielded by ch_anchor andthe sender uses the time values to label the first octet of each commandcm_unused parameter assignments in theRTP packet. Thissessiondescription describes the transcoding: v=0 o=lazzaro 2520644554 2838152170 IN IP4 first.example.net s=Example t=0 0 m=audio 5004 RTP/AVP 96 c=IN IP4 192.0.2.94 a=rtpmap:96 rtp-midi/44100 a=sendonly a=fmtp:96 tsmode=async; linerate=320000; octpos=first C.3.3 The buffer Algorithmdescription. The"buffer" tsmode value specifiesopen-loop policy does not place specific constraints on thesynchronous sampling of a MIDI time-of-arrival source. In synchronous sampling, octets received from a source are placed in a holding buffer upon arrival. At periodic intervals,sender. However, theRTP sender examinesopen-loop policy works best if thebuffer. Thesenderremoves complete commands from the buffer, and codes those commands in an RTP packet. The command timestamp codesmanages themomentsize ofbuffer examination, expressed in RTP timestamp units. Note that several commands may havethesame timestamp value. The "mperiod" ancillary parameter definescheckpoint history to ensure that uncovered losses occur infrequently, by taking into account thenominal periodic sampling interval. The parameter takes on positive integral values,delay andhas RTP timestamp units. The "octpos" ancillary parameter, defined in Appendix C.3.1 for asynchronous sampling, plays a different role in synchronous sampling. In synchronous sampling, the parameter specifies the timestamp semanticsloss characteristics ofa command whose octets span several sampling periods. If octpos has the token value "first", the timestamp reflectsthearrival period ofnetwork. Also, as each checkpoint packet change incurs thefirst octetrisk of an uncovered loss, senders should only move thecommand. If octpos has the token value "last", the timestamp reflects the arrival period ofcheckpoint if it reduces thelast octetsize of thecommand.journal. C.2.3. Recovery Journal Chapter Inclusion Parameters Theoctpos semantics refer torecovery journal chapter definitions (Appendices A-B) specify under what conditions a chapter MUST appear in thefirst or last octet ofrecovery journal. In most cases, thecommand as it appears ondefinition states that if atime-of-arrival source, not as itcertain command appears in theRTP packet. If the octpos parameter does notcheckpoint history, a certain chapter type MUST appear in themedia description,recovery journal to protect thetimestamp MAY reflectcommand. In this section, we describe thearrival period of any octet ofchapter inclusion parameters. These parameters modify thecommand -- senders use this option to signalconditions under which alack of knowledge aboutchapter appears thetiming detailsjournal. These parameters are essential to the use of thebuffering process at sub-command granularity. We now show a session description example foropen-loop policy (Appendix C.2.2.3) and may also be used to simplify implementations of thebuffer algorithm. Consider a sender that is transcoding a MIDI 1.0 DIN cable source into RTP. The sender runs on a computing platform that places source data intoclosed-loop (Appendix C.2.2.2) and anchor (Appendix C.2.2.1) policies. Each parameter represents abuffer upon receipt. The sender polls the buffer 1000 timestype of chapter inclusion semantics. An assignment to asecond, extracts all complete commands fromparameter declares which chapters (or chapter subsets) obey thebuffer, and placesinclusion semantics. We describe thecommandsassignment syntax for these parameters later inan RTP packet. This session description describesthis section. A party MUST NOT accept chapter inclusion parameter values that violate thetranscoding: v=0 o=lazzaro 2520644554 2838152170 IN IP6 first.example.net s=Example t=0 0 m=audio 5004 RTP/AVP 96 c=IN IP6 2001:DB80::7F2E:172A:1E24 a=rtpmap:96 rtp-midi/44100 a=sendonly a=fmtp:96 tsmode=buffer; linerate=320000; octpos=last; mperiod=44 The mperiod valuerecovery journal mandate (Section 4). All assignments of44 is derived by dividing the clock rate specified by the rtpmap attribute (44100 Hz) bythe1000 Hz buffer sampling rate,subsetting parameters (cm_used androunding tocm_unused) MUST precede thenearest integer. Command timestamps might not increment by exact multiplesfirst assignment of44, as the actual sampling period might not precisely matcha chapter inclusion parameter in thenominal mperiod value. C.4 Configuration Tools: Packet Timing Tools In this Appendix,parameter list. Below, wedescribe session configuration tools for customizingnormatively define thetemporal behaviorsemantics ofMIDI stream packets. C.4.1 Packet Duration Tools Senders controlthegranularity of a stream by settingchapter inclusion parameters. For clarity, we define thetemporal duration ("media time")action ofthe packets in the stream. Short media times (20 ms or less) often imply an interactive session. Longer media times (100 ms or more) usually indicateparameters on complete chapters. If acontent streaming session. The RTP AVP profile [RFC3551] recommends audio packet media times inparameter is assigned arange from 0 to 200 ms. By default, an RTP receiver dynamically senses the media timesubset ofpackets inastream, and chooseschapter, thelength of its playout bufferdefinition applies only tomatchthestream.chapter subset. o ch_never. Areceiver typically sizes its playout bufferchapter assigned tofit several audio packets, and adjuststhebuffer lengthch_never parameter MUST NOT appear in the recovery journal (Appendix A.4.1-2 defines exceptions toreflectthis rule for Chapter M). To signal thenetwork jitter andexclusion of a chapter from thesender timing fidelity. Alternatively,journal, an assignment to ch_never MUST be made, even if thepacket media timecommands coded by the chapter are assigned to cm_unused. This rule simplifies the handling of commands types that may bestatically set during session configuration. Session descriptions MAY usecoded in several chapters. o ch_default. A chapter assigned to theRTP MIDIch_default parameter"rtp_ptime" to setMUST follow therecommended media timedefault semantics fora packet. Session descriptions MAY also usetheRTP MIDI parameter "rtp_maxptime"chapter, as defined in Appendices A-B. o ch_anchor. A chapter assigned tosetthemaximum media time forch_anchor MUST obey a modified version of the default chapter semantics. In the modified semantics, all references to the checkpoint history are replaced with references to the session history, and all references to the checkpoint packetpermitted in a stream. Both parameters MAY be used togetherare replaced with references toconfigure athe first packet sent in the stream. Parameter assignments obey the following syntax (see Appendix D for ABNF): <parameter> = [channel list]<chapter list>[field list] Thevalues assigned tochapter list is mandatory; thertp_ptimechannel andrtp_maxptimefield lists are optional. Multiple assignments to parameters have a cumulative effect and are applied in theunitsorder of parameter appearance in a media description. To determine theRTP timestampsemantics of a list of chapter inclusion parameter assignments, we begin by assuming an implicit assignment of all channel and system chapters to the ch_default parameter, with the default values for thestream, as set bychannel list and field list for each chapter that are defined below. We then interpret thertpmap attribute (see Section 6.1). Thus, if rtpmap setssemantics of theclock rateactual parameter assignments, using the rules below. A later assignment of astreamchapter to44100 Hz,the same parameter expands the scope of the earlier assignment. In most cases, amaximum packet media timelater assignment of10 msa chapter to a different parameter cancels (partially or completely) the effect of an earlier assignment. The chapter list specifies the channel or system chapters for which the parameter applies. The chapter list iscoded by setting rtp_maxptime=441. As stated ina concatenated sequence of one or more of theAppendix C preamble,letters corresponding to thesenders and receiverschapter types (ACDEFMNPQTVWX). In addition, the list may contain one or more of the letters for the sub-chapter types (BGHJKYZ) of System Chapter D. The letters in astreamchapter list MUSTagree on common values for rtp_ptimebe uppercase andrtp_maxptime if the parametersMUST appear in alphabetical order. Letters other than (ABCDEFGHJKMNPQTVWXYZ) that appear in themedia description forchapter list MUST be ignored. The channel list specifies thestream. 0 mschannel journals for which this parameter applies; if no channel list is provided, the parameter applies to all channel journals. The channel list takes the form of areasonable media time value for MIDI packets,list of channel numbers (0 through 15) andis often useddash-separated channel number ranges (i.e., 0-5, 8-12, etc.). Dots (i.e., "." characters) separate elements inlow-latency interactive applications. In a packet withthe channel list. Several of the systems chapters may be configured to have special semantics. Configuration occurs by specifying a0 ms media time, allchannel list for the systems channel, using the coding described below (note that MIDI Systems commandsexecute atdo not have a "channel", and thus theinstant coded byoriginal purpose of thepacket timestamp.channel list does not apply to systems chapters). Thesession description below configures all packetsexpression "the digit N" in thestreamtext below refers tohave 0 ms media time: v=0 o=lazzaro 2520644554 2838152170 IN IP4 first.example.net s=Example t=0 0 m=audio 5004 RTP/AVP 96 c=IN IP4 192.0.2.94 a=rtpmap:96 rtp-midi/44100 a=fmtp:96 rtp_ptime=0; rtp_maxptime=0 The session attributes ptime and maxptime [SDP] MUST NOT be used to configure an RTP MIDI stream. Sessions MUST use rtp_ptime in lieuthe inclusion ofptime, and MUST use rtp_maxptimeN as a "channel" inlieu of maxptime. RTP MIDI defines its own parameters for media time configuration because 0 ms values for ptime and maxptime are forbidden by [RFC3264], but are essential for certain applications of RTP MIDI. SeetheAppendix C.7 examples for additional discussion about using rtp_ptime and rtp_maxptime for session configuration. C.4.2 The guardtime Parameter RTP permits a sender to stop sending audio packetschannel list foran arbitrary period of time duringasession. When sending resumes,systems chapter. For theRTP sequence number series continues unbroken,J and K Chapter D sub-chapters (undefined System Common), theRTP timestamp value reflectsdigit 0 codes that themedia time silence gap. This RTP feature has its roots in telephony, but is also well matched to interactive MIDI sessions, as players may fall silent for several seconds during (or between) songs. Certain MIDI applications benefit from a slight enhancement to this RTP feature. In interactive applications, receivers may use on-line network modelsparameter applies toguide heuristics for handling lost and late RTP packets. These models may work poorly if a sender ceases packet transmission for long periodsthe LEGAL field oftime. Session descriptions may usethe associated command log (Figure B.1.4 of Appendix B.1), the digit 1 codes that the parameter"guardtime" to set a minimum sending rate for a media session. The value assignedapplies toguardtime codesthemaximum separation time between two sequential packets, as expressed in RTP timestamp units. Typical guardtime values are 500-2000 ms. This value range is not a normative bound,VALUE field of the command log, andparties SHOULD be preparedthe digit 2 codes that the parameter applies toprocess values outsidethe COUNT field ofthis range. The congestion control requirements for sender implementations (described in Section 8the command log. For the Y and[RFC3550]) take precedence overZ Chapter D sub-chapters (undefined System Real-time), theguardtime parameter. Thus, ifdigit 0 codes that theguardtimeparameterrequests a minimum sending rate, but sending at this rate would violateapplies to thecongestion control requirements, senders MUST ignoreLEGAL field of theguardtime parameter value. In this case, senders SHOULD useassociated command log (Figure B.1.5 of Appendix B.1) and thelowest minimum sending ratedigit 1 codes thatsatisfiesthecongestion control requirements. Below, we show a session description that usesparameter applies to theguardtime parameter. v=0 o=lazzaro 2520644554 2838152170 IN IP6 first.example.net s=Example t=0COUNT field of the command log. For Chapter Q (Sequencer State Commands), the digit 0m=audio 5004 RTP/AVP 96 c=IN IP6 2001:DB80::7F2E:172A:1E24 a=rtpmap:96 rtp-midi/44100 a=fmtp:96 guardtime=44100; rtp_ptime=0; rtp_maxptime=0 C.5 Configuration Tools: Stream Description As we discussed in Section 2.1 incodes that themain text, a party may send several RTP MIDI streams inparameter applies to thesame RTP session, and several RTP sessionsdefault Chapter Q definition, which forbids the TIME field. The digit 1 codes thatcarry MIDI may appear in a multimedia session. By default,theMIDI name space (16 channels + systems) of each RTP stream sent by a party in a multimedia session is independent. By independent, we mean three distinct things: o By independent, we mean that if a party sends two RTP MIDI streams (A and B), MIDI voice channel 0 in stream A is a different "channel 0" than MIDI voice channel 0 in stream B. o By independent, we mean that MIDI voice channel 0 in stream B is not consideredparameter applies tobe "channel 16" of a 32-channel MIDI voicethe optional Chapter Q definition, which supports the TIME field. The syntax for field lists follows the syntax for channelspace whose "channel 0"lists. If no field list ischannel 0 of stream A. o By independent, we mean that streams sent by different parties over different RTP sessions,provided, the parameter applies to all controller orthat streams sent by different parties send overnote numbers. For Chapter C, if no field list is provided, thesame RTP session but with different payload type numbers,controller numbers do notshareuse enhanced Chapter C encoding (Appendix A.3.3). For Chapter C, theassociation that is shared by a MIDI cable pair that cross-connects two devices in a MIDI 1.0 DIN network. By default, this association is only held by streams sent by different partiesfield list may take on values in thesame RTP session that use the same payload type number. In this Appendix, we show howrange 0 toexpress that specific RTP MIDI streams in a multimedia session are not independent, but instead are related255. A field value X inone ofthethree ways defined above. We use two toolsrange 0-127 refers toexpress these relations: o The musicport parameter. This parameter is assignedanon-negative integer value between 0controller number X, and429496729. It appears in the fmtp lines of payload types. o The FID grouping attribute [RFC3388] signalsindicates thatseveral RTP sessionsthe controller number does not use enhanced Chapter C encoding. A field value X ina multimedia session are usingthemusicport parameterrange 128-255 refers toexpress an inter-session relationship. Ifamultimedia session has several payload types whose musicport parameters are assignedcontroller number "X minus 128" and indicates thesame integer value, streams using these payload types share an "identity relationship" (including streams thatcontroller number does use thesame payload type). Streams in an identity relationship share two properties: o Identity relationship streams sent by the same party targetenhanced Chapter C encoding. Assignments made to configure thesame MIDI name space. Thus, if streams A and B share an identity relationship, voice channel 0 in stream A isChapter C encoding method for a controller number MUST be made to thesame "channel 0"ch_default or ch_anchor parameters, asvoice channel 0 in stream B. o Pairs of identity relationship streams that are sent by different parties shareassignments to ch_never act to exclude theassociation thatnumber from the recovery journal (and thus the indicated encoding method isshared by a MIDI cable pair that cross-connects two devices in a MIDI 1.0 DIN network.irrelevant). ApartyChapter C field list MUST NOTsend two RTP MIDI streams that share an identity relationship inencode conflicting information about thesame RTP session. Instead, each streamenhanced encoding status of a particular controller number. For example, values 0 and 128 MUST NOT both beincoded by aseparate RTP session. As explained in Section 2.1 infield list. For Chapter M, themain text, this restriction is necessary to supportfield list codes theRTP MIDI methodRegistered Parameter Numbers (RPNs) and Non-Registered Parameter Numbers (NRPNs) for which thesynchronization of streams that share a MIDI name space. If a multimedia session has several payload types whose musicport parameters are assigned sequential values (i.e. i, i+1, ... i+k), the streams usingparameter applies. The number range 0-16383 specifies RPNs, thepayload types share an "ordered relationship". For example, if payload type A assigns 2number range 16384-32767 specifies NRPNs (16384 corresponds tomusicport and payload type B assigns 3NRPN 0, 32767 corresponds tomusicport, ANRPN 16383). For Chapters N andB are in an ordered relationship. Streams in an ordered relationship that are sent byA, thesame party are considered by renderersfield list codes the note numbers for which the parameter applies. The note number range specified for Chapter N also applies toform a single larger MIDI space.Chapter E. Forexample, if stream A has a musicport value of 2 and stream B has a musicport value of 3, MIDI voice channelChapter E, the digit 0in stream Bcodes that the parameter applies to Chapter E note logs whose V bit isconsideredset tobe voice channel 16 in the larger MIDI space formed by0, and therelationship. Notedigit 1 codes thatitthe parameter applies to note logs whose V bit ispossible for streamsset toparticipate in both an identity relationship and an ordered relationship. We now state several rules for using musicport: o If streams from several RTP sessions1. For Chapter X, the field list codes the number of data octets that may appear in amultimedia session useSysEx command that is coded in themusicport parameter,chapter. Thus, theRTP sessions MUST be grouped usingfield list 0-255 specifies SysEx commands with 255 or fewer data octets, theFID grouping attribute defined in [RFC3388]. o An orderedfield list 256-4294967295 specifies SysEx commands with more than 255 data octets but excludes commands with 255 oridentity relationship MUST NOT contain both native RTP MIDI streamsfewer data octets, andmpeg4-generic RTP MIDI streams. An exception applies ifthe field list 0 excludes all commands. A secondary parameter assignment syntax customizes Chapter X (see Appendix D for complete ABNF): <parameter> = "__" <h-list> ["_" <h-list>] "__" The assignment defines arelationship consistsclass ofsendonly and recvonly (but not sendrecv) streams. In this case,SysEx commands whose Chapter X coding obeys thesendonly streams MUST NOT contain both typessemantics ofstreams, andtherecvonly streams MUST NOT contain both types of streams. o Itassigned parameter. The command class ispossible to construct identity relationships that violatespecified by listing therecovery journal mandate (example: sending NoteOns for a voice channel on stream A and NoteOffs forpermitted values of thesame voice channel on stream B). Parties MUST NOT generate (or accept) session descriptions that exhibit this flaw. o Other payload formats MAY define musicport media type parameters. Formats would define these parameters sofirst N data octets thattheir sessions could be bundled into RTP MIDI name spaces. The parameter definitions MUST be compatible withfollow themusicport semantics defined in this Appendix. AsSysEx 0xF0 command octet. Any SysEx command whose first N data octets match the list is arule, at most one payload type inmember of the class. Each <h-list> defines arelationship may specifydata octet of the command, as aMIDI renderer. An exception todot-separated (".") list of one or more hexadecimal constants (such as "7F") or dash-separated hexadecimal ranges (such as "01-1F"). Underscores ("_") separate each <h-list>. Double-underscores ("__") delineate therule applies to relationships that contain sendonly and recvonly streams but no sendrecv streams. Indata octet list. Using thiscase, one sendonly session and one recvonly session maysyntax, eachdefine a renderer. Renderer specification inassignment specifies arelationshipsingle SysEx command class. Session descriptions maybe done using the tools described in Appendix C.6. These tools work for both native streams and mpeg4-generic streams. An mpeg4-generic stream that usesuse several assignments to theAppendix C.6 tools MUST set all "config"same (or different) parameters to specify complex Chapter X behaviors. The ordering behavior of multiple assignments follows theempty string (""). Alternatively,guidelines formpeg4-generic streams, renderer specification may be done by setting one "config"chapter parameter assignments described earlier in this section. The example session description below illustrates therelationship touse of therenderer configuration string, and all other config parameterschapter inclusion parameters: v=0 o=lazzaro 2520644554 2838152170 IN IP6 first.example.net s=Example t=0 0 m=audio 5004 RTP/AVP 96 c=IN IP6 2001:DB80::7F2E:172A:1E24 a=rtpmap:96 rtp-midi/44100 a=fmtp:96 j_update=open-loop; cm_unused=ABCFGHJKMQTVWXYZ; cm_used=__7E_00-7F_09_01.02.03__; cm_used=__7F_00-7F_04_01.02__; cm_used=C7.64; ch_never=ABCDEFGHJKMQTVWXYZ; ch_never=4.11-13N; ch_anchor=P; ch_anchor=C7.64; ch_anchor=__7E_00-7F_09_01.02.03__; ch_anchor=__7F_00-7F_04_01.02__ (The a=fmtp line has been wrapped to fit theempty string (""). We now define sender and receiver rules that apply whenpage to accommodate memo formatting restrictions; it comprises aparty sends several streamssingle line in SDP.) The j_update parameter codes thattargetthesame MIDI name space. Senders MAY usestream uses thesubsetting parameters (Appendix C.1)open-loop policy. Most MIDI command-types are assigned topredefinecm_unused and thus do not appear in thepartitioning of commands between streams, or MAY use a dynamic partitioning strategy. Receivers that merge identity relationship streams intostream. As asingle MIDI command stream MUST maintainconsequence, thestructural integrity ofassignments to the first ch_never parameter reflect that most chapters are not in use. Chapter N for several MIDIcommands codedchannels is assigned to ch_never. Chapter N for MIDI channels other than 4, 11, 12, and 13 may appear ineach stream duringthemerging process,recovery journal, using the (default) ch_default semantics. In practice, this assignment pattern would reflect knowledge about a resilient rendering method in use for thesame way that software that merges traditionalexcluded channels. The MIDI1.0 DIN cable flows is responsible for creating a mergedProgram Change commandflow compatible with [MIDI]. Senders MUST partition the name space soand several MIDI Control Change controller numbers are assigned to ch_anchor. Note that therendered MIDI performance does not contain indefinite artifacts (as defined in Section 4). This responsibility holds even if all streams are sent over reliable transport, as different stream latencies may yield indefinite artifacts. For example, stuck notes may occur in a performance split over two TCP streams, if NoteOnordering of the ch_anchor chapter C assignment after the ch_never command acts to override the ch_never assignment for the listed controller numbers (7 and 64). The assignment of command-type X to cm_unused excludes most SysEx commands from the stream. Exceptions aresent on one streammade for General MIDI System On/Off commands andNoteOfffor the Master Volume and Balance commands, via the use of the secondary assignment syntax. The cm_used assignment codes the exception, and the ch_anchor assignment codes how these commands aresent onprotected in Chapter X. C.3. Configuration Tools: Timestamp Semantics The MIDI command section of theother. Senders MUST NOT splitpayload format consists of aRegistered Parameter Name (RPN) or Non- Registered Parameter Name (NRPN) transaction appearing onlist of commands, each with an associated timestamp. The semantics of command timestamps may be set during session configuration, using the parameters we describe in this section The parameter "tsmode" specifies the timestamp semantics for aMIDI channel across multiple identity relationship sessions. Receivers MUST assumestream. The parameter takes on one of three token values: "comex", "async", or "buffer". The default "comex" value specifies that timestamps code theRPN/NRPN transactions that appear on different identity relationship sessions are independent,execution time for a command (Appendix C.3.1) andMUST preserve transactional integrity duringsupports the accurate transcoding Standard MIDImerge. A simple way to safely partition voice channel commandsFiles (SMFs, [MIDI]). The "comex" value isto place allalso RECOMMENDED for new MIDIcommandsuser-interface controller designs. The "async" value specifies an asynchronous timestamp sampling algorithm for time-of-arrival sources (Appendix C.3.2). The "buffer" value specifies a synchronous timestamp sampling algorithm (Appendix C.3.3) for time-of-arrival sources. Ancillary parameters MAY follow tsmode in aparticular voice channel intomedia description. We define these parameters in Appendices C.3.2-3 below. C.3.1. The comex Algorithm The default "comex" (COMmand EXecution) tsmode value specifies thesame session. Safe partitioningexecution time for the command. With comex, the difference between two timestamps indicates the time delay between the execution ofMIDI Systems commandsthe commands. This difference may bemore complicatedzero, coding simultaneous execution. The comex interpretation of timestamps works well forsessions that extensively use System Exclusive. We now show several session description examplestranscoding a Standard MIDI File (SMF, [MIDI]) into an RTP MIDI stream, as SMFs code a timestamp for each MIDI command stored in the file. To transcode an SMF that uses metric time markers, use themusicport parameter. Our first session description example shows twoSMF tempo map (encoded in the SMF as meta-events) to convert metric SMF timestamp units into seconds-based RTP timestamp units. New MIDIstreamscontroller designs (piano keyboard, drum pads, etc.) thatdrive the same Generalsupport RTP MIDIdecoder. The sender partitionsand that have direct access to sensor data SHOULD use comex interpretation for timestamps, so that simultaneous gestural events may be accurately coded by RTP MIDI. Comex is a poor choice for transcoding MIDIcommands between the streams dynamically. The musicport values indicate the streams share an identity relationship. v=0 o=lazzaro 2520644554 2838152170 IN IP4 first.example.net s=Example t=0 0 a=group:FID 1 2 c=IN IP4 192.0.2.94 m=audio 5004 RTP/AVP 96 a=rtpmap:96 mpeg4-generic/44100 a=mid:1 a=fmtp:96 streamtype=5; mode=rtp-midi; profile-level-id=12; config=7A0A0000001A4D546864000000060000000100604D54726B0 000000600FF2F000; musicport=12 m=audio 5006 RTP/AVP 96 a=rtpmap:96 mpeg4-generic/44100 a=mid:2 a=fmtp:96 streamtype=5; mode=rtp-midi; config=""; profile-level-id=12; musicport=12 (The a=fmtp lines have been wrapped to fit the page to accommodate memo formatting restrictions; they comprise single lines in SDP) Recall that Section 2.1 in the main text defines rules1.0 DIN cables [MIDI], forstreams that target the same MIDI name space. Those rules, implemented in the example above, require that each stream resides inaseparate RTP session, andreason thatthe grouping mechanisms defined in [RFC3388] signal an inter-session relationship. The "group" and "mid" attribute lines implement this grouping mechanism.we will now explain. Avariant on this example, whose session descriptionMIDI DIN cable isnot shown, would use two streams inanidentity relationship driving the sameasynchronous serial protocol (320 microseconds per MIDIrenderer, each withbyte). MIDI commands on adifferent transport type. One stream would use UDP, and would be dedicated to real-time messages. A second stream would use TCP [CONTRANS] and would be used for SysEx bulk data messages. InDIN cable are not tagged with timestamps. Instead, MIDI DIN receivers infer command timing from thenext example,time of arrival of the bytes. Thus, twompeg4-generic streams form an ordered relationship to drivetwo-byte MIDI commands that occur at aStructured Audio decodersource simultaneously are encoded on a MIDI 1.0 DIN cable with32a 640 microsecond time offset. A MIDIvoice channels. Both streams resideDIN receiver is unable to tell if this time offset existed in thesame RTP session. v=0 o=lazzaro 2520644554 2838152170 IN IP6 first.example.net s=Example t=0 0 m=audio 5006 RTP/AVP 96 97 c=IN IP6 2001:DB80::7F2E:172A:1E24 a=rtpmap:96 mpeg4-generic/44100 a=fmtp:96 streamtype=5; mode=rtp-midi; config=""; profile-level-id=13; musicport=5 a=rtpmap:97 mpeg4-generic/44100 a=fmtp:97 streamtype=5; mode=rtp-midi; config=""; profile-level-id=13; musicport=6; render=synthetic; rinit="audio/asc"; url="http://example.com/cardinal.asc"; cid="azsldkaslkdjqpwojdkmsldkfpe" (The a=fmtp lines have been wrapped to fitsource performance or is an artifact of thepage to accommodate memo formatting restrictions; they comprise single lines in SDP) The sequential musicport values forserial speed of the cable. However, the RTP MIDI comex interpretation of timestamps declares that a timestamp offset between twosessions establishescommands reflects theordered relationship. The musicport=5 session maps to Structured Audio extended channels range 0-15,timing of themusicport=6 session maps to Structured Audio extended channels range 16-31. Both config strings are empty. The configuration datasource performance. This semantic mismatch isspecified by parametersthe reason that comex is a poor choice for transcoding MIDI DIN cables. Note thatappear inthefmtp linechoice of thesecond media description. We define this configuration methodRTP timestamp rate (Section 6.1-2 inAppendix C.6. The next example showsthe main text) cannot fix this inaccuracy issue. In the sections that follow, we describe twoRTP MIDI streams (one recvonly, one sendonly)alternative timestamp interpretations ("async" and "buffer") thatform a "virtual sendrecv" session. Each stream resides inare adifferent RTP session (a requirement because sendonlybetter match to MIDI 1.0 DIN cable timing, andrecvonly are RTP session attributes). v=0 o=lazzaro 2520644554 2838152170 IN IP4 first.example.net s=Example t=0 0 a=group:FID 1 2 c=IN IP4 192.0.2.94 m=audio 5004 RTP/AVP 96 a=sendonly a=rtpmap:96 mpeg4-generic/44100 a=mid:1 a=fmtp:96 streamtype=5; mode=rtp-midi; profile-level-id=12; config=7A0A0000001A4D546864000000060000000100604D54726B0 000000600FF2F000; musicport=12 m=audio 5006 RTP/AVP 96 a=recvonly a=rtpmap:96 mpeg4-generic/44100 a=mid:2 a=fmtp:96 streamtype=5; mode=rtp-midi; profile-level-id=12; config=7A0A0000001A4D546864000000060000000100604D54726B0 000000600FF2F000; musicport=12 (The a=fmtp lines have been wrapped to fit the page to accommodate memo formatting restrictions; they comprise single lines in SDP) To signal the "virtual sendrecv" semantics, the two streams assign musicporttothe sameother MIDI time- of-arrival sources. The "octpos", "linerate", and "mperiod" ancillary parameters (defined below) SHOULD NOT be used with comex. C.3.2. The async Algorithm The "async" tsmode value(12). As defined earlier in this section, pairsspecifies the asynchronous sampling ofidentity relationship streams that are sent by different parties sharea MIDI time-of-arrival source. In asynchronous sampling, theassociation thatmoment an octet isshared byreceived from aMIDI cable pair that cross-connects two devices insource, it is labelled with aMIDI 1.0 network. We usewall-clock time value. The time value has RTP timestamp units. The "octpos" ancillary parameter defines how RTP command timestamps are derived from octet time values. If octpos has theterm "virtual sendrecv" because streams sent by different parties intoken value "first", atrue sendrecv session also have this property. As discussed intimestamp codes thepreamble to Appendix C,time value of theprimary advantagefirst octet of thevirtual sendrecv configuration is that each party can customizecommand. If octpos has theproperty oftoken value "last", a timestamp codes thestream it receives. Intime value of theexample above, each stream defines its own "config" string that could customizelast octet of therendering algorithm for each party (in fact,command. If theparticular strings shownoctpos parameter does not appear inthis example are identical, because General MIDI isthe media description, the sender does nota configurable MPEG 4 renderer). C.6 Configuration Tools: MIDI Rendering This Appendix definesknow which octet of thesession configuration tools for rendering.command the timestamp references (for example, the sender may be relying on an operating system service that does not specify this information). The"render" parameter specifies a rendering method foroctpos semantics refer to the first or last octet of astream. The parameter is assignedcommand as it appears on atoken value that signals the top-level rendering class.time-of-arrival MIDI source, not as it appears in an RTP MIDI packet. Thismemo defines four token values for render: "unknown", "synthetic", "api", and "null": o An "unknown" renderer is a renderer whose nature is unspecified. Itdistinction is significant because thedefault renderer for nativeRTP coding may contain octets that are not present in the source. For example, the status octet of the first MIDIstreams. o A "synthetic" renderer transformscommand in a packet may have been added to the MIDI streaminto audio output (or sometimes, into stage lighting changes or other actions). It isduring transcoding, to comply with thedefault renderer for mpeg4-genericRTP MIDIstreams. o An "api" renderer presents the command stream to applications via an Application Programmer Interface (API). orunning status requirements (Section 3.2). The"null" renderer discards"linerate" ancillary parameter defines the timespan of one MIDIstream. The "null" render value plays special roles during Offer/Answer negotiations [RFC3264]. A party usesoctet on the"null" value in an answertransmission medium of the MIDI source toreject an offered renderer. Note that rejectingbe sampled (such as arendererMIDI 1.0 DIN cable). The parameter has units of nanoseconds, and takes on integral values. For MIDI 1.0 DIN cables, the correct linerate value isindependent from rejecting320000 (this value is also the default value for the parameter). We now show apayload type (coded by by removingsession description example for thepayload type fromasync algorithm. Consider amedia line) and rejectingsender that is transcoding amedia stream (coded by zeroing the port ofMIDI 1.0 DIN cable source into RTP. The sender runs on amedia linecomputing platform that assigns time values to every incoming octet of the source, and the sender uses therenderer). Other render tokentime valuesMAY be registered with IANA.to label the first octet of each command in the RTP packet. This session description describes the transcoding: v=0 o=lazzaro 2520644554 2838152170 IN IP4 first.example.net s=Example t=0 0 m=audio 5004 RTP/AVP 96 c=IN IP4 192.0.2.94 a=rtpmap:96 rtp-midi/44100 a=sendonly a=fmtp:96 tsmode=async; linerate=320000; octpos=first C.3.3. Thetokenbuffer Algorithm The "buffer" tsmode valueMUST adhere tospecifies theABNF for render tokens definedsynchronous sampling of a MIDI time-of-arrival source. In synchronous sampling, octets received from a source are placed inAppendix D. Registrations MUST includea holding buffer upon arrival. At periodic intervals, the RTP sender examines the buffer. The sender removes completespecification of parameter value usage, similarcommands from the buffer and codes those commands indepth toan RTP packet. The command timestamp codes thespecificationsmoment of buffer examination, expressed in RTP timestamp units. Note thatappear throughoutseveral commands may have the same timestamp value. The "mperiod" ancillary parameter defines the nominal periodic sampling interval. The parameter takes on positive integral values and has RTP timestamp units. The "octpos" ancillary parameter, defined in AppendixC.6C.3.1 for"synthetic" and "api" render values. If a party is offeredasynchronous sampling, plays asession description that usesdifferent role in synchronous sampling. In synchronous sampling, the parameter specifies the timestamp semantics of arendercommand whose octets span several sampling periods. If octpos has the token valuethat is not known to"first", theparty,timestamp reflects theparty MUST NOT acceptarrival period of therenderer. Options include rejectingfirst octet of therenderer (usingcommand. If octpos has the"null" value),token value "last", thepayload type,timestamp reflects themedia stream, orarrival period of thesession description. Other parameters MAY follow a render parameter in a parameter list.last octet of the command. Theadditional parameters actoctpos semantics refer todefinetheexact naturefirst or last octet of therenderer. For example,command as it appears on a time-of-arrival source, not as it appears in the"subrender"RTP packet. If the octpos parameter(defineddoes not appear inAppendix C.6.2) specifiestheexact naturemedia description, the timestamp MAY reflect the arrival period of any octet of therenderer. Special rules applycommand; senders use this option tousingsignal a lack of knowledge about therender parameter in an mpeg4-generic stream.timing details of the buffering process at sub-command granularity. Wedefine these rules in Appendix C.6.5. C.6.1 The multimode Parameter A medianow show a session descriptionMAY contain several render parameters. By default, ifexample for the buffer algorithm. Consider aparameter lists includes several render parameters,sender that is transcoding areceiver MUST choose exactly one renderer fromMIDI 1.0 DIN cable source into RTP. The sender runs on a computing platform that places source data into a buffer upon receipt. The sender polls thelist to renderbuffer 1000 times a second, extracts all complete commands from thestream. The "multimode" parameter may be used to override this default. We define two token values for multimode: "one"buffer, and"all": oplaces the commands in an RTP packet. This session description describes the transcoding: v=0 o=lazzaro 2520644554 2838152170 IN IP6 first.example.net s=Example t=0 0 m=audio 5004 RTP/AVP 96 c=IN IP6 2001:DB80::7F2E:172A:1E24 a=rtpmap:96 rtp-midi/44100 a=sendonly a=fmtp:96 tsmode=buffer; linerate=320000; octpos=last; mperiod=44 Thedefault "one"mperiod valuerequests rendering by exactly oneof 44 is derived by dividing thelisted renderers. o The "all" value requestsclock rate specified by thesynchronized renderingrtpmap attribute (44100 Hz) by the 1000 Hz buffer sampling rate and rounding to the nearest integer. Command timestamps might not increment by exact multiples of 44, as theRTPactual sampling period might not precisely match the nominal mperiod value. C.4. Configuration Tools: Packet Timing Tools In this appendix, we describe session configuration tools for customizing the temporal behavior of MIDI stream packets. C.4.1. Packet Duration Tools Senders control the granularity of a stream byall listed renderers, if possible. Ifsetting themultimode parameter appearstemporal duration ("media time") of the packets ina parameter list, it MUST appear beforethefirst render parameter assignment. Render parameters appearstream. Short media times (20 ms or less) often imply an interactive session. Longer media times (100 ms or more) usually indicate a content streaming session. The RTP AVP profile [RFC3551] recommends audio packet media times in a range from 0 to 200 ms. By default, an RTP receiver dynamically senses theparameter listmedia time of packets inordera stream and chooses the length ofdecreasing priority.its playout buffer to match the stream. A receiver typically sizes its playout buffer to fit several audio packets and adjusts the buffer length to reflect the network jitter and the sender timing fidelity. Alternatively, the packet media time may be statically set during session configuration. Session descriptions MAY use thepriority ordering to decide which renderer(s)RTP MIDI parameter "rtp_ptime" toretain in a session. Ifset the"offer" in an Offer/Answer-style negotiation [RFC3264] containsrecommended media time for aparameter list with one or more render parameters,packet. Session descriptions MAY also use the"answer" MUSTRTP MIDI parameter "rtp_maxptime" to set therender parameters of all unchosen renderers to "null". C.6.2 Renderer Specification The render parameter (Appendix C.6 preamble) specifies, inmaximum media time for abroad sense, whatpacket permitted in arenderer does withstream. Both parameters MAY be used together to configure aMIDIstream.In this Appendix, we describe the "subrender" parameter.Thetoken valuevalues assigned tosubrender defines the exact nature oftherenderer. Thus, "render"rtp_ptime and"subrender" combine to define a renderer, inrtp_maxptime parameters have thesame way as MIME types and MIME subtypes combine to define a typeunits ofmedia [RFC2045]. Ifthesubrender parameter is usedRTP timestamp fora renderer definition, it MUST appear immediately aftertherender parameter instream, as set by theparameter list. At most one subrender parameter may appear inrtpmap attribute (see Section 6.1). Thus, if rtpmap sets the clock rate of arenderer definition. This document defines one value for subrender:stream to 44100 Hz, a maximum packet media time of 10 ms is coded by setting rtp_maxptime=441. As stated in thevalue "default". The "default" token specifiesAppendix C preamble, theusesenders and receivers of a stream MUST agree on common values for rtp_ptime and rtp_maxptime if the parameters appear in thedefault renderermedia description for thestream type (native or mpeg4-generic). The default renderer for native RTP MIDI streamsstream. 0 ms is arenderer whose naturereasonable media time value for MIDI packets and isunspecified (see point 6often used inSection 6.1low-latency interactive applications. In a packet with a 0 ms media time, all commands execute at the instant they are coded by the packet timestamp. The session description below configures all packets in themain text for details).stream to have 0 ms media time: v=0 o=lazzaro 2520644554 2838152170 IN IP4 first.example.net s=Example t=0 0 m=audio 5004 RTP/AVP 96 c=IN IP4 192.0.2.94 a=rtpmap:96 rtp-midi/44100 a=fmtp:96 rtp_ptime=0; rtp_maxptime=0 Thedefault renderer for mpeg4-genericsession attributes ptime and maxptime [RFC4566] MUST NOT be used to configure an RTP MIDIstreams is an MPEG 4 Audio Object Type whose ID number is 13, 14, or 15 (see Section 6.2stream. Sessions MUST use rtp_ptime inthe main text for details). If a renderer definition does notlieu of ptime and MUST usethe subrender parameter, the value "default" is assumedrtp_maxptime in lieu of maxptime. RTP MIDI defines its own parameters forsubrender. Other subrender tokenmedia time configuration because 0 ms valuesmay be registered with IANA. We now discuss guidelines for registering subrender values. A subrender value is registeredfora specific stream type (native or mpeg4-generic)ptime anda specific render value (excluding "null"maxptime are forbidden by [RFC3264] but are essential for certain applications of RTP MIDI. See the Appendix C.7 examples for additional discussion about using rtp_ptime and"unknown"). Registrationsrtp_maxptime formpeg4-generic subrender values are restricted to new MPEG 4 Audio Object Types that accept MIDI input.session configuration. C.4.2. Thesyntaxguardtime Parameter RTP permits a sender to stop sending audio packets for an arbitrary period of time during a session. When sending resumes, thetoken MUST adhere toRTP sequence number series continues unbroken, and thetoken definition in Appendix D. For "render=synthetic" renderers, a subrenderRTP timestamp valueregistration specifies an exact method for transformingreflects the media time silence gap. This RTP feature has its roots in telephony, but it is also well matched to interactive MIDIstream into audio (or sometimes, into video or control actions, suchsessions, asstage lighting). For standardized renderers, this specification is usuallyplayers may fall silent for several seconds during (or between) songs. Certain MIDI applications benefit from apointerslight enhancement toa standards document, perhaps supplemented by RTP-MIDI specific information. For commercial products and open-source projects,thisspecification usually takes the form of instructionsRTP feature. In interactive applications, receivers may use on-line network models to guide heuristics forinterfacing thehandling lost and late RTPMIDI stream with the product or project software. A "render=synthetic" registration MAY specify additional Reset State commandspackets. These models may work poorly if a sender ceases packet transmission for long periods of time. Session descriptions may use therenderer (Appendix A.1). A "render=api" subrenderparameter "guardtime" to set a minimum sending rate for a media session. The valueregistration specifies how anassigned to guardtime codes the maximum separation time between two sequential packets, as expressed in RTPMIDI stream interfaces with an API (Application Programmers Interface).timestamp units. Typical guardtime values are 500-2000 ms. Thisspecificationvalue range isusuallynot apointer to programmer's documentation for the API, perhaps supplemented by RTP-MIDI specific information. A subrender registration MAY specify an initialization file (referrednormative bound, and parties SHOULD be prepared toinprocess values outside thisdocument as an initialization data object) for the stream.range. Theinitialization data object MAY be encodedcongestion control requirements for sender implementations (described in Section 8 and [RFC3550]) take precedence over theparameter list (verbatim or by reference) usingguardtime parameter. Thus, if thecoding tools defined in Appendix C.6.3. An initialization data object MUST haveguardtime parameter requests aregistered [MTYPE] media type and subtype [RFC2045]. For "render=synthetic" renderers, the data object usually encodes initialization data forminimum sending rate, but sending at this rate would violate therenderer (sample files, synthesis patch parameters, reverberation room impulse responses, etc). For "render=api" renderers,congestion control requirements, senders MUST ignore thedata object usually encodes data aboutguardtime parameter value. In this case, senders SHOULD use thestream used bylowest minimum sending rate that satisfies theAPI (for example, for an RTP MIDI stream generated bycongestion control requirements. Below, we show apiano keyboard controller, the manufacturer and model number ofsession description that uses thekeyboard, for useguardtime parameter. v=0 o=lazzaro 2520644554 2838152170 IN IP6 first.example.net s=Example t=0 0 m=audio 5004 RTP/AVP 96 c=IN IP6 2001:DB80::7F2E:172A:1E24 a=rtpmap:96 rtp-midi/44100 a=fmtp:96 guardtime=44100; rtp_ptime=0; rtp_maxptime=0 C.5. Configuration Tools: Stream Description As we discussed inGUI presentation). Usually, only one initialization object is encoded for a renderer. IfSection 2.1, arenderer uses multiple data objects, the correct receiver interpretation of multiple data objects MUST be definedparty may send several RTP MIDI streams in thesubrender registration. A subrender value registrationsame RTP session, and several RTP sessions that carry MIDI mayalso specify additional parameters, toappear in a multimedia session. By default, theparameter list immediately after subrender. These parameter names MUST begin with the subrender value followed by an underscore ("_"), to avoidMIDI name spacecollisions with future(16 channels + systems) of each RTPMIDI parameter names (example:stream sent by aparameter "foo_bar" defined for subrender value "foo"). We now specify guidelines for interpreting the subrender parameter duringparty in a multimedia sessionconfiguration.is independent. By independent, we mean three distinct things: o If a party sends two RTP MIDI streams (A and B), MIDI voice channel 0 in stream A isoffered a session description that usesarenderer whose subrender valuedifferent "channel 0" than MIDI voice channel 0 in stream B. o MIDI voice channel 0 in stream B is notknownconsidered tothe party, the party MUST NOT accept the renderer. Options include rejecting the renderer (using the "null" value), the payload type, the media stream, or the session description. Receivers MUSTbeaware"channel 16" of a 32-channel MIDI voice channel space whose "channel 0" is channel 0 of stream A. o Streams sent by different parties over different RTP sessions, or over theReset State commands (Appendix A.1) forsame RTP session but with different payload type numbers, do not share therenderer specifiedassociation that is shared by a MIDI cable pair that cross-connects two devices in a MIDI 1.0 DIN network. By default, this association is only held by streams sent by different parties in thesubrender parameter, and MUST insuresame RTP session that use therenderer does not experience indefinite artifacts duesame payload type number. In this appendix, we show how tothe presence (or the loss) of a Reset State command. C.6.3 Renderer Initialization If the renderer forexpress that specific RTP MIDI streams in astream uses an initialization data object, an "rinit" parameter MUST appearmultimedia session are not independent but instead are related in one of theparameter list immediately after the "subrender"three ways defined above. We use two tools to express these relations: o The musicport parameter.If the rendererThis parameterlist does not includeis assigned asubrender parameter (recall the semantics for "default"non- negative integer value between 0 and 4294967295. It appears inAppendix C.6.2), the "rinit" parameter MUST appear immediately afterthe"render" parameter.fmtp lines of payload types. o Thevalue assigned toFID grouping attribute [RFC3388] signals that several RTP sessions in a multimedia session are using therinitmusicport parameterMUST be the media type/subtype [RFC2045] for the initialization data object. Ifto express aninitialization object type is registered withinter-session relationship. If a multimedia session has severalmedia types, including audio,payload types whose musicport parameters are assigned theassignment to rinit MUSTsame integer value, streams using these payload types share an "identity relationship" (including streams that use theaudio media type. RTP MIDI supports several parameters for encoding initialization data objects for rendererssame payload type). Streams in an identity relationship share two properties: o Identity relationship streams sent by theparameter list: "inline", "url",same party target the same MIDI name space. Thus, if streams A and"cid". IfB share an identity relationship, voice channel 0 in stream A is the"inline", "url", and/or "cid" parameterssame "channel 0" as voice channel 0 in stream B. o Pairs of identity relationship streams that areusedsent bya renderer, these parameters MUST immediately followdifferent parties share the"rinit" parameter. Ifassociation that is shared by a"url" parameter appears forMIDI cable pair that cross-connects two devices in arenderer, an "inline" parameterMIDI 1.0 DIN network. A party MUST NOTappear. Ifsend two RTP MIDI streams that share an"inline" parameter appearsidentity relationship in the same RTP session. Instead, each stream MUST be in a separate RTP session. As explained in Section 2.1, this restriction is necessary to support the RTP MIDI method for the synchronization of streams that share arenderer,MIDI name space. If a"url" parameter MUST NOT appear. However, neither "url" or "inline"multimedia session has several payload types whose musicport parameters arerequired to appear. If neither "url" or "inline" parameters follow "rinit", the "cid" parameter MUST follow "rinit". The "inline" parameter supports the inline encoding of the data object. The parameter isassigneda double-quoted Base64 [RFC2045] encoding ofsequential values (i.e., i, i+1, ... i+k), thebinary data object, with no line breaks. Appendix E.4 showsstreams using the payload types share anexample that constructs"ordered relationship". For example, if payload type A assigns 2 to musicport and payload type B assigns 3 to musicport, A and B are in aninline parameter value. The "url" parameter is assigned a double-quoted string representation of a Uniform Resource Locator (URL) forordered relationship. Streams in an ordered relationship that are sent by thedata object. The string MUST specifysame party are considered by renderers to form aHyperText Transport Protocol URL (HTTP, [RFC2616]). HTTP MAY be used over TCP, or MAY be used oversingle larger MIDI space. For example, if stream A has asecure network transport, such as the method describedmusicport value of 2 and stream B has a musicport value of 3, MIDI voice channel 0 in[RFC2818]. The media type/subtype for the data object SHOULDstream B is considered to bespecifiedvoice channel 16 in theappropriate HTTP transport header. The "cid" parameter supports data object caching. The parameter is assigned a double-quoted string valuelarger MIDI space formed by the relationship. Note thatencodes a globally unique identifierit is possible forthe data object. A cid parameter MAY immediately followstreams to participate in both aninline parameter,identity relationship and an ordered relationship. We now state several rules for using musicport: o If streams from several RTP sessions inwhich casea multimedia session use thecid identifier valuemusicport parameter, the RTP sessions MUST beassociated withgrouped using theinline data object. If a url parameter is present,FID grouping attribute defined in [RFC3388]. o An ordered or identity relationship MUST NOT contain both native RTP MIDI streams and mpeg4-generic RTP MIDI streams. An exception applies if a relationship consists of sendonly and recvonly (but not sendrecv) streams. In this case, thedata object forsendonly streams MUST NOT contain both types of streams, and theURLrecvonly streams MUST NOT contain both types of streams. o It isexpectedpossible tobe unchanged for the life ofconstruct identity relationships that violate theURL,recovery journal mandate (for example, sending NoteOns for acid parameter MAY immediately followvoice channel on stream A and NoteOffs for theurl parameter. The cid identifier valuesame voice channel on stream B). Parties MUST NOT generate (or accept) session descriptions that exhibit this flaw. o Other payload formats MAY define musicport media type parameters. Formats would define these parameters so that their sessions could beassociated with the data object for the URL. A cidbundled into RTP MIDI name spaces. The parameterassigned to the same identifier value SHOULDdefinitions MUST bespecified followingcompatible with thedata object type/subtypemusicport semantics defined inthe appropriate HTTP transport header. Ifthis appendix. As aurl parameter is present, and ifrule, at most one payload type in a relationship may specify a MIDI renderer. An exception to thedata object for the URL is expectedrule applies tochange during the life of the URL, a cid parameter MUST NOT follow the url parameter. A receiver interprets the presence of a cid parameter as an indication that it is safe use a cached copy of the url data object; the absence of a cid parameter is an indicationrelationships thatit is not safe to use a cached copy, as it may change. Finally, the cid parameter MAY be used without the inlinecontain sendonly andurl parameters.recvonly streams but no sendrecv streams. In this case,the identifier references a local or distributed catalog of data objects. In most cases, onlyonedata object is coded in the parameter list forsendonly session and one recvonly session may each define a renderer.For example,Renderer specification in a relationship may be done using thedefault renderertools described in Appendix C.6. These tools work formpeg4-genericboth native streams and mpeg4-generic streams. An mpeg4-generic stream that usesa single data object (see Appendix C.6.5 for example usage). However, a subrender registration MAY permit the use of multiple data objects for a renderer. If multiple data objects are encoded for a renderer, each object encoding begins with an "rinit" parameter, followed by "inline", "url", and/or "cid" parameters. Initialization data object MAY encapsulate a Standard MIDI File (SMF). By default,theSMFs that are encapsulated in a data objectAppendix C.6 tools MUSTbe ignored by an RTP MIDI receiver. We defineset all "config" parameters tooverride this default in Appendix C.6.4. To end this section, we offer guidelines for registering media typesthe empty string (""). Alternatively, forinitialization data objects. These guidelines arempeg4-generic streams, renderer specification may be done by setting one "config" parameter inadditionthe relationship to theinformation in [RFC2048]. Some initialization data objects are also capable of encoding MIDI note information,renderer configuration string, andthus complete audio performances. These objects SHOULD be registered usingall other config parameters to the"audio" media type, soempty string (""). We now define sender and receiver rules that apply when a party sends several streams that target theobjects may also be used for store-and-forward rendering, and "application" media type, to support editing tools. Initialization objects without note storage, or initialization objects for non-audio renderers, SHOULD be registered only for an "application" media type. C.6.4 MIDI Channel Mapping In this Appendix, we specify how to mapsame MIDI namespaces (16 voice channels + systems) onto a renderer. Inspace. Senders MAY use thegeneral case: o A session may define an ordered relationshipsubsetting parameters (AppendixC.5) that presents more than one MIDI name spaceC.1) toa renderer. o A renderer may accept an arbitrary numberpredefine the partitioning ofMIDI name spaces,commands between streams, ormay expectthey MAY use aspecific number of MIDI name spaces. A session description SHOULD providedynamic partitioning strategy. Receivers that merge identity relationship streams into acompatiblesingle MIDIname space tocommand stream MUST maintain the structural integrity of the MIDI commands coded in eachrendererstream during the merging process, in thesession. If a receiver detectssame way thata session description has too many or too fewsoftware that merges traditional MIDIname spaces for1.0 DIN cable flows is responsible for creating arenderer, MIDI data from extra stream name spacesmerged command flow compatible with [MIDI]. Senders MUSTbe discarded, and extra rendererpartition the namespaces MUST NOT be driven withspace so that the rendered MIDIdata (exceptperformance does not contain indefinite artifacts (as defined in Section 4). This responsibility holds even if all streams are sent over reliable transport, asdescribeddifferent stream latencies may yield indefinite artifacts. For example, stuck notes may occur inAppendix C.6.4.1 below). Ifaparameter list defines several renderersperformance split over two TCP streams, if NoteOn commands are sent on one stream andassignsNoteOff commands are sent on the"all" token value toother. Senders MUST NOT split a Registered Parameter Name (RPN) or Non- Registered Parameter Name (NRPN) transaction appearing on a MIDI channel across multiple identity relationship sessions. Receivers MUST assume that themultimode parameter,RPN/NRPN transactions that appear on different identity relationship sessions are independent and MUST preserve transactional integrity during thesame name spaceMIDI merge. A simple way to safely partition voice channel commands ispresentedtoeach renderer. However,place all MIDI commands for a particular voice channel into the"chanmask" parametersame session. Safe partitioning of MIDI Systems commands may beused to mask out selected voice channels to each renderer.more complicated for sessions that extensively use System Exclusive. Wedefine "chanmask" and othernow show several session description examples that use the musicport parameter. Our first session description example shows two RTP MIDImanagement parameters instreams that drive thesub-sections below. C.6.4.1same General MIDI decoder. Thesmf_info Parametersender partitions MIDI commands between the streams dynamically. Thesmf_info parameter definesmusicport values indicate that theuse ofstreams share an identity relationship. v=0 o=lazzaro 2520644554 2838152170 IN IP4 first.example.net s=Example t=0 0 a=group:FID 1 2 c=IN IP4 192.0.2.94 m=audio 5004 RTP/AVP 96 a=rtpmap:96 mpeg4-generic/44100 a=mid:1 a=fmtp:96 streamtype=5; mode=rtp-midi; profile-level-id=12; config=7A0A0000001A4D546864000000060000000100604D54726B0 000000600FF2F000; musicport=12 m=audio 5006 RTP/AVP 96 a=rtpmap:96 mpeg4-generic/44100 a=mid:2 a=fmtp:96 streamtype=5; mode=rtp-midi; config=""; profile-level-id=12; musicport=12 (The a=fmtp lines have been wrapped to fit theSMFs encapsulatedpage to accommodate memo formatting restrictions; they comprise single lines inrenderer data objects (if any). The smf_info parameter alsoSDP.) Recall that Section 2.1 defines rules for streams that target theuse of SMFs coded in the smf_inline, smf_url, and smf_cid parameters (definedsame MIDI name space. Those rules, implemented inAppendix C.6.4.2). The smf_info parameter describesthe"render" parameterexample above, require thatmost recently precedes iteach stream resides in a separate RTP session, and that theparameter list. The smf_info parameter MUST NOT appeargrouping mechanisms defined inparameter lists that do[RFC3388] signal an inter-session relationship. The "group" and "mid" attribute lines implement this grouping mechanism. A variant on this example, whose session description is not shown, would use two streams in an identity relationship driving the"render" parameter,same MIDI renderer, each with a different transport type. One stream would use UDP andMUST NOT appear before the firstwould be dedicated to real-time messages. A second stream would useof "render" in the parameter list. We define three token values for smf_info: "ignore", "sdp_start",TCP [RFC4571] and"identity": o The "ignore" value indicates that the SMFs MUSTwould bediscarded. This behavior is the default SMF rendering behavior. o The "sdp_start" value codes that SMFs MUST be rendered, and that the rendering MUST begin upon the acceptance ofused for SysEx bulk data messages. In thesession description. If a receiver is offered a session description with a renderer that usesnext example, two mpeg4-generic streams form ansmf_info parameter set to sdp_start, and if the receiver does not support rendering SMFs, the receiver MUST NOT accept the renderer associated with the smf_info parameter. Options include rejecting the renderer (by setting the "render" parameterordered relationship to"null"), the payload type, the media stream, or the entire session description. o The "identity" value indicates the SMFs code the identity of the renderer. The value is meant for usedrive a Structured Audio decoder withthe "unknown" renderer (see Appendix C.6 preamble). The32 MIDIcommands codedvoice channels. Both streams reside in theSMF are informational in nature, and MUST NOT be presentedsame RTP session. v=0 o=lazzaro 2520644554 2838152170 IN IP6 first.example.net s=Example t=0 0 m=audio 5006 RTP/AVP 96 97 c=IN IP6 2001:DB80::7F2E:172A:1E24 a=rtpmap:96 mpeg4-generic/44100 a=fmtp:96 streamtype=5; mode=rtp-midi; config=""; profile-level-id=13; musicport=5 a=rtpmap:97 mpeg4-generic/44100 a=fmtp:97 streamtype=5; mode=rtp-midi; config=""; profile-level-id=13; musicport=6; render=synthetic; rinit="audio/asc"; url="http://example.com/cardinal.asc"; cid="azsldkaslkdjqpwojdkmsldkfpe" (The a=fmtp lines have been wrapped toa renderer for audio presentation. In typical use,fit theSMF would use SysEx Identity Reply commands (F0 7E nn 06 02, as defined in [MIDI]) to identify devices, and use device-specific SysEx commandspage todescribe current state of the devices (patch memory contents, etc). Other smf_info tokenaccommodate memo formatting restrictions; they comprise single lines in SDP.) The sequential musicport valuesMAY be registered with IANA.for the two sessions establish the ordered relationship. Thetoken value MUST adheremusicport=5 session maps to Structured Audio extended channels range 0-15, theABNF for render tokens defined in Appendix D. Registrations MUST include a complete specification of parameter usage, similar in depthmusicport=6 session maps tothe specificationsStructured Audio extended channels range 16-31. Both config strings are empty. The configuration data is specified by parameters that appear inthis Appendix for "sdp_start" and "identity". If a party is offered a session description that uses an smf_info parameter value that is not known to the party, the party MUST NOT accept the renderer associated with the smf_info parameter. Options include rejectingtherenderer, the payload type,fmtp line of the second mediastream, or the entire sessiondescription. Wenowdefinethe rendering semantics for the "sdp_start" token valuethis configuration method indetail.Appendix C.6. TheSMFs andnext example shows two RTP MIDI streamsin a session description share the same MIDI name space(s). In the simple case of(one recvonly, one sendonly) that form asingle RTP MIDI"virtual sendrecv" session. Each streamand a single SMF, the SMF MIDI commands and RTP MIDI commands are merged into a single name space and presented to the renderer. The indefinite artifact responsibilities for merged MIDI streams definedresides inAppendix C.5 also apply to merginga different RTP session (a requirement because sendonly andSMF MIDI data. If a payload type codes multiple SMFs, the SMF name spacesrecvonly arepresented as an ordered entity to the renderer. To determine the ordering of SMFs for a renderer (which SMF is "first", which is "second", etc), use the following rules: o IfRTP session attributes). v=0 o=lazzaro 2520644554 2838152170 IN IP4 first.example.net s=Example t=0 0 a=group:FID 1 2 c=IN IP4 192.0.2.94 m=audio 5004 RTP/AVP 96 a=sendonly a=rtpmap:96 mpeg4-generic/44100 a=mid:1 a=fmtp:96 streamtype=5; mode=rtp-midi; profile-level-id=12; config=7A0A0000001A4D546864000000060000000100604D54726B0 000000600FF2F000; musicport=12 m=audio 5006 RTP/AVP 96 a=recvonly a=rtpmap:96 mpeg4-generic/44100 a=mid:2 a=fmtp:96 streamtype=5; mode=rtp-midi; profile-level-id=12; config=7A0A0000001A4D546864000000060000000100604D54726B0 000000600FF2F000; musicport=12 (The a=fmtp lines have been wrapped to fit therenderer uses apage to accommodate memo formatting restrictions; they comprise singledata object, the order of appearance of the SMFslines in SDP.) To signal theobject's internal structure defines"virtual sendrecv" semantics, theorder oftwo streams assign musicport to theSMFs (the earliest SMFsame value (12). As defined earlier in this section, pairs of identity relationship streams that are sent by different parties share theobjectassociation that is"first", the next SMFshared by a MIDI cable pair that cross-connects two devices inthe object is "second", etc). o If multiple data objects are encoded forarenderer,MIDI 1.0 network. We use theappearance of each data objectterm "virtual sendrecv" because streams sent by different parties in a true sendrecv session also have this property. As discussed in theparameter list setspreamble to Appendix C, therelative orderprimary advantage of theSMFs encoded in each data object (SMFs encoded in parameters that appear earlier in the list are ordered before SMFs encoded in parametersvirtual sendrecv configuration is thatappear later in the list). o If SMFs are encoded in data objects parameters and in the parameters defined in C.6.4.2,each party can customize therelative orderproperty of thedata object parameters and C.6.4.2 parameters in the parameter list setsstream it receives. In therelative order of SMFs (SMFs encoded in parametersexample above, each stream defines its own "config" string thatappear earlier incould customize thelist are ordered before SMFs in parameters that appear later inrendering algorithm for each party (in fact, thelist). Givenparticular strings shown in thisordering of SMFs, we now defineexample are identical, because General MIDI is not a configurable MPEG 4 renderer). C.6. Configuration Tools: MIDI Rendering This appendix defines themapping of SMFs to renderer name spaces.session configuration tools for rendering. TheSMF that appears first"render" parameter specifies a rendering method for arenderer maps to the first renderer name space.stream. TheSMFparameter is assigned a token value thatappears secondsignals the top-level rendering class. This memo defines four token values for render: "unknown", "synthetic", "api", and "null": o An "unknown" renderer is a renderermaps towhose nature is unspecified. It is theseconddefault renderername space, etc. If the associatedfor native RTP MIDIstreams also form an ordered relationship,streams. o A "synthetic" renderer transforms thefirst SMFMIDI stream into audio output (or sometimes into stage lighting changes or other actions). It ismerged with the first name space oftherelationship,default renderer for mpeg4-generic RTP MIDI streams. o An "api" renderer presents thesecond SMF is mergedcommand stream to applications via an Application Programmer Interface (API). o The "null" renderer discards thesecond name space of the relationship, etc. Unless the streams and the SMFs both useMIDITime Code, the time offset between SMF and stream data is unspecified. This restriction limitsstream. The "null" render value plays special roles during Offer/Answer negotiations [RFC3264]. A party uses theuse of SMFs"null" value in an answer toapplications where synchronization is not critical, such as the transport of System Exclusive commands for renderer initialization, or human-SMF interactivity. Finally, we notereject an offered renderer. Note thateach SMF inrejecting a renderer is independent from rejecting a payload type (coded by removing thesdp_start discussion above encodes exactly one MIDI name space (16 voice channels + systems). Thus,payload type from a media line) and rejecting a media stream (coded by zeroing theuseport of a media line that uses theDevice Name SMF meta event to specify several MIDI name spaces in an SMF is not supported for sdp_start. C.6.4.2renderer). Other render token values MAY be registered with IANA. Thesmf_inline, smf_url, and smf_cid Parameters In some applications, the renderer data object may not encapsulate SMFs, but an application may wishtoken value MUST adhere touse SMFs inthemannerABNF for render tokens defined in AppendixC.6.4.1. The "smf_inline", "smf_url", and "smf_cid" parameters address this situation. These parameters use the syntax and semanticsD. Registrations MUST include a complete specification ofthe inline, url, and cid parameters definedparameter value usage, similar in depth to the specifications that appear throughout AppendixC.6.3, exceptC.6 for "synthetic" and "api" render values. If a party is offered a session description that uses a render token value thatthe encoded data objectisan SMF. The "smf_inline", "smf_url", and "smf_cid" parameters belongnot known to the"render" parameter that most recently precedes it inparty, the party MUST NOT accept the renderer. Options include rejecting the renderer (using the "null" value), the payload type, the media stream, or the session description.The "smf_inline", "smf_url", and "smf_cid"Other parametersMUST NOT appearMAY follow a render parameter in a parameterlists that do not use the "render" parameter, and MUST NOT appear beforelist. The additional parameters act to define thefirst useexact nature of"render" inthe renderer. For example, the "subrender" parameterlist. If several "smf_inline", "smf_url", or "smf_cid" parameters appear for a renderer,(defined in Appendix C.6.2) specifies theorderexact nature of theparameters definesrenderer. Special rules apply to using theSMF name space ordering. C.6.4.3render parameter in an mpeg4-generic stream. We define these rules in Appendix C.6.5. C.6.1. Thechanmaskmultimode ParameterThe chanmaskA media description MAY contain several render parameters. By default, if a parameterinstructs thelist includes several render parameters, a receiver MUST choose exactly one renderer from the list toignore all MIDI voice commands for certain channel numbers.render the stream. The "multimode" parametervalue is a concatenated string of "1" and "0" digits. Each string position maps to a MIDI voice channel number (system channelsmaynotbemasked). A "1" instructs the renderer to process the voice channel; a "0" instructs the rendererused toignore the voice channel.override this default. We define two token values for multimode: "one" and "all": o Thestring lengthdefault "one" value requests rendering by exactly one of thechanmask parameterlisted renderers. o The "all" valueMUST be 16 (for a single stream or an identity relationship) or a multiplerequests the synchronized rendering of16 (for an ordered relationship). The chanmask parameter describesthe"render" parameter that most recently precedes it inRTP MIDI stream by all listed renderers, if possible. If thesession description; chanmask MUST NOT appearmultimode parameter appears in a parameterlists that do not use the "render" parameter, andlist, it MUSTNOTappear before the firstuse of "render"render parameter assignment. Render parameters appear in the parameterlist. The chanmask parameter describes the final MIDI name spaces presented to the renderer. The SMF and stream componentslist in order of decreasing priority. A receiver MAY use theMIDI name spaces may not be independently masked. Ifpriority ordering to decide which renderer(s) to retain in areceiver is offeredsession. If the "offer" in an Offer/Answer-style negotiation [RFC3264] contains asession descriptionparameter list witha renderer that uses the chanmask parameter, and if the receiver does not implement the semantics of the chanmask parameter,one or more render parameters, thereceiver"answer" MUSTNOT accept the renderer unlessset thechanmask parameter value contains only "1"'s. C.6.5render parameters of all unchosen renderers to "null". C.6.2. Renderer Specification Theaudio/asc Media Typerender parameter (Appendix C.6 preamble) specifies, in a broad sense, what a renderer does with a MIDI stream. InAppendix H.3,this appendix, weregisterdescribe theaudio/asc media type."subrender" parameter. Thedata object for audio/asc is a binary encodingtoken value assigned to subrender defines the exact nature of theAudioSpecificConfig data block used to initialize mpeg4-generic streams (Section 6.2renderer. Thus, "render" and[MPEGAUDIO]). An mpeg4-generic parameter list MAY use"subrender" combine to define a renderer, in therender, subrender,same way as MIME types andrinit parameters with the audio/asc media type for renderer configuration. Several restrictions applyMIME subtypes combine tothe usedefine a type ofthese parameters in mpeg4-generic parameter lists: o An mpeg4-genericmediadescription that uses[RFC2045]. If therendersubrender parameter is used for a renderer definition, it MUSTassignappear immediately after theempty string ("") torender parameter in thempeg4-generic "config" parameter.parameter list. At most one subrender parameter may appear in a renderer definition. This document defines one value for subrender: the value "default". The "default" token specifies the use of thestreamtype, mode, and profile-level-id parameters MUST followdefault renderer for thenormative text in Section 6.2. o Sessions that use identitystream type (native orordered relationships MUST follow the mpeg4-generic configuration restrictions in Appendix C.5. ompeg4-generic). Therender parameter MUST be assigned the value "synthetic", "unknown", "null", ordefault renderer for native RTP MIDI streams is arender value that has been added to the IANA repositoryrenderer whose nature is unspecified (see point 6 in Section 6.1 for details). The default renderer foruse withmpeg4-generic RTP MIDIstreams. The "api" token valuestreams is an MPEG 4 Audio Object Type whose ID number is 13, 14, or 15 (see Section 6.2 forrender MUST NOT be used. odetails). If asubrender parameter is present, it MUST immediately followrenderer definition does not use therendersubrender parameter,and it MUST be assignedthetoken value "default", or assigned a subrendervalueadded to the IANA repository"default" is assumed foruse with mpeg4-generic RTP MIDI streams. Asubrender. Other subrenderparameter assignmenttoken values may beleft out of the renderer configuration, in which case the implied value ofregistered with IANA. We now discuss guidelines for registering subrenderis the default value of "default". o If the render parameter is assigned the value "synthetic", and thevalues. A subrenderparameter has thevalue"default" (assignedis registered for a specific stream type (native orimplied), the rinit parameter MUST be assigned thempeg4-generic) and a specific render value"audio/asc",(excluding "null" andan AudioSpecificConfig data object MUST be encoded using the mechanisms defined in C.6.2-3. The AudioSpecificConfig data MUST encode one of the"unknown"). Registrations for mpeg4-generic subrender values are restricted to new MPEG 4 Audio Object Typesdefined for use with mpeg4-generic in Section 6.2. Ifthat accept MIDI input. The syntax of thesubrender value is other than "default", refertoken MUST adhere to the token definition in Appendix D. For "render=synthetic" renderers, a subrender value registration specifies an exact method forinformation ontransforming theuseMIDI stream into audio (or sometimes into video or control actions, such as stage lighting). For standardized renderers, this specification is usually a pointer to a standards document, perhaps supplemented by RTP-MIDI-specific information. For commercial products and open-source projects, this specification usually takes the form of"audio/asc"instructions for interfacing the RTP MIDI stream with therenderer. o Ifproduct or project software. A "render=synthetic" registration MAY specify additional Reset State commands for therender parameterrenderer (Appendix A.1). A "render=api" subrender value registration specifies how an RTP MIDI stream interfaces with an API (Application Programmers Interface). This specification isassignedusually a pointer to programmer's documentation for thevalue "null" or "unknown",API, perhaps supplemented by RTP-MIDI-specific information. A subrender registration MAY specify an initialization file (referred to in this document as an initialization data object) for the stream. The initialization data object MAY beomitted. Several general restrictions apply toencoded in theuse ofparameter list (verbatim or by reference) using theaudio/asc media typecoding tools defined inRTP MIDI: o A native streamAppendix C.6.3. An initialization data object MUSTNOT assign "audio/asc" to rinit. The audio/asc media type is not intended to behave ageneral-purpose container for rendering systems outside of MPEG usage. o The audio/ascregistered [RFC4288] media typedefines a storedand subtype [RFC2045]. For "render=synthetic" renderers, the data objecttype; it does not define semantics for RTP streams. Thus, audio/asc MUST NOT appear on an rtpmap line of a session description. Below, we show session description examplesusually encodes initialization data foraudio/asc. The session description below usestheinline parameter to coderenderer (sample files, synthesis patch parameters, reverberation room impulse responses, etc.). For "render=api" renderers, theAudioSpecificConfig blockdata object usually encodes data about the stream used by the API (for example, fora mpeg4-generic Generalan RTP MIDIstream. We derivestream generated by a piano keyboard controller, thevalue assigned tomanufacturer and model number of theinline parameterkeyboard, for use inAppendix E.4. The subrender token value of "default"GUI presentation). Usually, only one initialization object isimplied byencoded for a renderer. If a renderer uses multiple data objects, theabsencecorrect receiver interpretation of multiple data objects MUST be defined in the subrenderparameterregistration. A subrender value registration may also specify additional parameters, to appear in the parameterlist. v=0 o=lazzaro 2520644554 2838152170 IN IP4 first.example.net s=Example t=0 0 m=audio 5004 RTP/AVP 96 c=IN IP4 192.0.2.94 a=rtpmap:96 mpeg4-generic/44100 a=fmtp:96 streamtype=5; mode=rtp-midi; config=""; profile-level-id=12; render=synthetic; rinit="audio/asc"; inline="egoAAAAaTVRoZAAAAAYAAAABAGBNVHJrAAAABgD/LwAA" (The a=fmtp line has been wrapped to fitlist immediately after subrender. These parameter names MUST begin with thepagesubrender value, followed by an underscore ("_"), toaccommodate memo formatting restrictions; it comprisesavoid name space collisions with future RTP MIDI parameter names (for example, a parameter "foo_bar" defined for subrender value "foo"). We now specify guidelines for interpreting the subrender parameter during session configuration. If a party is offered asingle line in SDP) Thesession descriptionbelowthat usesthe url parametera renderer whose subrender value is not known tocodetheAudioSpecificConfig blockparty, the party MUST NOT accept the renderer. Options include rejecting the renderer (using the "null" value), the payload type, the media stream, or the session description. Receivers MUST be aware of the Reset State commands (Appendix A.1) for thesame General MIDI stream: v=0 o=lazzaro 2520644554 2838152170 IN IP4 first.example.net s=Example t=0 0 m=audio 5004 RTP/AVP 96 c=IN IP4 192.0.2.94 a=rtpmap:96 mpeg4-generic/44100 a=fmtp:96 streamtype=5; mode=rtp-midi; config=""; profile-level-id=12; render=synthetic; rinit="audio/asc"; url="http://example.net/oski.asc"; cid="xjflsoeiurvpa09itnvlduihgnvet98pa3w9utnuighbuk" (The a=fmtp line has been wrapped to fitrenderer specified by thepagesubrender parameter and MUST insure that the renderer does not experience indefinite artifacts due toaccommodate memo formatting restrictions; it comprisesthe presence (or the loss) of asingle line in SDP) C.7 Interoperability In this Appendix, we define interoperability guidelinesReset State command. C.6.3. Renderer Initialization If the renderer fortwo application areas: o MIDI content-streaming applications. Adding RTP MIDI to RTSP-based content-streaming servers, so that viewers may experience MIDI performances (produced byaspecified client-side renderer) in synchronization with other streams (video, audio). o Long-distance network musical performance applications. Adding RTP MIDI to SIP-based voice chat or videoconferencing programs, asstream uses analternative, or asinitialization data object, anaddition, to audio and/or video RTP streams. For each application we define a core set of functionality that all implementations"rinit" parameter MUSTimplement. The applications we addressappear inthis section are not an exhaustivethe parameter listof potential RTP MIDI uses. We expect framework documentsimmediately after the "subrender" parameter. If the renderer parameter list does not include a subrender parameter (recall the semantics forother applications"default" in Appendix C.6.2), the "rinit" parameter MUST appear immediately after the "render" parameter. The value assigned to the rinit parameter MUST bedeveloped, withintheIETF or within other organizations. We discuss other potential application areasmedia type/subtype [RFC2045] for the initialization data object. If an initialization object type is registered with several media types, including audio, the assignment to rinit MUST use the audio media type. RTP MIDI supports several parameters for encoding initialization data objects for renderers inSection 1 ofthemain text of this memo. C.7.1 MIDI content streaming applications In content-streaming applications, a user invokes an RTSP client to initiate a request to an RTSP server to viewparameter list: "inline", "url", and "cid". If the "inline", "url", and/or "cid" parameters are used by amultimedia session. For example, clicking onrenderer, these parameters MUST immediately follow the "rinit" parameter. If aweb page link"url" parameter appears for a renderer, anInternet Radio channel launches"inline" parameter MUST NOT appear. If anRTSP client that uses the link's RTSP URL"inline" parameter appears for a renderer, a "url" parameter MUST NOT appear. However, neither "url" or "inline" is required tocontact the RTSP server hostingappear. If neither "url" or "inline" parameters follow "rinit", theradio channel."cid" parameter MUST follow "rinit". Thecontent may be pre-recorded (example: on-demand replay"inline" parameter supports the inline encoding ofyesterday's football game) or "live" (example: football game coverage as it occurs) but in either casetheuserdata object. The parameter isusually an "audience member" as opposed toassigned a"participant" (asdouble-quoted Base64 [RFC2045] encoding of theuser would be in telephony). Notebinary data object, with no line breaks. Appendix E.4 shows an example thatthese examples describe the distribution of audio content toconstructs anaudience member.inline parameter value. Theinteroperability guidelines in this Appendix address RTP MIDI applications of this nature, not applications such as the transmission of raw MIDI command streams for use in a professional environment (recording studio, performance stage, etc). In an RTSP session,"url" parameter is assigned aclient accessesdouble-quoted string representation of asession description that is "declared" by the server, either viaUniform Resource Locator (URL) for theRTSP DESCRIBE method,data object. The string MUST specify a HyperText Transport Protocol URL (HTTP, [RFC2616]). HTTP MAY be used over TCP orvia other means,MAY be used over a secure network transport, such asHTTP or email. The session description defines the session from the perspective oftheclient. For example, if amethod described in [RFC2818]. The medialinetype/subtype for the data object SHOULD be specified in thesession description containsappropriate HTTP transport header. The "cid" parameter supports data object caching. The parameter is assigned anon-zero port number, itdouble-quoted string value that encodesthe server's preference for the client's port numbersa globally unique identifier forRTP and RTCP reception. Once media flow begins,theserver sendsdata object. A cid parameter MAY immediately follow anRTP MIDI stream to the client, which renders it for presentation, perhapsinline parameter, insynchronywhich case the cid identifier value MUST be associated withvideo or other audio streams. We now definetheinteroperability textinline data object. If a url parameter is present, and if the data object forcontent-streaming RTSP applications. In most cases, server interoperability responsibilities are described in terms of limits onthe"reference" session description a server providesURL is expected to be unchanged fora performance if it has no information aboutthecapabilitieslife of theclient. The reference session isURL, a"lowest common denominator" session that maximizescid parameter MAY immediately follow theodds that a client willurl parameter. The cid identifier value MUST beableassociated with the data object for the URL. A cid parameter assigned toviewthesession.same identifier value SHOULD be specified following the data object type/subtype in the appropriate HTTP transport header. If aserverurl parameter isaware of the capabilities ofpresent, and if theclient,data object for theserverURL isfreeexpected toprovide a session description customized forchange during theclient inlife of theDESCRIBE reply. ClientsURL, a cid parameter MUSTsupport unicast UDP RTP MIDI streams that useNOT follow therecovery journal withurl parameter. A receiver interprets theclosed-loop or the anchor sending policies. Clients MUST be able to interpret stream subsetting and chapter inclusion parameters in the session description that qualify the sending policies. Client supportpresence ofenhanced Chapter C encoding is OPTIONAL. The reference session description offered byaserver MUST send all RTP MIDI UDP streamscid parameter asunicast streamsan indication thatuse the recovery journal and the closed-loop or anchor sending policies. Servers SHOULD use the stream subsetting and chapter inclusion parameters in the reference session description,it is safe tosimplify the rendering taskuse a cached copy of theclient. Server supporturl data object; the absence ofenhanced Chapter C encodinga cid parameter isOPTIONAL. Clients and servers MUST support thean indication that it is not safe to useof RTSP interleaved mode (a method for interleaving RTP ontoa cached copy, as it may change. Finally, theRTSP TCP transport). Clients MUSTcid parameter MAY beable to interpretused without thetimestamp semantics signalled byinline and url parameters. In this case, the"comex" valueidentifier references a local or distributed catalog of data objects. In most cases, only one data object is coded in thetsmodeparameter(i.e.list for each renderer. For example, thetimestamp semanticsdefault renderer for mpeg4- generic streams uses a single data object (see Appendix C.6.5 for example usage). However, a subrender registration MAY permit the use of multiple data objects for a renderer. If multiple data objects are encoded for a renderer, each object encoding begins with an "rinit" parameter, followed by "inline", "url", and/or "cid" parameters. Initialization data objects MAY encapsulate a Standard MIDIFiles [MIDI]). Servers MUST use the "comex" value forFile (SMF). By default, the"tsmode" parameterSMFs that are encapsulated inthe reference session description. Clientsa data object MUST beable to processignored by an RTP MIDIstream whose packets encode an arbitrary temporal duration ("media time"). Thus, in practice, clients MUST implement a MIDI playout buffer. Clients MUST NOT depend on the presence of rtp_ptime, rtp_maxtime, and guardtimereceiver. We define parameters to override this default inthe session descriptionAppendix C.6.4. To end this section, we offer guidelines for registering media types for initialization data objects. These guidelines are inorderaddition toprocess packets, butthe information in [RFC4288] [RFC4289]. Some initialization data objects are also capable of encoding MIDI note information and thus complete audio performances. These objects SHOULD beable to use these parametersregistered using the "audio" media type, so that the objects may also be used for store-and-forward rendering, and "application" media type, toimprove packet processing. Serverssupport editing tools. Initialization objects without note storage, or initialization objects for non-audio renderers, SHOULDstrivebe registered only for an "application" media type. C.6.4. MIDI Channel Mapping In this appendix, we specify how tosend RTPmap MIDIstreams in the same way media servers send conventional audio streams:name spaces (16 voice channels + systems) onto asequence of packets that either all coderenderer. In thesame temporal duration (non-normative example: 50 ms packets) orgeneral case: o A session may define an ordered relationship (Appendix C.5) thatcodepresents more than oneofMIDI name space to a renderer. o A renderer may accept anintegralarbitrary number oftemporal durations (non-normative example: 50 ms, 100 ms, 250 ms,MIDI name spaces, or500 ms packets). Serversit may expect a specific number of MIDI name spaces. A session description SHOULDencode information about the packetization method in the rtp_ptime and rtp_maxtime parametersprovide a compatible MIDI name space to each renderer in the session. If a receiver detects that a sessiondescription. Clientsdescription has too many or too few MIDI name spaces for a renderer, MIDI data from extra stream name spaces MUST beable to examine the renderdiscarded, andsubrender parameter, to determine if a multimedia session uses aextra rendererit supports. Clientsname spaces MUST NOT beable to interpret the default "one" value of the "multimode" parameter, to identify supported renderer(s) fromdriven with MIDI data (except as described in Appendix C.6.4.1, below). If a parameter listof renderer descriptions. Clients MUST be abledefines several renderers and assigns the "all" token value tointerpretthemusicportmultimode parameter,tothedegree itsame name space isrelevantpresented to each renderer. However, therenderers it supports. Clients MUST"chanmask" parameter may beableused tointerpretmask out selected voice channels to each renderer. We define "chanmask" and other MIDI management parameters in thechanmask parameter. Clients supporting renderers whosesub-sections below. C.6.4.1. The smf_info Parameter The smf_info parameter defines the use of the SMFs encapsulated in renderer dataobject (as encoded by aobjects (if any). The smf_info parametervalue for "inline"), could exceed 300 octetsalso defines the use of SMFs coded insize MUST supporttheurl and cid parameters,smf_inline, smf_url, andthus, must implementsmf_cid parameters (defined in Appendix C.6.4.2). The smf_info parameter describes theHTTP protocol"render" parameter that most recently precedes it inaddition to RTSP. Serversthe parameter list. The smf_info parameter MUSTspecify complete rendering systems for RTP MIDI streams. NoteNOT appear in parameter lists thata minimal RTP MIDI native stream doesdo notmeet this requirement (Section 6.1), asuse therendering method for such streams is "not specified". At"render" parameter, and MUST NOT appear before thetime this memo was written,first use of "render" in theonly wayparameter list. We define three token values forservers to specify a complete rendering system is to specify an mpeg4-generic RTP MIDI stream in mode rtp-midi (Section 6.2smf_info: "ignore", "sdp_start", andC.6.5). As a consequence,"identity": o The "ignore" value indicates that theonlySMFs MUST be discarded. This behavior is the default SMF renderingsystemsbehavior. o The "sdp_start" value codes thatmaySMFs MUST bepresently used are General MIDI [MIDI], DLS 2 [DLS2], or Structured Audio [MPEGSA]. Noterendered, and that themaximum inline value for General MIDIrendering MUST begin upon the acceptance of the session description. If a receiver iswell under 300 octets (and thus clients needoffered a session description with a renderer that uses an smf_info parameter set to sdp_start, and if the receiver does not support rendering SMFs, the"url" parameter), but the maximum inline values for DLS 2 and Structured Audio may be quite larger than 300 octets (and thus clientsreceiver MUSTsupportNOT accept theurl parameter). We anticipate thatrenderer associated with theowners of rendering systems (both standardized and proprietary) will register subrender parameters for their renderers. Once registration occurs, native RTP MIDI sessions may use render and subrender (Appendix C.6.2)smf_info parameter. Options include rejecting the renderer (by setting the "render" parameter tospecify complete rendering systems for RTSP content-streaming multimedia sessions. Servers MUST NOT use"null"), thesdp_startpayload type, the media stream, or the entire session description. o The "identity" value indicates that the SMFs code the identity of the renderer. The value is meant for use with thesmf_info parameter"unknown" renderer (see Appendix C.6 preamble). The MIDI commands coded in thereference session description,SMF are informational in nature and MUST NOT be presented to a renderer for audio presentation. In typical use, the SMF would use SysEx Identity Reply commands (F0 7E nn 06 02, asthisdefined in [MIDI]) to identify devices, and usewould require clientsdevice-specific SysEx commands to describe current state of the devices (patch memory contents, etc.). Other smf_info token values MAY beableregistered with IANA. The token value MUST adhere toparse andthe ABNF for renderStandard MIDI Files. Clientstokens defined in Appendix D. Registrations MUSTsupport mpeg4-generic mode rtp-midi General MIDI (GM) sessions, at a polyphony limited by the hardware capabilities of the client. This requirement providesinclude a"lowest common denominator" rendering system for content providers to target. Note that this requirement does not force implementorscomplete specification ofa non-GM renderer (such as DLS 2 or Structured Audio)parameter usage, similar in depth toadd a second rendering engine. Instead, a client may satisfytherequirement by including a set of voice patchesspecifications thatimplement the GM instrument set, and usingappear in thisemulationappendix formpeg4-generic GM sessions. It"sdp_start" and "identity". If a party isRECOMMENDEDoffered a session description thatservers use General MIDI asuses an smf_info parameter value that is not known to the party, the party MUST NOT accept the rendererforassociated with thereferencesmf_info parameter. Options include rejecting the renderer, the payload type, the media stream, or the entire sessiondescription, because clients are REQUIRED to support it.description. Wedo not require General MIDI asnow define thereference renderer, becauserendering semantics fornormative applications it is an inappropriate choice. Servers using Generalthe "sdp_start" token value in detail. The SMFs and RTP MIDIasstreams in a"lowest common denominator" renderer SHOULD use Universal Real-Time SysEx MIP message [SPMIDI] to communicatesession description share thepriority of voices to polyphony-limited clients. C.7.2same MIDInetwork musical performance applicationsname space(s). InInternet telephonythe simple case of a single RTP MIDI stream andvideoconferencing applications, parties interact over an IP network as they would face-to-face. Good user experiences require low end-to-end audio latencya single SMF, the SMF MIDI commands andtight audiovisual synchronization (for "lip-sync").RTP MIDI commands are merged into a single name space and presented to the renderer. TheSession Initiation Protocol (SIP, [RFC3261]) is used for session management. In this Appendix section, we define interoperability guidelinesindefinite artifact responsibilities forusing RTPmerged MIDI streams defined ininteractive SIP applications. Our primary interest is supporting Network Musical Performances (NMP), where musicians in different locations interact overAppendix C.5 also apply to merging RTP and SMF MIDI data. If a payload type codes multiple SMFs, thenetworkSMF name spaces are presented asif they were inan ordered entity to thesame room. See [NMP] for background information on NMP, and see [GUIDE]renderer. To determine the ordering of SMFs for adiscussionrenderer (which SMF is "first", which is "second", etc.), use the following rules: o If the renderer uses a single data object, the order of appearance oflow-latency RTP MIDI implementation techniques for NMP. Note thatthegoalSMFs in the object's internal structure defines the order ofNMP applications is telepresence:theparties should hear audio that is close to what they would hear if they wereSMFs (the earliest SMF in thesame room. The interoperability guidelinesobject is "first", the next SMF inthis Appendix address RTP MIDI applicationsthe object is "second", etc.). o If multiple data objects are encoded for a renderer, the appearance ofthis nature, not applications such aseach data object in thetransmissionparameter list sets the relative order ofraw MIDI command streams for usethe SMFs encoded ina professional environment (recording studio, performance stage, etc). We focus on session management for two-party unicast sessionseach data object (SMFs encoded in parameters thatspecify a renderer for RTP MIDI streams. Within this limited scope,appear earlier in theguidelines defined herelist aresufficient to let applications interoperate. We defineordered before SMFs encoded in parameters that appear later in theREQUIRED capabilities of RTP MIDI senders and receiverslist). o If SMFs are encoded inNMP sessions,data objects parameters anddefine how session descriptions exchanged are used to set up network musical performance sessions. SIP lets parties negotiate detailsin the parameters defined in C.6.4.2, the relative order of thesession, usingdata object parameters and C.6.4.2 parameters in theOffer/Answer protocol [RFC3264]. However, RTP MIDI has so manyparameter list sets the relative order of SMFs (SMFs encoded in parameters that"blind" negotiations between two parties using different applications might not yield a common session configuration. Thus,appear earlier in the list are ordered before SMFs in parameters that appear later in the list). Given this ordering of SMFs, we now definea setthe mapping ofcapabilitiesSMFs to renderer name spaces. The SMF thatNMP parties MUST support. Session description offers whose options lie outsideappears first for a renderer maps to theenvelope of REQUIRED party behavior risk negotiation failure. We also define session description idiomsfirst renderer name space. The SMF that appears second for a renderer maps to the second renderer name space, etc. If the associated RTP MIDIpart ofstreams also form anoffer MUST follow, in orderordered relationship, the first SMF is merged with the first name space of the relationship, the second SMF is merged tostructuretheoffer for simpler analysis. We usesecond name space of theterm "offerer" forrelationship, etc. Unless theparty making a SIP offer,streams and"answerer" fortheparty answeringSMFs both use MIDI Time Code, theoffer.time offset between SMF and stream data is unspecified. This restriction limits the use of SMFs to applications where synchronization is not critical, such as the transport of System Exclusive commands for renderer initialization, or human-SMF interactivity. Finally, we note thatunless qualified byeach SMF in theadjective "sender" or "receiver", a statement that a party MUST support X implies that it MUST support Xsdp_start discussion above encodes exactly one MIDI name space (16 voice channels + systems). Thus, the use of the Device Name SMF meta event to specify several MIDI name spaces in an SMF is not supported forboth sendingsdp_start. C.6.4.2. The smf_inline, smf_url, andreceiving. Ifsmf_cid Parameters In some applications, the renderer data object may not encapsulate SMFs, but anofferer wishes to define a "sendrecv" RTP MIDI stream, itapplication may wish to usea true sendrecv session or the "virtual sendrecv" construction describedSMFs in thepreamble to Appendix C andmanner defined in AppendixC.5. A true sendrecv session indicates thatC.6.4.1. The "smf_inline", "smf_url", and "smf_cid" parameters address this situation. These parameters use theofferer wishes to participatesyntax and semantics of the inline, url, and cid parameters defined ina session where both parties use identically-configured renderers. A virtual sendrecv session indicatesAppendix C.6.3, except that theoffererencoded data object iswillingan SMF. The "smf_inline", "smf_url", and "smf_cid" parameters belong toparticipatethe "render" parameter that most recently precedes it ina session wherethetwo parties may be using different renderer configurations. Thus, parties MUST be prepared to see both realsession description. The "smf_inline", "smf_url", andvirtual sendrecv sessions in an offer. Parties MUST support unicast UDP transport of RTP MIDI streams. These streams"smf_cid" parameters MUST NOT appear in parameter lists that do not use therecovery journal with the closed-loop or anchor sending policies. These streams"render" parameter and MUST NOT appear before the first use of "render" in thestream subsetting and chapter inclusionparameter list. If several "smf_inline", "smf_url", or "smf_cid" parametersto declareappear for a renderer, thetypesorder ofMIDI commands that will be sent onthestream (for sendonly streams) or will be processed (for recvonly streams), includingparameters defines thesize limits on System Exclusive commands. Support of enhanced Chapter C encoding is OPTIONAL. Note that both TCP and multicast UDP support are OPTIONAL. We make TCP OPTIONAL because we expect NMP renderersSMF name space ordering. C.6.4.3. The chanmask Parameter The chanmask parameter instructs the renderer torely on data objects (signalled by "rinit" and associated parameters)ignore all MIDI voice commands forinitialization at the startcertain channel numbers. The parameter value is a concatenated string ofthe session,"1" and "0" digits. Each string position maps toonly use System Exclusive commands for interactive control duringa MIDI voice channel number (system channels may not be masked). A "1" instructs thesession. These interactive commands are small enoughrenderer tobe protected viaprocess therecovery journal mechanismvoice channel; a "0" instructs the renderer to ignore the voice channel. The string length ofRTP MIDI UDP streams. We now discuss timestamps, packet timing, and packet sending algorithms. Recall thatthetsmodechanmask parametercontrols the semanticsvalue MUST be 16 (for a single stream or an identity relationship) or a multiple ofcommand timestamps16 (for an ordered relationship). The chanmask parameter describes the "render" parameter that most recently precedes it in theMIDI list of RTP packets. Partiessession description; chanmask MUSTsupport clock rates of 44.1 kHz, 48 kHz, 88.2 kHz,NOT appear in parameter lists that do not use the "render" parameter and96 kHz. PartiesMUSTsupport streams usingNOT appear before the"comex", "async", and "buffer" tsmode values. Recvonly offers MUST offer the default "comex". Parties MUST support a wide range of packet temporal durations: from rtp_ptime and rtp_maxptime valuesfirst use of0,"render" in the parameter list. The chanmask parameter describes the final MIDI name spaces presented tortp_ptimethe renderer. The SMF andrtp_maxptime values that code 100 ms. Thus, receivers MUSTstream components of the MIDI name spaces may not beable to implementindependently masked. If aplayout buffer. Offers and answers MUST present rtp_ptime, rtp_maxptime, and guardtime values that support the latency that users would expect in the application, subject to bandwidth constraints. As senders MUST abide by values set for these parameters inreceiver is offered a sessiondescription,description with areceiver SHOULD use these values to size its playout buffer to producerenderer that uses thelowest reliable latency for a session. Implementers should refer to [GUIDE] for information on packet sending algorithms for latency-sensitive applications. Parties MUST be able tochanmask parameter, and if the receiver does not implement the semantics of theguardtimechanmask parameter, the receiver MUST NOT accept the renderer unless the chanmask parameter value contains only "1"s. C.6.5. The audio/asc Media Type In Appendix 11.3, we register the audio/asc media type. The data object fortimes from 5 msaudio/asc is a binary encoding of the AudioSpecificConfig data block used to initialize mpeg4-generic streams (Section 6.2 and [MPEGAUDIO]). An mpeg4-generic parameter list MAY use the render, subrender, and rinit parameters with the audio/asc media type for renderer configuration. Several restrictions apply to5000 ms. We now discussthe use of these parameters in mpeg4-generic parameter lists: o An mpeg4-generic media description that uses the render parameter MUST assign the empty string ("") to the mpeg4-generic "config" parameter. The use of the streamtype, mode, and profile-level-id parameters MUST follow the normative text in Section 6.2. o Sessions that use identity or ordered relationships MUSTspecify complete rendering systemsfollow the mpeg4-generic configuration restrictions in Appendix C.5. o The render parameter MUST be assigned the value "synthetic", "unknown", "null", or a render value that has been added to the IANA repository foralluse with mpeg4-generic RTP MIDI streams.Note that a minimal RTP MIDI native stream does not meet this requirement (Section 6.1), as the rendering methodThe "api" token value forsuch streamsrender MUST NOT be used. o If a subrender parameter is"not specified". Atpresent, it MUST immediately follow thetime this writing,render parameter, and it MUST be assigned theonly way for parties to specifytoken value "default" or assigned acomplete rendering system issubrender value added tospecify anthe IANA repository for use with mpeg4-generic RTP MIDIstreamstreams. A subrender parameter assignment may be left out of the renderer configuration, inmode rtp-midi (Section 6.2 and C.6.5). We anticipate thatwhich case theownersimplied value ofrendering systems (both standardized and proprietary) will registersubrendervalues for their renderers. Once IANA registration occurs, native RTP MIDI sessions may useis the default value of "default". o If the render parameter is assigned the value "synthetic" and the subrender(Appendix C.6.2) to specify complete rendering systems for SIP network musical performance multimedia sessions. All parties MUST support General MIDI (GM) sessions, at a polyphony limited byparameter has thehardware capabilities ofvalue "default" (assigned or implied), theparty. This requirement provides a "lowest common denominator" rendering system, without which practical interoperability willrinit parameter MUST bequite difficult. Whenassigned the value "audio/asc", and an AudioSpecificConfig data object MUST be encoded usingGM, parties SHOULD use Universal Real-Time SysEx MIP message [SPMIDI] to communicatetheprioritymechanisms defined in C.6.2-3. The AudioSpecificConfig data MUST encode one ofvoicesthe MPEG 4 Audio Object Types defined for use with mpeg4-generic in Section 6.2. If the subrender value is other than "default", refer topolyphony-limited clients. Note that this requirement does not force implementorsthe subrender registration for information on the use ofa non-GM renderer (for mpeg4-generic sessions, DLS 2"audio/asc" with the renderer. o If the render parameter is assigned the value "null" orStructured Audio)"unknown", the data object MAY be omitted. Several general restrictions apply toadd a second rendering engine. Instead, a client may satisfytherequirement by including a setuse ofvoice patches that implementtheGM instrument set, and using this emulation for mpeg4-generic GM sessions. We require GM support, so that an offerer that wishesaudio/asc media type in RTP MIDI: o A native stream MUST NOT assign "audio/asc" tomaximize interoperability may do so by offering GM if its preferred rendererrinit. The audio/asc media type is notaccepted by the answerer. Offerersintended to be a general-purpose container for rendering systems outside of MPEG usage. o The audio/asc media type defines a stored object type; it does not define semantics for RTP streams. Thus, audio/asc MUST NOTpresent several renderers as options inappear on an rtpmap line of a session description. Below, we show session descriptionby listing several payload types on a media line, as Section 2.1examples for audio/asc. The session description below usesthis constructthe inline parameter toletcode the AudioSpecificConfig block for aparty send several RTPmpeg4-generic General MIDIstreams instream. We derive thesame RTP session. Instead, an offerer wishingvalue assigned topresent rendering options SHOULD offer a single payload type that offers several renderers. In this construct,the inline parameterlist codes a list of render parameters (each followed by its support parameters). As discussedin AppendixC.6.1, the order of renderers in the list declares the offerer's preference. The "unknown" and "null" values MUST NOT appear in the offer.E.4. Theanswer MUST set all render values except the desired renderer to "null". Thus, "unknown" MUST NOT appear insubrender token value of "default" is implied by theanswer. We use SHOULD insteadabsence ofMUST inthefirst sentencesubrender parameter in theparagraph above, because this technique does not work in all situations (example: an offerer wishesparameter list. v=0 o=lazzaro 2520644554 2838152170 IN IP4 first.example.net s=Example t=0 0 m=audio 5004 RTP/AVP 96 c=IN IP4 192.0.2.94 a=rtpmap:96 mpeg4-generic/44100 a=fmtp:96 streamtype=5; mode=rtp-midi; config=""; profile-level-id=12; render=synthetic; rinit="audio/asc"; inline="egoAAAAaTVRoZAAAAAYAAAABAGBNVHJrAAAABgD/LwAA" (The a=fmtp line has been wrapped tooffer both mpeg4-generic renderers and native RTP MIDI renderers as options). In this case,fit theofferer MUST present a series of session descriptions, each offeringpage to accommodate memo formatting restrictions; it comprises a singlerenderer, until the answerer accepts aline in SDP.) The sessiondescription. Parties MUST support the musicport, chanmask, subrender, rinit, and inline parameters. Parties supporting renderers whose data object (as encoded by a parameter value for "inline"), could exceed 300 octets in size MUST supportdescription below uses the urland cid parameters, and thus, must implement HTTP protocol. Note that in mpeg4-generic, General MIDI data objects can not exceed 300 octets, but DLS 2 and Structured Audio data objects may. Support for the other rendering parameters (smf_cif, smf_info, smf_inline, smf_url) is OPTIONAL. Our discussion of rendering so far in this document assumes that the only MIDI flow that drives a renderer is the network flows described in the session description. In NMP applications, this assumption would require two rendering engines: one for local use by a party, a second for the remote party. In practice, applications may wish to have both parties share a single rendering engine. In this case, the session description MUST use a virtual sendrecv session, and MUST use the stream subsetting and chapter inclusion parameters to allocate which MIDI channels are intended for use by a party. If two parties are sharing a MIDI channels, the application MUST ensure appropriate MIDI merging occurs at the input to the renderer. We now discuss the use of (non-MIDI) audio streams in the session. Audio streams may be used for two purposes: as a "talkback" channel for parties to converse, or as a wayparameter toconduct a performance that includes MIDI and audio channels. In the latter case, offers MUST use sample rates andcode thepacket temporal durationsAudioSpecificConfig block for theaudio andsame General MIDIstreams that support low-latency synchronized rendering. We now show an example of an offer/answer exchange in a network musical performance application (next page). Below, we show an offer that complies with the interoperability text in this Appendix section.stream: v=0o=firsto=lazzaro 2520644554 2838152170 IN IP4 first.example.net s=Example t=0 0a=group:FID 1 2 c=IN IP4 192.0.2.94 m=audio 16112 RTP/AVP 96 a=recvonly a=mid:1 a=rtpmap:96 mpeg4-generic/44100 a=fmtp:96 streamtype=5; mode=rtp-midi; config=""; profile-level-id=12; cm_unused=ABCFGHJKMNPQTVWXYZ; cm_used=2NPTW; cm_used=2C0.1.7.10.11.64.121.123; cm_used=2M0.1.2 cm_used=X0-16; ch_never=ABCDEFGHJKMNPQTVWXYZ; ch_default=2NPTW; ch_default=2C0.1.7.10.11.64.121.123; ch_default=2M0.1.2; cm_default=X0-16; rtp_ptime=0; rtp_maxptime=0; guardtime=44100; musicport=1; render=synthetic; rinit="audio/asc"; inline="egoAAAAaTVRoZAAAAAYAAAABAGBNVHJrAAAABgD/LwAA"m=audio161145004 RTP/AVP 96a=sendonly a=mid:2c=IN IP4 192.0.2.94 a=rtpmap:96 mpeg4-generic/44100 a=fmtp:96 streamtype=5; mode=rtp-midi; config=""; profile-level-id=12;cm_unused=ABCFGHJKMNPQTVWXYZ; cm_used=1NPTW; cm_used=1C0.1.7.10.11.64.121.123; cm_used=1M0.1.2 cm_used=X0-16; ch_never=ABCDEFGHJKMNPQTVWXYZ; ch_default=1NPTW; ch_default=1C0.1.7.10.11.64.121.123; ch_default=1M0.1.2; cm_default=X0-16; rtp_ptime=0; rtp_maxptime=0; guardtime=44100; musicport=1;render=synthetic; rinit="audio/asc";inline="egoAAAAaTVRoZAAAAAYAAAABAGBNVHJrAAAABgD/LwAA"url="http://example.net/oski.asc"; cid="xjflsoeiurvpa09itnvlduihgnvet98pa3w9utnuighbuk" (The a=fmtplines haveline has been wrapped to fit the page to accommodate memo formatting restrictions; it comprises a single line inSDP) The owner line (o=) identifies the session owner as "first". The session description definesSDP.) C.7. Interoperability In this appendix, we define interoperability guidelines for two application areas: o MIDIstreams: a recvonly stream on which "first" receives a performance, and a sendonly stream that "first" uses to send a performance. The recvonly port number encodes the ports on which "first" wishes to receivecontent-streaming applications. RTP(16112) and RTCP (16113) media at IP4 address 192.0.2.94. The sendonly port number encodes the port on which "first" wishesMIDI is added toreceive RTCP for the stream (16115). The musicport parameters codeRTSP-based content-streaming servers, so thatthe two streams share and identity relationship, and thus formviewers may experience MIDI performances (produced by avirtual sendrecv stream. Bothspecified client- side renderer) in synchronization with other streamsare mpeg4-generic(video, audio). o Long-distance network musical performance applications. RTP MIDIstreams that specify a General MIDI renderer. The stream subsetting parameters code that the recvonly stream uses MIDI channel 1 exclusively foris added to SIP-based voicecommands, andchat or videoconferencing programs, as an alternative, or as an addition, to audio and/or video RTP streams. For each application, we define a core set of functionality thatthe sendonly stream usesall implementations MUST implement. The applications we address in this section are not an exhaustive list of potential RTP MIDIchannel 2 exclusivelyuses. We expect framework documents forvoice commands. This mapping permitsother applications to be developed, within the IETF or within other organizations. We discuss other potential applicationsoftwareareas for RTP MIDI in Section 1 of the main text of this memo. C.7.1. MIDI Content Streaming Applications In content-streaming applications, a user invokes an RTSP client toshareinitiate asingle rendererrequest to an RTSP server to view a multimedia session. For example, clicking on a web page link forlocal and remote performers. We now showan Internet Radio channel launches an RTSP client that uses theanswerlink's RTSP URL to contact theoffer. v=0 o=second 2520644554 2838152170 IN IP4 second.example.net s=Example t=0 0 a=group:FID 1 2 c=IN IP4 192.0.2.105 m=audio 5004 RTP/AVP 96 a=sendonly a=mid:1 a=rtpmap:96 mpeg4-generic/44100 a=fmtp:96 streamtype=5; mode=rtp-midi; config=""; profile-level-id=12; cm_unused=ABCFGHJKMNPQTVWXYZ; cm_used=2NPTW; cm_used=2C0.1.7.10.11.64.121.123; cm_used=2M0.1.2 cm_used=X0-16; ch_never=ABCDEFGHJKMNPQTVWXYZ; ch_default=2NPTW; ch_default=2C0.1.7.10.11.64.121.123; ch_default=2M0.1.2; cm_default=X0-16; rtp_ptime=0; rtp_maxptime=882; guardtime=44100; musicport=1; render=synthetic; rinit="audio/asc"; inline="egoAAAAaTVRoZAAAAAYAAAABAGBNVHJrAAAABgD/LwAA" m=audio 5006 RTP/AVP 96 a=recvonly a=mid:2 a=rtpmap:96 mpeg4-generic/44100 a=fmtp:96 streamtype=5; mode=rtp-midi; config=""; profile-level-id=12; cm_unused=ABCFGHJKMNPQTVWXYZ; cm_used=1NPTW; cm_used=1C0.1.7.10.11.64.121.123; cm_used=1M0.1.2 cm_used=X0-16; ch_never=ABCDEFGHJKMNPQTVWXYZ; ch_default=1NPTW; ch_default=1C0.1.7.10.11.64.121.123; ch_default=1M0.1.2; cm_default=X0-16; rtp_ptime=0; rtp_maxptime=0; guardtime=88200; musicport=1; render=synthetic; rinit="audio/asc"; inline="egoAAAAaTVRoZAAAAAYAAAABAGBNVHJrAAAABgD/LwAA" (The a=fmtp lines have been wrapped to fitRTSP server hosting thepage to accommodate memo formatting restrictions; they comprise single lines in SDP)radio channel. Theowner line (o=) identifiescontent may be pre-recorded (for example, on-demand replay of yesterday's football game) or "live" (for example, football game coverage as it occurs), but in either case thesession owneruser is usually an "audience member" as"second". The port numbers for both media streams are non-zero; thus, "second" has acceptedopposed to a "participant" (as thesession description.user would be in telephony). Note that these examples describe the distribution of audio content to an audience member. Thestream marked "sendonly"interoperability guidelines in this appendix address RTP MIDI applications of this nature, not applications such as theoffer is marked "recvonly"transmission of raw MIDI command streams for use in a professional environment (recording studio, performance stage, etc.). In an RTSP session, a client accesses a session description that is "declared" by theanswer, and vice versa, codingserver, either via thedifferent viewRTSP DESCRIBE method, or via other means, such as HTTP or email. The session description defines the session from the perspective of the client. For example, if a media line in the sessionheld by "session". The IP4 number (192.0.2.105) anddescription contains a non-zero port number, it encodes the server's preference for the client's port numbers for RTP(5004 and 5006)and RTCP(5005 and 5007) have been changed by "second" to match its transport wishes. In addition, "second" has made several parameter changes: rtp_maxptime forreception. Once media flow begins, thesendonlyserver sends an RTP MIDI streamhas been changedtocode 2 ms (441 in clock units), andtheguardtimeclient, which renders it forthe recvonly stream has been doubled. As these parameter modifications request capabilities that are REQUIRED to be implemented by interoperable parties, "second" can make these changespresentation, perhaps in synchrony withconfidence that "first" can abide by them. D. Parameter Syntax Definitions In this Appendix, wevideo or other audio streams. We now define thesyntaxinteroperability text forthe RTP MIDI media type parameterscontent-streaming RTSP applications. In most cases, server interoperability responsibilities are described inAugmented Backus-Naur Form (ABNF, [RFC2234]). When using these parameters with SDP, all parameters MUST appearterms of limits on the "reference" session description asingle fmtp attribute lineserver provides for a performance if it has no information about the capabilities ofan RTP MIDI media description. For mpeg4-genericthe client. The reference session is a "lowest common denominator" session that maximizes the odds that a client will be able to view the session. If a server is aware of the capabilities of the client, the server is free to provide a session description customized for the client in the DESCRIBE reply. Clients MUST support unicast UDP RTP MIDIstreams, this linestreams that use the recovery journal with the closed-loop or the anchor sending policies. Clients MUSTalso include any mpeg4-genericbe able to interpret stream subsetting and chapter inclusion parameters(usage describedinSection 6.2). An fmtp attribute line may be defined (after [RFC3640]) as: ; ; SDP fmtp line definition ; fmtp = "a=fmtp:" token SP param-assign 0*(";" SP param-assign) CRLF where <token> codestheRTP payload type. Notesession description thatwhite spacequalify the sending policies. Client support of enhanced Chapter C encoding is OPTIONAL. The reference session description offered by a server MUSTNOT appear betweensend all RTP MIDI UDP streams as unicast streams that use the"a=fmtp:"recovery journal and theRTP payload type. We now define the syntax ofclosed-loop or anchor sending policies. Servers SHOULD use the stream subsetting and chapter inclusion parametersdefinedinAppendix C. The definition takestheformreference session description, to simplify the rendering task of theincremental assemblyclient. Server support of enhanced Chapter C encoding is OPTIONAL. Clients and servers MUST support the<param- assign> token. See [RFC3640]use of RTSP interleaved mode (a method for interleaving RTP onto thesyntaxRTSP TCP transport). Clients MUST be able to interpret the timestamp semantics signalled by the "comex" value of thempeg4-generic parameters discussed in Section 6.2. ; ; ; top-level definitiontsmode parameter (i.e., the timestamp semantics of Standard MIDI Files [MIDI]). Servers MUST use the "comex" value forall parameters ; ; ; ; Parameters definedthe "tsmode" parameter inAppendix C.1 param-assign /= "cm_unused" "=" ([channel-list] command-type [f-list]) / sysex-data param-assign /= "cm_used" "=" ([channel-list] command-type [f-list]) / sysex-data ; ; Parameters definedthe reference session description. Clients MUST be able to process an RTP MIDI stream whose packets encode an arbitrary temporal duration ("media time"). Thus, inAppendix C.2 param-assign = "j_sec" "=" ("none" / "recj" / *ietf-extension) param-assign /= "j_update" "=" ("anchor" / "closed-loop" / "open-loop" / *ietf-extension) param-assign /= "ch_default" "=" ([channel-list] chapter-list [f-list]) / sysex-data param-assign /= "ch_never" "=" ([channel-list] chapter-list [f-list]) / sysex-data param-assign /= "ch_anchor" "=" ([channel-list] chapter-list [f-list]) / sysex-data ; ; Parameters definedpractice, clients MUST implement a MIDI playout buffer. Clients MUST NOT depend on the presence of rtp_ptime, rtp_maxtime, and guardtime parameters inAppendix C.3 param-assign /= "tsmode" "=" ("comex" / "async" / "buffer") param-assign /= "linerate" "=" nonzero-four-octet param-assign /= "octpos" "=" ("first" / "last") param-assign /= "mperiod" "=" nonzero-four-octet ; ; Parameter definedthe session description inAppendix C.4 param-assign /= "guardtime" "=" nonzero-four-octet param-assign /= "rtp_ptime" "=" four-octet param-assign /= "rtp_maxptime" "=" four-octet ; ; Parameters definedorder to process packets, but they SHOULD be able to use these parameters to improve packet processing. Servers SHOULD strive to send RTP MIDI streams inAppendix C.5 param-assign /= "musicport" "=" four-octet ; ; Parameters definedthe same way media servers send conventional audio streams: a sequence of packets that either all code the same temporal duration (non-normative example: 50 ms packets) or that code one of an integral number of temporal durations (non-normative example: 50 ms, 100 ms, 250 ms, or 500 ms packets). Servers SHOULD encode information about the packetization method inAppendix C.6 param-assign /= "chanmask" "=" 1*( 16( "0" / "1" ) ) param-assign /= "cid" "=" double-quote cid-block double-quote param-assign /= "inline" "=" double-quote base-64-block double-quote param-assign /= "multimode" "=" ("all" / "one") param-assign /= "render" "=" ("synthetic" / "api" / "null" / "unknown" / *extension) param-assign /= "rinit" "=" mime-type "/" mime-subtype param-assign /= "smf_cid" "=" double-quote cid-block double-quote param-assign /= "smf_info" "=" ("ignore" / "identity" / "sdp_start" / *extension) param-assign /= "smf_inline" "=" double-quote base-64-block double-quote param-assign /= "smf_url" "=" double-quote uri-element double-quote param-assign /= "subrender" "=" ("default" / *extension) param-assign /= "url" "=" double-quote uri-element double-quote ; ; list definitions forthecm_ command-type ; command-type = command-part1 command-part2 command-part3 command-part1 = 0*1"A" 0*1"B" 0*1"C" 0*1"F" 0*1"G" command-part2 = 0*1"H" 0*1"J" 0*1"K" 0*1"M" 0*1"N" 0*1"P" 0*1"Q" command-part3 = 0*1"T" 0*1"V" 0*1"W" 0*1"X" 0*1"Y" 0*1"Z" ; ; list definitions forrtp_ptime and rtp_maxtime parameters in thech_ chapter-list ; chapter-list = chapter-part1 chapter-part2 chapter-part3 chapter-part1 = 0*1"A" 0*1"B" 0*1"C" 0*1"D" 0*1"E" 0*1"F" 0*1"G" chapter-part2 = 0*1"H" 0*1"J" 0*1"K" 0*1"M" 0*1"N" 0*1"P" 0*1"Q" chapter-part3 = 0*1"T" 0*1"V" 0*1"W" 0*1"X" 0*1"Y" 0*1"Z" ; ; list definitions forsession description. Clients MUST be able to examine thech_ channel-list ; channel-list = midi-chan-element *("." midi-chan-element) midi-chan-element = midi-chan / midi-chan-range midi-chan-range = midi-chan "-" midi-chan ; decimal value of left midi-chan ;render and subrender parameter, to determine if a multimedia session uses a renderer it supports. Clients MUST bestrictly less than decimal ;able to interpret the default "one" value ofright midi-chan midi-chan = %d0-15 ; ; list definitions forthech_ field"multimode" parameter, to identify supported renderers from a list(f-list) ; f-list = midi-field-element *("." midi-field-element) midi-field-element = midi-field / midi-field-range midi-field-range = midi-field "-" midi-field ; ; decimal valueofleft midi-field ;renderer descriptions. Clients MUST bestrictly less than decimal ;able to interpret the musicport parameter, to the degree that it is relevant to the renderers it supports. Clients MUST be able to interpret the chanmask parameter. Clients supporting renderers whose data object (as encoded by a parameter valueof right midi-field midi-field = four-octet ; ; large range accommodates Chapter M ; RPN (0-16383)for "inline") could exceed 300 octets in size MUST support the url andNRPN (16384-32767) ; parameters,cid parameters andChapter X octet sizes. ; ; definitionsthus must implement the HTTP protocol in addition to RTSP. Servers MUST specify complete rendering systems forch_ sysex-data ; sysex-data = "__" h-list *("_" h-list) "__" h-list = hex-field-element *("." hex-field-element) hex-field-element = hex-octet / hex-field-range hex-field-range = hex-octet "-" hex-octet ; ; hexadecimal valueRTP MIDI streams. Note that a minimal RTP MIDI native stream does not meet this requirement (Section 6.1), as the rendering method for such streams is "not specified". At the time ofleft hex-octet ; MUST be strictly less than hexadecimal ; value of right hex-octet hex-octet = 2("0" / "1" / "2"/ "3" / "4" / "5" / "6" / "7" / "8" / "9" / "A" / "B" / "C" / "D" / "E" / "F") ; ; rewritten version of hex-octet in [RFC2045] ; (page 23). ; note that a-f are not permitted,this memo, the onlyA-F. ; hex-octet values MUST NOT exceed 7F. ; ; definitions for rinit parameter ; mime-type = "audio" / "application" mime-subtype = token ; ; See Appendix C.6.2 for registration ; requirements for rinit type/subtypes. ; ; definitionsway forbase64 encoding ; copied from [SDP] base-64-block = *base64-unit [base64-pad] base64-unit = 4base64-char base64-pad = 2base64-char "==" / 3base64-char "=" base64-char = %x41-5A / %x61-7A / %x30-39 / "+" / "/" ; A-Z, a-z, 0-9, "+"servers to specify a complete rendering system is to specify an mpeg4-generic RTP MIDI stream in mode rtp-midi (Section 6.2 and"/" ; ; generic rules ; ietf-extension = token ; ; ietf-extension mayC.6.5). As a consequence, the onlybe defined in ; standards-track RFCs. extension = token ; ; extensionrendering systems that may bedefined by filing ; a registration with IANA. four-octet = %d0-429496729 ; unsigned encoding of 32-bits nonzero-four-octet = %d1-429496729 ; unsigned encoding of 32-bits, ex-zero uri-element = URI-reference ; as defined in [RFC2396] and [RFC2732] double-quote = %x22 ;presently used are General MIDI [MIDI], DLS 2 [DLS2], or Structured Audio [MPEGSA]. Note that thedouble-quote (") character token = 1*(token-char) ; copied from [SDP] token-char = %x21 / %x23-27 / %x2A-2B / %x2D-2E / %x30-39 / %x41-5A / %x5E-7E ; copied from [SDP] cid-block = 1*(cid-char) cid-char = token-char cid-char /= "@" cid-char /= "," cid-char /= ";" cid-char /= ":" cid-char /= "\" cid-char /= "/" cid-char /= "[" cid-char /= "]" cid-char /= "?" cid-char /= "=" ; ; add back inmaximum inline value for General MIDI is well under 300 octets (and thus clients need not support thetspecials [RFC2045], except"url" parameter), and that the maximum inline values for; double-quoteDLS 2 and Structured Audio may be much larger than 300 octets (and thus clients MUST support thenon-email safe () <> ; note that "cid" defined above ensuresurl parameter). We anticipate that; cid-block is enclosed with double-quotes ; external references ; URI-reference: from [RFC2396] and [RFC2732] ; ; Endthe owners ofABNF The mpeg4-genericrendering systems (both standardized and proprietary) will register subrender parameters for their renderers. Once registration occurs, native RTPpayload [RFC3640] defines a "mode" parameter that signalsMIDI sessions may use render and subrender (Appendix C.6.2) to specify complete rendering systems for RTSP content-streaming multimedia sessions. Servers MUST NOT use thetype of MPEG stream in use. We add a new mode value, "rtp- midi", usingsdp_start value for theABNF rule below: ; ; mpeg4-generic modesmf_info parameterextension ; mode /= "rtp-midi" ; as describedinSection 6.2 ofthe reference session description, as thismemo E. Ause would require that clients be able to parse and render Standard MIDIOverview for Networking Specialists This Appendix presents an overview of theFiles. Clients MUST support mpeg4-generic mode rtp-midi General MIDIstandard, for(GM) sessions, at a polyphony limited by thebenefithardware capabilities ofnetworking specialists new to musical applications. Implementors should consult [MIDI] forthe client. This requirement provides anormative description"lowest common denominator" rendering system for content providers to target. Note that this requirement does not force implementors ofMIDI. Musicians make music by performingacontrolled sequence of physical movements. For example,non-GM renderer (such as DLS 2 or Structured Audio) to add apianist playssecond rendering engine. Instead, a client may satisfy the requirement bycoordinatingincluding aseriesset ofkey presses, key releases,voice patches that implement the GM instrument set, andpedal actions.using this emulation for mpeg4-generic GM sessions. It is RECOMMENDED that servers use General MIDIrepresents a musical performance by encoding these physical gesturesasa sequence of MIDI commands. This high-level musical representation is compact but fragile: one lost command may be catastrophicthe renderer for the reference session description, because clients are REQUIRED to support it. We do not require General MIDI as theperformance.reference renderer, because for normative applications it is an inappropriate choice. Servers using General MIDIcommands have much inas a "lowest commonwithdenominator" renderer SHOULD use Universal Real-Time SysEx MIP message [SPMIDI] to communicate themachine instructionspriority ofa microprocessor.voices to polyphony-limited clients. C.7.2. MIDIcommands are definedNetwork Musical Performance Applications In Internet telephony and videoconferencing applications, parties interact over an IP network asbinary elements. Bitfields within athey would face-to-face. Good user experiences require low end-to-end audio latency and tight audiovisual synchronization (for "lip-sync"). The Session Initiation Protocol (SIP, [RFC3261]) is used for session management. In this appendix section, we define interoperability guidelines for using RTP MIDIcommand have a regular structurestreams in interactive SIP applications. Our primary interest is supporting Network Musical Performances (NMP), where musicians in different locations interact over the network as if they were in the same room. See [NMP] for background information on NMP, and see [RFC4696] for aspecialized purpose. For example,discussion of low-latency RTP MIDI implementation techniques for NMP. Note that theupper nibblegoal of NMP applications is telepresence: thefirst command octet (the opcode field) codesparties should hear audio that is close to what they would hear if they were in thecommand type.same room. The interoperability guidelines in this appendix address RTP MIDIcommands may consistapplications ofan arbitrary numberthis nature, not applications such as the transmission ofcomplete octets, but mostraw MIDIcommands are 1, 2, or 3 octetscommand streams for use inlength. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | Channel Voice Messages | Bitfield Pattern | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | NoteOff (endanote) | 1000cccc 0nnnnnnn 0vvvvvvv | |-------------------------------------------------------------| | NoteOn (startprofessional environment (recording studio, performance stage, etc.). We focus on session management for two-party unicast sessions that specify anote) | 1001cccc 0nnnnnnn 0vvvvvvv | |-------------------------------------------------------------| | PTouch (Polyphonic Aftertouch) | 1010cccc 0nnnnnnn 0aaaaaaa | |-------------------------------------------------------------| | CControl (Controller Change) | 1011cccc 0xxxxxxx 0yyyyyyy | |-------------------------------------------------------------| | PChange (Program Change) | 1100cccc 0ppppppp | |-------------------------------------------------------------| | CTouch (Channel Aftertouch) | 1101cccc 0aaaaaaa | |-------------------------------------------------------------| | PWheel (Pitch Wheel) | 1110cccc 0xxxxxxx 0yyyyyyy | ------------------------------------------------------------- Figure E.1 --renderer for RTP MIDIChannel Messages ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | System Common Messages | Bitfield Pattern | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | System Exclusive | 11110000, followed by a | | | list of 0xxxxxx octets, | | | followed by 11110111 | |-------------------------------------------------------------| | MIDI Time Code Quarter Frame | 11110001 0xxxxxxx | |-------------------------------------------------------------| | Song Position Pointer | 11110010 0xxxxxxx 0yyyyyyy | |-------------------------------------------------------------| | Song Select | 11110011 0xxxxxxx | |-------------------------------------------------------------| | Undefined | 11110100 | |-------------------------------------------------------------| | Undefined | 11110101 | |-------------------------------------------------------------| | Tune Request | 11110110 | |-------------------------------------------------------------| | System Exclusive End Marker | 11110111 | ------------------------------------------------------------- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | System Realtime Messages | Bitfield Pattern | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | Clock | 11111000 | |-------------------------------------------------------------| | Undefined | 11111001 | |-------------------------------------------------------------| | Start | 11111010 | |-------------------------------------------------------------| | Continue | 11111011 | |-------------------------------------------------------------| | Stop | 11111100 | |-------------------------------------------------------------| | Undefined | 11111101 | |-------------------------------------------------------------| | Active Sense | 11111110 | |-------------------------------------------------------------| | System Reset | 11111111 | ------------------------------------------------------------- Figure E.2 -- MIDI System Messages Figure E.1 and E.2 showstreams. Within this limited scope, theMIDI command family. Thereguidelines defined here arethree major classes of commands: voice commands (opcode field values in the range 0x8 through 0xE), system common commands (opcode field 0xF, commands 0xF0 through 0xF7), and system real-time commands (opcode field 0xF, commands 0xF8 through 0xFF). Voice commands codesufficient to let applications interoperate. We define themusical gestures for each timbre in a composition. Systems commands perform functions that usually affect all voice channels, such as System Reset (0xFF). E.1 Commands Types Voice commands execute on oneREQUIRED capabilities of16RTP MIDIchannels, as coded by its 4-bit channel field (field ccccsenders and receivers inFigure E.1). In most applications, notes for different timbresNMP sessions and define how session descriptions exchanged areassignedused to set up network musical performance sessions. SIP lets parties negotiate details of the session, using the Offer/Answer protocol [RFC3264]. However, RTP MIDI has so many parameters that "blind" negotiations between two parties using differentchannels. To supportapplicationsthat require more than 16 channels, MIDI systems use several MIDI command streams in parallel, tomight not yield32, 48, or 64 MIDI channels. As an example ofavoice command, considercommon session configuration. Thus, we now define aNoteOn command (opcode 0x9), with binary encoding 1001cccc 0nnnnnnn 0aaaaaaa. This command signalsset of capabilities that NMP parties MUST support. Session description offers whose options lie outside thestartenvelope ofa musical note onREQUIRED party behavior risk negotiation failure. We also define session description idioms that the RTP MIDIchannel cccc. The note haspart of an offer MUST follow, in order to structure the offer for simpler analysis. We use the term "offerer" for the party making apitch coded bySIP offer, and "answerer" for the party answering the offer. Finally, we notenumber nnnnnnn, and an onset amplitude codedthat unless it is qualified bynote velocity aaaaaaa. Other voice commands signaltheend of notes (NoteOff, opcode 0x8), mapadjective "sender" or "receiver", aspecific timbrestatement that a party MUST support X implies that it MUST support X for both sending and receiving. If an offerer wishes to define a "sendrecv" RTP MIDIchannel (PChange, opcode 0xC),stream, it may use a true sendrecv session orset the value of parameters that modulatethetimbral quality (all other voice commands). The exact meaning of most voice channel commands depends on"virtual sendrecv" construction described in therendering algorithmspreamble to Appendix C and in Appendix C.5. A true sendrecv session indicates that theMIDI receiver usesofferer wishes togenerate sound. In most applications,participate in aMIDI sender hassession where both parties use identically configured renderers. A virtual sendrecv session indicates that the offerer is willing to participate in amodel (in some sense)session where the two parties may be using different renderer configurations. Thus, parties MUST be prepared to see both real and virtual sendrecv sessions in an offer. Parties MUST support unicast UDP transport of RTP MIDI streams. These streams MUST use therendering method used byrecovery journal with thereceiver. System commands perform a variety of global tasks inclosed-loop or anchor sending policies. These streams MUST use thestream, including "sequencer" playback controlstream subsetting and chapter inclusion parameters to declare the types ofpre-recordedMIDI commands(the Song Position Pointer, Song Select, Clock, Start, Continue, and Stop messages), SMPTE time code (the MIDI Time Code Quarter Frame command), andthat will be sent on thecommunication of device-specific data (thestream (for sendonly streams) or will be processed (for recvonly streams), including the size limits on System Exclusivemessages). E.2 Running Status All MIDI command bitfields share a special structure: the leading bitcommands. Support ofthe first octetenhanced Chapter C encoding issetOPTIONAL. Note that both TCP and multicast UDP support are OPTIONAL. We make TCP OPTIONAL because we expect NMP renderers to1,rely on data objects (signalled by "rinit" and associated parameters) for initialization at theleading bitstart ofall subsequent octets is setthe session, and only to0. This structure supports a data compression system, called running status [MIDI], that improvesuse System Exclusive commands for interactive control during thecoding efficiency of MIDI. In running status coding,session. These interactive commands are small enough to be protected via thefirst octetrecovery journal mechanism ofaRTP MIDIvoice command may be dropped if it is identical toUDP streams. We now discuss timestamps, packet timing, and packet sending algorithms. Recall that thefirst octet oftsmode parameter controls theprevious MIDI voice command. This rule,semantics of command timestamps incombination with a convention to consider NoteOn commands with a null third octet as NoteOff commands, supportsthecodingMIDI list ofnote sequences using two octets per command. Running status coding is only used for voice commands. The presenceRTP packets. Parties MUST support clock rates ofa system common message in the stream cancels running status mode for44.1 kHz, 48 kHz, 88.2 kHz, and 96 kHz. Parties MUST support streams using thenext voice command. However, system real-time messages do not cancel running status mode. E.3 Command Timing The bitfield formats in Figures E.1"comex", "async", andE.2 do not encode"buffer" tsmode values. Recvonly offers MUST offer theexecution time for a command. Timing information is notdefault "comex". Parties MUST support apartwide range ofthe MIDI command syntax itself; different applicationspacket temporal durations: from rtp_ptime and rtp_maxptime values ofthe MIDI command language use different methods0, toencode timing. For example, the MIDI command set acts as the transport layer for MIDI 1.0 DIN cables [MIDI]. MIDI cables are short asynchronous serial lines that facilitate the remote operation of musical instrumentsrtp_ptime andaudio equipment. Timestamps are not sent overrtp_maxptime values that code 100 ms. Thus, receivers MUST be able to implement aMIDI 1.0 DIN cable. Instead,playout buffer. Offers and answers MUST present rtp_ptime, rtp_maxptime, and guardtime values that support thestandard uses an implicit "time of arrival" code. Receivers execute MIDI commands atlatency that users would expect in themoment of arrival. In contrast, Standard MIDI Files (SMFs, [MIDI]), a file formatapplication, subject to bandwidth constraints. As senders MUST abide by values set forrepresenting complete musical performances, addthese parameters in aexplicit timestamp to each MIDI command, usingsession description, adelta encoding scheme that is optimized for statistics of musical performance. SMF timestamps usually code timing usingreceiver SHOULD use these values to size its playout buffer to produce themetric notation oflowest reliable latency for amusical score. SMF meta-events are usedsession. Implementers should refer toadd a tempo map[RFC4696] for information on packet sending algorithms for latency-sensitive applications. Parties MUST be able to implement thefile, so that score beats may be accurately converted into unitssemantics ofseconds during rendering. E.4 AudioSpecificConfig templatesthe guardtime parameter, forMMA renderers In Section 6.2 and Appendix C.6.5 in this memo, we describe how session descriptions include an AudioSpecificConfig data blocktimes from 5 ms to 5000 ms. We now discuss the use of the render parameter. Sessions MUST specifya MIDIcomplete renderingalgorithmsystems formpeg4-genericall RTP MIDI streams.The bitfield format of AudioSpecificConfig is defined in [MPEGAUDIO]. StructuredAudioSpecificConfig,Note that akey data structure coded in AudioSpecificConfig,minimal RTP MIDI native stream does not meet this requirement (Section 6.1), as the rendering method for such streams isdefined in [MPEGSA]. For implementors wishing"not specified". At the time this writing, the only way for parties to specifyStructured Audio renderers,afull understanding of [MPEGSA] and [MPEGAUDIO] are essential. However, many implementors will limit theircomplete renderingoptionssystem is tothe twospecify an mpeg4-generic RTP MIDIManufacturers Association renderers that may be specifiedstream inAudioSpecificConfig: General MIDI (GM, [MIDI]) and Downloadable Sounds 2 (DLS 2, [DLS2]). To aid these implementors, we reproduce the AudioSpecificConfig bitfield formats for a GM renderermode rtp-midi (Section 6.2 anda DLS 2 renderer below.C.6.5). Wehave checked these bitfields carefully and believe they are correct. However, we stressanticipate that thematerial below is informative,owners of rendering systems (both standardized and[MPEGAUDIO]proprietary) will register subrender values for their renderers. Once IANA registration occurs, native RTP MIDI sessions may use render and[MPEGSA] are the normative definitionssubrender (Appendix C.6.2) to specify complete rendering systems forAudioSpecificConfig. As described in Section 6.2,SIP network musical performance multimedia sessions. All parties MUST support General MIDI (GM) sessions, at aminimal mpeg4-generic session description encodespolyphony limited by theAudioSpecificConfig binary bitfield as a hexadecimal string (whose format is defined in [RFC3640]) that is assigned tohardware capabilities of the"config" parameter. As described in Appendix C.6.3,party. This requirement provides asession description that uses the render parameter encodes the AudioSpecificConfig binary bitfield as a Base64-encoded string assigned"lowest common denominator" rendering system, without which practical interoperability will be quite difficult. When using GM, parties SHOULD use Universal Real-Time SysEx MIP message [SPMIDI] to communicate the"inline" parameter, or in the bodypriority ofan HTTP URL assignedvoices to polyphony-limited clients. Note that this requirement does not force implementors of a non-GM renderer (for mpeg4-generic sessions, DLS 2, or Structured Audio) to add a second rendering engine. Instead, a client may satisfy the"url" parameter. Below, we showrequirement by including asimplified binary AudioSpecificConfig bitfield format, suitable for sending and receivingset of voice patches that implement the GM instrument set, andDLS 2 data: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | AOTYPE |FREQIDX|CHANNEL|SACNK| FILE_BLK 1 (required) ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |1|SACNK| FILE_BLK 2 (optional) ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ... |1|SACNK| FILE_BLK N (optional) ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0|0| (first "0" bit terminates FILE_BLK list) +-+-+ Figure E.3 -- Simplified AudioSpecificConfig The 5-bit AOTYPE field specifies the Audio Object Type as an unsigned integer. The legal valuesusing this emulation foruse withmpeg4-genericRTP MIDI streams are "15" (General MIDI), "14" (DLS 2), and "13" (Structured Audio). Thus, receiversGM sessions. We require GM support so that an offerer that wishes to maximize interoperability may do so by offering GM if its preferred renderer is notsupport all three mpeg4-generic renderers may parseaccepted by thefirst 5 bits of an AudioSpecificConfig codedanswerer. Offerers MUST NOT present several renderers as options in a sessiondescription, and reject sessionsdescription by listing several payload types on a media line, as Section 2.1 uses this construct to let a party send several RTP MIDI streams in the same RTP session. Instead, an offerer wishing to present rendering options SHOULD offer a single payload type thatspecify unsupportedoffers several renderers.The 4-bit FREQIDX field specifiesIn this construct, thesampling rateparameter list codes a list of render parameters (each followed by its support parameters). As discussed in Appendix C.6.1, therenderer. We show the mappingorder ofFREQIDXrenderers in the list declares the offerer's preference. The "unknown" and "null" values MUST NOT appear in the offer. The answer MUST set all render values except the desired renderer tosampling rates"null". Thus, "unknown" MUST NOT appear inFigure E.4. Sendersthe answer. We use SHOULD instead of MUSTspecify a sampling frequency that matchesin the first sentence in the paragraph above, because this technique does not work in all situations (example: an offerer wishes to offer both mpeg4-generic renderers and native RTPclock rate, if possible; if not, sendersMIDI renderers as options). In this case, the offerer MUSTspecifypresent a series of session descriptions, each offering a single renderer, until theescape value. Receiversanswerer accepts a session description. Parties MUSTconsultsupport theRTP clockmusicport, chanmask, subrender, rinit, and inline parameters. Parties supporting renderers whose data object (as encoded by a parameter value for "inline") could exceed 300 octets in size MUST support thetrue sampling rate if the escape value is specified. FREQIDX Sampling Frequency 0x0 96000 0x1 88200 0x2 64000 0x3 48000 0x4 44100 0x5 32000 0x6 24000 0x7 22050 0x8 16000 0x9 12000 0xa 11025 0xb 8000 0xc reserved 0xd reserved 0xe reserved 0xf escape value Figure E.4 -- FreqIdx encoding The 4-bit CHANNEL field specifies the number of audio channelsurl and cid parameters and thus must implement HTTP protocol. Note that in mpeg4-generic, General MIDI data objects cannot exceed 300 octets, but DLS 2 and Structured Audio data objects may. Support for therenderer. The values 0x1-0x5 specify 1 to 5 audio channels; the value 0x6 specified 5+1 surround sound, andother rendering parameters (smf_cif, smf_info, smf_inline, smf_url) is OPTIONAL. Thus far in this document, our discussion has assumed that thevalue 0x7 specifies 7+1 surround sound. Ifonly MIDI flows that drive a renderer are thertpmap linenetwork flows described in the sessiondescription specifiesdescription. In NMP applications, this assumption would require two rendering engines: oneof these formats, CHANNEL MUST be setfor local use by a party, a second for the remote party. In practice, applications may wish to have both parties share a single rendering engine. In this case, thecorresponding value. Otherwise, CHANNELsession description MUSTbe setuse a virtual sendrecv session and MUST use the stream subsetting and chapter inclusion parameters to0x0. The CHANNEL field is followedallocate which MIDI channels are intended for use by alist of one or more binary file data blocks. The 3-bit SACNK field (the chunk_type field in class StructuredAudioSpecificConfig defined in [MPEGSA]) specifies the type of each data block. For General MIDI, only Standardparty. If two parties are sharing a MIDIFiles may appear inchannels, thelist (SACNK field value 2). For DLS 2, only Standardapplication MUST ensure that appropriate MIDIFiles and DLS 2 RIFF files (SACNK field value 4) may appear. For both of these file types,merging occurs at theFILE_BLK field hasinput to theformat shown in Figure E.5: a 32-bit unsigned integer value (FILE_LEN) codingrenderer. We now discuss thenumberuse ofbytes(non-MIDI) audio streams in theSMF or RIFF file, followed by FILE_LEN bytes coding the file data. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | FILE_LEN (32-bit,session. Audio streams may be used for two purposes: as abyte count SMF file"talkback" channel for parties to converse, orRIFF file) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | FILE_DATA (file contents,as alist of FILE_LEN bytes) ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure E.5 -- The FILE_BLK field format Noteway to conduct a performance thatseveral files may follow CHANNEL field. The "1" constant fields in Figure E.3 codeincludes MIDI and audio channels. In thepresence of another file;latter case, offers MUST use sample rates and the"0" constant field codespacket temporal durations for theendaudio and MIDI streams that support low-latency synchronized rendering. We now show an example ofthe list. The final "0" bitan offer/answer exchange inFigure E.3 codes the absence of special coding tools (see [MPEGAUDIO] for details). Senders not using these tools MUST append this "0" bit; receivers that do not understand these coding tools MUST ignore all data followinga"1"network musical performance application (next page). Below, we show an offer that complies with the interoperability text in thisposition.appendix section. v=0 o=first 2520644554 2838152170 IN IP4 first.example.net s=Example t=0 0 a=group:FID 1 2 c=IN IP4 192.0.2.94 m=audio 16112 RTP/AVP 96 a=recvonly a=mid:1 a=rtpmap:96 mpeg4-generic/44100 a=fmtp:96 streamtype=5; mode=rtp-midi; config=""; profile-level-id=12; cm_unused=ABCFGHJKMNPQTVWXYZ; cm_used=2NPTW; cm_used=2C0.1.7.10.11.64.121.123; cm_used=2M0.1.2 cm_used=X0-16; ch_never=ABCDEFGHJKMNPQTVWXYZ; ch_default=2NPTW; ch_default=2C0.1.7.10.11.64.121.123; ch_default=2M0.1.2; cm_default=X0-16; rtp_ptime=0; rtp_maxptime=0; guardtime=44100; musicport=1; render=synthetic; rinit="audio/asc"; inline="egoAAAAaTVRoZAAAAAYAAAABAGBNVHJrAAAABgD/LwAA" m=audio 16114 RTP/AVP 96 a=sendonly a=mid:2 a=rtpmap:96 mpeg4-generic/44100 a=fmtp:96 streamtype=5; mode=rtp-midi; config=""; profile-level-id=12; cm_unused=ABCFGHJKMNPQTVWXYZ; cm_used=1NPTW; cm_used=1C0.1.7.10.11.64.121.123; cm_used=1M0.1.2 cm_used=X0-16; ch_never=ABCDEFGHJKMNPQTVWXYZ; ch_default=1NPTW; ch_default=1C0.1.7.10.11.64.121.123; ch_default=1M0.1.2; cm_default=X0-16; rtp_ptime=0; rtp_maxptime=0; guardtime=44100; musicport=1; render=synthetic; rinit="audio/asc"; inline="egoAAAAaTVRoZAAAAAYAAAABAGBNVHJrAAAABgD/LwAA" (The a=fmtp lines have been wrapped to fit the page to accommodate memo formatting restrictions; it comprises a single line in SDP.) TheStructuredAudioSpecificConfig bitfield structure requiresowner line (o=) identifies thepresence of one FILE_BLK. For mpeg4-generic RTP MIDI use of DLS 2, FILE_BLKs MUST code RIFF files or SMF files. For mpeg4-generic RTPsession owner as "first". The session description defines two MIDIuse of General MIDI, FILE_BLKs MUST code SMF files. By default, this SMF will be ignored (Appendix C.6.4.1). In this default case,streams: aGM StructuredAudioSpecificConfig bitfield SHOULD coderecvonly stream on which "first" receives aFILE_BLK whose FILE_LEN is 0,performance, andwhose FILE_DATA is empty. To complete this Appendix, we derivea sendonly stream that "first" uses to send a performance. The recvonly port number encodes theStructuredAudioSpecificConfigports on which "first" wishes to receive RTP (16112) and RTCP (16113) media at IP4 address 192.0.2.94. The sendonly port number encodes the port on which "first" wishes to receive RTCP for the stream (16115). The musicport parameters code thatwe use inthe two streams share and identity relationship and thus form a virtual sendrecv stream. Both streams are mpeg4-generic RTP MIDI streams that specify a General MIDIsession examples in this memo. Referring to Figure E.3, we noterenderer. The stream subsetting parameters code that the recvonly stream uses MIDI channel 1 exclusively forGM, AOTYPE = 15. Our examples use a 44,100 Hz sample rate (FREQIDX = 4)voice commands, andare in mono (CHANNEL = 1). For GM, a single SMF is encoded (SACNK = 2), usingthat theSMF shown in Figure E.6 (a 26 byte file). -------------------------------------------- |sendonly stream uses MIDIFile = <Header Chunk> <Track Chunk> | -------------------------------------------- Where: <Header Chunk> = <chunk type> <length> <format> <ntrks> <divsn> 4D 54 68 64 00 00 00 06 00 00 00 01 00 60 <Track Chunk> = <chunk type> <length> <delta-time> <end-event> 4D 54 72 6B 00 00 00 04 00 FF 2F 00 Figure E.6 -- SMF file encoded inchannel 2 exclusively for voice commands. This mapping permits theexample Placing these constants in binary format intoapplication software to share a single renderer for local and remote performers. We now show thedata structure shown in Figure E.3 yieldsanswer to theconstant shown in Figure E.7. 0 1 2 3offer. v=0 o=second 2520644554 2838152170 IN IP4 second.example.net s=Example t=0 0 a=group:FID 1 23 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0 1 1 1 1|0 1 0 0|0 0 0 1|0 1 0|0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0|0 1 0 0|1 1 0 1|0 1 0 1|0 1 0 0| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0 1 1 0|1 0 0 0|0 1 1 0|0 1 0 0|0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0 0 0 0|0 0 0 0|0 0 0 0|0 1 1 0|0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 1|0 0 0 0|0 0 0 0|0 1 1 0|0 0 0 0| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0 1 0 0|1 1 0 1|0 1 0 1|0 1 0 0|0 1 1 1|0 0 1 0|0 1 1 0|1 0 1 1| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0|0 1 1 0| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0 0 0 0|0 0 0 0|1 1 1 1|1 1 1 1|0 0 1 0|1 1 1 1|0 0 0 0|0 0 0 0| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0|0| +-+-+ Figure E.7 -- AudioSpecificConfig used in GM examples Expressing this bitfield as an ASCII hexadecimal string yields: 7A0A0000001A4D546864000000060000000100604D54726B0000000600FF2F000 This string is assignedc=IN IP4 192.0.2.105 m=audio 5004 RTP/AVP 96 a=sendonly a=mid:1 a=rtpmap:96 mpeg4-generic/44100 a=fmtp:96 streamtype=5; mode=rtp-midi; config=""; profile-level-id=12; cm_unused=ABCFGHJKMNPQTVWXYZ; cm_used=2NPTW; cm_used=2C0.1.7.10.11.64.121.123; cm_used=2M0.1.2 cm_used=X0-16; ch_never=ABCDEFGHJKMNPQTVWXYZ; ch_default=2NPTW; ch_default=2C0.1.7.10.11.64.121.123; ch_default=2M0.1.2; cm_default=X0-16; rtp_ptime=0; rtp_maxptime=882; guardtime=44100; musicport=1; render=synthetic; rinit="audio/asc"; inline="egoAAAAaTVRoZAAAAAYAAAABAGBNVHJrAAAABgD/LwAA" m=audio 5006 RTP/AVP 96 a=recvonly a=mid:2 a=rtpmap:96 mpeg4-generic/44100 a=fmtp:96 streamtype=5; mode=rtp-midi; config=""; profile-level-id=12; cm_unused=ABCFGHJKMNPQTVWXYZ; cm_used=1NPTW; cm_used=1C0.1.7.10.11.64.121.123; cm_used=1M0.1.2 cm_used=X0-16; ch_never=ABCDEFGHJKMNPQTVWXYZ; ch_default=1NPTW; ch_default=1C0.1.7.10.11.64.121.123; ch_default=1M0.1.2; cm_default=X0-16; rtp_ptime=0; rtp_maxptime=0; guardtime=88200; musicport=1; render=synthetic; rinit="audio/asc"; inline="egoAAAAaTVRoZAAAAAYAAAABAGBNVHJrAAAABgD/LwAA" (The a=fmtp lines have been wrapped to fit the"config" parameterpage to accommodate memo formatting restrictions; they comprise single lines in SDP.) The owner line (o=) identifies theminimal mpeg4-generic General MIDI examples in this memo (suchsession owner as "second". The port numbers for both media streams are non-zero; thus, "second" has accepted theexample in Section 6.2). Expressing this stringsession description. The stream marked "sendonly" inBase64 [RFC2045] yields: egoAAAAaTVRoZAAAAAYAAAABAGBNVHJrAAAABgD/LwAA This string is assigned tothe"inline" parameteroffer is marked "recvonly" in theGeneral MIDI example shown in Appendix C.6.5. F. Acknowledgements We thankanswer, and vice versa, coding thenetworking, media compression,different view of the session held by "session". The IP4 number (192.0.2.105) andcomputer music community members whothe RTP (5004 and 5006) and RTCP (5005 and 5007) havecommented or contributedbeen changed by "second" to match its transport wishes. In addition, "second" has made several parameter changes: rtp_maxptime for theeffort, including Kurt B, Cynthia Bruyns, Steve Casner, Paul Davis, Robin Davies, Joanne Dow, Tobias Erichsen, Nicolas Falquet, Dominique Fober, Philippe Gentric, Michael Godfrey, Chris Grigg, Todd Hager, Michel Jullian, Phil Kerr, Young-Kwon Lim, Jessica Little, Jan van der Meer, Colin Perkins, Charlie Richmond, Herbie Robinson, Larry Rowe, Eric Scheirer, Dave Singer, Martijn Sipkema, William Stewart, Kent Terry, Magnus Westerlund, Tom White, Jim Wright, Doug Wyatt,sendonly stream has been changed to code 2 ms (441 in clock units), andGiorgio Zoia. Wethe guardtime for the recvonly stream has been doubled. As these parameter modifications request capabilities that are REQUIRED to be implemented by interoperable parties, "second" can make these changes with confidence that "first" can abide by them. D. Parameter Syntax Definitions In this appendix, we define the syntax for the RTP MIDI media type parameters in Augmented Backus-Naur Form (ABNF, [RFC4234]). When using these parameters with SDP, all parameters MUST appear on a single fmtp attribute line of an RTP MIDI media description. For mpeg4-generic RTP MIDI streams, this line MUST alsothankinclude any mpeg4-generic parameters (usage described in Section 6.2). An fmtp attribute line may be defined (after [RFC3640]) as: ; ; SDP fmtp line definition ; fmtp = "a=fmtp:" token SP param-assign 0*(";" SP param-assign) CRLF where <token> codes themembersRTP payload type. Note that white space MUST NOT appear between the "a=fmtp:" and the RTP payload type. We now define the syntax of theSan Francisco Bay Area musicparameters defined in Appendix C. The definition takes the form of the incremental assembly of the <param-assign> token. See [RFC3640] for the syntax of the mpeg4-generic parameters discussed in Section 6.2. ; ; ; top-level definition for all parameters ; ; ; ; Parameters defined in Appendix C.1 param-assign = ("cm_unused=" (([channel-list] command-type [f-list]) / sysex-data)) param-assign =/ ("cm_used=" (([channel-list] command-type [f-list]) / sysex-data)) ; ; Parameters defined in Appendix C.2 param-assign =/ ("j_sec=" ("none" / "recj" / *ietf-extension)) param-assign =/ ("j_update=" ("anchor" / "closed-loop" / "open-loop" / *ietf-extension)) param-assign =/ ("ch_default=" (([channel-list] chapter-list [f-list]) / sysex-data)) param-assign =/ ("ch_never=" (([channel-list] chapter-list [f-list]) / sysex-data)) param-assign =/ ("ch_anchor=" (([channel-list] chapter-list [f-list]) / sysex-data)) ; ; Parameters defined in Appendix C.3 param-assign =/ ("tsmode=" ("comex" / "async" / "buffer")) param-assign =/ ("linerate=" nonzero-four-octet) param-assign =/ ("octpos=" ("first" / "last")) param-assign =/ ("mperiod=" nonzero-four-octet) ; ; Parameter defined in Appendix C.4 param-assign =/ ("guardtime=" nonzero-four-octet) param-assign =/ ("rtp_ptime=" four-octet) param-assign =/ ("rtp_maxptime=" four-octet) ; ; Parameters defined in Appendix C.5 param-assign =/ ("musicport=" four-octet) ; ; Parameters defined in Appendix C.6 param-assign =/ ("chanmask=" ( 1*( 16( "0" / "1" ) ))) param-assign =/ ("cid=" double-quote cid-block double-quote) param-assign =/ ("inline=" double-quote base-64-block double-quote) param-assign =/ ("multimode=" ("all" / "one")) param-assign =/ ("render=" ("synthetic" / "api" / "null" / "unknown" / *extension)) param-assign =/ ("rinit=" mime-type "/" mime-subtype) param-assign =/ ("smf_cid=" double-quote cid-block double-quote) param-assign =/ ("smf_info=" ("ignore" / "identity" / "sdp_start" / *extension)) param-assign =/ ("smf_inline=" double-quote base-64-block double-quote) param-assign =/ ("smf_url=" double-quote uri-element double-quote) param-assign =/ ("subrender=" ("default" / *extension)) param-assign =/ ("url=" double-quote uri-element double-quote) ; ; list definitions for the cm_ command-type ; command-type = command-part1 command-part2 command-part3 command-part1 = (*1"A") (*1"B") (*1"C") (*1"F") (*1"G") (*1"H") command-part2 = (*1"J") (*1"K") (*1"M") (*1"N") (*1"P") (*1"Q") command-part3 = (*1"T") (*1"V") (*1"W") (*1"X") (*1"Y") (*1"Z") ; ; list definitions for the ch_ chapter-list ; chapter-list = ch-part1 ch-part2 ch-part3 ch-part1 = (*1"A") (*1"B") (*1"C") (*1"D") (*1"E") (*1"F") (*1"G") ch-part2 = (*1"H") (*1"J") (*1"K") (*1"M") (*1"N") (*1"P") (*1"Q") ch-part3 = (*1"T") (*1"V") (*1"W") (*1"X") (*1"Y") (*1"Z") ; ; list definitions for the ch_ channel-list ; channel-list = midi-chan-element *("." midi-chan-element) midi-chan-element = midi-chan / midi-chan-range midi-chan-range = midi-chan "-" midi-chan ; decimal value of left midi-chan ; MUST be strictly less than decimal ; value of right midi-chan midi-chan = %d0-15 ; ; list definitions for the ch_ field list (f-list) ; f-list = midi-field-element *("." midi-field-element) midi-field-element = midi-field / midi-field-range midi-field-range = midi-field "-" midi-field ; ; decimal value of left midi-field ; MUST be strictly less than decimal ; value of right midi-field midi-field = four-octet ; ; large range accommodates Chapter M ; RPN (0-16383) andaudio communityNRPN (16384-32767) ; parameters, and Chapter X octet sizes. ; ; definitions for ch_ sysex-data ; sysex-data = "__" h-list *("_" h-list) "__" h-list = hex-field-element *("." hex-field-element) hex-field-element = hex-octet / hex-field-range hex-field-range = hex-octet "-" hex-octet ; ; hexadecimal value of left hex-octet ; MUST be strictly less than hexadecimal ; value of right hex-octet hex-octet = 2("0" / "1" / "2"/ "3" / "4" / "5" / "6" / "7" / "8" / "9" / "A" / "B" / "C" / "D" / "E" / "F") ; ; rewritten version of hex-octet in [RFC2045] ; (page 23). ; note that a-f are not permitted, only A-F. ; hex-octet values MUST NOT exceed 7F. ; ; definitions for rinit parameter ; mime-type = "audio" / "application" mime-subtype = token ; ; See Appendix C.6.2 for registration ; requirements for rinit type/subtypes. ; ; definitions for base64 encoding ; copied from [RFC4566] base-64-block = *base64-unit [base64-pad] base64-unit = 4base64-char base64-pad = 2base64-char "==" / 3base64-char "=" base64-char = %x41-5A / %x61-7A / %x30-39 / "+" / "/" ; A-Z, a-z, 0-9, "+" and "/" ; ; generic rules ; ietf-extension = token ; ; ietf-extension may only be defined in ; standards-track RFCs. extension = token ; ; extension may be defined by filing ; a registration with IANA. four-octet = %d0-4294967295 ; unsigned encoding of 32-bits nonzero-four-octet = %d1-4294967295 ; unsigned encoding of 32-bits, ex-zero uri-element = URI-reference ; as defined in [RFC3986] double-quote = %x22 ; the double-quote (") character token = 1*token-char ; copied from [RFC4566] token-char = %x21 / %x23-27 / %x2A-2B / %x2D-2E / %x30-39 / %x41-5A / %x5E-7E ; copied from [RFC4566] cid-block = 1*cid-char cid-char = token-char cid-char =/ "@" cid-char =/ "," cid-char =/ ";" cid-char =/ ":" cid-char =/ " cid-char =/ "/" cid-char =/ "[" cid-char =/ "]" cid-char =/ "?" cid-char =/ "=" ; ; add back in the tspecials [RFC2045], except for ; double-quote and the non-email safe () <> ; note that "cid" defined above ensures that ; cid-block is enclosed with double-quotes ; external references ; URI-reference: from [RFC3986] ; ; End of ABNF The mpeg4-generic RTP payload [RFC3640] defines a "mode" parameter that signals the type of MPEG stream in use. We add a new mode value, "rtp-midi", using the ABNF rule below: ; ; mpeg4-generic mode parameter extension ; mode =/ "rtp-midi" ; as described in Section 6.2 of this memo E. A MIDI Overview forcreatingNetworking Specialists This appendix presents an overview of thecontextMIDI standard, for thework, including Don Buchla, Chris Chafe, Richard Duda, Dan Ellis, Adrian Freed, Ben Gold, Jaron Lanier, Roger Linn, Richard Lyon, Dana Massie, Max Mathews, Keith McMillen, Carver Mead, Nelson Morgan, Tom Oberheim, Malcolm Slaney, Dave Smith, Julius Smith, David Wessel, and Matt Wright. G. Security Considerationsbenefit of networking specialists new to musical applications. Implementors shouldcarefully read the Security Considerations sectionsconsult [MIDI] for a normative description ofthe RTP [RFC3550], AVP [RFC3551],MIDI. Musicians make music by performing a controlled sequence of physical movements. For example, a pianist plays by coordinating a series of key presses, key releases, andother RTP profile documents,pedal actions. MIDI represents a musical performance by encoding these physical gestures as a sequence of MIDI commands. This high-level musical representation is compact but fragile: one lost command may be catastrophic to theissues discussedperformance. MIDI commands have much inthese sections directly apply to RTPcommon with the machine instructions of a microprocessor. MIDIstreams. Implementors should also reviewcommands are defined as binary elements. Bitfields within a MIDI command have a regular structure and a specialized purpose. For example, theSecure Real-time Transport Protocol (SRTP, [RFC3711]), an RTP profile that addressesupper nibble of thesecurity issues discussed in [RFC3550] [RFC3551]. In this Appendix, we discuss security issues that are unique tofirst command octet (the opcode field) codes theRTPcommand type. MIDIpayload format. When using RTP MIDI, authentication of incoming RTP and RTCP packets is RECOMMENDED. Per-packet authenticationcommands maybe provided by SRTPconsist of an arbitrary number of complete octets, but most MIDI commands are 1, 2, or 3 octets in length. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | Channel Voice Messages | Bitfield Pattern | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | NoteOff (end a note) | 1000cccc 0nnnnnnn 0vvvvvvv | |-------------------------------------------------------------| | NoteOn (start a note) | 1001cccc 0nnnnnnn 0vvvvvvv | |-------------------------------------------------------------| | PTouch (Polyphonic Aftertouch) | 1010cccc 0nnnnnnn 0aaaaaaa | |-------------------------------------------------------------| | CControl (Controller Change) | 1011cccc 0xxxxxxx 0yyyyyyy | |-------------------------------------------------------------| | PChange (Program Change) | 1100cccc 0ppppppp | |-------------------------------------------------------------| | CTouch (Channel Aftertouch) | 1101cccc 0aaaaaaa | |-------------------------------------------------------------| | PWheel (Pitch Wheel) | 1110cccc 0xxxxxxx 0yyyyyyy | ------------------------------------------------------------- Figure E.1 -- MIDI Channel Messages ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | System Common Messages | Bitfield Pattern | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | System Exclusive | 11110000, followed byother means. Without the usea | | | list ofauthentication, attackers could forge0xxxxxx octets, | | | followed by 11110111 | |-------------------------------------------------------------| | MIDI Time Code Quarter Frame | 11110001 0xxxxxxx | |-------------------------------------------------------------| | Song Position Pointer | 11110010 0xxxxxxx 0yyyyyyy | |-------------------------------------------------------------| | Song Select | 11110011 0xxxxxxx | |-------------------------------------------------------------| | Undefined | 11110100 | |-------------------------------------------------------------| | Undefined | 11110101 | |-------------------------------------------------------------| | Tune Request | 11110110 | |-------------------------------------------------------------| | System Exclusive End Marker | 11110111 | ------------------------------------------------------------- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | System Realtime Messages | Bitfield Pattern | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | Clock | 11111000 | |-------------------------------------------------------------| | Undefined | 11111001 | |-------------------------------------------------------------| | Start | 11111010 | |-------------------------------------------------------------| | Continue | 11111011 | |-------------------------------------------------------------| | Stop | 11111100 | |-------------------------------------------------------------| | Undefined | 11111101 | |-------------------------------------------------------------| | Active Sense | 11111110 | |-------------------------------------------------------------| | System Reset | 11111111 | ------------------------------------------------------------- Figure E.2 -- MIDIcommands into an ongoing stream, damaging speakers and eardrums. An attacker could also craft RTP and RTCP packets to exploit known bugs in the client,System Messages Figure E.1 andtake effective control of a client machine. Session management tools (such as SIP [RFC3261]) SHOULD use authentication during the transport of all session descriptions containing RTP MIDI media streams. For SIP,E.2 show theSecurity Considerations section in [RFC3261] provides an overview of possible authentication mechanisms. RTPMIDIsession descriptions should use authentication because the session descriptions may code initialization data using the parameters described in Appendix C. If an attacker inserts bogus initialization data into a session description, he can corrupt the session or forge an client attack. Session descriptions may also code renderer initialization data by reference, via the url (Appendix C.6.3) and smf_url (Appendix C.6.4.2) parameters. If the coded URL is spoofed, both session and clientcommand family. There areopen to attack, even if the session description itself is authenticated. Therefore, URLs specified in url and smf_url parameters SHOULD use [RFC2818]. Section 2.1 allows streams sent by a partythree major classes of commands: voice commands (opcode field values intwo RTP sessions to havethesame SSRC valuerange 0x8 through 0xE), system common commands (opcode field 0xF, commands 0xF0 through 0xF7), and system real-time commands (opcode field 0xF, commands 0xF8 through 0xFF). Voice commands code thesame RTP timestamp initialization value, under certain circumstances. Normally, these values are randomly chosenmusical gestures for eachstreamtimbre in asession, to make plaintext guessing harder to do if the payloads are encrypted. Thus, Section 2.1 weakens this aspect of RTP security. H. IANA Considerations This Appendix makes a series of requests to IANA. Thus, we begin with an outline of our requests. The sub-appendicescomposition. Systems commands perform functions thatfollow hold the actual, detailed requests. All registrations in this Appendix are in the IETF tree, and follow the rules of [MTYPE] and [RFC3555]usually affect all voice channels, such asappropriate. In Appendix H.1, we request the registrationSystem Reset (0xFF). E.1. Commands Types Voice commands execute on one ofa new media type: "audio/rtp-midi". Paired with this request is a request for a repository for new values for several parameters associated with "audio/rtp-midi". We request this repository16 MIDI channels, as coded by its 4-bit channel field (field cccc inAppendix H.1.1.Figure E.1). InAppendix H.2, we request the registration of a new value ("rtp-midi") for the "mode" parameter of the "mpeg4-generic" media type, in Appendix H.2. The "mpeg4-generic" media type is defined in [RFC3640], and [RFC3640] defines a repositorymost applications, notes forthe "mode" parameter. However, we believe wedifferent timbres arethe firstassigned torequest the registration of a "mode" value, and so we believe the registry for "mode" has not yet been created by IANA. Paired with our "mode" parameter value request for "mpeg4-generic" is a request for a repository for new values for several parameters we have defined fordifferent channels. To support applications that require more than 16 channels, MIDI systems usewith the "rtp-midi" mode value. We request this repositoryseveral MIDI command streams inAppendix H.2.1. In Appendix H.3, we request the registrationparallel, to yield 32, 48, or 64 MIDI channels. As an example of anew media type: "audio/asc". No repository request is associatedvoice command, consider a NoteOn command (opcode 0x9), withthis request. H.1 rtp-midi Media Type Registrationbinary encoding 1001cccc 0nnnnnnn 0aaaaaaa. ThisAppendix requestscommand signals theregistrationstart of a musical note on MIDI channel cccc. The note has a pitch coded by the"rtp-midi" subtype for the "audio" media type. We requestnote number nnnnnnn, and an onset amplitude coded by note velocity aaaaaaa. Other voice commands signal theregistrationend of notes (NoteOff, opcode 0x8), map a specific timbre to a MIDI channel (PChange, opcode 0xC), or set the value of parameterslisted in the "optional parameters" section below (both the "non- extensible parameters" and the "extensible parameters" lists). We also requestthat modulate thecreationtimbral quality (all other voice commands). The exact meaning ofrepositories formost voice channel commands depends on the"extensible parameters";rendering algorithms thedetails of this request appear in Appendix H.1.1 below. Media type name: audio Subtype name: rtp-midi Required parameters: rate: The RTP timestamp clock rate. See Sections 2.1 and 6.1 for usage details. Optional parameters: Non-extensible parameters: ch_anchor: See Appendix C.2.3 for usage details. ch_default: See Appendix C.2.3 for usage details. ch_never: See Appendix C.2.3 for usage details. cm_unused: See Appendix C.1 for usage details. cm_used: See Appendix C.1 for usage details. chanmask: See Appendix C.6.4.3 for usage details. cid: See Appendix C.6.3 for usage details. guardtime: See Appendix C.4.2 for usage details. inline: See Appendix C.6.3 for usage details. linerate: See Appendix C.3 for usage details. mperiod: See Appendix C.3 for usage details. multimode: See Appendix C.6.1 for usage details. musicport: See Appendix C.5 for usage details. octpos: See Appendix C.3 for usage details. rinit: See Appendix C.6.3 for usage details. rtp_maxptime: See Appendix C.4.1 for usage details. rtp_ptime: See Appendix C.4.1 for usage details. smf_cid: See Appendix C.6.4.2 for usage details. smf_inline: See Appendix C.6.4.2 for usage details. smf_url: See Appendix C.6.4.2 for usage details. tsmode: See Appendix C.3 for usage details. url: See Appendix C.6.3 for usage details. Extensible parameters: j_sec: See Appendix C.2.1 for usage details. See Appendix H.1.1 for repository details. j_update: See Appendix C.2.2 for usage details. See Appendix H.1.1 for repository details. render: See Appendix C.6 for usage details. See Appendix H.1.1 for repository details. subrender: See Appendix C.6.2 for usage details. See Appendix H.1.1 for repository details. smf_info: See Appendix C.6.4.1 for usage details. See Appendix H.1.1 for repository details. Encoding considerations: The format for this type is framed and binary. Restrictions on usage: This type is only defined for real-time transfers ofMIDIstreams via RTP. Stored-file semantics for rtp-midi may be defined in the future. Security considerations: See Appendix Greceiver uses to generate sound. In most applications, a MIDI sender has a model (in some sense) ofthis memo. Interoperability considerations: None. Published specification: This memo and [MIDI] serve asthenormative specification. In addition, references [NMP], [GRAME], and [GUIDE] provide non-normative implementation guidance. Applications which use this media type: Audio content-creation hardware, such asrendering method used by the receiver. System commands perform a variety of global tasks in the stream, including "sequencer" playback control of pre-recorded MIDIcontroller piano keyboardscommands (the Song Position Pointer, Song Select, Clock, Start, Continue, and Stop messages), SMPTE time code (the MIDIaudio synthesizers. Audio content-creation software, such as music sequencers, digital audio workstations,Time Code Quarter Frame command), andsoft synthesizers. Computer operating systems, for network supportthe communication of device-specific data (the System Exclusive messages). E.2. Running Status All MIDIApplication Programmer Interfaces. Content distribution servers and terminals may use this media type for low bit-rate music coding. Additional information: None. Person & email address to contact for further information: John Lazzaro <lazzaro@cs.berkeley.edu> Intended usage: COMMON. Author: John Lazzaro <lazzaro@cs.berkeley.edu> Change Controller: IETF Audio/Video Transport Working Group delegated fromcommand bitfields share a special structure: theIESG. H.1.1 Repository request for "audio/rtp-midi" Forleading bit of the"rtp-midi" subtype, we requestfirst octet is set to 1, and thecreationleading bit ofrepositories for extensionsall subsequent octets is set to 0. This structure supports a data compression system, called running status [MIDI], that improves thefollowing parameters (which are those listed as "extensible parameters" in Appendix H.1). j_sec: Registrations for this repository may only occur via an IETF standards-track document. Appendix C.2.1coding efficiency ofthis memo describes appropriate registrations for this repository. Initial values for this repository appear below: "none": Defined in Appendix C.2.1MIDI. In running status coding, the first octet ofthis memo. "recj": Defineda MIDI voice command may be dropped if it is identical to the first octet of the previous MIDI voice command. This rule, inAppendix C.2.1combination with a convention to consider NoteOn commands with a null third octet as NoteOff commands, supports the coding ofthis memo. j_update: Registrations for this repository maynote sequences using two octets per command. Running status coding is onlyoccur via an IETF standards-track document. Appendix C.2.2 of this memo describes appropriate registrations for this repository. Initial valuesused forthis repository appear below: "anchor": Defined in Appendix C.2.2voice commands. The presence ofthis memo. "open-loop": Defineda system common message inAppendix C.2.2 of this memo. "closed-loop": Definedthe stream cancels running status mode for the next voice command. However, system real-time messages do not cancel running status mode. E.3. Command Timing The bitfield formats inAppendix C.2.2 of this memo. render: RegistrationsFigures E.1 and E.2 do not encode the execution time forthis repository MUST includeaspecificationcommand. Timing information is not a part of theusageMIDI command syntax itself; different applications of theproposed value. See text inMIDI command language use different methods to encode timing. For example, thepreamble of Appendix C.6MIDI command set acts as the transport layer fordetails (the paragraphMIDI 1.0 DIN cables [MIDI]. MIDI cables are short asynchronous serial lines thatbegins "Other render token ..."). Initial values for this repository appear below: "unknown": Defined in Appendix C.6 of this memo. "synthetic": Defined in Appendix C.6facilitate the remote operation ofthis memo. "api": Defined in Appendix C.6musical instruments and audio equipment. Timestamps are not sent over a MIDI 1.0 DIN cable. Instead, the standard uses an implicit "time ofthis memo. "null": Defined in Appendix C.6arrival" code. Receivers execute MIDI commands at the moment ofthis memo. subrender: Registrationsarrival. In contrast, Standard MIDI Files (SMFs, [MIDI]), a file format forthis repository MUST includerepresenting complete musical performances, add an explicit timestamp to each MIDI command, using aspecificationdelta encoding scheme that is optimized for statistics of musical performance. SMF timestamps usually code timing using theusagemetric notation of a musical score. SMF meta-events are used to add a tempo map to theproposed value. See text Appendix C.6.2 for details (the paragraphfile, so thatbegins "Other subrender token ..."). Initial values for this repository appear below: "default": Defined in Appendix C.6.2score beats may be accurately converted into units ofthis memo. smf_info: Registrationsseconds during rendering. E.4. AudioSpecificConfig Templates forthis repository MUSTMMA Renderers In Section 6.2 and Appendix C.6.5, we describe how session descriptions include an AudioSpecificConfig data block to specify aspecificationMIDI rendering algorithm for mpeg4-generic RTP MIDI streams. The bitfield format ofthe usageAudioSpecificConfig is defined in [MPEGAUDIO]. StructuredAudioSpecificConfig, a key data structure coded in AudioSpecificConfig, is defined in [MPEGSA]. For implementors wishing to specify Structured Audio renderers, a full understanding of [MPEGSA] and [MPEGAUDIO] is essential. However, many implementors will limit their rendering options to theproposed value. See texttwo MIDI Manufacturers Association renderers that may be specified inAppendix C.6.4.1AudioSpecificConfig: General MIDI (GM, [MIDI]) and Downloadable Sounds 2 (DLS 2, [DLS2]). To aid these implementors, we reproduce the AudioSpecificConfig bitfield formats fordetails (the paragrapha GM renderer and a DLS 2 renderer below. We have checked these bitfields carefully and believe they are correct. However, we stress thatbegins "Other smf_info token ..."). Initial valuesthe material below is informative, and that [MPEGAUDIO] and [MPEGSA] are the normative definitions forthis repository appear below: "ignore": DefinedAudioSpecificConfig. As described inAppendix C.6.4.1 of this memo. "sdp_start": DefinedSection 6.2, a minimal mpeg4-generic session description encodes the AudioSpecificConfig binary bitfield as a hexadecimal string (whose format is defined inAppendix C.6.4.1 of this memo. "identity": Defined[RFC3640]) that is assigned to the "config" parameter. As described in AppendixC.6.4.1 of this memo. H.2 mpeg4-generic Media Type Registration This Appendix requestsC.6.3, a session description that uses theregistration ofrender parameter encodes the"rtp-midi" value forAudioSpecificConfig binary bitfield as a Base64-encoded string assigned to the"mode" parameter"inline" parameter, or in the body of an HTTP URL assigned to the"mpeg4-generic" media type. The "mpeg4-generic" media type is defined in [RFC3640], and [RFC3640] defines"url" parameter. Below, we show arepositorysimplified binary AudioSpecificConfig bitfield format, suitable for sending and receiving GM and DLS 2 data: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | AOTYPE |FREQIDX|CHANNEL|SACNK| FILE_BLK 1 (required) ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |1|SACNK| FILE_BLK 2 (optional) ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ... |1|SACNK| FILE_BLK N (optional) ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0|0| (first "0" bit terminates FILE_BLK list) +-+-+ Figure E.3 -- Simplified AudioSpecificConfig The 5-bit AOTYPE field specifies the"mode" parameter. We are registering mode rtp-midi to support the MPEGAudiocodecs [MPEGSA] thatObject Type as an unsigned integer. The legal values for useMIDI. In conjunctionwiththis registration request, we requestmpeg4-generic RTP MIDI streams are "15" (General MIDI), "14" (DLS 2), and "13" (Structured Audio). Thus, receivers that do not support all three mpeg4-generic renderers may parse theregistrationfirst 5 bits ofthe parameters listedan AudioSpecificConfig coded inthe "optional parameters" section below (both the "non-extensible parameters"a session description and reject sessions that specify unsupported renderers. The 4-bit FREQIDX field specifies the"extensible parameters" lists). We also request the creationsampling rate ofrepositories forthe"extensible parameters";renderer. We show thedetailsmapping ofthis request appearFREQIDX values to sampling rates inAppendix H.2.1 below. Media type name: audio Subtype name: mpeg4-generic Required parameters: The "mode" parameter is required by [RFC3640]. [RFC3640] requestsFigure E.4. Senders MUST specify arepository for "mode", so that new values for mode may be added. We requestsampling frequency that matches thevalue "rtp-midi" be added to the "mode" repository. In mode rtp-midi,RTP clock rate, if possible; if not, senders MUST specify thempeg4-generic parameter rate is a required parameter. Rate specifiesescape value. Receivers MUST consult the RTPtimestampclockrate. See Sections 2.1 and 6.2parameter forusage details ofthe true sampling ratein mode rtp-midi. Optional parameters: We request registration ofif thefollowing parameters for use in mode rtp-midi for mpeg4-generic. Non-extensible parameters: ch_anchor: See Appendix C.2.3 for usage details. ch_default: See Appendix C.2.3 for usage details. ch_never: See Appendix C.2.3 for usage details. cm_unused: See Appendix C.1 for usage details. cm_used: See Appendix C.1 for usage details. chanmask: See Appendix C.6.4.3 for usage details. cid: See Appendix C.6.3 for usage details. guardtime: See Appendix C.4.2 for usage details. inline: See Appendix C.6.3 for usage details. linerate: See Appendix C.3 for usage details. mperiod: See Appendix C.3 for usage details. multimode: See Appendix C.6.1 for usage details. musicport: See Appendix C.5 for usage details. octpos: See Appendix C.3 for usage details. rinit: See Appendix C.6.3 for usage details. rtp_maxptime: See Appendix C.4.1 for usage details. rtp_ptime: See Appendix C.4.1 for usage details. smf_cid: See Appendix C.6.4.2 for usage details. smf_inline: See Appendix C.6.4.2 for usage details. smf_url: See Appendix C.6.4.2 for usage details. tsmode: See Appendix C.3 for usage details. url: See Appendix C.6.3 for usage details. Extensible parameters: j_sec: See Appendix C.2.1 for usage details. See Appendix H.2.1 for repository details. j_update: See Appendix C.2.2 for usage details. See Appendix H.2.1 for repository details. render: See Appendix C.6 for usage details. See Appendix H.2.1 for repository details. subrender: See Appendix C.6.2 for usage details. See Appendix H.2.1 for repository details. smf_info: See Appendix C.6.4.1 for usage details. See Appendix H.2.1 for repository details. Encoding considerations: The format for this typeescape value isframed and binary. Restrictions on usage: Only defined for real-time transfers of audio/mpeg4-generic RTP streams with mode=rtp-midi. Security considerations: See Appendix G of this memo. Interoperability considerations: Except for the marker bit (Section 2.1),specified. FREQIDX Sampling Frequency 0x0 96000 0x1 88200 0x2 64000 0x3 48000 0x4 44100 0x5 32000 0x6 24000 0x7 22050 0x8 16000 0x9 12000 0xa 11025 0xb 8000 0xc reserved 0xd reserved 0xe reserved 0xf escape value Figure E.4 -- FreqIdx encoding The 4-bit CHANNEL field specifies thepacket formatsnumber of audio channels foraudio/rtp-midi and audio/mpeg4-generic (mode rtp-midi) are identical.the renderer. Theformats differ in use: audio/mpeg4-generic is for MPEG work, audio/rtp-midi is for all other work. Published specification: This memo, [MIDI], and [MPEGSA] arevalues 0x1 to 0x5 specify 1 to 5 audio channels; thenormative references. In addition, references [NMP], [GRAME], and [GUIDE] provide non-normative implementation guidance. Applications which use this media type: MPEG 4 serversvalue 0x6 specifies 5+1 surround sound, andterminals that support [MPEGSA]. Additional information: None. Person & email address to contact for further information: John Lazzaro <lazzaro@cs.berkeley.edu> Intended usage: COMMON. Author: John Lazzaro <lazzaro@cs.berkeley.edu> Change Controller: IETF Audio/Video Transport Working Group delegated fromtheIESG. H.2.1 Repository request for mode rtp-midi for mpeg4-generic For mode rtp-midi ofvalue 0x7 specifies 7+1 surround sound. If thempeg4-generic subtype, we requestrtpmap line in thecreationsession description specifies one ofrepositories for extensionsthese formats, CHANNEL MUST be set to thefollowing parameters (which are those listed as "extensible parameters" in Appendix H.2). j_sec: Registrations for this repository may only occur via an IETF standards-track document. Appendix C.2.1corresponding value. Otherwise, CHANNEL MUST be set to 0x0. The CHANNEL field is followed by a list ofthis memo describes appropriate registrations for this repository. Initial values for this repository appear below: "none": Definedone or more binary file data blocks. The 3-bit SACNK field (the chunk_type field inAppendix C.2.1 of this memo. "recj": Definedclass StructuredAudioSpecificConfig, defined inAppendix C.2.1[MPEGSA]) specifies the type ofthis memo. j_update: Registrations for this repository mayeach data block. For General MIDI, onlyoccur via an IETF standards-track document. Appendix C.2.2 of this memo describes appropriate registrations for this repository. Initial values for this repositoryStandard MIDI Files may appearbelow: "anchor": Defined in Appendix C.2.2 of this memo. "open-loop": DefinedinAppendix C.2.2the list (SACNK field value 2). For DLS 2, only Standard MIDI Files and DLS 2 RIFF files (SACNK field value 4) may appear. For both ofthis memo. "closed-loop": Definedthese file types, the FILE_BLK field has the format shown inAppendix C.2.2 of this memo. render: Registrations for this repository MUST includeFigure E.5: aspecification of32- bit unsigned integer value (FILE_LEN) coding theusagenumber ofthe proposed value. See textbytes in thepreambleSMF or RIFF file, followed by FILE_LEN bytes coding the file data. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | FILE_LEN (32-bit, a byte count SMF file or RIFF file) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | FILE_DATA (file contents, a list ofAppendix C.6 for details (the paragraphFILE_LEN bytes) ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure E.5 -- The FILE_BLK field format Note thatbegins "Other render token ..."). Initial values for this repository appear below: "unknown": Definedseveral files may follow CHANNEL field. The "1" constant fields inAppendix C.6Figure E.3 code the presence ofthis memo. "synthetic": Defined in Appendix C.6another file; the "0" constant field codes the end ofthis memo. "null": Definedthe list. The final "0" bit inAppendix C.6Figure E.3 codes the absence ofthis memo. subrender: Registrationsspecial coding tools (see [MPEGAUDIO] for details). Senders not using these tools MUST append thisrepository"0" bit; receivers that do not understand these coding tools MUSTincludeignore all data following aspecification of"1" in this position. The StructuredAudioSpecificConfig bitfield structure requires theusagepresence ofthe proposed value. See text Appendix C.6.2 for details (the paragraph that begins "Other subrender token ..." and subsequent paragraphs). Note that the text in Appendix C.6.2 contains restrictions on subrender registrations forone FILE_BLK. For mpeg4-generic("Registrations forRTP MIDI use of DLS 2, FILE_BLKs MUST code RIFF files or SMF files. For mpeg4-genericsubrender values ..."). Initial values for this repository appear below: "default": Defined in Appendix C.6.2RTP MIDI use of General MIDI, FILE_BLKs MUST code SMF files. By default, thismemo. smf_info: Registrations forSMF will be ignored (Appendix C.6.4.1). In thisrepository MUST includedefault case, aspecification ofGM StructuredAudioSpecificConfig bitfield SHOULD code a FILE_BLK whose FILE_LEN is 0, and whose FILE_DATA is empty. To complete this appendix, we derive theusage ofStructuredAudioSpecificConfig that we use in theproposed value. See textGeneral MIDI session examples inAppendix C.6.4.1 for details (the paragraphthis memo. Referring to Figure E.3, we note thatbegins "Other smf_info token ..."). Initial valuesforthis repository appear below: "ignore": DefinedGM, AOTYPE = 15. Our examples use a 44,100 Hz sample rate (FREQIDX = 4) and are inAppendix C.6.4.1 of this memo. "sdp_start": Definedmono (CHANNEL = 1). For GM, a single SMF is encoded (SACNK = 2), using the SMF shown inAppendix C.6.4.1 of this memo. "identity": DefinedFigure E.6 (a 26 byte file). -------------------------------------------- | MIDI File = <Header Chunk> <Track Chunk> | -------------------------------------------- <Header Chunk> = <chunk type> <length> <format> <ntrks> <divsn> 4D 54 68 64 00 00 00 06 00 00 00 01 00 60 <Track Chunk> = <chunk type> <length> <delta-time> <end-event> 4D 54 72 6B 00 00 00 04 00 FF 2F 00 Figure E.6 -- SMF file encoded in the example Placing these constants in binary format into the data structure shown in Figure E.3 yields the constant shown in Figure E.7. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0 1 1 1 1|0 1 0 0|0 0 0 1|0 1 0|0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0|0 1 0 0|1 1 0 1|0 1 0 1|0 1 0 0| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0 1 1 0|1 0 0 0|0 1 1 0|0 1 0 0|0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0 0 0 0|0 0 0 0|0 0 0 0|0 1 1 0|0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 1|0 0 0 0|0 0 0 0|0 1 1 0|0 0 0 0| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0 1 0 0|1 1 0 1|0 1 0 1|0 1 0 0|0 1 1 1|0 0 1 0|0 1 1 0|1 0 1 1| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0|0 1 1 0| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0 0 0 0|0 0 0 0|1 1 1 1|1 1 1 1|0 0 1 0|1 1 1 1|0 0 0 0|0 0 0 0| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0|0| +-+-+ Figure E.7 -- AudioSpecificConfig used inAppendix C.6.4.1 ofGM examples Expressing thismemo. H.3 asc Media Type Registration This section registers "asc"bitfield asa subtype for the "audio" media type. We register this subtypean ASCII hexadecimal string yields: 7A0A0000001A4D546864000000060000000100604D54726B0000000600FF2F000 This string is assigned tosupport the remote transfer ofthe "config" parameterof the mpeg4-generic media type [RFC3640] when used with mpeg4-generic mode rtp-midi (registeredinAppendix H.2 above). We explain the mechanics of using "audio/asc" to settheconfig parameterminimal mpeg4-generic General MIDI examples inSection 6.2 and Appendix C.6.5 of this document. Note thatthisregistration is a new subtype registration, and is not an addition to a repository defined by MPEG-related memosmemo (such as[RFC3640]). Also note that this request for "audio/asc" does not register parameters, and does not request the creation of a repository. Media type name: audio Subtype name: asc Required parameters: none Optional parameters: none Encoding considerations: The native form of the data object is binary data, zero-padded to an octet boundary. Restrictions on usage: This type is only defined for data object (stored file) transfer. The most common transports forthetype are HTTP and SMTP. Security considerations: See Appendix G ofexample in Section 6.2). Expressing thismemo. Interoperability considerations: None. Published specification: The audio/asc data objectstring in Base64 [RFC2045] yields: egoAAAAaTVRoZAAAAAYAAAABAGBNVHJrAAAABgD/LwAA This string is assigned to theAudioSpecificConfig binary data structure, which is normatively defined"inline" parameter in[MPEGAUDIO]. Applications which use this media type: MPEG 4 Audio servers and terminals which support audio/mpeg4-generic RTP streams for mode rtp-midi. Additional information: None. Person & email address to contact for further information: John Lazzaro <lazzaro@cs.berkeley.edu> Intended usage: COMMON. Author: John Lazzaro <lazzaro@cs.berkeley.edu> Change Controller: IETF Audio/Video Transport Working Group delegated fromtheIESG. I.General MIDI example shown in Appendix C.6.5. ReferencesI.1Normative References [MIDI] MIDI Manufacturers Association. "The Complete MIDI 1.0 Detailed Specification", 1996. [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V.Jacobson.Jacobson, "RTP: Atransport protocolTransport Protocol forreal-time applications",Real-Time Applications", STD 64, RFC 3550, July 2003. [RFC3551] Schulzrinne,H.,H. and S.Casner.Casner, "RTP Profile for Audio and Video Conferences with Minimal Control", STD 65, RFC 3551, July 2003. [RFC3640] van der Meer, J., Mackie, D., Swaminathan, V., Singer, D., and P.Gentric.Gentric, "RTP Payload Format for Transport of MPEG-4 Elementary Streams", RFC 3640, November 2003. [MPEGSA] International Standards Organization. "ISO/IEC 14496 MPEG-4", Part 3 (Audio), Subpart 5 (Structured Audio), 2001.[SDP][RFC4566] Handley, M., Jacobson, V., and C.Perkins.Perkins, "SDP: Session Description Protocol",draft-ietf-mmusic-sdp-new-25.txt.RFC 4566, July 2006. [MPEGAUDIO] International Standards Organization. "ISO 14496MPEG-4",MPEG- 4", Part 3 (Audio), 2001. [RFC2045] Freed, N. and N.Borenstein. "MIMEBorenstein, "Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies", RFC 2045, November 1996. [DLS2] MIDI Manufacturers Association. "The MIDI Downloadable Sounds Specification", v98.2, 1998.[RFC2234][RFC4234] Crocker, D. and P.Overell.Overell, "Augmented BNF for Syntax Specifications:ABNF.",ABNF", RFC2234, November 1997.4234, October 2005. [RFC2119] Bradner,S.S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K.Norrman.Norrman, "The Secure Real-time Transport Protocol (SRTP)", RFC 3711, March 2004. [RFC3264] Rosenberg, J. and H.Schulzrinne.Schulzrinne, "An Offer/Answer Model withSDP",Session Description Protocol (SDP)", RFC 3264, June 2002.[RFC2396][RFC3986] Berners-Lee, T., Fielding,R.R., and L. Masinter, "Uniform ResourceIdentifiersIdentifier (URI): Generic Syntax", STD 66, RFC2396, August 1998. [RFC2732] Hinden, R., Carpenter, B. and L. Masinter, "Format for Literal IPv6 Addresses in URL's", RFC 2732, December 1999.3986, January 2005. [RFC2616] Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L., Leach,P.P., and T. Berners-Lee, "Hypertext TransferProtocol,Protocol -- HTTP/1.1", RFC 2616, June 1999. [RFC3388] Camarillo, G., Eriksson, G., Holler, J., and H.Schulzrinne.Schulzrinne, "Grouping of Media Lines in the Session Description Protocol (SDP)", RFC 3388, December 2002. [RP015] MIDI Manufacturers Association. "Recommended Practice 015 (RP-015): Response to Reset All Controllers", 11/98.[MTYPE][RFC4288] Freed, N. and J.Klensin.Klensin, "Media Type Specifications and Registration Procedures",draft-freed-media-type-reg-05.BCP 13, RFC 4288, December 2005. [RFC3555] Casner, S. andP Hoschka.P. Hoschka, "MIME Type Registration of RTP Payload Formats", RFC 3555, July 2003.I.2Informative References [NMP] Lazzaro, J. and J. Wawrzynek. "A Case for Network Musical Performance", 11th International Workshop on Network and Operating Systems Support for Digital Audio and Video (NOSSDAV 2001) June 25-26, 2001, Port Jefferson, New York. [GRAME] Fober, D., Orlarey, Y. and S. Letz. "Real Time Musical Events Streaming over Internet", Proceedings of the International Conference on WEB Delivering of Music 2001, pages 147-154. [RFC3261] Rosenberg,J,J., Schulzrinne, H., Camarillo, G., Johnston, A., Peterson, J., Sparks, R., Handley, M., and E.Schooler.Schooler, "SIP: Session Initiation Protocol", RFC 3261, June 2002. [RFC2326] Schulzrinne, H., Rao, A., and R.Lanphier.Lanphier, "Real Time Streaming Protocol (RTSP)", RFC 2326, April 1998. [ALF] Clark, D. D. and D. L. Tennenhouse. "Architectural considerations for a new generation of protocols", SIGCOMM Symposium on Communications Architectures and Protocols , (Philadelphia, Pennsylvania), pp. 200--208, IEEE, Sept. 1990.[GUIDE][RFC4696] Lazzaro,J.,J. and J.Wawrzynek.Wawrzynek, "An Implementation Guide for RTP MIDI",draft-ietf-avt-rtp-midi-guidelines-15.txt.RFC 4696, November 2006. [RFC2205] Braden,R. et al.R., Zhang, L., Berson, S., Herzog, S., and S. Jamin, "Resource ReSerVation Protocol (RSVP) -- Version 1 Functional Specification", RFC 2205, September 1997.[RFC2048][RFC4288] Freed,N.,N. and J. Klensin,J.,"Media Type Specifications and Registration Procedures", BCP 13, RFC 4288, December 2005. [RFC4289] Freed, N. and J.Postel. "MIMEKlensin, "Multipurpose Internet Mail Extensions (MIME) Part Four: Registration Procedures", BCP 13, RFC2048, November 1996. [CONTRANS]4289, December 2005. [RFC4571] Lazzaro, J. "FramingRTPReal-time Transport Protocol (RTP) andRTCPRTP Control Protocol (RTCP) Packets overConnection-OrientedConnection- Oriented Transport",draft-ietf-avt-rtp-framing-contrans-05.txt.RFC 4571, July 2006. [RFC2818]E. Rescorla.Rescorla, E., "HTTPoverOver TLS", RFC 2818, May 2000. [SPMIDI] MIDI Manufacturers Association. "Scalable Polyphony MIDI, Specification and Device Profiles", Document Version 1.0a, 2002. [LCP] Apple Computer. "Logic 7 Dedicated Control Surface Support", Appendix B. Product manual available from www.apple.com.J.Authors' Addresses John Lazzaro (corresponding author) UC Berkeley CS Division 315 Soda Hall Berkeley CA 94720-1776Email:EMail: lazzaro@cs.berkeley.edu John Wawrzynek UC Berkeley CS Division 631 Soda Hall Berkeley CA 94720-1776Email:EMail: johnw@cs.berkeley.eduK.Full Copyright Statement Copyright (C) The IETF Trust (2006). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST, AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Intellectual PropertyRights StatementThe IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF atietf- ipr@ietf.org. L. Full Copyright Statement Copyright (C) The Internet Society (2006). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.ietf-ipr@ietf.org. Acknowledgement Funding for the RFC Editor function iscurrentlyprovided by theInternet Society. N. Change Log for <draft-ietf-avt-rtp-midi-format-15.txt> [Note to RFC Editors: this Appendix, and its Table of Contents listing, should be removed from the final version of the memo] The following changes were made to the document: -- [1] The IP6 "c=" lines in all session description examples were changed to be: c=IN IP6 2001:DB80::7F2E:172A:1E24 [2] In Appendix C.6.3, the final paragraph, the phrase "audio complete performances" was changed to be the grammatically correct "complete audio performances". [3] Updated GUIDE reference to version -15.txt. Also, RTCP is now defined as "RTP control protocol", not "Real Time Control Protocol, in its first appearance in the text, and in several other places in the text. --- The wording changes below are in response to a late-arriving review by Jim Wright, who has been reviewing RTP MIDI for the MMA. These changes reflect confusions he had understanding RTP session and stream definitions that were added to the I-Ds shortly before Last Call. He basically got in the mode of rereading Section 2.1 over and over, trying to figure out how it all fit together. He felt these four clarifications would have helped him figure out the mechanism easier, and so I added these into the I-D. [4] In Section 2.1, the paragraph that began "A session description [SDP] media line ("m=") is now preceded by the introductory line: "We now define RTP session semantics, in the context of sessions specified using the session description protocol [SDP]." [5] In the aforementioned paragraph, the sentence beginning with "Source" now begins with "Synchronization source". [6] Later in Section 2.1, the paragraph that begins "Streams in an RTP session may use" has the added sentence: Recall that dynamic binding of payload type numbers in [SDP] lets a party map many payload type numbers to the RTP MIDI payload format, and thus a party may send many RTP MIDI streams in a single RTP session. [7] Later in Section 2.1, the paragraph that begins "The RTP header timestamps for each stream" ends with a few sentences that have been modified to include specific references to [RFC3550]. Here are the modified sentences: The SSRC values for each stream in an RTP session are also separately and randomly chosen, as described in [RFC3550]. Receivers use the CNAME field encoded in RTCP sender reports to verify that streams were sent by the same party, and to detect SSRC collisions, as described in [RFC3550].IETF Administrative Support Activity (IASA).