



Internet Engineering Task Force                                 B. Liang
Internet-Draft                                       Tsinghua University
Intended status: Standards Track                                Y. Xiang
Expires: 8 August 2026                                  Yunshan Networks
                                                                  X. Shi
                                                                  X. Yin
                                                     Tsinghua University
                                                         4 February 2026


                    TCP Provenance Identifier Option
                  draft-liang-tcp-provenance-option-01

Abstract

   This document describes a TCP option that carries a Provenance
   Identifier (ProvID) to enable correlation of TCP connections when
   transport-layer identifiers change along the path.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on 8 August 2026.

Copyright Notice

   Copyright (c) 2026 IETF Trust and the persons identified as the
   document authors.  All rights reserved.











Liang, et al.             Expires 8 August 2026                 [Page 1]

Internet-Draft      TCP Provenance Identifier Option       February 2026


   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents (https://trustee.ietf.org/
   license-info) in effect on the date of publication of this document.
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.  Code Components
   extracted from this document must include Revised BSD License text as
   described in Section 4.e of the Trust Legal Provisions and are
   provided without warranty as described in the Revised BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
     1.1.  Requirements Language . . . . . . . . . . . . . . . . . .   3
   2.  Use Cases . . . . . . . . . . . . . . . . . . . . . . . . . .   3
     2.1.  Association of Traffic Across Rewriting . . . . . . . . .   3
     2.2.  Process-level Origin Attribution for Remediation  . . . .   4
   3.  Option Format . . . . . . . . . . . . . . . . . . . . . . . .   4
   4.  Middlebox Considerations  . . . . . . . . . . . . . . . . . .   5
     4.1.  Non-terminating Middleboxes . . . . . . . . . . . . . . .   5
     4.2.  Terminating Middleboxes . . . . . . . . . . . . . . . . .   5
     4.3.  Domain Boundary Handling  . . . . . . . . . . . . . . . .   5
   5.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .   6
   6.  Security Considerations . . . . . . . . . . . . . . . . . . .   6
   7.  References  . . . . . . . . . . . . . . . . . . . . . . . . .   6
     7.1.  Normative References  . . . . . . . . . . . . . . . . . .   6
     7.2.  Informative References  . . . . . . . . . . . . . . . . .   6
   Appendix A.  Appendix 1 . . . . . . . . . . . . . . . . . . . . .   6
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .   6

1.  Introduction

   In administrative domains (e.g., cloud platforms, enterprise
   networks, and data centers), TCP traffic often traverses devices such
   as NATs, load balancers, and service proxies that rewrite transport-
   layer identifiers or terminate and re-originate TCP connections.  As
   a result, a single end-to-end exchange between two workloads may
   correspond to a sequence of distinct TCP connections within the
   domain.  This document refers to that end-to-end exchange as a
   "logical communication".

   These transformations break provenance continuity.  Observations of
   TCP traffic at different points in the domain cannot reliably be
   associated with the same logical communication, and operators cannot
   determine which workload instance originally initiated a connection
   once rewriting has occurred.






Liang, et al.             Expires 8 August 2026                 [Page 2]

Internet-Draft      TCP Provenance Identifier Option       February 2026


   This document defines an experimental TCP option that carries a small
   Provenance Identifier (ProvID).  A ProvID is a compact value
   generated for the duration of a logical communication using workload-
   scoped attributes, such as host IP with process identifier.  As
   illustrated in Figure 1, the ProvID enables provenance correlation
   across rewriting boundaries within the domain.

   The ProvID option is intended for use within administrative domains
   and is not designed for use on the open Internet.

+------------------+          +--------------+          +-------------+
| Originating Host |          |  Middlebox   |          | Destination |
|                  |          | (terminates) |          |             |
|  [Workload A]    |  TCP #1  |              |  TCP #2  |             |
|   ProvID = P1    | (P1)     |              | (P1)     |             |
|                  |--------->|   Relay P1   |--------->|             |
|  [Workload B]    |          |              |          |             |
|   ProvID = P2    |          |              |          |             |
|                  |          |              |          |             |
+------------------+          +--------------+          +-------------+
        |                             |                        |
Observation Point A          Observation Point B       Observation Point C
logs: (TCP #1, P1)           logs: (TCP #1, P1)        logs: (TCP #2, P1)
                                   (TCP #2, P1)
Logical Communication (ProvID = P1) for Workload A:
{ TCP #1, TCP #2 }

     Figure 1: Provenance Correlation for a Logical Communication
                          across Middleboxes

1.1.  Requirements Language

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
   "OPTIONAL" in this document are to be interpreted as described in BCP
   14 [RFC2119] [RFC8174] when, and only when, they appear in all
   capitals, as shown here.

2.  Use Cases

2.1.  Association of Traffic Across Rewriting

   In many administrative domains, operators need to reconstruct the
   path of a logical communication for troubleshooting, incident
   investigation, or auditing.  Flow logs and measurements collected at
   different observation points commonly reference different TCP
   connections that correspond to the same logical communication, and
   the originating endpoint cannot be reliably identified from transport



Liang, et al.             Expires 8 August 2026                 [Page 3]

Internet-Draft      TCP Provenance Identifier Option       February 2026


   identifiers alone.

   In this use case, the originating endpoint includes the ProvID in the
   TCP header options.  Observation points record the ProvID alongside
   locally observed connection identifiers.  When a logical
   communication is realized as a series of distinct TCP connections
   within the domain, the ProvID provides a stable correlation handle
   for aggregating these disparate records.  This enables end-to-end
   reconstruction of communication paths across middleboxes, supports
   cross-layer observability (for example, linking network telemetry
   with application or process context), and maintains provenance
   continuity.

2.2.  Process-level Origin Attribution for Remediation

   An operator may detect anomalous TCP behavior at an observation point
   and need to remediate the issue by acting on the specific originator
   responsible.  At that point, the original source may be obscured by
   address translation or connection rewriting, and even host-level
   attribution may be insufficient because multiple independent
   processes can share a host.

   In this use case, the originating endpoint generates a ProvID using
   the pair (host IP address, process identifier) of the process that
   created the socket.  An operator that observes abnormal traffic
   associated with a ProvID can map the ProvID to the initiating (host
   IP, process identifier) and take immediate action on that process
   (for example, suspend or restart the process, isolate the workload
   instance, or apply a narrowly scoped network policy).

3.  Option Format

   The Provenance Identifier (ProvID) option uses a fixed-length
   experimental TCP option format.  The option is identified by the
   experimental option kind and is distinguished by a fixed option
   length.

     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |     Kind      |    Length     |              ExID             |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                           ProvID                              |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

                     Figure 2: ProvID TCP Option Format





Liang, et al.             Expires 8 August 2026                 [Page 4]

Internet-Draft      TCP Provenance Identifier Option       February 2026


   Kind
      The TCP option kind.  The value of this field is 253.

   Length
      The total length of the TCP option in bytes.  For the ProvID
      option defined in this document, the value of this field is 12.

   ExID
      The Experiment Identifier (ExID).  This 2-byte field identifies
      the ProvID experiment when used with experimental TCP option
      kinds.  The value of this field is 0xDEE9.

   ProvID
      The Provenance Identifier.  This field is 8 bytes in length and
      carries a provenance identifier defined by the sender.

4.  Middlebox Considerations

   To maintain provenance continuity within an administrative domain,
   middleboxes (as defined in [RFC3234]) MUST handle the ProvID option
   according to their function.

4.1.  Non-terminating Middleboxes

   A non-terminating middlebox is a device that resides on the
   communication path but does not terminate the end-to-end TCP
   connection.  These middleboxes MUST forward the ProvID option
   unmodified in any segment where it appears.  If the middlebox
   modifies transport-layer identifiers (e.g., performing Network
   Address Translation), it MUST NOT strip or alter the ProvID option.

4.2.  Terminating Middleboxes

   A terminating middlebox is a device that acts as the endpoint for a
   TCP connection (e.g., a service proxy).  To maintain provenance
   continuity, these middleboxes MUST relay the ProvID from the incoming
   connection to the outgoing connection.  Specifically, whenever a
   ProvID is observed in an incoming TCP segment, the middlebox MUST
   include the identical ProvID value in the corresponding segment of
   the outgoing connection.

4.3.  Domain Boundary Handling

   To prevent the leakage of internal network metadata, middleboxes at
   the boundary of the administrative domain MUST strip the ProvID
   option from any TCP segments exiting the domain.  Similarly, any
   ProvID options present in traffic entering the domain from the open
   Internet MUST be stripped.



Liang, et al.             Expires 8 August 2026                 [Page 5]

Internet-Draft      TCP Provenance Identifier Option       February 2026


5.  IANA Considerations

   This document's IANA considerations are to be determined and will be
   provided in a subsequent revision of this draft.  [TODO]

6.  Security Considerations

   This document's security considerations are to be determined and will
   be provided in a subsequent revision of this draft.  [TODO]

7.  References

7.1.  Normative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119,
              DOI 10.17487/RFC2119, March 1997,
              <https://www.rfc-editor.org/info/rfc2119>.

   [RFC6994]  Touch, J., "Shared Use of Experimental TCP Options",
              RFC 6994, DOI 10.17487/RFC6994, August 2013,
              <https://www.rfc-editor.org/info/rfc6994>.

   [RFC8174]  Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
              2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
              May 2017, <https://www.rfc-editor.org/info/rfc8174>.

   [RFC9293]  Eddy, W., Ed., "Transmission Control Protocol (TCP)",
              STD 7, RFC 9293, DOI 10.17487/RFC9293, August 2022,
              <https://www.rfc-editor.org/info/rfc9293>.

   [RFC3234]  Carpenter, B. and S. Brim, "Middleboxes: Taxonomy and
              Issues", RFC 3234, DOI 10.17487/RFC3234, February 2002,
              <https://www.rfc-editor.org/info/rfc3234>.

7.2.  Informative References

Appendix A.  Appendix 1

   TODO

Authors' Addresses

   Bowen Liang
   Tsinghua University
   Email: liangbw25@mails.tsinghua.edu.cn





Liang, et al.             Expires 8 August 2026                 [Page 6]

Internet-Draft      TCP Provenance Identifier Option       February 2026


   Yang Xiang
   Yunshan Networks
   Email: xiangyang@yunshan.net


   Xingang Shi
   Tsinghua University
   Email: shixg@cernet.edu.cn


   Xia Yin
   Tsinghua University
   Email: yinxia@tsinghua.edu.cn






































Liang, et al.             Expires 8 August 2026                 [Page 7]
