rfc9766v2.txt | rfc9766.txt | |||
---|---|---|---|---|
Internet Engineering Task Force (IETF) T. Haynes | Internet Engineering Task Force (IETF) T. Haynes | |||
Request for Comments: 9766 T. Myklebust | Request for Comments: 9766 T. Myklebust | |||
Category: Standards Track Hammerspace | Category: Standards Track Hammerspace | |||
ISSN: 2070-1721 April 2025 | ISSN: 2070-1721 April 2025 | |||
Extensions for Weak Cache Consistency in NFSv4.2's Flexible File Layout | Extensions for Weak Cache Consistency in NFSv4.2's Flexible File Layout | |||
Abstract | Abstract | |||
This document specifies extensions to Parallel NFS (pNFS) for | This document specifies extensions to NFSv4.2 for improving Weak | |||
improving Weak Cache Consistency (WCC). These extensions introduce | Cache Consistency (WCC). These extensions introduce mechanisms that | |||
mechanisms that ensure partial writes performed under a pNFS layout | ensure partial writes performed under a Parallel NFS (pNFS) layout | |||
remain coherent and correctly tracked. The solution addresses | remain coherent and correctly tracked. The solution addresses | |||
concurrency and data integrity concerns that may arise when multiple | concurrency and data integrity concerns that may arise when multiple | |||
clients write to the same file through separate data servers. By | clients write to the same file through separate data servers. By | |||
defining additional interactions among clients, metadata servers, and | defining additional interactions among clients, metadata servers, and | |||
data servers, this specification enhances the reliability of NFSv4 in | data servers, this specification enhances the reliability of NFSv4 in | |||
parallel-access environments and ensures consistency across diverse | parallel-access environments and ensures consistency across diverse | |||
deployment scenarios. | deployment scenarios. | |||
Status of This Memo | Status of This Memo | |||
skipping to change at line 140 ¶ | skipping to change at line 140 ¶ | |||
capitals, as shown here. | capitals, as shown here. | |||
2. Weak Cache Consistency (WCC) | 2. Weak Cache Consistency (WCC) | |||
A pNFS layout type enables the metadata server to inform the client | A pNFS layout type enables the metadata server to inform the client | |||
of both the storage protocol and the locations of the data that the | of both the storage protocol and the locations of the data that the | |||
client should use when communicating with the storage devices. The | client should use when communicating with the storage devices. The | |||
flexible file layout type, as specified in [RFC8435], describes how | flexible file layout type, as specified in [RFC8435], describes how | |||
data servers using NFSv3 can be accessed. The client is restricted | data servers using NFSv3 can be accessed. The client is restricted | |||
to performing the following NFSv3 operations on the filehandles | to performing the following NFSv3 operations on the filehandles | |||
provided in the layout: READ (Section 3.3.6 of [RFC1813]), WRITE | provided in the layout: READ, WRITE, and COMMIT (see Sections 3.3.6, | |||
(Section 3.3.7 of [RFC1813]), and COMMIT (Section 3.3.21 of | 3.3.7, and 3.3.21 of [RFC1813], respectively). In other words, the | |||
[RFC1813]). In other words, the client may only use NFSv3 operations | client may only use NFSv3 operations that act directly on the data | |||
that act directly on the data portion of the file. | portion of the file. | |||
Because there is no control protocol (see [RFC8434]) possible with | Because there is no control protocol (see [RFC8434]) possible with | |||
all data servers, NFSv3 is used as the control protocol. As such, | all data servers, NFSv3 is used as the control protocol. As such, | |||
the following NFSv3 operations are commonly used by the metadata | the following NFSv3 operations are commonly used by the metadata | |||
server: CREATE (see Section 3.3.8 of [RFC1813]), GETATTR (see | server: CREATE, GETATTR, and SETATTR (see Sections 3.3.8, 3.3.1, and | |||
Section 3.3.1 of [RFC1813]), and SETATTR (see Section 3.3.2 of | 3.3.2 of [RFC1813], respectively). That is, the metadata server is | |||
[RFC1813]). That is, the metadata server is only allowed to use | only allowed to use NFSv3 operations that directly act on the | |||
NFSv3 operations that directly act on the metadata portion of the | metadata portion of the data file. GETATTR allows the metadata | |||
data file. GETATTR allows the metadata server to mainly retrieve the | server to mainly retrieve the mtime (modify time), ctime (change | |||
mtime (modify time), ctime (change time), and atime (access time). | time), and atime (access time). The metadata server can use this | |||
The metadata server can use this information to determine if the | information to determine if the client modified the file whilst it | |||
client modified the file whilst it held an iomode of LAYOUTIOMODE4_RW | held an iomode of LAYOUTIOMODE4_RW (see Section 3.3.20 of [RFC8881]). | |||
(see Section 3.3.20 of [RFC8881]). Then it can determine the | Then it can determine the following for the metadata file: | |||
following for the metadata file: time_modify (see Section 5.8.2.43 of | time_modify, time_metadata, and time_access (see Sections 5.8.2.43, | |||
[RFC8881]), time_metadata (see Section 5.8.2.42 of [RFC8881]), and | 5.8.2.42, and 5.8.2.37 of [RFC8881], respectively). That is, it can | |||
time_access (see Section 5.8.2.37 of [RFC8881]). That is, it can | ||||
determine the information to return to clients in an NFSv4.2 GETATTR | determine the information to return to clients in an NFSv4.2 GETATTR | |||
response. | response. | |||
For example, the metadata server might issue an NFSv3 GETATTR | For example, the metadata server might issue an NFSv3 GETATTR | |||
operation to the data server, which is typically triggered by a | operation to the data server, which is typically triggered by a | |||
client's NFSv4 GETATTR request to the metadata server. In addition | client's NFSv4 GETATTR request to the metadata server. In addition | |||
to the cost of each individual GETATTR operation, the data server can | to the cost of each individual GETATTR operation, the data server can | |||
be overwhelmed by a large volume of such requests. NFSv3 addressed a | be overwhelmed by a large volume of such requests. NFSv3 addressed a | |||
similar challenge by including a post-operation attribute in the READ | similar challenge by including a post-operation attribute in the READ | |||
and WRITE operations to report WCC data (see Section 2.6 of | and WRITE operations to report WCC data (see Section 2.6 of | |||
End of changes. 3 change blocks. | ||||
19 lines changed or deleted | 18 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. |