RFCs in HTML Format


RFC 1813

                  NFS Version 3 Protocol Specification


Table of Contents

   1.    Introduction . . . . . . . . . . . . . . . . . . . . . . .   3
   1.1     Scope of the NFS version 3 protocol  . . . . . . . . . .   4
   1.2     Useful terms . . . . . . . . . . . . . . . . . . . . . .   5
   1.3     Remote Procedure Call  . . . . . . . . . . . . . . . . .   5
   1.4     External Data Representation . . . . . . . . . . . . . .   5
   1.5     Authentication and Permission Checking . . . . . . . . .   7
   1.6     Philosophy . . . . . . . . . . . . . . . . . . . . . . .   8
   1.7     Changes from the NFS version 2 protocol  . . . . . . . .  11
   2.    RPC Information  . . . . . . . . . . . . . . . . . . . . .  14
   2.1     Authentication . . . . . . . . . . . . . . . . . . . . .  14
   2.2     Constants  . . . . . . . . . . . . . . . . . . . . . . .  14
   2.3     Transport address  . . . . . . . . . . . . . . . . . . .  14
   2.4     Sizes  . . . . . . . . . . . . . . . . . . . . . . . . .  14
   2.5     Basic Data Types . . . . . . . . . . . . . . . . . . . .  15
   2.6     Defined Error Numbers  . . . . . . . . . . . . . . . . .  17
   3.    Server Procedures  . . . . . . . . . . . . . . . . . . . .  27
   3.1     General comments on attributes . . . . . . . . . . . . .  29
   3.2     General comments on filenames  . . . . . . . . . . . . .  30
   3.3.0   NULL: Do nothing . . . . . . . . . . . . . . . . . . . .  31



Callaghan, el al             Informational                      [Page 1]

RFC 1813 NFS Version 3 Protocol June 1995 3.3.1 GETATTR: Get file attributes . . . . . . . . . . . . . . 32 3.3.2 SETATTR: Set file attributes . . . . . . . . . . . . . . 33 3.3.3 LOOKUP: Lookup filename . . . . . . . . . . . . . . . . 37 3.3.4 ACCESS: Check access permission . . . . . . . . . . . . 40 3.3.5 READLINK: Read from symbolic link . . . . . . . . . . . 44 3.3.6 READ: Read from file . . . . . . . . . . . . . . . . . . 46 3.3.7 WRITE: Write to file . . . . . . . . . . . . . . . . . . 49 3.3.8 CREATE: Create a file . . . . . . . . . . . . . . . . . 54 3.3.9 MKDIR: Create a directory . . . . . . . . . . . . . . . 58 3.3.10 SYMLINK: Create a symbolic link . . . . . . . . . . . . 61 3.3.11 MKNOD: Create a special device . . . . . . . . . . . . . 63 3.3.12 REMOVE: Remove a file . . . . . . . . . . . . . . . . . 67 3.3.13 RMDIR: Remove a directory . . . . . . . . . . . . . . . 69 3.3.14 RENAME: Rename a file or directory . . . . . . . . . . . 71 3.3.15 LINK: Create link to an object . . . . . . . . . . . . . 74 3.3.16 READDIR: Read From directory . . . . . . . . . . . . . . 76 3.3.17 READDIRPLUS: Extended read from directory . . . . . . . 80 3.3.18 FSSTAT: Get dynamic file system information . . . . . . 84 3.3.19 FSINFO: Get static file system information . . . . . . . 86 3.3.20 PATHCONF: Retrieve POSIX information . . . . . . . . . . 90 3.3.21 COMMIT: Commit cached data on a server to stable storage 92 4. Implementation issues . . . . . . . . . . . . . . . . . . 96 4.1 Multiple version support . . . . . . . . . . . . . . . . 96 4.2 Server/client relationship . . . . . . . . . . . . . . . 96 4.3 Path name interpretation . . . . . . . . . . . . . . . . 97 4.4 Permission issues . . . . . . . . . . . . . . . . . . . 98 4.5 Duplicate request cache . . . . . . . . . . . . . . . . 99 4.6 File name component handling . . . . . . . . . . . . . . 101 4.7 Synchronous modifying operations . . . . . . . . . . . . 101 4.8 Stable storage . . . . . . . . . . . . . . . . . . . . . 101 4.9 Lookups and name resolution . . . . . . . . . . . . . . 102 4.10 Adaptive retransmission . . . . . . . . . . . . . . . . 102 4.11 Caching policies . . . . . . . . . . . . . . . . . . . . 102 4.12 Stable versus unstable writes. . . . . . . . . . . . . . 103 4.13 32 bit clients/servers and 64 bit clients/servers. . . . 104 5. Appendix I: Mount protocol . . . . . . . . . . . . . . . . 106 5.1 RPC Information . . . . . . . . . . . . . . . . . . . . 106 5.1.1 Authentication . . . . . . . . . . . . . . . . . . . . 106 5.1.2 Constants . . . . . . . . . . . . . . . . . . . . . . 106 5.1.3 Transport address . . . . . . . . . . . . . . . . . . 106 5.1.4 Sizes . . . . . . . . . . . . . . . . . . . . . . . . 106 5.1.5 Basic Data Types . . . . . . . . . . . . . . . . . . . 106 5.2 Server Procedures . . . . . . . . . . . . . . . . . . . 107 5.2.0 NULL: Do nothing . . . . . . . . . . . . . . . . . . . 108 5.2.1 MNT: Add mount entry . . . . . . . . . . . . . . . . . 109 5.2.2 DUMP: Return mount entries . . . . . . . . . . . . . . 110 5.2.3 UMNT: Remove mount entry . . . . . . . . . . . . . . . 111 5.2.4 UMNTALL: Remove all mount entries . . . . . . . . . . 112 Callaghan, el al Informational [Page 2]
RFC 1813 NFS Version 3 Protocol June 1995 5.2.5 EXPORT: Return export list . . . . . . . . . . . . . . 113 6. Appendix II: Lock manager protocol . . . . . . . . . . . . 114 6.1 RPC Information . . . . . . . . . . . . . . . . . . . . 114 6.1.1 Authentication . . . . . . . . . . . . . . . . . . . . 114 6.1.2 Constants . . . . . . . . . . . . . . . . . . . . . . 114 6.1.3 Transport Address . . . . . . . . . . . . . . . . . . 115 6.1.4 Basic Data Types . . . . . . . . . . . . . . . . . . . 115 6.2 NLM Procedures . . . . . . . . . . . . . . . . . . . . . 118 6.2.0 NULL: Do nothing . . . . . . . . . . . . . . . . . . . 120 6.3 Implementation issues . . . . . . . . . . . . . . . . . 120 6.3.1 64-bit offsets and lengths . . . . . . . . . . . . . . 120 6.3.2 File handles . . . . . . . . . . . . . . . . . . . . . 120 7. Appendix III: Bibliography . . . . . . . . . . . . . . . . 122 8. Security Considerations . . . . . . . . . . . . . . . . . 125 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . 125 10. Authors' Addresses . . . . . . . . . . . . . . . . . . . . 126 1. Introduction Sun's NFS protocol provides transparent remote access to shared file systems across networks. The NFS protocol is designed to be machine, operating system, network architecture, and transport protocol independent. This independence is achieved through the use of Remote Procedure Call (RPC) primitives built on top of an eXternal Data Representation (XDR). Implementations of the NFS version 2 protocol exist for a variety of machines, from personal computers to supercomputers. The initial version of the NFS protocol is specified in the Network File System Protocol Specification [RFC1094]. A description of the initial implementation can be found in [Sandberg]. The supporting MOUNT protocol performs the operating system-specific functions that allow clients to attach remote directory trees to a point within the local file system. The mount process also allows the server to grant remote access privileges to a restricted set of clients via export control. The Lock Manager provides support for file locking when used in the NFS environment. The Network Lock Manager (NLM) protocol isolates the inherently stateful aspects of file locking into a separate protocol. A complete description of the above protocols and their implementation is to be found in [X/OpenNFS]. The purpose of this document is to: Callaghan, el al Informational [Page 3]
RFC 1813 NFS Version 3 Protocol June 1995 o Specify the NFS version 3 protocol. o Describe semantics of the protocol through annotation and description of intended implementation. o Specify the MOUNT version 3 protocol. o Briefly describe the changes between the NLM version 3 protocol and the NLM version 4 protocol. The normative text is the description of the RPC procedures and arguments and results, which defines the over-the-wire protocol, and the semantics of those procedures. The material describing implementation practice aids the understanding of the protocol specification and describes some possible implementation issues and solutions. It is not possible to describe all implementations and the UNIX operating system implementation of the NFS version 3 protocol is most often used to provide examples. Given that, the implementation discussion does not bear the authority of the description of the over-the-wire protocol itself. 1.1 Scope of the NFS version 3 protocol This revision of the NFS protocol addresses new requirements. The need to support larger files and file systems has prompted extensions to allow 64 bit file sizes and offsets. The revision enhances security by adding support for an access check to be done on the server. Performance modifications are of three types: 1. The number of over-the-wire packets for a given set of file operations is reduced by returning file attributes on every operation, thus decreasing the number of calls to get modified attributes. 2. The write throughput bottleneck caused by the synchronous definition of write in the NFS version 2 protocol has been addressed by adding support so that the NFS server can do unsafe writes. Unsafe writes are writes which have not been committed to stable storage before the operation returns. This specification defines a method for committing these unsafe writes to stable storage in a reliable way. 3. Limitations on transfer sizes have been relaxed. The ability to support multiple versions of a protocol in RPC will allow implementors of the NFS version 3 protocol to define Callaghan, el al Informational [Page 4]
RFC 1813 NFS Version 3 Protocol June 1995 clients and servers that provide backwards compatibility with the existing installed base of NFS version 2 protocol implementations. The extensions described here represent an evolution of the existing NFS protocol and most of the design features of the NFS protocol described in [Sandberg] persist. See Changes from the NFS version 2 protocol on page 11 for a more detailed summary of the changes introduced by this revision. 1.2 Useful terms In this specification, a "server" is a machine that provides resources to the network; a "client" is a machine that accesses resources over the network; a "user" is a person logged in on a client; an "application" is a program that executes on a client. 1.3 Remote Procedure Call The Sun Remote Procedure Call specification provides a procedure-oriented interface to remote services. Each server supplies a program, which is a set of procedures. The NFS service is one such program. The combination of host address, program number, version number, and procedure number specify one remote service procedure. Servers can support multiple versions of a program by using different protocol version numbers. The NFS protocol was designed to not require any specific level of reliability from its lower levels so it could potentially be used on many underlying transport protocols. The NFS service is based on RPC which provides the abstraction above lower level network and transport protocols. The rest of this document assumes the NFS environment is implemented on top of Sun RPC, which is specified in [RFC1057]. A complete discussion is found in [Corbin]. 1.4 External Data Representation The eXternal Data Representation (XDR) specification provides a standard way of representing a set of data types on a network. This solves the problem of different byte orders, structure alignment, and data type representation on different, communicating machines. In this document, the RPC Data Description Language is used to specify the XDR format parameters and results to each of the RPC service procedures that an NFS server provides. The RPC Data Callaghan, el al Informational [Page 5]
RFC 1813 NFS Version 3 Protocol June 1995 Description Language is similar to declarations in the C programming language. A few new constructs have been added. The notation: string name[SIZE]; string data<DSIZE>; defines name, which is a fixed size block of SIZE bytes, and data, which is a variable sized block of up to DSIZE bytes. This notation indicates fixed-length arrays and arrays with a variable number of elements up to a fixed maximum. A variable-length definition with no size specified means there is no maximum size for the field. The discriminated union definition: union example switch (enum status) { case OK: struct { filename file1; filename file2; integer count; } case ERROR: struct { errstat error; integer errno; } default: void; } defines a structure where the first thing over the network is an enumeration type called status. If the value of status is OK, the next thing on the network will be the structure containing file1, file2, and count. Else, if the value of status is ERROR, the next thing on the network will be a structure containing error and errno. If the value of status is neither OK nor ERROR, then there is no more data in the structure. The XDR type, hyper, is an 8 byte (64 bit) quantity. It is used in the same way as the integer type. For example: hyper foo; unsigned hyper bar; foo is an 8 byte signed value, while bar is an 8 byte unsigned value. Callaghan, el al Informational [Page 6]
RFC 1813 NFS Version 3 Protocol June 1995 Although RPC/XDR compilers exist to generate client and server stubs from RPC Data Description Language input, NFS implementations do not require their use. Any software that provides equivalent encoding and decoding to the canonical network order of data defined by XDR can be used to interoperate with other NFS implementations. XDR is described in [RFC1014]. 1.5 Authentication and Permission Checking The RPC protocol includes a slot for authentication parameters on every call. The contents of the authentication parameters are determined by the type of authentication used by the server and client. A server may support several different flavors of authentication at once. The AUTH_NONE flavor provides null authentication, that is, no authentication information is passed. The AUTH_UNIX flavor provides UNIX-style user ID, group ID, and groups with each call. The AUTH_DES flavor provides DES-encrypted authentication parameters based on a network-wide name, with session keys exchanged via a public key scheme. The AUTH_KERB flavor provides DES encrypted authentication parameters based on a network-wide name with session keys exchanged via Kerberos secret keys. The NFS server checks permissions by taking the credentials from the RPC authentication information in each remote request. For example, using the AUTH_UNIX flavor of authentication, the server gets the user's effective user ID, effective group ID and groups on each call, and uses them to check access. Using user ids and group ids implies that the client and server either share the same ID list or do local user and group ID mapping. Servers and clients must agree on the mapping from user to uid and from group to gid, for those sites that do not implement a consistent user ID and group ID space. In practice, such mapping is typically performed on the server, following a static mapping scheme or a mapping established by the user from a client at mount time. The AUTH_DES and AUTH_KERB style of authentication is based on a network-wide name. It provides greater security through the use of DES encryption and public keys in the case of AUTH_DES, and DES encryption and Kerberos secret keys (and tickets) in the AUTH_KERB case. Again, the server and client must agree on the identity of a particular name on the network, but the name to identity mapping is more operating system independent than the uid and gid mapping in AUTH_UNIX. Also, because the authentication parameters are encrypted, a malicious user must Callaghan, el al Informational [Page 7]
RFC 1813 NFS Version 3 Protocol June 1995 know another users network password or private key to masquerade as that user. Similarly, the server returns a verifier that is also encrypted so that masquerading as a server requires knowing a network password. The NULL procedure typically requires no authentication. 1.6 Philosophy This specification defines the NFS version 3 protocol, that is the over-the-wire protocol by which a client accesses a server. The protocol provides a well-defined interface to a server's file resources. A client or server implements the protocol and provides a mapping of the local file system semantics and actions into those defined in the NFS version 3 protocol. Implementations may differ to varying degrees, depending on the extent to which a given environment can support all the operations and semantics defined in the NFS version 3 protocol. Although implementations exist and are used to illustrate various aspects of the NFS version 3 protocol, the protocol specification itself is the final description of how clients access server resources. Because the NFS version 3 protocol is designed to be operating-system independent, it does not necessarily match the semantics of any existing system. Server implementations are expected to make a best effort at supporting the protocol. If a server cannot support a particular protocol procedure, it may return the error, NFS3ERR_NOTSUP, that indicates that the operation is not supported. For example, many operating systems do not support the notion of a hard link. A server that cannot support hard links should return NFS3ERR_NOTSUP in response to a LINK request. FSINFO describes the most commonly unsupported procedures in the properties bit map. Alternatively, a server may not natively support a given operation, but can emulate it in the NFS version 3 protocol implementation to provide greater functionality. In some cases, a server can support most of the semantics described by the protocol but not all. For example, the ctime field in the fattr structure gives the time that a file's attributes were last modified. Many systems do not keep this information. In this case, rather than not support the GETATTR operation, a server could simulate it by returning the last modified time in place of ctime. Servers must be careful when simulating attribute information because of possible side effects on clients. For example, many clients use file modification times as a basis for their cache consistency Callaghan, el al Informational [Page 8]
RFC 1813 NFS Version 3 Protocol June 1995 scheme. NFS servers are dumb and NFS clients are smart. It is the clients that do the work required to convert the generalized file access that servers provide into a file access method that is useful to applications and users. In the LINK example given above, a UNIX client that received an NFS3ERR_NOTSUP error from a server would do the recovery necessary to either make it look to the application like the link request had succeeded or return a reasonable error. In general, it is the burden of the client to recover. The NFS version 3 protocol assumes a stateless server implementation. Statelessness means that the server does not need to maintain state about any of its clients in order to function correctly. Stateless servers have a distinct advantage over stateful servers in the event of a crash. With stateless servers, a client need only retry a request until the server responds; the client does not even need to know that the server has crashed. See additional comments in Duplicate request cache on page 99. For a server to be useful, it holds nonvolatile state: data stored in the file system. Design assumptions in the NFS version 3 protocol regarding flushing of modified data to stable storage reduce the number of failure modes in which data loss can occur. In this way, NFS version 3 protocol implementations can tolerate transient failures, including transient failures of the network. In general, server implementations of the NFS version 3 protocol cannot tolerate a non-transient failure of the stable storage itself. However, there exist fault tolerant implementations which attempt to address such problems. That is not to say that an NFS version 3 protocol server can't maintain noncritical state. In many cases, servers will maintain state (cache) about previous operations to increase performance. For example, a client READ request might trigger a read-ahead of the next block of the file into the server's data cache in the anticipation that the client is doing a sequential read and the next client READ request will be satisfied from the server's data cache instead of from the disk. Read-ahead on the server increases performance by overlapping server disk I/O with client requests. The important point here is that the read-ahead block is not necessary for correct server behavior. If the server crashes and loses its memory cache of read buffers, recovery is simple on reboot - clients will continue read operations retrieving data from the server disk. Callaghan, el al Informational [Page 9]
RFC 1813 NFS Version 3 Protocol June 1995 Most data-modifying operations in the NFS protocol are synchronous. That is, when a data modifying procedure returns to the client, the client can assume that the operation has completed and any modified data associated with the request is now on stable storage. For example, a synchronous client WRITE request may cause the server to update data blocks, file system information blocks, and file attribute information - the latter information is usually referred to as metadata. When the WRITE operation completes, the client can assume that the write data is safe and discard it. This is a very important part of the stateless nature of the server. If the server did not flush dirty data to stable storage before returning to the client, the client would have no way of knowing when it was safe to discard modified data. The following data modifying procedures are synchronous: WRITE (with stable flag set to FILE_SYNC), CREATE, MKDIR, SYMLINK, MKNOD, REMOVE, RMDIR, RENAME, LINK, and COMMIT. The NFS version 3 protocol introduces safe asynchronous writes on the server, when the WRITE procedure is used in conjunction with the COMMIT procedure. The COMMIT procedure provides a way for the client to flush data from previous asynchronous WRITE requests on the server to stable storage and to detect whether it is necessary to retransmit the data. See the procedure descriptions of WRITE on page 49 and COMMIT on page 92. The LOOKUP procedure is used by the client to traverse multicomponent file names (pathnames). Each call to LOOKUP is used to resolve one segment of a pathname. There are two reasons for restricting LOOKUP to a single segment: it is hard to standardize a common format for hierarchical file names and the client and server may have different mappings of pathnames to file systems. This would imply that either the client must break the path name at file system attachment points, or the server must know about the client's file system attachment points. In NFS version 3 protocol implementations, it is the client that constructs the hierarchical file name space using mounts to build a hierarchy. Support utilities, such as the Automounter, provide a way to manage a shared, consistent image of the file name space while still being driven by the client mount process. Clients can perform caching in varied manner. The general practice with the NFS version 2 protocol was to implement a time-based client-server cache consistency mechanism. It is expected NFS version 3 protocol implementations will use a similar mechanism. The NFS version 3 protocol has some explicit support, in the form of additional attribute information to eliminate explicit attribute checks. However, caching is not Callaghan, el al Informational [Page 10]
RFC 1813 NFS Version 3 Protocol June 1995 required, nor is any caching policy defined by the protocol. Neither the NFS version 2 protocol nor the NFS version 3 protocol provide a means of maintaining strict client-server consistency (and, by implication, consistency across client caches). 1.7 Changes from the NFS Version 2 Protocol The ROOT and WRITECACHE procedures have been removed. A MKNOD procedure has been defined to allow the creation of special files, eliminating the overloading of CREATE. Caching on the client is not defined nor dictated by the NFS version 3 protocol, but additional information and hints have been added to the protocol to allow clients that implement caching to manage their caches more effectively. Procedures that affect the attributes of a file or directory may now return the new attributes after the operation has completed to optimize out a subsequent GETATTR used in validating attribute caches. In addition, operations that modify the directory in which the target object resides return the old and new attributes of the directory to allow clients to implement more intelligent cache invalidation procedures. The ACCESS procedure provides access permission checking on the server, the FSSTAT procedure returns dynamic information about a file system, the FSINFO procedure returns static information about a file system and server, the READDIRPLUS procedure returns file handles and attributes in addition to directory entries, and the PATHCONF procedure returns POSIX pathconf information about a file. Below is a list of the important changes between the NFS version 2 protocol and the NFS version 3 protocol. File handle size The file handle has been increased to a variable-length array of 64 bytes maximum from a fixed array of 32 bytes. This addresses some known requirements for a slightly larger file handle size. The file handle was converted from fixed length to variable length to reduce local storage and network bandwidth requirements for systems which do not utilize the full 64 bytes of length. Maximum data sizes The maximum size of a data transfer used in the READ and WRITE procedures is now set by values in the FSINFO return structure. In addition, preferred transfer sizes are returned by FSINFO. The protocol does not place any artificial limits on the maximum transfer sizes. Callaghan, el al Informational [Page 11]
RFC 1813 NFS Version 3 Protocol June 1995 Filenames and pathnames are now specified as strings of variable length. The actual length restrictions are determined by the client and server implementations as appropriate. The protocol does not place any artificial limits on the length. The error, NFS3ERR_NAMETOOLONG, is provided to allow the server to return an indication to the client that it received a pathname that was too long for it to handle. Error return Error returns in some instances now return data (for example, attributes). nfsstat3 now defines the full set of errors that can be returned by a server. No other values are allowed. File type The file type now includes NF3CHR and NF3BLK for special files. Attributes for these types include subfields for UNIX major and minor devices numbers. NF3SOCK and NF3FIFO are now defined for sockets and fifos in the file system. File attributes The blocksize (the size in bytes of a block in the file) field has been removed. The mode field no longer contains file type information. The size and fileid fields have been widened to eight-byte unsigned integers from four-byte integers. Major and minor device information is now presented in a distinct structure. The blocks field name has been changed to used and now contains the total number of bytes used by the file. It is also an eight-byte unsigned integer. Set file attributes In the NFS version 2 protocol, the settable attributes were represented by a subset of the file attributes structure; the client indicated those attributes which were not to be modified by setting the corresponding field to -1, overloading some unsigned fields. The set file attributes structure now uses a discriminated union for each field to tell whether or how to set that field. The atime and mtime fields can be set to either the server's current time or a time supplied by the client. LOOKUP The LOOKUP return structure now includes the attributes for the directory searched. Callaghan, el al Informational [Page 12]
RFC 1813 NFS Version 3 Protocol June 1995 ACCESS An ACCESS procedure has been added to allow an explicit over-the-wire permissions check. This addresses known problems with the superuser ID mapping feature in many server implementations (where, due to mapping of root user, unexpected permission denied errors could occur while reading from or writing to a file). This also removes the assumption which was made in the NFS version 2 protocol that access to files was based solely on UNIX style mode bits. READ The reply structure includes a Boolean that is TRUE if the end-of-file was encountered during the READ. This allows the client to correctly detect end-of-file. WRITE The beginoffset and totalcount fields were removed from the WRITE arguments. The reply now includes a count so that the server can write less than the requested amount of data, if required. An indicator was added to the arguments to instruct the server as to the level of cache synchronization that is required by the client. CREATE An exclusive flag and a create verifier was added for the exclusive creation of regular files. MKNOD This procedure was added to support the creation of special files. This avoids overloading fields of CREATE as was done in some NFS version 2 protocol implementations. READDIR The READDIR arguments now include a verifier to allow the server to validate the cookie. The cookie is now a 64 bit unsigned integer instead of the 4 byte array which was used in the NFS version 2 protocol. This will help to reduce interoperability problems. READDIRPLUS This procedure was added to return file handles and attributes in an extended directory list. FSINFO FSINFO was added to provide nonvolatile information about a file system. The reply includes preferred and Callaghan, el al Informational [Page 13]
RFC 1813 NFS Version 3 Protocol June 1995 maximum read transfer size, preferred and maximum write transfer size, and flags stating whether links or symbolic links are supported. Also returned are preferred transfer size for READDIR procedure replies, server time granularity, and whether times can be set in a SETATTR request. FSSTAT FSSTAT was added to provide volatile information about a file system, for use by utilities such as the Unix system df command. The reply includes the total size and free space in the file system specified in bytes, the total number of files and number of free file slots in the file system, and an estimate of time between file system modifications (for use in cache consistency checking algorithms). COMMIT The COMMIT procedure provides the synchronization mechanism to be used with asynchronous WRITE operations. 2. RPC Information 2.1 Authentication The NFS service uses AUTH_NONE in the NULL procedure. AUTH_UNIX, AUTH_DES, or AUTH_KERB are used for all other procedures. Other authentication types may be supported in the future. 2.2 Constants These are the RPC constants needed to call the NFS Version 3 service. They are given in decimal. PROGRAM 100003 VERSION 3 2.3 Transport address The NFS protocol is normally supported over the TCP and UDP protocols. It uses port 2049, the same as the NFS version 2 protocol. 2.4 Sizes These are the sizes, given in decimal bytes, of various XDR structures used in the NFS version 3 protocol: Callaghan, el al Informational [Page 14]
RFC 1813 NFS Version 3 Protocol June 1995 NFS3_FHSIZE 64 The maximum size in bytes of the opaque file handle. NFS3_COOKIEVERFSIZE 8 The size in bytes of the opaque cookie verifier passed by READDIR and READDIRPLUS. NFS3_CREATEVERFSIZE 8 The size in bytes of the opaque verifier used for exclusive CREATE. NFS3_WRITEVERFSIZE 8 The size in bytes of the opaque verifier used for asynchronous WRITE. 2.5 Basic Data Types The following XDR definitions are basic definitions that are used in other structures. uint64 typedef unsigned hyper uint64; int64 typedef hyper int64; uint32 typedef unsigned long uint32; int32 typedef long int32; filename3 typedef string filename3<>; nfspath3 typedef string nfspath3<>; fileid3 typedef uint64 fileid3; cookie3 typedef uint64 cookie3; cookieverf3 typedef opaque cookieverf3[NFS3_COOKIEVERFSIZE]; Callaghan, el al Informational [Page 15]
RFC 1813 NFS Version 3 Protocol June 1995 createverf3 typedef opaque createverf3[NFS3_CREATEVERFSIZE]; writeverf3 typedef opaque writeverf3[NFS3_WRITEVERFSIZE]; uid3 typedef uint32 uid3; gid3 typedef uint32 gid3; size3 typedef uint64 size3; offset3 typedef uint64 offset3; mode3 typedef uint32 mode3; count3 typedef uint32 count3; nfsstat3 enum nfsstat3 { NFS3_OK = 0, NFS3ERR_PERM = 1, NFS3ERR_NOENT = 2, NFS3ERR_IO = 5, NFS3ERR_NXIO = 6, NFS3ERR_ACCES = 13, NFS3ERR_EXIST = 17, NFS3ERR_XDEV = 18, NFS3ERR_NODEV = 19, NFS3ERR_NOTDIR = 20, NFS3ERR_ISDIR = 21, NFS3ERR_INVAL = 22, NFS3ERR_FBIG = 27, NFS3ERR_NOSPC = 28, NFS3ERR_ROFS = 30, NFS3ERR_MLINK = 31, NFS3ERR_NAMETOOLONG = 63, NFS3ERR_NOTEMPTY = 66, NFS3ERR_DQUOT = 69, NFS3ERR_STALE = 70, NFS3ERR_REMOTE = 71, NFS3ERR_BADHANDLE = 10001, Callaghan, el al Informational [Page 16]
RFC 1813 NFS Version 3 Protocol June 1995 NFS3ERR_NOT_SYNC = 10002, NFS3ERR_BAD_COOKIE = 10003, NFS3ERR_NOTSUPP = 10004, NFS3ERR_TOOSMALL = 10005, NFS3ERR_SERVERFAULT = 10006, NFS3ERR_BADTYPE = 10007, NFS3ERR_JUKEBOX = 10008 }; The nfsstat3 type is returned with every procedure's results except for the NULL procedure. A value of NFS3_OK indicates that the call completed successfully. Any other value indicates that some error occurred on the call, as identified by the error code. Note that the precise numeric encoding must be followed. No other values may be returned by a server. Servers are expected to make a best effort mapping of error conditions to the set of error codes defined. In addition, no error precedences are specified by this specification. Error precedences determine the error value that should be returned when more than one error applies in a given situation. The error precedence will be determined by the individual server implementation. If the client requires specific error precedences, it should check for the specific errors for itself. 2.6 Defined Error Numbers A description of each defined error follows: NFS3_OK Indicates the call completed successfully. NFS3ERR_PERM Not owner. The operation was not allowed because the caller is either not a privileged user (root) or not the owner of the target of the operation. NFS3ERR_NOENT No such file or directory. The file or directory name specified does not exist. NFS3ERR_IO I/O error. A hard error (for example, a disk error) occurred while processing the requested operation. NFS3ERR_NXIO I/O error. No such device or address. Callaghan, el al Informational [Page 17]
RFC 1813 NFS Version 3 Protocol June 1995 NFS3ERR_ACCES Permission denied. The caller does not have the correct permission to perform the requested operation. Contrast this with NFS3ERR_PERM, which restricts itself to owner or privileged user permission failures. NFS3ERR_EXIST File exists. The file specified already exists. NFS3ERR_XDEV Attempt to do a cross-device hard link. NFS3ERR_NODEV No such device. NFS3ERR_NOTDIR Not a directory. The caller specified a non-directory in a directory operation. NFS3ERR_ISDIR Is a directory. The caller specified a directory in a non-directory operation. NFS3ERR_INVAL Invalid argument or unsupported argument for an operation. Two examples are attempting a READLINK on an object other than a symbolic link or attempting to SETATTR a time field on a server that does not support this operation. NFS3ERR_FBIG File too large. The operation would have caused a file to grow beyond the server's limit. NFS3ERR_NOSPC No space left on device. The operation would have caused the server's file system to exceed its limit. NFS3ERR_ROFS Read-only file system. A modifying operation was attempted on a read-only file system. NFS3ERR_MLINK Too many hard links. NFS3ERR_NAMETOOLONG The filename in an operation was too long. Callaghan, el al Informational [Page 18]
RFC 1813 NFS Version 3 Protocol June 1995 NFS3ERR_NOTEMPTY An attempt was made to remove a directory that was not empty. NFS3ERR_DQUOT Resource (quota) hard limit exceeded. The user's resource limit on the server has been exceeded. NFS3ERR_STALE Invalid file handle. The file handle given in the arguments was invalid. The file referred to by that file handle no longer exists or access to it has been revoked. NFS3ERR_REMOTE Too many levels of remote in path. The file handle given in the arguments referred to a file on a non-local file system on the server. NFS3ERR_BADHANDLE Illegal NFS file handle. The file handle failed internal consistency checks. NFS3ERR_NOT_SYNC Update synchronization mismatch was detected during a SETATTR operation. NFS3ERR_BAD_COOKIE READDIR or READDIRPLUS cookie is stale. NFS3ERR_NOTSUPP Operation is not supported. NFS3ERR_TOOSMALL Buffer or request is too small. NFS3ERR_SERVERFAULT An error occurred on the server which does not map to any of the legal NFS version 3 protocol error values. The client should translate this into an appropriate error. UNIX clients may choose to translate this to EIO. NFS3ERR_BADTYPE An attempt was made to create an object of a type not supported by the server. Callaghan, el al Informational [Page 19]
RFC 1813 NFS Version 3 Protocol June 1995 NFS3ERR_JUKEBOX The server initiated the request, but was not able to complete it in a timely fashion. The client should wait and then try the request with a new RPC transaction ID. For example, this error should be returned from a server that supports hierarchical storage and receives a request to process a file that has been migrated. In this case, the server should start the immigration process and respond to client with this error. ftype3 enum ftype3 { NF3REG = 1, NF3DIR = 2, NF3BLK = 3, NF3CHR = 4, NF3LNK = 5, NF3SOCK = 6, NF3FIFO = 7 }; The enumeration, ftype3, gives the type of a file. The type, NF3REG, is a regular file, NF3DIR is a directory, NF3BLK is a block special device file, NF3CHR is a character special device file, NF3LNK is a symbolic link, NF3SOCK is a socket, and NF3FIFO is a named pipe. Note that the precise enum encoding must be followed. specdata3 struct specdata3 { uint32 specdata1; uint32 specdata2; }; The interpretation of the two words depends on the type of file system object. For a block special (NF3BLK) or character special (NF3CHR) file, specdata1 and specdata2 are the major and minor device numbers, respectively. (This is obviously a UNIX-specific interpretation.) For all other file types, these two elements should either be set to 0 or the values should be agreed upon by the client and server. If the client and server do not agree upon the values, the client should treat these fields as if they are set to 0. This data field is returned as part of the fattr3 structure and so is available from all replies returning attributes. Since these fields are otherwise unused for objects which are not devices, out of band Callaghan, el al Informational [Page 20]
RFC 1813 NFS Version 3 Protocol June 1995 information can be passed from the server to the client. However, once again, both the server and the client must agree on the values passed. nfs_fh3 struct nfs_fh3 { opaque data<NFS3_FHSIZE>; }; The nfs_fh3 is the variable-length opaque object returned by the server on LOOKUP, CREATE, SYMLINK, MKNOD, LINK, or READDIRPLUS operations, which is used by the client on subsequent operations to reference the file. The file handle contains all the information the server needs to distinguish an individual file. To the client, the file handle is opaque. The client stores file handles for use in a later request and can compare two file handles from the same server for equality by doing a byte-by-byte comparison, but cannot otherwise interpret the contents of file handles. If two file handles from the same server are equal, they must refer to the same file, but if they are not equal, no conclusions can be drawn. Servers should try to maintain a one-to-one correspondence between file handles and files, but this is not required. Clients should use file handle comparisons only to improve performance, not for correct behavior. Servers can revoke the access provided by a file handle at any time. If the file handle passed in a call refers to a file system object that no longer exists on the server or access for that file handle has been revoked, the error, NFS3ERR_STALE, should be returned. nfstime3 struct nfstime3 { uint32 seconds; uint32 nseconds; }; The nfstime3 structure gives the number of seconds and nanoseconds since midnight January 1, 1970 Greenwich Mean Time. It is used to pass time and date information. The times associated with files are all server times except in the case of a SETATTR operation where the client can explicitly set the file time. A server converts to and from local time when processing time values, preserving as much accuracy as possible. If the precision of timestamps stored for a file is less than that Callaghan, el al Informational [Page 21]
RFC 1813 NFS Version 3 Protocol June 1995 defined by NFS version 3 protocol, loss of precision can occur. An adjunct time maintenance protocol is recommended to reduce client and server time skew. fattr3 struct fattr3 { ftype3 type; mode3 mode; uint32 nlink; uid3 uid; gid3 gid; size3 size; size3 used; specdata3 rdev; uint64 fsid; fileid3 fileid; nfstime3 atime; nfstime3 mtime; nfstime3 ctime; }; This structure defines the attributes of a file system object. It is returned by most operations on an object; in the case of operations that affect two objects (for example, a MKDIR that modifies the target directory attributes and defines new attributes for the newly created directory), the attributes for both may be returned. In some cases, the attributes are returned in the structure, wcc_data, which is defined below; in other cases the attributes are returned alone. The main changes from the NFS version 2 protocol are that many of the fields have been widened and the major/minor device information is now presented in a distinct structure rather than being packed into a word. The fattr3 structure contains the basic attributes of a file. All servers should support this set of attributes even if they have to simulate some of the fields. Type is the type of the file. Mode is the protection mode bits. Nlink is the number of hard links to the file - that is, the number of different names for the same file. Uid is the user ID of the owner of the file. Gid is the group ID of the group of the file. Size is the size of the file in bytes. Used is the number of bytes of disk space that the file actually uses (which can be smaller than the size because the file may have holes or it may be larger due to fragmentation). Rdev describes the device file if the file type is NF3CHR or NF3BLK - see specdata3 on page 20. Fsid is the file system identifier for the file system. Fileid is a number which uniquely identifies the file within its file system (on UNIX Callaghan, el al Informational [Page 22]
RFC 1813 NFS Version 3 Protocol June 1995 this would be the inumber). Atime is the time when the file data was last accessed. Mtime is the time when the file data was last modified. Ctime is the time when the attributes of the file were last changed. Writing to the file changes the ctime in addition to the mtime. The mode bits are defined as follows: 0x00800 Set user ID on execution. 0x00400 Set group ID on execution. 0x00200 Save swapped text (not defined in POSIX). 0x00100 Read permission for owner. 0x00080 Write permission for owner. 0x00040 Execute permission for owner on a file. Or lookup (search) permission for owner in directory. 0x00020 Read permission for group. 0x00010 Write permission for group. 0x00008 Execute permission for group on a file. Or lookup (search) permission for group in directory. 0x00004 Read permission for others. 0x00002 Write permission for others. 0x00001 Execute permission for others on a file. Or lookup (search) permission for others in directory. post_op_attr union post_op_attr switch (bool attributes_follow) { case TRUE: fattr3 attributes; case FALSE: void; }; This structure is used for returning attributes in those operations that are not directly involved with manipulating attributes. One of the principles of this revision of the NFS protocol is to return the real value from the indicated operation and not an error from an incidental operation. The post_op_attr structure was designed to allow the server to recover from errors encountered while getting attributes. This appears to make returning attributes optional. However, server implementors are strongly encouraged to make best effort to return attributes whenever possible, even when returning an error. Callaghan, el al Informational [Page 23]
RFC 1813 NFS Version 3 Protocol June 1995 wcc_attr struct wcc_attr { size3 size; nfstime3 mtime; nfstime3 ctime; }; This is the subset of pre-operation attributes needed to better support the weak cache consistency semantics. Size is the file size in bytes of the object before the operation. Mtime is the time of last modification of the object before the operation. Ctime is the time of last change to the attributes of the object before the operation. See discussion in wcc_attr on page 24. The use of mtime by clients to detect changes to file system objects residing on a server is dependent on the granularity of the time base on the server. pre_op_attr union pre_op_attr switch (bool attributes_follow) { case TRUE: wcc_attr attributes; case FALSE: void; }; wcc_data struct wcc_data { pre_op_attr before; post_op_attr after; }; When a client performs an operation that modifies the state of a file or directory on the server, it cannot immediately determine from the post-operation attributes whether the operation just performed was the only operation on the object since the last time the client received the attributes for the object. This is important, since if an intervening operation has changed the object, the client will need to invalidate any cached data for the object (except for the data that it just wrote). To deal with this, the notion of weak cache consistency data or wcc_data is introduced. A wcc_data structure consists of certain key fields from the object attributes before the operation, together with the object attributes after the operation. This Callaghan, el al Informational [Page 24]
RFC 1813 NFS Version 3 Protocol June 1995 information allows the client to manage its cache more accurately than in NFS version 2 protocol implementations. The term, weak cache consistency, emphasizes the fact that this mechanism does not provide the strict server-client consistency that a cache consistency protocol would provide. In order to support the weak cache consistency model, the server will need to be able to get the pre-operation attributes of the object, perform the intended modify operation, and then get the post-operation attributes atomically. If there is a window for the object to get modified between the operation and either of the get attributes operations, then the client will not be able to determine whether it was the only entity to modify the object. Some information will have been lost, thus weakening the weak cache consistency guarantees. post_op_fh3 union post_op_fh3 switch (bool handle_follows) { case TRUE: nfs_fh3 handle; case FALSE: void; }; One of the principles of this revision of the NFS protocol is to return the real value from the indicated operation and not an error from an incidental operation. The post_op_fh3 structure was designed to allow the server to recover from errors encountered while constructing a file handle. This is the structure used to return a file handle from the CREATE, MKDIR, SYMLINK, MKNOD, and READDIRPLUS requests. In each case, the client can get the file handle by issuing a LOOKUP request after a successful return from one of the listed operations. Returning the file handle is an optimization so that the client is not forced to immediately issue a LOOKUP request to get the file handle. sattr3 enum time_how { DONT_CHANGE = 0, SET_TO_SERVER_TIME = 1, SET_TO_CLIENT_TIME = 2 }; union set_mode3 switch (bool set_it) { Callaghan, el al Informational [Page 25]
RFC 1813 NFS Version 3 Protocol June 1995 case TRUE: mode3 mode; default: void; }; union set_uid3 switch (bool set_it) { case TRUE: uid3 uid; default: void; }; union set_gid3 switch (bool set_it) { case TRUE: gid3 gid; default: void; }; union set_size3 switch (bool set_it) { case TRUE: size3 size; default: void; }; union set_atime switch (time_how set_it) { case SET_TO_CLIENT_TIME: nfstime3 atime; default: void; }; union set_mtime switch (time_how set_it) { case SET_TO_CLIENT_TIME: nfstime3 mtime; default: void; }; struct sattr3 { set_mode3 mode; set_uid3 uid; set_gid3 gid; set_size3 size; set_atime atime; set_mtime mtime; Callaghan, el al Informational [Page 26]
RFC 1813 NFS Version 3 Protocol June 1995 }; The sattr3 structure contains the file attributes that can be set from the client. The fields are the same as the similarly named fields in the fattr3 structure. In the NFS version 3 protocol, the settable attributes are described by a structure containing a set of discriminated unions. Each union indicates whether the corresponding attribute is to be updated, and if so, how. There are two forms of discriminated unions used. In setting the mode, uid, gid, or size, the discriminated union is switched on a boolean, set_it; if it is TRUE, a value of the appropriate type is then encoded. In setting the atime or mtime, the union is switched on an enumeration type, set_it. If set_it has the value DONT_CHANGE, the corresponding attribute is unchanged. If it has the value, SET_TO_SERVER_TIME, the corresponding attribute is set by the server to its local time; no data is provided by the client. Finally, if set_it has the value, SET_TO_CLIENT_TIME, the attribute is set to the time passed by the client in an nfstime3 structure. (See FSINFO on page 86, which addresses the issue of time granularity). diropargs3 struct diropargs3 { nfs_fh3 dir; filename3 name; }; The diropargs3 structure is used in directory operations. The file handle, dir, identifies the directory in which to manipulate or access the file, name. See additional comments in File name component handling on page 101. 3. Server Procedures The following sections define the RPC procedures that are supplied by an NFS version 3 protocol server. The RPC procedure number is given at the top of the page with the name. The SYNOPSIS provides the name of the procedure, the list of the names of the arguments, the list of the names of the results, followed by the XDR argument declarations and results declarations. The information in the SYNOPSIS is specified in RPC Data Description Language as defined in [RFC1014]. The DESCRIPTION section tells what the procedure Callaghan, el al Informational [Page 27]
RFC 1813 NFS Version 3 Protocol June 1995 is expected to do and how its arguments and results are used. The ERRORS section lists the errors returned for specific types of failures. These lists are not intended to be the definitive statement of all of the errors which can be returned by any specific procedure, but as a guide for the more common errors which may be returned. Client implementations should be prepared to deal with unexpected errors coming from a server. The IMPLEMENTATION field gives information about how the procedure is expected to work and how it should be used by clients. program NFS_PROGRAM { version NFS_V3 { void NFSPROC3_NULL(void) = 0; GETATTR3res NFSPROC3_GETATTR(GETATTR3args) = 1; SETATTR3res NFSPROC3_SETATTR(SETATTR3args) = 2; LOOKUP3res NFSPROC3_LOOKUP(LOOKUP3args) = 3; ACCESS3res NFSPROC3_ACCESS(ACCESS3args) = 4; READLINK3res NFSPROC3_READLINK(READLINK3args) = 5; READ3res NFSPROC3_READ(READ3args) = 6; WRITE3res NFSPROC3_WRITE(WRITE3args) = 7; CREATE3res NFSPROC3_CREATE(CREATE3args) = 8; MKDIR3res NFSPROC3_MKDIR(MKDIR3args) = 9; SYMLINK3res NFSPROC3_SYMLINK(SYMLINK3args) = 10; Callaghan, el al Informational [Page 28]
RFC 1813 NFS Version 3 Protocol June 1995
RFC 1813 NFS Version 3 Protocol June 1995 5.2.1 Procedure 1: MNT - Add mount entry SYNOPSIS mountres3 MOUNTPROC3_MNT(dirpath) = 1; struct mountres3_ok { fhandle3 fhandle; int auth_flavors<>; }; union mountres3 switch (mountstat3 fhs_status) { case MNT_OK: mountres3_ok mountinfo; default: void; }; DESCRIPTION Procedure MNT maps a pathname on the server to a file handle. The pathname is an ASCII string that describes a directory on the server. If the call is successful (MNT3_OK), the server returns an NFS version 3 protocol file handle and a vector of RPC authentication flavors that are supported with the client's use of the file handle (or any file handles derived from it). The authentication flavors are defined in Section 7.2 and section 9 of [RFC1057]. IMPLEMENTATION If mountres3.fhs_status is MNT3_OK, then mountres3.mountinfo contains the file handle for the directory and a list of acceptable authentication flavors. This file handle may only be used in the NFS version 3 protocol. This procedure also results in the server adding a new entry to its mount list recording that this client has mounted the directory. AUTH_UNIX authentication or better is required. ERRORS MNT3ERR_NOENT MNT3ERR_IO MNT3ERR_ACCES MNT3ERR_NOTDIR MNT3ERR_NAMETOOLONG Callaghan, el al Informational [Page 109]
RFC 1813 NFS Version 3 Protocol June 1995 5.2.2 Procedure 2: DUMP - Return mount entries SYNOPSIS mountlist MOUNTPROC3_DUMP(void) = 2; typedef struct mountbody *mountlist; struct mountbody { name ml_hostname; dirpath ml_directory; mountlist ml_next; }; DESCRIPTION Procedure DUMP returns the list of remotely mounted file systems. The mountlist contains one entry for each client host name and directory pair. IMPLEMENTATION This list is derived from a list maintained on the server of clients that have requested file handles with the MNT procedure. Entries are removed from this list only when a client calls the UMNT or UMNTALL procedure. Entries may become stale if a client crashes and does not issue either UMNT calls for all of the file systems that it had previously mounted or a UMNTALL to remove all entries that existed for it on the server. ERRORS There are no MOUNT protocol errors which can be returned from this procedure. However, RPC errors may be returned for authentication or other RPC failures. Callaghan, el al Informational [Page 110]
RFC 1813 NFS Version 3 Protocol June 1995 5.2.3 Procedure 3: UMNT - Remove mount entry SYNOPSIS void MOUNTPROC3_UMNT(dirpath) = 3; DESCRIPTION Procedure UMNT removes the mount list entry for the directory that was previously the subject of a MNT call from this client. AUTH_UNIX authentication or better is required. IMPLEMENTATION Typically, server implementations have maintained a list of clients which have file systems mounted. In the past, this list has been used to inform clients that the server was going to be shutdown. ERRORS There are no MOUNT protocol errors which can be returned from this procedure. However, RPC errors may be returned for authentication or other RPC failures. Callaghan, el al Informational [Page 111]
RFC 1813 NFS Version 3 Protocol June 1995 5.2.4 Procedure 4: UMNTALL - Remove all mount entries SYNOPSIS void MOUNTPROC3_UMNTALL(void) = 4; DESCRIPTION Procedure UMNTALL removes all of the mount entries for this client previously recorded by calls to MNT. AUTH_UNIX authentication or better is required. IMPLEMENTATION This procedure should be used by clients when they are recovering after a system shutdown. If the client could not successfully unmount all of its file systems before being shutdown or the client crashed because of a software or hardware problem, there may be servers which still have mount entries for this client. This is an easy way for the client to inform all servers at once that it does not have any mounted file systems. However, since this procedure is generally implemented using broadcast RPC, it is only of limited usefullness. ERRORS There are no MOUNT protocol errors which can be returned from this procedure. However, RPC errors may be returned for authentication or other RPC failures. Callaghan, el al Informational [Page 112]
RFC 1813 NFS Version 3 Protocol June 1995 5.2.5 Procedure 5: EXPORT - Return export list SYNOPSIS exports MOUNTPROC3_EXPORT(void) = 5; typedef struct groupnode *groups; struct groupnode { name gr_name; groups gr_next; }; typedef struct exportnode *exports; struct exportnode { dirpath ex_dir; groups ex_groups; exports ex_next; }; DESCRIPTION Procedure EXPORT returns a list of all the exported file systems and which clients are allowed to mount each one. The names in the group list are implementation-specific and cannot be directly interpreted by clients. These names can represent hosts or groups of hosts. IMPLEMENTATION This procedure generally returns the contents of a list of shared or exported file systems. These are the file systems which are made available to NFS version 3 protocol clients. ERRORS There are no MOUNT protocol errors which can be returned from this procedure. However, RPC errors may be returned for authentication or other RPC failures. Callaghan, el al Informational [Page 113]
RFC 1813 NFS Version 3 Protocol June 1995 6.0 Appendix II: Lock manager protocol Because the NFS version 2 protocol as well as the NFS version 3 protocol is stateless, an additional Network Lock Manager (NLM) protocol is required to support locking of NFS-mounted files. The NLM version 3 protocol, which is used with the NFS version 2 protocol, is documented in [X/OpenNFS]. Some of the changes in the NFS version 3 protocol require a new version of the NLM protocol. This new protocol is the NLM version 4 protocol. The following table summarizes the correspondence between versions of the NFS protocol and NLM protocol. NFS and NLM protocol compatibility +---------+---------+ | NFS | NLM | | Version | Version | +===================+ | 2 | 1,3 | +---------+---------+ | 3 | 4 | +---------+---------+ This appendix only discusses the differences between the NLM version 3 protocol and the NLM version 4 protocol. As in the NFS version 3 protocol, almost all the names in the NLM version 4 protocol have been changed to include a version number. This appendix does not discuss changes that consist solely of a name change. 6.1 RPC Information 6.1.1 Authentication The NLM service uses AUTH_NONE in the NULL procedure. AUTH_UNIX, AUTH_SHORT, AUTH_DES, and AUTH_KERB are used for all other procedures. Other authentication types may be supported in the future. 6.1.2 Constants These are the RPC constants needed to call the NLM service. They are given in decimal. PROGRAM 100021 VERSION 4 Callaghan, el al Informational [Page 114]
RFC 1813 NFS Version 3 Protocol June 1995 6.1.3 Transport Address The NLM service is normally supported over the TCP and UDP protocols. The rpcbind daemon should be queried for the correct transport address. 6.1.4 Basic Data Types uint64 typedef unsigned hyper uint64; int64 typedef hyper int64; uint32 typedef unsigned long uint32; int32 typedef long int32; These types are new for the NLM version 4 protocol. They are the same as in the NFS version 3 protocol. nlm4_stats enum nlm4_stats { NLM4_GRANTED = 0, NLM4_DENIED = 1, NLM4_DENIED_NOLOCKS = 2, NLM4_BLOCKED = 3, NLM4_DENIED_GRACE_PERIOD = 4, NLM4_DEADLCK = 5, NLM4_ROFS = 6, NLM4_STALE_FH = 7, NLM4_FBIG = 8, NLM4_FAILED = 9 }; Nlm4_stats indicates the success or failure of a call. This version contains several new error codes, so that clients can provide more precise failure information to applications. NLM4_GRANTED The call completed successfully. NLM4_DENIED The call failed. For attempts to set a lock, this status implies that if the client retries the call later, it may Callaghan, el al Informational [Page 115]
RFC 1813 NFS Version 3 Protocol June 1995 succeed. NLM4_DENIED_NOLOCKS The call failed because the server could not allocate the necessary resources. NLM4_BLOCKED Indicates that a blocking request cannot be granted immediately. The server will issue an NLMPROC4_GRANTED callback to the client when the lock is granted. NLM4_DENIED_GRACE_PERIOD The call failed because the server is reestablishing old locks after a reboot and is not yet ready to resume normal service. NLM4_DEADLCK The request could not be granted and blocking would cause a deadlock. NLM4_ROFS The call failed because the remote file system is read-only. For example, some server implementations might not support exclusive locks on read-only file systems. NLM4_STALE_FH The call failed because it uses an invalid file handle. This can happen if the file has been removed or if access to the file has been revoked on the server. NLM4_FBIG The call failed because it specified a length or offset that exceeds the range supported by the server. NLM4_FAILED The call failed for some reason not already listed. The client should take this status as a strong hint not to retry the request. nlm4_holder struct nlm4_holder { bool exclusive; int32 svid; netobj oh; uint64 l_offset; uint64 l_len; }; Callaghan, el al Informational [Page 116]
RFC 1813 NFS Version 3 Protocol June 1995 This structure indicates the holder of a lock. The exclusive field tells whether the holder has an exclusive lock or a shared lock. The svid field identifies the process that is holding the lock. The oh field is an opaque object that identifies the host or process that is holding the lock. The l_len and l_offset fields identify the region that is locked. The only difference between the NLM version 3 protocol and the NLM version 4 protocol is that in the NLM version 3 protocol, the l_len and l_offset fields are 32 bits wide, while they are 64 bits wide in the NLM version 4 protocol. nlm4_lock struct nlm4_lock { string caller_name<LM_MAXSTRLEN>; netobj fh; netobj oh; int32 svid; uint64 l_offset; uint64 l_len; }; This structure describes a lock request. The caller_name field identifies the host that is making the request. The fh field identifies the file to lock. The oh field is an opaque object that identifies the host or process that is making the request, and the svid field identifies the process that is making the request. The l_offset and l_len fields identify the region of the file that the lock controls. A l_len of 0 means "to end of file". There are two differences between the NLM version 3 protocol and the NLM version 4 protocol versions of this structure. First, in the NLM version 3 protocol, the length and offset are 32 bits wide, while they are 64 bits wide in the NLM version 4 protocol. Second, in the NLM version 3 protocol, the file handle is a fixed-length NFS version 2 protocol file handle, which is encoded as a byte count followed by a byte array. In the NFS version 3 protocol, the file handle is already variable-length, so it is copied directly into the fh field. That is, the first four bytes of the fh field are the same as the byte count in an NFS version 3 protocol nfs_fh3. The rest of the fh field contains the byte array from the NFS version 3 protocol nfs_fh3. Callaghan, el al Informational [Page 117]
RFC 1813 NFS Version 3 Protocol June 1995 nlm4_share struct nlm4_share { string caller_name<LM_MAXSTRLEN>; netobj fh; netobj oh; fsh4_mode mode; fsh4_access access; }; This structure is used to support DOS file sharing. The caller_name field identifies the host making the request. The fh field identifies the file to be operated on. The oh field is an opaque object that identifies the host or process that is making the request. The mode and access fields specify the file-sharing and access modes. The encoding of fh is a byte count, followed by the file handle byte array. See the description of nlm4_lock for more details. 6.2 NLM Procedures The procedures in the NLM version 4 protocol are semantically the same as those in the NLM version 3 protocol. The only semantic difference is the addition of a NULL procedure that can be used to test for server responsiveness. The procedure names with _MSG and _RES suffixes denote asynchronous messages; for these the void response implies no reply. A syntactic change is that the procedures were renamed to avoid name conflicts with the values of nlm4_stats. Thus the procedure definition is as follows. version NLM4_VERS { void NLMPROC4_NULL(void) = 0; nlm4_testres NLMPROC4_TEST(nlm4_testargs) = 1; nlm4_res NLMPROC4_LOCK(nlm4_lockargs) = 2; nlm4_res NLMPROC4_CANCEL(nlm4_cancargs) = 3; nlm4_res NLMPROC4_UNLOCK(nlm4_unlockargs) = 4; Callaghan, el al Informational [Page 118]
RFC 1813 NFS Version 3 Protocol June 1995 nlm4_res NLMPROC4_GRANTED(nlm4_testargs) = 5; void NLMPROC4_TEST_MSG(nlm4_testargs) = 6; void NLMPROC4_LOCK_MSG(nlm4_lockargs) = 7; void NLMPROC4_CANCEL_MSG(nlm4_cancargs) = 8; void NLMPROC4_UNLOCK_MSG(nlm4_unlockargs) = 9; void NLMPROC4_GRANTED_MSG(nlm4_testargs) = 10; void NLMPROC4_TEST_RES(nlm4_testres) = 11; void NLMPROC4_LOCK_RES(nlm4_res) = 12; void NLMPROC4_CANCEL_RES(nlm4_res) = 13; void NLMPROC4_UNLOCK_RES(nlm4_res) = 14; void NLMPROC4_GRANTED_RES(nlm4_res) = 15; nlm4_shareres NLMPROC4_SHARE(nlm4_shareargs) = 20; nlm4_shareres NLMPROC4_UNSHARE(nlm4_shareargs) = 21; nlm4_res NLMPROC4_NM_LOCK(nlm4_lockargs) = 22; void NLMPROC4_FREE_ALL(nlm4_notify) = 23; } = 4; Callaghan, el al Informational [Page 119]
RFC 1813 NFS Version 3 Protocol June 1995 6.2.0 Procedure 0: NULL - Do nothing SYNOPSIS void NLMPROC4_NULL(void) = 0; DESCRIPTION The NULL procedure does no work. It is made available in all RPC services to allow server response testing and timing. IMPLEMENTATION It is important that this procedure do no work at all so that it can be used to measure the overhead of processing a service request. By convention, the NULL procedure should never require any authentication. ERRORS It is possible that some server implementations may return RPC errors based on security and authentication requirements. 6.3 Implementation issues 6.3.1 64-bit offsets and lengths Some NFS version 3 protocol servers can only support requests where the file offset or length fits in 32 or fewer bits. For these servers, the lock manager will have the same restriction. If such a lock manager receives a request that it cannot handle (because the offset or length uses more than 32 bits), it should return the error, NLM4_FBIG. 6.3.2 File handles The change in the file handle format from the NFS version 2 protocol to the NFS version 3 protocol complicates the lock manager. First, the lock manager needs some way to tell when an NFS version 2 protocol file handle refers to the same file as an NFS version 3 protocol file handle. (This is assuming that the lock manager supports both NLM version 3 protocol clients and NLM version 4 protocol clients.) Second, if the lock manager runs the file handle through a hashing function, the hashing function may need Callaghan, el al Informational [Page 120]
RFC 1813 NFS Version 3 Protocol June 1995 to be retuned to work with NFS version 3 protocol file handles as well as NFS version 2 protocol file handles. Callaghan, el al Informational [Page 121]
RFC 1813 NFS Version 3 Protocol June 1995 7.0 Appendix III: Bibliography [Corbin] Corbin, John, "The Art of Distributed Programming-Programming Techniques for Remote Procedure Calls." Springer-Verlag, New York, New York. 1991. Basic description of RPC and XDR and how to program distributed applications using them. [Glover] Glover, Fred, "TNFS Protocol Specification," Trusted System Interest Group, Work in Progress. [Israel] Israel, Robert K., Sandra Jett, James Pownell, George M. Ericson, "Eliminating Data Copies in UNIX-based NFS Servers," Uniforum Conference Proceedings, San Francisco, CA, February 27 - March 2, 1989. Describes two methods for reducing data copies in NFS server code. [Jacobson] Jacobson, V., "Congestion Control and Avoidance," Proc. ACM SIGCOMM `88, Stanford, CA, August 1988. The paper describing improvements to TCP to allow use over Wide Area Networks and through gateways connecting networks of varying capacity. This work was a starting point for the NFS Dynamic Retransmission work. [Juszczak] Juszczak, Chet, "Improving the Performance and Correctness of an NFS Server," USENIX Conference Proceedings, USENIX Association, Berkeley, CA, June 1990, pages 53-63. Describes reply cache implementation that avoids work in the server by handling duplicate requests. More important, though listed as a side-effect, the reply cache aids in the avoidance of destructive non-idempotent operation re-application -- improving correctness. [Kazar] Kazar, Michael Leon, "Synchronization and Caching Issues in the Andrew File System," USENIX Conference Proceedings, USENIX Association, Berkeley, CA, Dallas Winter 1988, pages 27-36. A description of the cache consistency scheme in AFS. Contrasted with other distributed file systems. Callaghan, el al Informational [Page 122]
RFC 1813 NFS Version 3 Protocol June 1995 [Macklem] Macklem, Rick, "Lessons Learned Tuning the 4.3BSD Reno Implementation of the NFS Protocol," Winter USENIX Conference Proceedings, USENIX Association, Berkeley, CA, January 1991. Describes performance work in tuning the 4.3BSD Reno NFS implementation. Describes performance improvement (reduced CPU loading) through elimination of data copies. [Mogul] Mogul, Jeffrey C., "A Recovery Protocol for Spritely NFS," USENIX File System Workshop Proceedings, Ann Arbor, MI, USENIX Association, Berkeley, CA, May 1992. Second paper on Spritely NFS proposes a lease-based scheme for recovering state of consistency protocol. [Nowicki] Nowicki, Bill, "Transport Issues in the Network File System," ACM SIGCOMM newsletter Computer Communication Review, April 1989. A brief description of the basis for the dynamic retransmission work. [Pawlowski] Pawlowski, Brian, Ron Hixon, Mark Stein, Joseph Tumminaro, "Network Computing in the UNIX and IBM Mainframe Environment," Uniforum `89 Conf. Proc., (1989) Description of an NFS server implementation for IBM's MVS operating system. [RFC1014] Sun Microsystems, Inc., "XDR: External Data Representation Standard", RFC 1014, Sun Microsystems, Inc., June 1987. Specification for canonical format for data exchange, used with RPC. [RFC1057] Sun Microsystems, Inc., "RPC: Remote Procedure Call Protocol Specification", RFC 1057, Sun Microsystems, Inc., June 1988. Remote procedure protocol specification. [RFC1094] Sun Microsystems, Inc., "Network Filesystem Specification", RFC 1094, Sun Microsystems, Inc., March 1989. NFS version 2 protocol specification. Callaghan, el al Informational [Page 123]
RFC 1813 NFS Version 3 Protocol June 1995 [Sandberg] Sandberg, R., D. Goldberg, S. Kleiman, D. Walsh, B. Lyon, "Design and Implementation of the Sun Network Filesystem," USENIX Conference Proceedings, USENIX Association, Berkeley, CA, Summer 1985. The basic paper describing the SunOS implementation of the NFS version 2 protocol, and discusses the goals, protocol specification and trade-offs. [Srinivasan] Srinivasan, V., Jeffrey C. Mogul, "Spritely NFS: Implementation and Performance of Cache Consistency Protocols", WRL Research Report 89/5, Digital Equipment Corporation Western Research Laboratory, 100 Hamilton Ave., Palo Alto, CA, 94301, May 1989. This paper analyzes the effect of applying a Sprite-like consistency protocol applied to standard NFS. The issues of recovery in a stateful environment are covered in [Mogul]. [X/OpenNFS] X/Open Company, Ltd., X/Open CAE Specification: Protocols for X/Open Internetworking: XNFS, X/Open Company, Ltd., Apex Plaza, Forbury Road, Reading Berkshire, RG1 1AX, United Kingdom, 1991. This is an indispensable reference for NFS version 2 protocol and accompanying protocols, including the Lock Manager and the Portmapper. [X/OpenPCNFS] X/Open Company, Ltd., X/Open CAE Specification: Protocols for X/Open Internetworking: (PC)NFS, Developer's Specification, X/Open Company, Ltd., Apex Plaza, Forbury Road, Reading Berkshire, RG1 1AX, United Kingdom, 1991. This is an indispensable reference for NFS version 2 protocol and accompanying protocols, including the Lock Manager and the Portmapper. Callaghan, el al Informational [Page 124]
RFC 1813 NFS Version 3 Protocol June 1995 8. Security Considerations Since sensitive file data may be transmitted or received from a server by the NFS protocol, authentication, privacy, and data integrity issues should be addressed by implementations of this protocol. As with the previous protocol revision (version 2), NFS version 3 defers to the authentication provisions of the supporting RPC protocol [RFC1057], and assumes that data privacy and integrity are provided by underlying transport layers as available in each implementation of the protocol. See section 4.4 for a discussion relating to file access permissions. 9. Acknowledgements This description of the protocol is derived from an original document written by Brian Pawlowski and revised by Peter Staubach. This protocol is the result of a co-operative effort that comprises the contributions of Geoff Arnold, Brent Callaghan, John Corbin, Fred Glover, Chet Juszczak, Mike Eisler, John Gillono, Dave Hitz, Mike Kupfer, Rick Macklem, Ron Minnich, Brian Pawlowski, David Robinson, Rusty Sandberg, Craig Schamp, Spencer Shepler, Carl Smith, Mark Stein, Peter Staubach, Tom Talpey, Rob Thurlow, and Mark Wittle. Callaghan, el al Informational [Page 125]
RFC 1813 NFS Version 3 Protocol June 1995 10. Authors' Addresses Address comments related to this protocol to: nfs3@eng.sun.com



Back to RFC index

 

Associates:

 



Sponsered-Sites:

Register domain name and transfer | Cheap webhosting service | Domain name registration

 

 

""