| [[Internal_Inodes]] |
| = Internal Inodes |
| |
| XFS allocates several inodes when a filesystem is created. These are internal |
| and not accessible from the standard directory structure. These inodes are only |
| accessible from the superblock. |
| |
| [[Metadata_Directories]] |
| == Metadata Directory Tree |
| |
| If the +XFS_SB_FEAT_INCOMPAT_METADIR+ feature is enabled, the +sb_metadirino+ |
| field in the superblock points to the root of a directory tree containing |
| metadata files. This directory tree is completely internal to the filesystem |
| and must not be exposed to user programs. |
| |
| When this feature is enabled, metadata files should be found by walking the |
| metadata directory tree. The superblock fields that formerly pointed to (some) |
| of those inodes have been deallocated and may be reused by future features. |
| |
| .Metadata Directory Paths |
| [options="header"] |
| |===== |
| | Metadata File | Location |
| | xref:Quota_Inodes[User Quota] | /quota/user |
| | xref:Quota_Inodes[Group Quota] | /quota/group |
| | xref:Quota_Inodes[Project Quota] | /quota/project |
| | xref:Real-Time_Bitmap_Inode[Realtime Bitmap] | /rtgroups/*.bitmap |
| | xref:Real-Time_Summary_Inode[Realtime Summary] | /rtgroups/*.summary |
| | xref:Real_time_Reverse_Mapping_Btree[Realtime Reverse Mapping B+tree] | /rtgroups/*.rmap |
| | xref:Real_time_Refcount_Btree[Realtime Reference Count+tree] | /rtgroups/*.refcount |
| |===== |
| |
| Metadata files are flagged by the +XFS_DIFLAG2_METADATA+ flag in the |
| +di_flags2+ field. Metadata files must have the following properties: |
| |
| * Must be either a directory or a regular file. |
| * chmod 0000 |
| * User and group IDs set to zero. |
| * The +XFS_DIFLAG_IMMUTABLE+, +XFS_DIFLAG_SYNC+, +XFS_DIFLAG_NOATIME+, +XFS_DIFLAG_NODUMP+, and +XFS_DIFLAG_NODEFRAG+ flags must all be set in +di_flags+. |
| * For a directory, the +XFS_DIFLAG_NOSYMLINKS+ flag must also be set. |
| * The +XFS_DIFLAG2_METADATA+ flag must be set in +di_flags2+. |
| * The +XFS_DIFLAG2_DAX+ flag must not be set. |
| |
| === Metadata Directory Example |
| |
| This example shows a metadta directory from a freshly formatted root |
| filesystem: |
| |
| ---- |
| xfs_db> sb 0 |
| xfs_db> p |
| magicnum = 0x58465342 |
| blocksize = 4096 |
| dblocks = 5192704 |
| rblocks = 0 |
| rextents = 0 |
| uuid = cbf2ceef-658e-46b0-8f96-785661c37976 |
| logstart = 4194311 |
| rootino = 128 |
| rbmino = 130 |
| rsumino = 131 |
| ... |
| meta_uuid = 00000000-0000-0000-0000-000000000000 |
| metadirino = 129 |
| ... |
| ---- |
| |
| Notice how the listing includes the root of the metadata directory tree |
| (+metadirino+). |
| |
| ---- |
| xfs_db> path -m / |
| xfs_db> ls |
| 8 129 directory 0x0000002e 1 . (good) |
| 10 129 directory 0x0000172e 2 .. (good) |
| 12 33685632 directory 0x2d18ab4c 8 rtgroups (good) |
| ---- |
| |
| Here we use the +path+ and +ls+ commands to display the root directory of |
| the metadata directory. We can navigate the directory the old way, too: |
| |
| ---- |
| xfs_db> p |
| core.magic = 0x494e |
| core.mode = 040000 |
| core.version = 3 |
| core.format = 1 (local) |
| core.onlink = 0 |
| core.uid = 0 |
| core.gid = 0 |
| ... |
| v3.flags2 = 0x8000000000000018 |
| v3.cowextsize = 0 |
| v3.crtime.sec = Wed Aug 7 10:22:36 2024 |
| v3.crtime.nsec = 273744000 |
| v3.inumber = 129 |
| v3.uuid = 7e55b909-8728-4d69-a1fa-891427314eea |
| v3.reflink = 0 |
| v3.cowextsz = 0 |
| v3.dax = 0 |
| v3.bigtime = 1 |
| v3.nrext64 = 1 |
| v3.metadata = 1 |
| u3.sfdir3.hdr.count = 1 |
| u3.sfdir3.hdr.i8count = 0 |
| u3.sfdir3.hdr.parent.i4 = 129 |
| u3.sfdir3.list[0].namelen = 8 |
| u3.sfdir3.list[0].offset = 0x60 |
| u3.sfdir3.list[0].name = "rtgroups" |
| u3.sfdir3.list[0].inumber.i4 = 33685632 |
| u3.sfdir3.list[0].filetype = 2 |
| ---- |
| |
| The root of the metadata directory is a short format directory, and looks just |
| like any other directory. The only difference is that the metadata flag is |
| set, and the directory can only be viewed in the XFS debugger. |
| |
| ---- |
| xfs_db> path -m /rtgroups/0.rmap |
| btdump |
| u3.rtrmapbt.recs[1] = [startblock,blockcount,owner,offset,extentflag,attrfork,bmbtblock] |
| 1:[0,1,-3,0,0,0,0] |
| ---- |
| |
| Observe that we can use the xfs_db +path+ command to navigate the metadata |
| directory tree to the user quota file and display its contents. |
| |
| [[Quota_Inodes]] |
| == Quota Inodes |
| |
| Prior to version 5 filesystems, two inodes can be allocated for quota |
| management. The first inode will be used for user quotas. The second inode |
| will be used for group quotas or project quotas, depending on mount options. |
| Group and project quotas are mutually exclusive features in these environments. |
| |
| In version 5 or later filesystems, each quota type is allocated its own inode, |
| making it possible to use group and project quota management simultaneously. |
| |
| * Project quota's primary purpose is to track and monitor disk usage for |
| directories. For this to occur, the directory inode must have the |
| +XFS_DIFLAG_PROJINHERIT+ flag set so all inodes created underneath the directory |
| inherit the project ID. |
| |
| * Inodes and blocks owned by ID zero do not have enforced quotas, but only quota |
| accounting. |
| |
| * Extended attributes do not contribute towards the ID's quota. |
| |
| * To access each ID's quota information in the file, seek to the ID offset |
| multiplied by the size of +xfs_dqblk_t+ (136 bytes). |
| |
| .Quota inode layout |
| image::images/76.png[] |
| |
| Quota information is stored in the data extents of the reserved quota |
| inodes as an array of the +xfs_dqblk+ structures, where there is one array |
| element for each ID in the system: |
| |
| [source, c] |
| ---- |
| struct xfs_disk_dquot { |
| __be16 d_magic; |
| __u8 d_version; |
| __u8 d_flags; |
| __be32 d_id; |
| __be64 d_blk_hardlimit; |
| __be64 d_blk_softlimit; |
| __be64 d_ino_hardlimit; |
| __be64 d_ino_softlimit; |
| __be64 d_bcount; |
| __be64 d_icount; |
| __be32 d_itimer; |
| __be32 d_btimer; |
| __be16 d_iwarns; |
| __be16 d_bwarns; |
| __be32 d_pad0; |
| __be64 d_rtb_hardlimit; |
| __be64 d_rtb_softlimit; |
| __be64 d_rtbcount; |
| __be32 d_rtbtimer; |
| __be16 d_rtbwarns; |
| __be16 d_pad; |
| }; |
| struct xfs_dqblk { |
| struct xfs_disk_dquot dd_diskdq; |
| char dd_fill[4]; |
| |
| /* version 5 filesystem fields begin here */ |
| __be32 dd_crc; |
| __be64 dd_lsn; |
| uuid_t dd_uuid; |
| }; |
| ---- |
| |
| *d_magic*:: |
| Specifies the signature where these two bytes are 0x4451 (+XFS_DQUOT_MAGIC+), |
| or ``DQ'' in ASCII. |
| |
| *d_version*:: |
| The structure version, currently this is 1 (+XFS_DQUOT_VERSION+). |
| |
| *d_flags*:: |
| Specifies which type of ID the structure applies to: |
| |
| [source, c] |
| ---- |
| #define XFS_DQ_USER 0x0001 |
| #define XFS_DQ_PROJ 0x0002 |
| #define XFS_DQ_GROUP 0x0004 |
| ---- |
| |
| *d_id*:: |
| The ID for the quota structure. This will be a uid, gid or projid based on the |
| value of +d_flags+. |
| |
| *d_blk_hardlimit*:: |
| The hard limit for the number of filesystem blocks the ID can own. The |
| ID will not be able to use more space than this limit. If it is attempted, |
| +ENOSPC+ will be returned. |
| |
| *d_blk_softlimit*:: |
| The soft limit for the number of filesystem blocks the ID can own. |
| The ID can temporarily use more space than by +d_blk_softlimit+ up to |
| +d_blk_hardlimit+. If the space is not freed by the time limit specified by ID |
| zero's +d_btimer+ value, the ID will be denied more space until the total |
| blocks owned goes below +d_blk_softlimit+. |
| |
| *d_ino_hardlimit*:: |
| The hard limit for the number of inodes the ID can own. The ID will |
| not be able to create or own any more inodes if +d_icount+ reaches this value. |
| |
| *d_ino_softlimit*:: |
| The soft limit for the number of inodes the ID can own. The ID can |
| temporarily create or own more inodes than specified by +d_ino_softlimit+ up to |
| +d_ino_hardlimit+. If the inode count is not reduced by the time limit specified |
| by ID zero's +d_itimer+ value, the ID will be denied from creating or owning more |
| inodes until the count goes below +d_ino_softlimit+. |
| |
| *d_bcount*:: |
| How many filesystem blocks are actually owned by the ID. |
| |
| *d_icount*:: |
| How many inodes are actually owned by the ID. |
| |
| *d_itimer*:: |
| Specifies the time when the ID's +d_icount+ exceeded +d_ino_softlimit+. The soft |
| limit will turn into a hard limit after the elapsed time exceeds ID zero's |
| +d_itimer+ value. When d_icount goes back below +d_ino_softlimit+, +d_itimer+ |
| is reset back to zero. |
| |
| If the +XFS_SB_FEAT_INCOMPAT_BIGTIME+ feature is enabled, the 32 bits used by |
| the timestamp field are interpreted as the upper 32 bits of an 34-bit unsigned |
| seconds counter. See the section about xref:Quota_Timers[quota expiration |
| timers] for more details. |
| |
| *d_btimer*:: |
| Specifies the time when the ID's +d_bcount+ exceeded +d_blk_softlimit+. The soft |
| limit will turn into a hard limit after the elapsed time exceeds ID zero's |
| +d_btimer+ value. When d_bcount goes back below +d_blk_softlimit+, +d_btimer+ |
| is reset back to zero. |
| |
| *d_iwarns*:: |
| *d_bwarns*:: |
| *d_rtbwarns*:: |
| Specifies how many times a warning has been issued. Currently not used. |
| |
| *d_rtb_hardlimit*:: |
| The hard limit for the number of real-time blocks the ID can own. The |
| ID cannot own more space on the real-time subvolume beyond this limit. |
| |
| *d_rtb_softlimit*:: |
| The soft limit for the number of real-time blocks the ID can own. The |
| ID can temporarily own more space than specified by +d_rtb_softlimit+ up to |
| +d_rtb_hardlimit+. If +d_rtbcount+ is not reduced by the time limit specified |
| by ID zero's +d_rtbtimer value+, the ID will be denied from owning more space |
| until the count goes below +d_rtb_softlimit+. |
| |
| *d_rtbcount*:: |
| How many real-time blocks are currently owned by the ID. |
| |
| *d_rtbtimer*:: |
| Specifies the time when the ID's +d_rtbcount+ exceeded +d_rtb_softlimit+. The |
| soft limit will turn into a hard limit after the elapsed time exceeds ID zero's |
| +d_rtbtimer+ value. When +d_rtbcount+ goes back below +d_rtb_softlimit+, |
| +d_rtbtimer+ is reset back to zero. |
| |
| *dd_uuid*:: |
| The UUID of this block, which must match either +sb_uuid+ or +sb_meta_uuid+ |
| depending on which features are set. |
| |
| *dd_lsn*:: |
| Log sequence number of the last DQ block write. |
| |
| *dd_crc*:: |
| Checksum of the DQ block. |
| |
| [[Real-time_Inodes]] |
| == Real-time Inodes |
| |
| There are two inodes allocated to managing the real-time device's space, the |
| xref:Real-Time_Bitmap_Inode[Bitmap Inode] and the |
| xref:Real-Time_Summary_Inode[Summary Inode]. |
| |
| Each realtime group can allocate one inode to managing a |
| xref:Real_time_Reverse_Mapping_Btree[reverse-index of space] usage, and |
| a second one to manage xref:Real_time_Refcount_Btree[reference counts] of space |
| usage. |