| From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> |
| Subject: mm/shmem: update shmem to use mmap_prepare |
| Date: Wed, 17 Sep 2025 20:11:03 +0100 |
| |
| Patch series "expand mmap_prepare functionality, port more users", v4. |
| |
| Since commit c84bf6dd2b83 ("mm: introduce new .mmap_prepare() file |
| callback"), The f_op->mmap hook has been deprecated in favour of |
| f_op->mmap_prepare. |
| |
| This was introduced in order to make it possible for us to eventually |
| eliminate the f_op->mmap hook which is highly problematic as it allows |
| drivers and filesystems raw access to a VMA which is not yet correctly |
| initialised. |
| |
| This hook also introduced complexity for the memory mapping operation, as |
| we must correctly unwind what we do should an error arises. |
| |
| Overall this interface being so open has caused significant problems for |
| us, including security issues, it is important for us to simply eliminate |
| this as a source of problems. |
| |
| Therefore this series continues what was established by extending the |
| functionality further to permit more drivers and filesystems to use |
| mmap_prepare. |
| |
| We start by udpating some existing users who can use the mmap_prepare |
| functionality as-is. |
| |
| We then introduce the concept of an mmap 'action', which a user, on |
| mmap_prepare, can request to be performed upon the VMA: |
| |
| * Nothing - default, we're done |
| * Remap PFN - perform PFN remap with specified parameters |
| * I/O remap PFN - perform I/O PFN remap with specified parameters |
| |
| By setting the action in mmap_prepare, this allows us to dynamically decide |
| what to do next, so if a driver/filesystem needs to determine whether to |
| e.g. remap or use a mixed map, it can do so then change which is done. |
| |
| This significantly expands the capabilities of the mmap_prepare hook, while |
| maintaining as much control as possible in the mm logic. |
| |
| We split [io_]remap_pfn_range*() functions which allow for PFN remap (a |
| typical mapping prepopulation operation) split between a prepare/complete |
| step, as well as io_mremap_pfn_range_prepare, complete for a similar |
| purpose. |
| |
| From there we update various mm-adjacent logic to use this functionality as |
| a first set of changes. |
| |
| We also add success and error hooks for post-action processing for e.g. |
| output debug log on success and filtering error codes. |
| |
| |
| This patch (of 14): |
| |
| This simply assigns the vm_ops so is easily updated - do so. |
| |
| Link: https://lkml.kernel.org/r/cover.1758135681.git.lorenzo.stoakes@oracle.com |
| Link: https://lkml.kernel.org/r/86029a4f59733826c8419e48f6ad4000932a6d08.1758135681.git.lorenzo.stoakes@oracle.com |
| Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> |
| Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> |
| Reviewed-by: David Hildenbrand <david@redhat.com> |
| Reviewed-by: Jan Kara <jack@suse.cz> |
| Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> |
| Reviewed-by: Pedro Falcato <pfalcato@suse.de> |
| Cc: Alexander Gordeev <agordeev@linux.ibm.com> |
| Cc: Al Viro <viro@zeniv.linux.org.uk> |
| Cc: Andreas Larsson <andreas@gaisler.com> |
| Cc: Andrey Konovalov <andreyknvl@gmail.com> |
| Cc: Arnd Bergmann <arnd@arndb.de> |
| Cc: Baoquan He <bhe@redhat.com> |
| Cc: Chatre, Reinette <reinette.chatre@intel.com> |
| Cc: Christian Borntraeger <borntraeger@linux.ibm.com> |
| Cc: Christian Brauner <brauner@kernel.org> |
| Cc: Dan Williams <dan.j.williams@intel.com> |
| Cc: Dave Jiang <dave.jiang@intel.com> |
| Cc: Dave Martin <dave.martin@arm.com> |
| Cc: Dave Young <dyoung@redhat.com> |
| Cc: David S. Miller <davem@davemloft.net> |
| Cc: Dmitriy Vyukov <dvyukov@google.com> |
| Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
| Cc: Guo Ren <guoren@kernel.org> |
| Cc: Heiko Carstens <hca@linux.ibm.com> |
| Cc: Hugh Dickins <hughd@google.com> |
| Cc: James Morse <james.morse@arm.com> |
| Cc: Jann Horn <jannh@google.com> |
| Cc: Jonathan Corbet <corbet@lwn.net> |
| Cc: Kevin Tian <kevin.tian@intel.com> |
| Cc: Konstantin Komarov <almaz.alexandrovich@paragon-software.com> |
| Cc: Liam Howlett <liam.howlett@oracle.com> |
| Cc: "Luck, Tony" <tony.luck@intel.com> |
| Cc: Matthew Wilcox (Oracle) <willy@infradead.org> |
| Cc: Michal Hocko <mhocko@suse.com> |
| Cc: Mike Rapoport <rppt@kernel.org> |
| Cc: Muchun Song <muchun.song@linux.dev> |
| Cc: Nicolas Pitre <nico@fluxnic.net> |
| Cc: Oscar Salvador <osalvador@suse.de> |
| Cc: Robin Murohy <robin.murphy@arm.com> |
| Cc: Suren Baghdasaryan <surenb@google.com> |
| Cc: Sven Schnelle <svens@linux.ibm.com> |
| Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> |
| Cc: "Uladzislau Rezki (Sony)" <urezki@gmail.com> |
| Cc: Vasily Gorbik <gor@linux.ibm.com> |
| Cc: Vishal Verma <vishal.l.verma@intel.com> |
| Cc: Vivek Goyal <vgoyal@redhat.com> |
| Cc: Vlastimil Babka <vbabka@suse.cz> |
| Cc: Will Deacon <will@kernel.org> |
| Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
| --- |
| |
| mm/shmem.c | 9 +++++---- |
| 1 file changed, 5 insertions(+), 4 deletions(-) |
| |
| --- a/mm/shmem.c~mm-shmem-update-shmem-to-use-mmap_prepare |
| +++ a/mm/shmem.c |
| @@ -2924,16 +2924,17 @@ out_nomem: |
| return retval; |
| } |
| |
| -static int shmem_mmap(struct file *file, struct vm_area_struct *vma) |
| +static int shmem_mmap_prepare(struct vm_area_desc *desc) |
| { |
| + struct file *file = desc->file; |
| struct inode *inode = file_inode(file); |
| |
| file_accessed(file); |
| /* This is anonymous shared memory if it is unlinked at the time of mmap */ |
| if (inode->i_nlink) |
| - vma->vm_ops = &shmem_vm_ops; |
| + desc->vm_ops = &shmem_vm_ops; |
| else |
| - vma->vm_ops = &shmem_anon_vm_ops; |
| + desc->vm_ops = &shmem_anon_vm_ops; |
| return 0; |
| } |
| |
| @@ -5203,7 +5204,7 @@ static const struct address_space_operat |
| }; |
| |
| static const struct file_operations shmem_file_operations = { |
| - .mmap = shmem_mmap, |
| + .mmap_prepare = shmem_mmap_prepare, |
| .open = shmem_file_open, |
| .get_unmapped_area = shmem_get_unmapped_area, |
| #ifdef CONFIG_TMPFS |
| _ |