Merge branch 'vj/papr_health' into pending

Add support to libndctl for reporting health for nvdimms that support
the PAPR standard[2]. The standard defines machenism (HCALL) through
which a guest kernel can query and fetch health and performance stats of
an nvdimm attached to the hypervisor[3]. Until now 'ndctl' was unable to
report these stats for papr_scm dimms on PPC64 guests due to absence of
ACPI/NFIT, a limitation which this patch-set tries to address.

The patch-set introduces support for the new PAPR PDSM family defined at
[4] & [5] via a new dimm-op named 'papr_dimm_ops'. Infrastructure to
probe and distinguish papr-scm dimms from other dimm families that may
support ACPI/NFIT is implemented by updating the 'struct ndctl_dimm'
initialization routines to bifurcate based on the nvdimm type. We also
introduce two new dimm-ops member for handling initialization of dimm
specific data for specific DSM families.

These changes coupled with proposed kernel changes located at Ref[1]
should provide a way for the user to retrieve NVDIMM health status using
ndtcl for pseries guests. Below is a sample output using proposed kernel
and ndctl changes:

 # ndctl list -DH
[
  {
    "dev":"nmem0",
    "flag_smart_event":true,
    "health":{
      "health_state":"fatal",
      "shutdown_state":"dirty"
    }
  }
]

References
==========
[2] "Power Architecture Platform Reference"
https://en.wikipedia.org/wiki/Power_Architecture_Platform_Reference

[3] "Hypercall Op-codes (hcalls)"
https://github.com/torvalds/linux/blob/master/Documentation/powerpc/papr_hcalls.rst

[4] "powerpc/papr_scm: Add support for reporting nvdimm health"
https://lore.kernel.org/linux-nvdimm/20200615124407.32596-1-vaibhav@linux.ibm.com/

[5] "ndctl/papr_scm,uapi: Add support for PAPR nvdimm specific methods"
https://lore.kernel.org/linux-nvdimm/20200615124407.32596-6-vaibhav@linux.ibm.com/
tree: c531f515dbf75e43f8fdf3d11aaf4190c4e7ca75
  1. ccan/
  2. contrib/
  3. daxctl/
  4. Documentation/
  5. licenses/
  6. m4/
  7. ndctl/
  8. sles/
  9. test/
  10. util/
  11. .gitignore
  12. .travis.yml
  13. autogen.sh
  14. configure.ac
  15. CONTRIBUTING.md
  16. COPYING
  17. git-version
  18. git-version-gen
  19. make-git-snapshot.sh
  20. Makefile.am
  21. Makefile.am.in
  22. ndctl.spec.in
  23. nfit.h
  24. README.md
  25. rpmbuild.sh
  26. test.h
README.md

ndctl

Build Status

Utility library for managing the libnvdimm (non-volatile memory device) sub-system in the Linux kernel

Build

./autogen.sh
./configure CFLAGS='-g -O2' --prefix=/usr --sysconfdir=/etc --libdir=/usr/lib64
make
make check
sudo make install

There are a number of packages required for the build steps that may not be installed by default. For information about the required packages, see the “BuildRequires:” lines in ndctl.spec.in.

https://github.com/pmem/ndctl/blob/master/ndctl.spec.in

Documentation

See the latest documentation for the NVDIMM kernel sub-system here:

https://www.kernel.org/doc/html/latest/driver-api/nvdimm/index.html

A getting started guide is also available on the kernel.org nvdimm wiki:

https://nvdimm.wiki.kernel.org/start

Unit Tests

The unit tests run by make check require the nfit_test.ko module to be loaded. To build and install nfit_test.ko:

  1. Obtain the kernel source. For example,
    git clone -b libnvdimm-for-next git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm.git

  2. Skip to step 3 if the kernel version is >= v4.8. Otherwise, for kernel versions < v4.8, configure the kernel to make some memory available to CMA (contiguous memory allocator). This will be used to emulate DAX.

    CONFIG_DMA_CMA=y
    CONFIG_CMA_SIZE_MBYTES=200
    

    or
    cma=200M on the kernel command line.

  3. Compile the libnvdimm sub-system as a module, make sure “zone device” memory is enabled, and enable the btt, pfn, and dax features of the sub-system:

    CONFIG_X86_PMEM_LEGACY=m
    CONFIG_ZONE_DEVICE=y
    CONFIG_LIBNVDIMM=m
    CONFIG_BLK_DEV_PMEM=m
    CONFIG_ND_BLK=m
    CONFIG_BTT=y
    CONFIG_NVDIMM_PFN=y
    CONFIG_NVDIMM_DAX=y
    CONFIG_DEV_DAX_PMEM=m
    
  4. Build and install the unit test enabled libnvdimm modules in the following order. The unit test modules need to be in place prior to the depmod that runs during the final modules_install

    make M=tools/testing/nvdimm
    sudo make M=tools/testing/nvdimm modules_install
    sudo make modules_install
    
  5. Now run make check in the ndctl source directory, or ndctl test, if ndctl was built with --enable-test.

Troubleshooting

The unit tests will validate that the environment is set up correctly before they try to run. If the platform is misconfigured, i.e. the unit test modules are not available, or the test versions of the modules are superseded by the “in-tree/production” version of the modules make check will skip tests and report a message like the following in test/test-suite.log:

SKIP: libndctl
==============
test/init: nfit_test_init: nfit.ko: appears to be production version: /lib/modules/4.8.8-200.fc24.x86_64/kernel/drivers/acpi/nfit/nfit.ko.xz
__ndctl_test_skip: explicit skip test_libndctl:2684
nfit_test unavailable skipping tests

If the unit test modules are indeed available in the modules ‘extra’ directory the default depmod policy can be overridden by adding a file to /etc/depmod.d with the following contents:

override nfit * extra
override device_dax * extra
override dax_pmem * extra
override dax_pmem_core * extra
override dax_pmem_compat * extra
override libnvdimm * extra
override nd_blk * extra
override nd_btt * extra
override nd_e820 * extra
override nd_pmem * extra

The nfit_test module emulates pmem with memory allocated via vmalloc(). One of the side effects is that this breaks ‘physically contiguous’ assumptions in the driver. Use the '--align=4K option to ‘ndctl create-namespace’ to avoid these corner case scenarios.