| ****************************************************** |
| Simulating persistent memory configurations using QEMU |
| ****************************************************** |
| Quick Summary |
| ~~~~~~~~~~~~~ |
| |
| QEMU provides excellent options for simulating persistent memory |
| configurations. For all the details about the various QEMU command line |
| options, please check out their documentation: |
| |
| https://github.com/qemu/qemu/blob/master/docs/nvdimm.txt |
| |
| In my setup I use the following options to define my NVDIMMs: |
| |
| :: |
| |
| -object memory-backend-file,id=mem1,share,mem-path=/home/rzwisler/vms/nvdimm-2,size=17G,align=128M |
| -device nvdimm,memdev=mem1,id=nv1,label-size=2M |
| |
| Multiple DIMMs can be defined by varying the **id** and **mem-path** |
| object options, and the **memdev** and **id** device options. i.e.: |
| |
| :: |
| |
| -object memory-backend-file,id=mem1,share,mem-path=/home/rzwisler/vms/nvdimm-3.1,size=17G,align=128M |
| -device nvdimm,memdev=mem1,id=nv1,label-size=2M |
| |
| -object memory-backend-file,id=mem2,share,mem-path=/home/rzwisler/vms/nvdimm-3.2,size=17G,align=128M |
| -device nvdimm,memdev=mem2,id=nv2,label-size=2M |
| |
| According to the QEMU documentation the "align" option was introduced in |
| QEMU v2.12.0, so depending on your distro you may need to install an |
| updated version of QEMU from source. |
| |
| More Details |
| ~~~~~~~~~~~~ |
| |
| Of the various options used in the QEMU command lines above, the |
| following are worth noting: |
| |
| label-size=2M |
| ^^^^^^^^^^^^^ |
| |
| This is necessary so that the virtual NVDIMMs have label space. Without |
| label space our DIMMs are used in label-less mode, which is a more |
| restricted configuration that prevents us from creating multiple |
| namespaces per region. |
| |
| We choose a 2 MiB label size because the label area is consumed out of |
| the total size of the DIMM itself (17 GiB in my configuration), and |
| keeping it to a multiple of 2 MiB allows the resulting namespaces to |
| still use 2 MiB pages in filesystem DAX and device DAX configurations. |
| |
| align=128M |
| ^^^^^^^^^^ |
| |
| This option controls the starting alignment of the physical addresses |
| consumed by the NVDIMMs. Because we are using 2 MiB of the 17 GiB of the |
| total space of our NVDIMMs for label space, that means that our NVDIMMs |
| are each (17 GiB - 2 MiB) in size. By default QEMU will put the DIMMs |
| directly next to one another in physical address space, which means that |
| the boundary between them won't align to the 128 MiB memory section size |
| imposed by the Linux kernel. |
| |
| In short, Linux needs the physical addresses for each NVDIMM to start on |
| a 128 MiB boundary. We can do this by not having a label area so the |
| total physical size of our NVDIMM is 1 GiB aligned, we can have our |
| label space be 128 MiB sized, or we can use the **align** command line |
| option to manually specify an alignment. Giving up the label area means |
| that we can't have more than a single namespace per region, reducing our |
| configuration flexibility. Having a 128 MiB sized label area means we |
| create label space we will never need and in my experience slows down |
| our VM because the label space needs to be initialized. |