Merge tag 'docs-broken-links' of git://linuxtv.org/mchehab/experimental

Pull documentation fixes from Mauro Carvalho Chehab:
 "This solves a series of broken links for files under Documentation,
  and improves a script meant to detect such broken links (see
  scripts/documentation-file-ref-check).

  The changes on this series are:

   - can.rst: fix a footnote reference;

   - crypto_engine.rst: Fix two parsing warnings;

   - Fix a lot of broken references to Documentation/*;

   - improve the scripts/documentation-file-ref-check script, in order
     to help detecting/fixing broken references, preventing
     false-positives.

  After this patch series, only 33 broken references to doc files are
  detected by scripts/documentation-file-ref-check"

* tag 'docs-broken-links' of git://linuxtv.org/mchehab/experimental: (26 commits)
  fix a series of Documentation/ broken file name references
  Documentation: rstFlatTable.py: fix a broken reference
  ABI: sysfs-devices-system-cpu: remove a broken reference
  devicetree: fix a series of wrong file references
  devicetree: fix name of pinctrl-bindings.txt
  devicetree: fix some bindings file names
  MAINTAINERS: fix location of DT npcm files
  MAINTAINERS: fix location of some display DT bindings
  kernel-parameters.txt: fix pointers to sound parameters
  bindings: nvmem/zii: Fix location of nvmem.txt
  docs: Fix more broken references
  scripts/documentation-file-ref-check: check tools/*/Documentation
  scripts/documentation-file-ref-check: get rid of false-positives
  scripts/documentation-file-ref-check: hint: dash or underline
  scripts/documentation-file-ref-check: add a fix logic for DT
  scripts/documentation-file-ref-check: accept more wildcards at filenames
  scripts/documentation-file-ref-check: fix help message
  media: max2175: fix location of driver's companion documentation
  media: v4l: fix broken video4linux docs locations
  media: dvb: point to the location of the old README.dvb-usb file
  ...
diff --git a/Documentation/riscv/pmu.txt b/Documentation/riscv/pmu.txt
new file mode 100644
index 0000000..b29f03a
--- /dev/null
+++ b/Documentation/riscv/pmu.txt
@@ -0,0 +1,249 @@
+Supporting PMUs on RISC-V platforms
+==========================================
+Alan Kao <alankao@andestech.com>, Mar 2018
+
+Introduction
+------------
+
+As of this writing, perf_event-related features mentioned in The RISC-V ISA
+Privileged Version 1.10 are as follows:
+(please check the manual for more details)
+
+* [m|s]counteren
+* mcycle[h], cycle[h]
+* minstret[h], instret[h]
+* mhpeventx, mhpcounterx[h]
+
+With such function set only, porting perf would require a lot of work, due to
+the lack of the following general architectural performance monitoring features:
+
+* Enabling/Disabling counters
+  Counters are just free-running all the time in our case.
+* Interrupt caused by counter overflow
+  No such feature in the spec.
+* Interrupt indicator
+  It is not possible to have many interrupt ports for all counters, so an
+  interrupt indicator is required for software to tell which counter has
+  just overflowed.
+* Writing to counters
+  There will be an SBI to support this since the kernel cannot modify the
+  counters [1].  Alternatively, some vendor considers to implement
+  hardware-extension for M-S-U model machines to write counters directly.
+
+This document aims to provide developers a quick guide on supporting their
+PMUs in the kernel.  The following sections briefly explain perf' mechanism
+and todos.
+
+You may check previous discussions here [1][2].  Also, it might be helpful
+to check the appendix for related kernel structures.
+
+
+1. Initialization
+-----------------
+
+*riscv_pmu* is a global pointer of type *struct riscv_pmu*, which contains
+various methods according to perf's internal convention and PMU-specific
+parameters.  One should declare such instance to represent the PMU.  By default,
+*riscv_pmu* points to a constant structure *riscv_base_pmu*, which has very
+basic support to a baseline QEMU model.
+
+Then he/she can either assign the instance's pointer to *riscv_pmu* so that
+the minimal and already-implemented logic can be leveraged, or invent his/her
+own *riscv_init_platform_pmu* implementation.
+
+In other words, existing sources of *riscv_base_pmu* merely provide a
+reference implementation.  Developers can flexibly decide how many parts they
+can leverage, and in the most extreme case, they can customize every function
+according to their needs.
+
+
+2. Event Initialization
+-----------------------
+
+When a user launches a perf command to monitor some events, it is first
+interpreted by the userspace perf tool into multiple *perf_event_open*
+system calls, and then each of them calls to the body of *event_init*
+member function that was assigned in the previous step.  In *riscv_base_pmu*'s
+case, it is *riscv_event_init*.
+
+The main purpose of this function is to translate the event provided by user
+into bitmap, so that HW-related control registers or counters can directly be
+manipulated.  The translation is based on the mappings and methods provided in
+*riscv_pmu*.
+
+Note that some features can be done in this stage as well:
+
+(1) interrupt setting, which is stated in the next section;
+(2) privilege level setting (user space only, kernel space only, both);
+(3) destructor setting.  Normally it is sufficient to apply *riscv_destroy_event*;
+(4) tweaks for non-sampling events, which will be utilized by functions such as
+*perf_adjust_period*, usually something like the follows:
+
+if (!is_sampling_event(event)) {
+        hwc->sample_period = x86_pmu.max_period;
+        hwc->last_period = hwc->sample_period;
+        local64_set(&hwc->period_left, hwc->sample_period);
+}
+
+In the case of *riscv_base_pmu*, only (3) is provided for now.
+
+
+3. Interrupt
+------------
+
+3.1. Interrupt Initialization
+
+This often occurs at the beginning of the *event_init* method. In common
+practice, this should be a code segment like
+
+int x86_reserve_hardware(void)
+{
+        int err = 0;
+
+        if (!atomic_inc_not_zero(&pmc_refcount)) {
+                mutex_lock(&pmc_reserve_mutex);
+                if (atomic_read(&pmc_refcount) == 0) {
+                        if (!reserve_pmc_hardware())
+                                err = -EBUSY;
+                        else
+                                reserve_ds_buffers();
+                }
+                if (!err)
+                        atomic_inc(&pmc_refcount);
+                mutex_unlock(&pmc_reserve_mutex);
+        }
+
+        return err;
+}
+
+And the magic is in *reserve_pmc_hardware*, which usually does atomic
+operations to make implemented IRQ accessible from some global function pointer.
+*release_pmc_hardware* serves the opposite purpose, and it is used in event
+destructors mentioned in previous section.
+
+(Note: From the implementations in all the architectures, the *reserve/release*
+pair are always IRQ settings, so the *pmc_hardware* seems somehow misleading.
+It does NOT deal with the binding between an event and a physical counter,
+which will be introduced in the next section.)
+
+3.2. IRQ Structure
+
+Basically, a IRQ runs the following pseudo code:
+
+for each hardware counter that triggered this overflow
+
+    get the event of this counter
+
+    // following two steps are defined as *read()*,
+    // check the section Reading/Writing Counters for details.
+    count the delta value since previous interrupt
+    update the event->count (# event occurs) by adding delta, and
+               event->hw.period_left by subtracting delta
+
+    if the event overflows
+        sample data
+        set the counter appropriately for the next overflow
+
+        if the event overflows again
+            too frequently, throttle this event
+        fi
+    fi
+
+end for
+
+However as of this writing, none of the RISC-V implementations have designed an
+interrupt for perf, so the details are to be completed in the future.
+
+4. Reading/Writing Counters
+---------------------------
+
+They seem symmetric but perf treats them quite differently.  For reading, there
+is a *read* interface in *struct pmu*, but it serves more than just reading.
+According to the context, the *read* function not only reads the content of the
+counter (event->count), but also updates the left period to the next interrupt
+(event->hw.period_left).
+
+But the core of perf does not need direct write to counters.  Writing counters
+is hidden behind the abstraction of 1) *pmu->start*, literally start counting so one
+has to set the counter to a good value for the next interrupt; 2) inside the IRQ
+it should set the counter to the same resonable value.
+
+Reading is not a problem in RISC-V but writing would need some effort, since
+counters are not allowed to be written by S-mode.
+
+
+5. add()/del()/start()/stop()
+-----------------------------
+
+Basic idea: add()/del() adds/deletes events to/from a PMU, and start()/stop()
+starts/stop the counter of some event in the PMU.  All of them take the same
+arguments: *struct perf_event *event* and *int flag*.
+
+Consider perf as a state machine, then you will find that these functions serve
+as the state transition process between those states.
+Three states (event->hw.state) are defined:
+
+* PERF_HES_STOPPED:	the counter is stopped
+* PERF_HES_UPTODATE:	the event->count is up-to-date
+* PERF_HES_ARCH:	arch-dependent usage ... we don't need this for now
+
+A normal flow of these state transitions are as follows:
+
+* A user launches a perf event, resulting in calling to *event_init*.
+* When being context-switched in, *add* is called by the perf core, with a flag
+  PERF_EF_START, which means that the event should be started after it is added.
+  At this stage, a general event is bound to a physical counter, if any.
+  The state changes to PERF_HES_STOPPED and PERF_HES_UPTODATE, because it is now
+  stopped, and the (software) event count does not need updating.
+** *start* is then called, and the counter is enabled.
+   With flag PERF_EF_RELOAD, it writes an appropriate value to the counter (check
+   previous section for detail).
+   Nothing is written if the flag does not contain PERF_EF_RELOAD.
+   The state now is reset to none, because it is neither stopped nor updated
+   (the counting already started)
+* When being context-switched out, *del* is called.  It then checks out all the
+  events in the PMU and calls *stop* to update their counts.
+** *stop* is called by *del*
+   and the perf core with flag PERF_EF_UPDATE, and it often shares the same
+   subroutine as *read* with the same logic.
+   The state changes to PERF_HES_STOPPED and PERF_HES_UPTODATE, again.
+
+** Life cycle of these two pairs: *add* and *del* are called repeatedly as
+  tasks switch in-and-out; *start* and *stop* is also called when the perf core
+  needs a quick stop-and-start, for instance, when the interrupt period is being
+  adjusted.
+
+Current implementation is sufficient for now and can be easily extended to
+features in the future.
+
+A. Related Structures
+---------------------
+
+* struct pmu: include/linux/perf_event.h
+* struct riscv_pmu: arch/riscv/include/asm/perf_event.h
+
+  Both structures are designed to be read-only.
+
+  *struct pmu* defines some function pointer interfaces, and most of them take
+*struct perf_event* as a main argument, dealing with perf events according to
+perf's internal state machine (check kernel/events/core.c for details).
+
+  *struct riscv_pmu* defines PMU-specific parameters.  The naming follows the
+convention of all other architectures.
+
+* struct perf_event: include/linux/perf_event.h
+* struct hw_perf_event
+
+  The generic structure that represents perf events, and the hardware-related
+details.
+
+* struct riscv_hw_events: arch/riscv/include/asm/perf_event.h
+
+  The structure that holds the status of events, has two fixed members:
+the number of events and the array of the events.
+
+References
+----------
+
+[1] https://github.com/riscv/riscv-linux/pull/124
+[2] https://groups.google.com/a/groups.riscv.org/forum/#!topic/sw-dev/f19TmCNP6yA
diff --git a/MAINTAINERS b/MAINTAINERS
index fd3fc63..9d5eeff 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -10273,18 +10273,16 @@
 F:	arch/arm/boot/dts/*dra7*
 
 OMAP DISPLAY SUBSYSTEM and FRAMEBUFFER SUPPORT (DSS2)
-M:	Tomi Valkeinen <tomi.valkeinen@ti.com>
 L:	linux-omap@vger.kernel.org
 L:	linux-fbdev@vger.kernel.org
-S:	Maintained
+S:	Orphan
 F:	drivers/video/fbdev/omap2/
 F:	Documentation/arm/OMAP/DSS
 
 OMAP FRAMEBUFFER SUPPORT
-M:	Tomi Valkeinen <tomi.valkeinen@ti.com>
 L:	linux-fbdev@vger.kernel.org
 L:	linux-omap@vger.kernel.org
-S:	Maintained
+S:	Orphan
 F:	drivers/video/fbdev/omap/
 
 OMAP GENERAL PURPOSE MEMORY CONTROLLER SUPPORT
@@ -12179,7 +12177,7 @@
 
 RISC-V ARCHITECTURE
 M:	Palmer Dabbelt <palmer@sifive.com>
-M:	Albert Ou <albert@sifive.com>
+M:	Albert Ou <aou@eecs.berkeley.edu>
 L:	linux-riscv@lists.infradead.org
 T:	git git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux.git
 S:	Supported
@@ -12939,6 +12937,14 @@
 F:	drivers/media/usb/siano/
 F:	drivers/media/mmc/siano/
 
+SIFIVE DRIVERS
+M:	Palmer Dabbelt <palmer@sifive.com>
+L:	linux-riscv@lists.infradead.org
+T:	git git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux.git
+S:	Supported
+K:	sifive
+N:	sifive
+
 SILEAD TOUCHSCREEN DRIVER
 M:	Hans de Goede <hdegoede@redhat.com>
 L:	linux-input@vger.kernel.org
@@ -15007,8 +15013,7 @@
 USER-MODE LINUX (UML)
 M:	Jeff Dike <jdike@addtoit.com>
 M:	Richard Weinberger <richard@nod.at>
-L:	user-mode-linux-devel@lists.sourceforge.net
-L:	user-mode-linux-user@lists.sourceforge.net
+L:	linux-um@lists.infradead.org
 W:	http://user-mode-linux.sourceforge.net
 T:	git git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml.git
 S:	Maintained
diff --git a/arch/powerpc/include/asm/asm-prototypes.h b/arch/powerpc/include/asm/asm-prototypes.h
index aa9e785..7841b8a 100644
--- a/arch/powerpc/include/asm/asm-prototypes.h
+++ b/arch/powerpc/include/asm/asm-prototypes.h
@@ -134,7 +134,13 @@
 void pnv_power9_force_smt4_catch(void);
 void pnv_power9_force_smt4_release(void);
 
+/* Transaction memory related */
 void tm_enable(void);
 void tm_disable(void);
 void tm_abort(uint8_t cause);
+
+struct kvm_vcpu;
+void _kvmppc_restore_tm_pr(struct kvm_vcpu *vcpu, u64 guest_msr);
+void _kvmppc_save_tm_pr(struct kvm_vcpu *vcpu, u64 guest_msr);
+
 #endif /* _ASM_POWERPC_ASM_PROTOTYPES_H */
diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h
index e7377b7..1f345a0 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -104,6 +104,7 @@
 	ulong vtb;		/* virtual timebase */
 	ulong conferring_threads;
 	unsigned int halt_poll_ns;
+	atomic_t online_count;
 };
 
 struct kvmppc_vcpu_book3s {
@@ -209,6 +210,7 @@
 extern void kvmppc_book3s_dequeue_irqprio(struct kvm_vcpu *vcpu,
 					  unsigned int vec);
 extern void kvmppc_inject_interrupt(struct kvm_vcpu *vcpu, int vec, u64 flags);
+extern void kvmppc_trigger_fac_interrupt(struct kvm_vcpu *vcpu, ulong fac);
 extern void kvmppc_set_bat(struct kvm_vcpu *vcpu, struct kvmppc_bat *bat,
 			   bool upper, u32 val);
 extern void kvmppc_giveup_ext(struct kvm_vcpu *vcpu, ulong msr);
@@ -256,6 +258,21 @@
 extern int kvmppc_hcall_impl_hv_realmode(unsigned long cmd);
 extern void kvmppc_copy_to_svcpu(struct kvm_vcpu *vcpu);
 extern void kvmppc_copy_from_svcpu(struct kvm_vcpu *vcpu);
+
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+void kvmppc_save_tm_pr(struct kvm_vcpu *vcpu);
+void kvmppc_restore_tm_pr(struct kvm_vcpu *vcpu);
+void kvmppc_save_tm_sprs(struct kvm_vcpu *vcpu);
+void kvmppc_restore_tm_sprs(struct kvm_vcpu *vcpu);
+#else
+static inline void kvmppc_save_tm_pr(struct kvm_vcpu *vcpu) {}
+static inline void kvmppc_restore_tm_pr(struct kvm_vcpu *vcpu) {}
+static inline void kvmppc_save_tm_sprs(struct kvm_vcpu *vcpu) {}
+static inline void kvmppc_restore_tm_sprs(struct kvm_vcpu *vcpu) {}
+#endif
+
+void kvmppc_giveup_fac(struct kvm_vcpu *vcpu, ulong fac);
+
 extern int kvm_irq_bypass;
 
 static inline struct kvmppc_vcpu_book3s *to_book3s(struct kvm_vcpu *vcpu)
@@ -274,12 +291,12 @@
 
 static inline void kvmppc_set_gpr(struct kvm_vcpu *vcpu, int num, ulong val)
 {
-	vcpu->arch.gpr[num] = val;
+	vcpu->arch.regs.gpr[num] = val;
 }
 
 static inline ulong kvmppc_get_gpr(struct kvm_vcpu *vcpu, int num)
 {
-	return vcpu->arch.gpr[num];
+	return vcpu->arch.regs.gpr[num];
 }
 
 static inline void kvmppc_set_cr(struct kvm_vcpu *vcpu, u32 val)
@@ -294,42 +311,42 @@
 
 static inline void kvmppc_set_xer(struct kvm_vcpu *vcpu, ulong val)
 {
-	vcpu->arch.xer = val;
+	vcpu->arch.regs.xer = val;
 }
 
 static inline ulong kvmppc_get_xer(struct kvm_vcpu *vcpu)
 {
-	return vcpu->arch.xer;
+	return vcpu->arch.regs.xer;
 }
 
 static inline void kvmppc_set_ctr(struct kvm_vcpu *vcpu, ulong val)
 {
-	vcpu->arch.ctr = val;
+	vcpu->arch.regs.ctr = val;
 }
 
 static inline ulong kvmppc_get_ctr(struct kvm_vcpu *vcpu)
 {
-	return vcpu->arch.ctr;
+	return vcpu->arch.regs.ctr;
 }
 
 static inline void kvmppc_set_lr(struct kvm_vcpu *vcpu, ulong val)
 {
-	vcpu->arch.lr = val;
+	vcpu->arch.regs.link = val;
 }
 
 static inline ulong kvmppc_get_lr(struct kvm_vcpu *vcpu)
 {
-	return vcpu->arch.lr;
+	return vcpu->arch.regs.link;
 }
 
 static inline void kvmppc_set_pc(struct kvm_vcpu *vcpu, ulong val)
 {
-	vcpu->arch.pc = val;
+	vcpu->arch.regs.nip = val;
 }
 
 static inline ulong kvmppc_get_pc(struct kvm_vcpu *vcpu)
 {
-	return vcpu->arch.pc;
+	return vcpu->arch.regs.nip;
 }
 
 static inline u64 kvmppc_get_msr(struct kvm_vcpu *vcpu);
diff --git a/arch/powerpc/include/asm/kvm_book3s_64.h b/arch/powerpc/include/asm/kvm_book3s_64.h
index c424e44..dc435a5 100644
--- a/arch/powerpc/include/asm/kvm_book3s_64.h
+++ b/arch/powerpc/include/asm/kvm_book3s_64.h
@@ -483,15 +483,15 @@
 static inline void copy_from_checkpoint(struct kvm_vcpu *vcpu)
 {
 	vcpu->arch.cr  = vcpu->arch.cr_tm;
-	vcpu->arch.xer = vcpu->arch.xer_tm;
-	vcpu->arch.lr  = vcpu->arch.lr_tm;
-	vcpu->arch.ctr = vcpu->arch.ctr_tm;
+	vcpu->arch.regs.xer = vcpu->arch.xer_tm;
+	vcpu->arch.regs.link  = vcpu->arch.lr_tm;
+	vcpu->arch.regs.ctr = vcpu->arch.ctr_tm;
 	vcpu->arch.amr = vcpu->arch.amr_tm;
 	vcpu->arch.ppr = vcpu->arch.ppr_tm;
 	vcpu->arch.dscr = vcpu->arch.dscr_tm;
 	vcpu->arch.tar = vcpu->arch.tar_tm;
-	memcpy(vcpu->arch.gpr, vcpu->arch.gpr_tm,
-	       sizeof(vcpu->arch.gpr));
+	memcpy(vcpu->arch.regs.gpr, vcpu->arch.gpr_tm,
+	       sizeof(vcpu->arch.regs.gpr));
 	vcpu->arch.fp  = vcpu->arch.fp_tm;
 	vcpu->arch.vr  = vcpu->arch.vr_tm;
 	vcpu->arch.vrsave = vcpu->arch.vrsave_tm;
@@ -500,15 +500,15 @@
 static inline void copy_to_checkpoint(struct kvm_vcpu *vcpu)
 {
 	vcpu->arch.cr_tm  = vcpu->arch.cr;
-	vcpu->arch.xer_tm = vcpu->arch.xer;
-	vcpu->arch.lr_tm  = vcpu->arch.lr;
-	vcpu->arch.ctr_tm = vcpu->arch.ctr;
+	vcpu->arch.xer_tm = vcpu->arch.regs.xer;
+	vcpu->arch.lr_tm  = vcpu->arch.regs.link;
+	vcpu->arch.ctr_tm = vcpu->arch.regs.ctr;
 	vcpu->arch.amr_tm = vcpu->arch.amr;
 	vcpu->arch.ppr_tm = vcpu->arch.ppr;
 	vcpu->arch.dscr_tm = vcpu->arch.dscr;
 	vcpu->arch.tar_tm = vcpu->arch.tar;
-	memcpy(vcpu->arch.gpr_tm, vcpu->arch.gpr,
-	       sizeof(vcpu->arch.gpr));
+	memcpy(vcpu->arch.gpr_tm, vcpu->arch.regs.gpr,
+	       sizeof(vcpu->arch.regs.gpr));
 	vcpu->arch.fp_tm  = vcpu->arch.fp;
 	vcpu->arch.vr_tm  = vcpu->arch.vr;
 	vcpu->arch.vrsave_tm = vcpu->arch.vrsave;
diff --git a/arch/powerpc/include/asm/kvm_booke.h b/arch/powerpc/include/asm/kvm_booke.h
index bc6e29e..d513e3e 100644
--- a/arch/powerpc/include/asm/kvm_booke.h
+++ b/arch/powerpc/include/asm/kvm_booke.h
@@ -36,12 +36,12 @@
 
 static inline void kvmppc_set_gpr(struct kvm_vcpu *vcpu, int num, ulong val)
 {
-	vcpu->arch.gpr[num] = val;
+	vcpu->arch.regs.gpr[num] = val;
 }
 
 static inline ulong kvmppc_get_gpr(struct kvm_vcpu *vcpu, int num)
 {
-	return vcpu->arch.gpr[num];
+	return vcpu->arch.regs.gpr[num];
 }
 
 static inline void kvmppc_set_cr(struct kvm_vcpu *vcpu, u32 val)
@@ -56,12 +56,12 @@
 
 static inline void kvmppc_set_xer(struct kvm_vcpu *vcpu, ulong val)
 {
-	vcpu->arch.xer = val;
+	vcpu->arch.regs.xer = val;
 }
 
 static inline ulong kvmppc_get_xer(struct kvm_vcpu *vcpu)
 {
-	return vcpu->arch.xer;
+	return vcpu->arch.regs.xer;
 }
 
 static inline bool kvmppc_need_byteswap(struct kvm_vcpu *vcpu)
@@ -72,32 +72,32 @@
 
 static inline void kvmppc_set_ctr(struct kvm_vcpu *vcpu, ulong val)
 {
-	vcpu->arch.ctr = val;
+	vcpu->arch.regs.ctr = val;
 }
 
 static inline ulong kvmppc_get_ctr(struct kvm_vcpu *vcpu)
 {
-	return vcpu->arch.ctr;
+	return vcpu->arch.regs.ctr;
 }
 
 static inline void kvmppc_set_lr(struct kvm_vcpu *vcpu, ulong val)
 {
-	vcpu->arch.lr = val;
+	vcpu->arch.regs.link = val;
 }
 
 static inline ulong kvmppc_get_lr(struct kvm_vcpu *vcpu)
 {
-	return vcpu->arch.lr;
+	return vcpu->arch.regs.link;
 }
 
 static inline void kvmppc_set_pc(struct kvm_vcpu *vcpu, ulong val)
 {
-	vcpu->arch.pc = val;
+	vcpu->arch.regs.nip = val;
 }
 
 static inline ulong kvmppc_get_pc(struct kvm_vcpu *vcpu)
 {
-	return vcpu->arch.pc;
+	return vcpu->arch.regs.nip;
 }
 
 static inline ulong kvmppc_get_fault_dar(struct kvm_vcpu *vcpu)
diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 17498e9..fa4efa7 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -269,7 +269,6 @@
 	unsigned long host_lpcr;
 	unsigned long sdr1;
 	unsigned long host_sdr1;
-	int tlbie_lock;
 	unsigned long lpcr;
 	unsigned long vrma_slb_v;
 	int mmu_ready;
@@ -454,6 +453,12 @@
 #define KVMPPC_VSX_COPY_WORD		1
 #define KVMPPC_VSX_COPY_DWORD		2
 #define KVMPPC_VSX_COPY_DWORD_LOAD_DUMP	3
+#define KVMPPC_VSX_COPY_WORD_LOAD_DUMP	4
+
+#define KVMPPC_VMX_COPY_BYTE		8
+#define KVMPPC_VMX_COPY_HWORD		9
+#define KVMPPC_VMX_COPY_WORD		10
+#define KVMPPC_VMX_COPY_DWORD		11
 
 struct openpic;
 
@@ -486,7 +491,7 @@
 	struct kvmppc_book3s_shadow_vcpu *shadow_vcpu;
 #endif
 
-	ulong gpr[32];
+	struct pt_regs regs;
 
 	struct thread_fp_state fp;
 
@@ -521,14 +526,10 @@
 	u32 qpr[32];
 #endif
 
-	ulong pc;
-	ulong ctr;
-	ulong lr;
 #ifdef CONFIG_PPC_BOOK3S
 	ulong tar;
 #endif
 
-	ulong xer;
 	u32 cr;
 
 #ifdef CONFIG_PPC_BOOK3S
@@ -626,7 +627,6 @@
 
 	struct thread_vr_state vr_tm;
 	u32 vrsave_tm; /* also USPRG0 */
-
 #endif
 
 #ifdef CONFIG_KVM_EXIT_TIMING
@@ -681,16 +681,17 @@
 	 * Number of simulations for vsx.
 	 * If we use 2*8bytes to simulate 1*16bytes,
 	 * then the number should be 2 and
-	 * mmio_vsx_copy_type=KVMPPC_VSX_COPY_DWORD.
+	 * mmio_copy_type=KVMPPC_VSX_COPY_DWORD.
 	 * If we use 4*4bytes to simulate 1*16bytes,
 	 * the number should be 4 and
 	 * mmio_vsx_copy_type=KVMPPC_VSX_COPY_WORD.
 	 */
 	u8 mmio_vsx_copy_nums;
 	u8 mmio_vsx_offset;
-	u8 mmio_vsx_copy_type;
 	u8 mmio_vsx_tx_sx_enabled;
 	u8 mmio_vmx_copy_nums;
+	u8 mmio_vmx_offset;
+	u8 mmio_copy_type;
 	u8 osi_needed;
 	u8 osi_enabled;
 	u8 papr_enabled;
@@ -772,6 +773,8 @@
 	u64 busy_preempt;
 
 	u32 emul_inst;
+
+	u32 online;
 #endif
 
 #ifdef CONFIG_KVM_BOOK3S_HV_EXIT_TIMING
diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h
index abe7032..e991821 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -52,7 +52,7 @@
 	EMULATE_EXIT_USER,    /* emulation requires exit to user-space */
 };
 
-enum instruction_type {
+enum instruction_fetch_type {
 	INST_GENERIC,
 	INST_SC,		/* system call */
 };
@@ -81,10 +81,10 @@
 extern int kvmppc_handle_vsx_load(struct kvm_run *run, struct kvm_vcpu *vcpu,
 				unsigned int rt, unsigned int bytes,
 			int is_default_endian, int mmio_sign_extend);
-extern int kvmppc_handle_load128_by2x64(struct kvm_run *run,
-		struct kvm_vcpu *vcpu, unsigned int rt, int is_default_endian);
-extern int kvmppc_handle_store128_by2x64(struct kvm_run *run,
-		struct kvm_vcpu *vcpu, unsigned int rs, int is_default_endian);
+extern int kvmppc_handle_vmx_load(struct kvm_run *run, struct kvm_vcpu *vcpu,
+		unsigned int rt, unsigned int bytes, int is_default_endian);
+extern int kvmppc_handle_vmx_store(struct kvm_run *run, struct kvm_vcpu *vcpu,
+		unsigned int rs, unsigned int bytes, int is_default_endian);
 extern int kvmppc_handle_store(struct kvm_run *run, struct kvm_vcpu *vcpu,
 			       u64 val, unsigned int bytes,
 			       int is_default_endian);
@@ -93,7 +93,7 @@
 				int is_default_endian);
 
 extern int kvmppc_load_last_inst(struct kvm_vcpu *vcpu,
-				 enum instruction_type type, u32 *inst);
+				 enum instruction_fetch_type type, u32 *inst);
 
 extern int kvmppc_ld(struct kvm_vcpu *vcpu, ulong *eaddr, int size, void *ptr,
 		     bool data);
@@ -265,6 +265,8 @@
 	vector128 vval;
 	u64	vsxval[2];
 	u32	vsx32val[4];
+	u16	vsx16val[8];
+	u8	vsx8val[16];
 	struct {
 		u64	addr;
 		u64	length;
@@ -324,13 +326,14 @@
 	int (*get_rmmu_info)(struct kvm *kvm, struct kvm_ppc_rmmu_info *info);
 	int (*set_smt_mode)(struct kvm *kvm, unsigned long mode,
 			    unsigned long flags);
+	void (*giveup_ext)(struct kvm_vcpu *vcpu, ulong msr);
 };
 
 extern struct kvmppc_ops *kvmppc_hv_ops;
 extern struct kvmppc_ops *kvmppc_pr_ops;
 
 static inline int kvmppc_get_last_inst(struct kvm_vcpu *vcpu,
-					enum instruction_type type, u32 *inst)
+				enum instruction_fetch_type type, u32 *inst)
 {
 	int ret = EMULATE_DONE;
 	u32 fetched_inst;
diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index 75c5b2c..5625684 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -385,6 +385,7 @@
 #define SPRN_PSSCR	0x357	/* Processor Stop Status and Control Register (ISA 3.0) */
 #define SPRN_PSSCR_PR	0x337	/* PSSCR ISA 3.0, privileged mode access */
 #define SPRN_PMCR	0x374	/* Power Management Control Register */
+#define SPRN_RWMR	0x375	/* Region-Weighting Mode Register */
 
 /* HFSCR and FSCR bit numbers are the same */
 #define FSCR_SCV_LG	12	/* Enable System Call Vectored */
diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h
index 833ed9a..1b32b56 100644
--- a/arch/powerpc/include/uapi/asm/kvm.h
+++ b/arch/powerpc/include/uapi/asm/kvm.h
@@ -633,6 +633,7 @@
 #define KVM_REG_PPC_PSSCR	(KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xbd)
 
 #define KVM_REG_PPC_DEC_EXPIRY	(KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xbe)
+#define KVM_REG_PPC_ONLINE	(KVM_REG_PPC | KVM_REG_SIZE_U32 | 0xbf)
 
 /* Transactional Memory checkpointed state:
  * This is all GPRs, all VSX regs and a subset of SPRs
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index 9fc9e09..0a05443 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -426,20 +426,20 @@
 	OFFSET(VCPU_HOST_STACK, kvm_vcpu, arch.host_stack);
 	OFFSET(VCPU_HOST_PID, kvm_vcpu, arch.host_pid);
 	OFFSET(VCPU_GUEST_PID, kvm_vcpu, arch.pid);
-	OFFSET(VCPU_GPRS, kvm_vcpu, arch.gpr);
+	OFFSET(VCPU_GPRS, kvm_vcpu, arch.regs.gpr);
 	OFFSET(VCPU_VRSAVE, kvm_vcpu, arch.vrsave);
 	OFFSET(VCPU_FPRS, kvm_vcpu, arch.fp.fpr);
 #ifdef CONFIG_ALTIVEC
 	OFFSET(VCPU_VRS, kvm_vcpu, arch.vr.vr);
 #endif
-	OFFSET(VCPU_XER, kvm_vcpu, arch.xer);
-	OFFSET(VCPU_CTR, kvm_vcpu, arch.ctr);
-	OFFSET(VCPU_LR, kvm_vcpu, arch.lr);
+	OFFSET(VCPU_XER, kvm_vcpu, arch.regs.xer);
+	OFFSET(VCPU_CTR, kvm_vcpu, arch.regs.ctr);
+	OFFSET(VCPU_LR, kvm_vcpu, arch.regs.link);
 #ifdef CONFIG_PPC_BOOK3S
 	OFFSET(VCPU_TAR, kvm_vcpu, arch.tar);
 #endif
 	OFFSET(VCPU_CR, kvm_vcpu, arch.cr);
-	OFFSET(VCPU_PC, kvm_vcpu, arch.pc);
+	OFFSET(VCPU_PC, kvm_vcpu, arch.regs.nip);
 #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
 	OFFSET(VCPU_MSR, kvm_vcpu, arch.shregs.msr);
 	OFFSET(VCPU_SRR0, kvm_vcpu, arch.shregs.srr0);
@@ -696,10 +696,10 @@
 
 #else /* CONFIG_PPC_BOOK3S */
 	OFFSET(VCPU_CR, kvm_vcpu, arch.cr);
-	OFFSET(VCPU_XER, kvm_vcpu, arch.xer);
-	OFFSET(VCPU_LR, kvm_vcpu, arch.lr);
-	OFFSET(VCPU_CTR, kvm_vcpu, arch.ctr);
-	OFFSET(VCPU_PC, kvm_vcpu, arch.pc);
+	OFFSET(VCPU_XER, kvm_vcpu, arch.regs.xer);
+	OFFSET(VCPU_LR, kvm_vcpu, arch.regs.link);
+	OFFSET(VCPU_CTR, kvm_vcpu, arch.regs.ctr);
+	OFFSET(VCPU_PC, kvm_vcpu, arch.regs.nip);
 	OFFSET(VCPU_SPRG9, kvm_vcpu, arch.sprg9);
 	OFFSET(VCPU_LAST_INST, kvm_vcpu, arch.last_inst);
 	OFFSET(VCPU_FAULT_DEAR, kvm_vcpu, arch.fault_dear);
diff --git a/arch/powerpc/kvm/Makefile b/arch/powerpc/kvm/Makefile
index 4b19da8..f872c04 100644
--- a/arch/powerpc/kvm/Makefile
+++ b/arch/powerpc/kvm/Makefile
@@ -63,6 +63,9 @@
 	book3s_64_mmu.o \
 	book3s_32_mmu.o
 
+kvm-book3s_64-builtin-objs-$(CONFIG_KVM_BOOK3S_64_HANDLER) += \
+	tm.o
+
 ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE
 kvm-book3s_64-builtin-objs-$(CONFIG_KVM_BOOK3S_64_HANDLER) += \
 	book3s_rmhandlers.o
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index 97d4a11..edaf4720 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -134,7 +134,7 @@
 {
 	kvmppc_unfixup_split_real(vcpu);
 	kvmppc_set_srr0(vcpu, kvmppc_get_pc(vcpu));
-	kvmppc_set_srr1(vcpu, kvmppc_get_msr(vcpu) | flags);
+	kvmppc_set_srr1(vcpu, (kvmppc_get_msr(vcpu) & ~0x783f0000ul) | flags);
 	kvmppc_set_pc(vcpu, kvmppc_interrupt_offset(vcpu) + vec);
 	vcpu->arch.mmu.reset_msr(vcpu);
 }
@@ -256,18 +256,15 @@
 {
 	kvmppc_set_dar(vcpu, dar);
 	kvmppc_set_dsisr(vcpu, flags);
-	kvmppc_book3s_queue_irqprio(vcpu, BOOK3S_INTERRUPT_DATA_STORAGE);
+	kvmppc_inject_interrupt(vcpu, BOOK3S_INTERRUPT_DATA_STORAGE, 0);
 }
-EXPORT_SYMBOL_GPL(kvmppc_core_queue_data_storage);	/* used by kvm_hv */
+EXPORT_SYMBOL_GPL(kvmppc_core_queue_data_storage);
 
 void kvmppc_core_queue_inst_storage(struct kvm_vcpu *vcpu, ulong flags)
 {
-	u64 msr = kvmppc_get_msr(vcpu);
-	msr &= ~(SRR1_ISI_NOPT | SRR1_ISI_N_OR_G | SRR1_ISI_PROT);
-	msr |= flags & (SRR1_ISI_NOPT | SRR1_ISI_N_OR_G | SRR1_ISI_PROT);
-	kvmppc_set_msr_fast(vcpu, msr);
-	kvmppc_book3s_queue_irqprio(vcpu, BOOK3S_INTERRUPT_INST_STORAGE);
+	kvmppc_inject_interrupt(vcpu, BOOK3S_INTERRUPT_INST_STORAGE, flags);
 }
+EXPORT_SYMBOL_GPL(kvmppc_core_queue_inst_storage);
 
 static int kvmppc_book3s_irqprio_deliver(struct kvm_vcpu *vcpu,
 					 unsigned int priority)
@@ -450,8 +447,8 @@
 	return r;
 }
 
-int kvmppc_load_last_inst(struct kvm_vcpu *vcpu, enum instruction_type type,
-					 u32 *inst)
+int kvmppc_load_last_inst(struct kvm_vcpu *vcpu,
+		enum instruction_fetch_type type, u32 *inst)
 {
 	ulong pc = kvmppc_get_pc(vcpu);
 	int r;
@@ -509,8 +506,6 @@
 {
 	int i;
 
-	vcpu_load(vcpu);
-
 	regs->pc = kvmppc_get_pc(vcpu);
 	regs->cr = kvmppc_get_cr(vcpu);
 	regs->ctr = kvmppc_get_ctr(vcpu);
@@ -532,7 +527,6 @@
 	for (i = 0; i < ARRAY_SIZE(regs->gpr); i++)
 		regs->gpr[i] = kvmppc_get_gpr(vcpu, i);
 
-	vcpu_put(vcpu);
 	return 0;
 }
 
@@ -540,8 +534,6 @@
 {
 	int i;
 
-	vcpu_load(vcpu);
-
 	kvmppc_set_pc(vcpu, regs->pc);
 	kvmppc_set_cr(vcpu, regs->cr);
 	kvmppc_set_ctr(vcpu, regs->ctr);
@@ -562,7 +554,6 @@
 	for (i = 0; i < ARRAY_SIZE(regs->gpr); i++)
 		kvmppc_set_gpr(vcpu, i, regs->gpr[i]);
 
-	vcpu_put(vcpu);
 	return 0;
 }
 
diff --git a/arch/powerpc/kvm/book3s.h b/arch/powerpc/kvm/book3s.h
index 4ad5e28..14ef035 100644
--- a/arch/powerpc/kvm/book3s.h
+++ b/arch/powerpc/kvm/book3s.h
@@ -31,4 +31,10 @@
 extern int kvmppc_book3s_init_pr(void);
 extern void kvmppc_book3s_exit_pr(void);
 
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+extern void kvmppc_emulate_tabort(struct kvm_vcpu *vcpu, int ra_val);
+#else
+static inline void kvmppc_emulate_tabort(struct kvm_vcpu *vcpu, int ra_val) {}
+#endif
+
 #endif
diff --git a/arch/powerpc/kvm/book3s_32_mmu.c b/arch/powerpc/kvm/book3s_32_mmu.c
index 1992676..45c8ea4 100644
--- a/arch/powerpc/kvm/book3s_32_mmu.c
+++ b/arch/powerpc/kvm/book3s_32_mmu.c
@@ -52,7 +52,7 @@
 static inline bool check_debug_ip(struct kvm_vcpu *vcpu)
 {
 #ifdef DEBUG_MMU_PTE_IP
-	return vcpu->arch.pc == DEBUG_MMU_PTE_IP;
+	return vcpu->arch.regs.nip == DEBUG_MMU_PTE_IP;
 #else
 	return true;
 #endif
diff --git a/arch/powerpc/kvm/book3s_64_mmu.c b/arch/powerpc/kvm/book3s_64_mmu.c
index a93d719..cf9d686 100644
--- a/arch/powerpc/kvm/book3s_64_mmu.c
+++ b/arch/powerpc/kvm/book3s_64_mmu.c
@@ -38,7 +38,16 @@
 
 static void kvmppc_mmu_book3s_64_reset_msr(struct kvm_vcpu *vcpu)
 {
-	kvmppc_set_msr(vcpu, vcpu->arch.intr_msr);
+	unsigned long msr = vcpu->arch.intr_msr;
+	unsigned long cur_msr = kvmppc_get_msr(vcpu);
+
+	/* If transactional, change to suspend mode on IRQ delivery */
+	if (MSR_TM_TRANSACTIONAL(cur_msr))
+		msr |= MSR_TS_S;
+	else
+		msr |= cur_msr & MSR_TS_MASK;
+
+	kvmppc_set_msr(vcpu, msr);
 }
 
 static struct kvmppc_slb *kvmppc_mmu_book3s_64_find_slbe(
diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index 1b3fcaf..7f3a8cf 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -272,6 +272,9 @@
 	if (!cpu_has_feature(CPU_FTR_HVMODE))
 		return -EINVAL;
 
+	if (!mmu_has_feature(MMU_FTR_LOCKLESS_TLBIE))
+		return -EINVAL;
+
 	/* POWER7 has 10-bit LPIDs (12-bit in POWER8) */
 	host_lpid = mfspr(SPRN_LPID);
 	rsvd_lpid = LPID_RSVD;
diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/book3s_64_mmu_radix.c
index 481da8f..176f911 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_radix.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c
@@ -139,44 +139,24 @@
 	return 0;
 }
 
-#ifdef CONFIG_PPC_64K_PAGES
-#define MMU_BASE_PSIZE	MMU_PAGE_64K
-#else
-#define MMU_BASE_PSIZE	MMU_PAGE_4K
-#endif
-
 static void kvmppc_radix_tlbie_page(struct kvm *kvm, unsigned long addr,
 				    unsigned int pshift)
 {
-	int psize = MMU_BASE_PSIZE;
+	unsigned long psize = PAGE_SIZE;
 
-	if (pshift >= PUD_SHIFT)
-		psize = MMU_PAGE_1G;
-	else if (pshift >= PMD_SHIFT)
-		psize = MMU_PAGE_2M;
-	addr &= ~0xfffUL;
-	addr |= mmu_psize_defs[psize].ap << 5;
-	asm volatile("ptesync": : :"memory");
-	asm volatile(PPC_TLBIE_5(%0, %1, 0, 0, 1)
-		     : : "r" (addr), "r" (kvm->arch.lpid) : "memory");
-	if (cpu_has_feature(CPU_FTR_P9_TLBIE_BUG))
-		asm volatile(PPC_TLBIE_5(%0, %1, 0, 0, 1)
-			     : : "r" (addr), "r" (kvm->arch.lpid) : "memory");
-	asm volatile("eieio ; tlbsync ; ptesync": : :"memory");
+	if (pshift)
+		psize = 1UL << pshift;
+
+	addr &= ~(psize - 1);
+	radix__flush_tlb_lpid_page(kvm->arch.lpid, addr, psize);
 }
 
-static void kvmppc_radix_flush_pwc(struct kvm *kvm, unsigned long addr)
+static void kvmppc_radix_flush_pwc(struct kvm *kvm)
 {
-	unsigned long rb = 0x2 << PPC_BITLSHIFT(53); /* IS = 2 */
-
-	asm volatile("ptesync": : :"memory");
-	/* RIC=1 PRS=0 R=1 IS=2 */
-	asm volatile(PPC_TLBIE_5(%0, %1, 1, 0, 1)
-		     : : "r" (rb), "r" (kvm->arch.lpid) : "memory");
-	asm volatile("eieio ; tlbsync ; ptesync": : :"memory");
+	radix__flush_pwc_lpid(kvm->arch.lpid);
 }
 
-unsigned long kvmppc_radix_update_pte(struct kvm *kvm, pte_t *ptep,
+static unsigned long kvmppc_radix_update_pte(struct kvm *kvm, pte_t *ptep,
 				      unsigned long clr, unsigned long set,
 				      unsigned long addr, unsigned int shift)
 {
@@ -228,6 +208,167 @@
 	kmem_cache_free(kvm_pmd_cache, pmdp);
 }
 
+static void kvmppc_unmap_pte(struct kvm *kvm, pte_t *pte,
+			     unsigned long gpa, unsigned int shift)
+
+{
+	unsigned long page_size = 1ul << shift;
+	unsigned long old;
+
+	old = kvmppc_radix_update_pte(kvm, pte, ~0UL, 0, gpa, shift);
+	kvmppc_radix_tlbie_page(kvm, gpa, shift);
+	if (old & _PAGE_DIRTY) {
+		unsigned long gfn = gpa >> PAGE_SHIFT;
+		struct kvm_memory_slot *memslot;
+
+		memslot = gfn_to_memslot(kvm, gfn);
+		if (memslot && memslot->dirty_bitmap)
+			kvmppc_update_dirty_map(memslot, gfn, page_size);
+	}
+}
+
+/*
+ * kvmppc_free_p?d are used to free existing page tables, and recursively
+ * descend and clear and free children.
+ * Callers are responsible for flushing the PWC.
+ *
+ * When page tables are being unmapped/freed as part of page fault path
+ * (full == false), ptes are not expected. There is code to unmap them
+ * and emit a warning if encountered, but there may already be data
+ * corruption due to the unexpected mappings.
+ */
+static void kvmppc_unmap_free_pte(struct kvm *kvm, pte_t *pte, bool full)
+{
+	if (full) {
+		memset(pte, 0, sizeof(long) << PTE_INDEX_SIZE);
+	} else {
+		pte_t *p = pte;
+		unsigned long it;
+
+		for (it = 0; it < PTRS_PER_PTE; ++it, ++p) {
+			if (pte_val(*p) == 0)
+				continue;
+			WARN_ON_ONCE(1);
+			kvmppc_unmap_pte(kvm, p,
+					 pte_pfn(*p) << PAGE_SHIFT,
+					 PAGE_SHIFT);
+		}
+	}
+
+	kvmppc_pte_free(pte);
+}
+
+static void kvmppc_unmap_free_pmd(struct kvm *kvm, pmd_t *pmd, bool full)
+{
+	unsigned long im;
+	pmd_t *p = pmd;
+
+	for (im = 0; im < PTRS_PER_PMD; ++im, ++p) {
+		if (!pmd_present(*p))
+			continue;
+		if (pmd_is_leaf(*p)) {
+			if (full) {
+				pmd_clear(p);
+			} else {
+				WARN_ON_ONCE(1);
+				kvmppc_unmap_pte(kvm, (pte_t *)p,
+					 pte_pfn(*(pte_t *)p) << PAGE_SHIFT,
+					 PMD_SHIFT);
+			}
+		} else {
+			pte_t *pte;
+
+			pte = pte_offset_map(p, 0);
+			kvmppc_unmap_free_pte(kvm, pte, full);
+			pmd_clear(p);
+		}
+	}
+	kvmppc_pmd_free(pmd);
+}
+
+static void kvmppc_unmap_free_pud(struct kvm *kvm, pud_t *pud)
+{
+	unsigned long iu;
+	pud_t *p = pud;
+
+	for (iu = 0; iu < PTRS_PER_PUD; ++iu, ++p) {
+		if (!pud_present(*p))
+			continue;
+		if (pud_huge(*p)) {
+			pud_clear(p);
+		} else {
+			pmd_t *pmd;
+
+			pmd = pmd_offset(p, 0);
+			kvmppc_unmap_free_pmd(kvm, pmd, true);
+			pud_clear(p);
+		}
+	}
+	pud_free(kvm->mm, pud);
+}
+
+void kvmppc_free_radix(struct kvm *kvm)
+{
+	unsigned long ig;
+	pgd_t *pgd;
+
+	if (!kvm->arch.pgtable)
+		return;
+	pgd = kvm->arch.pgtable;
+	for (ig = 0; ig < PTRS_PER_PGD; ++ig, ++pgd) {
+		pud_t *pud;
+
+		if (!pgd_present(*pgd))
+			continue;
+		pud = pud_offset(pgd, 0);
+		kvmppc_unmap_free_pud(kvm, pud);
+		pgd_clear(pgd);
+	}
+	pgd_free(kvm->mm, kvm->arch.pgtable);
+	kvm->arch.pgtable = NULL;
+}
+
+static void kvmppc_unmap_free_pmd_entry_table(struct kvm *kvm, pmd_t *pmd,
+					      unsigned long gpa)
+{
+	pte_t *pte = pte_offset_kernel(pmd, 0);
+
+	/*
+	 * Clearing the pmd entry then flushing the PWC ensures that the pte
+	 * page no longer be cached by the MMU, so can be freed without
+	 * flushing the PWC again.
+	 */
+	pmd_clear(pmd);
+	kvmppc_radix_flush_pwc(kvm);
+
+	kvmppc_unmap_free_pte(kvm, pte, false);
+}
+
+static void kvmppc_unmap_free_pud_entry_table(struct kvm *kvm, pud_t *pud,
+					unsigned long gpa)
+{
+	pmd_t *pmd = pmd_offset(pud, 0);
+
+	/*
+	 * Clearing the pud entry then flushing the PWC ensures that the pmd
+	 * page and any children pte pages will no longer be cached by the MMU,
+	 * so can be freed without flushing the PWC again.
+	 */
+	pud_clear(pud);
+	kvmppc_radix_flush_pwc(kvm);
+
+	kvmppc_unmap_free_pmd(kvm, pmd, false);
+}
+
+/*
+ * There are a number of bits which may differ between different faults to
+ * the same partition scope entry. RC bits, in the course of cleaning and
+ * aging. And the write bit can change, either the access could have been
+ * upgraded, or a read fault could happen concurrently with a write fault
+ * that sets those bits first.
+ */
+#define PTE_BITS_MUST_MATCH (~(_PAGE_WRITE | _PAGE_DIRTY | _PAGE_ACCESSED))
+
 static int kvmppc_create_pte(struct kvm *kvm, pte_t pte, unsigned long gpa,
 			     unsigned int level, unsigned long mmu_seq)
 {
@@ -235,7 +376,6 @@
 	pud_t *pud, *new_pud = NULL;
 	pmd_t *pmd, *new_pmd = NULL;
 	pte_t *ptep, *new_ptep = NULL;
-	unsigned long old;
 	int ret;
 
 	/* Traverse the guest's 2nd-level tree, allocate new levels needed */
@@ -273,42 +413,39 @@
 	if (pud_huge(*pud)) {
 		unsigned long hgpa = gpa & PUD_MASK;
 
+		/* Check if we raced and someone else has set the same thing */
+		if (level == 2) {
+			if (pud_raw(*pud) == pte_raw(pte)) {
+				ret = 0;
+				goto out_unlock;
+			}
+			/* Valid 1GB page here already, add our extra bits */
+			WARN_ON_ONCE((pud_val(*pud) ^ pte_val(pte)) &
+							PTE_BITS_MUST_MATCH);
+			kvmppc_radix_update_pte(kvm, (pte_t *)pud,
+					      0, pte_val(pte), hgpa, PUD_SHIFT);
+			ret = 0;
+			goto out_unlock;
+		}
 		/*
 		 * If we raced with another CPU which has just put
 		 * a 1GB pte in after we saw a pmd page, try again.
 		 */
-		if (level <= 1 && !new_pmd) {
+		if (!new_pmd) {
 			ret = -EAGAIN;
 			goto out_unlock;
 		}
-		/* Check if we raced and someone else has set the same thing */
-		if (level == 2 && pud_raw(*pud) == pte_raw(pte)) {
-			ret = 0;
-			goto out_unlock;
-		}
 		/* Valid 1GB page here already, remove it */
-		old = kvmppc_radix_update_pte(kvm, (pte_t *)pud,
-					      ~0UL, 0, hgpa, PUD_SHIFT);
-		kvmppc_radix_tlbie_page(kvm, hgpa, PUD_SHIFT);
-		if (old & _PAGE_DIRTY) {
-			unsigned long gfn = hgpa >> PAGE_SHIFT;
-			struct kvm_memory_slot *memslot;
-			memslot = gfn_to_memslot(kvm, gfn);
-			if (memslot && memslot->dirty_bitmap)
-				kvmppc_update_dirty_map(memslot,
-							gfn, PUD_SIZE);
-		}
+		kvmppc_unmap_pte(kvm, (pte_t *)pud, hgpa, PUD_SHIFT);
 	}
 	if (level == 2) {
 		if (!pud_none(*pud)) {
 			/*
 			 * There's a page table page here, but we wanted to
 			 * install a large page, so remove and free the page
-			 * table page.  new_pmd will be NULL since level == 2.
+			 * table page.
 			 */
-			new_pmd = pmd_offset(pud, 0);
-			pud_clear(pud);
-			kvmppc_radix_flush_pwc(kvm, gpa);
+			kvmppc_unmap_free_pud_entry_table(kvm, pud, gpa);
 		}
 		kvmppc_radix_set_pte_at(kvm, gpa, (pte_t *)pud, pte);
 		ret = 0;
@@ -324,42 +461,40 @@
 	if (pmd_is_leaf(*pmd)) {
 		unsigned long lgpa = gpa & PMD_MASK;
 
+		/* Check if we raced and someone else has set the same thing */
+		if (level == 1) {
+			if (pmd_raw(*pmd) == pte_raw(pte)) {
+				ret = 0;
+				goto out_unlock;
+			}
+			/* Valid 2MB page here already, add our extra bits */
+			WARN_ON_ONCE((pmd_val(*pmd) ^ pte_val(pte)) &
+							PTE_BITS_MUST_MATCH);
+			kvmppc_radix_update_pte(kvm, pmdp_ptep(pmd),
+					      0, pte_val(pte), lgpa, PMD_SHIFT);
+			ret = 0;
+			goto out_unlock;
+		}
+
 		/*
 		 * If we raced with another CPU which has just put
 		 * a 2MB pte in after we saw a pte page, try again.
 		 */
-		if (level == 0 && !new_ptep) {
+		if (!new_ptep) {
 			ret = -EAGAIN;
 			goto out_unlock;
 		}
-		/* Check if we raced and someone else has set the same thing */
-		if (level == 1 && pmd_raw(*pmd) == pte_raw(pte)) {
-			ret = 0;
-			goto out_unlock;
-		}
 		/* Valid 2MB page here already, remove it */
-		old = kvmppc_radix_update_pte(kvm, pmdp_ptep(pmd),
-					      ~0UL, 0, lgpa, PMD_SHIFT);
-		kvmppc_radix_tlbie_page(kvm, lgpa, PMD_SHIFT);
-		if (old & _PAGE_DIRTY) {
-			unsigned long gfn = lgpa >> PAGE_SHIFT;
-			struct kvm_memory_slot *memslot;
-			memslot = gfn_to_memslot(kvm, gfn);
-			if (memslot && memslot->dirty_bitmap)
-				kvmppc_update_dirty_map(memslot,
-							gfn, PMD_SIZE);
-		}
+		kvmppc_unmap_pte(kvm, pmdp_ptep(pmd), lgpa, PMD_SHIFT);
 	}
 	if (level == 1) {
 		if (!pmd_none(*pmd)) {
 			/*
 			 * There's a page table page here, but we wanted to
 			 * install a large page, so remove and free the page
-			 * table page.  new_ptep will be NULL since level == 1.
+			 * table page.
 			 */
-			new_ptep = pte_offset_kernel(pmd, 0);
-			pmd_clear(pmd);
-			kvmppc_radix_flush_pwc(kvm, gpa);
+			kvmppc_unmap_free_pmd_entry_table(kvm, pmd, gpa);
 		}
 		kvmppc_radix_set_pte_at(kvm, gpa, pmdp_ptep(pmd), pte);
 		ret = 0;
@@ -378,12 +513,12 @@
 			ret = 0;
 			goto out_unlock;
 		}
-		/* PTE was previously valid, so invalidate it */
-		old = kvmppc_radix_update_pte(kvm, ptep, _PAGE_PRESENT,
-					      0, gpa, 0);
-		kvmppc_radix_tlbie_page(kvm, gpa, 0);
-		if (old & _PAGE_DIRTY)
-			mark_page_dirty(kvm, gpa >> PAGE_SHIFT);
+		/* Valid page here already, add our extra bits */
+		WARN_ON_ONCE((pte_val(*ptep) ^ pte_val(pte)) &
+							PTE_BITS_MUST_MATCH);
+		kvmppc_radix_update_pte(kvm, ptep, 0, pte_val(pte), gpa, 0);
+		ret = 0;
+		goto out_unlock;
 	}
 	kvmppc_radix_set_pte_at(kvm, gpa, ptep, pte);
 	ret = 0;
@@ -565,9 +700,13 @@
 			unsigned long mask = (1ul << shift) - PAGE_SIZE;
 			pte = __pte(pte_val(pte) | (hva & mask));
 		}
-		if (!(writing || upgrade_write))
-			pte = __pte(pte_val(pte) & ~ _PAGE_WRITE);
-		pte = __pte(pte_val(pte) | _PAGE_EXEC);
+		pte = __pte(pte_val(pte) | _PAGE_EXEC | _PAGE_ACCESSED);
+		if (writing || upgrade_write) {
+			if (pte_val(pte) & _PAGE_WRITE)
+				pte = __pte(pte_val(pte) | _PAGE_DIRTY);
+		} else {
+			pte = __pte(pte_val(pte) & ~(_PAGE_WRITE | _PAGE_DIRTY));
+		}
 	}
 
 	/* Allocate space in the tree and write the PTE */
@@ -734,51 +873,6 @@
 	return 0;
 }
 
-void kvmppc_free_radix(struct kvm *kvm)
-{
-	unsigned long ig, iu, im;
-	pte_t *pte;
-	pmd_t *pmd;
-	pud_t *pud;
-	pgd_t *pgd;
-
-	if (!kvm->arch.pgtable)
-		return;
-	pgd = kvm->arch.pgtable;
-	for (ig = 0; ig < PTRS_PER_PGD; ++ig, ++pgd) {
-		if (!pgd_present(*pgd))
-			continue;
-		pud = pud_offset(pgd, 0);
-		for (iu = 0; iu < PTRS_PER_PUD; ++iu, ++pud) {
-			if (!pud_present(*pud))
-				continue;
-			if (pud_huge(*pud)) {
-				pud_clear(pud);
-				continue;
-			}
-			pmd = pmd_offset(pud, 0);
-			for (im = 0; im < PTRS_PER_PMD; ++im, ++pmd) {
-				if (pmd_is_leaf(*pmd)) {
-					pmd_clear(pmd);
-					continue;
-				}
-				if (!pmd_present(*pmd))
-					continue;
-				pte = pte_offset_map(pmd, 0);
-				memset(pte, 0, sizeof(long) << PTE_INDEX_SIZE);
-				kvmppc_pte_free(pte);
-				pmd_clear(pmd);
-			}
-			kvmppc_pmd_free(pmd_offset(pud, 0));
-			pud_clear(pud);
-		}
-		pud_free(kvm->mm, pud_offset(pgd, 0));
-		pgd_clear(pgd);
-	}
-	pgd_free(kvm->mm, kvm->arch.pgtable);
-	kvm->arch.pgtable = NULL;
-}
-
 static void pte_ctor(void *addr)
 {
 	memset(addr, 0, RADIX_PTE_TABLE_SIZE);
diff --git a/arch/powerpc/kvm/book3s_64_vio.c b/arch/powerpc/kvm/book3s_64_vio.c
index 4dffa61..d066e37 100644
--- a/arch/powerpc/kvm/book3s_64_vio.c
+++ b/arch/powerpc/kvm/book3s_64_vio.c
@@ -176,14 +176,12 @@
 
 		if (!tbltmp)
 			continue;
-		/*
-		 * Make sure hardware table parameters are exactly the same;
-		 * this is used in the TCE handlers where boundary checks
-		 * use only the first attached table.
-		 */
-		if ((tbltmp->it_page_shift == stt->page_shift) &&
-				(tbltmp->it_offset == stt->offset) &&
-				(tbltmp->it_size == stt->size)) {
+		/* Make sure hardware table parameters are compatible */
+		if ((tbltmp->it_page_shift <= stt->page_shift) &&
+				(tbltmp->it_offset << tbltmp->it_page_shift ==
+				 stt->offset << stt->page_shift) &&
+				(tbltmp->it_size << tbltmp->it_page_shift ==
+				 stt->size << stt->page_shift)) {
 			/*
 			 * Reference the table to avoid races with
 			 * add/remove DMA windows.
@@ -237,7 +235,7 @@
 	kfree(stt);
 }
 
-static int kvm_spapr_tce_fault(struct vm_fault *vmf)
+static vm_fault_t kvm_spapr_tce_fault(struct vm_fault *vmf)
 {
 	struct kvmppc_spapr_tce_table *stt = vmf->vma->vm_file->private_data;
 	struct page *page;
@@ -302,7 +300,8 @@
 	int ret = -ENOMEM;
 	int i;
 
-	if (!args->size)
+	if (!args->size || args->page_shift < 12 || args->page_shift > 34 ||
+		(args->offset + args->size > (ULLONG_MAX >> args->page_shift)))
 		return -EINVAL;
 
 	size = _ALIGN_UP(args->size, PAGE_SIZE >> 3);
@@ -396,7 +395,7 @@
 	return H_SUCCESS;
 }
 
-static long kvmppc_tce_iommu_unmap(struct kvm *kvm,
+static long kvmppc_tce_iommu_do_unmap(struct kvm *kvm,
 		struct iommu_table *tbl, unsigned long entry)
 {
 	enum dma_data_direction dir = DMA_NONE;
@@ -416,7 +415,24 @@
 	return ret;
 }
 
-long kvmppc_tce_iommu_map(struct kvm *kvm, struct iommu_table *tbl,
+static long kvmppc_tce_iommu_unmap(struct kvm *kvm,
+		struct kvmppc_spapr_tce_table *stt, struct iommu_table *tbl,
+		unsigned long entry)
+{
+	unsigned long i, ret = H_SUCCESS;
+	unsigned long subpages = 1ULL << (stt->page_shift - tbl->it_page_shift);
+	unsigned long io_entry = entry * subpages;
+
+	for (i = 0; i < subpages; ++i) {
+		ret = kvmppc_tce_iommu_do_unmap(kvm, tbl, io_entry + i);
+		if (ret != H_SUCCESS)
+			break;
+	}
+
+	return ret;
+}
+
+long kvmppc_tce_iommu_do_map(struct kvm *kvm, struct iommu_table *tbl,
 		unsigned long entry, unsigned long ua,
 		enum dma_data_direction dir)
 {
@@ -453,6 +469,27 @@
 	return 0;
 }
 
+static long kvmppc_tce_iommu_map(struct kvm *kvm,
+		struct kvmppc_spapr_tce_table *stt, struct iommu_table *tbl,
+		unsigned long entry, unsigned long ua,
+		enum dma_data_direction dir)
+{
+	unsigned long i, pgoff, ret = H_SUCCESS;
+	unsigned long subpages = 1ULL << (stt->page_shift - tbl->it_page_shift);
+	unsigned long io_entry = entry * subpages;
+
+	for (i = 0, pgoff = 0; i < subpages;
+			++i, pgoff += IOMMU_PAGE_SIZE(tbl)) {
+
+		ret = kvmppc_tce_iommu_do_map(kvm, tbl,
+				io_entry + i, ua + pgoff, dir);
+		if (ret != H_SUCCESS)
+			break;
+	}
+
+	return ret;
+}
+
 long kvmppc_h_put_tce(struct kvm_vcpu *vcpu, unsigned long liobn,
 		      unsigned long ioba, unsigned long tce)
 {
@@ -491,10 +528,10 @@
 
 	list_for_each_entry_lockless(stit, &stt->iommu_tables, next) {
 		if (dir == DMA_NONE)
-			ret = kvmppc_tce_iommu_unmap(vcpu->kvm,
+			ret = kvmppc_tce_iommu_unmap(vcpu->kvm, stt,
 					stit->tbl, entry);
 		else
-			ret = kvmppc_tce_iommu_map(vcpu->kvm, stit->tbl,
+			ret = kvmppc_tce_iommu_map(vcpu->kvm, stt, stit->tbl,
 					entry, ua, dir);
 
 		if (ret == H_SUCCESS)
@@ -570,7 +607,7 @@
 			return H_PARAMETER;
 
 		list_for_each_entry_lockless(stit, &stt->iommu_tables, next) {
-			ret = kvmppc_tce_iommu_map(vcpu->kvm,
+			ret = kvmppc_tce_iommu_map(vcpu->kvm, stt,
 					stit->tbl, entry + i, ua,
 					iommu_tce_direction(tce));
 
@@ -615,10 +652,10 @@
 		return H_PARAMETER;
 
 	list_for_each_entry_lockless(stit, &stt->iommu_tables, next) {
-		unsigned long entry = ioba >> stit->tbl->it_page_shift;
+		unsigned long entry = ioba >> stt->page_shift;
 
 		for (i = 0; i < npages; ++i) {
-			ret = kvmppc_tce_iommu_unmap(vcpu->kvm,
+			ret = kvmppc_tce_iommu_unmap(vcpu->kvm, stt,
 					stit->tbl, entry + i);
 
 			if (ret == H_SUCCESS)
diff --git a/arch/powerpc/kvm/book3s_64_vio_hv.c b/arch/powerpc/kvm/book3s_64_vio_hv.c
index 6651f73..925fc31 100644
--- a/arch/powerpc/kvm/book3s_64_vio_hv.c
+++ b/arch/powerpc/kvm/book3s_64_vio_hv.c
@@ -221,7 +221,7 @@
 	return H_SUCCESS;
 }
 
-static long kvmppc_rm_tce_iommu_unmap(struct kvm *kvm,
+static long kvmppc_rm_tce_iommu_do_unmap(struct kvm *kvm,
 		struct iommu_table *tbl, unsigned long entry)
 {
 	enum dma_data_direction dir = DMA_NONE;
@@ -245,7 +245,24 @@
 	return ret;
 }
 
-static long kvmppc_rm_tce_iommu_map(struct kvm *kvm, struct iommu_table *tbl,
+static long kvmppc_rm_tce_iommu_unmap(struct kvm *kvm,
+		struct kvmppc_spapr_tce_table *stt, struct iommu_table *tbl,
+		unsigned long entry)
+{
+	unsigned long i, ret = H_SUCCESS;
+	unsigned long subpages = 1ULL << (stt->page_shift - tbl->it_page_shift);
+	unsigned long io_entry = entry * subpages;
+
+	for (i = 0; i < subpages; ++i) {
+		ret = kvmppc_rm_tce_iommu_do_unmap(kvm, tbl, io_entry + i);
+		if (ret != H_SUCCESS)
+			break;
+	}
+
+	return ret;
+}
+
+static long kvmppc_rm_tce_iommu_do_map(struct kvm *kvm, struct iommu_table *tbl,
 		unsigned long entry, unsigned long ua,
 		enum dma_data_direction dir)
 {
@@ -290,6 +307,27 @@
 	return 0;
 }
 
+static long kvmppc_rm_tce_iommu_map(struct kvm *kvm,
+		struct kvmppc_spapr_tce_table *stt, struct iommu_table *tbl,
+		unsigned long entry, unsigned long ua,
+		enum dma_data_direction dir)
+{
+	unsigned long i, pgoff, ret = H_SUCCESS;
+	unsigned long subpages = 1ULL << (stt->page_shift - tbl->it_page_shift);
+	unsigned long io_entry = entry * subpages;
+
+	for (i = 0, pgoff = 0; i < subpages;
+			++i, pgoff += IOMMU_PAGE_SIZE(tbl)) {
+
+		ret = kvmppc_rm_tce_iommu_do_map(kvm, tbl,
+				io_entry + i, ua + pgoff, dir);
+		if (ret != H_SUCCESS)
+			break;
+	}
+
+	return ret;
+}
+
 long kvmppc_rm_h_put_tce(struct kvm_vcpu *vcpu, unsigned long liobn,
 		unsigned long ioba, unsigned long tce)
 {
@@ -327,10 +365,10 @@
 
 	list_for_each_entry_lockless(stit, &stt->iommu_tables, next) {
 		if (dir == DMA_NONE)
-			ret = kvmppc_rm_tce_iommu_unmap(vcpu->kvm,
+			ret = kvmppc_rm_tce_iommu_unmap(vcpu->kvm, stt,
 					stit->tbl, entry);
 		else
-			ret = kvmppc_rm_tce_iommu_map(vcpu->kvm,
+			ret = kvmppc_rm_tce_iommu_map(vcpu->kvm, stt,
 					stit->tbl, entry, ua, dir);
 
 		if (ret == H_SUCCESS)
@@ -477,7 +515,7 @@
 			return H_PARAMETER;
 
 		list_for_each_entry_lockless(stit, &stt->iommu_tables, next) {
-			ret = kvmppc_rm_tce_iommu_map(vcpu->kvm,
+			ret = kvmppc_rm_tce_iommu_map(vcpu->kvm, stt,
 					stit->tbl, entry + i, ua,
 					iommu_tce_direction(tce));
 
@@ -526,10 +564,10 @@
 		return H_PARAMETER;
 
 	list_for_each_entry_lockless(stit, &stt->iommu_tables, next) {
-		unsigned long entry = ioba >> stit->tbl->it_page_shift;
+		unsigned long entry = ioba >> stt->page_shift;
 
 		for (i = 0; i < npages; ++i) {
-			ret = kvmppc_rm_tce_iommu_unmap(vcpu->kvm,
+			ret = kvmppc_rm_tce_iommu_unmap(vcpu->kvm, stt,
 					stit->tbl, entry + i);
 
 			if (ret == H_SUCCESS)
@@ -571,7 +609,7 @@
 	page = stt->pages[idx / TCES_PER_PAGE];
 	tbl = (u64 *)page_address(page);
 
-	vcpu->arch.gpr[4] = tbl[idx % TCES_PER_PAGE];
+	vcpu->arch.regs.gpr[4] = tbl[idx % TCES_PER_PAGE];
 
 	return H_SUCCESS;
 }
diff --git a/arch/powerpc/kvm/book3s_emulate.c b/arch/powerpc/kvm/book3s_emulate.c
index 68d6898..36b11c5 100644
--- a/arch/powerpc/kvm/book3s_emulate.c
+++ b/arch/powerpc/kvm/book3s_emulate.c
@@ -23,7 +23,9 @@
 #include <asm/reg.h>
 #include <asm/switch_to.h>
 #include <asm/time.h>
+#include <asm/tm.h>
 #include "book3s.h"
+#include <asm/asm-prototypes.h>
 
 #define OP_19_XOP_RFID		18
 #define OP_19_XOP_RFI		50
@@ -47,6 +49,12 @@
 #define OP_31_XOP_EIOIO		854
 #define OP_31_XOP_SLBMFEE	915
 
+#define OP_31_XOP_TBEGIN	654
+#define OP_31_XOP_TABORT	910
+
+#define OP_31_XOP_TRECLAIM	942
+#define OP_31_XOP_TRCHKPT	1006
+
 /* DCBZ is actually 1014, but we patch it to 1010 so we get a trap */
 #define OP_31_XOP_DCBZ		1010
 
@@ -87,6 +95,157 @@
 	return true;
 }
 
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+static inline void kvmppc_copyto_vcpu_tm(struct kvm_vcpu *vcpu)
+{
+	memcpy(&vcpu->arch.gpr_tm[0], &vcpu->arch.regs.gpr[0],
+			sizeof(vcpu->arch.gpr_tm));
+	memcpy(&vcpu->arch.fp_tm, &vcpu->arch.fp,
+			sizeof(struct thread_fp_state));
+	memcpy(&vcpu->arch.vr_tm, &vcpu->arch.vr,
+			sizeof(struct thread_vr_state));
+	vcpu->arch.ppr_tm = vcpu->arch.ppr;
+	vcpu->arch.dscr_tm = vcpu->arch.dscr;
+	vcpu->arch.amr_tm = vcpu->arch.amr;
+	vcpu->arch.ctr_tm = vcpu->arch.regs.ctr;
+	vcpu->arch.tar_tm = vcpu->arch.tar;
+	vcpu->arch.lr_tm = vcpu->arch.regs.link;
+	vcpu->arch.cr_tm = vcpu->arch.cr;
+	vcpu->arch.xer_tm = vcpu->arch.regs.xer;
+	vcpu->arch.vrsave_tm = vcpu->arch.vrsave;
+}
+
+static inline void kvmppc_copyfrom_vcpu_tm(struct kvm_vcpu *vcpu)
+{
+	memcpy(&vcpu->arch.regs.gpr[0], &vcpu->arch.gpr_tm[0],
+			sizeof(vcpu->arch.regs.gpr));
+	memcpy(&vcpu->arch.fp, &vcpu->arch.fp_tm,
+			sizeof(struct thread_fp_state));
+	memcpy(&vcpu->arch.vr, &vcpu->arch.vr_tm,
+			sizeof(struct thread_vr_state));
+	vcpu->arch.ppr = vcpu->arch.ppr_tm;
+	vcpu->arch.dscr = vcpu->arch.dscr_tm;
+	vcpu->arch.amr = vcpu->arch.amr_tm;
+	vcpu->arch.regs.ctr = vcpu->arch.ctr_tm;
+	vcpu->arch.tar = vcpu->arch.tar_tm;
+	vcpu->arch.regs.link = vcpu->arch.lr_tm;
+	vcpu->arch.cr = vcpu->arch.cr_tm;
+	vcpu->arch.regs.xer = vcpu->arch.xer_tm;
+	vcpu->arch.vrsave = vcpu->arch.vrsave_tm;
+}
+
+static void kvmppc_emulate_treclaim(struct kvm_vcpu *vcpu, int ra_val)
+{
+	unsigned long guest_msr = kvmppc_get_msr(vcpu);
+	int fc_val = ra_val ? ra_val : 1;
+	uint64_t texasr;
+
+	/* CR0 = 0 | MSR[TS] | 0 */
+	vcpu->arch.cr = (vcpu->arch.cr & ~(CR0_MASK << CR0_SHIFT)) |
+		(((guest_msr & MSR_TS_MASK) >> (MSR_TS_S_LG - 1))
+		 << CR0_SHIFT);
+
+	preempt_disable();
+	tm_enable();
+	texasr = mfspr(SPRN_TEXASR);
+	kvmppc_save_tm_pr(vcpu);
+	kvmppc_copyfrom_vcpu_tm(vcpu);
+
+	/* failure recording depends on Failure Summary bit */
+	if (!(texasr & TEXASR_FS)) {
+		texasr &= ~TEXASR_FC;
+		texasr |= ((u64)fc_val << TEXASR_FC_LG) | TEXASR_FS;
+
+		texasr &= ~(TEXASR_PR | TEXASR_HV);
+		if (kvmppc_get_msr(vcpu) & MSR_PR)
+			texasr |= TEXASR_PR;
+
+		if (kvmppc_get_msr(vcpu) & MSR_HV)
+			texasr |= TEXASR_HV;
+
+		vcpu->arch.texasr = texasr;
+		vcpu->arch.tfiar = kvmppc_get_pc(vcpu);
+		mtspr(SPRN_TEXASR, texasr);
+		mtspr(SPRN_TFIAR, vcpu->arch.tfiar);
+	}
+	tm_disable();
+	/*
+	 * treclaim need quit to non-transactional state.
+	 */
+	guest_msr &= ~(MSR_TS_MASK);
+	kvmppc_set_msr(vcpu, guest_msr);
+	preempt_enable();
+
+	if (vcpu->arch.shadow_fscr & FSCR_TAR)
+		mtspr(SPRN_TAR, vcpu->arch.tar);
+}
+
+static void kvmppc_emulate_trchkpt(struct kvm_vcpu *vcpu)
+{
+	unsigned long guest_msr = kvmppc_get_msr(vcpu);
+
+	preempt_disable();
+	/*
+	 * need flush FP/VEC/VSX to vcpu save area before
+	 * copy.
+	 */
+	kvmppc_giveup_ext(vcpu, MSR_VSX);
+	kvmppc_giveup_fac(vcpu, FSCR_TAR_LG);
+	kvmppc_copyto_vcpu_tm(vcpu);
+	kvmppc_save_tm_sprs(vcpu);
+
+	/*
+	 * as a result of trecheckpoint. set TS to suspended.
+	 */
+	guest_msr &= ~(MSR_TS_MASK);
+	guest_msr |= MSR_TS_S;
+	kvmppc_set_msr(vcpu, guest_msr);
+	kvmppc_restore_tm_pr(vcpu);
+	preempt_enable();
+}
+
+/* emulate tabort. at guest privilege state */
+void kvmppc_emulate_tabort(struct kvm_vcpu *vcpu, int ra_val)
+{
+	/* currently we only emulate tabort. but no emulation of other
+	 * tabort variants since there is no kernel usage of them at
+	 * present.
+	 */
+	unsigned long guest_msr = kvmppc_get_msr(vcpu);
+	uint64_t org_texasr;
+
+	preempt_disable();
+	tm_enable();
+	org_texasr = mfspr(SPRN_TEXASR);
+	tm_abort(ra_val);
+
+	/* CR0 = 0 | MSR[TS] | 0 */
+	vcpu->arch.cr = (vcpu->arch.cr & ~(CR0_MASK << CR0_SHIFT)) |
+		(((guest_msr & MSR_TS_MASK) >> (MSR_TS_S_LG - 1))
+		 << CR0_SHIFT);
+
+	vcpu->arch.texasr = mfspr(SPRN_TEXASR);
+	/* failure recording depends on Failure Summary bit,
+	 * and tabort will be treated as nops in non-transactional
+	 * state.
+	 */
+	if (!(org_texasr & TEXASR_FS) &&
+			MSR_TM_ACTIVE(guest_msr)) {
+		vcpu->arch.texasr &= ~(TEXASR_PR | TEXASR_HV);
+		if (guest_msr & MSR_PR)
+			vcpu->arch.texasr |= TEXASR_PR;
+
+		if (guest_msr & MSR_HV)
+			vcpu->arch.texasr |= TEXASR_HV;
+
+		vcpu->arch.tfiar = kvmppc_get_pc(vcpu);
+	}
+	tm_disable();
+	preempt_enable();
+}
+
+#endif
+
 int kvmppc_core_emulate_op_pr(struct kvm_run *run, struct kvm_vcpu *vcpu,
 			      unsigned int inst, int *advance)
 {
@@ -117,11 +276,28 @@
 	case 19:
 		switch (get_xop(inst)) {
 		case OP_19_XOP_RFID:
-		case OP_19_XOP_RFI:
+		case OP_19_XOP_RFI: {
+			unsigned long srr1 = kvmppc_get_srr1(vcpu);
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+			unsigned long cur_msr = kvmppc_get_msr(vcpu);
+
+			/*
+			 * add rules to fit in ISA specification regarding TM
+			 * state transistion in TM disable/Suspended state,
+			 * and target TM state is TM inactive(00) state. (the
+			 * change should be suppressed).
+			 */
+			if (((cur_msr & MSR_TM) == 0) &&
+				((srr1 & MSR_TM) == 0) &&
+				MSR_TM_SUSPENDED(cur_msr) &&
+				!MSR_TM_ACTIVE(srr1))
+				srr1 |= MSR_TS_S;
+#endif
 			kvmppc_set_pc(vcpu, kvmppc_get_srr0(vcpu));
-			kvmppc_set_msr(vcpu, kvmppc_get_srr1(vcpu));
+			kvmppc_set_msr(vcpu, srr1);
 			*advance = 0;
 			break;
+		}
 
 		default:
 			emulated = EMULATE_FAIL;
@@ -304,6 +480,140 @@
 
 			break;
 		}
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+		case OP_31_XOP_TBEGIN:
+		{
+			if (!cpu_has_feature(CPU_FTR_TM))
+				break;
+
+			if (!(kvmppc_get_msr(vcpu) & MSR_TM)) {
+				kvmppc_trigger_fac_interrupt(vcpu, FSCR_TM_LG);
+				emulated = EMULATE_AGAIN;
+				break;
+			}
+
+			if (!(kvmppc_get_msr(vcpu) & MSR_PR)) {
+				preempt_disable();
+				vcpu->arch.cr = (CR0_TBEGIN_FAILURE |
+				  (vcpu->arch.cr & ~(CR0_MASK << CR0_SHIFT)));
+
+				vcpu->arch.texasr = (TEXASR_FS | TEXASR_EXACT |
+					(((u64)(TM_CAUSE_EMULATE | TM_CAUSE_PERSISTENT))
+						 << TEXASR_FC_LG));
+
+				if ((inst >> 21) & 0x1)
+					vcpu->arch.texasr |= TEXASR_ROT;
+
+				if (kvmppc_get_msr(vcpu) & MSR_HV)
+					vcpu->arch.texasr |= TEXASR_HV;
+
+				vcpu->arch.tfhar = kvmppc_get_pc(vcpu) + 4;
+				vcpu->arch.tfiar = kvmppc_get_pc(vcpu);
+
+				kvmppc_restore_tm_sprs(vcpu);
+				preempt_enable();
+			} else
+				emulated = EMULATE_FAIL;
+			break;
+		}
+		case OP_31_XOP_TABORT:
+		{
+			ulong guest_msr = kvmppc_get_msr(vcpu);
+			unsigned long ra_val = 0;
+
+			if (!cpu_has_feature(CPU_FTR_TM))
+				break;
+
+			if (!(kvmppc_get_msr(vcpu) & MSR_TM)) {
+				kvmppc_trigger_fac_interrupt(vcpu, FSCR_TM_LG);
+				emulated = EMULATE_AGAIN;
+				break;
+			}
+
+			/* only emulate for privilege guest, since problem state
+			 * guest can run with TM enabled and we don't expect to
+			 * trap at here for that case.
+			 */
+			WARN_ON(guest_msr & MSR_PR);
+
+			if (ra)
+				ra_val = kvmppc_get_gpr(vcpu, ra);
+
+			kvmppc_emulate_tabort(vcpu, ra_val);
+			break;
+		}
+		case OP_31_XOP_TRECLAIM:
+		{
+			ulong guest_msr = kvmppc_get_msr(vcpu);
+			unsigned long ra_val = 0;
+
+			if (!cpu_has_feature(CPU_FTR_TM))
+				break;
+
+			if (!(kvmppc_get_msr(vcpu) & MSR_TM)) {
+				kvmppc_trigger_fac_interrupt(vcpu, FSCR_TM_LG);
+				emulated = EMULATE_AGAIN;
+				break;
+			}
+
+			/* generate interrupts based on priorities */
+			if (guest_msr & MSR_PR) {
+				/* Privileged Instruction type Program Interrupt */
+				kvmppc_core_queue_program(vcpu, SRR1_PROGPRIV);
+				emulated = EMULATE_AGAIN;
+				break;
+			}
+
+			if (!MSR_TM_ACTIVE(guest_msr)) {
+				/* TM bad thing interrupt */
+				kvmppc_core_queue_program(vcpu, SRR1_PROGTM);
+				emulated = EMULATE_AGAIN;
+				break;
+			}
+
+			if (ra)
+				ra_val = kvmppc_get_gpr(vcpu, ra);
+			kvmppc_emulate_treclaim(vcpu, ra_val);
+			break;
+		}
+		case OP_31_XOP_TRCHKPT:
+		{
+			ulong guest_msr = kvmppc_get_msr(vcpu);
+			unsigned long texasr;
+
+			if (!cpu_has_feature(CPU_FTR_TM))
+				break;
+
+			if (!(kvmppc_get_msr(vcpu) & MSR_TM)) {
+				kvmppc_trigger_fac_interrupt(vcpu, FSCR_TM_LG);
+				emulated = EMULATE_AGAIN;
+				break;
+			}
+
+			/* generate interrupt based on priorities */
+			if (guest_msr & MSR_PR) {
+				/* Privileged Instruction type Program Intr */
+				kvmppc_core_queue_program(vcpu, SRR1_PROGPRIV);
+				emulated = EMULATE_AGAIN;
+				break;
+			}
+
+			tm_enable();
+			texasr = mfspr(SPRN_TEXASR);
+			tm_disable();
+
+			if (MSR_TM_ACTIVE(guest_msr) ||
+				!(texasr & (TEXASR_FS))) {
+				/* TM bad thing interrupt */
+				kvmppc_core_queue_program(vcpu, SRR1_PROGTM);
+				emulated = EMULATE_AGAIN;
+				break;
+			}
+
+			kvmppc_emulate_trchkpt(vcpu);
+			break;
+		}
+#endif
 		default:
 			emulated = EMULATE_FAIL;
 		}
@@ -465,13 +775,38 @@
 		break;
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
 	case SPRN_TFHAR:
-		vcpu->arch.tfhar = spr_val;
-		break;
 	case SPRN_TEXASR:
-		vcpu->arch.texasr = spr_val;
-		break;
 	case SPRN_TFIAR:
-		vcpu->arch.tfiar = spr_val;
+		if (!cpu_has_feature(CPU_FTR_TM))
+			break;
+
+		if (!(kvmppc_get_msr(vcpu) & MSR_TM)) {
+			kvmppc_trigger_fac_interrupt(vcpu, FSCR_TM_LG);
+			emulated = EMULATE_AGAIN;
+			break;
+		}
+
+		if (MSR_TM_ACTIVE(kvmppc_get_msr(vcpu)) &&
+			!((MSR_TM_SUSPENDED(kvmppc_get_msr(vcpu))) &&
+					(sprn == SPRN_TFHAR))) {
+			/* it is illegal to mtspr() TM regs in
+			 * other than non-transactional state, with
+			 * the exception of TFHAR in suspend state.
+			 */
+			kvmppc_core_queue_program(vcpu, SRR1_PROGTM);
+			emulated = EMULATE_AGAIN;
+			break;
+		}
+
+		tm_enable();
+		if (sprn == SPRN_TFHAR)
+			mtspr(SPRN_TFHAR, spr_val);
+		else if (sprn == SPRN_TEXASR)
+			mtspr(SPRN_TEXASR, spr_val);
+		else
+			mtspr(SPRN_TFIAR, spr_val);
+		tm_disable();
+
 		break;
 #endif
 #endif
@@ -618,13 +953,25 @@
 		break;
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
 	case SPRN_TFHAR:
-		*spr_val = vcpu->arch.tfhar;
-		break;
 	case SPRN_TEXASR:
-		*spr_val = vcpu->arch.texasr;
-		break;
 	case SPRN_TFIAR:
-		*spr_val = vcpu->arch.tfiar;
+		if (!cpu_has_feature(CPU_FTR_TM))
+			break;
+
+		if (!(kvmppc_get_msr(vcpu) & MSR_TM)) {
+			kvmppc_trigger_fac_interrupt(vcpu, FSCR_TM_LG);
+			emulated = EMULATE_AGAIN;
+			break;
+		}
+
+		tm_enable();
+		if (sprn == SPRN_TFHAR)
+			*spr_val = mfspr(SPRN_TFHAR);
+		else if (sprn == SPRN_TEXASR)
+			*spr_val = mfspr(SPRN_TEXASR);
+		else if (sprn == SPRN_TFIAR)
+			*spr_val = mfspr(SPRN_TFIAR);
+		tm_disable();
 		break;
 #endif
 #endif
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 8858ab8..de686b3 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -123,6 +123,32 @@
 static void kvmppc_end_cede(struct kvm_vcpu *vcpu);
 static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu *vcpu);
 
+/*
+ * RWMR values for POWER8.  These control the rate at which PURR
+ * and SPURR count and should be set according to the number of
+ * online threads in the vcore being run.
+ */
+#define RWMR_RPA_P8_1THREAD	0x164520C62609AECA
+#define RWMR_RPA_P8_2THREAD	0x7FFF2908450D8DA9
+#define RWMR_RPA_P8_3THREAD	0x164520C62609AECA
+#define RWMR_RPA_P8_4THREAD	0x199A421245058DA9
+#define RWMR_RPA_P8_5THREAD	0x164520C62609AECA
+#define RWMR_RPA_P8_6THREAD	0x164520C62609AECA
+#define RWMR_RPA_P8_7THREAD	0x164520C62609AECA
+#define RWMR_RPA_P8_8THREAD	0x164520C62609AECA
+
+static unsigned long p8_rwmr_values[MAX_SMT_THREADS + 1] = {
+	RWMR_RPA_P8_1THREAD,
+	RWMR_RPA_P8_1THREAD,
+	RWMR_RPA_P8_2THREAD,
+	RWMR_RPA_P8_3THREAD,
+	RWMR_RPA_P8_4THREAD,
+	RWMR_RPA_P8_5THREAD,
+	RWMR_RPA_P8_6THREAD,
+	RWMR_RPA_P8_7THREAD,
+	RWMR_RPA_P8_8THREAD,
+};
+
 static inline struct kvm_vcpu *next_runnable_thread(struct kvmppc_vcore *vc,
 		int *ip)
 {
@@ -371,13 +397,13 @@
 
 	pr_err("vcpu %p (%d):\n", vcpu, vcpu->vcpu_id);
 	pr_err("pc  = %.16lx  msr = %.16llx  trap = %x\n",
-	       vcpu->arch.pc, vcpu->arch.shregs.msr, vcpu->arch.trap);
+	       vcpu->arch.regs.nip, vcpu->arch.shregs.msr, vcpu->arch.trap);
 	for (r = 0; r < 16; ++r)
 		pr_err("r%2d = %.16lx  r%d = %.16lx\n",
 		       r, kvmppc_get_gpr(vcpu, r),
 		       r+16, kvmppc_get_gpr(vcpu, r+16));
 	pr_err("ctr = %.16lx  lr  = %.16lx\n",
-	       vcpu->arch.ctr, vcpu->arch.lr);
+	       vcpu->arch.regs.ctr, vcpu->arch.regs.link);
 	pr_err("srr0 = %.16llx srr1 = %.16llx\n",
 	       vcpu->arch.shregs.srr0, vcpu->arch.shregs.srr1);
 	pr_err("sprg0 = %.16llx sprg1 = %.16llx\n",
@@ -385,7 +411,7 @@
 	pr_err("sprg2 = %.16llx sprg3 = %.16llx\n",
 	       vcpu->arch.shregs.sprg2, vcpu->arch.shregs.sprg3);
 	pr_err("cr = %.8x  xer = %.16lx  dsisr = %.8x\n",
-	       vcpu->arch.cr, vcpu->arch.xer, vcpu->arch.shregs.dsisr);
+	       vcpu->arch.cr, vcpu->arch.regs.xer, vcpu->arch.shregs.dsisr);
 	pr_err("dar = %.16llx\n", vcpu->arch.shregs.dar);
 	pr_err("fault dar = %.16lx dsisr = %.8x\n",
 	       vcpu->arch.fault_dar, vcpu->arch.fault_dsisr);
@@ -1526,6 +1552,9 @@
 		*val = get_reg_val(id, vcpu->arch.dec_expires +
 				   vcpu->arch.vcore->tb_offset);
 		break;
+	case KVM_REG_PPC_ONLINE:
+		*val = get_reg_val(id, vcpu->arch.online);
+		break;
 	default:
 		r = -EINVAL;
 		break;
@@ -1757,6 +1786,14 @@
 		vcpu->arch.dec_expires = set_reg_val(id, *val) -
 			vcpu->arch.vcore->tb_offset;
 		break;
+	case KVM_REG_PPC_ONLINE:
+		i = set_reg_val(id, *val);
+		if (i && !vcpu->arch.online)
+			atomic_inc(&vcpu->arch.vcore->online_count);
+		else if (!i && vcpu->arch.online)
+			atomic_dec(&vcpu->arch.vcore->online_count);
+		vcpu->arch.online = i;
+		break;
 	default:
 		r = -EINVAL;
 		break;
@@ -2850,6 +2887,25 @@
 		}
 	}
 
+	/*
+	 * On POWER8, set RWMR register.
+	 * Since it only affects PURR and SPURR, it doesn't affect
+	 * the host, so we don't save/restore the host value.
+	 */
+	if (is_power8) {
+		unsigned long rwmr_val = RWMR_RPA_P8_8THREAD;
+		int n_online = atomic_read(&vc->online_count);
+
+		/*
+		 * Use the 8-thread value if we're doing split-core
+		 * or if the vcore's online count looks bogus.
+		 */
+		if (split == 1 && threads_per_subcore == MAX_SMT_THREADS &&
+		    n_online >= 1 && n_online <= MAX_SMT_THREADS)
+			rwmr_val = p8_rwmr_values[n_online];
+		mtspr(SPRN_RWMR, rwmr_val);
+	}
+
 	/* Start all the threads */
 	active = 0;
 	for (sub = 0; sub < core_info.n_subcores; ++sub) {
@@ -2902,6 +2958,32 @@
 	for (sub = 0; sub < core_info.n_subcores; ++sub)
 		spin_unlock(&core_info.vc[sub]->lock);
 
+	if (kvm_is_radix(vc->kvm)) {
+		int tmp = pcpu;
+
+		/*
+		 * Do we need to flush the process scoped TLB for the LPAR?
+		 *
+		 * On POWER9, individual threads can come in here, but the
+		 * TLB is shared between the 4 threads in a core, hence
+		 * invalidating on one thread invalidates for all.
+		 * Thus we make all 4 threads use the same bit here.
+		 *
+		 * Hash must be flushed in realmode in order to use tlbiel.
+		 */
+		mtspr(SPRN_LPID, vc->kvm->arch.lpid);
+		isync();
+
+		if (cpu_has_feature(CPU_FTR_ARCH_300))
+			tmp &= ~0x3UL;
+
+		if (cpumask_test_cpu(tmp, &vc->kvm->arch.need_tlb_flush)) {
+			radix__local_flush_tlb_lpid_guest(vc->kvm->arch.lpid);
+			/* Clear the bit after the TLB flush */
+			cpumask_clear_cpu(tmp, &vc->kvm->arch.need_tlb_flush);
+		}
+	}
+
 	/*
 	 * Interrupts will be enabled once we get into the guest,
 	 * so tell lockdep that we're about to enable interrupts.
@@ -3356,6 +3438,15 @@
 	}
 #endif
 
+	/*
+	 * Force online to 1 for the sake of old userspace which doesn't
+	 * set it.
+	 */
+	if (!vcpu->arch.online) {
+		atomic_inc(&vcpu->arch.vcore->online_count);
+		vcpu->arch.online = 1;
+	}
+
 	kvmppc_core_prepare_to_enter(vcpu);
 
 	/* No need to go into the guest when all we'll do is come back out */
diff --git a/arch/powerpc/kvm/book3s_hv_builtin.c b/arch/powerpc/kvm/book3s_hv_builtin.c
index de18299..d4a3f4d 100644
--- a/arch/powerpc/kvm/book3s_hv_builtin.c
+++ b/arch/powerpc/kvm/book3s_hv_builtin.c
@@ -18,6 +18,7 @@
 #include <linux/cma.h>
 #include <linux/bitops.h>
 
+#include <asm/asm-prototypes.h>
 #include <asm/cputable.h>
 #include <asm/kvm_ppc.h>
 #include <asm/kvm_book3s.h>
@@ -211,9 +212,9 @@
 
 	/* Only need to do the expensive mfmsr() on radix */
 	if (kvm_is_radix(vcpu->kvm) && (mfmsr() & MSR_IR))
-		r = powernv_get_random_long(&vcpu->arch.gpr[4]);
+		r = powernv_get_random_long(&vcpu->arch.regs.gpr[4]);
 	else
-		r = powernv_get_random_real_mode(&vcpu->arch.gpr[4]);
+		r = powernv_get_random_real_mode(&vcpu->arch.regs.gpr[4]);
 	if (r)
 		return H_SUCCESS;
 
@@ -562,7 +563,7 @@
 {
 	if (!kvmppc_xics_enabled(vcpu))
 		return H_TOO_HARD;
-	vcpu->arch.gpr[5] = get_tb();
+	vcpu->arch.regs.gpr[5] = get_tb();
 	if (xive_enabled()) {
 		if (is_rm())
 			return xive_rm_h_xirr(vcpu);
@@ -633,7 +634,19 @@
 
 void kvmppc_bad_interrupt(struct pt_regs *regs)
 {
-	die("Bad interrupt in KVM entry/exit code", regs, SIGABRT);
+	/*
+	 * 100 could happen at any time, 200 can happen due to invalid real
+	 * address access for example (or any time due to a hardware problem).
+	 */
+	if (TRAP(regs) == 0x100) {
+		get_paca()->in_nmi++;
+		system_reset_exception(regs);
+		get_paca()->in_nmi--;
+	} else if (TRAP(regs) == 0x200) {
+		machine_check_exception(regs);
+	} else {
+		die("Bad interrupt in KVM entry/exit code", regs, SIGABRT);
+	}
 	panic("Bad KVM trap");
 }
 
diff --git a/arch/powerpc/kvm/book3s_hv_interrupts.S b/arch/powerpc/kvm/book3s_hv_interrupts.S
index 0e84930..82f2ff9 100644
--- a/arch/powerpc/kvm/book3s_hv_interrupts.S
+++ b/arch/powerpc/kvm/book3s_hv_interrupts.S
@@ -137,7 +137,7 @@
 /*
  * We return here in virtual mode after the guest exits
  * with something that we can't handle in real mode.
- * Interrupts are enabled again at this point.
+ * Interrupts are still hard-disabled.
  */
 
 	/*
diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
index 78e6a39..1f22d9e 100644
--- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c
+++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
@@ -418,7 +418,8 @@
 		    long pte_index, unsigned long pteh, unsigned long ptel)
 {
 	return kvmppc_do_h_enter(vcpu->kvm, flags, pte_index, pteh, ptel,
-				 vcpu->arch.pgdir, true, &vcpu->arch.gpr[4]);
+				 vcpu->arch.pgdir, true,
+				 &vcpu->arch.regs.gpr[4]);
 }
 
 #ifdef __BIG_ENDIAN__
@@ -434,24 +435,6 @@
 		(HPTE_R_KEY_HI | HPTE_R_KEY_LO));
 }
 
-static inline int try_lock_tlbie(unsigned int *lock)
-{
-	unsigned int tmp, old;
-	unsigned int token = LOCK_TOKEN;
-
-	asm volatile("1:lwarx	%1,0,%2\n"
-		     "	cmpwi	cr0,%1,0\n"
-		     "	bne	2f\n"
-		     "  stwcx.	%3,0,%2\n"
-		     "	bne-	1b\n"
-		     "  isync\n"
-		     "2:"
-		     : "=&r" (tmp), "=&r" (old)
-		     : "r" (lock), "r" (token)
-		     : "cc", "memory");
-	return old == 0;
-}
-
 static void do_tlbies(struct kvm *kvm, unsigned long *rbvalues,
 		      long npages, int global, bool need_sync)
 {
@@ -463,8 +446,6 @@
 	 * the RS field, this is backwards-compatible with P7 and P8.
 	 */
 	if (global) {
-		while (!try_lock_tlbie(&kvm->arch.tlbie_lock))
-			cpu_relax();
 		if (need_sync)
 			asm volatile("ptesync" : : : "memory");
 		for (i = 0; i < npages; ++i) {
@@ -483,7 +464,6 @@
 		}
 
 		asm volatile("eieio; tlbsync; ptesync" : : : "memory");
-		kvm->arch.tlbie_lock = 0;
 	} else {
 		if (need_sync)
 			asm volatile("ptesync" : : : "memory");
@@ -561,13 +541,13 @@
 		     unsigned long pte_index, unsigned long avpn)
 {
 	return kvmppc_do_h_remove(vcpu->kvm, flags, pte_index, avpn,
-				  &vcpu->arch.gpr[4]);
+				  &vcpu->arch.regs.gpr[4]);
 }
 
 long kvmppc_h_bulk_remove(struct kvm_vcpu *vcpu)
 {
 	struct kvm *kvm = vcpu->kvm;
-	unsigned long *args = &vcpu->arch.gpr[4];
+	unsigned long *args = &vcpu->arch.regs.gpr[4];
 	__be64 *hp, *hptes[4];
 	unsigned long tlbrb[4];
 	long int i, j, k, n, found, indexes[4];
@@ -787,8 +767,8 @@
 			r = rev[i].guest_rpte | (r & (HPTE_R_R | HPTE_R_C));
 			r &= ~HPTE_GR_RESERVED;
 		}
-		vcpu->arch.gpr[4 + i * 2] = v;
-		vcpu->arch.gpr[5 + i * 2] = r;
+		vcpu->arch.regs.gpr[4 + i * 2] = v;
+		vcpu->arch.regs.gpr[5 + i * 2] = r;
 	}
 	return H_SUCCESS;
 }
@@ -834,7 +814,7 @@
 			}
 		}
 	}
-	vcpu->arch.gpr[4] = gr;
+	vcpu->arch.regs.gpr[4] = gr;
 	ret = H_SUCCESS;
  out:
 	unlock_hpte(hpte, v & ~HPTE_V_HVLOCK);
@@ -881,7 +861,7 @@
 			kvmppc_set_dirty_from_hpte(kvm, v, gr);
 		}
 	}
-	vcpu->arch.gpr[4] = gr;
+	vcpu->arch.regs.gpr[4] = gr;
 	ret = H_SUCCESS;
  out:
 	unlock_hpte(hpte, v & ~HPTE_V_HVLOCK);
diff --git a/arch/powerpc/kvm/book3s_hv_rm_xics.c b/arch/powerpc/kvm/book3s_hv_rm_xics.c
index 2a86261..758d1d2 100644
--- a/arch/powerpc/kvm/book3s_hv_rm_xics.c
+++ b/arch/powerpc/kvm/book3s_hv_rm_xics.c
@@ -517,7 +517,7 @@
 	} while (!icp_rm_try_update(icp, old_state, new_state));
 
 	/* Return the result in GPR4 */
-	vcpu->arch.gpr[4] = xirr;
+	vcpu->arch.regs.gpr[4] = xirr;
 
 	return check_too_hard(xics, icp);
 }
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index b97d261..153988d 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -39,8 +39,6 @@
 	extsw	reg, reg;			\
 END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_300)
 
-#define VCPU_GPRS_TM(reg) (((reg) * ULONG_SIZE) + VCPU_GPR_TM)
-
 /* Values in HSTATE_NAPPING(r13) */
 #define NAPPING_CEDE	1
 #define NAPPING_NOVCPU	2
@@ -639,6 +637,10 @@
 	/* Primary thread switches to guest partition. */
 	cmpwi	r6,0
 	bne	10f
+
+	/* Radix has already switched LPID and flushed core TLB */
+	bne	cr7, 22f
+
 	lwz	r7,KVM_LPID(r9)
 BEGIN_FTR_SECTION
 	ld	r6,KVM_SDR1(r9)
@@ -650,7 +652,7 @@
 	mtspr	SPRN_LPID,r7
 	isync
 
-	/* See if we need to flush the TLB */
+	/* See if we need to flush the TLB. Hash has to be done in RM */
 	lhz	r6,PACAPACAINDEX(r13)	/* test_bit(cpu, need_tlb_flush) */
 BEGIN_FTR_SECTION
 	/*
@@ -677,15 +679,10 @@
 	li	r7,0x800		/* IS field = 0b10 */
 	ptesync
 	li	r0,0			/* RS for P9 version of tlbiel */
-	bne	cr7, 29f
 28:	tlbiel	r7			/* On P9, rs=0, RIC=0, PRS=0, R=0 */
 	addi	r7,r7,0x1000
 	bdnz	28b
-	b	30f
-29:	PPC_TLBIEL(7,0,2,1,1)		/* for radix, RIC=2, PRS=1, R=1 */
-	addi	r7,r7,0x1000
-	bdnz	29b
-30:	ptesync
+	ptesync
 23:	ldarx	r7,0,r6			/* clear the bit after TLB flushed */
 	andc	r7,r7,r8
 	stdcx.	r7,0,r6
@@ -799,7 +796,10 @@
 	/*
 	 * NOTE THAT THIS TRASHES ALL NON-VOLATILE REGISTERS INCLUDING CR
 	 */
-	bl	kvmppc_restore_tm
+	mr      r3, r4
+	ld      r4, VCPU_MSR(r3)
+	bl	kvmppc_restore_tm_hv
+	ld	r4, HSTATE_KVM_VCPU(r13)
 91:
 #endif
 
@@ -1783,7 +1783,10 @@
 	/*
 	 * NOTE THAT THIS TRASHES ALL NON-VOLATILE REGISTERS INCLUDING CR
 	 */
-	bl	kvmppc_save_tm
+	mr      r3, r9
+	ld      r4, VCPU_MSR(r3)
+	bl	kvmppc_save_tm_hv
+	ld	r9, HSTATE_KVM_VCPU(r13)
 91:
 #endif
 
@@ -2686,8 +2689,9 @@
 	/*
 	 * NOTE THAT THIS TRASHES ALL NON-VOLATILE REGISTERS INCLUDING CR
 	 */
-	ld	r9, HSTATE_KVM_VCPU(r13)
-	bl	kvmppc_save_tm
+	ld	r3, HSTATE_KVM_VCPU(r13)
+	ld      r4, VCPU_MSR(r3)
+	bl	kvmppc_save_tm_hv
 91:
 #endif
 
@@ -2805,7 +2809,10 @@
 	/*
 	 * NOTE THAT THIS TRASHES ALL NON-VOLATILE REGISTERS INCLUDING CR
 	 */
-	bl	kvmppc_restore_tm
+	mr      r3, r4
+	ld      r4, VCPU_MSR(r3)
+	bl	kvmppc_restore_tm_hv
+	ld	r4, HSTATE_KVM_VCPU(r13)
 91:
 #endif
 
@@ -3126,11 +3133,22 @@
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
 /*
  * Save transactional state and TM-related registers.
- * Called with r9 pointing to the vcpu struct.
+ * Called with r3 pointing to the vcpu struct and r4 containing
+ * the guest MSR value.
  * This can modify all checkpointed registers, but
- * restores r1, r2 and r9 (vcpu pointer) before exit.
+ * restores r1 and r2 before exit.
  */
-kvmppc_save_tm:
+kvmppc_save_tm_hv:
+	/* See if we need to handle fake suspend mode */
+BEGIN_FTR_SECTION
+	b	__kvmppc_save_tm
+END_FTR_SECTION_IFCLR(CPU_FTR_P9_TM_HV_ASSIST)
+
+	lbz	r0, HSTATE_FAKE_SUSPEND(r13) /* Were we fake suspended? */
+	cmpwi	r0, 0
+	beq	__kvmppc_save_tm
+
+	/* The following code handles the fake_suspend = 1 case */
 	mflr	r0
 	std	r0, PPC_LR_STKOFF(r1)
 	stdu	r1, -PPC_MIN_STKFRM(r1)
@@ -3141,59 +3159,37 @@
 	rldimi	r8, r0, MSR_TM_LG, 63-MSR_TM_LG
 	mtmsrd	r8
 
-	ld	r5, VCPU_MSR(r9)
-	rldicl. r5, r5, 64 - MSR_TS_S_LG, 62
-	beq	1f	/* TM not active in guest. */
-
-	std	r1, HSTATE_HOST_R1(r13)
-	li	r3, TM_CAUSE_KVM_RESCHED
-
-BEGIN_FTR_SECTION
-	lbz	r0, HSTATE_FAKE_SUSPEND(r13) /* Were we fake suspended? */
-	cmpwi	r0, 0
-	beq	3f
 	rldicl. r8, r8, 64 - MSR_TS_S_LG, 62 /* Did we actually hrfid? */
 	beq	4f
-BEGIN_FTR_SECTION_NESTED(96)
+BEGIN_FTR_SECTION
 	bl	pnv_power9_force_smt4_catch
-END_FTR_SECTION_NESTED(CPU_FTR_P9_TM_XER_SO_BUG, CPU_FTR_P9_TM_XER_SO_BUG, 96)
+END_FTR_SECTION_IFSET(CPU_FTR_P9_TM_XER_SO_BUG)
 	nop
-	b	6f
-3:
-	/* Emulation of the treclaim instruction needs TEXASR before treclaim */
-	mfspr	r6, SPRN_TEXASR
-	std	r6, VCPU_ORIG_TEXASR(r9)
-6:
-END_FTR_SECTION_IFSET(CPU_FTR_P9_TM_HV_ASSIST)
 
-	/* Clear the MSR RI since r1, r13 are all going to be foobar. */
+	std	r1, HSTATE_HOST_R1(r13)
+
+	/* Clear the MSR RI since r1, r13 may be foobar. */
 	li	r5, 0
 	mtmsrd	r5, 1
 
-	/* All GPRs are volatile at this point. */
+	/* We have to treclaim here because that's the only way to do S->N */
+	li	r3, TM_CAUSE_KVM_RESCHED
 	TRECLAIM(R3)
 
-	/* Temporarily store r13 and r9 so we have some regs to play with */
-	SET_SCRATCH0(r13)
-	GET_PACA(r13)
-	std	r9, PACATMSCRATCH(r13)
-
-	/* If doing TM emulation on POWER9 DD2.2, check for fake suspend mode */
-BEGIN_FTR_SECTION
-	lbz	r9, HSTATE_FAKE_SUSPEND(r13)
-	cmpwi	r9, 0
-	beq	2f
 	/*
 	 * We were in fake suspend, so we are not going to save the
 	 * register state as the guest checkpointed state (since
 	 * we already have it), therefore we can now use any volatile GPR.
 	 */
-	/* Reload stack pointer and TOC. */
+	/* Reload PACA pointer, stack pointer and TOC. */
+	GET_PACA(r13)
 	ld	r1, HSTATE_HOST_R1(r13)
 	ld	r2, PACATOC(r13)
+
 	/* Set MSR RI now we have r1 and r13 back. */
 	li	r5, MSR_RI
 	mtmsrd	r5, 1
+
 	HMT_MEDIUM
 	ld	r6, HSTATE_DSCR(r13)
 	mtspr	SPRN_DSCR, r6
@@ -3208,85 +3204,9 @@
 	li	r0, PSSCR_FAKE_SUSPEND
 	andc	r3, r3, r0
 	mtspr	SPRN_PSSCR, r3
-	ld	r9, HSTATE_KVM_VCPU(r13)
+
 	/* Don't save TEXASR, use value from last exit in real suspend state */
-	b	11f
-2:
-END_FTR_SECTION_IFSET(CPU_FTR_P9_TM_HV_ASSIST)
-
 	ld	r9, HSTATE_KVM_VCPU(r13)
-
-	/* Get a few more GPRs free. */
-	std	r29, VCPU_GPRS_TM(29)(r9)
-	std	r30, VCPU_GPRS_TM(30)(r9)
-	std	r31, VCPU_GPRS_TM(31)(r9)
-
-	/* Save away PPR and DSCR soon so don't run with user values. */
-	mfspr	r31, SPRN_PPR
-	HMT_MEDIUM
-	mfspr	r30, SPRN_DSCR
-	ld	r29, HSTATE_DSCR(r13)
-	mtspr	SPRN_DSCR, r29
-
-	/* Save all but r9, r13 & r29-r31 */
-	reg = 0
-	.rept	29
-	.if (reg != 9) && (reg != 13)
-	std	reg, VCPU_GPRS_TM(reg)(r9)
-	.endif
-	reg = reg + 1
-	.endr
-	/* ... now save r13 */
-	GET_SCRATCH0(r4)
-	std	r4, VCPU_GPRS_TM(13)(r9)
-	/* ... and save r9 */
-	ld	r4, PACATMSCRATCH(r13)
-	std	r4, VCPU_GPRS_TM(9)(r9)
-
-	/* Reload stack pointer and TOC. */
-	ld	r1, HSTATE_HOST_R1(r13)
-	ld	r2, PACATOC(r13)
-
-	/* Set MSR RI now we have r1 and r13 back. */
-	li	r5, MSR_RI
-	mtmsrd	r5, 1
-
-	/* Save away checkpinted SPRs. */
-	std	r31, VCPU_PPR_TM(r9)
-	std	r30, VCPU_DSCR_TM(r9)
-	mflr	r5
-	mfcr	r6
-	mfctr	r7
-	mfspr	r8, SPRN_AMR
-	mfspr	r10, SPRN_TAR
-	mfxer	r11
-	std	r5, VCPU_LR_TM(r9)
-	stw	r6, VCPU_CR_TM(r9)
-	std	r7, VCPU_CTR_TM(r9)
-	std	r8, VCPU_AMR_TM(r9)
-	std	r10, VCPU_TAR_TM(r9)
-	std	r11, VCPU_XER_TM(r9)
-
-	/* Restore r12 as trap number. */
-	lwz	r12, VCPU_TRAP(r9)
-
-	/* Save FP/VSX. */
-	addi	r3, r9, VCPU_FPRS_TM
-	bl	store_fp_state
-	addi	r3, r9, VCPU_VRS_TM
-	bl	store_vr_state
-	mfspr	r6, SPRN_VRSAVE
-	stw	r6, VCPU_VRSAVE_TM(r9)
-1:
-	/*
-	 * We need to save these SPRs after the treclaim so that the software
-	 * error code is recorded correctly in the TEXASR.  Also the user may
-	 * change these outside of a transaction, so they must always be
-	 * context switched.
-	 */
-	mfspr	r7, SPRN_TEXASR
-	std	r7, VCPU_TEXASR(r9)
-11:
 	mfspr	r5, SPRN_TFHAR
 	mfspr	r6, SPRN_TFIAR
 	std	r5, VCPU_TFHAR(r9)
@@ -3299,149 +3219,63 @@
 
 /*
  * Restore transactional state and TM-related registers.
- * Called with r4 pointing to the vcpu struct.
+ * Called with r3 pointing to the vcpu struct
+ * and r4 containing the guest MSR value.
  * This potentially modifies all checkpointed registers.
- * It restores r1, r2, r4 from the PACA.
+ * It restores r1 and r2 from the PACA.
  */
-kvmppc_restore_tm:
-	mflr	r0
-	std	r0, PPC_LR_STKOFF(r1)
-
-	/* Turn on TM/FP/VSX/VMX so we can restore them. */
-	mfmsr	r5
-	li	r6, MSR_TM >> 32
-	sldi	r6, r6, 32
-	or	r5, r5, r6
-	ori	r5, r5, MSR_FP
-	oris	r5, r5, (MSR_VEC | MSR_VSX)@h
-	mtmsrd	r5
-
-	/*
-	 * The user may change these outside of a transaction, so they must
-	 * always be context switched.
-	 */
-	ld	r5, VCPU_TFHAR(r4)
-	ld	r6, VCPU_TFIAR(r4)
-	ld	r7, VCPU_TEXASR(r4)
-	mtspr	SPRN_TFHAR, r5
-	mtspr	SPRN_TFIAR, r6
-	mtspr	SPRN_TEXASR, r7
-
-	li	r0, 0
-	stb	r0, HSTATE_FAKE_SUSPEND(r13)
-	ld	r5, VCPU_MSR(r4)
-	rldicl. r5, r5, 64 - MSR_TS_S_LG, 62
-	beqlr		/* TM not active in guest */
-	std	r1, HSTATE_HOST_R1(r13)
-
-	/* Make sure the failure summary is set, otherwise we'll program check
-	 * when we trechkpt.  It's possible that this might have been not set
-	 * on a kvmppc_set_one_reg() call but we shouldn't let this crash the
-	 * host.
-	 */
-	oris	r7, r7, (TEXASR_FS)@h
-	mtspr	SPRN_TEXASR, r7
-
+kvmppc_restore_tm_hv:
 	/*
 	 * If we are doing TM emulation for the guest on a POWER9 DD2,
 	 * then we don't actually do a trechkpt -- we either set up
 	 * fake-suspend mode, or emulate a TM rollback.
 	 */
 BEGIN_FTR_SECTION
-	b	.Ldo_tm_fake_load
-END_FTR_SECTION_IFSET(CPU_FTR_P9_TM_HV_ASSIST)
+	b	__kvmppc_restore_tm
+END_FTR_SECTION_IFCLR(CPU_FTR_P9_TM_HV_ASSIST)
+	mflr	r0
+	std	r0, PPC_LR_STKOFF(r1)
+
+	li	r0, 0
+	stb	r0, HSTATE_FAKE_SUSPEND(r13)
+
+	/* Turn on TM so we can restore TM SPRs */
+	mfmsr	r5
+	li	r0, 1
+	rldimi	r5, r0, MSR_TM_LG, 63-MSR_TM_LG
+	mtmsrd	r5
 
 	/*
-	 * We need to load up the checkpointed state for the guest.
-	 * We need to do this early as it will blow away any GPRs, VSRs and
-	 * some SPRs.
+	 * The user may change these outside of a transaction, so they must
+	 * always be context switched.
 	 */
+	ld	r5, VCPU_TFHAR(r3)
+	ld	r6, VCPU_TFIAR(r3)
+	ld	r7, VCPU_TEXASR(r3)
+	mtspr	SPRN_TFHAR, r5
+	mtspr	SPRN_TFIAR, r6
+	mtspr	SPRN_TEXASR, r7
 
-	mr	r31, r4
-	addi	r3, r31, VCPU_FPRS_TM
-	bl	load_fp_state
-	addi	r3, r31, VCPU_VRS_TM
-	bl	load_vr_state
-	mr	r4, r31
-	lwz	r7, VCPU_VRSAVE_TM(r4)
-	mtspr	SPRN_VRSAVE, r7
+	rldicl. r5, r4, 64 - MSR_TS_S_LG, 62
+	beqlr		/* TM not active in guest */
 
-	ld	r5, VCPU_LR_TM(r4)
-	lwz	r6, VCPU_CR_TM(r4)
-	ld	r7, VCPU_CTR_TM(r4)
-	ld	r8, VCPU_AMR_TM(r4)
-	ld	r9, VCPU_TAR_TM(r4)
-	ld	r10, VCPU_XER_TM(r4)
-	mtlr	r5
-	mtcr	r6
-	mtctr	r7
-	mtspr	SPRN_AMR, r8
-	mtspr	SPRN_TAR, r9
-	mtxer	r10
+	/* Make sure the failure summary is set */
+	oris	r7, r7, (TEXASR_FS)@h
+	mtspr	SPRN_TEXASR, r7
 
-	/*
-	 * Load up PPR and DSCR values but don't put them in the actual SPRs
-	 * till the last moment to avoid running with userspace PPR and DSCR for
-	 * too long.
-	 */
-	ld	r29, VCPU_DSCR_TM(r4)
-	ld	r30, VCPU_PPR_TM(r4)
-
-	std	r2, PACATMSCRATCH(r13) /* Save TOC */
-
-	/* Clear the MSR RI since r1, r13 are all going to be foobar. */
-	li	r5, 0
-	mtmsrd	r5, 1
-
-	/* Load GPRs r0-r28 */
-	reg = 0
-	.rept	29
-	ld	reg, VCPU_GPRS_TM(reg)(r31)
-	reg = reg + 1
-	.endr
-
-	mtspr	SPRN_DSCR, r29
-	mtspr	SPRN_PPR, r30
-
-	/* Load final GPRs */
-	ld	29, VCPU_GPRS_TM(29)(r31)
-	ld	30, VCPU_GPRS_TM(30)(r31)
-	ld	31, VCPU_GPRS_TM(31)(r31)
-
-	/* TM checkpointed state is now setup.  All GPRs are now volatile. */
-	TRECHKPT
-
-	/* Now let's get back the state we need. */
-	HMT_MEDIUM
-	GET_PACA(r13)
-	ld	r29, HSTATE_DSCR(r13)
-	mtspr	SPRN_DSCR, r29
-	ld	r4, HSTATE_KVM_VCPU(r13)
-	ld	r1, HSTATE_HOST_R1(r13)
-	ld	r2, PACATMSCRATCH(r13)
-
-	/* Set the MSR RI since we have our registers back. */
-	li	r5, MSR_RI
-	mtmsrd	r5, 1
-9:
-	ld	r0, PPC_LR_STKOFF(r1)
-	mtlr	r0
-	blr
-
-.Ldo_tm_fake_load:
 	cmpwi	r5, 1		/* check for suspended state */
 	bgt	10f
 	stb	r5, HSTATE_FAKE_SUSPEND(r13)
-	b	9b		/* and return */
+	b	9f		/* and return */
 10:	stdu	r1, -PPC_MIN_STKFRM(r1)
 	/* guest is in transactional state, so simulate rollback */
-	mr	r3, r4
 	bl	kvmhv_emulate_tm_rollback
 	nop
-	ld      r4, HSTATE_KVM_VCPU(r13) /* our vcpu pointer has been trashed */
 	addi	r1, r1, PPC_MIN_STKFRM
-	b	9b
-#endif
+9:	ld	r0, PPC_LR_STKOFF(r1)
+	mtlr	r0
+	blr
+#endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
 
 /*
  * We come here if we get any exception or interrupt while we are
@@ -3572,6 +3406,8 @@
 	bcl	20, 31, .+4
 5:	mflr	r3
 	addi	r3, r3, 9f - 5b
+	li	r4, -1
+	rldimi	r3, r4, 62, 0	/* ensure 0xc000000000000000 bits are set */
 	ld	r4, PACAKMSR(r13)
 	mtspr	SPRN_SRR0, r3
 	mtspr	SPRN_SRR1, r4
diff --git a/arch/powerpc/kvm/book3s_hv_tm.c b/arch/powerpc/kvm/book3s_hv_tm.c
index bf710ad..0082850 100644
--- a/arch/powerpc/kvm/book3s_hv_tm.c
+++ b/arch/powerpc/kvm/book3s_hv_tm.c
@@ -19,7 +19,7 @@
 	u64 texasr, tfiar;
 	u64 msr = vcpu->arch.shregs.msr;
 
-	tfiar = vcpu->arch.pc & ~0x3ull;
+	tfiar = vcpu->arch.regs.nip & ~0x3ull;
 	texasr = (failure_cause << 56) | TEXASR_ABORT | TEXASR_FS | TEXASR_EXACT;
 	if (MSR_TM_SUSPENDED(vcpu->arch.shregs.msr))
 		texasr |= TEXASR_SUSP;
@@ -57,8 +57,8 @@
 			       (newmsr & MSR_TM)));
 		newmsr = sanitize_msr(newmsr);
 		vcpu->arch.shregs.msr = newmsr;
-		vcpu->arch.cfar = vcpu->arch.pc - 4;
-		vcpu->arch.pc = vcpu->arch.shregs.srr0;
+		vcpu->arch.cfar = vcpu->arch.regs.nip - 4;
+		vcpu->arch.regs.nip = vcpu->arch.shregs.srr0;
 		return RESUME_GUEST;
 
 	case PPC_INST_RFEBB:
@@ -90,8 +90,8 @@
 		vcpu->arch.bescr = bescr;
 		msr = (msr & ~MSR_TS_MASK) | MSR_TS_T;
 		vcpu->arch.shregs.msr = msr;
-		vcpu->arch.cfar = vcpu->arch.pc - 4;
-		vcpu->arch.pc = vcpu->arch.ebbrr;
+		vcpu->arch.cfar = vcpu->arch.regs.nip - 4;
+		vcpu->arch.regs.nip = vcpu->arch.ebbrr;
 		return RESUME_GUEST;
 
 	case PPC_INST_MTMSRD:
diff --git a/arch/powerpc/kvm/book3s_hv_tm_builtin.c b/arch/powerpc/kvm/book3s_hv_tm_builtin.c
index d98ccfd..b2c7c6f 100644
--- a/arch/powerpc/kvm/book3s_hv_tm_builtin.c
+++ b/arch/powerpc/kvm/book3s_hv_tm_builtin.c
@@ -35,8 +35,8 @@
 			return 0;
 		newmsr = sanitize_msr(newmsr);
 		vcpu->arch.shregs.msr = newmsr;
-		vcpu->arch.cfar = vcpu->arch.pc - 4;
-		vcpu->arch.pc = vcpu->arch.shregs.srr0;
+		vcpu->arch.cfar = vcpu->arch.regs.nip - 4;
+		vcpu->arch.regs.nip = vcpu->arch.shregs.srr0;
 		return 1;
 
 	case PPC_INST_RFEBB:
@@ -58,8 +58,8 @@
 		mtspr(SPRN_BESCR, bescr);
 		msr = (msr & ~MSR_TS_MASK) | MSR_TS_T;
 		vcpu->arch.shregs.msr = msr;
-		vcpu->arch.cfar = vcpu->arch.pc - 4;
-		vcpu->arch.pc = mfspr(SPRN_EBBRR);
+		vcpu->arch.cfar = vcpu->arch.regs.nip - 4;
+		vcpu->arch.regs.nip = mfspr(SPRN_EBBRR);
 		return 1;
 
 	case PPC_INST_MTMSRD:
@@ -103,7 +103,7 @@
 void kvmhv_emulate_tm_rollback(struct kvm_vcpu *vcpu)
 {
 	vcpu->arch.shregs.msr &= ~MSR_TS_MASK;	/* go to N state */
-	vcpu->arch.pc = vcpu->arch.tfhar;
+	vcpu->arch.regs.nip = vcpu->arch.tfhar;
 	copy_from_checkpoint(vcpu);
 	vcpu->arch.cr = (vcpu->arch.cr & 0x0fffffff) | 0xa0000000;
 }
diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index d3f304d..c3b8006 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -42,6 +42,8 @@
 #include <linux/highmem.h>
 #include <linux/module.h>
 #include <linux/miscdevice.h>
+#include <asm/asm-prototypes.h>
+#include <asm/tm.h>
 
 #include "book3s.h"
 
@@ -53,7 +55,9 @@
 
 static int kvmppc_handle_ext(struct kvm_vcpu *vcpu, unsigned int exit_nr,
 			     ulong msr);
-static void kvmppc_giveup_fac(struct kvm_vcpu *vcpu, ulong fac);
+#ifdef CONFIG_PPC_BOOK3S_64
+static int kvmppc_handle_fac(struct kvm_vcpu *vcpu, ulong fac);
+#endif
 
 /* Some compatibility defines */
 #ifdef CONFIG_PPC_BOOK3S_32
@@ -114,6 +118,8 @@
 
 	if (kvmppc_is_split_real(vcpu))
 		kvmppc_fixup_split_real(vcpu);
+
+	kvmppc_restore_tm_pr(vcpu);
 }
 
 static void kvmppc_core_vcpu_put_pr(struct kvm_vcpu *vcpu)
@@ -133,6 +139,7 @@
 
 	kvmppc_giveup_ext(vcpu, MSR_FP | MSR_VEC | MSR_VSX);
 	kvmppc_giveup_fac(vcpu, FSCR_TAR_LG);
+	kvmppc_save_tm_pr(vcpu);
 
 	/* Enable AIL if supported */
 	if (cpu_has_feature(CPU_FTR_HVMODE) &&
@@ -147,25 +154,25 @@
 {
 	struct kvmppc_book3s_shadow_vcpu *svcpu = svcpu_get(vcpu);
 
-	svcpu->gpr[0] = vcpu->arch.gpr[0];
-	svcpu->gpr[1] = vcpu->arch.gpr[1];
-	svcpu->gpr[2] = vcpu->arch.gpr[2];
-	svcpu->gpr[3] = vcpu->arch.gpr[3];
-	svcpu->gpr[4] = vcpu->arch.gpr[4];
-	svcpu->gpr[5] = vcpu->arch.gpr[5];
-	svcpu->gpr[6] = vcpu->arch.gpr[6];
-	svcpu->gpr[7] = vcpu->arch.gpr[7];
-	svcpu->gpr[8] = vcpu->arch.gpr[8];
-	svcpu->gpr[9] = vcpu->arch.gpr[9];
-	svcpu->gpr[10] = vcpu->arch.gpr[10];
-	svcpu->gpr[11] = vcpu->arch.gpr[11];
-	svcpu->gpr[12] = vcpu->arch.gpr[12];
-	svcpu->gpr[13] = vcpu->arch.gpr[13];
+	svcpu->gpr[0] = vcpu->arch.regs.gpr[0];
+	svcpu->gpr[1] = vcpu->arch.regs.gpr[1];
+	svcpu->gpr[2] = vcpu->arch.regs.gpr[2];
+	svcpu->gpr[3] = vcpu->arch.regs.gpr[3];
+	svcpu->gpr[4] = vcpu->arch.regs.gpr[4];
+	svcpu->gpr[5] = vcpu->arch.regs.gpr[5];
+	svcpu->gpr[6] = vcpu->arch.regs.gpr[6];
+	svcpu->gpr[7] = vcpu->arch.regs.gpr[7];
+	svcpu->gpr[8] = vcpu->arch.regs.gpr[8];
+	svcpu->gpr[9] = vcpu->arch.regs.gpr[9];
+	svcpu->gpr[10] = vcpu->arch.regs.gpr[10];
+	svcpu->gpr[11] = vcpu->arch.regs.gpr[11];
+	svcpu->gpr[12] = vcpu->arch.regs.gpr[12];
+	svcpu->gpr[13] = vcpu->arch.regs.gpr[13];
 	svcpu->cr  = vcpu->arch.cr;
-	svcpu->xer = vcpu->arch.xer;
-	svcpu->ctr = vcpu->arch.ctr;
-	svcpu->lr  = vcpu->arch.lr;
-	svcpu->pc  = vcpu->arch.pc;
+	svcpu->xer = vcpu->arch.regs.xer;
+	svcpu->ctr = vcpu->arch.regs.ctr;
+	svcpu->lr  = vcpu->arch.regs.link;
+	svcpu->pc  = vcpu->arch.regs.nip;
 #ifdef CONFIG_PPC_BOOK3S_64
 	svcpu->shadow_fscr = vcpu->arch.shadow_fscr;
 #endif
@@ -182,10 +189,45 @@
 	svcpu_put(svcpu);
 }
 
+static void kvmppc_recalc_shadow_msr(struct kvm_vcpu *vcpu)
+{
+	ulong guest_msr = kvmppc_get_msr(vcpu);
+	ulong smsr = guest_msr;
+
+	/* Guest MSR values */
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+	smsr &= MSR_FE0 | MSR_FE1 | MSR_SF | MSR_SE | MSR_BE | MSR_LE |
+		MSR_TM | MSR_TS_MASK;
+#else
+	smsr &= MSR_FE0 | MSR_FE1 | MSR_SF | MSR_SE | MSR_BE | MSR_LE;
+#endif
+	/* Process MSR values */
+	smsr |= MSR_ME | MSR_RI | MSR_IR | MSR_DR | MSR_PR | MSR_EE;
+	/* External providers the guest reserved */
+	smsr |= (guest_msr & vcpu->arch.guest_owned_ext);
+	/* 64-bit Process MSR values */
+#ifdef CONFIG_PPC_BOOK3S_64
+	smsr |= MSR_ISF | MSR_HV;
+#endif
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+	/*
+	 * in guest privileged state, we want to fail all TM transactions.
+	 * So disable MSR TM bit so that all tbegin. will be able to be
+	 * trapped into host.
+	 */
+	if (!(guest_msr & MSR_PR))
+		smsr &= ~MSR_TM;
+#endif
+	vcpu->arch.shadow_msr = smsr;
+}
+
 /* Copy data touched by real-mode code from shadow vcpu back to vcpu */
 void kvmppc_copy_from_svcpu(struct kvm_vcpu *vcpu)
 {
 	struct kvmppc_book3s_shadow_vcpu *svcpu = svcpu_get(vcpu);
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+	ulong old_msr;
+#endif
 
 	/*
 	 * Maybe we were already preempted and synced the svcpu from
@@ -194,25 +236,25 @@
 	if (!svcpu->in_use)
 		goto out;
 
-	vcpu->arch.gpr[0] = svcpu->gpr[0];
-	vcpu->arch.gpr[1] = svcpu->gpr[1];
-	vcpu->arch.gpr[2] = svcpu->gpr[2];
-	vcpu->arch.gpr[3] = svcpu->gpr[3];
-	vcpu->arch.gpr[4] = svcpu->gpr[4];
-	vcpu->arch.gpr[5] = svcpu->gpr[5];
-	vcpu->arch.gpr[6] = svcpu->gpr[6];
-	vcpu->arch.gpr[7] = svcpu->gpr[7];
-	vcpu->arch.gpr[8] = svcpu->gpr[8];
-	vcpu->arch.gpr[9] = svcpu->gpr[9];
-	vcpu->arch.gpr[10] = svcpu->gpr[10];
-	vcpu->arch.gpr[11] = svcpu->gpr[11];
-	vcpu->arch.gpr[12] = svcpu->gpr[12];
-	vcpu->arch.gpr[13] = svcpu->gpr[13];
+	vcpu->arch.regs.gpr[0] = svcpu->gpr[0];
+	vcpu->arch.regs.gpr[1] = svcpu->gpr[1];
+	vcpu->arch.regs.gpr[2] = svcpu->gpr[2];
+	vcpu->arch.regs.gpr[3] = svcpu->gpr[3];
+	vcpu->arch.regs.gpr[4] = svcpu->gpr[4];
+	vcpu->arch.regs.gpr[5] = svcpu->gpr[5];
+	vcpu->arch.regs.gpr[6] = svcpu->gpr[6];
+	vcpu->arch.regs.gpr[7] = svcpu->gpr[7];
+	vcpu->arch.regs.gpr[8] = svcpu->gpr[8];
+	vcpu->arch.regs.gpr[9] = svcpu->gpr[9];
+	vcpu->arch.regs.gpr[10] = svcpu->gpr[10];
+	vcpu->arch.regs.gpr[11] = svcpu->gpr[11];
+	vcpu->arch.regs.gpr[12] = svcpu->gpr[12];
+	vcpu->arch.regs.gpr[13] = svcpu->gpr[13];
 	vcpu->arch.cr  = svcpu->cr;
-	vcpu->arch.xer = svcpu->xer;
-	vcpu->arch.ctr = svcpu->ctr;
-	vcpu->arch.lr  = svcpu->lr;
-	vcpu->arch.pc  = svcpu->pc;
+	vcpu->arch.regs.xer = svcpu->xer;
+	vcpu->arch.regs.ctr = svcpu->ctr;
+	vcpu->arch.regs.link  = svcpu->lr;
+	vcpu->arch.regs.nip  = svcpu->pc;
 	vcpu->arch.shadow_srr1 = svcpu->shadow_srr1;
 	vcpu->arch.fault_dar   = svcpu->fault_dar;
 	vcpu->arch.fault_dsisr = svcpu->fault_dsisr;
@@ -228,12 +270,116 @@
 	to_book3s(vcpu)->vtb += get_vtb() - vcpu->arch.entry_vtb;
 	if (cpu_has_feature(CPU_FTR_ARCH_207S))
 		vcpu->arch.ic += mfspr(SPRN_IC) - vcpu->arch.entry_ic;
+
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+	/*
+	 * Unlike other MSR bits, MSR[TS]bits can be changed at guest without
+	 * notifying host:
+	 *  modified by unprivileged instructions like "tbegin"/"tend"/
+	 * "tresume"/"tsuspend" in PR KVM guest.
+	 *
+	 * It is necessary to sync here to calculate a correct shadow_msr.
+	 *
+	 * privileged guest's tbegin will be failed at present. So we
+	 * only take care of problem state guest.
+	 */
+	old_msr = kvmppc_get_msr(vcpu);
+	if (unlikely((old_msr & MSR_PR) &&
+		(vcpu->arch.shadow_srr1 & (MSR_TS_MASK)) !=
+				(old_msr & (MSR_TS_MASK)))) {
+		old_msr &= ~(MSR_TS_MASK);
+		old_msr |= (vcpu->arch.shadow_srr1 & (MSR_TS_MASK));
+		kvmppc_set_msr_fast(vcpu, old_msr);
+		kvmppc_recalc_shadow_msr(vcpu);
+	}
+#endif
+
 	svcpu->in_use = false;
 
 out:
 	svcpu_put(svcpu);
 }
 
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+void kvmppc_save_tm_sprs(struct kvm_vcpu *vcpu)
+{
+	tm_enable();
+	vcpu->arch.tfhar = mfspr(SPRN_TFHAR);
+	vcpu->arch.texasr = mfspr(SPRN_TEXASR);
+	vcpu->arch.tfiar = mfspr(SPRN_TFIAR);
+	tm_disable();
+}
+
+void kvmppc_restore_tm_sprs(struct kvm_vcpu *vcpu)
+{
+	tm_enable();
+	mtspr(SPRN_TFHAR, vcpu->arch.tfhar);
+	mtspr(SPRN_TEXASR, vcpu->arch.texasr);
+	mtspr(SPRN_TFIAR, vcpu->arch.tfiar);
+	tm_disable();
+}
+
+/* loadup math bits which is enabled at kvmppc_get_msr() but not enabled at
+ * hardware.
+ */
+static void kvmppc_handle_lost_math_exts(struct kvm_vcpu *vcpu)
+{
+	ulong exit_nr;
+	ulong ext_diff = (kvmppc_get_msr(vcpu) & ~vcpu->arch.guest_owned_ext) &
+		(MSR_FP | MSR_VEC | MSR_VSX);
+
+	if (!ext_diff)
+		return;
+
+	if (ext_diff == MSR_FP)
+		exit_nr = BOOK3S_INTERRUPT_FP_UNAVAIL;
+	else if (ext_diff == MSR_VEC)
+		exit_nr = BOOK3S_INTERRUPT_ALTIVEC;
+	else
+		exit_nr = BOOK3S_INTERRUPT_VSX;
+
+	kvmppc_handle_ext(vcpu, exit_nr, ext_diff);
+}
+
+void kvmppc_save_tm_pr(struct kvm_vcpu *vcpu)
+{
+	if (!(MSR_TM_ACTIVE(kvmppc_get_msr(vcpu)))) {
+		kvmppc_save_tm_sprs(vcpu);
+		return;
+	}
+
+	kvmppc_giveup_fac(vcpu, FSCR_TAR_LG);
+	kvmppc_giveup_ext(vcpu, MSR_VSX);
+
+	preempt_disable();
+	_kvmppc_save_tm_pr(vcpu, mfmsr());
+	preempt_enable();
+}
+
+void kvmppc_restore_tm_pr(struct kvm_vcpu *vcpu)
+{
+	if (!MSR_TM_ACTIVE(kvmppc_get_msr(vcpu))) {
+		kvmppc_restore_tm_sprs(vcpu);
+		if (kvmppc_get_msr(vcpu) & MSR_TM) {
+			kvmppc_handle_lost_math_exts(vcpu);
+			if (vcpu->arch.fscr & FSCR_TAR)
+				kvmppc_handle_fac(vcpu, FSCR_TAR_LG);
+		}
+		return;
+	}
+
+	preempt_disable();
+	_kvmppc_restore_tm_pr(vcpu, kvmppc_get_msr(vcpu));
+	preempt_enable();
+
+	if (kvmppc_get_msr(vcpu) & MSR_TM) {
+		kvmppc_handle_lost_math_exts(vcpu);
+		if (vcpu->arch.fscr & FSCR_TAR)
+			kvmppc_handle_fac(vcpu, FSCR_TAR_LG);
+	}
+}
+#endif
+
 static int kvmppc_core_check_requests_pr(struct kvm_vcpu *vcpu)
 {
 	int r = 1; /* Indicate we want to get back into the guest */
@@ -306,32 +452,29 @@
 
 /*****************************************/
 
-static void kvmppc_recalc_shadow_msr(struct kvm_vcpu *vcpu)
-{
-	ulong guest_msr = kvmppc_get_msr(vcpu);
-	ulong smsr = guest_msr;
-
-	/* Guest MSR values */
-	smsr &= MSR_FE0 | MSR_FE1 | MSR_SF | MSR_SE | MSR_BE | MSR_LE;
-	/* Process MSR values */
-	smsr |= MSR_ME | MSR_RI | MSR_IR | MSR_DR | MSR_PR | MSR_EE;
-	/* External providers the guest reserved */
-	smsr |= (guest_msr & vcpu->arch.guest_owned_ext);
-	/* 64-bit Process MSR values */
-#ifdef CONFIG_PPC_BOOK3S_64
-	smsr |= MSR_ISF | MSR_HV;
-#endif
-	vcpu->arch.shadow_msr = smsr;
-}
-
 static void kvmppc_set_msr_pr(struct kvm_vcpu *vcpu, u64 msr)
 {
-	ulong old_msr = kvmppc_get_msr(vcpu);
+	ulong old_msr;
+
+	/* For PAPR guest, make sure MSR reflects guest mode */
+	if (vcpu->arch.papr_enabled)
+		msr = (msr & ~MSR_HV) | MSR_ME;
 
 #ifdef EXIT_DEBUG
 	printk(KERN_INFO "KVM: Set MSR to 0x%llx\n", msr);
 #endif
 
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+	/* We should never target guest MSR to TS=10 && PR=0,
+	 * since we always fail transaction for guest privilege
+	 * state.
+	 */
+	if (!(msr & MSR_PR) && MSR_TM_TRANSACTIONAL(msr))
+		kvmppc_emulate_tabort(vcpu,
+			TM_CAUSE_KVM_FAC_UNAV | TM_CAUSE_PERSISTENT);
+#endif
+
+	old_msr = kvmppc_get_msr(vcpu);
 	msr &= to_book3s(vcpu)->msr_mask;
 	kvmppc_set_msr_fast(vcpu, msr);
 	kvmppc_recalc_shadow_msr(vcpu);
@@ -387,6 +530,11 @@
 	/* Preload FPU if it's enabled */
 	if (kvmppc_get_msr(vcpu) & MSR_FP)
 		kvmppc_handle_ext(vcpu, BOOK3S_INTERRUPT_FP_UNAVAIL, MSR_FP);
+
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+	if (kvmppc_get_msr(vcpu) & MSR_TM)
+		kvmppc_handle_lost_math_exts(vcpu);
+#endif
 }
 
 void kvmppc_set_pvr_pr(struct kvm_vcpu *vcpu, u32 pvr)
@@ -584,24 +732,20 @@
 		pte.may_execute = !data;
 	}
 
-	if (page_found == -ENOENT) {
-		/* Page not found in guest PTE entries */
-		u64 ssrr1 = vcpu->arch.shadow_srr1;
-		u64 msr = kvmppc_get_msr(vcpu);
-		kvmppc_set_dar(vcpu, kvmppc_get_fault_dar(vcpu));
-		kvmppc_set_dsisr(vcpu, vcpu->arch.fault_dsisr);
-		kvmppc_set_msr_fast(vcpu, msr | (ssrr1 & 0xf8000000ULL));
-		kvmppc_book3s_queue_irqprio(vcpu, vec);
-	} else if (page_found == -EPERM) {
-		/* Storage protection */
-		u32 dsisr = vcpu->arch.fault_dsisr;
-		u64 ssrr1 = vcpu->arch.shadow_srr1;
-		u64 msr = kvmppc_get_msr(vcpu);
-		kvmppc_set_dar(vcpu, kvmppc_get_fault_dar(vcpu));
-		dsisr = (dsisr & ~DSISR_NOHPTE) | DSISR_PROTFAULT;
-		kvmppc_set_dsisr(vcpu, dsisr);
-		kvmppc_set_msr_fast(vcpu, msr | (ssrr1 & 0xf8000000ULL));
-		kvmppc_book3s_queue_irqprio(vcpu, vec);
+	if (page_found == -ENOENT || page_found == -EPERM) {
+		/* Page not found in guest PTE entries, or protection fault */
+		u64 flags;
+
+		if (page_found == -EPERM)
+			flags = DSISR_PROTFAULT;
+		else
+			flags = DSISR_NOHPTE;
+		if (data) {
+			flags |= vcpu->arch.fault_dsisr & DSISR_ISSTORE;
+			kvmppc_core_queue_data_storage(vcpu, eaddr, flags);
+		} else {
+			kvmppc_core_queue_inst_storage(vcpu, flags);
+		}
 	} else if (page_found == -EINVAL) {
 		/* Page not found in guest SLB */
 		kvmppc_set_dar(vcpu, kvmppc_get_fault_dar(vcpu));
@@ -683,7 +827,7 @@
 }
 
 /* Give up facility (TAR / EBB / DSCR) */
-static void kvmppc_giveup_fac(struct kvm_vcpu *vcpu, ulong fac)
+void kvmppc_giveup_fac(struct kvm_vcpu *vcpu, ulong fac)
 {
 #ifdef CONFIG_PPC_BOOK3S_64
 	if (!(vcpu->arch.shadow_fscr & (1ULL << fac))) {
@@ -802,7 +946,7 @@
 
 #ifdef CONFIG_PPC_BOOK3S_64
 
-static void kvmppc_trigger_fac_interrupt(struct kvm_vcpu *vcpu, ulong fac)
+void kvmppc_trigger_fac_interrupt(struct kvm_vcpu *vcpu, ulong fac)
 {
 	/* Inject the Interrupt Cause field and trigger a guest interrupt */
 	vcpu->arch.fscr &= ~(0xffULL << 56);
@@ -864,6 +1008,18 @@
 		break;
 	}
 
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+	/* Since we disabled MSR_TM at privilege state, the mfspr instruction
+	 * for TM spr can trigger TM fac unavailable. In this case, the
+	 * emulation is handled by kvmppc_emulate_fac(), which invokes
+	 * kvmppc_emulate_mfspr() finally. But note the mfspr can include
+	 * RT for NV registers. So it need to restore those NV reg to reflect
+	 * the update.
+	 */
+	if ((fac == FSCR_TM_LG) && !(kvmppc_get_msr(vcpu) & MSR_PR))
+		return RESUME_GUEST_NV;
+#endif
+
 	return RESUME_GUEST;
 }
 
@@ -872,7 +1028,12 @@
 	if ((vcpu->arch.fscr & FSCR_TAR) && !(fscr & FSCR_TAR)) {
 		/* TAR got dropped, drop it in shadow too */
 		kvmppc_giveup_fac(vcpu, FSCR_TAR_LG);
+	} else if (!(vcpu->arch.fscr & FSCR_TAR) && (fscr & FSCR_TAR)) {
+		vcpu->arch.fscr = fscr;
+		kvmppc_handle_fac(vcpu, FSCR_TAR_LG);
+		return;
 	}
+
 	vcpu->arch.fscr = fscr;
 }
 #endif
@@ -1017,10 +1178,8 @@
 			kvmppc_mmu_pte_flush(vcpu, kvmppc_get_pc(vcpu), ~0xFFFUL);
 			r = RESUME_GUEST;
 		} else {
-			u64 msr = kvmppc_get_msr(vcpu);
-			msr |= shadow_srr1 & 0x58000000;
-			kvmppc_set_msr_fast(vcpu, msr);
-			kvmppc_book3s_queue_irqprio(vcpu, exit_nr);
+			kvmppc_core_queue_inst_storage(vcpu,
+						shadow_srr1 & 0x58000000);
 			r = RESUME_GUEST;
 		}
 		break;
@@ -1059,9 +1218,7 @@
 			r = kvmppc_handle_pagefault(run, vcpu, dar, exit_nr);
 			srcu_read_unlock(&vcpu->kvm->srcu, idx);
 		} else {
-			kvmppc_set_dar(vcpu, dar);
-			kvmppc_set_dsisr(vcpu, fault_dsisr);
-			kvmppc_book3s_queue_irqprio(vcpu, exit_nr);
+			kvmppc_core_queue_data_storage(vcpu, dar, fault_dsisr);
 			r = RESUME_GUEST;
 		}
 		break;
@@ -1092,10 +1249,13 @@
 	case BOOK3S_INTERRUPT_EXTERNAL:
 	case BOOK3S_INTERRUPT_EXTERNAL_LEVEL:
 	case BOOK3S_INTERRUPT_EXTERNAL_HV:
+	case BOOK3S_INTERRUPT_H_VIRT:
 		vcpu->stat.ext_intr_exits++;
 		r = RESUME_GUEST;
 		break;
+	case BOOK3S_INTERRUPT_HMI:
 	case BOOK3S_INTERRUPT_PERFMON:
+	case BOOK3S_INTERRUPT_SYSTEM_RESET:
 		r = RESUME_GUEST;
 		break;
 	case BOOK3S_INTERRUPT_PROGRAM:
@@ -1225,8 +1385,7 @@
 	}
 #ifdef CONFIG_PPC_BOOK3S_64
 	case BOOK3S_INTERRUPT_FAC_UNAVAIL:
-		kvmppc_handle_fac(vcpu, vcpu->arch.shadow_fscr >> 56);
-		r = RESUME_GUEST;
+		r = kvmppc_handle_fac(vcpu, vcpu->arch.shadow_fscr >> 56);
 		break;
 #endif
 	case BOOK3S_INTERRUPT_MACHINE_CHECK:
@@ -1379,6 +1538,73 @@
 		else
 			*val = get_reg_val(id, 0);
 		break;
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+	case KVM_REG_PPC_TFHAR:
+		*val = get_reg_val(id, vcpu->arch.tfhar);
+		break;
+	case KVM_REG_PPC_TFIAR:
+		*val = get_reg_val(id, vcpu->arch.tfiar);
+		break;
+	case KVM_REG_PPC_TEXASR:
+		*val = get_reg_val(id, vcpu->arch.texasr);
+		break;
+	case KVM_REG_PPC_TM_GPR0 ... KVM_REG_PPC_TM_GPR31:
+		*val = get_reg_val(id,
+				vcpu->arch.gpr_tm[id-KVM_REG_PPC_TM_GPR0]);
+		break;
+	case KVM_REG_PPC_TM_VSR0 ... KVM_REG_PPC_TM_VSR63:
+	{
+		int i, j;
+
+		i = id - KVM_REG_PPC_TM_VSR0;
+		if (i < 32)
+			for (j = 0; j < TS_FPRWIDTH; j++)
+				val->vsxval[j] = vcpu->arch.fp_tm.fpr[i][j];
+		else {
+			if (cpu_has_feature(CPU_FTR_ALTIVEC))
+				val->vval = vcpu->arch.vr_tm.vr[i-32];
+			else
+				r = -ENXIO;
+		}
+		break;
+	}
+	case KVM_REG_PPC_TM_CR:
+		*val = get_reg_val(id, vcpu->arch.cr_tm);
+		break;
+	case KVM_REG_PPC_TM_XER:
+		*val = get_reg_val(id, vcpu->arch.xer_tm);
+		break;
+	case KVM_REG_PPC_TM_LR:
+		*val = get_reg_val(id, vcpu->arch.lr_tm);
+		break;
+	case KVM_REG_PPC_TM_CTR:
+		*val = get_reg_val(id, vcpu->arch.ctr_tm);
+		break;
+	case KVM_REG_PPC_TM_FPSCR:
+		*val = get_reg_val(id, vcpu->arch.fp_tm.fpscr);
+		break;
+	case KVM_REG_PPC_TM_AMR:
+		*val = get_reg_val(id, vcpu->arch.amr_tm);
+		break;
+	case KVM_REG_PPC_TM_PPR:
+		*val = get_reg_val(id, vcpu->arch.ppr_tm);
+		break;
+	case KVM_REG_PPC_TM_VRSAVE:
+		*val = get_reg_val(id, vcpu->arch.vrsave_tm);
+		break;
+	case KVM_REG_PPC_TM_VSCR:
+		if (cpu_has_feature(CPU_FTR_ALTIVEC))
+			*val = get_reg_val(id, vcpu->arch.vr_tm.vscr.u[3]);
+		else
+			r = -ENXIO;
+		break;
+	case KVM_REG_PPC_TM_DSCR:
+		*val = get_reg_val(id, vcpu->arch.dscr_tm);
+		break;
+	case KVM_REG_PPC_TM_TAR:
+		*val = get_reg_val(id, vcpu->arch.tar_tm);
+		break;
+#endif
 	default:
 		r = -EINVAL;
 		break;
@@ -1412,6 +1638,72 @@
 	case KVM_REG_PPC_LPCR_64:
 		kvmppc_set_lpcr_pr(vcpu, set_reg_val(id, *val));
 		break;
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+	case KVM_REG_PPC_TFHAR:
+		vcpu->arch.tfhar = set_reg_val(id, *val);
+		break;
+	case KVM_REG_PPC_TFIAR:
+		vcpu->arch.tfiar = set_reg_val(id, *val);
+		break;
+	case KVM_REG_PPC_TEXASR:
+		vcpu->arch.texasr = set_reg_val(id, *val);
+		break;
+	case KVM_REG_PPC_TM_GPR0 ... KVM_REG_PPC_TM_GPR31:
+		vcpu->arch.gpr_tm[id - KVM_REG_PPC_TM_GPR0] =
+			set_reg_val(id, *val);
+		break;
+	case KVM_REG_PPC_TM_VSR0 ... KVM_REG_PPC_TM_VSR63:
+	{
+		int i, j;
+
+		i = id - KVM_REG_PPC_TM_VSR0;
+		if (i < 32)
+			for (j = 0; j < TS_FPRWIDTH; j++)
+				vcpu->arch.fp_tm.fpr[i][j] = val->vsxval[j];
+		else
+			if (cpu_has_feature(CPU_FTR_ALTIVEC))
+				vcpu->arch.vr_tm.vr[i-32] = val->vval;
+			else
+				r = -ENXIO;
+		break;
+	}
+	case KVM_REG_PPC_TM_CR:
+		vcpu->arch.cr_tm = set_reg_val(id, *val);
+		break;
+	case KVM_REG_PPC_TM_XER:
+		vcpu->arch.xer_tm = set_reg_val(id, *val);
+		break;
+	case KVM_REG_PPC_TM_LR:
+		vcpu->arch.lr_tm = set_reg_val(id, *val);
+		break;
+	case KVM_REG_PPC_TM_CTR:
+		vcpu->arch.ctr_tm = set_reg_val(id, *val);
+		break;
+	case KVM_REG_PPC_TM_FPSCR:
+		vcpu->arch.fp_tm.fpscr = set_reg_val(id, *val);
+		break;
+	case KVM_REG_PPC_TM_AMR:
+		vcpu->arch.amr_tm = set_reg_val(id, *val);
+		break;
+	case KVM_REG_PPC_TM_PPR:
+		vcpu->arch.ppr_tm = set_reg_val(id, *val);
+		break;
+	case KVM_REG_PPC_TM_VRSAVE:
+		vcpu->arch.vrsave_tm = set_reg_val(id, *val);
+		break;
+	case KVM_REG_PPC_TM_VSCR:
+		if (cpu_has_feature(CPU_FTR_ALTIVEC))
+			vcpu->arch.vr.vscr.u[3] = set_reg_val(id, *val);
+		else
+			r = -ENXIO;
+		break;
+	case KVM_REG_PPC_TM_DSCR:
+		vcpu->arch.dscr_tm = set_reg_val(id, *val);
+		break;
+	case KVM_REG_PPC_TM_TAR:
+		vcpu->arch.tar_tm = set_reg_val(id, *val);
+		break;
+#endif
 	default:
 		r = -EINVAL;
 		break;
@@ -1687,6 +1979,17 @@
 
 	return 0;
 }
+
+static int kvm_configure_mmu_pr(struct kvm *kvm, struct kvm_ppc_mmuv3_cfg *cfg)
+{
+	if (!cpu_has_feature(CPU_FTR_ARCH_300))
+		return -ENODEV;
+	/* Require flags and process table base and size to all be zero. */
+	if (cfg->flags || cfg->process_table)
+		return -EINVAL;
+	return 0;
+}
+
 #else
 static int kvm_vm_ioctl_get_smmu_info_pr(struct kvm *kvm,
 					 struct kvm_ppc_smmu_info *info)
@@ -1735,9 +2038,12 @@
 static int kvmppc_core_check_processor_compat_pr(void)
 {
 	/*
-	 * Disable KVM for Power9 untill the required bits merged.
+	 * PR KVM can work on POWER9 inside a guest partition
+	 * running in HPT mode.  It can't work if we are using
+	 * radix translation (because radix provides no way for
+	 * a process to have unique translations in quadrant 3).
 	 */
-	if (cpu_has_feature(CPU_FTR_ARCH_300))
+	if (cpu_has_feature(CPU_FTR_ARCH_300) && radix_enabled())
 		return -EIO;
 	return 0;
 }
@@ -1781,7 +2087,9 @@
 	.arch_vm_ioctl  = kvm_arch_vm_ioctl_pr,
 #ifdef CONFIG_PPC_BOOK3S_64
 	.hcall_implemented = kvmppc_hcall_impl_pr,
+	.configure_mmu = kvm_configure_mmu_pr,
 #endif
+	.giveup_ext = kvmppc_giveup_ext,
 };
 
 
diff --git a/arch/powerpc/kvm/book3s_segment.S b/arch/powerpc/kvm/book3s_segment.S
index 93a180c..98ccc7e 100644
--- a/arch/powerpc/kvm/book3s_segment.S
+++ b/arch/powerpc/kvm/book3s_segment.S
@@ -383,6 +383,19 @@
 	 */
 
 	PPC_LL	r6, HSTATE_HOST_MSR(r13)
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+	/*
+	 * We don't want to change MSR[TS] bits via rfi here.
+	 * The actual TM handling logic will be in host with
+	 * recovered DR/IR bits after HSTATE_VMHANDLER.
+	 * And MSR_TM can be enabled in HOST_MSR so rfid may
+	 * not suppress this change and can lead to exception.
+	 * Manually set MSR to prevent TS state change here.
+	 */
+	mfmsr   r7
+	rldicl  r7, r7, 64 - MSR_TS_S_LG, 62
+	rldimi  r6, r7, MSR_TS_S_LG, 63 - MSR_TS_T_LG
+#endif
 	PPC_LL	r8, HSTATE_VMHANDLER(r13)
 
 #ifdef CONFIG_PPC64
diff --git a/arch/powerpc/kvm/book3s_xive_template.c b/arch/powerpc/kvm/book3s_xive_template.c
index 99c3620..6e41ba7 100644
--- a/arch/powerpc/kvm/book3s_xive_template.c
+++ b/arch/powerpc/kvm/book3s_xive_template.c
@@ -334,7 +334,7 @@
 	 */
 
 	/* Return interrupt and old CPPR in GPR4 */
-	vcpu->arch.gpr[4] = hirq | (old_cppr << 24);
+	vcpu->arch.regs.gpr[4] = hirq | (old_cppr << 24);
 
 	return H_SUCCESS;
 }
@@ -369,7 +369,7 @@
 	hirq = GLUE(X_PFX,scan_interrupts)(xc, pending, scan_poll);
 
 	/* Return interrupt and old CPPR in GPR4 */
-	vcpu->arch.gpr[4] = hirq | (xc->cppr << 24);
+	vcpu->arch.regs.gpr[4] = hirq | (xc->cppr << 24);
 
 	return H_SUCCESS;
 }
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 876d4f2..a9ca016 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -77,8 +77,10 @@
 {
 	int i;
 
-	printk("pc:   %08lx msr:  %08llx\n", vcpu->arch.pc, vcpu->arch.shared->msr);
-	printk("lr:   %08lx ctr:  %08lx\n", vcpu->arch.lr, vcpu->arch.ctr);
+	printk("pc:   %08lx msr:  %08llx\n", vcpu->arch.regs.nip,
+			vcpu->arch.shared->msr);
+	printk("lr:   %08lx ctr:  %08lx\n", vcpu->arch.regs.link,
+			vcpu->arch.regs.ctr);
 	printk("srr0: %08llx srr1: %08llx\n", vcpu->arch.shared->srr0,
 					    vcpu->arch.shared->srr1);
 
@@ -491,24 +493,25 @@
 	if (allowed) {
 		switch (int_class) {
 		case INT_CLASS_NONCRIT:
-			set_guest_srr(vcpu, vcpu->arch.pc,
+			set_guest_srr(vcpu, vcpu->arch.regs.nip,
 				      vcpu->arch.shared->msr);
 			break;
 		case INT_CLASS_CRIT:
-			set_guest_csrr(vcpu, vcpu->arch.pc,
+			set_guest_csrr(vcpu, vcpu->arch.regs.nip,
 				       vcpu->arch.shared->msr);
 			break;
 		case INT_CLASS_DBG:
-			set_guest_dsrr(vcpu, vcpu->arch.pc,
+			set_guest_dsrr(vcpu, vcpu->arch.regs.nip,
 				       vcpu->arch.shared->msr);
 			break;
 		case INT_CLASS_MC:
-			set_guest_mcsrr(vcpu, vcpu->arch.pc,
+			set_guest_mcsrr(vcpu, vcpu->arch.regs.nip,
 					vcpu->arch.shared->msr);
 			break;
 		}
 
-		vcpu->arch.pc = vcpu->arch.ivpr | vcpu->arch.ivor[priority];
+		vcpu->arch.regs.nip = vcpu->arch.ivpr |
+					vcpu->arch.ivor[priority];
 		if (update_esr == true)
 			kvmppc_set_esr(vcpu, vcpu->arch.queued_esr);
 		if (update_dear == true)
@@ -826,7 +829,7 @@
 
 	case EMULATE_FAIL:
 		printk(KERN_CRIT "%s: emulation at %lx failed (%08x)\n",
-		       __func__, vcpu->arch.pc, vcpu->arch.last_inst);
+		       __func__, vcpu->arch.regs.nip, vcpu->arch.last_inst);
 		/* For debugging, encode the failing instruction and
 		 * report it to userspace. */
 		run->hw.hardware_exit_reason = ~0ULL << 32;
@@ -875,7 +878,7 @@
 	 */
 	vcpu->arch.dbsr = 0;
 	run->debug.arch.status = 0;
-	run->debug.arch.address = vcpu->arch.pc;
+	run->debug.arch.address = vcpu->arch.regs.nip;
 
 	if (dbsr & (DBSR_IAC1 | DBSR_IAC2 | DBSR_IAC3 | DBSR_IAC4)) {
 		run->debug.arch.status |= KVMPPC_DEBUG_BREAKPOINT;
@@ -971,7 +974,7 @@
 
 	case EMULATE_FAIL:
 		pr_debug("%s: load instruction from guest address %lx failed\n",
-		       __func__, vcpu->arch.pc);
+		       __func__, vcpu->arch.regs.nip);
 		/* For debugging, encode the failing instruction and
 		 * report it to userspace. */
 		run->hw.hardware_exit_reason = ~0ULL << 32;
@@ -1169,7 +1172,7 @@
 	case BOOKE_INTERRUPT_SPE_FP_DATA:
 	case BOOKE_INTERRUPT_SPE_FP_ROUND:
 		printk(KERN_CRIT "%s: unexpected SPE interrupt %u at %08lx\n",
-		       __func__, exit_nr, vcpu->arch.pc);
+		       __func__, exit_nr, vcpu->arch.regs.nip);
 		run->hw.hardware_exit_reason = exit_nr;
 		r = RESUME_HOST;
 		break;
@@ -1299,7 +1302,7 @@
 	}
 
 	case BOOKE_INTERRUPT_ITLB_MISS: {
-		unsigned long eaddr = vcpu->arch.pc;
+		unsigned long eaddr = vcpu->arch.regs.nip;
 		gpa_t gpaddr;
 		gfn_t gfn;
 		int gtlb_index;
@@ -1391,7 +1394,7 @@
 	int i;
 	int r;
 
-	vcpu->arch.pc = 0;
+	vcpu->arch.regs.nip = 0;
 	vcpu->arch.shared->pir = vcpu->vcpu_id;
 	kvmppc_set_gpr(vcpu, 1, (16<<20) - 8); /* -8 for the callee-save LR slot */
 	kvmppc_set_msr(vcpu, 0);
@@ -1440,10 +1443,10 @@
 
 	vcpu_load(vcpu);
 
-	regs->pc = vcpu->arch.pc;
+	regs->pc = vcpu->arch.regs.nip;
 	regs->cr = kvmppc_get_cr(vcpu);
-	regs->ctr = vcpu->arch.ctr;
-	regs->lr = vcpu->arch.lr;
+	regs->ctr = vcpu->arch.regs.ctr;
+	regs->lr = vcpu->arch.regs.link;
 	regs->xer = kvmppc_get_xer(vcpu);
 	regs->msr = vcpu->arch.shared->msr;
 	regs->srr0 = kvmppc_get_srr0(vcpu);
@@ -1471,10 +1474,10 @@
 
 	vcpu_load(vcpu);
 
-	vcpu->arch.pc = regs->pc;
+	vcpu->arch.regs.nip = regs->pc;
 	kvmppc_set_cr(vcpu, regs->cr);
-	vcpu->arch.ctr = regs->ctr;
-	vcpu->arch.lr = regs->lr;
+	vcpu->arch.regs.ctr = regs->ctr;
+	vcpu->arch.regs.link = regs->lr;
 	kvmppc_set_xer(vcpu, regs->xer);
 	kvmppc_set_msr(vcpu, regs->msr);
 	kvmppc_set_srr0(vcpu, regs->srr0);
diff --git a/arch/powerpc/kvm/booke_emulate.c b/arch/powerpc/kvm/booke_emulate.c
index a82f645..d23e582 100644
--- a/arch/powerpc/kvm/booke_emulate.c
+++ b/arch/powerpc/kvm/booke_emulate.c
@@ -34,19 +34,19 @@
 
 static void kvmppc_emul_rfi(struct kvm_vcpu *vcpu)
 {
-	vcpu->arch.pc = vcpu->arch.shared->srr0;
+	vcpu->arch.regs.nip = vcpu->arch.shared->srr0;
 	kvmppc_set_msr(vcpu, vcpu->arch.shared->srr1);
 }
 
 static void kvmppc_emul_rfdi(struct kvm_vcpu *vcpu)
 {
-	vcpu->arch.pc = vcpu->arch.dsrr0;
+	vcpu->arch.regs.nip = vcpu->arch.dsrr0;
 	kvmppc_set_msr(vcpu, vcpu->arch.dsrr1);
 }
 
 static void kvmppc_emul_rfci(struct kvm_vcpu *vcpu)
 {
-	vcpu->arch.pc = vcpu->arch.csrr0;
+	vcpu->arch.regs.nip = vcpu->arch.csrr0;
 	kvmppc_set_msr(vcpu, vcpu->arch.csrr1);
 }
 
diff --git a/arch/powerpc/kvm/e500_emulate.c b/arch/powerpc/kvm/e500_emulate.c
index 990db69..3f8189e 100644
--- a/arch/powerpc/kvm/e500_emulate.c
+++ b/arch/powerpc/kvm/e500_emulate.c
@@ -53,7 +53,7 @@
 
 static int kvmppc_e500_emul_msgclr(struct kvm_vcpu *vcpu, int rb)
 {
-	ulong param = vcpu->arch.gpr[rb];
+	ulong param = vcpu->arch.regs.gpr[rb];
 	int prio = dbell2prio(param);
 
 	if (prio < 0)
@@ -65,7 +65,7 @@
 
 static int kvmppc_e500_emul_msgsnd(struct kvm_vcpu *vcpu, int rb)
 {
-	ulong param = vcpu->arch.gpr[rb];
+	ulong param = vcpu->arch.regs.gpr[rb];
 	int prio = dbell2prio(rb);
 	int pir = param & PPC_DBELL_PIR_MASK;
 	int i;
@@ -94,7 +94,7 @@
 	switch (get_oc(inst)) {
 	case EHPRIV_OC_DEBUG:
 		run->exit_reason = KVM_EXIT_DEBUG;
-		run->debug.arch.address = vcpu->arch.pc;
+		run->debug.arch.address = vcpu->arch.regs.nip;
 		run->debug.arch.status = 0;
 		kvmppc_account_exit(vcpu, DEBUG_EXITS);
 		emulated = EMULATE_EXIT_USER;
diff --git a/arch/powerpc/kvm/e500_mmu.c b/arch/powerpc/kvm/e500_mmu.c
index ddbf8f0..24296f4 100644
--- a/arch/powerpc/kvm/e500_mmu.c
+++ b/arch/powerpc/kvm/e500_mmu.c
@@ -513,7 +513,7 @@
 {
 	unsigned int as = !!(vcpu->arch.shared->msr & MSR_IS);
 
-	kvmppc_e500_deliver_tlb_miss(vcpu, vcpu->arch.pc, as);
+	kvmppc_e500_deliver_tlb_miss(vcpu, vcpu->arch.regs.nip, as);
 }
 
 void kvmppc_mmu_dtlb_miss(struct kvm_vcpu *vcpu)
diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c
index c878b4f..8f2985e 100644
--- a/arch/powerpc/kvm/e500_mmu_host.c
+++ b/arch/powerpc/kvm/e500_mmu_host.c
@@ -625,8 +625,8 @@
 }
 
 #ifdef CONFIG_KVM_BOOKE_HV
-int kvmppc_load_last_inst(struct kvm_vcpu *vcpu, enum instruction_type type,
-			  u32 *instr)
+int kvmppc_load_last_inst(struct kvm_vcpu *vcpu,
+		enum instruction_fetch_type type, u32 *instr)
 {
 	gva_t geaddr;
 	hpa_t addr;
@@ -715,8 +715,8 @@
 	return EMULATE_DONE;
 }
 #else
-int kvmppc_load_last_inst(struct kvm_vcpu *vcpu, enum instruction_type type,
-			  u32 *instr)
+int kvmppc_load_last_inst(struct kvm_vcpu *vcpu,
+		enum instruction_fetch_type type, u32 *instr)
 {
 	return EMULATE_AGAIN;
 }
diff --git a/arch/powerpc/kvm/emulate_loadstore.c b/arch/powerpc/kvm/emulate_loadstore.c
index a382e15..afde788 100644
--- a/arch/powerpc/kvm/emulate_loadstore.c
+++ b/arch/powerpc/kvm/emulate_loadstore.c
@@ -31,6 +31,7 @@
 #include <asm/kvm_ppc.h>
 #include <asm/disassemble.h>
 #include <asm/ppc-opcode.h>
+#include <asm/sstep.h>
 #include "timing.h"
 #include "trace.h"
 
@@ -84,8 +85,9 @@
 	struct kvm_run *run = vcpu->run;
 	u32 inst;
 	int ra, rs, rt;
-	enum emulation_result emulated;
+	enum emulation_result emulated = EMULATE_FAIL;
 	int advance = 1;
+	struct instruction_op op;
 
 	/* this default type might be overwritten by subcategories */
 	kvmppc_set_exit_type(vcpu, EMULATED_INST_EXITS);
@@ -107,580 +109,276 @@
 	vcpu->arch.mmio_vsx_tx_sx_enabled = get_tx_or_sx(inst);
 	vcpu->arch.mmio_vsx_copy_nums = 0;
 	vcpu->arch.mmio_vsx_offset = 0;
-	vcpu->arch.mmio_vsx_copy_type = KVMPPC_VSX_COPY_NONE;
+	vcpu->arch.mmio_copy_type = KVMPPC_VSX_COPY_NONE;
 	vcpu->arch.mmio_sp64_extend = 0;
 	vcpu->arch.mmio_sign_extend = 0;
 	vcpu->arch.mmio_vmx_copy_nums = 0;
+	vcpu->arch.mmio_vmx_offset = 0;
+	vcpu->arch.mmio_host_swabbed = 0;
 
-	switch (get_op(inst)) {
-	case 31:
-		switch (get_xop(inst)) {
-		case OP_31_XOP_LWZX:
-			emulated = kvmppc_handle_load(run, vcpu, rt, 4, 1);
+	emulated = EMULATE_FAIL;
+	vcpu->arch.regs.msr = vcpu->arch.shared->msr;
+	vcpu->arch.regs.ccr = vcpu->arch.cr;
+	if (analyse_instr(&op, &vcpu->arch.regs, inst) == 0) {
+		int type = op.type & INSTR_TYPE_MASK;
+		int size = GETSIZE(op.type);
+
+		switch (type) {
+		case LOAD:  {
+			int instr_byte_swap = op.type & BYTEREV;
+
+			if (op.type & SIGNEXT)
+				emulated = kvmppc_handle_loads(run, vcpu,
+						op.reg, size, !instr_byte_swap);
+			else
+				emulated = kvmppc_handle_load(run, vcpu,
+						op.reg, size, !instr_byte_swap);
+
+			if ((op.type & UPDATE) && (emulated != EMULATE_FAIL))
+				kvmppc_set_gpr(vcpu, op.update_reg, op.ea);
+
 			break;
+		}
+#ifdef CONFIG_PPC_FPU
+		case LOAD_FP:
+			if (kvmppc_check_fp_disabled(vcpu))
+				return EMULATE_DONE;
 
-		case OP_31_XOP_LWZUX:
-			emulated = kvmppc_handle_load(run, vcpu, rt, 4, 1);
-			kvmppc_set_gpr(vcpu, ra, vcpu->arch.vaddr_accessed);
+			if (op.type & FPCONV)
+				vcpu->arch.mmio_sp64_extend = 1;
+
+			if (op.type & SIGNEXT)
+				emulated = kvmppc_handle_loads(run, vcpu,
+					     KVM_MMIO_REG_FPR|op.reg, size, 1);
+			else
+				emulated = kvmppc_handle_load(run, vcpu,
+					     KVM_MMIO_REG_FPR|op.reg, size, 1);
+
+			if ((op.type & UPDATE) && (emulated != EMULATE_FAIL))
+				kvmppc_set_gpr(vcpu, op.update_reg, op.ea);
+
 			break;
+#endif
+#ifdef CONFIG_ALTIVEC
+		case LOAD_VMX:
+			if (kvmppc_check_altivec_disabled(vcpu))
+				return EMULATE_DONE;
 
-		case OP_31_XOP_LBZX:
-			emulated = kvmppc_handle_load(run, vcpu, rt, 1, 1);
+			/* Hardware enforces alignment of VMX accesses */
+			vcpu->arch.vaddr_accessed &= ~((unsigned long)size - 1);
+			vcpu->arch.paddr_accessed &= ~((unsigned long)size - 1);
+
+			if (size == 16) { /* lvx */
+				vcpu->arch.mmio_copy_type =
+						KVMPPC_VMX_COPY_DWORD;
+			} else if (size == 4) { /* lvewx  */
+				vcpu->arch.mmio_copy_type =
+						KVMPPC_VMX_COPY_WORD;
+			} else if (size == 2) { /* lvehx  */
+				vcpu->arch.mmio_copy_type =
+						KVMPPC_VMX_COPY_HWORD;
+			} else if (size == 1) { /* lvebx  */
+				vcpu->arch.mmio_copy_type =
+						KVMPPC_VMX_COPY_BYTE;
+			} else
+				break;
+
+			vcpu->arch.mmio_vmx_offset =
+				(vcpu->arch.vaddr_accessed & 0xf)/size;
+
+			if (size == 16) {
+				vcpu->arch.mmio_vmx_copy_nums = 2;
+				emulated = kvmppc_handle_vmx_load(run,
+						vcpu, KVM_MMIO_REG_VMX|op.reg,
+						8, 1);
+			} else {
+				vcpu->arch.mmio_vmx_copy_nums = 1;
+				emulated = kvmppc_handle_vmx_load(run, vcpu,
+						KVM_MMIO_REG_VMX|op.reg,
+						size, 1);
+			}
 			break;
+#endif
+#ifdef CONFIG_VSX
+		case LOAD_VSX: {
+			int io_size_each;
 
-		case OP_31_XOP_LBZUX:
-			emulated = kvmppc_handle_load(run, vcpu, rt, 1, 1);
-			kvmppc_set_gpr(vcpu, ra, vcpu->arch.vaddr_accessed);
+			if (op.vsx_flags & VSX_CHECK_VEC) {
+				if (kvmppc_check_altivec_disabled(vcpu))
+					return EMULATE_DONE;
+			} else {
+				if (kvmppc_check_vsx_disabled(vcpu))
+					return EMULATE_DONE;
+			}
+
+			if (op.vsx_flags & VSX_FPCONV)
+				vcpu->arch.mmio_sp64_extend = 1;
+
+			if (op.element_size == 8)  {
+				if (op.vsx_flags & VSX_SPLAT)
+					vcpu->arch.mmio_copy_type =
+						KVMPPC_VSX_COPY_DWORD_LOAD_DUMP;
+				else
+					vcpu->arch.mmio_copy_type =
+						KVMPPC_VSX_COPY_DWORD;
+			} else if (op.element_size == 4) {
+				if (op.vsx_flags & VSX_SPLAT)
+					vcpu->arch.mmio_copy_type =
+						KVMPPC_VSX_COPY_WORD_LOAD_DUMP;
+				else
+					vcpu->arch.mmio_copy_type =
+						KVMPPC_VSX_COPY_WORD;
+			} else
+				break;
+
+			if (size < op.element_size) {
+				/* precision convert case: lxsspx, etc */
+				vcpu->arch.mmio_vsx_copy_nums = 1;
+				io_size_each = size;
+			} else { /* lxvw4x, lxvd2x, etc */
+				vcpu->arch.mmio_vsx_copy_nums =
+					size/op.element_size;
+				io_size_each = op.element_size;
+			}
+
+			emulated = kvmppc_handle_vsx_load(run, vcpu,
+					KVM_MMIO_REG_VSX | (op.reg & 0x1f),
+					io_size_each, 1, op.type & SIGNEXT);
 			break;
+		}
+#endif
+		case STORE:
+			/* if need byte reverse, op.val has been reversed by
+			 * analyse_instr().
+			 */
+			emulated = kvmppc_handle_store(run, vcpu, op.val,
+					size, 1);
 
-		case OP_31_XOP_STDX:
+			if ((op.type & UPDATE) && (emulated != EMULATE_FAIL))
+				kvmppc_set_gpr(vcpu, op.update_reg, op.ea);
+
+			break;
+#ifdef CONFIG_PPC_FPU
+		case STORE_FP:
+			if (kvmppc_check_fp_disabled(vcpu))
+				return EMULATE_DONE;
+
+			/* The FP registers need to be flushed so that
+			 * kvmppc_handle_store() can read actual FP vals
+			 * from vcpu->arch.
+			 */
+			if (vcpu->kvm->arch.kvm_ops->giveup_ext)
+				vcpu->kvm->arch.kvm_ops->giveup_ext(vcpu,
+						MSR_FP);
+
+			if (op.type & FPCONV)
+				vcpu->arch.mmio_sp64_extend = 1;
+
 			emulated = kvmppc_handle_store(run, vcpu,
-					kvmppc_get_gpr(vcpu, rs), 8, 1);
-			break;
+					VCPU_FPR(vcpu, op.reg), size, 1);
 
-		case OP_31_XOP_STDUX:
-			emulated = kvmppc_handle_store(run, vcpu,
-					kvmppc_get_gpr(vcpu, rs), 8, 1);
-			kvmppc_set_gpr(vcpu, ra, vcpu->arch.vaddr_accessed);
-			break;
+			if ((op.type & UPDATE) && (emulated != EMULATE_FAIL))
+				kvmppc_set_gpr(vcpu, op.update_reg, op.ea);
 
-		case OP_31_XOP_STWX:
-			emulated = kvmppc_handle_store(run, vcpu,
-					kvmppc_get_gpr(vcpu, rs), 4, 1);
 			break;
+#endif
+#ifdef CONFIG_ALTIVEC
+		case STORE_VMX:
+			if (kvmppc_check_altivec_disabled(vcpu))
+				return EMULATE_DONE;
 
-		case OP_31_XOP_STWUX:
-			emulated = kvmppc_handle_store(run, vcpu,
-					kvmppc_get_gpr(vcpu, rs), 4, 1);
-			kvmppc_set_gpr(vcpu, ra, vcpu->arch.vaddr_accessed);
+			/* Hardware enforces alignment of VMX accesses. */
+			vcpu->arch.vaddr_accessed &= ~((unsigned long)size - 1);
+			vcpu->arch.paddr_accessed &= ~((unsigned long)size - 1);
+
+			if (vcpu->kvm->arch.kvm_ops->giveup_ext)
+				vcpu->kvm->arch.kvm_ops->giveup_ext(vcpu,
+						MSR_VEC);
+			if (size == 16) { /* stvx */
+				vcpu->arch.mmio_copy_type =
+						KVMPPC_VMX_COPY_DWORD;
+			} else if (size == 4) { /* stvewx  */
+				vcpu->arch.mmio_copy_type =
+						KVMPPC_VMX_COPY_WORD;
+			} else if (size == 2) { /* stvehx  */
+				vcpu->arch.mmio_copy_type =
+						KVMPPC_VMX_COPY_HWORD;
+			} else if (size == 1) { /* stvebx  */
+				vcpu->arch.mmio_copy_type =
+						KVMPPC_VMX_COPY_BYTE;
+			} else
+				break;
+
+			vcpu->arch.mmio_vmx_offset =
+				(vcpu->arch.vaddr_accessed & 0xf)/size;
+
+			if (size == 16) {
+				vcpu->arch.mmio_vmx_copy_nums = 2;
+				emulated = kvmppc_handle_vmx_store(run,
+						vcpu, op.reg, 8, 1);
+			} else {
+				vcpu->arch.mmio_vmx_copy_nums = 1;
+				emulated = kvmppc_handle_vmx_store(run,
+						vcpu, op.reg, size, 1);
+			}
+
 			break;
+#endif
+#ifdef CONFIG_VSX
+		case STORE_VSX: {
+			int io_size_each;
 
-		case OP_31_XOP_STBX:
-			emulated = kvmppc_handle_store(run, vcpu,
-					kvmppc_get_gpr(vcpu, rs), 1, 1);
+			if (op.vsx_flags & VSX_CHECK_VEC) {
+				if (kvmppc_check_altivec_disabled(vcpu))
+					return EMULATE_DONE;
+			} else {
+				if (kvmppc_check_vsx_disabled(vcpu))
+					return EMULATE_DONE;
+			}
+
+			if (vcpu->kvm->arch.kvm_ops->giveup_ext)
+				vcpu->kvm->arch.kvm_ops->giveup_ext(vcpu,
+						MSR_VSX);
+
+			if (op.vsx_flags & VSX_FPCONV)
+				vcpu->arch.mmio_sp64_extend = 1;
+
+			if (op.element_size == 8)
+				vcpu->arch.mmio_copy_type =
+						KVMPPC_VSX_COPY_DWORD;
+			else if (op.element_size == 4)
+				vcpu->arch.mmio_copy_type =
+						KVMPPC_VSX_COPY_WORD;
+			else
+				break;
+
+			if (size < op.element_size) {
+				/* precise conversion case, like stxsspx */
+				vcpu->arch.mmio_vsx_copy_nums = 1;
+				io_size_each = size;
+			} else { /* stxvw4x, stxvd2x, etc */
+				vcpu->arch.mmio_vsx_copy_nums =
+						size/op.element_size;
+				io_size_each = op.element_size;
+			}
+
+			emulated = kvmppc_handle_vsx_store(run, vcpu,
+					op.reg & 0x1f, io_size_each, 1);
 			break;
-
-		case OP_31_XOP_STBUX:
-			emulated = kvmppc_handle_store(run, vcpu,
-					kvmppc_get_gpr(vcpu, rs), 1, 1);
-			kvmppc_set_gpr(vcpu, ra, vcpu->arch.vaddr_accessed);
-			break;
-
-		case OP_31_XOP_LHAX:
-			emulated = kvmppc_handle_loads(run, vcpu, rt, 2, 1);
-			break;
-
-		case OP_31_XOP_LHAUX:
-			emulated = kvmppc_handle_loads(run, vcpu, rt, 2, 1);
-			kvmppc_set_gpr(vcpu, ra, vcpu->arch.vaddr_accessed);
-			break;
-
-		case OP_31_XOP_LHZX:
-			emulated = kvmppc_handle_load(run, vcpu, rt, 2, 1);
-			break;
-
-		case OP_31_XOP_LHZUX:
-			emulated = kvmppc_handle_load(run, vcpu, rt, 2, 1);
-			kvmppc_set_gpr(vcpu, ra, vcpu->arch.vaddr_accessed);
-			break;
-
-		case OP_31_XOP_STHX:
-			emulated = kvmppc_handle_store(run, vcpu,
-					kvmppc_get_gpr(vcpu, rs), 2, 1);
-			break;
-
-		case OP_31_XOP_STHUX:
-			emulated = kvmppc_handle_store(run, vcpu,
-					kvmppc_get_gpr(vcpu, rs), 2, 1);
-			kvmppc_set_gpr(vcpu, ra, vcpu->arch.vaddr_accessed);
-			break;
-
-		case OP_31_XOP_DCBST:
-		case OP_31_XOP_DCBF:
-		case OP_31_XOP_DCBI:
+		}
+#endif
+		case CACHEOP:
 			/* Do nothing. The guest is performing dcbi because
 			 * hardware DMA is not snooped by the dcache, but
 			 * emulated DMA either goes through the dcache as
 			 * normal writes, or the host kernel has handled dcache
-			 * coherence. */
-			break;
-
-		case OP_31_XOP_LWBRX:
-			emulated = kvmppc_handle_load(run, vcpu, rt, 4, 0);
-			break;
-
-		case OP_31_XOP_STWBRX:
-			emulated = kvmppc_handle_store(run, vcpu,
-					kvmppc_get_gpr(vcpu, rs), 4, 0);
-			break;
-
-		case OP_31_XOP_LHBRX:
-			emulated = kvmppc_handle_load(run, vcpu, rt, 2, 0);
-			break;
-
-		case OP_31_XOP_STHBRX:
-			emulated = kvmppc_handle_store(run, vcpu,
-					kvmppc_get_gpr(vcpu, rs), 2, 0);
-			break;
-
-		case OP_31_XOP_LDBRX:
-			emulated = kvmppc_handle_load(run, vcpu, rt, 8, 0);
-			break;
-
-		case OP_31_XOP_STDBRX:
-			emulated = kvmppc_handle_store(run, vcpu,
-					kvmppc_get_gpr(vcpu, rs), 8, 0);
-			break;
-
-		case OP_31_XOP_LDX:
-			emulated = kvmppc_handle_load(run, vcpu, rt, 8, 1);
-			break;
-
-		case OP_31_XOP_LDUX:
-			emulated = kvmppc_handle_load(run, vcpu, rt, 8, 1);
-			kvmppc_set_gpr(vcpu, ra, vcpu->arch.vaddr_accessed);
-			break;
-
-		case OP_31_XOP_LWAX:
-			emulated = kvmppc_handle_loads(run, vcpu, rt, 4, 1);
-			break;
-
-		case OP_31_XOP_LWAUX:
-			emulated = kvmppc_handle_loads(run, vcpu, rt, 4, 1);
-			kvmppc_set_gpr(vcpu, ra, vcpu->arch.vaddr_accessed);
-			break;
-
-#ifdef CONFIG_PPC_FPU
-		case OP_31_XOP_LFSX:
-			if (kvmppc_check_fp_disabled(vcpu))
-				return EMULATE_DONE;
-			vcpu->arch.mmio_sp64_extend = 1;
-			emulated = kvmppc_handle_load(run, vcpu,
-				KVM_MMIO_REG_FPR|rt, 4, 1);
-			break;
-
-		case OP_31_XOP_LFSUX:
-			if (kvmppc_check_fp_disabled(vcpu))
-				return EMULATE_DONE;
-			vcpu->arch.mmio_sp64_extend = 1;
-			emulated = kvmppc_handle_load(run, vcpu,
-				KVM_MMIO_REG_FPR|rt, 4, 1);
-			kvmppc_set_gpr(vcpu, ra, vcpu->arch.vaddr_accessed);
-			break;
-
-		case OP_31_XOP_LFDX:
-			if (kvmppc_check_fp_disabled(vcpu))
-				return EMULATE_DONE;
-			emulated = kvmppc_handle_load(run, vcpu,
-				KVM_MMIO_REG_FPR|rt, 8, 1);
-			break;
-
-		case OP_31_XOP_LFDUX:
-			if (kvmppc_check_fp_disabled(vcpu))
-				return EMULATE_DONE;
-			emulated = kvmppc_handle_load(run, vcpu,
-				KVM_MMIO_REG_FPR|rt, 8, 1);
-			kvmppc_set_gpr(vcpu, ra, vcpu->arch.vaddr_accessed);
-			break;
-
-		case OP_31_XOP_LFIWAX:
-			if (kvmppc_check_fp_disabled(vcpu))
-				return EMULATE_DONE;
-			emulated = kvmppc_handle_loads(run, vcpu,
-				KVM_MMIO_REG_FPR|rt, 4, 1);
-			break;
-
-		case OP_31_XOP_LFIWZX:
-			if (kvmppc_check_fp_disabled(vcpu))
-				return EMULATE_DONE;
-			emulated = kvmppc_handle_load(run, vcpu,
-				KVM_MMIO_REG_FPR|rt, 4, 1);
-			break;
-
-		case OP_31_XOP_STFSX:
-			if (kvmppc_check_fp_disabled(vcpu))
-				return EMULATE_DONE;
-			vcpu->arch.mmio_sp64_extend = 1;
-			emulated = kvmppc_handle_store(run, vcpu,
-				VCPU_FPR(vcpu, rs), 4, 1);
-			break;
-
-		case OP_31_XOP_STFSUX:
-			if (kvmppc_check_fp_disabled(vcpu))
-				return EMULATE_DONE;
-			vcpu->arch.mmio_sp64_extend = 1;
-			emulated = kvmppc_handle_store(run, vcpu,
-				VCPU_FPR(vcpu, rs), 4, 1);
-			kvmppc_set_gpr(vcpu, ra, vcpu->arch.vaddr_accessed);
-			break;
-
-		case OP_31_XOP_STFDX:
-			if (kvmppc_check_fp_disabled(vcpu))
-				return EMULATE_DONE;
-			emulated = kvmppc_handle_store(run, vcpu,
-				VCPU_FPR(vcpu, rs), 8, 1);
-			break;
-
-		case OP_31_XOP_STFDUX:
-			if (kvmppc_check_fp_disabled(vcpu))
-				return EMULATE_DONE;
-			emulated = kvmppc_handle_store(run, vcpu,
-				VCPU_FPR(vcpu, rs), 8, 1);
-			kvmppc_set_gpr(vcpu, ra, vcpu->arch.vaddr_accessed);
-			break;
-
-		case OP_31_XOP_STFIWX:
-			if (kvmppc_check_fp_disabled(vcpu))
-				return EMULATE_DONE;
-			emulated = kvmppc_handle_store(run, vcpu,
-				VCPU_FPR(vcpu, rs), 4, 1);
-			break;
-#endif
-
-#ifdef CONFIG_VSX
-		case OP_31_XOP_LXSDX:
-			if (kvmppc_check_vsx_disabled(vcpu))
-				return EMULATE_DONE;
-			vcpu->arch.mmio_vsx_copy_nums = 1;
-			vcpu->arch.mmio_vsx_copy_type = KVMPPC_VSX_COPY_DWORD;
-			emulated = kvmppc_handle_vsx_load(run, vcpu,
-				KVM_MMIO_REG_VSX|rt, 8, 1, 0);
-			break;
-
-		case OP_31_XOP_LXSSPX:
-			if (kvmppc_check_vsx_disabled(vcpu))
-				return EMULATE_DONE;
-			vcpu->arch.mmio_vsx_copy_nums = 1;
-			vcpu->arch.mmio_vsx_copy_type = KVMPPC_VSX_COPY_DWORD;
-			vcpu->arch.mmio_sp64_extend = 1;
-			emulated = kvmppc_handle_vsx_load(run, vcpu,
-				KVM_MMIO_REG_VSX|rt, 4, 1, 0);
-			break;
-
-		case OP_31_XOP_LXSIWAX:
-			if (kvmppc_check_vsx_disabled(vcpu))
-				return EMULATE_DONE;
-			vcpu->arch.mmio_vsx_copy_nums = 1;
-			vcpu->arch.mmio_vsx_copy_type = KVMPPC_VSX_COPY_DWORD;
-			emulated = kvmppc_handle_vsx_load(run, vcpu,
-				KVM_MMIO_REG_VSX|rt, 4, 1, 1);
-			break;
-
-		case OP_31_XOP_LXSIWZX:
-			if (kvmppc_check_vsx_disabled(vcpu))
-				return EMULATE_DONE;
-			vcpu->arch.mmio_vsx_copy_nums = 1;
-			vcpu->arch.mmio_vsx_copy_type = KVMPPC_VSX_COPY_DWORD;
-			emulated = kvmppc_handle_vsx_load(run, vcpu,
-				KVM_MMIO_REG_VSX|rt, 4, 1, 0);
-			break;
-
-		case OP_31_XOP_LXVD2X:
-		/*
-		 * In this case, the official load/store process is like this:
-		 * Step1, exit from vm by page fault isr, then kvm save vsr.
-		 * Please see guest_exit_cont->store_fp_state->SAVE_32VSRS
-		 * as reference.
-		 *
-		 * Step2, copy data between memory and VCPU
-		 * Notice: for LXVD2X/STXVD2X/LXVW4X/STXVW4X, we use
-		 * 2copies*8bytes or 4copies*4bytes
-		 * to simulate one copy of 16bytes.
-		 * Also there is an endian issue here, we should notice the
-		 * layout of memory.
-		 * Please see MARCO of LXVD2X_ROT/STXVD2X_ROT as more reference.
-		 * If host is little-endian, kvm will call XXSWAPD for
-		 * LXVD2X_ROT/STXVD2X_ROT.
-		 * So, if host is little-endian,
-		 * the postion of memeory should be swapped.
-		 *
-		 * Step3, return to guest, kvm reset register.
-		 * Please see kvmppc_hv_entry->load_fp_state->REST_32VSRS
-		 * as reference.
-		 */
-			if (kvmppc_check_vsx_disabled(vcpu))
-				return EMULATE_DONE;
-			vcpu->arch.mmio_vsx_copy_nums = 2;
-			vcpu->arch.mmio_vsx_copy_type = KVMPPC_VSX_COPY_DWORD;
-			emulated = kvmppc_handle_vsx_load(run, vcpu,
-				KVM_MMIO_REG_VSX|rt, 8, 1, 0);
-			break;
-
-		case OP_31_XOP_LXVW4X:
-			if (kvmppc_check_vsx_disabled(vcpu))
-				return EMULATE_DONE;
-			vcpu->arch.mmio_vsx_copy_nums = 4;
-			vcpu->arch.mmio_vsx_copy_type = KVMPPC_VSX_COPY_WORD;
-			emulated = kvmppc_handle_vsx_load(run, vcpu,
-				KVM_MMIO_REG_VSX|rt, 4, 1, 0);
-			break;
-
-		case OP_31_XOP_LXVDSX:
-			if (kvmppc_check_vsx_disabled(vcpu))
-				return EMULATE_DONE;
-			vcpu->arch.mmio_vsx_copy_nums = 1;
-			vcpu->arch.mmio_vsx_copy_type =
-				 KVMPPC_VSX_COPY_DWORD_LOAD_DUMP;
-			emulated = kvmppc_handle_vsx_load(run, vcpu,
-				KVM_MMIO_REG_VSX|rt, 8, 1, 0);
-			break;
-
-		case OP_31_XOP_STXSDX:
-			if (kvmppc_check_vsx_disabled(vcpu))
-				return EMULATE_DONE;
-			vcpu->arch.mmio_vsx_copy_nums = 1;
-			vcpu->arch.mmio_vsx_copy_type = KVMPPC_VSX_COPY_DWORD;
-			emulated = kvmppc_handle_vsx_store(run, vcpu,
-						 rs, 8, 1);
-			break;
-
-		case OP_31_XOP_STXSSPX:
-			if (kvmppc_check_vsx_disabled(vcpu))
-				return EMULATE_DONE;
-			vcpu->arch.mmio_vsx_copy_nums = 1;
-			vcpu->arch.mmio_vsx_copy_type = KVMPPC_VSX_COPY_DWORD;
-			vcpu->arch.mmio_sp64_extend = 1;
-			emulated = kvmppc_handle_vsx_store(run, vcpu,
-						 rs, 4, 1);
-			break;
-
-		case OP_31_XOP_STXSIWX:
-			if (kvmppc_check_vsx_disabled(vcpu))
-				return EMULATE_DONE;
-			vcpu->arch.mmio_vsx_offset = 1;
-			vcpu->arch.mmio_vsx_copy_nums = 1;
-			vcpu->arch.mmio_vsx_copy_type = KVMPPC_VSX_COPY_WORD;
-			emulated = kvmppc_handle_vsx_store(run, vcpu,
-							 rs, 4, 1);
-			break;
-
-		case OP_31_XOP_STXVD2X:
-			if (kvmppc_check_vsx_disabled(vcpu))
-				return EMULATE_DONE;
-			vcpu->arch.mmio_vsx_copy_nums = 2;
-			vcpu->arch.mmio_vsx_copy_type = KVMPPC_VSX_COPY_DWORD;
-			emulated = kvmppc_handle_vsx_store(run, vcpu,
-							 rs, 8, 1);
-			break;
-
-		case OP_31_XOP_STXVW4X:
-			if (kvmppc_check_vsx_disabled(vcpu))
-				return EMULATE_DONE;
-			vcpu->arch.mmio_vsx_copy_nums = 4;
-			vcpu->arch.mmio_vsx_copy_type = KVMPPC_VSX_COPY_WORD;
-			emulated = kvmppc_handle_vsx_store(run, vcpu,
-							 rs, 4, 1);
-			break;
-#endif /* CONFIG_VSX */
-
-#ifdef CONFIG_ALTIVEC
-		case OP_31_XOP_LVX:
-			if (kvmppc_check_altivec_disabled(vcpu))
-				return EMULATE_DONE;
-			vcpu->arch.vaddr_accessed &= ~0xFULL;
-			vcpu->arch.paddr_accessed &= ~0xFULL;
-			vcpu->arch.mmio_vmx_copy_nums = 2;
-			emulated = kvmppc_handle_load128_by2x64(run, vcpu,
-					KVM_MMIO_REG_VMX|rt, 1);
-			break;
-
-		case OP_31_XOP_STVX:
-			if (kvmppc_check_altivec_disabled(vcpu))
-				return EMULATE_DONE;
-			vcpu->arch.vaddr_accessed &= ~0xFULL;
-			vcpu->arch.paddr_accessed &= ~0xFULL;
-			vcpu->arch.mmio_vmx_copy_nums = 2;
-			emulated = kvmppc_handle_store128_by2x64(run, vcpu,
-					rs, 1);
-			break;
-#endif /* CONFIG_ALTIVEC */
-
-		default:
-			emulated = EMULATE_FAIL;
-			break;
-		}
-		break;
-
-	case OP_LWZ:
-		emulated = kvmppc_handle_load(run, vcpu, rt, 4, 1);
-		break;
-
-#ifdef CONFIG_PPC_FPU
-	case OP_STFS:
-		if (kvmppc_check_fp_disabled(vcpu))
-			return EMULATE_DONE;
-		vcpu->arch.mmio_sp64_extend = 1;
-		emulated = kvmppc_handle_store(run, vcpu,
-			VCPU_FPR(vcpu, rs),
-			4, 1);
-		break;
-
-	case OP_STFSU:
-		if (kvmppc_check_fp_disabled(vcpu))
-			return EMULATE_DONE;
-		vcpu->arch.mmio_sp64_extend = 1;
-		emulated = kvmppc_handle_store(run, vcpu,
-			VCPU_FPR(vcpu, rs),
-			4, 1);
-		kvmppc_set_gpr(vcpu, ra, vcpu->arch.vaddr_accessed);
-		break;
-
-	case OP_STFD:
-		if (kvmppc_check_fp_disabled(vcpu))
-			return EMULATE_DONE;
-		emulated = kvmppc_handle_store(run, vcpu,
-			VCPU_FPR(vcpu, rs),
-	                               8, 1);
-		break;
-
-	case OP_STFDU:
-		if (kvmppc_check_fp_disabled(vcpu))
-			return EMULATE_DONE;
-		emulated = kvmppc_handle_store(run, vcpu,
-			VCPU_FPR(vcpu, rs),
-	                               8, 1);
-		kvmppc_set_gpr(vcpu, ra, vcpu->arch.vaddr_accessed);
-		break;
-#endif
-
-	case OP_LD:
-		rt = get_rt(inst);
-		switch (inst & 3) {
-		case 0:	/* ld */
-			emulated = kvmppc_handle_load(run, vcpu, rt, 8, 1);
-			break;
-		case 1: /* ldu */
-			emulated = kvmppc_handle_load(run, vcpu, rt, 8, 1);
-			kvmppc_set_gpr(vcpu, ra, vcpu->arch.vaddr_accessed);
-			break;
-		case 2:	/* lwa */
-			emulated = kvmppc_handle_loads(run, vcpu, rt, 4, 1);
+			 * coherence.
+			 */
+			emulated = EMULATE_DONE;
 			break;
 		default:
-			emulated = EMULATE_FAIL;
-		}
-		break;
-
-	case OP_LWZU:
-		emulated = kvmppc_handle_load(run, vcpu, rt, 4, 1);
-		kvmppc_set_gpr(vcpu, ra, vcpu->arch.vaddr_accessed);
-		break;
-
-	case OP_LBZ:
-		emulated = kvmppc_handle_load(run, vcpu, rt, 1, 1);
-		break;
-
-	case OP_LBZU:
-		emulated = kvmppc_handle_load(run, vcpu, rt, 1, 1);
-		kvmppc_set_gpr(vcpu, ra, vcpu->arch.vaddr_accessed);
-		break;
-
-	case OP_STW:
-		emulated = kvmppc_handle_store(run, vcpu,
-					       kvmppc_get_gpr(vcpu, rs),
-		                               4, 1);
-		break;
-
-	case OP_STD:
-		rs = get_rs(inst);
-		switch (inst & 3) {
-		case 0:	/* std */
-			emulated = kvmppc_handle_store(run, vcpu,
-				kvmppc_get_gpr(vcpu, rs), 8, 1);
 			break;
-		case 1: /* stdu */
-			emulated = kvmppc_handle_store(run, vcpu,
-				kvmppc_get_gpr(vcpu, rs), 8, 1);
-			kvmppc_set_gpr(vcpu, ra, vcpu->arch.vaddr_accessed);
-			break;
-		default:
-			emulated = EMULATE_FAIL;
 		}
-		break;
-
-	case OP_STWU:
-		emulated = kvmppc_handle_store(run, vcpu,
-				kvmppc_get_gpr(vcpu, rs), 4, 1);
-		kvmppc_set_gpr(vcpu, ra, vcpu->arch.vaddr_accessed);
-		break;
-
-	case OP_STB:
-		emulated = kvmppc_handle_store(run, vcpu,
-				kvmppc_get_gpr(vcpu, rs), 1, 1);
-		break;
-
-	case OP_STBU:
-		emulated = kvmppc_handle_store(run, vcpu,
-				kvmppc_get_gpr(vcpu, rs), 1, 1);
-		kvmppc_set_gpr(vcpu, ra, vcpu->arch.vaddr_accessed);
-		break;
-
-	case OP_LHZ:
-		emulated = kvmppc_handle_load(run, vcpu, rt, 2, 1);
-		break;
-
-	case OP_LHZU:
-		emulated = kvmppc_handle_load(run, vcpu, rt, 2, 1);
-		kvmppc_set_gpr(vcpu, ra, vcpu->arch.vaddr_accessed);
-		break;
-
-	case OP_LHA:
-		emulated = kvmppc_handle_loads(run, vcpu, rt, 2, 1);
-		break;
-
-	case OP_LHAU:
-		emulated = kvmppc_handle_loads(run, vcpu, rt, 2, 1);
-		kvmppc_set_gpr(vcpu, ra, vcpu->arch.vaddr_accessed);
-		break;
-
-	case OP_STH:
-		emulated = kvmppc_handle_store(run, vcpu,
-				kvmppc_get_gpr(vcpu, rs), 2, 1);
-		break;
-
-	case OP_STHU:
-		emulated = kvmppc_handle_store(run, vcpu,
-				kvmppc_get_gpr(vcpu, rs), 2, 1);
-		kvmppc_set_gpr(vcpu, ra, vcpu->arch.vaddr_accessed);
-		break;
-
-#ifdef CONFIG_PPC_FPU
-	case OP_LFS:
-		if (kvmppc_check_fp_disabled(vcpu))
-			return EMULATE_DONE;
-		vcpu->arch.mmio_sp64_extend = 1;
-		emulated = kvmppc_handle_load(run, vcpu,
-			KVM_MMIO_REG_FPR|rt, 4, 1);
-		break;
-
-	case OP_LFSU:
-		if (kvmppc_check_fp_disabled(vcpu))
-			return EMULATE_DONE;
-		vcpu->arch.mmio_sp64_extend = 1;
-		emulated = kvmppc_handle_load(run, vcpu,
-			KVM_MMIO_REG_FPR|rt, 4, 1);
-		kvmppc_set_gpr(vcpu, ra, vcpu->arch.vaddr_accessed);
-		break;
-
-	case OP_LFD:
-		if (kvmppc_check_fp_disabled(vcpu))
-			return EMULATE_DONE;
-		emulated = kvmppc_handle_load(run, vcpu,
-			KVM_MMIO_REG_FPR|rt, 8, 1);
-		break;
-
-	case OP_LFDU:
-		if (kvmppc_check_fp_disabled(vcpu))
-			return EMULATE_DONE;
-		emulated = kvmppc_handle_load(run, vcpu,
-			KVM_MMIO_REG_FPR|rt, 8, 1);
-		kvmppc_set_gpr(vcpu, ra, vcpu->arch.vaddr_accessed);
-		break;
-#endif
-
-	default:
-		emulated = EMULATE_FAIL;
-		break;
 	}
 
 	if (emulated == EMULATE_FAIL) {
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 3764d00..0e8c20c5 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -648,9 +648,8 @@
 #endif
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
 	case KVM_CAP_PPC_HTM:
-		r = hv_enabled &&
-		    (!!(cur_cpu_spec->cpu_user_features2 & PPC_FEATURE2_HTM) ||
-		     cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST));
+		r = !!(cur_cpu_spec->cpu_user_features2 & PPC_FEATURE2_HTM) ||
+		     (hv_enabled && cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST));
 		break;
 #endif
 	default:
@@ -907,6 +906,26 @@
 	}
 }
 
+static inline void kvmppc_set_vsr_word_dump(struct kvm_vcpu *vcpu,
+	u32 gpr)
+{
+	union kvmppc_one_reg val;
+	int index = vcpu->arch.io_gpr & KVM_MMIO_REG_MASK;
+
+	if (vcpu->arch.mmio_vsx_tx_sx_enabled) {
+		val.vsx32val[0] = gpr;
+		val.vsx32val[1] = gpr;
+		val.vsx32val[2] = gpr;
+		val.vsx32val[3] = gpr;
+		VCPU_VSX_VR(vcpu, index) = val.vval;
+	} else {
+		val.vsx32val[0] = gpr;
+		val.vsx32val[1] = gpr;
+		VCPU_VSX_FPR(vcpu, index, 0) = val.vsxval[0];
+		VCPU_VSX_FPR(vcpu, index, 1) = val.vsxval[0];
+	}
+}
+
 static inline void kvmppc_set_vsr_word(struct kvm_vcpu *vcpu,
 	u32 gpr32)
 {
@@ -933,30 +952,110 @@
 #endif /* CONFIG_VSX */
 
 #ifdef CONFIG_ALTIVEC
-static inline void kvmppc_set_vmx_dword(struct kvm_vcpu *vcpu,
-		u64 gpr)
+static inline int kvmppc_get_vmx_offset_generic(struct kvm_vcpu *vcpu,
+		int index, int element_size)
 {
+	int offset;
+	int elts = sizeof(vector128)/element_size;
+
+	if ((index < 0) || (index >= elts))
+		return -1;
+
+	if (kvmppc_need_byteswap(vcpu))
+		offset = elts - index - 1;
+	else
+		offset = index;
+
+	return offset;
+}
+
+static inline int kvmppc_get_vmx_dword_offset(struct kvm_vcpu *vcpu,
+		int index)
+{
+	return kvmppc_get_vmx_offset_generic(vcpu, index, 8);
+}
+
+static inline int kvmppc_get_vmx_word_offset(struct kvm_vcpu *vcpu,
+		int index)
+{
+	return kvmppc_get_vmx_offset_generic(vcpu, index, 4);
+}
+
+static inline int kvmppc_get_vmx_hword_offset(struct kvm_vcpu *vcpu,
+		int index)
+{
+	return kvmppc_get_vmx_offset_generic(vcpu, index, 2);
+}
+
+static inline int kvmppc_get_vmx_byte_offset(struct kvm_vcpu *vcpu,
+		int index)
+{
+	return kvmppc_get_vmx_offset_generic(vcpu, index, 1);
+}
+
+
+static inline void kvmppc_set_vmx_dword(struct kvm_vcpu *vcpu,
+	u64 gpr)
+{
+	union kvmppc_one_reg val;
+	int offset = kvmppc_get_vmx_dword_offset(vcpu,
+			vcpu->arch.mmio_vmx_offset);
 	int index = vcpu->arch.io_gpr & KVM_MMIO_REG_MASK;
-	u32 hi, lo;
-	u32 di;
 
-#ifdef __BIG_ENDIAN
-	hi = gpr >> 32;
-	lo = gpr & 0xffffffff;
-#else
-	lo = gpr >> 32;
-	hi = gpr & 0xffffffff;
-#endif
-
-	di = 2 - vcpu->arch.mmio_vmx_copy_nums;		/* doubleword index */
-	if (di > 1)
+	if (offset == -1)
 		return;
 
-	if (vcpu->arch.mmio_host_swabbed)
-		di = 1 - di;
+	val.vval = VCPU_VSX_VR(vcpu, index);
+	val.vsxval[offset] = gpr;
+	VCPU_VSX_VR(vcpu, index) = val.vval;
+}
 
-	VCPU_VSX_VR(vcpu, index).u[di * 2] = hi;
-	VCPU_VSX_VR(vcpu, index).u[di * 2 + 1] = lo;
+static inline void kvmppc_set_vmx_word(struct kvm_vcpu *vcpu,
+	u32 gpr32)
+{
+	union kvmppc_one_reg val;
+	int offset = kvmppc_get_vmx_word_offset(vcpu,
+			vcpu->arch.mmio_vmx_offset);
+	int index = vcpu->arch.io_gpr & KVM_MMIO_REG_MASK;
+
+	if (offset == -1)
+		return;
+
+	val.vval = VCPU_VSX_VR(vcpu, index);
+	val.vsx32val[offset] = gpr32;
+	VCPU_VSX_VR(vcpu, index) = val.vval;
+}
+
+static inline void kvmppc_set_vmx_hword(struct kvm_vcpu *vcpu,
+	u16 gpr16)
+{
+	union kvmppc_one_reg val;
+	int offset = kvmppc_get_vmx_hword_offset(vcpu,
+			vcpu->arch.mmio_vmx_offset);
+	int index = vcpu->arch.io_gpr & KVM_MMIO_REG_MASK;
+
+	if (offset == -1)
+		return;
+
+	val.vval = VCPU_VSX_VR(vcpu, index);
+	val.vsx16val[offset] = gpr16;
+	VCPU_VSX_VR(vcpu, index) = val.vval;
+}
+
+static inline void kvmppc_set_vmx_byte(struct kvm_vcpu *vcpu,
+	u8 gpr8)
+{
+	union kvmppc_one_reg val;
+	int offset = kvmppc_get_vmx_byte_offset(vcpu,
+			vcpu->arch.mmio_vmx_offset);
+	int index = vcpu->arch.io_gpr & KVM_MMIO_REG_MASK;
+
+	if (offset == -1)
+		return;
+
+	val.vval = VCPU_VSX_VR(vcpu, index);
+	val.vsx8val[offset] = gpr8;
+	VCPU_VSX_VR(vcpu, index) = val.vval;
 }
 #endif /* CONFIG_ALTIVEC */
 
@@ -1041,6 +1140,9 @@
 		kvmppc_set_gpr(vcpu, vcpu->arch.io_gpr, gpr);
 		break;
 	case KVM_MMIO_REG_FPR:
+		if (vcpu->kvm->arch.kvm_ops->giveup_ext)
+			vcpu->kvm->arch.kvm_ops->giveup_ext(vcpu, MSR_FP);
+
 		VCPU_FPR(vcpu, vcpu->arch.io_gpr & KVM_MMIO_REG_MASK) = gpr;
 		break;
 #ifdef CONFIG_PPC_BOOK3S
@@ -1054,18 +1156,36 @@
 #endif
 #ifdef CONFIG_VSX
 	case KVM_MMIO_REG_VSX:
-		if (vcpu->arch.mmio_vsx_copy_type == KVMPPC_VSX_COPY_DWORD)
+		if (vcpu->kvm->arch.kvm_ops->giveup_ext)
+			vcpu->kvm->arch.kvm_ops->giveup_ext(vcpu, MSR_VSX);
+
+		if (vcpu->arch.mmio_copy_type == KVMPPC_VSX_COPY_DWORD)
 			kvmppc_set_vsr_dword(vcpu, gpr);
-		else if (vcpu->arch.mmio_vsx_copy_type == KVMPPC_VSX_COPY_WORD)
+		else if (vcpu->arch.mmio_copy_type == KVMPPC_VSX_COPY_WORD)
 			kvmppc_set_vsr_word(vcpu, gpr);
-		else if (vcpu->arch.mmio_vsx_copy_type ==
+		else if (vcpu->arch.mmio_copy_type ==
 				KVMPPC_VSX_COPY_DWORD_LOAD_DUMP)
 			kvmppc_set_vsr_dword_dump(vcpu, gpr);
+		else if (vcpu->arch.mmio_copy_type ==
+				KVMPPC_VSX_COPY_WORD_LOAD_DUMP)
+			kvmppc_set_vsr_word_dump(vcpu, gpr);
 		break;
 #endif
 #ifdef CONFIG_ALTIVEC
 	case KVM_MMIO_REG_VMX:
-		kvmppc_set_vmx_dword(vcpu, gpr);
+		if (vcpu->kvm->arch.kvm_ops->giveup_ext)
+			vcpu->kvm->arch.kvm_ops->giveup_ext(vcpu, MSR_VEC);
+
+		if (vcpu->arch.mmio_copy_type == KVMPPC_VMX_COPY_DWORD)
+			kvmppc_set_vmx_dword(vcpu, gpr);
+		else if (vcpu->arch.mmio_copy_type == KVMPPC_VMX_COPY_WORD)
+			kvmppc_set_vmx_word(vcpu, gpr);
+		else if (vcpu->arch.mmio_copy_type ==
+				KVMPPC_VMX_COPY_HWORD)
+			kvmppc_set_vmx_hword(vcpu, gpr);
+		else if (vcpu->arch.mmio_copy_type ==
+				KVMPPC_VMX_COPY_BYTE)
+			kvmppc_set_vmx_byte(vcpu, gpr);
 		break;
 #endif
 	default:
@@ -1228,7 +1348,7 @@
 	u32 dword_offset, word_offset;
 	union kvmppc_one_reg reg;
 	int vsx_offset = 0;
-	int copy_type = vcpu->arch.mmio_vsx_copy_type;
+	int copy_type = vcpu->arch.mmio_copy_type;
 	int result = 0;
 
 	switch (copy_type) {
@@ -1344,14 +1464,16 @@
 #endif /* CONFIG_VSX */
 
 #ifdef CONFIG_ALTIVEC
-/* handle quadword load access in two halves */
-int kvmppc_handle_load128_by2x64(struct kvm_run *run, struct kvm_vcpu *vcpu,
-		unsigned int rt, int is_default_endian)
+int kvmppc_handle_vmx_load(struct kvm_run *run, struct kvm_vcpu *vcpu,
+		unsigned int rt, unsigned int bytes, int is_default_endian)
 {
 	enum emulation_result emulated = EMULATE_DONE;
 
+	if (vcpu->arch.mmio_vsx_copy_nums > 2)
+		return EMULATE_FAIL;
+
 	while (vcpu->arch.mmio_vmx_copy_nums) {
-		emulated = __kvmppc_handle_load(run, vcpu, rt, 8,
+		emulated = __kvmppc_handle_load(run, vcpu, rt, bytes,
 				is_default_endian, 0);
 
 		if (emulated != EMULATE_DONE)
@@ -1359,55 +1481,127 @@
 
 		vcpu->arch.paddr_accessed += run->mmio.len;
 		vcpu->arch.mmio_vmx_copy_nums--;
+		vcpu->arch.mmio_vmx_offset++;
 	}
 
 	return emulated;
 }
 
-static inline int kvmppc_get_vmx_data(struct kvm_vcpu *vcpu, int rs, u64 *val)
+int kvmppc_get_vmx_dword(struct kvm_vcpu *vcpu, int index, u64 *val)
 {
-	vector128 vrs = VCPU_VSX_VR(vcpu, rs);
-	u32 di;
-	u64 w0, w1;
+	union kvmppc_one_reg reg;
+	int vmx_offset = 0;
+	int result = 0;
 
-	di = 2 - vcpu->arch.mmio_vmx_copy_nums;		/* doubleword index */
-	if (di > 1)
+	vmx_offset =
+		kvmppc_get_vmx_dword_offset(vcpu, vcpu->arch.mmio_vmx_offset);
+
+	if (vmx_offset == -1)
 		return -1;
 
-	if (vcpu->arch.mmio_host_swabbed)
-		di = 1 - di;
+	reg.vval = VCPU_VSX_VR(vcpu, index);
+	*val = reg.vsxval[vmx_offset];
 
-	w0 = vrs.u[di * 2];
-	w1 = vrs.u[di * 2 + 1];
-
-#ifdef __BIG_ENDIAN
-	*val = (w0 << 32) | w1;
-#else
-	*val = (w1 << 32) | w0;
-#endif
-	return 0;
+	return result;
 }
 
-/* handle quadword store in two halves */
-int kvmppc_handle_store128_by2x64(struct kvm_run *run, struct kvm_vcpu *vcpu,
-		unsigned int rs, int is_default_endian)
+int kvmppc_get_vmx_word(struct kvm_vcpu *vcpu, int index, u64 *val)
+{
+	union kvmppc_one_reg reg;
+	int vmx_offset = 0;
+	int result = 0;
+
+	vmx_offset =
+		kvmppc_get_vmx_word_offset(vcpu, vcpu->arch.mmio_vmx_offset);
+
+	if (vmx_offset == -1)
+		return -1;
+
+	reg.vval = VCPU_VSX_VR(vcpu, index);
+	*val = reg.vsx32val[vmx_offset];
+
+	return result;
+}
+
+int kvmppc_get_vmx_hword(struct kvm_vcpu *vcpu, int index, u64 *val)
+{
+	union kvmppc_one_reg reg;
+	int vmx_offset = 0;
+	int result = 0;
+
+	vmx_offset =
+		kvmppc_get_vmx_hword_offset(vcpu, vcpu->arch.mmio_vmx_offset);
+
+	if (vmx_offset == -1)
+		return -1;
+
+	reg.vval = VCPU_VSX_VR(vcpu, index);
+	*val = reg.vsx16val[vmx_offset];
+
+	return result;
+}
+
+int kvmppc_get_vmx_byte(struct kvm_vcpu *vcpu, int index, u64 *val)
+{
+	union kvmppc_one_reg reg;
+	int vmx_offset = 0;
+	int result = 0;
+
+	vmx_offset =
+		kvmppc_get_vmx_byte_offset(vcpu, vcpu->arch.mmio_vmx_offset);
+
+	if (vmx_offset == -1)
+		return -1;
+
+	reg.vval = VCPU_VSX_VR(vcpu, index);
+	*val = reg.vsx8val[vmx_offset];
+
+	return result;
+}
+
+int kvmppc_handle_vmx_store(struct kvm_run *run, struct kvm_vcpu *vcpu,
+		unsigned int rs, unsigned int bytes, int is_default_endian)
 {
 	u64 val = 0;
+	unsigned int index = rs & KVM_MMIO_REG_MASK;
 	enum emulation_result emulated = EMULATE_DONE;
 
+	if (vcpu->arch.mmio_vsx_copy_nums > 2)
+		return EMULATE_FAIL;
+
 	vcpu->arch.io_gpr = rs;
 
 	while (vcpu->arch.mmio_vmx_copy_nums) {
-		if (kvmppc_get_vmx_data(vcpu, rs, &val) == -1)
-			return EMULATE_FAIL;
+		switch (vcpu->arch.mmio_copy_type) {
+		case KVMPPC_VMX_COPY_DWORD:
+			if (kvmppc_get_vmx_dword(vcpu, index, &val) == -1)
+				return EMULATE_FAIL;
 
-		emulated = kvmppc_handle_store(run, vcpu, val, 8,
+			break;
+		case KVMPPC_VMX_COPY_WORD:
+			if (kvmppc_get_vmx_word(vcpu, index, &val) == -1)
+				return EMULATE_FAIL;
+			break;
+		case KVMPPC_VMX_COPY_HWORD:
+			if (kvmppc_get_vmx_hword(vcpu, index, &val) == -1)
+				return EMULATE_FAIL;
+			break;
+		case KVMPPC_VMX_COPY_BYTE:
+			if (kvmppc_get_vmx_byte(vcpu, index, &val) == -1)
+				return EMULATE_FAIL;
+			break;
+		default:
+			return EMULATE_FAIL;
+		}
+
+		emulated = kvmppc_handle_store(run, vcpu, val, bytes,
 				is_default_endian);
 		if (emulated != EMULATE_DONE)
 			break;
 
 		vcpu->arch.paddr_accessed += run->mmio.len;
 		vcpu->arch.mmio_vmx_copy_nums--;
+		vcpu->arch.mmio_vmx_offset++;
 	}
 
 	return emulated;
@@ -1422,11 +1616,11 @@
 	vcpu->arch.paddr_accessed += run->mmio.len;
 
 	if (!vcpu->mmio_is_write) {
-		emulated = kvmppc_handle_load128_by2x64(run, vcpu,
-				vcpu->arch.io_gpr, 1);
+		emulated = kvmppc_handle_vmx_load(run, vcpu,
+				vcpu->arch.io_gpr, run->mmio.len, 1);
 	} else {
-		emulated = kvmppc_handle_store128_by2x64(run, vcpu,
-				vcpu->arch.io_gpr, 1);
+		emulated = kvmppc_handle_vmx_store(run, vcpu,
+				vcpu->arch.io_gpr, run->mmio.len, 1);
 	}
 
 	switch (emulated) {
@@ -1570,8 +1764,10 @@
 		}
 #endif
 #ifdef CONFIG_ALTIVEC
-		if (vcpu->arch.mmio_vmx_copy_nums > 0)
+		if (vcpu->arch.mmio_vmx_copy_nums > 0) {
 			vcpu->arch.mmio_vmx_copy_nums--;
+			vcpu->arch.mmio_vmx_offset++;
+		}
 
 		if (vcpu->arch.mmio_vmx_copy_nums > 0) {
 			r = kvmppc_emulate_mmio_vmx_loadstore(vcpu, run);
@@ -1784,16 +1980,16 @@
 	void __user *argp = (void __user *)arg;
 	long r;
 
-	vcpu_load(vcpu);
-
 	switch (ioctl) {
 	case KVM_ENABLE_CAP:
 	{
 		struct kvm_enable_cap cap;
 		r = -EFAULT;
+		vcpu_load(vcpu);
 		if (copy_from_user(&cap, argp, sizeof(cap)))
 			goto out;
 		r = kvm_vcpu_ioctl_enable_cap(vcpu, &cap);
+		vcpu_put(vcpu);
 		break;
 	}
 
@@ -1815,9 +2011,11 @@
 	case KVM_DIRTY_TLB: {
 		struct kvm_dirty_tlb dirty;
 		r = -EFAULT;
+		vcpu_load(vcpu);
 		if (copy_from_user(&dirty, argp, sizeof(dirty)))
 			goto out;
 		r = kvm_vcpu_ioctl_dirty_tlb(vcpu, &dirty);
+		vcpu_put(vcpu);
 		break;
 	}
 #endif
@@ -1826,7 +2024,6 @@
 	}
 
 out:
-	vcpu_put(vcpu);
 	return r;
 }
 
diff --git a/arch/powerpc/kvm/tm.S b/arch/powerpc/kvm/tm.S
new file mode 100644
index 0000000..90e330f
--- /dev/null
+++ b/arch/powerpc/kvm/tm.S
@@ -0,0 +1,384 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * Derived from book3s_hv_rmhandlers.S, which is:
+ *
+ * Copyright 2011 Paul Mackerras, IBM Corp. <paulus@au1.ibm.com>
+ *
+ */
+
+#include <asm/reg.h>
+#include <asm/ppc_asm.h>
+#include <asm/asm-offsets.h>
+#include <asm/export.h>
+#include <asm/tm.h>
+#include <asm/cputable.h>
+
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+#define VCPU_GPRS_TM(reg) (((reg) * ULONG_SIZE) + VCPU_GPR_TM)
+
+/*
+ * Save transactional state and TM-related registers.
+ * Called with:
+ * - r3 pointing to the vcpu struct
+ * - r4 points to the MSR with current TS bits:
+ * 	(For HV KVM, it is VCPU_MSR ; For PR KVM, it is host MSR).
+ * This can modify all checkpointed registers, but
+ * restores r1, r2 before exit.
+ */
+_GLOBAL(__kvmppc_save_tm)
+	mflr	r0
+	std	r0, PPC_LR_STKOFF(r1)
+
+	/* Turn on TM. */
+	mfmsr	r8
+	li	r0, 1
+	rldimi	r8, r0, MSR_TM_LG, 63-MSR_TM_LG
+	ori     r8, r8, MSR_FP
+	oris    r8, r8, (MSR_VEC | MSR_VSX)@h
+	mtmsrd	r8
+
+	rldicl. r4, r4, 64 - MSR_TS_S_LG, 62
+	beq	1f	/* TM not active in guest. */
+
+	std	r1, HSTATE_SCRATCH2(r13)
+	std	r3, HSTATE_SCRATCH1(r13)
+
+#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
+BEGIN_FTR_SECTION
+	/* Emulation of the treclaim instruction needs TEXASR before treclaim */
+	mfspr	r6, SPRN_TEXASR
+	std	r6, VCPU_ORIG_TEXASR(r3)
+END_FTR_SECTION_IFSET(CPU_FTR_P9_TM_HV_ASSIST)
+#endif
+
+	/* Clear the MSR RI since r1, r13 are all going to be foobar. */
+	li	r5, 0
+	mtmsrd	r5, 1
+
+	li	r3, TM_CAUSE_KVM_RESCHED
+
+	/* All GPRs are volatile at this point. */
+	TRECLAIM(R3)
+
+	/* Temporarily store r13 and r9 so we have some regs to play with */
+	SET_SCRATCH0(r13)
+	GET_PACA(r13)
+	std	r9, PACATMSCRATCH(r13)
+	ld	r9, HSTATE_SCRATCH1(r13)
+
+	/* Get a few more GPRs free. */
+	std	r29, VCPU_GPRS_TM(29)(r9)
+	std	r30, VCPU_GPRS_TM(30)(r9)
+	std	r31, VCPU_GPRS_TM(31)(r9)
+
+	/* Save away PPR and DSCR soon so don't run with user values. */
+	mfspr	r31, SPRN_PPR
+	HMT_MEDIUM
+	mfspr	r30, SPRN_DSCR
+#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
+	ld	r29, HSTATE_DSCR(r13)
+	mtspr	SPRN_DSCR, r29
+#endif
+
+	/* Save all but r9, r13 & r29-r31 */
+	reg = 0
+	.rept	29
+	.if (reg != 9) && (reg != 13)
+	std	reg, VCPU_GPRS_TM(reg)(r9)
+	.endif
+	reg = reg + 1
+	.endr
+	/* ... now save r13 */
+	GET_SCRATCH0(r4)
+	std	r4, VCPU_GPRS_TM(13)(r9)
+	/* ... and save r9 */
+	ld	r4, PACATMSCRATCH(r13)
+	std	r4, VCPU_GPRS_TM(9)(r9)
+
+	/* Reload stack pointer and TOC. */
+	ld	r1, HSTATE_SCRATCH2(r13)
+	ld	r2, PACATOC(r13)
+
+	/* Set MSR RI now we have r1 and r13 back. */
+	li	r5, MSR_RI
+	mtmsrd	r5, 1
+
+	/* Save away checkpinted SPRs. */
+	std	r31, VCPU_PPR_TM(r9)
+	std	r30, VCPU_DSCR_TM(r9)
+	mflr	r5
+	mfcr	r6
+	mfctr	r7
+	mfspr	r8, SPRN_AMR
+	mfspr	r10, SPRN_TAR
+	mfxer	r11
+	std	r5, VCPU_LR_TM(r9)
+	stw	r6, VCPU_CR_TM(r9)
+	std	r7, VCPU_CTR_TM(r9)
+	std	r8, VCPU_AMR_TM(r9)
+	std	r10, VCPU_TAR_TM(r9)
+	std	r11, VCPU_XER_TM(r9)
+
+	/* Restore r12 as trap number. */
+	lwz	r12, VCPU_TRAP(r9)
+
+	/* Save FP/VSX. */
+	addi	r3, r9, VCPU_FPRS_TM
+	bl	store_fp_state
+	addi	r3, r9, VCPU_VRS_TM
+	bl	store_vr_state
+	mfspr	r6, SPRN_VRSAVE
+	stw	r6, VCPU_VRSAVE_TM(r9)
+1:
+	/*
+	 * We need to save these SPRs after the treclaim so that the software
+	 * error code is recorded correctly in the TEXASR.  Also the user may
+	 * change these outside of a transaction, so they must always be
+	 * context switched.
+	 */
+	mfspr	r7, SPRN_TEXASR
+	std	r7, VCPU_TEXASR(r9)
+11:
+	mfspr	r5, SPRN_TFHAR
+	mfspr	r6, SPRN_TFIAR
+	std	r5, VCPU_TFHAR(r9)
+	std	r6, VCPU_TFIAR(r9)
+
+	ld	r0, PPC_LR_STKOFF(r1)
+	mtlr	r0
+	blr
+
+/*
+ * _kvmppc_save_tm_pr() is a wrapper around __kvmppc_save_tm(), so that it can
+ * be invoked from C function by PR KVM only.
+ */
+_GLOBAL(_kvmppc_save_tm_pr)
+	mflr	r5
+	std	r5, PPC_LR_STKOFF(r1)
+	stdu    r1, -SWITCH_FRAME_SIZE(r1)
+	SAVE_NVGPRS(r1)
+
+	/* save MSR since TM/math bits might be impacted
+	 * by __kvmppc_save_tm().
+	 */
+	mfmsr	r5
+	SAVE_GPR(5, r1)
+
+	/* also save DSCR/CR/TAR so that it can be recovered later */
+	mfspr   r6, SPRN_DSCR
+	SAVE_GPR(6, r1)
+
+	mfcr    r7
+	stw     r7, _CCR(r1)
+
+	mfspr   r8, SPRN_TAR
+	SAVE_GPR(8, r1)
+
+	bl	__kvmppc_save_tm
+
+	REST_GPR(8, r1)
+	mtspr   SPRN_TAR, r8
+
+	ld      r7, _CCR(r1)
+	mtcr	r7
+
+	REST_GPR(6, r1)
+	mtspr   SPRN_DSCR, r6
+
+	/* need preserve current MSR's MSR_TS bits */
+	REST_GPR(5, r1)
+	mfmsr   r6
+	rldicl  r6, r6, 64 - MSR_TS_S_LG, 62
+	rldimi  r5, r6, MSR_TS_S_LG, 63 - MSR_TS_T_LG
+	mtmsrd  r5
+
+	REST_NVGPRS(r1)
+	addi    r1, r1, SWITCH_FRAME_SIZE
+	ld	r5, PPC_LR_STKOFF(r1)
+	mtlr	r5
+	blr
+
+EXPORT_SYMBOL_GPL(_kvmppc_save_tm_pr);
+
+/*
+ * Restore transactional state and TM-related registers.
+ * Called with:
+ *  - r3 pointing to the vcpu struct.
+ *  - r4 is the guest MSR with desired TS bits:
+ * 	For HV KVM, it is VCPU_MSR
+ * 	For PR KVM, it is provided by caller
+ * This potentially modifies all checkpointed registers.
+ * It restores r1, r2 from the PACA.
+ */
+_GLOBAL(__kvmppc_restore_tm)
+	mflr	r0
+	std	r0, PPC_LR_STKOFF(r1)
+
+	/* Turn on TM/FP/VSX/VMX so we can restore them. */
+	mfmsr	r5
+	li	r6, MSR_TM >> 32
+	sldi	r6, r6, 32
+	or	r5, r5, r6
+	ori	r5, r5, MSR_FP
+	oris	r5, r5, (MSR_VEC | MSR_VSX)@h
+	mtmsrd	r5
+
+	/*
+	 * The user may change these outside of a transaction, so they must
+	 * always be context switched.
+	 */
+	ld	r5, VCPU_TFHAR(r3)
+	ld	r6, VCPU_TFIAR(r3)
+	ld	r7, VCPU_TEXASR(r3)
+	mtspr	SPRN_TFHAR, r5
+	mtspr	SPRN_TFIAR, r6
+	mtspr	SPRN_TEXASR, r7
+
+	mr	r5, r4
+	rldicl. r5, r5, 64 - MSR_TS_S_LG, 62
+	beqlr		/* TM not active in guest */
+	std	r1, HSTATE_SCRATCH2(r13)
+
+	/* Make sure the failure summary is set, otherwise we'll program check
+	 * when we trechkpt.  It's possible that this might have been not set
+	 * on a kvmppc_set_one_reg() call but we shouldn't let this crash the
+	 * host.
+	 */
+	oris	r7, r7, (TEXASR_FS)@h
+	mtspr	SPRN_TEXASR, r7
+
+	/*
+	 * We need to load up the checkpointed state for the guest.
+	 * We need to do this early as it will blow away any GPRs, VSRs and
+	 * some SPRs.
+	 */
+
+	mr	r31, r3
+	addi	r3, r31, VCPU_FPRS_TM
+	bl	load_fp_state
+	addi	r3, r31, VCPU_VRS_TM
+	bl	load_vr_state
+	mr	r3, r31
+	lwz	r7, VCPU_VRSAVE_TM(r3)
+	mtspr	SPRN_VRSAVE, r7
+
+	ld	r5, VCPU_LR_TM(r3)
+	lwz	r6, VCPU_CR_TM(r3)
+	ld	r7, VCPU_CTR_TM(r3)
+	ld	r8, VCPU_AMR_TM(r3)
+	ld	r9, VCPU_TAR_TM(r3)
+	ld	r10, VCPU_XER_TM(r3)
+	mtlr	r5
+	mtcr	r6
+	mtctr	r7
+	mtspr	SPRN_AMR, r8
+	mtspr	SPRN_TAR, r9
+	mtxer	r10
+
+	/*
+	 * Load up PPR and DSCR values but don't put them in the actual SPRs
+	 * till the last moment to avoid running with userspace PPR and DSCR for
+	 * too long.
+	 */
+	ld	r29, VCPU_DSCR_TM(r3)
+	ld	r30, VCPU_PPR_TM(r3)
+
+	std	r2, PACATMSCRATCH(r13) /* Save TOC */
+
+	/* Clear the MSR RI since r1, r13 are all going to be foobar. */
+	li	r5, 0
+	mtmsrd	r5, 1
+
+	/* Load GPRs r0-r28 */
+	reg = 0
+	.rept	29
+	ld	reg, VCPU_GPRS_TM(reg)(r31)
+	reg = reg + 1
+	.endr
+
+	mtspr	SPRN_DSCR, r29
+	mtspr	SPRN_PPR, r30
+
+	/* Load final GPRs */
+	ld	29, VCPU_GPRS_TM(29)(r31)
+	ld	30, VCPU_GPRS_TM(30)(r31)
+	ld	31, VCPU_GPRS_TM(31)(r31)
+
+	/* TM checkpointed state is now setup.  All GPRs are now volatile. */
+	TRECHKPT
+
+	/* Now let's get back the state we need. */
+	HMT_MEDIUM
+	GET_PACA(r13)
+#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
+	ld	r29, HSTATE_DSCR(r13)
+	mtspr	SPRN_DSCR, r29
+#endif
+	ld	r1, HSTATE_SCRATCH2(r13)
+	ld	r2, PACATMSCRATCH(r13)
+
+	/* Set the MSR RI since we have our registers back. */
+	li	r5, MSR_RI
+	mtmsrd	r5, 1
+	ld	r0, PPC_LR_STKOFF(r1)
+	mtlr	r0
+	blr
+
+/*
+ * _kvmppc_restore_tm_pr() is a wrapper around __kvmppc_restore_tm(), so that it
+ * can be invoked from C function by PR KVM only.
+ */
+_GLOBAL(_kvmppc_restore_tm_pr)
+	mflr	r5
+	std	r5, PPC_LR_STKOFF(r1)
+	stdu    r1, -SWITCH_FRAME_SIZE(r1)
+	SAVE_NVGPRS(r1)
+
+	/* save MSR to avoid TM/math bits change */
+	mfmsr	r5
+	SAVE_GPR(5, r1)
+
+	/* also save DSCR/CR/TAR so that it can be recovered later */
+	mfspr   r6, SPRN_DSCR
+	SAVE_GPR(6, r1)
+
+	mfcr    r7
+	stw     r7, _CCR(r1)
+
+	mfspr   r8, SPRN_TAR
+	SAVE_GPR(8, r1)
+
+	bl	__kvmppc_restore_tm
+
+	REST_GPR(8, r1)
+	mtspr   SPRN_TAR, r8
+
+	ld      r7, _CCR(r1)
+	mtcr	r7
+
+	REST_GPR(6, r1)
+	mtspr   SPRN_DSCR, r6
+
+	/* need preserve current MSR's MSR_TS bits */
+	REST_GPR(5, r1)
+	mfmsr   r6
+	rldicl  r6, r6, 64 - MSR_TS_S_LG, 62
+	rldimi  r5, r6, MSR_TS_S_LG, 63 - MSR_TS_T_LG
+	mtmsrd  r5
+
+	REST_NVGPRS(r1)
+	addi    r1, r1, SWITCH_FRAME_SIZE
+	ld	r5, PPC_LR_STKOFF(r1)
+	mtlr	r5
+	blr
+
+EXPORT_SYMBOL_GPL(_kvmppc_restore_tm_pr);
+#endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 42e581a..f12680c 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -32,6 +32,7 @@
 	select HAVE_MEMBLOCK_NODE_MAP
 	select HAVE_DMA_CONTIGUOUS
 	select HAVE_GENERIC_DMA_COHERENT
+	select HAVE_PERF_EVENTS
 	select IRQ_DOMAIN
 	select NO_BOOTMEM
 	select RISCV_ISA_A if SMP
@@ -193,6 +194,19 @@
 config RISCV_ISA_A
 	def_bool y
 
+menu "supported PMU type"
+	depends on PERF_EVENTS
+
+config RISCV_BASE_PMU
+	bool "Base Performance Monitoring Unit"
+	def_bool y
+	help
+	  A base PMU that serves as a reference implementation and has limited
+	  feature of perf.  It can run on any RISC-V machines so serves as the
+	  fallback, but this option can also be disable to reduce kernel size.
+
+endmenu
+
 endmenu
 
 menu "Kernel type"
diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
index 76e958a..6d4a5f6c 100644
--- a/arch/riscv/Makefile
+++ b/arch/riscv/Makefile
@@ -71,6 +71,9 @@
 # architectures.  It's faster to have GCC emit only aligned accesses.
 KBUILD_CFLAGS += $(call cc-option,-mstrict-align)
 
+# arch specific predefines for sparse
+CHECKFLAGS += -D__riscv -D__riscv_xlen=$(BITS)
+
 head-y := arch/riscv/kernel/head.o
 
 core-y += arch/riscv/kernel/ arch/riscv/mm/
diff --git a/arch/riscv/configs/defconfig b/arch/riscv/configs/defconfig
index bca0eee..0732646 100644
--- a/arch/riscv/configs/defconfig
+++ b/arch/riscv/configs/defconfig
@@ -44,6 +44,7 @@
 CONFIG_SERIAL_8250=y
 CONFIG_SERIAL_8250_CONSOLE=y
 CONFIG_SERIAL_OF_PLATFORM=y
+CONFIG_HVC_RISCV_SBI=y
 # CONFIG_PTP_1588_CLOCK is not set
 CONFIG_DRM=y
 CONFIG_DRM_RADEON=y
diff --git a/arch/riscv/include/asm/Kbuild b/arch/riscv/include/asm/Kbuild
index 4286a5f..576ffdca 100644
--- a/arch/riscv/include/asm/Kbuild
+++ b/arch/riscv/include/asm/Kbuild
@@ -25,6 +25,7 @@
 generic-y += kmap_types.h
 generic-y += kvm_para.h
 generic-y += local.h
+generic-y += local64.h
 generic-y += mm-arch-hooks.h
 generic-y += mman.h
 generic-y += module.h
diff --git a/arch/riscv/include/asm/cacheflush.h b/arch/riscv/include/asm/cacheflush.h
index efd89a8..8f13074 100644
--- a/arch/riscv/include/asm/cacheflush.h
+++ b/arch/riscv/include/asm/cacheflush.h
@@ -47,7 +47,7 @@
 
 #else /* CONFIG_SMP */
 
-#define flush_icache_all() sbi_remote_fence_i(0)
+#define flush_icache_all() sbi_remote_fence_i(NULL)
 void flush_icache_mm(struct mm_struct *mm, bool local);
 
 #endif /* CONFIG_SMP */
diff --git a/arch/riscv/include/asm/perf_event.h b/arch/riscv/include/asm/perf_event.h
new file mode 100644
index 0000000..0e638a0
--- /dev/null
+++ b/arch/riscv/include/asm/perf_event.h
@@ -0,0 +1,84 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2018 SiFive
+ * Copyright (C) 2018 Andes Technology Corporation
+ *
+ */
+
+#ifndef _ASM_RISCV_PERF_EVENT_H
+#define _ASM_RISCV_PERF_EVENT_H
+
+#include <linux/perf_event.h>
+#include <linux/ptrace.h>
+
+#define RISCV_BASE_COUNTERS	2
+
+/*
+ * The RISCV_MAX_COUNTERS parameter should be specified.
+ */
+
+#ifdef CONFIG_RISCV_BASE_PMU
+#define RISCV_MAX_COUNTERS	2
+#endif
+
+#ifndef RISCV_MAX_COUNTERS
+#error "Please provide a valid RISCV_MAX_COUNTERS for the PMU."
+#endif
+
+/*
+ * These are the indexes of bits in counteren register *minus* 1,
+ * except for cycle.  It would be coherent if it can directly mapped
+ * to counteren bit definition, but there is a *time* register at
+ * counteren[1].  Per-cpu structure is scarce resource here.
+ *
+ * According to the spec, an implementation can support counter up to
+ * mhpmcounter31, but many high-end processors has at most 6 general
+ * PMCs, we give the definition to MHPMCOUNTER8 here.
+ */
+#define RISCV_PMU_CYCLE		0
+#define RISCV_PMU_INSTRET	1
+#define RISCV_PMU_MHPMCOUNTER3	2
+#define RISCV_PMU_MHPMCOUNTER4	3
+#define RISCV_PMU_MHPMCOUNTER5	4
+#define RISCV_PMU_MHPMCOUNTER6	5
+#define RISCV_PMU_MHPMCOUNTER7	6
+#define RISCV_PMU_MHPMCOUNTER8	7
+
+#define RISCV_OP_UNSUPP		(-EOPNOTSUPP)
+
+struct cpu_hw_events {
+	/* # currently enabled events*/
+	int			n_events;
+	/* currently enabled events */
+	struct perf_event	*events[RISCV_MAX_COUNTERS];
+	/* vendor-defined PMU data */
+	void			*platform;
+};
+
+struct riscv_pmu {
+	struct pmu	*pmu;
+
+	/* generic hw/cache events table */
+	const int	*hw_events;
+	const int	(*cache_events)[PERF_COUNT_HW_CACHE_MAX]
+				       [PERF_COUNT_HW_CACHE_OP_MAX]
+				       [PERF_COUNT_HW_CACHE_RESULT_MAX];
+	/* method used to map hw/cache events */
+	int		(*map_hw_event)(u64 config);
+	int		(*map_cache_event)(u64 config);
+
+	/* max generic hw events in map */
+	int		max_events;
+	/* number total counters, 2(base) + x(general) */
+	int		num_counters;
+	/* the width of the counter */
+	int		counter_width;
+
+	/* vendor-defined PMU features */
+	void		*platform;
+
+	irqreturn_t	(*handle_irq)(int irq_num, void *dev);
+	int		irq;
+};
+
+#endif /* _ASM_RISCV_PERF_EVENT_H */
diff --git a/arch/riscv/include/asm/tlbflush.h b/arch/riscv/include/asm/tlbflush.h
index 7b209ae..85c2d8b 100644
--- a/arch/riscv/include/asm/tlbflush.h
+++ b/arch/riscv/include/asm/tlbflush.h
@@ -49,7 +49,7 @@
 
 #include <asm/sbi.h>
 
-#define flush_tlb_all() sbi_remote_sfence_vma(0, 0, -1)
+#define flush_tlb_all() sbi_remote_sfence_vma(NULL, 0, -1)
 #define flush_tlb_page(vma, addr) flush_tlb_range(vma, addr, 0)
 #define flush_tlb_range(vma, start, end) \
 	sbi_remote_sfence_vma(mm_cpumask((vma)->vm_mm)->bits, \
diff --git a/arch/riscv/include/asm/uaccess.h b/arch/riscv/include/asm/uaccess.h
index 14b0b22..473cfc8 100644
--- a/arch/riscv/include/asm/uaccess.h
+++ b/arch/riscv/include/asm/uaccess.h
@@ -392,19 +392,21 @@
 })
 
 
-extern unsigned long __must_check __copy_user(void __user *to,
+extern unsigned long __must_check __asm_copy_to_user(void __user *to,
+	const void *from, unsigned long n);
+extern unsigned long __must_check __asm_copy_from_user(void *to,
 	const void __user *from, unsigned long n);
 
 static inline unsigned long
 raw_copy_from_user(void *to, const void __user *from, unsigned long n)
 {
-	return __copy_user(to, from, n);
+	return __asm_copy_to_user(to, from, n);
 }
 
 static inline unsigned long
 raw_copy_to_user(void __user *to, const void *from, unsigned long n)
 {
-	return __copy_user(to, from, n);
+	return __asm_copy_from_user(to, from, n);
 }
 
 extern long strncpy_from_user(char *dest, const char __user *src, long count);
diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile
index 8586dd9..e1274fc 100644
--- a/arch/riscv/kernel/Makefile
+++ b/arch/riscv/kernel/Makefile
@@ -39,4 +39,6 @@
 obj-$(CONFIG_FUNCTION_TRACER)	+= mcount.o ftrace.o
 obj-$(CONFIG_DYNAMIC_FTRACE)	+= mcount-dyn.o
 
+obj-$(CONFIG_PERF_EVENTS)      += perf_event.o
+
 clean:
diff --git a/arch/riscv/kernel/mcount.S b/arch/riscv/kernel/mcount.S
index ce9bdc5..5721624 100644
--- a/arch/riscv/kernel/mcount.S
+++ b/arch/riscv/kernel/mcount.S
@@ -126,5 +126,5 @@
 	RESTORE_ABI_STATE
 	ret
 ENDPROC(_mcount)
-EXPORT_SYMBOL(_mcount)
 #endif
+EXPORT_SYMBOL(_mcount)
diff --git a/arch/riscv/kernel/module.c b/arch/riscv/kernel/module.c
index 5dddba3..1d5e9b9 100644
--- a/arch/riscv/kernel/module.c
+++ b/arch/riscv/kernel/module.c
@@ -17,6 +17,17 @@
 #include <linux/errno.h>
 #include <linux/moduleloader.h>
 
+static int apply_r_riscv_32_rela(struct module *me, u32 *location, Elf_Addr v)
+{
+	if (v != (u32)v) {
+		pr_err("%s: value %016llx out of range for 32-bit field\n",
+		       me->name, v);
+		return -EINVAL;
+	}
+	*location = v;
+	return 0;
+}
+
 static int apply_r_riscv_64_rela(struct module *me, u32 *location, Elf_Addr v)
 {
 	*(u64 *)location = v;
@@ -265,6 +276,7 @@
 
 static int (*reloc_handlers_rela[]) (struct module *me, u32 *location,
 				Elf_Addr v) = {
+	[R_RISCV_32]			= apply_r_riscv_32_rela,
 	[R_RISCV_64]			= apply_r_riscv_64_rela,
 	[R_RISCV_BRANCH]		= apply_r_riscv_branch_rela,
 	[R_RISCV_JAL]			= apply_r_riscv_jal_rela,
diff --git a/arch/riscv/kernel/perf_event.c b/arch/riscv/kernel/perf_event.c
new file mode 100644
index 0000000..b0e10c4
--- /dev/null
+++ b/arch/riscv/kernel/perf_event.c
@@ -0,0 +1,485 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2008 Thomas Gleixner <tglx@linutronix.de>
+ * Copyright (C) 2008-2009 Red Hat, Inc., Ingo Molnar
+ * Copyright (C) 2009 Jaswinder Singh Rajput
+ * Copyright (C) 2009 Advanced Micro Devices, Inc., Robert Richter
+ * Copyright (C) 2008-2009 Red Hat, Inc., Peter Zijlstra
+ * Copyright (C) 2009 Intel Corporation, <markus.t.metzger@intel.com>
+ * Copyright (C) 2009 Google, Inc., Stephane Eranian
+ * Copyright 2014 Tilera Corporation. All Rights Reserved.
+ * Copyright (C) 2018 Andes Technology Corporation
+ *
+ * Perf_events support for RISC-V platforms.
+ *
+ * Since the spec. (as of now, Priv-Spec 1.10) does not provide enough
+ * functionality for perf event to fully work, this file provides
+ * the very basic framework only.
+ *
+ * For platform portings, please check Documentations/riscv/pmu.txt.
+ *
+ * The Copyright line includes x86 and tile ones.
+ */
+
+#include <linux/kprobes.h>
+#include <linux/kernel.h>
+#include <linux/kdebug.h>
+#include <linux/mutex.h>
+#include <linux/bitmap.h>
+#include <linux/irq.h>
+#include <linux/interrupt.h>
+#include <linux/perf_event.h>
+#include <linux/atomic.h>
+#include <linux/of.h>
+#include <asm/perf_event.h>
+
+static const struct riscv_pmu *riscv_pmu __read_mostly;
+static DEFINE_PER_CPU(struct cpu_hw_events, cpu_hw_events);
+
+/*
+ * Hardware & cache maps and their methods
+ */
+
+static const int riscv_hw_event_map[] = {
+	[PERF_COUNT_HW_CPU_CYCLES]		= RISCV_PMU_CYCLE,
+	[PERF_COUNT_HW_INSTRUCTIONS]		= RISCV_PMU_INSTRET,
+	[PERF_COUNT_HW_CACHE_REFERENCES]	= RISCV_OP_UNSUPP,
+	[PERF_COUNT_HW_CACHE_MISSES]		= RISCV_OP_UNSUPP,
+	[PERF_COUNT_HW_BRANCH_INSTRUCTIONS]	= RISCV_OP_UNSUPP,
+	[PERF_COUNT_HW_BRANCH_MISSES]		= RISCV_OP_UNSUPP,
+	[PERF_COUNT_HW_BUS_CYCLES]		= RISCV_OP_UNSUPP,
+};
+
+#define C(x) PERF_COUNT_HW_CACHE_##x
+static const int riscv_cache_event_map[PERF_COUNT_HW_CACHE_MAX]
+[PERF_COUNT_HW_CACHE_OP_MAX]
+[PERF_COUNT_HW_CACHE_RESULT_MAX] = {
+	[C(L1D)] = {
+		[C(OP_READ)] = {
+			[C(RESULT_ACCESS)] = RISCV_OP_UNSUPP,
+			[C(RESULT_MISS)] = RISCV_OP_UNSUPP,
+		},
+		[C(OP_WRITE)] = {
+			[C(RESULT_ACCESS)] = RISCV_OP_UNSUPP,
+			[C(RESULT_MISS)] = RISCV_OP_UNSUPP,
+		},
+		[C(OP_PREFETCH)] = {
+			[C(RESULT_ACCESS)] = RISCV_OP_UNSUPP,
+			[C(RESULT_MISS)] = RISCV_OP_UNSUPP,
+		},
+	},
+	[C(L1I)] = {
+		[C(OP_READ)] = {
+			[C(RESULT_ACCESS)] = RISCV_OP_UNSUPP,
+			[C(RESULT_MISS)] = RISCV_OP_UNSUPP,
+		},
+		[C(OP_WRITE)] = {
+			[C(RESULT_ACCESS)] = RISCV_OP_UNSUPP,
+			[C(RESULT_MISS)] = RISCV_OP_UNSUPP,
+		},
+		[C(OP_PREFETCH)] = {
+			[C(RESULT_ACCESS)] = RISCV_OP_UNSUPP,
+			[C(RESULT_MISS)] = RISCV_OP_UNSUPP,
+		},
+	},
+	[C(LL)] = {
+		[C(OP_READ)] = {
+			[C(RESULT_ACCESS)] = RISCV_OP_UNSUPP,
+			[C(RESULT_MISS)] = RISCV_OP_UNSUPP,
+		},
+		[C(OP_WRITE)] = {
+			[C(RESULT_ACCESS)] = RISCV_OP_UNSUPP,
+			[C(RESULT_MISS)] = RISCV_OP_UNSUPP,
+		},
+		[C(OP_PREFETCH)] = {
+			[C(RESULT_ACCESS)] = RISCV_OP_UNSUPP,
+			[C(RESULT_MISS)] = RISCV_OP_UNSUPP,
+		},
+	},
+	[C(DTLB)] = {
+		[C(OP_READ)] = {
+			[C(RESULT_ACCESS)] =  RISCV_OP_UNSUPP,
+			[C(RESULT_MISS)] =  RISCV_OP_UNSUPP,
+		},
+		[C(OP_WRITE)] = {
+			[C(RESULT_ACCESS)] = RISCV_OP_UNSUPP,
+			[C(RESULT_MISS)] = RISCV_OP_UNSUPP,
+		},
+		[C(OP_PREFETCH)] = {
+			[C(RESULT_ACCESS)] = RISCV_OP_UNSUPP,
+			[C(RESULT_MISS)] = RISCV_OP_UNSUPP,
+		},
+	},
+	[C(ITLB)] = {
+		[C(OP_READ)] = {
+			[C(RESULT_ACCESS)] = RISCV_OP_UNSUPP,
+			[C(RESULT_MISS)] = RISCV_OP_UNSUPP,
+		},
+		[C(OP_WRITE)] = {
+			[C(RESULT_ACCESS)] = RISCV_OP_UNSUPP,
+			[C(RESULT_MISS)] = RISCV_OP_UNSUPP,
+		},
+		[C(OP_PREFETCH)] = {
+			[C(RESULT_ACCESS)] = RISCV_OP_UNSUPP,
+			[C(RESULT_MISS)] = RISCV_OP_UNSUPP,
+		},
+	},
+	[C(BPU)] = {
+		[C(OP_READ)] = {
+			[C(RESULT_ACCESS)] = RISCV_OP_UNSUPP,
+			[C(RESULT_MISS)] = RISCV_OP_UNSUPP,
+		},
+		[C(OP_WRITE)] = {
+			[C(RESULT_ACCESS)] = RISCV_OP_UNSUPP,
+			[C(RESULT_MISS)] = RISCV_OP_UNSUPP,
+		},
+		[C(OP_PREFETCH)] = {
+			[C(RESULT_ACCESS)] = RISCV_OP_UNSUPP,
+			[C(RESULT_MISS)] = RISCV_OP_UNSUPP,
+		},
+	},
+};
+
+static int riscv_map_hw_event(u64 config)
+{
+	if (config >= riscv_pmu->max_events)
+		return -EINVAL;
+
+	return riscv_pmu->hw_events[config];
+}
+
+int riscv_map_cache_decode(u64 config, unsigned int *type,
+			   unsigned int *op, unsigned int *result)
+{
+	return -ENOENT;
+}
+
+static int riscv_map_cache_event(u64 config)
+{
+	unsigned int type, op, result;
+	int err = -ENOENT;
+		int code;
+
+	err = riscv_map_cache_decode(config, &type, &op, &result);
+	if (!riscv_pmu->cache_events || err)
+		return err;
+
+	if (type >= PERF_COUNT_HW_CACHE_MAX ||
+	    op >= PERF_COUNT_HW_CACHE_OP_MAX ||
+	    result >= PERF_COUNT_HW_CACHE_RESULT_MAX)
+		return -EINVAL;
+
+	code = (*riscv_pmu->cache_events)[type][op][result];
+	if (code == RISCV_OP_UNSUPP)
+		return -EINVAL;
+
+	return code;
+}
+
+/*
+ * Low-level functions: reading/writing counters
+ */
+
+static inline u64 read_counter(int idx)
+{
+	u64 val = 0;
+
+	switch (idx) {
+	case RISCV_PMU_CYCLE:
+		val = csr_read(cycle);
+		break;
+	case RISCV_PMU_INSTRET:
+		val = csr_read(instret);
+		break;
+	default:
+		WARN_ON_ONCE(idx < 0 ||	idx > RISCV_MAX_COUNTERS);
+		return -EINVAL;
+	}
+
+	return val;
+}
+
+static inline void write_counter(int idx, u64 value)
+{
+	/* currently not supported */
+	WARN_ON_ONCE(1);
+}
+
+/*
+ * pmu->read: read and update the counter
+ *
+ * Other architectures' implementation often have a xxx_perf_event_update
+ * routine, which can return counter values when called in the IRQ, but
+ * return void when being called by the pmu->read method.
+ */
+static void riscv_pmu_read(struct perf_event *event)
+{
+	struct hw_perf_event *hwc = &event->hw;
+	u64 prev_raw_count, new_raw_count;
+	u64 oldval;
+	int idx = hwc->idx;
+	u64 delta;
+
+	do {
+		prev_raw_count = local64_read(&hwc->prev_count);
+		new_raw_count = read_counter(idx);
+
+		oldval = local64_cmpxchg(&hwc->prev_count, prev_raw_count,
+					 new_raw_count);
+	} while (oldval != prev_raw_count);
+
+	/*
+	 * delta is the value to update the counter we maintain in the kernel.
+	 */
+	delta = (new_raw_count - prev_raw_count) &
+		((1ULL << riscv_pmu->counter_width) - 1);
+	local64_add(delta, &event->count);
+	/*
+	 * Something like local64_sub(delta, &hwc->period_left) here is
+	 * needed if there is an interrupt for perf.
+	 */
+}
+
+/*
+ * State transition functions:
+ *
+ * stop()/start() & add()/del()
+ */
+
+/*
+ * pmu->stop: stop the counter
+ */
+static void riscv_pmu_stop(struct perf_event *event, int flags)
+{
+	struct hw_perf_event *hwc = &event->hw;
+
+	WARN_ON_ONCE(hwc->state & PERF_HES_STOPPED);
+	hwc->state |= PERF_HES_STOPPED;
+
+	if ((flags & PERF_EF_UPDATE) && !(hwc->state & PERF_HES_UPTODATE)) {
+		riscv_pmu->pmu->read(event);
+		hwc->state |= PERF_HES_UPTODATE;
+	}
+}
+
+/*
+ * pmu->start: start the event.
+ */
+static void riscv_pmu_start(struct perf_event *event, int flags)
+{
+	struct hw_perf_event *hwc = &event->hw;
+
+	if (WARN_ON_ONCE(!(event->hw.state & PERF_HES_STOPPED)))
+		return;
+
+	if (flags & PERF_EF_RELOAD) {
+		WARN_ON_ONCE(!(event->hw.state & PERF_HES_UPTODATE));
+
+		/*
+		 * Set the counter to the period to the next interrupt here,
+		 * if you have any.
+		 */
+	}
+
+	hwc->state = 0;
+	perf_event_update_userpage(event);
+
+	/*
+	 * Since we cannot write to counters, this serves as an initialization
+	 * to the delta-mechanism in pmu->read(); otherwise, the delta would be
+	 * wrong when pmu->read is called for the first time.
+	 */
+	local64_set(&hwc->prev_count, read_counter(hwc->idx));
+}
+
+/*
+ * pmu->add: add the event to PMU.
+ */
+static int riscv_pmu_add(struct perf_event *event, int flags)
+{
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+	struct hw_perf_event *hwc = &event->hw;
+
+	if (cpuc->n_events == riscv_pmu->num_counters)
+		return -ENOSPC;
+
+	/*
+	 * We don't have general conunters, so no binding-event-to-counter
+	 * process here.
+	 *
+	 * Indexing using hwc->config generally not works, since config may
+	 * contain extra information, but here the only info we have in
+	 * hwc->config is the event index.
+	 */
+	hwc->idx = hwc->config;
+	cpuc->events[hwc->idx] = event;
+	cpuc->n_events++;
+
+	hwc->state = PERF_HES_UPTODATE | PERF_HES_STOPPED;
+
+	if (flags & PERF_EF_START)
+		riscv_pmu->pmu->start(event, PERF_EF_RELOAD);
+
+	return 0;
+}
+
+/*
+ * pmu->del: delete the event from PMU.
+ */
+static void riscv_pmu_del(struct perf_event *event, int flags)
+{
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+	struct hw_perf_event *hwc = &event->hw;
+
+	cpuc->events[hwc->idx] = NULL;
+	cpuc->n_events--;
+	riscv_pmu->pmu->stop(event, PERF_EF_UPDATE);
+	perf_event_update_userpage(event);
+}
+
+/*
+ * Interrupt: a skeletion for reference.
+ */
+
+static DEFINE_MUTEX(pmc_reserve_mutex);
+
+irqreturn_t riscv_base_pmu_handle_irq(int irq_num, void *dev)
+{
+	return IRQ_NONE;
+}
+
+static int reserve_pmc_hardware(void)
+{
+	int err = 0;
+
+	mutex_lock(&pmc_reserve_mutex);
+	if (riscv_pmu->irq >= 0 && riscv_pmu->handle_irq) {
+		err = request_irq(riscv_pmu->irq, riscv_pmu->handle_irq,
+				  IRQF_PERCPU, "riscv-base-perf", NULL);
+	}
+	mutex_unlock(&pmc_reserve_mutex);
+
+	return err;
+}
+
+void release_pmc_hardware(void)
+{
+	mutex_lock(&pmc_reserve_mutex);
+	if (riscv_pmu->irq >= 0)
+		free_irq(riscv_pmu->irq, NULL);
+	mutex_unlock(&pmc_reserve_mutex);
+}
+
+/*
+ * Event Initialization/Finalization
+ */
+
+static atomic_t riscv_active_events = ATOMIC_INIT(0);
+
+static void riscv_event_destroy(struct perf_event *event)
+{
+	if (atomic_dec_return(&riscv_active_events) == 0)
+		release_pmc_hardware();
+}
+
+static int riscv_event_init(struct perf_event *event)
+{
+	struct perf_event_attr *attr = &event->attr;
+	struct hw_perf_event *hwc = &event->hw;
+	int err;
+	int code;
+
+	if (atomic_inc_return(&riscv_active_events) == 1) {
+		err = reserve_pmc_hardware();
+
+		if (err) {
+			pr_warn("PMC hardware not available\n");
+			atomic_dec(&riscv_active_events);
+			return -EBUSY;
+		}
+	}
+
+	switch (event->attr.type) {
+	case PERF_TYPE_HARDWARE:
+		code = riscv_pmu->map_hw_event(attr->config);
+		break;
+	case PERF_TYPE_HW_CACHE:
+		code = riscv_pmu->map_cache_event(attr->config);
+		break;
+	case PERF_TYPE_RAW:
+		return -EOPNOTSUPP;
+	default:
+		return -ENOENT;
+	}
+
+	event->destroy = riscv_event_destroy;
+	if (code < 0) {
+		event->destroy(event);
+		return code;
+	}
+
+	/*
+	 * idx is set to -1 because the index of a general event should not be
+	 * decided until binding to some counter in pmu->add().
+	 *
+	 * But since we don't have such support, later in pmu->add(), we just
+	 * use hwc->config as the index instead.
+	 */
+	hwc->config = code;
+	hwc->idx = -1;
+
+	return 0;
+}
+
+/*
+ * Initialization
+ */
+
+static struct pmu min_pmu = {
+	.name		= "riscv-base",
+	.event_init	= riscv_event_init,
+	.add		= riscv_pmu_add,
+	.del		= riscv_pmu_del,
+	.start		= riscv_pmu_start,
+	.stop		= riscv_pmu_stop,
+	.read		= riscv_pmu_read,
+};
+
+static const struct riscv_pmu riscv_base_pmu = {
+	.pmu = &min_pmu,
+	.max_events = ARRAY_SIZE(riscv_hw_event_map),
+	.map_hw_event = riscv_map_hw_event,
+	.hw_events = riscv_hw_event_map,
+	.map_cache_event = riscv_map_cache_event,
+	.cache_events = &riscv_cache_event_map,
+	.counter_width = 63,
+	.num_counters = RISCV_BASE_COUNTERS + 0,
+	.handle_irq = &riscv_base_pmu_handle_irq,
+
+	/* This means this PMU has no IRQ. */
+	.irq = -1,
+};
+
+static const struct of_device_id riscv_pmu_of_ids[] = {
+	{.compatible = "riscv,base-pmu",	.data = &riscv_base_pmu},
+	{ /* sentinel value */ }
+};
+
+int __init init_hw_perf_events(void)
+{
+	struct device_node *node = of_find_node_by_type(NULL, "pmu");
+	const struct of_device_id *of_id;
+
+	riscv_pmu = &riscv_base_pmu;
+
+	if (node) {
+		of_id = of_match_node(riscv_pmu_of_ids, node);
+
+		if (of_id)
+			riscv_pmu = of_id->data;
+	}
+
+	perf_pmu_register(riscv_pmu->pmu, "cpu", PERF_TYPE_RAW);
+	return 0;
+}
+arch_initcall(init_hw_perf_events);
diff --git a/arch/riscv/kernel/riscv_ksyms.c b/arch/riscv/kernel/riscv_ksyms.c
index 5517342..f247d6d 100644
--- a/arch/riscv/kernel/riscv_ksyms.c
+++ b/arch/riscv/kernel/riscv_ksyms.c
@@ -13,6 +13,7 @@
  * Assembly functions that may be used (directly or indirectly) by modules
  */
 EXPORT_SYMBOL(__clear_user);
-EXPORT_SYMBOL(__copy_user);
+EXPORT_SYMBOL(__asm_copy_to_user);
+EXPORT_SYMBOL(__asm_copy_from_user);
 EXPORT_SYMBOL(memset);
 EXPORT_SYMBOL(memcpy);
diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
index b99d9dd..81a1952 100644
--- a/arch/riscv/kernel/traps.c
+++ b/arch/riscv/kernel/traps.c
@@ -148,7 +148,7 @@
 
 	if (pc < PAGE_OFFSET)
 		return 0;
-	if (probe_kernel_address((bug_insn_t __user *)pc, insn))
+	if (probe_kernel_address((bug_insn_t *)pc, insn))
 		return 0;
 	return (insn == __BUG_INSN);
 }
diff --git a/arch/riscv/lib/uaccess.S b/arch/riscv/lib/uaccess.S
index 58fb287..399e6f0 100644
--- a/arch/riscv/lib/uaccess.S
+++ b/arch/riscv/lib/uaccess.S
@@ -13,7 +13,8 @@
 	.previous
 	.endm
 
-ENTRY(__copy_user)
+ENTRY(__asm_copy_to_user)
+ENTRY(__asm_copy_from_user)
 
 	/* Enable access to user memory */
 	li t6, SR_SUM
@@ -63,7 +64,8 @@
 	addi a0, a0, 1
 	bltu a1, a3, 5b
 	j 3b
-ENDPROC(__copy_user)
+ENDPROC(__asm_copy_to_user)
+ENDPROC(__asm_copy_from_user)
 
 
 ENTRY(__clear_user)
@@ -84,7 +86,7 @@
 	bgeu t0, t1, 2f
 	bltu a0, t0, 4f
 1:
-	fixup REG_S, zero, (a0), 10f
+	fixup REG_S, zero, (a0), 11f
 	addi a0, a0, SZREG
 	bltu a0, t1, 1b
 2:
@@ -96,12 +98,12 @@
 	li a0, 0
 	ret
 4: /* Edge case: unalignment */
-	fixup sb, zero, (a0), 10f
+	fixup sb, zero, (a0), 11f
 	addi a0, a0, 1
 	bltu a0, t0, 4b
 	j 1b
 5: /* Edge case: remainder */
-	fixup sb, zero, (a0), 10f
+	fixup sb, zero, (a0), 11f
 	addi a0, a0, 1
 	bltu a0, a3, 5b
 	j 3b
@@ -109,9 +111,14 @@
 
 	.section .fixup,"ax"
 	.balign 4
+	/* Fixup code for __copy_user(10) and __clear_user(11) */
 10:
 	/* Disable access to user memory */
 	csrs sstatus, t6
-	sub a0, a3, a0
+	mv a0, a2
+	ret
+11:
+	csrs sstatus, t6
+	mv a0, a1
 	ret
 	.previous
diff --git a/arch/um/drivers/vector_kern.c b/arch/um/drivers/vector_kern.c
index 627075e..50ee3bb 100644
--- a/arch/um/drivers/vector_kern.c
+++ b/arch/um/drivers/vector_kern.c
@@ -188,7 +188,7 @@
 	if (strncmp(transport, TRANS_TAP, TRANS_TAP_LEN) == 0)
 		return (vec_rx | VECTOR_BPF);
 	if (strncmp(transport, TRANS_RAW, TRANS_RAW_LEN) == 0)
-		return (vec_rx | vec_tx);
+		return (vec_rx | vec_tx | VECTOR_QDISC_BYPASS);
 	return (vec_rx | vec_tx);
 }
 
@@ -504,15 +504,19 @@
 
 	result = kmalloc(sizeof(struct vector_queue), GFP_KERNEL);
 	if (result == NULL)
-		goto out_fail;
+		return NULL;
 	result->max_depth = max_size;
 	result->dev = vp->dev;
 	result->mmsg_vector = kmalloc(
 		(sizeof(struct mmsghdr) * max_size), GFP_KERNEL);
+	if (result->mmsg_vector == NULL)
+		goto out_mmsg_fail;
 	result->skbuff_vector = kmalloc(
 		(sizeof(void *) * max_size), GFP_KERNEL);
-	if (result->mmsg_vector == NULL || result->skbuff_vector == NULL)
-		goto out_fail;
+	if (result->skbuff_vector == NULL)
+		goto out_skb_fail;
+
+	/* further failures can be handled safely by destroy_queue*/
 
 	mmsg_vector = result->mmsg_vector;
 	for (i = 0; i < max_size; i++) {
@@ -563,6 +567,11 @@
 	result->head = 0;
 	result->tail = 0;
 	return result;
+out_skb_fail:
+	kfree(result->mmsg_vector);
+out_mmsg_fail:
+	kfree(result);
+	return NULL;
 out_fail:
 	destroy_queue(result);
 	return NULL;
@@ -1232,9 +1241,8 @@
 
 	if ((vp->options & VECTOR_QDISC_BYPASS) != 0) {
 		if (!uml_raw_enable_qdisc_bypass(vp->fds->rx_fd))
-			vp->options = vp->options | VECTOR_BPF;
+			vp->options |= VECTOR_BPF;
 	}
-
 	if ((vp->options & VECTOR_BPF) != 0)
 		vp->bpf = uml_vector_default_bpf(vp->fds->rx_fd, dev->dev_addr);
 
diff --git a/arch/um/include/asm/common.lds.S b/arch/um/include/asm/common.lds.S
index b30d73c..7adb4e6 100644
--- a/arch/um/include/asm/common.lds.S
+++ b/arch/um/include/asm/common.lds.S
@@ -53,12 +53,6 @@
 	CON_INITCALL
   }
 
-  .uml.initcall.init : {
-	__uml_initcall_start = .;
-	*(.uml.initcall.init)
-	__uml_initcall_end = .;
-  }
-
   SECURITY_INIT
 
   .exitcall : {
diff --git a/arch/um/include/shared/init.h b/arch/um/include/shared/init.h
index b3f5865..c66de43 100644
--- a/arch/um/include/shared/init.h
+++ b/arch/um/include/shared/init.h
@@ -64,14 +64,10 @@
         int (*setup_func)(char *, int *);
 };
 
-extern initcall_t __uml_initcall_start, __uml_initcall_end;
 extern initcall_t __uml_postsetup_start, __uml_postsetup_end;
 extern const char *__uml_help_start, *__uml_help_end;
 #endif
 
-#define __uml_initcall(fn)					  	\
-	static initcall_t __uml_initcall_##fn __uml_init_call = fn
-
 #define __uml_exitcall(fn)						\
 	static exitcall_t __uml_exitcall_##fn __uml_exit_call = fn
 
@@ -108,7 +104,6 @@
  */
 #define __uml_init_setup	__used __section(.uml.setup.init)
 #define __uml_setup_help	__used __section(.uml.help.init)
-#define __uml_init_call		__used __section(.uml.initcall.init)
 #define __uml_postsetup_call	__used __section(.uml.postsetup.init)
 #define __uml_exit_call		__used __section(.uml.exitcall.exit)
 
diff --git a/arch/um/os-Linux/main.c b/arch/um/os-Linux/main.c
index 5f970ec..f1fee2b 100644
--- a/arch/um/os-Linux/main.c
+++ b/arch/um/os-Linux/main.c
@@ -40,17 +40,6 @@
 	}
 }
 
-static __init void do_uml_initcalls(void)
-{
-	initcall_t *call;
-
-	call = &__uml_initcall_start;
-	while (call < &__uml_initcall_end) {
-		(*call)();
-		call++;
-	}
-}
-
 static void last_ditch_exit(int sig)
 {
 	uml_cleanup();
@@ -151,7 +140,6 @@
 	scan_elf_aux(envp);
 #endif
 
-	do_uml_initcalls();
 	change_sig(SIGPIPE, 0);
 	ret = linux_main(argc, argv);
 
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index d0dd35d5..559a12b 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -4429,16 +4429,14 @@
 			goto out_vmcs;
 		memset(loaded_vmcs->msr_bitmap, 0xff, PAGE_SIZE);
 
-#if IS_ENABLED(CONFIG_HYPERV)
-		if (static_branch_unlikely(&enable_evmcs) &&
+		if (IS_ENABLED(CONFIG_HYPERV) &&
+		    static_branch_unlikely(&enable_evmcs) &&
 		    (ms_hyperv.nested_features & HV_X64_NESTED_MSR_BITMAP)) {
 			struct hv_enlightened_vmcs *evmcs =
 				(struct hv_enlightened_vmcs *)loaded_vmcs->vmcs;
 
 			evmcs->hv_enlightenments_control.msr_bitmap = 1;
 		}
-#endif
-
 	}
 	return 0;
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 6bcecc3..0046aa7 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8567,7 +8567,7 @@
 		/*
 		 * Make sure the user can only configure tsc_khz values that
 		 * fit into a signed integer.
-		 * A min value is not calculated needed because it will always
+		 * A min value is not calculated because it will always
 		 * be 1 on all machines.
 		 */
 		u64 max = min(0x7fffffffULL,
diff --git a/drivers/gpu/drm/shmobile/Kconfig b/drivers/gpu/drm/shmobile/Kconfig
index c987c82..0426d66 100644
--- a/drivers/gpu/drm/shmobile/Kconfig
+++ b/drivers/gpu/drm/shmobile/Kconfig
@@ -2,7 +2,6 @@
 	tristate "DRM Support for SH Mobile"
 	depends on DRM && ARM
 	depends on ARCH_SHMOBILE || COMPILE_TEST
-	depends on FB_SH_MOBILE_MERAM || !FB_SH_MOBILE_MERAM
 	select BACKLIGHT_CLASS_DEVICE
 	select BACKLIGHT_LCD_SUPPORT
 	select DRM_KMS_HELPER
diff --git a/drivers/gpu/drm/shmobile/shmob_drm_crtc.c b/drivers/gpu/drm/shmobile/shmob_drm_crtc.c
index e773893..40df888 100644
--- a/drivers/gpu/drm/shmobile/shmob_drm_crtc.c
+++ b/drivers/gpu/drm/shmobile/shmob_drm_crtc.c
@@ -21,8 +21,6 @@
 #include <drm/drm_gem_cma_helper.h>
 #include <drm/drm_plane_helper.h>
 
-#include <video/sh_mobile_meram.h>
-
 #include "shmob_drm_backlight.h"
 #include "shmob_drm_crtc.h"
 #include "shmob_drm_drv.h"
@@ -47,20 +45,12 @@
 		if (ret < 0)
 			return ret;
 	}
-#if 0
-	if (sdev->meram_dev && sdev->meram_dev->pdev)
-		pm_runtime_get_sync(&sdev->meram_dev->pdev->dev);
-#endif
 
 	return 0;
 }
 
 static void shmob_drm_clk_off(struct shmob_drm_device *sdev)
 {
-#if 0
-	if (sdev->meram_dev && sdev->meram_dev->pdev)
-		pm_runtime_put_sync(&sdev->meram_dev->pdev->dev);
-#endif
 	if (sdev->clock)
 		clk_disable_unprepare(sdev->clock);
 }
@@ -269,12 +259,6 @@
 	if (!scrtc->started)
 		return;
 
-	/* Disable the MERAM cache. */
-	if (scrtc->cache) {
-		sh_mobile_meram_cache_free(sdev->meram, scrtc->cache);
-		scrtc->cache = NULL;
-	}
-
 	/* Stop the LCDC. */
 	shmob_drm_crtc_start_stop(scrtc, false);
 
@@ -305,7 +289,6 @@
 {
 	struct drm_crtc *crtc = &scrtc->crtc;
 	struct drm_framebuffer *fb = crtc->primary->fb;
-	struct shmob_drm_device *sdev = crtc->dev->dev_private;
 	struct drm_gem_cma_object *gem;
 	unsigned int bpp;
 
@@ -321,11 +304,6 @@
 			      + y / (bpp == 4 ? 2 : 1) * fb->pitches[1]
 			      + x * (bpp == 16 ? 2 : 1);
 	}
-
-	if (scrtc->cache)
-		sh_mobile_meram_cache_update(sdev->meram, scrtc->cache,
-					     scrtc->dma[0], scrtc->dma[1],
-					     &scrtc->dma[0], &scrtc->dma[1]);
 }
 
 static void shmob_drm_crtc_update_base(struct shmob_drm_crtc *scrtc)
@@ -372,9 +350,7 @@
 {
 	struct shmob_drm_crtc *scrtc = to_shmob_crtc(crtc);
 	struct shmob_drm_device *sdev = crtc->dev->dev_private;
-	const struct sh_mobile_meram_cfg *mdata = sdev->pdata->meram;
 	const struct shmob_drm_format_info *format;
-	void *cache;
 
 	format = shmob_drm_format_info(crtc->primary->fb->format->format);
 	if (format == NULL) {
@@ -386,24 +362,6 @@
 	scrtc->format = format;
 	scrtc->line_size = crtc->primary->fb->pitches[0];
 
-	if (sdev->meram) {
-		/* Enable MERAM cache if configured. We need to de-init
-		 * configured ICBs before we can re-initialize them.
-		 */
-		if (scrtc->cache) {
-			sh_mobile_meram_cache_free(sdev->meram, scrtc->cache);
-			scrtc->cache = NULL;
-		}
-
-		cache = sh_mobile_meram_cache_alloc(sdev->meram, mdata,
-						    crtc->primary->fb->pitches[0],
-						    adjusted_mode->vdisplay,
-						    format->meram,
-						    &scrtc->line_size);
-		if (!IS_ERR(cache))
-			scrtc->cache = cache;
-	}
-
 	shmob_drm_crtc_compute_base(scrtc, x, y);
 
 	return 0;
diff --git a/drivers/gpu/drm/shmobile/shmob_drm_crtc.h b/drivers/gpu/drm/shmobile/shmob_drm_crtc.h
index f152973..c11f421 100644
--- a/drivers/gpu/drm/shmobile/shmob_drm_crtc.h
+++ b/drivers/gpu/drm/shmobile/shmob_drm_crtc.h
@@ -28,7 +28,6 @@
 	int dpms;
 
 	const struct shmob_drm_format_info *format;
-	void *cache;
 	unsigned long dma[2];
 	unsigned int line_size;
 	bool started;
diff --git a/drivers/gpu/drm/shmobile/shmob_drm_drv.h b/drivers/gpu/drm/shmobile/shmob_drm_drv.h
index 02ea315..088a6e55 100644
--- a/drivers/gpu/drm/shmobile/shmob_drm_drv.h
+++ b/drivers/gpu/drm/shmobile/shmob_drm_drv.h
@@ -23,7 +23,6 @@
 struct clk;
 struct device;
 struct drm_device;
-struct sh_mobile_meram_info;
 
 struct shmob_drm_device {
 	struct device *dev;
@@ -31,7 +30,6 @@
 
 	void __iomem *mmio;
 	struct clk *clock;
-	struct sh_mobile_meram_info *meram;
 	u32 lddckr;
 	u32 ldmt1r;
 
diff --git a/drivers/gpu/drm/shmobile/shmob_drm_kms.c b/drivers/gpu/drm/shmobile/shmob_drm_kms.c
index d36919b..4476385 100644
--- a/drivers/gpu/drm/shmobile/shmob_drm_kms.c
+++ b/drivers/gpu/drm/shmobile/shmob_drm_kms.c
@@ -18,8 +18,6 @@
 #include <drm/drm_gem_cma_helper.h>
 #include <drm/drm_gem_framebuffer_helper.h>
 
-#include <video/sh_mobile_meram.h>
-
 #include "shmob_drm_crtc.h"
 #include "shmob_drm_drv.h"
 #include "shmob_drm_kms.h"
@@ -35,55 +33,46 @@
 		.bpp = 16,
 		.yuv = false,
 		.lddfr = LDDFR_PKF_RGB16,
-		.meram = SH_MOBILE_MERAM_PF_RGB,
 	}, {
 		.fourcc = DRM_FORMAT_RGB888,
 		.bpp = 24,
 		.yuv = false,
 		.lddfr = LDDFR_PKF_RGB24,
-		.meram = SH_MOBILE_MERAM_PF_RGB,
 	}, {
 		.fourcc = DRM_FORMAT_ARGB8888,
 		.bpp = 32,
 		.yuv = false,
 		.lddfr = LDDFR_PKF_ARGB32,
-		.meram = SH_MOBILE_MERAM_PF_RGB,
 	}, {
 		.fourcc = DRM_FORMAT_NV12,
 		.bpp = 12,
 		.yuv = true,
 		.lddfr = LDDFR_CC | LDDFR_YF_420,
-		.meram = SH_MOBILE_MERAM_PF_NV,
 	}, {
 		.fourcc = DRM_FORMAT_NV21,
 		.bpp = 12,
 		.yuv = true,
 		.lddfr = LDDFR_CC | LDDFR_YF_420,
-		.meram = SH_MOBILE_MERAM_PF_NV,
 	}, {
 		.fourcc = DRM_FORMAT_NV16,
 		.bpp = 16,
 		.yuv = true,
 		.lddfr = LDDFR_CC | LDDFR_YF_422,
-		.meram = SH_MOBILE_MERAM_PF_NV,
 	}, {
 		.fourcc = DRM_FORMAT_NV61,
 		.bpp = 16,
 		.yuv = true,
 		.lddfr = LDDFR_CC | LDDFR_YF_422,
-		.meram = SH_MOBILE_MERAM_PF_NV,
 	}, {
 		.fourcc = DRM_FORMAT_NV24,
 		.bpp = 24,
 		.yuv = true,
 		.lddfr = LDDFR_CC | LDDFR_YF_444,
-		.meram = SH_MOBILE_MERAM_PF_NV24,
 	}, {
 		.fourcc = DRM_FORMAT_NV42,
 		.bpp = 24,
 		.yuv = true,
 		.lddfr = LDDFR_CC | LDDFR_YF_444,
-		.meram = SH_MOBILE_MERAM_PF_NV24,
 	},
 };
 
diff --git a/drivers/gpu/drm/shmobile/shmob_drm_kms.h b/drivers/gpu/drm/shmobile/shmob_drm_kms.h
index 06d5b7c..753e281 100644
--- a/drivers/gpu/drm/shmobile/shmob_drm_kms.h
+++ b/drivers/gpu/drm/shmobile/shmob_drm_kms.h
@@ -24,7 +24,6 @@
 	unsigned int bpp;
 	bool yuv;
 	u32 lddfr;
-	unsigned int meram;
 };
 
 const struct shmob_drm_format_info *shmob_drm_format_info(u32 fourcc);
diff --git a/drivers/gpu/drm/shmobile/shmob_drm_plane.c b/drivers/gpu/drm/shmobile/shmob_drm_plane.c
index 97f6e4a..1d0359f 100644
--- a/drivers/gpu/drm/shmobile/shmob_drm_plane.c
+++ b/drivers/gpu/drm/shmobile/shmob_drm_plane.c
@@ -17,8 +17,6 @@
 #include <drm/drm_fb_cma_helper.h>
 #include <drm/drm_gem_cma_helper.h>
 
-#include <video/sh_mobile_meram.h>
-
 #include "shmob_drm_drv.h"
 #include "shmob_drm_kms.h"
 #include "shmob_drm_plane.h"
diff --git a/drivers/media/platform/via-camera.c b/drivers/media/platform/via-camera.c
index f01c3e8..c8bb82f 100644
--- a/drivers/media/platform/via-camera.c
+++ b/drivers/media/platform/via-camera.c
@@ -27,7 +27,12 @@
 #include <linux/via-core.h>
 #include <linux/via-gpio.h>
 #include <linux/via_i2c.h>
+
+#ifdef CONFIG_X86
 #include <asm/olpc.h>
+#else
+#define machine_is_olpc(x) 0
+#endif
 
 #include "via-camera.h"
 
diff --git a/drivers/net/ethernet/cavium/thunder/nic.h b/drivers/net/ethernet/cavium/thunder/nic.h
index 448d1fa..f4d8176 100644
--- a/drivers/net/ethernet/cavium/thunder/nic.h
+++ b/drivers/net/ethernet/cavium/thunder/nic.h
@@ -325,6 +325,8 @@
 	struct tasklet_struct	qs_err_task;
 	struct work_struct	reset_task;
 	struct nicvf_work       rx_mode_work;
+	/* spinlock to protect workqueue arguments from concurrent access */
+	spinlock_t              rx_mode_wq_lock;
 
 	/* PTP timestamp */
 	struct cavium_ptp	*ptp_clock;
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
index 7135db4..135766c 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
@@ -1923,17 +1923,12 @@
 	}
 }
 
-static void nicvf_set_rx_mode_task(struct work_struct *work_arg)
+static void __nicvf_set_rx_mode_task(u8 mode, struct xcast_addr_list *mc_addrs,
+				     struct nicvf *nic)
 {
-	struct nicvf_work *vf_work = container_of(work_arg, struct nicvf_work,
-						  work.work);
-	struct nicvf *nic = container_of(vf_work, struct nicvf, rx_mode_work);
 	union nic_mbx mbx = {};
 	int idx;
 
-	if (!vf_work)
-		return;
-
 	/* From the inside of VM code flow we have only 128 bits memory
 	 * available to send message to host's PF, so send all mc addrs
 	 * one by one, starting from flush command in case if kernel
@@ -1944,7 +1939,7 @@
 	mbx.xcast.msg = NIC_MBOX_MSG_RESET_XCAST;
 	nicvf_send_msg_to_pf(nic, &mbx);
 
-	if (vf_work->mode & BGX_XCAST_MCAST_FILTER) {
+	if (mode & BGX_XCAST_MCAST_FILTER) {
 		/* once enabling filtering, we need to signal to PF to add
 		 * its' own LMAC to the filter to accept packets for it.
 		 */
@@ -1954,23 +1949,46 @@
 	}
 
 	/* check if we have any specific MACs to be added to PF DMAC filter */
-	if (vf_work->mc) {
+	if (mc_addrs) {
 		/* now go through kernel list of MACs and add them one by one */
-		for (idx = 0; idx < vf_work->mc->count; idx++) {
+		for (idx = 0; idx < mc_addrs->count; idx++) {
 			mbx.xcast.msg = NIC_MBOX_MSG_ADD_MCAST;
-			mbx.xcast.data.mac = vf_work->mc->mc[idx];
+			mbx.xcast.data.mac = mc_addrs->mc[idx];
 			nicvf_send_msg_to_pf(nic, &mbx);
 		}
-		kfree(vf_work->mc);
+		kfree(mc_addrs);
 	}
 
 	/* and finally set rx mode for PF accordingly */
 	mbx.xcast.msg = NIC_MBOX_MSG_SET_XCAST;
-	mbx.xcast.data.mode = vf_work->mode;
+	mbx.xcast.data.mode = mode;
 
 	nicvf_send_msg_to_pf(nic, &mbx);
 }
 
+static void nicvf_set_rx_mode_task(struct work_struct *work_arg)
+{
+	struct nicvf_work *vf_work = container_of(work_arg, struct nicvf_work,
+						  work.work);
+	struct nicvf *nic = container_of(vf_work, struct nicvf, rx_mode_work);
+	u8 mode;
+	struct xcast_addr_list *mc;
+
+	if (!vf_work)
+		return;
+
+	/* Save message data locally to prevent them from
+	 * being overwritten by next ndo_set_rx_mode call().
+	 */
+	spin_lock(&nic->rx_mode_wq_lock);
+	mode = vf_work->mode;
+	mc = vf_work->mc;
+	vf_work->mc = NULL;
+	spin_unlock(&nic->rx_mode_wq_lock);
+
+	__nicvf_set_rx_mode_task(mode, mc, nic);
+}
+
 static void nicvf_set_rx_mode(struct net_device *netdev)
 {
 	struct nicvf *nic = netdev_priv(netdev);
@@ -2004,9 +2022,12 @@
 			}
 		}
 	}
+	spin_lock(&nic->rx_mode_wq_lock);
+	kfree(nic->rx_mode_work.mc);
 	nic->rx_mode_work.mc = mc_list;
 	nic->rx_mode_work.mode = mode;
-	queue_delayed_work(nicvf_rx_mode_wq, &nic->rx_mode_work.work, 2 * HZ);
+	queue_delayed_work(nicvf_rx_mode_wq, &nic->rx_mode_work.work, 0);
+	spin_unlock(&nic->rx_mode_wq_lock);
 }
 
 static const struct net_device_ops nicvf_netdev_ops = {
@@ -2163,6 +2184,7 @@
 	INIT_WORK(&nic->reset_task, nicvf_reset_task);
 
 	INIT_DELAYED_WORK(&nic->rx_mode_work.work, nicvf_set_rx_mode_task);
+	spin_lock_init(&nic->rx_mode_wq_lock);
 
 	err = register_netdev(netdev);
 	if (err) {
diff --git a/drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c b/drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c
index 2edfdbd..7b795ed 100644
--- a/drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c
@@ -3362,10 +3362,17 @@
 
 	err = sysfs_create_group(&adapter->port[0]->dev.kobj,
 				 &cxgb3_attr_group);
+	if (err) {
+		dev_err(&pdev->dev, "cannot create sysfs group\n");
+		goto out_close_led;
+	}
 
 	print_port_info(adapter, ai);
 	return 0;
 
+out_close_led:
+	t3_set_reg_field(adapter, A_T3DBG_GPIO_EN, F_GPIO0_OUT_VAL, 0);
+
 out_free_dev:
 	iounmap(adapter->regs);
 	for (i = ai->nports0 + ai->nports1 - 1; i >= 0; --i)
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
index fc534e9..144d5fe 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
@@ -760,9 +760,9 @@
 #define IXGBE_RSS_KEY_SIZE     40  /* size of RSS Hash Key in bytes */
 	u32 *rss_key;
 
-#ifdef CONFIG_XFRM
+#ifdef CONFIG_XFRM_OFFLOAD
 	struct ixgbe_ipsec *ipsec;
-#endif /* CONFIG_XFRM */
+#endif /* CONFIG_XFRM_OFFLOAD */
 };
 
 static inline u8 ixgbe_max_rss_indices(struct ixgbe_adapter *adapter)
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
index 344a1f2..c116f45 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
@@ -158,7 +158,16 @@
 	reg |= IXGBE_SECRXCTRL_RX_DIS;
 	IXGBE_WRITE_REG(hw, IXGBE_SECRXCTRL, reg);
 
-	IXGBE_WRITE_FLUSH(hw);
+	/* If both Tx and Rx are ready there are no packets
+	 * that we need to flush so the loopback configuration
+	 * below is not necessary.
+	 */
+	t_rdy = IXGBE_READ_REG(hw, IXGBE_SECTXSTAT) &
+		IXGBE_SECTXSTAT_SECTX_RDY;
+	r_rdy = IXGBE_READ_REG(hw, IXGBE_SECRXSTAT) &
+		IXGBE_SECRXSTAT_SECRX_RDY;
+	if (t_rdy && r_rdy)
+		return;
 
 	/* If the tx fifo doesn't have link, but still has data,
 	 * we can't clear the tx sec block.  Set the MAC loopback
@@ -185,7 +194,7 @@
 			IXGBE_SECTXSTAT_SECTX_RDY;
 		r_rdy = IXGBE_READ_REG(hw, IXGBE_SECRXSTAT) &
 			IXGBE_SECRXSTAT_SECRX_RDY;
-	} while (!t_rdy && !r_rdy && limit--);
+	} while (!(t_rdy && r_rdy) && limit--);
 
 	/* undo loopback if we played with it earlier */
 	if (!link) {
@@ -966,10 +975,22 @@
  **/
 void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter)
 {
+	struct ixgbe_hw *hw = &adapter->hw;
 	struct ixgbe_ipsec *ipsec;
+	u32 t_dis, r_dis;
 	size_t size;
 
-	if (adapter->hw.mac.type == ixgbe_mac_82598EB)
+	if (hw->mac.type == ixgbe_mac_82598EB)
+		return;
+
+	/* If there is no support for either Tx or Rx offload
+	 * we should not be advertising support for IPsec.
+	 */
+	t_dis = IXGBE_READ_REG(hw, IXGBE_SECTXSTAT) &
+		IXGBE_SECTXSTAT_SECTX_OFF_DIS;
+	r_dis = IXGBE_READ_REG(hw, IXGBE_SECRXSTAT) &
+		IXGBE_SECRXSTAT_SECRX_OFF_DIS;
+	if (t_dis || r_dis)
 		return;
 
 	ipsec = kzalloc(sizeof(*ipsec), GFP_KERNEL);
@@ -1001,13 +1022,6 @@
 
 	adapter->netdev->xfrmdev_ops = &ixgbe_xfrmdev_ops;
 
-#define IXGBE_ESP_FEATURES	(NETIF_F_HW_ESP | \
-				 NETIF_F_HW_ESP_TX_CSUM | \
-				 NETIF_F_GSO_ESP)
-
-	adapter->netdev->features |= IXGBE_ESP_FEATURES;
-	adapter->netdev->hw_enc_features |= IXGBE_ESP_FEATURES;
-
 	return;
 
 err2:
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
index 893a920..d361f57 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
@@ -593,6 +593,14 @@
 	}
 
 #endif
+	/* To support macvlan offload we have to use num_tc to
+	 * restrict the queues that can be used by the device.
+	 * By doing this we can avoid reporting a false number of
+	 * queues.
+	 */
+	if (vmdq_i > 1)
+		netdev_set_num_tc(adapter->netdev, 1);
+
 	/* populate TC0 for use by pool 0 */
 	netdev_set_tc_queue(adapter->netdev, 0,
 			    adapter->num_rx_queues_per_pool, 0);
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 0b1ba3a..3e87dbb 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -6117,6 +6117,7 @@
 #ifdef CONFIG_IXGBE_DCB
 	ixgbe_init_dcb(adapter);
 #endif
+	ixgbe_init_ipsec_offload(adapter);
 
 	/* default flow control settings */
 	hw->fc.requested_mode = ixgbe_fc_full;
@@ -8822,14 +8823,6 @@
 	} else {
 		netdev_reset_tc(dev);
 
-		/* To support macvlan offload we have to use num_tc to
-		 * restrict the queues that can be used by the device.
-		 * By doing this we can avoid reporting a false number of
-		 * queues.
-		 */
-		if (!tc && adapter->num_rx_pools > 1)
-			netdev_set_num_tc(dev, 1);
-
 		if (adapter->hw.mac.type == ixgbe_mac_82598EB)
 			adapter->hw.fc.requested_mode = adapter->last_lfc_mode;
 
@@ -9904,7 +9897,7 @@
 	 * the TSO, so it's the exception.
 	 */
 	if (skb->encapsulation && !(features & NETIF_F_TSO_MANGLEID)) {
-#ifdef CONFIG_XFRM
+#ifdef CONFIG_XFRM_OFFLOAD
 		if (!skb->sp)
 #endif
 			features &= ~NETIF_F_TSO;
@@ -10437,6 +10430,14 @@
 	if (hw->mac.type >= ixgbe_mac_82599EB)
 		netdev->features |= NETIF_F_SCTP_CRC;
 
+#ifdef CONFIG_XFRM_OFFLOAD
+#define IXGBE_ESP_FEATURES	(NETIF_F_HW_ESP | \
+				 NETIF_F_HW_ESP_TX_CSUM | \
+				 NETIF_F_GSO_ESP)
+
+	if (adapter->ipsec)
+		netdev->features |= IXGBE_ESP_FEATURES;
+#endif
 	/* copy netdev features into list of user selectable features */
 	netdev->hw_features |= netdev->features |
 			       NETIF_F_HW_VLAN_CTAG_FILTER |
@@ -10499,8 +10500,6 @@
 					 NETIF_F_FCOE_MTU;
 	}
 #endif /* IXGBE_FCOE */
-	ixgbe_init_ipsec_offload(adapter);
-
 	if (adapter->flags2 & IXGBE_FLAG2_RSC_CAPABLE)
 		netdev->hw_features |= NETIF_F_LRO;
 	if (adapter->flags2 & IXGBE_FLAG2_RSC_ENABLED)
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h b/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
index e8ed377..44cfb20 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
@@ -599,13 +599,15 @@
 #define IXGBE_SECTXCTRL_STORE_FORWARD   0x00000004
 
 #define IXGBE_SECTXSTAT_SECTX_RDY       0x00000001
-#define IXGBE_SECTXSTAT_ECC_TXERR       0x00000002
+#define IXGBE_SECTXSTAT_SECTX_OFF_DIS   0x00000002
+#define IXGBE_SECTXSTAT_ECC_TXERR       0x00000004
 
 #define IXGBE_SECRXCTRL_SECRX_DIS       0x00000001
 #define IXGBE_SECRXCTRL_RX_DIS          0x00000002
 
 #define IXGBE_SECRXSTAT_SECRX_RDY       0x00000001
-#define IXGBE_SECRXSTAT_ECC_RXERR       0x00000002
+#define IXGBE_SECRXSTAT_SECRX_OFF_DIS   0x00000002
+#define IXGBE_SECRXSTAT_ECC_RXERR       0x00000004
 
 /* LinkSec (MacSec) Registers */
 #define IXGBE_LSECTXCAP         0x08A00
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
index 77b2adb..6aaaf3d 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
@@ -4756,12 +4756,6 @@
 	kfree(mlxsw_sp_rt6);
 }
 
-static bool mlxsw_sp_fib6_rt_can_mp(const struct fib6_info *rt)
-{
-	/* RTF_CACHE routes are ignored */
-	return (rt->fib6_flags & (RTF_GATEWAY | RTF_ADDRCONF)) == RTF_GATEWAY;
-}
-
 static struct fib6_info *
 mlxsw_sp_fib6_entry_rt(const struct mlxsw_sp_fib6_entry *fib6_entry)
 {
@@ -4771,11 +4765,11 @@
 
 static struct mlxsw_sp_fib6_entry *
 mlxsw_sp_fib6_node_mp_entry_find(const struct mlxsw_sp_fib_node *fib_node,
-				 const struct fib6_info *nrt, bool replace)
+				 const struct fib6_info *nrt, bool append)
 {
 	struct mlxsw_sp_fib6_entry *fib6_entry;
 
-	if (!mlxsw_sp_fib6_rt_can_mp(nrt) || replace)
+	if (!append)
 		return NULL;
 
 	list_for_each_entry(fib6_entry, &fib_node->entry_list, common.list) {
@@ -4790,8 +4784,7 @@
 			break;
 		if (rt->fib6_metric < nrt->fib6_metric)
 			continue;
-		if (rt->fib6_metric == nrt->fib6_metric &&
-		    mlxsw_sp_fib6_rt_can_mp(rt))
+		if (rt->fib6_metric == nrt->fib6_metric)
 			return fib6_entry;
 		if (rt->fib6_metric > nrt->fib6_metric)
 			break;
@@ -5170,7 +5163,7 @@
 mlxsw_sp_fib6_node_entry_find(const struct mlxsw_sp_fib_node *fib_node,
 			      const struct fib6_info *nrt, bool replace)
 {
-	struct mlxsw_sp_fib6_entry *fib6_entry, *fallback = NULL;
+	struct mlxsw_sp_fib6_entry *fib6_entry;
 
 	list_for_each_entry(fib6_entry, &fib_node->entry_list, common.list) {
 		struct fib6_info *rt = mlxsw_sp_fib6_entry_rt(fib6_entry);
@@ -5179,18 +5172,13 @@
 			continue;
 		if (rt->fib6_table->tb6_id != nrt->fib6_table->tb6_id)
 			break;
-		if (replace && rt->fib6_metric == nrt->fib6_metric) {
-			if (mlxsw_sp_fib6_rt_can_mp(rt) ==
-			    mlxsw_sp_fib6_rt_can_mp(nrt))
-				return fib6_entry;
-			if (mlxsw_sp_fib6_rt_can_mp(nrt))
-				fallback = fallback ?: fib6_entry;
-		}
+		if (replace && rt->fib6_metric == nrt->fib6_metric)
+			return fib6_entry;
 		if (rt->fib6_metric > nrt->fib6_metric)
-			return fallback ?: fib6_entry;
+			return fib6_entry;
 	}
 
-	return fallback;
+	return NULL;
 }
 
 static int
@@ -5316,7 +5304,8 @@
 }
 
 static int mlxsw_sp_router_fib6_add(struct mlxsw_sp *mlxsw_sp,
-				    struct fib6_info *rt, bool replace)
+				    struct fib6_info *rt, bool replace,
+				    bool append)
 {
 	struct mlxsw_sp_fib6_entry *fib6_entry;
 	struct mlxsw_sp_fib_node *fib_node;
@@ -5342,7 +5331,7 @@
 	/* Before creating a new entry, try to append route to an existing
 	 * multipath entry.
 	 */
-	fib6_entry = mlxsw_sp_fib6_node_mp_entry_find(fib_node, rt, replace);
+	fib6_entry = mlxsw_sp_fib6_node_mp_entry_find(fib_node, rt, append);
 	if (fib6_entry) {
 		err = mlxsw_sp_fib6_entry_nexthop_add(mlxsw_sp, fib6_entry, rt);
 		if (err)
@@ -5350,6 +5339,14 @@
 		return 0;
 	}
 
+	/* We received an append event, yet did not find any route to
+	 * append to.
+	 */
+	if (WARN_ON(append)) {
+		err = -EINVAL;
+		goto err_fib6_entry_append;
+	}
+
 	fib6_entry = mlxsw_sp_fib6_entry_create(mlxsw_sp, fib_node, rt);
 	if (IS_ERR(fib6_entry)) {
 		err = PTR_ERR(fib6_entry);
@@ -5367,6 +5364,7 @@
 err_fib6_node_entry_link:
 	mlxsw_sp_fib6_entry_destroy(mlxsw_sp, fib6_entry);
 err_fib6_entry_create:
+err_fib6_entry_append:
 err_fib6_entry_nexthop_add:
 	mlxsw_sp_fib_node_put(mlxsw_sp, fib_node);
 	return err;
@@ -5717,7 +5715,7 @@
 	struct mlxsw_sp_fib_event_work *fib_work =
 		container_of(work, struct mlxsw_sp_fib_event_work, work);
 	struct mlxsw_sp *mlxsw_sp = fib_work->mlxsw_sp;
-	bool replace;
+	bool replace, append;
 	int err;
 
 	rtnl_lock();
@@ -5728,8 +5726,10 @@
 	case FIB_EVENT_ENTRY_APPEND: /* fall through */
 	case FIB_EVENT_ENTRY_ADD:
 		replace = fib_work->event == FIB_EVENT_ENTRY_REPLACE;
+		append = fib_work->event == FIB_EVENT_ENTRY_APPEND;
 		err = mlxsw_sp_router_fib6_add(mlxsw_sp,
-					       fib_work->fen6_info.rt, replace);
+					       fib_work->fen6_info.rt, replace,
+					       append);
 		if (err)
 			mlxsw_sp_router_fib_abort(mlxsw_sp);
 		mlxsw_sp_rt6_release(fib_work->fen6_info.rt);
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
index e97652c..eea5666 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
@@ -1018,8 +1018,10 @@
 	int err;
 
 	/* No need to continue if only VLAN flags were changed */
-	if (mlxsw_sp_port_vlan->bridge_port)
+	if (mlxsw_sp_port_vlan->bridge_port) {
+		mlxsw_sp_port_vlan_put(mlxsw_sp_port_vlan);
 		return 0;
+	}
 
 	err = mlxsw_sp_port_vlan_fid_join(mlxsw_sp_port_vlan, bridge_port);
 	if (err)
diff --git a/drivers/net/ethernet/netronome/nfp/flower/main.c b/drivers/net/ethernet/netronome/nfp/flower/main.c
index 19cfa16..1decf3a 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/main.c
+++ b/drivers/net/ethernet/netronome/nfp/flower/main.c
@@ -455,6 +455,7 @@
 
 	eth_hw_addr_random(nn->dp.netdev);
 	netif_keep_dst(nn->dp.netdev);
+	nn->vnic_no_name = true;
 
 	return 0;
 
diff --git a/drivers/net/ethernet/netronome/nfp/flower/tunnel_conf.c b/drivers/net/ethernet/netronome/nfp/flower/tunnel_conf.c
index ec524d9..78afe75 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/tunnel_conf.c
+++ b/drivers/net/ethernet/netronome/nfp/flower/tunnel_conf.c
@@ -381,6 +381,8 @@
 	err = PTR_ERR_OR_ZERO(rt);
 	if (err)
 		return NOTIFY_DONE;
+
+	ip_rt_put(rt);
 #else
 	return NOTIFY_DONE;
 #endif
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net.h b/drivers/net/ethernet/netronome/nfp/nfp_net.h
index 57cb035d..2a71a9f 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net.h
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net.h
@@ -590,6 +590,8 @@
  * @vnic_list:		Entry on device vNIC list
  * @pdev:		Backpointer to PCI device
  * @app:		APP handle if available
+ * @vnic_no_name:	For non-port PF vNIC make ndo_get_phys_port_name return
+ *			-EOPNOTSUPP to keep backwards compatibility (set by app)
  * @port:		Pointer to nfp_port structure if vNIC is a port
  * @app_priv:		APP private data for this vNIC
  */
@@ -663,6 +665,8 @@
 	struct pci_dev *pdev;
 	struct nfp_app *app;
 
+	bool vnic_no_name;
+
 	struct nfp_port *port;
 
 	void *app_priv;
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
index 75110c8..d4c27f8 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
@@ -3121,7 +3121,7 @@
 	struct nfp_net *nn = netdev_priv(netdev);
 	int r;
 
-	for (r = 0; r < nn->dp.num_r_vecs; r++) {
+	for (r = 0; r < nn->max_r_vecs; r++) {
 		struct nfp_net_r_vector *r_vec = &nn->r_vecs[r];
 		u64 data[3];
 		unsigned int start;
@@ -3286,7 +3286,7 @@
 	if (nn->port)
 		return nfp_port_get_phys_port_name(netdev, name, len);
 
-	if (nn->dp.is_vf)
+	if (nn->dp.is_vf || nn->vnic_no_name)
 		return -EOPNOTSUPP;
 
 	n = snprintf(name, len, "n%d", nn->id);
diff --git a/drivers/net/ethernet/netronome/nfp/nfpcore/nfp_resource.c b/drivers/net/ethernet/netronome/nfp/nfpcore/nfp_resource.c
index 2dd89db..d32af59 100644
--- a/drivers/net/ethernet/netronome/nfp/nfpcore/nfp_resource.c
+++ b/drivers/net/ethernet/netronome/nfp/nfpcore/nfp_resource.c
@@ -98,21 +98,18 @@
 
 static int nfp_cpp_resource_find(struct nfp_cpp *cpp, struct nfp_resource *res)
 {
-	char name_pad[NFP_RESOURCE_ENTRY_NAME_SZ] = {};
 	struct nfp_resource_entry entry;
 	u32 cpp_id, key;
 	int ret, i;
 
 	cpp_id = NFP_CPP_ID(NFP_RESOURCE_TBL_TARGET, 3, 0);  /* Atomic read */
 
-	strncpy(name_pad, res->name, sizeof(name_pad));
-
 	/* Search for a matching entry */
-	if (!memcmp(name_pad, NFP_RESOURCE_TBL_NAME "\0\0\0\0\0\0\0\0", 8)) {
+	if (!strcmp(res->name, NFP_RESOURCE_TBL_NAME)) {
 		nfp_err(cpp, "Grabbing device lock not supported\n");
 		return -EOPNOTSUPP;
 	}
-	key = crc32_posix(name_pad, sizeof(name_pad));
+	key = crc32_posix(res->name, NFP_RESOURCE_ENTRY_NAME_SZ);
 
 	for (i = 0; i < NFP_RESOURCE_TBL_ENTRIES; i++) {
 		u64 addr = NFP_RESOURCE_TBL_BASE +
diff --git a/drivers/net/ethernet/qualcomm/emac/emac-sgmii.c b/drivers/net/ethernet/qualcomm/emac/emac-sgmii.c
index e78e5db..c694e34 100644
--- a/drivers/net/ethernet/qualcomm/emac/emac-sgmii.c
+++ b/drivers/net/ethernet/qualcomm/emac/emac-sgmii.c
@@ -384,6 +384,7 @@
 		}
 
 		sgmii_pdev = of_find_device_by_node(np);
+		of_node_put(np);
 		if (!sgmii_pdev) {
 			dev_err(&pdev->dev, "invalid internal-phy property\n");
 			return -ENODEV;
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-meson8b.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-meson8b.c
index 4ff231d..c597956 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac-meson8b.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-meson8b.c
@@ -334,9 +334,10 @@
 
 	dwmac->data = (const struct meson8b_dwmac_data *)
 		of_device_get_match_data(&pdev->dev);
-	if (!dwmac->data)
-		return -EINVAL;
-
+	if (!dwmac->data) {
+		ret = -EINVAL;
+		goto err_remove_config_dt;
+	}
 	res = platform_get_resource(pdev, IORESOURCE_MEM, 1);
 	dwmac->regs = devm_ioremap_resource(&pdev->dev, res);
 	if (IS_ERR(dwmac->regs)) {
diff --git a/drivers/net/ethernet/stmicro/stmmac/hwif.c b/drivers/net/ethernet/stmicro/stmmac/hwif.c
index 14770fc..1f50e83 100644
--- a/drivers/net/ethernet/stmicro/stmmac/hwif.c
+++ b/drivers/net/ethernet/stmicro/stmmac/hwif.c
@@ -252,13 +252,8 @@
 				return ret;
 		}
 
-		/* Run quirks, if needed */
-		if (entry->quirks) {
-			ret = entry->quirks(priv);
-			if (ret)
-				return ret;
-		}
-
+		/* Save quirks, if needed for posterior use */
+		priv->hwif_quirks = entry->quirks;
 		return 0;
 	}
 
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
index 025efbf..76649ad 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
@@ -129,6 +129,7 @@
 	struct net_device *dev;
 	struct device *device;
 	struct mac_device_info *hw;
+	int (*hwif_quirks)(struct stmmac_priv *priv);
 	struct mutex lock;
 
 	/* RX Queue */
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 11fb7c7..e79b0d7 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -3182,17 +3182,22 @@
 
 static void stmmac_rx_vlan(struct net_device *dev, struct sk_buff *skb)
 {
-	struct ethhdr *ehdr;
+	struct vlan_ethhdr *veth;
+	__be16 vlan_proto;
 	u16 vlanid;
 
-	if ((dev->features & NETIF_F_HW_VLAN_CTAG_RX) ==
-	    NETIF_F_HW_VLAN_CTAG_RX &&
-	    !__vlan_get_tag(skb, &vlanid)) {
+	veth = (struct vlan_ethhdr *)skb->data;
+	vlan_proto = veth->h_vlan_proto;
+
+	if ((vlan_proto == htons(ETH_P_8021Q) &&
+	     dev->features & NETIF_F_HW_VLAN_CTAG_RX) ||
+	    (vlan_proto == htons(ETH_P_8021AD) &&
+	     dev->features & NETIF_F_HW_VLAN_STAG_RX)) {
 		/* pop the vlan tag */
-		ehdr = (struct ethhdr *)skb->data;
-		memmove(skb->data + VLAN_HLEN, ehdr, ETH_ALEN * 2);
+		vlanid = ntohs(veth->h_vlan_TCI);
+		memmove(skb->data + VLAN_HLEN, veth, ETH_ALEN * 2);
 		skb_pull(skb, VLAN_HLEN);
-		__vlan_hwaccel_put_tag(skb, htons(ETH_P_8021Q), vlanid);
+		__vlan_hwaccel_put_tag(skb, vlan_proto, vlanid);
 	}
 }
 
@@ -4130,6 +4135,13 @@
 	if (priv->dma_cap.tsoen)
 		dev_info(priv->device, "TSO supported\n");
 
+	/* Run HW quirks, if any */
+	if (priv->hwif_quirks) {
+		ret = priv->hwif_quirks(priv);
+		if (ret)
+			return ret;
+	}
+
 	return 0;
 }
 
@@ -4235,7 +4247,7 @@
 	ndev->watchdog_timeo = msecs_to_jiffies(watchdog);
 #ifdef STMMAC_VLAN_TAG_USED
 	/* Both mac100 and gmac support receive VLAN tag detection */
-	ndev->features |= NETIF_F_HW_VLAN_CTAG_RX;
+	ndev->features |= NETIF_F_HW_VLAN_CTAG_RX | NETIF_F_HW_VLAN_STAG_RX;
 #endif
 	priv->msg_enable = netif_msg_init(debug, default_msg_level);
 
diff --git a/drivers/net/ethernet/xilinx/xilinx_emaclite.c b/drivers/net/ethernet/xilinx/xilinx_emaclite.c
index 69e31ce..2a0c06e 100644
--- a/drivers/net/ethernet/xilinx/xilinx_emaclite.c
+++ b/drivers/net/ethernet/xilinx/xilinx_emaclite.c
@@ -123,7 +123,6 @@
  * @phy_node:		pointer to the PHY device node
  * @mii_bus:		pointer to the MII bus
  * @last_link:		last link status
- * @has_mdio:		indicates whether MDIO is included in the HW
  */
 struct net_local {
 
@@ -144,7 +143,6 @@
 	struct mii_bus *mii_bus;
 
 	int last_link;
-	bool has_mdio;
 };
 
 
@@ -863,14 +861,14 @@
 	bus->write = xemaclite_mdio_write;
 	bus->parent = dev;
 
-	lp->mii_bus = bus;
-
 	rc = of_mdiobus_register(bus, np);
 	if (rc) {
 		dev_err(dev, "Failed to register mdio bus.\n");
 		goto err_register;
 	}
 
+	lp->mii_bus = bus;
+
 	return 0;
 
 err_register:
@@ -1145,9 +1143,7 @@
 	xemaclite_update_address(lp, ndev->dev_addr);
 
 	lp->phy_node = of_parse_phandle(ofdev->dev.of_node, "phy-handle", 0);
-	rc = xemaclite_mdio_setup(lp, &ofdev->dev);
-	if (rc)
-		dev_warn(&ofdev->dev, "error registering MDIO bus\n");
+	xemaclite_mdio_setup(lp, &ofdev->dev);
 
 	dev_info(dev, "MAC address is now %pM\n", ndev->dev_addr);
 
@@ -1191,7 +1187,7 @@
 	struct net_local *lp = netdev_priv(ndev);
 
 	/* Un-register the mii_bus, if configured */
-	if (lp->has_mdio) {
+	if (lp->mii_bus) {
 		mdiobus_unregister(lp->mii_bus);
 		mdiobus_free(lp->mii_bus);
 		lp->mii_bus = NULL;
diff --git a/drivers/net/hyperv/Kconfig b/drivers/net/hyperv/Kconfig
index 23a2d14..0765d5f 100644
--- a/drivers/net/hyperv/Kconfig
+++ b/drivers/net/hyperv/Kconfig
@@ -2,6 +2,5 @@
 	tristate "Microsoft Hyper-V virtual network driver"
 	depends on HYPERV
 	select UCS2_STRING
-	select FAILOVER
 	help
 	  Select this option to enable the Hyper-V virtual network driver.
diff --git a/drivers/net/hyperv/hyperv_net.h b/drivers/net/hyperv/hyperv_net.h
index 23304ac..1a924b8 100644
--- a/drivers/net/hyperv/hyperv_net.h
+++ b/drivers/net/hyperv/hyperv_net.h
@@ -901,6 +901,8 @@
 	struct hv_device *device_ctx;
 	/* netvsc_device */
 	struct netvsc_device __rcu *nvdev;
+	/* list of netvsc net_devices */
+	struct list_head list;
 	/* reconfigure work */
 	struct delayed_work dwork;
 	/* last reconfig time */
@@ -931,8 +933,6 @@
 	u32 vf_alloc;
 	/* Serial number of the VF to team with */
 	u32 vf_serial;
-
-	struct failover *failover;
 };
 
 /* Per channel data */
@@ -1277,17 +1277,17 @@
 
 struct ndis_ipsecv2_offload {
 	u32	encap;
-	u16	ip6;
-	u16	ip4opt;
-	u16	ip6ext;
-	u16	ah;
-	u16	esp;
-	u16	ah_esp;
-	u16	xport;
-	u16	tun;
-	u16	xport_tun;
-	u16	lso;
-	u16	extseq;
+	u8	ip6;
+	u8	ip4opt;
+	u8	ip6ext;
+	u8	ah;
+	u8	esp;
+	u8	ah_esp;
+	u8	xport;
+	u8	tun;
+	u8	xport_tun;
+	u8	lso;
+	u8	extseq;
 	u32	udp_esp;
 	u32	auth;
 	u32	crypto;
@@ -1295,8 +1295,8 @@
 };
 
 struct ndis_rsc_offload {
-	u16	ip4;
-	u16	ip6;
+	u8	ip4;
+	u8	ip6;
 };
 
 struct ndis_encap_offload {
diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index 7b18a8c..fe2256b 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -42,7 +42,6 @@
 #include <net/pkt_sched.h>
 #include <net/checksum.h>
 #include <net/ip6_checksum.h>
-#include <net/failover.h>
 
 #include "hyperv_net.h"
 
@@ -68,6 +67,8 @@
 module_param(debug, int, 0444);
 MODULE_PARM_DESC(debug, "Debug level (0=none,...,16=all)");
 
+static LIST_HEAD(netvsc_dev_list);
+
 static void netvsc_change_rx_flags(struct net_device *net, int change)
 {
 	struct net_device_context *ndev_ctx = netdev_priv(net);
@@ -1780,6 +1781,36 @@
 	rtnl_unlock();
 }
 
+static struct net_device *get_netvsc_bymac(const u8 *mac)
+{
+	struct net_device_context *ndev_ctx;
+
+	list_for_each_entry(ndev_ctx, &netvsc_dev_list, list) {
+		struct net_device *dev = hv_get_drvdata(ndev_ctx->device_ctx);
+
+		if (ether_addr_equal(mac, dev->perm_addr))
+			return dev;
+	}
+
+	return NULL;
+}
+
+static struct net_device *get_netvsc_byref(struct net_device *vf_netdev)
+{
+	struct net_device_context *net_device_ctx;
+	struct net_device *dev;
+
+	dev = netdev_master_upper_dev_get(vf_netdev);
+	if (!dev || dev->netdev_ops != &device_ops)
+		return NULL;	/* not a netvsc device */
+
+	net_device_ctx = netdev_priv(dev);
+	if (!rtnl_dereference(net_device_ctx->nvdev))
+		return NULL;	/* device is removed */
+
+	return dev;
+}
+
 /* Called when VF is injecting data into network stack.
  * Change the associated network device from VF to netvsc.
  * note: already called with rcu_read_lock
@@ -1802,6 +1833,46 @@
 	return RX_HANDLER_ANOTHER;
 }
 
+static int netvsc_vf_join(struct net_device *vf_netdev,
+			  struct net_device *ndev)
+{
+	struct net_device_context *ndev_ctx = netdev_priv(ndev);
+	int ret;
+
+	ret = netdev_rx_handler_register(vf_netdev,
+					 netvsc_vf_handle_frame, ndev);
+	if (ret != 0) {
+		netdev_err(vf_netdev,
+			   "can not register netvsc VF receive handler (err = %d)\n",
+			   ret);
+		goto rx_handler_failed;
+	}
+
+	ret = netdev_master_upper_dev_link(vf_netdev, ndev,
+					   NULL, NULL, NULL);
+	if (ret != 0) {
+		netdev_err(vf_netdev,
+			   "can not set master device %s (err = %d)\n",
+			   ndev->name, ret);
+		goto upper_link_failed;
+	}
+
+	/* set slave flag before open to prevent IPv6 addrconf */
+	vf_netdev->flags |= IFF_SLAVE;
+
+	schedule_delayed_work(&ndev_ctx->vf_takeover, VF_TAKEOVER_INT);
+
+	call_netdevice_notifiers(NETDEV_JOIN, vf_netdev);
+
+	netdev_info(vf_netdev, "joined to %s\n", ndev->name);
+	return 0;
+
+upper_link_failed:
+	netdev_rx_handler_unregister(vf_netdev);
+rx_handler_failed:
+	return ret;
+}
+
 static void __netvsc_vf_setup(struct net_device *ndev,
 			      struct net_device *vf_netdev)
 {
@@ -1852,95 +1923,104 @@
 	rtnl_unlock();
 }
 
-static int netvsc_pre_register_vf(struct net_device *vf_netdev,
-				  struct net_device *ndev)
+static int netvsc_register_vf(struct net_device *vf_netdev)
 {
+	struct net_device *ndev;
 	struct net_device_context *net_device_ctx;
 	struct netvsc_device *netvsc_dev;
+	int ret;
+
+	if (vf_netdev->addr_len != ETH_ALEN)
+		return NOTIFY_DONE;
+
+	/*
+	 * We will use the MAC address to locate the synthetic interface to
+	 * associate with the VF interface. If we don't find a matching
+	 * synthetic interface, move on.
+	 */
+	ndev = get_netvsc_bymac(vf_netdev->perm_addr);
+	if (!ndev)
+		return NOTIFY_DONE;
 
 	net_device_ctx = netdev_priv(ndev);
 	netvsc_dev = rtnl_dereference(net_device_ctx->nvdev);
 	if (!netvsc_dev || rtnl_dereference(net_device_ctx->vf_netdev))
-		return -ENODEV;
+		return NOTIFY_DONE;
 
-	return 0;
-}
+	/* if syntihetic interface is a different namespace,
+	 * then move the VF to that namespace; join will be
+	 * done again in that context.
+	 */
+	if (!net_eq(dev_net(ndev), dev_net(vf_netdev))) {
+		ret = dev_change_net_namespace(vf_netdev,
+					       dev_net(ndev), "eth%d");
+		if (ret)
+			netdev_err(vf_netdev,
+				   "could not move to same namespace as %s: %d\n",
+				   ndev->name, ret);
+		else
+			netdev_info(vf_netdev,
+				    "VF moved to namespace with: %s\n",
+				    ndev->name);
+		return NOTIFY_DONE;
+	}
 
-static int netvsc_register_vf(struct net_device *vf_netdev,
-			      struct net_device *ndev)
-{
-	struct net_device_context *ndev_ctx = netdev_priv(ndev);
+	netdev_info(ndev, "VF registering: %s\n", vf_netdev->name);
 
-	/* set slave flag before open to prevent IPv6 addrconf */
-	vf_netdev->flags |= IFF_SLAVE;
-
-	schedule_delayed_work(&ndev_ctx->vf_takeover, VF_TAKEOVER_INT);
-
-	call_netdevice_notifiers(NETDEV_JOIN, vf_netdev);
-
-	netdev_info(vf_netdev, "joined to %s\n", ndev->name);
+	if (netvsc_vf_join(vf_netdev, ndev) != 0)
+		return NOTIFY_DONE;
 
 	dev_hold(vf_netdev);
-	rcu_assign_pointer(ndev_ctx->vf_netdev, vf_netdev);
-
-	return 0;
+	rcu_assign_pointer(net_device_ctx->vf_netdev, vf_netdev);
+	return NOTIFY_OK;
 }
 
 /* VF up/down change detected, schedule to change data path */
-static int netvsc_vf_changed(struct net_device *vf_netdev,
-			     struct net_device *ndev)
+static int netvsc_vf_changed(struct net_device *vf_netdev)
 {
 	struct net_device_context *net_device_ctx;
 	struct netvsc_device *netvsc_dev;
+	struct net_device *ndev;
 	bool vf_is_up = netif_running(vf_netdev);
 
+	ndev = get_netvsc_byref(vf_netdev);
+	if (!ndev)
+		return NOTIFY_DONE;
+
 	net_device_ctx = netdev_priv(ndev);
 	netvsc_dev = rtnl_dereference(net_device_ctx->nvdev);
 	if (!netvsc_dev)
-		return -ENODEV;
+		return NOTIFY_DONE;
 
 	netvsc_switch_datapath(ndev, vf_is_up);
 	netdev_info(ndev, "Data path switched %s VF: %s\n",
 		    vf_is_up ? "to" : "from", vf_netdev->name);
 
-	return 0;
+	return NOTIFY_OK;
 }
 
-static int netvsc_pre_unregister_vf(struct net_device *vf_netdev,
-				    struct net_device *ndev)
+static int netvsc_unregister_vf(struct net_device *vf_netdev)
 {
+	struct net_device *ndev;
 	struct net_device_context *net_device_ctx;
 
+	ndev = get_netvsc_byref(vf_netdev);
+	if (!ndev)
+		return NOTIFY_DONE;
+
 	net_device_ctx = netdev_priv(ndev);
 	cancel_delayed_work_sync(&net_device_ctx->vf_takeover);
 
-	return 0;
-}
-
-static int netvsc_unregister_vf(struct net_device *vf_netdev,
-				struct net_device *ndev)
-{
-	struct net_device_context *net_device_ctx;
-
-	net_device_ctx = netdev_priv(ndev);
-
 	netdev_info(ndev, "VF unregistering: %s\n", vf_netdev->name);
 
+	netdev_rx_handler_unregister(vf_netdev);
+	netdev_upper_dev_unlink(vf_netdev, ndev);
 	RCU_INIT_POINTER(net_device_ctx->vf_netdev, NULL);
 	dev_put(vf_netdev);
 
-	return 0;
+	return NOTIFY_OK;
 }
 
-static struct failover_ops netvsc_failover_ops = {
-	.slave_pre_register	= netvsc_pre_register_vf,
-	.slave_register		= netvsc_register_vf,
-	.slave_pre_unregister	= netvsc_pre_unregister_vf,
-	.slave_unregister	= netvsc_unregister_vf,
-	.slave_link_change	= netvsc_vf_changed,
-	.slave_handle_frame	= netvsc_vf_handle_frame,
-};
-
 static int netvsc_probe(struct hv_device *dev,
 			const struct hv_vmbus_device_id *dev_id)
 {
@@ -2024,23 +2104,19 @@
 	else
 		net->max_mtu = ETH_DATA_LEN;
 
-	ret = register_netdev(net);
+	rtnl_lock();
+	ret = register_netdevice(net);
 	if (ret != 0) {
 		pr_err("Unable to register netdev.\n");
 		goto register_failed;
 	}
 
-	net_device_ctx->failover = failover_register(net, &netvsc_failover_ops);
-	if (IS_ERR(net_device_ctx->failover)) {
-		ret = PTR_ERR(net_device_ctx->failover);
-		goto err_failover;
-	}
+	list_add(&net_device_ctx->list, &netvsc_dev_list);
+	rtnl_unlock();
+	return 0;
 
-	return ret;
-
-err_failover:
-	unregister_netdev(net);
 register_failed:
+	rtnl_unlock();
 	rndis_filter_device_remove(dev, nvdev);
 rndis_failed:
 	free_percpu(net_device_ctx->vf_stats);
@@ -2080,14 +2156,13 @@
 	rtnl_lock();
 	vf_netdev = rtnl_dereference(ndev_ctx->vf_netdev);
 	if (vf_netdev)
-		failover_slave_unregister(vf_netdev);
+		netvsc_unregister_vf(vf_netdev);
 
 	if (nvdev)
 		rndis_filter_device_remove(dev, nvdev);
 
 	unregister_netdevice(net);
-
-	failover_unregister(ndev_ctx->failover);
+	list_del(&ndev_ctx->list);
 
 	rtnl_unlock();
 	rcu_read_unlock();
@@ -2115,8 +2190,54 @@
 	.remove = netvsc_remove,
 };
 
+/*
+ * On Hyper-V, every VF interface is matched with a corresponding
+ * synthetic interface. The synthetic interface is presented first
+ * to the guest. When the corresponding VF instance is registered,
+ * we will take care of switching the data path.
+ */
+static int netvsc_netdev_event(struct notifier_block *this,
+			       unsigned long event, void *ptr)
+{
+	struct net_device *event_dev = netdev_notifier_info_to_dev(ptr);
+
+	/* Skip our own events */
+	if (event_dev->netdev_ops == &device_ops)
+		return NOTIFY_DONE;
+
+	/* Avoid non-Ethernet type devices */
+	if (event_dev->type != ARPHRD_ETHER)
+		return NOTIFY_DONE;
+
+	/* Avoid Vlan dev with same MAC registering as VF */
+	if (is_vlan_dev(event_dev))
+		return NOTIFY_DONE;
+
+	/* Avoid Bonding master dev with same MAC registering as VF */
+	if ((event_dev->priv_flags & IFF_BONDING) &&
+	    (event_dev->flags & IFF_MASTER))
+		return NOTIFY_DONE;
+
+	switch (event) {
+	case NETDEV_REGISTER:
+		return netvsc_register_vf(event_dev);
+	case NETDEV_UNREGISTER:
+		return netvsc_unregister_vf(event_dev);
+	case NETDEV_UP:
+	case NETDEV_DOWN:
+		return netvsc_vf_changed(event_dev);
+	default:
+		return NOTIFY_DONE;
+	}
+}
+
+static struct notifier_block netvsc_netdev_notifier = {
+	.notifier_call = netvsc_netdev_event,
+};
+
 static void __exit netvsc_drv_exit(void)
 {
+	unregister_netdevice_notifier(&netvsc_netdev_notifier);
 	vmbus_driver_unregister(&netvsc_drv);
 }
 
@@ -2135,6 +2256,7 @@
 	if (ret)
 		return ret;
 
+	register_netdevice_notifier(&netvsc_netdev_notifier);
 	return 0;
 }
 
diff --git a/drivers/net/phy/mdio-gpio.c b/drivers/net/phy/mdio-gpio.c
index 4e4c8da..3326574 100644
--- a/drivers/net/phy/mdio-gpio.c
+++ b/drivers/net/phy/mdio-gpio.c
@@ -26,10 +26,7 @@
 #include <linux/platform_device.h>
 #include <linux/mdio-bitbang.h>
 #include <linux/mdio-gpio.h>
-#include <linux/gpio.h>
 #include <linux/gpio/consumer.h>
-
-#include <linux/of_gpio.h>
 #include <linux/of_mdio.h>
 
 struct mdio_gpio_info {
diff --git a/drivers/net/wireless/mac80211_hwsim.c b/drivers/net/wireless/mac80211_hwsim.c
index 9825bfd..18e819d 100644
--- a/drivers/net/wireless/mac80211_hwsim.c
+++ b/drivers/net/wireless/mac80211_hwsim.c
@@ -3572,11 +3572,14 @@
 	hwsim_wq = alloc_workqueue("hwsim_wq", 0, 0);
 	if (!hwsim_wq)
 		return -ENOMEM;
-	rhashtable_init(&hwsim_radios_rht, &hwsim_rht_params);
+
+	err = rhashtable_init(&hwsim_radios_rht, &hwsim_rht_params);
+	if (err)
+		goto out_free_wq;
 
 	err = register_pernet_device(&hwsim_net_ops);
 	if (err)
-		return err;
+		goto out_free_rht;
 
 	err = platform_driver_register(&mac80211_hwsim_driver);
 	if (err)
@@ -3701,6 +3704,10 @@
 	platform_driver_unregister(&mac80211_hwsim_driver);
 out_unregister_pernet:
 	unregister_pernet_device(&hwsim_net_ops);
+out_free_rht:
+	rhashtable_destroy(&hwsim_radios_rht);
+out_free_wq:
+	destroy_workqueue(hwsim_wq);
 	return err;
 }
 module_init(init_mac80211_hwsim);
diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index 679da1a..922ce0a 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -239,7 +239,7 @@
 static int netfront_tx_slot_available(struct netfront_queue *queue)
 {
 	return (queue->tx.req_prod_pvt - queue->tx.rsp_cons) <
-		(NET_TX_RING_SIZE - MAX_SKB_FRAGS - 2);
+		(NET_TX_RING_SIZE - XEN_NETIF_NR_SLOTS_MIN - 1);
 }
 
 static void xennet_maybe_wake_tx(struct netfront_queue *queue)
@@ -790,7 +790,7 @@
 	RING_IDX cons = queue->rx.rsp_cons;
 	struct sk_buff *skb = xennet_get_rx_skb(queue, cons);
 	grant_ref_t ref = xennet_get_rx_ref(queue, cons);
-	int max = MAX_SKB_FRAGS + (rx->status <= RX_COPY_THRESHOLD);
+	int max = XEN_NETIF_NR_SLOTS_MIN + (rx->status <= RX_COPY_THRESHOLD);
 	int slots = 1;
 	int err = 0;
 	unsigned long ret;
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index ce8c95b..a502f1a 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -2349,6 +2349,9 @@
 	struct vhost_msg_node *node = kmalloc(sizeof *node, GFP_KERNEL);
 	if (!node)
 		return NULL;
+
+	/* Make sure all padding within the structure is initialized. */
+	memset(&node->msg, 0, sizeof node->msg);
 	node->vq = vq;
 	node->msg.type = type;
 	return node;
diff --git a/drivers/video/fbdev/Kconfig b/drivers/video/fbdev/Kconfig
index d942542..591a13a 100644
--- a/drivers/video/fbdev/Kconfig
+++ b/drivers/video/fbdev/Kconfig
@@ -1437,7 +1437,7 @@
 
 config FB_VIA
        tristate "VIA UniChrome (Pro) and Chrome9 display support"
-       depends on FB && PCI && X86 && GPIOLIB && I2C
+       depends on FB && PCI && GPIOLIB && I2C && (X86 || COMPILE_TEST)
        select FB_CFB_FILLRECT
        select FB_CFB_COPYAREA
        select FB_CFB_IMAGEBLIT
@@ -1888,7 +1888,6 @@
 config FB_SH_MOBILE_LCDC
 	tristate "SuperH Mobile LCDC framebuffer support"
 	depends on FB && (SUPERH || ARCH_RENESAS) && HAVE_CLK
-	depends on FB_SH_MOBILE_MERAM || !FB_SH_MOBILE_MERAM
 	select FB_SYS_FILLRECT
 	select FB_SYS_COPYAREA
 	select FB_SYS_IMAGEBLIT
@@ -2253,39 +2252,6 @@
 	  and could also have been called by other names when coupled with
 	  a bridge adapter.
 
-config FB_AUO_K190X
-	tristate "AUO-K190X EPD controller support"
-	depends on FB
-	select FB_SYS_FILLRECT
-	select FB_SYS_COPYAREA
-	select FB_SYS_IMAGEBLIT
-	select FB_SYS_FOPS
-	select FB_DEFERRED_IO
-	help
-	  Provides support for epaper controllers from the K190X series
-	  of AUO. These controllers can be used to drive epaper displays
-	  from Sipix.
-
-	  This option enables the common support, shared by the individual
-	  controller drivers. You will also have to enable the driver
-	  for the controller type used in your device.
-
-config FB_AUO_K1900
-	tristate "AUO-K1900 EPD controller support"
-	depends on FB && FB_AUO_K190X
-	help
-	  This driver implements support for the AUO K1900 epd-controller.
-	  This controller can drive Sipix epaper displays but can only do
-	  serial updates, reducing the number of possible frames per second.
-
-config FB_AUO_K1901
-	tristate "AUO-K1901 EPD controller support"
-	depends on FB && FB_AUO_K190X
-	help
-	  This driver implements support for the AUO K1901 epd-controller.
-	  This controller can drive Sipix epaper displays and supports
-	  concurrent updates, making higher frames per second possible.
-
 config FB_JZ4740
 	tristate "JZ4740 LCD framebuffer support"
 	depends on FB && MACH_JZ4740
@@ -2346,18 +2312,6 @@
 source "drivers/video/fbdev/omap2/Kconfig"
 source "drivers/video/fbdev/mmp/Kconfig"
 
-config FB_SH_MOBILE_MERAM
-	tristate "SuperH Mobile MERAM read ahead support"
-	depends on (SUPERH || ARCH_SHMOBILE)
-	select GENERIC_ALLOCATOR
-	---help---
-	  Enable MERAM support for the SuperH controller.
-
-	  This will allow for caching of the framebuffer to provide more
-	  reliable access under heavy main memory bus traffic situations.
-	  Up to 4 memory channels can be configured, allowing 4 RGB or
-	  2 YCbCr framebuffers to be configured.
-
 config FB_SSD1307
 	tristate "Solomon SSD1307 framebuffer support"
 	depends on FB && I2C
diff --git a/drivers/video/fbdev/Makefile b/drivers/video/fbdev/Makefile
index 55282a2..13c9003 100644
--- a/drivers/video/fbdev/Makefile
+++ b/drivers/video/fbdev/Makefile
@@ -100,9 +100,6 @@
 obj-$(CONFIG_FB_MAXINE)		  += maxinefb.o
 obj-$(CONFIG_FB_METRONOME)        += metronomefb.o
 obj-$(CONFIG_FB_BROADSHEET)       += broadsheetfb.o
-obj-$(CONFIG_FB_AUO_K190X)	  += auo_k190x.o
-obj-$(CONFIG_FB_AUO_K1900)	  += auo_k1900fb.o
-obj-$(CONFIG_FB_AUO_K1901)	  += auo_k1901fb.o
 obj-$(CONFIG_FB_S1D13XXX)	  += s1d13xxxfb.o
 obj-$(CONFIG_FB_SH7760)		  += sh7760fb.o
 obj-$(CONFIG_FB_IMX)              += imxfb.o
@@ -116,7 +113,6 @@
 obj-$(CONFIG_FB_UDL)		  += udlfb.o
 obj-$(CONFIG_FB_SMSCUFX)	  += smscufx.o
 obj-$(CONFIG_FB_XILINX)           += xilinxfb.o
-obj-$(CONFIG_FB_SH_MOBILE_MERAM)  += sh_mobile_meram.o
 obj-$(CONFIG_FB_SH_MOBILE_LCDC)	  += sh_mobile_lcdcfb.o
 obj-$(CONFIG_FB_OMAP)             += omap/
 obj-y                             += omap2/
diff --git a/drivers/video/fbdev/aty/aty128fb.c b/drivers/video/fbdev/aty/aty128fb.c
index 09b0e55..6cc4686 100644
--- a/drivers/video/fbdev/aty/aty128fb.c
+++ b/drivers/video/fbdev/aty/aty128fb.c
@@ -2442,7 +2442,7 @@
 		(void)aty_ld_pll(POWER_MANAGEMENT);
 		aty_st_le32(BUS_CNTL1, 0x00000010);
 		aty_st_le32(MEM_POWER_MISC, 0x0c830000);
-		mdelay(100);
+		msleep(100);
 
 		/* Switch PCI power management to D2 */
 		pci_set_power_state(pdev, PCI_D2);
diff --git a/drivers/video/fbdev/aty/radeon_pm.c b/drivers/video/fbdev/aty/radeon_pm.c
index 7137c12..e695adb 100644
--- a/drivers/video/fbdev/aty/radeon_pm.c
+++ b/drivers/video/fbdev/aty/radeon_pm.c
@@ -2678,17 +2678,17 @@
 		 * it, we'll restore the dynamic clocks state on wakeup
 		 */
 		radeon_pm_disable_dynamic_mode(rinfo);
-		mdelay(50);
+		msleep(50);
 		radeon_pm_save_regs(rinfo, 1);
 
 		if (rinfo->is_mobility && !(rinfo->pm_mode & radeon_pm_d2)) {
 			/* Switch off LVDS interface */
-			mdelay(1);
+			usleep_range(1000, 2000);
 			OUTREG(LVDS_GEN_CNTL, INREG(LVDS_GEN_CNTL) & ~(LVDS_BL_MOD_EN));
-			mdelay(1);
+			usleep_range(1000, 2000);
 			OUTREG(LVDS_GEN_CNTL, INREG(LVDS_GEN_CNTL) & ~(LVDS_EN | LVDS_ON));
 			OUTREG(LVDS_PLL_CNTL, (INREG(LVDS_PLL_CNTL) & ~30000) | 0x20000);
-			mdelay(20);
+			msleep(20);
 			OUTREG(LVDS_GEN_CNTL, INREG(LVDS_GEN_CNTL) & ~(LVDS_DIGON));
 		}
 		pci_disable_device(pdev);
diff --git a/drivers/video/fbdev/au1100fb.c b/drivers/video/fbdev/au1100fb.c
index d555a78..0adf068 100644
--- a/drivers/video/fbdev/au1100fb.c
+++ b/drivers/video/fbdev/au1100fb.c
@@ -464,7 +464,7 @@
 					    PAGE_ALIGN(fbdev->fb_len),
 					    &fbdev->fb_phys, GFP_KERNEL);
 	if (!fbdev->fb_mem) {
-		print_err("fail to allocate frambuffer (size: %dK))",
+		print_err("fail to allocate framebuffer (size: %dK))",
 			  fbdev->fb_len / 1024);
 		return -ENOMEM;
 	}
diff --git a/drivers/video/fbdev/au1200fb.c b/drivers/video/fbdev/au1200fb.c
index 87d5a62..3872cce 100644
--- a/drivers/video/fbdev/au1200fb.c
+++ b/drivers/video/fbdev/au1200fb.c
@@ -1696,7 +1696,7 @@
 				&fbdev->fb_phys, GFP_KERNEL,
 				DMA_ATTR_NON_CONSISTENT);
 		if (!fbdev->fb_mem) {
-			print_err("fail to allocate frambuffer (size: %dK))",
+			print_err("fail to allocate framebuffer (size: %dK))",
 				  fbdev->fb_len / 1024);
 			ret = -ENOMEM;
 			goto failed;
diff --git a/drivers/video/fbdev/auo_k1900fb.c b/drivers/video/fbdev/auo_k1900fb.c
deleted file mode 100644
index 7637c60..0000000
--- a/drivers/video/fbdev/auo_k1900fb.c
+++ /dev/null
@@ -1,204 +0,0 @@
-/*
- * auok190xfb.c -- FB driver for AUO-K1900 controllers
- *
- * Copyright (C) 2011, 2012 Heiko Stuebner <heiko@sntech.de>
- *
- * based on broadsheetfb.c
- *
- * Copyright (C) 2008, Jaya Kumar
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- * Layout is based on skeletonfb.c by James Simmons and Geert Uytterhoeven.
- *
- * This driver is written to be used with the AUO-K1900 display controller.
- *
- * It is intended to be architecture independent. A board specific driver
- * must be used to perform all the physical IO interactions.
- *
- * The controller supports different update modes:
- * mode0+1 16 step gray (4bit)
- * mode2 4 step gray (2bit) - FIXME: add strange refresh
- * mode3 2 step gray (1bit) - FIXME: add strange refresh
- * mode4 handwriting mode (strange behaviour)
- * mode5 automatic selection of update mode
- */
-
-#include <linux/module.h>
-#include <linux/kernel.h>
-#include <linux/errno.h>
-#include <linux/string.h>
-#include <linux/mm.h>
-#include <linux/slab.h>
-#include <linux/delay.h>
-#include <linux/interrupt.h>
-#include <linux/fb.h>
-#include <linux/init.h>
-#include <linux/platform_device.h>
-#include <linux/list.h>
-#include <linux/firmware.h>
-#include <linux/gpio.h>
-#include <linux/pm_runtime.h>
-
-#include <video/auo_k190xfb.h>
-
-#include "auo_k190x.h"
-
-/*
- * AUO-K1900 specific commands
- */
-
-#define AUOK1900_CMD_PARTIALDISP	0x1001
-#define AUOK1900_CMD_ROTATION		0x1006
-#define AUOK1900_CMD_LUT_STOP		0x1009
-
-#define AUOK1900_INIT_TEMP_AVERAGE	(1 << 13)
-#define AUOK1900_INIT_ROTATE(_x)	((_x & 0x3) << 10)
-#define AUOK1900_INIT_RESOLUTION(_res)	((_res & 0x7) << 2)
-
-static void auok1900_init(struct auok190xfb_par *par)
-{
-	struct device *dev = par->info->device;
-	struct auok190x_board *board = par->board;
-	u16 init_param = 0;
-
-	pm_runtime_get_sync(dev);
-
-	init_param |= AUOK1900_INIT_TEMP_AVERAGE;
-	init_param |= AUOK1900_INIT_ROTATE(par->rotation);
-	init_param |= AUOK190X_INIT_INVERSE_WHITE;
-	init_param |= AUOK190X_INIT_FORMAT0;
-	init_param |= AUOK1900_INIT_RESOLUTION(par->resolution);
-	init_param |= AUOK190X_INIT_SHIFT_RIGHT;
-
-	auok190x_send_cmdargs(par, AUOK190X_CMD_INIT, 1, &init_param);
-
-	/* let the controller finish */
-	board->wait_for_rdy(par);
-
-	pm_runtime_mark_last_busy(dev);
-	pm_runtime_put_autosuspend(dev);
-}
-
-static void auok1900_update_region(struct auok190xfb_par *par, int mode,
-						u16 y1, u16 y2)
-{
-	struct device *dev = par->info->device;
-	unsigned char *buf = (unsigned char *)par->info->screen_base;
-	int xres = par->info->var.xres;
-	int line_length = par->info->fix.line_length;
-	u16 args[4];
-
-	pm_runtime_get_sync(dev);
-
-	mutex_lock(&(par->io_lock));
-
-	/* y1 and y2 must be a multiple of 2 so drop the lowest bit */
-	y1 &= 0xfffe;
-	y2 &= 0xfffe;
-
-	dev_dbg(dev, "update (x,y,w,h,mode)=(%d,%d,%d,%d,%d)\n",
-		1, y1+1, xres, y2-y1, mode);
-
-	/* to FIX handle different partial update modes */
-	args[0] = mode | 1;
-	args[1] = y1 + 1;
-	args[2] = xres;
-	args[3] = y2 - y1;
-	buf += y1 * line_length;
-	auok190x_send_cmdargs_pixels(par, AUOK1900_CMD_PARTIALDISP, 4, args,
-				     ((y2 - y1) * line_length)/2, (u16 *) buf);
-	auok190x_send_command(par, AUOK190X_CMD_DATA_STOP);
-
-	par->update_cnt++;
-
-	mutex_unlock(&(par->io_lock));
-
-	pm_runtime_mark_last_busy(dev);
-	pm_runtime_put_autosuspend(dev);
-}
-
-static void auok1900fb_dpy_update_pages(struct auok190xfb_par *par,
-						u16 y1, u16 y2)
-{
-	int mode;
-
-	if (par->update_mode < 0) {
-		mode = AUOK190X_UPDATE_MODE(1);
-		par->last_mode = -1;
-	} else {
-		mode = AUOK190X_UPDATE_MODE(par->update_mode);
-		par->last_mode = par->update_mode;
-	}
-
-	if (par->flash)
-		mode |= AUOK190X_UPDATE_NONFLASH;
-
-	auok1900_update_region(par, mode, y1, y2);
-}
-
-static void auok1900fb_dpy_update(struct auok190xfb_par *par)
-{
-	int mode;
-
-	if (par->update_mode < 0) {
-		mode = AUOK190X_UPDATE_MODE(0);
-		par->last_mode = -1;
-	} else {
-		mode = AUOK190X_UPDATE_MODE(par->update_mode);
-		par->last_mode = par->update_mode;
-	}
-
-	if (par->flash)
-		mode |= AUOK190X_UPDATE_NONFLASH;
-
-	auok1900_update_region(par, mode, 0, par->info->var.yres);
-	par->update_cnt = 0;
-}
-
-static bool auok1900fb_need_refresh(struct auok190xfb_par *par)
-{
-	return (par->update_cnt > 10);
-}
-
-static int auok1900fb_probe(struct platform_device *pdev)
-{
-	struct auok190x_init_data init;
-	struct auok190x_board *board;
-
-	/* pick up board specific routines */
-	board = pdev->dev.platform_data;
-	if (!board)
-		return -EINVAL;
-
-	/* fill temporary init struct for common init */
-	init.id = "auo_k1900fb";
-	init.board = board;
-	init.update_partial = auok1900fb_dpy_update_pages;
-	init.update_all = auok1900fb_dpy_update;
-	init.need_refresh = auok1900fb_need_refresh;
-	init.init = auok1900_init;
-
-	return auok190x_common_probe(pdev, &init);
-}
-
-static int auok1900fb_remove(struct platform_device *pdev)
-{
-	return auok190x_common_remove(pdev);
-}
-
-static struct platform_driver auok1900fb_driver = {
-	.probe	= auok1900fb_probe,
-	.remove = auok1900fb_remove,
-	.driver	= {
-		.name	= "auo_k1900fb",
-		.pm = &auok190x_pm,
-	},
-};
-module_platform_driver(auok1900fb_driver);
-
-MODULE_DESCRIPTION("framebuffer driver for the AUO-K1900 EPD controller");
-MODULE_AUTHOR("Heiko Stuebner <heiko@sntech.de>");
-MODULE_LICENSE("GPL");
diff --git a/drivers/video/fbdev/auo_k1901fb.c b/drivers/video/fbdev/auo_k1901fb.c
deleted file mode 100644
index 681fe61..0000000
--- a/drivers/video/fbdev/auo_k1901fb.c
+++ /dev/null
@@ -1,257 +0,0 @@
-/*
- * auok190xfb.c -- FB driver for AUO-K1901 controllers
- *
- * Copyright (C) 2011, 2012 Heiko Stuebner <heiko@sntech.de>
- *
- * based on broadsheetfb.c
- *
- * Copyright (C) 2008, Jaya Kumar
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- * Layout is based on skeletonfb.c by James Simmons and Geert Uytterhoeven.
- *
- * This driver is written to be used with the AUO-K1901 display controller.
- *
- * It is intended to be architecture independent. A board specific driver
- * must be used to perform all the physical IO interactions.
- *
- * The controller supports different update modes:
- * mode0+1 16 step gray (4bit)
- * mode2+3 4 step gray (2bit)
- * mode4+5 2 step gray (1bit)
- * - mode4 is described as "without LUT"
- * mode7 automatic selection of update mode
- *
- * The most interesting difference to the K1900 is the ability to do screen
- * updates in an asynchronous fashion. Where the K1900 needs to wait for the
- * current update to complete, the K1901 can process later updates already.
- */
-
-#include <linux/module.h>
-#include <linux/kernel.h>
-#include <linux/errno.h>
-#include <linux/string.h>
-#include <linux/mm.h>
-#include <linux/slab.h>
-#include <linux/delay.h>
-#include <linux/interrupt.h>
-#include <linux/fb.h>
-#include <linux/init.h>
-#include <linux/platform_device.h>
-#include <linux/list.h>
-#include <linux/firmware.h>
-#include <linux/gpio.h>
-#include <linux/pm_runtime.h>
-
-#include <video/auo_k190xfb.h>
-
-#include "auo_k190x.h"
-
-/*
- * AUO-K1901 specific commands
- */
-
-#define AUOK1901_CMD_LUT_INTERFACE	0x0005
-#define AUOK1901_CMD_DMA_START		0x1001
-#define AUOK1901_CMD_CURSOR_START	0x1007
-#define AUOK1901_CMD_CURSOR_STOP	AUOK190X_CMD_DATA_STOP
-#define AUOK1901_CMD_DDMA_START		0x1009
-
-#define AUOK1901_INIT_GATE_PULSE_LOW	(0 << 14)
-#define AUOK1901_INIT_GATE_PULSE_HIGH	(1 << 14)
-#define AUOK1901_INIT_SINGLE_GATE	(0 << 13)
-#define AUOK1901_INIT_DOUBLE_GATE	(1 << 13)
-
-/* Bits to pixels
- *   Mode	15-12	11-8	7-4	3-0
- *   format2	2	T	1	T
- *   format3	1	T	2	T
- *   format4	T	2	T	1
- *   format5	T	1	T	2
- *
- *   halftone modes:
- *   format6	2	2	1	1
- *   format7	1	1	2	2
- */
-#define AUOK1901_INIT_FORMAT2		(1 << 7)
-#define AUOK1901_INIT_FORMAT3		((1 << 7) | (1 << 6))
-#define AUOK1901_INIT_FORMAT4		(1 << 8)
-#define AUOK1901_INIT_FORMAT5		((1 << 8) | (1 << 6))
-#define AUOK1901_INIT_FORMAT6		((1 << 8) | (1 << 7))
-#define AUOK1901_INIT_FORMAT7		((1 << 8) | (1 << 7) | (1 << 6))
-
-/* res[4] to bit 10
- * res[3-0] to bits 5-2
- */
-#define AUOK1901_INIT_RESOLUTION(_res)	(((_res & (1 << 4)) << 6) \
-					 | ((_res & 0xf) << 2))
-
-/*
- * portrait / landscape orientation in AUOK1901_CMD_DMA_START
- */
-#define AUOK1901_DMA_ROTATE90(_rot)		((_rot & 1) << 13)
-
-/*
- * equivalent to 1 << 11, needs the ~ to have same rotation like K1900
- */
-#define AUOK1901_DDMA_ROTATE180(_rot)		((~_rot & 2) << 10)
-
-static void auok1901_init(struct auok190xfb_par *par)
-{
-	struct device *dev = par->info->device;
-	struct auok190x_board *board = par->board;
-	u16 init_param = 0;
-
-	pm_runtime_get_sync(dev);
-
-	init_param |= AUOK190X_INIT_INVERSE_WHITE;
-	init_param |= AUOK190X_INIT_FORMAT0;
-	init_param |= AUOK1901_INIT_RESOLUTION(par->resolution);
-	init_param |= AUOK190X_INIT_SHIFT_LEFT;
-
-	auok190x_send_cmdargs(par, AUOK190X_CMD_INIT, 1, &init_param);
-
-	/* let the controller finish */
-	board->wait_for_rdy(par);
-
-	pm_runtime_mark_last_busy(dev);
-	pm_runtime_put_autosuspend(dev);
-}
-
-static void auok1901_update_region(struct auok190xfb_par *par, int mode,
-						u16 y1, u16 y2)
-{
-	struct device *dev = par->info->device;
-	unsigned char *buf = (unsigned char *)par->info->screen_base;
-	int xres = par->info->var.xres;
-	int line_length = par->info->fix.line_length;
-	u16 args[5];
-
-	pm_runtime_get_sync(dev);
-
-	mutex_lock(&(par->io_lock));
-
-	/* y1 and y2 must be a multiple of 2 so drop the lowest bit */
-	y1 &= 0xfffe;
-	y2 &= 0xfffe;
-
-	dev_dbg(dev, "update (x,y,w,h,mode)=(%d,%d,%d,%d,%d)\n",
-		1, y1+1, xres, y2-y1, mode);
-
-	/* K1901: first transfer the region data */
-	args[0] = AUOK1901_DMA_ROTATE90(par->rotation) | 1;
-	args[1] = y1 + 1;
-	args[2] = xres;
-	args[3] = y2 - y1;
-	buf += y1 * line_length;
-	auok190x_send_cmdargs_pixels_nowait(par, AUOK1901_CMD_DMA_START, 4,
-					    args, ((y2 - y1) * line_length)/2,
-					    (u16 *) buf);
-	auok190x_send_command_nowait(par, AUOK190X_CMD_DATA_STOP);
-
-	/* K1901: second tell the controller to update the region with mode */
-	args[0] = mode | AUOK1901_DDMA_ROTATE180(par->rotation);
-	args[1] = 1;
-	args[2] = y1 + 1;
-	args[3] = xres;
-	args[4] = y2 - y1;
-	auok190x_send_cmdargs_nowait(par, AUOK1901_CMD_DDMA_START, 5, args);
-
-	par->update_cnt++;
-
-	mutex_unlock(&(par->io_lock));
-
-	pm_runtime_mark_last_busy(dev);
-	pm_runtime_put_autosuspend(dev);
-}
-
-static void auok1901fb_dpy_update_pages(struct auok190xfb_par *par,
-						u16 y1, u16 y2)
-{
-	int mode;
-
-	if (par->update_mode < 0) {
-		mode = AUOK190X_UPDATE_MODE(1);
-		par->last_mode = -1;
-	} else {
-		mode = AUOK190X_UPDATE_MODE(par->update_mode);
-		par->last_mode = par->update_mode;
-	}
-
-	if (par->flash)
-		mode |= AUOK190X_UPDATE_NONFLASH;
-
-	auok1901_update_region(par, mode, y1, y2);
-}
-
-static void auok1901fb_dpy_update(struct auok190xfb_par *par)
-{
-	int mode;
-
-	/* When doing full updates, wait for the controller to be ready
-	 * This will hopefully catch some hangs of the K1901
-	 */
-	par->board->wait_for_rdy(par);
-
-	if (par->update_mode < 0) {
-		mode = AUOK190X_UPDATE_MODE(0);
-		par->last_mode = -1;
-	} else {
-		mode = AUOK190X_UPDATE_MODE(par->update_mode);
-		par->last_mode = par->update_mode;
-	}
-
-	if (par->flash)
-		mode |= AUOK190X_UPDATE_NONFLASH;
-
-	auok1901_update_region(par, mode, 0, par->info->var.yres);
-	par->update_cnt = 0;
-}
-
-static bool auok1901fb_need_refresh(struct auok190xfb_par *par)
-{
-	return (par->update_cnt > 10);
-}
-
-static int auok1901fb_probe(struct platform_device *pdev)
-{
-	struct auok190x_init_data init;
-	struct auok190x_board *board;
-
-	/* pick up board specific routines */
-	board = pdev->dev.platform_data;
-	if (!board)
-		return -EINVAL;
-
-	/* fill temporary init struct for common init */
-	init.id = "auo_k1901fb";
-	init.board = board;
-	init.update_partial = auok1901fb_dpy_update_pages;
-	init.update_all = auok1901fb_dpy_update;
-	init.need_refresh = auok1901fb_need_refresh;
-	init.init = auok1901_init;
-
-	return auok190x_common_probe(pdev, &init);
-}
-
-static int auok1901fb_remove(struct platform_device *pdev)
-{
-	return auok190x_common_remove(pdev);
-}
-
-static struct platform_driver auok1901fb_driver = {
-	.probe	= auok1901fb_probe,
-	.remove = auok1901fb_remove,
-	.driver	= {
-		.name	= "auo_k1901fb",
-		.pm = &auok190x_pm,
-	},
-};
-module_platform_driver(auok1901fb_driver);
-
-MODULE_DESCRIPTION("framebuffer driver for the AUO-K1901 EPD controller");
-MODULE_AUTHOR("Heiko Stuebner <heiko@sntech.de>");
-MODULE_LICENSE("GPL");
diff --git a/drivers/video/fbdev/auo_k190x.c b/drivers/video/fbdev/auo_k190x.c
deleted file mode 100644
index 9d24d1b..0000000
--- a/drivers/video/fbdev/auo_k190x.c
+++ /dev/null
@@ -1,1195 +0,0 @@
-/*
- * Common code for AUO-K190X framebuffer drivers
- *
- * Copyright (C) 2012 Heiko Stuebner <heiko@sntech.de>
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- */
-
-#include <linux/module.h>
-#include <linux/sched/mm.h>
-#include <linux/kernel.h>
-#include <linux/gpio.h>
-#include <linux/platform_device.h>
-#include <linux/pm_runtime.h>
-#include <linux/fb.h>
-#include <linux/delay.h>
-#include <linux/uaccess.h>
-#include <linux/vmalloc.h>
-#include <linux/regulator/consumer.h>
-
-#include <video/auo_k190xfb.h>
-
-#include "auo_k190x.h"
-
-struct panel_info {
-	int w;
-	int h;
-};
-
-/* table of panel specific parameters to be indexed into by the board drivers */
-static struct panel_info panel_table[] = {
-	/* standard 6" */
-	[AUOK190X_RESOLUTION_800_600] = {
-		.w = 800,
-		.h = 600,
-	},
-	/* standard 9" */
-	[AUOK190X_RESOLUTION_1024_768] = {
-		.w = 1024,
-		.h = 768,
-	},
-	[AUOK190X_RESOLUTION_600_800] = {
-		.w = 600,
-		.h = 800,
-	},
-	[AUOK190X_RESOLUTION_768_1024] = {
-		.w = 768,
-		.h = 1024,
-	},
-};
-
-/*
- * private I80 interface to the board driver
- */
-
-static void auok190x_issue_data(struct auok190xfb_par *par, u16 data)
-{
-	par->board->set_ctl(par, AUOK190X_I80_WR, 0);
-	par->board->set_hdb(par, data);
-	par->board->set_ctl(par, AUOK190X_I80_WR, 1);
-}
-
-static void auok190x_issue_cmd(struct auok190xfb_par *par, u16 data)
-{
-	par->board->set_ctl(par, AUOK190X_I80_DC, 0);
-	auok190x_issue_data(par, data);
-	par->board->set_ctl(par, AUOK190X_I80_DC, 1);
-}
-
-/**
- * Conversion of 16bit color to 4bit grayscale
- * does roughly (0.3 * R + 0.6 G + 0.1 B) / 2
- */
-static inline int rgb565_to_gray4(u16 data, struct fb_var_screeninfo *var)
-{
-	return ((((data & 0xF800) >> var->red.offset) * 77 +
-		 ((data & 0x07E0) >> (var->green.offset + 1)) * 151 +
-		 ((data & 0x1F) >> var->blue.offset) * 28) >> 8 >> 1);
-}
-
-static int auok190x_issue_pixels_rgb565(struct auok190xfb_par *par, int size,
-					u16 *data)
-{
-	struct fb_var_screeninfo *var = &par->info->var;
-	struct device *dev = par->info->device;
-	int i;
-	u16 tmp;
-
-	if (size & 7) {
-		dev_err(dev, "issue_pixels: size %d must be a multiple of 8\n",
-			size);
-		return -EINVAL;
-	}
-
-	for (i = 0; i < (size >> 2); i++) {
-		par->board->set_ctl(par, AUOK190X_I80_WR, 0);
-
-		tmp  = (rgb565_to_gray4(data[4*i], var) & 0x000F);
-		tmp |= (rgb565_to_gray4(data[4*i+1], var) << 4) & 0x00F0;
-		tmp |= (rgb565_to_gray4(data[4*i+2], var) << 8) & 0x0F00;
-		tmp |= (rgb565_to_gray4(data[4*i+3], var) << 12) & 0xF000;
-
-		par->board->set_hdb(par, tmp);
-		par->board->set_ctl(par, AUOK190X_I80_WR, 1);
-	}
-
-	return 0;
-}
-
-static int auok190x_issue_pixels_gray8(struct auok190xfb_par *par, int size,
-				       u16 *data)
-{
-	struct device *dev = par->info->device;
-	int i;
-	u16 tmp;
-
-	if (size & 3) {
-		dev_err(dev, "issue_pixels: size %d must be a multiple of 4\n",
-			size);
-		return -EINVAL;
-	}
-
-	for (i = 0; i < (size >> 1); i++) {
-		par->board->set_ctl(par, AUOK190X_I80_WR, 0);
-
-		/* simple reduction of 8bit staticgray to 4bit gray
-		 * combines 4 * 4bit pixel values into a 16bit value
-		 */
-		tmp  = (data[2*i] & 0xF0) >> 4;
-		tmp |= (data[2*i] & 0xF000) >> 8;
-		tmp |= (data[2*i+1] & 0xF0) << 4;
-		tmp |= (data[2*i+1] & 0xF000);
-
-		par->board->set_hdb(par, tmp);
-		par->board->set_ctl(par, AUOK190X_I80_WR, 1);
-	}
-
-	return 0;
-}
-
-static int auok190x_issue_pixels(struct auok190xfb_par *par, int size,
-				 u16 *data)
-{
-	struct fb_info *info = par->info;
-	struct device *dev = par->info->device;
-
-	if (info->var.bits_per_pixel == 8 && info->var.grayscale)
-		auok190x_issue_pixels_gray8(par, size, data);
-	else if (info->var.bits_per_pixel == 16)
-		auok190x_issue_pixels_rgb565(par, size, data);
-	else
-		dev_err(dev, "unsupported color mode (bits: %d, gray: %d)\n",
-			info->var.bits_per_pixel, info->var.grayscale);
-
-	return 0;
-}
-
-static u16 auok190x_read_data(struct auok190xfb_par *par)
-{
-	u16 data;
-
-	par->board->set_ctl(par, AUOK190X_I80_OE, 0);
-	data = par->board->get_hdb(par);
-	par->board->set_ctl(par, AUOK190X_I80_OE, 1);
-
-	return data;
-}
-
-/*
- * Command interface for the controller drivers
- */
-
-void auok190x_send_command_nowait(struct auok190xfb_par *par, u16 data)
-{
-	par->board->set_ctl(par, AUOK190X_I80_CS, 0);
-	auok190x_issue_cmd(par, data);
-	par->board->set_ctl(par, AUOK190X_I80_CS, 1);
-}
-EXPORT_SYMBOL_GPL(auok190x_send_command_nowait);
-
-void auok190x_send_cmdargs_nowait(struct auok190xfb_par *par, u16 cmd,
-				  int argc, u16 *argv)
-{
-	int i;
-
-	par->board->set_ctl(par, AUOK190X_I80_CS, 0);
-	auok190x_issue_cmd(par, cmd);
-
-	for (i = 0; i < argc; i++)
-		auok190x_issue_data(par, argv[i]);
-	par->board->set_ctl(par, AUOK190X_I80_CS, 1);
-}
-EXPORT_SYMBOL_GPL(auok190x_send_cmdargs_nowait);
-
-int auok190x_send_command(struct auok190xfb_par *par, u16 data)
-{
-	int ret;
-
-	ret = par->board->wait_for_rdy(par);
-	if (ret)
-		return ret;
-
-	auok190x_send_command_nowait(par, data);
-	return 0;
-}
-EXPORT_SYMBOL_GPL(auok190x_send_command);
-
-int auok190x_send_cmdargs(struct auok190xfb_par *par, u16 cmd,
-			   int argc, u16 *argv)
-{
-	int ret;
-
-	ret = par->board->wait_for_rdy(par);
-	if (ret)
-		return ret;
-
-	auok190x_send_cmdargs_nowait(par, cmd, argc, argv);
-	return 0;
-}
-EXPORT_SYMBOL_GPL(auok190x_send_cmdargs);
-
-int auok190x_read_cmdargs(struct auok190xfb_par *par, u16 cmd,
-			   int argc, u16 *argv)
-{
-	int i, ret;
-
-	ret = par->board->wait_for_rdy(par);
-	if (ret)
-		return ret;
-
-	par->board->set_ctl(par, AUOK190X_I80_CS, 0);
-	auok190x_issue_cmd(par, cmd);
-
-	for (i = 0; i < argc; i++)
-		argv[i] = auok190x_read_data(par);
-	par->board->set_ctl(par, AUOK190X_I80_CS, 1);
-
-	return 0;
-}
-EXPORT_SYMBOL_GPL(auok190x_read_cmdargs);
-
-void auok190x_send_cmdargs_pixels_nowait(struct auok190xfb_par *par, u16 cmd,
-				  int argc, u16 *argv, int size, u16 *data)
-{
-	int i;
-
-	par->board->set_ctl(par, AUOK190X_I80_CS, 0);
-
-	auok190x_issue_cmd(par, cmd);
-
-	for (i = 0; i < argc; i++)
-		auok190x_issue_data(par, argv[i]);
-
-	auok190x_issue_pixels(par, size, data);
-
-	par->board->set_ctl(par, AUOK190X_I80_CS, 1);
-}
-EXPORT_SYMBOL_GPL(auok190x_send_cmdargs_pixels_nowait);
-
-int auok190x_send_cmdargs_pixels(struct auok190xfb_par *par, u16 cmd,
-				  int argc, u16 *argv, int size, u16 *data)
-{
-	int ret;
-
-	ret = par->board->wait_for_rdy(par);
-	if (ret)
-		return ret;
-
-	auok190x_send_cmdargs_pixels_nowait(par, cmd, argc, argv, size, data);
-
-	return 0;
-}
-EXPORT_SYMBOL_GPL(auok190x_send_cmdargs_pixels);
-
-/*
- * fbdefio callbacks - common on both controllers.
- */
-
-static void auok190xfb_dpy_first_io(struct fb_info *info)
-{
-	/* tell runtime-pm that we wish to use the device in a short time */
-	pm_runtime_get(info->device);
-}
-
-/* this is called back from the deferred io workqueue */
-static void auok190xfb_dpy_deferred_io(struct fb_info *info,
-				struct list_head *pagelist)
-{
-	struct fb_deferred_io *fbdefio = info->fbdefio;
-	struct auok190xfb_par *par = info->par;
-	u16 line_length = info->fix.line_length;
-	u16 yres = info->var.yres;
-	u16 y1 = 0, h = 0;
-	int prev_index = -1;
-	struct page *cur;
-	int h_inc;
-	int threshold;
-
-	if (!list_empty(pagelist))
-		/* the device resume should've been requested through first_io,
-		 * if the resume did not finish until now, wait for it.
-		 */
-		pm_runtime_barrier(info->device);
-	else
-		/* We reached this via the fsync or some other way.
-		 * In either case the first_io function did not run,
-		 * so we runtime_resume the device here synchronously.
-		 */
-		pm_runtime_get_sync(info->device);
-
-	/* Do a full screen update every n updates to prevent
-	 * excessive darkening of the Sipix display.
-	 * If we do this, there is no need to walk the pages.
-	 */
-	if (par->need_refresh(par)) {
-		par->update_all(par);
-		goto out;
-	}
-
-	/* height increment is fixed per page */
-	h_inc = DIV_ROUND_UP(PAGE_SIZE , line_length);
-
-	/* calculate number of pages from pixel height */
-	threshold = par->consecutive_threshold / h_inc;
-	if (threshold < 1)
-		threshold = 1;
-
-	/* walk the written page list and swizzle the data */
-	list_for_each_entry(cur, &fbdefio->pagelist, lru) {
-		if (prev_index < 0) {
-			/* just starting so assign first page */
-			y1 = (cur->index << PAGE_SHIFT) / line_length;
-			h = h_inc;
-		} else if ((cur->index - prev_index) <= threshold) {
-			/* page is within our threshold for single updates */
-			h += h_inc * (cur->index - prev_index);
-		} else {
-			/* page not consecutive, issue previous update first */
-			par->update_partial(par, y1, y1 + h);
-
-			/* start over with our non consecutive page */
-			y1 = (cur->index << PAGE_SHIFT) / line_length;
-			h = h_inc;
-		}
-		prev_index = cur->index;
-	}
-
-	/* if we still have any pages to update we do so now */
-	if (h >= yres)
-		/* its a full screen update, just do it */
-		par->update_all(par);
-	else
-		par->update_partial(par, y1, min((u16) (y1 + h), yres));
-
-out:
-	pm_runtime_mark_last_busy(info->device);
-	pm_runtime_put_autosuspend(info->device);
-}
-
-/*
- * framebuffer operations
- */
-
-/*
- * this is the slow path from userspace. they can seek and write to
- * the fb. it's inefficient to do anything less than a full screen draw
- */
-static ssize_t auok190xfb_write(struct fb_info *info, const char __user *buf,
-				size_t count, loff_t *ppos)
-{
-	struct auok190xfb_par *par = info->par;
-	unsigned long p = *ppos;
-	void *dst;
-	int err = 0;
-	unsigned long total_size;
-
-	if (info->state != FBINFO_STATE_RUNNING)
-		return -EPERM;
-
-	total_size = info->fix.smem_len;
-
-	if (p > total_size)
-		return -EFBIG;
-
-	if (count > total_size) {
-		err = -EFBIG;
-		count = total_size;
-	}
-
-	if (count + p > total_size) {
-		if (!err)
-			err = -ENOSPC;
-
-		count = total_size - p;
-	}
-
-	dst = (void *)(info->screen_base + p);
-
-	if (copy_from_user(dst, buf, count))
-		err = -EFAULT;
-
-	if  (!err)
-		*ppos += count;
-
-	par->update_all(par);
-
-	return (err) ? err : count;
-}
-
-static void auok190xfb_fillrect(struct fb_info *info,
-				   const struct fb_fillrect *rect)
-{
-	struct auok190xfb_par *par = info->par;
-
-	sys_fillrect(info, rect);
-
-	par->update_all(par);
-}
-
-static void auok190xfb_copyarea(struct fb_info *info,
-				   const struct fb_copyarea *area)
-{
-	struct auok190xfb_par *par = info->par;
-
-	sys_copyarea(info, area);
-
-	par->update_all(par);
-}
-
-static void auok190xfb_imageblit(struct fb_info *info,
-				const struct fb_image *image)
-{
-	struct auok190xfb_par *par = info->par;
-
-	sys_imageblit(info, image);
-
-	par->update_all(par);
-}
-
-static int auok190xfb_check_var(struct fb_var_screeninfo *var,
-				   struct fb_info *info)
-{
-	struct device *dev = info->device;
-	struct auok190xfb_par *par = info->par;
-	struct panel_info *panel = &panel_table[par->resolution];
-	int size;
-
-	/*
-	 * Color depth
-	 */
-
-	if (var->bits_per_pixel == 8 && var->grayscale == 1) {
-		/*
-		 * For 8-bit grayscale, R, G, and B offset are equal.
-		 */
-		var->red.length = 8;
-		var->red.offset = 0;
-		var->red.msb_right = 0;
-
-		var->green.length = 8;
-		var->green.offset = 0;
-		var->green.msb_right = 0;
-
-		var->blue.length = 8;
-		var->blue.offset = 0;
-		var->blue.msb_right = 0;
-
-		var->transp.length = 0;
-		var->transp.offset = 0;
-		var->transp.msb_right = 0;
-	} else if (var->bits_per_pixel == 16) {
-		var->red.length = 5;
-		var->red.offset = 11;
-		var->red.msb_right = 0;
-
-		var->green.length = 6;
-		var->green.offset = 5;
-		var->green.msb_right = 0;
-
-		var->blue.length = 5;
-		var->blue.offset = 0;
-		var->blue.msb_right = 0;
-
-		var->transp.length = 0;
-		var->transp.offset = 0;
-		var->transp.msb_right = 0;
-	} else {
-		dev_warn(dev, "unsupported color mode (bits: %d, grayscale: %d)\n",
-			info->var.bits_per_pixel, info->var.grayscale);
-		return -EINVAL;
-	}
-
-	/*
-	 * Dimensions
-	 */
-
-	switch (var->rotate) {
-	case FB_ROTATE_UR:
-	case FB_ROTATE_UD:
-		var->xres = panel->w;
-		var->yres = panel->h;
-		break;
-	case FB_ROTATE_CW:
-	case FB_ROTATE_CCW:
-		var->xres = panel->h;
-		var->yres = panel->w;
-		break;
-	default:
-		dev_dbg(dev, "Invalid rotation request\n");
-		return -EINVAL;
-	}
-
-	var->xres_virtual = var->xres;
-	var->yres_virtual = var->yres;
-
-	/*
-	 *  Memory limit
-	 */
-
-	size = var->xres_virtual * var->yres_virtual * var->bits_per_pixel / 8;
-	if (size > info->fix.smem_len) {
-		dev_err(dev, "Memory limit exceeded, requested %dK\n",
-			size >> 10);
-		return -ENOMEM;
-	}
-
-	return 0;
-}
-
-static int auok190xfb_set_fix(struct fb_info *info)
-{
-	struct fb_fix_screeninfo *fix = &info->fix;
-	struct fb_var_screeninfo *var = &info->var;
-
-	fix->line_length = var->xres_virtual * var->bits_per_pixel / 8;
-
-	fix->type = FB_TYPE_PACKED_PIXELS;
-	fix->accel = FB_ACCEL_NONE;
-	fix->visual = (var->grayscale) ? FB_VISUAL_STATIC_PSEUDOCOLOR
-				       : FB_VISUAL_TRUECOLOR;
-	fix->xpanstep = 0;
-	fix->ypanstep = 0;
-	fix->ywrapstep = 0;
-
-	return 0;
-}
-
-static int auok190xfb_set_par(struct fb_info *info)
-{
-	struct auok190xfb_par *par = info->par;
-
-	par->rotation = info->var.rotate;
-	auok190xfb_set_fix(info);
-
-	/* reinit the controller to honor the rotation */
-	par->init(par);
-
-	/* wait for init to complete */
-	par->board->wait_for_rdy(par);
-
-	return 0;
-}
-
-static struct fb_ops auok190xfb_ops = {
-	.owner		= THIS_MODULE,
-	.fb_read	= fb_sys_read,
-	.fb_write	= auok190xfb_write,
-	.fb_fillrect	= auok190xfb_fillrect,
-	.fb_copyarea	= auok190xfb_copyarea,
-	.fb_imageblit	= auok190xfb_imageblit,
-	.fb_check_var	= auok190xfb_check_var,
-	.fb_set_par     = auok190xfb_set_par,
-};
-
-/*
- * Controller-functions common to both K1900 and K1901
- */
-
-static int auok190x_read_temperature(struct auok190xfb_par *par)
-{
-	struct device *dev = par->info->device;
-	u16 data[4];
-	int temp;
-
-	pm_runtime_get_sync(dev);
-
-	mutex_lock(&(par->io_lock));
-
-	auok190x_read_cmdargs(par, AUOK190X_CMD_READ_VERSION, 4, data);
-
-	mutex_unlock(&(par->io_lock));
-
-	pm_runtime_mark_last_busy(dev);
-	pm_runtime_put_autosuspend(dev);
-
-	/* sanitize and split of half-degrees for now */
-	temp = ((data[0] & AUOK190X_VERSION_TEMP_MASK) >> 1);
-
-	/* handle positive and negative temperatures */
-	if (temp >= 201)
-		return (255 - temp + 1) * (-1);
-	else
-		return temp;
-}
-
-static void auok190x_identify(struct auok190xfb_par *par)
-{
-	struct device *dev = par->info->device;
-	u16 data[4];
-
-	pm_runtime_get_sync(dev);
-
-	mutex_lock(&(par->io_lock));
-
-	auok190x_read_cmdargs(par, AUOK190X_CMD_READ_VERSION, 4, data);
-
-	mutex_unlock(&(par->io_lock));
-
-	par->epd_type = data[1] & AUOK190X_VERSION_TEMP_MASK;
-
-	par->panel_size_int = AUOK190X_VERSION_SIZE_INT(data[2]);
-	par->panel_size_float = AUOK190X_VERSION_SIZE_FLOAT(data[2]);
-	par->panel_model = AUOK190X_VERSION_MODEL(data[2]);
-
-	par->tcon_version = AUOK190X_VERSION_TCON(data[3]);
-	par->lut_version = AUOK190X_VERSION_LUT(data[3]);
-
-	dev_dbg(dev, "panel %d.%din, model 0x%x, EPD 0x%x TCON-rev 0x%x, LUT-rev 0x%x",
-		par->panel_size_int, par->panel_size_float, par->panel_model,
-		par->epd_type, par->tcon_version, par->lut_version);
-
-	pm_runtime_mark_last_busy(dev);
-	pm_runtime_put_autosuspend(dev);
-}
-
-/*
- * Sysfs functions
- */
-
-static ssize_t update_mode_show(struct device *dev,
-				struct device_attribute *attr, char *buf)
-{
-	struct fb_info *info = dev_get_drvdata(dev);
-	struct auok190xfb_par *par = info->par;
-
-	return sprintf(buf, "%d\n", par->update_mode);
-}
-
-static ssize_t update_mode_store(struct device *dev,
-				 struct device_attribute *attr,
-				 const char *buf, size_t count)
-{
-	struct fb_info *info = dev_get_drvdata(dev);
-	struct auok190xfb_par *par = info->par;
-	int mode, ret;
-
-	ret = kstrtoint(buf, 10, &mode);
-	if (ret)
-		return ret;
-
-	par->update_mode = mode;
-
-	/* if we enter a better mode, do a full update */
-	if (par->last_mode > 1 && mode < par->last_mode)
-		par->update_all(par);
-
-	return count;
-}
-
-static ssize_t flash_show(struct device *dev, struct device_attribute *attr,
-			  char *buf)
-{
-	struct fb_info *info = dev_get_drvdata(dev);
-	struct auok190xfb_par *par = info->par;
-
-	return sprintf(buf, "%d\n", par->flash);
-}
-
-static ssize_t flash_store(struct device *dev, struct device_attribute *attr,
-			   const char *buf, size_t count)
-{
-	struct fb_info *info = dev_get_drvdata(dev);
-	struct auok190xfb_par *par = info->par;
-	int flash, ret;
-
-	ret = kstrtoint(buf, 10, &flash);
-	if (ret)
-		return ret;
-
-	if (flash > 0)
-		par->flash = 1;
-	else
-		par->flash = 0;
-
-	return count;
-}
-
-static ssize_t temp_show(struct device *dev, struct device_attribute *attr,
-			 char *buf)
-{
-	struct fb_info *info = dev_get_drvdata(dev);
-	struct auok190xfb_par *par = info->par;
-	int temp;
-
-	temp = auok190x_read_temperature(par);
-	return sprintf(buf, "%d\n", temp);
-}
-
-static DEVICE_ATTR_RW(update_mode);
-static DEVICE_ATTR_RW(flash);
-static DEVICE_ATTR(temp, 0644, temp_show, NULL);
-
-static struct attribute *auok190x_attributes[] = {
-	&dev_attr_update_mode.attr,
-	&dev_attr_flash.attr,
-	&dev_attr_temp.attr,
-	NULL
-};
-
-static const struct attribute_group auok190x_attr_group = {
-	.attrs		= auok190x_attributes,
-};
-
-static int auok190x_power(struct auok190xfb_par *par, bool on)
-{
-	struct auok190x_board *board = par->board;
-	int ret;
-
-	if (on) {
-		/* We should maintain POWER up for at least 80ms before set
-		 * RST_N and SLP_N to high (TCON spec 20100803_v35 p59)
-		 */
-		ret = regulator_enable(par->regulator);
-		if (ret)
-			return ret;
-
-		msleep(200);
-		gpio_set_value(board->gpio_nrst, 1);
-		gpio_set_value(board->gpio_nsleep, 1);
-		msleep(200);
-	} else {
-		regulator_disable(par->regulator);
-		gpio_set_value(board->gpio_nrst, 0);
-		gpio_set_value(board->gpio_nsleep, 0);
-	}
-
-	return 0;
-}
-
-/*
- * Recovery - powercycle the controller
- */
-
-static void auok190x_recover(struct auok190xfb_par *par)
-{
-	struct device *dev = par->info->device;
-
-	auok190x_power(par, 0);
-	msleep(100);
-	auok190x_power(par, 1);
-
-	/* after powercycling the device, it's always active */
-	pm_runtime_set_active(dev);
-	par->standby = 0;
-
-	par->init(par);
-
-	/* wait for init to complete */
-	par->board->wait_for_rdy(par);
-}
-
-/*
- * Power-management
- */
-static int __maybe_unused auok190x_runtime_suspend(struct device *dev)
-{
-	struct platform_device *pdev = to_platform_device(dev);
-	struct fb_info *info = platform_get_drvdata(pdev);
-	struct auok190xfb_par *par = info->par;
-	struct auok190x_board *board = par->board;
-	u16 standby_param;
-
-	/* take and keep the lock until we are resumed, as the controller
-	 * will never reach the non-busy state when in standby mode
-	 */
-	mutex_lock(&(par->io_lock));
-
-	if (par->standby) {
-		dev_warn(dev, "already in standby, runtime-pm pairing mismatch\n");
-		mutex_unlock(&(par->io_lock));
-		return 0;
-	}
-
-	/* according to runtime_pm.txt runtime_suspend only means, that the
-	 * device will not process data and will not communicate with the CPU
-	 * As we hold the lock, this stays true even without standby
-	 */
-	if (board->quirks & AUOK190X_QUIRK_STANDBYBROKEN) {
-		dev_dbg(dev, "runtime suspend without standby\n");
-		goto finish;
-	} else if (board->quirks & AUOK190X_QUIRK_STANDBYPARAM) {
-		/* for some TCON versions STANDBY expects a parameter (0) but
-		 * it seems the real tcon version has to be determined yet.
-		 */
-		dev_dbg(dev, "runtime suspend with additional empty param\n");
-		standby_param = 0;
-		auok190x_send_cmdargs(par, AUOK190X_CMD_STANDBY, 1,
-				      &standby_param);
-	} else {
-		dev_dbg(dev, "runtime suspend without param\n");
-		auok190x_send_command(par, AUOK190X_CMD_STANDBY);
-	}
-
-	msleep(64);
-
-finish:
-	par->standby = 1;
-
-	return 0;
-}
-
-static int __maybe_unused auok190x_runtime_resume(struct device *dev)
-{
-	struct platform_device *pdev = to_platform_device(dev);
-	struct fb_info *info = platform_get_drvdata(pdev);
-	struct auok190xfb_par *par = info->par;
-	struct auok190x_board *board = par->board;
-
-	if (!par->standby) {
-		dev_warn(dev, "not in standby, runtime-pm pairing mismatch\n");
-		return 0;
-	}
-
-	if (board->quirks & AUOK190X_QUIRK_STANDBYBROKEN) {
-		dev_dbg(dev, "runtime resume without standby\n");
-	} else {
-		/* when in standby, controller is always busy
-		 * and only accepts the wakeup command
-		 */
-		dev_dbg(dev, "runtime resume from standby\n");
-		auok190x_send_command_nowait(par, AUOK190X_CMD_WAKEUP);
-
-		msleep(160);
-
-		/* wait for the controller to be ready and release the lock */
-		board->wait_for_rdy(par);
-	}
-
-	par->standby = 0;
-
-	mutex_unlock(&(par->io_lock));
-
-	return 0;
-}
-
-static int __maybe_unused auok190x_suspend(struct device *dev)
-{
-	struct platform_device *pdev = to_platform_device(dev);
-	struct fb_info *info = platform_get_drvdata(pdev);
-	struct auok190xfb_par *par = info->par;
-	struct auok190x_board *board = par->board;
-	int ret;
-
-	dev_dbg(dev, "suspend\n");
-	if (board->quirks & AUOK190X_QUIRK_STANDBYBROKEN) {
-		/* suspend via powering off the ic */
-		dev_dbg(dev, "suspend with broken standby\n");
-
-		auok190x_power(par, 0);
-	} else {
-		dev_dbg(dev, "suspend using sleep\n");
-
-		/* the sleep state can only be entered from the standby state.
-		 * pm_runtime_get_noresume gets called before the suspend call.
-		 * So the devices usage count is >0 but it is not necessarily
-		 * active.
-		 */
-		if (!pm_runtime_status_suspended(dev)) {
-			ret = auok190x_runtime_suspend(dev);
-			if (ret < 0) {
-				dev_err(dev, "auok190x_runtime_suspend failed with %d\n",
-					ret);
-				return ret;
-			}
-			par->manual_standby = 1;
-		}
-
-		gpio_direction_output(board->gpio_nsleep, 0);
-	}
-
-	msleep(100);
-
-	return 0;
-}
-
-static int __maybe_unused auok190x_resume(struct device *dev)
-{
-	struct platform_device *pdev = to_platform_device(dev);
-	struct fb_info *info = platform_get_drvdata(pdev);
-	struct auok190xfb_par *par = info->par;
-	struct auok190x_board *board = par->board;
-
-	dev_dbg(dev, "resume\n");
-	if (board->quirks & AUOK190X_QUIRK_STANDBYBROKEN) {
-		dev_dbg(dev, "resume with broken standby\n");
-
-		auok190x_power(par, 1);
-
-		par->init(par);
-	} else {
-		dev_dbg(dev, "resume from sleep\n");
-
-		/* device should be in runtime suspend when we were suspended
-		 * and pm_runtime_put_sync gets called after this function.
-		 * So there is no need to touch the standby mode here at all.
-		 */
-		gpio_direction_output(board->gpio_nsleep, 1);
-		msleep(100);
-
-		/* an additional init call seems to be necessary after sleep */
-		auok190x_runtime_resume(dev);
-		par->init(par);
-
-		/* if we were runtime-suspended before, suspend again*/
-		if (!par->manual_standby)
-			auok190x_runtime_suspend(dev);
-		else
-			par->manual_standby = 0;
-	}
-
-	return 0;
-}
-
-const struct dev_pm_ops auok190x_pm = {
-	SET_RUNTIME_PM_OPS(auok190x_runtime_suspend, auok190x_runtime_resume,
-			   NULL)
-	SET_SYSTEM_SLEEP_PM_OPS(auok190x_suspend, auok190x_resume)
-};
-EXPORT_SYMBOL_GPL(auok190x_pm);
-
-/*
- * Common probe and remove code
- */
-
-int auok190x_common_probe(struct platform_device *pdev,
-			  struct auok190x_init_data *init)
-{
-	struct auok190x_board *board = init->board;
-	struct auok190xfb_par *par;
-	struct fb_info *info;
-	struct panel_info *panel;
-	int videomemorysize, ret;
-	unsigned char *videomemory;
-
-	/* check board contents */
-	if (!board->init || !board->cleanup || !board->wait_for_rdy
-	    || !board->set_ctl || !board->set_hdb || !board->get_hdb
-	    || !board->setup_irq)
-		return -EINVAL;
-
-	info = framebuffer_alloc(sizeof(struct auok190xfb_par), &pdev->dev);
-	if (!info)
-		return -ENOMEM;
-
-	par = info->par;
-	par->info = info;
-	par->board = board;
-	par->recover = auok190x_recover;
-	par->update_partial = init->update_partial;
-	par->update_all = init->update_all;
-	par->need_refresh = init->need_refresh;
-	par->init = init->init;
-
-	/* init update modes */
-	par->update_cnt = 0;
-	par->update_mode = -1;
-	par->last_mode = -1;
-	par->flash = 0;
-
-	par->regulator = regulator_get(info->device, "vdd");
-	if (IS_ERR(par->regulator)) {
-		ret = PTR_ERR(par->regulator);
-		dev_err(info->device, "Failed to get regulator: %d\n", ret);
-		goto err_reg;
-	}
-
-	ret = board->init(par);
-	if (ret) {
-		dev_err(info->device, "board init failed, %d\n", ret);
-		goto err_board;
-	}
-
-	ret = gpio_request(board->gpio_nsleep, "AUOK190x sleep");
-	if (ret) {
-		dev_err(info->device, "could not request sleep gpio, %d\n",
-			ret);
-		goto err_gpio1;
-	}
-
-	ret = gpio_direction_output(board->gpio_nsleep, 0);
-	if (ret) {
-		dev_err(info->device, "could not set sleep gpio, %d\n", ret);
-		goto err_gpio2;
-	}
-
-	ret = gpio_request(board->gpio_nrst, "AUOK190x reset");
-	if (ret) {
-		dev_err(info->device, "could not request reset gpio, %d\n",
-			ret);
-		goto err_gpio2;
-	}
-
-	ret = gpio_direction_output(board->gpio_nrst, 0);
-	if (ret) {
-		dev_err(info->device, "could not set reset gpio, %d\n", ret);
-		goto err_gpio3;
-	}
-
-	ret = auok190x_power(par, 1);
-	if (ret) {
-		dev_err(info->device, "could not power on the device, %d\n",
-			ret);
-		goto err_gpio3;
-	}
-
-	mutex_init(&par->io_lock);
-
-	init_waitqueue_head(&par->waitq);
-
-	ret = par->board->setup_irq(par->info);
-	if (ret) {
-		dev_err(info->device, "could not setup ready-irq, %d\n", ret);
-		goto err_irq;
-	}
-
-	/* wait for init to complete */
-	par->board->wait_for_rdy(par);
-
-	/*
-	 * From here on the controller can talk to us
-	 */
-
-	/* initialise fix, var, resolution and rotation */
-
-	strlcpy(info->fix.id, init->id, 16);
-	info->var.bits_per_pixel = 8;
-	info->var.grayscale = 1;
-
-	panel = &panel_table[board->resolution];
-
-	par->resolution = board->resolution;
-	par->rotation = 0;
-
-	/* videomemory handling */
-
-	videomemorysize = roundup((panel->w * panel->h) * 2, PAGE_SIZE);
-	videomemory = vzalloc(videomemorysize);
-	if (!videomemory) {
-		ret = -ENOMEM;
-		goto err_irq;
-	}
-
-	info->screen_base = (char *)videomemory;
-	info->fix.smem_len = videomemorysize;
-
-	info->flags = FBINFO_FLAG_DEFAULT | FBINFO_VIRTFB;
-	info->fbops = &auok190xfb_ops;
-
-	ret = auok190xfb_check_var(&info->var, info);
-	if (ret)
-		goto err_defio;
-
-	auok190xfb_set_fix(info);
-
-	/* deferred io init */
-
-	info->fbdefio = devm_kzalloc(info->device,
-				     sizeof(struct fb_deferred_io),
-				     GFP_KERNEL);
-	if (!info->fbdefio) {
-		dev_err(info->device, "Failed to allocate memory\n");
-		ret = -ENOMEM;
-		goto err_defio;
-	}
-
-	dev_dbg(info->device, "targeting %d frames per second\n", board->fps);
-	info->fbdefio->delay = HZ / board->fps;
-	info->fbdefio->first_io = auok190xfb_dpy_first_io,
-	info->fbdefio->deferred_io = auok190xfb_dpy_deferred_io,
-	fb_deferred_io_init(info);
-
-	/* color map */
-
-	ret = fb_alloc_cmap(&info->cmap, 256, 0);
-	if (ret < 0) {
-		dev_err(info->device, "Failed to allocate colormap\n");
-		goto err_cmap;
-	}
-
-	/* controller init */
-
-	par->consecutive_threshold = 100;
-	par->init(par);
-	auok190x_identify(par);
-
-	platform_set_drvdata(pdev, info);
-
-	ret = register_framebuffer(info);
-	if (ret < 0)
-		goto err_regfb;
-
-	ret = sysfs_create_group(&info->device->kobj, &auok190x_attr_group);
-	if (ret)
-		goto err_sysfs;
-
-	dev_info(info->device, "fb%d: %dx%d using %dK of video memory\n",
-		 info->node, info->var.xres, info->var.yres,
-		 videomemorysize >> 10);
-
-	/* increase autosuspend_delay when we use alternative methods
-	 * for runtime_pm
-	 */
-	par->autosuspend_delay = (board->quirks & AUOK190X_QUIRK_STANDBYBROKEN)
-					? 1000 : 200;
-
-	pm_runtime_set_active(info->device);
-	pm_runtime_enable(info->device);
-	pm_runtime_set_autosuspend_delay(info->device, par->autosuspend_delay);
-	pm_runtime_use_autosuspend(info->device);
-
-	return 0;
-
-err_sysfs:
-	unregister_framebuffer(info);
-err_regfb:
-	fb_dealloc_cmap(&info->cmap);
-err_cmap:
-	fb_deferred_io_cleanup(info);
-err_defio:
-	vfree((void *)info->screen_base);
-err_irq:
-	auok190x_power(par, 0);
-err_gpio3:
-	gpio_free(board->gpio_nrst);
-err_gpio2:
-	gpio_free(board->gpio_nsleep);
-err_gpio1:
-	board->cleanup(par);
-err_board:
-	regulator_put(par->regulator);
-err_reg:
-	framebuffer_release(info);
-
-	return ret;
-}
-EXPORT_SYMBOL_GPL(auok190x_common_probe);
-
-int  auok190x_common_remove(struct platform_device *pdev)
-{
-	struct fb_info *info = platform_get_drvdata(pdev);
-	struct auok190xfb_par *par = info->par;
-	struct auok190x_board *board = par->board;
-
-	pm_runtime_disable(info->device);
-
-	sysfs_remove_group(&info->device->kobj, &auok190x_attr_group);
-
-	unregister_framebuffer(info);
-
-	fb_dealloc_cmap(&info->cmap);
-
-	fb_deferred_io_cleanup(info);
-
-	vfree((void *)info->screen_base);
-
-	auok190x_power(par, 0);
-
-	gpio_free(board->gpio_nrst);
-	gpio_free(board->gpio_nsleep);
-
-	board->cleanup(par);
-
-	regulator_put(par->regulator);
-
-	framebuffer_release(info);
-
-	return 0;
-}
-EXPORT_SYMBOL_GPL(auok190x_common_remove);
-
-MODULE_DESCRIPTION("Common code for AUO-K190X controllers");
-MODULE_AUTHOR("Heiko Stuebner <heiko@sntech.de>");
-MODULE_LICENSE("GPL");
diff --git a/drivers/video/fbdev/auo_k190x.h b/drivers/video/fbdev/auo_k190x.h
deleted file mode 100644
index e35af1f..0000000
--- a/drivers/video/fbdev/auo_k190x.h
+++ /dev/null
@@ -1,129 +0,0 @@
-/*
- * Private common definitions for AUO-K190X framebuffer drivers
- *
- * Copyright (C) 2012 Heiko Stuebner <heiko@sntech.de>
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- */
-
-/*
- * I80 interface specific defines
- */
-
-#define AUOK190X_I80_CS			0x01
-#define AUOK190X_I80_DC			0x02
-#define AUOK190X_I80_WR			0x03
-#define AUOK190X_I80_OE			0x04
-
-/*
- * AUOK190x commands, common to both controllers
- */
-
-#define AUOK190X_CMD_INIT		0x0000
-#define AUOK190X_CMD_STANDBY		0x0001
-#define AUOK190X_CMD_WAKEUP		0x0002
-#define AUOK190X_CMD_TCON_RESET		0x0003
-#define AUOK190X_CMD_DATA_STOP		0x1002
-#define AUOK190X_CMD_LUT_START		0x1003
-#define AUOK190X_CMD_DISP_REFRESH	0x1004
-#define AUOK190X_CMD_DISP_RESET		0x1005
-#define AUOK190X_CMD_PRE_DISPLAY_START	0x100D
-#define AUOK190X_CMD_PRE_DISPLAY_STOP	0x100F
-#define AUOK190X_CMD_FLASH_W		0x2000
-#define AUOK190X_CMD_FLASH_E		0x2001
-#define AUOK190X_CMD_FLASH_STS		0x2002
-#define AUOK190X_CMD_FRAMERATE		0x3000
-#define AUOK190X_CMD_READ_VERSION	0x4000
-#define AUOK190X_CMD_READ_STATUS	0x4001
-#define AUOK190X_CMD_READ_LUT		0x4003
-#define AUOK190X_CMD_DRIVERTIMING	0x5000
-#define AUOK190X_CMD_LBALANCE		0x5001
-#define AUOK190X_CMD_AGINGMODE		0x6000
-#define AUOK190X_CMD_AGINGEXIT		0x6001
-
-/*
- * Common settings for AUOK190X_CMD_INIT
- */
-
-#define AUOK190X_INIT_DATA_FILTER	(0 << 12)
-#define AUOK190X_INIT_DATA_BYPASS	(1 << 12)
-#define AUOK190X_INIT_INVERSE_WHITE	(0 << 9)
-#define AUOK190X_INIT_INVERSE_BLACK	(1 << 9)
-#define AUOK190X_INIT_SCAN_DOWN		(0 << 1)
-#define AUOK190X_INIT_SCAN_UP		(1 << 1)
-#define AUOK190X_INIT_SHIFT_LEFT	(0 << 0)
-#define AUOK190X_INIT_SHIFT_RIGHT	(1 << 0)
-
-/* Common bits to pixels
- *   Mode	15-12	11-8	7-4	3-0
- *   format0	4	3	2	1
- *   format1	3	4	1	2
- */
-
-#define AUOK190X_INIT_FORMAT0		0
-#define AUOK190X_INIT_FORMAT1		(1 << 6)
-
-/*
- * settings for AUOK190X_CMD_RESET
- */
-
-#define AUOK190X_RESET_TCON		(0 << 0)
-#define AUOK190X_RESET_NORMAL		(1 << 0)
-#define AUOK190X_RESET_PON		(1 << 1)
-
-/*
- * AUOK190X_CMD_VERSION
- */
-
-#define AUOK190X_VERSION_TEMP_MASK		(0x1ff)
-#define AUOK190X_VERSION_EPD_MASK		(0xff)
-#define AUOK190X_VERSION_SIZE_INT(_val)		((_val & 0xfc00) >> 10)
-#define AUOK190X_VERSION_SIZE_FLOAT(_val)	((_val & 0x3c0) >> 6)
-#define AUOK190X_VERSION_MODEL(_val)		(_val & 0x3f)
-#define AUOK190X_VERSION_LUT(_val)		(_val & 0xff)
-#define AUOK190X_VERSION_TCON(_val)		((_val & 0xff00) >> 8)
-
-/*
- * update modes for CMD_PARTIALDISP on K1900 and CMD_DDMA on K1901
- */
-
-#define AUOK190X_UPDATE_MODE(_res)		((_res & 0x7) << 12)
-#define AUOK190X_UPDATE_NONFLASH		(1 << 15)
-
-/*
- * track panel specific parameters for common init
- */
-
-struct auok190x_init_data {
-	char *id;
-	struct auok190x_board *board;
-
-	void (*update_partial)(struct auok190xfb_par *par, u16 y1, u16 y2);
-	void (*update_all)(struct auok190xfb_par *par);
-	bool (*need_refresh)(struct auok190xfb_par *par);
-	void (*init)(struct auok190xfb_par *par);
-};
-
-
-extern void auok190x_send_command_nowait(struct auok190xfb_par *par, u16 data);
-extern int auok190x_send_command(struct auok190xfb_par *par, u16 data);
-extern void auok190x_send_cmdargs_nowait(struct auok190xfb_par *par, u16 cmd,
-					 int argc, u16 *argv);
-extern int auok190x_send_cmdargs(struct auok190xfb_par *par, u16 cmd,
-				  int argc, u16 *argv);
-extern void auok190x_send_cmdargs_pixels_nowait(struct auok190xfb_par *par,
-						u16 cmd, int argc, u16 *argv,
-						int size, u16 *data);
-extern int auok190x_send_cmdargs_pixels(struct auok190xfb_par *par, u16 cmd,
-					int argc, u16 *argv, int size,
-					u16 *data);
-extern int auok190x_read_cmdargs(struct auok190xfb_par *par, u16 cmd,
-				  int argc, u16 *argv);
-
-extern int auok190x_common_probe(struct platform_device *pdev,
-				 struct auok190x_init_data *init);
-extern int auok190x_common_remove(struct platform_device *pdev);
-
-extern const struct dev_pm_ops auok190x_pm;
diff --git a/drivers/video/fbdev/core/fb_defio.c b/drivers/video/fbdev/core/fb_defio.c
index 487d5e3..82c20c6 100644
--- a/drivers/video/fbdev/core/fb_defio.c
+++ b/drivers/video/fbdev/core/fb_defio.c
@@ -37,7 +37,7 @@
 }
 
 /* this is to find and return the vmalloc-ed fb pages */
-static int fb_deferred_io_fault(struct vm_fault *vmf)
+static vm_fault_t fb_deferred_io_fault(struct vm_fault *vmf)
 {
 	unsigned long offset;
 	struct page *page;
@@ -90,7 +90,7 @@
 EXPORT_SYMBOL_GPL(fb_deferred_io_fsync);
 
 /* vm_ops->page_mkwrite handler */
-static int fb_deferred_io_mkwrite(struct vm_fault *vmf)
+static vm_fault_t fb_deferred_io_mkwrite(struct vm_fault *vmf)
 {
 	struct page *page = vmf->page;
 	struct fb_info *info = vmf->vma->vm_private_data;
diff --git a/drivers/video/fbdev/mmp/fb/mmpfb.c b/drivers/video/fbdev/mmp/fb/mmpfb.c
index f27697e..ee212be 100644
--- a/drivers/video/fbdev/mmp/fb/mmpfb.c
+++ b/drivers/video/fbdev/mmp/fb/mmpfb.c
@@ -495,10 +495,9 @@
 	/* put videomode list to info structure */
 	videomodes = kcalloc(videomode_num, sizeof(struct fb_videomode),
 			     GFP_KERNEL);
-	if (!videomodes) {
-		dev_err(fbi->dev, "can't malloc video modes\n");
+	if (!videomodes)
 		return -ENOMEM;
-	}
+
 	for (i = 0; i < videomode_num; i++)
 		mmpmode_to_fbmode(&videomodes[i], &mmp_modes[i]);
 	fb_videomode_to_modelist(videomodes, videomode_num, &info->modelist);
diff --git a/drivers/video/fbdev/mmp/hw/mmp_ctrl.c b/drivers/video/fbdev/mmp/hw/mmp_ctrl.c
index b6f83d5..fcdbb2d 100644
--- a/drivers/video/fbdev/mmp/hw/mmp_ctrl.c
+++ b/drivers/video/fbdev/mmp/hw/mmp_ctrl.c
@@ -406,12 +406,10 @@
 	dev_info(ctrl->dev, "%s: %s\n", __func__, config->name);
 
 	/* init driver data */
-	path_info = kzalloc(sizeof(struct mmp_path_info), GFP_KERNEL);
-	if (!path_info) {
-		dev_err(ctrl->dev, "%s: unable to alloc path_info for %s\n",
-				__func__, config->name);
+	path_info = kzalloc(sizeof(*path_info), GFP_KERNEL);
+	if (!path_info)
 		return 0;
-	}
+
 	path_info->name = config->name;
 	path_info->id = path_plat->id;
 	path_info->dev = ctrl->dev;
diff --git a/drivers/video/fbdev/nvidia/nvidia.c b/drivers/video/fbdev/nvidia/nvidia.c
index 2e50120..fbeeed5 100644
--- a/drivers/video/fbdev/nvidia/nvidia.c
+++ b/drivers/video/fbdev/nvidia/nvidia.c
@@ -1548,7 +1548,7 @@
 		 "(default=0)");
 module_param(noscale, int, 0);
 MODULE_PARM_DESC(noscale,
-		 "Disables screen scaleing. (0 or 1=disable) "
+		 "Disables screen scaling. (0 or 1=disable) "
 		 "(default=0, do scaling)");
 module_param(paneltweak, int, 0);
 MODULE_PARM_DESC(paneltweak,
diff --git a/drivers/video/fbdev/omap/lcd_ams_delta.c b/drivers/video/fbdev/omap/lcd_ams_delta.c
index a4ee947..e8c748a 100644
--- a/drivers/video/fbdev/omap/lcd_ams_delta.c
+++ b/drivers/video/fbdev/omap/lcd_ams_delta.c
@@ -197,3 +197,7 @@
 };
 
 module_platform_driver(ams_delta_panel_driver);
+
+MODULE_AUTHOR("Jonathan McDowell <noodles@earth.li>");
+MODULE_DESCRIPTION("LCD panel support for the Amstrad E3 (Delta) videophone");
+MODULE_LICENSE("GPL");
diff --git a/drivers/video/fbdev/omap/lcd_h3.c b/drivers/video/fbdev/omap/lcd_h3.c
index 796f463..fd0ac99 100644
--- a/drivers/video/fbdev/omap/lcd_h3.c
+++ b/drivers/video/fbdev/omap/lcd_h3.c
@@ -89,3 +89,7 @@
 };
 
 module_platform_driver(h3_panel_driver);
+
+MODULE_AUTHOR("Imre Deak");
+MODULE_DESCRIPTION("LCD panel support for the TI OMAP H3 board");
+MODULE_LICENSE("GPL");
diff --git a/drivers/video/fbdev/omap/lcd_htcherald.c b/drivers/video/fbdev/omap/lcd_htcherald.c
index 9d692f5b..db4ff1c 100644
--- a/drivers/video/fbdev/omap/lcd_htcherald.c
+++ b/drivers/video/fbdev/omap/lcd_htcherald.c
@@ -66,3 +66,7 @@
 };
 
 module_platform_driver(htcherald_panel_driver);
+
+MODULE_AUTHOR("Cory Maccarrone");
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("LCD panel support for the HTC Herald");
diff --git a/drivers/video/fbdev/omap/lcd_inn1510.c b/drivers/video/fbdev/omap/lcd_inn1510.c
index b284050..1ea775f 100644
--- a/drivers/video/fbdev/omap/lcd_inn1510.c
+++ b/drivers/video/fbdev/omap/lcd_inn1510.c
@@ -73,3 +73,7 @@
 };
 
 module_platform_driver(innovator1510_panel_driver);
+
+MODULE_AUTHOR("Imre Deak");
+MODULE_DESCRIPTION("LCD panel support for the TI OMAP1510 Innovator board");
+MODULE_LICENSE("GPL");
diff --git a/drivers/video/fbdev/omap/lcd_inn1610.c b/drivers/video/fbdev/omap/lcd_inn1610.c
index 1841710..8d0cf68 100644
--- a/drivers/video/fbdev/omap/lcd_inn1610.c
+++ b/drivers/video/fbdev/omap/lcd_inn1610.c
@@ -106,3 +106,7 @@
 };
 
 module_platform_driver(innovator1610_panel_driver);
+
+MODULE_AUTHOR("Imre Deak");
+MODULE_DESCRIPTION("LCD panel support for the TI OMAP1610 Innovator board");
+MODULE_LICENSE("GPL");
diff --git a/drivers/video/fbdev/omap/lcd_osk.c b/drivers/video/fbdev/omap/lcd_osk.c
index b0be577..9fc43a1 100644
--- a/drivers/video/fbdev/omap/lcd_osk.c
+++ b/drivers/video/fbdev/omap/lcd_osk.c
@@ -93,3 +93,7 @@
 };
 
 module_platform_driver(osk_panel_driver);
+
+MODULE_AUTHOR("Imre Deak");
+MODULE_DESCRIPTION("LCD panel support for the TI OMAP OSK board");
+MODULE_LICENSE("GPL");
diff --git a/drivers/video/fbdev/omap/lcd_palmte.c b/drivers/video/fbdev/omap/lcd_palmte.c
index cef9638..a0e8886 100644
--- a/drivers/video/fbdev/omap/lcd_palmte.c
+++ b/drivers/video/fbdev/omap/lcd_palmte.c
@@ -59,3 +59,7 @@
 };
 
 module_platform_driver(palmte_panel_driver);
+
+MODULE_AUTHOR("Romain Goyet <r.goyet@gmail.com>, Laurent Gonzalez <palmte.linux@free.fr>");
+MODULE_DESCRIPTION("LCD panel support for the Palm Tungsten E");
+MODULE_LICENSE("GPL");
diff --git a/drivers/video/fbdev/omap/lcd_palmtt.c b/drivers/video/fbdev/omap/lcd_palmtt.c
index 627f13d..2c45375 100644
--- a/drivers/video/fbdev/omap/lcd_palmtt.c
+++ b/drivers/video/fbdev/omap/lcd_palmtt.c
@@ -72,3 +72,7 @@
 };
 
 module_platform_driver(palmtt_panel_driver);
+
+MODULE_AUTHOR("Marek Vasut <marek.vasut@gmail.com>");
+MODULE_DESCRIPTION("LCD panel support for Palm Tungsten|T");
+MODULE_LICENSE("GPL");
diff --git a/drivers/video/fbdev/omap/lcd_palmz71.c b/drivers/video/fbdev/omap/lcd_palmz71.c
index c46d4db..c99a15a 100644
--- a/drivers/video/fbdev/omap/lcd_palmz71.c
+++ b/drivers/video/fbdev/omap/lcd_palmz71.c
@@ -66,3 +66,7 @@
 };
 
 module_platform_driver(palmz71_panel_driver);
+
+MODULE_AUTHOR("Romain Goyet, Laurent Gonzalez, Marek Vasut");
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("LCD panel support for the Palm Zire71");
diff --git a/drivers/video/fbdev/omap/omapfb_main.c b/drivers/video/fbdev/omap/omapfb_main.c
index 3479a47..585f39e 100644
--- a/drivers/video/fbdev/omap/omapfb_main.c
+++ b/drivers/video/fbdev/omap/omapfb_main.c
@@ -1645,7 +1645,7 @@
 		goto cleanup;
 	}
 
-	fbdev = kzalloc(sizeof(struct omapfb_device), GFP_KERNEL);
+	fbdev = kzalloc(sizeof(*fbdev), GFP_KERNEL);
 	if (fbdev == NULL) {
 		dev_err(&pdev->dev,
 			"unable to allocate memory for device info\n");
diff --git a/drivers/video/fbdev/omap2/omapfb/Kconfig b/drivers/video/fbdev/omap2/omapfb/Kconfig
index e6226ae..3bf154e 100644
--- a/drivers/video/fbdev/omap2/omapfb/Kconfig
+++ b/drivers/video/fbdev/omap2/omapfb/Kconfig
@@ -5,6 +5,7 @@
         tristate "OMAP2+ frame buffer support"
         depends on FB
         depends on DRM_OMAP = n
+	depends on GPIOLIB
 
         select FB_OMAP2_DSS
 	select OMAP2_VRFB if ARCH_OMAP2 || ARCH_OMAP3
diff --git a/drivers/video/fbdev/omap2/omapfb/displays/panel-dsi-cm.c b/drivers/video/fbdev/omap2/omapfb/displays/panel-dsi-cm.c
index bef4315..87497a0 100644
--- a/drivers/video/fbdev/omap2/omapfb/displays/panel-dsi-cm.c
+++ b/drivers/video/fbdev/omap2/omapfb/displays/panel-dsi-cm.c
@@ -387,8 +387,7 @@
 static ssize_t dsicm_num_errors_show(struct device *dev,
 		struct device_attribute *attr, char *buf)
 {
-	struct platform_device *pdev = to_platform_device(dev);
-	struct panel_drv_data *ddata = platform_get_drvdata(pdev);
+	struct panel_drv_data *ddata = dev_get_drvdata(dev);
 	struct omap_dss_device *in = ddata->in;
 	u8 errors = 0;
 	int r;
@@ -419,8 +418,7 @@
 static ssize_t dsicm_hw_revision_show(struct device *dev,
 		struct device_attribute *attr, char *buf)
 {
-	struct platform_device *pdev = to_platform_device(dev);
-	struct panel_drv_data *ddata = platform_get_drvdata(pdev);
+	struct panel_drv_data *ddata = dev_get_drvdata(dev);
 	struct omap_dss_device *in = ddata->in;
 	u8 id1, id2, id3;
 	int r;
@@ -451,8 +449,7 @@
 		struct device_attribute *attr,
 		const char *buf, size_t count)
 {
-	struct platform_device *pdev = to_platform_device(dev);
-	struct panel_drv_data *ddata = platform_get_drvdata(pdev);
+	struct panel_drv_data *ddata = dev_get_drvdata(dev);
 	struct omap_dss_device *in = ddata->in;
 	unsigned long t;
 	int r;
@@ -486,8 +483,7 @@
 		struct device_attribute *attr,
 		char *buf)
 {
-	struct platform_device *pdev = to_platform_device(dev);
-	struct panel_drv_data *ddata = platform_get_drvdata(pdev);
+	struct panel_drv_data *ddata = dev_get_drvdata(dev);
 	unsigned t;
 
 	mutex_lock(&ddata->lock);
@@ -501,8 +497,7 @@
 		struct device_attribute *attr,
 		const char *buf, size_t count)
 {
-	struct platform_device *pdev = to_platform_device(dev);
-	struct panel_drv_data *ddata = platform_get_drvdata(pdev);
+	struct panel_drv_data *ddata = dev_get_drvdata(dev);
 	struct omap_dss_device *in = ddata->in;
 	unsigned long t;
 	int r;
@@ -533,8 +528,7 @@
 		struct device_attribute *attr,
 		char *buf)
 {
-	struct platform_device *pdev = to_platform_device(dev);
-	struct panel_drv_data *ddata = platform_get_drvdata(pdev);
+	struct panel_drv_data *ddata = dev_get_drvdata(dev);
 	unsigned t;
 
 	mutex_lock(&ddata->lock);
diff --git a/drivers/video/fbdev/pxafb.c b/drivers/video/fbdev/pxafb.c
index c3d49e1..76722a5 100644
--- a/drivers/video/fbdev/pxafb.c
+++ b/drivers/video/fbdev/pxafb.c
@@ -2115,12 +2115,10 @@
 	if (ret)
 		s = "color-tft";
 
-	for (i = 0; lcd_types[i]; i++)
-		if (!strcmp(s, lcd_types[i]))
-			break;
-	if (!i || !lcd_types[i]) {
+	i = match_string(lcd_types, -1, s);
+	if (i < 0) {
 		dev_err(dev, "lcd-type %s is unknown\n", s);
-		return -EINVAL;
+		return i;
 	}
 	info->lcd_conn |= LCD_CONN_TYPE(i);
 	info->lcd_conn |= LCD_CONN_WIDTH(bus_width);
diff --git a/drivers/video/fbdev/savage/savagefb_driver.c b/drivers/video/fbdev/savage/savagefb_driver.c
index c204683..c09d742 100644
--- a/drivers/video/fbdev/savage/savagefb_driver.c
+++ b/drivers/video/fbdev/savage/savagefb_driver.c
@@ -1892,11 +1892,11 @@
 	vga_out8(0x3d4, 0x66, par);
 	cr66 = vga_in8(0x3d5, par);
 	vga_out8(0x3d5, cr66 | 0x02, par);
-	mdelay(10);
+	usleep_range(10000, 11000);
 
 	vga_out8(0x3d4, 0x66, par);
 	vga_out8(0x3d5, cr66 & ~0x02, par);	/* clear reset flag */
-	mdelay(10);
+	usleep_range(10000, 11000);
 
 
 	/*
@@ -1906,11 +1906,11 @@
 	vga_out8(0x3d4, 0x3f, par);
 	cr3f = vga_in8(0x3d5, par);
 	vga_out8(0x3d5, cr3f | 0x08, par);
-	mdelay(10);
+	usleep_range(10000, 11000);
 
 	vga_out8(0x3d4, 0x3f, par);
 	vga_out8(0x3d5, cr3f & ~0x08, par);	/* clear reset flags */
-	mdelay(10);
+	usleep_range(10000, 11000);
 
 	/* Savage ramdac speeds */
 	par->numClocks = 4;
diff --git a/drivers/video/fbdev/sh_mobile_lcdcfb.c b/drivers/video/fbdev/sh_mobile_lcdcfb.c
index c3a4650..dc46be3 100644
--- a/drivers/video/fbdev/sh_mobile_lcdcfb.c
+++ b/drivers/video/fbdev/sh_mobile_lcdcfb.c
@@ -29,7 +29,6 @@
 #include <linux/vmalloc.h>
 
 #include <video/sh_mobile_lcdc.h>
-#include <video/sh_mobile_meram.h>
 
 #include "sh_mobile_lcdcfb.h"
 
@@ -217,7 +216,6 @@
 	struct notifier_block notifier;
 	int started;
 	int forced_fourcc; /* 2 channel LCDC must share fourcc setting */
-	struct sh_mobile_meram_info *meram_dev;
 };
 
 /* -----------------------------------------------------------------------------
@@ -346,16 +344,12 @@
 		if (priv->dot_clk)
 			clk_prepare_enable(priv->dot_clk);
 		pm_runtime_get_sync(priv->dev);
-		if (priv->meram_dev && priv->meram_dev->pdev)
-			pm_runtime_get_sync(&priv->meram_dev->pdev->dev);
 	}
 }
 
 static void sh_mobile_lcdc_clk_off(struct sh_mobile_lcdc_priv *priv)
 {
 	if (atomic_sub_return(1, &priv->hw_usecnt) == -1) {
-		if (priv->meram_dev && priv->meram_dev->pdev)
-			pm_runtime_put_sync(&priv->meram_dev->pdev->dev);
 		pm_runtime_put(priv->dev);
 		if (priv->dot_clk)
 			clk_disable_unprepare(priv->dot_clk);
@@ -1073,7 +1067,6 @@
 
 static int sh_mobile_lcdc_start(struct sh_mobile_lcdc_priv *priv)
 {
-	struct sh_mobile_meram_info *mdev = priv->meram_dev;
 	struct sh_mobile_lcdc_chan *ch;
 	unsigned long tmp;
 	int ret;
@@ -1106,9 +1099,6 @@
 
 	/* Compute frame buffer base address and pitch for each channel. */
 	for (k = 0; k < ARRAY_SIZE(priv->ch); k++) {
-		int pixelformat;
-		void *cache;
-
 		ch = &priv->ch[k];
 		if (!ch->enabled)
 			continue;
@@ -1117,45 +1107,6 @@
 		ch->base_addr_c = ch->dma_handle
 				+ ch->xres_virtual * ch->yres_virtual;
 		ch->line_size = ch->pitch;
-
-		/* Enable MERAM if possible. */
-		if (mdev == NULL || ch->cfg->meram_cfg == NULL)
-			continue;
-
-		/* Free the allocated MERAM cache. */
-		if (ch->cache) {
-			sh_mobile_meram_cache_free(mdev, ch->cache);
-			ch->cache = NULL;
-		}
-
-		switch (ch->format->fourcc) {
-		case V4L2_PIX_FMT_NV12:
-		case V4L2_PIX_FMT_NV21:
-		case V4L2_PIX_FMT_NV16:
-		case V4L2_PIX_FMT_NV61:
-			pixelformat = SH_MOBILE_MERAM_PF_NV;
-			break;
-		case V4L2_PIX_FMT_NV24:
-		case V4L2_PIX_FMT_NV42:
-			pixelformat = SH_MOBILE_MERAM_PF_NV24;
-			break;
-		case V4L2_PIX_FMT_RGB565:
-		case V4L2_PIX_FMT_BGR24:
-		case V4L2_PIX_FMT_BGR32:
-		default:
-			pixelformat = SH_MOBILE_MERAM_PF_RGB;
-			break;
-		}
-
-		cache = sh_mobile_meram_cache_alloc(mdev, ch->cfg->meram_cfg,
-					ch->pitch, ch->yres, pixelformat,
-					&ch->line_size);
-		if (!IS_ERR(cache)) {
-			sh_mobile_meram_cache_update(mdev, cache,
-					ch->base_addr_y, ch->base_addr_c,
-					&ch->base_addr_y, &ch->base_addr_c);
-			ch->cache = cache;
-		}
 	}
 
 	for (k = 0; k < ARRAY_SIZE(priv->overlays); ++k) {
@@ -1223,13 +1174,6 @@
 		}
 
 		sh_mobile_lcdc_display_off(ch);
-
-		/* Free the MERAM cache. */
-		if (ch->cache) {
-			sh_mobile_meram_cache_free(priv->meram_dev, ch->cache);
-			ch->cache = NULL;
-		}
-
 	}
 
 	/* stop the lcdc */
@@ -1851,11 +1795,6 @@
 	base_addr_c = ch->dma_handle + ch->xres_virtual * ch->yres_virtual
 		    + c_offset;
 
-	if (ch->cache)
-		sh_mobile_meram_cache_update(priv->meram_dev, ch->cache,
-					     base_addr_y, base_addr_c,
-					     &base_addr_y, &base_addr_c);
-
 	ch->base_addr_y = base_addr_y;
 	ch->base_addr_c = base_addr_c;
 	ch->pan_y_offset = y_offset;
@@ -2149,10 +2088,8 @@
 	if (info->fbdefio) {
 		ch->sglist = vmalloc(sizeof(struct scatterlist) *
 				     ch->fb_size >> PAGE_SHIFT);
-		if (!ch->sglist) {
-			dev_err(ch->lcdc->dev, "cannot allocate sglist\n");
+		if (!ch->sglist)
 			return -ENOMEM;
-		}
 	}
 
 	info->bl_dev = ch->bl;
@@ -2354,8 +2291,7 @@
 
 static int sh_mobile_lcdc_runtime_suspend(struct device *dev)
 {
-	struct platform_device *pdev = to_platform_device(dev);
-	struct sh_mobile_lcdc_priv *priv = platform_get_drvdata(pdev);
+	struct sh_mobile_lcdc_priv *priv = dev_get_drvdata(dev);
 
 	/* turn off LCDC hardware */
 	lcdc_write(priv, _LDCNT1R, 0);
@@ -2365,8 +2301,7 @@
 
 static int sh_mobile_lcdc_runtime_resume(struct device *dev)
 {
-	struct platform_device *pdev = to_platform_device(dev);
-	struct sh_mobile_lcdc_priv *priv = platform_get_drvdata(pdev);
+	struct sh_mobile_lcdc_priv *priv = dev_get_drvdata(dev);
 
 	__sh_mobile_lcdc_start(priv);
 
@@ -2718,13 +2653,11 @@
 	}
 
 	priv = kzalloc(sizeof(*priv), GFP_KERNEL);
-	if (!priv) {
-		dev_err(&pdev->dev, "cannot allocate device data\n");
+	if (!priv)
 		return -ENOMEM;
-	}
 
 	priv->dev = &pdev->dev;
-	priv->meram_dev = pdata->meram_dev;
+
 	for (i = 0; i < ARRAY_SIZE(priv->ch); i++)
 		mutex_init(&priv->ch[i].open_lock);
 	platform_set_drvdata(pdev, priv);
diff --git a/drivers/video/fbdev/sh_mobile_lcdcfb.h b/drivers/video/fbdev/sh_mobile_lcdcfb.h
index cc52c74..b8e47a8 100644
--- a/drivers/video/fbdev/sh_mobile_lcdcfb.h
+++ b/drivers/video/fbdev/sh_mobile_lcdcfb.h
@@ -61,7 +61,6 @@
 	unsigned long *reg_offs;
 	unsigned long ldmt1r_value;
 	unsigned long enabled; /* ME and SE in LDCNT2R */
-	void *cache;
 
 	struct mutex open_lock;		/* protects the use counter */
 	int use_count;
diff --git a/drivers/video/fbdev/sh_mobile_meram.c b/drivers/video/fbdev/sh_mobile_meram.c
deleted file mode 100644
index baadfb2..0000000
--- a/drivers/video/fbdev/sh_mobile_meram.c
+++ /dev/null
@@ -1,758 +0,0 @@
-/*
- * SuperH Mobile MERAM Driver for SuperH Mobile LCDC Driver
- *
- * Copyright (c) 2011	Damian Hobson-Garcia <dhobsong@igel.co.jp>
- *                      Takanari Hayama <taki@igel.co.jp>
- *
- * This file is subject to the terms and conditions of the GNU General Public
- * License.  See the file "COPYING" in the main directory of this archive
- * for more details.
- */
-
-#include <linux/device.h>
-#include <linux/err.h>
-#include <linux/export.h>
-#include <linux/genalloc.h>
-#include <linux/io.h>
-#include <linux/kernel.h>
-#include <linux/module.h>
-#include <linux/platform_device.h>
-#include <linux/pm_runtime.h>
-#include <linux/slab.h>
-
-#include <video/sh_mobile_meram.h>
-
-/* -----------------------------------------------------------------------------
- * MERAM registers
- */
-
-#define MEVCR1			0x4
-#define MEVCR1_RST		(1 << 31)
-#define MEVCR1_WD		(1 << 30)
-#define MEVCR1_AMD1		(1 << 29)
-#define MEVCR1_AMD0		(1 << 28)
-#define MEQSEL1			0x40
-#define MEQSEL2			0x44
-
-#define MExxCTL			0x400
-#define MExxCTL_BV		(1 << 31)
-#define MExxCTL_BSZ_SHIFT	28
-#define MExxCTL_MSAR_MASK	(0x7ff << MExxCTL_MSAR_SHIFT)
-#define MExxCTL_MSAR_SHIFT	16
-#define MExxCTL_NXT_MASK	(0x1f << MExxCTL_NXT_SHIFT)
-#define MExxCTL_NXT_SHIFT	11
-#define MExxCTL_WD1		(1 << 10)
-#define MExxCTL_WD0		(1 << 9)
-#define MExxCTL_WS		(1 << 8)
-#define MExxCTL_CB		(1 << 7)
-#define MExxCTL_WBF		(1 << 6)
-#define MExxCTL_WF		(1 << 5)
-#define MExxCTL_RF		(1 << 4)
-#define MExxCTL_CM		(1 << 3)
-#define MExxCTL_MD_READ		(1 << 0)
-#define MExxCTL_MD_WRITE	(2 << 0)
-#define MExxCTL_MD_ICB_WB	(3 << 0)
-#define MExxCTL_MD_ICB		(4 << 0)
-#define MExxCTL_MD_FB		(7 << 0)
-#define MExxCTL_MD_MASK		(7 << 0)
-#define MExxBSIZE		0x404
-#define MExxBSIZE_RCNT_SHIFT	28
-#define MExxBSIZE_YSZM1_SHIFT	16
-#define MExxBSIZE_XSZM1_SHIFT	0
-#define MExxMNCF		0x408
-#define MExxMNCF_KWBNM_SHIFT	28
-#define MExxMNCF_KRBNM_SHIFT	24
-#define MExxMNCF_BNM_SHIFT	16
-#define MExxMNCF_XBV		(1 << 15)
-#define MExxMNCF_CPL_YCBCR444	(1 << 12)
-#define MExxMNCF_CPL_YCBCR420	(2 << 12)
-#define MExxMNCF_CPL_YCBCR422	(3 << 12)
-#define MExxMNCF_CPL_MSK	(3 << 12)
-#define MExxMNCF_BL		(1 << 2)
-#define MExxMNCF_LNM_SHIFT	0
-#define MExxSARA		0x410
-#define MExxSARB		0x414
-#define MExxSBSIZE		0x418
-#define MExxSBSIZE_HDV		(1 << 31)
-#define MExxSBSIZE_HSZ16	(0 << 28)
-#define MExxSBSIZE_HSZ32	(1 << 28)
-#define MExxSBSIZE_HSZ64	(2 << 28)
-#define MExxSBSIZE_HSZ128	(3 << 28)
-#define MExxSBSIZE_SBSIZZ_SHIFT	0
-
-#define MERAM_MExxCTL_VAL(next, addr)	\
-	((((next) << MExxCTL_NXT_SHIFT) & MExxCTL_NXT_MASK) | \
-	 (((addr) << MExxCTL_MSAR_SHIFT) & MExxCTL_MSAR_MASK))
-#define	MERAM_MExxBSIZE_VAL(rcnt, yszm1, xszm1) \
-	(((rcnt) << MExxBSIZE_RCNT_SHIFT) | \
-	 ((yszm1) << MExxBSIZE_YSZM1_SHIFT) | \
-	 ((xszm1) << MExxBSIZE_XSZM1_SHIFT))
-
-static const unsigned long common_regs[] = {
-	MEVCR1,
-	MEQSEL1,
-	MEQSEL2,
-};
-#define MERAM_REGS_SIZE ARRAY_SIZE(common_regs)
-
-static const unsigned long icb_regs[] = {
-	MExxCTL,
-	MExxBSIZE,
-	MExxMNCF,
-	MExxSARA,
-	MExxSARB,
-	MExxSBSIZE,
-};
-#define ICB_REGS_SIZE ARRAY_SIZE(icb_regs)
-
-/*
- * sh_mobile_meram_icb - MERAM ICB information
- * @regs: Registers cache
- * @index: ICB index
- * @offset: MERAM block offset
- * @size: MERAM block size in KiB
- * @cache_unit: Bytes to cache per ICB
- * @pixelformat: Video pixel format of the data stored in the ICB
- * @current_reg: Which of Start Address Register A (0) or B (1) is in use
- */
-struct sh_mobile_meram_icb {
-	unsigned long regs[ICB_REGS_SIZE];
-	unsigned int index;
-	unsigned long offset;
-	unsigned int size;
-
-	unsigned int cache_unit;
-	unsigned int pixelformat;
-	unsigned int current_reg;
-};
-
-#define MERAM_ICB_NUM			32
-
-struct sh_mobile_meram_fb_plane {
-	struct sh_mobile_meram_icb *marker;
-	struct sh_mobile_meram_icb *cache;
-};
-
-struct sh_mobile_meram_fb_cache {
-	unsigned int nplanes;
-	struct sh_mobile_meram_fb_plane planes[2];
-};
-
-/*
- * sh_mobile_meram_priv - MERAM device
- * @base: Registers base address
- * @meram: MERAM physical address
- * @regs: Registers cache
- * @lock: Protects used_icb and icbs
- * @used_icb: Bitmask of used ICBs
- * @icbs: ICBs
- * @pool: Allocation pool to manage the MERAM
- */
-struct sh_mobile_meram_priv {
-	void __iomem *base;
-	unsigned long meram;
-	unsigned long regs[MERAM_REGS_SIZE];
-
-	struct mutex lock;
-	unsigned long used_icb;
-	struct sh_mobile_meram_icb icbs[MERAM_ICB_NUM];
-
-	struct gen_pool *pool;
-};
-
-/* settings */
-#define MERAM_GRANULARITY		1024
-#define MERAM_SEC_LINE			15
-#define MERAM_LINE_WIDTH		2048
-
-/* -----------------------------------------------------------------------------
- * Registers access
- */
-
-#define MERAM_ICB_OFFSET(base, idx, off)	((base) + (off) + (idx) * 0x20)
-
-static inline void meram_write_icb(void __iomem *base, unsigned int idx,
-				   unsigned int off, unsigned long val)
-{
-	iowrite32(val, MERAM_ICB_OFFSET(base, idx, off));
-}
-
-static inline unsigned long meram_read_icb(void __iomem *base, unsigned int idx,
-					   unsigned int off)
-{
-	return ioread32(MERAM_ICB_OFFSET(base, idx, off));
-}
-
-static inline void meram_write_reg(void __iomem *base, unsigned int off,
-				   unsigned long val)
-{
-	iowrite32(val, base + off);
-}
-
-static inline unsigned long meram_read_reg(void __iomem *base, unsigned int off)
-{
-	return ioread32(base + off);
-}
-
-/* -----------------------------------------------------------------------------
- * MERAM allocation and free
- */
-
-static unsigned long meram_alloc(struct sh_mobile_meram_priv *priv, size_t size)
-{
-	return gen_pool_alloc(priv->pool, size);
-}
-
-static void meram_free(struct sh_mobile_meram_priv *priv, unsigned long mem,
-		       size_t size)
-{
-	gen_pool_free(priv->pool, mem, size);
-}
-
-/* -----------------------------------------------------------------------------
- * LCDC cache planes allocation, init, cleanup and free
- */
-
-/* Allocate ICBs and MERAM for a plane. */
-static int meram_plane_alloc(struct sh_mobile_meram_priv *priv,
-			     struct sh_mobile_meram_fb_plane *plane,
-			     size_t size)
-{
-	unsigned long mem;
-	unsigned long idx;
-
-	idx = find_first_zero_bit(&priv->used_icb, 28);
-	if (idx == 28)
-		return -ENOMEM;
-	plane->cache = &priv->icbs[idx];
-
-	idx = find_next_zero_bit(&priv->used_icb, 32, 28);
-	if (idx == 32)
-		return -ENOMEM;
-	plane->marker = &priv->icbs[idx];
-
-	mem = meram_alloc(priv, size * 1024);
-	if (mem == 0)
-		return -ENOMEM;
-
-	__set_bit(plane->marker->index, &priv->used_icb);
-	__set_bit(plane->cache->index, &priv->used_icb);
-
-	plane->marker->offset = mem - priv->meram;
-	plane->marker->size = size;
-
-	return 0;
-}
-
-/* Free ICBs and MERAM for a plane. */
-static void meram_plane_free(struct sh_mobile_meram_priv *priv,
-			     struct sh_mobile_meram_fb_plane *plane)
-{
-	meram_free(priv, priv->meram + plane->marker->offset,
-		   plane->marker->size * 1024);
-
-	__clear_bit(plane->marker->index, &priv->used_icb);
-	__clear_bit(plane->cache->index, &priv->used_icb);
-}
-
-/* Is this a YCbCr(NV12, NV16 or NV24) colorspace? */
-static int is_nvcolor(int cspace)
-{
-	if (cspace == SH_MOBILE_MERAM_PF_NV ||
-	    cspace == SH_MOBILE_MERAM_PF_NV24)
-		return 1;
-	return 0;
-}
-
-/* Set the next address to fetch. */
-static void meram_set_next_addr(struct sh_mobile_meram_priv *priv,
-				struct sh_mobile_meram_fb_cache *cache,
-				unsigned long base_addr_y,
-				unsigned long base_addr_c)
-{
-	struct sh_mobile_meram_icb *icb = cache->planes[0].marker;
-	unsigned long target;
-
-	icb->current_reg ^= 1;
-	target = icb->current_reg ? MExxSARB : MExxSARA;
-
-	/* set the next address to fetch */
-	meram_write_icb(priv->base, cache->planes[0].cache->index, target,
-			base_addr_y);
-	meram_write_icb(priv->base, cache->planes[0].marker->index, target,
-			base_addr_y + cache->planes[0].marker->cache_unit);
-
-	if (cache->nplanes == 2) {
-		meram_write_icb(priv->base, cache->planes[1].cache->index,
-				target, base_addr_c);
-		meram_write_icb(priv->base, cache->planes[1].marker->index,
-				target, base_addr_c +
-				cache->planes[1].marker->cache_unit);
-	}
-}
-
-/* Get the next ICB address. */
-static void
-meram_get_next_icb_addr(struct sh_mobile_meram_info *pdata,
-			struct sh_mobile_meram_fb_cache *cache,
-			unsigned long *icb_addr_y, unsigned long *icb_addr_c)
-{
-	struct sh_mobile_meram_icb *icb = cache->planes[0].marker;
-	unsigned long icb_offset;
-
-	if (pdata->addr_mode == SH_MOBILE_MERAM_MODE0)
-		icb_offset = 0x80000000 | (icb->current_reg << 29);
-	else
-		icb_offset = 0xc0000000 | (icb->current_reg << 23);
-
-	*icb_addr_y = icb_offset | (cache->planes[0].marker->index << 24);
-	if (cache->nplanes == 2)
-		*icb_addr_c = icb_offset
-			    | (cache->planes[1].marker->index << 24);
-}
-
-#define MERAM_CALC_BYTECOUNT(x, y) \
-	(((x) * (y) + (MERAM_LINE_WIDTH - 1)) & ~(MERAM_LINE_WIDTH - 1))
-
-/* Initialize MERAM. */
-static int meram_plane_init(struct sh_mobile_meram_priv *priv,
-			    struct sh_mobile_meram_fb_plane *plane,
-			    unsigned int xres, unsigned int yres,
-			    unsigned int *out_pitch)
-{
-	struct sh_mobile_meram_icb *marker = plane->marker;
-	unsigned long total_byte_count = MERAM_CALC_BYTECOUNT(xres, yres);
-	unsigned long bnm;
-	unsigned int lcdc_pitch;
-	unsigned int xpitch;
-	unsigned int line_cnt;
-	unsigned int save_lines;
-
-	/* adjust pitch to 1024, 2048, 4096 or 8192 */
-	lcdc_pitch = (xres - 1) | 1023;
-	lcdc_pitch = lcdc_pitch | (lcdc_pitch >> 1);
-	lcdc_pitch = lcdc_pitch | (lcdc_pitch >> 2);
-	lcdc_pitch += 1;
-
-	/* derive settings */
-	if (lcdc_pitch == 8192 && yres >= 1024) {
-		lcdc_pitch = xpitch = MERAM_LINE_WIDTH;
-		line_cnt = total_byte_count >> 11;
-		*out_pitch = xres;
-		save_lines = plane->marker->size / 16 / MERAM_SEC_LINE;
-		save_lines *= MERAM_SEC_LINE;
-	} else {
-		xpitch = xres;
-		line_cnt = yres;
-		*out_pitch = lcdc_pitch;
-		save_lines = plane->marker->size / (lcdc_pitch >> 10) / 2;
-		save_lines &= 0xff;
-	}
-	bnm = (save_lines - 1) << 16;
-
-	/* TODO: we better to check if we have enough MERAM buffer size */
-
-	/* set up ICB */
-	meram_write_icb(priv->base, plane->cache->index,  MExxBSIZE,
-			MERAM_MExxBSIZE_VAL(0x0, line_cnt - 1, xpitch - 1));
-	meram_write_icb(priv->base, plane->marker->index, MExxBSIZE,
-			MERAM_MExxBSIZE_VAL(0xf, line_cnt - 1, xpitch - 1));
-
-	meram_write_icb(priv->base, plane->cache->index,  MExxMNCF, bnm);
-	meram_write_icb(priv->base, plane->marker->index, MExxMNCF, bnm);
-
-	meram_write_icb(priv->base, plane->cache->index,  MExxSBSIZE, xpitch);
-	meram_write_icb(priv->base, plane->marker->index, MExxSBSIZE, xpitch);
-
-	/* save a cache unit size */
-	plane->cache->cache_unit = xres * save_lines;
-	plane->marker->cache_unit = xres * save_lines;
-
-	/*
-	 * Set MERAM for framebuffer
-	 *
-	 * we also chain the cache_icb and the marker_icb.
-	 * we also split the allocated MERAM buffer between two ICBs.
-	 */
-	meram_write_icb(priv->base, plane->cache->index, MExxCTL,
-			MERAM_MExxCTL_VAL(plane->marker->index, marker->offset)
-			| MExxCTL_WD1 | MExxCTL_WD0 | MExxCTL_WS | MExxCTL_CM |
-			MExxCTL_MD_FB);
-	meram_write_icb(priv->base, plane->marker->index, MExxCTL,
-			MERAM_MExxCTL_VAL(plane->cache->index, marker->offset +
-					  plane->marker->size / 2) |
-			MExxCTL_WD1 | MExxCTL_WD0 | MExxCTL_WS | MExxCTL_CM |
-			MExxCTL_MD_FB);
-
-	return 0;
-}
-
-static void meram_plane_cleanup(struct sh_mobile_meram_priv *priv,
-				struct sh_mobile_meram_fb_plane *plane)
-{
-	/* disable ICB */
-	meram_write_icb(priv->base, plane->cache->index,  MExxCTL,
-			MExxCTL_WBF | MExxCTL_WF | MExxCTL_RF);
-	meram_write_icb(priv->base, plane->marker->index, MExxCTL,
-			MExxCTL_WBF | MExxCTL_WF | MExxCTL_RF);
-
-	plane->cache->cache_unit = 0;
-	plane->marker->cache_unit = 0;
-}
-
-/* -----------------------------------------------------------------------------
- * MERAM operations
- */
-
-unsigned long sh_mobile_meram_alloc(struct sh_mobile_meram_info *pdata,
-				    size_t size)
-{
-	struct sh_mobile_meram_priv *priv = pdata->priv;
-
-	return meram_alloc(priv, size);
-}
-EXPORT_SYMBOL_GPL(sh_mobile_meram_alloc);
-
-void sh_mobile_meram_free(struct sh_mobile_meram_info *pdata, unsigned long mem,
-			  size_t size)
-{
-	struct sh_mobile_meram_priv *priv = pdata->priv;
-
-	meram_free(priv, mem, size);
-}
-EXPORT_SYMBOL_GPL(sh_mobile_meram_free);
-
-/* Allocate memory for the ICBs and mark them as used. */
-static struct sh_mobile_meram_fb_cache *
-meram_cache_alloc(struct sh_mobile_meram_priv *priv,
-		  const struct sh_mobile_meram_cfg *cfg,
-		  int pixelformat)
-{
-	unsigned int nplanes = is_nvcolor(pixelformat) ? 2 : 1;
-	struct sh_mobile_meram_fb_cache *cache;
-	int ret;
-
-	cache = kzalloc(sizeof(*cache), GFP_KERNEL);
-	if (cache == NULL)
-		return ERR_PTR(-ENOMEM);
-
-	cache->nplanes = nplanes;
-
-	ret = meram_plane_alloc(priv, &cache->planes[0],
-				cfg->icb[0].meram_size);
-	if (ret < 0)
-		goto error;
-
-	cache->planes[0].marker->current_reg = 1;
-	cache->planes[0].marker->pixelformat = pixelformat;
-
-	if (cache->nplanes == 1)
-		return cache;
-
-	ret = meram_plane_alloc(priv, &cache->planes[1],
-				cfg->icb[1].meram_size);
-	if (ret < 0) {
-		meram_plane_free(priv, &cache->planes[0]);
-		goto error;
-	}
-
-	return cache;
-
-error:
-	kfree(cache);
-	return ERR_PTR(-ENOMEM);
-}
-
-void *sh_mobile_meram_cache_alloc(struct sh_mobile_meram_info *pdata,
-				  const struct sh_mobile_meram_cfg *cfg,
-				  unsigned int xres, unsigned int yres,
-				  unsigned int pixelformat, unsigned int *pitch)
-{
-	struct sh_mobile_meram_fb_cache *cache;
-	struct sh_mobile_meram_priv *priv = pdata->priv;
-	struct platform_device *pdev = pdata->pdev;
-	unsigned int nplanes = is_nvcolor(pixelformat) ? 2 : 1;
-	unsigned int out_pitch;
-
-	if (priv == NULL)
-		return ERR_PTR(-ENODEV);
-
-	if (pixelformat != SH_MOBILE_MERAM_PF_NV &&
-	    pixelformat != SH_MOBILE_MERAM_PF_NV24 &&
-	    pixelformat != SH_MOBILE_MERAM_PF_RGB)
-		return ERR_PTR(-EINVAL);
-
-	dev_dbg(&pdev->dev, "registering %dx%d (%s)", xres, yres,
-		!pixelformat ? "yuv" : "rgb");
-
-	/* we can't handle wider than 8192px */
-	if (xres > 8192) {
-		dev_err(&pdev->dev, "width exceeding the limit (> 8192).");
-		return ERR_PTR(-EINVAL);
-	}
-
-	if (cfg->icb[0].meram_size == 0)
-		return ERR_PTR(-EINVAL);
-
-	if (nplanes == 2 && cfg->icb[1].meram_size == 0)
-		return ERR_PTR(-EINVAL);
-
-	mutex_lock(&priv->lock);
-
-	/* We now register the ICBs and allocate the MERAM regions. */
-	cache = meram_cache_alloc(priv, cfg, pixelformat);
-	if (IS_ERR(cache)) {
-		dev_err(&pdev->dev, "MERAM allocation failed (%ld).",
-			PTR_ERR(cache));
-		goto err;
-	}
-
-	/* initialize MERAM */
-	meram_plane_init(priv, &cache->planes[0], xres, yres, &out_pitch);
-	*pitch = out_pitch;
-	if (pixelformat == SH_MOBILE_MERAM_PF_NV)
-		meram_plane_init(priv, &cache->planes[1],
-				 xres, (yres + 1) / 2, &out_pitch);
-	else if (pixelformat == SH_MOBILE_MERAM_PF_NV24)
-		meram_plane_init(priv, &cache->planes[1],
-				 2 * xres, (yres + 1) / 2, &out_pitch);
-
-err:
-	mutex_unlock(&priv->lock);
-	return cache;
-}
-EXPORT_SYMBOL_GPL(sh_mobile_meram_cache_alloc);
-
-void
-sh_mobile_meram_cache_free(struct sh_mobile_meram_info *pdata, void *data)
-{
-	struct sh_mobile_meram_fb_cache *cache = data;
-	struct sh_mobile_meram_priv *priv = pdata->priv;
-
-	mutex_lock(&priv->lock);
-
-	/* Cleanup and free. */
-	meram_plane_cleanup(priv, &cache->planes[0]);
-	meram_plane_free(priv, &cache->planes[0]);
-
-	if (cache->nplanes == 2) {
-		meram_plane_cleanup(priv, &cache->planes[1]);
-		meram_plane_free(priv, &cache->planes[1]);
-	}
-
-	kfree(cache);
-
-	mutex_unlock(&priv->lock);
-}
-EXPORT_SYMBOL_GPL(sh_mobile_meram_cache_free);
-
-void
-sh_mobile_meram_cache_update(struct sh_mobile_meram_info *pdata, void *data,
-			     unsigned long base_addr_y,
-			     unsigned long base_addr_c,
-			     unsigned long *icb_addr_y,
-			     unsigned long *icb_addr_c)
-{
-	struct sh_mobile_meram_fb_cache *cache = data;
-	struct sh_mobile_meram_priv *priv = pdata->priv;
-
-	mutex_lock(&priv->lock);
-
-	meram_set_next_addr(priv, cache, base_addr_y, base_addr_c);
-	meram_get_next_icb_addr(pdata, cache, icb_addr_y, icb_addr_c);
-
-	mutex_unlock(&priv->lock);
-}
-EXPORT_SYMBOL_GPL(sh_mobile_meram_cache_update);
-
-/* -----------------------------------------------------------------------------
- * Power management
- */
-
-#ifdef CONFIG_PM
-static int sh_mobile_meram_suspend(struct device *dev)
-{
-	struct platform_device *pdev = to_platform_device(dev);
-	struct sh_mobile_meram_priv *priv = platform_get_drvdata(pdev);
-	unsigned int i, j;
-
-	for (i = 0; i < MERAM_REGS_SIZE; i++)
-		priv->regs[i] = meram_read_reg(priv->base, common_regs[i]);
-
-	for (i = 0; i < 32; i++) {
-		if (!test_bit(i, &priv->used_icb))
-			continue;
-		for (j = 0; j < ICB_REGS_SIZE; j++) {
-			priv->icbs[i].regs[j] =
-				meram_read_icb(priv->base, i, icb_regs[j]);
-			/* Reset ICB on resume */
-			if (icb_regs[j] == MExxCTL)
-				priv->icbs[i].regs[j] |=
-					MExxCTL_WBF | MExxCTL_WF | MExxCTL_RF;
-		}
-	}
-	return 0;
-}
-
-static int sh_mobile_meram_resume(struct device *dev)
-{
-	struct platform_device *pdev = to_platform_device(dev);
-	struct sh_mobile_meram_priv *priv = platform_get_drvdata(pdev);
-	unsigned int i, j;
-
-	for (i = 0; i < 32; i++) {
-		if (!test_bit(i, &priv->used_icb))
-			continue;
-		for (j = 0; j < ICB_REGS_SIZE; j++)
-			meram_write_icb(priv->base, i, icb_regs[j],
-					priv->icbs[i].regs[j]);
-	}
-
-	for (i = 0; i < MERAM_REGS_SIZE; i++)
-		meram_write_reg(priv->base, common_regs[i], priv->regs[i]);
-	return 0;
-}
-#endif /* CONFIG_PM */
-
-static UNIVERSAL_DEV_PM_OPS(sh_mobile_meram_dev_pm_ops,
-			    sh_mobile_meram_suspend,
-			    sh_mobile_meram_resume, NULL);
-
-/* -----------------------------------------------------------------------------
- * Probe/remove and driver init/exit
- */
-
-static int sh_mobile_meram_probe(struct platform_device *pdev)
-{
-	struct sh_mobile_meram_priv *priv;
-	struct sh_mobile_meram_info *pdata = pdev->dev.platform_data;
-	struct resource *regs;
-	struct resource *meram;
-	unsigned int i;
-	int error;
-
-	if (!pdata) {
-		dev_err(&pdev->dev, "no platform data defined\n");
-		return -EINVAL;
-	}
-
-	regs = platform_get_resource(pdev, IORESOURCE_MEM, 0);
-	meram = platform_get_resource(pdev, IORESOURCE_MEM, 1);
-	if (regs == NULL || meram == NULL) {
-		dev_err(&pdev->dev, "cannot get platform resources\n");
-		return -ENOENT;
-	}
-
-	priv = kzalloc(sizeof(*priv), GFP_KERNEL);
-	if (!priv) {
-		dev_err(&pdev->dev, "cannot allocate device data\n");
-		return -ENOMEM;
-	}
-
-	/* Initialize private data. */
-	mutex_init(&priv->lock);
-	priv->used_icb = pdata->reserved_icbs;
-
-	for (i = 0; i < MERAM_ICB_NUM; ++i)
-		priv->icbs[i].index = i;
-
-	pdata->priv = priv;
-	pdata->pdev = pdev;
-
-	/* Request memory regions and remap the registers. */
-	if (!request_mem_region(regs->start, resource_size(regs), pdev->name)) {
-		dev_err(&pdev->dev, "MERAM registers region already claimed\n");
-		error = -EBUSY;
-		goto err_req_regs;
-	}
-
-	if (!request_mem_region(meram->start, resource_size(meram),
-				pdev->name)) {
-		dev_err(&pdev->dev, "MERAM memory region already claimed\n");
-		error = -EBUSY;
-		goto err_req_meram;
-	}
-
-	priv->base = ioremap_nocache(regs->start, resource_size(regs));
-	if (!priv->base) {
-		dev_err(&pdev->dev, "ioremap failed\n");
-		error = -EFAULT;
-		goto err_ioremap;
-	}
-
-	priv->meram = meram->start;
-
-	/* Create and initialize the MERAM memory pool. */
-	priv->pool = gen_pool_create(ilog2(MERAM_GRANULARITY), -1);
-	if (priv->pool == NULL) {
-		error = -ENOMEM;
-		goto err_genpool;
-	}
-
-	error = gen_pool_add(priv->pool, meram->start, resource_size(meram),
-			     -1);
-	if (error < 0)
-		goto err_genpool;
-
-	/* initialize ICB addressing mode */
-	if (pdata->addr_mode == SH_MOBILE_MERAM_MODE1)
-		meram_write_reg(priv->base, MEVCR1, MEVCR1_AMD1);
-
-	platform_set_drvdata(pdev, priv);
-	pm_runtime_enable(&pdev->dev);
-
-	dev_info(&pdev->dev, "sh_mobile_meram initialized.");
-
-	return 0;
-
-err_genpool:
-	if (priv->pool)
-		gen_pool_destroy(priv->pool);
-	iounmap(priv->base);
-err_ioremap:
-	release_mem_region(meram->start, resource_size(meram));
-err_req_meram:
-	release_mem_region(regs->start, resource_size(regs));
-err_req_regs:
-	mutex_destroy(&priv->lock);
-	kfree(priv);
-
-	return error;
-}
-
-
-static int sh_mobile_meram_remove(struct platform_device *pdev)
-{
-	struct sh_mobile_meram_priv *priv = platform_get_drvdata(pdev);
-	struct resource *regs = platform_get_resource(pdev, IORESOURCE_MEM, 0);
-	struct resource *meram = platform_get_resource(pdev, IORESOURCE_MEM, 1);
-
-	pm_runtime_disable(&pdev->dev);
-
-	gen_pool_destroy(priv->pool);
-
-	iounmap(priv->base);
-	release_mem_region(meram->start, resource_size(meram));
-	release_mem_region(regs->start, resource_size(regs));
-
-	mutex_destroy(&priv->lock);
-
-	kfree(priv);
-
-	return 0;
-}
-
-static struct platform_driver sh_mobile_meram_driver = {
-	.driver	= {
-		.name		= "sh_mobile_meram",
-		.pm		= &sh_mobile_meram_dev_pm_ops,
-	},
-	.probe		= sh_mobile_meram_probe,
-	.remove		= sh_mobile_meram_remove,
-};
-
-module_platform_driver(sh_mobile_meram_driver);
-
-MODULE_DESCRIPTION("SuperH Mobile MERAM driver");
-MODULE_AUTHOR("Damian Hobson-Garcia / Takanari Hayama");
-MODULE_LICENSE("GPL v2");
diff --git a/drivers/video/fbdev/sm501fb.c b/drivers/video/fbdev/sm501fb.c
index 6f0a195..dde52d0 100644
--- a/drivers/video/fbdev/sm501fb.c
+++ b/drivers/video/fbdev/sm501fb.c
@@ -1932,8 +1932,7 @@
 	int ret;
 
 	/* allocate our framebuffers */
-
-	info = kzalloc(sizeof(struct sm501fb_info), GFP_KERNEL);
+	info = kzalloc(sizeof(*info), GFP_KERNEL);
 	if (!info) {
 		dev_err(dev, "failed to allocate state\n");
 		return -ENOMEM;
diff --git a/drivers/video/fbdev/via/global.h b/drivers/video/fbdev/via/global.h
index 275dbbb..649d2ca 100644
--- a/drivers/video/fbdev/via/global.h
+++ b/drivers/video/fbdev/via/global.h
@@ -33,6 +33,12 @@
 #include <linux/console.h>
 #include <linux/timer.h>
 
+#ifdef CONFIG_X86
+#include <asm/olpc.h>
+#else
+#define machine_is_olpc(x) 0
+#endif
+
 #include "debug.h"
 
 #include "viafbdev.h"
diff --git a/drivers/video/fbdev/via/hw.c b/drivers/video/fbdev/via/hw.c
index 22450908..48969c6 100644
--- a/drivers/video/fbdev/via/hw.c
+++ b/drivers/video/fbdev/via/hw.c
@@ -20,7 +20,6 @@
  */
 
 #include <linux/via-core.h>
-#include <asm/olpc.h>
 #include "global.h"
 #include "via_clock.h"
 
diff --git a/drivers/video/fbdev/via/via-core.c b/drivers/video/fbdev/via/via-core.c
index 77774d8..b041eb2 100644
--- a/drivers/video/fbdev/via/via-core.c
+++ b/drivers/video/fbdev/via/via-core.c
@@ -17,7 +17,6 @@
 #include <linux/platform_device.h>
 #include <linux/list.h>
 #include <linux/pm.h>
-#include <asm/olpc.h>
 
 /*
  * The default port config.
diff --git a/drivers/video/fbdev/via/via_clock.c b/drivers/video/fbdev/via/via_clock.c
index bf269fa..3d0efdb 100644
--- a/drivers/video/fbdev/via/via_clock.c
+++ b/drivers/video/fbdev/via/via_clock.c
@@ -25,7 +25,7 @@
 
 #include <linux/kernel.h>
 #include <linux/via-core.h>
-#include <asm/olpc.h>
+
 #include "via_clock.h"
 #include "global.h"
 #include "debug.h"
diff --git a/drivers/video/fbdev/via/viafbdev.c b/drivers/video/fbdev/via/viafbdev.c
index 52f577b..d2f7850 100644
--- a/drivers/video/fbdev/via/viafbdev.c
+++ b/drivers/video/fbdev/via/viafbdev.c
@@ -25,7 +25,6 @@
 #include <linux/stat.h>
 #include <linux/via-core.h>
 #include <linux/via_i2c.h>
-#include <asm/olpc.h>
 
 #define _MASTER_FILE
 #include "global.h"
diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c
index b563a44..705aebd 100644
--- a/drivers/virtio/virtio_pci_common.c
+++ b/drivers/virtio/virtio_pci_common.c
@@ -578,6 +578,8 @@
 	struct virtio_pci_device *vp_dev = pci_get_drvdata(pci_dev);
 	struct device *dev = get_device(&vp_dev->vdev.dev);
 
+	pci_disable_sriov(pci_dev);
+
 	unregister_virtio_device(&vp_dev->vdev);
 
 	if (vp_dev->ioaddr)
@@ -589,6 +591,33 @@
 	put_device(dev);
 }
 
+static int virtio_pci_sriov_configure(struct pci_dev *pci_dev, int num_vfs)
+{
+	struct virtio_pci_device *vp_dev = pci_get_drvdata(pci_dev);
+	struct virtio_device *vdev = &vp_dev->vdev;
+	int ret;
+
+	if (!(vdev->config->get_status(vdev) & VIRTIO_CONFIG_S_DRIVER_OK))
+		return -EBUSY;
+
+	if (!__virtio_test_bit(vdev, VIRTIO_F_SR_IOV))
+		return -EINVAL;
+
+	if (pci_vfs_assigned(pci_dev))
+		return -EPERM;
+
+	if (num_vfs == 0) {
+		pci_disable_sriov(pci_dev);
+		return 0;
+	}
+
+	ret = pci_enable_sriov(pci_dev, num_vfs);
+	if (ret < 0)
+		return ret;
+
+	return num_vfs;
+}
+
 static struct pci_driver virtio_pci_driver = {
 	.name		= "virtio-pci",
 	.id_table	= virtio_pci_id_table,
@@ -597,6 +626,7 @@
 #ifdef CONFIG_PM_SLEEP
 	.driver.pm	= &virtio_pci_pm_ops,
 #endif
+	.sriov_configure = virtio_pci_sriov_configure,
 };
 
 module_pci_driver(virtio_pci_driver);
diff --git a/drivers/virtio/virtio_pci_modern.c b/drivers/virtio/virtio_pci_modern.c
index 2555d80..07571da 100644
--- a/drivers/virtio/virtio_pci_modern.c
+++ b/drivers/virtio/virtio_pci_modern.c
@@ -153,14 +153,28 @@
 	return features;
 }
 
+static void vp_transport_features(struct virtio_device *vdev, u64 features)
+{
+	struct virtio_pci_device *vp_dev = to_vp_device(vdev);
+	struct pci_dev *pci_dev = vp_dev->pci_dev;
+
+	if ((features & BIT_ULL(VIRTIO_F_SR_IOV)) &&
+			pci_find_ext_capability(pci_dev, PCI_EXT_CAP_ID_SRIOV))
+		__virtio_set_bit(vdev, VIRTIO_F_SR_IOV);
+}
+
 /* virtio config->finalize_features() implementation */
 static int vp_finalize_features(struct virtio_device *vdev)
 {
 	struct virtio_pci_device *vp_dev = to_vp_device(vdev);
+	u64 features = vdev->features;
 
 	/* Give virtio_ring a chance to accept features. */
 	vring_transport_features(vdev);
 
+	/* Give virtio_pci a chance to accept features. */
+	vp_transport_features(vdev, features);
+
 	if (!__virtio_test_bit(vdev, VIRTIO_F_VERSION_1)) {
 		dev_err(&vdev->dev, "virtio: device uses modern interface "
 			"but does not have VIRTIO_F_VERSION_1\n");
diff --git a/fs/afs/Makefile b/fs/afs/Makefile
index 532acae..5468740 100644
--- a/fs/afs/Makefile
+++ b/fs/afs/Makefile
@@ -5,7 +5,7 @@
 
 afs-cache-$(CONFIG_AFS_FSCACHE) := cache.o
 
-kafs-objs := \
+kafs-y := \
 	$(afs-cache-y) \
 	addr_list.o \
 	callback.o \
@@ -21,7 +21,6 @@
 	main.o \
 	misc.o \
 	mntpt.o \
-	proc.o \
 	rotate.o \
 	rxrpc.o \
 	security.o \
@@ -34,4 +33,5 @@
 	write.o \
 	xattr.o
 
+kafs-$(CONFIG_PROC_FS) += proc.o
 obj-$(CONFIG_AFS_FS)  := kafs.o
diff --git a/fs/afs/addr_list.c b/fs/afs/addr_list.c
index 2c46c46..025a9a5 100644
--- a/fs/afs/addr_list.c
+++ b/fs/afs/addr_list.c
@@ -215,7 +215,7 @@
 	_enter("%s", cell->name);
 
 	ret = dns_query("afsdb", cell->name, cell->name_len,
-			"ipv4", &vllist, _expiry);
+			"", &vllist, _expiry);
 	if (ret < 0)
 		return ERR_PTR(ret);
 
diff --git a/fs/afs/callback.c b/fs/afs/callback.c
index 571437d..5f261fb 100644
--- a/fs/afs/callback.c
+++ b/fs/afs/callback.c
@@ -21,6 +21,66 @@
 #include "internal.h"
 
 /*
+ * Create volume and callback interests on a server.
+ */
+static struct afs_cb_interest *afs_create_interest(struct afs_server *server,
+						   struct afs_vnode *vnode)
+{
+	struct afs_vol_interest *new_vi, *vi;
+	struct afs_cb_interest *new;
+	struct hlist_node **pp;
+
+	new_vi = kzalloc(sizeof(struct afs_vol_interest), GFP_KERNEL);
+	if (!new_vi)
+		return NULL;
+
+	new = kzalloc(sizeof(struct afs_cb_interest), GFP_KERNEL);
+	if (!new) {
+		kfree(new_vi);
+		return NULL;
+	}
+
+	new_vi->usage = 1;
+	new_vi->vid = vnode->volume->vid;
+	INIT_HLIST_NODE(&new_vi->srv_link);
+	INIT_HLIST_HEAD(&new_vi->cb_interests);
+
+	refcount_set(&new->usage, 1);
+	new->sb = vnode->vfs_inode.i_sb;
+	new->vid = vnode->volume->vid;
+	new->server = afs_get_server(server);
+	INIT_HLIST_NODE(&new->cb_vlink);
+
+	write_lock(&server->cb_break_lock);
+
+	for (pp = &server->cb_volumes.first; *pp; pp = &(*pp)->next) {
+		vi = hlist_entry(*pp, struct afs_vol_interest, srv_link);
+		if (vi->vid < new_vi->vid)
+			continue;
+		if (vi->vid > new_vi->vid)
+			break;
+		vi->usage++;
+		goto found_vi;
+	}
+
+	new_vi->srv_link.pprev = pp;
+	new_vi->srv_link.next = *pp;
+	if (*pp)
+		(*pp)->pprev = &new_vi->srv_link.next;
+	*pp = &new_vi->srv_link;
+	vi = new_vi;
+	new_vi = NULL;
+found_vi:
+
+	new->vol_interest = vi;
+	hlist_add_head(&new->cb_vlink, &vi->cb_interests);
+
+	write_unlock(&server->cb_break_lock);
+	kfree(new_vi);
+	return new;
+}
+
+/*
  * Set up an interest-in-callbacks record for a volume on a server and
  * register it with the server.
  * - Called with vnode->io_lock held.
@@ -77,20 +137,10 @@
 	}
 
 	if (!cbi) {
-		new = kzalloc(sizeof(struct afs_cb_interest), GFP_KERNEL);
+		new = afs_create_interest(server, vnode);
 		if (!new)
 			return -ENOMEM;
 
-		refcount_set(&new->usage, 1);
-		new->sb = vnode->vfs_inode.i_sb;
-		new->vid = vnode->volume->vid;
-		new->server = afs_get_server(server);
-		INIT_LIST_HEAD(&new->cb_link);
-
-		write_lock(&server->cb_break_lock);
-		list_add_tail(&new->cb_link, &server->cb_interests);
-		write_unlock(&server->cb_break_lock);
-
 		write_lock(&slist->lock);
 		if (!entry->cb_interest) {
 			entry->cb_interest = afs_get_cb_interest(new);
@@ -126,11 +176,22 @@
  */
 void afs_put_cb_interest(struct afs_net *net, struct afs_cb_interest *cbi)
 {
+	struct afs_vol_interest *vi;
+
 	if (cbi && refcount_dec_and_test(&cbi->usage)) {
-		if (!list_empty(&cbi->cb_link)) {
+		if (!hlist_unhashed(&cbi->cb_vlink)) {
 			write_lock(&cbi->server->cb_break_lock);
-			list_del_init(&cbi->cb_link);
+
+			hlist_del_init(&cbi->cb_vlink);
+			vi = cbi->vol_interest;
+			cbi->vol_interest = NULL;
+			if (--vi->usage == 0)
+				hlist_del(&vi->srv_link);
+			else
+				vi = NULL;
+
 			write_unlock(&cbi->server->cb_break_lock);
+			kfree(vi);
 			afs_put_server(net, cbi->server);
 		}
 		kfree(cbi);
@@ -182,20 +243,34 @@
 static void afs_break_one_callback(struct afs_server *server,
 				   struct afs_fid *fid)
 {
+	struct afs_vol_interest *vi;
 	struct afs_cb_interest *cbi;
 	struct afs_iget_data data;
 	struct afs_vnode *vnode;
 	struct inode *inode;
 
 	read_lock(&server->cb_break_lock);
+	hlist_for_each_entry(vi, &server->cb_volumes, srv_link) {
+		if (vi->vid < fid->vid)
+			continue;
+		if (vi->vid > fid->vid) {
+			vi = NULL;
+			break;
+		}
+		//atomic_inc(&vi->usage);
+		break;
+	}
+
+	/* TODO: Find all matching volumes if we couldn't match the server and
+	 * break them anyway.
+	 */
+	if (!vi)
+		goto out;
 
 	/* Step through all interested superblocks.  There may be more than one
 	 * because of cell aliasing.
 	 */
-	list_for_each_entry(cbi, &server->cb_interests, cb_link) {
-		if (cbi->vid != fid->vid)
-			continue;
-
+	hlist_for_each_entry(cbi, &vi->cb_interests, cb_vlink) {
 		if (fid->vnode == 0 && fid->unique == 0) {
 			/* The callback break applies to an entire volume. */
 			struct afs_super_info *as = AFS_FS_S(cbi->sb);
@@ -217,6 +292,7 @@
 		}
 	}
 
+out:
 	read_unlock(&server->cb_break_lock);
 }
 
diff --git a/fs/afs/cell.c b/fs/afs/cell.c
index fdf4c36..f3d0bef 100644
--- a/fs/afs/cell.c
+++ b/fs/afs/cell.c
@@ -15,6 +15,7 @@
 #include <linux/dns_resolver.h>
 #include <linux/sched.h>
 #include <linux/inet.h>
+#include <linux/namei.h>
 #include <keys/rxrpc-type.h>
 #include "internal.h"
 
@@ -341,8 +342,8 @@
 
 	/* install the new cell */
 	write_seqlock(&net->cells_lock);
-	old_root = net->ws_cell;
-	net->ws_cell = new_root;
+	old_root = rcu_access_pointer(net->ws_cell);
+	rcu_assign_pointer(net->ws_cell, new_root);
 	write_sequnlock(&net->cells_lock);
 
 	afs_put_cell(net, old_root);
@@ -528,12 +529,14 @@
 					     NULL, 0,
 					     cell, 0, true);
 #endif
-	ret = afs_proc_cell_setup(net, cell);
+	ret = afs_proc_cell_setup(cell);
 	if (ret < 0)
 		return ret;
-	spin_lock(&net->proc_cells_lock);
+
+	mutex_lock(&net->proc_cells_lock);
 	list_add_tail(&cell->proc_link, &net->proc_cells);
-	spin_unlock(&net->proc_cells_lock);
+	afs_dynroot_mkdir(net, cell);
+	mutex_unlock(&net->proc_cells_lock);
 	return 0;
 }
 
@@ -544,11 +547,12 @@
 {
 	_enter("%s", cell->name);
 
-	afs_proc_cell_remove(net, cell);
+	afs_proc_cell_remove(cell);
 
-	spin_lock(&net->proc_cells_lock);
+	mutex_lock(&net->proc_cells_lock);
 	list_del_init(&cell->proc_link);
-	spin_unlock(&net->proc_cells_lock);
+	afs_dynroot_rmdir(net, cell);
+	mutex_unlock(&net->proc_cells_lock);
 
 #ifdef CONFIG_AFS_FSCACHE
 	fscache_relinquish_cookie(cell->cache, NULL, false);
@@ -755,8 +759,8 @@
 	_enter("");
 
 	write_seqlock(&net->cells_lock);
-	ws = net->ws_cell;
-	net->ws_cell = NULL;
+	ws = rcu_access_pointer(net->ws_cell);
+	RCU_INIT_POINTER(net->ws_cell, NULL);
 	write_sequnlock(&net->cells_lock);
 	afs_put_cell(net, ws);
 
diff --git a/fs/afs/cmservice.c b/fs/afs/cmservice.c
index 238fd28..9e51d6f 100644
--- a/fs/afs/cmservice.c
+++ b/fs/afs/cmservice.c
@@ -526,7 +526,7 @@
 	nifs = 0;
 	ifs = kcalloc(32, sizeof(*ifs), GFP_KERNEL);
 	if (ifs) {
-		nifs = afs_get_ipv4_interfaces(ifs, 32, false);
+		nifs = afs_get_ipv4_interfaces(call->net, ifs, 32, false);
 		if (nifs < 0) {
 			kfree(ifs);
 			ifs = NULL;
diff --git a/fs/afs/dynroot.c b/fs/afs/dynroot.c
index 983f394..174e843 100644
--- a/fs/afs/dynroot.c
+++ b/fs/afs/dynroot.c
@@ -1,4 +1,4 @@
-/* dir.c: AFS dynamic root handling
+/* AFS dynamic root handling
  *
  * Copyright (C) 2018 Red Hat, Inc. All Rights Reserved.
  * Written by David Howells (dhowells@redhat.com)
@@ -46,7 +46,7 @@
 		return 0;
 	}
 
-	ret = dns_query("afsdb", name, len, "ipv4", NULL, NULL);
+	ret = dns_query("afsdb", name, len, "", NULL, NULL);
 	if (ret == -ENODATA)
 		ret = -EDESTADDRREQ;
 	return ret;
@@ -207,3 +207,125 @@
 	.d_release	= afs_d_release,
 	.d_automount	= afs_d_automount,
 };
+
+/*
+ * Create a manually added cell mount directory.
+ * - The caller must hold net->proc_cells_lock
+ */
+int afs_dynroot_mkdir(struct afs_net *net, struct afs_cell *cell)
+{
+	struct super_block *sb = net->dynroot_sb;
+	struct dentry *root, *subdir;
+	int ret;
+
+	if (!sb || atomic_read(&sb->s_active) == 0)
+		return 0;
+
+	/* Let the ->lookup op do the creation */
+	root = sb->s_root;
+	inode_lock(root->d_inode);
+	subdir = lookup_one_len(cell->name, root, cell->name_len);
+	if (IS_ERR(subdir)) {
+		ret = PTR_ERR(subdir);
+		goto unlock;
+	}
+
+	/* Note that we're retaining an extra ref on the dentry */
+	subdir->d_fsdata = (void *)1UL;
+	ret = 0;
+unlock:
+	inode_unlock(root->d_inode);
+	return ret;
+}
+
+/*
+ * Remove a manually added cell mount directory.
+ * - The caller must hold net->proc_cells_lock
+ */
+void afs_dynroot_rmdir(struct afs_net *net, struct afs_cell *cell)
+{
+	struct super_block *sb = net->dynroot_sb;
+	struct dentry *root, *subdir;
+
+	if (!sb || atomic_read(&sb->s_active) == 0)
+		return;
+
+	root = sb->s_root;
+	inode_lock(root->d_inode);
+
+	/* Don't want to trigger a lookup call, which will re-add the cell */
+	subdir = try_lookup_one_len(cell->name, root, cell->name_len);
+	if (IS_ERR_OR_NULL(subdir)) {
+		_debug("lookup %ld", PTR_ERR(subdir));
+		goto no_dentry;
+	}
+
+	_debug("rmdir %pd %u", subdir, d_count(subdir));
+
+	if (subdir->d_fsdata) {
+		_debug("unpin %u", d_count(subdir));
+		subdir->d_fsdata = NULL;
+		dput(subdir);
+	}
+	dput(subdir);
+no_dentry:
+	inode_unlock(root->d_inode);
+	_leave("");
+}
+
+/*
+ * Populate a newly created dynamic root with cell names.
+ */
+int afs_dynroot_populate(struct super_block *sb)
+{
+	struct afs_cell *cell;
+	struct afs_net *net = afs_sb2net(sb);
+	int ret;
+
+	if (mutex_lock_interruptible(&net->proc_cells_lock) < 0)
+		return -ERESTARTSYS;
+
+	net->dynroot_sb = sb;
+	list_for_each_entry(cell, &net->proc_cells, proc_link) {
+		ret = afs_dynroot_mkdir(net, cell);
+		if (ret < 0)
+			goto error;
+	}
+
+	ret = 0;
+out:
+	mutex_unlock(&net->proc_cells_lock);
+	return ret;
+
+error:
+	net->dynroot_sb = NULL;
+	goto out;
+}
+
+/*
+ * When a dynamic root that's in the process of being destroyed, depopulate it
+ * of pinned directories.
+ */
+void afs_dynroot_depopulate(struct super_block *sb)
+{
+	struct afs_net *net = afs_sb2net(sb);
+	struct dentry *root = sb->s_root, *subdir, *tmp;
+
+	/* Prevent more subdirs from being created */
+	mutex_lock(&net->proc_cells_lock);
+	if (net->dynroot_sb == sb)
+		net->dynroot_sb = NULL;
+	mutex_unlock(&net->proc_cells_lock);
+
+	inode_lock(root->d_inode);
+
+	/* Remove all the pins for dirs created for manually added cells */
+	list_for_each_entry_safe(subdir, tmp, &root->d_subdirs, d_child) {
+		if (subdir->d_fsdata) {
+			subdir->d_fsdata = NULL;
+			dput(subdir);
+		}
+	}
+
+	inode_unlock(root->d_inode);
+}
diff --git a/fs/afs/fsclient.c b/fs/afs/fsclient.c
index 5907601..50929cb 100644
--- a/fs/afs/fsclient.c
+++ b/fs/afs/fsclient.c
@@ -138,10 +138,6 @@
 	u64 data_version, size;
 	u32 type, abort_code;
 	u8 flags = 0;
-	int ret;
-
-	if (vnode)
-		write_seqlock(&vnode->cb_lock);
 
 	abort_code = ntohl(xdr->abort_code);
 
@@ -154,8 +150,7 @@
 			 * case.
 			 */
 			status->abort_code = abort_code;
-			ret = 0;
-			goto out;
+			return 0;
 		}
 
 		pr_warn("Unknown AFSFetchStatus version %u\n", ntohl(xdr->if_version));
@@ -164,8 +159,7 @@
 
 	if (abort_code != 0 && inline_error) {
 		status->abort_code = abort_code;
-		ret = 0;
-		goto out;
+		return 0;
 	}
 
 	type = ntohl(xdr->type);
@@ -235,17 +229,35 @@
 					     flags);
 	}
 
-	ret = 0;
-
-out:
-	if (vnode)
-		write_sequnlock(&vnode->cb_lock);
-	return ret;
+	return 0;
 
 bad:
 	xdr_dump_bad(*_bp);
-	ret = afs_protocol_error(call, -EBADMSG);
-	goto out;
+	return afs_protocol_error(call, -EBADMSG);
+}
+
+/*
+ * Decode the file status.  We need to lock the target vnode if we're going to
+ * update its status so that stat() sees the attributes update atomically.
+ */
+static int afs_decode_status(struct afs_call *call,
+			     const __be32 **_bp,
+			     struct afs_file_status *status,
+			     struct afs_vnode *vnode,
+			     const afs_dataversion_t *expected_version,
+			     struct afs_read *read_req)
+{
+	int ret;
+
+	if (!vnode)
+		return xdr_decode_AFSFetchStatus(call, _bp, status, vnode,
+						 expected_version, read_req);
+
+	write_seqlock(&vnode->cb_lock);
+	ret = xdr_decode_AFSFetchStatus(call, _bp, status, vnode,
+					expected_version, read_req);
+	write_sequnlock(&vnode->cb_lock);
+	return ret;
 }
 
 /*
@@ -387,8 +399,8 @@
 
 	/* unmarshall the reply once we've received all of it */
 	bp = call->buffer;
-	if (xdr_decode_AFSFetchStatus(call, &bp, &vnode->status, vnode,
-				      &call->expected_version, NULL) < 0)
+	if (afs_decode_status(call, &bp, &vnode->status, vnode,
+			      &call->expected_version, NULL) < 0)
 		return afs_protocol_error(call, -EBADMSG);
 	xdr_decode_AFSCallBack(call, vnode, &bp);
 	if (call->reply[1])
@@ -568,8 +580,8 @@
 			return ret;
 
 		bp = call->buffer;
-		if (xdr_decode_AFSFetchStatus(call, &bp, &vnode->status, vnode,
-					      &vnode->status.data_version, req) < 0)
+		if (afs_decode_status(call, &bp, &vnode->status, vnode,
+				      &vnode->status.data_version, req) < 0)
 			return afs_protocol_error(call, -EBADMSG);
 		xdr_decode_AFSCallBack(call, vnode, &bp);
 		if (call->reply[1])
@@ -721,9 +733,9 @@
 	/* unmarshall the reply once we've received all of it */
 	bp = call->buffer;
 	xdr_decode_AFSFid(&bp, call->reply[1]);
-	if (xdr_decode_AFSFetchStatus(call, &bp, call->reply[2], NULL, NULL, NULL) < 0 ||
-	    xdr_decode_AFSFetchStatus(call, &bp, &vnode->status, vnode,
-				      &call->expected_version, NULL) < 0)
+	if (afs_decode_status(call, &bp, call->reply[2], NULL, NULL, NULL) < 0 ||
+	    afs_decode_status(call, &bp, &vnode->status, vnode,
+			      &call->expected_version, NULL) < 0)
 		return afs_protocol_error(call, -EBADMSG);
 	xdr_decode_AFSCallBack_raw(&bp, call->reply[3]);
 	/* xdr_decode_AFSVolSync(&bp, call->reply[X]); */
@@ -827,8 +839,8 @@
 
 	/* unmarshall the reply once we've received all of it */
 	bp = call->buffer;
-	if (xdr_decode_AFSFetchStatus(call, &bp, &vnode->status, vnode,
-				      &call->expected_version, NULL) < 0)
+	if (afs_decode_status(call, &bp, &vnode->status, vnode,
+			      &call->expected_version, NULL) < 0)
 		return afs_protocol_error(call, -EBADMSG);
 	/* xdr_decode_AFSVolSync(&bp, call->reply[X]); */
 
@@ -917,9 +929,9 @@
 
 	/* unmarshall the reply once we've received all of it */
 	bp = call->buffer;
-	if (xdr_decode_AFSFetchStatus(call, &bp, &vnode->status, vnode, NULL, NULL) < 0 ||
-	    xdr_decode_AFSFetchStatus(call, &bp, &dvnode->status, dvnode,
-				      &call->expected_version, NULL) < 0)
+	if (afs_decode_status(call, &bp, &vnode->status, vnode, NULL, NULL) < 0 ||
+	    afs_decode_status(call, &bp, &dvnode->status, dvnode,
+			      &call->expected_version, NULL) < 0)
 		return afs_protocol_error(call, -EBADMSG);
 	/* xdr_decode_AFSVolSync(&bp, call->reply[X]); */
 
@@ -1004,9 +1016,9 @@
 	/* unmarshall the reply once we've received all of it */
 	bp = call->buffer;
 	xdr_decode_AFSFid(&bp, call->reply[1]);
-	if (xdr_decode_AFSFetchStatus(call, &bp, call->reply[2], NULL, NULL, NULL) ||
-	    xdr_decode_AFSFetchStatus(call, &bp, &vnode->status, vnode,
-				      &call->expected_version, NULL) < 0)
+	if (afs_decode_status(call, &bp, call->reply[2], NULL, NULL, NULL) ||
+	    afs_decode_status(call, &bp, &vnode->status, vnode,
+			      &call->expected_version, NULL) < 0)
 		return afs_protocol_error(call, -EBADMSG);
 	/* xdr_decode_AFSVolSync(&bp, call->reply[X]); */
 
@@ -1110,12 +1122,12 @@
 
 	/* unmarshall the reply once we've received all of it */
 	bp = call->buffer;
-	if (xdr_decode_AFSFetchStatus(call, &bp, &orig_dvnode->status, orig_dvnode,
-				      &call->expected_version, NULL) < 0)
+	if (afs_decode_status(call, &bp, &orig_dvnode->status, orig_dvnode,
+			      &call->expected_version, NULL) < 0)
 		return afs_protocol_error(call, -EBADMSG);
 	if (new_dvnode != orig_dvnode &&
-	    xdr_decode_AFSFetchStatus(call, &bp, &new_dvnode->status, new_dvnode,
-				      &call->expected_version_2, NULL) < 0)
+	    afs_decode_status(call, &bp, &new_dvnode->status, new_dvnode,
+			      &call->expected_version_2, NULL) < 0)
 		return afs_protocol_error(call, -EBADMSG);
 	/* xdr_decode_AFSVolSync(&bp, call->reply[X]); */
 
@@ -1219,8 +1231,8 @@
 
 	/* unmarshall the reply once we've received all of it */
 	bp = call->buffer;
-	if (xdr_decode_AFSFetchStatus(call, &bp, &vnode->status, vnode,
-				      &call->expected_version, NULL) < 0)
+	if (afs_decode_status(call, &bp, &vnode->status, vnode,
+			      &call->expected_version, NULL) < 0)
 		return afs_protocol_error(call, -EBADMSG);
 	/* xdr_decode_AFSVolSync(&bp, call->reply[X]); */
 
@@ -1395,8 +1407,8 @@
 
 	/* unmarshall the reply once we've received all of it */
 	bp = call->buffer;
-	if (xdr_decode_AFSFetchStatus(call, &bp, &vnode->status, vnode,
-				      &call->expected_version, NULL) < 0)
+	if (afs_decode_status(call, &bp, &vnode->status, vnode,
+			      &call->expected_version, NULL) < 0)
 		return afs_protocol_error(call, -EBADMSG);
 	/* xdr_decode_AFSVolSync(&bp, call->reply[X]); */
 
@@ -2097,8 +2109,8 @@
 
 	/* unmarshall the reply once we've received all of it */
 	bp = call->buffer;
-	xdr_decode_AFSFetchStatus(call, &bp, status, vnode,
-				  &call->expected_version, NULL);
+	afs_decode_status(call, &bp, status, vnode,
+			  &call->expected_version, NULL);
 	callback[call->count].version	= ntohl(bp[0]);
 	callback[call->count].expiry	= ntohl(bp[1]);
 	callback[call->count].type	= ntohl(bp[2]);
@@ -2209,9 +2221,9 @@
 
 		bp = call->buffer;
 		statuses = call->reply[1];
-		if (xdr_decode_AFSFetchStatus(call, &bp, &statuses[call->count],
-					      call->count == 0 ? vnode : NULL,
-					      NULL, NULL) < 0)
+		if (afs_decode_status(call, &bp, &statuses[call->count],
+				      call->count == 0 ? vnode : NULL,
+				      NULL, NULL) < 0)
 			return afs_protocol_error(call, -EBADMSG);
 
 		call->count++;
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index e3f8a46..9778df1 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -22,6 +22,8 @@
 #include <linux/backing-dev.h>
 #include <linux/uuid.h>
 #include <net/net_namespace.h>
+#include <net/netns/generic.h>
+#include <net/sock.h>
 #include <net/af_rxrpc.h>
 
 #include "afs.h"
@@ -40,7 +42,8 @@
 	afs_voltype_t		type;		/* type of volume requested */
 	int			volnamesz;	/* size of volume name */
 	const char		*volname;	/* name of volume to mount */
-	struct afs_net		*net;		/* Network namespace in effect */
+	struct net		*net_ns;	/* Network namespace in effect */
+	struct afs_net		*net;		/* the AFS net namespace stuff */
 	struct afs_cell		*cell;		/* cell in which to find volume */
 	struct afs_volume	*volume;	/* volume record */
 	struct key		*key;		/* key to use for secure mounting */
@@ -189,7 +192,7 @@
  * - there's one superblock per volume
  */
 struct afs_super_info {
-	struct afs_net		*net;		/* Network namespace */
+	struct net		*net_ns;	/* Network namespace */
 	struct afs_cell		*cell;		/* The cell in which the volume resides */
 	struct afs_volume	*volume;	/* volume record */
 	bool			dyn_root;	/* True if dynamic root */
@@ -210,7 +213,6 @@
 	char			*subs[AFS_NR_SYSNAME];
 	refcount_t		usage;
 	unsigned short		nr;
-	short			error;
 	char			blank[1];
 };
 
@@ -218,6 +220,7 @@
  * AFS network namespace record.
  */
 struct afs_net {
+	struct net		*net;		/* Backpointer to the owning net namespace */
 	struct afs_uuid		uuid;
 	bool			live;		/* F if this namespace is being removed */
 
@@ -231,13 +234,13 @@
 
 	/* Cell database */
 	struct rb_root		cells;
-	struct afs_cell		*ws_cell;
+	struct afs_cell __rcu	*ws_cell;
 	struct work_struct	cells_manager;
 	struct timer_list	cells_timer;
 	atomic_t		cells_outstanding;
 	seqlock_t		cells_lock;
 
-	spinlock_t		proc_cells_lock;
+	struct mutex		proc_cells_lock;
 	struct list_head	proc_cells;
 
 	/* Known servers.  Theoretically each fileserver can only be in one
@@ -261,6 +264,7 @@
 	struct mutex		lock_manager_mutex;
 
 	/* Misc */
+	struct super_block	*dynroot_sb;	/* Dynamic root mount superblock */
 	struct proc_dir_entry	*proc_afs;	/* /proc/net/afs directory */
 	struct afs_sysnames	*sysnames;
 	rwlock_t		sysnames_lock;
@@ -280,7 +284,6 @@
 };
 
 extern const char afs_init_sysname[];
-extern struct afs_net __afs_net;// Dummy AFS network namespace; TODO: replace with real netns
 
 enum afs_cell_state {
 	AFS_CELL_UNSET,
@@ -404,16 +407,27 @@
 	rwlock_t		fs_lock;	/* access lock */
 
 	/* callback promise management */
-	struct list_head	cb_interests;	/* List of superblocks using this server */
+	struct hlist_head	cb_volumes;	/* List of volume interests on this server */
 	unsigned		cb_s_break;	/* Break-everything counter. */
 	rwlock_t		cb_break_lock;	/* Volume finding lock */
 };
 
 /*
+ * Volume collation in the server's callback interest list.
+ */
+struct afs_vol_interest {
+	struct hlist_node	srv_link;	/* Link in server->cb_volumes */
+	struct hlist_head	cb_interests;	/* List of callback interests on the server */
+	afs_volid_t		vid;		/* Volume ID to match */
+	unsigned int		usage;
+};
+
+/*
  * Interest by a superblock on a server.
  */
 struct afs_cb_interest {
-	struct list_head	cb_link;	/* Link in server->cb_interests */
+	struct hlist_node	cb_vlink;	/* Link in vol_interest->cb_interests */
+	struct afs_vol_interest	*vol_interest;
 	struct afs_server	*server;	/* Server on which this interest resides */
 	struct super_block	*sb;		/* Superblock on which inodes reside */
 	afs_volid_t		vid;		/* Volume ID to match */
@@ -720,6 +734,10 @@
 extern const struct dentry_operations afs_dynroot_dentry_operations;
 
 extern struct inode *afs_try_auto_mntpt(struct dentry *, struct inode *);
+extern int afs_dynroot_mkdir(struct afs_net *, struct afs_cell *);
+extern void afs_dynroot_rmdir(struct afs_net *, struct afs_cell *);
+extern int afs_dynroot_populate(struct super_block *);
+extern void afs_dynroot_depopulate(struct super_block *);
 
 /*
  * file.c
@@ -806,34 +824,36 @@
  * main.c
  */
 extern struct workqueue_struct *afs_wq;
+extern int afs_net_id;
+
+static inline struct afs_net *afs_net(struct net *net)
+{
+	return net_generic(net, afs_net_id);
+}
+
+static inline struct afs_net *afs_sb2net(struct super_block *sb)
+{
+	return afs_net(AFS_FS_S(sb)->net_ns);
+}
 
 static inline struct afs_net *afs_d2net(struct dentry *dentry)
 {
-	return &__afs_net;
+	return afs_sb2net(dentry->d_sb);
 }
 
 static inline struct afs_net *afs_i2net(struct inode *inode)
 {
-	return &__afs_net;
+	return afs_sb2net(inode->i_sb);
 }
 
 static inline struct afs_net *afs_v2net(struct afs_vnode *vnode)
 {
-	return &__afs_net;
+	return afs_i2net(&vnode->vfs_inode);
 }
 
 static inline struct afs_net *afs_sock2net(struct sock *sk)
 {
-	return &__afs_net;
-}
-
-static inline struct afs_net *afs_get_net(struct afs_net *net)
-{
-	return net;
-}
-
-static inline void afs_put_net(struct afs_net *net)
-{
+	return net_generic(sock_net(sk), afs_net_id);
 }
 
 static inline void __afs_stat(atomic_t *s)
@@ -861,16 +881,25 @@
 /*
  * netdevices.c
  */
-extern int afs_get_ipv4_interfaces(struct afs_interface *, size_t, bool);
+extern int afs_get_ipv4_interfaces(struct afs_net *, struct afs_interface *,
+				   size_t, bool);
 
 /*
  * proc.c
  */
+#ifdef CONFIG_PROC_FS
 extern int __net_init afs_proc_init(struct afs_net *);
 extern void __net_exit afs_proc_cleanup(struct afs_net *);
-extern int afs_proc_cell_setup(struct afs_net *, struct afs_cell *);
-extern void afs_proc_cell_remove(struct afs_net *, struct afs_cell *);
+extern int afs_proc_cell_setup(struct afs_cell *);
+extern void afs_proc_cell_remove(struct afs_cell *);
 extern void afs_put_sysnames(struct afs_sysnames *);
+#else
+static inline int afs_proc_init(struct afs_net *net) { return 0; }
+static inline void afs_proc_cleanup(struct afs_net *net) {}
+static inline int afs_proc_cell_setup(struct afs_cell *cell) { return 0; }
+static inline void afs_proc_cell_remove(struct afs_cell *cell) {}
+static inline void afs_put_sysnames(struct afs_sysnames *sysnames) {}
+#endif
 
 /*
  * rotate.c
@@ -1002,7 +1031,7 @@
  * super.c
  */
 extern int __init afs_fs_init(void);
-extern void __exit afs_fs_exit(void);
+extern void afs_fs_exit(void);
 
 /*
  * vlclient.c
diff --git a/fs/afs/main.c b/fs/afs/main.c
index d756016..e84fe82 100644
--- a/fs/afs/main.c
+++ b/fs/afs/main.c
@@ -15,6 +15,7 @@
 #include <linux/completion.h>
 #include <linux/sched.h>
 #include <linux/random.h>
+#include <linux/proc_fs.h>
 #define CREATE_TRACE_POINTS
 #include "internal.h"
 
@@ -32,7 +33,7 @@
 MODULE_PARM_DESC(rootcell, "root AFS cell name and VL server IP addr list");
 
 struct workqueue_struct *afs_wq;
-struct afs_net __afs_net;
+static struct proc_dir_entry *afs_proc_symlink;
 
 #if defined(CONFIG_ALPHA)
 const char afs_init_sysname[] = "alpha_linux26";
@@ -67,11 +68,13 @@
 /*
  * Initialise an AFS network namespace record.
  */
-static int __net_init afs_net_init(struct afs_net *net)
+static int __net_init afs_net_init(struct net *net_ns)
 {
 	struct afs_sysnames *sysnames;
+	struct afs_net *net = afs_net(net_ns);
 	int ret;
 
+	net->net = net_ns;
 	net->live = true;
 	generate_random_uuid((unsigned char *)&net->uuid);
 
@@ -83,7 +86,7 @@
 	INIT_WORK(&net->cells_manager, afs_manage_cells);
 	timer_setup(&net->cells_timer, afs_cells_timer, 0);
 
-	spin_lock_init(&net->proc_cells_lock);
+	mutex_init(&net->proc_cells_lock);
 	INIT_LIST_HEAD(&net->proc_cells);
 
 	seqlock_init(&net->fs_lock);
@@ -142,8 +145,10 @@
 /*
  * Clean up and destroy an AFS network namespace record.
  */
-static void __net_exit afs_net_exit(struct afs_net *net)
+static void __net_exit afs_net_exit(struct net *net_ns)
 {
+	struct afs_net *net = afs_net(net_ns);
+
 	net->live = false;
 	afs_cell_purge(net);
 	afs_purge_servers(net);
@@ -152,6 +157,13 @@
 	afs_put_sysnames(net->sysnames);
 }
 
+static struct pernet_operations afs_net_ops = {
+	.init	= afs_net_init,
+	.exit	= afs_net_exit,
+	.id	= &afs_net_id,
+	.size	= sizeof(struct afs_net),
+};
+
 /*
  * initialise the AFS client FS module
  */
@@ -178,7 +190,7 @@
 		goto error_cache;
 #endif
 
-	ret = afs_net_init(&__afs_net);
+	ret = register_pernet_subsys(&afs_net_ops);
 	if (ret < 0)
 		goto error_net;
 
@@ -187,10 +199,18 @@
 	if (ret < 0)
 		goto error_fs;
 
+	afs_proc_symlink = proc_symlink("fs/afs", NULL, "../self/net/afs");
+	if (IS_ERR(afs_proc_symlink)) {
+		ret = PTR_ERR(afs_proc_symlink);
+		goto error_proc;
+	}
+
 	return ret;
 
+error_proc:
+	afs_fs_exit();
 error_fs:
-	afs_net_exit(&__afs_net);
+	unregister_pernet_subsys(&afs_net_ops);
 error_net:
 #ifdef CONFIG_AFS_FSCACHE
 	fscache_unregister_netfs(&afs_cache_netfs);
@@ -219,8 +239,9 @@
 {
 	printk(KERN_INFO "kAFS: Red Hat AFS client v0.1 unregistering.\n");
 
+	proc_remove(afs_proc_symlink);
 	afs_fs_exit();
-	afs_net_exit(&__afs_net);
+	unregister_pernet_subsys(&afs_net_ops);
 #ifdef CONFIG_AFS_FSCACHE
 	fscache_unregister_netfs(&afs_cache_netfs);
 #endif
diff --git a/fs/afs/netdevices.c b/fs/afs/netdevices.c
index 50bd5bb..2a009d1 100644
--- a/fs/afs/netdevices.c
+++ b/fs/afs/netdevices.c
@@ -17,8 +17,8 @@
  * - maxbufs must be at least 1
  * - returns the number of interface records in the buffer
  */
-int afs_get_ipv4_interfaces(struct afs_interface *bufs, size_t maxbufs,
-			    bool wantloopback)
+int afs_get_ipv4_interfaces(struct afs_net *net, struct afs_interface *bufs,
+			    size_t maxbufs, bool wantloopback)
 {
 	struct net_device *dev;
 	struct in_device *idev;
@@ -27,7 +27,7 @@
 	ASSERT(maxbufs > 0);
 
 	rtnl_lock();
-	for_each_netdev(&init_net, dev) {
+	for_each_netdev(net->net, dev) {
 		if (dev->type == ARPHRD_LOOPBACK && !wantloopback)
 			continue;
 		idev = __in_dev_get_rtnl(dev);
diff --git a/fs/afs/proc.c b/fs/afs/proc.c
index 3aad327..0c3285c 100644
--- a/fs/afs/proc.c
+++ b/fs/afs/proc.c
@@ -17,197 +17,18 @@
 #include <linux/uaccess.h>
 #include "internal.h"
 
-static inline struct afs_net *afs_proc2net(struct file *f)
-{
-	return &__afs_net;
-}
-
 static inline struct afs_net *afs_seq2net(struct seq_file *m)
 {
-	return &__afs_net; // TODO: use seq_file_net(m)
+	return afs_net(seq_file_net(m));
 }
 
-static int afs_proc_cells_open(struct inode *inode, struct file *file);
-static void *afs_proc_cells_start(struct seq_file *p, loff_t *pos);
-static void *afs_proc_cells_next(struct seq_file *p, void *v, loff_t *pos);
-static void afs_proc_cells_stop(struct seq_file *p, void *v);
-static int afs_proc_cells_show(struct seq_file *m, void *v);
-static ssize_t afs_proc_cells_write(struct file *file, const char __user *buf,
-				    size_t size, loff_t *_pos);
-
-static const struct seq_operations afs_proc_cells_ops = {
-	.start	= afs_proc_cells_start,
-	.next	= afs_proc_cells_next,
-	.stop	= afs_proc_cells_stop,
-	.show	= afs_proc_cells_show,
-};
-
-static const struct file_operations afs_proc_cells_fops = {
-	.open		= afs_proc_cells_open,
-	.read		= seq_read,
-	.write		= afs_proc_cells_write,
-	.llseek		= seq_lseek,
-	.release	= seq_release,
-};
-
-static ssize_t afs_proc_rootcell_read(struct file *file, char __user *buf,
-				      size_t size, loff_t *_pos);
-static ssize_t afs_proc_rootcell_write(struct file *file,
-				       const char __user *buf,
-				       size_t size, loff_t *_pos);
-
-static const struct file_operations afs_proc_rootcell_fops = {
-	.read		= afs_proc_rootcell_read,
-	.write		= afs_proc_rootcell_write,
-	.llseek		= no_llseek,
-};
-
-static void *afs_proc_cell_volumes_start(struct seq_file *p, loff_t *pos);
-static void *afs_proc_cell_volumes_next(struct seq_file *p, void *v,
-					loff_t *pos);
-static void afs_proc_cell_volumes_stop(struct seq_file *p, void *v);
-static int afs_proc_cell_volumes_show(struct seq_file *m, void *v);
-
-static const struct seq_operations afs_proc_cell_volumes_ops = {
-	.start	= afs_proc_cell_volumes_start,
-	.next	= afs_proc_cell_volumes_next,
-	.stop	= afs_proc_cell_volumes_stop,
-	.show	= afs_proc_cell_volumes_show,
-};
-
-static void *afs_proc_cell_vlservers_start(struct seq_file *p, loff_t *pos);
-static void *afs_proc_cell_vlservers_next(struct seq_file *p, void *v,
-					  loff_t *pos);
-static void afs_proc_cell_vlservers_stop(struct seq_file *p, void *v);
-static int afs_proc_cell_vlservers_show(struct seq_file *m, void *v);
-
-static const struct seq_operations afs_proc_cell_vlservers_ops = {
-	.start	= afs_proc_cell_vlservers_start,
-	.next	= afs_proc_cell_vlservers_next,
-	.stop	= afs_proc_cell_vlservers_stop,
-	.show	= afs_proc_cell_vlservers_show,
-};
-
-static void *afs_proc_servers_start(struct seq_file *p, loff_t *pos);
-static void *afs_proc_servers_next(struct seq_file *p, void *v,
-					loff_t *pos);
-static void afs_proc_servers_stop(struct seq_file *p, void *v);
-static int afs_proc_servers_show(struct seq_file *m, void *v);
-
-static const struct seq_operations afs_proc_servers_ops = {
-	.start	= afs_proc_servers_start,
-	.next	= afs_proc_servers_next,
-	.stop	= afs_proc_servers_stop,
-	.show	= afs_proc_servers_show,
-};
-
-static int afs_proc_sysname_open(struct inode *inode, struct file *file);
-static int afs_proc_sysname_release(struct inode *inode, struct file *file);
-static void *afs_proc_sysname_start(struct seq_file *p, loff_t *pos);
-static void *afs_proc_sysname_next(struct seq_file *p, void *v,
-					loff_t *pos);
-static void afs_proc_sysname_stop(struct seq_file *p, void *v);
-static int afs_proc_sysname_show(struct seq_file *m, void *v);
-static ssize_t afs_proc_sysname_write(struct file *file,
-				      const char __user *buf,
-				      size_t size, loff_t *_pos);
-
-static const struct seq_operations afs_proc_sysname_ops = {
-	.start	= afs_proc_sysname_start,
-	.next	= afs_proc_sysname_next,
-	.stop	= afs_proc_sysname_stop,
-	.show	= afs_proc_sysname_show,
-};
-
-static const struct file_operations afs_proc_sysname_fops = {
-	.open		= afs_proc_sysname_open,
-	.read		= seq_read,
-	.llseek		= seq_lseek,
-	.release	= afs_proc_sysname_release,
-	.write		= afs_proc_sysname_write,
-};
-
-static int afs_proc_stats_show(struct seq_file *m, void *v);
-
-/*
- * initialise the /proc/fs/afs/ directory
- */
-int afs_proc_init(struct afs_net *net)
+static inline struct afs_net *afs_seq2net_single(struct seq_file *m)
 {
-	_enter("");
-
-	net->proc_afs = proc_mkdir("fs/afs", NULL);
-	if (!net->proc_afs)
-		goto error_dir;
-
-	if (!proc_create("cells", 0644, net->proc_afs, &afs_proc_cells_fops) ||
-	    !proc_create("rootcell", 0644, net->proc_afs, &afs_proc_rootcell_fops) ||
-	    !proc_create_seq("servers", 0644, net->proc_afs, &afs_proc_servers_ops) ||
-	    !proc_create_single("stats", 0644, net->proc_afs, afs_proc_stats_show) ||
-	    !proc_create("sysname", 0644, net->proc_afs, &afs_proc_sysname_fops))
-		goto error_tree;
-
-	_leave(" = 0");
-	return 0;
-
-error_tree:
-	proc_remove(net->proc_afs);
-error_dir:
-	_leave(" = -ENOMEM");
-	return -ENOMEM;
+	return afs_net(seq_file_single_net(m));
 }
 
 /*
- * clean up the /proc/fs/afs/ directory
- */
-void afs_proc_cleanup(struct afs_net *net)
-{
-	proc_remove(net->proc_afs);
-	net->proc_afs = NULL;
-}
-
-/*
- * open "/proc/fs/afs/cells" which provides a summary of extant cells
- */
-static int afs_proc_cells_open(struct inode *inode, struct file *file)
-{
-	return seq_open(file, &afs_proc_cells_ops);
-}
-
-/*
- * set up the iterator to start reading from the cells list and return the
- * first item
- */
-static void *afs_proc_cells_start(struct seq_file *m, loff_t *_pos)
-	__acquires(rcu)
-{
-	struct afs_net *net = afs_seq2net(m);
-
-	rcu_read_lock();
-	return seq_list_start_head(&net->proc_cells, *_pos);
-}
-
-/*
- * move to next cell in cells list
- */
-static void *afs_proc_cells_next(struct seq_file *m, void *v, loff_t *pos)
-{
-	struct afs_net *net = afs_seq2net(m);
-
-	return seq_list_next(v, &net->proc_cells, pos);
-}
-
-/*
- * clean up after reading from the cells list
- */
-static void afs_proc_cells_stop(struct seq_file *m, void *v)
-	__releases(rcu)
-{
-	rcu_read_unlock();
-}
-
-/*
- * display a header line followed by a load of cell lines
+ * Display the list of cells known to the namespace.
  */
 static int afs_proc_cells_show(struct seq_file *m, void *v)
 {
@@ -225,32 +46,49 @@
 	return 0;
 }
 
+static void *afs_proc_cells_start(struct seq_file *m, loff_t *_pos)
+	__acquires(rcu)
+{
+	rcu_read_lock();
+	return seq_list_start_head(&afs_seq2net(m)->proc_cells, *_pos);
+}
+
+static void *afs_proc_cells_next(struct seq_file *m, void *v, loff_t *pos)
+{
+	return seq_list_next(v, &afs_seq2net(m)->proc_cells, pos);
+}
+
+static void afs_proc_cells_stop(struct seq_file *m, void *v)
+	__releases(rcu)
+{
+	rcu_read_unlock();
+}
+
+static const struct seq_operations afs_proc_cells_ops = {
+	.start	= afs_proc_cells_start,
+	.next	= afs_proc_cells_next,
+	.stop	= afs_proc_cells_stop,
+	.show	= afs_proc_cells_show,
+};
+
 /*
  * handle writes to /proc/fs/afs/cells
  * - to add cells: echo "add <cellname> <IP>[:<IP>][:<IP>]"
  */
-static ssize_t afs_proc_cells_write(struct file *file, const char __user *buf,
-				    size_t size, loff_t *_pos)
+static int afs_proc_cells_write(struct file *file, char *buf, size_t size)
 {
-	struct afs_net *net = afs_proc2net(file);
-	char *kbuf, *name, *args;
+	struct seq_file *m = file->private_data;
+	struct afs_net *net = afs_seq2net(m);
+	char *name, *args;
 	int ret;
 
-	/* start by dragging the command into memory */
-	if (size <= 1 || size >= PAGE_SIZE)
-		return -EINVAL;
-
-	kbuf = memdup_user_nul(buf, size);
-	if (IS_ERR(kbuf))
-		return PTR_ERR(kbuf);
-
 	/* trim to first NL */
-	name = memchr(kbuf, '\n', size);
+	name = memchr(buf, '\n', size);
 	if (name)
 		*name = 0;
 
 	/* split into command, name and argslist */
-	name = strchr(kbuf, ' ');
+	name = strchr(buf, ' ');
 	if (!name)
 		goto inval;
 	do {
@@ -269,9 +107,9 @@
 		goto inval;
 
 	/* determine command to perform */
-	_debug("cmd=%s name=%s args=%s", kbuf, name, args);
+	_debug("cmd=%s name=%s args=%s", buf, name, args);
 
-	if (strcmp(kbuf, "add") == 0) {
+	if (strcmp(buf, "add") == 0) {
 		struct afs_cell *cell;
 
 		cell = afs_lookup_cell(net, name, strlen(name), args, true);
@@ -287,10 +125,9 @@
 		goto inval;
 	}
 
-	ret = size;
+	ret = 0;
 
 done:
-	kfree(kbuf);
 	_leave(" = %d", ret);
 	return ret;
 
@@ -300,169 +137,59 @@
 	goto done;
 }
 
-static ssize_t afs_proc_rootcell_read(struct file *file, char __user *buf,
-				      size_t size, loff_t *_pos)
+/*
+ * Display the name of the current workstation cell.
+ */
+static int afs_proc_rootcell_show(struct seq_file *m, void *v)
 {
 	struct afs_cell *cell;
-	struct afs_net *net = afs_proc2net(file);
-	unsigned int seq = 0;
-	char name[AFS_MAXCELLNAME + 1];
-	int len;
+	struct afs_net *net;
 
-	if (*_pos > 0)
-		return 0;
-	if (!net->ws_cell)
-		return 0;
-
-	rcu_read_lock();
-	do {
-		read_seqbegin_or_lock(&net->cells_lock, &seq);
-		len = 0;
-		cell = rcu_dereference_raw(net->ws_cell);
-		if (cell) {
-			len = cell->name_len;
-			memcpy(name, cell->name, len);
-		}
-	} while (need_seqretry(&net->cells_lock, seq));
-	done_seqretry(&net->cells_lock, seq);
-	rcu_read_unlock();
-
-	if (!len)
-		return 0;
-
-	name[len++] = '\n';
-	if (len > size)
-		len = size;
-	if (copy_to_user(buf, name, len) != 0)
-		return -EFAULT;
-	*_pos = 1;
-	return len;
+	net = afs_seq2net_single(m);
+	if (rcu_access_pointer(net->ws_cell)) {
+		rcu_read_lock();
+		cell = rcu_dereference(net->ws_cell);
+		if (cell)
+			seq_printf(m, "%s\n", cell->name);
+		rcu_read_unlock();
+	}
+	return 0;
 }
 
 /*
- * handle writes to /proc/fs/afs/rootcell
- * - to initialize rootcell: echo "cell.name:192.168.231.14"
+ * Set the current workstation cell and optionally supply its list of volume
+ * location servers.
+ *
+ *	echo "cell.name:192.168.231.14" >/proc/fs/afs/rootcell
  */
-static ssize_t afs_proc_rootcell_write(struct file *file,
-				       const char __user *buf,
-				       size_t size, loff_t *_pos)
+static int afs_proc_rootcell_write(struct file *file, char *buf, size_t size)
 {
-	struct afs_net *net = afs_proc2net(file);
-	char *kbuf, *s;
+	struct seq_file *m = file->private_data;
+	struct afs_net *net = afs_seq2net_single(m);
+	char *s;
 	int ret;
 
-	/* start by dragging the command into memory */
-	if (size <= 1 || size >= PAGE_SIZE)
-		return -EINVAL;
-
-	kbuf = memdup_user_nul(buf, size);
-	if (IS_ERR(kbuf))
-		return PTR_ERR(kbuf);
-
 	ret = -EINVAL;
-	if (kbuf[0] == '.')
+	if (buf[0] == '.')
 		goto out;
-	if (memchr(kbuf, '/', size))
+	if (memchr(buf, '/', size))
 		goto out;
 
 	/* trim to first NL */
-	s = memchr(kbuf, '\n', size);
+	s = memchr(buf, '\n', size);
 	if (s)
 		*s = 0;
 
 	/* determine command to perform */
-	_debug("rootcell=%s", kbuf);
+	_debug("rootcell=%s", buf);
 
-	ret = afs_cell_init(net, kbuf);
-	if (ret >= 0)
-		ret = size;	/* consume everything, always */
+	ret = afs_cell_init(net, buf);
 
 out:
-	kfree(kbuf);
 	_leave(" = %d", ret);
 	return ret;
 }
 
-/*
- * initialise /proc/fs/afs/<cell>/
- */
-int afs_proc_cell_setup(struct afs_net *net, struct afs_cell *cell)
-{
-	struct proc_dir_entry *dir;
-
-	_enter("%p{%s},%p", cell, cell->name, net->proc_afs);
-
-	dir = proc_mkdir(cell->name, net->proc_afs);
-	if (!dir)
-		goto error_dir;
-
-	if (!proc_create_seq_data("vlservers", 0, dir,
-			&afs_proc_cell_vlservers_ops, cell))
-		goto error_tree;
-	if (!proc_create_seq_data("volumes", 0, dir, &afs_proc_cell_volumes_ops,
-			cell))
-		goto error_tree;
-
-	_leave(" = 0");
-	return 0;
-
-error_tree:
-	remove_proc_subtree(cell->name, net->proc_afs);
-error_dir:
-	_leave(" = -ENOMEM");
-	return -ENOMEM;
-}
-
-/*
- * remove /proc/fs/afs/<cell>/
- */
-void afs_proc_cell_remove(struct afs_net *net, struct afs_cell *cell)
-{
-	_enter("");
-
-	remove_proc_subtree(cell->name, net->proc_afs);
-
-	_leave("");
-}
-
-/*
- * set up the iterator to start reading from the cells list and return the
- * first item
- */
-static void *afs_proc_cell_volumes_start(struct seq_file *m, loff_t *_pos)
-	__acquires(cell->proc_lock)
-{
-	struct afs_cell *cell = PDE_DATA(file_inode(m->file));
-
-	_enter("cell=%p pos=%Ld", cell, *_pos);
-
-	read_lock(&cell->proc_lock);
-	return seq_list_start_head(&cell->proc_volumes, *_pos);
-}
-
-/*
- * move to next cell in cells list
- */
-static void *afs_proc_cell_volumes_next(struct seq_file *p, void *v,
-					loff_t *_pos)
-{
-	struct afs_cell *cell = PDE_DATA(file_inode(p->file));
-
-	_enter("cell=%p pos=%Ld", cell, *_pos);
-	return seq_list_next(v, &cell->proc_volumes, _pos);
-}
-
-/*
- * clean up after reading from the cells list
- */
-static void afs_proc_cell_volumes_stop(struct seq_file *p, void *v)
-	__releases(cell->proc_lock)
-{
-	struct afs_cell *cell = PDE_DATA(file_inode(p->file));
-
-	read_unlock(&cell->proc_lock);
-}
-
 static const char afs_vol_types[3][3] = {
 	[AFSVL_RWVOL]	= "RW",
 	[AFSVL_ROVOL]	= "RO",
@@ -470,7 +197,7 @@
 };
 
 /*
- * display a header line followed by a load of volume lines
+ * Display the list of volumes known to a cell.
  */
 static int afs_proc_cell_volumes_show(struct seq_file *m, void *v)
 {
@@ -490,10 +217,56 @@
 	return 0;
 }
 
+static void *afs_proc_cell_volumes_start(struct seq_file *m, loff_t *_pos)
+	__acquires(cell->proc_lock)
+{
+	struct afs_cell *cell = PDE_DATA(file_inode(m->file));
+
+	read_lock(&cell->proc_lock);
+	return seq_list_start_head(&cell->proc_volumes, *_pos);
+}
+
+static void *afs_proc_cell_volumes_next(struct seq_file *m, void *v,
+					loff_t *_pos)
+{
+	struct afs_cell *cell = PDE_DATA(file_inode(m->file));
+
+	return seq_list_next(v, &cell->proc_volumes, _pos);
+}
+
+static void afs_proc_cell_volumes_stop(struct seq_file *m, void *v)
+	__releases(cell->proc_lock)
+{
+	struct afs_cell *cell = PDE_DATA(file_inode(m->file));
+
+	read_unlock(&cell->proc_lock);
+}
+
+static const struct seq_operations afs_proc_cell_volumes_ops = {
+	.start	= afs_proc_cell_volumes_start,
+	.next	= afs_proc_cell_volumes_next,
+	.stop	= afs_proc_cell_volumes_stop,
+	.show	= afs_proc_cell_volumes_show,
+};
+
 /*
- * set up the iterator to start reading from the cells list and return the
- * first item
+ * Display the list of Volume Location servers we're using for a cell.
  */
+static int afs_proc_cell_vlservers_show(struct seq_file *m, void *v)
+{
+	struct sockaddr_rxrpc *addr = v;
+
+	/* display header on line 1 */
+	if (v == (void *)1) {
+		seq_puts(m, "ADDRESS\n");
+		return 0;
+	}
+
+	/* display one cell per line on subsequent lines */
+	seq_printf(m, "%pISp\n", &addr->transport);
+	return 0;
+}
+
 static void *afs_proc_cell_vlservers_start(struct seq_file *m, loff_t *_pos)
 	__acquires(rcu)
 {
@@ -516,14 +289,11 @@
 	return alist->addrs + pos;
 }
 
-/*
- * move to next cell in cells list
- */
-static void *afs_proc_cell_vlservers_next(struct seq_file *p, void *v,
+static void *afs_proc_cell_vlservers_next(struct seq_file *m, void *v,
 					  loff_t *_pos)
 {
 	struct afs_addr_list *alist;
-	struct afs_cell *cell = PDE_DATA(file_inode(p->file));
+	struct afs_cell *cell = PDE_DATA(file_inode(m->file));
 	loff_t pos;
 
 	alist = rcu_dereference(cell->vl_addrs);
@@ -536,72 +306,27 @@
 	return alist->addrs + pos;
 }
 
-/*
- * clean up after reading from the cells list
- */
-static void afs_proc_cell_vlservers_stop(struct seq_file *p, void *v)
+static void afs_proc_cell_vlservers_stop(struct seq_file *m, void *v)
 	__releases(rcu)
 {
 	rcu_read_unlock();
 }
 
-/*
- * display a header line followed by a load of volume lines
- */
-static int afs_proc_cell_vlservers_show(struct seq_file *m, void *v)
-{
-	struct sockaddr_rxrpc *addr = v;
-
-	/* display header on line 1 */
-	if (v == (void *)1) {
-		seq_puts(m, "ADDRESS\n");
-		return 0;
-	}
-
-	/* display one cell per line on subsequent lines */
-	seq_printf(m, "%pISp\n", &addr->transport);
-	return 0;
-}
+static const struct seq_operations afs_proc_cell_vlservers_ops = {
+	.start	= afs_proc_cell_vlservers_start,
+	.next	= afs_proc_cell_vlservers_next,
+	.stop	= afs_proc_cell_vlservers_stop,
+	.show	= afs_proc_cell_vlservers_show,
+};
 
 /*
- * Set up the iterator to start reading from the server list and return the
- * first item.
- */
-static void *afs_proc_servers_start(struct seq_file *m, loff_t *_pos)
-	__acquires(rcu)
-{
-	struct afs_net *net = afs_seq2net(m);
-
-	rcu_read_lock();
-	return seq_hlist_start_head_rcu(&net->fs_proc, *_pos);
-}
-
-/*
- * move to next cell in cells list
- */
-static void *afs_proc_servers_next(struct seq_file *m, void *v, loff_t *_pos)
-{
-	struct afs_net *net = afs_seq2net(m);
-
-	return seq_hlist_next_rcu(v, &net->fs_proc, _pos);
-}
-
-/*
- * clean up after reading from the cells list
- */
-static void afs_proc_servers_stop(struct seq_file *p, void *v)
-	__releases(rcu)
-{
-	rcu_read_unlock();
-}
-
-/*
- * display a header line followed by a load of volume lines
+ * Display the list of fileservers we're using within a namespace.
  */
 static int afs_proc_servers_show(struct seq_file *m, void *v)
 {
 	struct afs_server *server;
 	struct afs_addr_list *alist;
+	int i;
 
 	if (v == SEQ_START_TOKEN) {
 		seq_puts(m, "UUID                                 USE ADDR\n");
@@ -610,87 +335,116 @@
 
 	server = list_entry(v, struct afs_server, proc_link);
 	alist = rcu_dereference(server->addresses);
-	seq_printf(m, "%pU %3d %pISp\n",
+	seq_printf(m, "%pU %3d %pISpc%s\n",
 		   &server->uuid,
 		   atomic_read(&server->usage),
-		   &alist->addrs[alist->index].transport);
+		   &alist->addrs[0].transport,
+		   alist->index == 0 ? "*" : "");
+	for (i = 1; i < alist->nr_addrs; i++)
+		seq_printf(m, "                                         %pISpc%s\n",
+			   &alist->addrs[i].transport,
+			   alist->index == i ? "*" : "");
 	return 0;
 }
 
-void afs_put_sysnames(struct afs_sysnames *sysnames)
+static void *afs_proc_servers_start(struct seq_file *m, loff_t *_pos)
+	__acquires(rcu)
 {
-	int i;
-
-	if (sysnames && refcount_dec_and_test(&sysnames->usage)) {
-		for (i = 0; i < sysnames->nr; i++)
-			if (sysnames->subs[i] != afs_init_sysname &&
-			    sysnames->subs[i] != sysnames->blank)
-				kfree(sysnames->subs[i]);
-	}
+	rcu_read_lock();
+	return seq_hlist_start_head_rcu(&afs_seq2net(m)->fs_proc, *_pos);
 }
 
-/*
- * Handle opening of /proc/fs/afs/sysname.  If it is opened for writing, we
- * assume the caller wants to change the substitution list and we allocate a
- * buffer to hold the list.
- */
-static int afs_proc_sysname_open(struct inode *inode, struct file *file)
+static void *afs_proc_servers_next(struct seq_file *m, void *v, loff_t *_pos)
 {
-	struct afs_sysnames *sysnames;
-	struct seq_file *m;
-	int ret;
+	return seq_hlist_next_rcu(v, &afs_seq2net(m)->fs_proc, _pos);
+}
 
-	ret = seq_open(file, &afs_proc_sysname_ops);
-	if (ret < 0)
-		return ret;
+static void afs_proc_servers_stop(struct seq_file *m, void *v)
+	__releases(rcu)
+{
+	rcu_read_unlock();
+}
 
-	if (file->f_mode & FMODE_WRITE) {
-		sysnames = kzalloc(sizeof(*sysnames), GFP_KERNEL);
-		if (!sysnames) {
-			seq_release(inode, file);
-			return -ENOMEM;
-		}
+static const struct seq_operations afs_proc_servers_ops = {
+	.start	= afs_proc_servers_start,
+	.next	= afs_proc_servers_next,
+	.stop	= afs_proc_servers_stop,
+	.show	= afs_proc_servers_show,
+};
 
-		refcount_set(&sysnames->usage, 1);
-		m = file->private_data;
-		m->private = sysnames;
-	}
+/*
+ * Display the list of strings that may be substituted for the @sys pathname
+ * macro.
+ */
+static int afs_proc_sysname_show(struct seq_file *m, void *v)
+{
+	struct afs_net *net = afs_seq2net(m);
+	struct afs_sysnames *sysnames = net->sysnames;
+	unsigned int i = (unsigned long)v - 1;
 
+	if (i < sysnames->nr)
+		seq_printf(m, "%s\n", sysnames->subs[i]);
 	return 0;
 }
 
-/*
- * Handle writes to /proc/fs/afs/sysname to set the @sys substitution.
- */
-static ssize_t afs_proc_sysname_write(struct file *file,
-				      const char __user *buf,
-				      size_t size, loff_t *_pos)
+static void *afs_proc_sysname_start(struct seq_file *m, loff_t *pos)
+	__acquires(&net->sysnames_lock)
 {
-	struct afs_sysnames *sysnames;
+	struct afs_net *net = afs_seq2net(m);
+	struct afs_sysnames *names;
+
+	read_lock(&net->sysnames_lock);
+
+	names = net->sysnames;
+	if (*pos >= names->nr)
+		return NULL;
+	return (void *)(unsigned long)(*pos + 1);
+}
+
+static void *afs_proc_sysname_next(struct seq_file *m, void *v, loff_t *pos)
+{
+	struct afs_net *net = afs_seq2net(m);
+	struct afs_sysnames *names = net->sysnames;
+
+	*pos += 1;
+	if (*pos >= names->nr)
+		return NULL;
+	return (void *)(unsigned long)(*pos + 1);
+}
+
+static void afs_proc_sysname_stop(struct seq_file *m, void *v)
+	__releases(&net->sysnames_lock)
+{
+	struct afs_net *net = afs_seq2net(m);
+
+	read_unlock(&net->sysnames_lock);
+}
+
+static const struct seq_operations afs_proc_sysname_ops = {
+	.start	= afs_proc_sysname_start,
+	.next	= afs_proc_sysname_next,
+	.stop	= afs_proc_sysname_stop,
+	.show	= afs_proc_sysname_show,
+};
+
+/*
+ * Allow the @sys substitution to be configured.
+ */
+static int afs_proc_sysname_write(struct file *file, char *buf, size_t size)
+{
+	struct afs_sysnames *sysnames, *kill;
 	struct seq_file *m = file->private_data;
-	char *kbuf = NULL, *s, *p, *sub;
+	struct afs_net *net = afs_seq2net(m);
+	char *s, *p, *sub;
 	int ret, len;
 
-	sysnames = m->private;
+	sysnames = kzalloc(sizeof(*sysnames), GFP_KERNEL);
 	if (!sysnames)
-		return -EINVAL;
-	if (sysnames->error)
-		return sysnames->error;
+		return -ENOMEM;
+	refcount_set(&sysnames->usage, 1);
+	kill = sysnames;
 
-	if (size >= PAGE_SIZE - 1) {
-		sysnames->error = -EINVAL;
-		return -EINVAL;
-	}
-	if (size == 0)
-		return 0;
-
-	kbuf = memdup_user_nul(buf, size);
-	if (IS_ERR(kbuf))
-		return PTR_ERR(kbuf);
-
-	inode_lock(file_inode(file));
-
-	p = kbuf;
+	p = buf;
 	while ((s = strsep(&p, " \t\n"))) {
 		len = strlen(s);
 		if (len == 0)
@@ -731,85 +485,36 @@
 		sysnames->nr++;
 	}
 
-	ret = size;	/* consume everything, always */
+	if (sysnames->nr == 0) {
+		sysnames->subs[0] = sysnames->blank;
+		sysnames->nr++;
+	}
+
+	write_lock(&net->sysnames_lock);
+	kill = net->sysnames;
+	net->sysnames = sysnames;
+	write_unlock(&net->sysnames_lock);
+	ret = 0;
 out:
-	inode_unlock(file_inode(file));
-	kfree(kbuf);
+	afs_put_sysnames(kill);
 	return ret;
 
 invalid:
 	ret = -EINVAL;
 error:
-	sysnames->error = ret;
 	goto out;
 }
 
-static int afs_proc_sysname_release(struct inode *inode, struct file *file)
+void afs_put_sysnames(struct afs_sysnames *sysnames)
 {
-	struct afs_sysnames *sysnames, *kill = NULL;
-	struct seq_file *m = file->private_data;
-	struct afs_net *net = afs_seq2net(m);
+	int i;
 
-	sysnames = m->private;
-	if (sysnames) {
-		if (!sysnames->error) {
-			kill = sysnames;
-			if (sysnames->nr == 0) {
-				sysnames->subs[0] = sysnames->blank;
-				sysnames->nr++;
-			}
-			write_lock(&net->sysnames_lock);
-			kill = net->sysnames;
-			net->sysnames = sysnames;
-			write_unlock(&net->sysnames_lock);
-		}
-		afs_put_sysnames(kill);
+	if (sysnames && refcount_dec_and_test(&sysnames->usage)) {
+		for (i = 0; i < sysnames->nr; i++)
+			if (sysnames->subs[i] != afs_init_sysname &&
+			    sysnames->subs[i] != sysnames->blank)
+				kfree(sysnames->subs[i]);
 	}
-
-	return seq_release(inode, file);
-}
-
-static void *afs_proc_sysname_start(struct seq_file *m, loff_t *pos)
-	__acquires(&net->sysnames_lock)
-{
-	struct afs_net *net = afs_seq2net(m);
-	struct afs_sysnames *names = net->sysnames;
-
-	read_lock(&net->sysnames_lock);
-
-	if (*pos >= names->nr)
-		return NULL;
-	return (void *)(unsigned long)(*pos + 1);
-}
-
-static void *afs_proc_sysname_next(struct seq_file *m, void *v, loff_t *pos)
-{
-	struct afs_net *net = afs_seq2net(m);
-	struct afs_sysnames *names = net->sysnames;
-
-	*pos += 1;
-	if (*pos >= names->nr)
-		return NULL;
-	return (void *)(unsigned long)(*pos + 1);
-}
-
-static void afs_proc_sysname_stop(struct seq_file *m, void *v)
-	__releases(&net->sysnames_lock)
-{
-	struct afs_net *net = afs_seq2net(m);
-
-	read_unlock(&net->sysnames_lock);
-}
-
-static int afs_proc_sysname_show(struct seq_file *m, void *v)
-{
-	struct afs_net *net = afs_seq2net(m);
-	struct afs_sysnames *sysnames = net->sysnames;
-	unsigned int i = (unsigned long)v - 1;
-
-	if (i < sysnames->nr)
-		seq_printf(m, "%s\n", sysnames->subs[i]);
-	return 0;
 }
 
 /*
@@ -817,7 +522,7 @@
  */
 static int afs_proc_stats_show(struct seq_file *m, void *v)
 {
-	struct afs_net *net = afs_seq2net(m);
+	struct afs_net *net = afs_seq2net_single(m);
 
 	seq_puts(m, "kAFS statistics\n");
 
@@ -842,3 +547,101 @@
 		   atomic_long_read(&net->n_store_bytes));
 	return 0;
 }
+
+/*
+ * initialise /proc/fs/afs/<cell>/
+ */
+int afs_proc_cell_setup(struct afs_cell *cell)
+{
+	struct proc_dir_entry *dir;
+	struct afs_net *net = cell->net;
+
+	_enter("%p{%s},%p", cell, cell->name, net->proc_afs);
+
+	dir = proc_net_mkdir(net->net, cell->name, net->proc_afs);
+	if (!dir)
+		goto error_dir;
+
+	if (!proc_create_net_data("vlservers", 0444, dir,
+				  &afs_proc_cell_vlservers_ops,
+				  sizeof(struct seq_net_private),
+				  cell) ||
+	    !proc_create_net_data("volumes", 0444, dir,
+				  &afs_proc_cell_volumes_ops,
+				  sizeof(struct seq_net_private),
+				  cell))
+		goto error_tree;
+
+	_leave(" = 0");
+	return 0;
+
+error_tree:
+	remove_proc_subtree(cell->name, net->proc_afs);
+error_dir:
+	_leave(" = -ENOMEM");
+	return -ENOMEM;
+}
+
+/*
+ * remove /proc/fs/afs/<cell>/
+ */
+void afs_proc_cell_remove(struct afs_cell *cell)
+{
+	struct afs_net *net = cell->net;
+
+	_enter("");
+	remove_proc_subtree(cell->name, net->proc_afs);
+	_leave("");
+}
+
+/*
+ * initialise the /proc/fs/afs/ directory
+ */
+int afs_proc_init(struct afs_net *net)
+{
+	struct proc_dir_entry *p;
+
+	_enter("");
+
+	p = proc_net_mkdir(net->net, "afs", net->net->proc_net);
+	if (!p)
+		goto error_dir;
+
+	if (!proc_create_net_data_write("cells", 0644, p,
+					&afs_proc_cells_ops,
+					afs_proc_cells_write,
+					sizeof(struct seq_net_private),
+					NULL) ||
+	    !proc_create_net_single_write("rootcell", 0644, p,
+					  afs_proc_rootcell_show,
+					  afs_proc_rootcell_write,
+					  NULL) ||
+	    !proc_create_net("servers", 0444, p, &afs_proc_servers_ops,
+			     sizeof(struct seq_net_private)) ||
+	    !proc_create_net_single("stats", 0444, p, afs_proc_stats_show, NULL) ||
+	    !proc_create_net_data_write("sysname", 0644, p,
+					&afs_proc_sysname_ops,
+					afs_proc_sysname_write,
+					sizeof(struct seq_net_private),
+					NULL))
+		goto error_tree;
+
+	net->proc_afs = p;
+	_leave(" = 0");
+	return 0;
+
+error_tree:
+	proc_remove(p);
+error_dir:
+	_leave(" = -ENOMEM");
+	return -ENOMEM;
+}
+
+/*
+ * clean up the /proc/fs/afs/ directory
+ */
+void afs_proc_cleanup(struct afs_net *net)
+{
+	proc_remove(net->proc_afs);
+	net->proc_afs = NULL;
+}
diff --git a/fs/afs/rxrpc.c b/fs/afs/rxrpc.c
index 0873594..a1b1808 100644
--- a/fs/afs/rxrpc.c
+++ b/fs/afs/rxrpc.c
@@ -46,7 +46,7 @@
 
 	_enter("");
 
-	ret = sock_create_kern(&init_net, AF_RXRPC, SOCK_DGRAM, PF_INET6, &socket);
+	ret = sock_create_kern(net->net, AF_RXRPC, SOCK_DGRAM, PF_INET6, &socket);
 	if (ret < 0)
 		goto error_1;
 
diff --git a/fs/afs/server.c b/fs/afs/server.c
index 3af4625..1d329e6 100644
--- a/fs/afs/server.c
+++ b/fs/afs/server.c
@@ -228,7 +228,7 @@
 	server->flags = (1UL << AFS_SERVER_FL_NEW);
 	server->update_at = ktime_get_real_seconds() + afs_server_update_delay;
 	rwlock_init(&server->fs_lock);
-	INIT_LIST_HEAD(&server->cb_interests);
+	INIT_HLIST_HEAD(&server->cb_volumes);
 	rwlock_init(&server->cb_break_lock);
 
 	afs_inc_servers_outstanding(net);
diff --git a/fs/afs/super.c b/fs/afs/super.c
index 9e5d796..4d3e274 100644
--- a/fs/afs/super.c
+++ b/fs/afs/super.c
@@ -48,6 +48,8 @@
 };
 MODULE_ALIAS_FS("afs");
 
+int afs_net_id;
+
 static const struct super_operations afs_super_ops = {
 	.statfs		= afs_statfs,
 	.alloc_inode	= afs_alloc_inode,
@@ -117,7 +119,7 @@
 /*
  * clean up the filesystem
  */
-void __exit afs_fs_exit(void)
+void afs_fs_exit(void)
 {
 	_enter("");
 
@@ -351,14 +353,19 @@
 	struct afs_super_info *as1 = data;
 	struct afs_super_info *as = AFS_FS_S(sb);
 
-	return (as->net == as1->net &&
+	return (as->net_ns == as1->net_ns &&
 		as->volume &&
-		as->volume->vid == as1->volume->vid);
+		as->volume->vid == as1->volume->vid &&
+		!as->dyn_root);
 }
 
 static int afs_dynroot_test_super(struct super_block *sb, void *data)
 {
-	return false;
+	struct afs_super_info *as1 = data;
+	struct afs_super_info *as = AFS_FS_S(sb);
+
+	return (as->net_ns == as1->net_ns &&
+		as->dyn_root);
 }
 
 static int afs_set_super(struct super_block *sb, void *data)
@@ -418,10 +425,14 @@
 	if (!sb->s_root)
 		goto error;
 
-	if (params->dyn_root)
+	if (as->dyn_root) {
 		sb->s_d_op = &afs_dynroot_dentry_operations;
-	else
+		ret = afs_dynroot_populate(sb);
+		if (ret < 0)
+			goto error;
+	} else {
 		sb->s_d_op = &afs_fs_dentry_operations;
+	}
 
 	_leave(" = 0");
 	return 0;
@@ -437,7 +448,7 @@
 
 	as = kzalloc(sizeof(struct afs_super_info), GFP_KERNEL);
 	if (as) {
-		as->net = afs_get_net(params->net);
+		as->net_ns = get_net(params->net_ns);
 		if (params->dyn_root)
 			as->dyn_root = true;
 		else
@@ -450,12 +461,31 @@
 {
 	if (as) {
 		afs_put_volume(as->cell, as->volume);
-		afs_put_cell(as->net, as->cell);
-		afs_put_net(as->net);
+		afs_put_cell(afs_net(as->net_ns), as->cell);
+		put_net(as->net_ns);
 		kfree(as);
 	}
 }
 
+static void afs_kill_super(struct super_block *sb)
+{
+	struct afs_super_info *as = AFS_FS_S(sb);
+	struct afs_net *net = afs_net(as->net_ns);
+
+	if (as->dyn_root)
+		afs_dynroot_depopulate(sb);
+	
+	/* Clear the callback interests (which will do ilookup5) before
+	 * deactivating the superblock.
+	 */
+	if (as->volume)
+		afs_clear_callback_interests(net, as->volume->servers);
+	kill_anon_super(sb);
+	if (as->volume)
+		afs_deactivate_volume(as->volume);
+	afs_destroy_sbi(as);
+}
+
 /*
  * get an AFS superblock
  */
@@ -472,12 +502,13 @@
 	_enter(",,%s,%p", dev_name, options);
 
 	memset(&params, 0, sizeof(params));
-	params.net = &__afs_net;
 
 	ret = -EINVAL;
 	if (current->nsproxy->net_ns != &init_net)
 		goto error;
-
+	params.net_ns = current->nsproxy->net_ns;
+	params.net = afs_net(params.net_ns);
+	
 	/* parse the options and device name */
 	if (options) {
 		ret = afs_parse_options(&params, options, &dev_name);
@@ -563,21 +594,6 @@
 	return ERR_PTR(ret);
 }
 
-static void afs_kill_super(struct super_block *sb)
-{
-	struct afs_super_info *as = AFS_FS_S(sb);
-
-	/* Clear the callback interests (which will do ilookup5) before
-	 * deactivating the superblock.
-	 */
-	if (as->volume)
-		afs_clear_callback_interests(as->net, as->volume->servers);
-	kill_anon_super(sb);
-	if (as->volume)
-		afs_deactivate_volume(as->volume);
-	afs_destroy_sbi(as);
-}
-
 /*
  * Initialise an inode cache slab element prior to any use.  Note that
  * afs_alloc_inode() *must* reset anything that could incorrectly leak from one
diff --git a/fs/aio.c b/fs/aio.c
index 134e5b6..e1d2012 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -1661,7 +1661,7 @@
 	if (mask && !(mask & req->events))
 		return 0;
 
-	mask = file->f_op->poll_mask(file, req->events);
+	mask = file->f_op->poll_mask(file, req->events) & req->events;
 	if (!mask)
 		return 0;
 
@@ -1719,7 +1719,7 @@
 
 	spin_lock_irq(&ctx->ctx_lock);
 	spin_lock(&req->head->lock);
-	mask = req->file->f_op->poll_mask(req->file, req->events);
+	mask = req->file->f_op->poll_mask(req->file, req->events) & req->events;
 	if (!mask) {
 		__add_wait_queue(req->head, &req->wait);
 		list_add_tail(&aiocb->ki_list, &ctx->active_reqs);
diff --git a/fs/eventfd.c b/fs/eventfd.c
index 61c9514..ceb1031 100644
--- a/fs/eventfd.c
+++ b/fs/eventfd.c
@@ -156,11 +156,11 @@
 	count = READ_ONCE(ctx->count);
 
 	if (count > 0)
-		events |= EPOLLIN;
+		events |= (EPOLLIN & eventmask);
 	if (count == ULLONG_MAX)
 		events |= EPOLLERR;
 	if (ULLONG_MAX - 1 > count)
-		events |= EPOLLOUT;
+		events |= (EPOLLOUT & eventmask);
 
 	return events;
 }
diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index 67db22f..ea4436f 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -922,14 +922,18 @@
 	return 0;
 }
 
-static __poll_t ep_eventpoll_poll(struct file *file, poll_table *wait)
+static struct wait_queue_head *ep_eventpoll_get_poll_head(struct file *file,
+		__poll_t eventmask)
+{
+	struct eventpoll *ep = file->private_data;
+	return &ep->poll_wait;
+}
+
+static __poll_t ep_eventpoll_poll_mask(struct file *file, __poll_t eventmask)
 {
 	struct eventpoll *ep = file->private_data;
 	int depth = 0;
 
-	/* Insert inside our poll wait queue */
-	poll_wait(file, &ep->poll_wait, wait);
-
 	/*
 	 * Proceed to find out if wanted events are really available inside
 	 * the ready list.
@@ -968,7 +972,8 @@
 	.show_fdinfo	= ep_show_fdinfo,
 #endif
 	.release	= ep_eventpoll_release,
-	.poll		= ep_eventpoll_poll,
+	.get_poll_head	= ep_eventpoll_get_poll_head,
+	.poll_mask	= ep_eventpoll_poll_mask,
 	.llseek		= noop_llseek,
 };
 
diff --git a/fs/namei.c b/fs/namei.c
index 2490ddb..734cef5 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -2464,6 +2464,35 @@
 }
 
 /**
+ * try_lookup_one_len - filesystem helper to lookup single pathname component
+ * @name:	pathname component to lookup
+ * @base:	base directory to lookup from
+ * @len:	maximum length @len should be interpreted to
+ *
+ * Look up a dentry by name in the dcache, returning NULL if it does not
+ * currently exist.  The function does not try to create a dentry.
+ *
+ * Note that this routine is purely a helper for filesystem usage and should
+ * not be called by generic code.
+ *
+ * The caller must hold base->i_mutex.
+ */
+struct dentry *try_lookup_one_len(const char *name, struct dentry *base, int len)
+{
+	struct qstr this;
+	int err;
+
+	WARN_ON_ONCE(!inode_is_locked(base->d_inode));
+
+	err = lookup_one_len_common(name, base, len, &this);
+	if (err)
+		return ERR_PTR(err);
+
+	return lookup_dcache(&this, base, 0);
+}
+EXPORT_SYMBOL(try_lookup_one_len);
+
+/**
  * lookup_one_len - filesystem helper to lookup single pathname component
  * @name:	pathname component to lookup
  * @base:	base directory to lookup from
diff --git a/fs/notify/dnotify/dnotify.c b/fs/notify/dnotify/dnotify.c
index 63a1ca4..e2bea2a 100644
--- a/fs/notify/dnotify/dnotify.c
+++ b/fs/notify/dnotify/dnotify.c
@@ -79,12 +79,11 @@
  */
 static int dnotify_handle_event(struct fsnotify_group *group,
 				struct inode *inode,
-				struct fsnotify_mark *inode_mark,
-				struct fsnotify_mark *vfsmount_mark,
 				u32 mask, const void *data, int data_type,
 				const unsigned char *file_name, u32 cookie,
 				struct fsnotify_iter_info *iter_info)
 {
+	struct fsnotify_mark *inode_mark = fsnotify_iter_inode_mark(iter_info);
 	struct dnotify_mark *dn_mark;
 	struct dnotify_struct *dn;
 	struct dnotify_struct **prev;
@@ -95,7 +94,8 @@
 	if (!S_ISDIR(inode->i_mode))
 		return 0;
 
-	BUG_ON(vfsmount_mark);
+	if (WARN_ON(fsnotify_iter_vfsmount_mark(iter_info)))
+		return 0;
 
 	dn_mark = container_of(inode_mark, struct dnotify_mark, fsn_mark);
 
@@ -319,7 +319,7 @@
 		dn_mark = container_of(fsn_mark, struct dnotify_mark, fsn_mark);
 		spin_lock(&fsn_mark->lock);
 	} else {
-		error = fsnotify_add_mark_locked(new_fsn_mark, inode, NULL, 0);
+		error = fsnotify_add_inode_mark_locked(new_fsn_mark, inode, 0);
 		if (error) {
 			mutex_unlock(&dnotify_group->mark_mutex);
 			goto out_err;
diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
index d94e803..f90842e 100644
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -87,17 +87,17 @@
 	return ret;
 }
 
-static bool fanotify_should_send_event(struct fsnotify_mark *inode_mark,
-				       struct fsnotify_mark *vfsmnt_mark,
-				       u32 event_mask,
-				       const void *data, int data_type)
+static bool fanotify_should_send_event(struct fsnotify_iter_info *iter_info,
+				       u32 event_mask, const void *data,
+				       int data_type)
 {
 	__u32 marks_mask = 0, marks_ignored_mask = 0;
 	const struct path *path = data;
+	struct fsnotify_mark *mark;
+	int type;
 
-	pr_debug("%s: inode_mark=%p vfsmnt_mark=%p mask=%x data=%p"
-		 " data_type=%d\n", __func__, inode_mark, vfsmnt_mark,
-		 event_mask, data, data_type);
+	pr_debug("%s: report_mask=%x mask=%x data=%p data_type=%d\n",
+		 __func__, iter_info->report_mask, event_mask, data, data_type);
 
 	/* if we don't have enough info to send an event to userspace say no */
 	if (data_type != FSNOTIFY_EVENT_PATH)
@@ -108,20 +108,21 @@
 	    !d_can_lookup(path->dentry))
 		return false;
 
-	/*
-	 * if the event is for a child and this inode doesn't care about
-	 * events on the child, don't send it!
-	 */
-	if (inode_mark &&
-	    (!(event_mask & FS_EVENT_ON_CHILD) ||
-	     (inode_mark->mask & FS_EVENT_ON_CHILD))) {
-		marks_mask |= inode_mark->mask;
-		marks_ignored_mask |= inode_mark->ignored_mask;
-	}
+	fsnotify_foreach_obj_type(type) {
+		if (!fsnotify_iter_should_report_type(iter_info, type))
+			continue;
+		mark = iter_info->marks[type];
+		/*
+		 * if the event is for a child and this inode doesn't care about
+		 * events on the child, don't send it!
+		 */
+		if (type == FSNOTIFY_OBJ_TYPE_INODE &&
+		    (event_mask & FS_EVENT_ON_CHILD) &&
+		    !(mark->mask & FS_EVENT_ON_CHILD))
+			continue;
 
-	if (vfsmnt_mark) {
-		marks_mask |= vfsmnt_mark->mask;
-		marks_ignored_mask |= vfsmnt_mark->ignored_mask;
+		marks_mask |= mark->mask;
+		marks_ignored_mask |= mark->ignored_mask;
 	}
 
 	if (d_is_dir(path->dentry) &&
@@ -178,8 +179,6 @@
 
 static int fanotify_handle_event(struct fsnotify_group *group,
 				 struct inode *inode,
-				 struct fsnotify_mark *inode_mark,
-				 struct fsnotify_mark *fanotify_mark,
 				 u32 mask, const void *data, int data_type,
 				 const unsigned char *file_name, u32 cookie,
 				 struct fsnotify_iter_info *iter_info)
@@ -199,8 +198,7 @@
 	BUILD_BUG_ON(FAN_ACCESS_PERM != FS_ACCESS_PERM);
 	BUILD_BUG_ON(FAN_ONDIR != FS_ISDIR);
 
-	if (!fanotify_should_send_event(inode_mark, fanotify_mark, mask, data,
-					data_type))
+	if (!fanotify_should_send_event(iter_info, mask, data, data_type))
 		return 0;
 
 	pr_debug("%s: group=%p inode=%p mask=%x\n", __func__, group, inode,
diff --git a/fs/notify/fdinfo.c b/fs/notify/fdinfo.c
index d478629..10aac19 100644
--- a/fs/notify/fdinfo.c
+++ b/fs/notify/fdinfo.c
@@ -77,7 +77,7 @@
 	struct inotify_inode_mark *inode_mark;
 	struct inode *inode;
 
-	if (!(mark->connector->flags & FSNOTIFY_OBJ_TYPE_INODE))
+	if (mark->connector->type != FSNOTIFY_OBJ_TYPE_INODE)
 		return;
 
 	inode_mark = container_of(mark, struct inotify_inode_mark, fsn_mark);
@@ -116,7 +116,7 @@
 	if (mark->flags & FSNOTIFY_MARK_FLAG_IGNORED_SURV_MODIFY)
 		mflags |= FAN_MARK_IGNORED_SURV_MODIFY;
 
-	if (mark->connector->flags & FSNOTIFY_OBJ_TYPE_INODE) {
+	if (mark->connector->type == FSNOTIFY_OBJ_TYPE_INODE) {
 		inode = igrab(mark->connector->inode);
 		if (!inode)
 			return;
@@ -126,7 +126,7 @@
 		show_mark_fhandle(m, inode);
 		seq_putc(m, '\n');
 		iput(inode);
-	} else if (mark->connector->flags & FSNOTIFY_OBJ_TYPE_VFSMOUNT) {
+	} else if (mark->connector->type == FSNOTIFY_OBJ_TYPE_VFSMOUNT) {
 		struct mount *mnt = real_mount(mark->connector->mnt);
 
 		seq_printf(m, "fanotify mnt_id:%x mflags:%x mask:%x ignored_mask:%x\n",
diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c
index 613ec7e..f174397 100644
--- a/fs/notify/fsnotify.c
+++ b/fs/notify/fsnotify.c
@@ -184,8 +184,6 @@
 EXPORT_SYMBOL_GPL(__fsnotify_parent);
 
 static int send_to_group(struct inode *to_tell,
-			 struct fsnotify_mark *inode_mark,
-			 struct fsnotify_mark *vfsmount_mark,
 			 __u32 mask, const void *data,
 			 int data_is, u32 cookie,
 			 const unsigned char *file_name,
@@ -195,48 +193,45 @@
 	__u32 test_mask = (mask & ~FS_EVENT_ON_CHILD);
 	__u32 marks_mask = 0;
 	__u32 marks_ignored_mask = 0;
+	struct fsnotify_mark *mark;
+	int type;
 
-	if (unlikely(!inode_mark && !vfsmount_mark)) {
-		BUG();
+	if (WARN_ON(!iter_info->report_mask))
 		return 0;
-	}
 
 	/* clear ignored on inode modification */
 	if (mask & FS_MODIFY) {
-		if (inode_mark &&
-		    !(inode_mark->flags & FSNOTIFY_MARK_FLAG_IGNORED_SURV_MODIFY))
-			inode_mark->ignored_mask = 0;
-		if (vfsmount_mark &&
-		    !(vfsmount_mark->flags & FSNOTIFY_MARK_FLAG_IGNORED_SURV_MODIFY))
-			vfsmount_mark->ignored_mask = 0;
+		fsnotify_foreach_obj_type(type) {
+			if (!fsnotify_iter_should_report_type(iter_info, type))
+				continue;
+			mark = iter_info->marks[type];
+			if (mark &&
+			    !(mark->flags & FSNOTIFY_MARK_FLAG_IGNORED_SURV_MODIFY))
+				mark->ignored_mask = 0;
+		}
 	}
 
-	/* does the inode mark tell us to do something? */
-	if (inode_mark) {
-		group = inode_mark->group;
-		marks_mask |= inode_mark->mask;
-		marks_ignored_mask |= inode_mark->ignored_mask;
+	fsnotify_foreach_obj_type(type) {
+		if (!fsnotify_iter_should_report_type(iter_info, type))
+			continue;
+		mark = iter_info->marks[type];
+		/* does the object mark tell us to do something? */
+		if (mark) {
+			group = mark->group;
+			marks_mask |= mark->mask;
+			marks_ignored_mask |= mark->ignored_mask;
+		}
 	}
 
-	/* does the vfsmount_mark tell us to do something? */
-	if (vfsmount_mark) {
-		group = vfsmount_mark->group;
-		marks_mask |= vfsmount_mark->mask;
-		marks_ignored_mask |= vfsmount_mark->ignored_mask;
-	}
-
-	pr_debug("%s: group=%p to_tell=%p mask=%x inode_mark=%p"
-		 " vfsmount_mark=%p marks_mask=%x marks_ignored_mask=%x"
+	pr_debug("%s: group=%p to_tell=%p mask=%x marks_mask=%x marks_ignored_mask=%x"
 		 " data=%p data_is=%d cookie=%d\n",
-		 __func__, group, to_tell, mask, inode_mark, vfsmount_mark,
-		 marks_mask, marks_ignored_mask, data,
-		 data_is, cookie);
+		 __func__, group, to_tell, mask, marks_mask, marks_ignored_mask,
+		 data, data_is, cookie);
 
 	if (!(test_mask & marks_mask & ~marks_ignored_mask))
 		return 0;
 
-	return group->ops->handle_event(group, to_tell, inode_mark,
-					vfsmount_mark, mask, data, data_is,
+	return group->ops->handle_event(group, to_tell, mask, data, data_is,
 					file_name, cookie, iter_info);
 }
 
@@ -264,6 +259,57 @@
 }
 
 /*
+ * iter_info is a multi head priority queue of marks.
+ * Pick a subset of marks from queue heads, all with the
+ * same group and set the report_mask for selected subset.
+ * Returns the report_mask of the selected subset.
+ */
+static unsigned int fsnotify_iter_select_report_types(
+		struct fsnotify_iter_info *iter_info)
+{
+	struct fsnotify_group *max_prio_group = NULL;
+	struct fsnotify_mark *mark;
+	int type;
+
+	/* Choose max prio group among groups of all queue heads */
+	fsnotify_foreach_obj_type(type) {
+		mark = iter_info->marks[type];
+		if (mark &&
+		    fsnotify_compare_groups(max_prio_group, mark->group) > 0)
+			max_prio_group = mark->group;
+	}
+
+	if (!max_prio_group)
+		return 0;
+
+	/* Set the report mask for marks from same group as max prio group */
+	iter_info->report_mask = 0;
+	fsnotify_foreach_obj_type(type) {
+		mark = iter_info->marks[type];
+		if (mark &&
+		    fsnotify_compare_groups(max_prio_group, mark->group) == 0)
+			fsnotify_iter_set_report_type(iter_info, type);
+	}
+
+	return iter_info->report_mask;
+}
+
+/*
+ * Pop from iter_info multi head queue, the marks that were iterated in the
+ * current iteration step.
+ */
+static void fsnotify_iter_next(struct fsnotify_iter_info *iter_info)
+{
+	int type;
+
+	fsnotify_foreach_obj_type(type) {
+		if (fsnotify_iter_should_report_type(iter_info, type))
+			iter_info->marks[type] =
+				fsnotify_next_mark(iter_info->marks[type]);
+	}
+}
+
+/*
  * This is the main call to fsnotify.  The VFS calls into hook specific functions
  * in linux/fsnotify.h.  Those functions then in turn call here.  Here will call
  * out to all of the registered fsnotify_group.  Those groups can then use the
@@ -307,15 +353,15 @@
 
 	if ((mask & FS_MODIFY) ||
 	    (test_mask & to_tell->i_fsnotify_mask)) {
-		iter_info.inode_mark =
+		iter_info.marks[FSNOTIFY_OBJ_TYPE_INODE] =
 			fsnotify_first_mark(&to_tell->i_fsnotify_marks);
 	}
 
 	if (mnt && ((mask & FS_MODIFY) ||
 		    (test_mask & mnt->mnt_fsnotify_mask))) {
-		iter_info.inode_mark =
+		iter_info.marks[FSNOTIFY_OBJ_TYPE_INODE] =
 			fsnotify_first_mark(&to_tell->i_fsnotify_marks);
-		iter_info.vfsmount_mark =
+		iter_info.marks[FSNOTIFY_OBJ_TYPE_VFSMOUNT] =
 			fsnotify_first_mark(&mnt->mnt_fsnotify_marks);
 	}
 
@@ -324,32 +370,14 @@
 	 * ignore masks are properly reflected for mount mark notifications.
 	 * That's why this traversal is so complicated...
 	 */
-	while (iter_info.inode_mark || iter_info.vfsmount_mark) {
-		struct fsnotify_mark *inode_mark = iter_info.inode_mark;
-		struct fsnotify_mark *vfsmount_mark = iter_info.vfsmount_mark;
-
-		if (inode_mark && vfsmount_mark) {
-			int cmp = fsnotify_compare_groups(inode_mark->group,
-							  vfsmount_mark->group);
-			if (cmp > 0)
-				inode_mark = NULL;
-			else if (cmp < 0)
-				vfsmount_mark = NULL;
-		}
-
-		ret = send_to_group(to_tell, inode_mark, vfsmount_mark, mask,
-				    data, data_is, cookie, file_name,
-				    &iter_info);
+	while (fsnotify_iter_select_report_types(&iter_info)) {
+		ret = send_to_group(to_tell, mask, data, data_is, cookie,
+				    file_name, &iter_info);
 
 		if (ret && (mask & ALL_FSNOTIFY_PERM_EVENTS))
 			goto out;
 
-		if (inode_mark)
-			iter_info.inode_mark =
-				fsnotify_next_mark(iter_info.inode_mark);
-		if (vfsmount_mark)
-			iter_info.vfsmount_mark =
-				fsnotify_next_mark(iter_info.vfsmount_mark);
+		fsnotify_iter_next(&iter_info);
 	}
 	ret = 0;
 out:
diff --git a/fs/notify/fsnotify.h b/fs/notify/fsnotify.h
index 60f365d..34515d2 100644
--- a/fs/notify/fsnotify.h
+++ b/fs/notify/fsnotify.h
@@ -9,12 +9,6 @@
 
 #include "../mount.h"
 
-struct fsnotify_iter_info {
-	struct fsnotify_mark *inode_mark;
-	struct fsnotify_mark *vfsmount_mark;
-	int srcu_idx;
-};
-
 /* destroy all events sitting in this groups notification queue */
 extern void fsnotify_flush_notify(struct fsnotify_group *group);
 
diff --git a/fs/notify/group.c b/fs/notify/group.c
index b7a4b6a..aa5468f2 100644
--- a/fs/notify/group.c
+++ b/fs/notify/group.c
@@ -67,7 +67,7 @@
 	fsnotify_group_stop_queueing(group);
 
 	/* Clear all marks for this group and queue them for destruction */
-	fsnotify_clear_marks_by_group(group, FSNOTIFY_OBJ_ALL_TYPES);
+	fsnotify_clear_marks_by_group(group, FSNOTIFY_OBJ_ALL_TYPES_MASK);
 
 	/*
 	 * Some marks can still be pinned when waiting for response from
diff --git a/fs/notify/inotify/inotify.h b/fs/notify/inotify/inotify.h
index c00d2ca..7e4578d 100644
--- a/fs/notify/inotify/inotify.h
+++ b/fs/notify/inotify/inotify.h
@@ -25,8 +25,6 @@
 					   struct fsnotify_group *group);
 extern int inotify_handle_event(struct fsnotify_group *group,
 				struct inode *inode,
-				struct fsnotify_mark *inode_mark,
-				struct fsnotify_mark *vfsmount_mark,
 				u32 mask, const void *data, int data_type,
 				const unsigned char *file_name, u32 cookie,
 				struct fsnotify_iter_info *iter_info);
diff --git a/fs/notify/inotify/inotify_fsnotify.c b/fs/notify/inotify/inotify_fsnotify.c
index 40dedb3..9ab6dde 100644
--- a/fs/notify/inotify/inotify_fsnotify.c
+++ b/fs/notify/inotify/inotify_fsnotify.c
@@ -65,12 +65,11 @@
 
 int inotify_handle_event(struct fsnotify_group *group,
 			 struct inode *inode,
-			 struct fsnotify_mark *inode_mark,
-			 struct fsnotify_mark *vfsmount_mark,
 			 u32 mask, const void *data, int data_type,
 			 const unsigned char *file_name, u32 cookie,
 			 struct fsnotify_iter_info *iter_info)
 {
+	struct fsnotify_mark *inode_mark = fsnotify_iter_inode_mark(iter_info);
 	struct inotify_inode_mark *i_mark;
 	struct inotify_event_info *event;
 	struct fsnotify_event *fsn_event;
@@ -78,7 +77,8 @@
 	int len = 0;
 	int alloc_len = sizeof(struct inotify_event_info);
 
-	BUG_ON(vfsmount_mark);
+	if (WARN_ON(fsnotify_iter_vfsmount_mark(iter_info)))
+		return 0;
 
 	if ((inode_mark->mask & FS_EXCL_UNLINK) &&
 	    (data_type == FSNOTIFY_EVENT_PATH)) {
diff --git a/fs/notify/inotify/inotify_user.c b/fs/notify/inotify/inotify_user.c
index ef32f36..1cf5b77 100644
--- a/fs/notify/inotify/inotify_user.c
+++ b/fs/notify/inotify/inotify_user.c
@@ -485,10 +485,14 @@
 				    struct fsnotify_group *group)
 {
 	struct inotify_inode_mark *i_mark;
+	struct fsnotify_iter_info iter_info = { };
+
+	fsnotify_iter_set_report_type_mark(&iter_info, FSNOTIFY_OBJ_TYPE_INODE,
+					   fsn_mark);
 
 	/* Queue ignore event for the watch */
-	inotify_handle_event(group, NULL, fsn_mark, NULL, FS_IN_IGNORED,
-			     NULL, FSNOTIFY_EVENT_NONE, NULL, 0, NULL);
+	inotify_handle_event(group, NULL, FS_IN_IGNORED, NULL,
+			     FSNOTIFY_EVENT_NONE, NULL, 0, &iter_info);
 
 	i_mark = container_of(fsn_mark, struct inotify_inode_mark, fsn_mark);
 	/* remove this mark from the idr */
@@ -578,7 +582,7 @@
 	}
 
 	/* we are on the idr, now get on the inode */
-	ret = fsnotify_add_mark_locked(&tmp_i_mark->fsn_mark, inode, NULL, 0);
+	ret = fsnotify_add_inode_mark_locked(&tmp_i_mark->fsn_mark, inode, 0);
 	if (ret) {
 		/* we failed to get on the inode, get off the idr */
 		inotify_remove_from_idr(group, tmp_i_mark);
diff --git a/fs/notify/mark.c b/fs/notify/mark.c
index e9191b4..61f4c5f 100644
--- a/fs/notify/mark.c
+++ b/fs/notify/mark.c
@@ -119,9 +119,9 @@
 		if (mark->flags & FSNOTIFY_MARK_FLAG_ATTACHED)
 			new_mask |= mark->mask;
 	}
-	if (conn->flags & FSNOTIFY_OBJ_TYPE_INODE)
+	if (conn->type == FSNOTIFY_OBJ_TYPE_INODE)
 		conn->inode->i_fsnotify_mask = new_mask;
-	else if (conn->flags & FSNOTIFY_OBJ_TYPE_VFSMOUNT)
+	else if (conn->type == FSNOTIFY_OBJ_TYPE_VFSMOUNT)
 		real_mount(conn->mnt)->mnt_fsnotify_mask = new_mask;
 }
 
@@ -139,7 +139,7 @@
 	spin_lock(&conn->lock);
 	__fsnotify_recalc_mask(conn);
 	spin_unlock(&conn->lock);
-	if (conn->flags & FSNOTIFY_OBJ_TYPE_INODE)
+	if (conn->type == FSNOTIFY_OBJ_TYPE_INODE)
 		__fsnotify_update_child_dentry_flags(conn->inode);
 }
 
@@ -166,18 +166,18 @@
 {
 	struct inode *inode = NULL;
 
-	if (conn->flags & FSNOTIFY_OBJ_TYPE_INODE) {
+	if (conn->type == FSNOTIFY_OBJ_TYPE_INODE) {
 		inode = conn->inode;
 		rcu_assign_pointer(inode->i_fsnotify_marks, NULL);
 		inode->i_fsnotify_mask = 0;
 		conn->inode = NULL;
-		conn->flags &= ~FSNOTIFY_OBJ_TYPE_INODE;
-	} else if (conn->flags & FSNOTIFY_OBJ_TYPE_VFSMOUNT) {
+		conn->type = FSNOTIFY_OBJ_TYPE_DETACHED;
+	} else if (conn->type == FSNOTIFY_OBJ_TYPE_VFSMOUNT) {
 		rcu_assign_pointer(real_mount(conn->mnt)->mnt_fsnotify_marks,
 				   NULL);
 		real_mount(conn->mnt)->mnt_fsnotify_mask = 0;
 		conn->mnt = NULL;
-		conn->flags &= ~FSNOTIFY_OBJ_TYPE_VFSMOUNT;
+		conn->type = FSNOTIFY_OBJ_TYPE_DETACHED;
 	}
 
 	return inode;
@@ -294,12 +294,12 @@
 
 bool fsnotify_prepare_user_wait(struct fsnotify_iter_info *iter_info)
 {
-	/* This can fail if mark is being removed */
-	if (!fsnotify_get_mark_safe(iter_info->inode_mark))
-		return false;
-	if (!fsnotify_get_mark_safe(iter_info->vfsmount_mark)) {
-		fsnotify_put_mark_wake(iter_info->inode_mark);
-		return false;
+	int type;
+
+	fsnotify_foreach_obj_type(type) {
+		/* This can fail if mark is being removed */
+		if (!fsnotify_get_mark_safe(iter_info->marks[type]))
+			goto fail;
 	}
 
 	/*
@@ -310,13 +310,20 @@
 	srcu_read_unlock(&fsnotify_mark_srcu, iter_info->srcu_idx);
 
 	return true;
+
+fail:
+	for (type--; type >= 0; type--)
+		fsnotify_put_mark_wake(iter_info->marks[type]);
+	return false;
 }
 
 void fsnotify_finish_user_wait(struct fsnotify_iter_info *iter_info)
 {
+	int type;
+
 	iter_info->srcu_idx = srcu_read_lock(&fsnotify_mark_srcu);
-	fsnotify_put_mark_wake(iter_info->inode_mark);
-	fsnotify_put_mark_wake(iter_info->vfsmount_mark);
+	fsnotify_foreach_obj_type(type)
+		fsnotify_put_mark_wake(iter_info->marks[type]);
 }
 
 /*
@@ -442,10 +449,10 @@
 	spin_lock_init(&conn->lock);
 	INIT_HLIST_HEAD(&conn->list);
 	if (inode) {
-		conn->flags = FSNOTIFY_OBJ_TYPE_INODE;
+		conn->type = FSNOTIFY_OBJ_TYPE_INODE;
 		conn->inode = igrab(inode);
 	} else {
-		conn->flags = FSNOTIFY_OBJ_TYPE_VFSMOUNT;
+		conn->type = FSNOTIFY_OBJ_TYPE_VFSMOUNT;
 		conn->mnt = mnt;
 	}
 	/*
@@ -479,8 +486,7 @@
 	if (!conn)
 		goto out;
 	spin_lock(&conn->lock);
-	if (!(conn->flags & (FSNOTIFY_OBJ_TYPE_INODE |
-			     FSNOTIFY_OBJ_TYPE_VFSMOUNT))) {
+	if (conn->type == FSNOTIFY_OBJ_TYPE_DETACHED) {
 		spin_unlock(&conn->lock);
 		srcu_read_unlock(&fsnotify_mark_srcu, idx);
 		return NULL;
@@ -646,16 +652,16 @@
 	return NULL;
 }
 
-/* Clear any marks in a group with given type */
+/* Clear any marks in a group with given type mask */
 void fsnotify_clear_marks_by_group(struct fsnotify_group *group,
-				   unsigned int type)
+				   unsigned int type_mask)
 {
 	struct fsnotify_mark *lmark, *mark;
 	LIST_HEAD(to_free);
 	struct list_head *head = &to_free;
 
 	/* Skip selection step if we want to clear all marks. */
-	if (type == FSNOTIFY_OBJ_ALL_TYPES) {
+	if (type_mask == FSNOTIFY_OBJ_ALL_TYPES_MASK) {
 		head = &group->marks_list;
 		goto clear;
 	}
@@ -670,7 +676,7 @@
 	 */
 	mutex_lock_nested(&group->mark_mutex, SINGLE_DEPTH_NESTING);
 	list_for_each_entry_safe(mark, lmark, &group->marks_list, g_list) {
-		if (mark->connector->flags & type)
+		if ((1U << mark->connector->type) & type_mask)
 			list_move(&mark->g_list, &to_free);
 	}
 	mutex_unlock(&group->mark_mutex);
diff --git a/fs/orangefs/devorangefs-req.c b/fs/orangefs/devorangefs-req.c
index 74b37cb..33ee8cb 100644
--- a/fs/orangefs/devorangefs-req.c
+++ b/fs/orangefs/devorangefs-req.c
@@ -719,37 +719,6 @@
 	__s32 count;
 };
 
-static unsigned long translate_dev_map26(unsigned long args, long *error)
-{
-	struct ORANGEFS_dev_map_desc32 __user *p32 = (void __user *)args;
-	/*
-	 * Depending on the architecture, allocate some space on the
-	 * user-call-stack based on our expected layout.
-	 */
-	struct ORANGEFS_dev_map_desc __user *p =
-	    compat_alloc_user_space(sizeof(*p));
-	compat_uptr_t addr;
-
-	*error = 0;
-	/* get the ptr from the 32 bit user-space */
-	if (get_user(addr, &p32->ptr))
-		goto err;
-	/* try to put that into a 64-bit layout */
-	if (put_user(compat_ptr(addr), &p->ptr))
-		goto err;
-	/* copy the remaining fields */
-	if (copy_in_user(&p->total_size, &p32->total_size, sizeof(__s32)))
-		goto err;
-	if (copy_in_user(&p->size, &p32->size, sizeof(__s32)))
-		goto err;
-	if (copy_in_user(&p->count, &p32->count, sizeof(__s32)))
-		goto err;
-	return (unsigned long)p;
-err:
-	*error = -EFAULT;
-	return 0;
-}
-
 /*
  * 32 bit user-space apps' ioctl handlers when kernel modules
  * is compiled as a 64 bit one
@@ -758,25 +727,26 @@
 				      unsigned long args)
 {
 	long ret;
-	unsigned long arg = args;
 
 	/* Check for properly constructed commands */
 	ret = check_ioctl_command(cmd);
 	if (ret < 0)
 		return ret;
 	if (cmd == ORANGEFS_DEV_MAP) {
-		/*
-		 * convert the arguments to what we expect internally
-		 * in kernel space
-		 */
-		arg = translate_dev_map26(args, &ret);
-		if (ret < 0) {
-			gossip_err("Could not translate dev map\n");
-			return ret;
-		}
+		struct ORANGEFS_dev_map_desc desc;
+		struct ORANGEFS_dev_map_desc32 d32;
+
+		if (copy_from_user(&d32, (void __user *)args, sizeof(d32)))
+			return -EFAULT;
+
+		desc.ptr = compat_ptr(d32.ptr);
+		desc.total_size = d32.total_size;
+		desc.size = d32.size;
+		desc.count = d32.count;
+		return orangefs_bufmap_initialize(&desc);
 	}
 	/* no other ioctl requires translation */
-	return dispatch_ioctl_command(cmd, arg);
+	return dispatch_ioctl_command(cmd, args);
 }
 
 #endif /* CONFIG_COMPAT is in .config */
diff --git a/fs/proc/generic.c b/fs/proc/generic.c
index 7b4d971..6ac1c92 100644
--- a/fs/proc/generic.c
+++ b/fs/proc/generic.c
@@ -409,7 +409,7 @@
 	if (!ent)
 		goto out;
 
-	if (qstr.len + 1 <= sizeof(ent->inline_name)) {
+	if (qstr.len + 1 <= SIZEOF_PDE_INLINE_NAME) {
 		ent->name = ent->inline_name;
 	} else {
 		ent->name = kmalloc(qstr.len + 1, GFP_KERNEL);
@@ -740,3 +740,27 @@
 	return __PDE_DATA(inode);
 }
 EXPORT_SYMBOL(PDE_DATA);
+
+/*
+ * Pull a user buffer into memory and pass it to the file's write handler if
+ * one is supplied.  The ->write() method is permitted to modify the
+ * kernel-side buffer.
+ */
+ssize_t proc_simple_write(struct file *f, const char __user *ubuf, size_t size,
+			  loff_t *_pos)
+{
+	struct proc_dir_entry *pde = PDE(file_inode(f));
+	char *buf;
+	int ret;
+
+	if (!pde->write)
+		return -EACCES;
+	if (size == 0 || size > PAGE_SIZE - 1)
+		return -EINVAL;
+	buf = memdup_user_nul(ubuf, size);
+	if (IS_ERR(buf))
+		return PTR_ERR(buf);
+	ret = pde->write(f, buf, size);
+	kfree(buf);
+	return ret == 0 ? size : ret;
+}
diff --git a/fs/proc/inode.c b/fs/proc/inode.c
index 2cf3b74..85ffbd2 100644
--- a/fs/proc/inode.c
+++ b/fs/proc/inode.c
@@ -105,9 +105,8 @@
 		kmem_cache_create("pde_opener", sizeof(struct pde_opener), 0,
 				  SLAB_ACCOUNT|SLAB_PANIC, NULL);
 	proc_dir_entry_cache = kmem_cache_create_usercopy(
-		"proc_dir_entry", sizeof(struct proc_dir_entry), 0, SLAB_PANIC,
-		offsetof(struct proc_dir_entry, inline_name),
-		sizeof_field(struct proc_dir_entry, inline_name), NULL);
+		"proc_dir_entry", SIZEOF_PDE_SLOT, 0, SLAB_PANIC,
+		OFFSETOF_PDE_NAME, SIZEOF_PDE_INLINE_NAME, NULL);
 }
 
 static int proc_show_options(struct seq_file *seq, struct dentry *root)
diff --git a/fs/proc/internal.h b/fs/proc/internal.h
index 50cb22a..da3dbfa 100644
--- a/fs/proc/internal.h
+++ b/fs/proc/internal.h
@@ -48,6 +48,7 @@
 		const struct seq_operations *seq_ops;
 		int (*single_show)(struct seq_file *, void *);
 	};
+	proc_write_t write;
 	void *data;
 	unsigned int state_size;
 	unsigned int low_ino;
@@ -61,14 +62,20 @@
 	char *name;
 	umode_t mode;
 	u8 namelen;
-#ifdef CONFIG_64BIT
-#define SIZEOF_PDE_INLINE_NAME	(192-155)
-#else
-#define SIZEOF_PDE_INLINE_NAME	(128-95)
-#endif
-	char inline_name[SIZEOF_PDE_INLINE_NAME];
+	char inline_name[];
 } __randomize_layout;
 
+#define OFFSETOF_PDE_NAME offsetof(struct proc_dir_entry, inline_name)
+#define SIZEOF_PDE_SLOT					\
+	(OFFSETOF_PDE_NAME + 34 <= 64 ? 64 :		\
+	 OFFSETOF_PDE_NAME + 34 <= 128 ? 128 :		\
+	 OFFSETOF_PDE_NAME + 34 <= 192 ? 192 :		\
+	 OFFSETOF_PDE_NAME + 34 <= 256 ? 256 :		\
+	 OFFSETOF_PDE_NAME + 34 <= 512 ? 512 :		\
+	 0)
+
+#define SIZEOF_PDE_INLINE_NAME (SIZEOF_PDE_SLOT - OFFSETOF_PDE_NAME)
+
 extern struct kmem_cache *proc_dir_entry_cache;
 void pde_free(struct proc_dir_entry *pde);
 
@@ -189,6 +196,7 @@
 {
 	return S_ISDIR(pde->mode) && !pde->proc_iops;
 }
+extern ssize_t proc_simple_write(struct file *, const char __user *, size_t, loff_t *);
 
 /*
  * inode.c
diff --git a/fs/proc/proc_net.c b/fs/proc/proc_net.c
index 7d94fa0..d5e0fcb 100644
--- a/fs/proc/proc_net.c
+++ b/fs/proc/proc_net.c
@@ -46,6 +46,9 @@
 
 	WARN_ON_ONCE(state_size < sizeof(*p));
 
+	if (file->f_mode & FMODE_WRITE && !PDE(inode)->write)
+		return -EACCES;
+
 	net = get_proc_net(inode);
 	if (!net)
 		return -ENXIO;
@@ -73,6 +76,7 @@
 static const struct file_operations proc_net_seq_fops = {
 	.open		= seq_open_net,
 	.read		= seq_read,
+	.write		= proc_simple_write,
 	.llseek		= seq_lseek,
 	.release	= seq_release_net,
 };
@@ -93,6 +97,50 @@
 }
 EXPORT_SYMBOL_GPL(proc_create_net_data);
 
+/**
+ * proc_create_net_data_write - Create a writable net_ns-specific proc file
+ * @name: The name of the file.
+ * @mode: The file's access mode.
+ * @parent: The parent directory in which to create.
+ * @ops: The seq_file ops with which to read the file.
+ * @write: The write method which which to 'modify' the file.
+ * @data: Data for retrieval by PDE_DATA().
+ *
+ * Create a network namespaced proc file in the @parent directory with the
+ * specified @name and @mode that allows reading of a file that displays a
+ * series of elements and also provides for the file accepting writes that have
+ * some arbitrary effect.
+ *
+ * The functions in the @ops table are used to iterate over items to be
+ * presented and extract the readable content using the seq_file interface.
+ *
+ * The @write function is called with the data copied into a kernel space
+ * scratch buffer and has a NUL appended for convenience.  The buffer may be
+ * modified by the @write function.  @write should return 0 on success.
+ *
+ * The @data value is accessible from the @show and @write functions by calling
+ * PDE_DATA() on the file inode.  The network namespace must be accessed by
+ * calling seq_file_net() on the seq_file struct.
+ */
+struct proc_dir_entry *proc_create_net_data_write(const char *name, umode_t mode,
+						  struct proc_dir_entry *parent,
+						  const struct seq_operations *ops,
+						  proc_write_t write,
+						  unsigned int state_size, void *data)
+{
+	struct proc_dir_entry *p;
+
+	p = proc_create_reg(name, mode, &parent, data);
+	if (!p)
+		return NULL;
+	p->proc_fops = &proc_net_seq_fops;
+	p->seq_ops = ops;
+	p->state_size = state_size;
+	p->write = write;
+	return proc_register(parent, p);
+}
+EXPORT_SYMBOL_GPL(proc_create_net_data_write);
+
 static int single_open_net(struct inode *inode, struct file *file)
 {
 	struct proc_dir_entry *de = PDE(inode);
@@ -119,6 +167,7 @@
 static const struct file_operations proc_net_single_fops = {
 	.open		= single_open_net,
 	.read		= seq_read,
+	.write		= proc_simple_write,
 	.llseek		= seq_lseek,
 	.release	= single_release_net,
 };
@@ -138,6 +187,49 @@
 }
 EXPORT_SYMBOL_GPL(proc_create_net_single);
 
+/**
+ * proc_create_net_single_write - Create a writable net_ns-specific proc file
+ * @name: The name of the file.
+ * @mode: The file's access mode.
+ * @parent: The parent directory in which to create.
+ * @show: The seqfile show method with which to read the file.
+ * @write: The write method which which to 'modify' the file.
+ * @data: Data for retrieval by PDE_DATA().
+ *
+ * Create a network-namespaced proc file in the @parent directory with the
+ * specified @name and @mode that allows reading of a file that displays a
+ * single element rather than a series and also provides for the file accepting
+ * writes that have some arbitrary effect.
+ *
+ * The @show function is called to extract the readable content via the
+ * seq_file interface.
+ *
+ * The @write function is called with the data copied into a kernel space
+ * scratch buffer and has a NUL appended for convenience.  The buffer may be
+ * modified by the @write function.  @write should return 0 on success.
+ *
+ * The @data value is accessible from the @show and @write functions by calling
+ * PDE_DATA() on the file inode.  The network namespace must be accessed by
+ * calling seq_file_single_net() on the seq_file struct.
+ */
+struct proc_dir_entry *proc_create_net_single_write(const char *name, umode_t mode,
+						    struct proc_dir_entry *parent,
+						    int (*show)(struct seq_file *, void *),
+						    proc_write_t write,
+						    void *data)
+{
+	struct proc_dir_entry *p;
+
+	p = proc_create_reg(name, mode, &parent, data);
+	if (!p)
+		return NULL;
+	p->proc_fops = &proc_net_single_fops;
+	p->single_show = show;
+	p->write = write;
+	return proc_register(parent, p);
+}
+EXPORT_SYMBOL_GPL(proc_create_net_single_write);
+
 static struct net *get_proc_task_net(struct inode *dir)
 {
 	struct task_struct *task;
diff --git a/fs/proc/root.c b/fs/proc/root.c
index 61b7340..f4b1a9d 100644
--- a/fs/proc/root.c
+++ b/fs/proc/root.c
@@ -204,8 +204,7 @@
 	.proc_fops	= &proc_root_operations,
 	.parent		= &proc_root,
 	.subdir		= RB_ROOT,
-	.name		= proc_root.inline_name,
-	.inline_name	= "/proc",
+	.name		= "/proc",
 };
 
 int pid_ns_prepare_proc(struct pid_namespace *ns)
diff --git a/fs/signalfd.c b/fs/signalfd.c
index cbb42f7..4fcd149 100644
--- a/fs/signalfd.c
+++ b/fs/signalfd.c
@@ -259,10 +259,8 @@
 	.llseek		= noop_llseek,
 };
 
-static int do_signalfd4(int ufd, sigset_t __user *user_mask, size_t sizemask,
-			int flags)
+static int do_signalfd4(int ufd, sigset_t *mask, int flags)
 {
-	sigset_t sigmask;
 	struct signalfd_ctx *ctx;
 
 	/* Check the SFD_* constants for consistency.  */
@@ -272,18 +270,15 @@
 	if (flags & ~(SFD_CLOEXEC | SFD_NONBLOCK))
 		return -EINVAL;
 
-	if (sizemask != sizeof(sigset_t) ||
-	    copy_from_user(&sigmask, user_mask, sizeof(sigmask)))
-		return -EINVAL;
-	sigdelsetmask(&sigmask, sigmask(SIGKILL) | sigmask(SIGSTOP));
-	signotset(&sigmask);
+	sigdelsetmask(mask, sigmask(SIGKILL) | sigmask(SIGSTOP));
+	signotset(mask);
 
 	if (ufd == -1) {
 		ctx = kmalloc(sizeof(*ctx), GFP_KERNEL);
 		if (!ctx)
 			return -ENOMEM;
 
-		ctx->sigmask = sigmask;
+		ctx->sigmask = *mask;
 
 		/*
 		 * When we call this, the initialization must be complete, since
@@ -303,7 +298,7 @@
 			return -EINVAL;
 		}
 		spin_lock_irq(&current->sighand->siglock);
-		ctx->sigmask = sigmask;
+		ctx->sigmask = *mask;
 		spin_unlock_irq(&current->sighand->siglock);
 
 		wake_up(&current->sighand->signalfd_wqh);
@@ -316,46 +311,51 @@
 SYSCALL_DEFINE4(signalfd4, int, ufd, sigset_t __user *, user_mask,
 		size_t, sizemask, int, flags)
 {
-	return do_signalfd4(ufd, user_mask, sizemask, flags);
+	sigset_t mask;
+
+	if (sizemask != sizeof(sigset_t) ||
+	    copy_from_user(&mask, user_mask, sizeof(mask)))
+		return -EINVAL;
+	return do_signalfd4(ufd, &mask, flags);
 }
 
 SYSCALL_DEFINE3(signalfd, int, ufd, sigset_t __user *, user_mask,
 		size_t, sizemask)
 {
-	return do_signalfd4(ufd, user_mask, sizemask, 0);
+	sigset_t mask;
+
+	if (sizemask != sizeof(sigset_t) ||
+	    copy_from_user(&mask, user_mask, sizeof(mask)))
+		return -EINVAL;
+	return do_signalfd4(ufd, &mask, 0);
 }
 
 #ifdef CONFIG_COMPAT
 static long do_compat_signalfd4(int ufd,
-			const compat_sigset_t __user *sigmask,
+			const compat_sigset_t __user *user_mask,
 			compat_size_t sigsetsize, int flags)
 {
-	sigset_t tmp;
-	sigset_t __user *ksigmask;
+	sigset_t mask;
 
 	if (sigsetsize != sizeof(compat_sigset_t))
 		return -EINVAL;
-	if (get_compat_sigset(&tmp, sigmask))
+	if (get_compat_sigset(&mask, user_mask))
 		return -EFAULT;
-	ksigmask = compat_alloc_user_space(sizeof(sigset_t));
-	if (copy_to_user(ksigmask, &tmp, sizeof(sigset_t)))
-		return -EFAULT;
-
-	return do_signalfd4(ufd, ksigmask, sizeof(sigset_t), flags);
+	return do_signalfd4(ufd, &mask, flags);
 }
 
 COMPAT_SYSCALL_DEFINE4(signalfd4, int, ufd,
-		     const compat_sigset_t __user *, sigmask,
+		     const compat_sigset_t __user *, user_mask,
 		     compat_size_t, sigsetsize,
 		     int, flags)
 {
-	return do_compat_signalfd4(ufd, sigmask, sigsetsize, flags);
+	return do_compat_signalfd4(ufd, user_mask, sigsetsize, flags);
 }
 
 COMPAT_SYSCALL_DEFINE3(signalfd, int, ufd,
-		     const compat_sigset_t __user *,sigmask,
+		     const compat_sigset_t __user *, user_mask,
 		     compat_size_t, sigsetsize)
 {
-	return do_compat_signalfd4(ufd, sigmask, sigsetsize, 0);
+	return do_compat_signalfd4(ufd, user_mask, sigsetsize, 0);
 }
 #endif
diff --git a/fs/splice.c b/fs/splice.c
index 2365ab0..b3daa97 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -1243,38 +1243,26 @@
  * For lack of a better implementation, implement vmsplice() to userspace
  * as a simple copy of the pipes pages to the user iov.
  */
-static long vmsplice_to_user(struct file *file, const struct iovec __user *uiov,
-			     unsigned long nr_segs, unsigned int flags)
+static long vmsplice_to_user(struct file *file, struct iov_iter *iter,
+			     unsigned int flags)
 {
-	struct pipe_inode_info *pipe;
-	struct splice_desc sd;
-	long ret;
-	struct iovec iovstack[UIO_FASTIOV];
-	struct iovec *iov = iovstack;
-	struct iov_iter iter;
+	struct pipe_inode_info *pipe = get_pipe_info(file);
+	struct splice_desc sd = {
+		.total_len = iov_iter_count(iter),
+		.flags = flags,
+		.u.data = iter
+	};
+	long ret = 0;
 
-	pipe = get_pipe_info(file);
 	if (!pipe)
 		return -EBADF;
 
-	ret = import_iovec(READ, uiov, nr_segs,
-			   ARRAY_SIZE(iovstack), &iov, &iter);
-	if (ret < 0)
-		return ret;
-
-	sd.total_len = iov_iter_count(&iter);
-	sd.len = 0;
-	sd.flags = flags;
-	sd.u.data = &iter;
-	sd.pos = 0;
-
 	if (sd.total_len) {
 		pipe_lock(pipe);
 		ret = __splice_from_pipe(pipe, &sd, pipe_to_user);
 		pipe_unlock(pipe);
 	}
 
-	kfree(iov);
 	return ret;
 }
 
@@ -1283,14 +1271,11 @@
  * as splice-from-memory, where the regular splice is splice-from-file (or
  * to file). In both cases the output is a pipe, naturally.
  */
-static long vmsplice_to_pipe(struct file *file, const struct iovec __user *uiov,
-			     unsigned long nr_segs, unsigned int flags)
+static long vmsplice_to_pipe(struct file *file, struct iov_iter *iter,
+			     unsigned int flags)
 {
 	struct pipe_inode_info *pipe;
-	struct iovec iovstack[UIO_FASTIOV];
-	struct iovec *iov = iovstack;
-	struct iov_iter from;
-	long ret;
+	long ret = 0;
 	unsigned buf_flag = 0;
 
 	if (flags & SPLICE_F_GIFT)
@@ -1300,22 +1285,31 @@
 	if (!pipe)
 		return -EBADF;
 
-	ret = import_iovec(WRITE, uiov, nr_segs,
-			   ARRAY_SIZE(iovstack), &iov, &from);
-	if (ret < 0)
-		return ret;
-
 	pipe_lock(pipe);
 	ret = wait_for_space(pipe, flags);
 	if (!ret)
-		ret = iter_to_pipe(&from, pipe, buf_flag);
+		ret = iter_to_pipe(iter, pipe, buf_flag);
 	pipe_unlock(pipe);
 	if (ret > 0)
 		wakeup_pipe_readers(pipe);
-	kfree(iov);
 	return ret;
 }
 
+static int vmsplice_type(struct fd f, int *type)
+{
+	if (!f.file)
+		return -EBADF;
+	if (f.file->f_mode & FMODE_WRITE) {
+		*type = WRITE;
+	} else if (f.file->f_mode & FMODE_READ) {
+		*type = READ;
+	} else {
+		fdput(f);
+		return -EBADF;
+	}
+	return 0;
+}
+
 /*
  * Note that vmsplice only really supports true splicing _from_ user memory
  * to a pipe, not the other way around. Splicing from user memory is a simple
@@ -1332,57 +1326,69 @@
  * Currently we punt and implement it as a normal copy, see pipe_to_user().
  *
  */
-static long do_vmsplice(int fd, const struct iovec __user *iov,
-			unsigned long nr_segs, unsigned int flags)
+static long do_vmsplice(struct file *f, struct iov_iter *iter, unsigned int flags)
 {
-	struct fd f;
-	long error;
-
 	if (unlikely(flags & ~SPLICE_F_ALL))
 		return -EINVAL;
-	if (unlikely(nr_segs > UIO_MAXIOV))
-		return -EINVAL;
-	else if (unlikely(!nr_segs))
+
+	if (!iov_iter_count(iter))
 		return 0;
 
-	error = -EBADF;
-	f = fdget(fd);
-	if (f.file) {
-		if (f.file->f_mode & FMODE_WRITE)
-			error = vmsplice_to_pipe(f.file, iov, nr_segs, flags);
-		else if (f.file->f_mode & FMODE_READ)
-			error = vmsplice_to_user(f.file, iov, nr_segs, flags);
-
-		fdput(f);
-	}
-
-	return error;
+	if (iov_iter_rw(iter) == WRITE)
+		return vmsplice_to_pipe(f, iter, flags);
+	else
+		return vmsplice_to_user(f, iter, flags);
 }
 
-SYSCALL_DEFINE4(vmsplice, int, fd, const struct iovec __user *, iov,
+SYSCALL_DEFINE4(vmsplice, int, fd, const struct iovec __user *, uiov,
 		unsigned long, nr_segs, unsigned int, flags)
 {
-	return do_vmsplice(fd, iov, nr_segs, flags);
+	struct iovec iovstack[UIO_FASTIOV];
+	struct iovec *iov = iovstack;
+	struct iov_iter iter;
+	long error;
+	struct fd f;
+	int type;
+
+	f = fdget(fd);
+	error = vmsplice_type(f, &type);
+	if (error)
+		return error;
+
+	error = import_iovec(type, uiov, nr_segs,
+			     ARRAY_SIZE(iovstack), &iov, &iter);
+	if (!error) {
+		error = do_vmsplice(f.file, &iter, flags);
+		kfree(iov);
+	}
+	fdput(f);
+	return error;
 }
 
 #ifdef CONFIG_COMPAT
 COMPAT_SYSCALL_DEFINE4(vmsplice, int, fd, const struct compat_iovec __user *, iov32,
 		    unsigned int, nr_segs, unsigned int, flags)
 {
-	unsigned i;
-	struct iovec __user *iov;
-	if (nr_segs > UIO_MAXIOV)
-		return -EINVAL;
-	iov = compat_alloc_user_space(nr_segs * sizeof(struct iovec));
-	for (i = 0; i < nr_segs; i++) {
-		struct compat_iovec v;
-		if (get_user(v.iov_base, &iov32[i].iov_base) ||
-		    get_user(v.iov_len, &iov32[i].iov_len) ||
-		    put_user(compat_ptr(v.iov_base), &iov[i].iov_base) ||
-		    put_user(v.iov_len, &iov[i].iov_len))
-			return -EFAULT;
+	struct iovec iovstack[UIO_FASTIOV];
+	struct iovec *iov = iovstack;
+	struct iov_iter iter;
+	long error;
+	struct fd f;
+	int type;
+
+	f = fdget(fd);
+	error = vmsplice_type(f, &type);
+	if (error)
+		return error;
+
+	error = compat_import_iovec(type, iov32, nr_segs,
+			     ARRAY_SIZE(iovstack), &iov, &iter);
+	if (!error) {
+		error = do_vmsplice(f.file, &iter, flags);
+		kfree(iov);
 	}
-	return do_vmsplice(fd, iov, nr_segs, flags);
+	fdput(f);
+	return error;
 }
 #endif
 
diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
index e64c029..b38964a 100644
--- a/include/linux/fsnotify_backend.h
+++ b/include/linux/fsnotify_backend.h
@@ -98,8 +98,6 @@
 struct fsnotify_ops {
 	int (*handle_event)(struct fsnotify_group *group,
 			    struct inode *inode,
-			    struct fsnotify_mark *inode_mark,
-			    struct fsnotify_mark *vfsmount_mark,
 			    u32 mask, const void *data, int data_type,
 			    const unsigned char *file_name, u32 cookie,
 			    struct fsnotify_iter_info *iter_info);
@@ -201,6 +199,57 @@
 #define FSNOTIFY_EVENT_PATH	1
 #define FSNOTIFY_EVENT_INODE	2
 
+enum fsnotify_obj_type {
+	FSNOTIFY_OBJ_TYPE_INODE,
+	FSNOTIFY_OBJ_TYPE_VFSMOUNT,
+	FSNOTIFY_OBJ_TYPE_COUNT,
+	FSNOTIFY_OBJ_TYPE_DETACHED = FSNOTIFY_OBJ_TYPE_COUNT
+};
+
+#define FSNOTIFY_OBJ_TYPE_INODE_FL	(1U << FSNOTIFY_OBJ_TYPE_INODE)
+#define FSNOTIFY_OBJ_TYPE_VFSMOUNT_FL	(1U << FSNOTIFY_OBJ_TYPE_VFSMOUNT)
+#define FSNOTIFY_OBJ_ALL_TYPES_MASK	((1U << FSNOTIFY_OBJ_TYPE_COUNT) - 1)
+
+struct fsnotify_iter_info {
+	struct fsnotify_mark *marks[FSNOTIFY_OBJ_TYPE_COUNT];
+	unsigned int report_mask;
+	int srcu_idx;
+};
+
+static inline bool fsnotify_iter_should_report_type(
+		struct fsnotify_iter_info *iter_info, int type)
+{
+	return (iter_info->report_mask & (1U << type));
+}
+
+static inline void fsnotify_iter_set_report_type(
+		struct fsnotify_iter_info *iter_info, int type)
+{
+	iter_info->report_mask |= (1U << type);
+}
+
+static inline void fsnotify_iter_set_report_type_mark(
+		struct fsnotify_iter_info *iter_info, int type,
+		struct fsnotify_mark *mark)
+{
+	iter_info->marks[type] = mark;
+	iter_info->report_mask |= (1U << type);
+}
+
+#define FSNOTIFY_ITER_FUNCS(name, NAME) \
+static inline struct fsnotify_mark *fsnotify_iter_##name##_mark( \
+		struct fsnotify_iter_info *iter_info) \
+{ \
+	return (iter_info->report_mask & FSNOTIFY_OBJ_TYPE_##NAME##_FL) ? \
+		iter_info->marks[FSNOTIFY_OBJ_TYPE_##NAME] : NULL; \
+}
+
+FSNOTIFY_ITER_FUNCS(inode, INODE)
+FSNOTIFY_ITER_FUNCS(vfsmount, VFSMOUNT)
+
+#define fsnotify_foreach_obj_type(type) \
+	for (type = 0; type < FSNOTIFY_OBJ_TYPE_COUNT; type++)
+
 /*
  * Inode / vfsmount point to this structure which tracks all marks attached to
  * the inode / vfsmount. The reference to inode / vfsmount is held by this
@@ -209,11 +258,7 @@
  */
 struct fsnotify_mark_connector {
 	spinlock_t lock;
-#define FSNOTIFY_OBJ_TYPE_INODE		0x01
-#define FSNOTIFY_OBJ_TYPE_VFSMOUNT	0x02
-#define FSNOTIFY_OBJ_ALL_TYPES		(FSNOTIFY_OBJ_TYPE_INODE | \
-					 FSNOTIFY_OBJ_TYPE_VFSMOUNT)
-	unsigned int flags;	/* Type of object [lock] */
+	unsigned int type;	/* Type of object [lock] */
 	union {	/* Object pointer [lock] */
 		struct inode *inode;
 		struct vfsmount *mnt;
@@ -356,7 +401,21 @@
 extern int fsnotify_add_mark(struct fsnotify_mark *mark, struct inode *inode,
 			     struct vfsmount *mnt, int allow_dups);
 extern int fsnotify_add_mark_locked(struct fsnotify_mark *mark,
-				    struct inode *inode, struct vfsmount *mnt, int allow_dups);
+				    struct inode *inode, struct vfsmount *mnt,
+				    int allow_dups);
+/* attach the mark to the inode */
+static inline int fsnotify_add_inode_mark(struct fsnotify_mark *mark,
+					  struct inode *inode,
+					  int allow_dups)
+{
+	return fsnotify_add_mark(mark, inode, NULL, allow_dups);
+}
+static inline int fsnotify_add_inode_mark_locked(struct fsnotify_mark *mark,
+						 struct inode *inode,
+						 int allow_dups)
+{
+	return fsnotify_add_mark_locked(mark, inode, NULL, allow_dups);
+}
 /* given a group and a mark, flag mark to be freed when all references are dropped */
 extern void fsnotify_destroy_mark(struct fsnotify_mark *mark,
 				  struct fsnotify_group *group);
@@ -369,12 +428,12 @@
 /* run all the marks in a group, and clear all of the vfsmount marks */
 static inline void fsnotify_clear_vfsmount_marks_by_group(struct fsnotify_group *group)
 {
-	fsnotify_clear_marks_by_group(group, FSNOTIFY_OBJ_TYPE_VFSMOUNT);
+	fsnotify_clear_marks_by_group(group, FSNOTIFY_OBJ_TYPE_VFSMOUNT_FL);
 }
 /* run all the marks in a group, and clear all of the inode marks */
 static inline void fsnotify_clear_inode_marks_by_group(struct fsnotify_group *group)
 {
-	fsnotify_clear_marks_by_group(group, FSNOTIFY_OBJ_TYPE_INODE);
+	fsnotify_clear_marks_by_group(group, FSNOTIFY_OBJ_TYPE_INODE_FL);
 }
 extern void fsnotify_get_mark(struct fsnotify_mark *mark);
 extern void fsnotify_put_mark(struct fsnotify_mark *mark);
diff --git a/include/linux/namei.h b/include/linux/namei.h
index a982bb7..a78606e 100644
--- a/include/linux/namei.h
+++ b/include/linux/namei.h
@@ -81,6 +81,7 @@
 extern struct dentry *kern_path_locked(const char *, struct path *);
 extern int kern_path_mountpoint(int, const char *, struct path *, unsigned int);
 
+extern struct dentry *try_lookup_one_len(const char *, struct dentry *, int);
 extern struct dentry *lookup_one_len(const char *, struct dentry *, int);
 extern struct dentry *lookup_one_len_unlocked(const char *, struct dentry *, int);
 
diff --git a/include/linux/netfilter.h b/include/linux/netfilter.h
index 04551af..dd2052f 100644
--- a/include/linux/netfilter.h
+++ b/include/linux/netfilter.h
@@ -345,7 +345,7 @@
 
 	rcu_read_lock();
 	nat_hook = rcu_dereference(nf_nat_hook);
-	if (nat_hook->decode_session)
+	if (nat_hook && nat_hook->decode_session)
 		nat_hook->decode_session(skb, fl);
 	rcu_read_unlock();
 #endif
diff --git a/include/linux/netfilter/ipset/ip_set_timeout.h b/include/linux/netfilter/ipset/ip_set_timeout.h
index bfb3531..8ce271e 100644
--- a/include/linux/netfilter/ipset/ip_set_timeout.h
+++ b/include/linux/netfilter/ipset/ip_set_timeout.h
@@ -23,6 +23,9 @@
 /* Set is defined with timeout support: timeout value may be 0 */
 #define IPSET_NO_TIMEOUT	UINT_MAX
 
+/* Max timeout value, see msecs_to_jiffies() in jiffies.h */
+#define IPSET_MAX_TIMEOUT	(UINT_MAX >> 1)/MSEC_PER_SEC
+
 #define ip_set_adt_opt_timeout(opt, set)	\
 ((opt)->ext.timeout != IPSET_NO_TIMEOUT ? (opt)->ext.timeout : (set)->timeout)
 
@@ -32,11 +35,10 @@
 	unsigned int timeout = ip_set_get_h32(tb);
 
 	/* Normalize to fit into jiffies */
-	if (timeout > UINT_MAX/MSEC_PER_SEC)
-		timeout = UINT_MAX/MSEC_PER_SEC;
+	if (timeout > IPSET_MAX_TIMEOUT)
+		timeout = IPSET_MAX_TIMEOUT;
 
-	/* Userspace supplied TIMEOUT parameter: adjust crazy size */
-	return timeout == IPSET_NO_TIMEOUT ? IPSET_NO_TIMEOUT - 1 : timeout;
+	return timeout;
 }
 
 static inline bool
@@ -65,8 +67,14 @@
 static inline u32
 ip_set_timeout_get(const unsigned long *timeout)
 {
-	return *timeout == IPSET_ELEM_PERMANENT ? 0 :
-		jiffies_to_msecs(*timeout - jiffies)/MSEC_PER_SEC;
+	u32 t;
+
+	if (*timeout == IPSET_ELEM_PERMANENT)
+		return 0;
+
+	t = jiffies_to_msecs(*timeout - jiffies)/MSEC_PER_SEC;
+	/* Zero value in userspace means no timeout */
+	return t == 0 ? 1 : t;
 }
 
 #endif	/* __KERNEL__ */
diff --git a/include/linux/platform_data/shmob_drm.h b/include/linux/platform_data/shmob_drm.h
index 7c686d3..ee495d7 100644
--- a/include/linux/platform_data/shmob_drm.h
+++ b/include/linux/platform_data/shmob_drm.h
@@ -18,9 +18,6 @@
 
 #include <drm/drm_mode.h>
 
-struct sh_mobile_meram_cfg;
-struct sh_mobile_meram_info;
-
 enum shmob_drm_clk_source {
 	SHMOB_DRM_CLK_BUS,
 	SHMOB_DRM_CLK_PERIPHERAL,
@@ -93,7 +90,6 @@
 	struct shmob_drm_interface_data iface;
 	struct shmob_drm_panel_data panel;
 	struct shmob_drm_backlight_data backlight;
-	const struct sh_mobile_meram_cfg *meram;
 };
 
 #endif /* __SHMOB_DRM_H__ */
diff --git a/include/linux/proc_fs.h b/include/linux/proc_fs.h
index e518352..626fc65 100644
--- a/include/linux/proc_fs.h
+++ b/include/linux/proc_fs.h
@@ -14,6 +14,8 @@
 
 #ifdef CONFIG_PROC_FS
 
+typedef int (*proc_write_t)(struct file *, char *, size_t);
+
 extern void proc_root_init(void);
 extern void proc_flush_task(struct task_struct *);
 
@@ -61,6 +63,16 @@
 struct proc_dir_entry *proc_create_net_single(const char *name, umode_t mode,
 		struct proc_dir_entry *parent,
 		int (*show)(struct seq_file *, void *), void *data);
+struct proc_dir_entry *proc_create_net_data_write(const char *name, umode_t mode,
+						  struct proc_dir_entry *parent,
+						  const struct seq_operations *ops,
+						  proc_write_t write,
+						  unsigned int state_size, void *data);
+struct proc_dir_entry *proc_create_net_single_write(const char *name, umode_t mode,
+						    struct proc_dir_entry *parent,
+						    int (*show)(struct seq_file *, void *),
+						    proc_write_t write,
+						    void *data);
 
 #else /* CONFIG_PROC_FS */
 
diff --git a/include/linux/virtio_ring.h b/include/linux/virtio_ring.h
index bbf3252..fab0213 100644
--- a/include/linux/virtio_ring.h
+++ b/include/linux/virtio_ring.h
@@ -35,7 +35,7 @@
 	if (weak_barriers)
 		virt_rmb();
 	else
-		rmb();
+		dma_rmb();
 }
 
 static inline void virtio_wmb(bool weak_barriers)
@@ -43,7 +43,7 @@
 	if (weak_barriers)
 		virt_wmb();
 	else
-		wmb();
+		dma_wmb();
 }
 
 static inline void virtio_store_mb(bool weak_barriers,
diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index 6d6e21d..a0bec23 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -631,6 +631,7 @@
 
 	/* alternate persistence engine */
 	struct ip_vs_pe __rcu	*pe;
+	int			conntrack_afmask;
 
 	struct rcu_head		rcu_head;
 };
@@ -1611,6 +1612,35 @@
 	return false;
 }
 
+static inline int ip_vs_register_conntrack(struct ip_vs_service *svc)
+{
+#if IS_ENABLED(CONFIG_NF_CONNTRACK)
+	int afmask = (svc->af == AF_INET6) ? 2 : 1;
+	int ret = 0;
+
+	if (!(svc->conntrack_afmask & afmask)) {
+		ret = nf_ct_netns_get(svc->ipvs->net, svc->af);
+		if (ret >= 0)
+			svc->conntrack_afmask |= afmask;
+	}
+	return ret;
+#else
+	return 0;
+#endif
+}
+
+static inline void ip_vs_unregister_conntrack(struct ip_vs_service *svc)
+{
+#if IS_ENABLED(CONFIG_NF_CONNTRACK)
+	int afmask = (svc->af == AF_INET6) ? 2 : 1;
+
+	if (svc->conntrack_afmask & afmask) {
+		nf_ct_netns_put(svc->ipvs->net, svc->af);
+		svc->conntrack_afmask &= ~afmask;
+	}
+#endif
+}
+
 static inline int
 ip_vs_dest_conn_overhead(struct ip_vs_dest *dest)
 {
diff --git a/include/net/netfilter/nf_conntrack_count.h b/include/net/netfilter/nf_conntrack_count.h
index 1910b65..3a188a0 100644
--- a/include/net/netfilter/nf_conntrack_count.h
+++ b/include/net/netfilter/nf_conntrack_count.h
@@ -20,7 +20,8 @@
 				 bool *addit);
 
 bool nf_conncount_add(struct hlist_head *head,
-		      const struct nf_conntrack_tuple *tuple);
+		      const struct nf_conntrack_tuple *tuple,
+		      const struct nf_conntrack_zone *zone);
 
 void nf_conncount_cache_free(struct hlist_head *hhead);
 
diff --git a/include/net/netfilter/nft_dup.h b/include/net/netfilter/nft_dup.h
deleted file mode 100644
index 4d9d512..0000000
--- a/include/net/netfilter/nft_dup.h
+++ /dev/null
@@ -1,10 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef _NFT_DUP_H_
-#define _NFT_DUP_H_
-
-struct nft_dup_inet {
-	enum nft_registers	sreg_addr:8;
-	enum nft_registers	sreg_dev:8;
-};
-
-#endif /* _NFT_DUP_H_ */
diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
index ebf809e..dbe1b91 100644
--- a/include/net/sctp/structs.h
+++ b/include/net/sctp/structs.h
@@ -1133,6 +1133,11 @@
 };
 #define SCTP_INPUT_CB(__skb)	((struct sctp_input_cb *)&((__skb)->cb[0]))
 
+struct sctp_output_cb {
+	struct sk_buff *last;
+};
+#define SCTP_OUTPUT_CB(__skb)	((struct sctp_output_cb *)&((__skb)->cb[0]))
+
 static inline const struct sk_buff *sctp_gso_headskb(const struct sk_buff *skb)
 {
 	const struct sctp_chunk *chunk = SCTP_INPUT_CB(skb)->chunk;
diff --git a/include/net/tls.h b/include/net/tls.h
index 70c2737..7f84ea3 100644
--- a/include/net/tls.h
+++ b/include/net/tls.h
@@ -109,8 +109,7 @@
 
 	struct strparser strp;
 	void (*saved_data_ready)(struct sock *sk);
-	unsigned int (*sk_poll)(struct file *file, struct socket *sock,
-				struct poll_table_struct *wait);
+	__poll_t (*sk_poll_mask)(struct socket *sock, __poll_t events);
 	struct sk_buff *recv_pkt;
 	u8 control;
 	bool decrypted;
@@ -225,8 +224,7 @@
 void tls_sw_free_resources_rx(struct sock *sk);
 int tls_sw_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
 		   int nonblock, int flags, int *addr_len);
-unsigned int tls_sw_poll(struct file *file, struct socket *sock,
-			 struct poll_table_struct *wait);
+__poll_t tls_sw_poll_mask(struct socket *sock, __poll_t events);
 ssize_t tls_sw_splice_read(struct socket *sock, loff_t *ppos,
 			   struct pipe_inode_info *pipe,
 			   size_t len, unsigned int flags);
diff --git a/include/uapi/linux/aio_abi.h b/include/uapi/linux/aio_abi.h
index 7584616..d002213 100644
--- a/include/uapi/linux/aio_abi.h
+++ b/include/uapi/linux/aio_abi.h
@@ -109,7 +109,7 @@
 #undef IFLITTLE
 
 struct __aio_sigset {
-	sigset_t __user	*sigmask;
+	const sigset_t __user	*sigmask;
 	size_t		sigsetsize;
 };
 
diff --git a/include/uapi/linux/netfilter/nf_conntrack_common.h b/include/uapi/linux/netfilter/nf_conntrack_common.h
index c712eb6..336014b 100644
--- a/include/uapi/linux/netfilter/nf_conntrack_common.h
+++ b/include/uapi/linux/netfilter/nf_conntrack_common.h
@@ -112,7 +112,7 @@
 				 IPS_EXPECTED | IPS_CONFIRMED | IPS_DYING |
 				 IPS_SEQ_ADJUST | IPS_TEMPLATE | IPS_OFFLOAD),
 
-	__IPS_MAX_BIT = 14,
+	__IPS_MAX_BIT = 15,
 };
 
 /* Connection tracking event types */
diff --git a/include/uapi/linux/netfilter/nf_tables.h b/include/uapi/linux/netfilter/nf_tables.h
index c9bf74b..89438e6 100644
--- a/include/uapi/linux/netfilter/nf_tables.h
+++ b/include/uapi/linux/netfilter/nf_tables.h
@@ -266,7 +266,7 @@
  * @NFT_SET_INTERVAL: set contains intervals
  * @NFT_SET_MAP: set is used as a dictionary
  * @NFT_SET_TIMEOUT: set uses timeouts
- * @NFT_SET_EVAL: set contains expressions for evaluation
+ * @NFT_SET_EVAL: set can be updated from the evaluation path
  * @NFT_SET_OBJECT: set contains stateful objects
  */
 enum nft_set_flags {
diff --git a/include/uapi/linux/nl80211.h b/include/uapi/linux/nl80211.h
index 28b3654..27e4e44 100644
--- a/include/uapi/linux/nl80211.h
+++ b/include/uapi/linux/nl80211.h
@@ -981,18 +981,18 @@
  *	only the %NL80211_ATTR_IE data is used and updated with this command.
  *
  * @NL80211_CMD_SET_PMK: For offloaded 4-Way handshake, set the PMK or PMK-R0
- *	for the given authenticator address (specified with &NL80211_ATTR_MAC).
- *	When &NL80211_ATTR_PMKR0_NAME is set, &NL80211_ATTR_PMK specifies the
+ *	for the given authenticator address (specified with %NL80211_ATTR_MAC).
+ *	When %NL80211_ATTR_PMKR0_NAME is set, %NL80211_ATTR_PMK specifies the
  *	PMK-R0, otherwise it specifies the PMK.
  * @NL80211_CMD_DEL_PMK: For offloaded 4-Way handshake, delete the previously
  *	configured PMK for the authenticator address identified by
- *	&NL80211_ATTR_MAC.
+ *	%NL80211_ATTR_MAC.
  * @NL80211_CMD_PORT_AUTHORIZED: An event that indicates that the 4 way
  *	handshake was completed successfully by the driver. The BSSID is
- *	specified with &NL80211_ATTR_MAC. Drivers that support 4 way handshake
+ *	specified with %NL80211_ATTR_MAC. Drivers that support 4 way handshake
  *	offload should send this event after indicating 802.11 association with
- *	&NL80211_CMD_CONNECT or &NL80211_CMD_ROAM. If the 4 way handshake failed
- *	&NL80211_CMD_DISCONNECT should be indicated instead.
+ *	%NL80211_CMD_CONNECT or %NL80211_CMD_ROAM. If the 4 way handshake failed
+ *	%NL80211_CMD_DISCONNECT should be indicated instead.
  *
  * @NL80211_CMD_CONTROL_PORT_FRAME: Control Port (e.g. PAE) frame TX request
  *	and RX notification.  This command is used both as a request to transmit
@@ -1029,9 +1029,9 @@
  *	initiated the connection through the connect request.
  *
  * @NL80211_CMD_STA_OPMODE_CHANGED: An event that notify station's
- *	ht opmode or vht opmode changes using any of &NL80211_ATTR_SMPS_MODE,
- *	&NL80211_ATTR_CHANNEL_WIDTH,&NL80211_ATTR_NSS attributes with its
- *	address(specified in &NL80211_ATTR_MAC).
+ *	ht opmode or vht opmode changes using any of %NL80211_ATTR_SMPS_MODE,
+ *	%NL80211_ATTR_CHANNEL_WIDTH,%NL80211_ATTR_NSS attributes with its
+ *	address(specified in %NL80211_ATTR_MAC).
  *
  * @NL80211_CMD_MAX: highest used command number
  * @__NL80211_CMD_AFTER_LAST: internal use
@@ -2218,7 +2218,7 @@
  * @NL80211_ATTR_EXTERNAL_AUTH_ACTION: Identify the requested external
  *     authentication operation (u32 attribute with an
  *     &enum nl80211_external_auth_action value). This is used with the
- *     &NL80211_CMD_EXTERNAL_AUTH request event.
+ *     %NL80211_CMD_EXTERNAL_AUTH request event.
  * @NL80211_ATTR_EXTERNAL_AUTH_SUPPORT: Flag attribute indicating that the user
  *     space supports external authentication. This attribute shall be used
  *     only with %NL80211_CMD_CONNECT request. The driver may offload
@@ -3491,7 +3491,7 @@
  * @NL80211_RRF_AUTO_BW: maximum available bandwidth should be calculated
  *	base on contiguous rules and wider channels will be allowed to cross
  *	multiple contiguous/overlapping frequency ranges.
- * @NL80211_RRF_IR_CONCURRENT: See &NL80211_FREQUENCY_ATTR_IR_CONCURRENT
+ * @NL80211_RRF_IR_CONCURRENT: See %NL80211_FREQUENCY_ATTR_IR_CONCURRENT
  * @NL80211_RRF_NO_HT40MINUS: channels can't be used in HT40- operation
  * @NL80211_RRF_NO_HT40PLUS: channels can't be used in HT40+ operation
  * @NL80211_RRF_NO_80MHZ: 80MHz operation not allowed
@@ -5643,11 +5643,11 @@
  * @NL80211_NAN_SRF_INCLUDE: present if the include bit of the SRF set.
  *	This is a flag.
  * @NL80211_NAN_SRF_BF: Bloom Filter. Present if and only if
- *	&NL80211_NAN_SRF_MAC_ADDRS isn't present. This attribute is binary.
+ *	%NL80211_NAN_SRF_MAC_ADDRS isn't present. This attribute is binary.
  * @NL80211_NAN_SRF_BF_IDX: index of the Bloom Filter. Mandatory if
- *	&NL80211_NAN_SRF_BF is present. This is a u8.
+ *	%NL80211_NAN_SRF_BF is present. This is a u8.
  * @NL80211_NAN_SRF_MAC_ADDRS: list of MAC addresses for the SRF. Present if
- *	and only if &NL80211_NAN_SRF_BF isn't present. This is a nested
+ *	and only if %NL80211_NAN_SRF_BF isn't present. This is a nested
  *	attribute. Each nested attribute is a MAC address.
  * @NUM_NL80211_NAN_SRF_ATTR: internal
  * @NL80211_NAN_SRF_ATTR_MAX: highest NAN SRF attribute
diff --git a/include/uapi/linux/virtio_config.h b/include/uapi/linux/virtio_config.h
index 308e209..449132c 100644
--- a/include/uapi/linux/virtio_config.h
+++ b/include/uapi/linux/virtio_config.h
@@ -45,11 +45,14 @@
 /* We've given up on this device. */
 #define VIRTIO_CONFIG_S_FAILED		0x80
 
-/* Some virtio feature bits (currently bits 28 through 32) are reserved for the
- * transport being used (eg. virtio_ring), the rest are per-device feature
- * bits. */
+/*
+ * Virtio feature bits VIRTIO_TRANSPORT_F_START through
+ * VIRTIO_TRANSPORT_F_END are reserved for the transport
+ * being used (e.g. virtio_ring, virtio_pci etc.), the
+ * rest are per-device feature bits.
+ */
 #define VIRTIO_TRANSPORT_F_START	28
-#define VIRTIO_TRANSPORT_F_END		34
+#define VIRTIO_TRANSPORT_F_END		38
 
 #ifndef VIRTIO_CONFIG_NO_LEGACY
 /* Do we get callbacks when the ring is completely used, even if we've
@@ -71,4 +74,9 @@
  * this is for compatibility with legacy systems.
  */
 #define VIRTIO_F_IOMMU_PLATFORM		33
+
+/*
+ * Does the device support Single Root I/O Virtualization?
+ */
+#define VIRTIO_F_SR_IOV			37
 #endif /* _UAPI_LINUX_VIRTIO_CONFIG_H */
diff --git a/include/video/auo_k190xfb.h b/include/video/auo_k190xfb.h
deleted file mode 100644
index ac329ee..0000000
--- a/include/video/auo_k190xfb.h
+++ /dev/null
@@ -1,107 +0,0 @@
-/*
- * Definitions for AUO-K190X framebuffer drivers
- *
- * Copyright (C) 2012 Heiko Stuebner <heiko@sntech.de>
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- */
-
-#ifndef _LINUX_VIDEO_AUO_K190XFB_H_
-#define _LINUX_VIDEO_AUO_K190XFB_H_
-
-/* Controller standby command needs a param */
-#define AUOK190X_QUIRK_STANDBYPARAM	(1 << 0)
-
-/* Controller standby is completely broken */
-#define AUOK190X_QUIRK_STANDBYBROKEN	(1 << 1)
-
-/*
- * Resolutions for the displays
- */
-#define AUOK190X_RESOLUTION_800_600		0
-#define AUOK190X_RESOLUTION_1024_768		1
-#define AUOK190X_RESOLUTION_600_800		4
-#define AUOK190X_RESOLUTION_768_1024		5
-
-/*
- * struct used by auok190x. board specific stuff comes from *board
- */
-struct auok190xfb_par {
-	struct fb_info *info;
-	struct auok190x_board *board;
-
-	struct regulator *regulator;
-
-	struct mutex io_lock;
-	struct delayed_work work;
-	wait_queue_head_t waitq;
-	int resolution;
-	int rotation;
-	int consecutive_threshold;
-	int update_cnt;
-
-	/* panel and controller informations */
-	int epd_type;
-	int panel_size_int;
-	int panel_size_float;
-	int panel_model;
-	int tcon_version;
-	int lut_version;
-
-	/* individual controller callbacks */
-	void (*update_partial)(struct auok190xfb_par *par, u16 y1, u16 y2);
-	void (*update_all)(struct auok190xfb_par *par);
-	bool (*need_refresh)(struct auok190xfb_par *par);
-	void (*init)(struct auok190xfb_par *par);
-	void (*recover)(struct auok190xfb_par *par);
-
-	int update_mode; /* mode to use for updates */
-	int last_mode; /* update mode last used */
-	int flash;
-
-	/* power management */
-	int autosuspend_delay;
-	bool standby;
-	bool manual_standby;
-};
-
-/**
- * Board specific platform-data
- * @init:		initialize the controller interface
- * @cleanup:		cleanup the controller interface
- * @wait_for_rdy:	wait until the controller is not busy anymore
- * @set_ctl:		change an interface control
- * @set_hdb:		write a value to the data register
- * @get_hdb:		read a value from the data register
- * @setup_irq:		method to setup the irq handling on the busy gpio
- * @gpio_nsleep:	sleep gpio
- * @gpio_nrst:		reset gpio
- * @gpio_nbusy:		busy gpio
- * @resolution:		one of the AUOK190X_RESOLUTION constants
- * @rotation:		rotation of the framebuffer
- * @quirks:		controller quirks to honor
- * @fps:		frames per second for defio
- */
-struct auok190x_board {
-	int (*init)(struct auok190xfb_par *);
-	void (*cleanup)(struct auok190xfb_par *);
-	int (*wait_for_rdy)(struct auok190xfb_par *);
-
-	void (*set_ctl)(struct auok190xfb_par *, unsigned char, u8);
-	void (*set_hdb)(struct auok190xfb_par *, u16);
-	u16 (*get_hdb)(struct auok190xfb_par *);
-
-	int (*setup_irq)(struct fb_info *);
-
-	int gpio_nsleep;
-	int gpio_nrst;
-	int gpio_nbusy;
-
-	int resolution;
-	int quirks;
-	int fps;
-};
-
-#endif
diff --git a/include/video/sh_mobile_lcdc.h b/include/video/sh_mobile_lcdc.h
index f706b0f..84aa976 100644
--- a/include/video/sh_mobile_lcdc.h
+++ b/include/video/sh_mobile_lcdc.h
@@ -3,7 +3,6 @@
 #define __ASM_SH_MOBILE_LCDC_H__
 
 #include <linux/fb.h>
-#include <video/sh_mobile_meram.h>
 
 /* Register definitions */
 #define _LDDCKR			0x410
@@ -184,7 +183,6 @@
 	struct sh_mobile_lcdc_panel_cfg panel_cfg;
 	struct sh_mobile_lcdc_bl_info bl_info;
 	struct sh_mobile_lcdc_sys_bus_cfg sys_bus_cfg; /* only for SYSn I/F */
-	const struct sh_mobile_meram_cfg *meram_cfg;
 
 	struct platform_device *tx_dev;	/* HDMI/DSI transmitter device */
 };
@@ -193,7 +191,6 @@
 	int clock_source;
 	struct sh_mobile_lcdc_chan_cfg ch[2];
 	struct sh_mobile_lcdc_overlay_cfg overlays[4];
-	struct sh_mobile_meram_info *meram_dev;
 };
 
 #endif /* __ASM_SH_MOBILE_LCDC_H__ */
diff --git a/include/video/sh_mobile_meram.h b/include/video/sh_mobile_meram.h
deleted file mode 100644
index f4efc21..0000000
--- a/include/video/sh_mobile_meram.h
+++ /dev/null
@@ -1,95 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef __VIDEO_SH_MOBILE_MERAM_H__
-#define __VIDEO_SH_MOBILE_MERAM_H__
-
-/* For sh_mobile_meram_info.addr_mode */
-enum {
-	SH_MOBILE_MERAM_MODE0 = 0,
-	SH_MOBILE_MERAM_MODE1
-};
-
-enum {
-	SH_MOBILE_MERAM_PF_NV = 0,
-	SH_MOBILE_MERAM_PF_RGB,
-	SH_MOBILE_MERAM_PF_NV24
-};
-
-
-struct sh_mobile_meram_priv;
-
-/*
- * struct sh_mobile_meram_info - MERAM platform data
- * @reserved_icbs: Bitmask of reserved ICBs (for instance used through UIO)
- */
-struct sh_mobile_meram_info {
-	int				addr_mode;
-	u32				reserved_icbs;
-	struct sh_mobile_meram_priv	*priv;
-	struct platform_device		*pdev;
-};
-
-/* icb config */
-struct sh_mobile_meram_icb_cfg {
-	unsigned int meram_size;	/* MERAM Buffer Size to use */
-};
-
-struct sh_mobile_meram_cfg {
-	struct sh_mobile_meram_icb_cfg icb