summaryrefslogtreecommitdiffstats
path: root/include/asm-sparc64
diff options
context:
space:
mode:
authorDavid S. Miller <davem@davemloft.net>2006-02-28 15:10:26 -0800
committerDavid S. Miller <davem@sunset.davemloft.net>2006-03-20 01:14:09 -0800
commitb830ab665ad96c6b20d51a89b35cbc09ab5a2c29 (patch)
tree57c2c75b3e069f9f244259ae02f6f2fe3de68612 /include/asm-sparc64
parentaac0aadf09b98ba36eab0bb02a560ebcb82ac39f (diff)
[SPARC64]: Fix bugs in SUN4V cpu mondo dispatch.
There were several bugs in the SUN4V cpu mondo dispatch code. In fact, if we ever got a EWOULDBLOCK or other error from the hypervisor call, we'd potentially send a cpu mondo multiple times to the same cpu and even worse we could loop until the timeout resending the same mondo over and over to such cpus. So let's bulletproof this thing as follows: 1) Implement cpu_mondo_send() and cpu_state() hypervisor calls in arch/sparc64/kernel/entry.S, add prototypes to asm/hypervisor.h 2) Don't build and update the cpulist using inline functions, this was causing the cpu mask to not get updated in the caller. 3) Disable interrupts during the entire mondo send, otherwise our cpu list and/or mondo block could get overwritten if we take an interrupt and do a cpu mondo send on the current cpu. 4) Check for all possible error return types from the cpu_mondo_send() hypervisor call. In particular: HV_EOK) Our work is done, all cpus have received the mondo. HV_CPUERROR) One or more of the cpus in the cpu list we passed to the hypervisor are in error state. Use cpu_state() calls over the entries in the cpu list to see which ones. Record them in "error_mask" and report this after we are done sending the mondo to cpus which are not in error state. HV_EWOULDBLOCK) We need to keep trying. Any other error we consider fatal, we report the event and exit immediately. 5) We only timeout if forward progress is not made. Forward progress is defined as having at least one cpu get the mondo successfully in a given cpu_mondo_send() call. Otherwise we bump a counter and delay a little. If the counter hits a limit, we signal an error and report the event. Also, smp_call_function_mask() error handling reports the number of cpus incorrectly. Signed-off-by: David S. Miller <davem@davemloft.net>
Diffstat (limited to 'include/asm-sparc64')
-rw-r--r--include/asm-sparc64/hypervisor.h10
1 files changed, 10 insertions, 0 deletions
diff --git a/include/asm-sparc64/hypervisor.h b/include/asm-sparc64/hypervisor.h
index 726e2ea03ce3..612bf319753f 100644
--- a/include/asm-sparc64/hypervisor.h
+++ b/include/asm-sparc64/hypervisor.h
@@ -342,6 +342,8 @@ extern unsigned long sun4v_cpu_qconf(unsigned long type,
* ENOCPU Invalid cpu in CPU list
* EWOULDBLOCK Some or all of the listed CPUs did not receive
* the mondo
+ * ECPUERROR One or more of the listed CPUs are in error
+ * state, use HV_FAST_CPU_STATE to see which ones
* EINVAL CPU list includes caller's CPU ID
*
* Send a mondo interrupt to the CPUs in the given CPU list with the
@@ -355,6 +357,10 @@ extern unsigned long sun4v_cpu_qconf(unsigned long type,
*/
#define HV_FAST_CPU_MONDO_SEND 0x42
+#ifndef __ASSEMBLY__
+extern unsigned long sun4v_cpu_mondo_send(unsigned long cpu_count, unsigned long cpu_list_pa, unsigned long mondo_block_pa);
+#endif
+
/* cpu_myid()
* TRAP: HV_FAST_TRAP
* FUNCTION: HV_FAST_CPU_MYID
@@ -382,6 +388,10 @@ extern unsigned long sun4v_cpu_qconf(unsigned long type,
#define HV_CPU_STATE_RUNNING 0x02
#define HV_CPU_STATE_ERROR 0x03
+#ifndef __ASSEMBLY__
+extern long sun4v_cpu_state(unsigned long cpuid);
+#endif
+
/* cpu_set_rtba()
* TRAP: HV_FAST_TRAP
* FUNCTION: HV_FAST_CPU_SET_RTBA