kernel-shark: Multi-thread the computaion of stream/combo plots

Parallelize _newCPUGraph() and _newTaskGraph() calls to dramatically
speed up graph rendering particularly for traces from very large systems.

OpenMP technically is a new dependency here, but it's part of GCC, so long
as your GCC >= v4.9, the libgomp library will make the code compiled.

Signed-off-by: Libo Chen <libo.chen@oracle.com>
Signed-off-by: Yordan Karadzhov <y.karadz@gmail.com>
2 files changed