From ecf590ae9bb13ba2b2f01c3bf7a53056a8b1467b Mon Sep 17 00:00:00 2001 From: Markus Wittmann Date: Thu, 26 Oct 2017 09:43:56 +0200 Subject: [PATCH] add HTML documentation --- doc/html/main.html | 623 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 623 insertions(+) create mode 100644 doc/html/main.html diff --git a/doc/html/main.html b/doc/html/main.html new file mode 100644 index 0000000..36579cc --- /dev/null +++ b/doc/html/main.html @@ -0,0 +1,623 @@ + + + + + + +LBM Benchmark Kernels Documentation + + + +
+

LBM Benchmark Kernels Documentation

+ + + +
+

1   Compilation

+

The benchmark framework currently supports only Linux systems and the GCC and +Intel compilers. Every other configuration probably requires adjustment inside +the code and the makefiles. Further some code might be platform or at least +POSIX specific.

+

The benchmark can be build via make from the src subdirectory. This will +generate one binary which hosts all implemented benchmark kernels.

+

Binaries are located under the bin subdirectory and will have different names +depending on compiler and build configuration.

+
+

1.1   Debug and Verification

+
+make
+
+

Running make without any arguments builds the debug version (BUILD=debug) of +the benchmark kernels, where no optimizations are performed, line numbers and +debug symbols are included as well as DEBUG will be defined. The resulting +binary will be found in the bin subdirectory and named +lbmbenchk-linux-<compiler>-debug.

+

Without any further specification the binary includes verification +(VERIFICATION=on), statistics (STATISTICS), and VTK output +(VTK_OUTPUT=on) enabled.

+

Please note that the generated binary will therefore +exhibit a poor performance.

+
+
+

1.2   Benchmarking

+

To generate a binary for benchmarking run make with

+
+make BENCHMARK=on BUILD=release
+
+

Here BUILD=release turns optimizations on and BENCHMARK=on disables +verfification, statistics, and VTK output.

+
+
+

1.3   Release and Verification

+

Verification with the debug builds can be extremely slow. Hence verification +capabilities can be build with release builds:

+
+make BUILD=release
+
+
+
+

1.4   Compilers

+

Currently only the GCC and Intel compiler under Linux are supported. Between +both configuration can be chosen via CONFIG=linux-gcc or +CONFIG=linux-intel.

+
+
+

1.5   Options Summary

+

Options that can be specified when building the framework with make:

+ ++++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
namevaluesdefaultdescription
TARCH----Via TARCH the architecture the compiler generates code for can be overriden. The value depends on the chose compiler.
BENCHMARKon, offoffIf enabled, disables VERIFICATION, STATISTICS, VTK_OUTPUT.
BUILDdebug, releasedebugNo optimization, debug symbols, DEBUG defined.
CONFIGlinux-gcc, linux-intellinux-intelSelect GCC or Intel compiler.
ISAavx, sseavxDetermines which ISA extension is used for macro definitions. This is not the architecture the compiler generates code for.
OPENMPon, offonOpenMP, i.,e.. threading support.
STATISTICSon, offoffView statistics, like density etc, during simulation.
VERIFICATIONon, offoffTurn verification on/off.
VTK_OUTPUTon, offoffEnable/Disable VTK file output.
+
+
+
+

2   Invocation

+

Running the binary will print among the GPL licence header a line like the following:

+
+LBM Benchmark Kernels 0.1, compiled Jul 5 2017 21:59:22, type: verification
+

if verfication was enabled during compilation or

+
+LBM Benchmark Kernels 0.1, compiled Jul 5 2017 21:59:22, type: benchmark
+

if verfication was disabled during compilation.

+
+

2.1   Command Line Parameters

+

Running the binary with -h list all available parameters:

+
+Usage:
+./lbmbenchk -list
+./lbmbenchk
+    [-dims XxYyZ] [-geometry box|channel|pipe|blocks[-<block size>]] [-iterations <iterations>] [-lattice-dump-ascii]
+    [-rho-in <density>] [-rho-out <density] [-omega <omega>] [-kernel <kernel>]
+    [-periodic-x]
+    [-t <number of threads>]
+    [-pin core{,core}*]
+    [-verify]
+    -- <kernel specific parameters>
+
+-list           List available kernels.
+
+-dims XxYxZ     Specify geometry dimensions.
+
+-geometry blocks-<block size>
+                Geometetry with blocks of size <block size> regularily layout out.
+
+

If an option is specified multiple times the last one overrides previous ones. +This holds also true for -verify which sets geometry dimensions, +iterations, etc, which can afterward be override, e.g.:

+
+$ bin/lbmbenchk-linux-intel-release -verfiy -dims 32x32x32
+
+

Kernel specific parameters can be opatained via selecting the specific kernel +and passing -h as parameter:

+
+$ bin/lbmbenchk-linux-intel-release -kernel -- -h
+...
+Kernel parameters:
+[-blk <n>] [-blk-[xyz] <n>]
+
+

A list of all available kernels can be obtained via -list:

+
+$ ../bin/lbmbenchk-linux-gcc-debug -list
+Lattice Boltzmann Benchmark Kernels (LbmBenchKernels) Copyright (C) 2016, 2017 LSS, RRZE
+This program comes with ABSOLUTELY NO WARRANTY; for details see LICENSE.
+This is free software, and you are welcome to redistribute it under certain conditions.
+
+LBM Benchmark Kernels 0.1, compiled Jul  5 2017 21:59:22, type: verification
+Available kernels to benchmark:
+   list-aa-pv-soa
+   list-aa-ria-soa
+   list-aa-soa
+   list-aa-aos
+   list-pull-split-nt-1s-soa
+   list-pull-split-nt-2s-soa
+   list-push-soa
+   list-push-aos
+   list-pull-soa
+   list-pull-aos
+   push-soa
+   push-aos
+   pull-soa
+   pull-aos
+   blk-push-soa
+   blk-push-aos
+   blk-pull-soa
+   blk-pull-aos
+
+
+
+
+

3   Benchmarking

+

Correct benchmarking is a nontrivial task. Whenever benchmark results should be +created make sure the binary was compiled with:

+
    +
  • BENCHMARK=on and
  • +
  • BUILD=release and
  • +
  • the correct ISA for macros is used, selected via ISA and
  • +
  • use TARCH to specify the architecture the compiler generates code for.
  • +
+

During benchmarking pinning should be used via the -pin parameter. Running +a benchmark with 10 threads an pin them to the first 10 cores works like

+
+$ bin/lbmbenchk-linux-intel-release ... -t 10 -pin $(seq -s , 0 9)
+
+

Things the binary does nor check or controll:

+
    +
  • transparent huge pages: when allocating memory small 4 KiB pages might be +replaced with larger ones. This is in general a good thing, but if this is +really the case, depends on the system settings.
  • +
  • CPU/core frequency: For reproducible results the frequency of all cores +should be fixed.
  • +
  • NUMA placement policy: The benchmark assumes a first touch policy, which +means the memory will be placed at the NUMA domain the touching core is +associated with. If a different policy is in place or the NUMA domain to be +used is already full memory might be allocated in a remote domain. Accesses +to remote domains typically have a higher latency and lower bandwidth.
  • +
  • System load: interference with other application, espcially on desktop +systems should be avoided.
  • +
  • Padding: most kernels do not care about padding against cache or TLB +thrashing. Even if the number of (fluid) nodes suggest everything is fine, +through parallelization still problems might occur.
  • +
  • CPU dispatcher function: the compiler might add different versions of a +function for different ISA extensions. Make sure the code you might think is +executed is actually the code which is executed.
  • +
+
+
+

4   Acknowledgements

+

This work was funded by BMBF, grant no. 01IH15003A (project SKAMPY).

+

This work was funded by KONWHIR project OMI4PAPS.

+

Document was generated at 2017-10-26 09:43.

+
+
+ + -- 2.25.1