-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME
60 lines (49 loc) · 2.64 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
INSTRUCTIONS TO BUILD TASKPROF WITH AUTOMATIC REGION IDENTIFICATION,DIFFERENTIAL ANALYSIS,SCHEDULING OVERHEAD
-------------------------------------------------------------------------------------------------------------
OFFLINE ANALYSIS
----------------
1. First build original TBB library without modification for TaskProf
> source build_orig_tbb.sh
2. Install TaskProf in a separate shell session.
> source build_tprof.sh
TaskProf generates 3 profiles
1) regions_automatic.csv - shows the regions that need to optimized to improve parallelism.
2) step_diff_file.csv - The differential profile
3) sc_ov_profile.csv - The scheduling overhead profile.
Try false sharing and scheduling overhead test programs in tests/tbb_tests/false_sharing and tests/tbb_tests/sched_overhead respectively.
PROCESS FOR SELECTING PERFORMANCE COUNTERS
------------------------------------------
TaskProf can use a number of hardware performance counter types to profile applications. For instance, TaskProf can use hardware cycles, instructions, page faults, etc to measure work performed in a program.
The programmer can control the profiling by specifying the type of hardware performance counter to use while profiling.
The program counter type can be specified using TD_Activate function call, which is called at the beginning of the program.
TD_Activate(__FILE__,__LINE__,<counter_type>);
<counter_type> accepts integer values. If <counter_type> is not provided, TaskProf uses 0 which corresponds to harware cycles.
The following are the integer <counter_type> values for hardware performance counter types currently supported by TaskProf.
CYCLES = 0
INSTRUCTIONS = 1
LOCAL_HITM = 2
REMOTE_HITM = 3
LOCAL_DRAM = 4
REMOTE_DRAM = 5
PAGE_FAULTS = 6
LLC_MISS = 7
L1_M_REPLACE = 8
L1_EVICTION = 9
L1_REPLACEMENT = 10
FP_DIVIDE = 11
ONLINE ANALYSIS
---------------
Install TaskProf online profiler in a separate shell session.
> source build_online.sh
Build the input program.
We provide a Python script that automates the what-if analysis and generates the profiles.
> cd <path_to_TaskProf>/src/ptprof_lib/online_profiling/scripts
Use the help option with causal_profiler.py script to specify command line options.
> python causal_profiler.py --help
-p -> Full path to program executable (required)
-b -> name of executable binary (required)
-a -> arguments to the program
-f -> optimization factor
-t -> threshold parallelism
The python script after performing analysis generates the parallelism profile in parallelism_profile.csv,
what-if profile in causal_profile.csv and differential profile in step_diff_file.csv.