1 |
dpavlin |
237 |
|
2 |
|
|
the benefit of multithreading depends heavily on two factors: |
3 |
|
|
- whether the db is in the filesystem cache or needs real disk I/O |
4 |
|
|
- whether you use Java or pure C |
5 |
|
|
|
6 |
|
|
if there is real disk I/O, then multithreading gives a big |
7 |
|
|
performance enhancement: 8 threads outperform a single one |
8 |
|
|
by a factor of 4.2 (C) or 3.3 (Java) respectively, |
9 |
|
|
and C outperforms Java only by a factor of |
10 |
|
|
1.3 (single-threaded) or 1.6 (8 threads). |
11 |
|
|
|
12 |
|
|
if there is sufficient memory available to cache the whole db, |
13 |
|
|
so that the process is mostly CPU bound, |
14 |
|
|
then 8 threads give only a 9% throughput advantage over one in C, |
15 |
|
|
and an 18% performance degradation (!) in Java. |
16 |
|
|
C outperforms Java by a factor of 50 (yes, 5000% percent !) |
17 |
|
|
with 8 threads or 38 single-threaded. |
18 |
|
|
|
19 |
|
|
|
20 |
|
|
====== |
21 |
|
|
|
22 |
|
|
conclusion: |
23 |
|
|
For a mostly I/O-bound server, you might well use threaded Java, |
24 |
|
|
it is only about 40% slower than multithreaded C, |
25 |
|
|
but still beats single-threaded C by a factor of 2.5 . |
26 |
|
|
|
27 |
|
|
If, on the other hand, you want to go for real performance, |
28 |
|
|
get the memory needed and use C. |
29 |
|
|
As long as you are not concerned too much with fairness and there |
30 |
|
|
is no risk of some too lengthy operations, it may as well be |
31 |
|
|
single threaded. |
32 |
|
|
|
33 |
|
|
|
34 |
|
|
====== |
35 |
|
|
|
36 |
|
|
the following numbers were derived without any thread interlocking (mutex) |
37 |
|
|
on a single processor |
38 |
|
|
|
39 |
|
|
|
40 |
|
|
some numbers from the C multithreaded crashtest (make crash) |
41 |
|
|
on the unesb db for prefix "Z" running on an 800MHz PIII with 1/4G RAM |
42 |
|
|
under Linux 2.4.13 |
43 |
|
|
|
44 |
|
|
409 terms |
45 |
|
|
4205 postings max mfn 58880 |
46 |
|
|
|
47 |
|
|
* with lots of memory available for caching: |
48 |
|
|
$ |
49 |
|
|
sequential read 58880 rows in 0.618 seconds 95275 rows per sec |
50 |
|
|
|
51 |
|
|
8 threads 82384 rows per sec |
52 |
|
|
2 threads 79600 rows per sec |
53 |
|
|
1 threads 75544 rows per sec |
54 |
|
|
4 threads 81916 rows per sec |
55 |
|
|
|
56 |
|
|
real 0m3.040s |
57 |
|
|
user 0m1.860s |
58 |
|
|
sys 0m0.920s |
59 |
|
|
$ |
60 |
|
|
|
61 |
|
|
* with nearly all memory locked down by root: |
62 |
|
|
$ |
63 |
|
|
sequential read 58880 rows in 6.720 seconds 8761 rows per sec |
64 |
|
|
|
65 |
|
|
8 threads 1199 rows per sec |
66 |
|
|
2 threads 467 rows per sec |
67 |
|
|
1 threads 288 rows per sec |
68 |
|
|
4 threads 624 rows per sec |
69 |
|
|
|
70 |
|
|
real 5m6.422s |
71 |
|
|
user 0m1.940s |
72 |
|
|
sys 0m2.230s |
73 |
|
|
$ |
74 |
|
|
|
75 |
|
|
|
76 |
|
|
|
77 |
|
|
some numbers from the Java multithreaded crashtest (make jcrash) |
78 |
|
|
on the same db and machine |
79 |
|
|
|
80 |
|
|
409 terms |
81 |
|
|
4205 postings max mfn 58880 |
82 |
|
|
|
83 |
|
|
* with lots of memory available for caching: |
84 |
|
|
$ |
85 |
|
|
sequential read 58880 rows in 30.508 seconds 1929 rows per sec |
86 |
|
|
|
87 |
|
|
8 threads 1645 rows per sec |
88 |
|
|
2 threads 1836 rows per sec |
89 |
|
|
1 threads 2013 rows per sec |
90 |
|
|
4 threads 1785 rows per sec |
91 |
|
|
129.00user 7.40system 2:20.80elapsed 96%CPU (0avgtext+0avgdata 0maxresident)k |
92 |
|
|
0inputs+0outputs (4978major+7237minor)pagefaults 0swaps |
93 |
|
|
$ |
94 |
|
|
|
95 |
|
|
* with nearly all memory locked down by root: |
96 |
|
|
$ |
97 |
|
|
sequential read 58880 rows in 58.827 seconds 1000 rows per sec |
98 |
|
|
|
99 |
|
|
8 threads 738 rows per sec |
100 |
|
|
2 threads 380 rows per sec |
101 |
|
|
1 threads 225 rows per sec |
102 |
|
|
4 threads 558 rows per sec |
103 |
|
|
123.68user 3.87system 8:16.98elapsed 25%CPU (0avgtext+0avgdata 0maxresident)k |
104 |
|
|
0inputs+0outputs (5286major+7366minor)pagefaults 0swaps |
105 |
|
|
$ |
106 |
|
|
|
107 |
|
|
* on a 360MHz Mob. PII with 40MB and Linux 2.2.12, |
108 |
|
|
numbers differ even more: |
109 |
|
|
without disk activity, |
110 |
|
|
a single thread gives about 50% more throughput than 8 |
111 |
|
|
with lots of disk I/O on the other hand, |
112 |
|
|
8 threads nearly quadruple the single thread throughput |
113 |
|
|
|
114 |
|
|
* the mobile now with an 2.4.18 kernel: |
115 |
|
|
$ |
116 |
|
|
enough memory: |
117 |
|
|
8 threads 54558 rows per sec |
118 |
|
|
4 threads 49822 rows per sec |
119 |
|
|
2 threads 43575 rows per sec |
120 |
|
|
1 threads 37656 rows per sec |
121 |
|
|
|
122 |
|
|
with mutex: |
123 |
|
|
8 threads 49525 rows per sec |
124 |
|
|
4 threads 45174 rows per sec |
125 |
|
|
2 threads 41859 rows per sec |
126 |
|
|
1 threads 36262 rows per sec |
127 |
|
|
|
128 |
|
|
no memory (mutex): |
129 |
|
|
8 threads 696 rows per sec |
130 |
|
|
4 threads 367 rows per sec |
131 |
|
|
2 threads 258 rows per sec |
132 |
|
|
1 threads 150 rows per sec |
133 |
|
|
$ |