1 |
Several concurrency issues arise whenever a database is accessed |
2 |
simultaniously be multiple processes or threads |
3 |
(lightweight processes sharing all their system |
4 |
ressources like memory and open files, including file positions). |
5 |
|
6 |
|
7 |
* multiprocess (MP) environments |
8 |
|
9 |
MP environments are distinguished according to |
10 |
- whether all processes are readonly or there are one or more writers |
11 |
- whether processes are single-shot |
12 |
(i.e. open, work, exit like CGI scripts, including PHP in CGI mode) |
13 |
or resident (like PHP module living in Apache 1.x/Unix childs). |
14 |
Note that PHP module in a multithreaded server is not multiprocess. |
15 |
|
16 |
Within a readonly environment, there is not much of a problem. |
17 |
Each process may read and cache file contents independent of each other. |
18 |
So the rest of this section discusses read/write access. |
19 |
|
20 |
In the presence of writers, there are some problems: |
21 |
- at least the actual writing accesses must be *strictly* mutually exclusive |
22 |
- it must be ensured that readers do not use old cached data |
23 |
(or at least use it in a well controlled manner) |
24 |
- it must be ensured that changed data is written and read in a consistent way |
25 |
|
26 |
These problems are addressed in reverse order: |
27 |
- the data structures used in OpenIsis are designed so |
28 |
that a consistent way of reading and writing can be defined. |
29 |
For example, the XRF pointer to a new or changed record is written |
30 |
after the record, so readers will not see an invalid pointer. |
31 |
However, this will work only where the operating system guarantees |
32 |
such semantics for file reads/writes. This again may depend on the |
33 |
filesystem and will not hold for most network file systems. |
34 |
- learning which cached blocks are outdated is not possible |
35 |
with reasonable effort. One possible approach is to not cache at all, |
36 |
i.e. resort to the operating system cache, which hopefully is |
37 |
properly synchronized (see above). |
38 |
- A simple and well supported means of mutual exclusion is the use |
39 |
of exclusive file locks. flock-(BSD-)style locks are sufficient; |
40 |
we do not need locking of file regions nor locking over NFS, |
41 |
which is not reliable anyway (and it's not much better with SMB). |
42 |
|
43 |
The most easy and reliable solution is to completely encapsulate |
44 |
any access --from database open to close-- within an exclusive lock. |
45 |
That way there clearly is no inconsistent cache. |
46 |
|
47 |
- for single-shot processes, this solution is reliable and does |
48 |
not incurr too much cost: exit will release any lock and the |
49 |
processes may not benefit from caching anyway. |
50 |
- for resident processes, the need to open and close on any request |
51 |
is more of a disadvantage, and it is a problem how to guarantee |
52 |
that any lock is released after processing. |
53 |
|
54 |
Possible perfomance enhancements that might be implemented one day: |
55 |
- readers could use shared locks. |
56 |
however, this gives a risk of writer starvation. |
57 |
- if all data access methods are carefully checked and a reasonable |
58 |
local file system is used, non-caching readers could get by without |
59 |
locking at all. |
60 |
- another, quite complex approach is to share cache memory between processes, |
61 |
similar to ORACLE's SGA. This would also help in guaranteeing consistent |
62 |
read-write-sequences. |
63 |
|
64 |
|
65 |
To summarize the multiprocess issues: |
66 |
- readonly access is fine |
67 |
- DO NOT TRY TO WRITE ON A NETWORK DRIVE |
68 |
(or at least make sure it is accessed only by one host at a time) |
69 |
- the best solution for multiple processes is to contact |
70 |
a server for writing instead of doing it themselves |
71 |
- for PHP as module in read/write mode, |
72 |
we have to rely on register_shutdown_function to close any db |
73 |
|
74 |
|
75 |
* multithreaded environments |
76 |
|
77 |
OpenIsis is designed to run multithreaded. |
78 |
Multithreading is used only within some sort of server |
79 |
(like database, web or servlet engine) in order to run multiple |
80 |
requests from multiple clients in parallel. |
81 |
|
82 |
MT environments are distinguished according to |
83 |
- whether they support active dispatching of requests to threads |
84 |
- whether they support parallel IO. |
85 |
Besides the basic calls for parallel IO (like pread,pwrite, |
86 |
or ReadFileEx "overlapped" IO in Win speak, which is missing on Win 9x/Me), |
87 |
this also requires condition variables (like pthread_cond_wait/broadcast, |
88 |
which are rather difficult to emulate on Win 9x/Me in the absence |
89 |
of SignalObjectAndWait) and should include memory mapping |
90 |
(like mmap,msync, which is working poorly on Win 9x/Me). |
91 |
|
92 |
All threads of a single process share the same cache, |
93 |
so dirty caches are not an issue here. |
94 |
Synchronization is cheaper and more easy to use. |
95 |
|
96 |
However, this great performance benefit comes at a price: |
97 |
While there are a few utilities without any side effects |
98 |
(i.e. proper FORTRAN functions), |
99 |
not only access to the database and it's cache, |
100 |
but any access to system ressources like files or the memory |
101 |
heap must be carefully checked for possible collisions and, |
102 |
when in doubt, must be synchronized -- even in a readonly environment. |
103 |
|
104 |
|
105 |
* Session synchronization |
106 |
|
107 |
Our strategy is to share as little as possible between threads |
108 |
and to protect all that must be shared (basically the database) |
109 |
by a single lock. The means to give each thread it's own, |
110 |
unshared environment is the SESSION. |
111 |
|
112 |
|
113 |
A session represents a single client accessing the database. |
114 |
(At least this is the idea, but depends on the dispatcher's abilities, |
115 |
see below). The session may hold result sets from previous queries, |
116 |
some authentication info from the client and other temporary data. |
117 |
In a standalone environment like the Tk GUI not connected to a server, |
118 |
there is only one session, the "default session" (session id 0). |
119 |
In a database or web-server, however, there may exist several sessions |
120 |
on behalf of several users at the same time. |
121 |
|
122 |
Requests from each session are serialized by some dispatcher, |
123 |
so that each session is accessed by at most one thread at a time. |
124 |
Consequently, in an environment with one session only, |
125 |
there also is only one thread used to access the database. |
126 |
|
127 |
|
128 |
To summarize, from a session's point of view, the world is single threaded. |
129 |
Each session has a private memory heap and even it's own IO stream buffers |
130 |
stdin, stdout and stderr (as streams 0,1,2) |
131 |
and need not care about how it is connected and to whom. |
132 |
Since the dispatcher guarantees that no session is accessed |
133 |
by more than one thread at a time, dynamic memory, streaming |
134 |
IO and other session ressources can be used without further interlocking. |
135 |
|
136 |
|
137 |
* dispatching requests and locking sessions |
138 |
|
139 |
Due to the dual nature of a session as both representing a user and serving |
140 |
as object of synchronization, dispatching requests has two tasks: |
141 |
- ensuring serialized (single-threaded) access |
142 |
- finding the session bound to a given user |
143 |
|
144 |
While the former is crucial in MT environments, the latter is used only if |
145 |
- the environment identifies a user session in the first place |
146 |
- the session object's ability to keep state (like result sets) is used |
147 |
|
148 |
We distinguish two cases of when and how dispatching is done: |
149 |
- passive/late dispatching: |
150 |
In most environments we have to get the session from within a thread |
151 |
dedicated to that request. The dispatcher is implemented as a call, |
152 |
accessing a session pool protected by some mutex. |
153 |
- active/early dispatching: |
154 |
Within the database server, the proper session can be looked up |
155 |
before a thread is allocated for a request. |
156 |
Here, the dispatcher is an active component, probably running in a |
157 |
thread on it's own (thus not requiring a mutex on the session pool). |
158 |
That way several requests on the same session may be queued |
159 |
(or discarded) without consuming any thread ressources. |
160 |
This should yield better performance under high load and somewhat |
161 |
better protection against denial of service. |
162 |
|
163 |
There are also two different situations with regard to the scope |
164 |
of synchronization: |
165 |
- per request: |
166 |
The session is "locked" (somehow marked as busy) until processing |
167 |
the request has finished. Locking is done by the dispatcher, |
168 |
and unlocking must be performed on exit, |
169 |
e.g. using register_shutdown_function in PHP. |
170 |
For the passive dispatcher, if some user session id is used to locate |
171 |
an existing session and there is already a request executing in this session, |
172 |
the current thread has to wait. |
173 |
- per use: |
174 |
In a high level language, i.e. Java, basic synchronization is achieved |
175 |
by having a Java object representing the session and marking the |
176 |
appropriate methods as synchronized. |
177 |
|
178 |
Note that unless we promise that sessions actually will remember some state, |
179 |
a simple dispatcher may decide to operate on a session pool of size 1 (one), |
180 |
containing only the default session, thus ruling out any parallel operation. |
181 |
|
182 |
|
183 |
* Configuration synchronization |
184 |
|
185 |
Operations that change the overall system state like opening a database |
186 |
are allowed for session 0 only. Consequently, IO (logging) and memory |
187 |
associated with such operations is bound to the default session. |
188 |
Databases may be marked for exclusive use by session 0 |
189 |
for example during a lengthy batch index update |
190 |
or in order to perform structural changes like modifying the FDT. |
191 |
|
192 |
|
193 |
On the other hand, the worker sessions need some confidence that |
194 |
configuration is not going to change while they are in the midth of |
195 |
processing a request. Therefore, any database that is somehow accessed |
196 |
by a session, is marked as used by the session and marked as unused |
197 |
when the session is released. This protects the database from being |
198 |
closed or put in exclusive ("single-user") mode and thus also |
199 |
configuration from being changed. |
200 |
|
201 |
|
202 |
Note that a request for the database need not be the same as |
203 |
the original user request. For a database server, the request for |
204 |
a database operation is all that is known, thus clients issuing |
205 |
several remote requests won't get no guarantee that the DB is unchanged |
206 |
between database accesses (regardless of the environment they are running in). |
207 |
When accessing a local database, the scope of locking depends on |
208 |
the environment as described above. An explicit lock on a local |
209 |
database might be provided for Java (to be unlocked in a finally clause). |
210 |
|
211 |
|
212 |
However, the situation is not as bad as it might look, since there are |
213 |
complex database accesses, bundling several operations into one. |
214 |
A standard example is to perform a query and not only obtain a result set, |
215 |
but also the contents of the first n records, like with a Z39.50 |
216 |
piggybacked "present". For remote databasse access, |
217 |
this is the most efficient operation mode anyway. |
218 |
|
219 |
|
220 |
|
221 |
* Database synchronization |
222 |
|
223 |
All database ressources like master file and index have associated |
224 |
in memory structures like a cache. These structures must not be |
225 |
accessed by more than one thread at once and are therefore protected |
226 |
by a mutex (some "mutual exclusion" object like a critical section). |
227 |
|
228 |
Again, there are two modes to distinguish: |
229 |
- basic mutex |
230 |
The database (actually all databases) are locked when starting an |
231 |
access like reading or writing a record or searching an index, |
232 |
and unlocked when done. |
233 |
Since there is not very much and especially no IO happening outside |
234 |
the database access, it doesn't make much sense to allow parallel |
235 |
access in the first place and we will rather resort to a |
236 |
one-session environment. |
237 |
- parallel IO |
238 |
This is the interesting case to be discussed now |
239 |
|
240 |
Parallel IO aims at using the time one thread has to wait for |
241 |
an IO operation to complete in order to let another thread |
242 |
use the CPU and possibly start additional IOs. |
243 |
Therefore, the mutex is released during IO. |
244 |
|
245 |
In certain situations like thread A wishing to access a cache page |
246 |
being read by another thread B, A has to wait on a condition |
247 |
which will be signaled by B after returning from the IO. |
248 |
|
249 |
|
250 |
The mutex and condition are implemented by an OpenIsisLockFunc, |
251 |
which may map it to a pthread mutex and associated condition variable. |
252 |
This is very similar to the concept of a monitor as implemented |
253 |
by Java's synchronized blocks. |
254 |
|
255 |
|
256 |
The mutual exclusion could be made even more finegrained by using |
257 |
one mutex per database and another one for global structures. |
258 |
With parallel IO, however, the mutex is locked only during CPU use and |
259 |
released during IO, so this, while adding overhead, |
260 |
would hardly increase concurrency on a single CPU system. |
261 |
On a Windoze box capable of basic mutex only, on the other hand, |
262 |
you would probably not access multiple databases anyway. |
263 |
|
264 |
|
265 |
* Summary by environments |
266 |
|
267 |
The following gives an overview of simple approaches |
268 |
to be used in basic implementations: |
269 |
- PHP/Apache1.x/Unix, any CGI: |
270 |
Multiple processes use mutual exclusion based on file locking. |
271 |
Database must be closed after request. |
272 |
Actually, file locking is performed always on database open/close, |
273 |
without asking whether there might be other processes. |
274 |
- PHP/MT/windoze: |
275 |
Uses trivial dispatcher, requests fully synchronized on default session. |
276 |
- PHP/MT/Apache2.0: |
277 |
May use real dispatcher, once the MT-Apache is stable. |
278 |
- Java: |
279 |
May use non-trivial dispatcher, if it provides LockFunc. |
280 |
- OpenIsis server: |
281 |
Uses active dispatcher / |
282 |
> Server multiplexer |
283 |
|
284 |
|
285 |
* Notes on PHP |
286 |
|
287 |
For various PHP run modes, see |
288 |
> http://www.php.net/manual/en/features.persistent-connections.php |
289 |
|
290 |
As of Feb.03, several extensions are |
291 |
> http://www.php.net/manual/en/faq.obtaining.php listed |
292 |
as being NOT thread-safe! |
293 |
|
294 |
|
295 |
--- |
296 |
$Id: Concurrency.txt,v 1.6 2003/02/18 18:10:20 kripke Exp $ |