summaryrefslogtreecommitdiff
path: root/docs/kbase/internals/qemu-threads.rst
blob: 95681d1b9d9309a5a8ef56fecb566246f8281aa3 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
QEMU Driver Threading: The Rules
================================

.. contents::

This document describes how thread safety is ensured throughout
the QEMU driver. The criteria for this model are:

 - Objects must never be exclusively locked for any prolonged time
 - Code which sleeps must be able to time out after suitable period
 - Must be safe against dispatch of asynchronous events from monitor

Basic locking primitives
------------------------

There are a number of locks on various objects

  ``virQEMUDriver``

    The ``qemu_conf.h`` file has inline comments describing the locking
    needs for each field. Any field marked immutable, self-locking
    can be accessed without the driver lock. For other fields there
    are typically helper APIs in ``qemu_conf.c`` that provide serialized
    access to the data. No code outside ``qemu_conf.c`` should ever
    acquire this lock

  ``virDomainObj``

    Will be locked and the reference counter will be increased after calling
    any of the ``virDomainObjListFindBy{ID,Name,UUID}`` methods. The preferred way
    of decrementing the reference counter and unlocking the domain is using the
    ``virDomainObjEndAPI()`` function.

    Lock must be held when changing/reading any variable in the ``virDomainObj``

    This lock must not be held for anything which sleeps/waits (i.e. monitor
    commands).


  ``qemuMonitorPrivatePtr`` job conditions

    Since ``virDomainObj`` lock must not be held during sleeps, the job
    conditions provide additional protection for code making updates.

    QEMU driver uses three kinds of job conditions: asynchronous, agent
    and normal.

    Asynchronous job condition is used for long running jobs (such as
    migration) that consist of several monitor commands and it is
    desirable to allow calling a limited set of other monitor commands
    while such job is running.  This allows clients to, e.g., query
    statistical data, cancel the job, or change parameters of the job.

    Normal job condition is used by all other jobs to get exclusive
    access to the monitor and also by every monitor command issued by an
    asynchronous job.  When acquiring normal job condition, the job must
    specify what kind of action it is about to take and this is checked
    against the allowed set of jobs in case an asynchronous job is
    running.  If the job is incompatible with current asynchronous job,
    it needs to wait until the asynchronous job ends and try to acquire
    the job again.

    Agent job condition is then used when thread wishes to talk to qemu
    agent monitor. It is possible to acquire just agent job
    (``virDomainObjBeginAgentJob``), or only normal job (``virDomainObjBeginJob``)
    but not both at the same time. Holding an agent job and a normal job would
    allow an unresponsive or malicious agent to block normal libvirt API and
    potentially result in a denial of service. Which type of job to grab
    depends whether caller wishes to communicate only with agent socket, or
    only with qemu monitor socket.

    Immediately after acquiring the ``virDomainObj`` lock, any method
    which intends to update state must acquire asynchronous, normal or
    agent job . The ``virDomainObj`` lock is released while blocking on
    these condition variables.  Once the job condition is acquired, a
    method can safely release the ``virDomainObj`` lock whenever it hits
    a piece of code which may sleep/wait, and re-acquire it after the
    sleep/wait.  Whenever an asynchronous job wants to talk to the
    monitor, it needs to acquire nested job (a special kind of normal
    job) to obtain exclusive access to the monitor.

    Since the ``virDomainObj`` lock was dropped while waiting for the
    job condition, it is possible that the domain is no longer active
    when the condition is finally obtained.  The monitor lock is only
    safe to grab after verifying that the domain is still active.


  ``qemuMonitor`` mutex

    Lock to be used when invoking any monitor command to ensure safety
    wrt any asynchronous events that may be dispatched from the monitor.
    It should be acquired before running a command.

    The job condition *MUST* be held before acquiring the monitor lock

    The ``virDomainObj`` lock *MUST* be held before acquiring the monitor
    lock.

    The ``virDomainObj`` lock *MUST* then be released when invoking the
    monitor command.


Helper methods
--------------

To lock the ``virDomainObj``

  ``virObjectLock()``
    - Acquires the ``virDomainObj`` lock

  ``virObjectUnlock()``
    - Releases the ``virDomainObj`` lock


To acquire the normal job condition

  ``virDomainObjBeginJob()``
    - Waits until the job is compatible with current async job or no
      async job is running
    - Waits for ``job.cond`` condition ``job.active != 0`` using ``virDomainObj``
      mutex
    - Rechecks if the job is still compatible and repeats waiting if it
      isn't
    - Sets ``job.active`` to the job type

  ``virDomainObjEndJob()``
    - Sets job.active to 0
    - Signals on job.cond condition


To acquire the agent job condition

  ``virDomainObjBeginAgentJob()``
    - Waits until there is no other agent job set
    - Sets ``job.agentActive`` to the job type

  ``virDomainObjEndAgentJob()``
    - Sets ``job.agentActive`` to 0
    - Signals on ``job.cond`` condition


To acquire the asynchronous job condition

  ``virDomainObjBeginAsyncJob()``
    - Waits until no async job is running
    - Waits for ``job.cond`` condition ``job.active != 0`` using ``virDomainObj``
      mutex
    - Rechecks if any async job was started while waiting on ``job.cond``
      and repeats waiting in that case
    - Sets ``job.asyncJob`` to the asynchronous job type

  ``virDomainObjEndAsyncJob()``
    - Sets ``job.asyncJob`` to 0
    - Broadcasts on ``job.asyncCond`` condition


To acquire the QEMU monitor lock

  ``qemuDomainObjEnterMonitor()``
    - Acquires the ``qemuMonitorObj`` lock
    - Releases the ``virDomainObj`` lock

  ``qemuDomainObjExitMonitor()``
    - Releases the ``qemuMonitorObj`` lock
    - Acquires the ``virDomainObj`` lock

  These functions must not be used by an asynchronous job.


To acquire the QEMU monitor lock as part of an asynchronous job

  ``qemuDomainObjEnterMonitorAsync()``
    - Validates that the right async job is still running
    - Acquires the ``qemuMonitorObj`` lock
    - Releases the ``virDomainObj`` lock
    - Validates that the VM is still active

  qemuDomainObjExitMonitor()
    - Releases the ``qemuMonitorObj`` lock
    - Acquires the ``virDomainObj`` lock

  These functions are for use inside an asynchronous job; the caller
  must check for a return of -1 (VM not running, so nothing to exit).
  Helper functions may also call this with ``VIR_ASYNC_JOB_NONE`` when
  used from a sync job (such as when first starting a domain).


To keep a domain alive while waiting on a remote command

  ``qemuDomainObjEnterRemote()``
    - Releases the ``virDomainObj`` lock

  ``qemuDomainObjExitRemote()``
    - Acquires the ``virDomainObj`` lock


Design patterns
---------------

 * Accessing something directly to do with a ``virDomainObj``::

     virDomainObj *obj;

     obj = qemuDomObjFromDomain(dom);

     ...do work...

     virDomainObjEndAPI(&obj);


 * Updating something directly to do with a ``virDomainObj``::

     virDomainObj *obj;

     obj = qemuDomObjFromDomain(dom);

     virDomainObjBeginJob(obj, VIR_JOB_TYPE);

     ...do work...

     virDomainObjEndJob(obj);

     virDomainObjEndAPI(&obj);


 * Invoking a monitor command on a ``virDomainObj``::

     virDomainObj *obj;
     qemuDomainObjPrivate *priv;

     obj = qemuDomObjFromDomain(dom);

     virDomainObjBeginJob(obj, VIR_JOB_TYPE);

     ...do prep work...

     if (virDomainObjIsActive(vm)) {
         qemuDomainObjEnterMonitor(obj);
         qemuMonitorXXXX(priv->mon);
         qemuDomainObjExitMonitor(obj);
     }

     ...do final work...

     virDomainObjEndJob(obj);
     virDomainObjEndAPI(&obj);


 * Invoking an agent command on a ``virDomainObj``::

     virDomainObj *obj;
     qemuAgent *agent;

     obj = qemuDomObjFromDomain(dom);

     virDomainObjBeginAgentJob(obj, VIR_AGENT_JOB_TYPE);

     ...do prep work...

     if (!qemuDomainAgentAvailable(obj, true))
         goto cleanup;

     agent = qemuDomainObjEnterAgent(obj);
     qemuAgentXXXX(agent, ..);
     qemuDomainObjExitAgent(obj, agent);

     ...do final work...

     virDomainObjEndAgentJob(obj);
     virDomainObjEndAPI(&obj);


 * Running asynchronous job::

     virDomainObj *obj;
     qemuDomainObjPrivate *priv;

     obj = qemuDomObjFromDomain(dom);

     virDomainObjBeginAsyncJob(obj, VIR_ASYNC_JOB_TYPE);
     qemuDomainObjSetAsyncJobMask(obj, allowedJobs);

     ...do prep work...

     if (qemuDomainObjEnterMonitorAsync(driver, obj,
                                        VIR_ASYNC_JOB_TYPE) < 0) {
         /* domain died in the meantime */
         goto error;
     }
     ...start qemu job...
     qemuDomainObjExitMonitor(obj);

     while (!finished) {
         if (qemuDomainObjEnterMonitorAsync(driver, obj,
                                            VIR_ASYNC_JOB_TYPE) < 0) {
             /* domain died in the meantime */
             goto error;
         }
         ...monitor job progress...
         qemuDomainObjExitMonitor(obj);

         virObjectUnlock(obj);
         sleep(aWhile);
         virObjectLock(obj);
     }

     ...do final work...

     virDomainObjEndAsyncJob(obj);
     virDomainObjEndAPI(&obj);


 * Coordinating with a remote server for migration::

     virDomainObj *obj;
     qemuDomainObjPrivate *priv;

     obj = qemuDomObjFromDomain(dom);

     virDomainObjBeginAsyncJob(obj, VIR_ASYNC_JOB_TYPE);

     ...do prep work...

     if (virDomainObjIsActive(vm)) {
         qemuDomainObjEnterRemote(obj);
         ...communicate with remote...
         qemuDomainObjExitRemote(obj);
         /* domain may have been stopped while we were talking to remote */
         if (!virDomainObjIsActive(vm)) {
             qemuReportError(VIR_ERR_INTERNAL_ERROR, "%s",
                             _("guest unexpectedly quit"));
         }
     }

     ...do final work...

     virDomainObjEndAsyncJob(obj);
     virDomainObjEndAPI(&obj);