[Qemu-devel] [PATCH COLO-Frame v13 04/39] migration: Export migrate_set_state()

2015-12-28 Thread zhanghailiang
Fix the first parameter of migrate_set_state(), and export it. We will use it in later. Signed-off-by: zhanghailiang Reviewed-by: Dr. David Alan Gilbert --- v12: - Add Reviewed-by tag v11: - New patch which is split from patch 'migration: Add state records for migration incoming' (Juan's sugge

[Qemu-devel] [PATCH COLO-Frame v13 03/39] COLO: migrate colo related info to secondary node

2015-12-28 Thread zhanghailiang
We can know if VM in destination should go into COLO mode by refer to the info that been migrated from PVM. We skip this section if colo is not enabled (i.e. migrate_set_capability colo off), so that, It not break compatibility with migration however the --enable-colo/disable-colo on the source/d

[Qemu-devel] [PATCH COLO-Frame v13 08/39] migration: Rename the'file' member of MigrationState

2015-12-28 Thread zhanghailiang
Rename the 'file' member of MigrationState to 'to_dst_file'. Signed-off-by: zhanghailiang Reviewed-by: Dr. David Alan Gilbert --- v12: - Add Reviewed-by tag - Add the missed modification for RDMA migration. (Found by Wen Congyang) v11: - Only rename 'file' member of MigrationState --- include/m

[Qemu-devel] [PATCH COLO-Frame v13 18/39] COLO: Flush PVM's cached RAM into SVM's memory

2015-12-28 Thread zhanghailiang
During the time of VM's running, PVM may dirty some pages, we will transfer PVM's dirty pages to SVM and store them into SVM's RAM cache at next checkpoint time. So, the content of SVM's RAM cache will always be same with PVM's memory after checkpoint. Instead of flushing all content of PVM's RAM

[Qemu-devel] [PATCH COLO-Frame v13 13/39] COLO: Save PVM state to secondary side when do checkpoint

2015-12-28 Thread zhanghailiang
The main process of checkpoint is to synchronize SVM with PVM. VM's state includes ram and device state. So we will migrate PVM's state to SVM when do checkpoint, just like migration does. We will cache PVM's state in slave, we use QEMUSizedBuffer to store the data, we need to know the size of VM

[Qemu-devel] [PATCH COLO-Frame v13 21/39] COLO failover: Introduce a new command to trigger a failover

2015-12-28 Thread zhanghailiang
We leave users to choose whatever heartbeat solution they want, if the heartbeat is lost, or other errors they detect, they can use experimental command 'x_colo_lost_heartbeat' to tell COLO to do failover, COLO will do operations accordingly. For example, if the command is sent to the PVM, the Pri

[Qemu-devel] [PATCH COLO-Frame v13 09/39] COLO/migration: Create a new communication path from destination to source

2015-12-28 Thread zhanghailiang
This new communication path will be used for returning messages from destination to source. Signed-off-by: zhanghailiang Signed-off-by: Li Zhijian Reviewed-by: Dr. David Alan Gilbert --- v13: - Remove useless error report v12: - Add Reviewed-by tag v11: - Rebase master to use qemu_file_get_retu

[Qemu-devel] [PATCH COLO-Frame v13 22/39] COLO failover: Introduce state to record failover process

2015-12-28 Thread zhanghailiang
When handling failover, we do different things according to the different stage of failover process, here we introduce a global atomic variable to record the status of failover. We add four failover status to indicate the different stage of failover process. You should use the helpers to get and s

[Qemu-devel] [PATCH COLO-Frame v13 24/39] COLO: Implement failover work for Secondary VM

2015-12-28 Thread zhanghailiang
If users require SVM to takeover work, colo incoming thread should exit from loop while failover BH helps backing to migration incoming coroutine. Signed-off-by: zhanghailiang Signed-off-by: Li Zhijian Reviewed-by: Dr. David Alan Gilbert v12: - Improve error message that suggested by Dave - Add

[Qemu-devel] [PATCH COLO-Frame v13 30/39] savevm: Split load vm state function qemu_loadvm_state

2015-12-28 Thread zhanghailiang
qemu_loadvm_state is too long, and we can simplify it by splitting up with three helper functions. Signed-off-by: zhanghailiang Reviewed-by: Dr. David Alan Gilbert v13: - Add Reviewed-by tag --- migration/savevm.c | 156 +++-- 1 file changed, 92 i

[Qemu-devel] [PATCH COLO-Frame v13 31/39] savevm: Introduce two helper functions for save/find loadvm_handlers entry

2015-12-28 Thread zhanghailiang
For COLO's checkpoint process, we will do savevm/loadvm repeatedly. So every time we call qemu_loadvm_section_start_full(), we will add all sections information into loadvm_handlers list for one time. There will be many instances in loadvm_handlers for one section, and this will lead to memory leak

[Qemu-devel] [PATCH COLO-Frame v13 20/39] COLO: synchronize PVM's state to SVM periodically

2015-12-28 Thread zhanghailiang
Do checkpoint periodically, the default interval is 200ms. Signed-off-by: zhanghailiang Signed-off-by: Li Zhijian Reviewed-by: Dr. David Alan Gilbert --- v12: - Add Reviewed-by tag v11: - Fix wrong sleep time for checkpoint period. (Dave's review comment) --- migration/colo.c | 12

[Qemu-devel] [PATCH COLO-Frame v13 02/39] migration: Introduce capability 'x-colo' to migration

2015-12-28 Thread zhanghailiang
We add helper function colo_supported() to indicate whether colo is supported or not, with which we use to control whether or not showing 'x-colo' string to users, they can use qmp command 'query-migrate-capabilities' or hmp command 'info migrate_capabilities' to learn if colo is supported. Cc: Ju

[Qemu-devel] [PATCH COLO-Frame v13 39/39] COLO: Add block replication into colo process

2015-12-28 Thread zhanghailiang
Make sure master start block replication after slave's block replication started. Signed-off-by: zhanghailiang Signed-off-by: Wen Congyang Signed-off-by: Li Zhijian --- migration/colo.c | 52 trace-events | 2 ++ 2 files changed, 54 in

[Qemu-devel] [PATCH COLO-Frame v13 06/39] migration: Integrate COLO checkpoint process into migration

2015-12-28 Thread zhanghailiang
Add a migrate state: MIGRATION_STATUS_COLO, enter this migration state after the first live migration successfully finished. We reuse migration thread, so if colo is enabled by user, migration thread will go into the process of colo. Signed-off-by: zhanghailiang Signed-off-by: Li Zhijian Signed

[Qemu-devel] [PATCH COLO-Frame v13 23/39] COLO: Implement failover work for Primary VM

2015-12-28 Thread zhanghailiang
For PVM, if there is failover request from users. The colo thread will exit the loop while the failover BH does the cleanup work and resumes VM. Signed-off-by: zhanghailiang Signed-off-by: Li Zhijian Reviewed-by: Dr. David Alan Gilbert --- v13: - Add Reviewed-by tag v12: - Fix error report and

[Qemu-devel] [PATCH COLO-Frame v13 15/39] COLO: Load PVM's dirty pages into SVM's RAM cache temporarily

2015-12-28 Thread zhanghailiang
We should not load PVM's state directly into SVM, because there maybe some errors happen when SVM is receving data, which will break SVM. We need to ensure receving all data before load the state into SVM. We use an extra memory to cache these data (PVM's ram). The ram cache in secondary side is i

[Qemu-devel] [PATCH COLO-Frame v13 34/39] net/filter-buffer: Add default filter-buffer for each netdev

2015-12-28 Thread zhanghailiang
We add each netdev (except vhost-net) a default filter-buffer, which will be used for COLO or Micro-checkpoint to buffer VM's packets. The name of default filter-buffer is 'nop'. For the default filter-buffer, it will not buffer any packets in default. So it has no side effect for the netdev. Sign

[Qemu-devel] [PATCH COLO-Frame v13 11/39] COLO: Add a new RunState RUN_STATE_COLO

2015-12-28 Thread zhanghailiang
Guest will enter this state when paused to save/restore VM state under colo checkpoint. Cc: Eric Blake Cc: Markus Armbruster Signed-off-by: zhanghailiang Signed-off-by: Li Zhijian Signed-off-by: Gonglei Reviewed-by: Dr. David Alan Gilbert Reviewed-by: Eric Blake --- qapi-schema.json | 5 ++

[Qemu-devel] [PATCH COLO-Frame v13 27/39] COLO failover: Don't do failover during loading VM's state

2015-12-28 Thread zhanghailiang
We should not do failover work while the main thread is loading VM's state, otherwise it will destroy the consistent of VM's memory and device state. Here we add a new failover status 'RELAUNCH' which means we should relaunch the process of failover. Signed-off-by: zhanghailiang Signed-off-by: L

[Qemu-devel] [PATCH COLO-Frame v13 25/39] qmp event: Add COLO_EXIT event to notify users while exited from COLO

2015-12-28 Thread zhanghailiang
If some errors happen during VM's COLO FT stage, it's important to notify the users of this event. Together with 'x_colo_lost_heartbeat', users can intervene in COLO's failover work immediately. If users don't want to get involved in COLO's failover verdict, it is still necessary to notify users

[Qemu-devel] [PATCH COLO-Frame v13 32/39] COLO: Separate the process of saving/loading ram and device state

2015-12-28 Thread zhanghailiang
We separate the process of saving/loading ram and device state when do checkpoint, we add new helpers for save/load ram/device. With this change, we can directly transfer ram from master to slave without using QEMUSizeBuffer as assistant, which also reduce the size of extra memory been used during

[Qemu-devel] [PATCH COLO-Frame v13 28/39] COLO: Process shutdown command for VM in COLO state

2015-12-28 Thread zhanghailiang
If VM is in COLO FT state, we should do some extra work before normal shutdown process. SVM will ignore the shutdown command if this command is issued directly to it, PVM will send the shutdown command to SVM if it gets this command. Cc: Paolo Bonzini Signed-off-by: zhanghailiang Signed-off-by:

[Qemu-devel] [PATCH COLO-Frame v13 14/39] ram: Split host_from_stream_offset() into two helper functions

2015-12-28 Thread zhanghailiang
Split host_from_stream_offset() into two parts: One is to get ram block, which the block idstr may be get from migration stream, the other is to get hva (host) address from block and the offset. Besides, we will do the check working in a new helper offset_in_ramblock(). Signed-off-by: zhanghailian

Re: [Qemu-devel] [PATCH COLO-Frame v13 00/39] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT)

2015-12-28 Thread Hailiang Zhang
Cc: Markus Armbruster On 2015/12/29 15:08, zhanghailiang wrote: This is the 13th version of COLO (Still only support periodic checkpoint). Here is only COLO frame part, you can get the whole codes from github: https://github.com/coloft/qemu/commits/colo-v2.4-periodic-mode Please ignore patch

[Qemu-devel] [PATCH COLO-Frame v13 33/39] COLO: Split qemu_savevm_state_begin out of checkpoint process

2015-12-28 Thread zhanghailiang
It is unnecessary to call qemu_savevm_state_begin() in every checkponit process. It mainly sets up devices and does the first device state pass. These data will not change during the later checkpoint process. So, we split it out of colo_do_checkpoint_transaction(), in this way, we can reduce these

[Qemu-devel] [PATCH COLO-Frame v13 29/39] COLO: Update the global runstate after going into colo state

2015-12-28 Thread zhanghailiang
If we start qemu with -S, the runstate will change from 'prelaunch' to 'running' after going into colo state. So it is necessary to update the global runstate after going into colo state. Signed-off-by: zhanghailiang Signed-off-by: Li Zhijian Reviewed-by: Dr. David Alan Gilbert --- v13: - Add R

[Qemu-devel] [PATCH COLO-Frame v13 37/39] filter-buffer: Introduce a helper function to release packets

2015-12-28 Thread zhanghailiang
We need to release all the packets from VM in COLO or Micro-checkpoint, here we add a new helper function to realse the packets that buffered by default buffer-filter Signed-off-by: zhanghailiang Cc: Jason Wang Cc: Yang Hongyang --- v12: - Rename this helper function v11: - New patch --- inclu

[Qemu-devel] [PATCH COLO-Frame v13 35/39] filter-buffer: Accept zero interval

2015-12-28 Thread zhanghailiang
For default buffer filter, its 'interval' value is zero, so here we should accept zero interval. Signed-off-by: zhanghailiang Reviewed-by: Yang Hongyang Cc: Jason Wang --- v12: - Add Reviewed-by tag v11: - Add comment v10: - new patch --- net/filter-buffer.c | 10 -- 1 file changed, 10

[Qemu-devel] [PATCH COLO-Frame v13 26/39] COLO failover: Shutdown related socket fd when do failover

2015-12-28 Thread zhanghailiang
If the net connection between COLO's two sides is broken while colo/colo incoming thread is blocked in 'read'/'write' socket fd. It will not detect this error until connect timeout. It will be a long time. Here we shutdown all the related socket file descriptors to wake up the blocking operation

[Qemu-devel] [PATCH COLO-Frame v13 36/39] filter-buffer: Introduce a helper function to enable/disable default filter

2015-12-28 Thread zhanghailiang
The default buffer filter doesn't buffer packets in default, but we need to buffer packets for COLO or Micro-checkpoint, Here we add a helper function to enable/disable filter's buffer capability. Signed-off-by: zhanghailiang Cc: Jason Wang Cc: Yang Hongyang --- v12: - Rename the heler function

[Qemu-devel] [PATCH COLO-Frame v13 38/39] colo: Use default buffer-filter to buffer and release packets

2015-12-28 Thread zhanghailiang
Enable default filter to buffer packets and release the packets after a checkpoint. Signed-off-by: zhanghailiang Cc: Jason Wang Cc: Yang Hongyang --- v12: - Add a helper function to check if all netdev supports buffer packets. - Flush buffered packets when do failover. v11: - Use new helper fun

<    1   2