Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-22 Thread Peter Zijlstra
On Sat, 2012-02-04 at 11:08 +0900, Takuya Yoshikawa wrote: > The latter needs a fundamental change: I heard (from Avi) that we can > change mmu_lock to mutex_lock if mmu_notifier becomes preemptible. > > So I was planning to restart this work when Peter's > "mm: Preemptibility" >

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-18 Thread Alexander Graf
On 18.02.2012, at 11:00, Avi Kivity wrote: > On 02/17/2012 02:19 AM, Alexander Graf wrote: >>> >>> Or we try to be less clever unless we have a really compelling reason. >>> qemu monitor and gdb support aren't compelling reasons to optimize. >> >> The goal here was simplicity with a grain of

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-18 Thread Avi Kivity
On 02/17/2012 02:09 AM, Michael Ellerman wrote: > On Thu, 2012-02-16 at 21:28 +0200, Avi Kivity wrote: > > On 02/16/2012 03:04 AM, Michael Ellerman wrote: > > > > > > > > ioctl is good for hardware devices and stuff that you want to enumerate > > > > and/or control permissions on. For something li

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-18 Thread Avi Kivity
On 02/17/2012 02:19 AM, Alexander Graf wrote: > > > > Or we try to be less clever unless we have a really compelling reason. > > qemu monitor and gdb support aren't compelling reasons to optimize. > > The goal here was simplicity with a grain of performance concerns. > Shared memory is simple in

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-18 Thread Avi Kivity
On 02/16/2012 10:41 PM, Scott Wood wrote: > >>> Sharing the data structures is not need. Simply synchronize them before > >>> lookup, like we do for ordinary registers. > >> > >> Ordinary registers are a few bytes. We're talking of dozens of kbytes here. > > > > A TLB way is a few dozen bytes, no

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-17 Thread Scott Wood
On 02/16/2012 06:23 PM, Alexander Graf wrote: > On 16.02.2012, at 21:41, Scott Wood wrote: >> And yes, we do have fancier hardware coming fairly soon for which this >> breaks (TLB0 entries can be loaded without host involvement, as long as >> there's a translation from guest physical to physical in

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-16 Thread Alexander Graf
On 16.02.2012, at 21:41, Scott Wood wrote: > On 02/16/2012 01:38 PM, Avi Kivity wrote: >> On 02/16/2012 09:34 PM, Alexander Graf wrote: >>> On 16.02.2012, at 20:24, Avi Kivity wrote: >>> On 02/15/2012 04:08 PM, Alexander Graf wrote: >> >> Well, the scatter/gather registers I propos

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-16 Thread Alexander Graf
On 16.02.2012, at 20:38, Avi Kivity wrote: > On 02/16/2012 09:34 PM, Alexander Graf wrote: >> On 16.02.2012, at 20:24, Avi Kivity wrote: >> >>> On 02/15/2012 04:08 PM, Alexander Graf wrote: > > Well, the scatter/gather registers I proposed will give you just one > register or all of

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-16 Thread Michael Ellerman
On Thu, 2012-02-16 at 21:28 +0200, Avi Kivity wrote: > On 02/16/2012 03:04 AM, Michael Ellerman wrote: > > > > > > ioctl is good for hardware devices and stuff that you want to enumerate > > > and/or control permissions on. For something like KVM that is really a > > > core kernel service, a sysca

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-16 Thread Scott Wood
On 02/16/2012 01:38 PM, Avi Kivity wrote: > On 02/16/2012 09:34 PM, Alexander Graf wrote: >> On 16.02.2012, at 20:24, Avi Kivity wrote: >> >>> On 02/15/2012 04:08 PM, Alexander Graf wrote: > > Well, the scatter/gather registers I proposed will give you just one > register or all of them

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-16 Thread Avi Kivity
On 02/16/2012 09:34 PM, Alexander Graf wrote: > On 16.02.2012, at 20:24, Avi Kivity wrote: > > > On 02/15/2012 04:08 PM, Alexander Graf wrote: > >>> > >>> Well, the scatter/gather registers I proposed will give you just one > >>> register or all of them. > >> > >> One register is hardly any use.

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-16 Thread Avi Kivity
On 02/16/2012 04:46 PM, Anthony Liguori wrote: >> What will it buy us? Surely not speed. Entering a guest is not much >> (if at all) faster than exiting to userspace and any non trivial >> operation will require exit to userspace anyway, > > > You can emulate the PIT/RTC entirely within the guest u

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-16 Thread Alexander Graf
On 16.02.2012, at 20:24, Avi Kivity wrote: > On 02/15/2012 04:08 PM, Alexander Graf wrote: >>> >>> Well, the scatter/gather registers I proposed will give you just one >>> register or all of them. >> >> One register is hardly any use. We either need all ways of a respective >> address to do a

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-16 Thread Avi Kivity
On 02/16/2012 03:04 AM, Michael Ellerman wrote: > > > > ioctl is good for hardware devices and stuff that you want to enumerate > > and/or control permissions on. For something like KVM that is really a > > core kernel service, a syscall makes much more sense. > > Yeah maybe. That distinction is a

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-16 Thread Avi Kivity
On 02/15/2012 04:08 PM, Alexander Graf wrote: > > > > Well, the scatter/gather registers I proposed will give you just one > > register or all of them. > > One register is hardly any use. We either need all ways of a respective > address to do a full fledged lookup or all of them. I should have

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-16 Thread Arnd Bergmann
On Tuesday 07 February 2012, Alexander Graf wrote: > >> > >> Not sure we'll ever get there. For PPC, it will probably take another 1-2 > >> years until we get the 32-bit targets stabilized. By then we will have new > >> 64-bit support though. And then the next gen will come out giving us even >

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-16 Thread Arnd Bergmann
On Tuesday 07 February 2012, Alexander Graf wrote: > On 07.02.2012, at 07:58, Michael Ellerman wrote: > > > On Mon, 2012-02-06 at 13:46 -0600, Scott Wood wrote: > >> You're exposing a large, complex kernel subsystem that does very > >> low-level things with the hardware. It's a potential source o

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-16 Thread Anthony Liguori
On 02/16/2012 02:57 AM, Gleb Natapov wrote: On Wed, Feb 15, 2012 at 03:59:33PM -0600, Anthony Liguori wrote: On 02/15/2012 07:39 AM, Avi Kivity wrote: On 02/07/2012 08:12 PM, Rusty Russell wrote: I would really love to have this, but the problem is that we'd need a general purpose bytecode VM

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-16 Thread Avi Kivity
On 02/16/2012 12:21 AM, Arnd Bergmann wrote: > ioctl is good for hardware devices and stuff that you want to enumerate > and/or control permissions on. For something like KVM that is really a > core kernel service, a syscall makes much more sense. > > I would certainly never mix the two concepts: I

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-16 Thread Gleb Natapov
On Wed, Feb 15, 2012 at 03:59:33PM -0600, Anthony Liguori wrote: > On 02/15/2012 07:39 AM, Avi Kivity wrote: > >On 02/07/2012 08:12 PM, Rusty Russell wrote: > >>>I would really love to have this, but the problem is that we'd need a > >>>general purpose bytecode VM with binding to some kernel APIs.

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-15 Thread Rusty Russell
On Wed, 15 Feb 2012 15:39:41 +0200, Avi Kivity wrote: > On 02/07/2012 08:12 PM, Rusty Russell wrote: > > > I would really love to have this, but the problem is that we'd need a > > > general purpose bytecode VM with binding to some kernel APIs. The > > > bytecode VM, if made general enough to hos

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-15 Thread Michael Ellerman
On Wed, 2012-02-15 at 22:21 +, Arnd Bergmann wrote: > On Tuesday 07 February 2012, Alexander Graf wrote: > > On 07.02.2012, at 07:58, Michael Ellerman wrote: > > > > > On Mon, 2012-02-06 at 13:46 -0600, Scott Wood wrote: > > >> You're exposing a large, complex kernel subsystem that does very >

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-15 Thread Anthony Liguori
On 02/15/2012 07:39 AM, Avi Kivity wrote: On 02/07/2012 08:12 PM, Rusty Russell wrote: I would really love to have this, but the problem is that we'd need a general purpose bytecode VM with binding to some kernel APIs. The bytecode VM, if made general enough to host more complicated devices, wo

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-15 Thread Scott Wood
On 02/15/2012 05:57 AM, Alexander Graf wrote: > > On 15.02.2012, at 12:18, Avi Kivity wrote: > >> Well the real reason is we have an extra bit reported by page faults >> that we can control. Can't you set up a hashed pte that is configured >> in a way that it will fault, no matter what type of a

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-15 Thread Alexander Graf
On 15.02.2012, at 14:57, Avi Kivity wrote: > On 02/15/2012 03:37 PM, Alexander Graf wrote: >> On 15.02.2012, at 14:29, Avi Kivity wrote: >> >>> On 02/15/2012 01:57 PM, Alexander Graf wrote: > > Is an extra syscall for copying TLB entries to user space prohibitively > expensive?

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-15 Thread Avi Kivity
On 02/15/2012 03:37 PM, Alexander Graf wrote: > On 15.02.2012, at 14:29, Avi Kivity wrote: > > > On 02/15/2012 01:57 PM, Alexander Graf wrote: > >>> > >>> Is an extra syscall for copying TLB entries to user space prohibitively > >>> expensive? > >> > >> The copying can be very expensive, yes. We

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-15 Thread Avi Kivity
On 02/07/2012 06:19 PM, Anthony Liguori wrote: >> Ah. But then ioeventfd has that as well, unless the other end is in >> the kernel too. > > > Yes, that was my point exactly :-) > > ioeventfd/mmio-over-socketpair to adifferent thread is not faster than > a synchronous KVM_RUN + writing to an eventf

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-15 Thread Avi Kivity
On 02/07/2012 06:29 PM, Jan Kiszka wrote: > >>> > >> > >> Isn't there another level in between just scheduling and full syscall > >> return if the user return notifier has some real work to do? > > > > Depends on whether you're scheduling a kthread or a userspace process, no? > > If > > Kthread

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-15 Thread Avi Kivity
On 02/07/2012 08:12 PM, Rusty Russell wrote: > > I would really love to have this, but the problem is that we'd need a > > general purpose bytecode VM with binding to some kernel APIs. The > > bytecode VM, if made general enough to host more complicated devices, > > would likely be much larger tha

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-15 Thread Alexander Graf
On 15.02.2012, at 14:29, Avi Kivity wrote: > On 02/15/2012 01:57 PM, Alexander Graf wrote: >>> >>> Is an extra syscall for copying TLB entries to user space prohibitively >>> expensive? >> >> The copying can be very expensive, yes. We want to have the possibility of >> exposing a very large TL

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-15 Thread Avi Kivity
On 02/07/2012 05:23 PM, Anthony Liguori wrote: > On 02/07/2012 07:40 AM, Alexander Graf wrote: >> >> Why? For the HPET timer register for example, we could have a simple >> MMIO hook that says >> >>on_read: >> return read_current_time() - shared_page.offset; >>on_write: >> handle_

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-15 Thread Avi Kivity
On 02/12/2012 09:10 AM, Takuya Yoshikawa wrote: > Avi Kivity wrote: > > > > > Slot searching is quite fast since there's a small number of slots, > > > > and we sort the larger ones to be in the front, so positive lookups are > > > > fast. We cache negative lookups in the shadow page tables (a

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-15 Thread Avi Kivity
On 02/15/2012 01:57 PM, Alexander Graf wrote: > > > > Is an extra syscall for copying TLB entries to user space prohibitively > > expensive? > > The copying can be very expensive, yes. We want to have the possibility of > exposing a very large TLB to the guest, in the order of multiple kentries.

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-15 Thread Alexander Graf
On 15.02.2012, at 12:18, Avi Kivity wrote: > On 02/07/2012 04:39 PM, Alexander Graf wrote: >>> >>> Syscalls are orthogonal to that - they're to avoid the fget_light() and to >>> tighten the vcpu/thread and vm/process relationship. >> >> How about keeping the ioctl interface but moving vcpu_run

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-15 Thread Avi Kivity
On 02/07/2012 04:39 PM, Alexander Graf wrote: > > > > Syscalls are orthogonal to that - they're to avoid the fget_light() and to > > tighten the vcpu/thread and vm/process relationship. > > How about keeping the ioctl interface but moving vcpu_run to a syscall then? I dislike half-and-half inter

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-11 Thread Takuya Yoshikawa
Avi Kivity wrote: > > > Slot searching is quite fast since there's a small number of slots, and > > > we sort the larger ones to be in the front, so positive lookups are fast. > > > We cache negative lookups in the shadow page tables (an spte can be > > > either "not mapped", "mapped to RAM"

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-09 Thread Jamie Lokier
Anthony Liguori wrote: > >The new API will do away with the IOAPIC/PIC/PIT emulation and defer > >them to userspace. > > I'm a big fan of this. I agree with getting rid of unnecessary emulations. (Why were those things emulated in the first place?) But it would be good to retain some way to "plu

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-08 Thread Alan Cox
> >register_pio_hook_ptr_r(PIO_IDE, SIZE_BYTE,&s->cmd[0]); > >for (i = 1; i< 7; i++) { > > register_pio_hook_ptr_r(PIO_IDE + i, SIZE_BYTE,&s->cmd[i]); > > register_pio_hook_ptr_w(PIO_IDE + i, SIZE_BYTE,&s->cmd[i]); > >} > > You can't easily serialize updates to that address

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-08 Thread Alan Cox
> If the fd overhead really is a problem, perhaps the fd could be retained > for setup operations, and omitted only on calls that require a vcpu to > have been already set up on the current thread? Quite frankly I'd like to have an fd because it means you've got a meaningful way of ensuring that i

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-08 Thread Scott Wood
On 02/07/2012 06:28 AM, Anthony Liguori wrote: > On 02/06/2012 01:46 PM, Scott Wood wrote: >> On 02/03/2012 04:52 PM, Anthony Liguori wrote: >>> On 02/03/2012 12:07 PM, Eric Northup wrote: How would the ability to use sys_kvm_* be regulated? >>> >>> Why should it be regulated? >>> >>> It's not

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-07 Thread Rusty Russell
On Mon, 06 Feb 2012 11:34:01 +0200, Avi Kivity wrote: > On 02/05/2012 06:36 PM, Anthony Liguori wrote: > > If userspace had a way to upload bytecode to the kernel that was > > executed for a PIO operation, it could either pass the operation to > > userspace or handle it within the kernel when poss

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-07 Thread Chris Wright
* Anthony Liguori (anth...@codemonkey.ws) wrote: > On 02/07/2012 07:18 AM, Avi Kivity wrote: > >On 02/07/2012 02:51 PM, Anthony Liguori wrote: > >>On 02/07/2012 06:40 AM, Avi Kivity wrote: > >>>On 02/07/2012 02:28 PM, Anthony Liguori wrote: > > >It's a potential source of exploits > >(

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-07 Thread Jan Kiszka
On 2012-02-07 17:21, Anthony Liguori wrote: > On 02/07/2012 10:18 AM, Jan Kiszka wrote: >> On 2012-02-07 17:02, Avi Kivity wrote: >>> On 02/07/2012 05:17 PM, Anthony Liguori wrote: On 02/07/2012 06:03 AM, Avi Kivity wrote: > On 02/06/2012 09:11 PM, Anthony Liguori wrote: >> >> I'm

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-07 Thread Anthony Liguori
On 02/07/2012 10:18 AM, Jan Kiszka wrote: On 2012-02-07 17:02, Avi Kivity wrote: On 02/07/2012 05:17 PM, Anthony Liguori wrote: On 02/07/2012 06:03 AM, Avi Kivity wrote: On 02/06/2012 09:11 PM, Anthony Liguori wrote: I'm not so sure. ioeventfds and a future mmio-over-socketpair have to put t

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-07 Thread Anthony Liguori
On 02/07/2012 10:02 AM, Avi Kivity wrote: On 02/07/2012 05:17 PM, Anthony Liguori wrote: On 02/07/2012 06:03 AM, Avi Kivity wrote: On 02/06/2012 09:11 PM, Anthony Liguori wrote: I'm not so sure. ioeventfds and a future mmio-over-socketpair have to put the kthread to sleep while it waits for t

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-07 Thread Jan Kiszka
On 2012-02-07 17:02, Avi Kivity wrote: > On 02/07/2012 05:17 PM, Anthony Liguori wrote: >> On 02/07/2012 06:03 AM, Avi Kivity wrote: >>> On 02/06/2012 09:11 PM, Anthony Liguori wrote: I'm not so sure. ioeventfds and a future mmio-over-socketpair have to put the kthread to sleep

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-07 Thread Avi Kivity
On 02/07/2012 05:17 PM, Anthony Liguori wrote: On 02/07/2012 06:03 AM, Avi Kivity wrote: On 02/06/2012 09:11 PM, Anthony Liguori wrote: I'm not so sure. ioeventfds and a future mmio-over-socketpair have to put the kthread to sleep while it waits for the other end to process it. This is effec

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-07 Thread Alexander Graf
On 07.02.2012, at 16:23, Anthony Liguori wrote: > On 02/07/2012 07:40 AM, Alexander Graf wrote: >> >> Why? For the HPET timer register for example, we could have a simple MMIO >> hook that says >> >> on_read: >> return read_current_time() - shared_page.offset; >> on_write: >> handl

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-07 Thread Anthony Liguori
On 02/07/2012 07:40 AM, Alexander Graf wrote: Why? For the HPET timer register for example, we could have a simple MMIO hook that says on_read: return read_current_time() - shared_page.offset; on_write: handle_in_user_space(); For IDE, it would be as simple as register_pio

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-07 Thread Anthony Liguori
On 02/07/2012 06:03 AM, Avi Kivity wrote: On 02/06/2012 09:11 PM, Anthony Liguori wrote: I'm not so sure. ioeventfds and a future mmio-over-socketpair have to put the kthread to sleep while it waits for the other end to process it. This is effectively equivalent to a heavy weight exit. The diff

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-07 Thread Anthony Liguori
On 02/07/2012 07:18 AM, Avi Kivity wrote: On 02/07/2012 02:51 PM, Anthony Liguori wrote: On 02/07/2012 06:40 AM, Avi Kivity wrote: On 02/07/2012 02:28 PM, Anthony Liguori wrote: It's a potential source of exploits (from bugs in KVM or in hardware). I can see people wanting to be selective wi

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-07 Thread Alexander Graf
On 07.02.2012, at 15:21, Avi Kivity wrote: > On 02/07/2012 03:40 PM, Alexander Graf wrote: >> >> >> >> Not sure we'll ever get there. For PPC, it will probably take another >> >> 1-2 years until we get the 32-bit targets stabilized. By then we will >> >> have new 64-bit support though. And the

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-07 Thread Avi Kivity
On 02/07/2012 03:40 PM, Alexander Graf wrote: >> >> Not sure we'll ever get there. For PPC, it will probably take another 1-2 years until we get the 32-bit targets stabilized. By then we will have new 64-bit support though. And then the next gen will come out giving us even more new constrain

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-07 Thread Alexander Graf
On 07.02.2012, at 14:16, Avi Kivity wrote: > On 02/07/2012 02:51 PM, Alexander Graf wrote: >> On 07.02.2012, at 13:24, Avi Kivity wrote: >> >> > On 02/07/2012 03:08 AM, Alexander Graf wrote: >> >> I don't like the idea too much. On s390 and ppc we can set other vcpu's >> >> interrupt status.

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-07 Thread Avi Kivity
On 02/07/2012 02:51 PM, Anthony Liguori wrote: On 02/07/2012 06:40 AM, Avi Kivity wrote: On 02/07/2012 02:28 PM, Anthony Liguori wrote: It's a potential source of exploits (from bugs in KVM or in hardware). I can see people wanting to be selective with access because of that. As is true of

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-07 Thread Avi Kivity
On 02/07/2012 02:51 PM, Alexander Graf wrote: On 07.02.2012, at 13:24, Avi Kivity wrote: > On 02/07/2012 03:08 AM, Alexander Graf wrote: >> I don't like the idea too much. On s390 and ppc we can set other vcpu's interrupt status. How would that work in this model? > > It would be a "vm-wide

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-07 Thread Alexander Graf
On 07.02.2012, at 13:24, Avi Kivity wrote: > On 02/07/2012 03:08 AM, Alexander Graf wrote: >> I don't like the idea too much. On s390 and ppc we can set other vcpu's >> interrupt status. How would that work in this model? > > It would be a "vm-wide syscall". You can also do that on x86 (throug

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-07 Thread Anthony Liguori
On 02/07/2012 06:40 AM, Avi Kivity wrote: On 02/07/2012 02:28 PM, Anthony Liguori wrote: It's a potential source of exploits (from bugs in KVM or in hardware). I can see people wanting to be selective with access because of that. As is true of the rest of the kernel. If you want finer grain

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-07 Thread Avi Kivity
On 02/07/2012 02:28 PM, Anthony Liguori wrote: It's a potential source of exploits (from bugs in KVM or in hardware). I can see people wanting to be selective with access because of that. As is true of the rest of the kernel. If you want finer grain access control, that's exactly why we ha

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-07 Thread Anthony Liguori
On 02/06/2012 01:46 PM, Scott Wood wrote: On 02/03/2012 04:52 PM, Anthony Liguori wrote: On 02/03/2012 12:07 PM, Eric Northup wrote: On Thu, Feb 2, 2012 at 8:09 AM, Avi Kivity wrote: [...] Moving to syscalls avoids these problems, but introduces new ones: - adding new syscalls is generally

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-07 Thread Avi Kivity
On 02/07/2012 03:08 AM, Alexander Graf wrote: I don't like the idea too much. On s390 and ppc we can set other vcpu's interrupt status. How would that work in this model? It would be a "vm-wide syscall". You can also do that on x86 (through KVM_IRQ_LINE). I really do like the ioctl model

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-07 Thread Avi Kivity
On 02/06/2012 09:11 PM, Anthony Liguori wrote: I'm not so sure. ioeventfds and a future mmio-over-socketpair have to put the kthread to sleep while it waits for the other end to process it. This is effectively equivalent to a heavy weight exit. The difference in cost is dropping to userspa

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-07 Thread Avi Kivity
On 02/06/2012 07:41 PM, Rob Earhart wrote: >> >> I like the ioctl() interface. If the overhead matters in your hot path, > > I can't say that it's a pressing problem, but it's not negligible. > >> I suspect you're doing it wrong; > > What am I doing wrong? "You the vmm" not "you the KVM mai

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-07 Thread Alexander Graf
On 07.02.2012, at 07:58, Michael Ellerman wrote: > On Mon, 2012-02-06 at 13:46 -0600, Scott Wood wrote: >> On 02/03/2012 04:52 PM, Anthony Liguori wrote: >>> On 02/03/2012 12:07 PM, Eric Northup wrote: On Thu, Feb 2, 2012 at 8:09 AM, Avi Kivity wrote: [...] > > Moving to sysca

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-06 Thread Michael Ellerman
On Mon, 2012-02-06 at 13:46 -0600, Scott Wood wrote: > On 02/03/2012 04:52 PM, Anthony Liguori wrote: > > On 02/03/2012 12:07 PM, Eric Northup wrote: > >> On Thu, Feb 2, 2012 at 8:09 AM, Avi Kivity wrote: > >> [...] > >>> > >>> Moving to syscalls avoids these problems, but introduces new ones: > >

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-06 Thread Alexander Graf
On 03.02.2012, at 03:09, Anthony Liguori wrote: > On 02/02/2012 10:09 AM, Avi Kivity wrote: >> The kvm api has been accumulating cruft for several years now. This is >> due to feature creep, fixing mistakes, experience gained by the >> maintainers and developers on how to do things, ports to new

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-06 Thread Scott Wood
On 02/03/2012 04:52 PM, Anthony Liguori wrote: > On 02/03/2012 12:07 PM, Eric Northup wrote: >> On Thu, Feb 2, 2012 at 8:09 AM, Avi Kivity wrote: >> [...] >>> >>> Moving to syscalls avoids these problems, but introduces new ones: >>> >>> - adding new syscalls is generally frowned upon, and kvm wil

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-06 Thread Anthony Liguori
On 02/06/2012 11:41 AM, Rob Earhart wrote: On Sun, Feb 5, 2012 at 5:14 AM, Avi Kivity wrote: On 02/03/2012 12:13 AM, Rob Earhart wrote: On Thu, Feb 2, 2012 at 8:09 AM, Avi Kivitymailto:a...@redhat.com>> wrote: The kvm api has been accumulating cruft for several years now. This is

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-06 Thread Rob Earhart
On Sun, Feb 5, 2012 at 5:14 AM, Avi Kivity wrote: > On 02/03/2012 12:13 AM, Rob Earhart wrote: >> On Thu, Feb 2, 2012 at 8:09 AM, Avi Kivity > > wrote: >> >>     The kvm api has been accumulating cruft for several years now. >>      This is >>     due to feature creep, fixi

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-06 Thread Avi Kivity
On 02/06/2012 04:00 PM, Anthony Liguori wrote: >> Do guests always read an unlatched counter? Doesn't seem reasonable >> since they can't get a stable count this way. > > > Perhaps. You could have the latching done by writing to persisted > scratch memory but then locking becomes an issue. Oh, y

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-06 Thread Anthony Liguori
On 02/06/2012 07:54 AM, Avi Kivity wrote: On 02/06/2012 03:33 PM, Anthony Liguori wrote: Look at arch/x86/kvm/i8254.c:pit_ioport_read() for a counterexample. There are also interactions with other devices (for example the apic/ioapic interaction via the apic bus). Hrm, maybe I'm missing it, b

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-06 Thread Avi Kivity
On 02/06/2012 03:33 PM, Anthony Liguori wrote: >> Look at arch/x86/kvm/i8254.c:pit_ioport_read() for a counterexample. >> There are also interactions with other devices (for example the >> apic/ioapic interaction via the apic bus). > > > Hrm, maybe I'm missing it, but the path that would be hot is:

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-06 Thread Anthony Liguori
On 02/06/2012 03:34 AM, Avi Kivity wrote: On 02/05/2012 06:36 PM, Anthony Liguori wrote: On 02/05/2012 03:51 AM, Gleb Natapov wrote: On Sun, Feb 05, 2012 at 11:44:43AM +0200, Avi Kivity wrote: On 02/05/2012 11:37 AM, Gleb Natapov wrote: On Thu, Feb 02, 2012 at 06:09:54PM +0200, Avi Kivity wro

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-06 Thread Avi Kivity
On 02/05/2012 06:36 PM, Anthony Liguori wrote: > On 02/05/2012 03:51 AM, Gleb Natapov wrote: >> On Sun, Feb 05, 2012 at 11:44:43AM +0200, Avi Kivity wrote: >>> On 02/05/2012 11:37 AM, Gleb Natapov wrote: On Thu, Feb 02, 2012 at 06:09:54PM +0200, Avi Kivity wrote: > Device model > -

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-05 Thread Anthony Liguori
On 02/05/2012 03:51 AM, Gleb Natapov wrote: On Sun, Feb 05, 2012 at 11:44:43AM +0200, Avi Kivity wrote: On 02/05/2012 11:37 AM, Gleb Natapov wrote: On Thu, Feb 02, 2012 at 06:09:54PM +0200, Avi Kivity wrote: Device model Currently kvm virtualizes or emulates a set of x86 cores, wi

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-05 Thread Avi Kivity
On 02/05/2012 12:58 PM, Gleb Natapov wrote: > > > > > > > Reduced performance is what I mean. Obviously old guests will continue > > > working. > > > > I'm not happy about it either. > > > It is not only about old guests either. In RHEL we pretend to not > support HPET because when some guests

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-05 Thread Avi Kivity
On 02/03/2012 12:13 AM, Rob Earhart wrote: > On Thu, Feb 2, 2012 at 8:09 AM, Avi Kivity > wrote: > > The kvm api has been accumulating cruft for several years now. > This is > due to feature creep, fixing mistakes, experience gained by the > maintainers and

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-05 Thread Gleb Natapov
On Sun, Feb 05, 2012 at 11:56:21AM +0200, Avi Kivity wrote: > On 02/05/2012 11:51 AM, Gleb Natapov wrote: > > On Sun, Feb 05, 2012 at 11:44:43AM +0200, Avi Kivity wrote: > > > On 02/05/2012 11:37 AM, Gleb Natapov wrote: > > > > On Thu, Feb 02, 2012 at 06:09:54PM +0200, Avi Kivity wrote: > > > > > D

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-05 Thread Avi Kivity
On 02/05/2012 11:51 AM, Gleb Natapov wrote: > On Sun, Feb 05, 2012 at 11:44:43AM +0200, Avi Kivity wrote: > > On 02/05/2012 11:37 AM, Gleb Natapov wrote: > > > On Thu, Feb 02, 2012 at 06:09:54PM +0200, Avi Kivity wrote: > > > > Device model > > > > > > > > Currently kvm virtualizes or

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-05 Thread Gleb Natapov
On Sun, Feb 05, 2012 at 11:44:43AM +0200, Avi Kivity wrote: > On 02/05/2012 11:37 AM, Gleb Natapov wrote: > > On Thu, Feb 02, 2012 at 06:09:54PM +0200, Avi Kivity wrote: > > > Device model > > > > > > Currently kvm virtualizes or emulates a set of x86 cores, with or > > > without local

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-05 Thread Avi Kivity
On 02/05/2012 11:37 AM, Gleb Natapov wrote: > On Thu, Feb 02, 2012 at 06:09:54PM +0200, Avi Kivity wrote: > > Device model > > > > Currently kvm virtualizes or emulates a set of x86 cores, with or > > without local APICs, a 24-input IOAPIC, a PIC, a PIT, and a number of > > PCI devices

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-05 Thread Gleb Natapov
On Thu, Feb 02, 2012 at 06:09:54PM +0200, Avi Kivity wrote: > Device model > > Currently kvm virtualizes or emulates a set of x86 cores, with or > without local APICs, a 24-input IOAPIC, a PIC, a PIT, and a number of > PCI devices assigned from the host. The API allows emulating the l

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-05 Thread Avi Kivity
On 02/03/2012 04:09 AM, Anthony Liguori wrote: > >> Note: this may cause a regression for older guests >> that don't support MSI or kvmclock. Device assignment will be done >> using VFIO, that is, without direct kvm involvement. >> >> Local APICs will be mandatory, but it will be possible to hide

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-03 Thread Takuya Yoshikawa
Hope to get comments from live migration developers, Anthony Liguori wrote: > > Guest memory management > > --- > > Instead of managing each memory slot individually, a single API will be > > provided that replaces the entire guest physical memory map atomically. > > This mat

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-03 Thread Anthony Liguori
On 02/03/2012 12:07 PM, Eric Northup wrote: On Thu, Feb 2, 2012 at 8:09 AM, Avi Kivity wrote: [...] Moving to syscalls avoids these problems, but introduces new ones: - adding new syscalls is generally frowned upon, and kvm will need several - syscalls into modules are harder and rarer than i

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-03 Thread Rob Earhart
On Thu, Feb 2, 2012 at 8:09 AM, Avi Kivity wrote: > The kvm api has been accumulating cruft for several years now. This is > due to feature creep, fixing mistakes, experience gained by the > maintainers and developers on how to do things, ports to new > architectures, and simply as a side effect

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-03 Thread Rob Earhart
(Resending as plain text to appease vger.kernel.org :-) On Thu, Feb 2, 2012 at 8:09 AM, Avi Kivity wrote: > > The kvm api has been accumulating cruft for several years now.  This is > due to feature creep, fixing mistakes, experience gained by the > maintainers and developers on how to do things,

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-03 Thread Eric Northup
On Thu, Feb 2, 2012 at 8:09 AM, Avi Kivity wrote: [...] > > Moving to syscalls avoids these problems, but introduces new ones: > > - adding new syscalls is generally frowned upon, and kvm will need several > - syscalls into modules are harder and rarer than into core kernel code > - will need to a

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-02 Thread Anthony Liguori
On 02/02/2012 10:09 AM, Avi Kivity wrote: The kvm api has been accumulating cruft for several years now. This is due to feature creep, fixing mistakes, experience gained by the maintainers and developers on how to do things, ports to new architectures, and simply as a side effect of a code base

[Qemu-devel] [RFC] Next gen kvm api

2012-02-02 Thread Avi Kivity
The kvm api has been accumulating cruft for several years now. This is due to feature creep, fixing mistakes, experience gained by the maintainers and developers on how to do things, ports to new architectures, and simply as a side effect of a code base that is developed slowly and incrementally.