date:20111115

[Qemu-devel] [PATCH v5 1.0] configure: build position independent executables across for x86 hosts

2011-11-15 Thread Avi Kivity

Change the default on x86 hosts to building PIE (position independent
executables); instead of restricting the option to user-only targets,
apply it to all targets.

In addition, set the relocation sections to read-only (relro) when available;
this reduces the attack surface by disallowing changes to relocation tables
at runtime.

While PIE reduces performance and relro increases load time, it greatly
improves security, with the potential to reduce a code execution vulnerability
to a self denial of service.

Non-x86 are not changed, as they require TCG changes.

Signed-off-by: Avi Kivity 
---

v5: fix typos; only default enable for x86; mutually exclusive with -static

v4: say it's v4 and for 1.0

v3: detect toolchain support for PIE at configure time

v2: improve description to include relro


 configure |   55 +--
 1 files changed, 37 insertions(+), 18 deletions(-)

diff --git a/configure b/configure
index 6c77fbb..024e603 100755
--- a/configure
+++ b/configure
@@ -172,7 +172,7 @@ aix="no"
 blobs="yes"
 pkgversion=""
 check_utests=""
-user_pie="no"
+pie=""
 zero_malloc=""
 trace_backend="nop"
 trace_file="trace"
@@ -701,9 +701,9 @@ for opt do
   ;;
   --disable-guest-base) guest_base="no"
   ;;
-  --enable-user-pie) user_pie="yes"
+  --enable-pie) pie="yes"
   ;;
-  --disable-user-pie) user_pie="no"
+  --disable-pie) pie="no"
   ;;
   --enable-uname-release=*) uname_release="$optarg"
   ;;
@@ -1031,8 +1031,8 @@ echo "  --disable-bsd-user   disable all BSD usermode 
emulation targets"
 echo "  --enable-guest-base  enable GUEST_BASE support for usermode"
 echo "   emulation targets"
 echo "  --disable-guest-base disable GUEST_BASE support"
-echo "  --enable-user-piebuild usermode emulation targets as PIE"
-echo "  --disable-user-pie   do not build usermode emulation targets as 
PIE"
+echo "  --enable-pie build Position Independent Executables"
+echo "  --disable-piedo not build Position Independent Executables"
 echo "  --fmod-lib   path to FMOD library"
 echo "  --fmod-inc   path to FMOD includes"
 echo "  --oss-libpath to OSS library"
@@ -1099,6 +1099,37 @@ for flag in $gcc_flags; do
 fi
 done
 
+if test "$pie" = "yes" -a "$static" = "yes" ; then
+  echo "static and pie are mutually incompatible"
+  exit 1
+fi
+
+if test "$pie" != "no" -a "$static" != "yes" ; then
+  case "$cpu" in
+i386|x86_64)
+  pie="yes"
+  ;;
+*)
+  ;;
+  esac
+fi
+
+if test "$pie" = "yes" ; then
+  cat > $TMPC << EOF
+int main(void) { return 0; }
+EOF
+  if compile_prog "-fPIE -DPIE" "-Wl,-pie"; then
+QEMU_CFLAGS="-fPIE -DPIE $QEMU_CFLAGS"
+LDFLAGS="-Wl,-pie $LDFLAGS"
+if compile_prog "-fPIE -DPIE" "-Wl,-pie -Wl,-z,relro -Wl,-z,now"; then
+  LDFLAGS="-Wl,-z,relro -Wl,-z,now $LDFLAGS"
+fi
+  else
+echo "Disabling PIE due to missing toolchain support"
+pie="no"
+  fi
+fi
+
 #
 # Solaris specific configure tool chain decisions
 #
@@ -2765,7 +2796,7 @@ echo "Documentation $docs"
 echo "uname -r  $uname_release"
 echo "NPTL support  $nptl"
 echo "GUEST_BASE$guest_base"
-echo "PIE user targets  $user_pie"
+echo "PIE   $pie"
 echo "vde support   $vde"
 echo "Linux AIO support $linux_aio"
 echo "ATTR/XATTR support $attr"
@@ -3225,9 +3256,6 @@ for d in libdis libdis-user; do
 symlink $source_path/Makefile.dis $d/Makefile
 echo > $d/config.mak
 done
-if test "$static" = "no" -a "$user_pie" = "yes" ; then
-  echo "QEMU_CFLAGS+=-fpie" > libdis-user/config.mak
-fi
 
 for target in $target_list; do
 target_dir="$target"
@@ -3646,12 +3674,6 @@ if test "$target_softmmu" = "yes" ; then
   esac
 fi
 
-if test "$target_user_only" = "yes" -a "$static" = "no" -a \
-   "$user_pie" = "yes" ; then
-  cflags="-fpie $cflags"
-  ldflags="-pie $ldflags"
-fi
-
 if test "$target_softmmu" = "yes" -a \( \
 "$TARGET_ARCH" = "microblaze" -o \
 "$TARGET_ARCH" = "cris" \) ; then
@@ -3775,9 +3797,6 @@ d=libuser
 mkdir -p $d
 mkdir -p $d/trace
 symlink $source_path/Makefile.user $d/Makefile
-if test "$static" = "no" -a "$user_pie" = "yes" ; then
-  echo "QEMU_CFLAGS+=-fpie" > $d/config.mak
-fi
 
 if test "$docs" = "yes" ; then
   mkdir -p QMP
-- 
1.7.7.1

Re: [Qemu-devel] [PATCH 0/5] docs: convert specifications to markdown

2011-11-15 Thread Avi Kivity

On 11/15/2011 12:41 AM, Anthony Liguori wrote:
> Right now our specs are written in psuedo-wiki syntax.  This series converts
> them to markdown.  markdown is a simple markup format that's gaining in
> popularity.
>
> The big advantage of using markdown is that there are tools that can convert 
> it
> to relatively simple HTML.  That means we can build a make infrastructure that
> generates a nice set of static web pages.
>
> The syntax is also more human friendly than mediawiki syntax.
>
> To see what the stylized version of this looks like, check out:
>
>   https://github.com/aliguori/qemu/tree/markdown/docs/specs
>
>

Nice.  Suggest you enable rename detection, to make patches like these
easier to read (not that it truly matters in the particular case).

-- 
error compiling committee.c: too many arguments to function

[Qemu-devel] [PATCH v6 1.0] configure: build position independent executables on x86 hosts

2011-11-15 Thread Avi Kivity

Change the default on x86 hosts to building PIE (position independent
executables); instead of restricting the option to user-only targets,
apply it to all targets.

In addition, set the relocation sections to read-only (relro) when available;
this reduces the attack surface by disallowing changes to relocation tables
at runtime.

While PIE reduces performance and relro increases load time, it greatly
improves security, with the potential to reduce a code execution vulnerability
to a self denial of service.

Non-x86 are not changed, as they require TCG changes.

Signed-off-by: Avi Kivity 
---

v6: fix subject line. sigh.

v5: fix typos; only default enable for x86; mutually exclusive with -static

v4: say it's v4 and for 1.0

v3: detect toolchain support for PIE at configure time

v2: improve description to include relro

 configure |   55 +--
 1 files changed, 37 insertions(+), 18 deletions(-)

diff --git a/configure b/configure
index 6c77fbb..024e603 100755
--- a/configure
+++ b/configure
@@ -172,7 +172,7 @@ aix="no"
 blobs="yes"
 pkgversion=""
 check_utests=""
-user_pie="no"
+pie=""
 zero_malloc=""
 trace_backend="nop"
 trace_file="trace"
@@ -701,9 +701,9 @@ for opt do
   ;;
   --disable-guest-base) guest_base="no"
   ;;
-  --enable-user-pie) user_pie="yes"
+  --enable-pie) pie="yes"
   ;;
-  --disable-user-pie) user_pie="no"
+  --disable-pie) pie="no"
   ;;
   --enable-uname-release=*) uname_release="$optarg"
   ;;
@@ -1031,8 +1031,8 @@ echo "  --disable-bsd-user   disable all BSD usermode 
emulation targets"
 echo "  --enable-guest-base  enable GUEST_BASE support for usermode"
 echo "   emulation targets"
 echo "  --disable-guest-base disable GUEST_BASE support"
-echo "  --enable-user-piebuild usermode emulation targets as PIE"
-echo "  --disable-user-pie   do not build usermode emulation targets as 
PIE"
+echo "  --enable-pie build Position Independent Executables"
+echo "  --disable-piedo not build Position Independent Executables"
 echo "  --fmod-lib   path to FMOD library"
 echo "  --fmod-inc   path to FMOD includes"
 echo "  --oss-libpath to OSS library"
@@ -1099,6 +1099,37 @@ for flag in $gcc_flags; do
 fi
 done
 
+if test "$pie" = "yes" -a "$static" = "yes" ; then
+  echo "static and pie are mutually incompatible"
+  exit 1
+fi
+
+if test "$pie" != "no" -a "$static" != "yes" ; then
+  case "$cpu" in
+i386|x86_64)
+  pie="yes"
+  ;;
+*)
+  ;;
+  esac
+fi
+
+if test "$pie" = "yes" ; then
+  cat > $TMPC << EOF
+int main(void) { return 0; }
+EOF
+  if compile_prog "-fPIE -DPIE" "-Wl,-pie"; then
+QEMU_CFLAGS="-fPIE -DPIE $QEMU_CFLAGS"
+LDFLAGS="-Wl,-pie $LDFLAGS"
+if compile_prog "-fPIE -DPIE" "-Wl,-pie -Wl,-z,relro -Wl,-z,now"; then
+  LDFLAGS="-Wl,-z,relro -Wl,-z,now $LDFLAGS"
+fi
+  else
+echo "Disabling PIE due to missing toolchain support"
+pie="no"
+  fi
+fi
+
 #
 # Solaris specific configure tool chain decisions
 #
@@ -2765,7 +2796,7 @@ echo "Documentation $docs"
 echo "uname -r  $uname_release"
 echo "NPTL support  $nptl"
 echo "GUEST_BASE$guest_base"
-echo "PIE user targets  $user_pie"
+echo "PIE   $pie"
 echo "vde support   $vde"
 echo "Linux AIO support $linux_aio"
 echo "ATTR/XATTR support $attr"
@@ -3225,9 +3256,6 @@ for d in libdis libdis-user; do
 symlink $source_path/Makefile.dis $d/Makefile
 echo > $d/config.mak
 done
-if test "$static" = "no" -a "$user_pie" = "yes" ; then
-  echo "QEMU_CFLAGS+=-fpie" > libdis-user/config.mak
-fi
 
 for target in $target_list; do
 target_dir="$target"
@@ -3646,12 +3674,6 @@ if test "$target_softmmu" = "yes" ; then
   esac
 fi
 
-if test "$target_user_only" = "yes" -a "$static" = "no" -a \
-   "$user_pie" = "yes" ; then
-  cflags="-fpie $cflags"
-  ldflags="-pie $ldflags"
-fi
-
 if test "$target_softmmu" = "yes" -a \( \
 "$TARGET_ARCH" = "microblaze" -o \
 "$TARGET_ARCH" = "cris" \) ; then
@@ -3775,9 +3797,6 @@ d=libuser
 mkdir -p $d
 mkdir -p $d/trace
 symlink $source_path/Makefile.user $d/Makefile
-if test "$static" = "no" -a "$user_pie" = "yes" ; then
-  echo "QEMU_CFLAGS+=-fpie" > $d/config.mak
-fi
 
 if test "$docs" = "yes" ; then
   mkdir -p QMP
-- 
1.7.7.1

Re: [Qemu-devel] [PATCH v5 1.0] configure: build position independent executables across for x86 hosts

2011-11-15 Thread Peter Maydell

On 15 November 2011 08:00, Avi Kivity  wrote:

> @@ -1099,6 +1099,37 @@ for flag in $gcc_flags; do
>     fi
>  done
>
> +if test "$pie" = "yes" -a "$static" = "yes" ; then
> +  echo "static and pie are mutually incompatible"
> +  exit 1
> +fi

The -a operator to test has been marked obsolescent in
POSIX -- please don't use it in new code. (Use
if test "$pie" = yes && test "$static" = yes; then )

> +if test "$pie" = "yes" ; then
> +  cat > $TMPC << EOF
> +int main(void) { return 0; }
> +EOF
> +  if compile_prog "-fPIE -DPIE" "-Wl,-pie"; then
> +    QEMU_CFLAGS="-fPIE -DPIE $QEMU_CFLAGS"
> +    LDFLAGS="-Wl,-pie $LDFLAGS"
> +    if compile_prog "-fPIE -DPIE" "-Wl,-pie -Wl,-z,relro -Wl,-z,now"; then
> +      LDFLAGS="-Wl,-z,relro -Wl,-z,now $LDFLAGS"

Why does this second compile test put -fPIE -DPIE into
its local cflags and -Wl,-pie into its local ldflags
when we just put them into the global cflags/ldflags?

> +    fi
> +  else
> +    echo "Disabling PIE due to missing toolchain support"
> +    pie="no"

This means that if the user explicitly asked for PIE (with
--enable-pie") we will carry on even if we couldn't do it.
Usually for configure if the user asked for something then
not providing it is a fatal error.

-- PMM

Re: [Qemu-devel] [PATCH v5 1.0] configure: build position independent executables across for x86 hosts

2011-11-15 Thread Avi Kivity

On 11/15/2011 11:10 AM, Peter Maydell wrote:
> On 15 November 2011 08:00, Avi Kivity  wrote:
>
> > @@ -1099,6 +1099,37 @@ for flag in $gcc_flags; do
> > fi
> >  done
> >
> > +if test "$pie" = "yes" -a "$static" = "yes" ; then
> > +  echo "static and pie are mutually incompatible"
> > +  exit 1
> > +fi
>
> The -a operator to test has been marked obsolescent in
> POSIX -- please don't use it in new code. (Use
> if test "$pie" = yes && test "$static" = yes; then )

Okay.  For 1.1, I'll convert this script to python.

> > +if test "$pie" = "yes" ; then
> > +  cat > $TMPC << EOF
> > +int main(void) { return 0; }
> > +EOF
> > +  if compile_prog "-fPIE -DPIE" "-Wl,-pie"; then
> > +QEMU_CFLAGS="-fPIE -DPIE $QEMU_CFLAGS"
> > +LDFLAGS="-Wl,-pie $LDFLAGS"
> > +if compile_prog "-fPIE -DPIE" "-Wl,-pie -Wl,-z,relro -Wl,-z,now"; then
> > +  LDFLAGS="-Wl,-z,relro -Wl,-z,now $LDFLAGS"
>
> Why does this second compile test put -fPIE -DPIE into
> its local cflags and -Wl,-pie into its local ldflags
> when we just put them into the global cflags/ldflags?

Ah, I didn't realize compile_prog considered those.  Will make
parallelizing it harder.  Will fix.

> > +fi
> > +  else
> > +echo "Disabling PIE due to missing toolchain support"
> > +pie="no"
>
> This means that if the user explicitly asked for PIE (with
> --enable-pie") we will carry on even if we couldn't do it.
> Usually for configure if the user asked for something then
> not providing it is a fatal error.

Yeah.

-- 
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] Memory sync algorithm during migration

2011-11-15 Thread Takuya Yoshikawa


Adding qemu-devel ML to CC.

Your question should have been sent to qemu-devel ML because the logic
is implemented in QEMU, not KVM.

(2011/11/11 1:35), Oliver Hookins wrote:

Hi,

I am performing some benchmarks on KVM migration on two different types of VM.
One has 4GB RAM and the other 32GB. More or less idle, the 4GB VM takes about 20
seconds to migrate on our hardware while the 32GB VM takes about a minute.

With a reasonable amount of memory activity going on (in the hundreds of MB per
second) the 32GB VM takes 3.5 minutes to migrate, but the 4GB VM never
completes. Intuitively this tells me there is some watermarking of dirty pages
going on that is not particularly efficient when the dirty pages ratio is high
compared to total memory, but I may be completely incorrect.


You can change the ratio IIRC.
Hopefully, someone who knows well about QEMU will tell you better ways.

Takuya



Could anybody fill me in on what might be going on here? We're using libvirt
0.8.2 and kvm-83-224.el5.centos.1
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] [PATCH] migration: add a MAINTAINERS entry for migration

2011-11-15 Thread Kevin Wolf

Am 14.11.2011 22:08, schrieb Anthony Liguori:
> On 11/14/2011 11:40 AM, Juan Quintela wrote:
>> Anthony Liguori  wrote:
>>> I think this is an accurate reflection of the state of migration today.  
>>> This
>>> is the second release in a row where we're scrambling to fix a critical 
>>> issue
>>> in migration.
>>
>> We need to make our mind about it.
> 
> Ultimately, we need to make migration a priority.  That's what I'm trying to 
> do 
> here.

When you make everything a priority, being a priority doesn't have much
of a meaning any more. Our current priorities are changing the entire
device model, the monitor, migration, turning the block layer upside
down - what's left? Okay, maybe vvfat and slirp.

> The first step is to be open about the state of migration today.  I 
> personally 
> don't have the bandwidth to invest a lot of effort in migration, but I can 
> invest time in trying to find more people to work on migration, and help put 
> together a proper roadmap.
> 
> We need to outline and document what we support and what we don't support.  
> We 
> need to invest in a test infrastructure.  We need a roadmap that we can 
> reasonably execute on.  In short, we need to turn migration into a first 
> class 
> subsystem.
> 
> It's not about any single person or any single patch series.  It's about 
> deciding that migration is an important feature and deserves more focus and 
> attention.

I don't doubt that everyone will agree with this. The harder question is
who should concentrate less on which other feature to have time to spend
for migration.

Kevin

[Qemu-devel] [PATCH v7 1.0] configure: build position independent executables on x86 hosts

2011-11-15 Thread Avi Kivity

Change the default on x86 hosts to building PIE (position independent
executables); instead of restricting the option to user-only targets,
apply it to all targets.

In addition, set the relocation sections to read-only (relro) when available;
this reduces the attack surface by disallowing changes to relocation tables
at runtime.

While PIE reduces performance and relro increases load time, it greatly
improves security, with the potential to reduce a code execution vulnerability
to a self denial of service.

Non-x86 are not changed, as they require TCG changes.

Signed-off-by: Avi Kivity 
---

v7: avoid 'test -a'
optimize relro/now linker flag test
fail if toolchain doesn't support pie while the user explicitly asked for it

v6: fix subject line. sigh.

v5: fix typos; only default enable for x86; mutually exclusive with -static

v4: say it's v4 and for 1.0

v3: detect toolchain support for PIE at configure time

v2: improve description to include relro


 configure |   65 
 1 files changed, 47 insertions(+), 18 deletions(-)

diff --git a/configure b/configure
index 6c77fbb..ba7143a 100755
--- a/configure
+++ b/configure
@@ -172,7 +172,7 @@ aix="no"
 blobs="yes"
 pkgversion=""
 check_utests=""
-user_pie="no"
+pie=""
 zero_malloc=""
 trace_backend="nop"
 trace_file="trace"
@@ -701,9 +701,9 @@ for opt do
   ;;
   --disable-guest-base) guest_base="no"
   ;;
-  --enable-user-pie) user_pie="yes"
+  --enable-pie) pie="yes"
   ;;
-  --disable-user-pie) user_pie="no"
+  --disable-pie) pie="no"
   ;;
   --enable-uname-release=*) uname_release="$optarg"
   ;;
@@ -1031,8 +1031,8 @@ echo "  --disable-bsd-user   disable all BSD usermode 
emulation targets"
 echo "  --enable-guest-base  enable GUEST_BASE support for usermode"
 echo "   emulation targets"
 echo "  --disable-guest-base disable GUEST_BASE support"
-echo "  --enable-user-piebuild usermode emulation targets as PIE"
-echo "  --disable-user-pie   do not build usermode emulation targets as 
PIE"
+echo "  --enable-pie build Position Independent Executables"
+echo "  --disable-piedo not build Position Independent Executables"
 echo "  --fmod-lib   path to FMOD library"
 echo "  --fmod-inc   path to FMOD includes"
 echo "  --oss-libpath to OSS library"
@@ -1099,6 +1099,47 @@ for flag in $gcc_flags; do
 fi
 done
 
+if test "$static" = "yes" ; then
+  if test "$pie" = "yes" ; then
+echo "static and pie are mutually incompatible"
+exit 1
+  else
+pie="no"
+  fi
+fi
+
+if test "$pie" = ""; then
+  case "$cpu" in
+i386|x86_64)
+  ;;
+*)
+  pie="no"
+  ;;
+  esac
+fi
+
+if test "$pie" != "no" ; then
+  cat > $TMPC << EOF
+int main(void) { return 0; }
+EOF
+  if compile_prog "-fPIE -DPIE" "-Wl,-pie"; then
+QEMU_CFLAGS="-fPIE -DPIE $QEMU_CFLAGS"
+LDFLAGS="-Wl,-pie $LDFLAGS"
+pie="yes"
+if compile_prog "" "-Wl,-z,relro -Wl,-z,now" ; then
+  LDFLAGS="-Wl,-z,relro -Wl,-z,now $LDFLAGS"
+fi
+  else
+if test "$pie" = "yes"; then
+  echo "PIE not available due to missing toolchain support"
+  exit 1
+else
+  echo "Disabling PIE due to missing toolchain support"
+  pie="no"
+fi
+  fi
+fi
+
 #
 # Solaris specific configure tool chain decisions
 #
@@ -2765,7 +2806,7 @@ echo "Documentation $docs"
 echo "uname -r  $uname_release"
 echo "NPTL support  $nptl"
 echo "GUEST_BASE$guest_base"
-echo "PIE user targets  $user_pie"
+echo "PIE   $pie"
 echo "vde support   $vde"
 echo "Linux AIO support $linux_aio"
 echo "ATTR/XATTR support $attr"
@@ -3225,9 +3266,6 @@ for d in libdis libdis-user; do
 symlink $source_path/Makefile.dis $d/Makefile
 echo > $d/config.mak
 done
-if test "$static" = "no" -a "$user_pie" = "yes" ; then
-  echo "QEMU_CFLAGS+=-fpie" > libdis-user/config.mak
-fi
 
 for target in $target_list; do
 target_dir="$target"
@@ -3646,12 +3684,6 @@ if test "$target_softmmu" = "yes" ; then
   esac
 fi
 
-if test "$target_user_only" = "yes" -a "$static" = "no" -a \
-   "$user_pie" = "yes" ; then
-  cflags="-fpie $cflags"
-  ldflags="-pie $ldflags"
-fi
-
 if test "$target_softmmu" = "yes" -a \( \
 "$TARGET_ARCH" = "microblaze" -o \
 "$TARGET_ARCH" = "cris" \) ; then
@@ -3775,9 +3807,6 @@ d=libuser
 mkdir -p $d
 mkdir -p $d/trace
 symlink $source_path/Makefile.user $d/Makefile
-if test "$static" = "no" -a "$user_pie" = "yes" ; then
-  echo "QEMU_CFLAGS+=-fpie" > $d/config.mak
-fi
 
 if test "$docs" = "yes" ; then
   mkdir -p QMP
-- 
1.7.7.1

Re: [Qemu-devel] [PATCH] migration: add a MAINTAINERS entry for migration

2011-11-15 Thread Stefan Hajnoczi

On Mon, Nov 14, 2011 at 03:08:25PM -0600, Anthony Liguori wrote:
> On 11/14/2011 11:40 AM, Juan Quintela wrote:
> >Anthony Liguori  wrote:
> >>I think this is an accurate reflection of the state of migration today.  
> >>This
> >>is the second release in a row where we're scrambling to fix a critical 
> >>issue
> >>in migration.
> >
> >We need to make our mind about it.
> 
> Ultimately, we need to make migration a priority.  That's what I'm
> trying to do here.
> 
> The first step is to be open about the state of migration today.  I
> personally don't have the bandwidth to invest a lot of effort in
> migration, but I can invest time in trying to find more people to
> work on migration, and help put together a proper roadmap.

It would help to have a migration wiki page or document that explains
the implications of migration on QEMU code - what to look out for in
device emulation code.

Although regular QEMU contributors may know the background on
migration/save/load, it would be not only helpful for new contributors
but also a good refresher for those of us who have picked up the
assumptions around migration piecewise.

I think a good document would raise migration awareness and help us
review new patches with an eye towards correct migration behavior.

The rules need to be laid down by someone who understands migration
quite well.

Stefan

Re: [Qemu-devel] Memory sync algorithm during migration

2011-11-15 Thread Juan Quintela

Takuya Yoshikawa  wrote:
> Adding qemu-devel ML to CC.
>
> Your question should have been sent to qemu-devel ML because the logic
> is implemented in QEMU, not KVM.
>
> (2011/11/11 1:35), Oliver Hookins wrote:
>> Hi,
>>
>> I am performing some benchmarks on KVM migration on two different types of 
>> VM.
>> One has 4GB RAM and the other 32GB. More or less idle, the 4GB VM takes 
>> about 20
>> seconds to migrate on our hardware while the 32GB VM takes about a minute.
>>
>> With a reasonable amount of memory activity going on (in the hundreds of MB 
>> per
>> second) the 32GB VM takes 3.5 minutes to migrate, but the 4GB VM never
>> completes. Intuitively this tells me there is some watermarking of dirty 
>> pages
>> going on that is not particularly efficient when the dirty pages ratio is 
>> high
>> compared to total memory, but I may be completely incorrect.

> You can change the ratio IIRC.
> Hopefully, someone who knows well about QEMU will tell you better ways.
>
>   Takuya
>
>>
>> Could anybody fill me in on what might be going on here? We're using libvirt
>> 0.8.2 and kvm-83-224.el5.centos.1

This is pretty old qemu/kvm code base.
In principle, it makes no sense that with 32GB RAM migration finishes,
and with 4GB RAM it is unable (intuitively it should be, if ever, the
other way around).

Do you have an easy test that makes the problem easily reproducible?
Have you tried ustream qemu.git? (some improvements on that department).

Later, Juan.

Re: [Qemu-devel] [RFC 1.0] pc_piix: set qxl revision to 2 for pc-0.14

2011-11-15 Thread Gerd Hoffmann

  Hi,

> I mean, change stable-0.15 so that if it is invoked with -M 0.14, you
> get a rev 2 qxl device.

0.15 has qxl rev 2 too, 1.0 got rev 3.  No need to do anything for
-stable.  Also 0.15 didn't got its own machine type, so the pc-0.14
compat property (master branch) covers both 0.14 and 0.15.  Everything
is fine.

> The Fedora 16 regression which triggered this was running 0.15, I
> imagine (but it could be running virt-preview, or something, so not sure).

Must have been due to backports then, or running a master snapshot.

cheers,
  Gerd

Re: [Qemu-devel] Memory sync algorithm during migration

2011-11-15 Thread Oliver Hookins

On Tue, Nov 15, 2011 at 11:47:58AM +0100, ext Juan Quintela wrote:
> Takuya Yoshikawa  wrote:
> > Adding qemu-devel ML to CC.
> >
> > Your question should have been sent to qemu-devel ML because the logic
> > is implemented in QEMU, not KVM.
> >
> > (2011/11/11 1:35), Oliver Hookins wrote:
> >> Hi,
> >>
> >> I am performing some benchmarks on KVM migration on two different types of 
> >> VM.
> >> One has 4GB RAM and the other 32GB. More or less idle, the 4GB VM takes 
> >> about 20
> >> seconds to migrate on our hardware while the 32GB VM takes about a minute.
> >>
> >> With a reasonable amount of memory activity going on (in the hundreds of 
> >> MB per
> >> second) the 32GB VM takes 3.5 minutes to migrate, but the 4GB VM never
> >> completes. Intuitively this tells me there is some watermarking of dirty 
> >> pages
> >> going on that is not particularly efficient when the dirty pages ratio is 
> >> high
> >> compared to total memory, but I may be completely incorrect.
> 
> > You can change the ratio IIRC.
> > Hopefully, someone who knows well about QEMU will tell you better ways.
> >
> > Takuya
> >
> >>
> >> Could anybody fill me in on what might be going on here? We're using 
> >> libvirt
> >> 0.8.2 and kvm-83-224.el5.centos.1
> 
> This is pretty old qemu/kvm code base.
> In principle, it makes no sense that with 32GB RAM migration finishes,
> and with 4GB RAM it is unable (intuitively it should be, if ever, the
> other way around).

If the syncing of dirty pages is based on some kind of ratio that causes it to
resync everything if more than a certain percentage of total pages are dirty
then this could explain it, as with a smaller amount of memory it could be
continually restarting.

I can certainly state from observation that the network traffic continues at
high speed the whole time so perhaps this makes some sense. I was hoping that
this behaviour would cause someone to recall a problem like this that was solved
- maybe this is not the case!

> 
> Do you have an easy test that makes the problem easily reproducible?
> Have you tried ustream qemu.git? (some improvements on that department).

I haven't just yet, but I'll give this a try. Thanks for the suggestion!

[Qemu-devel] [PATCH] Fixing some spelling in docs/libcacard.txt

2011-11-15 Thread matthias . bgg

From: Matthias Brugger 


Signed-off-by: Matthias Brugger 
---
 docs/libcacard.txt |   12 ++--
 1 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/docs/libcacard.txt b/docs/libcacard.txt
index 5dee6fa..576af57 100644
--- a/docs/libcacard.txt
+++ b/docs/libcacard.txt
@@ -170,7 +170,7 @@ public entry point:
int cert_count);
 
   The parameters for this are:
-  card   - the virtual card structure which will prepresent this card.
+  card   - the virtual card structure which will represent this card.
   flags  - option flags that may be specific to this card type.
   cert   - array of binary certificates.
   cert_len   - array of lengths of each of the certificates specified in cert.
@@ -179,7 +179,7 @@ public entry point:
   cert_count - number of entries in cert, cert_len, and key arrays.
 
   Any cert, cert_len, or key with the same index are matching sets. That is
-  cert[0] is cert_len[0] long and has the corresponsing private key of key[0].
+  cert[0] is cert_len[0] long and has the corresponding private key of key[0].
 
 The card type emulator is expected to own the VCardKeys, but it should copy
 any raw cert data it wants to save. It can create new applets and add them to
@@ -261,7 +261,7 @@ Prior to processing calling the card type emulator's 
VCardProcessAPDU function,
apdu->a_Le   - The expected length of any returned data.
apdu->a_cla  - The raw apdu class.
apdu->a_channel - The channel (decoded from the class).
-   apdu->a_secure_messaging_type - The decoded secure messagin type
+   apdu->a_secure_messaging_type - The decoded secure messaging type
(from class).
apdu->a_type - The decode class type.
apdu->a_gen_type - the generic class type (7816, PROPRIETARY, RFU, PTS).
@@ -273,7 +273,7 @@ Creating a Response --
 
 The expected result of any APDU call is a response. The card type emulator must
 set *response with an appropriate VCardResponse value if it returns VCARD_DONE.
-Reponses could be as simple as returning a 2 byte status word response, to as
+Responses could be as simple as returning a 2 byte status word response, to as
 complex as returning a block of data along with a 2 byte response. Which is
 returned will depend on the semantics of the APDU. The following functions will
 create card responses.
@@ -282,12 +282,12 @@ create card responses.
 
 This is the most basic function to get a response. This function will
 return a response the consists soley one 2 byte status code. If that status
-code is defined in card_7816t.h, then this function is guarrenteed to
+code is defined in card_7816t.h, then this function is guaranteed to
 return a response with that status. If a cart type specific status code
 is passed and vcard_make_response fails to allocate the appropriate memory
 for that response, then vcard_make_response will return a VCardResponse
 of VCARD7816_STATUS_EXC_ERROR_MEMORY. In any case, this function is
-guarrenteed to return a valid VCardResponse.
+guaranteed to return a valid VCardResponse.
 
 VCardResponse *vcard_response_new(unsigned char *buf, int len,
   VCard7816Status status);
-- 
1.7.1

[Qemu-devel] [PATCH] virtio-blk: fix cross-endian config space

2011-11-15 Thread Paolo Bonzini

Signed-off-by: Paolo Bonzini 
---
 hw/virtio-blk.c |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/hw/virtio-blk.c b/hw/virtio-blk.c
index 01aeb28..4a15f0c 100644
--- a/hw/virtio-blk.c
+++ b/hw/virtio-blk.c
@@ -481,14 +481,14 @@ static void virtio_blk_update_config(VirtIODevice *vdev, 
uint8_t *config)
 stq_raw(&blkcfg.capacity, capacity);
 stl_raw(&blkcfg.seg_max, 128 - 2);
 stw_raw(&blkcfg.cylinders, cylinders);
+stl_raw(&blkcfg.blk_size, s->conf->logical_block_size);
+stw_raw(&blkcfg.min_io_size, s->conf->min_io_size / blkcfg.blk_size);
+stw_raw(&blkcfg.opt_io_size, s->conf->opt_io_size / blkcfg.blk_size);
 blkcfg.heads = heads;
 blkcfg.sectors = secs & ~s->sector_mask;
-blkcfg.blk_size = s->conf->logical_block_size;
 blkcfg.size_max = 0;
 blkcfg.physical_block_exp = get_physical_block_exp(s->conf);
 blkcfg.alignment_offset = 0;
-blkcfg.min_io_size = s->conf->min_io_size / blkcfg.blk_size;
-blkcfg.opt_io_size = s->conf->opt_io_size / blkcfg.blk_size;
 memcpy(config, &blkcfg, sizeof(struct virtio_blk_config));
 }
 
-- 
1.7.7.1

[Qemu-devel] [PATCH 01/14] slavio_misc: convert apc to memory API

2011-11-15 Thread Benoît Canet

Signed-off-by: Benoit Canet 
---
 hw/slavio_misc.c |   32 
 1 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/hw/slavio_misc.c b/hw/slavio_misc.c
index 1f5a2d7..7d427f7 100644
--- a/hw/slavio_misc.c
+++ b/hw/slavio_misc.c
@@ -48,6 +48,7 @@ typedef struct MiscState {
 
 typedef struct APCState {
 SysBusDevice busdev;
+MemoryRegion iomem;
 qemu_irq cpu_halt;
 } APCState;
 
@@ -270,7 +271,8 @@ static CPUWriteMemoryFunc * const slavio_aux2_mem_write[3] 
= {
 NULL,
 };
 
-static void apc_mem_writeb(void *opaque, target_phys_addr_t addr, uint32_t val)
+static void apc_mem_writeb(void *opaque, target_phys_addr_t addr,
+   uint64_t val, unsigned size)
 {
 APCState *s = opaque;
 
@@ -278,7 +280,8 @@ static void apc_mem_writeb(void *opaque, target_phys_addr_t 
addr, uint32_t val)
 qemu_irq_raise(s->cpu_halt);
 }
 
-static uint32_t apc_mem_readb(void *opaque, target_phys_addr_t addr)
+static uint64_t apc_mem_readb(void *opaque, target_phys_addr_t addr,
+  unsigned size)
 {
 uint32_t ret = 0;
 
@@ -286,16 +289,14 @@ static uint32_t apc_mem_readb(void *opaque, 
target_phys_addr_t addr)
 return ret;
 }
 
-static CPUReadMemoryFunc * const apc_mem_read[3] = {
-apc_mem_readb,
-NULL,
-NULL,
-};
-
-static CPUWriteMemoryFunc * const apc_mem_write[3] = {
-apc_mem_writeb,
-NULL,
-NULL,
+static const MemoryRegionOps apc_mem_ops = {
+.read = apc_mem_readb,
+.write = apc_mem_writeb,
+.endianness = DEVICE_NATIVE_ENDIAN,
+.valid = {
+.min_access_size = 1,
+.max_access_size = 1,
+}
 };
 
 static uint32_t slavio_sysctrl_mem_readl(void *opaque, target_phys_addr_t addr)
@@ -407,14 +408,13 @@ static const VMStateDescription vmstate_misc = {
 static int apc_init1(SysBusDevice *dev)
 {
 APCState *s = FROM_SYSBUS(APCState, dev);
-int io;
 
 sysbus_init_irq(dev, &s->cpu_halt);
 
 /* Power management (APC) XXX: not a Slavio device */
-io = cpu_register_io_memory(apc_mem_read, apc_mem_write, s,
-DEVICE_NATIVE_ENDIAN);
-sysbus_init_mmio(dev, MISC_SIZE, io);
+memory_region_init_io(&s->iomem, &apc_mem_ops, s,
+  "apc", MISC_SIZE);
+sysbus_init_mmio_region(dev, &s->iomem);
 return 0;
 }
 
-- 
1.7.5.4

[Qemu-devel] [PATCH 03/14] slavio_misc: convert diagnostic to memory API

2011-11-15 Thread Benoît Canet

Signed-off-by: Benoit Canet 
---
 hw/slavio_misc.c |   31 +++
 1 files changed, 15 insertions(+), 16 deletions(-)

diff --git a/hw/slavio_misc.c b/hw/slavio_misc.c
index 7b98f11..60a115d 100644
--- a/hw/slavio_misc.c
+++ b/hw/slavio_misc.c
@@ -37,6 +37,7 @@
 typedef struct MiscState {
 SysBusDevice busdev;
 MemoryRegion cfg_iomem;
+MemoryRegion diag_iomem;
 qemu_irq irq;
 qemu_irq fdc_tc;
 uint32_t dummy;
@@ -133,7 +134,7 @@ static const MemoryRegionOps slavio_cfg_mem_ops = {
 };
 
 static void slavio_diag_mem_writeb(void *opaque, target_phys_addr_t addr,
-   uint32_t val)
+   uint64_t val, unsigned size)
 {
 MiscState *s = opaque;
 
@@ -141,7 +142,8 @@ static void slavio_diag_mem_writeb(void *opaque, 
target_phys_addr_t addr,
 s->diag = val & 0xff;
 }
 
-static uint32_t slavio_diag_mem_readb(void *opaque, target_phys_addr_t addr)
+static uint64_t slavio_diag_mem_readb(void *opaque, target_phys_addr_t addr,
+  unsigned size)
 {
 MiscState *s = opaque;
 uint32_t ret = 0;
@@ -151,16 +153,14 @@ static uint32_t slavio_diag_mem_readb(void *opaque, 
target_phys_addr_t addr)
 return ret;
 }
 
-static CPUReadMemoryFunc * const slavio_diag_mem_read[3] = {
-slavio_diag_mem_readb,
-NULL,
-NULL,
-};
-
-static CPUWriteMemoryFunc * const slavio_diag_mem_write[3] = {
-slavio_diag_mem_writeb,
-NULL,
-NULL,
+static const MemoryRegionOps slavio_diag_mem_ops = {
+.read = slavio_diag_mem_readb,
+.write = slavio_diag_mem_writeb,
+.endianness = DEVICE_NATIVE_ENDIAN,
+.valid = {
+.min_access_size = 1,
+.max_access_size = 1,
+},
 };
 
 static void slavio_mdm_mem_writeb(void *opaque, target_phys_addr_t addr,
@@ -433,10 +433,9 @@ static int slavio_misc_init1(SysBusDevice *dev)
 sysbus_init_mmio_region(dev, &s->cfg_iomem);
 
 /* Diagnostics */
-io = cpu_register_io_memory(slavio_diag_mem_read,
-slavio_diag_mem_write, s,
-DEVICE_NATIVE_ENDIAN);
-sysbus_init_mmio(dev, MISC_SIZE, io);
+memory_region_init_io(&s->diag_iomem, &slavio_diag_mem_ops, s,
+  "diagnostic", MISC_SIZE);
+sysbus_init_mmio_region(dev, &s->diag_iomem);
 
 /* Modem control */
 io = cpu_register_io_memory(slavio_mdm_mem_read,
-- 
1.7.5.4

[Qemu-devel] [PATCH 07/14] slavio_misc: convert aux1 to memory API

2011-11-15 Thread Benoît Canet

Signed-off-by: Benoit Canet 
---
 hw/slavio_misc.c |   31 +++
 1 files changed, 15 insertions(+), 16 deletions(-)

diff --git a/hw/slavio_misc.c b/hw/slavio_misc.c
index 29eca9b..7a51e1b 100644
--- a/hw/slavio_misc.c
+++ b/hw/slavio_misc.c
@@ -41,6 +41,7 @@ typedef struct MiscState {
 MemoryRegion mdm_iomem;
 MemoryRegion led_iomem;
 MemoryRegion sysctrl_iomem;
+MemoryRegion aux1_iomem;
 qemu_irq irq;
 qemu_irq fdc_tc;
 uint32_t dummy;
@@ -197,7 +198,7 @@ static const MemoryRegionOps slavio_mdm_mem_ops = {
 };
 
 static void slavio_aux1_mem_writeb(void *opaque, target_phys_addr_t addr,
-   uint32_t val)
+   uint64_t val, unsigned size)
 {
 MiscState *s = opaque;
 
@@ -213,7 +214,8 @@ static void slavio_aux1_mem_writeb(void *opaque, 
target_phys_addr_t addr,
 s->aux1 = val & 0xff;
 }
 
-static uint32_t slavio_aux1_mem_readb(void *opaque, target_phys_addr_t addr)
+static uint64_t slavio_aux1_mem_readb(void *opaque, target_phys_addr_t addr,
+  unsigned size)
 {
 MiscState *s = opaque;
 uint32_t ret = 0;
@@ -223,16 +225,14 @@ static uint32_t slavio_aux1_mem_readb(void *opaque, 
target_phys_addr_t addr)
 return ret;
 }
 
-static CPUReadMemoryFunc * const slavio_aux1_mem_read[3] = {
-slavio_aux1_mem_readb,
-NULL,
-NULL,
-};
-
-static CPUWriteMemoryFunc * const slavio_aux1_mem_write[3] = {
-slavio_aux1_mem_writeb,
-NULL,
-NULL,
+static const MemoryRegionOps slavio_aux1_mem_ops = {
+.read = slavio_aux1_mem_readb,
+.write = slavio_aux1_mem_writeb,
+.endianness = DEVICE_NATIVE_ENDIAN,
+.valid = {
+.min_access_size = 1,
+.max_access_size = 1,
+},
 };
 
 static void slavio_aux2_mem_writeb(void *opaque, target_phys_addr_t addr,
@@ -455,10 +455,9 @@ static int slavio_misc_init1(SysBusDevice *dev)
 sysbus_init_mmio_region(dev, &s->sysctrl_iomem);
 
 /* AUX 1 (Misc System Functions) */
-io = cpu_register_io_memory(slavio_aux1_mem_read,
-slavio_aux1_mem_write, s,
-DEVICE_NATIVE_ENDIAN);
-sysbus_init_mmio(dev, MISC_SIZE, io);
+memory_region_init_io(&s->aux1_iomem, &slavio_aux1_mem_ops, s,
+  "misc-system-functions", MISC_SIZE);
+sysbus_init_mmio_region(dev, &s->aux1_iomem);
 
 /* AUX 2 (Software Powerdown Control) */
 io = cpu_register_io_memory(slavio_aux2_mem_read,
-- 
1.7.5.4

[Qemu-devel] [PATCH 06/14] slavio_misc: convert system control to memory API

2011-11-15 Thread Benoît Canet

Signed-off-by: Benoit Canet 
---
 hw/slavio_misc.c |   31 +++
 1 files changed, 15 insertions(+), 16 deletions(-)

diff --git a/hw/slavio_misc.c b/hw/slavio_misc.c
index db266ba..29eca9b 100644
--- a/hw/slavio_misc.c
+++ b/hw/slavio_misc.c
@@ -40,6 +40,7 @@ typedef struct MiscState {
 MemoryRegion diag_iomem;
 MemoryRegion mdm_iomem;
 MemoryRegion led_iomem;
+MemoryRegion sysctrl_iomem;
 qemu_irq irq;
 qemu_irq fdc_tc;
 uint32_t dummy;
@@ -300,7 +301,8 @@ static const MemoryRegionOps apc_mem_ops = {
 }
 };
 
-static uint32_t slavio_sysctrl_mem_readl(void *opaque, target_phys_addr_t addr)
+static uint64_t slavio_sysctrl_mem_readl(void *opaque, target_phys_addr_t addr,
+ unsigned size)
 {
 MiscState *s = opaque;
 uint32_t ret = 0;
@@ -317,7 +319,7 @@ static uint32_t slavio_sysctrl_mem_readl(void *opaque, 
target_phys_addr_t addr)
 }
 
 static void slavio_sysctrl_mem_writel(void *opaque, target_phys_addr_t addr,
-  uint32_t val)
+  uint64_t val, unsigned size)
 {
 MiscState *s = opaque;
 
@@ -334,16 +336,14 @@ static void slavio_sysctrl_mem_writel(void *opaque, 
target_phys_addr_t addr,
 }
 }
 
-static CPUReadMemoryFunc * const slavio_sysctrl_mem_read[3] = {
-NULL,
-NULL,
-slavio_sysctrl_mem_readl,
-};
-
-static CPUWriteMemoryFunc * const slavio_sysctrl_mem_write[3] = {
-NULL,
-NULL,
-slavio_sysctrl_mem_writel,
+static const MemoryRegionOps slavio_sysctrl_mem_ops = {
+.read = slavio_sysctrl_mem_readl,
+.write = slavio_sysctrl_mem_writel,
+.endianness = DEVICE_NATIVE_ENDIAN,
+.valid = {
+.min_access_size = 4,
+.max_access_size = 4,
+},
 };
 
 static uint64_t slavio_led_mem_readw(void *opaque, target_phys_addr_t addr,
@@ -450,10 +450,9 @@ static int slavio_misc_init1(SysBusDevice *dev)
 
 /* 32 bit registers */
 /* System control */
-io = cpu_register_io_memory(slavio_sysctrl_mem_read,
-slavio_sysctrl_mem_write, s,
-DEVICE_NATIVE_ENDIAN);
-sysbus_init_mmio(dev, SYSCTRL_SIZE, io);
+memory_region_init_io(&s->sysctrl_iomem, &slavio_sysctrl_mem_ops, s,
+  "system-control", MISC_SIZE);
+sysbus_init_mmio_region(dev, &s->sysctrl_iomem);
 
 /* AUX 1 (Misc System Functions) */
 io = cpu_register_io_memory(slavio_aux1_mem_read,
-- 
1.7.5.4

[Qemu-devel] [PATCH 11/14] sun4c_intctl: convert to memory API

2011-11-15 Thread Benoît Canet

Signed-off-by: Benoit Canet 
---
 hw/sun4c_intctl.c |   32 +++-
 1 files changed, 15 insertions(+), 17 deletions(-)

diff --git a/hw/sun4c_intctl.c b/hw/sun4c_intctl.c
index 5c7fdef..4d01d1c 100644
--- a/hw/sun4c_intctl.c
+++ b/hw/sun4c_intctl.c
@@ -46,6 +46,7 @@
 
 typedef struct Sun4c_INTCTLState {
 SysBusDevice busdev;
+MemoryRegion iomem;
 #ifdef DEBUG_IRQ_COUNT
 uint64_t irq_count;
 #endif
@@ -60,7 +61,8 @@ typedef struct Sun4c_INTCTLState {
 
 static void sun4c_check_interrupts(void *opaque);
 
-static uint32_t sun4c_intctl_mem_readb(void *opaque, target_phys_addr_t addr)
+static uint64_t sun4c_intctl_mem_readb(void *opaque, target_phys_addr_t addr,
+   unsigned size)
 {
 Sun4c_INTCTLState *s = opaque;
 uint32_t ret;
@@ -72,7 +74,7 @@ static uint32_t sun4c_intctl_mem_readb(void *opaque, 
target_phys_addr_t addr)
 }
 
 static void sun4c_intctl_mem_writeb(void *opaque, target_phys_addr_t addr,
-uint32_t val)
+uint64_t val, unsigned size)
 {
 Sun4c_INTCTLState *s = opaque;
 
@@ -82,16 +84,14 @@ static void sun4c_intctl_mem_writeb(void *opaque, 
target_phys_addr_t addr,
 sun4c_check_interrupts(s);
 }
 
-static CPUReadMemoryFunc * const sun4c_intctl_mem_read[3] = {
-sun4c_intctl_mem_readb,
-NULL,
-NULL,
-};
-
-static CPUWriteMemoryFunc * const sun4c_intctl_mem_write[3] = {
-sun4c_intctl_mem_writeb,
-NULL,
-NULL,
+static const MemoryRegionOps sun4c_intctl_mem_ops = {
+.read = sun4c_intctl_mem_readb,
+.write = sun4c_intctl_mem_writeb,
+.endianness = DEVICE_NATIVE_ENDIAN,
+.impl = {
+.min_access_size = 1,
+.max_access_size = 1,
+},
 };
 
 void sun4c_pic_info(Monitor *mon, void *opaque)
@@ -192,13 +192,11 @@ static void sun4c_intctl_reset(DeviceState *d)
 static int sun4c_intctl_init1(SysBusDevice *dev)
 {
 Sun4c_INTCTLState *s = FROM_SYSBUS(Sun4c_INTCTLState, dev);
-int io_memory;
 unsigned int i;
 
-io_memory = cpu_register_io_memory(sun4c_intctl_mem_read,
-   sun4c_intctl_mem_write, s,
-   DEVICE_NATIVE_ENDIAN);
-sysbus_init_mmio(dev, INTCTL_SIZE, io_memory);
+memory_region_init_io(&s->iomem, &sun4c_intctl_mem_ops, s,
+  "interrupt-controller", INTCTL_SIZE);
+sysbus_init_mmio_region(dev, &s->iomem);
 qdev_init_gpio_in(&dev->qdev, sun4c_set_irq, 8);
 
 for (i = 0; i < MAX_PILS; i++) {
-- 
1.7.5.4

Re: [Qemu-devel] [PATCH] Fixing some spelling in docs/libcacard.txt

2011-11-15 Thread Alon Levy

On Tue, Nov 15, 2011 at 11:57:14AM +, matthias@googlemail.com wrote:
> From: Matthias Brugger 
> 

Thanks!

Reviewed-by: Alon Levy 

> 
> Signed-off-by: Matthias Brugger 
> ---
>  docs/libcacard.txt |   12 ++--
>  1 files changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/docs/libcacard.txt b/docs/libcacard.txt
> index 5dee6fa..576af57 100644
> --- a/docs/libcacard.txt
> +++ b/docs/libcacard.txt
> @@ -170,7 +170,7 @@ public entry point:
> int cert_count);
>  
>The parameters for this are:
> -  card   - the virtual card structure which will prepresent this card.
> +  card   - the virtual card structure which will represent this card.
>flags  - option flags that may be specific to this card type.
>cert   - array of binary certificates.
>cert_len   - array of lengths of each of the certificates specified in 
> cert.
> @@ -179,7 +179,7 @@ public entry point:
>cert_count - number of entries in cert, cert_len, and key arrays.
>  
>Any cert, cert_len, or key with the same index are matching sets. That is
> -  cert[0] is cert_len[0] long and has the corresponsing private key of 
> key[0].
> +  cert[0] is cert_len[0] long and has the corresponding private key of 
> key[0].
>  
>  The card type emulator is expected to own the VCardKeys, but it should copy
>  any raw cert data it wants to save. It can create new applets and add them to
> @@ -261,7 +261,7 @@ Prior to processing calling the card type emulator's 
> VCardProcessAPDU function,
> apdu->a_Le   - The expected length of any returned data.
> apdu->a_cla  - The raw apdu class.
> apdu->a_channel - The channel (decoded from the class).
> -   apdu->a_secure_messaging_type - The decoded secure messagin type
> +   apdu->a_secure_messaging_type - The decoded secure messaging type
> (from class).
> apdu->a_type - The decode class type.
> apdu->a_gen_type - the generic class type (7816, PROPRIETARY, RFU, PTS).
> @@ -273,7 +273,7 @@ Creating a Response --
>  
>  The expected result of any APDU call is a response. The card type emulator 
> must
>  set *response with an appropriate VCardResponse value if it returns 
> VCARD_DONE.
> -Reponses could be as simple as returning a 2 byte status word response, to as
> +Responses could be as simple as returning a 2 byte status word response, to 
> as
>  complex as returning a block of data along with a 2 byte response. Which is
>  returned will depend on the semantics of the APDU. The following functions 
> will
>  create card responses.
> @@ -282,12 +282,12 @@ create card responses.
>  
>  This is the most basic function to get a response. This function will
>  return a response the consists soley one 2 byte status code. If that 
> status
> -code is defined in card_7816t.h, then this function is guarrenteed to
> +code is defined in card_7816t.h, then this function is guaranteed to
>  return a response with that status. If a cart type specific status code
>  is passed and vcard_make_response fails to allocate the appropriate 
> memory
>  for that response, then vcard_make_response will return a VCardResponse
>  of VCARD7816_STATUS_EXC_ERROR_MEMORY. In any case, this function is
> -guarrenteed to return a valid VCardResponse.
> +guaranteed to return a valid VCardResponse.
>  
>  VCardResponse *vcard_response_new(unsigned char *buf, int len,
>VCard7816Status status);
> -- 
> 1.7.1
> 
>

Re: [Qemu-devel] [PATCH v7 1.0] configure: build position independent executables on x86 hosts

2011-11-15 Thread Peter Maydell

On 15 November 2011 09:34, Avi Kivity  wrote:
> Change the default on x86 hosts to building PIE (position independent
> executables); instead of restricting the option to user-only targets,
> apply it to all targets.
>
> In addition, set the relocation sections to read-only (relro) when available;
> this reduces the attack surface by disallowing changes to relocation tables
> at runtime.
>
> While PIE reduces performance and relro increases load time, it greatly
> improves security, with the potential to reduce a code execution vulnerability
> to a self denial of service.
>
> Non-x86 are not changed, as they require TCG changes.
>
> Signed-off-by: Avi Kivity 

Reviewed-by: Peter Maydell 

...as far as the technical content of the patch is concerned.
I'm still rather dubious about the merits of putting this patch
in this late in the release cycle.

-- PMM

[Qemu-devel] [PATCH 13/14] sun4m_iommu: convert to memory API

2011-11-15 Thread Benoît Canet

Signed-off-by: Benoit Canet 
---
 hw/sun4m_iommu.c |   31 +++
 1 files changed, 15 insertions(+), 16 deletions(-)

diff --git a/hw/sun4m_iommu.c b/hw/sun4m_iommu.c
index 6eeadfa..86d135a 100644
--- a/hw/sun4m_iommu.c
+++ b/hw/sun4m_iommu.c
@@ -128,13 +128,15 @@
 
 typedef struct IOMMUState {
 SysBusDevice busdev;
+MemoryRegion iomem;
 uint32_t regs[IOMMU_NREGS];
 target_phys_addr_t iostart;
 qemu_irq irq;
 uint32_t version;
 } IOMMUState;
 
-static uint32_t iommu_mem_readl(void *opaque, target_phys_addr_t addr)
+static uint64_t iommu_mem_readl(void *opaque, target_phys_addr_t addr,
+unsigned size)
 {
 IOMMUState *s = opaque;
 target_phys_addr_t saddr;
@@ -156,7 +158,7 @@ static uint32_t iommu_mem_readl(void *opaque, 
target_phys_addr_t addr)
 }
 
 static void iommu_mem_writel(void *opaque, target_phys_addr_t addr,
- uint32_t val)
+ uint64_t val, unsigned size)
 {
 IOMMUState *s = opaque;
 target_phys_addr_t saddr;
@@ -237,16 +239,14 @@ static void iommu_mem_writel(void *opaque, 
target_phys_addr_t addr,
 }
 }
 
-static CPUReadMemoryFunc * const iommu_mem_read[3] = {
-NULL,
-NULL,
-iommu_mem_readl,
-};
-
-static CPUWriteMemoryFunc * const iommu_mem_write[3] = {
-NULL,
-NULL,
-iommu_mem_writel,
+static const MemoryRegionOps iommu_mem_ops = {
+.read = iommu_mem_readl,
+.write = iommu_mem_writel,
+.endianness = DEVICE_NATIVE_ENDIAN,
+.impl = {
+.min_access_size = 4,
+.max_access_size = 4,
+},
 };
 
 static uint32_t iommu_page_get_flags(IOMMUState *s, target_phys_addr_t addr)
@@ -347,13 +347,12 @@ static void iommu_reset(DeviceState *d)
 static int iommu_init1(SysBusDevice *dev)
 {
 IOMMUState *s = FROM_SYSBUS(IOMMUState, dev);
-int io;
 
 sysbus_init_irq(dev, &s->irq);
 
-io = cpu_register_io_memory(iommu_mem_read, iommu_mem_write, s,
-DEVICE_NATIVE_ENDIAN);
-sysbus_init_mmio(dev, IOMMU_NREGS * sizeof(uint32_t), io);
+memory_region_init_io(&s->iomem, &iommu_mem_ops, s,
+  "iommu", IOMMU_NREGS * sizeof(uint32_t));
+sysbus_init_mmio_region(dev, &s->iomem);
 
 return 0;
 }
-- 
1.7.5.4

[Qemu-devel] [PATCH 00/14] Convert Sun devices to memory API.

2011-11-15 Thread Benoît Canet

.valid was used where the access size is specified like in
http://www.ibiblio.org/pub/historic-linux/early-ports/Sparc/NCR/NCR89C105.txt
.impl was used when the behavior is not known.

Benoît Canet (14):
  slavio_misc: convert apc to memory API
  slavio_misc: convert configuration to memory API
  slavio_misc: convert diagnostic to memory API
  slavio_misc: convert modem to memory API
  slavio_misc: convert leds to memory API
  slavio_misc: convert system control to memory API
  slavio_misc: convert aux1 to memory API
  slavio_misc: convert aux2 to memory API
  slavio_intctl: convert master interrupt controller to memory API
  slavio_intctl: convert slaves interrupt controllers to memory API
  sun4c_intctl: convert to memory API
  slavio_timer: convert to memory API
  sun4m_iommu: convert to memory API
  esp: Fix memory API conversion

 hw/esp.c   |1 +
 hw/slavio_intctl.c |   67 +++---
 hw/slavio_misc.c   |  250 +---
 hw/slavio_timer.c  |   41 -
 hw/sun4c_intctl.c  |   32 +++
 hw/sun4m_iommu.c   |   31 +++
 6 files changed, 205 insertions(+), 217 deletions(-)

-- 
1.7.5.4

[Qemu-devel] [PATCH V2 00/12] Proxy FS driver for VirtFS

2011-11-15 Thread M. Mohan Kumar

Pass-through security model in QEMU 9p server needs root privilege to do
few file operations (like chown, chmod to any mode/uid:gid).  There are two
issues in pass-through security model

1) TOCTTOU vulnerability: Following symbolic links in the server could
provide access to files beyond 9p export path.

2) Running QEMU with root privilege could be a security issue.

To overcome above issues, following approach is used: A new filesytem
type 'proxy' is introduced. Proxy FS uses chroot + socket combination
for securing the vulnerability known with following symbolic links.
Intention of adding a new filesystem type is to allow qemu to run
in non-root mode, but doing privileged operations using socket IO.

Proxy helper(a stand alone binary part of qemu) is invoked with
root privileges. Proxy helper chroots into 9p export path and creates
a socket pair or a named socket based on the command line parameter.
Qemu and proxy helper communicate using this socket. QEMU proxy fs
driver sends filesystem request to proxy helper and receives the
response from it.

Proxy helper is designed so that it can drop the root privilege but
retaining capbilities that are needed for doing filesystem operations
(like CAP_DAC_OVERRIDE, CAP_FOWNER etc)

M. Mohan Kumar (12):
  hw/9pfs: Move pdu_marshal/unmarshal code to a seperate file
  hw/9pfs: Add new proxy filesystem driver
  hw/9pfs: File system helper process for qemu 9p proxy FS
  hw/9pfs: Open and create files
  hw/9pfs: Create other filesystem objects
  hw/9pfs: Add stat/readlink/statfs for proxy FS
  hw/9pfs: File ownership and others
  hw/9pfs: xattr interfaces in proxy filesystem driver
  hw/9pfs: Proxy getversion
  hw/9pfs: Documentation changes related to proxy fs
  hw/9pfs: man page for proxy helper
  hw/9pfs: Add support to use named socket for proxy FS

 Makefile   |   15 +-
 Makefile.objs  |4 +-
 configure  |   19 +
 fsdev/file-op-9p.h |3 +-
 fsdev/qemu-fsdev.c |1 +
 fsdev/qemu-fsdev.h |1 +
 fsdev/virtfs-proxy-helper.c|  947 +
 fsdev/virtfs-proxy-helper.texi |   63 +++
 fsdev/virtio-9p-marshal.c  |  338 
 fsdev/virtio-9p-marshal.h  |   87 +++
 hw/9pfs/virtio-9p-proxy.c  | 1123 
 hw/9pfs/virtio-9p-proxy.h  |   80 +++
 hw/9pfs/virtio-9p.c|  297 +---
 hw/9pfs/virtio-9p.h|   85 +---
 qemu-config.c  |   13 +
 qemu-options.hx|   32 +-
 vl.c   |   10 +-
 17 files changed, 2736 insertions(+), 382 deletions(-)
 create mode 100644 fsdev/virtfs-proxy-helper.c
 create mode 100644 fsdev/virtfs-proxy-helper.texi
 create mode 100644 fsdev/virtio-9p-marshal.c
 create mode 100644 fsdev/virtio-9p-marshal.h
 create mode 100644 hw/9pfs/virtio-9p-proxy.c
 create mode 100644 hw/9pfs/virtio-9p-proxy.h

-- 
1.7.6

[Qemu-devel] [PATCH V2 07/12] hw/9pfs: File ownership and others

2011-11-15 Thread M. Mohan Kumar

Add file ownership interfaces like chmod/chown, utime update, rename,
remove and truncating files for proxy FS

Signed-off-by: M. Mohan Kumar 
---
 Makefile|2 +-
 fsdev/virtfs-proxy-helper.c |   66 +
 hw/9pfs/virtio-9p-proxy.c   |  134 +++
 hw/9pfs/virtio-9p-proxy.h   |6 ++
 4 files changed, 195 insertions(+), 13 deletions(-)

diff --git a/Makefile b/Makefile
index 19b481a..378ee4d 100644
--- a/Makefile
+++ b/Makefile
@@ -153,7 +153,7 @@ qemu-img$(EXESUF): qemu-img.o $(tools-obj-y) $(block-obj-y)
 qemu-nbd$(EXESUF): qemu-nbd.o $(tools-obj-y) $(block-obj-y)
 qemu-io$(EXESUF): qemu-io.o cmd.o $(tools-obj-y) $(block-obj-y)
 
-fsdev/virtfs-proxy-helper$(EXESUF): fsdev/virtfs-proxy-helper.o 
fsdev/virtio-9p-marshal.o
+fsdev/virtfs-proxy-helper$(EXESUF): fsdev/virtfs-proxy-helper.o 
fsdev/virtio-9p-marshal.o oslib-posix.o
 fsdev/virtfs-proxy-helper$(EXESUF): LIBS += -lcap
 
 qemu-img-cmds.h: $(SRC_PATH)/qemu-img-cmds.hx
diff --git a/fsdev/virtfs-proxy-helper.c b/fsdev/virtfs-proxy-helper.c
index eb33504..ded0ead 100644
--- a/fsdev/virtfs-proxy-helper.c
+++ b/fsdev/virtfs-proxy-helper.c
@@ -495,6 +495,9 @@ static void usage(char *prog)
 
 static int process_requests(int sock)
 {
+uint64_t offset;
+int mode, uid, gid;
+struct timespec spec[2];
 int type, retval = 0;
 V9fsString oldpath, path;
 struct iovec in_iovec, out_iovec;
@@ -535,6 +538,63 @@ static int process_requests(int sock)
 case T_READLINK:
 size = do_readlink(&in_iovec, &out_iovec);
 break;
+case T_CHMOD:
+proxy_unmarshal(&in_iovec, 1, HDR_SZ, "sd",
+&path, &mode);
+retval = chmod(path.data, mode);
+if (retval < 0) {
+retval = -errno;
+}
+v9fs_string_free(&path);
+break;
+case T_CHOWN:
+proxy_unmarshal(&in_iovec, 1, HDR_SZ, "sdd", &path,
+&uid, &gid);
+retval = lchown(path.data, uid, gid);
+if (retval < 0) {
+retval = -errno;
+}
+v9fs_string_free(&path);
+break;
+case T_TRUNCATE:
+proxy_unmarshal(&in_iovec, 1, HDR_SZ, "sq",
+&path, &offset);
+retval = truncate(path.data, offset);
+if (retval < 0) {
+retval = -errno;
+}
+v9fs_string_free(&path);
+break;
+case T_UTIME:
+proxy_unmarshal(&in_iovec, 1,
+   HDR_SZ, "s", &path,
+   &spec[0].tv_sec, &spec[0].tv_nsec,
+   &spec[1].tv_sec, &spec[1].tv_nsec);
+retval = qemu_utimensat(AT_FDCWD, path.data, spec,
+AT_SYMLINK_NOFOLLOW);
+if (retval < 0) {
+retval = -errno;
+}
+v9fs_string_free(&path);
+break;
+case T_RENAME:
+proxy_unmarshal(&in_iovec, 1,
+   HDR_SZ, "ss", &oldpath, &path);
+retval = rename(oldpath.data, path.data);
+if (retval < 0) {
+retval = -errno;
+}
+v9fs_string_free(&oldpath);
+v9fs_string_free(&path);
+break;
+case T_REMOVE:
+proxy_unmarshal(&in_iovec, 1, HDR_SZ, "s", &path);
+retval = remove(path.data);
+if (retval < 0) {
+retval = -errno;
+}
+v9fs_string_free(&path);
+break;
 default:
 goto error;
 break;
@@ -550,6 +610,12 @@ static int process_requests(int sock)
 case T_MKDIR:
 case T_SYMLINK:
 case T_LINK:
+case T_CHMOD:
+case T_CHOWN:
+case T_TRUNCATE:
+case T_UTIME:
+case T_RENAME:
+case T_REMOVE:
 send_status(sock, &out_iovec, retval);
 break;
 case T_LSTAT:
diff --git a/hw/9pfs/virtio-9p-proxy.c b/hw/9pfs/virtio-9p-proxy.c
index 090db44..aefdc61 100644
--- a/hw/9pfs/virtio-9p-proxy.c
+++ b/hw/9pfs/virtio-9p-proxy.c
@@ -242,6 +242,8 @@ static int v9fs_request(V9fsProxy *proxy, int type,
 struct iovec *iovec = NULL, *reply = NULL;
 dev_t rdev;
 int size = 0;
+struct timespec spec[2];
+uint64_t offset;
 
 qemu_mutex_lock(&proxy->mutex);
 
@@ -339,6 +341,63 @@ static int v9fs_request(V9fsProxy *proxy, int type,
 proxy_marshal(iovec, 1, 0, "dd", header.type, header.size);
 header.size += HDR_SZ;
 break;
+case T_CHMOD:
+path = va_arg(ap, V9fsString *);
+mode = va_arg(ap, int);
+header.size = proxy_marshal(iovec, 1, HDR_SZ, "sd",
+   path, mode);
+header.type = T_CHMOD;
+proxy_marshal(iovec, 1, 0, "dd", header

[Qemu-devel] [PATCH V2 05/12] hw/9pfs: Create other filesystem objects

2011-11-15 Thread M. Mohan Kumar

Add interfaces to create filesystem objects like directory,
device nodes, symbolic links, links for proxy filesytem driver

Signed-off-by: M. Mohan Kumar 
---
 fsdev/virtfs-proxy-helper.c |  105 --
 hw/9pfs/virtio-9p-proxy.c   |  173 +++
 hw/9pfs/virtio-9p-proxy.h   |8 ++-
 3 files changed, 261 insertions(+), 25 deletions(-)

diff --git a/fsdev/virtfs-proxy-helper.c b/fsdev/virtfs-proxy-helper.c
index 68d27f1..377e91a 100644
--- a/fsdev/virtfs-proxy-helper.c
+++ b/fsdev/virtfs-proxy-helper.c
@@ -210,6 +210,28 @@ static void send_fd(int sockfd, int fd)
 }
 }
 
+static int send_status(int sockfd, struct iovec *iovec, int status)
+{
+int retval, msg_size;;
+ProxyHeader header;
+
+if (status < 0) {
+header.type = T_ERROR;
+} else {
+header.type = T_SUCCESS;
+}
+header.size = sizeof(status);
+
+/* marshal the return status */
+msg_size = proxy_marshal(iovec, 1, 0, "ddd", header.type, header.size,
+status);
+retval = socket_write(sockfd, iovec->iov_base, msg_size);
+if (retval != msg_size) {
+return -EIO;
+}
+return 0;
+}
+
 /*
  * from man 7 capabilities, section
  * Effect of User ID Changes on Capabilities:
@@ -229,6 +251,49 @@ static int setfsugid(int uid, int gid)
 }
 
 /*
+ * create a other filesystem objects and send 0 on success
+ * return -errno on error
+ */
+static int do_create_others(int type, struct iovec *iovec)
+{
+dev_t rdev;
+int retval = 0;
+V9fsString oldpath, path;
+int mode, uid, gid, cur_uid, cur_gid;
+int offset = HDR_SZ;
+
+cur_uid = geteuid();
+cur_gid = getegid();
+
+offset += proxy_unmarshal(iovec, 1, offset, "dd", &uid, &gid);
+if (setfsugid(uid, gid) < 0) {
+return -EPERM;
+}
+switch (type) {
+case T_MKNOD:
+proxy_unmarshal(iovec, 1, offset, "sdq", &path, &mode, &rdev);
+retval = mknod(path.data, mode, rdev);
+break;
+case T_MKDIR:
+proxy_unmarshal(iovec, 1, offset, "sd", &path, &mode);
+retval = mkdir(path.data, mode);
+break;
+case T_SYMLINK:
+proxy_unmarshal(iovec, 1, offset, "ss", &oldpath, &path);
+retval = symlink(oldpath.data, path.data);
+v9fs_string_free(&oldpath);
+break;
+}
+
+if (retval < 0) {
+retval = -errno;
+}
+v9fs_string_free(&path);
+setfsugid(cur_uid, cur_gid);
+return retval;
+}
+
+/*
  * create a file and send fd on success
  * return -errno on error
  */
@@ -281,18 +346,36 @@ static void usage(char *prog)
 static int process_requests(int sock)
 {
 int type, retval = 0;
-struct iovec iovec;
+V9fsString oldpath, path;
+struct iovec in_iovec, out_iovec;
+
+in_iovec.iov_base = g_malloc(BUFF_SZ);
+in_iovec.iov_len = BUFF_SZ;
+out_iovec.iov_base = g_malloc(BUFF_SZ);
+out_iovec.iov_len = BUFF_SZ;
 
-iovec.iov_base = g_malloc(BUFF_SZ);
-iovec.iov_len = BUFF_SZ;
 while (1) {
-type = read_request(sock, &iovec);
+type = read_request(sock, &in_iovec);
 switch (type) {
 case T_OPEN:
-retval = do_open(&iovec);
+retval = do_open(&in_iovec);
 break;
 case T_CREATE:
-retval = do_create(&iovec);
+retval = do_create(&in_iovec);
+break;
+case T_MKNOD:
+case T_MKDIR:
+case T_SYMLINK:
+retval = do_create_others(type, &in_iovec);
+break;
+case T_LINK:
+proxy_unmarshal(&in_iovec, 1, HDR_SZ, "ss", &oldpath, &path);
+retval = link(oldpath.data, path.data);
+if (retval < 0) {
+retval = -errno;
+}
+v9fs_string_free(&oldpath);
+v9fs_string_free(&path);
 break;
 default:
 goto error;
@@ -305,14 +388,20 @@ static int process_requests(int sock)
 case T_CREATE:
 send_fd(sock, retval);
 break;
+case T_MKNOD:
+case T_MKDIR:
+case T_SYMLINK:
+case T_LINK:
+send_status(sock, &out_iovec, retval);
+break;
 default:
 goto error;
 break;
 }
 }
-(void)socket_write;
 error:
-g_free(iovec.iov_base);
+g_free(in_iovec.iov_base);
+g_free(out_iovec.iov_base);
 return -1;
 }
 
diff --git a/hw/9pfs/virtio-9p-proxy.c b/hw/9pfs/virtio-9p-proxy.c
index 8cc55d6..683d762 100644
--- a/hw/9pfs/virtio-9p-proxy.c
+++ b/hw/9pfs/virtio-9p-proxy.c
@@ -19,7 +19,8 @@
 typedef struct V9fsProxy {
 int sockfd;
 QemuMutex mutex;
-struct iovec iovec;
+struct iovec in_iovec;
+struct iovec out_iovec;
 } V9fsProxy;
 
 /*
@@ -79,6 +80,38 @@ static int v9fs_receivefd(int sockfd, int *sock_error)
 return -ENFILE; /* Ancillary data sent but not received */
 }
 
+static ssize_t socket_read(int sockfd, vo

[Qemu-devel] [PATCH V2 06/12] hw/9pfs: Add stat/readlink/statfs for proxy FS

2011-11-15 Thread M. Mohan Kumar

Signed-off-by: M. Mohan Kumar 
---
 fsdev/virtfs-proxy-helper.c |  165 
 hw/9pfs/virtio-9p-proxy.c   |  174 +--
 hw/9pfs/virtio-9p-proxy.h   |   34 +
 3 files changed, 367 insertions(+), 6 deletions(-)

diff --git a/fsdev/virtfs-proxy-helper.c b/fsdev/virtfs-proxy-helper.c
index 377e91a..eb33504 100644
--- a/fsdev/virtfs-proxy-helper.c
+++ b/fsdev/virtfs-proxy-helper.c
@@ -27,6 +27,8 @@
 #include 
 #include "bswap.h"
 #include 
+#include 
+#include 
 #include "qemu-common.h"
 #include "virtio-9p-marshal.h"
 #include "hw/9pfs/virtio-9p-proxy.h"
@@ -251,6 +253,154 @@ static int setfsugid(int uid, int gid)
 }
 
 /*
+ * send response in two parts
+ * 1) ProxyHeader
+ * 2) Response or error status
+ * This function should be called with marshaling response
+ * send_response constructs header part and error part only.
+ * send response sends {ProxyHeader,Response} if the request was success
+ * otherwise sends {ProxyHeader,error status}
+ */
+static int send_response(int sock, struct iovec *iovec, int size)
+{
+int retval;
+ProxyHeader header;
+
+if (size < 0) {
+header.type = T_ERROR;
+header.size = sizeof(size);
+proxy_marshal(iovec, 1, HDR_SZ, "d", size);
+} else {
+header.type = T_SUCCESS;
+header.size = size;
+}
+
+proxy_marshal(iovec, 1, 0, "dd", header.type, header.size);
+retval = socket_write(sock, iovec->iov_base, header.size + HDR_SZ);
+if (retval != header.size + HDR_SZ) {
+return -EIO;
+}
+return 0;
+}
+
+static void stat_to_prstat(ProxyStat *pr_stat, struct stat *stat)
+{
+memset(pr_stat, 0, sizeof(*pr_stat));
+pr_stat->st_dev = stat->st_dev;
+pr_stat->st_ino = stat->st_ino;
+pr_stat->st_nlink = stat->st_nlink;
+pr_stat->st_mode = stat->st_mode;
+pr_stat->st_uid = stat->st_uid;
+pr_stat->st_gid = stat->st_gid;
+pr_stat->st_rdev = stat->st_rdev;
+pr_stat->st_size = stat->st_size;
+pr_stat->st_blksize = stat->st_blksize;
+pr_stat->st_blocks = stat->st_blocks;
+pr_stat->st_atim_sec = stat->st_atim.tv_sec;
+pr_stat->st_atim_nsec = stat->st_atim.tv_nsec;
+pr_stat->st_mtim_sec = stat->st_mtim.tv_sec;
+pr_stat->st_mtim_nsec = stat->st_mtim.tv_nsec;
+pr_stat->st_ctim_sec = stat->st_ctim.tv_sec;
+pr_stat->st_ctim_nsec = stat->st_ctim.tv_nsec;
+}
+
+static void statfs_to_prstatfs(ProxyStatFS *pr_stfs, struct statfs *stfs)
+{
+memset(pr_stfs, 0, sizeof(*pr_stfs));
+pr_stfs->f_type = stfs->f_type;
+pr_stfs->f_bsize = stfs->f_bsize;
+pr_stfs->f_blocks = stfs->f_blocks;
+pr_stfs->f_bfree = stfs->f_bfree;
+pr_stfs->f_bavail = stfs->f_bavail;
+pr_stfs->f_files = stfs->f_files;
+pr_stfs->f_ffree = stfs->f_ffree;
+pr_stfs->f_fsid[0] = stfs->f_fsid.__val[0];
+pr_stfs->f_fsid[1] = stfs->f_fsid.__val[1];
+pr_stfs->f_namelen = stfs->f_namelen;
+pr_stfs->f_frsize = stfs->f_frsize;
+}
+
+/*
+ * Gets stat/statfs information and packs in out_iovec structure
+ * on success returns number of bytes packed in out_iovec struture
+ * otherwise returns -errno
+ */
+static int do_stat(int type, struct iovec *iovec, struct iovec *out_iovec)
+{
+V9fsString path;
+int retval = 0;
+
+proxy_unmarshal(iovec, 1, HDR_SZ, "s", &path);
+
+switch (type) {
+case T_LSTAT: {
+struct stat st_buf;
+ProxyStat pr_stat;
+
+retval = lstat(path.data, &st_buf);
+if (retval < 0) {
+retval = -errno;
+} else {
+stat_to_prstat(&pr_stat, &st_buf);
+retval = proxy_marshal(out_iovec, 1, HDR_SZ,
+"qqqdddqq", pr_stat.st_dev,
+pr_stat.st_ino, pr_stat.st_nlink,
+pr_stat.st_mode, pr_stat.st_uid,
+pr_stat.st_gid, pr_stat.st_rdev,
+pr_stat.st_size, pr_stat.st_blksize,
+pr_stat.st_blocks,
+pr_stat.st_atim_sec, pr_stat.st_atim_nsec,
+pr_stat.st_mtim_sec, pr_stat.st_mtim_nsec,
+pr_stat.st_ctim_sec, pr_stat.st_ctim_nsec);
+}
+break;
+}
+case T_STATFS: {
+struct statfs stfs_buf;
+ProxyStatFS pr_stfs;
+
+retval = statfs(path.data, &stfs_buf);
+if (retval < 0) {
+retval = -errno;
+} else {
+statfs_to_prstatfs(&pr_stfs, &stfs_buf);
+retval = proxy_marshal(out_iovec, 1, HDR_SZ,
+"qqq", pr_stfs.f_type, pr_stfs.f_bsize,
+pr_stfs.f_blocks, pr_stfs.f_bfree, 
pr_stfs.f_bavail,
+pr_stfs.f_files, pr_stfs.f_ffree,
+pr_stfs.f_fsid[0], pr_stfs.f_fsid[1],
+pr_stfs.f_namelen, pr_stfs.f_frsize);
+}
+

[Qemu-devel] [PATCH V2 02/12] hw/9pfs: Add new proxy filesystem driver

2011-11-15 Thread M. Mohan Kumar

Add new proxy filesystem driver to add root privilege to qemu process.
It needs a helper process to be started by root user.

Following command line can be used to utilize proxy filesystem driver
-virtfs proxy,id=,mount_tag=,socket_fd=

Signed-off-by: M. Mohan Kumar 
---
 Makefile.objs |2 +-
 fsdev/file-op-9p.h|1 -
 fsdev/qemu-fsdev.c|1 +
 fsdev/qemu-fsdev.h|1 +
 fsdev/virtio-9p-marshal.h |2 +-
 hw/9pfs/virtio-9p-proxy.c |  374 +
 hw/9pfs/virtio-9p-proxy.h |   10 ++
 qemu-config.c |6 +
 vl.c  |6 +-
 9 files changed, 399 insertions(+), 4 deletions(-)
 create mode 100644 hw/9pfs/virtio-9p-proxy.c
 create mode 100644 hw/9pfs/virtio-9p-proxy.h

diff --git a/Makefile.objs b/Makefile.objs
index c256fdc..8201202 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -311,7 +311,7 @@ hw-obj-$(CONFIG_SOUND) += $(sound-obj-y)
 9pfs-nested-$(CONFIG_VIRTFS) += virtio-9p-xattr-user.o virtio-9p-posix-acl.o
 9pfs-nested-$(CONFIG_VIRTFS) += virtio-9p-coth.o cofs.o codir.o cofile.o
 9pfs-nested-$(CONFIG_VIRTFS) += coxattr.o virtio-9p-handle.o
-9pfs-nested-$(CONFIG_VIRTFS) += virtio-9p-synth.o
+9pfs-nested-$(CONFIG_VIRTFS) += virtio-9p-synth.o virtio-9p-proxy.o
 
 hw-obj-$(CONFIG_REALLY_VIRTFS) += $(addprefix 9pfs/, $(9pfs-nested-y))
 $(addprefix 9pfs/, $(9pfs-nested-y)): QEMU_CFLAGS+=$(GLIB_CFLAGS)
diff --git a/fsdev/file-op-9p.h b/fsdev/file-op-9p.h
index 22849c9..84e5375 100644
--- a/fsdev/file-op-9p.h
+++ b/fsdev/file-op-9p.h
@@ -60,7 +60,6 @@ typedef struct extended_ops {
 
 #define V9FS_SEC_MASK   0x001C
 
-
 typedef struct FileOperations FileOperations;
 /*
  * Structure to store the various fsdev's passed through command line.
diff --git a/fsdev/qemu-fsdev.c b/fsdev/qemu-fsdev.c
index efbfea1..b31d116 100644
--- a/fsdev/qemu-fsdev.c
+++ b/fsdev/qemu-fsdev.c
@@ -25,6 +25,7 @@ static FsDriverTable FsDrivers[] = {
 { .name = "local", .ops = &local_ops},
 { .name = "handle", .ops = &handle_ops},
 { .name = "synth", .ops = &synth_ops},
+{ .name = "proxy", .ops = &proxy_ops},
 };
 
 int qemu_fsdev_add(QemuOpts *opts)
diff --git a/fsdev/qemu-fsdev.h b/fsdev/qemu-fsdev.h
index 921452d..1af1f54 100644
--- a/fsdev/qemu-fsdev.h
+++ b/fsdev/qemu-fsdev.h
@@ -44,4 +44,5 @@ FsDriverEntry *get_fsdev_fsentry(char *id);
 extern FileOperations local_ops;
 extern FileOperations handle_ops;
 extern FileOperations synth_ops;
+extern FileOperations proxy_ops;
 #endif
diff --git a/fsdev/virtio-9p-marshal.h b/fsdev/virtio-9p-marshal.h
index fe2d34b..45dba20 100644
--- a/fsdev/virtio-9p-marshal.h
+++ b/fsdev/virtio-9p-marshal.h
@@ -30,7 +30,7 @@ typedef struct V9fsStat
 V9fsString muid;
 /* 9p2000.u */
 V9fsString extension;
-   int32_t n_uid;
+int32_t n_uid;
 int32_t n_gid;
 int32_t n_muid;
 } V9fsStat;
diff --git a/hw/9pfs/virtio-9p-proxy.c b/hw/9pfs/virtio-9p-proxy.c
new file mode 100644
index 000..0e539e3
--- /dev/null
+++ b/hw/9pfs/virtio-9p-proxy.c
@@ -0,0 +1,374 @@
+/*
+ * Virtio 9p Proxy callback
+ *
+ * Copyright IBM, Corp. 2011
+ *
+ * Authors:
+ * M. Mohan Kumar 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ */
+#include 
+#include 
+#include "hw/virtio.h"
+#include "virtio-9p.h"
+#include "fsdev/qemu-fsdev.h"
+#include "virtio-9p-proxy.h"
+
+typedef struct V9fsProxy {
+int sockfd;
+QemuMutex mutex;
+struct iovec iovec;
+} V9fsProxy;
+
+static int proxy_lstat(FsContext *fs_ctx, V9fsPath *fs_path, struct stat 
*stbuf)
+{
+errno = EOPNOTSUPP;
+return -1;
+}
+
+static ssize_t proxy_readlink(FsContext *fs_ctx, V9fsPath *fs_path,
+  char *buf, size_t bufsz)
+{
+errno = EOPNOTSUPP;
+return -1;
+}
+
+static int proxy_close(FsContext *ctx, V9fsFidOpenState *fs)
+{
+return close(fs->fd);
+}
+
+static int proxy_closedir(FsContext *ctx, V9fsFidOpenState *fs)
+{
+return closedir(fs->dir);
+}
+
+static int proxy_open(FsContext *ctx, V9fsPath *fs_path,
+  int flags, V9fsFidOpenState *fs)
+{
+fs->fd = -1;
+return fs->fd;
+}
+
+static int proxy_opendir(FsContext *ctx,
+ V9fsPath *fs_path, V9fsFidOpenState *fs)
+{
+fs->dir = NULL;
+errno = EOPNOTSUPP;
+return -1;
+}
+
+static void proxy_rewinddir(FsContext *ctx, V9fsFidOpenState *fs)
+{
+return rewinddir(fs->dir);
+}
+
+static off_t proxy_telldir(FsContext *ctx, V9fsFidOpenState *fs)
+{
+return telldir(fs->dir);
+}
+
+static int proxy_readdir_r(FsContext *ctx, V9fsFidOpenState *fs,
+   struct dirent *entry,
+   struct dirent **result)
+{
+return readdir_r(fs->dir, entry, result);
+}
+
+static void proxy_seekdir(FsContext *ctx, V9fsFidOpenState *fs, off_t off)
+{
+return seekdir(fs->dir, off);
+}
+
+static ssize_t proxy_preadv(FsContext

[Qemu-devel] [PATCH V2 12/12] hw/9pfs: Add support to use named socket for proxy FS

2011-11-15 Thread M. Mohan Kumar

Add option to use named socket for communicating between proxy helper
and qemu proxy FS. Access to socket can be given by using command line
options -u and -g. We can achive the same using a shell script over
qemu and virtfs-proxy-helper using exec fd<>, and then
passing that fd as argument to qemu and virtfs-proxy-helper. Also having
a server like virtfs-proxy-helper listening on a pathname without any
authentication is little bit scary. So we have to decide whether this
patch is really needed.

Signed-off-by: M. Mohan Kumar 
Signed-off-by: Aneesh Kumar K.V 
---
 fsdev/file-op-9p.h |2 +
 fsdev/virtfs-proxy-helper.c|   88 ++--
 fsdev/virtfs-proxy-helper.texi |4 ++
 hw/9pfs/virtio-9p-proxy.c  |   52 ---
 qemu-config.c  |7 +++
 qemu-options.hx|   15 +--
 vl.c   |6 ++-
 7 files changed, 158 insertions(+), 16 deletions(-)

diff --git a/fsdev/file-op-9p.h b/fsdev/file-op-9p.h
index 84e5375..ac98e10 100644
--- a/fsdev/file-op-9p.h
+++ b/fsdev/file-op-9p.h
@@ -57,6 +57,8 @@ typedef struct extended_ops {
  */
 #define V9FS_SM_NONE0x0010
 #define V9FS_RDONLY 0x0020
+#define V9FS_PROXY_SOCK_FD  0x0040
+#define V9FS_PROXY_SOCK_NAME0x0080
 
 #define V9FS_SEC_MASK   0x001C
 
diff --git a/fsdev/virtfs-proxy-helper.c b/fsdev/virtfs-proxy-helper.c
index 917bb26..6c9ee3b 100644
--- a/fsdev/virtfs-proxy-helper.c
+++ b/fsdev/virtfs-proxy-helper.c
@@ -60,6 +60,9 @@ static struct option helper_opts[] = {
 {"fd", required_argument, NULL, 'f'},
 {"path", required_argument, NULL, 'p'},
 {"nodaemon", no_argument, NULL, 'n'},
+{"socket", required_argument, NULL, 's'},
+{"uid", required_argument, NULL, 'u'},
+{"gid", required_argument, NULL, 'g'},
 };
 
 int is_daemon;
@@ -584,11 +587,61 @@ static int do_open(struct iovec *iovec)
 return fd;
 }
 
+/* create unix domain socket and return the descriptor */
+static int proxy_socket(const char *path, uid_t uid, gid_t gid)
+{
+int sock, client;
+struct sockaddr_un proxy, qemu;
+socklen_t size;
+
+/* requested socket already exists, refuse to start */
+if (!access(path, F_OK)) {
+do_log(LOG_CRIT, "socket already exists\n");
+return -1;
+}
+
+sock = socket(AF_UNIX, SOCK_STREAM, 0);
+if (sock < 0) {
+do_perror("socket");
+return -1;
+}
+
+/* mask other part of mode bits */
+umask(7);
+
+proxy.sun_family = AF_UNIX;
+strcpy(proxy.sun_path, path);
+if (bind(sock, (struct sockaddr *)&proxy,
+sizeof(struct sockaddr_un)) < 0) {
+do_perror("bind");
+return -1;
+}
+if (chown(proxy.sun_path, uid, gid) < 0) {
+do_perror("chown");
+return -1;
+}
+if (listen(sock, 1) < 0) {
+do_perror("listen");
+return -1;
+}
+
+client = accept(sock, (struct sockaddr *)&qemu, &size);
+if (client < 0) {
+do_perror("accept");
+return -1;
+}
+return client;
+}
+
 static void usage(char *prog)
 {
 fprintf(stderr, "usage: %s\n"
 " -p|--path  9p path to export\n"
 " {-f|--fd } socket file descriptor to be 
used\n"
+" {-s|--socket  socket file used for communication\n"
+" \t-u|--uid  -g|--gid } - uid:gid combination to give "
+" access to this socket\n"
+" \tNote: -s & -f can not be used together\n"
 " [-n|--nodaemon] Run as a normal program\n",
 basename(prog));
 }
@@ -774,18 +827,22 @@ error:
 int main(int argc, char **argv)
 {
 int sock;
+char sock_name[PATH_MAX];
 char rpath[PATH_MAX];
 struct stat stbuf;
 int c, option_index;
 int retval;
 struct statfs st_fs;
+uid_t own_u;
+gid_t own_g;
 
 is_daemon = 1;
-rpath[0] = '\0';
+sock_name[0] = rpath[0] = '\0';
 sock = -1;
+own_u = own_g = -1;
 while (1) {
 option_index = 0;
-c = getopt_long(argc, argv, "p:nh?f:", helper_opts,
+c = getopt_long(argc, argv, "p:nh?f:s:u:g:", helper_opts,
 &option_index);
 if (c == -1) {
 break;
@@ -800,6 +857,15 @@ int main(int argc, char **argv)
 case 'f':
 sock = atoi(optarg);
 break;
+case 's':
+strcpy(sock_name, optarg);
+break;
+case 'u':
+own_u = atoi(optarg);
+break;
+case 'g':
+own_g = atoi(optarg);
+break;
 case '?':
 case 'h':
 default:
@@ -810,8 +876,16 @@ int main(int argc, char **argv)
 }
 
 /* Parameter validation */
-if (sock == -1 || rpath[0] == '\0') {
-fprintf(stderr, "socket descriptor or path not specified\n");
+if ((sock_name[0] == '\0' && sock == -1) || rpath[0] == '\0') {
+

[Qemu-devel] [PATCH V2 10/12] hw/9pfs: Documentation changes related to proxy fs

2011-11-15 Thread M. Mohan Kumar

Signed-off-by: M. Mohan Kumar 
---
 qemu-options.hx |   25 -
 1 files changed, 16 insertions(+), 9 deletions(-)

diff --git a/qemu-options.hx b/qemu-options.hx
index 681eaf1..cde17ed 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -530,19 +530,19 @@ DEFHEADING()
 DEFHEADING(File system options:)
 
 DEF("fsdev", HAS_ARG, QEMU_OPTION_fsdev,
-"-fsdev 
fsdriver,id=id,path=path,[security_model={mapped|passthrough|none}]\n"
-"   [,writeout=immediate][,readonly]\n",
+"-fsdev 
fsdriver,id=id[,path=path,][security_model={mapped|passthrough|none}]\n"
+" [,writeout=immediate][,readonly][,sock_fd=sock_fd]\n",
 QEMU_ARCH_ALL)
 
 STEXI
 
-@item -fsdev 
@var{fsdriver},id=@var{id},path=@var{path},[security_model=@var{security_model}][,writeout=@var{writeout}][,readonly]
+@item -fsdev 
@var{fsdriver},id=@var{id},path=@var{path},[security_model=@var{security_model}][,writeout=@var{writeout}][,readonly][,sock_fd=@var{sock_fd}]
 @findex -fsdev
 Define a new file system device. Valid options are:
 @table @option
 @item @var{fsdriver}
 This option specifies the fs driver backend to use.
-Currently "local" and "handle" file system drivers are supported.
+Currently "local", "handle" and "proxy" file system drivers are supported.
 @item id=@var{id}
 Specifies identifier for this device
 @item path=@var{path}
@@ -559,7 +559,7 @@ file attributes. Directories exported by this security 
model cannot
 interact with other unix tools. "none" security model is same as
 passthrough except the sever won't report failures if it fails to
 set file attributes like ownership. Security model is mandatory
-only for local fsdriver. Other fsdrivers (like handle) don't take
+only for local fsdriver. Other fsdrivers (like handle, proxy) don't take
 security model as a parameter.
 @item writeout=@var{writeout}
 This is an optional argument. The only supported value is "immediate".
@@ -569,6 +569,10 @@ reported as written by the storage subsystem.
 @item readonly
 Enables exporting 9p share as a readonly mount for guests. By default
 read-write access is given.
+@item sock_fd=@var{sock_fd}
+Enables proxy filesystem driver to use passed socket descriptor for
+communicating with virtfs-proxy-helper. Usually a helper like libvirt
+will create socketpair and pass one of the fds as sock_fd
 @end table
 
 -fsdev option is used along with -device driver "virtio-9p-pci".
@@ -589,19 +593,19 @@ DEFHEADING(Virtual File system pass-through options:)
 
 DEF("virtfs", HAS_ARG, QEMU_OPTION_virtfs,
 "-virtfs 
local,path=path,mount_tag=tag,security_model=[mapped|passthrough|none]\n"
-"[,writeout=immediate][,readonly]\n",
+"[,writeout=immediate][,readonly][,sock_fd=sock_fd]\n",
 QEMU_ARCH_ALL)
 
 STEXI
 
-@item -virtfs 
@var{fsdriver},path=@var{path},mount_tag=@var{mount_tag},security_model=@var{security_model}[,writeout=@var{writeout}][,readonly]
+@item -virtfs 
@var{fsdriver}[,path=@var{path}],mount_tag=@var{mount_tag}[,security_model=@var{security_model}][,writeout=@var{writeout}][,readonly][,sock_fd=@var{sock_fd}]
 @findex -virtfs
 
 The general form of a Virtual File system pass-through options are:
 @table @option
 @item @var{fsdriver}
 This option specifies the fs driver backend to use.
-Currently "local" and "handle" file system drivers are supported.
+Currently "local", "handle" and "proxy" file system drivers are supported.
 @item id=@var{id}
 Specifies identifier for this device
 @item path=@var{path}
@@ -618,7 +622,7 @@ file attributes. Directories exported by this security 
model cannot
 interact with other unix tools. "none" security model is same as
 passthrough except the sever won't report failures if it fails to
 set file attributes like ownership. Security model is mandatory only
-for local fsdriver. Other fsdrivers (like handle) don't take security
+for local fsdriver. Other fsdrivers (like handle, proxy) don't take security
 model as a parameter.
 @item writeout=@var{writeout}
 This is an optional argument. The only supported value is "immediate".
@@ -628,6 +632,9 @@ reported as written by the storage subsystem.
 @item readonly
 Enables exporting 9p share as a readonly mount for guests. By default
 read-write access is given.
+@item sock_fd
+Enables proxy filesystem driver to use passed 'sock_fd' as the socket
+descriptor for interfacing with virtfs-proxy-helper
 @end table
 ETEXI
 
-- 
1.7.6

[Qemu-devel] [PATCH 08/14] slavio_misc: convert aux2 to memory API

2011-11-15 Thread Benoît Canet

Signed-off-by: Benoit Canet 
---
 hw/slavio_misc.c |   32 +++-
 1 files changed, 15 insertions(+), 17 deletions(-)

diff --git a/hw/slavio_misc.c b/hw/slavio_misc.c
index 7a51e1b..ccc1c53 100644
--- a/hw/slavio_misc.c
+++ b/hw/slavio_misc.c
@@ -42,6 +42,7 @@ typedef struct MiscState {
 MemoryRegion led_iomem;
 MemoryRegion sysctrl_iomem;
 MemoryRegion aux1_iomem;
+MemoryRegion aux2_iomem;
 qemu_irq irq;
 qemu_irq fdc_tc;
 uint32_t dummy;
@@ -236,7 +237,7 @@ static const MemoryRegionOps slavio_aux1_mem_ops = {
 };
 
 static void slavio_aux2_mem_writeb(void *opaque, target_phys_addr_t addr,
-   uint32_t val)
+   uint64_t val, unsigned size)
 {
 MiscState *s = opaque;
 
@@ -251,7 +252,8 @@ static void slavio_aux2_mem_writeb(void *opaque, 
target_phys_addr_t addr,
 slavio_misc_update_irq(s);
 }
 
-static uint32_t slavio_aux2_mem_readb(void *opaque, target_phys_addr_t addr)
+static uint64_t slavio_aux2_mem_readb(void *opaque, target_phys_addr_t addr,
+  unsigned size)
 {
 MiscState *s = opaque;
 uint32_t ret = 0;
@@ -261,16 +263,14 @@ static uint32_t slavio_aux2_mem_readb(void *opaque, 
target_phys_addr_t addr)
 return ret;
 }
 
-static CPUReadMemoryFunc * const slavio_aux2_mem_read[3] = {
-slavio_aux2_mem_readb,
-NULL,
-NULL,
-};
-
-static CPUWriteMemoryFunc * const slavio_aux2_mem_write[3] = {
-slavio_aux2_mem_writeb,
-NULL,
-NULL,
+static const MemoryRegionOps slavio_aux2_mem_ops = {
+.read = slavio_aux2_mem_readb,
+.write = slavio_aux2_mem_writeb,
+.endianness = DEVICE_NATIVE_ENDIAN,
+.valid = {
+.min_access_size = 1,
+.max_access_size = 1,
+},
 };
 
 static void apc_mem_writeb(void *opaque, target_phys_addr_t addr,
@@ -421,7 +421,6 @@ static int apc_init1(SysBusDevice *dev)
 static int slavio_misc_init1(SysBusDevice *dev)
 {
 MiscState *s = FROM_SYSBUS(MiscState, dev);
-int io;
 
 sysbus_init_irq(dev, &s->irq);
 sysbus_init_irq(dev, &s->fdc_tc);
@@ -460,10 +459,9 @@ static int slavio_misc_init1(SysBusDevice *dev)
 sysbus_init_mmio_region(dev, &s->aux1_iomem);
 
 /* AUX 2 (Software Powerdown Control) */
-io = cpu_register_io_memory(slavio_aux2_mem_read,
-slavio_aux2_mem_write, s,
-DEVICE_NATIVE_ENDIAN);
-sysbus_init_mmio(dev, MISC_SIZE, io);
+memory_region_init_io(&s->aux2_iomem, &slavio_aux2_mem_ops, s,
+  "software-powerdown-control", MISC_SIZE);
+sysbus_init_mmio_region(dev, &s->aux2_iomem);
 
 qdev_init_gpio_in(&dev->qdev, slavio_set_power_fail, 1);
 
-- 
1.7.5.4

[Qemu-devel] [PATCH V2 01/12] hw/9pfs: Move pdu_marshal/unmarshal code to a seperate file

2011-11-15 Thread M. Mohan Kumar

Move p9 marshaling/unmarshaling code to a separate file so that
proxy filesytem driver can use these calls. Also made marshaling
code generic to accept "struct iovec" instead of V9fsPDU.

Signed-off-by: M. Mohan Kumar 
---
 Makefile.objs |2 +-
 fsdev/virtio-9p-marshal.c |  338 +
 fsdev/virtio-9p-marshal.h |   87 
 hw/9pfs/virtio-9p.c   |  297 +---
 hw/9pfs/virtio-9p.h   |   85 ++--
 5 files changed, 440 insertions(+), 369 deletions(-)
 create mode 100644 fsdev/virtio-9p-marshal.c
 create mode 100644 fsdev/virtio-9p-marshal.h

diff --git a/Makefile.objs b/Makefile.objs
index d7a6539..c256fdc 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -61,7 +61,7 @@ ifeq ($(CONFIG_VIRTIO)$(CONFIG_VIRTFS)$(CONFIG_PCI),yyy)
 # Lots of the fsdev/9pcode is pulled in by vl.c via qemu_fsdev_add.
 # only pull in the actual virtio-9p device if we also enabled virtio.
 CONFIG_REALLY_VIRTFS=y
-fsdev-nested-y = qemu-fsdev.o
+fsdev-nested-y = qemu-fsdev.o virtio-9p-marshal.o
 else
 fsdev-nested-y = qemu-fsdev-dummy.o
 endif
diff --git a/fsdev/virtio-9p-marshal.c b/fsdev/virtio-9p-marshal.c
new file mode 100644
index 000..2da0a34
--- /dev/null
+++ b/fsdev/virtio-9p-marshal.c
@@ -0,0 +1,338 @@
+/*
+ * Virtio 9p backend
+ *
+ * Copyright IBM, Corp. 2010
+ *
+ * Authors:
+ *  Anthony Liguori   
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "compiler.h"
+#include "virtio-9p-marshal.h"
+#include "bswap.h"
+
+void v9fs_string_init(V9fsString *str)
+{
+str->data = NULL;
+str->size = 0;
+}
+
+void v9fs_string_free(V9fsString *str)
+{
+g_free(str->data);
+str->data = NULL;
+str->size = 0;
+}
+
+void v9fs_string_null(V9fsString *str)
+{
+v9fs_string_free(str);
+}
+
+void GCC_FMT_ATTR(2, 3)
+v9fs_string_sprintf(V9fsString *str, const char *fmt, ...)
+{
+va_list ap;
+
+v9fs_string_free(str);
+
+va_start(ap, fmt);
+str->size = g_vasprintf(&str->data, fmt, ap);
+va_end(ap);
+}
+
+void v9fs_string_copy(V9fsString *lhs, V9fsString *rhs)
+{
+v9fs_string_free(lhs);
+v9fs_string_sprintf(lhs, "%s", rhs->data);
+}
+
+
+static size_t v9fs_packunpack(void *addr, struct iovec *sg, int sg_count,
+  size_t offset, size_t size, int pack)
+{
+int i = 0;
+size_t copied = 0;
+
+for (i = 0; size && i < sg_count; i++) {
+size_t len;
+if (offset >= sg[i].iov_len) {
+/* skip this sg */
+offset -= sg[i].iov_len;
+continue;
+} else {
+len = MIN(sg[i].iov_len - offset, size);
+if (pack) {
+memcpy(sg[i].iov_base + offset, addr, len);
+} else {
+memcpy(addr, sg[i].iov_base + offset, len);
+}
+size -= len;
+copied += len;
+addr += len;
+if (size) {
+offset = 0;
+continue;
+}
+}
+}
+
+return copied;
+}
+
+static size_t v9fs_unpack(void *dst, struct iovec *out_sg, int out_num,
+  size_t offset, size_t size)
+{
+return v9fs_packunpack(dst, out_sg, out_num, offset, size, 0);
+}
+
+size_t v9fs_pack(struct iovec *in_sg, int in_num, size_t offset,
+const void *src, size_t size)
+{
+return v9fs_packunpack((void *)src, in_sg, in_num, offset, size, 1);
+}
+
+static int v9fs_copy_sg(struct iovec *src_sg, unsigned int num,
+size_t offset, struct iovec *sg)
+{
+size_t pos = 0;
+int i, j;
+
+j = 0;
+for (i = 0; i < num; i++) {
+if (offset <= pos) {
+sg[j].iov_base = src_sg[i].iov_base;
+sg[j].iov_len = src_sg[i].iov_len;
+j++;
+} else if (offset < (src_sg[i].iov_len + pos)) {
+sg[j].iov_base = src_sg[i].iov_base;
+sg[j].iov_len = src_sg[i].iov_len;
+sg[j].iov_base += (offset - pos);
+sg[j].iov_len -= (offset - pos);
+j++;
+}
+pos += src_sg[i].iov_len;
+}
+
+return j;
+}
+
+size_t v9fs_unmarshal(struct iovec *out_sg, int out_num, size_t offset,
+int convert, const char *fmt, ...)
+{
+size_t old_offset = offset;
+va_list ap;
+int i;
+
+va_start(ap, fmt);
+for (i = 0; fmt[i]; i++) {
+switch (fmt[i]) {
+case 'b': {
+uint8_t *valp = va_arg(ap, uint8_t *);
+offset += v9fs_unpack(valp, out_sg, out_num, offset, 
sizeof(*valp));
+break;
+}
+case 'w': {
+uint16_t val, *valp;
+valp = va_arg(ap, uint16_t *);
+offset += v9fs_unpack(&val, out_sg, out_num, offset, sizeof(val));
+

Re: [Qemu-devel] [PATCH V2 00/12] Proxy FS driver for VirtFS

2011-11-15 Thread M. Mohan Kumar


Changes from previous version:

1) Communication between qemu and helper process is similar to 9p way of 
packing

elements (pdu marshaling).

M. Mohan Kumar wrote:

Pass-through security model in QEMU 9p server needs root privilege to do
few file operations (like chown, chmod to any mode/uid:gid).  There are two
issues in pass-through security model

1) TOCTTOU vulnerability: Following symbolic links in the server could
provide access to files beyond 9p export path.

2) Running QEMU with root privilege could be a security issue.

To overcome above issues, following approach is used: A new filesytem
type 'proxy' is introduced. Proxy FS uses chroot + socket combination
for securing the vulnerability known with following symbolic links.
Intention of adding a new filesystem type is to allow qemu to run
in non-root mode, but doing privileged operations using socket IO.

Proxy helper(a stand alone binary part of qemu) is invoked with
root privileges. Proxy helper chroots into 9p export path and creates
a socket pair or a named socket based on the command line parameter.
Qemu and proxy helper communicate using this socket. QEMU proxy fs
driver sends filesystem request to proxy helper and receives the
response from it.

Proxy helper is designed so that it can drop the root privilege but
retaining capbilities that are needed for doing filesystem operations
(like CAP_DAC_OVERRIDE, CAP_FOWNER etc)

M. Mohan Kumar (12):
   hw/9pfs: Move pdu_marshal/unmarshal code to a seperate file
   hw/9pfs: Add new proxy filesystem driver
   hw/9pfs: File system helper process for qemu 9p proxy FS
   hw/9pfs: Open and create files
   hw/9pfs: Create other filesystem objects
   hw/9pfs: Add stat/readlink/statfs for proxy FS
   hw/9pfs: File ownership and others
   hw/9pfs: xattr interfaces in proxy filesystem driver
   hw/9pfs: Proxy getversion
   hw/9pfs: Documentation changes related to proxy fs
   hw/9pfs: man page for proxy helper
   hw/9pfs: Add support to use named socket for proxy FS

  Makefile   |   15 +-
  Makefile.objs  |4 +-
  configure  |   19 +
  fsdev/file-op-9p.h |3 +-
  fsdev/qemu-fsdev.c |1 +
  fsdev/qemu-fsdev.h |1 +
  fsdev/virtfs-proxy-helper.c|  947 +
  fsdev/virtfs-proxy-helper.texi |   63 +++
  fsdev/virtio-9p-marshal.c  |  338 
  fsdev/virtio-9p-marshal.h  |   87 +++
  hw/9pfs/virtio-9p-proxy.c  | 1123 
  hw/9pfs/virtio-9p-proxy.h  |   80 +++
  hw/9pfs/virtio-9p.c|  297 +---
  hw/9pfs/virtio-9p.h|   85 +---
  qemu-config.c  |   13 +
  qemu-options.hx|   32 +-
  vl.c   |   10 +-
  17 files changed, 2736 insertions(+), 382 deletions(-)
  create mode 100644 fsdev/virtfs-proxy-helper.c
  create mode 100644 fsdev/virtfs-proxy-helper.texi
  create mode 100644 fsdev/virtio-9p-marshal.c
  create mode 100644 fsdev/virtio-9p-marshal.h
  create mode 100644 hw/9pfs/virtio-9p-proxy.c
  create mode 100644 hw/9pfs/virtio-9p-proxy.h

Re: [Qemu-devel] [PATCH v2] qcow2: Unlock during COW

2011-11-15 Thread Stefan Hajnoczi

On Mon, Nov 14, 2011 at 06:55:18PM +0100, Kevin Wolf wrote:
> Unlocking during COW allows for more parallelism. One change it requires is
> that buffers are dynamically allocated instead of just using a per-image
> buffer.
> 
> While touching the code, drop the synchronous qcow2_read() function and 
> replace
> it by a bdrv_read() call.
> 
> Signed-off-by: Kevin Wolf 
> ---
>  block/qcow2-cluster.c |  104 
>  1 files changed, 35 insertions(+), 69 deletions(-)

This should be safe because dependent requests are queued so we can
perform copy_sectors() in parallel with the non-dependent requests.

> -static int qcow2_read(BlockDriverState *bs, int64_t sector_num,
> -  uint8_t *buf, int nb_sectors)
> -{
> -BDRVQcowState *s = bs->opaque;
> -int ret, index_in_cluster, n, n1;
> -uint64_t cluster_offset;
> -struct iovec iov;
> -QEMUIOVector qiov;
> -
> -while (nb_sectors > 0) {
> -n = nb_sectors;
> -
> -ret = qcow2_get_cluster_offset(bs, sector_num << 9, &n,
> -&cluster_offset);
> -if (ret < 0) {
> -return ret;
> -}
> -
> -index_in_cluster = sector_num & (s->cluster_sectors - 1);
> -if (!cluster_offset) {
> -if (bs->backing_hd) {
> -/* read from the base image */
> -iov.iov_base = buf;
> -iov.iov_len = n * 512;
> -qemu_iovec_init_external(&qiov, &iov, 1);
> -
> -n1 = qcow2_backing_read1(bs->backing_hd, &qiov, sector_num, 
> n);
> -if (n1 > 0) {
> -BLKDBG_EVENT(bs->file, BLKDBG_READ_BACKING);
> -ret = bdrv_read(bs->backing_hd, sector_num, buf, n1);
> -if (ret < 0)
> -return -1;
> -}
> -} else {
> -memset(buf, 0, 512 * n);
> -}
> -} else if (cluster_offset & QCOW_OFLAG_COMPRESSED) {
> -if (qcow2_decompress_cluster(bs, cluster_offset) < 0)
> -return -1;
> -memcpy(buf, s->cluster_cache + index_in_cluster * 512, 512 * n);
> -} else {
> -BLKDBG_EVENT(bs->file, BLKDBG_READ);

BLKDBG_READ is now unused, there is no other occurrence in QEMU.  Not
sure if you want to remove it from blkdebug now?

> -ret = bdrv_pread(bs->file, cluster_offset + index_in_cluster * 
> 512, buf, n * 512);
> -if (ret != n * 512)
> -return -1;
> -if (s->crypt_method) {
> -qcow2_encrypt_sectors(s, sector_num, buf, buf, n, 0,
> -&s->aes_decrypt_key);
> -}
> -}
> -nb_sectors -= n;
> -sector_num += n;
> -buf += n * 512;
> -}
> -return 0;
> -}
> -
>  static int copy_sectors(BlockDriverState *bs, uint64_t start_sect,
>  uint64_t cluster_offset, int n_start, int n_end)
>  {
>  BDRVQcowState *s = bs->opaque;
>  int n, ret;
> +void *buf;
> +
> +/*
> + * If this is the last cluster and it is only partially used, we must 
> only
> + * copy until the end of the image, or bdrv_check_request will fail for 
> the
> + * bdrv_read/write calls below.
> + */
> +if (start_sect + n_end > bs->total_sectors) {
> +n_end = bs->total_sectors - start_sect;
> +}
>  
>  n = n_end - n_start;
> -if (n <= 0)
> +if (n <= 0) {
>  return 0;
> +}
> +
> +buf = qemu_blockalign(bs, n * BDRV_SECTOR_SIZE);
> +
>  BLKDBG_EVENT(bs->file, BLKDBG_COW_READ);
> -ret = qcow2_read(bs, start_sect + n_start, s->cluster_data, n);
> -if (ret < 0)
> -return ret;
> +ret = bdrv_read(bs, start_sect + n_start, buf, n);
> +if (ret < 0) {
> +goto out;
> +}
> +
>  if (s->crypt_method) {
>  qcow2_encrypt_sectors(s, start_sect + n_start,
> -s->cluster_data,
> -s->cluster_data, n, 1,
> +buf, buf, n, 1,
>  &s->aes_encrypt_key);
>  }
> +
>  BLKDBG_EVENT(bs->file, BLKDBG_COW_WRITE);
> -ret = bdrv_write(bs->file, (cluster_offset >> 9) + n_start,
> -s->cluster_data, n);
> -if (ret < 0)
> -return ret;
> -return 0;
> +ret = bdrv_write(bs->file, (cluster_offset >> 9) + n_start, buf, n);
> +if (ret < 0) {
> +goto out;
> +}
> +
> +ret = 0;

bdrv_write() returns 0 on success or -errno on failure.  We don't need
to check for ret < 0 or set ret = 0 here.  Just continuing execution
should be fine unless I missed something.

Stefan

[Qemu-devel] [PATCH V2 11/12] hw/9pfs: man page for proxy helper

2011-11-15 Thread M. Mohan Kumar

Signed-off-by: M. Mohan Kumar 
---
 Makefile   |   12 +++-
 fsdev/virtfs-proxy-helper.texi |   59 
 2 files changed, 70 insertions(+), 1 deletions(-)
 create mode 100644 fsdev/virtfs-proxy-helper.texi

diff --git a/Makefile b/Makefile
index 378ee4d..29ae332 100644
--- a/Makefile
+++ b/Makefile
@@ -37,6 +37,7 @@ LIBS+=-lz $(LIBS_TOOLS)
 
 ifdef BUILD_DOCS
 DOCS=qemu-doc.html qemu-tech.html qemu.1 qemu-img.1 qemu-nbd.8 
QMP/qmp-commands.txt
+DOCS+=fsdev/virtfs-proxy-helper.1
 else
 DOCS=
 endif
@@ -280,7 +281,10 @@ ifdef CONFIG_POSIX
$(INSTALL_DIR) "$(DESTDIR)$(mandir)/man8"
$(INSTALL_DATA) qemu-nbd.8 "$(DESTDIR)$(mandir)/man8"
 endif
-
+ifdef CONFIG_VIRTFS
+   $(INSTALL_DIR) "$(DESTDIR)$(mandir)/man1"
+   $(INSTALL_DATA) fsdev/virtfs-proxy-helper.1 "$(DESTDIR)$(mandir)/man1"
+endif
 install-sysconfig:
$(INSTALL_DIR) "$(DESTDIR)$(sysconfdir)/qemu"
$(INSTALL_DATA) $(SRC_PATH)/sysconfigs/target/target-x86_64.conf 
"$(DESTDIR)$(sysconfdir)/qemu"
@@ -358,6 +362,12 @@ qemu-img.1: qemu-img.texi qemu-img-cmds.texi
  pod2man --section=1 --center=" " --release=" " qemu-img.pod > $@, \
  "  GEN   $@")
 
+fsdev/virtfs-proxy-helper.1: fsdev/virtfs-proxy-helper.texi
+   $(call quiet-command, \
+ perl -Ww -- $(SRC_PATH)/scripts/texi2pod.pl $< 
fsdev/virtfs-proxy-helper.pod && \
+ pod2man --section=1 --center=" " --release=" " 
fsdev/virtfs-proxy-helper.pod > $@, \
+ "  GEN   $@")
+
 qemu-nbd.8: qemu-nbd.texi
$(call quiet-command, \
  perl -Ww -- $(SRC_PATH)/scripts/texi2pod.pl $< qemu-nbd.pod && \
diff --git a/fsdev/virtfs-proxy-helper.texi b/fsdev/virtfs-proxy-helper.texi
new file mode 100644
index 000..3816382
--- /dev/null
+++ b/fsdev/virtfs-proxy-helper.texi
@@ -0,0 +1,59 @@
+@example
+@c man begin SYNOPSIS
+usage: virtfs-proxy-helper options
+@c man end
+@end example
+
+@c man begin DESCRIPTION
+@table @description
+Pass-through security model in QEMU 9p server needs root privilege to do
+few file operations (like chown, chmod to any mode/uid:gid).  There are two
+issues in pass-through security model
+
+1) TOCTTOU vulnerability: Following symbolic links in the server could
+provide access to files beyond 9p export path.
+
+2) Running QEMU with root privilege could be a security issue.
+
+To overcome above issues, following approach is used: A new filesytem
+type 'proxy' is introduced. Proxy FS uses chroot + socket combination
+for securing the vulnerability known with following symbolic links.
+Intention of adding a new filesystem type is to allow qemu to run
+in non-root mode, but doing privileged operations using socket IO.
+
+Proxy helper(a stand alone binary part of qemu) is invoked with
+root privileges. Proxy helper chroots into 9p export path and creates
+a socket pair or a named socket based on the command line parameter.
+Qemu and proxy helper communicate using this socket. QEMU proxy fs
+driver sends filesystem request to proxy helper and receives the
+response from it.
+
+Proxy helper is designed so that it can drop the root privilege with
+retaining capbilities needed for doing filesystem operations only.
+
+@end table
+@c man end
+
+@c man begin OPTIONS
+The following options are supported:
+@table @option
+@item -h
+@findex -h
+Display help and exit
+@item -p|--path path
+Path to export for proxy filesystem driver
+@item -f|--fd socket-id
+Use given file descriptor as socket descriptor for communicating with
+qemu proxy fs drier. Usually a helper like libvirt will create
+socketpair and pass one of the fds as parameter to -f|--fd
+@item -n|--nodaemon
+Run as a normal program. By default program will run in daemon mode
+@end table
+@c man end
+
+@setfilename virtfs-proxy-helper
+@settitle QEMU 9p virtfs proxy filesystem helper
+
+@c man begin AUTHOR
+M. Mohan Kumar
+@c man end
-- 
1.7.6

[Qemu-devel] [PATCH 00/12] Proxy FS driver for VirtFS

2011-11-15 Thread M. Mohan Kumar

Pass-through security model in QEMU 9p server needs root privilege to do
few file operations (like chown, chmod to any mode/uid:gid).  There are two
issues in pass-through security model

1) TOCTTOU vulnerability: Following symbolic links in the server could
provide access to files beyond 9p export path.

2) Running QEMU with root privilege could be a security issue.

To overcome above issues, following approach is used: A new filesytem
type 'proxy' is introduced. Proxy FS uses chroot + socket combination
for securing the vulnerability known with following symbolic links.
Intention of adding a new filesystem type is to allow qemu to run
in non-root mode, but doing privileged operations using socket IO.

Proxy helper(a stand alone binary part of qemu) is invoked with
root privileges. Proxy helper chroots into 9p export path and creates
a socket pair or a named socket based on the command line parameter.
Qemu and proxy helper communicate using this socket. QEMU proxy fs
driver sends filesystem request to proxy helper and receives the
response from it.

Proxy helper is designed so that it can drop the root privilege but
retaining capbilities that are needed for doing filesystem operations
(like CAP_DAC_OVERRIDE, CAP_FOWNER etc)

M. Mohan Kumar (12):
  hw/9pfs: Move pdu_marshal/unmarshal code to a seperate file
  hw/9pfs: Add new proxy filesystem driver
  hw/9pfs: File system helper process for qemu 9p proxy FS
  hw/9pfs: Open and create files
  hw/9pfs: Create other filesystem objects
  hw/9pfs: Add stat/readlink/statfs for proxy FS
  hw/9pfs: File ownership and others
  hw/9pfs: xattr interfaces in proxy filesystem driver
  hw/9pfs: Proxy getversion
  hw/9pfs: Documentation changes related to proxy fs
  hw/9pfs: man page for proxy helper
  hw/9pfs: Add support to use named socket for proxy FS

 Makefile   |   15 +-
 Makefile.objs  |4 +-
 configure  |   19 +
 fsdev/file-op-9p.h |3 +-
 fsdev/qemu-fsdev.c |1 +
 fsdev/qemu-fsdev.h |1 +
 fsdev/virtfs-proxy-helper.c|  947 +
 fsdev/virtfs-proxy-helper.texi |   63 +++
 fsdev/virtio-9p-marshal.c  |  338 
 fsdev/virtio-9p-marshal.h  |   87 +++
 hw/9pfs/virtio-9p-proxy.c  | 1123 
 hw/9pfs/virtio-9p-proxy.h  |   80 +++
 hw/9pfs/virtio-9p.c|  297 +---
 hw/9pfs/virtio-9p.h|   85 +---
 qemu-config.c  |   13 +
 qemu-options.hx|   32 +-
 vl.c   |   10 +-
 17 files changed, 2736 insertions(+), 382 deletions(-)
 create mode 100644 fsdev/virtfs-proxy-helper.c
 create mode 100644 fsdev/virtfs-proxy-helper.texi
 create mode 100644 fsdev/virtio-9p-marshal.c
 create mode 100644 fsdev/virtio-9p-marshal.h
 create mode 100644 hw/9pfs/virtio-9p-proxy.c
 create mode 100644 hw/9pfs/virtio-9p-proxy.h

-- 
1.7.6

Re: [Qemu-devel] [PATCH] virtio-blk: fix cross-endian config space

2011-11-15 Thread Stefan Hajnoczi

On Tue, Nov 15, 2011 at 12:07:53PM +0100, Paolo Bonzini wrote:
> Signed-off-by: Paolo Bonzini 
> ---
>  hw/virtio-blk.c |6 +++---
>  1 files changed, 3 insertions(+), 3 deletions(-)

Works with existing x86 guests and does the right thing for big-endian
guests.

Reviewed-by: Stefan Hajnoczi

[Qemu-devel] [PATCH V2 08/12] hw/9pfs: xattr interfaces in proxy filesystem driver

2011-11-15 Thread M. Mohan Kumar

Add xattr support for proxy FS

Signed-off-by: M. Mohan Kumar 
---
 fsdev/virtfs-proxy-helper.c |   78 -
 hw/9pfs/virtio-9p-proxy.c   |  119 +++
 hw/9pfs/virtio-9p-proxy.h   |4 ++
 3 files changed, 190 insertions(+), 11 deletions(-)

diff --git a/fsdev/virtfs-proxy-helper.c b/fsdev/virtfs-proxy-helper.c
index ded0ead..ccd7ed8 100644
--- a/fsdev/virtfs-proxy-helper.c
+++ b/fsdev/virtfs-proxy-helper.c
@@ -29,6 +29,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "qemu-common.h"
 #include "virtio-9p-marshal.h"
 #include "hw/9pfs/virtio-9p-proxy.h"
@@ -283,6 +284,50 @@ static int send_response(int sock, struct iovec *iovec, 
int size)
 return 0;
 }
 
+static int do_getxattr(int type, struct iovec *iovec, struct iovec *out_iovec)
+{
+int size = 0, offset, retval;
+V9fsString path, name, xattr;
+
+v9fs_string_init(&xattr);
+
+offset = HDR_SZ;
+offset += proxy_unmarshal(iovec, 1, HDR_SZ, "ds", &size, &path);
+if (size) {
+xattr.data = g_malloc(size);
+xattr.size = size;
+}
+switch (type) {
+case T_LGETXATTR:
+proxy_unmarshal(iovec, 1, offset, "s", &name);
+retval = lgetxattr(path.data, name.data, xattr.data, size);
+if (retval < 0) {
+retval = -errno;
+v9fs_string_free(&name);
+goto error;
+}
+   v9fs_string_free(&name);
+break;
+case T_LLISTXATTR:
+retval = llistxattr(path.data, xattr.data, size);
+if (retval < 0) {
+retval = -errno;
+goto error;
+}
+break;
+}
+
+if (!size) {
+proxy_marshal(out_iovec, 1, HDR_SZ, "d", retval);
+retval = sizeof(retval);
+} else {
+retval = proxy_marshal(out_iovec, 1, HDR_SZ, "s", &xattr);
+}
+error:
+v9fs_string_free(&path);
+return retval;
+}
+
 static void stat_to_prstat(ProxyStat *pr_stat, struct stat *stat)
 {
 memset(pr_stat, 0, sizeof(*pr_stat));
@@ -499,9 +544,10 @@ static int process_requests(int sock)
 int mode, uid, gid;
 struct timespec spec[2];
 int type, retval = 0;
+V9fsString name, value;
 V9fsString oldpath, path;
 struct iovec in_iovec, out_iovec;
-int size = 0;
+int size = 0, flags;
 
 in_iovec.iov_base = g_malloc(BUFF_SZ);
 in_iovec.iov_len = BUFF_SZ;
@@ -595,6 +641,32 @@ static int process_requests(int sock)
 }
 v9fs_string_free(&path);
 break;
+case T_LGETXATTR:
+case T_LLISTXATTR:
+size = do_getxattr(type, &in_iovec, &out_iovec);
+break;
+case T_LSETXATTR:
+proxy_unmarshal(&in_iovec, 1, HDR_SZ,
+   "sssdd", &path, &name, &value, &size,
+   &flags);
+retval = lsetxattr(path.data, name.data, value.data, size, flags);
+if (retval < 0) {
+retval = -errno;
+}
+v9fs_string_free(&path);
+v9fs_string_free(&name);
+v9fs_string_free(&value);
+break;
+case T_LREMOVEXATTR:
+proxy_unmarshal(&in_iovec, 1,
+   HDR_SZ, "ss", &path, &name);
+retval = lremovexattr(path.data, name.data);
+if (retval < 0) {
+retval = -errno;
+}
+v9fs_string_free(&path);
+v9fs_string_free(&name);
+break;
 default:
 goto error;
 break;
@@ -616,11 +688,15 @@ static int process_requests(int sock)
 case T_UTIME:
 case T_RENAME:
 case T_REMOVE:
+case T_LSETXATTR:
+case T_LREMOVEXATTR:
 send_status(sock, &out_iovec, retval);
 break;
 case T_LSTAT:
 case T_STATFS:
 case T_READLINK:
+case T_LGETXATTR:
+case T_LLISTXATTR:
 if (send_response(sock, &out_iovec, size) < 0) {
 goto error;
 }
diff --git a/hw/9pfs/virtio-9p-proxy.c b/hw/9pfs/virtio-9p-proxy.c
index aefdc61..f672ac3 100644
--- a/hw/9pfs/virtio-9p-proxy.c
+++ b/hw/9pfs/virtio-9p-proxy.c
@@ -136,7 +136,7 @@ static void prstat_to_stat(struct stat *stbuf, ProxyStat 
*prstat)
  * size of errno/response is given by header.size
  */
 static int v9fs_receive_response(V9fsProxy *proxy, int type,
-int *sock_error, void *response)
+int *sock_error, int size, void *response)
 {
 int retval, error;
 ProxyHeader header;
@@ -196,6 +196,19 @@ static int v9fs_receive_response(V9fsProxy *proxy, int 
type,
 v9fs_string_free(&target);
 break;
 }
+case T_LGETXATTR:
+case T_LLISTXATTR: {
+V9fsString xattr;
+if (!size) {
+proxy_unmarshal(reply, 1, HDR_SZ, "d", &size);
+return size;
+} else {
+proxy_unmarshal(reply, 1, HDR_SZ, "s", &xattr);

Re: [Qemu-devel] [PATCH 00/14] Convert Sun devices to memory API.

2011-11-15 Thread Avi Kivity

On 11/15/2011 01:13 PM, Benoît Canet wrote:
> .valid was used where the access size is specified like in
> http://www.ibiblio.org/pub/historic-linux/early-ports/Sparc/NCR/NCR89C105.txt
> .impl was used when the behavior is not known.

Thanks, all applied except:

>   sun4c_intctl: convert to memory API
>   sun4m_iommu: convert to memory API

Where we had raced - I just wrote those two conversions as well.  As it
happens, these were the only two patches that used .impl, which is a
behaviour change; please avoid behaviour changes and do them as separate
patches.

Note I don't think .impl works well when .min_access_size = 4 - it
requires RMW which we don't do yet.

>   esp: Fix memory API conversion
>

Thanks for that too.  Will fold it into the bad patch.

-- 
error compiling committee.c: too many arguments to function

[Qemu-devel] [PATCH V2 09/12] hw/9pfs: Proxy getversion

2011-11-15 Thread M. Mohan Kumar

Add proxy getversion to get generation number

Signed-off-by: M. Mohan Kumar 
---
 fsdev/virtfs-proxy-helper.c |   74 +++
 hw/9pfs/virtio-9p-proxy.c   |   31 ++
 hw/9pfs/virtio-9p-proxy.h   |1 +
 3 files changed, 106 insertions(+), 0 deletions(-)

diff --git a/fsdev/virtfs-proxy-helper.c b/fsdev/virtfs-proxy-helper.c
index ccd7ed8..917bb26 100644
--- a/fsdev/virtfs-proxy-helper.c
+++ b/fsdev/virtfs-proxy-helper.c
@@ -30,6 +30,11 @@
 #include 
 #include 
 #include 
+#include 
+#include 
+#ifdef CONFIG_LINUX_MAGIC_H
+#include 
+#endif
 #include "qemu-common.h"
 #include "virtio-9p-marshal.h"
 #include "hw/9pfs/virtio-9p-proxy.h"
@@ -38,6 +43,19 @@
 
 #define PROGNAME "virtfs-proxy-helper"
 
+#ifndef XFS_SUPER_MAGIC
+#define XFS_SUPER_MAGIC  0x58465342
+#endif
+#ifndef EXT2_SUPER_MAGIC
+#define EXT2_SUPER_MAGIC 0xEF53
+#endif
+#ifndef REISERFS_SUPER_MAGIC
+#define REISERFS_SUPER_MAGIC 0x52654973
+#endif
+#ifndef BTRFS_SUPER_MAGIC
+#define BTRFS_SUPER_MAGIC 0x9123683E
+#endif
+
 static struct option helper_opts[] = {
 {"fd", required_argument, NULL, 'f'},
 {"path", required_argument, NULL, 'p'},
@@ -45,6 +63,7 @@ static struct option helper_opts[] = {
 };
 
 int is_daemon;
+int get_version; /* IOC getversion IOCTL supported */
 
 static void do_perror(const char *string)
 {
@@ -284,6 +303,42 @@ static int send_response(int sock, struct iovec *iovec, 
int size)
 return 0;
 }
 
+/*
+ * gets generation number
+ * returns -errno on failure and sizeof(generation number) on success
+ */
+static int do_getversion(struct iovec *iovec, struct iovec *out_iovec)
+{
+int fd, retval;
+uint64_t version;
+V9fsString path;
+
+retval = sizeof(version);
+/* no need to issue ioctl */
+if (!get_version) {
+version = 0;
+proxy_marshal(out_iovec, 1, HDR_SZ, "q", version);
+return retval;
+}
+
+proxy_unmarshal(iovec, 1, HDR_SZ, "s", &path);
+
+fd = open(path.data, O_RDONLY);
+if (fd < 0) {
+retval = -errno;
+goto done;
+}
+if (ioctl(fd, FS_IOC_GETVERSION, &version) < 0) {
+retval = -errno;
+} else {
+proxy_marshal(out_iovec, 1, HDR_SZ, "q", version);
+}
+close(fd);
+done:
+v9fs_string_free(&path);
+return retval;
+}
+
 static int do_getxattr(int type, struct iovec *iovec, struct iovec *out_iovec)
 {
 int size = 0, offset, retval;
@@ -667,6 +722,9 @@ static int process_requests(int sock)
 v9fs_string_free(&path);
 v9fs_string_free(&name);
 break;
+case T_GETVERSION:
+size = do_getversion(&in_iovec, &out_iovec);
+break;
 default:
 goto error;
 break;
@@ -697,6 +755,7 @@ static int process_requests(int sock)
 case T_READLINK:
 case T_LGETXATTR:
 case T_LLISTXATTR:
+case T_GETVERSION:
 if (send_response(sock, &out_iovec, size) < 0) {
 goto error;
 }
@@ -718,6 +777,8 @@ int main(int argc, char **argv)
 char rpath[PATH_MAX];
 struct stat stbuf;
 int c, option_index;
+int retval;
+struct statfs st_fs;
 
 is_daemon = 1;
 rpath[0] = '\0';
@@ -775,6 +836,19 @@ int main(int argc, char **argv)
 
 do_log(LOG_INFO, "Started");
 
+/* check whether underlying FS support IOC_GETVERSION */
+retval = statfs(rpath, &st_fs);
+if (!retval) {
+switch (st_fs.f_type) {
+case EXT2_SUPER_MAGIC:
+case BTRFS_SUPER_MAGIC:
+case REISERFS_SUPER_MAGIC:
+case XFS_SUPER_MAGIC:
+get_version = 1;
+break;
+}
+}
+
 if (chroot(rpath) < 0) {
 do_perror("chroot");
 goto error;
diff --git a/hw/9pfs/virtio-9p-proxy.c b/hw/9pfs/virtio-9p-proxy.c
index f672ac3..bc63835 100644
--- a/hw/9pfs/virtio-9p-proxy.c
+++ b/hw/9pfs/virtio-9p-proxy.c
@@ -209,6 +209,9 @@ static int v9fs_receive_response(V9fsProxy *proxy, int type,
 }
 break;
 }
+case T_GETVERSION:
+proxy_unmarshal(reply, 1, HDR_SZ, "q", response);
+break;
 default:
 *sock_error = 1;
 return -1;
@@ -450,6 +453,13 @@ static int v9fs_request(V9fsProxy *proxy, int type,
 proxy_marshal(iovec, 1, 0, "dd", header.type, header.size);
 header.size += HDR_SZ;
 break;
+case T_GETVERSION:
+path = va_arg(ap, V9fsString *);
+header.size = proxy_marshal(iovec, 1, HDR_SZ, "s", path);
+header.type = T_GETVERSION;
+proxy_marshal(iovec, 1, 0, "dd", header.type, header.size);
+header.size += HDR_SZ;
+break;
 default:
 error_report("Invalid type %d\n", type);
 va_end(ap);
@@ -497,6 +507,7 @@ static int v9fs_request(V9fsProxy *proxy, int type,
 case T_STATFS:
 case T_LGETXATTR:
 case T_LLISTXATTR:
+case T_GETVERSION:
 retval = v9fs_receive_response(proxy, type, &sock

[Qemu-devel] [PATCH 10/14] slavio_intctl: convert slaves interrupt controllers to memory API

2011-11-15 Thread Benoît Canet

Signed-off-by: Benoit Canet 
---
 hw/slavio_intctl.c |   36 ++--
 1 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/hw/slavio_intctl.c b/hw/slavio_intctl.c
index 0bc2a0b..e7812ed 100644
--- a/hw/slavio_intctl.c
+++ b/hw/slavio_intctl.c
@@ -46,6 +46,7 @@
 struct SLAVIO_INTCTLState;
 
 typedef struct SLAVIO_CPUINTCTLState {
+MemoryRegion iomem;
 struct SLAVIO_INTCTLState *master;
 uint32_t intreg_pending;
 uint32_t cpu;
@@ -77,7 +78,8 @@ typedef struct SLAVIO_INTCTLState {
 static void slavio_check_interrupts(SLAVIO_INTCTLState *s, int set_irqs);
 
 // per-cpu interrupt controller
-static uint32_t slavio_intctl_mem_readl(void *opaque, target_phys_addr_t addr)
+static uint64_t slavio_intctl_mem_readl(void *opaque, target_phys_addr_t addr,
+unsigned size)
 {
 SLAVIO_CPUINTCTLState *s = opaque;
 uint32_t saddr, ret;
@@ -97,7 +99,7 @@ static uint32_t slavio_intctl_mem_readl(void *opaque, 
target_phys_addr_t addr)
 }
 
 static void slavio_intctl_mem_writel(void *opaque, target_phys_addr_t addr,
- uint32_t val)
+ uint64_t val, unsigned size)
 {
 SLAVIO_CPUINTCTLState *s = opaque;
 uint32_t saddr;
@@ -122,16 +124,14 @@ static void slavio_intctl_mem_writel(void *opaque, 
target_phys_addr_t addr,
 }
 }
 
-static CPUReadMemoryFunc * const slavio_intctl_mem_read[3] = {
-NULL,
-NULL,
-slavio_intctl_mem_readl,
-};
-
-static CPUWriteMemoryFunc * const slavio_intctl_mem_write[3] = {
-NULL,
-NULL,
-slavio_intctl_mem_writel,
+static const MemoryRegionOps slavio_intctl_mem_ops = {
+.read = slavio_intctl_mem_readl,
+.write = slavio_intctl_mem_writel,
+.endianness = DEVICE_NATIVE_ENDIAN,
+.valid = {
+.min_access_size = 4,
+.max_access_size = 4,
+},
 };
 
 // master system interrupt controller
@@ -422,8 +422,8 @@ static void slavio_intctl_reset(DeviceState *d)
 static int slavio_intctl_init1(SysBusDevice *dev)
 {
 SLAVIO_INTCTLState *s = FROM_SYSBUS(SLAVIO_INTCTLState, dev);
-int io_memory;
 unsigned int i, j;
+char slave_name[45];
 
 qdev_init_gpio_in(&dev->qdev, slavio_set_irq_all, 32 + MAX_CPUS);
 memory_region_init_io(&s->iomem, &slavio_intctlm_mem_ops, s,
@@ -431,14 +431,14 @@ static int slavio_intctl_init1(SysBusDevice *dev)
 sysbus_init_mmio_region(dev, &s->iomem);
 
 for (i = 0; i < MAX_CPUS; i++) {
+snprintf(slave_name, sizeof(slave_name),
+ "slave-interrupt-controller-%i", i);
 for (j = 0; j < MAX_PILS; j++) {
 sysbus_init_irq(dev, &s->cpu_irqs[i][j]);
 }
-io_memory = cpu_register_io_memory(slavio_intctl_mem_read,
-   slavio_intctl_mem_write,
-   &s->slaves[i],
-   DEVICE_NATIVE_ENDIAN);
-sysbus_init_mmio(dev, INTCTL_SIZE, io_memory);
+memory_region_init_io(&s->slaves[i].iomem, &slavio_intctl_mem_ops,
+  &s->slaves[i], slave_name, INTCTL_SIZE);
+sysbus_init_mmio_region(dev, &s->slaves[i].iomem);
 s->slaves[i].cpu = i;
 s->slaves[i].master = s;
 }
-- 
1.7.5.4

[Qemu-devel] [PATCH 09/14] slavio_intctl: convert master interrupt controller to memory API

2011-11-15 Thread Benoît Canet

Signed-off-by: Benoit Canet 
---
 hw/slavio_intctl.c |   31 +++
 1 files changed, 15 insertions(+), 16 deletions(-)

diff --git a/hw/slavio_intctl.c b/hw/slavio_intctl.c
index 329c251..0bc2a0b 100644
--- a/hw/slavio_intctl.c
+++ b/hw/slavio_intctl.c
@@ -54,6 +54,7 @@ typedef struct SLAVIO_CPUINTCTLState {
 
 typedef struct SLAVIO_INTCTLState {
 SysBusDevice busdev;
+MemoryRegion iomem;
 #ifdef DEBUG_IRQ_COUNT
 uint64_t irq_count[32];
 #endif
@@ -134,7 +135,8 @@ static CPUWriteMemoryFunc * const 
slavio_intctl_mem_write[3] = {
 };
 
 // master system interrupt controller
-static uint32_t slavio_intctlm_mem_readl(void *opaque, target_phys_addr_t addr)
+static uint64_t slavio_intctlm_mem_readl(void *opaque, target_phys_addr_t addr,
+ unsigned size)
 {
 SLAVIO_INTCTLState *s = opaque;
 uint32_t saddr, ret;
@@ -160,7 +162,7 @@ static uint32_t slavio_intctlm_mem_readl(void *opaque, 
target_phys_addr_t addr)
 }
 
 static void slavio_intctlm_mem_writel(void *opaque, target_phys_addr_t addr,
-  uint32_t val)
+  uint64_t val, unsigned size)
 {
 SLAVIO_INTCTLState *s = opaque;
 uint32_t saddr;
@@ -192,16 +194,14 @@ static void slavio_intctlm_mem_writel(void *opaque, 
target_phys_addr_t addr,
 }
 }
 
-static CPUReadMemoryFunc * const slavio_intctlm_mem_read[3] = {
-NULL,
-NULL,
-slavio_intctlm_mem_readl,
-};
-
-static CPUWriteMemoryFunc * const slavio_intctlm_mem_write[3] = {
-NULL,
-NULL,
-slavio_intctlm_mem_writel,
+static const MemoryRegionOps slavio_intctlm_mem_ops = {
+.read = slavio_intctlm_mem_readl,
+.write = slavio_intctlm_mem_writel,
+.endianness = DEVICE_NATIVE_ENDIAN,
+.valid = {
+.min_access_size = 4,
+.max_access_size = 4,
+},
 };
 
 void slavio_pic_info(Monitor *mon, DeviceState *dev)
@@ -426,10 +426,9 @@ static int slavio_intctl_init1(SysBusDevice *dev)
 unsigned int i, j;
 
 qdev_init_gpio_in(&dev->qdev, slavio_set_irq_all, 32 + MAX_CPUS);
-io_memory = cpu_register_io_memory(slavio_intctlm_mem_read,
-   slavio_intctlm_mem_write, s,
-   DEVICE_NATIVE_ENDIAN);
-sysbus_init_mmio(dev, INTCTLM_SIZE, io_memory);
+memory_region_init_io(&s->iomem, &slavio_intctlm_mem_ops, s,
+  "master-interrupt-controller", INTCTLM_SIZE);
+sysbus_init_mmio_region(dev, &s->iomem);
 
 for (i = 0; i < MAX_CPUS; i++) {
 for (j = 0; j < MAX_PILS; j++) {
-- 
1.7.5.4

[Qemu-devel] [PATCH 04/14] slavio_misc: convert modem to memory API

2011-11-15 Thread Benoît Canet

Signed-off-by: Benoit Canet 
---
 hw/slavio_misc.c |   31 +++
 1 files changed, 15 insertions(+), 16 deletions(-)

diff --git a/hw/slavio_misc.c b/hw/slavio_misc.c
index 60a115d..9110c64 100644
--- a/hw/slavio_misc.c
+++ b/hw/slavio_misc.c
@@ -38,6 +38,7 @@ typedef struct MiscState {
 SysBusDevice busdev;
 MemoryRegion cfg_iomem;
 MemoryRegion diag_iomem;
+MemoryRegion mdm_iomem;
 qemu_irq irq;
 qemu_irq fdc_tc;
 uint32_t dummy;
@@ -164,7 +165,7 @@ static const MemoryRegionOps slavio_diag_mem_ops = {
 };
 
 static void slavio_mdm_mem_writeb(void *opaque, target_phys_addr_t addr,
-  uint32_t val)
+  uint64_t val, unsigned size)
 {
 MiscState *s = opaque;
 
@@ -172,7 +173,8 @@ static void slavio_mdm_mem_writeb(void *opaque, 
target_phys_addr_t addr,
 s->mctrl = val & 0xff;
 }
 
-static uint32_t slavio_mdm_mem_readb(void *opaque, target_phys_addr_t addr)
+static uint64_t slavio_mdm_mem_readb(void *opaque, target_phys_addr_t addr,
+ unsigned size)
 {
 MiscState *s = opaque;
 uint32_t ret = 0;
@@ -182,16 +184,14 @@ static uint32_t slavio_mdm_mem_readb(void *opaque, 
target_phys_addr_t addr)
 return ret;
 }
 
-static CPUReadMemoryFunc * const slavio_mdm_mem_read[3] = {
-slavio_mdm_mem_readb,
-NULL,
-NULL,
-};
-
-static CPUWriteMemoryFunc * const slavio_mdm_mem_write[3] = {
-slavio_mdm_mem_writeb,
-NULL,
-NULL,
+static const MemoryRegionOps slavio_mdm_mem_ops = {
+.read = slavio_mdm_mem_readb,
+.write = slavio_mdm_mem_writeb,
+.endianness = DEVICE_NATIVE_ENDIAN,
+.valid = {
+.min_access_size = 1,
+.max_access_size = 1,
+},
 };
 
 static void slavio_aux1_mem_writeb(void *opaque, target_phys_addr_t addr,
@@ -438,10 +438,9 @@ static int slavio_misc_init1(SysBusDevice *dev)
 sysbus_init_mmio_region(dev, &s->diag_iomem);
 
 /* Modem control */
-io = cpu_register_io_memory(slavio_mdm_mem_read,
-slavio_mdm_mem_write, s,
-DEVICE_NATIVE_ENDIAN);
-sysbus_init_mmio(dev, MISC_SIZE, io);
+memory_region_init_io(&s->mdm_iomem, &slavio_mdm_mem_ops, s,
+  "modem", MISC_SIZE);
+sysbus_init_mmio_region(dev, &s->mdm_iomem);
 
 /* 16 bit registers */
 /* ss600mp diag LEDs */
-- 
1.7.5.4

[Qemu-devel] [PATCH 14/14] esp: Fix memory API conversion

2011-11-15 Thread Benoît Canet

Signed-off-by: Benoit Canet 
---
 hw/esp.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/hw/esp.c b/hw/esp.c
index 2f23df9..9ab41ba 100644
--- a/hw/esp.c
+++ b/hw/esp.c
@@ -745,6 +745,7 @@ static int esp_init1(SysBusDevice *dev)
 
 memory_region_init_io(&s->iomem, &esp_mem_ops, s,
   "esp", ESP_REGS << s->it_shift);
+sysbus_init_mmio_region(dev, &s->iomem);
 
 qdev_init_gpio_in(&dev->qdev, esp_gpio_demux, 2);
 
-- 
1.7.5.4

[Qemu-devel] [PATCH V2 04/12] hw/9pfs: Open and create files

2011-11-15 Thread M. Mohan Kumar

Add interfaces to open and create files for proxy file system driver.

Signed-off-by: M. Mohan Kumar 
---
 fsdev/virtfs-proxy-helper.c |  136 +++-
 hw/9pfs/virtio-9p-proxy.c   |  180 +--
 hw/9pfs/virtio-9p-proxy.h   |9 ++
 3 files changed, 314 insertions(+), 11 deletions(-)

diff --git a/fsdev/virtfs-proxy-helper.c b/fsdev/virtfs-proxy-helper.c
index 69daf7c..68d27f1 100644
--- a/fsdev/virtfs-proxy-helper.c
+++ b/fsdev/virtfs-proxy-helper.c
@@ -30,6 +30,8 @@
 #include "qemu-common.h"
 #include "virtio-9p-marshal.h"
 #include "hw/9pfs/virtio-9p-proxy.h"
+#include "fsdev/virtio-9p-marshal.h"
+
 
 #define PROGNAME "virtfs-proxy-helper"
 
@@ -148,20 +150,125 @@ static int read_request(int sockfd, struct iovec *iovec)
 ProxyHeader header;
 
 /* read the header */
-retval = socket_read(sockfd, iovec->iov_base, sizeof(header));
-if (retval != sizeof(header)) {
+retval = socket_read(sockfd, iovec->iov_base, HDR_SZ);
+if (retval != HDR_SZ) {
 return -EIO;
 }
 /* unmarshal header */
 proxy_unmarshal(iovec, 1, 0, "dd", &header.type, &header.size);
 /* read the request */
-retval = socket_read(sockfd, iovec->iov_base + sizeof(header), 
header.size);
+retval = socket_read(sockfd, iovec->iov_base + HDR_SZ, header.size);
 if (retval != header.size) {
 return -EIO;
 }
 return header.type;
 }
 
+static void send_fd(int sockfd, int fd)
+{
+struct msghdr msg = { };
+struct iovec iov;
+struct cmsghdr *cmsg;
+int retval, data;
+union MsgControl msg_control;
+
+iov.iov_base = &data;
+iov.iov_len = sizeof(data);
+
+memset(&msg, 0, sizeof(msg));
+msg.msg_iov = &iov;
+msg.msg_iovlen = 1;
+/* No ancillary data on error */
+if (fd < 0) {
+/*
+ * fd is really negative errno if the request failed. Or simply
+ * zero if the request is successful and it doesn't need a file
+ * descriptor.
+ */
+data = fd;
+} else {
+data = V9FS_FD_VALID;
+msg.msg_control = &msg_control;
+msg.msg_controllen = sizeof(msg_control);
+
+cmsg = &msg_control.cmsg;
+cmsg->cmsg_len = CMSG_LEN(sizeof(fd));
+cmsg->cmsg_level = SOL_SOCKET;
+cmsg->cmsg_type = SCM_RIGHTS;
+memcpy(CMSG_DATA(cmsg), &fd, sizeof(fd));
+}
+
+do {
+retval = sendmsg(sockfd, &msg, 0);
+} while (retval < 0 && errno == EINTR);
+if (retval < 0) {
+do_perror("sendmsg");
+exit(1);
+}
+if (fd >= 0) {
+close(fd);
+}
+}
+
+/*
+ * from man 7 capabilities, section
+ * Effect of User ID Changes on Capabilities:
+ * 4. If the file system user ID is changed from 0 to nonzero (see setfsuid(2))
+ * then the following capabilities are cleared from the effective set:
+ * CAP_CHOWN, CAP_DAC_OVERRIDE, CAP_DAC_READ_SEARCH,  CAP_FOWNER, CAP_FSETID,
+ * CAP_LINUX_IMMUTABLE  (since  Linux 2.2.30), CAP_MAC_OVERRIDE, and CAP_MKNOD
+ * (since Linux 2.2.30). If the file system UID is changed from nonzero to 0,
+ * then any of these capabilities that are enabled in the permitted set
+ * are enabled in the effective set.
+ */
+static int setfsugid(int uid, int gid)
+{
+setfsgid(gid);
+setfsuid(uid);
+return cap_set();
+}
+
+/*
+ * create a file and send fd on success
+ * return -errno on error
+ */
+static int do_create(struct iovec *iovec)
+{
+V9fsString path;
+int flags, fd, mode, uid, gid, cur_uid, cur_gid;
+proxy_unmarshal(iovec, 1, HDR_SZ, "s",
+   &path, &flags, &mode, &uid, &gid);
+cur_uid = geteuid();
+cur_gid = getegid();
+if (setfsugid(uid, gid) < 0) {
+return -EPERM;
+}
+fd = open(path.data, flags, mode);
+if (fd < 0) {
+fd = -errno;
+}
+v9fs_string_free(&path);
+setfsugid(cur_uid, cur_gid);
+return fd;
+}
+
+/*
+ * open a file and send fd on success
+ * return -errno on error
+ */
+static int do_open(struct iovec *iovec)
+{
+V9fsString path;
+int flags, fd;
+proxy_unmarshal(iovec, 1, HDR_SZ, "sd", &path, &flags);
+fd = open(path.data, flags);
+if (fd < 0) {
+fd = -errno;
+}
+v9fs_string_free(&path);
+return fd;
+}
+
 static void usage(char *prog)
 {
 fprintf(stderr, "usage: %s\n"
@@ -173,15 +280,34 @@ static void usage(char *prog)
 
 static int process_requests(int sock)
 {
-int type;
+int type, retval = 0;
 struct iovec iovec;
 
 iovec.iov_base = g_malloc(BUFF_SZ);
 iovec.iov_len = BUFF_SZ;
 while (1) {
 type = read_request(sock, &iovec);
-if (type <= 0) {
+switch (type) {
+case T_OPEN:
+retval = do_open(&iovec);
+break;
+case T_CREATE:
+retval = do_create(&iovec);
+break;
+default:
+goto error;
+break;
+}
+
+/* Send response */
+switch

[Qemu-devel] [PATCH 05/14] slavio_misc: convert leds to memory API

2011-11-15 Thread Benoît Canet

Signed-off-by: Benoit Canet 
---
 hw/slavio_misc.c |   31 +++
 1 files changed, 15 insertions(+), 16 deletions(-)

diff --git a/hw/slavio_misc.c b/hw/slavio_misc.c
index 9110c64..db266ba 100644
--- a/hw/slavio_misc.c
+++ b/hw/slavio_misc.c
@@ -39,6 +39,7 @@ typedef struct MiscState {
 MemoryRegion cfg_iomem;
 MemoryRegion diag_iomem;
 MemoryRegion mdm_iomem;
+MemoryRegion led_iomem;
 qemu_irq irq;
 qemu_irq fdc_tc;
 uint32_t dummy;
@@ -345,7 +346,8 @@ static CPUWriteMemoryFunc * const 
slavio_sysctrl_mem_write[3] = {
 slavio_sysctrl_mem_writel,
 };
 
-static uint32_t slavio_led_mem_readw(void *opaque, target_phys_addr_t addr)
+static uint64_t slavio_led_mem_readw(void *opaque, target_phys_addr_t addr,
+ unsigned size)
 {
 MiscState *s = opaque;
 uint32_t ret = 0;
@@ -362,7 +364,7 @@ static uint32_t slavio_led_mem_readw(void *opaque, 
target_phys_addr_t addr)
 }
 
 static void slavio_led_mem_writew(void *opaque, target_phys_addr_t addr,
-  uint32_t val)
+  uint64_t val, unsigned size)
 {
 MiscState *s = opaque;
 
@@ -376,16 +378,14 @@ static void slavio_led_mem_writew(void *opaque, 
target_phys_addr_t addr,
 }
 }
 
-static CPUReadMemoryFunc * const slavio_led_mem_read[3] = {
-NULL,
-slavio_led_mem_readw,
-NULL,
-};
-
-static CPUWriteMemoryFunc * const slavio_led_mem_write[3] = {
-NULL,
-slavio_led_mem_writew,
-NULL,
+static const MemoryRegionOps slavio_led_mem_ops = {
+.read = slavio_led_mem_readw,
+.write = slavio_led_mem_writew,
+.endianness = DEVICE_NATIVE_ENDIAN,
+.valid = {
+.min_access_size = 2,
+.max_access_size = 2,
+},
 };
 
 static const VMStateDescription vmstate_misc = {
@@ -444,10 +444,9 @@ static int slavio_misc_init1(SysBusDevice *dev)
 
 /* 16 bit registers */
 /* ss600mp diag LEDs */
-io = cpu_register_io_memory(slavio_led_mem_read,
-slavio_led_mem_write, s,
-DEVICE_NATIVE_ENDIAN);
-sysbus_init_mmio(dev, MISC_SIZE, io);
+memory_region_init_io(&s->led_iomem, &slavio_led_mem_ops, s,
+  "leds", MISC_SIZE);
+sysbus_init_mmio_region(dev, &s->led_iomem);
 
 /* 32 bit registers */
 /* System control */
-- 
1.7.5.4

[Qemu-devel] [PATCH 02/14] slavio_misc: convert configuration to memory API

2011-11-15 Thread Benoît Canet

Signed-off-by: Benoit Canet 
---
 hw/slavio_misc.c |   31 +++
 1 files changed, 15 insertions(+), 16 deletions(-)

diff --git a/hw/slavio_misc.c b/hw/slavio_misc.c
index 7d427f7..7b98f11 100644
--- a/hw/slavio_misc.c
+++ b/hw/slavio_misc.c
@@ -36,6 +36,7 @@
 
 typedef struct MiscState {
 SysBusDevice busdev;
+MemoryRegion cfg_iomem;
 qemu_irq irq;
 qemu_irq fdc_tc;
 uint32_t dummy;
@@ -101,7 +102,7 @@ static void slavio_set_power_fail(void *opaque, int irq, 
int power_failing)
 }
 
 static void slavio_cfg_mem_writeb(void *opaque, target_phys_addr_t addr,
-  uint32_t val)
+  uint64_t val, unsigned size)
 {
 MiscState *s = opaque;
 
@@ -110,7 +111,8 @@ static void slavio_cfg_mem_writeb(void *opaque, 
target_phys_addr_t addr,
 slavio_misc_update_irq(s);
 }
 
-static uint32_t slavio_cfg_mem_readb(void *opaque, target_phys_addr_t addr)
+static uint64_t slavio_cfg_mem_readb(void *opaque, target_phys_addr_t addr,
+ unsigned size)
 {
 MiscState *s = opaque;
 uint32_t ret = 0;
@@ -120,16 +122,14 @@ static uint32_t slavio_cfg_mem_readb(void *opaque, 
target_phys_addr_t addr)
 return ret;
 }
 
-static CPUReadMemoryFunc * const slavio_cfg_mem_read[3] = {
-slavio_cfg_mem_readb,
-NULL,
-NULL,
-};
-
-static CPUWriteMemoryFunc * const slavio_cfg_mem_write[3] = {
-slavio_cfg_mem_writeb,
-NULL,
-NULL,
+static const MemoryRegionOps slavio_cfg_mem_ops = {
+.read = slavio_cfg_mem_readb,
+.write = slavio_cfg_mem_writeb,
+.endianness = DEVICE_NATIVE_ENDIAN,
+.valid = {
+.min_access_size = 1,
+.max_access_size = 1,
+},
 };
 
 static void slavio_diag_mem_writeb(void *opaque, target_phys_addr_t addr,
@@ -428,10 +428,9 @@ static int slavio_misc_init1(SysBusDevice *dev)
 
 /* 8 bit registers */
 /* Slavio control */
-io = cpu_register_io_memory(slavio_cfg_mem_read,
-slavio_cfg_mem_write, s,
-DEVICE_NATIVE_ENDIAN);
-sysbus_init_mmio(dev, MISC_SIZE, io);
+memory_region_init_io(&s->cfg_iomem, &slavio_cfg_mem_ops, s,
+  "configuration", MISC_SIZE);
+sysbus_init_mmio_region(dev, &s->cfg_iomem);
 
 /* Diagnostics */
 io = cpu_register_io_memory(slavio_diag_mem_read,
-- 
1.7.5.4

[Qemu-devel] [PATCH V2 03/12] hw/9pfs: File system helper process for qemu 9p proxy FS

2011-11-15 Thread M. Mohan Kumar

Provide root privilege access to QEMU 9p proxy filesystem using socket
communication.

Proxy helper is started by root user as:
~ # virtfs-proxy-helper -f|--fd  -p|--path 

Signed-off-by: M. Mohan Kumar 
---
 Makefile|3 +
 configure   |   19 +++
 fsdev/virtfs-proxy-helper.c |  271 +++
 hw/9pfs/virtio-9p-proxy.h   |   10 ++
 4 files changed, 303 insertions(+), 0 deletions(-)
 create mode 100644 fsdev/virtfs-proxy-helper.c

diff --git a/Makefile b/Makefile
index ba8d738..19b481a 100644
--- a/Makefile
+++ b/Makefile
@@ -153,6 +153,9 @@ qemu-img$(EXESUF): qemu-img.o $(tools-obj-y) $(block-obj-y)
 qemu-nbd$(EXESUF): qemu-nbd.o $(tools-obj-y) $(block-obj-y)
 qemu-io$(EXESUF): qemu-io.o cmd.o $(tools-obj-y) $(block-obj-y)
 
+fsdev/virtfs-proxy-helper$(EXESUF): fsdev/virtfs-proxy-helper.o 
fsdev/virtio-9p-marshal.o
+fsdev/virtfs-proxy-helper$(EXESUF): LIBS += -lcap
+
 qemu-img-cmds.h: $(SRC_PATH)/qemu-img-cmds.hx
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -h < $< > $@,"  GEN  
 $@")
 
diff --git a/configure b/configure
index a6cf6d6..a2b55e8 100755
--- a/configure
+++ b/configure
@@ -1873,6 +1873,22 @@ else
 fi
 
 ##
+# libcap probe
+
+if test "$cap" != "no" ; then
+  cat > $TMPC <
+#include 
+int main(void) { cap_t caps; caps = cap_init(); }
+EOF
+  if compile_prog "" "-lcap" ; then
+cap=yes
+  else
+cap=no
+  fi
+fi
+
+##
 # pthread probe
 PTHREADLIBS_LIST="-pthread -lpthread -lpthreadGC2"
 
@@ -2662,6 +2678,9 @@ confdir=$sysconfdir$confsuffix
 tools=
 if test "$softmmu" = yes ; then
   tools="qemu-img\$(EXESUF) qemu-io\$(EXESUF) $tools"
+  if [ "$cap" = "yes" -a "$linux" = "yes" ] ; then
+  tools="$tools fsdev/virtfs-proxy-helper\$(EXESUF)"
+  fi
   if [ "$linux" = "yes" -o "$bsd" = "yes" -o "$solaris" = "yes" ] ; then
   tools="qemu-nbd\$(EXESUF) $tools"
 if [ "$guest_agent" = "yes" ]; then
diff --git a/fsdev/virtfs-proxy-helper.c b/fsdev/virtfs-proxy-helper.c
new file mode 100644
index 000..69daf7c
--- /dev/null
+++ b/fsdev/virtfs-proxy-helper.c
@@ -0,0 +1,271 @@
+/*
+ * Helper for QEMU Proxy FS Driver
+ * Copyright IBM, Corp. 2011
+ *
+ * Authors:
+ * M. Mohan Kumar 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2. See
+ * the COPYING file in the top-level directory.
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "bswap.h"
+#include 
+#include "qemu-common.h"
+#include "virtio-9p-marshal.h"
+#include "hw/9pfs/virtio-9p-proxy.h"
+
+#define PROGNAME "virtfs-proxy-helper"
+
+static struct option helper_opts[] = {
+{"fd", required_argument, NULL, 'f'},
+{"path", required_argument, NULL, 'p'},
+{"nodaemon", no_argument, NULL, 'n'},
+};
+
+int is_daemon;
+
+static void do_perror(const char *string)
+{
+if (is_daemon) {
+syslog(LOG_CRIT, "%s:%s", string, strerror(errno));
+} else {
+fprintf(stderr, "%s:%s\n", string, strerror(errno));
+}
+}
+
+static void do_log(int level, const char *string)
+{
+if (is_daemon) {
+syslog(level, "%s", string);
+} else {
+fprintf(stderr, "%s\n", string);
+}
+}
+
+static int cap_set(void)
+{
+int retval;
+cap_t caps;
+cap_value_t cap_list[10];
+
+/* helper needs following capbabilities only */
+cap_list[0] = CAP_CHOWN;
+cap_list[1] = CAP_DAC_OVERRIDE;
+cap_list[2] = CAP_DAC_READ_SEARCH;
+cap_list[3] = CAP_FOWNER;
+cap_list[4] = CAP_FSETID;
+cap_list[5] = CAP_SETGID;
+cap_list[6] = CAP_MKNOD;
+cap_list[7] = CAP_SETUID;
+
+caps = cap_init();
+if (caps == NULL) {
+do_perror("cap_init");
+return -1;
+}
+retval = cap_set_flag(caps, CAP_PERMITTED, 8, cap_list, CAP_SET);
+if (retval < 0) {
+do_perror("cap_set_flag");
+goto error;
+}
+retval = cap_set_proc(caps);
+if (retval < 0) {
+do_perror("cap_set_proc");
+}
+retval = cap_set_flag(caps, CAP_EFFECTIVE, 8, cap_list, CAP_SET);
+if (retval < 0) {
+do_perror("cap_set_flag");
+goto error;
+}
+retval = cap_set_proc(caps);
+if (retval < 0) {
+do_perror("cap_set_proc");
+}
+
+error:
+cap_free(caps);
+return retval;
+}
+
+static int init_capabilities(void)
+{
+if (prctl(PR_SET_KEEPCAPS, 1) < 0) {
+do_perror("prctl");
+return -1;
+}
+if (cap_set() < 0) {
+return -1;
+}
+return 0;
+}
+
+static int socket_read(int sockfd, void *buff, ssize_t size)
+{
+int retval;
+
+do {
+retval = read(sockfd, buff, size);
+} while (retval < 0 && errno == EINTR);
+if (retval != size) {
+return -EIO;
+}
+return retval;
+}
+
+static int socket_write(int sockfd, void *buff,

[Qemu-devel] [PATCH 12/14] slavio_timer: convert to memory API

2011-11-15 Thread Benoît Canet

Signed-off-by: Benoit Canet 
---
 hw/slavio_timer.c |   41 -
 1 files changed, 20 insertions(+), 21 deletions(-)

diff --git a/hw/slavio_timer.c b/hw/slavio_timer.c
index 84449ba..c23f990 100644
--- a/hw/slavio_timer.c
+++ b/hw/slavio_timer.c
@@ -61,6 +61,7 @@ typedef struct SLAVIO_TIMERState {
 } SLAVIO_TIMERState;
 
 typedef struct TimerContext {
+MemoryRegion iomem;
 SLAVIO_TIMERState *s;
 unsigned int timer_index; /* 0 for system, 1 ... MAX_CPUS for CPU timers */
 } TimerContext;
@@ -128,7 +129,8 @@ static void slavio_timer_irq(void *opaque)
 }
 }
 
-static uint32_t slavio_timer_mem_readl(void *opaque, target_phys_addr_t addr)
+static uint64_t slavio_timer_mem_readl(void *opaque, target_phys_addr_t addr,
+   unsigned size)
 {
 TimerContext *tc = opaque;
 SLAVIO_TIMERState *s = tc->s;
@@ -188,7 +190,7 @@ static uint32_t slavio_timer_mem_readl(void *opaque, 
target_phys_addr_t addr)
 }
 
 static void slavio_timer_mem_writel(void *opaque, target_phys_addr_t addr,
-uint32_t val)
+uint64_t val, unsigned size)
 {
 TimerContext *tc = opaque;
 SLAVIO_TIMERState *s = tc->s;
@@ -311,16 +313,14 @@ static void slavio_timer_mem_writel(void *opaque, 
target_phys_addr_t addr,
 }
 }
 
-static CPUReadMemoryFunc * const slavio_timer_mem_read[3] = {
-NULL,
-NULL,
-slavio_timer_mem_readl,
-};
-
-static CPUWriteMemoryFunc * const slavio_timer_mem_write[3] = {
-NULL,
-NULL,
-slavio_timer_mem_writel,
+static const MemoryRegionOps slavio_timer_mem_ops = {
+.read = slavio_timer_mem_readl,
+.write = slavio_timer_mem_writel,
+.endianness = DEVICE_NATIVE_ENDIAN,
+.valid = {
+.min_access_size = 4,
+.max_access_size = 4,
+},
 };
 
 static const VMStateDescription vmstate_timer = {
@@ -374,13 +374,15 @@ static void slavio_timer_reset(DeviceState *d)
 
 static int slavio_timer_init1(SysBusDevice *dev)
 {
-int io;
 SLAVIO_TIMERState *s = FROM_SYSBUS(SLAVIO_TIMERState, dev);
 QEMUBH *bh;
 unsigned int i;
 TimerContext *tc;
 
 for (i = 0; i <= MAX_CPUS; i++) {
+uint64_t size;
+char timer_name[20];
+
 tc = g_malloc0(sizeof(TimerContext));
 tc->s = s;
 tc->timer_index = i;
@@ -389,14 +391,11 @@ static int slavio_timer_init1(SysBusDevice *dev)
 s->cputimer[i].timer = ptimer_init(bh);
 ptimer_set_period(s->cputimer[i].timer, TIMER_PERIOD);
 
-io = cpu_register_io_memory(slavio_timer_mem_read,
-slavio_timer_mem_write, tc,
-DEVICE_NATIVE_ENDIAN);
-if (i == 0) {
-sysbus_init_mmio(dev, SYS_TIMER_SIZE, io);
-} else {
-sysbus_init_mmio(dev, CPU_TIMER_SIZE, io);
-}
+size = i == 0 ? SYS_TIMER_SIZE : CPU_TIMER_SIZE;
+snprintf(timer_name, sizeof(timer_name), "timer-%i", i);
+memory_region_init_io(&tc->iomem, &slavio_timer_mem_ops, tc,
+  timer_name, size);
+sysbus_init_mmio_region(dev, &tc->iomem);
 
 sysbus_init_irq(dev, &s->cputimer[i].irq);
 }
-- 
1.7.5.4

[Qemu-devel] [PATCH v3 6/6] LICENSE: There is no libqemu.a anymore

2011-11-15 Thread 陳韋任

From: Chen Wei-Ren 

  Remove statement about libqemu.a from LICENSE.

Signed-off-by: Chen Wei-Ren 
---
 LICENSE |4 +---
 1 files changed, 1 insertions(+), 3 deletions(-)

diff --git a/LICENSE b/LICENSE
index cbd92c0..acae9a3 100644
--- a/LICENSE
+++ b/LICENSE
@@ -6,9 +6,7 @@ The following points clarify the QEMU license:
 GNU General Public License. Hence each source file contains its own
 licensing information.
 
-In particular, the QEMU virtual CPU core library (libqemu.a) is
-released under the GNU Lesser General Public License. Many hardware
-device emulation sources are released under the BSD license.
+Many hardware device emulation sources are released under the BSD license.
 
 3) The Tiny Code Generator (TCG) is released under the BSD license
(see license headers in files).
-- 
1.7.3.4

[Qemu-devel] [PATCH v3 5/6] Makefile.objs: Remove libqemu_common.a from the comment

2011-11-15 Thread 陳韋任

From: Chen Wei-Ren 

  Remove libqemu_common.a from the comment.

Signed-off-by: Chen Wei-Ren 
---
 Makefile.objs |7 +++
 1 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/Makefile.objs b/Makefile.objs
index d7a6539..64c5c24 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -68,10 +68,9 @@ endif
 fsdev-obj-$(CONFIG_VIRTFS) += $(addprefix fsdev/, $(fsdev-nested-y))
 
 ##
-# libqemu_common.a: Target independent part of system emulation. The
-# long term path is to suppress *all* target specific code in case of
-# system emulation, i.e. a single QEMU executable should support all
-# CPUs and machines.
+# Target independent part of system emulation. The long term path is
+# to suppress *all* target specific code in case of # system emulation,
+# i.e. a single QEMU executable should support all CPUs and machines.
 
 common-obj-y = $(block-obj-y) blockdev.o
 common-obj-y += $(net-obj-y)
-- 
1.7.3.4

[Qemu-devel] [PATCH v3 1/6] tests/Makefile: Remove qruncom target

2011-11-15 Thread 陳韋任

From: Chen Wei-Ren 

  Remove qruncom target from the Makefile file.

Signed-off-by: Chen Wei-Ren 
---
 tests/Makefile |6 --
 1 files changed, 0 insertions(+), 6 deletions(-)

diff --git a/tests/Makefile b/tests/Makefile
index 430e0c1..15e36a2 100644
--- a/tests/Makefile
+++ b/tests/Makefile
@@ -115,12 +115,6 @@ speed: sha1 sha1-i386
time ./sha1
time $(QEMU) ./sha1-i386
 
-# broken test
-# NOTE: -fomit-frame-pointer is currently needed : this is a bug in libqemu
-qruncom: qruncom.c ../ioport-user.c ../i386-user/libqemu.a
-   $(CC) $(CFLAGS) -fomit-frame-pointer $(LDFLAGS) -I../target-i386 -I.. 
-I../i386-user -I../fpu \
-  -o $@ $(filter %.c, $^) -L../i386-user -lqemu -lm
-
 # arm test
 hello-arm: hello-arm.o
arm-linux-ld -o $@ $<
-- 
1.7.3.4

[Qemu-devel] [PATCH v3 4/6] Makefile.target: Remove out of date comment

2011-11-15 Thread 陳韋任

From: Chen Wei-Ren 

  Remove the out of date comment, i.e., "# libqemu" since libqemu.a is not
available anymore.

Signed-off-by: Chen Wei-Ren 
---
v3:
 - Only remove out-of-date comment about libqemu.a from Makefile.target,
   leave manually inserted dependencie alone.

 Makefile.target |2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/Makefile.target b/Makefile.target
index a111521..7369a89 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -92,8 +92,6 @@ tci-dis.o: QEMU_CFLAGS += -I$(SRC_PATH)/tcg 
-I$(SRC_PATH)/tcg/tci
 
 $(libobj-y): $(GENERATED_HEADERS)
 
-# libqemu
-
 translate.o: translate.c cpu.h
 
 translate-all.o: translate-all.c cpu.h
-- 
1.7.3.4

[Qemu-devel] [PATCH v3 3/6] qemu-tech.texi: Remove libqemu related stuff from the document

2011-11-15 Thread 陳韋任

From: Chen Wei-Ren 

  Remove libqemu related stuff from the document since libqemu.a is not 
supported
anymore.

Signed-off-by: Chen Wei-Ren 
---
 qemu-tech.texi |   10 --
 1 files changed, 0 insertions(+), 10 deletions(-)

diff --git a/qemu-tech.texi b/qemu-tech.texi
index 62afe45..5676fb7 100644
--- a/qemu-tech.texi
+++ b/qemu-tech.texi
@@ -96,10 +96,6 @@ Alpha and S390 hosts, but TCG (see below) doesn't support 
those yet.
 
 @item Precise exceptions support.
 
-@item The virtual CPU is a library (@code{libqemu}) which can be used
-in other projects (look at @file{qemu/tests/qruncom.c} to have an
-example of user mode @code{libqemu} usage).
-
 @item
 Floating point library supporting both full software emulation and
 native host FPU instructions.
@@ -685,7 +681,6 @@ are available. They are used for regression testing.
 @menu
 * test-i386::
 * linux-test::
-* qruncom.c::
 @end menu
 
 @node test-i386
@@ -711,11 +706,6 @@ This program tests various Linux system calls. It is used 
to verify
 that the system call parameters are correctly converted between target
 and host CPUs.
 
-@node qruncom.c
-@section @file{qruncom.c}
-
-Example of usage of @code{libqemu} to emulate a user mode i386 CPU.
-
 @node Index
 @chapter Index
 @printindex cp
-- 
1.7.3.4

[Qemu-devel] [PATCH v3 2/6] tests/qruncom.c: Remove libqemu.a example

2011-11-15 Thread 陳韋任

From: Chen Wei-Ren 

  Remove libqemu example since libqemu.a is not available anymore.

Signed-off-by: Chen Wei-Ren 
---
 tests/qruncom.c |  284 ---
 1 files changed, 0 insertions(+), 284 deletions(-)
 delete mode 100644 tests/qruncom.c

diff --git a/tests/qruncom.c b/tests/qruncom.c
deleted file mode 100644
index 2e93aaf..000
--- a/tests/qruncom.c
+++ /dev/null
@@ -1,284 +0,0 @@
-/*
- * Example of use of user mode libqemu: launch a basic .com DOS
- * executable
- */
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-
-#include "cpu.h"
-
-//#define SIGTEST
-
-int cpu_get_pic_interrupt(CPUState *env)
-{
-return -1;
-}
-
-uint64_t cpu_get_tsc(CPUState *env)
-{
-return 0;
-}
-
-static void set_gate(void *ptr, unsigned int type, unsigned int dpl,
- unsigned long addr, unsigned int sel)
-{
-unsigned int e1, e2;
-e1 = (addr & 0x) | (sel << 16);
-e2 = (addr & 0x) | 0x8000 | (dpl << 13) | (type << 8);
-stl((uint8_t *)ptr, e1);
-stl((uint8_t *)ptr + 4, e2);
-}
-
-uint64_t idt_table[256];
-
-/* only dpl matters as we do only user space emulation */
-static void set_idt(int n, unsigned int dpl)
-{
-set_gate(idt_table + n, 0, dpl, 0, 0);
-}
-
-void g_free(void *ptr)
-{
-free(ptr);
-}
-
-void *g_malloc(size_t size)
-{
-return malloc(size);
-}
-
-void *g_malloc0(size_t size)
-{
-void *ptr;
-ptr = g_malloc(size);
-if (!ptr)
-return NULL;
-memset(ptr, 0, size);
-return ptr;
-}
-
-void *qemu_vmalloc(size_t size)
-{
-return memalign(4096, size);
-}
-
-void qemu_vfree(void *ptr)
-{
-free(ptr);
-}
-
-void qemu_printf(const char *fmt, ...)
-{
-va_list ap;
-va_start(ap, fmt);
-vprintf(fmt, ap);
-va_end(ap);
-}
-
-/* XXX: this is a bug in helper2.c */
-int errno;
-
-/**/
-
-#define COM_BASE_ADDR0x10100
-
-static void usage(void)
-{
-printf("qruncom version 0.1 (c) 2003 Fabrice Bellard\n"
-   "usage: qruncom file.com\n"
-   "user mode libqemu demo: run simple .com DOS executables\n");
-exit(1);
-}
-
-static inline uint8_t *seg_to_linear(unsigned int seg, unsigned int reg)
-{
-return (uint8_t *)((seg << 4) + (reg & 0x));
-}
-
-static inline void pushw(CPUState *env, int val)
-{
-env->regs[R_ESP] = (env->regs[R_ESP] & ~0x) | ((env->regs[R_ESP] - 2) 
& 0x);
-*(uint16_t *)seg_to_linear(env->segs[R_SS].selector, env->regs[R_ESP]) = 
val;
-}
-
-static void host_segv_handler(int host_signum, siginfo_t *info,
-  void *puc)
-{
-if (cpu_signal_handler(host_signum, info, puc)) {
-return;
-}
-abort();
-}
-
-int main(int argc, char **argv)
-{
-uint8_t *vm86_mem;
-const char *filename;
-int fd, ret, seg;
-CPUState *env;
-
-if (argc != 2)
-usage();
-filename = argv[1];
-
-vm86_mem = mmap((void *)0x, 0x11,
-PROT_WRITE | PROT_READ | PROT_EXEC,
-MAP_FIXED | MAP_ANON | MAP_PRIVATE, -1, 0);
-if (vm86_mem == MAP_FAILED) {
-perror("mmap");
-exit(1);
-}
-
-/* load the MSDOS .com executable */
-fd = open(filename, O_RDONLY);
-if (fd < 0) {
-perror(filename);
-exit(1);
-}
-ret = read(fd, vm86_mem + COM_BASE_ADDR, 65536 - 256);
-if (ret < 0) {
-perror("read");
-exit(1);
-}
-close(fd);
-
-/* install exception handler for CPU emulator */
-{
-struct sigaction act;
-
-sigfillset(&act.sa_mask);
-act.sa_flags = SA_SIGINFO;
-//act.sa_flags |= SA_ONSTACK;
-
-act.sa_sigaction = host_segv_handler;
-sigaction(SIGSEGV, &act, NULL);
-sigaction(SIGBUS, &act, NULL);
-}
-
-//cpu_set_log(CPU_LOG_TB_IN_ASM | CPU_LOG_TB_OUT_ASM | CPU_LOG_EXEC);
-
-env = cpu_init("qemu32");
-
-cpu_x86_set_cpl(env, 3);
-
-env->cr[0] = CR0_PG_MASK | CR0_WP_MASK | CR0_PE_MASK;
-/* NOTE: hflags duplicates some of the virtual CPU state */
-env->hflags |= HF_PE_MASK | VM_MASK;
-
-/* flags setup : we activate the IRQs by default as in user
-   mode. We also activate the VM86 flag to run DOS code */
-env->eflags |= IF_MASK | VM_MASK;
-
-/* init basic registers */
-env->eip = 0x100;
-env->regs[R_ESP] = 0xfffe;
-seg = (COM_BASE_ADDR - 0x100) >> 4;
-
-cpu_x86_load_seg_cache(env, R_CS, seg,
-   (seg << 4), 0x, 0);
-cpu_x86_load_seg_cache(env, R_SS, seg,
-   (seg << 4), 0x, 0);
-cpu_x86_load_seg_cache(env, R_DS, seg,
-   (seg << 4), 0x, 0);
-cpu_x86_load_seg_cache(env, R_ES, seg,
-   (seg << 4), 0x, 0);
-cpu_x86_load_seg_cache(env, R_FS, seg,
-   (seg << 4), 0x, 0);
-cpu_x86

[Qemu-devel] [PATCH v3 0/6] Remove libqemu related stuff from QEMU source tree

2011-11-15 Thread 陳韋任

From: Chen Wei-Ren 

  According to [1], libqemu is not available anymore. Remove libqemu
related stuff from QEMU source tree.

[1] http://www.mail-archive.com/address@hidden/msg49809.html

v2:
 - Remove entry "qruncom.c" from "3 Regression Tests" in qemu-tech.texi.
 - Undo the deletion of common-obj-y. Only remove libqemu_common.a from
   the comment.

v3:
 - Change tests/Makefile first before removing tests/qruncom.c.
 - Only remove out-of-date comment about libqemu.a from Makefile.target,
   leave manually inserted dependencie alone.

Chen Wei-Ren (6):
  tests/Makefile: Remove qruncom target
  tests/qruncom.c: Remove libqemu.a example
  qemu-tech.texi: Remove libqemu related stuff from the document
  Makefile.target: Remove out of date comment
  Makefile.objs: Remove libqemu_common.a from the comment
  LICENSE: There is no libqemu.a anymore

 LICENSE |4 +-
 Makefile.objs   |7 +-
 Makefile.target |2 -
 qemu-tech.texi  |   10 --
 tests/Makefile  |6 -
 tests/qruncom.c |  284 ---
 6 files changed, 4 insertions(+), 309 deletions(-)
 delete mode 100644 tests/qruncom.c

-- 
1.7.3.4

Re: [Qemu-devel] [PATCH] migration: add a MAINTAINERS entry for migration

2011-11-15 Thread Avi Kivity

On 11/15/2011 10:32 AM, Stefan Hajnoczi wrote:
> It would help to have a migration wiki page or document that explains
> the implications of migration on QEMU code - what to look out for in
> device emulation code.
>
> Although regular QEMU contributors may know the background on
> migration/save/load, it would be not only helpful for new contributors
> but also a good refresher for those of us who have picked up the
> assumptions around migration piecewise.
>
> I think a good document would raise migration awareness and help us
> review new patches with an eye towards correct migration behavior.
>
> The rules need to be laid down by someone who understands migration
> quite well.
>

Good idea.  There needs to be a good explanation of what the migration
state is; I think that's the biggest obstacle.

-- 
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] Windows 7 shutdown causes BSOD

2011-11-15 Thread Stefan Hajnoczi

On Fri, Nov 4, 2011 at 11:25 AM, Stefan Hajnoczi  wrote:
> On Fri, Nov 4, 2011 at 10:48 AM, Stefan Hajnoczi  wrote:
>> Windows 7 32-bit guest blue screens when I shut it down properly with
>> Start | Shut Down.  The blue screen is only displayed for a split
>> second before the guest reboots so I am not able to easily tell what
>> it says.  My guess is that Windows is triple-faulting or soft
>> rebooting - note that I told Windows to shut down, not reboot.
>>
>> This issue happens on qemu.git/master (and Debian kvm 0.14.1+dfsg-3).
>> Here is the QEMU command-line:
>>
>> x86_64-softmmu/qemu-system-x86_64 -L pc-bios -cpu qemu32 -enable-kvm
>> -m 1024 -rtc base=localtime -drive
>> file=win7.img,if=none,id=drive-ide0-0-0,format=raw -device
>> ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1
>>
>> Questions:
>>
>> Is anyone else experiencing this?
>>
>> Is anyone fixing this?
>>
>> If not I will play with it.  Disabling ACPI might reveal the source of
>> the problem.  If that turns up nothing I will try to get the BSOD or
>> WinDbg output.
>
> Thanks to Andreas Faerber and Michael Tokarev I found out the
> automatic reboot can be disabled in Windows.  Here is the BSOD
> information:
>
> IRQL_NOT_LESS_OR_EQUAL
> STOP: 0x000A (0x,0x00FF,0x0001,0x828B7220)

This decodes to:
"Windows or a kernel-mode driver accessed paged memory at
DISPATCH_LEVEL or above."

Memory referenced: 0x
IRQL: 0xff
Read/write: Write (1)
Address which referenced memory: 0x828B7220

http://msdn.microsoft.com/en-us/library/ff560129%28v=VS.85%29.aspx

Looks like a NULL pointer reference or maybe a deliberate "we should
never get here" failure.

Stefan

Re: [Qemu-devel] qemu and qemu.git -> Migration + disk stress introduces qcow2 corruptions

2011-11-15 Thread Juan Quintela

Anthony Liguori  wrote:
> On 11/14/2011 04:16 AM, Daniel P. Berrange wrote:
>> On Sat, Nov 12, 2011 at 12:25:34PM +0200, Avi Kivity wrote:
>>> On 11/11/2011 12:15 PM, Kevin Wolf wrote:
 Am 10.11.2011 22:30, schrieb Anthony Liguori:
> Live migration with qcow2 or any other image format is just not going to 
> work
> right now even with proper clustered storage.  I think doing a block 
> level flush
> cache interface and letting block devices decide how to do it is the best 
> approach.

 I would really prefer reusing the existing open/close code. It means
 less (duplicated) code, is existing code that is well tested and doesn't
 make migration much of a special case.

 If you want to avoid reopening the file on the OS level, we can reopen
 only the topmost layer (i.e. the format, but not the protocol) for now
 and in 1.1 we can use bdrv_reopen().

>>>
>>> Intuitively I dislike _reopen style interfaces.  If the second open
>>> yields different results from the first, does it invalidate any
>>> computations in between?
>>>
>>> What's wrong with just delaying the open?
>>
>> If you delay the 'open' until the mgmt app issues 'cont', then you loose
>> the ability to rollback to the source host upon open failure for most
>> deployed versions of libvirt. We only fairly recently switched to a five
>> stage migration handshake to cope with rollback when 'cont' fails.
>
> Delayed open isn't a panacea.  With the series I sent, we should be
> able to migration with a qcow2 file on coherent shared storage.
>
> There are two other cases that we care about: migration with nfs
> cache!=none and direct attached storage with cache!=none
>
> Whether the open is deferred matters less with NFS than if the open
> happens after the close on the source.  To fix NFS cache!=none, we
> would have to do a bdrv_close() before sending the last byte of
> migration data and make sure that we bdrv_open() after receiving the
> last byte of migration data.
>
> The problem with this IMHO is it creates a large window where noone
> has the file open and you're critically vulnerable to losing your VM.

Red Hat NFS guru told that fsync() on source + open() after that on
target is enough.  But anyways, it still depends of nothing else having
the file opened on target.

> I'm much more in favor of a smarter caching policy.  If we can fcntl()
> our way to O_DIRECT on NFS, that would be fairly interesting.  I'm not
> sure if this is supported today but it's something we could look into
> adding in the kernel. That way we could force NFS to O_DIRECT during
> migration which would solve this problem robustly.

We would need O_DIRECT on target during migration, I agree than that
would work.

> Deferred open doesn't help with direct attached storage.  There simple
> is no guarantee that there isn't data in the page cache.

Yeap, I asked the clustered filesystem people how they fixed the
problem, because clustered filesystem have this problem, right.  After
lots of arm twisting, I got the ioctl(BLKFLSBUF,...), but that only
works:
- on linux
- on some block devices

So, we are back to square 1.

> Again, I think defaulting DAS to cache=none|directsync is what makes
> the most sense here.

I think it is the only sane solution.  Otherwise, we need to write the
equivalent of a lock manager, to know _who_ has the storage, and
distributed lock managers are a mess :-(

> We can even add a migration blocker for DAS with cache=on.  If we can
> do dynamic toggling of the cache setting, then that's pretty friendly
> at the end of the day.

That could fix the problem also.  At the moment that we start migration,
we do an fsync() + switch to O_DIRECT for all filesystems.

As you said, time for implementing fcntl(O_DIRECT).

Later, Juan.

Re: [Qemu-devel] qemu and qemu.git -> Migration + disk stress introduces qcow2 corruptions

2011-11-15 Thread Avi Kivity

On 11/14/2011 11:58 AM, Kevin Wolf wrote:
> Am 12.11.2011 11:25, schrieb Avi Kivity:
> > On 11/11/2011 12:15 PM, Kevin Wolf wrote:
> >> Am 10.11.2011 22:30, schrieb Anthony Liguori:
> >>> Live migration with qcow2 or any other image format is just not going to 
> >>> work 
> >>> right now even with proper clustered storage.  I think doing a block 
> >>> level flush 
> >>> cache interface and letting block devices decide how to do it is the best 
> >>> approach.
> >>
> >> I would really prefer reusing the existing open/close code. It means
> >> less (duplicated) code, is existing code that is well tested and doesn't
> >> make migration much of a special case.
> >>
> >> If you want to avoid reopening the file on the OS level, we can reopen
> >> only the topmost layer (i.e. the format, but not the protocol) for now
> >> and in 1.1 we can use bdrv_reopen().
> > 
> > Intuitively I dislike _reopen style interfaces.  If the second open
> > yields different results from the first, does it invalidate any
> > computations in between?
>
> Not sure what results and what computation you mean,

Result = open succeeded.  Computation = anything that derives from the
image, like size, or reading some stuff to guess CHS or something.

>  but let me clarify
> a bit about bdrv_reopen:
>
> The main purpose of bdrv_reopen() is to change flags, for example toggle
> O_SYNC during runtime in order to allow the guest to toggle WCE. This
> doesn't necessarily mean a close()/open() sequence if there are other
> means to change the flags, like fcntl() (or even using other protocols
> than files).
>
> The idea here was to extend this to invalidate all caches if some
> specific flag is set. As you don't change any other flag, this will
> usually not be a reopen on a lower level.
>
> If we need to use open() though, and it fails (this is really the only
> "different" result that comes to mind)

(yes)

>  then bdrv_reopen() would fail and
> the old fd would stay in use. Migration would have to fail, but I don't
> think this case is ever needed for reopening after migration.

Okay.

>
> > What's wrong with just delaying the open?
>
> Nothing, except that with today's code it's harder to do.
>

This has never stopped us (though it may delay us).

-- 
error compiling committee.c: too many arguments to function

[Qemu-devel] [PATCH 1.0] scsi: fix fw path

2011-11-15 Thread Paolo Bonzini

The pre-1.0 firmware path for SCSI devices already included the LUN
using the suffix argument to add_boot_device_path.  Avoid that it is
included twice, and convert the colons to commas for consistency with
other kinds of devices

Signed-off-by: Paolo Bonzini 
---
 hw/scsi-bus.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/hw/scsi-bus.c b/hw/scsi-bus.c
index 372fe7f..b4e6e29 100644
--- a/hw/scsi-bus.c
+++ b/hw/scsi-bus.c
@@ -1304,7 +1304,7 @@ static char *scsibus_get_fw_dev_path(DeviceState *dev)
 SCSIDevice *d = DO_UPCAST(SCSIDevice, qdev, dev);
 char path[100];
 
-snprintf(path, sizeof(path), "%s@%d:%d:%d", qdev_fw_name(dev),
+snprintf(path, sizeof(path), "%s@%d,%d,%d", qdev_fw_name(dev),
  d->channel, d->id, d->lun);
 
 return strdup(path);
-- 
1.7.7.1

Re: [Qemu-devel] Summary of Anthony's 'next' queue

2011-11-15 Thread Anthony Liguori


On 11/14/2011 09:34 PM, Stefan Berger wrote:

On 11/14/2011 03:17 PM, Anthony Liguori wrote:

Hi,

This is a summary of the patches that I have queued in my next tree that were
identified as 1.1 candidates. These patches will not be applied until after
the 1.1 tree opens (December 1st).

These patches have not been tested yet and may receive additional review
comments. This note is meant to let submitters know that their patch has not
been forgotten.

Here's the full list:

050 stef...@linux.vnet.ibm.com Support for TPM command line options
051 stef...@linux.vnet.ibm.com Add TPM (frontend) hardware interface (TPM TIS)
to Qemu
052 stef...@linux.vnet.ibm.com Add a debug register
053 stef...@linux.vnet.ibm.com Build the TPM frontend code
054 stef...@linux.vnet.ibm.com Add a TPM Passthrough backend driver
implementation
055 stef...@linux.vnet.ibm.com Introduce --enable-tpm-passthrough configure
option
056 stef...@linux.vnet.ibm.com Move parsing of filedescriptor into common
function
057 stef...@linux.vnet.ibm.com Add fd parameter for TPM passthrough driver


I have two patches that fix a deallocation issue in an error path and that
restrict the 'fd' passed via command line to only be of chardev type. Should I
post it 'now' or only once 1.1 opens?


I would suggest sending them out in case anyone is interested.  Personally, I'm 
focusing on release issues at the moment so I won't look at them for a couple 
more weeks.


Regards,

Anthony Liguori



Stefan

Re: [Qemu-devel] [PATCH 0/5] docs: convert specifications to markdown

2011-11-15 Thread Anthony Liguori


On 11/15/2011 02:28 AM, Avi Kivity wrote:

On 11/15/2011 12:41 AM, Anthony Liguori wrote:

Right now our specs are written in psuedo-wiki syntax.  This series converts
them to markdown.  markdown is a simple markup format that's gaining in
popularity.

The big advantage of using markdown is that there are tools that can convert it
to relatively simple HTML.  That means we can build a make infrastructure that
generates a nice set of static web pages.

The syntax is also more human friendly than mediawiki syntax.

To see what the stylized version of this looks like, check out:

   https://github.com/aliguori/qemu/tree/markdown/docs/specs




Nice.  Suggest you enable rename detection, to make patches like these
easier to read (not that it truly matters in the particular case).


I haven't figured out yet how to make this sane to merge, but I've also 
converted qemu-doc.texi to a bunch of separate markdown files[1].


The info is fairly out of date.  I'll try to get patches out RSN so that we can 
all take a pass at trying to modernize some of the sections before the release.


[1] https://github.com/aliguori/qemu/tree/markdown/docs/manual

Regards,

Anthony Liguori

Re: [Qemu-devel] [PATCH] migration: add a MAINTAINERS entry for migration

2011-11-15 Thread Anthony Liguori


On 11/15/2011 02:32 AM, Stefan Hajnoczi wrote:

On Mon, Nov 14, 2011 at 03:08:25PM -0600, Anthony Liguori wrote:

On 11/14/2011 11:40 AM, Juan Quintela wrote:

Anthony Liguori   wrote:

I think this is an accurate reflection of the state of migration today.  This
is the second release in a row where we're scrambling to fix a critical issue
in migration.


We need to make our mind about it.


Ultimately, we need to make migration a priority.  That's what I'm
trying to do here.

The first step is to be open about the state of migration today.  I
personally don't have the bandwidth to invest a lot of effort in
migration, but I can invest time in trying to find more people to
work on migration, and help put together a proper roadmap.


It would help to have a migration wiki page or document that explains
the implications of migration on QEMU code - what to look out for in
device emulation code.

Although regular QEMU contributors may know the background on
migration/save/load, it would be not only helpful for new contributors
but also a good refresher for those of us who have picked up the
assumptions around migration piecewise.

I think a good document would raise migration awareness and help us
review new patches with an eye towards correct migration behavior.

The rules need to be laid down by someone who understands migration
quite well.


100% agreed.

I'll volunteer to start by taking the storage requirements wiki page, converting 
it to markdown, and adding it to docs/migration


Regards,

Anthony Liguori



Stefan

Re: [Qemu-devel] [PATCH] migration: add a MAINTAINERS entry for migration

2011-11-15 Thread Anthony Liguori


On 11/15/2011 03:36 AM, Kevin Wolf wrote:

Am 14.11.2011 22:08, schrieb Anthony Liguori:

On 11/14/2011 11:40 AM, Juan Quintela wrote:

Anthony Liguori   wrote:

I think this is an accurate reflection of the state of migration today.  This
is the second release in a row where we're scrambling to fix a critical issue
in migration.


We need to make our mind about it.


Ultimately, we need to make migration a priority.  That's what I'm trying to do
here.


When you make everything a priority, being a priority doesn't have much
of a meaning any more. Our current priorities are changing the entire
device model, the monitor, migration, turning the block layer upside
down - what's left? Okay, maybe vvfat and slirp.


Well, think of it as employment insurance :-)




The first step is to be open about the state of migration today.  I personally
don't have the bandwidth to invest a lot of effort in migration, but I can
invest time in trying to find more people to work on migration, and help put
together a proper roadmap.

We need to outline and document what we support and what we don't support.  We
need to invest in a test infrastructure.  We need a roadmap that we can
reasonably execute on.  In short, we need to turn migration into a first class
subsystem.

It's not about any single person or any single patch series.  It's about
deciding that migration is an important feature and deserves more focus and
attention.


I don't doubt that everyone will agree with this. The harder question is
who should concentrate less on which other feature to have time to spend
for migration.


I don't think it's a question of trading patches in one subsystem for patches in 
another subsystem.


I think it's more about having a planned, concerted effort, that systematically 
tackles the problems we're facing in migration.


By spending more time planning, it makes it much easier for people to 
contribute.  There's a lot of interest in migration.  If we made it easier to 
participate in improving it, I'm sure we would attract at least a few more 
people to working on it.


Regards,

Anthony Liguori



Kevin

Re: [Qemu-devel] [PATCH 0/5] docs: convert specifications to markdown

2011-11-15 Thread Avi Kivity

On 11/15/2011 03:44 PM, Anthony Liguori wrote:
> On 11/15/2011 02:28 AM, Avi Kivity wrote:
>> On 11/15/2011 12:41 AM, Anthony Liguori wrote:
>>> Right now our specs are written in psuedo-wiki syntax.  This series
>>> converts
>>> them to markdown.  markdown is a simple markup format that's gaining in
>>> popularity.
>>>
>>> The big advantage of using markdown is that there are tools that can
>>> convert it
>>> to relatively simple HTML.  That means we can build a make
>>> infrastructure that
>>> generates a nice set of static web pages.
>>>
>>> The syntax is also more human friendly than mediawiki syntax.
>>>
>>> To see what the stylized version of this looks like, check out:
>>>
>>>https://github.com/aliguori/qemu/tree/markdown/docs/specs
>>>
>>>
>>
>> Nice.  Suggest you enable rename detection, to make patches like these
>> easier to read (not that it truly matters in the particular case).
>
> I haven't figured out yet how to make this sane to merge, but I've
> also converted qemu-doc.texi to a bunch of separate markdown files[1].
>
> The info is fairly out of date.  I'll try to get patches out RSN so
> that we can all take a pass at trying to modernize some of the
> sections before the release.
>
> [1] https://github.com/aliguori/qemu/tree/markdown/docs/manual
>

Does markdown support rendering into man pages?

A similar alternative is asciidoc, which is used by git.

-- 
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] qemu and qemu.git -> Migration + disk stress introduces qcow2 corruptions

2011-11-15 Thread Anthony Liguori


On 11/15/2011 07:20 AM, Juan Quintela wrote:

Again, I think defaulting DAS to cache=none|directsync is what makes
the most sense here.


I think it is the only sane solution.  Otherwise, we need to write the
equivalent of a lock manager, to know _who_ has the storage, and
distributed lock managers are a mess :-(


We can even add a migration blocker for DAS with cache=on.  If we can
do dynamic toggling of the cache setting, then that's pretty friendly
at the end of the day.


That could fix the problem also.  At the moment that we start migration,
we do an fsync() + switch to O_DIRECT for all filesystems.

As you said, time for implementing fcntl(O_DIRECT).


Yeah, I think this ends up being a very elegant solution.

We always open block devices O_DIRECT to start with.  That ensures reads go 
directly to disk if its DAS or result in NFS protocol reads.


As long as we fsync on the source (and we do), then we're okay.

For cache=write{back,through}, we would then just fcntl() away O_DIRECT as soon 
as we start the guest.  Then we can start doing reads through the page cache.


Regards,

Anthony Liguori


Later, Juan.

Re: [Qemu-devel] [PATCH 0/5] docs: convert specifications to markdown

2011-11-15 Thread Alex Bradbury

On 15 November 2011 13:51, Avi Kivity  wrote:
> Does markdown support rendering into man pages?

You can do this via pandoc:
http://johnmacfarlane.net/pandoc/

Alex

Re: [Qemu-devel] [PATCH V2 03/12] hw/9pfs: File system helper process for qemu 9p proxy FS

2011-11-15 Thread Stefan Hajnoczi

On Tue, Nov 15, 2011 at 11:57 AM, M. Mohan Kumar  wrote:
> diff --git a/fsdev/virtfs-proxy-helper.c b/fsdev/virtfs-proxy-helper.c
> new file mode 100644
> index 000..69daf7c
> --- /dev/null
> +++ b/fsdev/virtfs-proxy-helper.c
> @@ -0,0 +1,271 @@
> +/*
> + * Helper for QEMU Proxy FS Driver
> + * Copyright IBM, Corp. 2011
> + *
> + * Authors:
> + * M. Mohan Kumar 
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2. See
> + * the COPYING file in the top-level directory.
> + */
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include "bswap.h"

Where is "bswap.h" used and why above ?

> +#include 
> +#include "qemu-common.h"
> +#include "virtio-9p-marshal.h"
> +#include "hw/9pfs/virtio-9p-proxy.h"
> +
> +#define PROGNAME "virtfs-proxy-helper"
> +
> +static struct option helper_opts[] = {
> +    {"fd", required_argument, NULL, 'f'},
> +    {"path", required_argument, NULL, 'p'},
> +    {"nodaemon", no_argument, NULL, 'n'},
> +};
> +
> +int is_daemon;

static?

Also, please use the bool type from , it makes it easier
for readers who don't have to guess how the variable works (might be a
bitfield or reference count too).

> +static int socket_read(int sockfd, void *buff, ssize_t size)
> +{
> +    int retval;
> +
> +    do {
> +        retval = read(sockfd, buff, size);
> +    } while (retval < 0 && errno == EINTR);
> +    if (retval != size) {
> +        return -EIO;
> +    }

Shouldn't this loop until size bytes have been read?

> +    return retval;
> +}
> +
> +static int socket_write(int sockfd, void *buff, ssize_t size)
> +{
> +    int retval;
> +
> +    do {
> +        retval = write(sockfd, buff, size);
> +    } while (retval < 0 && errno == EINTR);
> +    if (retval != size) {
> +        return -EIO;

We could pass the actual -errno here if retval < 0.

> +    }
> +    return retval;
> +}
> +
> +static int read_request(int sockfd, struct iovec *iovec)
> +{
> +    int retval;
> +    ProxyHeader header;
> +
> +    /* read the header */
> +    retval = socket_read(sockfd, iovec->iov_base, sizeof(header));
> +    if (retval != sizeof(header)) {
> +        return -EIO;
> +    }
> +    /* unmarshal header */
> +    proxy_unmarshal(iovec, 1, 0, "dd", &header.type, &header.size);
> +    /* read the request */
> +    retval = socket_read(sockfd, iovec->iov_base + sizeof(header), 
> header.size);
> +    if (retval != header.size) {
> +        return -EIO;
> +    }
> +    return header.type;
> +}

Size checks are missing and we're trusting what the client sends!

> +
> +static void usage(char *prog)
> +{
> +    fprintf(stderr, "usage: %s\n"
> +            " -p|--path  9p path to export\n"
> +            " {-f|--fd } socket file descriptor to be 
> used\n"
> +            " [-n|--nodaemon] Run as a normal program\n",
> +            basename(prog));
> +}
> +
> +static int process_requests(int sock)
> +{
> +    int type;
> +    struct iovec iovec;
> +
> +    iovec.iov_base = g_malloc(BUFF_SZ);
> +    iovec.iov_len = BUFF_SZ;
> +    while (1) {
> +        type = read_request(sock, &iovec);
> +        if (type <= 0) {
> +            goto error;
> +        }
> +    }
> +    (void)socket_write;
> +error:
> +    g_free(iovec.iov_base);
> +    return -1;
> +}
> +
> +int main(int argc, char **argv)
> +{
> +    int sock;
> +    char rpath[PATH_MAX];
> +    struct stat stbuf;
> +    int c, option_index;
> +
> +    is_daemon = 1;
> +    rpath[0] = '\0';
> +    sock = -1;
> +    while (1) {
> +        option_index = 0;
> +        c = getopt_long(argc, argv, "p:nh?f:", helper_opts,
> +                        &option_index);
> +        if (c == -1) {
> +            break;
> +        }
> +        switch (c) {
> +        case 'p':
> +            strcpy(rpath, optarg);

Buffer overflow.  The whole thing would be simpler like this:

const char *rpath = "";
[...]
case 'p':
rpath = optarg;
break;

> +            break;
> +        case 'n':
> +            is_daemon = 0;
> +            break;
> +        case 'f':
> +            sock = atoi(optarg);
> +            break;
> +        case '?':
> +        case 'h':
> +        default:
> +            usage(argv[0]);
> +            return -1;

The convention is for programs to exit with 1 (EXIT_FAILURE) on error.

> +            break;
> +        }
> +    }
> +
> +    /* Parameter validation */
> +    if (sock == -1 || rpath[0] == '\0') {
> +        fprintf(stderr, "socket descriptor or path not specified\n");
> +        usage(argv[0]);
> +        return -1;
> +    }
> +
> +    if (lstat(rpath, &stbuf) < 0) {
> +        fprintf(stderr, "invalid path \"%s\" specified?\n", rpath);

sterror() would provide further details on what went wrong.

> +        return -1;
> +    }
> +
> +    if (!S_ISDIR(stbuf.st_mode)) {
> +        fprintf(stderr, "specified path \"%s\" is not directory

Re: [Qemu-devel] endless loop when use qemu-system-mipsel to load bios

2011-11-15 Thread Markus Armbruster

rui chen  writes:

> Hi all,
> When I try to use command line "qemu-system-mipsel -M malta -L .
> -nographic" to run redboot,  it will have an endless loop, then I find this
> bug, here is my patch:
>
>
> Author: Chen Rui 
> Date:   Sat Nov 12 01:38:23 2011 +0800
>
> resolve an endless loop when use qemu-system-mipsel to load bios
>
> Signed-off-by: Chen Rui 

Please use git-format-patch, not git-show.  And please put all of the
description in the commit message.

Finally, it helps to cc: the maintainer.  scripts/get_maintainer.pl can
help find him.  For your patch, it points to Aurelien (cc'ed).

> diff --git a/hw/mips_malta.c b/hw/mips_malta.c
> index bb49749..e7dfbd6 100644
> --- a/hw/mips_malta.c
> +++ b/hw/mips_malta.c
> @@ -911,6 +911,7 @@ void mips_malta_init (ram_addr_t ram_size,
>  uint32_t *end = addr + bios_size;
>  while (addr < end) {
>  bswap32s(addr);
> +addr++;
>  }
>  }
>  #endif

[Qemu-devel] [PATCH 0/4] prevent Qemu from waking up needlessly

2011-11-15 Thread Stefano Stabellini

Hi all,
this small patch series prevents Qemu from waking up needlessly on Xen
several times a second in order to check some timers.

The first two patches stop Qemu from emulating the RTC and the PIT on
Xen, that are both already emulated in the hypervisor and consume
precious cpu cycles because they need qemu-timers to work.

The third patch makes use of a new mechanism to receive buffered io
event notifications from Xen, so that Qemu doesn't need to check the
buffered io page for data 10 times a sec for the entire life of the VM.

Finally the last patch increases the default select timeout to 1h:
nothing should rely on the select timeout to be 1sec, so we might as
well increase it to 1h.


Stefano Stabellini (4):
  xen: introduce mc146818rtcxen
  xen: do not initialize the interval timer emulator
  xen: introduce an event channel for buffered io event notifications
  qemu_calculate_timeout: increase minimum timeout to 1h

 hw/mc146818rtc.c |   36 +++-
 hw/pc.c  |7 +--
 qemu-timer.c |2 +-
 xen-all.c|   38 --
 4 files changed, 73 insertions(+), 10 deletions(-)


A git tree based on v1.0-rc2 is available here:

git://xenbits.xen.org/people/sstabellini/qemu-dm.git timers-1.0-rc2

Cheers,

Stefano

[Qemu-devel] [PATCH 2/4] xen: do not initialize the interval timer emulator

2011-11-15 Thread stefano.stabellini

From: Stefano Stabellini 

PIT is emulated by the hypervisor so we don't need to emulate it in Qemu:
this patch prevents Qemu from waking up needlessly at PIT_FREQ on Xen.

Signed-off-by: Stefano Stabellini 
---
 hw/pc.c |7 +--
 1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/hw/pc.c b/hw/pc.c
index 33778fe..a0ae981 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -43,6 +43,7 @@
 #include "ui/qemu-spice.h"
 #include "memory.h"
 #include "exec-memory.h"
+#include "arch_init.h"
 
 /* output Bochs bios info messages */
 //#define DEBUG_BIOS
@@ -1121,7 +1122,7 @@ void pc_basic_device_init(qemu_irq *gsi,
 DriveInfo *fd[MAX_FD];
 qemu_irq rtc_irq = NULL;
 qemu_irq *a20_line;
-ISADevice *i8042, *port92, *vmmouse, *pit;
+ISADevice *i8042, *port92, *vmmouse, *pit = NULL;
 qemu_irq *cpu_exit_irq;
 
 register_ioport_write(0x80, 1, 1, ioport80_write, NULL);
@@ -1142,7 +1143,9 @@ void pc_basic_device_init(qemu_irq *gsi,
 
 qemu_register_boot_set(pc_boot_set, *rtc_state);
 
-pit = pit_init(0x40, 0);
+if (!xen_available()) {
+pit = pit_init(0x40, 0);
+}
 pcspk_init(pit);
 
 for(i = 0; i < MAX_SERIAL_PORTS; i++) {
-- 
1.7.2.3

[Qemu-devel] [PATCH 4/4] qemu_calculate_timeout: increase minimum timeout to 1h

2011-11-15 Thread stefano.stabellini

From: Stefano Stabellini 

There is no reason why the minimum timeout should be 1sec, it could
easily be 1h and we would safe lots of cpu cycles.

Signed-off-by: Stefano Stabellini 
---
 qemu-timer.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/qemu-timer.c b/qemu-timer.c
index cd026c6..3a9987e 100644
--- a/qemu-timer.c
+++ b/qemu-timer.c
@@ -846,6 +846,6 @@ fail:
 
 int qemu_calculate_timeout(void)
 {
-return 1000;
+return 1000*60*60;
 }
 
-- 
1.7.2.3

[Qemu-devel] [PATCH 1/4] xen: introduce mc146818rtcxen

2011-11-15 Thread stefano.stabellini

From: Stefano Stabellini 

Xen doesn't need full RTC emulation in Qemu because the RTC is already
emulated by the hypervisor. In particular we want to avoid the timers
initialization so that Qemu doesn't need to wake up needlessly.

Signed-off-by: Stefano Stabellini 
---
 hw/mc146818rtc.c |   36 +++-
 1 files changed, 35 insertions(+), 1 deletions(-)

diff --git a/hw/mc146818rtc.c b/hw/mc146818rtc.c
index 2aaca2f..91242d0 100644
--- a/hw/mc146818rtc.c
+++ b/hw/mc146818rtc.c
@@ -28,6 +28,7 @@
 #include "apic.h"
 #include "isa.h"
 #include "mc146818rtc.h"
+#include "arch_init.h"
 
 //#define DEBUG_CMOS
 //#define DEBUG_COALESCED
@@ -614,6 +615,17 @@ static const MemoryRegionOps cmos_ops = {
 .old_portio = cmos_portio
 };
 
+static int rtcxen_initfn(ISADevice *dev)
+{
+int base = 0x70;
+RTCState *s = DO_UPCAST(RTCState, dev, dev);
+
+memory_region_init_io(&s->io, &cmos_ops, s, "rtc", 2);
+isa_register_ioport(dev, &s->io, base);
+
+return 0;
+}
+
 static int rtc_initfn(ISADevice *dev)
 {
 RTCState *s = DO_UPCAST(RTCState, dev, dev);
@@ -655,7 +667,11 @@ ISADevice *rtc_init(int base_year, qemu_irq intercept_irq)
 ISADevice *dev;
 RTCState *s;
 
-dev = isa_create("mc146818rtc");
+if (xen_available()) {
+dev = isa_create("mc146818rtcxen");
+} else {
+dev = isa_create("mc146818rtc");
+}
 s = DO_UPCAST(RTCState, dev, dev);
 qdev_prop_set_int32(&dev->qdev, "base_year", base_year);
 qdev_init_nofail(&dev->qdev);
@@ -684,3 +700,21 @@ static void mc146818rtc_register(void)
 isa_qdev_register(&mc146818rtc_info);
 }
 device_init(mc146818rtc_register)
+
+static ISADeviceInfo mc146818rtcxen_info = {
+.qdev.name = "mc146818rtcxen",
+.qdev.size = sizeof(RTCState),
+.qdev.no_user  = 1,
+.qdev.vmsd = &vmstate_rtc,
+.init  = rtcxen_initfn,
+.qdev.props= (Property[]) {
+DEFINE_PROP_INT32("base_year", RTCState, base_year, 1980),
+DEFINE_PROP_END_OF_LIST(),
+}
+};
+
+static void mc146818rtcxen_register(void)
+{
+isa_qdev_register(&mc146818rtcxen_info);
+}
+device_init(mc146818rtcxen_register)
-- 
1.7.2.3

Re: [Qemu-devel] [PATCH 0/5] docs: convert specifications to markdown

2011-11-15 Thread Anthony Liguori


On 11/15/2011 07:51 AM, Avi Kivity wrote:

On 11/15/2011 03:44 PM, Anthony Liguori wrote:

Nice.  Suggest you enable rename detection, to make patches like these
easier to read (not that it truly matters in the particular case).


I haven't figured out yet how to make this sane to merge, but I've
also converted qemu-doc.texi to a bunch of separate markdown files[1].

The info is fairly out of date.  I'll try to get patches out RSN so
that we can all take a pass at trying to modernize some of the
sections before the release.

[1] https://github.com/aliguori/qemu/tree/markdown/docs/manual



Does markdown support rendering into man pages?

A similar alternative is asciidoc, which is used by git.


I was thinking of doing a2x for the man pages (which is more or less what git 
does).

The man pages are generated by qemu-doc.texi so I think I'm going to have to 
strip out the extracted info, but leave enough in qemu-doc.texi so that we can 
keep generating the man pages.  Once we clean up the user docs a bit, we can 
convert the man pages too.


Regards,

Anthony Liguori

[Qemu-devel] [PATCH 3/4] xen: introduce an event channel for buffered io event notifications

2011-11-15 Thread stefano.stabellini

From: Stefano Stabellini 

Use the newly introduced HVM_PARAM_BUFIOREQ_EVTCHN to receive
notifications for buffered io events.
After the first notification is received leave the event channel masked
and setup a timer to process the rest of the batch.
Once we have completed processing the batch, unmask the event channel
and delete the timer.

Signed-off-by: Stefano Stabellini 
---
 xen-all.c |   38 --
 1 files changed, 32 insertions(+), 6 deletions(-)

diff --git a/xen-all.c b/xen-all.c
index b5e28ab..b28d7e7 100644
--- a/xen-all.c
+++ b/xen-all.c
@@ -70,6 +70,8 @@ typedef struct XenIOState {
 QEMUTimer *buffered_io_timer;
 /* the evtchn port for polling the notification, */
 evtchn_port_t *ioreq_local_port;
+/* evtchn local port for buffered io */
+evtchn_port_t bufioreq_local_port;
 /* the evtchn fd for polling */
 XenEvtchn xce_handle;
 /* which vcpu we are serving */
@@ -516,6 +518,12 @@ static ioreq_t *cpu_get_ioreq(XenIOState *state)
 evtchn_port_t port;
 
 port = xc_evtchn_pending(state->xce_handle);
+if (port == state->bufioreq_local_port) {
+qemu_mod_timer(state->buffered_io_timer,
+BUFFER_IO_MAX_DELAY + qemu_get_clock_ms(rt_clock));
+return NULL;
+}
+
 if (port != -1) {
 for (i = 0; i < smp_cpus; i++) {
 if (state->ioreq_local_port[i] == port) {
@@ -664,16 +672,18 @@ static void handle_ioreq(ioreq_t *req)
 }
 }
 
-static void handle_buffered_iopage(XenIOState *state)
+static int handle_buffered_iopage(XenIOState *state)
 {
 buf_ioreq_t *buf_req = NULL;
 ioreq_t req;
 int qw;
 
 if (!state->buffered_io_page) {
-return;
+return 0;
 }
 
+memset(&req, 0x00, sizeof(req));
+
 while (state->buffered_io_page->read_pointer != 
state->buffered_io_page->write_pointer) {
 buf_req = &state->buffered_io_page->buf_ioreq[
 state->buffered_io_page->read_pointer % IOREQ_BUFFER_SLOT_NUM];
@@ -698,15 +708,21 @@ static void handle_buffered_iopage(XenIOState *state)
 xen_mb();
 state->buffered_io_page->read_pointer += qw ? 2 : 1;
 }
+
+return req.count;
 }
 
 static void handle_buffered_io(void *opaque)
 {
 XenIOState *state = opaque;
 
-handle_buffered_iopage(state);
-qemu_mod_timer(state->buffered_io_timer,
-   BUFFER_IO_MAX_DELAY + qemu_get_clock_ms(rt_clock));
+if (handle_buffered_iopage(state)) {
+qemu_mod_timer(state->buffered_io_timer,
+BUFFER_IO_MAX_DELAY + qemu_get_clock_ms(rt_clock));
+} else {
+qemu_del_timer(state->buffered_io_timer);
+xc_evtchn_unmask(state->xce_handle, state->bufioreq_local_port);
+}
 }
 
 static void cpu_handle_ioreq(void *opaque)
@@ -836,7 +852,6 @@ static void xen_main_loop_prepare(XenIOState *state)
 
 state->buffered_io_timer = qemu_new_timer_ms(rt_clock, handle_buffered_io,
  state);
-qemu_mod_timer(state->buffered_io_timer, qemu_get_clock_ms(rt_clock));
 
 if (evtchn_fd != -1) {
 qemu_set_fd_handler(evtchn_fd, cpu_handle_ioreq, NULL, state);
@@ -888,6 +903,7 @@ int xen_hvm_init(void)
 {
 int i, rc;
 unsigned long ioreq_pfn;
+unsigned long bufioreq_evtchn;
 XenIOState *state;
 
 state = g_malloc0(sizeof (XenIOState));
@@ -937,6 +953,16 @@ int xen_hvm_init(void)
 state->ioreq_local_port[i] = rc;
 }
 
+xc_get_hvm_param(xen_xc, xen_domid, HVM_PARAM_BUFIOREQ_EVTCHN,
+&bufioreq_evtchn);
+rc = xc_evtchn_bind_interdomain(state->xce_handle, xen_domid,
+(uint32_t)bufioreq_evtchn);
+if (rc == -1) {
+fprintf(stderr, "bind interdomain ioctl error %d\n", errno);
+return -1;
+}
+state->bufioreq_local_port = rc;
+
 /* Init RAM management */
 xen_map_cache_init();
 xen_ram_init(ram_size);
-- 
1.7.2.3

Re: [Qemu-devel] [PATCH 1/4] xen: introduce mc146818rtcxen

2011-11-15 Thread Anthony Liguori


On 11/15/2011 08:51 AM, stefano.stabell...@eu.citrix.com wrote:

From: Stefano Stabellini

Xen doesn't need full RTC emulation in Qemu because the RTC is already
emulated by the hypervisor. In particular we want to avoid the timers
initialization so that Qemu doesn't need to wake up needlessly.

Signed-off-by: Stefano Stabellini


Yuck.  There's got to be a better way to do this.

I think it would be better to name timers and then in Xen specific machine code, 
disable the RTC timers.


Regards,

Anthony Liguori


---
  hw/mc146818rtc.c |   36 +++-
  1 files changed, 35 insertions(+), 1 deletions(-)

diff --git a/hw/mc146818rtc.c b/hw/mc146818rtc.c
index 2aaca2f..91242d0 100644
--- a/hw/mc146818rtc.c
+++ b/hw/mc146818rtc.c
@@ -28,6 +28,7 @@
  #include "apic.h"
  #include "isa.h"
  #include "mc146818rtc.h"
+#include "arch_init.h"

  //#define DEBUG_CMOS
  //#define DEBUG_COALESCED
@@ -614,6 +615,17 @@ static const MemoryRegionOps cmos_ops = {
  .old_portio = cmos_portio
  };

+static int rtcxen_initfn(ISADevice *dev)
+{
+int base = 0x70;
+RTCState *s = DO_UPCAST(RTCState, dev, dev);
+
+memory_region_init_io(&s->io,&cmos_ops, s, "rtc", 2);
+isa_register_ioport(dev,&s->io, base);
+
+return 0;
+}
+
  static int rtc_initfn(ISADevice *dev)
  {
  RTCState *s = DO_UPCAST(RTCState, dev, dev);
@@ -655,7 +667,11 @@ ISADevice *rtc_init(int base_year, qemu_irq intercept_irq)
  ISADevice *dev;
  RTCState *s;

-dev = isa_create("mc146818rtc");
+if (xen_available()) {
+dev = isa_create("mc146818rtcxen");
+} else {
+dev = isa_create("mc146818rtc");
+}
  s = DO_UPCAST(RTCState, dev, dev);
  qdev_prop_set_int32(&dev->qdev, "base_year", base_year);
  qdev_init_nofail(&dev->qdev);
@@ -684,3 +700,21 @@ static void mc146818rtc_register(void)
  isa_qdev_register(&mc146818rtc_info);
  }
  device_init(mc146818rtc_register)
+
+static ISADeviceInfo mc146818rtcxen_info = {
+.qdev.name = "mc146818rtcxen",
+.qdev.size = sizeof(RTCState),
+.qdev.no_user  = 1,
+.qdev.vmsd =&vmstate_rtc,
+.init  = rtcxen_initfn,
+.qdev.props= (Property[]) {
+DEFINE_PROP_INT32("base_year", RTCState, base_year, 1980),
+DEFINE_PROP_END_OF_LIST(),
+}
+};
+
+static void mc146818rtcxen_register(void)
+{
+isa_qdev_register(&mc146818rtcxen_info);
+}
+device_init(mc146818rtcxen_register)

Re: [Qemu-devel] [PATCH v7 1.0] configure: build position independent executables on x86 hosts

2011-11-15 Thread Anthony Liguori


On 11/15/2011 05:25 AM, Peter Maydell wrote:

On 15 November 2011 09:34, Avi Kivity  wrote:

Change the default on x86 hosts to building PIE (position independent
executables); instead of restricting the option to user-only targets,
apply it to all targets.

In addition, set the relocation sections to read-only (relro) when available;
this reduces the attack surface by disallowing changes to relocation tables
at runtime.

While PIE reduces performance and relro increases load time, it greatly
improves security, with the potential to reduce a code execution vulnerability
to a self denial of service.

Non-x86 are not changed, as they require TCG changes.

Signed-off-by: Avi Kivity


Reviewed-by: Peter Maydell

...as far as the technical content of the patch is concerned.
I'm still rather dubious about the merits of putting this patch
in this late in the release cycle.


How about we limit this to be enabled by default on x86 Linux hosts?

That would make me a lot more comfortable for 1.0 since I expect we can test 
that exhaustively.


Regards,

Anthony Liguori



-- PMM

Re: [Qemu-devel] [PATCH] hw/omap_gpio: Fix infinite recursion when doing 8/16 bit reads

2011-11-15 Thread Anthony Liguori


On 11/07/2011 07:25 AM, Peter Maydell wrote:

Fix a long-standing bug which meant that any attempt to do an
8 or 16 bit read from the OMAP GPIO module would cause qemu to
crash due to an infinite recursion.

Signed-off-by: Peter Maydell


Applied.  Thanks.

Regards,

Anthony Liguori


---
This has actually been in the code since the original OMAP2 support
was added in 2008; we've never noticed before because the kernel happened
to always do 32 bit accesses...
Long term we should fix this by conversion to MemoryRegion; this is
the minimally invasive fix for 1.0.

  hw/omap_gpio.c |2 +-
  1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/hw/omap_gpio.c b/hw/omap_gpio.c
index d775df6..d630748 100644
--- a/hw/omap_gpio.c
+++ b/hw/omap_gpio.c
@@ -510,7 +510,7 @@ static void omap2_gpio_module_write(void *opaque, 
target_phys_addr_t addr,

  static uint32_t omap2_gpio_module_readp(void *opaque, target_phys_addr_t addr)
  {
-return omap2_gpio_module_readp(opaque, addr)>>  ((addr&  3)<<  3);
+return omap2_gpio_module_read(opaque, addr&  ~3)>>  ((addr&  3)<<  3);
  }

  static void omap2_gpio_module_writep(void *opaque, target_phys_addr_t addr,

Re: [Qemu-devel] [RFC 1.0] pc_piix: set qxl revision to 2 for pc-0.14

2011-11-15 Thread Anthony Liguori


On 11/13/2011 07:27 AM, Alon Levy wrote:

The default is still 3, and I didn't change older machine types.

Signed-off-by: Alon Levy


Applied.  Thanks.

Regards,

Anthony Liguori


---
Is there a better way then copy pasting this to the older pc types to get
the revision == 2 for them as well?

  hw/pc_piix.c |   12 
  1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/hw/pc_piix.c b/hw/pc_piix.c
index 27ea570..970f43c 100644
--- a/hw/pc_piix.c
+++ b/hw/pc_piix.c
@@ -311,6 +311,18 @@ static QEMUMachine pc_machine_v0_14 = {
  .desc = "Standard PC",
  .init = pc_init_pci,
  .max_cpus = 255,
+.compat_props = (GlobalProperty[]) {
+{
+.driver   = "qxl",
+.property = "revision",
+.value= stringify(2),
+},{
+.driver   = "qxl-vga",
+.property = "revision",
+.value= stringify(2),
+},
+{ /* end of list */ }
+},
  };

  static QEMUMachine pc_machine_v0_13 = {

Re: [Qemu-devel] [PATCH 00/14] Convert Sun devices to memory API.

2011-11-15 Thread Benoît Canet

When converting lines like :

-cpu_register_physical_memory_offset(0x1f80, 0x1000,
-sh7750_io_memory, 0x1f80);
-cpu_register_physical_memory_offset(0xff80, 0x1000,
-sh7750_io_memory, 0x1f80);

I'm tempted to do :

+memory_region_init_alias(&s->iomem_1f8, "memory-1f8",
+ &s->iomem, 0x1f80, 0x1000);
+memory_region_add_subregion(sysmem, 0x1f80, &s->iomem_1f8);
+
+memory_region_init_alias(&s->iomem_ff8, "memory-ff8",
+ &s->iomem, 0xff80, 0x1000);
+memory_region_add_subregion(sysmem, 0xff80, &s->iomem_ff8);

but I'm affraid to loose some information contained in the offset different
from the base address (0xff80 != 0x1f80).

What is the current recommendation regarding such conversions ?

Benoît


2011/11/15 Avi Kivity 

> On 11/15/2011 01:13 PM, Benoît Canet wrote:
> > .valid was used where the access size is specified like in
> >
> http://www.ibiblio.org/pub/historic-linux/early-ports/Sparc/NCR/NCR89C105.txt
> > .impl was used when the behavior is not known.
>
> Thanks, all applied except:
>
> >   sun4c_intctl: convert to memory API
> >   sun4m_iommu: convert to memory API
>
> Where we had raced - I just wrote those two conversions as well.  As it
> happens, these were the only two patches that used .impl, which is a
> behaviour change; please avoid behaviour changes and do them as separate
> patches.
>
> Note I don't think .impl works well when .min_access_size = 4 - it
> requires RMW which we don't do yet.
>
> >   esp: Fix memory API conversion
> >
>
> Thanks for that too.  Will fold it into the bad patch.
>
> --
> error compiling committee.c: too many arguments to function
>
>

[Qemu-devel] [PATCH 1.0] scsi-disk: guess geometry

2011-11-15 Thread Paolo Bonzini

Old operating systems rely on correct geometry to convert from CHS
addresses to LBA.  Providing correct data is necessary for them to boot.

Signed-off-by: Paolo Bonzini 
---
This fixes booting the FreeDOS image on bochs.sf.net with
virtio-scsi.  I haven't tested with LSI option ROMs, but
it should qualify for 1.0.

 hw/scsi-disk.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c
index 16a4714..cd77780 100644
--- a/hw/scsi-disk.c
+++ b/hw/scsi-disk.c
@@ -828,7 +828,7 @@ static int mode_sense_page(SCSIDiskState *s, int page, 
uint8_t **p_outbuf,
 break;
 }
 /* if a geometry hint is available, use it */
-bdrv_get_geometry_hint(bdrv, &cylinders, &heads, &secs);
+bdrv_guess_geometry(bdrv, &cylinders, &heads, &secs);
 p[2] = (cylinders >> 16) & 0xff;
 p[3] = (cylinders >> 8) & 0xff;
 p[4] = cylinders & 0xff;
@@ -862,7 +862,7 @@ static int mode_sense_page(SCSIDiskState *s, int page, 
uint8_t **p_outbuf,
 p[2] = 5000 >> 8;
 p[3] = 5000 & 0xff;
 /* if a geometry hint is available, use it */
-bdrv_get_geometry_hint(bdrv, &cylinders, &heads, &secs);
+bdrv_guess_geometry(bdrv, &cylinders, &heads, &secs);
 p[4] = heads & 0xff;
 p[5] = secs & 0xff;
 p[6] = s->qdev.blocksize >> 8;
-- 
1.7.7.1

Re: [Qemu-devel] KVM call agenda for November 15th

2011-11-15 Thread Anthony Liguori


On 11/14/2011 11:44 AM, Juan Quintela wrote:


Hi

Please send in any agenda items you are interested in covering.

Proposal:
- Migration debacle.


Just to capture my action items so I don't forget:

1) write up a markdown document for qemu.git that describes the past, current, 
and future state of storage requirements for migration


2) write up the notes from the discussion about doing an IDL based vmstate 
description.


Regards,

Anthony Liguori



Thanks, Juan.

[Qemu-devel] [PATCH 1.0 v2] scsi: fix fw path

2011-11-15 Thread Paolo Bonzini

The pre-1.0 firmware path for SCSI devices already included the LUN
using the suffix argument to add_boot_device_path.  I missed that when
making channel and LUN customizable.  Avoid that it is included twice, and
convert the colons to commas for consistency with other kinds of devices

Signed-off-by: Paolo Bonzini 
---
v1->v2: include scsi-disk hunk too

 hw/scsi-bus.c  |2 +-
 hw/scsi-disk.c |2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/scsi-bus.c b/hw/scsi-bus.c
index 372fe7f..b4e6e29 100644
--- a/hw/scsi-bus.c
+++ b/hw/scsi-bus.c
@@ -1304,7 +1304,7 @@ static char *scsibus_get_fw_dev_path(DeviceState *dev)
 SCSIDevice *d = DO_UPCAST(SCSIDevice, qdev, dev);
 char path[100];
 
-snprintf(path, sizeof(path), "%s@%d:%d:%d", qdev_fw_name(dev),
+snprintf(path, sizeof(path), "%s@%d,%d,%d", qdev_fw_name(dev),
  d->channel, d->id, d->lun);
 
 return strdup(path);
diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c
index 9da6d36..16a4714 100644
--- a/hw/scsi-disk.c
+++ b/hw/scsi-disk.c
@@ -1581,7 +1581,7 @@ static int scsi_initfn(SCSIDevice *dev)
 bdrv_set_buffer_alignment(s->qdev.conf.bs, s->qdev.blocksize);
 
 bdrv_iostatus_enable(s->qdev.conf.bs);
-add_boot_device_path(s->qdev.conf.bootindex, &dev->qdev, ",0");
+add_boot_device_path(s->qdev.conf.bootindex, &dev->qdev, NULL);
 return 0;
 }
 
-- 
1.7.7.1

Re: [Qemu-devel] [PATCH 1/4] xen: introduce mc146818rtcxen

2011-11-15 Thread Stefano Stabellini

On Tue, 15 Nov 2011, Anthony Liguori wrote:
> On 11/15/2011 08:51 AM, stefano.stabell...@eu.citrix.com wrote:
> > From: Stefano Stabellini
> >
> > Xen doesn't need full RTC emulation in Qemu because the RTC is already
> > emulated by the hypervisor. In particular we want to avoid the timers
> > initialization so that Qemu doesn't need to wake up needlessly.
> >
> > Signed-off-by: Stefano Stabellini
> 
> Yuck.  There's got to be a better way to do this.

Yeah, it is pretty ugly, I was hoping in some good suggestions to
improve this patch :)


> I think it would be better to name timers and then in Xen specific machine 
> code, 
> disable the RTC timers.

Good idea!
I was thinking that I could implement an rtc_stop function in
mc146818rtc.c that stops and frees the timers.

Now the problem is that from xen-all.c I cannot easily find the
ISADevice instance to pass to rtc_stop. Do you think it would be
reasonable to call rtc_stop from pc_basic_device_init, inside the same
if (!xen_available()) introduce by the next patch?

Otherwise I could implement functions to walk the isa bus, similarly to
pci_for_each_device.


This is just an example:

diff --git a/hw/mc146818rtc.c b/hw/mc146818rtc.c
index 2aaca2f..568c540 100644
--- a/hw/mc146818rtc.c
+++ b/hw/mc146818rtc.c
@@ -667,6 +667,28 @@ ISADevice *rtc_init(int base_year, qemu_irq intercept_irq)
 return dev;
 }
 
+void rtc_stop(ISADevice *dev)
+{
+RTCState *s = DO_UPCAST(RTCState, dev, dev);
+
+qemu_del_timer(s->periodic_timer);
+qemu_del_timer(s->second_timer);
+qemu_del_timer(s->second_timer2);
+#ifdef TARGET_I386
+if (rtc_td_hack) {
+qemu_del_timer(s->coalesced_timer);
+}
+#endif
+qemu_free_timer(s->periodic_timer);
+qemu_free_timer(s->second_timer);
+qemu_free_timer(s->second_timer2);
+#ifdef TARGET_I386
+if (rtc_td_hack) {
+qemu_free_timer(s->coalesced_timer);
+}
+#endif
+}
+
 static ISADeviceInfo mc146818rtc_info = {
 .qdev.name = "mc146818rtc",
 .qdev.size = sizeof(RTCState),
diff --git a/hw/mc146818rtc.h b/hw/mc146818rtc.h
index 575968c..aa2b8ab 100644
--- a/hw/mc146818rtc.h
+++ b/hw/mc146818rtc.h
@@ -8,5 +8,6 @@
 ISADevice *rtc_init(int base_year, qemu_irq intercept_irq);
 void rtc_set_memory(ISADevice *dev, int addr, int val);
 void rtc_set_date(ISADevice *dev, const struct tm *tm);
+void rtc_stop(ISADevice *dev);
 
 #endif /* !MC146818RTC_H */
diff --git a/hw/pc.c b/hw/pc.c
index a0ae981..d734f75 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -1145,6 +1145,8 @@ void pc_basic_device_init(qemu_irq *gsi,
 
 if (!xen_available()) {
 pit = pit_init(0x40, 0);
+} else {
+rtc_stop(*rtc_state);
 }
 pcspk_init(pit);

[Qemu-devel] [RFC PATCH 0/4] virtio-scsi device model

2011-11-15 Thread Paolo Bonzini

Here is the first sneak peek of virtio-scsi.  It's on top of my scsi-sg
branch at http://github.com/bonzini/qemu.  I'm more interested
in getting early reviews in the virtio side, so I'm omitting the
scsi-specific patches that introduce support for scatter/gather I/O
in the SCSI layer.

What's missing is 1) support for writable config space, 2) testing TMF,
3) events; 4) migration.  The last two do not need to be in at the
first commit, they can come later.

Tested lightly with a seabios driver (see the seabios ML).  The Linux
driver will come next...

Paolo Bonzini (2):
  virtio-scsi: add basic SCSI bus operation
  virtio-scsi: process control queue requests

Stefan Hajnoczi (2):
  virtio-scsi: Add virtio-scsi stub device
  virtio-scsi: Add basic request processing infrastructure

 Makefile.target |1 +
 default-configs/pci.mak |1 +
 hw/pci.h|1 +
 hw/virtio-pci.c |   42 
 hw/virtio-pci.h |2 +
 hw/virtio-scsi.c|  502 +++
 hw/virtio-scsi.h|   28 +++
 hw/virtio.h |3 +
 8 files changed, 580 insertions(+), 0 deletions(-)
 create mode 100644 hw/virtio-scsi.c
 create mode 100644 hw/virtio-scsi.h

-- 
1.7.7.1

Re: [Qemu-devel] [Xen-devel] [PATCH 3/4] xen: introduce an event channel for buffered io event notifications

2011-11-15 Thread Ian Campbell

On Tue, 2011-11-15 at 14:51 +, stefano.stabell...@eu.citrix.com
wrote:
> From: Stefano Stabellini 
> 
> Use the newly introduced HVM_PARAM_BUFIOREQ_EVTCHN to receive
> notifications for buffered io events.
> After the first notification is received leave the event channel masked
> and setup a timer to process the rest of the batch.
> Once we have completed processing the batch, unmask the event channel
> and delete the timer.
> 
> Signed-off-by: Stefano Stabellini 
> ---
>  xen-all.c |   38 --
>  1 files changed, 32 insertions(+), 6 deletions(-)
> 
> diff --git a/xen-all.c b/xen-all.c
> index b5e28ab..b28d7e7 100644
> --- a/xen-all.c
> +++ b/xen-all.c
> @@ -70,6 +70,8 @@ typedef struct XenIOState {
>  QEMUTimer *buffered_io_timer;
>  /* the evtchn port for polling the notification, */
>  evtchn_port_t *ioreq_local_port;
> +/* evtchn local port for buffered io */
> +evtchn_port_t bufioreq_local_port;
>  /* the evtchn fd for polling */
>  XenEvtchn xce_handle;
>  /* which vcpu we are serving */
> @@ -516,6 +518,12 @@ static ioreq_t *cpu_get_ioreq(XenIOState *state)
>  evtchn_port_t port;
>  
>  port = xc_evtchn_pending(state->xce_handle);
> +if (port == state->bufioreq_local_port) {
> +qemu_mod_timer(state->buffered_io_timer,
> +BUFFER_IO_MAX_DELAY + qemu_get_clock_ms(rt_clock));
> +return NULL;
> +}
> +
>  if (port != -1) {
>  for (i = 0; i < smp_cpus; i++) {
>  if (state->ioreq_local_port[i] == port) {
> @@ -664,16 +672,18 @@ static void handle_ioreq(ioreq_t *req)
>  }
>  }
>  
> -static void handle_buffered_iopage(XenIOState *state)
> +static int handle_buffered_iopage(XenIOState *state)
>  {
>  buf_ioreq_t *buf_req = NULL;
>  ioreq_t req;
>  int qw;
>  
>  if (!state->buffered_io_page) {
> -return;
> +return 0;
>  }
>  
> +memset(&req, 0x00, sizeof(req));
> +
>  while (state->buffered_io_page->read_pointer != 
> state->buffered_io_page->write_pointer) {
>  buf_req = &state->buffered_io_page->buf_ioreq[
>  state->buffered_io_page->read_pointer % IOREQ_BUFFER_SLOT_NUM];
> @@ -698,15 +708,21 @@ static void handle_buffered_iopage(XenIOState *state)
>  xen_mb();
>  state->buffered_io_page->read_pointer += qw ? 2 : 1;
>  }
> +
> +return req.count;
>  }
>  
>  static void handle_buffered_io(void *opaque)
>  {
>  XenIOState *state = opaque;
>  
> -handle_buffered_iopage(state);
> -qemu_mod_timer(state->buffered_io_timer,
> -   BUFFER_IO_MAX_DELAY + qemu_get_clock_ms(rt_clock));
> +if (handle_buffered_iopage(state)) {
> +qemu_mod_timer(state->buffered_io_timer,
> +BUFFER_IO_MAX_DELAY + qemu_get_clock_ms(rt_clock));
> +} else {
> +qemu_del_timer(state->buffered_io_timer);
> +xc_evtchn_unmask(state->xce_handle, state->bufioreq_local_port);
> +}
>  }
>  
>  static void cpu_handle_ioreq(void *opaque)
> @@ -836,7 +852,6 @@ static void xen_main_loop_prepare(XenIOState *state)
>  
>  state->buffered_io_timer = qemu_new_timer_ms(rt_clock, 
> handle_buffered_io,
>   state);
> -qemu_mod_timer(state->buffered_io_timer, qemu_get_clock_ms(rt_clock));
>  
>  if (evtchn_fd != -1) {
>  qemu_set_fd_handler(evtchn_fd, cpu_handle_ioreq, NULL, state);
> @@ -888,6 +903,7 @@ int xen_hvm_init(void)
>  {
>  int i, rc;
>  unsigned long ioreq_pfn;
> +unsigned long bufioreq_evtchn;
>  XenIOState *state;
>  
>  state = g_malloc0(sizeof (XenIOState));
> @@ -937,6 +953,16 @@ int xen_hvm_init(void)
>  state->ioreq_local_port[i] = rc;
>  }
>  
> +xc_get_hvm_param(xen_xc, xen_domid, HVM_PARAM_BUFIOREQ_EVTCHN,
> +&bufioreq_evtchn);
> +rc = xc_evtchn_bind_interdomain(state->xce_handle, xen_domid,
> +(uint32_t)bufioreq_evtchn);
> +if (rc == -1) {
> +fprintf(stderr, "bind interdomain ioctl error %d\n", errno);
> +return -1;
> +}
> +state->bufioreq_local_port = rc;

Does this fallback gracefully on hypervisors which don't have this new
hvm param? It doesn't look like it but perhaps I'm missing something.

> +
>  /* Init RAM management */
>  xen_map_cache_init();
>  xen_ram_init(ram_size);

Re: [Qemu-devel] [Xen-devel] [PATCH 3/4] xen: introduce an event channel for buffered io event notifications

2011-11-15 Thread Stefano Stabellini

On Tue, 15 Nov 2011, Ian Campbell wrote:
> > +xc_get_hvm_param(xen_xc, xen_domid, HVM_PARAM_BUFIOREQ_EVTCHN,
> > +&bufioreq_evtchn);
> > +rc = xc_evtchn_bind_interdomain(state->xce_handle, xen_domid,
> > +(uint32_t)bufioreq_evtchn);
> > +if (rc == -1) {
> > +fprintf(stderr, "bind interdomain ioctl error %d\n", errno);
> > +return -1;
> > +}
> > +state->bufioreq_local_port = rc;
> 
> Does this fallback gracefully on hypervisors which don't have this new
> hvm param? It doesn't look like it but perhaps I'm missing something.

No, it does not.
However upstream Qemu doesn't work very well with Xen 4.1 anyway, the
first Xen release that is going to support it will be Xen 4.2 that
should have this feature.

[Qemu-devel] [PATCH 1/4] virtio-scsi: Add virtio-scsi stub device

2011-11-15 Thread Paolo Bonzini

From: Stefan Hajnoczi 

Add a useless virtio SCSI HBA device:

  qemu -device virtio-scsi-pci

Signed-off-by: Stefan Hajnoczi 
Signed-off-by: Paolo Bonzini 
---
 Makefile.target |1 +
 default-configs/pci.mak |1 +
 hw/pci.h|1 +
 hw/virtio-pci.c |   42 ++
 hw/virtio-pci.h |2 +
 hw/virtio-scsi.c|  194 +++
 hw/virtio-scsi.h|   28 +++
 hw/virtio.h |3 +
 8 files changed, 272 insertions(+), 0 deletions(-)
 create mode 100644 hw/virtio-scsi.c
 create mode 100644 hw/virtio-scsi.h

diff --git a/Makefile.target b/Makefile.target
index a111521..f3bc562 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -199,6 +199,7 @@ obj-y = arch_init.o cpus.o monitor.o machine.o gdbstub.o 
balloon.o ioport.o
 # need to fix this properly
 obj-$(CONFIG_NO_PCI) += pci-stub.o
 obj-$(CONFIG_VIRTIO) += virtio.o virtio-blk.o virtio-balloon.o virtio-net.o 
virtio-serial-bus.o
+obj-$(CONFIG_VIRTIO_SCSI) += virtio-scsi.o
 obj-y += vhost_net.o
 obj-$(CONFIG_VHOST_NET) += vhost.o
 obj-$(CONFIG_REALLY_VIRTFS) += 9pfs/virtio-9p-device.o
diff --git a/default-configs/pci.mak b/default-configs/pci.mak
index 22bd350..9c8edd4 100644
--- a/default-configs/pci.mak
+++ b/default-configs/pci.mak
@@ -1,5 +1,6 @@
 CONFIG_PCI=y
 CONFIG_VIRTIO_PCI=y
+CONFIG_VIRTIO_SCSI=y
 CONFIG_VIRTIO=y
 CONFIG_USB_UHCI=y
 CONFIG_USB_OHCI=y
diff --git a/hw/pci.h b/hw/pci.h
index 4b2e785..e21ffd2 100644
--- a/hw/pci.h
+++ b/hw/pci.h
@@ -75,6 +75,7 @@
 #define PCI_DEVICE_ID_VIRTIO_BLOCK   0x1001
 #define PCI_DEVICE_ID_VIRTIO_BALLOON 0x1002
 #define PCI_DEVICE_ID_VIRTIO_CONSOLE 0x1003
+#define PCI_DEVICE_ID_VIRTIO_SCSI0x1004
 
 #define FMT_PCIBUS  PRIx64
 
diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
index ca5923c..72be056 100644
--- a/hw/virtio-pci.c
+++ b/hw/virtio-pci.c
@@ -19,6 +19,7 @@
 #include "virtio-blk.h"
 #include "virtio-net.h"
 #include "virtio-serial.h"
+#include "virtio-scsi.h"
 #include "pci.h"
 #include "qemu-error.h"
 #include "msix.h"
@@ -784,6 +785,32 @@ static int virtio_balloon_exit_pci(PCIDevice *pci_dev)
 return virtio_exit_pci(pci_dev);
 }
 
+static int virtio_scsi_init_pci(PCIDevice *pci_dev)
+{
+VirtIOPCIProxy *proxy = DO_UPCAST(VirtIOPCIProxy, pci_dev, pci_dev);
+VirtIODevice *vdev;
+
+vdev = virtio_scsi_init(&pci_dev->qdev, &proxy->scsi);
+if (!vdev) {
+return -EINVAL;
+}
+
+vdev->nvectors = proxy->nvectors;
+virtio_init_pci(proxy, vdev);
+
+/* make the actual value visible */
+proxy->nvectors = vdev->nvectors;
+return 0;
+}
+
+static int virtio_scsi_exit_pci(PCIDevice *pci_dev)
+{
+VirtIOPCIProxy *proxy = DO_UPCAST(VirtIOPCIProxy, pci_dev, pci_dev);
+
+virtio_scsi_exit(proxy->vdev);
+return virtio_exit_pci(pci_dev);
+}
+
 static PCIDeviceInfo virtio_info[] = {
 {
 .qdev.name = "virtio-blk-pci",
@@ -869,6 +896,21 @@ static PCIDeviceInfo virtio_info[] = {
 },
 .qdev.reset = virtio_pci_reset,
 },{
+.qdev.name = "virtio-scsi-pci",
+.qdev.size = sizeof(VirtIOPCIProxy),
+.init  = virtio_scsi_init_pci,
+.exit  = virtio_scsi_exit_pci,
+.vendor_id = PCI_VENDOR_ID_REDHAT_QUMRANET,
+.device_id = PCI_DEVICE_ID_VIRTIO_SCSI,
+.class_id  = PCI_CLASS_STORAGE_SCSI,
+.revision  = 0x00,
+.qdev.props = (Property[]) {
+DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors, 2),
+DEFINE_VIRTIO_COMMON_FEATURES(VirtIOPCIProxy, host_features),
+DEFINE_PROP_UINT32("num_queues", VirtIOPCIProxy, scsi.num_queues, 
1),
+DEFINE_PROP_END_OF_LIST(),
+},
+}, {
 /* end of list */
 }
 };
diff --git a/hw/virtio-pci.h b/hw/virtio-pci.h
index f8404de..20523af 100644
--- a/hw/virtio-pci.h
+++ b/hw/virtio-pci.h
@@ -17,6 +17,7 @@
 
 #include "virtio-net.h"
 #include "virtio-serial.h"
+#include "virtio-scsi.h"
 
 /* Performance improves when virtqueue kick processing is decoupled from the
  * vcpu thread using ioeventfd for some devices. */
@@ -40,6 +41,7 @@ typedef struct {
 #endif
 virtio_serial_conf serial;
 virtio_net_conf net;
+VirtIOSCSIConf scsi;
 bool ioeventfd_disabled;
 bool ioeventfd_started;
 } VirtIOPCIProxy;
diff --git a/hw/virtio-scsi.c b/hw/virtio-scsi.c
new file mode 100644
index 000..ff86376
--- /dev/null
+++ b/hw/virtio-scsi.c
@@ -0,0 +1,194 @@
+/*
+ * Virtio SCSI HBA
+ *
+ * Copyright IBM, Corp. 2010
+ * Copyright Red Hat, Inc. 2011
+ *
+ * Authors:
+ *   Stefan Hajnoczi
+ *   Paolo Bonzini  
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "virtio-scsi.h"
+#include 
+#include 
+
+#define VIRTIO_SCSI_VQ_SIZE128
+#define VIRTIO_SCSI_CDB_SIZE   32
+#define VIRTIO_SCSI_SENSE_SIZE 96
+
+/* Response codes

[Qemu-devel] [PATCH 3/4] virtio-scsi: add basic SCSI bus operation

2011-11-15 Thread Paolo Bonzini

Signed-off-by: Paolo Bonzini 
---
 hw/virtio-scsi.c |   99 --
 1 files changed, 88 insertions(+), 11 deletions(-)

diff --git a/hw/virtio-scsi.c b/hw/virtio-scsi.c
index 7e6348a..5fc3c00 100644
--- a/hw/virtio-scsi.c
+++ b/hw/virtio-scsi.c
@@ -119,6 +119,7 @@ typedef struct {
 DeviceState *qdev;
 VirtIOSCSIConf *conf;
 
+SCSIBus bus;
 VirtQueue *ctrl_vq;
 VirtQueue *event_vq;
 VirtQueue *cmd_vq;
@@ -149,6 +150,22 @@ typedef struct VirtIOSCSIReq {
 } resp;
 } VirtIOSCSIReq;
 
+static inline int virtio_scsi_get_lun(uint8_t *lun)
+{
+return ((lun[2] << 8) | lun[3]) & 0x3FFF;
+}
+
+static inline SCSIDevice *virtio_scsi_device_find(VirtIOSCSI *s, uint8_t *lun)
+{
+if (lun[0] != 1) {
+return NULL;
+}
+if (lun[2] != 0 && !(lun[2] >= 0x40 && lun[2] < 0x7F)) {
+return NULL;
+}
+return scsi_device_find(&s->bus, 0, lun[1], virtio_scsi_get_lun(lun));
+}
+
 static void virtio_scsi_complete_req(VirtIOSCSIReq *req)
 {
 VirtQueue *vq = req->vq;
@@ -228,6 +245,36 @@ static void virtio_scsi_handle_ctrl(VirtIODevice *vdev, 
VirtQueue *vq)
 }
 }
 
+static void virtio_scsi_command_complete(SCSIRequest *r, uint32_t status,
+ int32_t resid)
+{
+VirtIOSCSIReq *req = r->hba_private;
+
+req->resp.cmd->response = VIRTIO_SCSI_S_OK;
+req->resp.cmd->status = status;
+if (req->resp.cmd->status == GOOD) {
+req->resp.cmd->resid = resid;
+if (resid) {
+req->resp.cmd->response = VIRTIO_SCSI_S_UNDERRUN;
+}
+} else {
+   req->resp.cmd->resid = 0;
+   scsi_req_get_sense(r, req->resp.cmd->sense, VIRTIO_SCSI_SENSE_SIZE);
+}
+virtio_scsi_complete_req(req);
+}
+
+static void virtio_scsi_request_cancelled(SCSIRequest *r)
+{
+VirtIOSCSIReq *req = r->hba_private;
+
+if (!req) {
+return;
+}
+req->resp.cmd->response = VIRTIO_SCSI_S_ABORTED;
+virtio_scsi_complete_req(req);
+}
+
 static void virtio_scsi_fail_cmd_req(VirtIOSCSI *s, VirtIOSCSIReq *req)
 {
 req->resp.cmd->response = VIRTIO_SCSI_S_FAILURE;
@@ -238,8 +285,10 @@ static void virtio_scsi_handle_cmd(VirtIODevice *vdev, 
VirtQueue *vq)
 {
 VirtIOSCSI *s = (VirtIOSCSI *)vdev;
 VirtIOSCSIReq *req;
+int n;
 
 while ((req = virtio_scsi_parse_req(s, vq))) {
+SCSIDevice *d;
 int out_size, in_size;
 if (req->elem.out_num < 1 || req->elem.in_num < 1) {
 virtio_scsi_bad_req();
@@ -257,17 +306,31 @@ static void virtio_scsi_handle_cmd(VirtIODevice *vdev, 
VirtQueue *vq)
 continue;
 }
 
-req->resp.cmd->resid = 0;
-req->resp.cmd->status_qualifier = 0;
-req->resp.cmd->status = CHECK_CONDITION;
-req->resp.cmd->sense_len = 4;
-req->resp.cmd->sense[0] = 0xf0; /* Fixed format current sense */
-req->resp.cmd->sense[1] = ILLEGAL_REQUEST;
-req->resp.cmd->sense[2] = 0x20;
-req->resp.cmd->sense[3] = 0x00;
-req->resp.cmd->response = VIRTIO_SCSI_S_OK;
-
-virtio_scsi_complete_req(req);
+d = virtio_scsi_device_find(s, req->req.cmd->lun);
+if (!d) {
+req->resp.cmd->response = VIRTIO_SCSI_S_BAD_TARGET;
+virtio_scsi_complete_req(req);
+continue;
+}
+req->sreq = scsi_req_new(d, req->req.cmd->tag,
+ virtio_scsi_get_lun(req->req.cmd->lun),
+ req->req.cmd->cdb, req);
+
+if (req->sreq->cmd.mode != SCSI_XFER_NONE) {
+int req_mode =
+(req->elem.in_num > 1 ? SCSI_XFER_FROM_DEV : SCSI_XFER_TO_DEV);
+
+if (req->sreq->cmd.mode != req_mode) {
+virtio_scsi_fail_cmd_req(s, req);
+scsi_req_cancel(req->sreq);
+continue;
+}
+}
+
+n = scsi_req_enqueue(req->sreq, &req->qsgl);
+if (n) {
+scsi_req_continue(req->sreq);
+}
 }
 }
 
@@ -290,6 +353,15 @@ static uint32_t virtio_scsi_get_features(VirtIODevice 
*vdev,
 return requested_features;
 }
 
+static struct SCSIBusInfo virtio_scsi_scsi_info = {
+.tcq = true,
+.max_target = 255,
+.max_lun = 16383,
+
+.complete = virtio_scsi_command_complete,
+.cancel = virtio_scsi_request_cancelled,
+};
+
 VirtIODevice *virtio_scsi_init(DeviceState *dev, VirtIOSCSIConf *proxyconf)
 {
 VirtIOSCSI *s;
@@ -316,6 +388,11 @@ VirtIODevice *virtio_scsi_init(DeviceState *dev, 
VirtIOSCSIConf *proxyconf)
 s->cmd_vq = virtio_add_queue(&s->vdev, VIRTIO_SCSI_VQ_SIZE,
virtio_scsi_handle_cmd);
 
+scsi_bus_new(&s->bus, dev, &virtio_scsi_scsi_info);
+if (!dev->hotplugged) {
+scsi_bus_legacy_handle_cmdline(&s->bus);
+}
+
 /* TODO savevm */
 /* TODO boot device path */
 
-- 
1.7.7.1

[Qemu-devel] [PATCH 2/4] virtio-scsi: Add basic request processing infrastructure

2011-11-15 Thread Paolo Bonzini

From: Stefan Hajnoczi 

Signed-off-by: Stefan Hajnoczi 
Signed-off-by: Paolo Bonzini 
---
 hw/virtio-scsi.c |  138 +-
 1 files changed, 136 insertions(+), 2 deletions(-)

diff --git a/hw/virtio-scsi.c b/hw/virtio-scsi.c
index ff86376..7e6348a 100644
--- a/hw/virtio-scsi.c
+++ b/hw/virtio-scsi.c
@@ -127,14 +127,148 @@ typedef struct {
 uint32_t cdb_size;
 } VirtIOSCSI;
 
+typedef struct VirtIOSCSIReq {
+struct VirtIOSCSIReq *next;
+VirtIOSCSI *dev;
+VirtQueue *vq;
+VirtQueueElement elem;
+QEMUSGList qsgl;
+SCSIRequest *sreq;
+union {
+char  *buf;
+VirtIOSCSICmdReq  *cmd;
+VirtIOSCSICtrlTMFReq  *tmf;
+VirtIOSCSICtrlANReq   *an;
+} req;
+union {
+char  *buf;
+VirtIOSCSICmdResp *cmd;
+VirtIOSCSICtrlTMFResp *tmf;
+VirtIOSCSICtrlANResp  *an;
+VirtIOSCSIEvent   *event;
+} resp;
+} VirtIOSCSIReq;
+
+static void virtio_scsi_complete_req(VirtIOSCSIReq *req)
+{
+VirtQueue *vq = req->vq;
+virtqueue_push(vq, &req->elem, req->qsgl.size + 
req->elem.in_sg[0].iov_len);
+qemu_sglist_destroy(&req->qsgl);
+if (req->sreq) {
+req->sreq->hba_private = NULL;
+scsi_req_unref(req->sreq);
+}
+g_free(req);
+virtio_notify(&req->dev->vdev, vq);
+}
+
+static void virtio_scsi_bad_req(void)
+{
+error_report("wrong size for virtio-scsi headers");
+exit(1);
+}
+
+static void qemu_sgl_init_external(QEMUSGList *qsgl, struct iovec *sg,
+   target_phys_addr_t *addr, int num)
+{
+memset(qsgl, 0, sizeof(*qsgl));
+while (num--) {
+qemu_sglist_add(qsgl, *(addr++), (sg++)->iov_len);
+}
+}
+
+static VirtIOSCSIReq *virtio_scsi_parse_req(VirtIOSCSI *s, VirtQueue *vq)
+{
+VirtIOSCSIReq *req;
+req = g_malloc(sizeof(*req));
+if (!virtqueue_pop(vq, &req->elem)) {
+g_free(req);
+return NULL;
+}
+
+assert(req->elem.out_num && req->elem.in_num);
+req->vq = vq;
+req->dev = s;
+req->next = NULL;
+req->sreq = NULL;
+req->req.buf = req->elem.out_sg[0].iov_base;
+req->resp.buf = req->elem.in_sg[0].iov_base;
+
+if (req->elem.out_num > 1) {
+qemu_sgl_init_external(&req->qsgl, &req->elem.out_sg[1],
+   &req->elem.out_addr[1],
+   req->elem.out_num - 1);
+} else {
+qemu_sgl_init_external(&req->qsgl, &req->elem.in_sg[1],
+   &req->elem.in_addr[1],
+   req->elem.in_num - 1);
+}
+
+return req;
+}
+
+static void virtio_scsi_fail_ctrl_req(VirtIOSCSI *s, VirtIOSCSIReq *req)
+{
+if (req->req.tmf->type == VIRTIO_SCSI_T_TMF) {
+req->resp.tmf->response = VIRTIO_SCSI_S_FAILURE;
+} else {
+req->resp.an->response = VIRTIO_SCSI_S_FAILURE;
+}
+
+virtio_scsi_complete_req(req);
+}
+
 static void virtio_scsi_handle_ctrl(VirtIODevice *vdev, VirtQueue *vq)
 {
-/* TODO */
+VirtIOSCSI *s = (VirtIOSCSI *)vdev;
+VirtIOSCSIReq *req;
+
+while ((req = virtio_scsi_parse_req(s, vq))) {
+virtio_scsi_fail_ctrl_req(s, req);
+}
+}
+
+static void virtio_scsi_fail_cmd_req(VirtIOSCSI *s, VirtIOSCSIReq *req)
+{
+req->resp.cmd->response = VIRTIO_SCSI_S_FAILURE;
+virtio_scsi_complete_req(req);
 }
 
 static void virtio_scsi_handle_cmd(VirtIODevice *vdev, VirtQueue *vq)
 {
-/* TODO */
+VirtIOSCSI *s = (VirtIOSCSI *)vdev;
+VirtIOSCSIReq *req;
+
+while ((req = virtio_scsi_parse_req(s, vq))) {
+int out_size, in_size;
+if (req->elem.out_num < 1 || req->elem.in_num < 1) {
+virtio_scsi_bad_req();
+}
+
+out_size = req->elem.out_sg[0].iov_len;
+in_size = req->elem.in_sg[0].iov_len;
+if (out_size < sizeof(VirtIOSCSICmdReq) + VIRTIO_SCSI_CDB_SIZE ||
+in_size < sizeof(VirtIOSCSICmdResp) + VIRTIO_SCSI_SENSE_SIZE) {
+virtio_scsi_bad_req();
+}
+
+if (req->elem.out_num > 1 && req->elem.in_num > 1) {
+virtio_scsi_fail_cmd_req(s, req);
+continue;
+}
+
+req->resp.cmd->resid = 0;
+req->resp.cmd->status_qualifier = 0;
+req->resp.cmd->status = CHECK_CONDITION;
+req->resp.cmd->sense_len = 4;
+req->resp.cmd->sense[0] = 0xf0; /* Fixed format current sense */
+req->resp.cmd->sense[1] = ILLEGAL_REQUEST;
+req->resp.cmd->sense[2] = 0x20;
+req->resp.cmd->sense[3] = 0x00;
+req->resp.cmd->response = VIRTIO_SCSI_S_OK;
+
+virtio_scsi_complete_req(req);
+}
 }
 
 static void virtio_scsi_get_config(VirtIODevice *vdev,
-- 
1.7.7.1

[Qemu-devel] converging around a single guest agent

2011-11-15 Thread Barak Azulay

Hi,

One of the breakout sessions during the ovirt workshop [1] was about the guest 
tools, and focused mainly on the ovirt-guest-agent [2]. 

One of the issues discussed there, was the various existing guest agents out 
there, and the need to converge the efforts to a single agent that will serve 
all. 

while 4 agents were mentioned (Matahari, vdagent, qemu-ga & ovirt-guest-agent) 
during that discussion, we narrowed it down to 2 candidates:  

qemu-ga (aka virt-agent):
-
- Qemu specific - it was aimed for specific qemu needs (mainly quiesce guest 
I/O)
- Communicates directly with qemu  (not implemented yet) 
- Supports ? 
- So far linux only
- written in C

Ovirt-guest-agent:
--
- Has been around for a long time (~5 years) - considered stable
- Started as rhevm specific but evolved a lot since then
- Currently the only fully functional guest agent available for ovirt
- Written in python 
- Some VDI related sub components are written in C & C++
- Supports a well defined list of message types / protocol [3]
- Supports the folowing guest OSs
  Linux: RHEL5, RHEL6 F15, F16(soon) 
  Windows: xp, 2k3 (32/64), w7 (32/64), 2k8 (32/64/R2)

  
The need to converge is obvious, and now that ovirt-guest-agent is opensourced 
under the ovirt stack, and since it already produces value for enterprise 
installations, and is cross platform, I offer to join hands around ovirt-
guest-agent and formalize a single code base that will serve us all.

git @ git://gerrit.ovirt.org/ovirt-guest-agent

Thoughts ?

Thanks
Barak Azulay

[1] http://www.ovirt.org/news-and-events/workshop
[2] http://www.ovirt.org/wiki/File:Ovirt-guest-agent.odp
[3] http://www.ovirt.org/wiki/Ovirt_guest_agent

Re: [Qemu-devel] [PATCH 0/5] docs: convert specifications to markdown

2011-11-15 Thread Stefano Stabellini

On Tue, 15 Nov 2011, Alex Bradbury wrote:
> On 15 November 2011 13:51, Avi Kivity  wrote:
> > Does markdown support rendering into man pages?
> 
> You can do this via pandoc:
> http://johnmacfarlane.net/pandoc/

Actually we are having the very same issue on xen right now: we have a
manual written in markdown and we would like to render it into a man
page. I found that pandoc generates man pages of terrible quality, but
ronn works pretty well. Unfortunately it is not easy to find it in
distros.
Overall I think it is better to use something else than markdown for man
pages whenever possible.

Re: [Qemu-devel] [Xen-devel] [PATCH 3/4] xen: introduce an event channel for buffered io event notifications

2011-11-15 Thread Ian Campbell

On Tue, 2011-11-15 at 17:20 +, Stefano Stabellini wrote:
> On Tue, 15 Nov 2011, Ian Campbell wrote:
> > > +xc_get_hvm_param(xen_xc, xen_domid, HVM_PARAM_BUFIOREQ_EVTCHN,
> > > +&bufioreq_evtchn);
> > > +rc = xc_evtchn_bind_interdomain(state->xce_handle, xen_domid,
> > > +(uint32_t)bufioreq_evtchn);
> > > +if (rc == -1) {
> > > +fprintf(stderr, "bind interdomain ioctl error %d\n", errno);
> > > +return -1;
> > > +}
> > > +state->bufioreq_local_port = rc;
> > 
> > Does this fallback gracefully on hypervisors which don't have this new
> > hvm param? It doesn't look like it but perhaps I'm missing something.
> 
> No, it does not.
> However upstream Qemu doesn't work very well with Xen 4.1 anyway, the
> first Xen release that is going to support it will be Xen 4.2 that
> should have this feature.

In which case I think you need to handle the resultant error from
xc_get_hvm_param() gracefully with a suitable error message which says
something along those lines.

Ian.

[Qemu-devel] [PATCH 4/4] virtio-scsi: process control queue requests

2011-11-15 Thread Paolo Bonzini

Signed-off-by: Paolo Bonzini 
---
 hw/virtio-scsi.c |  111 ++---
 1 files changed, 104 insertions(+), 7 deletions(-)

diff --git a/hw/virtio-scsi.c b/hw/virtio-scsi.c
index 5fc3c00..146ea6e 100644
--- a/hw/virtio-scsi.c
+++ b/hw/virtio-scsi.c
@@ -224,15 +224,88 @@ static VirtIOSCSIReq *virtio_scsi_parse_req(VirtIOSCSI 
*s, VirtQueue *vq)
 return req;
 }
 
-static void virtio_scsi_fail_ctrl_req(VirtIOSCSI *s, VirtIOSCSIReq *req)
+static void virtio_scsi_do_tmf(VirtIOSCSI *s, VirtIOSCSIReq *req)
 {
-if (req->req.tmf->type == VIRTIO_SCSI_T_TMF) {
-req->resp.tmf->response = VIRTIO_SCSI_S_FAILURE;
-} else {
-req->resp.an->response = VIRTIO_SCSI_S_FAILURE;
+SCSIDevice *d = virtio_scsi_device_find(s, req->req.cmd->lun);
+SCSIRequest *r, *next;
+DeviceState *qdev;
+int target;
+
+switch (req->req.tmf->subtype) {
+case VIRTIO_SCSI_T_TMF_ABORT_TASK:
+case VIRTIO_SCSI_T_TMF_QUERY_TASK:
+d = virtio_scsi_device_find(s, req->req.cmd->lun);
+if (!d) {
+goto fail;
+}
+
+QTAILQ_FOREACH_SAFE(r, &d->requests, next, next) {
+if (r->tag == req->req.cmd->tag) {
+break;
+}
+}
+if (r && r->hba_private) {
+if (req->req.tmf->subtype == VIRTIO_SCSI_T_TMF_ABORT_TASK) {
+scsi_req_cancel(r);
+}
+req->resp.tmf->response = VIRTIO_SCSI_S_FUNCTION_SUCCEEDED;
+} else {
+req->resp.tmf->response = VIRTIO_SCSI_S_OK;
+}
+break;
+
+case VIRTIO_SCSI_T_TMF_LOGICAL_UNIT_RESET:
+d = virtio_scsi_device_find(s, req->req.cmd->lun);
+if (!d) {
+goto fail;
+}
+if (d->lun == virtio_scsi_get_lun(req->req.cmd->lun)) {
+qdev_reset_all(&d->qdev);
+}
+break;
+
+case VIRTIO_SCSI_T_TMF_ABORT_TASK_SET:
+case VIRTIO_SCSI_T_TMF_CLEAR_TASK_SET:
+case VIRTIO_SCSI_T_TMF_QUERY_TASK_SET:
+d = virtio_scsi_device_find(s, req->req.cmd->lun);
+if (!d) {
+goto fail;
+}
+if (d->lun != virtio_scsi_get_lun(req->req.cmd->lun)) {
+req->resp.tmf->response = VIRTIO_SCSI_S_OK;
+break;
+}
+req->resp.tmf->response = VIRTIO_SCSI_S_OK;
+QTAILQ_FOREACH_SAFE(r, &d->requests, next, next) {
+if (r->hba_private) {
+if (req->req.tmf->subtype == VIRTIO_SCSI_T_TMF_ABORT_TASK) {
+scsi_req_cancel(r);
+}
+req->resp.tmf->response = VIRTIO_SCSI_S_FUNCTION_SUCCEEDED;
+}
+}
+break;
+
+case VIRTIO_SCSI_T_TMF_I_T_NEXUS_RESET:
+target = req->req.cmd->lun[1];
+QTAILQ_FOREACH(qdev, &s->bus.qbus.children, sibling) {
+ d = DO_UPCAST(SCSIDevice, qdev, qdev);
+ if (d->channel == 0 && d->id == target) {
+qdev_reset_all(&d->qdev);
+ }
+}
+break;
+
+case VIRTIO_SCSI_T_TMF_CLEAR_ACA:
+default:
+req->resp.tmf->response = VIRTIO_SCSI_S_FUNCTION_REJECTED;
+break;
 }
 
-virtio_scsi_complete_req(req);
+return;
+
+fail:
+req->resp.tmf->response = VIRTIO_SCSI_S_BAD_TARGET;
 }
 
 static void virtio_scsi_handle_ctrl(VirtIODevice *vdev, VirtQueue *vq)
@@ -241,7 +314,31 @@ static void virtio_scsi_handle_ctrl(VirtIODevice *vdev, 
VirtQueue *vq)
 VirtIOSCSIReq *req;
 
 while ((req = virtio_scsi_parse_req(s, vq))) {
-virtio_scsi_fail_ctrl_req(s, req);
+int out_size, in_size;
+if (req->elem.out_num < 1 || req->elem.in_num < 1) {
+virtio_scsi_bad_req();
+continue;
+}
+
+out_size = req->elem.out_sg[0].iov_len;
+in_size = req->elem.in_sg[0].iov_len;
+if (req->req.tmf->type == VIRTIO_SCSI_T_TMF) {
+if (out_size < sizeof(VirtIOSCSICtrlTMFReq) ||
+in_size < sizeof(VirtIOSCSICtrlTMFResp)) {
+virtio_scsi_bad_req();
+}
+virtio_scsi_do_tmf(s, req);
+
+} else if (req->req.tmf->type == VIRTIO_SCSI_T_AN_QUERY ||
+   req->req.tmf->type == VIRTIO_SCSI_T_AN_SUBSCRIBE) {
+if (out_size < sizeof(VirtIOSCSICtrlANReq) ||
+in_size < sizeof(VirtIOSCSICtrlANResp)) {
+virtio_scsi_bad_req();
+}
+req->resp.an->event_actual = 0;
+req->resp.an->response = VIRTIO_SCSI_S_OK;
+}
+virtio_scsi_complete_req(req);
 }
 }
 
-- 
1.7.7.1

Re: [Qemu-devel] converging around a single guest agent

2011-11-15 Thread Alon Levy

On Tue, Nov 15, 2011 at 07:24:40PM +0200, Barak Azulay wrote:
> Hi,
> 
> One of the breakout sessions during the ovirt workshop [1] was about the 
> guest 
> tools, and focused mainly on the ovirt-guest-agent [2]. 
> 
> One of the issues discussed there, was the various existing guest agents out 
> there, and the need to converge the efforts to a single agent that will serve 
> all. 
> 
> while 4 agents were mentioned (Matahari, vdagent, qemu-ga & 
> ovirt-guest-agent) 
> during that discussion, we narrowed it down to 2 candidates:  
> 
> qemu-ga (aka virt-agent):
> -
> - Qemu specific - it was aimed for specific qemu needs (mainly quiesce guest 
> I/O)
> - Communicates directly with qemu  (not implemented yet) 
> - Supports ? 
> - So far linux only
> - written in C
> 
> Ovirt-guest-agent:
> --
> - Has been around for a long time (~5 years) - considered stable
> - Started as rhevm specific but evolved a lot since then
> - Currently the only fully functional guest agent available for ovirt
> - Written in python 
> - Some VDI related sub components are written in C & C++
> - Supports a well defined list of message types / protocol [3]
> - Supports the folowing guest OSs
>   Linux: RHEL5, RHEL6 F15, F16(soon) 

Does it have a seperate system level and user level part in Linux? It
does in windows, right? This is a requirment for replacing
vdagent+vdservice and the linux spice-agent, they both need to be active
during login stage, and then launch a new session agent when the user is
logged in. This is true for both linux and windows, although we have
completely different code bases for them:

 http://cgit.freedesktop.org/spice/linux/vd_agent/
 http://cgit.freedesktop.org/spice/win32/vd_agent/

linux is C, windows is C++ btw.

>   Windows: xp, 2k3 (32/64), w7 (32/64), 2k8 (32/64/R2)
> 
>   
> The need to converge is obvious, and now that ovirt-guest-agent is 
> opensourced 
> under the ovirt stack, and since it already produces value for enterprise 
> installations, and is cross platform, I offer to join hands around ovirt-
> guest-agent and formalize a single code base that will serve us all.
> 
> git @ git://gerrit.ovirt.org/ovirt-guest-agent
> 
> Thoughts ?
> 
> Thanks
> Barak Azulay
> 
> [1] http://www.ovirt.org/news-and-events/workshop
> [2] http://www.ovirt.org/wiki/File:Ovirt-guest-agent.odp
> [3] http://www.ovirt.org/wiki/Ovirt_guest_agent
>

[Qemu-devel] Sharing virtio-devices between several kvm virtual machines over network

2011-11-15 Thread Leib, David

Hi,
I am trying to share devices between vm's. For example I want to use a 
cdrom-drive who is exposed to a vm from another vm over the network.
In addition to this I want to use virtio for this idea.
What I am trying to do step by step:
1.  If virtqueue_pop is called on the KVM 2 I take the iovec structure 
information
2.  I send it over to the KVM 1
3.  KVM 1 put it into the own virtqueue_pop
4.  KVM 1 wait for virtqueue_push
5.  KVM 1 take the information from virtqueue_push
6.  KVM 1 send it over to KVM 2
7.  KVM 2 put it into the virtqueue push

 |--|
||
 |  ||  
  |
 |KVM 1 ||   KVM 2  
  |
 |  ||  
  |
 |  ||  
  |
 |--|
||
   |  ||   | |  
 |
   |  ||   | |  
 |
   |  ||   | |  
 |
   |/\<<>>| |
 |
   | \---/   \_/
 |
   || | 
 |
   || | 
___
 ---|   
   |
 | ||   
   |
 | ||   
   |
 | Host 1  ||   
   |
 | ||   Host 2  
   |
 |__   ||   
   |
 |  |  ||   
   |
 | CDROM|  ||   
   |
 |__|__|
|--|

I tried it already slightly different by stopping KVM 1 and only waiting for 
request of KVM 2 but there are some problems with the iovec buffer address I am 
not able to use as a address of the buffer.
Has somebody experience with that or an idea of doing this maybe in a more 
smarter way or is it generally possible to do that?
Thank you for your help



David Leib
SAP Research Belfast
SAP (UK) Limited   I   The Concourse   I   Queen's Road   I   Queen's Island   
I   Belfast BT3 9DT

mailto: david.l...@sap.com  I   
www.sap.com/research

--
This communication contains information which is confidential and may also be 
privileged. It is for the exclusive use of the addressee. If you are not the 
addressee please contact us immediately and also delete the communication from 
your computer. Steps have been taken to ensure this e-mail is free from 
computer viruses but the recipient is responsible for ensuring that it is 
actually virus free before opening it or any attachments. Any views and/or 
opinions expressed in this e-mail are of the author only and do not represent 
the views of SAP.

SAP (UK) Limited, Registered in England No. 2152073. Registered Office: 
Clockhouse Place, Bedfont Road, Feltham, Middlesex, TW14 8HD
---

Re: [Qemu-devel] [PATCH 00/14] Convert Sun devices to memory API.

2011-11-15 Thread Avi Kivity

On 11/15/2011 05:22 PM, Benoît Canet wrote:
> When converting lines like :
>
> -cpu_register_physical_memory_offset(0x1f80, 0x1000,
> -sh7750_io_memory, 0x1f80);
> -cpu_register_physical_memory_offset(0xff80, 0x1000,
> -sh7750_io_memory, 0x1f80);
>
> I'm tempted to do :
>
> +memory_region_init_alias(&s->iomem_1f8, "memory-1f8",
> + &s->iomem, 0x1f80, 0x1000);
> +memory_region_add_subregion(sysmem, 0x1f80, &s->iomem_1f8);
> +
> +memory_region_init_alias(&s->iomem_ff8, "memory-ff8",
> + &s->iomem, 0xff80, 0x1000);
> +memory_region_add_subregion(sysmem, 0xff80, &s->iomem_ff8);
>
> but I'm affraid to loose some information contained in the offset
> different from the base address (0xff80 != 0x1f80).
>

I think the last lines need to be

memory_region_init_alias(&s->iomem_ff8, "memory-ff8",
 &s->iomem, 0x1f80, 0x1000);
memory_region_add_subregion(sysmem, 0xff80, &s->iomem_ff8);

This redirects writes to 0xff800xxx in sysmem to 0x1f800xxx in iomem,
which is what I think the original code intends.

-- 
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] [PATCH v7 1.0] configure: build position independent executables on x86 hosts

2011-11-15 Thread Avi Kivity

On 11/15/2011 04:57 PM, Anthony Liguori wrote:
> On 11/15/2011 05:25 AM, Peter Maydell wrote:
>> On 15 November 2011 09:34, Avi Kivity  wrote:
>>> Change the default on x86 hosts to building PIE (position independent
>>> executables); instead of restricting the option to user-only targets,
>>> apply it to all targets.
>>>
>>> In addition, set the relocation sections to read-only (relro) when
>>> available;
>>> this reduces the attack surface by disallowing changes to relocation
>>> tables
>>> at runtime.
>>>
>>> While PIE reduces performance and relro increases load time, it greatly
>>> improves security, with the potential to reduce a code execution
>>> vulnerability
>>> to a self denial of service.
>>>
>>> Non-x86 are not changed, as they require TCG changes.
>>>
>>> Signed-off-by: Avi Kivity
>>
>> Reviewed-by: Peter Maydell
>>
>> ...as far as the technical content of the patch is concerned.
>> I'm still rather dubious about the merits of putting this patch
>> in this late in the release cycle.
>
> How about we limit this to be enabled by default on x86 Linux hosts?
>
> That would make me a lot more comfortable for 1.0 since I expect we
> can test that exhaustively.

It certainly suits me.  v8 coming up.

-- 
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] converging around a single guest agent

2011-11-15 Thread Perry Myers

On 11/15/2011 12:24 PM, Barak Azulay wrote:
> Hi,
> 
> One of the breakout sessions during the ovirt workshop [1] was about the 
> guest 
> tools, and focused mainly on the ovirt-guest-agent [2]. 
> 
> One of the issues discussed there, was the various existing guest agents out 
> there, and the need to converge the efforts to a single agent that will serve 
> all. 
> 
> while 4 agents were mentioned (Matahari, vdagent, qemu-ga & 
> ovirt-guest-agent) 
> during that discussion, we narrowed it down to 2 candidates:  
> 
> qemu-ga (aka virt-agent):
> -
> - Qemu specific - it was aimed for specific qemu needs (mainly quiesce guest 
> I/O)
> - Communicates directly with qemu  (not implemented yet) 
> - Supports ? 
> - So far linux only
> - written in C
> 
> Ovirt-guest-agent:
> --
> - Has been around for a long time (~5 years) - considered stable
> - Started as rhevm specific but evolved a lot since then
> - Currently the only fully functional guest agent available for ovirt
> - Written in python 
> - Some VDI related sub components are written in C & C++
> - Supports a well defined list of message types / protocol [3]
> - Supports the folowing guest OSs
>   Linux: RHEL5, RHEL6 F15, F16(soon) 
>   Windows: xp, 2k3 (32/64), w7 (32/64), 2k8 (32/64/R2)
> 
>   
> The need to converge is obvious, and now that ovirt-guest-agent is 
> opensourced 
> under the ovirt stack, and since it already produces value for enterprise 
> installations, and is cross platform, I offer to join hands around ovirt-
> guest-agent and formalize a single code base that will serve us all.
> 
> git @ git://gerrit.ovirt.org/ovirt-guest-agent
> 
> Thoughts ?

+1

The only downside that I concretely heard from folks re:
ovirt-guest-agent was that it is written in Python.  Two thoughts there:

1. On Windows it is compiled to an executable, so no separate python
   stack needed

2. ovirt-guest-agent is not very large and does not bring in a lot
   (any?) additional python class dependencies above/beyond the core
   language and interpreter.  Given this, the chances of dealing with
   python stack issues are probably minimal and also the overhead of
   including _just_ the base python interpreter in a given guest OS is
   very lightweight.  Core python RPM in F16 is about 80k.

Perry

1 2 >

1 - 100 of 129 matches

Mail list logo