Re: [Qemu-devel] [PATCH] block: Set cdrom device read only flag

2012-08-07 Thread ronnie sahlberg
Since pretty much every cdrom drive use scsi today,  shouldnt the
readonly/writeable flag for MMC devices rather
be done down in the hw/scsi* code rather than the generic block code?

If/when/ever I get enough time I would like to port the writeable
dvd+r emulation I wrote for STGT to qemu.


In STGT, MMC/DVD devices can be either read-only or read-write.
If the filesize for the backing file is >0 bytes, it is assumed the
file is an iso image and that the file is a read-only iso image.
If filesize is ==0, then the file is opened read-write and is treated
as a "blank dvd+r disk that the initiator can write/burn to"


regards
ronnie sahlberg


On Tue, Aug 7, 2012 at 6:47 PM, Markus Armbruster  wrote:
> Kevin Wolf  writes:
>
>> Am 02.08.2012 09:20, schrieb Kevin Shanahan:
>>> On Thu, Aug 02, 2012 at 02:49:52PM +0930, Kevin Shanahan wrote:
>>>> On Thu, Aug 02, 2012 at 11:46:13AM +0930, Kevin Shanahan wrote:
>>>>> On Thu, Aug 02, 2012 at 11:02:42AM +0930, Kevin Shanahan wrote:
>>>>>> Set the block driver read_only flag for cdrom devices so that
>>>>>> qmp_change_blockdev does not attempt to open cdrom files in read-write
>>>>>> mode when changing media.
>>>>>
>>>>> Hrm, this fixes my simple test case using the kvm monitor directly but
>>>>> changing media via libvirt still has the same issue (fails for RO
>>>>> files, succeeds for writable files).
>>>>>
>>>>> $ virsh attach-disk --type cdrom --mode readonly test1
>>>>> /srv/kvm/iso/ubuntu-12.04-server-amd64.iso hdc
>>>>> error: Failed to attach disk
>>>>> error: internal error unable to execute QEMU command 'change':
>>>>> Could not open '/srv/kvm/iso/ubuntu-12.04-server-amd64.iso'
>>>>>
>>>>> I'll keep looking into it.
>>>>
>>>> In the libvirt case, it seems libvirt is failing to add media=cdrom to
>>>> the commandline, so in this case qemu is defaulting to media=disk and
>>>> my proposed fix has no effect. Diving into libvirt now to see why no
>>>> media=disk is getting added...
>>>>
>>>> Common test case has this xml (generated by virt-install):
>>>>
>>>> 
>>>>   
>>>>   
>>>>   
>>>>   
>>>>   
>>>> 
>>>
>>> Ok, looks like libvirt is intentionally leaving media=cdrom off the
>>> command line in the case that "-device ide-cd,..." is
>>> supported. Presumably by specifying the device this way, qemu is
>>> supposed to work out that the media type is cdrom automatically (but
>>> it doesn't, it defaults to disk).
>>>
>>> Libvirt wants to use:
>>>
>>> qemu-kvm ... \
>>>  -drive if=none,id=drive-ide0-1-0,readonly=on,format=raw \
>>>  -device ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 ...
>>>
>>> If I hack qemu/qemu_command.c::qemuBuildDriveStr() to ignore the check
>>> for QEMU_CAPS_IDE_CD and always add media=cdrom, then (with my qemu as
>>> well patch) qemu will open cdrom media files read-only.
>>>
>>> There's probably a neater way to just get qemu to set the media type
>>> if "-device ide-cd,..." is used, but I haven't worked it out yet.
>>>
>>> Anyway, apologies for the rambling conversation with myself on your
>>> lists. Hope this is helpful in some way.
>>
>> Thanks, that's some good information.
>>
>> However, I don't think you should start from media=cdrom. libvirt
>> already does specify readonly=on and that is the property you're really
>> interested in. Since commit 528f7663 (released with 0.13) the 'change'
>> command should keep the read-only flag for all kinds of media.
>
> Correct.
>
>> Now I'm not sure where you lost the read-only flag. At least on git
>> master it doesn't seem to reproduce for me.
>
> I can:
>
> $ qemu --enable-kvm -S -m 384 -vnc localhost:5500 -monitor stdio \
> -drive if=none,id=cd,readonly=on,format=raw \
> -device ide-cd,bus=ide.1,unit=0,drive=cd
> QEMU 1.1.50 monitor - type 'help' for more information
> (qemu) change cd r5.iso
> Could not open 'r5.iso'
> (qemu) q
> armbru@blackfin:~/work/images$ ls -l r5.iso
> -r--r--r--. 1 armbru armbru 2872639488 Mar 31  2011 r5.iso
>
> Looks like a QEMU bug to me.
>



Re: [Qemu-devel] [PATCH] ISCSI: Pick default initiator-name based on the name of the VM

2012-08-08 Thread ronnie sahlberg
csi_open(BlockDriverState *bs, const char 
> *filename, int flags)
>  if (ret != 0) {
>  error_report("Failed to set initiator username and password");
>  ret = -EINVAL;
> -goto failed;
> +goto out;
>  }
>  }
>
> @@ -985,13 +981,13 @@ static int iscsi_open(BlockDriverState *bs, const char 
> *filename, int flags)
>  if (parse_chap(iscsi, iscsi_url->target) != 0) {
>  error_report("iSCSI: Failed to set CHAP user/password");
>  ret = -EINVAL;
> -goto failed;
> +goto out;
>  }
>
>  if (iscsi_set_session_type(iscsi, ISCSI_SESSION_NORMAL) != 0) {
>  error_report("iSCSI: Failed to set session type to normal.");
>  ret = -EINVAL;
> -goto failed;
> +goto out;
>  }
>
>  iscsi_set_header_digest(iscsi, ISCSI_HEADER_DIGEST_NONE_CRC32C);
> @@ -1012,7 +1008,7 @@ static int iscsi_open(BlockDriverState *bs, const char 
> *filename, int flags)
>  != 0) {
>  error_report("iSCSI: Failed to start async connect.");
>  ret = -EINVAL;
> -goto failed;
> +goto out;
>  }
>
>  while (!task.complete) {
> @@ -1023,11 +1019,7 @@ static int iscsi_open(BlockDriverState *bs, const char 
> *filename, int flags)
>  error_report("iSCSI: Failed to connect to LUN : %s",
>   iscsi_get_error(iscsi));
>  ret = -EINVAL;
> -goto failed;
> -}
> -
> -if (iscsi_url != NULL) {
> -iscsi_destroy_url(iscsi_url);
> +goto out;
>  }
>
>  /* Medium changer or tape. We dont have any emulation for this so this 
> must
> @@ -1039,19 +1031,22 @@ static int iscsi_open(BlockDriverState *bs, const 
> char *filename, int flags)
>  bs->sg = 1;
>  }
>
> -return 0;
> +ret = 0;
>
> -failed:
> +out:
>  if (initiator_name != NULL) {
>  g_free(initiator_name);
>  }
>  if (iscsi_url != NULL) {
>  iscsi_destroy_url(iscsi_url);
>  }
> -if (iscsi != NULL) {
> -iscsi_destroy_context(iscsi);
> +
> +if (ret) {
> +if (iscsi != NULL) {
> +iscsi_destroy_context(iscsi);
> +}
> +memset(iscsilun, 0, sizeof(IscsiLun));
>  }
> -memset(iscsilun, 0, sizeof(IscsiLun));
>  return ret;
>  }
>
>
> Can you confirm that this is how the initiator name should be passed, so I can
> split the above patch and commit it?
>


That looks good.
Thanks!

ronnie sahlberg



Re: [Qemu-devel] virtio-scsi vs. virtio-blk

2012-08-09 Thread ronnie sahlberg
On Thu, Aug 9, 2012 at 10:31 PM, Stefan Priebe - Profihost AG
 wrote:
> Am 09.08.2012 14:19, schrieb Paolo Bonzini:
>
>> Il 09/08/2012 14:08, Stefan Priebe - Profihost AG ha scritto:
>>>
>>>
>>> virtio-scsi:
>>> rand 4k:
>>>write: io=822448KB, bw=82228KB/s, iops=20557, runt= 10002msec
>>>read : io=950920KB, bw=94694KB/s, iops=23673, runt= 10042msec
>>> seq:
>>>write: io=2436MB, bw=231312KB/s, iops=56, runt= 10784msec
>>>read : io=3248MB, bw=313799KB/s, iops=76, runt= 10599msec
>>>
>>> virtio-blk:
>>> rand 4k:
>>>write: io=896472KB, bw=89051KB/s, iops=22262, runt= 10067msec
>>>read : io=1710MB, bw=175073KB/s, iops=43768, runt= 10002msec
>>> seq:
>>>write: io=4008MB, bw=391285KB/s, iops=95, runt= 10489msec
>>>read : io=5748MB, bw=570178KB/s, iops=139, runt= 10323msec
>>
>>
>> Thanks; some overhead is expected, but not this much.  Especially the
>> sequential case is bad, what disk is this?
>
>
> right now this is an external iscsi Nexentastor. Locally i can't get this
> bandwith nor these iops to test.
>
>
>
>> Things to test include:
>>
>> - using the deadline I/O scheduler on at least the host, and possibly
>> the guest too
>
> guest uses noop right now. Disk Host is nexentastor running open solaris. I
> use libiscsi right now so the disks are not visible in both cases
> (virtio-blk and virtio-scsi) to the host right now.
>

And if you mount the disks locally on the host using open-iscsi, and
access them as /dev/sg* from qemu, what performance do you get?


virtio-blk would first go to scsi emulation and then call out to
block/iscsi.c to translate back to scsi commands to send to libiscsi

while virtio-scsi (I think) would treat libiscsi as a generic scsi
passthrough device.  I.e. all commands just go straight through
bdrv_aio_ioctl(SG_IO)


If anything, I think the codepaths should be shorter for the
virtio-scsi case, and it should due to the lack of scsi emulate and
scsi re-encode perform better.


Can you try also using normal scsi-generic and see how it performs
compared to virtio-blk/-scsi ?

git show 983924532f61091fd90d1f2fafa4aa938c414dbb
This command shows how to set up libiscsi with passthrough via an
emulated scsi hba.


as  virtio-blk/-scsi both use libiscsi, i think the bottleneck might
either be the interface between guest and qemu,  or the difference to
the guest when talking to local scsi-emulation vs talking to the
passthrough remote target.

also is it possible to map the LUNs locally on the host using
open-iscsi and then use the scsi-generic devices /dev/sg*  for qemu
and see how it compares?



Re: [Qemu-devel] virtio-scsi vs. virtio-blk

2012-08-10 Thread ronnie sahlberg
You want discard to work?



That should not be a problem with iscsi.

You are using qemu 1.0 ?
So you dont have the qemu support for scsi-generic passthrough to iscsi devices.


This should though work without too much trouble



First you need an iscsi target that supports SBC UNMAP command.
STGT  does support this :   http://stgt.sourceforge.net/
This is a userspace software iscsi target that works on most distros of linux.

Support for thin-provisioning in STGT is very very recent :
commit 9a855ac0026971c2b5c7f7133febfaf9462410dc
Author: Ronnie Sahlberg 
Date:   Sun Apr 15 12:07:32 2012 +1000

SBC: Add support for thin-provisioning and the UNMAP command

The UNMAP command is implemented using FALLOC_FL_PUNCH_HOLE and
release UNMAPPED blocks back to the underlying filesystem.

FALLOC_FL_PUNCH_HOLE is fairly new addition to Linux but works o
ext4 and XFS filesystems currently.

Signed-off-by: Ronnie Sahlberg 
Signed-off-by: FUJITA Tomonori 


That means STGT version 1.0.27 or later.
As this is very recent your distro probably does not have support for this yet,
so you probably want to download, compile and install STGT from the
current git tree.

Thin-provisioning in STGT requires the also very recent addition on
FALLOC_FL_PUNCH_HOLE
(and also SEEK_HOLE/SEEK_DATA if you want "get_lba_status" support)
Because STGT just calls out to these functions.

I think you need to run the target on linux 3.2 or later kernels using
ext4/xfs filessytem for this to work since I dont think any
other filesystems support it. Never tested XFS myself but google
claims it works.


Once you have STGT running on a 3.2 or later kernel and using a
filesystem that supports discard,  this is the command to tell TGTD to
activate
thin-provisioning support for the LUN :

   tgtadm --lld iscsi --mode logicalunit --op update --tid $TID --lun
1 --params thin_provisioning=1

STGT will show the thin provisioning status like this

LUN: 1
Type: disk
SCSI ID: IET 00010001
SCSI SN: beaf11
Size: 105 MB, Block size: 512
Online: Yes
Removable media: Yes
Prevent removal: No
Readonly: No
Thin-provisioning: Yes
Backing store type: rdwr
Backing store path: /100mb.img
Backing store flags:
...

Once you got STGT up and running and setup for thin provisioning that
should be almost all you need.
(Other iscsi targets may also probably support thin-provisioning but I
have no idea on how to set them up)


Once you have set STGT up,  you just need a guest that supports discard.
Any recent linux distro with a kernel 3.2 or later should do.
I often use latest mint when I test.

Just set it up and put a ext4 filesystem on the iscsi lun, and use the
'discard' mount option in the guest.
Use 'ls -ls ' on the target and see that the file
'shrinks' in size when  you delete files from the ext4 filesystem.

You can also use wireshark, it understands and decodes the unmap UNMAP
that is used to punch holes in the medium.




NOTE: STGT does not yet support/implement the "thresholds" for
thin-provisioning so there is not yet any mechanism to automatically
inform your guest when the unterlying storage is running low on space.
So you do need to track space utilization on the target backing
filesystem youself!
At some stage I will add the thresholds from sbc 4.7.3.8 but it wont
be anytime soon. (patches are likely welcome)



That should be all you need to do to get it running. It is pretty easy.
Ping me if you have any trouble.



regards
ronnie sahlberg



On Fri, Aug 10, 2012 at 8:30 PM, Paolo Bonzini  wrote:
> Il 10/08/2012 12:28, Stefan Priebe - Profihost AG ha scritto:
>> I'm using iscsi. So no raw or qcow2.
>
> Ok, then you need to use scsi-block as your device instead of scsi-disk
> or scsi-hd.  This will disable the QEMU SCSI emulation and let your VM
> talk directly to the NAS.
>
> CCing Ronnie who may be interested in bug reports since I'm on holiday
> starting "soon".
>
> Paolo
>
>>
>> Thanks,
>>
>> Stefan
>>
>> Am 10.08.2012 12:20, schrieb Paolo Bonzini:
>>> Il 10/08/2012 11:22, Stefan Priebe - Profihost AG ha scritto:
>>>> virtio-scsi is now working fine. Could you please help me to get discard
>>>> / trim running? I can't find any information what is needed to get
>>>> discard / trim working.
>>>
>>> You need to add discard_granularity=NNN, where NNN is usually 512 for
>>> raw and the cluster size (65536) for qcow2.
>>>
>>> However, discard is supported only for XFS on raw, and on qcow2 it will
>>> not reclaim space---the space will only be reused for future qcow2
>>> allocation.
>>>
>>> Bette

Re: [Qemu-devel] virtio-scsi vs. virtio-blk

2012-08-10 Thread ronnie sahlberg
On Fri, Aug 10, 2012 at 8:30 PM, Paolo Bonzini  wrote:
> Il 10/08/2012 12:28, Stefan Priebe - Profihost AG ha scritto:
>> I'm using iscsi. So no raw or qcow2.
>
> Ok, then you need to use scsi-block as your device instead of scsi-disk
> or scsi-hd.  This will disable the QEMU SCSI emulation and let your VM
> talk directly to the NAS.
>
> CCing Ronnie who may be interested in bug reports since I'm on holiday
> starting "soon".
>

I think it works on any,
You can of course not boot from a if=scsi disk in qemu,

but any '-drive file=iscsi://...,if=scsi' should work as long as it is
not the boot device.

SCSI emulation in qemu picks this up via WRITESAME10/16 and then calls
 bdrv_aio_discard()
block/iscsi.c is invoked for discard and then translates this back to
a SBC UNMAP command it sends to the target.


Now, block/iscsi.c does assume that any target that reports that it
supports thin-provisioning actually implements UNMAP command.
There could be targets that support thin-provision ing that does NOT
support UNMAP and unly support discard via WRITESAME10/16
so at some stage I should send a patch to iscsi.c to check which
commands the target supprots and use one of the supported ones instead
of a blanket
"you say you support thin-provisioning, I take that as confirmation
you support SBC UNMAP"


regards
ronnie sahlberg



Re: [Qemu-devel] virtio-scsi vs. virtio-blk

2012-08-10 Thread ronnie sahlberg
You can easily verify if your target supports thin-provisioning via
the UNMAP command.

Download the sg3-utils package
 and either mount the LUN locally via the kernel open-iscsi or apply
the libiscsi patch to sg3-utils to make it iscsi-aware

then use the commands"sg_unmap"  to try to unmap regions   and
"sg_get_lba_status" to check that the regions are now unmapped.



On Fri, Aug 10, 2012 at 9:54 PM, Stefan Priebe - Profihost AG
 wrote:
> http://www.nexenta.com/corp/products/what-is-openstorage/nexentastor
>
> tells me:
> "SCSI UNMAP as a client-side feature frees up storage in the back end, in
> the context of thin provisioning (a 100-to-one reduction in space for VDI
> when using NexentaStor)."
>
> So i would say nexenta supports it. But i'm using virtio-scsi-pci? I'm
> really sorry to ask so many questions.
>
> Stefan
> Am 10.08.2012 13:20, schrieb ronnie sahlberg:
>
>> On Fri, Aug 10, 2012 at 8:30 PM, Paolo Bonzini 
>> wrote:
>>>
>>> Il 10/08/2012 12:28, Stefan Priebe - Profihost AG ha scritto:
>>>>
>>>> I'm using iscsi. So no raw or qcow2.
>>>
>>>
>>> Ok, then you need to use scsi-block as your device instead of scsi-disk
>>> or scsi-hd.  This will disable the QEMU SCSI emulation and let your VM
>>> talk directly to the NAS.
>>>
>>> CCing Ronnie who may be interested in bug reports since I'm on holiday
>>> starting "soon".
>>>
>>
>> I think it works on any,
>> You can of course not boot from a if=scsi disk in qemu,
>>
>> but any '-drive file=iscsi://...,if=scsi' should work as long as it is
>> not the boot device.
>>
>> SCSI emulation in qemu picks this up via WRITESAME10/16 and then calls
>>   bdrv_aio_discard()
>> block/iscsi.c is invoked for discard and then translates this back to
>> a SBC UNMAP command it sends to the target.
>>
>>
>> Now, block/iscsi.c does assume that any target that reports that it
>> supports thin-provisioning actually implements UNMAP command.
>> There could be targets that support thin-provision ing that does NOT
>> support UNMAP and unly support discard via WRITESAME10/16
>> so at some stage I should send a patch to iscsi.c to check which
>> commands the target supprots and use one of the supported ones instead
>> of a blanket
>> "you say you support thin-provisioning, I take that as confirmation
>> you support SBC UNMAP"
>>
>>
>> regards
>> ronnie sahlberg
>>
>



Re: [Qemu-devel] virtio-scsi vs. virtio-blk

2012-08-10 Thread ronnie sahlberg
On Fri, Aug 10, 2012 at 9:57 PM, Stefan Priebe - Profihost AG
 wrote:
> Am 10.08.2012 13:12, schrieb ronnie sahlberg:
>
>> You want discard to work?
>
> Yes
>
>
>> You are using qemu 1.0 ?
>
> actual qemu-kvm git
>
>
>> So you dont have the qemu support for scsi-generic passthrough to iscsi
>> devices.
>
> Why?
>

scsi-generic passthrough I think was added for iscsi in 1.1
so in 1.0 your guest will talk scsi to qemu, and invoke the
scsi-emulation in qemu.
It then will call functions like 'bdrv_aio_discard()" in libiscsi
that will translate it back into a scsi command again and pass it to
the target.

It still works, it just means you have a small degradation of
performance compared to if you could send the SCSI CDB straight
through to the iscsi target as you can in qemu 1.1
Very likely so small performance hit that you can not even measure it.


Biggest difference is cosmetic if you run 'sg_inq' in your guest.  in
1.0 it will talk to the qemu scsi emulation layer. in 1.1 when scsi
passthrough is use you will talk to the iscsi target.



Re: [Qemu-devel] virtio-scsi vs. virtio-blk

2012-08-10 Thread ronnie sahlberg
On Fri, Aug 10, 2012 at 10:14 PM, Stefan Priebe - Profihost AG
 wrote:
> Am 10.08.2012 14:04, schrieb ronnie sahlberg:
>
>> On Fri, Aug 10, 2012 at 9:57 PM, Stefan Priebe - Profihost AG
>>  wrote:
>>>
>>> Am 10.08.2012 13:12, schrieb ronnie sahlberg:
>>>
>>>> You want discard to work?
>>>
>>>
>>> Yes
>>>
>>>
>>>> You are using qemu 1.0 ?
>>>
>>>
>>> actual qemu-kvm git
>>>
>>>
>>>> So you dont have the qemu support for scsi-generic passthrough to iscsi
>>>> devices.
>>>
>>>
>>> Why?
>>>
>>
>> scsi-generic passthrough I think was added for iscsi in 1.1
>> so in 1.0 your guest will talk scsi to qemu, and invoke the
>> scsi-emulation in qemu.
>> It then will call functions like 'bdrv_aio_discard()" in libiscsi
>> that will translate it back into a scsi command again and pass it to
>> the target.
>>
>> It still works, it just means you have a small degradation of
>> performance compared to if you could send the SCSI CDB straight
>> through to the iscsi target as you can in qemu 1.1
>> Very likely so small performance hit that you can not even measure it.
>
>
> which version are you talking about? I use qemu-kvm.git so this is upcomming
> 1.2 and i use libiscsi 1.5.0.

I dont know the kvm version numbers.

But you can check the file
block/iscsi.c for the version you use  for this :

  .bdrv_aio_discard = iscsi_aio_discard,

If it has bdrv_aio_discard then you have support for 'discard' when
using the scsi emulation. i.e.   -drive ...,if=scsi,...



#ifdef __linux__
.bdrv_ioctl   = iscsi_ioctl,
.bdrv_aio_ioctl   = iscsi_aio_ioctl,
#endif

If it has these two lines too, then you have scsi-passthrough and can
bypass the qemu scsi emulation.
One way to activate passthough is via scsi-generic:
Example:
-device lsi -device scsi-generic,drive=MyISCSI \
-drive file=iscsi://10.1.1.125/iqn.ronnie.test/1,if=none,id=MyI


regards
ronnie sahlberg



Re: [Qemu-devel] [PATCH] iscsi: fix race between task completition and task abortion

2012-08-14 Thread ronnie sahlberg
Stefan H

How should I do Acked-by properly,


Is a reply with the text

Acked-by: Ronnie Sahlberg 

sufficient ?


regards
ronnie sahlberg


On Tue, Aug 14, 2012 at 8:35 PM, Stefan Hajnoczi  wrote:
> On Tue, Aug 14, 2012 at 08:44:46AM +0200, Stefan Priebe wrote:
>> From: spriebe 
>
> CCing Ronnie, iSCSI block driver author.
>
>>
>> ---
>>  block/iscsi.c |   36 
>>  1 files changed, 20 insertions(+), 16 deletions(-)
>>
>> diff --git a/block/iscsi.c b/block/iscsi.c
>> index 12ca76d..257f97f 100644
>> --- a/block/iscsi.c
>> +++ b/block/iscsi.c
>> @@ -76,6 +76,10 @@ static void
>>  iscsi_abort_task_cb(struct iscsi_context *iscsi, int status, void 
>> *command_data,
>>  void *private_data)
>>  {
>> +   IscsiAIOCB *acb = (IscsiAIOCB *)private_data;
>> +
>> +   scsi_free_scsi_task(acb->task);
>> +   acb->task = NULL;
>>  }
>>
>>  static void
>> @@ -85,14 +89,19 @@ iscsi_aio_cancel(BlockDriverAIOCB *blockacb)
>>  IscsiLun *iscsilun = acb->iscsilun;
>>
>>  acb->common.cb(acb->common.opaque, -ECANCELED);
>> -acb->canceled = 1;
>>
>> -/* send a task mgmt call to the target to cancel the task on the target 
>> */
>> -iscsi_task_mgmt_abort_task_async(iscsilun->iscsi, acb->task,
>> - iscsi_abort_task_cb, NULL);
>> +if (acb->canceled != 0)
>> +return;
>> +
>> +acb->canceled = 1;
>>
>> -/* then also cancel the task locally in libiscsi */
>> -iscsi_scsi_task_cancel(iscsilun->iscsi, acb->task);
>> +/* send a task mgmt call to the target to cancel the task on the target
>> + * this also cancels the task in libiscsi
>> + */
>> +if (acb->task) {
>> +iscsi_task_mgmt_abort_task_async(iscsilun->iscsi, acb->task,
>> + iscsi_abort_task_cb, &acb);
>> +}
>>  }
>>
>>  static AIOPool iscsi_aio_pool = {
>> @@ -184,6 +193,11 @@ iscsi_readv_writev_bh_cb(void *p)
>>  }
>>
>>  qemu_aio_release(acb);
>> +
>> +if (acb->task) {
>> +  scsi_free_scsi_task(acb->task);
>> +  acb->task = NULL;
>> +}
>>  }
>>
>>
>> @@ -212,8 +226,6 @@ iscsi_aio_write16_cb(struct iscsi_context *iscsi, int 
>> status,
>>  }
>>
>>  iscsi_schedule_bh(iscsi_readv_writev_bh_cb, acb);
>> -scsi_free_scsi_task(acb->task);
>> -acb->task = NULL;
>>  }
>>
>>  static int64_t sector_qemu2lun(int64_t sector, IscsiLun *iscsilun)
>> @@ -313,8 +325,6 @@ iscsi_aio_read16_cb(struct iscsi_context *iscsi, int 
>> status,
>>  }
>>
>>  iscsi_schedule_bh(iscsi_readv_writev_bh_cb, acb);
>> -scsi_free_scsi_task(acb->task);
>> -acb->task = NULL;
>>  }
>>
>>  static BlockDriverAIOCB *
>> @@ -429,8 +439,6 @@ iscsi_synccache10_cb(struct iscsi_context *iscsi, int 
>> status,
>>  }
>>
>>  iscsi_schedule_bh(iscsi_readv_writev_bh_cb, acb);
>> -scsi_free_scsi_task(acb->task);
>> -acb->task = NULL;
>>  }
>>
>>  static BlockDriverAIOCB *
>> @@ -483,8 +491,6 @@ iscsi_unmap_cb(struct iscsi_context *iscsi, int status,
>>  }
>>
>>  iscsi_schedule_bh(iscsi_readv_writev_bh_cb, acb);
>> -scsi_free_scsi_task(acb->task);
>> -acb->task = NULL;
>>  }
>>
>>  static BlockDriverAIOCB *
>> @@ -560,8 +566,6 @@ iscsi_aio_ioctl_cb(struct iscsi_context *iscsi, int 
>> status,
>>  }
>>
>>  iscsi_schedule_bh(iscsi_readv_writev_bh_cb, acb);
>> -scsi_free_scsi_task(acb->task);
>> -acb->task = NULL;
>>  }
>>
>>  static BlockDriverAIOCB *iscsi_aio_ioctl(BlockDriverState *bs,
>> --
>> 1.7.2.5
>>
>>



Re: [Qemu-devel] [PATCH] PATCH V2: fix NULL dereferences / races between task completition and abort

2012-08-15 Thread ronnie sahlberg
Acked-by: Ronnie Sahlberg 



On Wed, Aug 15, 2012 at 5:09 PM, Stefan Priebe  wrote:
> Signed-off-by: Stefan Priebe 
>
> ---
>  block/iscsi.c |   55 +++
>  1 files changed, 23 insertions(+), 32 deletions(-)
>
> diff --git a/block/iscsi.c b/block/iscsi.c
> index 12ca76d..1c8b049 100644
> --- a/block/iscsi.c
> +++ b/block/iscsi.c
> @@ -76,6 +76,10 @@ static void
>  iscsi_abort_task_cb(struct iscsi_context *iscsi, int status, void 
> *command_data,
>  void *private_data)
>  {
> +   IscsiAIOCB *acb = (IscsiAIOCB *)private_data;
> +
> +   scsi_free_scsi_task(acb->task);
> +   acb->task = NULL;
>  }
>
>  static void
> @@ -84,15 +88,15 @@ iscsi_aio_cancel(BlockDriverAIOCB *blockacb)
>  IscsiAIOCB *acb = (IscsiAIOCB *)blockacb;
>  IscsiLun *iscsilun = acb->iscsilun;
>
> -acb->common.cb(acb->common.opaque, -ECANCELED);
>  acb->canceled = 1;
>
> -/* send a task mgmt call to the target to cancel the task on the target 
> */
> -iscsi_task_mgmt_abort_task_async(iscsilun->iscsi, acb->task,
> - iscsi_abort_task_cb, NULL);
> +acb->common.cb(acb->common.opaque, -ECANCELED);
>
> -/* then also cancel the task locally in libiscsi */
> -iscsi_scsi_task_cancel(iscsilun->iscsi, acb->task);
> +/* send a task mgmt call to the target to cancel the task on the target
> + * this also cancels the task in libiscsi
> + */
> +iscsi_task_mgmt_abort_task_async(iscsilun->iscsi, acb->task,
> + iscsi_abort_task_cb, &acb);
>  }
>
>  static AIOPool iscsi_aio_pool = {
> @@ -179,11 +183,18 @@ iscsi_readv_writev_bh_cb(void *p)
>
>  qemu_bh_delete(acb->bh);
>
> -if (acb->canceled == 0) {
> +if (!acb->canceled) {
>  acb->common.cb(acb->common.opaque, acb->status);
>  }
>
>  qemu_aio_release(acb);
> +
> +if (acb->canceled) {
> +return;
> +}
> +
> +scsi_free_scsi_task(acb->task);
> +acb->task = NULL;
>  }
>
>
> @@ -197,10 +208,8 @@ iscsi_aio_write16_cb(struct iscsi_context *iscsi, int 
> status,
>
>  g_free(acb->buf);
>
> -if (acb->canceled != 0) {
> +if (acb->canceled) {
>  qemu_aio_release(acb);
> -scsi_free_scsi_task(acb->task);
> -acb->task = NULL;
>  return;
>  }
>
> @@ -212,8 +221,6 @@ iscsi_aio_write16_cb(struct iscsi_context *iscsi, int 
> status,
>  }
>
>  iscsi_schedule_bh(iscsi_readv_writev_bh_cb, acb);
> -scsi_free_scsi_task(acb->task);
> -acb->task = NULL;
>  }
>
>  static int64_t sector_qemu2lun(int64_t sector, IscsiLun *iscsilun)
> @@ -298,10 +305,8 @@ iscsi_aio_read16_cb(struct iscsi_context *iscsi, int 
> status,
>
>  trace_iscsi_aio_read16_cb(iscsi, status, acb, acb->canceled);
>
> -if (acb->canceled != 0) {
> +if (acb->canceled) {
>  qemu_aio_release(acb);
> -scsi_free_scsi_task(acb->task);
> -acb->task = NULL;
>  return;
>  }
>
> @@ -313,8 +318,6 @@ iscsi_aio_read16_cb(struct iscsi_context *iscsi, int 
> status,
>  }
>
>  iscsi_schedule_bh(iscsi_readv_writev_bh_cb, acb);
> -scsi_free_scsi_task(acb->task);
> -acb->task = NULL;
>  }
>
>  static BlockDriverAIOCB *
> @@ -414,10 +417,8 @@ iscsi_synccache10_cb(struct iscsi_context *iscsi, int 
> status,
>  {
>  IscsiAIOCB *acb = opaque;
>
> -if (acb->canceled != 0) {
> +if (acb->canceled) {
>  qemu_aio_release(acb);
> -scsi_free_scsi_task(acb->task);
> -acb->task = NULL;
>  return;
>  }
>
> @@ -429,8 +430,6 @@ iscsi_synccache10_cb(struct iscsi_context *iscsi, int 
> status,
>  }
>
>  iscsi_schedule_bh(iscsi_readv_writev_bh_cb, acb);
> -scsi_free_scsi_task(acb->task);
> -acb->task = NULL;
>  }
>
>  static BlockDriverAIOCB *
> @@ -468,10 +467,8 @@ iscsi_unmap_cb(struct iscsi_context *iscsi, int status,
>  {
>  IscsiAIOCB *acb = opaque;
>
> -if (acb->canceled != 0) {
> +if (acb->canceled) {
>  qemu_aio_release(acb);
> -scsi_free_scsi_task(acb->task);
> -acb->task = NULL;
>  return;
>  }
>
> @@ -483,8 +480,6 @@ iscsi_unmap_cb(struct iscsi_context *iscsi, int status,
>  }
>
>  iscsi_schedule_bh(iscsi_readv_writev_bh_cb, acb);
> -scsi_free_scsi_task(acb->task);
> -acb-&g

[Qemu-devel] [PATCH] Two trivial patches for iSCSI and blank DVDs

2012-08-16 Thread Ronnie Sahlberg
Kevin, List, Paolo

Please find two trivial patches to the iscsi block driver.

1, If the LUN contains a blank disk then the readcapacity10 data will
contain a ReturnedLogicalBlockAddress of 0. For this case this does not mean 
that the device contains one readable block and can be read from.
This is a special case for a blank disk that contains no readable blocks at 
all. This change makes bdrv_getlength()/iscsi_getlength() now rerurn a size of 
0  for iSCSI LUNs that contain a blank disk.

2, If the iSCSI LUN contains a blank disk then we can not read or write to it 
using bdrv_*() functions since there is no bdrv api for writeable mmc devices.
For this case with a blank disk force the use of scsi-generic just like 
we do for tape and mediachanger devices.
This allows the guest to talk directly to the target LUN and write to/burn the 
disk.


I have confirmed that 

  dvdrecord -dao -ignsize -overburn dev=/dev/sg1 .iso

works from a qemu guest to burn an iSCSI cdrom with these patches.

regards
ronnie sahlberg



[Qemu-devel] [PATCH 1/2] ISCSI: Set number of blocks to 0 for blank CDROM devices

2012-08-16 Thread Ronnie Sahlberg
The number of blocks of the device is used to compute the device size
in bdrv_getlength()/iscsi_getlength().
For MMC devices, the ReturnedLogicalBlockAddress in the READCAPACITY10
has a special meaning when it is 0.
In this case it does not mean that LBA 0 is the last accessible LBA,
and thus the device has 1 readable block, but instead it means that the
disc is blank and there are no readable blocks.

This change ensures that when the iSCSI LUN is loaded with a blank
DVD-R disk or similar that bdrv_getlength() will return the correct
size of the device as 0 bytes.

Signed-off-by: Ronnie Sahlberg 
---
 block/iscsi.c |7 ++-
 1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/block/iscsi.c b/block/iscsi.c
index bb9cf82..fb420ea 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -717,7 +717,12 @@ iscsi_readcapacity10_cb(struct iscsi_context *iscsi, int 
status,
 }
 
 itask->iscsilun->block_size = rc10->block_size;
-itask->iscsilun->num_blocks = rc10->lba + 1;
+if (rc10->lba == 0) {
+/* blank disk loaded */
+itask->iscsilun->num_blocks = 0;
+} else {
+itask->iscsilun->num_blocks = rc10->lba + 1;
+}
 itask->bs->total_sectors= itask->iscsilun->num_blocks *
itask->iscsilun->block_size / BDRV_SECTOR_SIZE ;
 
-- 
1.7.3.1




[Qemu-devel] [PATCH 2/2] ISCSI: Force scsi-generic for MMC with blank disks

2012-08-16 Thread Ronnie Sahlberg
There is no bdrv_* API for the commands for burning a blank MMC disk
so when iSCSI LUNs are specified and the LUN is a MMC device with
0 available blocks. This is a blank disk so force scsi generic.

This allows the guest to talk directly to the target to burn data on
the disk.

Signed-off-by: Ronnie Sahlberg 
---
 block/iscsi.c |   13 +++--
 1 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/block/iscsi.c b/block/iscsi.c
index fb420ea..ca53afa 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -1017,10 +1017,19 @@ static int iscsi_open(BlockDriverState *bs, const char 
*filename, int flags)
 /* Medium changer or tape. We dont have any emulation for this so this must
  * be sg ioctl compatible. We force it to be sg, otherwise qemu will try
  * to read from the device to guess the image format.
+ * MMC device with no blocks contain a blank disk so force them to use sg
+ * too.
  */
-if (iscsilun->type == TYPE_MEDIUM_CHANGER ||
-iscsilun->type == TYPE_TAPE) {
+switch (iscsilun->type) {
+case TYPE_ROM:
+if (iscsilun->num_blocks > 0) {
+break;
+}
+case TYPE_MEDIUM_CHANGER:
+case TYPE_TAPE:
 bs->sg = 1;
+default:
+break;
 }
 
 ret = 0;
-- 
1.7.3.1




Re: [Qemu-devel] [PATCH 2/2] ISCSI: Force scsi-generic for MMC with blank disks

2012-08-18 Thread ronnie sahlberg
On Sun, Aug 19, 2012 at 7:58 AM, Paolo Bonzini  wrote:
> Il 17/08/2012 04:36, Ronnie Sahlberg ha scritto:
>> There is no bdrv_* API for the commands for burning a blank MMC disk
>> so when iSCSI LUNs are specified and the LUN is a MMC device with
>> 0 available blocks. This is a blank disk so force scsi generic.
>>
>> This allows the guest to talk directly to the target to burn data on
>> the disk.
>>
>> Signed-off-by: Ronnie Sahlberg 
>
> What happens without the patch?  It's ok that scsi-{hd,cd} does not
> work, but do scsi-{block,generic} work without the patch?
>

Neither of them work, basically because in
block.c:find_image_format()

if bs->sg is not set in

 if (bs->sg || !bdrv_is_inserted(bs)) {

then we continue to

  ret = bdrv_pread(bs, 0, buf, sizeof(buf));

which fails with an error.
So this patch is basically to prevent find_image_format() from trying
to read from a blank disk.

Maybe there is a better way to do this?



regards
ronnie sahlberg



Re: [Qemu-devel] [PATCH 2/2] ISCSI: Force scsi-generic for MMC with blank disks

2012-08-18 Thread ronnie sahlberg
Ah,

This patch only affects the case when there is a blank / empty disk loaded.
It has no effect on when real *.iso images are loaded and the disk
contains data.

The use case to be able to "burn" to an iscsi cdrom is probably not
very urgent, so maybe it is best to delay this until post 1.2


regards
ronnie sahlberg


On Sun, Aug 19, 2012 at 8:02 AM, ronnie sahlberg
 wrote:
> On Sun, Aug 19, 2012 at 7:58 AM, Paolo Bonzini  wrote:
>> Il 17/08/2012 04:36, Ronnie Sahlberg ha scritto:
>>> There is no bdrv_* API for the commands for burning a blank MMC disk
>>> so when iSCSI LUNs are specified and the LUN is a MMC device with
>>> 0 available blocks. This is a blank disk so force scsi generic.
>>>
>>> This allows the guest to talk directly to the target to burn data on
>>> the disk.
>>>
>>> Signed-off-by: Ronnie Sahlberg 
>>
>> What happens without the patch?  It's ok that scsi-{hd,cd} does not
>> work, but do scsi-{block,generic} work without the patch?
>>
>
> Neither of them work, basically because in
> block.c:find_image_format()
>
> if bs->sg is not set in
>
>  if (bs->sg || !bdrv_is_inserted(bs)) {
>
> then we continue to
>
>   ret = bdrv_pread(bs, 0, buf, sizeof(buf));
>
> which fails with an error.
> So this patch is basically to prevent find_image_format() from trying
> to read from a blank disk.
>
> Maybe there is a better way to do this?
>
>
>
> regards
> ronnie sahlberg



Re: [Qemu-devel] [PATCH 1/2] ISCSI: Set number of blocks to 0 for blank CDROM devices

2012-08-18 Thread ronnie sahlberg
On Sun, Aug 19, 2012 at 7:57 AM, Paolo Bonzini  wrote:
> Il 17/08/2012 04:36, Ronnie Sahlberg ha scritto:
>> The number of blocks of the device is used to compute the device size
>> in bdrv_getlength()/iscsi_getlength().
>> For MMC devices, the ReturnedLogicalBlockAddress in the READCAPACITY10
>> has a special meaning when it is 0.
>> In this case it does not mean that LBA 0 is the last accessible LBA,
>> and thus the device has 1 readable block, but instead it means that the
>> disc is blank and there are no readable blocks.
>>
>> This change ensures that when the iSCSI LUN is loaded with a blank
>> DVD-R disk or similar that bdrv_getlength() will return the correct
>> size of the device as 0 bytes.
>>
>> Signed-off-by: Ronnie Sahlberg 
>> ---
>>  block/iscsi.c |7 ++-
>>  1 files changed, 6 insertions(+), 1 deletions(-)
>>
>> diff --git a/block/iscsi.c b/block/iscsi.c
>> index bb9cf82..fb420ea 100644
>> --- a/block/iscsi.c
>> +++ b/block/iscsi.c
>> @@ -717,7 +717,12 @@ iscsi_readcapacity10_cb(struct iscsi_context *iscsi, 
>> int status,
>>  }
>>
>>  itask->iscsilun->block_size = rc10->block_size;
>> -itask->iscsilun->num_blocks = rc10->lba + 1;
>> +if (rc10->lba == 0) {
>> +/* blank disk loaded */
>> +itask->iscsilun->num_blocks = 0;
>> +} else {
>> +itask->iscsilun->num_blocks = rc10->lba + 1;
>> +}
>>  itask->bs->total_sectors= itask->iscsilun->num_blocks *
>> itask->iscsilun->block_size / 
>> BDRV_SECTOR_SIZE ;
>>
>>
>
> Thanks, applied to SCSI branch.  We'll see whether this hits 1.2.

This patch only matters for the use case of having empty/blank media
loaded into an iscsi cdrom.
I doubt being able to "burn" data to an iscsi drive is very important or urgent,
so it is fine with me to delay until post 1.2

regards
ronnie sahlberg



Re: [Qemu-devel] [PATCH 2/2] ISCSI: Force scsi-generic for MMC with blank disks

2012-08-18 Thread ronnie sahlberg
On Sun, Aug 19, 2012 at 8:16 AM, Paolo Bonzini  wrote:
> Il 19/08/2012 00:02, ronnie sahlberg ha scritto:
>> Neither of them work, basically because in
>> block.c:find_image_format()
>>
>> if bs->sg is not set in
>>
>>  if (bs->sg || !bdrv_is_inserted(bs)) {
>>
>> then we continue to
>>
>>   ret = bdrv_pread(bs, 0, buf, sizeof(buf));
>>
>> which fails with an error.
>> So this patch is basically to prevent find_image_format() from trying
>> to read from a blank disk.
>>
>> Maybe there is a better way to do this?
>
> Yeah, I think in this case find_image_format should just use raw.

Ok, so that is basically what the patch does. It forces bs->sg==true
so that we pick "raw" right there instead of trying to read from the
device.


So you are happy with the patch ?


regards
ronnie sahlberg



Re: [Qemu-devel] [PATCH 2/2] ISCSI: Force scsi-generic for MMC with blank disks

2012-08-18 Thread ronnie sahlberg
On Sun, Aug 19, 2012 at 8:20 AM, Paolo Bonzini  wrote:
> Il 19/08/2012 00:02, ronnie sahlberg ha scritto:
>> On Sun, Aug 19, 2012 at 7:58 AM, Paolo Bonzini  wrote:
>>> Il 17/08/2012 04:36, Ronnie Sahlberg ha scritto:
>>>> There is no bdrv_* API for the commands for burning a blank MMC disk
>>>> so when iSCSI LUNs are specified and the LUN is a MMC device with
>>>> 0 available blocks. This is a blank disk so force scsi generic.
>>>>
>>>> This allows the guest to talk directly to the target to burn data on
>>>> the disk.
>>>>
>>>> Signed-off-by: Ronnie Sahlberg 
>>>
>>> What happens without the patch?  It's ok that scsi-{hd,cd} does not
>>> work, but do scsi-{block,generic} work without the patch?
>>>
>>
>> Neither of them work, basically because in
>> block.c:find_image_format()
>>
>> if bs->sg is not set in
>>
>>  if (bs->sg || !bdrv_is_inserted(bs)) {
>>
>> then we continue to
>>
>>   ret = bdrv_pread(bs, 0, buf, sizeof(buf));
>>
>> which fails with an error.
>> So this patch is basically to prevent find_image_format() from trying
>> to read from a blank disk.
>>
>> Maybe there is a better way to do this?
>
> Ah, BTW does using -drive ...,format=raw work, or does it hiccup later
> again on something else?
>

format=raw works !

That then begs the question if would it be possible to force
format=raw always for iscsi devices?

A iscsi device as far as I can see would always just be a raw block
device and there would never be a "header" on such devices
so maybe that would be a solution?


regards
ronnie sahlberg



Re: [Qemu-devel] [PATCH RFT 0/3] iscsi: fix NULL dereferences / races between task completion and abort

2012-08-20 Thread ronnie sahlberg
On Mon, Aug 20, 2012 at 6:12 PM, Stefan Priebe - Profihost AG
 wrote:
> Hi Ronnie,
>
> Am 20.08.2012 10:08, schrieb Paolo Bonzini:
>
>> That's because the "big QEMU lock" is held by the thread that called
>> qemu_aio_cancel.
>>
>>> and i also see
>>> no cancellation message in kernel log.
>>
>>
>> And that's because the UNMAP actually ultimately succeeds.  You'll
>> probably see soft lockup messages though.
>>
>> The solution here is to bump the timeout of the UNMAP command (either in
>> the kernel or in libiscsi, I didn't really understand who's at fault).
>
>
> What's your suggestion / idea about that?
>

There are no timeouts in libiscsi itself.
But you can probably tweak it through the guest kernel :


$ cat /sys/bus/scsi/devices/0\:0\:0\:0/timeout
30

should be the default scsi timeout for this device, and

$ cat /sys/bus/scsi/devices/0\:0\:0\:0/queue_depth
31

would be how many concurrent i/o that the guest will drive to the device.


When performing the UNMAPS, maybe setting timeout to something really
big, and at the same time also dropping queue-depth to something
really small so that the guest kernel will not be able to send so many
UNMAPs concurrently.


ronnie s



[Qemu-devel] [PATCH] iSCSI: Add support for SG_IO to bdrv_ioctl()/iscsi_ioctl()

2012-08-20 Thread Ronnie Sahlberg
Paolo, List

Please find a patch that adds emulation of SG_IO to the synchronous function
bdrv_ioctl()/iscsi_ioctl().
Previously we have only supported emulation for this ioctl in iSCSI for the
asynchronous function iscsi_aio_ioctl() since that is the only place 
scsi-generic uses SG_IO.

By adding support for SG_IO to iscsi_ioctl() this makes scsi-block and 
scsi-disk work too.


Since scsi-block/scsi-disk never worked with iscsi, and only scsi-generic 
worked, this is not a new regression. So whether this should go in now or wait 
until post 1.2 is for your disgression.

regards
ronnie sahlberg





[Qemu-devel] [PATCH] iSCSI: Add support for SG_IO in bdrv_ioctl()

2012-08-20 Thread Ronnie Sahlberg
We need to support SG_IO in the synchronous bdrv_ioctl() since this
is used by scsi-block

Signed-off-by: Ronnie Sahlberg 
---
 block/iscsi.c |  109 -
 1 files changed, 108 insertions(+), 1 deletions(-)

diff --git a/block/iscsi.c b/block/iscsi.c
index 993a86d..9e98bfe 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -548,7 +548,8 @@ iscsi_aio_ioctl_cb(struct iscsi_context *iscsi, int status,
 
 #define SG_ERR_DRIVER_SENSE0x08
 
-if (status == SCSI_STATUS_CHECK_CONDITION && acb->task->datain.size >= 2) {
+if (status == SCSI_STATUS_CHECK_CONDITION
+&& acb->task->datain.size >= 2) {
 int ss;
 
 acb->ioh->driver_status |= SG_ERR_DRIVER_SENSE;
@@ -633,9 +634,53 @@ static BlockDriverAIOCB *iscsi_aio_ioctl(BlockDriverState 
*bs,
 return &acb->common;
 }
 
+struct IoctlTask {
+int status;
+int complete;
+sg_io_hdr_t *ioh;
+struct scsi_task *task;
+};
+
+static void
+iscsi_ioctl_cb(struct iscsi_context *iscsi, int status,
+ void *command_data, void *opaque)
+{
+struct IoctlTask *itask = opaque;
+
+if (status < 0) {
+error_report("Failed to ioctl(SG_IO) to iSCSI lun. %s",
+ iscsi_get_error(iscsi));
+itask->status = -EIO;
+}
+
+itask->ioh->driver_status = 0;
+itask->ioh->host_status   = 0;
+itask->ioh->resid = 0;
+
+#define SG_ERR_DRIVER_SENSE0x08
+
+if (status == SCSI_STATUS_CHECK_CONDITION
+&& itask->task->datain.size >= 2) {
+int ss;
+
+itask->ioh->driver_status |= SG_ERR_DRIVER_SENSE;
+
+itask->ioh->sb_len_wr = itask->task->datain.size - 2;
+ss = (itask->ioh->mx_sb_len >= itask->ioh->sb_len_wr) ?
+ itask->ioh->mx_sb_len : itask->ioh->sb_len_wr;
+memcpy(itask->ioh->sbp, &itask->task->datain.data[2], ss);
+}
+
+itask->complete = 1;
+}
+
 static int iscsi_ioctl(BlockDriverState *bs, unsigned long int req, void *buf)
 {
 IscsiLun *iscsilun = bs->opaque;
+struct iscsi_context *iscsi = iscsilun->iscsi;
+struct IoctlTask itask;
+struct scsi_task *task;
+struct iscsi_data data;
 
 switch (req) {
 case SG_GET_VERSION_NUM:
@@ -644,6 +689,68 @@ static int iscsi_ioctl(BlockDriverState *bs, unsigned long 
int req, void *buf)
 case SG_GET_SCSI_ID:
 ((struct sg_scsi_id *)buf)->scsi_type = iscsilun->type;
 break;
+case SG_IO:
+itask.ioh = buf;
+task = malloc(sizeof(struct scsi_task));
+if (task == NULL) {
+error_report("iSCSI: Failed to allocate task for scsi command. %s",
+ iscsi_get_error(iscsi));
+return -1;
+}
+memset(task, 0, sizeof(struct scsi_task));
+
+switch (itask.ioh->dxfer_direction) {
+case SG_DXFER_TO_DEV:
+task->xfer_dir = SCSI_XFER_WRITE;
+break;
+case SG_DXFER_FROM_DEV:
+task->xfer_dir = SCSI_XFER_READ;
+break;
+default:
+task->xfer_dir = SCSI_XFER_NONE;
+break;
+}
+task->cdb_size = itask.ioh->cmd_len;
+memcpy(&task->cdb[0], itask.ioh->cmdp, itask.ioh->cmd_len);
+task->expxferlen = itask.ioh->dxfer_len;
+
+if (task->xfer_dir == SCSI_XFER_WRITE) {
+data.data = itask.ioh->dxferp;
+data.size = itask.ioh->dxfer_len;
+}
+
+if (iscsi_scsi_command_async(iscsi, iscsilun->lun, task,
+ iscsi_ioctl_cb,
+ (task->xfer_dir == SCSI_XFER_WRITE) ?
+ &data : NULL,
+ &itask) != 0) {
+scsi_free_scsi_task(task);
+return -1;
+}
+
+/* tell libiscsi to read straight into the buffer we got from ioctl */
+if (task->xfer_dir == SCSI_XFER_READ) {
+scsi_task_add_data_in_buffer(task,
+ itask.ioh->dxfer_len,
+ itask.ioh->dxferp);
+}
+
+itask.complete = 0;
+itask.status   = 0;
+itask.task = task;
+while (!itask.complete) {
+iscsi_set_events(iscsilun);
+qemu_aio_wait();
+}
+scsi_free_scsi_task(task);
+
+if (itask.status != 0) {
+error_report("iSCSI: Failed to send async command to target : %s",
+ iscsi_get_error(iscsi));
+return -1;
+}
+
+return 0;
 default:
 return -1;
 }
-- 
1.7.3.1




Re: [Qemu-devel] [PATCH v2 1/3] net: asynchronous send/receive infrastructure for net/socket.c

2012-08-21 Thread ronnie sahlberg
In
net_socket_update_fd_handler()

shouldnt you call qemu_notify_event() if any of the handlers have
changed from NULL to non-NULL ?
or else it might take a while before the change takes effect.


regards
ronnie sahlberg

On Wed, Aug 22, 2012 at 1:52 AM, Stefan Hajnoczi
 wrote:
> The net/socket.c net client is not truly asynchronous.  This patch
> borrows the qemu_set_fd_handler2() code from net/tap.c as the basis for
> proper asynchronous send/receive.
>
> Only read packets from the socket when the peer is able to receive.
> This avoids needless queuing.
>
> Later patches implement asynchronous send.
>
> Signed-off-by: Stefan Hajnoczi 
> ---
>  net/socket.c |   58 
> --
>  1 file changed, 52 insertions(+), 6 deletions(-)
>
> diff --git a/net/socket.c b/net/socket.c
> index c172c24..54e32f0 100644
> --- a/net/socket.c
> +++ b/net/socket.c
> @@ -42,9 +42,51 @@ typedef struct NetSocketState {
>  unsigned int packet_len;
>  uint8_t buf[4096];
>  struct sockaddr_in dgram_dst; /* contains inet host and port destination 
> iff connectionless (SOCK_DGRAM) */
> +IOHandler *send_fn;   /* differs between SOCK_STREAM/SOCK_DGRAM 
> */
> +bool read_poll;   /* waiting to receive data? */
> +bool write_poll;  /* waiting to transmit data? */
>  } NetSocketState;
>
>  static void net_socket_accept(void *opaque);
> +static void net_socket_writable(void *opaque);
> +
> +/* Only read packets from socket when peer can receive them */
> +static int net_socket_can_send(void *opaque)
> +{
> +NetSocketState *s = opaque;
> +
> +return qemu_can_send_packet(&s->nc);
> +}
> +
> +static void net_socket_update_fd_handler(NetSocketState *s)
> +{
> +qemu_set_fd_handler2(s->fd,
> + s->read_poll  ? net_socket_can_send : NULL,
> + s->read_poll  ? s->send_fn : NULL,
> + s->write_poll ? net_socket_writable : NULL,
> + s);
> +}
> +
> +static void net_socket_read_poll(NetSocketState *s, bool enable)
> +{
> +s->read_poll = enable;
> +net_socket_update_fd_handler(s);
> +}
> +
> +static void net_socket_write_poll(NetSocketState *s, bool enable)
> +{
> +s->write_poll = enable;
> +net_socket_update_fd_handler(s);
> +}
> +
> +static void net_socket_writable(void *opaque)
> +{
> +NetSocketState *s = opaque;
> +
> +net_socket_write_poll(s, false);
> +
> +qemu_flush_queued_packets(&s->nc);
> +}
>
>  /* XXX: we consider we can send the whole packet without blocking */
>  static ssize_t net_socket_receive(NetClientState *nc, const uint8_t *buf, 
> size_t size)
> @@ -81,7 +123,8 @@ static void net_socket_send(void *opaque)
>  } else if (size == 0) {
>  /* end of connection */
>  eoc:
> -qemu_set_fd_handler(s->fd, NULL, NULL, NULL);
> +net_socket_read_poll(s, false);
> +net_socket_write_poll(s, false);
>  if (s->listen_fd != -1) {
>  qemu_set_fd_handler(s->listen_fd, net_socket_accept, NULL, s);
>  }
> @@ -152,7 +195,8 @@ static void net_socket_send_dgram(void *opaque)
>  return;
>  if (size == 0) {
>  /* end of connection */
> -qemu_set_fd_handler(s->fd, NULL, NULL, NULL);
> +net_socket_read_poll(s, false);
> +net_socket_write_poll(s, false);
>  return;
>  }
>  qemu_send_packet(&s->nc, s->buf, size);
> @@ -243,7 +287,8 @@ static void net_socket_cleanup(NetClientState *nc)
>  {
>  NetSocketState *s = DO_UPCAST(NetSocketState, nc, nc);
>  if (s->fd != -1) {
> -qemu_set_fd_handler(s->fd, NULL, NULL, NULL);
> +net_socket_read_poll(s, false);
> +net_socket_write_poll(s, false);
>  close(s->fd);
>  s->fd = -1;
>  }
> @@ -314,8 +359,8 @@ static NetSocketState 
> *net_socket_fd_init_dgram(NetClientState *peer,
>
>  s->fd = fd;
>  s->listen_fd = -1;
> -
> -qemu_set_fd_handler(s->fd, net_socket_send_dgram, NULL, s);
> +s->send_fn = net_socket_send_dgram;
> +net_socket_read_poll(s, true);
>
>  /* mcast: save bound address as dst */
>  if (is_connected) {
> @@ -332,7 +377,8 @@ err:
>  static void net_socket_connect(void *opaque)
>  {
>  NetSocketState *s = opaque;
> -qemu_set_fd_handler(s->fd, net_socket_send, NULL, s);
> +s->send_fn = net_socket_send;
> +net_socket_read_poll(s, true);
>  }
>
>  static NetClientInfo net_socket_info = {
> --
> 1.7.10.4
>
>



Re: [Qemu-devel] [PATCH] iSCSI: add configuration variables for iSCSI

2012-01-24 Thread ronnie sahlberg
On Tue, Jan 24, 2012 at 5:07 AM, Eric Blake  wrote:
>
> Can -readconfig support reading from an inherited fd, rather than only
> taking a file name that qemu has to open()?
>
...
>
> Can you give an actual command line that uses -readconfig, as part of
> your example?
>

Thanks for the review Eric.
I will update the patch with an example using -readconfig and fix the typo.


Yes,  -readconfig can read from a pipe()  and it can also read from an
arbitrary file descriptor.

Reading from a pipe :
cat iscsi.conf | ./x86_64-softmmu/qemu-system-x86_64 -enable-kvm
-display vnc=127.0.0.1:0 -drive
file=iscsi://127.0.0.1/iqn.ronnie.test/1 -readconfig /proc/self/fd/0

Read from an arbitrary filedescriptor inherited from the parent process :
9

Re: [Qemu-devel] [PATCH] iSCSI: add configuration variables for iSCSI

2012-01-25 Thread ronnie sahlberg
Fair enough.

I will send a separate tiny patch to add 'fd:' support to specify
to qemu to -readconfig from a preexisting filedescriptor.

Other protocols like 'exec:' can easily be added later as needed.


regards
ronnie sahlberg

On Thu, Jan 26, 2012 at 2:57 AM, Eric Blake  wrote:
> On 01/24/2012 11:47 PM, ronnie sahlberg wrote:
>> Read from an arbitrary filedescriptor inherited from the parent process :
>> 9> vnc=127.0.0.1:0 -drive file=iscsi://127.0.0.1/iqn.ronnie.test/1
>> -readconfig /proc/self/fd/9
>
> That requires the existence of procfs, which is not portable (although
> it does work on Linux).  I'd rather see:
>
> -readconfig fd:9
>
> which matches things for -incoming; that is, if -readconfig starts with
> '/' or '.', it is a filename; otherwise, it is a protocol:value
> designation, where we recognize at least the fd: protocol where a value
> is the incoming fd, but we could also recognize things like exec:
> protocol which is an arbitrary command to use via popen.
>
>> I imagine you would pipe() then fork() and pass the read side of your
>> pipe to qemu here ?
>
> Yes, the idea is that libvirt would rather pipe() and then pass the read
> size fd to qemu, so that libvirt's handling of the decrypted secret
> information is only ever passed over the pipe and not stored on disk.
>
>> If this works well or at least in some acceptable form it might be
>> useful for other users needing to pass sensitive config data into QEMU
>> too?
>
> Yes, the fd: notation of -incoming should be reusable in multiple
> contexsts, including any other location where sensitive information must
> be passed in.
>
> --
> Eric Blake   ebl...@redhat.com    +1-919-301-3266
> Libvirt virtualization library http://libvirt.org
>



[Qemu-devel] [PATCH 0/0] Allow -readconfig to read from a pre-existing filedescriptor

2012-01-25 Thread Ronnie Sahlberg
List,

Please find attached a trivial patch to allow -readconfig to read from a 
pre-existing filedescriptor instead of a file off disk.

Syntax is '-readconfig fd:' to read from filedescriptor .

This is useful for example for libvirt which allow it to pass configuration 
data, including possibly sensitive such, into qemu without this data hitting 
the filesystem or a disk.

regards
ronnie sahlberg





[Qemu-devel] [PATCH] READCONFIG: Allow reading the configuration from a pre-existing filedescriptor

2012-01-25 Thread Ronnie Sahlberg
Update the readconfig filename parsing to allow specifying an existing, 
inherited, filedescriptor as 'fd:'
This is useful when you want to pass potentially sensitive onfiguration data to 
qemu without having it hit the filesystem/stable-storage

Signed-off-by: Ronnie Sahlberg 
---
 qemu-config.c   |   15 +--
 qemu-options.hx |3 ++-
 2 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/qemu-config.c b/qemu-config.c
index b030205..c12c5eb 100644
--- a/qemu-config.c
+++ b/qemu-config.c
@@ -770,8 +770,19 @@ out:
 
 int qemu_read_config_file(const char *filename)
 {
-FILE *f = fopen(filename, "r");
-int ret;
+FILE *f;
+int ret, fd;
+
+if (strncmp(filename, "fd:", 3)) {
+f = fopen(filename, "r");
+} else {
+errno = 0;
+fd = strtol(filename + 3, NULL, 10);
+if (errno != 0) {
+return -errno;
+}
+f = fdopen(fd, "r");
+}
 
 if (f == NULL) {
 return -errno;
diff --git a/qemu-options.hx b/qemu-options.hx
index 3a07ae8..653ff2d 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -2577,7 +2577,8 @@ DEF("readconfig", HAS_ARG, QEMU_OPTION_readconfig,
 STEXI
 @item -readconfig @var{file}
 @findex -readconfig
-Read device configuration from @var{file}.
+Read device configuration from @var{file}. Use 'fd:' as filename
+to read from an existing, inherited, filedesriptor.
 ETEXI
 DEF("writeconfig", HAS_ARG, QEMU_OPTION_writeconfig,
 "-writeconfig \n"
-- 
1.7.3.1




[Qemu-devel] [PATCH 0/0] Add configuration variables for iSCSI

2012-01-25 Thread Ronnie Sahlberg
Kevin, List
Please review and/or apply.


This is version 3 of the patch to add configuration variables for iSCSI.
Version 2 added the feature to specify configuration blocks that apply to a 
specific target name, allowing qemu to use different settings if/when 
connecting one guest to multiple different iscsi targets.
Version 3 fixes a typo and adds an example to the documentation on how to use 
-readconfig for reading the configuration from a file


This patch adds configuration variables for iSCSI to set
initiator-name to use when logging in to the target,
which type of header-digest to negotiate with the target
and username and password for CHAP authentication.

This allows specifying a initiator-name either from the command line
-iscsi initiator-name=iqn.2004-01.com.example:test
or from a configuration file included with -readconfig
[iscsi]
 initiator-name = iqn.2004-01.com.example:test
 header-digest = CRC32C|CRC32C-NONE|NONE-CRC32C|NONE
 user = CHAP username
 password = CHAP password

In the configuration file it is also possible to set a target specific
configuratyion block using the header
[iscsi "iqn.target.name"]

When a iscsi session is initialized, it will first try to use a configuration
section that matches the target name.
If no such block is found, it will fall-back to try the default [iscsi] section 
instead.

regards
ronnie sahlberg




[Qemu-devel] [PATCH] iSCSI: add configuration variables for iSCSI

2012-01-25 Thread Ronnie Sahlberg
This patch adds configuration variables for iSCSI to set
initiator-name to use when logging in to the target,
which type of header-digest to negotiate with the target
and username and password for CHAP authentication.

This allows specifying a initiator-name either from the command line
-iscsi initiator-name=iqn.2004-01.com.example:test
or from a configuration file included with -readconfig
[iscsi]
  initiator-name = iqn.2004-01.com.example:test
  header-digest = CRC32C|CRC32C-NONE|NONE-CRC32C|NONE
  user = CHAP username
  password = CHAP password

If you use several different targets, you can also configure this on a per
target basis by using a group name:
[iscsi "iqn.target.name"]
...

The configuration file can be read using -readconfig.
Example :
qemu-system-i386 -drive file=iscsi://127.0.0.1/iqn.ronnie.test/1
 -readconfig iscsi.conf

Signed-off-by: Ronnie Sahlberg 
---
 block/iscsi.c   |  139 +++
 qemu-config.c   |   27 +++
 qemu-doc.texi   |   54 +-
 qemu-options.hx |   16 +--
 vl.c|8 +++
 5 files changed, 229 insertions(+), 15 deletions(-)

diff --git a/block/iscsi.c b/block/iscsi.c
index 938c568..bd3ca11 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -455,6 +455,109 @@ iscsi_connect_cb(struct iscsi_context *iscsi, int status, 
void *command_data,
 }
 }
 
+static int parse_chap(struct iscsi_context *iscsi, const char *target)
+{
+QemuOptsList *list;
+QemuOpts *opts;
+const char *user = NULL;
+const char *password = NULL;
+
+list = qemu_find_opts("iscsi");
+if (!list) {
+return 0;
+}
+
+opts = qemu_opts_find(list, target);
+if (opts == NULL) {
+opts = QTAILQ_FIRST(&list->head);
+if (!opts) {
+return 0;
+}
+}
+
+user = qemu_opt_get(opts, "user");
+if (!user) {
+return 0;
+}
+
+password = qemu_opt_get(opts, "password");
+if (!password) {
+error_report("CHAP username specified but no password was given");
+return -1;
+}
+
+if (iscsi_set_initiator_username_pwd(iscsi, user, password)) {
+error_report("Failed to set initiator username and password");
+return -1;
+}
+
+return 0;
+}
+
+static void parse_header_digest(struct iscsi_context *iscsi, const char 
*target)
+{
+QemuOptsList *list;
+QemuOpts *opts;
+const char *digest = NULL;
+
+list = qemu_find_opts("iscsi");
+if (!list) {
+return;
+}
+
+opts = qemu_opts_find(list, target);
+if (opts == NULL) {
+opts = QTAILQ_FIRST(&list->head);
+if (!opts) {
+return;
+}
+}
+
+digest = qemu_opt_get(opts, "header-digest");
+if (!digest) {
+return;
+}
+
+if (!strcmp(digest, "CRC32C")) {
+iscsi_set_header_digest(iscsi, ISCSI_HEADER_DIGEST_CRC32C);
+} else if (!strcmp(digest, "NONE")) {
+iscsi_set_header_digest(iscsi, ISCSI_HEADER_DIGEST_NONE);
+} else if (!strcmp(digest, "CRC32C-NONE")) {
+iscsi_set_header_digest(iscsi, ISCSI_HEADER_DIGEST_CRC32C_NONE);
+} else if (!strcmp(digest, "NONE-CRC32C")) {
+iscsi_set_header_digest(iscsi, ISCSI_HEADER_DIGEST_NONE_CRC32C);
+} else {
+error_report("Invalid header-digest setting : %s", digest);
+}
+}
+
+static char *parse_initiator_name(const char *target)
+{
+QemuOptsList *list;
+QemuOpts *opts;
+const char *name = NULL;
+
+list = qemu_find_opts("iscsi");
+if (!list) {
+return g_strdup("iqn.2008-11.org.linux-kvm");
+}
+
+opts = qemu_opts_find(list, target);
+if (opts == NULL) {
+opts = QTAILQ_FIRST(&list->head);
+if (!opts) {
+return g_strdup("iqn.2008-11.org.linux-kvm");
+}
+}
+
+name = qemu_opt_get(opts, "initiator-name");
+if (!name) {
+return g_strdup("iqn.2008-11.org.linux-kvm");
+}
+
+return g_strdup(name);
+}
+
 /*
  * We support iscsi url's on the form
  * iscsi://[%@][:]//
@@ -465,6 +568,7 @@ static int iscsi_open(BlockDriverState *bs, const char 
*filename, int flags)
 struct iscsi_context *iscsi = NULL;
 struct iscsi_url *iscsi_url = NULL;
 struct IscsiTask task;
+char *initiator_name = NULL;
 int ret;
 
 if ((BDRV_SECTOR_SIZE % 512) != 0) {
@@ -474,16 +578,6 @@ static int iscsi_open(BlockDriverState *bs, const char 
*filename, int flags)
 return -EINVAL;
 }
 
-memset(iscsilun, 0, sizeof(IscsiLun));
-
-/* Should really append the KVM name after the ':' here */
-iscsi = iscsi_create_context("iqn.2008-11.org.linux-kvm:");
-if (iscsi == NULL) {
-error_report("iSCSI: Failed to create 

Re: [Qemu-devel] [PATCH v2 11/33] scsi-disk: support DVD profile in GET CONFIGURATION

2012-01-26 Thread ronnie sahlberg
Hi Paolo, List,

I see you are very active in the MMC emulation for qemu.
I have written several MMC emulators, the third of which (in STGT) is
open source.

I did the MMC emulation in STGT, (see README.mmc  for how to use it as
"blank burnable media")


Do you have DVD+/-R + morhping implemented yet?
STGT has a feature where once an initiator has finished "burning" a
new image to a "blank" disk
where STGT automagically morphs the LUN from being profile:dvd+/-r
to be profile:dvd-rom
to indicate it is no longer a blank burnable disk   but a readonly
disk containing an iso.

Would that be of interest to you or the community?   It is a few
thousand lines of semi-trivial code to port but nothing
that hairy.


Im happy to port it to qemu if there is need/desire/interest for it.



There was a bug in STGT that made it impossible to build a virtual
jukebox with writeable/burnable media for about 6 months and no one
noticed, so maybe not too many people use such features :-(




regards
ronnie sahlberg


On Thu, Jan 26, 2012 at 7:09 PM, Paolo Bonzini  wrote:
> On 01/25/2012 09:13 PM, Artyom Tarasenko wrote:
>>
>> On 1/25/12, Paolo Bonzini  wrote:
>>>
>>> On 01/25/2012 05:34 PM, Artyom Tarasenko wrote:
>>>>
>>>> This patch produces the following error when booting Solaris/SPARC:
>>>>
>>>> WARNING:
>>>> /iommu@0,1000/sbus@0,10001000/espdma@5,840/esp@5,880
>>>> (esp0):
>>>>          data transfer overrun: current esp state:
>>>> esp:            State=DATA Last State=DATA_DONE
>>>> esp:            Latched stat=0x91   intr=0x10   fifo
>>>> 0x0
>>>> esp:            last msg out: IDENTIFY; last msg in: COMMAND COMPLETE
>>>> esp:            DMA csr=0xa4240030
>>>> esp:            addr=fc005860 dmacnt=0 last=fc005858 last_cnt=8
>>>> esp:            Cmd dump for Target 6 Lun 0:
>>>> esp:            cdblen=10, cdb=[ 0x46 0x2 0x0 0x0 0x0 0x0 0x0 0x0 0x8
>>>> 0x0
>>>> ]
>>>> esp:            pkt_state=0xf   pkt_flags=0x1
>>>> pkt_statistics=0x0
>>>> esp:            cmd_flags=0x462 cmd_timeout=60
>>>>
>>>> Particularly it seems that Solaris is not happy with respose length
>>>>>
>>>>> 8. Any idea, why?
>
>
> The response length is not 8, it is 40; but Solaris only expected 8 (see the
> CDB in the error dump).
>
>
>>>> $ sparc-softmmu/qemu-system-sparc -bios ~/ss5-170.bin -m 32 -nographic
>>>> -drive if=scsi,unit=6,media=cdrom,file=~/Solaris9cd1.iso
>>>
>>>
>>>   From a quick look, it's the same bug that was reported recently on
>>> INQUIRY commands.
>>
>>
>> If you need I can provide more input.
>
>
> You can test http://article.gmane.org/gmane.comp.emulators.qemu/133434 if
> you have time.
>
> Paolo
>



Re: [Qemu-devel] [PATCH] iSCSI: add configuration variables for iSCSI

2012-01-26 Thread ronnie sahlberg
Kevin,

Collissions are bad, but what about

IF ! STRNCMP (filename, "/proc/self/fd/", 14)  THEN
 fopen(filename, "r")
ELSE
fdopen(atoi(filename+14), "r")
FI

modulo better validation for the atio() arguments.


Probability of anyone using "/proc/self/fd/" as a prefix for normal
files is very small.
Small enough it will be justifiable to say "do that and you are on your own". ?


regards
ronnie sahlberg

On Thu, Jan 26, 2012 at 8:08 PM, Kevin Wolf  wrote:
> Am 25.01.2012 16:57, schrieb Eric Blake:
>> On 01/24/2012 11:47 PM, ronnie sahlberg wrote:
>>> Read from an arbitra filedescriptor inherited from the parent process :
>>> 9>> vnc=127.0.0.1:0 -drive file=iscsi://127.0.0.1/iqn.ronnie.test/1
>>> -readconfig /proc/self/fd/9
>>
>> That requires the existence of procfs, which is not portable (although
>> it does work on Linux).  I'd rather see:
>>
>> -readconfig fd:9
>>
>> which matches things for -incoming; that is, if -readconfig starts with
>> '/' or '.', it is a filename; otherwise, it is a protocol:value
>> designation, where we recognize at least the fd: protocol where a value
>> is the incoming fd, but we could also recognize things like exec:
>> protocol which is an arbitrary command to use via popen.
>
> Magic prefixes like this have one big problem: What if someone has a
> config file called "fd:9"? We have the very same problem with protocols
> in the block layer and while in the general case it's a convenient
> syntax, we've come to hate it in cases where it misinterprets things.
>
> Kevin



Re: [Qemu-devel] [PATCH] iSCSI: add configuration variables for iSCSI

2012-01-26 Thread ronnie sahlberg
Ok   so what about this

You use a filename starting with "/proc/self/fd/"  and you dont have a
proc filesystem mounted? you are on your own!


regards
ronnie sahlberg



On Thu, Jan 26, 2012 at 8:27 PM, Kevin Wolf  wrote:
> Am 26.01.2012 10:18, schrieb ronnie sahlberg:
>> Kevin,
>>
>> Collissions are bad, but what about
>>
>> IF ! STRNCMP (filename, "/proc/self/fd/", 14)  THEN
>>          fopen(filename, "r")
>> ELSE
>>         fdopen(atoi(filename+14), "r")
>> FI
>>
>> modulo better validation for the atio() arguments.
>>
>>
>> Probability of anyone using "/proc/self/fd/" as a prefix for normal
>> files is very small.
>> Small enough it will be justifiable to say "do that and you are on your 
>> own". ?
>
> Interesting idea. Maybe that could work.
>
> Kevin


0001-READCONFIG-make-proc-self-fd-a-magic-prefix-to-refer.patch.gz
Description: GNU Zip compressed data
From 65aa496d77b839f1c48745fc5545e3e6772f724c Mon Sep 17 00:00:00 2001
From: Ronnie Sahlberg 
Date: Thu, 26 Jan 2012 20:47:40 +1100
Subject: [PATCH] READCONFIG: make /proc/self/fd/ a magic prefix to refer to a specific
 filedesriptor inherited from the parent.

Signed-off-by: Ronnie Sahlberg 
---
 qemu-config.c   |   16 ++--
 qemu-options.hx |3 ++-
 2 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/qemu-config.c b/qemu-config.c
index b030205..9e709d4 100644
--- a/qemu-config.c
+++ b/qemu-config.c
@@ -770,8 +770,20 @@ out:
 
 int qemu_read_config_file(const char *filename)
 {
-FILE *f = fopen(filename, "r");
-int ret;
+FILE *f;
+int ret, fd;
+char *ptr = NULL;
+
+if (strncmp(filename, "/proc/self/fd/", 14)) {
+f = fopen(filename, "r");
+} else {
+errno = 0;
+fd = strtol(filename + 14, &ptr, 10);
+if (errno != 0 || ptr == filename + 14 || *ptr != '\0') {
+return -EINVAL;
+}
+f = fdopen(fd, "r");
+}
 
 if (f == NULL) {
 return -errno;
diff --git a/qemu-options.hx b/qemu-options.hx
index 3a07ae8..e54af58 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -2577,7 +2577,8 @@ DEF("readconfig", HAS_ARG, QEMU_OPTION_readconfig,
 STEXI
 @item -readconfig @var{file}
 @findex -readconfig
-Read device configuration from @var{file}.
+Read device configuration from @var{file}. Use '/proc/self/fd/' as filename
+to read from an existing, inherited, filedesriptor.
 ETEXI
 DEF("writeconfig", HAS_ARG, QEMU_OPTION_writeconfig,
 "-writeconfig \n"
-- 
1.7.3.1



Re: [Qemu-devel] qemu-img convert with block driver without .bdrv_create (like iscsi)

2012-10-25 Thread ronnie sahlberg
On Thu, Oct 25, 2012 at 1:02 AM, Kevin Wolf  wrote:
> Am 25.10.2012 09:52, schrieb Paolo Bonzini:
>> Il 25/10/2012 09:46, Kevin Wolf ha scritto:
>>>>> 1)add a .bdrv_create in block/iscsi.c ?
>>>>>
>>>>> (like host_device block driver, only open/close the device and check if 
>>>>> size if big enough)
>>> Yes, this is the right way.
>>
>> Could it be a default implementation of .bdrv_create (i.e. something
>> you'd do in bdrv_create if the protocol doesn't have it)?
>
> No, there are block drivers that really can't create images. They should
> keep failing.

Technically, you can not create new LUNs via the iscsi protocol
itself, you can only access pre-existing luns
that have been created by some out-of-band method.

So basically, with iscsi you can only ever access already preexisting luns.

In that case I think requiring a .bdrv_create feels wrong. We are not
creating a new LUN here but just opening an already existing LUN.


What about changing bdrv_create() so that IF there is no .bdrv_create
method, then assume it might be a pre-existing file in which case we
fallback to use .bdrv_open instead.

regards
ronnie sahlberg



Re: [Qemu-devel] qemu-img convert with block driver without .bdrv_create (like iscsi)

2012-10-25 Thread ronnie sahlberg
On Thu, Oct 25, 2012 at 6:58 AM, Paolo Bonzini  wrote:
> Il 25/10/2012 15:41, ronnie sahlberg ha scritto:
>> On Thu, Oct 25, 2012 at 1:02 AM, Kevin Wolf  wrote:
>>> Am 25.10.2012 09:52, schrieb Paolo Bonzini:
>>>> Il 25/10/2012 09:46, Kevin Wolf ha scritto:
>>>>>>> 1)add a .bdrv_create in block/iscsi.c ?
>>>>>>>
>>>>>>> (like host_device block driver, only open/close the device and check if 
>>>>>>> size if big enough)
>>>>> Yes, this is the right way.
>>>>
>>>> Could it be a default implementation of .bdrv_create (i.e. something
>>>> you'd do in bdrv_create if the protocol doesn't have it)?
>>>
>>> No, there are block drivers that really can't create images. They should
>>> keep failing.
>>
>> Technically, you can not create new LUNs via the iscsi protocol
>> itself, you can only access pre-existing luns
>> that have been created by some out-of-band method.
>>
>> So basically, with iscsi you can only ever access already preexisting luns.
>>
>> In that case I think requiring a .bdrv_create feels wrong. We are not
>> creating a new LUN here but just opening an already existing LUN.
>
> The problem is that bdrv_create is overloaded to mean both "create the
> backing storage" and "format the image".  Only the latter applies to
> iSCSI and, in general, as far as protocols are concerned bdrv_create is
> usually a no-op.
>
> However, Kevin said he prefers to have an explicit override of
> bdrv_create for iSCSI.  Can you implement that?  (I'll then do the same
> for NBD).
>

Yepp. Since there is consensus.
We can do a .bdrv_create for iscsi.c

regards
ronnie sahlberg



Re: [Qemu-devel] Ubuntu/Debian Installer + Virtio-SCSI -> Bad ram pointer

2012-10-30 Thread ronnie sahlberg
But older distros/kernels work fine?

Can you take a network trace?

About half a year there was an issue where recent kernels had added
support to start using new scsi opcodes,  but the qemu functions that
determine "which transfer direction is used for this opcode" had not
yet been updated, so that the opcode was sent with the wrong transfer
direction.

That caused the guests memory to be overwritten and crash.

I dont have (easy) access to the git tree right now, but it was a
patch for the ATA_PASSTHROUGH command that fixed that.

Could you take a network trace and check maybe this is something
similar ?  I.e. does the guest send an "unusual" scsi opcode just
before the crash ?


regards
ronnie sahlberg


On Tue, Oct 30, 2012 at 12:37 PM, Peter Lieven  wrote:
> Am 30.10.2012 19:27, schrieb Stefan Hajnoczi:
>
>> On Tue, Oct 30, 2012 at 4:56 PM, Peter Lieven  wrote:
>>>
>>> On 30.10.2012 09:32, Stefan Hajnoczi wrote:
>>>>
>>>> On Mon, Oct 29, 2012 at 03:09:37PM +0100, Peter Lieven wrote:
>>>>>
>>>>> Hi,
>>>>
>>>> Bug subject should be virtio-blk, not virtio-scsi.  virtio-scsi is a
>>>> different virtio device type from virtoi-blk and is not present in the
>>>> backtrace you posted.
>>>>
>>>> Sounds pedantic but I want to make sure this gets chalked up against the
>>>> right device :).
>>>>
>>>>> If I try to Install Ubuntu 12.04 LTS / 12.10 64-bit on a virtio
>>>>> storage backend that supports iSCSI
>>>>> qemu-kvm crashes reliably with the following error:
>>>>
>>>> Are you using vanilla qemu-kvm-1.2.0 or are there patches applied?
>>>>
>>>> Have you tried qemu-kvm.git/master?
>>>>
>>>> Have you tried a local raw disk image to check whether libiscsi is
>>>> involved?
>>>>
>>>>> Bad ram pointer 0x3039303620008000
>>>>>
>>>>> This happens directly after the confirmation of the Timezone before
>>>>> the Disk is partitioned.
>>>>>
>>>>> If I specify  -global virtio-blk-pci.scsi=off in the cmdline this
>>>>> does not happen.
>>>>>
>>>>> Here is a stack trace:
>>>>>
>>>>> Thread 1 (Thread 0x77fee700 (LWP 8226)):
>>>>> #0 0x763c0a10 in abort () from /lib/x86_64-linux-gnu/libc.so.6
>>>>> No symbol table info available.
>>>>> #1 <https://github.com/sahlberg/libiscsi/issues/1>
>>>>> 0x557b751d in qemu_ram_addr_from_host_nofail (
>>>>> ptr=0x3039303620008000) at /usr/src/qemu-kvm-1.2.0/exec.c:2835
>>>>> ram_addr = 0
>>>>> #2 <https://github.com/sahlberg/libiscsi/issues/2>
>>>>> 0x557b9177 in cpu_physical_memory_unmap (
>>>>> buffer=0x3039303620008000, len=4986663671065686081, is_write=1,
>>>>> access_len=1) at /usr/src/qemu-kvm-1.2.0/exec.c:3645
>>>>
>>>> buffer and len are ASCII junk.  It appears to be hex digits and it's not
>>>> clear where they come from.
>>>>
>>>> It would be interesting to print *elem one stack frame up in #3
>>>> virtqueue_fill() to show the iovecs and in/out counts.
>>>
>>>
>>> (gdb) print *elem
>>
>> Great, thanks for providing this info:
>>
>>> $6 = {index = 3, out_num = 2, in_num = 4, in_addr = {1914920960,
>>> 1916656688,
>>>  2024130072, 2024130088, 0 , 4129, 93825009696000,
>>>  140737328183160, 0 }, out_addr = {2024130056,
>>>  2038414056, 0, 8256, 4128, 93824999311936, 0, 3, 0 >> times>,
>>>  12385, 93825009696000, 140737328183160, 0 },
>>
>> Up to here everything is fine.
>>
>>> in_sg =
>>> {{
>>>iov_base = 0x3039303620008000, iov_len = 4986663671065686081}, {
>>>iov_base = 0x383038454635, iov_len = 3544389261899019573}, {
>>
>> The fields are bogus, in_sg has been overwritten with ASCII data.
>> Unfortunately I don't see any hint of where this ASCII data came from
>> yet.
>>
>> The hdr fields you provided in stack frame #6 show that in_sg was
>> overwritten during or after the bdrv_ioctl() call.  We pulled valid
>> data out of the vring and mapped buffers correctly.  But something is
>> overwriting in_sg and when we complete the request we blow up due to
>> the bogus values.
>
> Ok. What I have to mention. I've been testing with qemu-kvm 1.2.0
> an

Re: [Qemu-devel] QEMU question: is eventfd not thread safe?

2012-07-01 Thread ronnie sahlberg
Hi,

As Paolo said,
I hit this with block/iscsi.c a few months back.
Then about a month back I recall something that looks alkmost
identical to this hit the IDE driver on the KVM list.
( At least the symptoms looked just like my symptoms, and the KVM guys
managed to bisect it down to the exact same patch that I did that
would expose the bug in block.iscsi.c, namely the lack of calling
qemu_notify_event() )

If we have hit a problem 3 times recently due to developers not
realizing the importance of calling qemu_notify_event(), maybe the API
is suboptimal.

Would it be possible to change the set-event-handlers functions to
automatically call qemu_notify_event() when the descriptos change?
To eliminate the need to call this function at all ?


regards
ronnie sahlberg

On Mon, Jul 2, 2012 at 10:06 AM, Alexey Kardashevskiy  wrote:
> On 02/07/12 09:07, Benjamin Herrenschmidt wrote:
>>>>> diff --git a/iohandler.c b/iohandler.c
>>>>> index 3c74de6..54f4c48 100644
>>>>> --- a/iohandler.c
>>>>> +++ b/iohandler.c
>>>>> @@ -77,6 +77,7 @@ int qemu_set_fd_handler2(int fd,
>>>>>          ioh->fd_write = fd_write;
>>>>>          ioh->opaque = opaque;
>>>>>          ioh->deleted = 0;
>>>>> +        kill(getpid(), SIGUSR2);
>>>>>      }
>>>>>      return 0;
>>>>>  }
>>
>> That probably wants to be a pthread_kill targetted at the main loop.
>>
>>>>> +static void sigusr2_print(int signal)
>>>>> +{
>>>>> +}
>>>>> +
>>>>> +static void sigusr2_init(void)
>>>>> +{
>>>>> +    struct sigaction action;
>>>>> +
>>>>> +    memset(&action, 0, sizeof(action));
>>>>> +    sigfillset(&action.sa_mask);
>>>>> +    action.sa_handler = sigusr2_print;
>>>>> +    action.sa_flags = 0;
>>>>> +    sigaction(SIGUSR2, &action, NULL);
>>>>> +}
>>>>> +
>>
>> Won't that conflict with the business in coroutine-sigaltstack.c ?
>
> The code which touches SIGUSR2 does not compile on power.
>
>> Hrm... looking at it, it looks like it will save/restore the handler,
>> so that should be good.
>>
>> Still, one might want to wrap that into something, like
>> qemu_wake_main_loop();
>
>
> I already posted another patch with qemu_notify_event() in this mail thread 
> later :)
>
>
>>
>> Cheers,
>> Ben.
>>
>>>>>  int main_loop_init(void)
>>>>>  {
>>>>>      int ret;
>>>>>
>>>>> +    sigusr2_init();
>>>>> +
>>>>>      qemu_mutex_lock_iothread();
>>>>>      ret = qemu_signal_init();
>>>>>      if (ret) {
>>>>> --
>>>>> 1.7.10
>>>
>>>
>>
>>
>
>
> --
> Alexey
>



[Qemu-devel] [PATCH] SCSI rd/wr/vr protect != 0 is an error

2012-07-11 Thread Ronnie Sahlberg
List, Paolo,

Please find a small patch for the scsi emulation in qemu.
Since QEMU scsi emulation does not implement/support protection information
any READ/WRITE/VERIFY command that specifies a non-zero value for the 
rd/vr/wr-protect field in the CDB should fail with 
ILLEGAL_REQUEST/INVALID_FIELD_IN_CDB


regards
ronnie sahlberg





[Qemu-devel] [PATCH] SCSI: rd/wr/vr-protect !=0 is an error

2012-07-11 Thread Ronnie Sahlberg
The QEMU SCSI emulation does not support protection information,
so any READ/WRITE/VERIFY commands that has the protect bits set to
non-zero should fail with ILLEGAL_REQUEST/INVALID_FIELD_IN_CDB

>From SCSI SBC :
If the logical unit does not support protection information,
then the device server should terminate the command with CHECK CONDITION
status with the sense key set to ILLEGAL REQUEST and the additional sense
code set to INVALID FIELD IN CDB.

Signed-off-by: Ronnie Sahlberg 
---
 hw/scsi-disk.c |6 ++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c
index 34336b1..b2f3c0c 100644
--- a/hw/scsi-disk.c
+++ b/hw/scsi-disk.c
@@ -1555,6 +1555,9 @@ static int32_t scsi_send_command(SCSIRequest *req, 
uint8_t *buf)
 case READ_16:
 len = r->req.cmd.xfer / s->qdev.blocksize;
 DPRINTF("Read (sector %" PRId64 ", count %d)\n", r->req.cmd.lba, len);
+if (r->req.cmd.buf[1] & 0xe0) {
+goto fail;
+}
 if (r->req.cmd.lba > s->qdev.max_lba) {
 goto illegal_lba;
 }
@@ -1575,6 +1578,9 @@ static int32_t scsi_send_command(SCSIRequest *req, 
uint8_t *buf)
 DPRINTF("Write %s(sector %" PRId64 ", count %d)\n",
 (command & 0xe) == 0xe ? "And Verify " : "",
 r->req.cmd.lba, len);
+if (r->req.cmd.buf[1] & 0xe0) {
+goto fail;
+}
 if (r->req.cmd.lba > s->qdev.max_lba) {
 goto illegal_lba;
 }
-- 
1.7.3.1




[Qemu-devel] [PATCH] SCSI improved LBA-out-of-range checks

2012-07-11 Thread Ronnie Sahlberg

Paolo, List

Please find a small patch to the scsi emulation.
This patch improves the checkign that the requested lbas are all available.
We check both that lba+len is not going past the end of the device
but also iflba+len < lba

This second condition could occur for deviously crafted scsi packets where
lba is set to 0x  and len is set to 2
in which case lba+len would wrap to 1





[Qemu-devel] [PATCH] SCSI: improve the lba-out-of-range tests for read/write/verify

2012-07-11 Thread Ronnie Sahlberg
Improve the tests for the LBA to cover more cases, the new test looks like this
if (r->req.cmd.lba > r->req.cmd.lba + len
||  r->req.cmd.lba + len > s->qdev.max_lba) {

For the 16 byte opcodes, the lba is a uint64, so the first check is to make 
sure that we do not wrap.
For example if an opcode would specify the LBA:0x and LEN:2
then lba+len would wrap to 1.

The second part of the test is to verify that ALL requested blocks are 
available, not just the first one.

Signed-off-by: Ronnie Sahlberg 
---
 hw/scsi-disk.c |6 --
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c
index b2f3c0c..40c05de 100644
--- a/hw/scsi-disk.c
+++ b/hw/scsi-disk.c
@@ -1558,7 +1558,8 @@ static int32_t scsi_send_command(SCSIRequest *req, 
uint8_t *buf)
 if (r->req.cmd.buf[1] & 0xe0) {
 goto fail;
 }
-if (r->req.cmd.lba > s->qdev.max_lba) {
+if (r->req.cmd.lba > r->req.cmd.lba + len
+||  r->req.cmd.lba + len > s->qdev.max_lba) {
 goto illegal_lba;
 }
 r->sector = r->req.cmd.lba * (s->qdev.blocksize / 512);
@@ -1581,7 +1582,8 @@ static int32_t scsi_send_command(SCSIRequest *req, 
uint8_t *buf)
 if (r->req.cmd.buf[1] & 0xe0) {
 goto fail;
 }
-if (r->req.cmd.lba > s->qdev.max_lba) {
+if (r->req.cmd.lba > r->req.cmd.lba + len
+||  r->req.cmd.lba + len > s->qdev.max_lba) {
 goto illegal_lba;
 }
 r->sector = r->req.cmd.lba * (s->qdev.blocksize / 512);
-- 
1.7.3.1




[Qemu-devel] [PATCH] SCSI improved LBA-out-of-range checks BUGFIX

2012-07-12 Thread Ronnie Sahlberg
Paolo,

Sorry but the previous patch was bad.
Use this patch instead it uses the correct check of
if (r->req.cmd.lba > r->req.cmd.lba + len
||  r->req.cmd.lba + len > s->qdev.max_lba + 1) {




[Qemu-devel] [PATCH] SCSI Improved checking for LBA out of range

2012-07-12 Thread Ronnie Sahlberg
Improve the tests for the LBA to cover more cases, the new test looks like this
if (r->req.cmd.lba > r->req.cmd.lba + len
||  r->req.cmd.lba + len > s->qdev.max_lba + 1) {

For the 16 byte opcodes, the lba is a uint64, so the first check is to make 
sure that we do not wrap.
For example if an opcode would specify the LBA:0x and LEN:2
then lba+len would wrap to 1.

The second part of the test is to verify that ALL requested blocks are 
available, not just the first one.

Signed-off-by: Ronnie Sahlberg 
---
 hw/scsi-disk.c |6 --
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c
index b2f3c0c..2c2be33 100644
--- a/hw/scsi-disk.c
+++ b/hw/scsi-disk.c
@@ -1558,7 +1558,8 @@ static int32_t scsi_send_command(SCSIRequest *req, 
uint8_t *buf)
 if (r->req.cmd.buf[1] & 0xe0) {
 goto fail;
 }
-if (r->req.cmd.lba > s->qdev.max_lba) {
+if (r->req.cmd.lba > r->req.cmd.lba + len
+||  r->req.cmd.lba + len > s->qdev.max_lba + 1) {
 goto illegal_lba;
 }
 r->sector = r->req.cmd.lba * (s->qdev.blocksize / 512);
@@ -1581,7 +1582,8 @@ static int32_t scsi_send_command(SCSIRequest *req, 
uint8_t *buf)
 if (r->req.cmd.buf[1] & 0xe0) {
 goto fail;
 }
-if (r->req.cmd.lba > s->qdev.max_lba) {
+if (r->req.cmd.lba > r->req.cmd.lba + len
+||  r->req.cmd.lba + len > s->qdev.max_lba + 1) {
 goto illegal_lba;
 }
 r->sector = r->req.cmd.lba * (s->qdev.blocksize / 512);
-- 
1.7.3.1




Re: [Qemu-devel] [PATCH] SCSI: improve the lba-out-of-range tests for read/write/verify

2012-07-12 Thread ronnie sahlberg
Sorry,   there is a bug in it.

Can you change it to :
  ||  r->req.cmd.lba + len > s->qdev.max_lba + 1) {

or else the last lba of the device will be flagged as out of range.



On Thu, Jul 12, 2012 at 5:04 PM, Paolo Bonzini  wrote:
> Il 12/07/2012 08:52, Ronnie Sahlberg ha scritto:
>> Improve the tests for the LBA to cover more cases, the new test looks like 
>> this
>> if (r->req.cmd.lba > r->req.cmd.lba + len
>> ||  r->req.cmd.lba + len > s->qdev.max_lba) {
>>
>> For the 16 byte opcodes, the lba is a uint64, so the first check is to make 
>> sure that we do not wrap.
>> For example if an opcode would specify the LBA:0x and LEN:2
>> then lba+len would wrap to 1.
>>
>> The second part of the test is to verify that ALL requested blocks are 
>> available, not just the first one.
>
> Fixed code style for || and applied to scsi-next.
>
> Thanks!
>
> Paolo
>
>> Signed-off-by: Ronnie Sahlberg 
>> ---
>>  hw/scsi-disk.c |6 --
>>  1 files changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c
>> index b2f3c0c..40c05de 100644
>> --- a/hw/scsi-disk.c
>> +++ b/hw/scsi-disk.c
>> @@ -1558,7 +1558,8 @@ static int32_t scsi_send_command(SCSIRequest *req, 
>> uint8_t *buf)
>>  if (r->req.cmd.buf[1] & 0xe0) {
>>  goto fail;
>>  }
>> -if (r->req.cmd.lba > s->qdev.max_lba) {
>> +if (r->req.cmd.lba > r->req.cmd.lba + len
>> +||  r->req.cmd.lba + len > s->qdev.max_lba) {
>>  goto illegal_lba;
>>  }
>>  r->sector = r->req.cmd.lba * (s->qdev.blocksize / 512);
>> @@ -1581,7 +1582,8 @@ static int32_t scsi_send_command(SCSIRequest *req, 
>> uint8_t *buf)
>>  if (r->req.cmd.buf[1] & 0xe0) {
>>  goto fail;
>>  }
>> -if (r->req.cmd.lba > s->qdev.max_lba) {
>> +if (r->req.cmd.lba > r->req.cmd.lba + len
>> +||  r->req.cmd.lba + len > s->qdev.max_lba) {
>>  goto illegal_lba;
>>  }
>>  r->sector = r->req.cmd.lba * (s->qdev.blocksize / 512);
>>
>
>



Re: [Qemu-devel] [RFC V4 00/30] QCOW2 deduplication

2013-01-02 Thread ronnie sahlberg
Do you really need to resolve the conflicts?
It might be easier and sufficient to just flag those hashes where a
conflict has been detected as : "dont dedup this hash anymore,
collissions have been seen."


On Wed, Jan 2, 2013 at 10:40 AM, Benoît Canet  wrote:
> Le Wednesday 02 Jan 2013 à 12:26:37 (-0600), Troy Benjegerdes a écrit :
>> The probability may be 'low' but it is not zero. Just because it's
>> hard to calculate the hash doesn't mean you can't do it. If your
>> input data is not random the probability of a hash collision is
>> going to get scewed.
>>
>> Read about how Bitcoin uses hashes.
>>
>> I need a budget of around $10,000 or so for some FPGAs and/or GPU cards,
>> and I can make a regression test that will create deduplication hash
>> collisions on purpose.
>
> It's not a problem as Eric pointed out while reviewing the previous patchset
> there is a small place left with zeroes on the deduplication block.
> A bit could be set on it when a collision is detected and an offset could 
> point
> to a cluster used to resolve collisions.
>
>>
>>
>> On Wed, Jan 02, 2013 at 06:33:24PM +0100, Beno?t Canet wrote:
>> > > How does this code handle hash collisions, and do you have some 
>> > > regression
>> > > tests that purposefully create a dedup hash collision, and verify that 
>> > > the
>> > > 'right thing' happens?
>> >
>> > The two hash function that can be used are cryptographics and not broken 
>> > yet.
>> > So nobody knows how to generate a collision.
>> >
>> > You can do the math to calculate the probability of collision using a 256 
>> > bit
>> > hash while processing 1EiB of data the result is so low you can consider it
>> > won't happen.
>> > The sha256 ZFS deduplication works the same way regarding collisions.
>> >
>> > I currently use qemu-io-test for testing purpose and iozone with the -w 
>> > flag in
>> > the guest.
>> > I would like to find a good deduplication stress test to run in a guest.
>> >
>> > Regards
>> >
>> > Beno?t
>> >
>> > > It's great that this almost works, but it seems rather dangerous to put
>> > > something like this into the mainline code without some regression tests.
>> > >
>> > > (I'm also suspecting the regression test will be a great way to find
>> > > flakey hardware)
>> > >
>> > > --
>> > > Troy Benjegerdes'da hozer' 
>> > > ho...@hozed.org
>> > >
>> > > Somone asked my why I work on this free (http://www.fsf.org/philosophy/)
>> > > software & hardware (http://q3u.be) stuff and not get a real job.
>> > > Charles Shultz had the best answer:
>> > >
>> > > "Why do musicians compose symphonies and poets write poems? They do it
>> > > because life wouldn't have any meaning for them if they didn't. That's 
>> > > why
>> > > I draw cartoons. It's my life." -- Charles Shultz
>>
>> --
>> --
>> Troy Benjegerdes'da hozer' ho...@hozed.org
>>
>> Somone asked my why I work on this free (http://www.fsf.org/philosophy/)
>> software & hardware (http://q3u.be) stuff and not get a real job.
>> Charles Shultz had the best answer:
>>
>> "Why do musicians compose symphonies and poets write poems? They do it
>> because life wouldn't have any meaning for them if they didn't. That's why
>> I draw cartoons. It's my life." -- Charles Shultz
>>
>



Re: [Qemu-devel] [RFC V4 00/30] QCOW2 deduplication

2013-01-02 Thread ronnie sahlberg
On Wed, Jan 2, 2013 at 11:18 AM, Troy Benjegerdes  wrote:
> If you do get a hash collision, it's a rather exceptional event, so I'd
> say every effort should be made to log the event and the data that created
> it in multiple places.
>
> There are three questions I'd ask on a hash collision:
>
> 1) was it the data?
> 2) was it the hardware?
> 3) was it a software bug?

Yes, that is probably good too, and saving off the old and new block
content that collided.

Unless you are checksumming the blocks, I suspect that the most common
reason for "collisions" would just be cases where the original block
was corrupted/changed on disk and you dont detect it and then when you
re-write an identical one the blocks no longer match and thus you get
a false collision.



[Qemu-devel] [PATCH] iSCSI: remove the now obsolete call to qemu_notify_event()

2012-08-30 Thread Ronnie Sahlberg
Paolo, List

Please find a trivial patch to iscsi that removes the call to 
qemu_notify_event()
We do not need to call this any more since a recent patch that made
qemu_aio_set_fd_handler() and friends invoke this call automatically when 
required.

regards
ronnie sahlberg




[Qemu-devel] [PATCH] iSCSI: We dont need to explicitely call qemu_notify_event() any more

2012-08-30 Thread Ronnie Sahlberg
We no longer need to explicitely call qemu_notify_event() any more since this 
is now done automatically any time the filehandles we listen to change.

Signed-off-by: Ronnie Sahlberg 
---
 block/iscsi.c |6 --
 1 files changed, 0 insertions(+), 6 deletions(-)

diff --git a/block/iscsi.c b/block/iscsi.c
index 0b96165..355ce65 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -167,12 +167,6 @@ iscsi_set_events(IscsiLun *iscsilun)
 
 }
 
-/* If we just added an event, the callback might be delayed
- * unless we call qemu_notify_event().
- */
-if (ev & ~iscsilun->events) {
-qemu_notify_event();
-}
 iscsilun->events = ev;
 }
 
-- 
1.7.3.1




[Qemu-devel] [PATCH] iSCSI: Add support for SG_IO to iscsi_ioctl()

2012-08-30 Thread Ronnie Sahlberg
Paolo, List

Please find a small patch that adds support for SG_IO to the synchronous 
iscsi_ioctl() function.
This is needed to make scsi-block work with iscsi.

regards
ronnie sahlberg




[Qemu-devel] [PATCH] iSCSI: We need to support SG_IO also from iscsi_ioctl()

2012-08-30 Thread Ronnie Sahlberg
We need to support SG_IO from the synchronous iscsi_ioctl() since
scsi-block uses this to do an INQ to the device to discover its properties
This patch makes scsi-block work with iscsi.

Signed-off-by: Ronnie Sahlberg 
---
 block/iscsi.c |   20 +++-
 1 files changed, 19 insertions(+), 1 deletions(-)

diff --git a/block/iscsi.c b/block/iscsi.c
index 355ce65..189ab6f 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -537,7 +537,8 @@ iscsi_aio_ioctl_cb(struct iscsi_context *iscsi, int status,
 
 #define SG_ERR_DRIVER_SENSE0x08
 
-if (status == SCSI_STATUS_CHECK_CONDITION && acb->task->datain.size >= 2) {
+if (status == SCSI_STATUS_CHECK_CONDITION
+&& acb->task->datain.size >= 2) {
 int ss;
 
 acb->ioh->driver_status |= SG_ERR_DRIVER_SENSE;
@@ -622,9 +623,17 @@ static BlockDriverAIOCB *iscsi_aio_ioctl(BlockDriverState 
*bs,
 return &acb->common;
 }
 
+
+static void ioctl_cb(void *opaque, int status)
+{
+int *p_status = opaque;
+*p_status = status;
+}
+
 static int iscsi_ioctl(BlockDriverState *bs, unsigned long int req, void *buf)
 {
 IscsiLun *iscsilun = bs->opaque;
+int status;
 
 switch (req) {
 case SG_GET_VERSION_NUM:
@@ -633,6 +642,15 @@ static int iscsi_ioctl(BlockDriverState *bs, unsigned long 
int req, void *buf)
 case SG_GET_SCSI_ID:
 ((struct sg_scsi_id *)buf)->scsi_type = iscsilun->type;
 break;
+case SG_IO:
+status = -EINPROGRESS;
+iscsi_aio_ioctl(bs, req, buf, ioctl_cb, &status);
+
+while (status == -EINPROGRESS) {
+qemu_aio_wait();
+}
+
+return 0;
 default:
 return -1;
 }
-- 
1.7.3.1




Re: [Qemu-devel] [PATCH v6 2/2] block: Support GlusterFS as a QEMU block backend

2012-09-06 Thread ronnie sahlberg
On Fri, Sep 7, 2012 at 1:47 AM, Daniel P. Berrange wrote:

> On Thu, Sep 06, 2012 at 09:10:04PM +0530, Bharata B Rao wrote:
> > On Thu, Sep 06, 2012 at 11:29:36AM +0300, Avi Kivity wrote:
> > > On 08/14/2012 12:58 PM, Kevin Wolf wrote:
> > > >
> > > >> While we are at this, let me bring out another issue. Gluster
> supports 3
> > > >> transport types:
> > > >>
> > > >> - socket in which case the server will be hostname, ipv4 or ipv4
> address.
> > > >> - rdma in which case server will be interpreted similar to socket.
> > > >> - unix in which case server will be a path to unix domain socket
> and this
> > > >>   will look like any other filesystem path. (Eg.
> /tmp/glusterd.socket)
> > > >>
> > > >> I don't think we can fit 'unix' within the standard URI scheme (RFC
> 3986)
> > > >> easily, but I am planning to specify the 'unix' transport as below:
> > > >>
> > > >> gluster://[/path/to/unix/domain/socket]/volname/image?transport=unix
> > > >>
> > > >> i,e., I am asking the user to put the unix domain socket path within
> > > >> square brackets when transport type is unix.
> > > >>
> > > >> Do you think this is fine ?
> > > >
> > > > Never saw something like this before, but it does seem reasonable to
> me.
> > > > Excludes ] from the valid characters in the file name of the socket,
> but
> > > > that shouldn't be a problem in practice.
> > >
> > > Bikeshedding, but I prefer
> > >
> > >   gluster:///path/to/unix/domain/socket:/volname/image?transport=unix
> >
> > So if the unix domain socket is /tmp/glusterd.socket, then this would
> look like:
> >
> > gluster:///tmp/glusterd.socket:/volname/image?transport=unix.
> >
> > So you are saying : will the separator b/n the unix domain socket path
> > and rest of the URI components.
> >
> > Unless you or others strongly feel about this, I would like to go with
> > [ ] based spec, which I feel is less prone to errors like missing a colon
> > by mistake :)
> >
> > >
> > > as being more similar to file://, or even
> > >
> > >   gluster:///path/to/unix/domain/socket/volname/image?transport=unix
> > >
> > > with the last two components implied to be part of the payload, not the
> > > path.
> >
> > Note that image is a file path by itself like /dir1/a.img. So I guess it
> > becomes difficult to figure out where the unix domain socket path ends
> > and rest of the URI components begin w/o a separator in between.
>
> IMHO this is all gross. URIs already have a well defined way to provide
> multiple parameters, dealing with escaping of special characters. ie query
> parameters. The whole benefit of using URI syntax is to let apps process
> the URIs using standard APIs. We should not be trying to define some extra
> magic encoding to let us stuff 2 separate parameters into the path
> component
> since that means apps have to now write custom parsing code again. Either
> the UNIX socket path, or the volume path should be in the URI path, not
> both. The other part should be a URI parameter. I'd really expect us to
> use:
>
>   gluster:///volname/image?transport=unix&sockpath=/path/to/unix/sock
>
>
+1

ronnie sahlberg


Re: [Qemu-devel] [PATCH] iSCSI: We dont need to explicitely call qemu_notify_event() any more

2012-09-07 Thread ronnie sahlberg
Ping?

On Thu, Aug 30, 2012 at 4:56 PM, Ronnie Sahlberg
 wrote:
> We no longer need to explicitely call qemu_notify_event() any more since this 
> is now done automatically any time the filehandles we listen to change.
>
> Signed-off-by: Ronnie Sahlberg 
> ---
>  block/iscsi.c |6 --
>  1 files changed, 0 insertions(+), 6 deletions(-)
>
> diff --git a/block/iscsi.c b/block/iscsi.c
> index 0b96165..355ce65 100644
> --- a/block/iscsi.c
> +++ b/block/iscsi.c
> @@ -167,12 +167,6 @@ iscsi_set_events(IscsiLun *iscsilun)
>
>  }
>
> -/* If we just added an event, the callback might be delayed
> - * unless we call qemu_notify_event().
> - */
> -if (ev & ~iscsilun->events) {
> -qemu_notify_event();
> -}
>  iscsilun->events = ev;
>  }
>
> --
> 1.7.3.1
>



Re: [Qemu-devel] [PATCH] iSCSI: We need to support SG_IO also from iscsi_ioctl()

2012-09-07 Thread ronnie sahlberg
ping?

On Thu, Aug 30, 2012 at 5:28 PM, Ronnie Sahlberg
 wrote:
> We need to support SG_IO from the synchronous iscsi_ioctl() since
> scsi-block uses this to do an INQ to the device to discover its properties
> This patch makes scsi-block work with iscsi.
>
> Signed-off-by: Ronnie Sahlberg 
> ---
>  block/iscsi.c |   20 +++-
>  1 files changed, 19 insertions(+), 1 deletions(-)
>
> diff --git a/block/iscsi.c b/block/iscsi.c
> index 355ce65..189ab6f 100644
> --- a/block/iscsi.c
> +++ b/block/iscsi.c
> @@ -537,7 +537,8 @@ iscsi_aio_ioctl_cb(struct iscsi_context *iscsi, int 
> status,
>
>  #define SG_ERR_DRIVER_SENSE0x08
>
> -if (status == SCSI_STATUS_CHECK_CONDITION && acb->task->datain.size >= 
> 2) {
> +if (status == SCSI_STATUS_CHECK_CONDITION
> +&& acb->task->datain.size >= 2) {
>  int ss;
>
>  acb->ioh->driver_status |= SG_ERR_DRIVER_SENSE;
> @@ -622,9 +623,17 @@ static BlockDriverAIOCB 
> *iscsi_aio_ioctl(BlockDriverState *bs,
>  return &acb->common;
>  }
>
> +
> +static void ioctl_cb(void *opaque, int status)
> +{
> +int *p_status = opaque;
> +*p_status = status;
> +}
> +
>  static int iscsi_ioctl(BlockDriverState *bs, unsigned long int req, void 
> *buf)
>  {
>  IscsiLun *iscsilun = bs->opaque;
> +int status;
>
>  switch (req) {
>  case SG_GET_VERSION_NUM:
> @@ -633,6 +642,15 @@ static int iscsi_ioctl(BlockDriverState *bs, unsigned 
> long int req, void *buf)
>  case SG_GET_SCSI_ID:
>  ((struct sg_scsi_id *)buf)->scsi_type = iscsilun->type;
>  break;
> +case SG_IO:
> +status = -EINPROGRESS;
> +iscsi_aio_ioctl(bs, req, buf, ioctl_cb, &status);
> +
> +while (status == -EINPROGRESS) {
> +qemu_aio_wait();
> +}
> +
> +return 0;
>  default:
>  return -1;
>  }
> --
> 1.7.3.1
>



[Qemu-devel] [PATCH] SCSI disk, set HISUP bit in INQ data

2012-09-14 Thread Ronnie Sahlberg
Paolo, List

Please find a trivial patch that make SCSI-DISK set the HISUP bit in the INQ 
data.
Since I think all LUN numbers are reported as SAM4 describes, we should set 
this bit.

Its already sed in SCSI-BUS


regards
ronnie sahlberg




[Qemu-devel] [PATCH] SCSI: Standard INQUIRY data should report HiSup flag as set.

2012-09-14 Thread Ronnie Sahlberg
QEMU as far as I know only reports LUN numbers using the modes that
are described in SAM4.
As such, since all LUN numbers generated by the SCSI emulation in QEMU
follow SAM4, we should set the HiSup bit in the standard INQUIRY data
to indicate such.

>From SAM4:
  4.6.3 LUNs overview
  All LUN formats described in this standard are hierarchical in
  structure even when only a single level in that hierarchy is used.
  The HISUP bit shall be set to one in the standard INQUIRY data
  (see SPC-4) when any LUN format described in this standard is used.
  Non-hierarchical formats are outside the scope of this standard.

Signed-off-by: Ronnie Sahlberg 
---
 hw/scsi-disk.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c
index 1585683..52bc062 100644
--- a/hw/scsi-disk.c
+++ b/hw/scsi-disk.c
@@ -678,7 +678,7 @@ static int scsi_disk_emulate_inquiry(SCSIRequest *req, 
uint8_t *outbuf)
  * is actually implemented, but we're good enough.
  */
 outbuf[2] = 5;
-outbuf[3] = 2; /* Format 2 */
+outbuf[3] = 2 | 0x10; /* Format 2, HiSup */
 
 if (buflen > 36) {
 outbuf[4] = buflen - 5; /* Additional Length = (Len - 1) - 4 */
-- 
1.7.3.1




Re: [Qemu-devel] [RFC] find_next_bit optimizations

2013-03-11 Thread ronnie sahlberg
Even more efficient might be to do bitwise instead of logical or

>if (tmp | d1 | d2 | d3) {

that should remove 3 of the 4 conditional jumps
and should become 3 bitwise ors and one conditional jump


On Mon, Mar 11, 2013 at 8:58 AM, Paolo Bonzini  wrote:
> Il 11/03/2013 16:37, Peter Lieven ha scritto:
>>
>> Am 11.03.2013 um 16:29 schrieb Paolo Bonzini :
>>
>>> Il 11/03/2013 16:24, Peter Lieven ha scritto:

> How would that be different in your patch?  But you can solve it by
> making two >= loops, one checking for 4*BITS_PER_LONG and one checking
> BITS_PER_LONG.

 This is what I have now:

 diff --git a/util/bitops.c b/util/bitops.c
 index e72237a..b0dc93f 100644
 --- a/util/bitops.c
 +++ b/util/bitops.c
 @@ -24,12 +24,13 @@ unsigned long find_next_bit(const unsigned long *addr, 
 unsigned long size,
 const unsigned long *p = addr + BITOP_WORD(offset);
 unsigned long result = offset & ~(BITS_PER_LONG-1);
 unsigned long tmp;
 +unsigned long d0,d1,d2,d3;

 if (offset >= size) {
 return size;
 }
 size -= result;
 -offset %= BITS_PER_LONG;
 +offset &= (BITS_PER_LONG-1);
 if (offset) {
 tmp = *(p++);
 tmp &= (~0UL << offset);
 @@ -42,7 +43,19 @@ unsigned long find_next_bit(const unsigned long *addr, 
 unsigned long size,
 size -= BITS_PER_LONG;
 result += BITS_PER_LONG;
 }
 -while (size & ~(BITS_PER_LONG-1)) {
 +while (size >= 4*BITS_PER_LONG) {
 +d0 = *p;
 +d1 = *(p+1);
 +d2 = *(p+2);
 +d3 = *(p+3);
 +if (d0 || d1 || d2 || d3) {
 +break;
 +}
 +p+=4;
 +result += 4*BITS_PER_LONG;
 +size -= 4*BITS_PER_LONG;
 +}
 +while (size >= BITS_PER_LONG) {
 if ((tmp = *(p++))) {
 goto found_middle;
 }

>>>
>>> Minus the %= vs. &=,
>>>
>>> Reviewed-by: Paolo Bonzini 
>>>
>>> Perhaps:
>>>
>>>tmp = *p;
>>>d1 = *(p+1);
>>>d2 = *(p+2);
>>>d3 = *(p+3);
>>>if (tmp) {
>>>goto found_middle;
>>>}
>>>if (d1 || d2 || d3) {
>>>break;
>>>}
>>
>> i do not know what gcc interally makes of the d0 || d1 || d2 || d3 ?
>
> It depends on the target and how expensive branches are.
>
>> i would guess its sth like one addition w/ carry and 1 test?
>
> It could be either 4 compare-and-jump sequences, or 3 bitwise ORs
> followed by a compare-and-jump.
>
> That is, either:
>
>  test %r8, %r8
>  jz   second_loop
>  test %r9, %r9
>  jz   second_loop
>  test %r10, %r10
>  jz   second_loop
>  test %r11, %r11
>  jz   second_loop
>
> or
>
>  or %r9, %r8
>  or %r11, %r10
>  or %r8, %r10
>  jz   second_loop
>
> Don't let the length of the code fool you.  The processor knows how to
> optimize all of these, and GCC knows too.
>
>> your proposed change would introduce 2 tests (maybe)?
>
> Yes, but I expect they to be fairly well predicted.
>
>> what about this to be sure?
>>
>>tmp = *p;
>>d1 = *(p+1);
>>d2 = *(p+2);
>>d3 = *(p+3);
>>if (tmp || d1 || d2 || d3) {
>>if (tmp) {
>>goto found_middle;
>
> I suspect that GCC would rewrite it my version (definitely if it
> produces 4 compare-and-jumps; but possibly it does it even if it goes
> for bitwise ORs, I haven't checked.
>
> Regarding your other question ("one last thought. would it make sense to
> update only `size`in the while loops and compute the `result` at the end
> as `orgsize` - `size`?"), again the compiler knows better and might even
> do this for you.  It will likely drop the p increases and use p[result],
> so if you do that change you may even get the same code, only this time
> p is increased and you get an extra subtraction at the end. :)
>
> Bottom line: don't try to outsmart an optimizing C compiler on
> micro-optimization, unless you have benchmarked it and it shows there is
> a problem.
>
> Paolo
>
>>}
>>break;
>>}
>>
>> Peter
>>
>
>



Re: [Qemu-devel] [PULL 00/14] SCSI changes for 2014-06-18

2014-06-18 Thread ronnie sahlberg
On Wed, Jun 18, 2014 at 8:57 AM, Paolo Bonzini  wrote:
> Il 18/06/2014 17:53, Peter Maydell ha scritto:
>
>> Then you probably need to fix the configure test ;-) This is
>> Ubuntu Trusty so really pretty recent (it's my main x86 build
>> machine, not some oddball platform); libiscsi-dev 1.4.0-3.
>
>
> This is a very old release.  I'll apply Peter's patch to disable libiscsi
> for pre-1.9.0 versions and respin the pull request.
>
> I didn't want to do that because 1.4.0->1.9.0 breaks ABI, but Peter thinks
> that versions before 1.8.0 are not reliable anyway.

+1

Versions prior are good for causal use but Peter is the authority on
production use here.
I think qemu should have a configure check and just force disable
iscsi support if libiscsi < 1.9.0


>
> If Ubuntu wants to distribute both 1.4.0 and 1.9.0, they can move 1.9.0
> header files to a separate directory using pkgconfig (1.4.0 does not support
> pkgconfig).
>
> Paolo
>



Re: [Qemu-devel] [PATCH 08/22] iscsi: implement .bdrv_detach/attach_aio_context()

2014-05-08 Thread ronnie sahlberg
On Thu, May 8, 2014 at 4:33 AM, Stefan Hajnoczi  wrote:
> On Wed, May 07, 2014 at 04:09:27PM +0200, Peter Lieven wrote:
>> On 07.05.2014 12:29, Paolo Bonzini wrote:
>> >Il 07/05/2014 12:07, Stefan Hajnoczi ha scritto:
>> >>On Fri, May 02, 2014 at 12:39:06AM +0200, Peter Lieven wrote:
>> +static void iscsi_attach_aio_context(BlockDriverState *bs,
>> + AioContext *new_context)
>> +{
>> +IscsiLun *iscsilun = bs->opaque;
>> +
>> +iscsilun->aio_context = new_context;
>> +iscsi_set_events(iscsilun);
>> +
>> +#if defined(LIBISCSI_FEATURE_NOP_COUNTER)
>> +/* Set up a timer for sending out iSCSI NOPs */
>> +iscsilun->nop_timer = aio_timer_new(iscsilun->aio_context,
>> + QEMU_CLOCK_REALTIME, SCALE_MS,
>> + iscsi_nop_timed_event, iscsilun);
>> +timer_mod(iscsilun->nop_timer,
>> +  qemu_clock_get_ms(QEMU_CLOCK_REALTIME) + NOP_INTERVAL);
>> +#endif
>> +}
>> >>>
>> >>>Is it still guaranteed that iscsi_nop_timed_event for a target is not 
>> >>>invoked
>> >>>while we are in another function/callback of the iscsi driver for the 
>> >>>same target?
>> >
>> >Yes, since the timer is in the same AioContext as the iscsi driver 
>> >callbacks.
>>
>>
>> Ok. Stefan: What MUST NOT happen is that the timer gets fired while we are 
>> in iscsi_service.
>> As Paolo outlined, this cannot happen, right?
>
> Okay, I think we're safe then.  The timer can only be invoked during
> aio_poll() event loop iterations.  It cannot be invoked while we're
> inside iscsi_service().
>
>> >>BTW, is iscsi_reconnect() the right libiscsi interface to use since it
>> >>is synchronous?  It seems like this would block QEMU until the socket
>> >>has connected!  The guest would be frozen.
>> >
>> >There is no asynchronous interface yet for reconnection, unfortunately.
>>
>> We initiate the reconnect after we miss a few NOP replies. So the target is 
>> already down for approx. 30 seconds.
>> Every process inside the guest is already haging or has timed out.
>>
>> If I understand correctly with the new patches only the communication with 
>> this target is hanging or isn't it?
>> So what benefit would an asyncronous reconnect have?
>
> Asynchronous reconnect is desirable:
>
> 1. The QEMU monitor is blocked while we're waiting for the iSCSI target
>to accept our reconnect.  This means the management stack (libvirt)
>cannot control QEMU until we time out or succeed.
>
> 2. The guest is totally frozen - cannot execute instructions - because
>it will soon reach a point in the code that locks the QEMU global
>mutex (which is being held while we reconnect to the iSCSI target).
>
>This may be okayish for guests where the iSCSI LUN contains the
>"main" data that is being processed.  But what if an iSCSI LUN was
>just attached to a guest that is also doing other things that are
>independent (e.g. serving a website, processing data from a local
>disk, etc) - now the reconnect causes downtime for the entire guest.

I will look into making the reconnect async over the next few days.



Re: [Qemu-devel] [PATCH 08/20] nfs: Handle failure for potentially large allocations

2014-05-22 Thread ronnie sahlberg
For this case and for the iscsi case, isn't it likely that once this
happens the guest is pretty much doomed since I/O to the disk will no
longer work ?

Should we also change the block layer so that IF *_readv/_writev fails
with -ENOMEM
then it should try again but break the request up into a chain of
smaller chunks ?



On Wed, May 21, 2014 at 9:28 AM, Kevin Wolf  wrote:
> Some code in the block layer makes potentially huge allocations. Failure
> is not completely unexpected there, so avoid aborting qemu and handle
> out-of-memory situations gracefully.
>
> This patch addresses the allocations in the nfs block driver.
>
> Signed-off-by: Kevin Wolf 
> ---
>  block/nfs.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/block/nfs.c b/block/nfs.c
> index 539bd95..e3d6216 100644
> --- a/block/nfs.c
> +++ b/block/nfs.c
> @@ -165,7 +165,11 @@ static int coroutine_fn nfs_co_writev(BlockDriverState 
> *bs,
>
>  nfs_co_init_task(client, &task);
>
> -buf = g_malloc(nb_sectors * BDRV_SECTOR_SIZE);
> +buf = g_try_malloc(nb_sectors * BDRV_SECTOR_SIZE);
> +if (buf == NULL) {
> +return -ENOMEM;
> +}
> +
>  qemu_iovec_to_buf(iov, 0, buf, nb_sectors * BDRV_SECTOR_SIZE);
>
>  if (nfs_pwrite_async(client->context, client->fh,
> --
> 1.8.3.1
>
>



Re: [Qemu-devel] [PATCH] block/iscsi: use 16 byte CDBs only when necessary

2014-09-02 Thread ronnie sahlberg
That is one big request.  I assume the device reports "no limit" in
the VPD page so we can not state it is the guest/application going
beyond the allowed limit?


I am not entirely sure what meaning the target assigns to Protocol
Error means here.
It could be that ~100M is way higher than MaxBurstLength ?  What is
the MaxBurstLength that was reported by the server during login
negotiation?
If so, we should make libiscsi check the maxburstlength and fail the
request early. We would still fail the I/O so it will not really solve
anything much
but at least we should not send the request to the server.

Best would probably be to take the smallest of a non-zero
Block-Limits.max_transfer_length and iscsi-MaxBurstLength/block-size
and pass this back to the guest in the emulated Block-Limits-VPD.
At least then you have tried to tell the guest "never do SCSI I/O
bigger than this".

I.e. even if the target reports BlockLimits.MaxTransferLength == 0 ==
no limit to QEMU, QEMU should probably take the iscsi transport limit
into account and pass this to the guest
by setting the emulated BlockLimits page it passes to scale to the
maximum that MaxBurstLength allows.


Then if BTRFS or SG_IO in the guest ignores the BlockLimits it is
clearly a guest problem.

(A different interpretation for ProtocolError could be the mismatch
between the iscsi expected data transfer length and the scsi transfer
length, but that should result in residuals, not protocol error.)



Hypothetically there could be targets that support really huge
MaxBurstLengths > 32MB. For those you probably want to switch to
WRITE16 when the SCSI transfer length goes > 0x.

- if (iscsilun->use_16_for_rw)  {
+ if (iscsilun->use_16_for_rw || num_sectors > 0x)  {


regards
ronnie sahlberg

On Mon, Sep 1, 2014 at 8:21 AM, Peter Lieven  wrote:
> On 17.06.2014 13:46, Paolo Bonzini wrote:
>
> Il 17/06/2014 13:37, Peter Lieven ha scritto:
>
> On 17.06.2014 13:15, Paolo Bonzini wrote:
>
> Il 17/06/2014 08:14, Peter Lieven ha scritto:
>
>
>
> BTW, while debugging a case with a bigger storage supplier I found
> that open-iscsi seems to do exactly this undeterministic behaviour.
> I have a 3TB LUN. If I access < 2TB sectors it uses READ10/WRITE10 and
> if I go beyond 2TB it changes to READ16/WRITE16.
>
>
> Isn't that exactly what your latest patch does for >64K sector writes? :)
>
>
> Not exactly, we choose the default by checking the LUN size. 10 Byte for
> < 2TB and 16 Byte otherwise.
>
>
> Yeah, I meant introducing the non-determinism.
>
> My latest patch makes an exception if a request is bigger than 64K
> sectors and
> switches to 16 Byte requests. These would otherwise end in an I/O error.
>
>
> It could also be split at the block layer, like we do for unmap.  I think
> there's also a maximum transfer size somewhere in the VPD, we could to
> READ16/WRITE16 if it is >64K sectors.
>
>
> It seems that there might be a real world example where Linux issues >32MB
> write requests. Maybe someone familiar with btrfs can advise.
> I see iSCSI Protocol Errors in my logs:
>
> Sep  1 10:10:14 libiscsi:0 PDU header: 01 a1 00 00 00 01 00 00 00 00 00 00
> 00 00 00 00 00 00 00 07 06 8f 30 00 00 00 00 06 00 00 00 0a 2a 00 01 09 9e
> 50 00 47 98 00 00 00 00 00 00 00 [XXX]
> Sep  1 10:10:14 qemu-2.0.0: iSCSI: Failed to write10 data to iSCSI lun.
> Request was rejected with reason: 0x04 (Protocol Error)
>
> Looking at the headers the xferlen in the iSCSI PDU is 110047232 Byte which
> is 214936 sectors.
> 214936 % 65536 = 18328 which is exactly the number of blocks in the SCSI
> WRITE10 CDB.
>
> Can someone advise if this is something that btrfs can cause
> or if I have to
> blame the customer that he issues very big write requests with Direct I/O?
>
> The user sseems something like this in the log:
> [34640.489284] BTRFS: bdev /dev/vda2 errs: wr 8232, rd 0, flush 0, corrupt
> 0, gen 0
> [34640.490379] end_request: I/O error, dev vda, sector 17446880
> [34640.491251] end_request: I/O error, dev vda, sector 5150144
> [34640.491290] end_request: I/O error, dev vda, sector 17472080
> [34640.492201] end_request: I/O error, dev vda, sector 17523488
> [34640.492201] end_request: I/O error, dev vda, sector 17536592
> [34640.492201] end_request: I/O error, dev vda, sector 17599088
> [34640.492201] end_request: I/O error, dev vda, sector 17601104
> [34640.685611] end_request: I/O error, dev vda, sector 15495456
> [34640.685650] end_request: I/O error, dev vda, sector 7138216
>
> Thanks,
> Peter
>



Re: [Qemu-devel] [PATCH] block/iscsi: use 16 byte CDBs only when necessary

2014-09-03 Thread ronnie sahlberg
On Wed, Sep 3, 2014 at 1:09 AM, Peter Lieven  wrote:
>
>
>> Am 02.09.2014 um 21:30 schrieb Peter Lieven :
>>
>> Looking at the code, is it possible that not the guest is causing trouble 
>> here, but
>> multiwrite_merge code?
>>
>> From what I see the only limit it has when merging requests is the number of 
>> IOVs.
>>
>>
>> Any thoughts?
>>
>> Mine are:
>> a) Introducing bs->bl.max_request_size and set merge = 0 if the result would 
>> be too big. Default
>> max request size to 32768 sectors (see below).
>> b) Hardcoding the limit in multiwrite_merge for now limiting the merged size 
>> to 16MB (32768 sectors).
>> Which is the limit we already use in bdrv_co_discard and 
>> bdrv_co_write_zeroes if we don't know
>> better.
>
> or c) disabling multiwrite merge for RAW or only iSCSI completely.
>

I think (a) would be best.
But I would suggest some small modifications:

Set the default max to something even smaller, like 256 sectors. This
should not have much effect on throughput since the client/initiator
can just send several concurrent i/os instead of one large i/o.

Also update scsi_disk_emulate_inquiry() so we tell the guest the
maximum transfer length for a single i/o
so that the guest kernel will break up any huge requests into proper sizes.



Re: [Qemu-devel] [PATCH] block/iscsi: use 16 byte CDBs only when necessary

2014-09-03 Thread ronnie sahlberg
On Wed, Sep 3, 2014 at 7:18 AM, Paolo Bonzini  wrote:
> Il 03/09/2014 16:17, ronnie sahlberg ha scritto:
>> I think (a) would be best.
>> But I would suggest some small modifications:
>>
>> Set the default max to something even smaller, like 256 sectors. This
>> should not have much effect on throughput since the client/initiator
>> can just send several concurrent i/os instead of one large i/o.
>
> I disagree.  256 sectors are 128k, that's a fairly normal I/O size.
>

Fair enough.
But maybe a command line argument to set/override the max?

This would be useful when using scsi-passthrough to usb-scsi disks.
Linux kernel has I think a hard limit on 120k for the usb transport,
which means when doing SG_IO to a usb disk you are limited to this
as the max transfer length since anything larger will break in the usb layer.
This limit also I think is not easily discoverable by an application since
* the actual device still reports, in most cases, max_transfer_length
== unlimited
* and it is semi-hard to discover whether a /dev/sg* device is on a
usb bus or not.



Re: [Qemu-devel] [PATCH 0/4] introduce max_transfer_length

2014-09-05 Thread ronnie sahlberg
Looks good to me.

(minor question is just why not let default max be 0x for both 10
and 16 CDBs ?)

On Fri, Sep 5, 2014 at 9:51 AM, Peter Lieven  wrote:
> This series adds the basics for introducing a maximum transfer length
> to the block layer. Its main purpose is currently avoiding that
> a multiwrite_merge exceeds the max_xfer_len of an attached iSCSI LUN.
> This is a required bug fix.
>
> Discussed reporting of this maximum in the SCSI Disk Inquiry Emulation
> of the Block Limits VPD is currently not added as we do not import any
> of the other limits there. This has to be addresses in a seperate series.
>
> Peter Lieven (4):
>   BlockLimits: introduce max_transfer_length
>   block: immediate cancel oversized read/write requests
>   block/iscsi: set max_transfer_length
>   block: avoid creating oversized writes in multiwrite_merge
>
>  block.c   |   23 +++
>  block/iscsi.c |   12 ++--
>  include/block/block_int.h |3 +++
>  3 files changed, 36 insertions(+), 2 deletions(-)
>
> --
> 1.7.9.5
>



Re: [Qemu-devel] [PATCH 0/4] introduce max_transfer_length

2014-09-05 Thread ronnie sahlberg
Feel free to add a

Reviewed-by: Ronnie Sahlberg 

to the patches.

On Fri, Sep 5, 2014 at 12:52 PM, Peter Lieven  wrote:
> Am 05.09.2014 um 19:05 schrieb ronnie sahlberg:
>> Looks good to me.
>>
>> (minor question is just why not let default max be 0x for both 10
>> and 16 CDBs ?)
>
> You are right. I was looking at the technical limit, but in fact it doesn't
> make sense to have different limits. Its ridiculous to say, you wan't to
> do big I/O then you need a target thats bigger than 2TB ;-)
>
> Peter
>
>>
>> On Fri, Sep 5, 2014 at 9:51 AM, Peter Lieven  wrote:
>>> This series adds the basics for introducing a maximum transfer length
>>> to the block layer. Its main purpose is currently avoiding that
>>> a multiwrite_merge exceeds the max_xfer_len of an attached iSCSI LUN.
>>> This is a required bug fix.
>>>
>>> Discussed reporting of this maximum in the SCSI Disk Inquiry Emulation
>>> of the Block Limits VPD is currently not added as we do not import any
>>> of the other limits there. This has to be addresses in a seperate series.
>>>
>>> Peter Lieven (4):
>>>   BlockLimits: introduce max_transfer_length
>>>   block: immediate cancel oversized read/write requests
>>>   block/iscsi: set max_transfer_length
>>>   block: avoid creating oversized writes in multiwrite_merge
>>>
>>>  block.c   |   23 +++
>>>  block/iscsi.c |   12 ++--
>>>  include/block/block_int.h |3 +++
>>>  3 files changed, 36 insertions(+), 2 deletions(-)
>>>
>>> --
>>> 1.7.9.5
>>>
>



Re: [Qemu-devel] [PATCH 2/4] block: immediately cancel oversized read/write requests

2014-09-08 Thread ronnie sahlberg
On Mon, Sep 8, 2014 at 7:42 AM, Paolo Bonzini  wrote:
> Il 08/09/2014 16:35, Peter Lieven ha scritto:
>
> messages. :)
 So you would not throw an error msg here?
>>> No, though a trace could be useful.
>>
>> Is there a howto somewhere how to implement that?
>
> Try commit 4ac4458076e1aaf3b01a6361154117df20e22215.
>
>> Whats your opinion changed the max_xfer_len to 0x regardsless
>> of use_16_for_rw in iSCSI?
>
> If you implemented request splitting in the block layer, it would be
> okay to force max_xfer_len to 0x.

I think it should be OK if that is a different series. This only
affects the iSCSI transport since it is the only transport so far to
record or report a max transfer length.
If a guest generates these big requests(i.e. not multiwrite) we
already fail them due to the READ10 limitations. We would just fail
the request at a different layer in the stack with these patches.


What I would like to see would also be to report these limitations to
the guest itself to prevent it from generating too large I/Os.
I am willing to do that part once the initial max_xfer_len ones are finished.



Re: [Qemu-devel] [RFC] How to handle feature regressions in new QEMU releases

2014-07-16 Thread ronnie sahlberg
On Wed, Jul 16, 2014 at 10:11 AM, Stefan Weil  wrote:
> Am 16.07.2014 18:49, schrieb Paolo Bonzini:
>> Il 16/07/2014 18:28, Stefan Weil ha scritto:
>>> Debian testing includes a brand new libiscsi, but it
>>> does not include libiscsi.pc, so pkg-config won't know that it is
>>> available and configure will disable libiscsi.
>>
>> That's a packaging bug.
>
> CC'ing Michael as he is the Debian maintainer of this package and
> Aurélien who maintains QEMU for Debian.
>
> Michael, should I send a Debian bug report for libiscsi-dev? Would an
> update of libiscsi for Debian stable be reasonable if versions older
> than 1.9 are too buggy to be used?

If you ask debian to upgrade. Could you ask them to wait and upgrade after I
have release the next version, hopefully if all goes well, at the end
of this week?

It contains new functionality, thanks to plieven, to better handle
cases where active/passive storage arrays
perform failover.


> I must admit that I'm a little bit
> surprised because iSCSI support worked for me quite well the last time I
> used it with Debian wheezy.

I think, and plieven please correct me if I am wrong, earlier version
would work reasonably well for basic use
but there were bugs and gaps in functionality that made it ill suited
for enterprise environments.

>
> Regards
> Stefan
>
>>
>>> I have a patch which
>>> fixes this, so QEMU for Debian testing could include libiscsi again.
>>>
>>> Is a feature regression like this one acceptable? Do we need additional
>>> testing (maybe run the build bots with --enable-xxx, so builds fail when
>>> xxx no longer works)?
>>
>> As mentioned in the e49ab19fcaa617ad6cdfe1ac401327326b6a2552 commit
>> message, this was intentional.  I was reluctant to do it, but ultimately
>> Peter Lieven convinced me that it isn't just about using fancy new APIs;
>> libiscsi was too buggy to be useful until release 1.8.0 (even 1.9.0
>> requires a patch to avoid segfaults, and many more if you want to run it
>> on ARM).
>>
>> Paolo
>
>



Re: [Qemu-devel] [RFC] How to handle feature regressions in new QEMU releases

2014-07-23 Thread ronnie sahlberg
On Wed, Jul 16, 2014 at 10:29 AM, Michael Tokarev  wrote:
> 16.07.2014 21:23, ronnie sahlberg wrote:
>
>> If you ask debian to upgrade. Could you ask them to wait and upgrade after I
>> have release the next version, hopefully if all goes well, at the end
>> of this week?
>
> There's no problem in updating now to fix missing .pc file and to update
> next week to include a new version.
>

Please find a new version 1.12 on the website.

Thanks.
ronnie sahlberg



Re: [Qemu-devel] [PATCH] iSCSI: add configuration variables for iSCSI

2011-12-22 Thread ronnie sahlberg
Hi,

You mean using the '-S' mode and setting the key after qemu has
initialized all devices but before it actually starts booting the
guest?
That would not be easy and would probably require some intrusive in
qemu itself that might affect other device types, so I am unsure.

The difference between qcow2 and iscsi and the problem is that .open()
is called for all devices before the monitor is started, so .open() is
called before we would have a chance to even hand the password to
qemu.

For qcow2 this is not a problem since even if the file is password
protected the file header is not, so you can still open the file and
read the header (to discover it is password protected) without knowing
the password.
So qcow2 can still open the file and do all the sanity checks it needs
without yet knowing the password.
You only need the password at a much later at the stage when you want
to actually start doing I/O to the device.


iSCSI is different because the username/password is not really
associated with the LUN or doing I/O, the password is required to
connect to the iscsi target itself, so without the password we can not
even connect to the target, much less run REPORTLUNS or TESTUNITREADY
to check if the LUN even exists. We cant even check if the target iqn
name even exists at all without the password.


So to pass the password similarly to how qcow2 does it, I see three
options, neither of which are especially attractive :

1, Change iscsi_open() to become a NOOP, dont even try to connect to
the target until the CPU is started (and -S has passed the password)
and do the connect on first i/o  instead of .open()
This would mean that if the target or the LUN does not even exist at
all, we wont get an early failure where QEMU would fail to start due
to "the device does not exist", instead the error would be postponed
until much later and cause qemu to error out on first i/o.
This looks like a horrible kludge.

2, Change -S so that it is called before qemu calls any of the .open()
functions for block devices.
I am not sure what impact this would have and if -S users would be
"surprised" if the monitor is open but the block devices have not yet
been scanned.
This looks difficult to easily integrate with the command line parsing anyway.
This too sounds like a horrible kludge.

3, Add a new block device method   .open_post_start() which is just
like .open()  but invoked after the CPU has been started.
So .open() is called like today before -S monitor is started, then the
new .open_post_start() would be called once -S has started the CPU
with the 'c' command.
In this case for iscsi, .open() could be changed to just create a
context for the device and establish the TCP connection to the target
but it would not log in to the target.
.open_post_start() which could be called for all block devices once
the CPU has been started would then use the context and connection
from .open()
and perform the actual login and discover the LUN.

3 would still be a little bit ugly since a whole set of error
conditions such as "target iqn does not exist", "lun does not exist",
"wrong type of LUN", "incorrect password" ...  would not be detected
until once the guest CPU is actually started.
Would be less ugly than options 1,2 though.


But even with something like option 3, to make it possible to use -S
to provide the password, this is mainly for addressing the case when
running via libvirt ? I still want to provide a mechanism for setting
the password directly in a configuration file for users that run qemu
directly from the command line and are not using -S.


comments ?


regards
ronnie sahlberg


On Mon, Dec 19, 2011 at 12:48 AM, Paolo Bonzini  wrote:
> On 12/18/2011 05:48 AM, Ronnie Sahlberg wrote:
>>
>> This patch adds configuration variables for iSCSI to set
>> initiator-name to use when logging in to the target,
>> which type of header-digest to negotiate with the target
>> and username and password for CHAP authentication.
>>
>> This allows specifying a initiator-name either from the command line
>> -iscsi initiator-name=iqn.2004-01.com.example:test
>> or from a configuration file included with -readconfig
>> [iscsi]
>>   initiator-name = iqn.2004-01.com.example:test
>>   header-digest = CRC32C|CRC32C-NONE|NONE-CRC32C|NONE
>>   user = CHAP username
>>   password = CHAP password
>
>
> header digest and user look like they should be options to the block driver.
>  For the password you can reuse the system used by qcow2 perhaps?
>
> Paolo



Re: [Qemu-devel] OEM Windows in Qemu

2011-12-22 Thread ronnie sahlberg
Why not just buy a non-OEM version ?
They are surely not tied to a specific piece of hardware and its
licence should allow to run on "arbitrary PC HW you happen to want to
install/run it on once activated"




On Fri, Dec 23, 2011 at 5:11 PM,   wrote:
>
>
>>"My patch" does not produce any SLIC at all. The instructions mentions
>>using SLIC from your machine - "my patch" is just a way to _embed_ a given
>>data into VM, not a way to "produce" anything. You get in your VM what
>>you have outside, either in a file or in your own BIOS, depending on where
>>you took that data from.
>
> Right, I meant to say simply that it did copy the SLIC in a recognizable
> form and embed it into the BIOS.
>
>
>
>>Yes, the patch only changes RSDT to match SLIC - this is what win7 verifies,
>>all other tables does not matter for win7. And yes, it might be different
>>for winXP - you may try setting all tables in VM to be of DELL OEM and see
>>what happens.
>
> I had high hopes for this, but no joy.  All the DMI tables now read Dell
> Inc but XP refuses to activate.  I think the key is those specific
> memory addresses I mentioned earlier.  I tried raw write of "Dell Inc"
> into the specified address space, but there are 2 problems with this.
> First is that the address space already contains data so I overwrote
> something.  Second even then Windows did not activate, I think the
> activation check only occurs at certain times, probably on boot, so I
> made my changes after it checked and it didn't check again and so didn't
> see the changes I made.
>
> I need to modify the BIOS to contain the string so it is visible on boot
> without me having to manually edit it.  Is there a way to hardcode the
> string into a specific memory address when I compile the BIOS?
>
> Brian
>
>



Re: [Qemu-devel] [PATCH] iSCSI: add configuration variables for iSCSI

2011-12-23 Thread ronnie sahlberg
I do recognize that there is real need to provide password to iscsi
(and rbd?) via the monitor.
I myself do not use libvirt (it is just too much pain to be worth it
to try to upgrade it out-of-step with your distribution)
but I recognize that likely vast majority of users would use libvirt.


Once consensus is reached on "how to handle devices that need
two-stage setup" and what it would look like
I can volunteer to implement it. It would however be way out of scope
of and bigger than a simple iscsi-local patch


What about the idea about splitting .open() into two stages:
.open()   /* stage one, called before -S monitor is invoked */
.open_stage_2() /* called after -S finishes and when it starts to boot
the guest CPU */

?



regards
ronnie sahlberg


On Fri, Dec 23, 2011 at 8:08 PM, Paolo Bonzini  wrote:
> On 12/22/2011 09:51 PM, ronnie sahlberg wrote:
>>
>> The difference between qcow2 and iscsi and the problem is that .open()
>> is called for all devices before the monitor is started, so .open() is
>> called before we would have a chance to even hand the password to
>> qemu.
>>
>> For qcow2 this is not a problem since even if the file is password
>> protected the file header is not, so you can still open the file and
>> read the header (to discover it is password protected) without knowing
>> the password.
>> So qcow2 can still open the file and do all the sanity checks it needs
>> without yet knowing the password.
>
>
> You're right.
>
> I see that the libvirt driver for rbd simply passes it on the command line.
>  I hope I'll be able to review the patch further today.
>
> Paolo



Re: [Qemu-devel] [PATCH] iSCSI: add configuration variables for iSCSI

2012-01-20 Thread ronnie sahlberg
On Thu, Jan 19, 2012 at 11:17 PM, Kevin Wolf  wrote:
> Am 18.12.2011 05:48, schrieb Ronnie Sahlberg:
>> This patch adds configuration variables for iSCSI to set
>> initiator-name to use when logging in to the target,
>> which type of header-digest to negotiate with the target
>> and username and password for CHAP authentication.
>>
>> This allows specifying a initiator-name either from the command line
>> -iscsi initiator-name=iqn.2004-01.com.example:test
>> or from a configuration file included with -readconfig
>> [iscsi]
>>   initiator-name = iqn.2004-01.com.example:test
>>   header-digest = CRC32C|CRC32C-NONE|NONE-CRC32C|NONE
>>   user = CHAP username
>>   password = CHAP password
>>
>> The patch also updates the manpage and qemu-doc
>>
>> Signed-off-by: Ronnie Sahlberg 
>
> So these options are global? What if I wanted to use two different
> setups for two different images?
>


Good point.
I will rework the patch so that it first checks for
[iscsi "iqn.target.name"]
and if that is not found it falls-back to just checking for [iscsi]

That would allow to have one "catch all" section for all targets, but
also the possibility to override and use different settings on a
per-target basis.

I will post an updated patch in a day or two.



regards
ronnie sahlberg



[Qemu-devel] [PATCH] Add configuration variables for iscsi

2012-01-21 Thread Ronnie Sahlberg

Kevin, List


This is version 2 of the patch to add configuration variables for iSCSI.
This version adds the feature to specify configuration blocks that apply to a 
specific target name, allowing qemu to use different settings if/when 
connecting one guest to multiple different iscsi targets.


This patch adds configuration variables for iSCSI to set
initiator-name to use when logging in to the target,
which type of header-digest to negotiate with the target
and username and password for CHAP authentication.

This allows specifying a initiator-name either from the command line
-iscsi initiator-name=iqn.2004-01.com.example:test
or from a configuration file included with -readconfig
[iscsi]
 initiator-name = iqn.2004-01.com.example:test
 header-digest = CRC32C|CRC32C-NONE|NONE-CRC32C|NONE
 user = CHAP username
 password = CHAP password

In the configuration file it is also possible to set a target specific
configuratyion block using the header
[iscsi "iqn.target.name"]

When a iscsi session is initialized, it will first try to use a configuration
section that matches the target name.
If no such block is found, it will fall-back to try the default [iscsi] section 
instead.


regards
ronnie sahlberg





[Qemu-devel] [PATCH] iSCSI: add configuration variables for iSCSI

2012-01-21 Thread Ronnie Sahlberg
This patch adds configuration variables for iSCSI to set
initiator-name to use when logging in to the target,
which type of header-digest to negotiate with the target
and username and password for CHAP authentication.

This allows specifying a initiator-name either from the command line
-iscsi initiator-name=iqn.2004-01.com.example:test
or from a configuration file included with -readconfig
[iscsi]
  initiator-name = iqn.2004-01.com.example:test
  header-digest = CRC32C|CRC32C-NONE|NONE-CRC32C|NONE
  user = CHAP username
  password = CHAP password

If you use several different targets, you can also configure this on a per
target basis by using a group name:
[iscsi "iqn.target.name"]
...

Signed-off-by: Ronnie Sahlberg 
---
 block/iscsi.c   |  139 +++
 qemu-config.c   |   27 +++
 qemu-doc.texi   |   38 +++-
 qemu-options.hx |   16 +--
 vl.c|8 +++
 5 files changed, 213 insertions(+), 15 deletions(-)

diff --git a/block/iscsi.c b/block/iscsi.c
index 938c568..bd3ca11 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -455,6 +455,109 @@ iscsi_connect_cb(struct iscsi_context *iscsi, int status, 
void *command_data,
 }
 }
 
+static int parse_chap(struct iscsi_context *iscsi, const char *target)
+{
+QemuOptsList *list;
+QemuOpts *opts;
+const char *user = NULL;
+const char *password = NULL;
+
+list = qemu_find_opts("iscsi");
+if (!list) {
+return 0;
+}
+
+opts = qemu_opts_find(list, target);
+if (opts == NULL) {
+opts = QTAILQ_FIRST(&list->head);
+if (!opts) {
+return 0;
+}
+}
+
+user = qemu_opt_get(opts, "user");
+if (!user) {
+return 0;
+}
+
+password = qemu_opt_get(opts, "password");
+if (!password) {
+error_report("CHAP username specified but no password was given");
+return -1;
+}
+
+if (iscsi_set_initiator_username_pwd(iscsi, user, password)) {
+error_report("Failed to set initiator username and password");
+return -1;
+}
+
+return 0;
+}
+
+static void parse_header_digest(struct iscsi_context *iscsi, const char 
*target)
+{
+QemuOptsList *list;
+QemuOpts *opts;
+const char *digest = NULL;
+
+list = qemu_find_opts("iscsi");
+if (!list) {
+return;
+}
+
+opts = qemu_opts_find(list, target);
+if (opts == NULL) {
+opts = QTAILQ_FIRST(&list->head);
+if (!opts) {
+return;
+}
+}
+
+digest = qemu_opt_get(opts, "header-digest");
+if (!digest) {
+return;
+}
+
+if (!strcmp(digest, "CRC32C")) {
+iscsi_set_header_digest(iscsi, ISCSI_HEADER_DIGEST_CRC32C);
+} else if (!strcmp(digest, "NONE")) {
+iscsi_set_header_digest(iscsi, ISCSI_HEADER_DIGEST_NONE);
+} else if (!strcmp(digest, "CRC32C-NONE")) {
+iscsi_set_header_digest(iscsi, ISCSI_HEADER_DIGEST_CRC32C_NONE);
+} else if (!strcmp(digest, "NONE-CRC32C")) {
+iscsi_set_header_digest(iscsi, ISCSI_HEADER_DIGEST_NONE_CRC32C);
+} else {
+error_report("Invalid header-digest setting : %s", digest);
+}
+}
+
+static char *parse_initiator_name(const char *target)
+{
+QemuOptsList *list;
+QemuOpts *opts;
+const char *name = NULL;
+
+list = qemu_find_opts("iscsi");
+if (!list) {
+return g_strdup("iqn.2008-11.org.linux-kvm");
+}
+
+opts = qemu_opts_find(list, target);
+if (opts == NULL) {
+opts = QTAILQ_FIRST(&list->head);
+if (!opts) {
+return g_strdup("iqn.2008-11.org.linux-kvm");
+}
+}
+
+name = qemu_opt_get(opts, "initiator-name");
+if (!name) {
+return g_strdup("iqn.2008-11.org.linux-kvm");
+}
+
+return g_strdup(name);
+}
+
 /*
  * We support iscsi url's on the form
  * iscsi://[%@][:]//
@@ -465,6 +568,7 @@ static int iscsi_open(BlockDriverState *bs, const char 
*filename, int flags)
 struct iscsi_context *iscsi = NULL;
 struct iscsi_url *iscsi_url = NULL;
 struct IscsiTask task;
+char *initiator_name = NULL;
 int ret;
 
 if ((BDRV_SECTOR_SIZE % 512) != 0) {
@@ -474,16 +578,6 @@ static int iscsi_open(BlockDriverState *bs, const char 
*filename, int flags)
 return -EINVAL;
 }
 
-memset(iscsilun, 0, sizeof(IscsiLun));
-
-/* Should really append the KVM name after the ':' here */
-iscsi = iscsi_create_context("iqn.2008-11.org.linux-kvm:");
-if (iscsi == NULL) {
-error_report("iSCSI: Failed to create iSCSI context.");
-ret = -ENOMEM;
-goto failed;
-}
-
 iscsi_url = iscsi_parse_full_url(iscsi, filename);
 if (iscsi_url

Re: [Qemu-devel] [PATCH] readconfig: add emulation of /dev/fd/ to platforms that that lacks this API

2012-03-20 Thread ronnie sahlberg
None.
No platforms that I care about.


On Tue, Mar 20, 2012 at 3:43 AM, Anthony Liguori  wrote:
> On 03/03/2012 01:49 AM, Ronnie Sahlberg wrote:
>>
>> Please find a patch to -readconfig.
>>
>> On many platforms -readconfig /dev/fd/  can be used to read from an
>> already opened and inherited filedescriptor.
>>
>> On platforms that do not natively provide /dev/fd/
>
>
> What platforms don't provide /dev/fd/ where fd passing makes sense?
>
> Regards,
>
> Anthony Liguori
>
>
>> add emulation of this by checking if the ofiginal fopen(path) failed, then
>> IF
>> the path starts with /dev/fd/ then try to read the descriptorvalue and
>> fdopen() it.
>>
>>
>> It means that we can use -readconfig /dev/fd/  on all platforms. On
>> those that provide a /dev/fd we just open that file. On those that do not we
>> emulate it.
>>
>>
>> regards
>> ronnie sahlberg
>>
>>
>



[Qemu-devel] (no subject)

2011-06-11 Thread Ronnie Sahlberg
Please find attached a patch to add built-in support for iSCSI into QEMU.
Please review and/or apply this patch.

This is the latest version of this patch and I think I have addressed all 
previous concerns and suggestions.


Using built-in iSCSI support has many advantages for certain use cases :

* if you have very many iSCSI devices it may be impractical to expose
all LUNs to the underlying host.

* the devices become private to the guest and are not visible to the host.
This automatically avoids polluting the page-cache on the host.

* security, the devices are private to the guest, which prevents other guests 
or the host itself from accessing the devices.

* security, it allows non-root users a secure way to get private and password 
protected access to the device by using CHAP authentication.

* migration, many other virtualization systems provide built-in iscsi clients 
like this. Also providing this as feature in QEMU may make it easier for such 
users to migrate over to QEMU/KVM.

* easier to maintain. For users with very many QEMU instances I think having 
guest-private iscsi targets and LUNs may be easier to manage than 'huge set of 
files in /dev/scsi/*'

* easier to maintain, when copying a QEMU instance from one host to another, 
this offers a more self-contained solution where the QEMU command line itself 
cotnains all required storage configuration, instead of also having to 
coordinate this move with setting up and tearing down open-iscsi logins on the 
underlying hosts.




The patch has been tested by installing and running several distributions such 
as RHEL6 and Ubuntu on devices, both system disk as well as the installer CDROM 
as being iscsi devices.

This testing has also been performed by running full QEMU under valgrind.

Performance is comparable to using open-iscsi devices.





[Qemu-devel] [PATCH] iSCSI block driver support

2011-06-11 Thread Ronnie Sahlberg
This patch adds a new block driver : block.iscsi.c
This driver interfaces with the multiplatform posix library
for iscsi initiator/client access to iscsi devices hosted at
git://github.com/sahlberg/libiscsi.git

The patch adds the driver to interface with the iscsi library.
It also updated the configure script to
* by default, probe is libiscsi is available and if so, build
  qemu against libiscsi.
* --enable-libiscsi
  Force a build against libiscsi. If libiscsi is not available
  the build will fail.
* --disable-libiscsi
  Do not link against libiscsi, even if it is available.

When linked with libiscsi, qemu gains support to access iscsi resources
such as disks and cdrom directly, without having to make the devices visible
to the host.

You can specify devices using a iscsi url of the form :
iscsi://[[:@]][:/
When using authentication, the password can optionally be set with
LIBISCSI_CHAP_PASSWORD="password" to avoid it showing up in the process list

Example:
./x86_64-softmmu/qemu-system-x86_64 -m 1024 -cdrom 
iscsi://127.0.0.1/iqn.ronnie.test/2 --drive 
file=iscsi://127.0.0.1/iqn.ronnie.test/1 -boot d -enable-kvm

Signed-off-by: Ronnie Sahlberg 
---
 Makefile.objs |1 +
 block/iscsi.c |  596 +
 configure |   31 +++
 trace-events  |6 +
 4 files changed, 634 insertions(+), 0 deletions(-)
 create mode 100644 block/iscsi.c

diff --git a/Makefile.objs b/Makefile.objs
index 52d8b23..1b21c14 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -25,6 +25,7 @@ block-nested-y += qed-check.o
 block-nested-y += parallels.o nbd.o blkdebug.o sheepdog.o blkverify.o
 block-nested-$(CONFIG_WIN32) += raw-win32.o
 block-nested-$(CONFIG_POSIX) += raw-posix.o
+block-nested-$(CONFIG_LIBISCSI) += iscsi.o
 block-nested-$(CONFIG_CURL) += curl.o
 block-nested-$(CONFIG_RBD) += rbd.o
 
diff --git a/block/iscsi.c b/block/iscsi.c
new file mode 100644
index 000..07fb3bc
--- /dev/null
+++ b/block/iscsi.c
@@ -0,0 +1,596 @@
+/*
+ * QEMU Block driver for iSCSI images
+ *
+ * Copyright (c) 2010 Ronnie Sahlberg 
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include "config-host.h"
+
+#include 
+#include "sysemu.h"
+#include "qemu-common.h"
+#include "qemu-error.h"
+#include "block_int.h"
+#include "trace.h"
+
+#include 
+#include 
+
+
+typedef struct IscsiLun {
+struct iscsi_context *iscsi;
+int lun;
+int block_size;
+unsigned long num_blocks;
+} IscsiLun;
+
+typedef struct IscsiAIOCB {
+BlockDriverAIOCB common;
+QEMUIOVector *qiov;
+QEMUBH *bh;
+IscsiLun *iscsilun;
+struct scsi_task *task;
+uint8_t *buf;
+int canceled;
+int status;
+size_t read_size;
+size_t read_offset;
+} IscsiAIOCB;
+
+struct IscsiTask {
+IscsiLun *iscsilun;
+int status;
+int complete;
+};
+
+static void
+iscsi_abort_task_cb(struct iscsi_context *iscsi, int status, void 
*command_data,
+void *private_data)
+{
+}
+
+static void
+iscsi_aio_cancel(BlockDriverAIOCB *blockacb)
+{
+IscsiAIOCB *acb = (IscsiAIOCB *)blockacb;
+IscsiLun *iscsilun = acb->iscsilun;
+
+acb->status = -ECANCELED;
+acb->common.cb(acb->common.opaque, acb->status);
+acb->canceled = 1;
+
+iscsi_task_mgmt_abort_task_async(iscsilun->iscsi, acb->task,
+ iscsi_abort_task_cb, NULL);
+}
+
+static AIOPool iscsi_aio_pool = {
+.aiocb_size = sizeof(IscsiAIOCB),
+.cancel = iscsi_aio_cancel,
+};
+
+
+static void iscsi_process_read(void *arg);
+static void iscsi_process_write(void *arg);
+
+static int iscsi_process_flush(void *arg)
+{
+IscsiLun *iscsilun = arg;
+
+return iscsi_queue_length(iscsilun->iscsi) > 0;
+}
+
+static void
+iscsi_set_events(IscsiLun *iscsilun)
+{
+ 

[Qemu-devel] iSCSI support for QEMU

2011-06-11 Thread ronnie sahlberg
Sorry about the missing subject line in the previous mail
Got confused when using git-send-patch :-)

regards
ronnie sahlberg

On Sun, Jun 12, 2011 at 12:47 PM, Ronnie Sahlberg
 wrote:
> Please find attached a patch to add built-in support for iSCSI into QEMU.
> Please review and/or apply this patch.
>
> This is the latest version of this patch and I think I have addressed all 
> previous concerns and suggestions.
>
>
> Using built-in iSCSI support has many advantages for certain use cases :
>
> * if you have very many iSCSI devices it may be impractical to expose
> all LUNs to the underlying host.
>
> * the devices become private to the guest and are not visible to the host.
> This automatically avoids polluting the page-cache on the host.
>
> * security, the devices are private to the guest, which prevents other guests 
> or the host itself from accessing the devices.
>
> * security, it allows non-root users a secure way to get private and password 
> protected access to the device by using CHAP authentication.
>
> * migration, many other virtualization systems provide built-in iscsi clients 
> like this. Also providing this as feature in QEMU may make it easier for such 
> users to migrate over to QEMU/KVM.
>
> * easier to maintain. For users with very many QEMU instances I think having 
> guest-private iscsi targets and LUNs may be easier to manage than 'huge set 
> of files in /dev/scsi/*'
>
> * easier to maintain, when copying a QEMU instance from one host to another, 
> this offers a more self-contained solution where the QEMU command line itself 
> cotnains all required storage configuration, instead of also having to 
> coordinate this move with setting up and tearing down open-iscsi logins on 
> the underlying hosts.
>
>
>
>
> The patch has been tested by installing and running several distributions 
> such as RHEL6 and Ubuntu on devices, both system disk as well as the 
> installer CDROM as being iscsi devices.
>
> This testing has also been performed by running full QEMU under valgrind.
>
> Performance is comparable to using open-iscsi devices.
>
>
>



[Qemu-devel] [PATCH 0/4] ISCSI: Passthough updates

2012-05-25 Thread Ronnie Sahlberg

Paolo, list

Please find a set of patches for iscsi.

1, The recent change to the eventsystem in iscsi is racy. If the async connect 
of the socket takes longer than almost instant, the shortcircuit trying to 
write directly to the socket may try to do so before the socket is established.
So we hcange the logic here and ONLY try the shortcircuit avoiding setting up 
POLLOUT if we have been fully logged in.

This avoids the race where we might crash if the socket is not yet connected.

2, Swithc to using READCAPACITY10  for MMC devices. Only SBC devices support 
READCAPACITY16 so far.

3, Switch to use READ10 for MMC devices instead of READ16. With this patch 
ISCSI and MMC devices now work correctly.

4, If the device is a SMC device, then force the device to be sg.
We dont support emulation as far as i know for SMV devices, so these device 
types would only work for sg or iscsi passthrough anyway.
The reason we need to forcer sg is that since we can not READ from a SMC 
device, we need to shortcircuit  find_image_format()  before it tries to do the 
bdrv_pread() trying to guess what format the image is.

With this patch I have verified I can access and manage a media changer via 
iscsi-passthrough.


regards
ronnie sahblerg




[Qemu-devel] [PATCH 1/4] ISCSI: Only call READCAPACITY16 for SBC devices

2012-05-25 Thread Ronnie Sahlberg
Signed-off-by: Ronnie Sahlberg 
---
 block/iscsi.c |   17 -
 1 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/block/iscsi.c b/block/iscsi.c
index df0b6c8..39d75cb 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -727,13 +727,20 @@ iscsi_inquiry_cb(struct iscsi_context *iscsi, int status, 
void *command_data,
 
 scsi_free_scsi_task(task);
 
-task = iscsi_readcapacity16_task(iscsi, itask->iscsilun->lun,
+switch (itask->iscsilun->type) {
+case TYPE_DISK:
+task = iscsi_readcapacity16_task(iscsi, itask->iscsilun->lun,
iscsi_readcapacity16_cb, opaque);
-if (task == NULL) {
-error_report("iSCSI: failed to send readcapacity16 command.");
-itask->status   = 1;
+if (task == NULL) {
+error_report("iSCSI: failed to send readcapacity16 command.");
+itask->status   = 1;
+itask->complete = 1;
+return;
+}
+break;
+default:
+itask->status   = 0;
 itask->complete = 1;
-return;
 }
 }
 
-- 
1.7.3.1




[Qemu-devel] [PATCH 2/4] ISCSI: Use READCAPACITY10 for MMC devices

2012-05-25 Thread Ronnie Sahlberg
Signed-off-by: Ronnie Sahlberg 
---
 block/iscsi.c |   47 +++
 1 files changed, 47 insertions(+), 0 deletions(-)

diff --git a/block/iscsi.c b/block/iscsi.c
index 39d75cb..2ddb9e5 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -700,6 +700,42 @@ iscsi_readcapacity16_cb(struct iscsi_context *iscsi, int 
status,
 }
 
 static void
+iscsi_readcapacity10_cb(struct iscsi_context *iscsi, int status,
+void *command_data, void *opaque)
+{
+struct IscsiTask *itask = opaque;
+struct scsi_readcapacity10 *rc10;
+struct scsi_task *task = command_data;
+
+if (status != 0) {
+error_report("iSCSI: Failed to read capacity of iSCSI lun. %s",
+ iscsi_get_error(iscsi));
+itask->status   = 1;
+itask->complete = 1;
+scsi_free_scsi_task(task);
+return;
+}
+
+rc10 = scsi_datain_unmarshall(task);
+if (rc10 == NULL) {
+error_report("iSCSI: Failed to unmarshall readcapacity10 data.");
+itask->status   = 1;
+itask->complete = 1;
+scsi_free_scsi_task(task);
+return;
+}
+
+itask->iscsilun->block_size = rc10->block_size;
+itask->iscsilun->num_blocks = rc10->lba + 1;
+itask->bs->total_sectors= itask->iscsilun->num_blocks *
+   itask->iscsilun->block_size / BDRV_SECTOR_SIZE ;
+
+itask->status   = 0;
+itask->complete = 1;
+scsi_free_scsi_task(task);
+}
+
+static void
 iscsi_inquiry_cb(struct iscsi_context *iscsi, int status, void *command_data,
  void *opaque)
 {
@@ -738,6 +774,17 @@ iscsi_inquiry_cb(struct iscsi_context *iscsi, int status, 
void *command_data,
 return;
 }
 break;
+case TYPE_ROM:
+task = iscsi_readcapacity10_task(iscsi, itask->iscsilun->lun,
+   0, 0,
+   iscsi_readcapacity10_cb, opaque);
+if (task == NULL) {
+error_report("iSCSI: failed to send readcapacity16 command.");
+itask->status   = 1;
+itask->complete = 1;
+return;
+}
+break;
 default:
 itask->status   = 0;
 itask->complete = 1;
-- 
1.7.3.1




[Qemu-devel] [PATCH 3/4] ISCSI: Only use READ16 for SBC devices. Use READ10 for other device types such as MMC

2012-05-25 Thread Ronnie Sahlberg
Signed-off-by: Ronnie Sahlberg 
---
 block/iscsi.c |   21 -
 1 files changed, 16 insertions(+), 5 deletions(-)

diff --git a/block/iscsi.c b/block/iscsi.c
index 2ddb9e5..a015a52 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -379,14 +379,25 @@ iscsi_aio_readv(BlockDriverState *bs, int64_t sector_num,
 memset(acb->task, 0, sizeof(struct scsi_task));
 
 acb->task->xfer_dir = SCSI_XFER_READ;
-acb->task->cdb_size = 16;
-acb->task->cdb[0]  = 0x88;
 lba = sector_qemu2lun(sector_num, iscsilun);
-*(uint32_t *)&acb->task->cdb[2]  = htonl(lba >> 32);
-*(uint32_t *)&acb->task->cdb[6]  = htonl(lba & 0x);
-*(uint32_t *)&acb->task->cdb[10] = htonl(num_sectors);
 acb->task->expxferlen = qemu_read_size;
 
+switch (iscsilun->type) {
+case TYPE_DISK:
+acb->task->cdb_size = 16;
+acb->task->cdb[0]  = 0x88;
+*(uint32_t *)&acb->task->cdb[2]  = htonl(lba >> 32);
+*(uint32_t *)&acb->task->cdb[6]  = htonl(lba & 0x);
+*(uint32_t *)&acb->task->cdb[10] = htonl(num_sectors);
+break;
+default:
+acb->task->cdb_size = 10;
+acb->task->cdb[0]  = 0x28;
+*(uint32_t *)&acb->task->cdb[2] = htonl(lba);
+*(uint16_t *)&acb->task->cdb[7] = htons(num_sectors);
+break;
+}
+
 if (iscsi_scsi_command_async(iscsi, iscsilun->lun, acb->task,
  iscsi_aio_read16_cb,
  NULL,
-- 
1.7.3.1




[Qemu-devel] [PATCH 4/4] ISCSI: If the device we open is a SMC device, then force the use of sg. We dont have any medium changer emulation so only passthrough via real sg or scsi-generic via iscsi wou

2012-05-25 Thread Ronnie Sahlberg
Forcing sg also makes qemu skip trying to read from the device to guess the 
image format by reading from the device (find_image_format()).
SMC devices do not implement READ6/10/12/16  so it is noit possible to read 
from them.

With this patch I can successfully manage a SMC device wiht iscsi in 
passthrough mode.

Signed-off-by: Ronnie Sahlberg 
---
 block/iscsi.c |9 +
 1 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/block/iscsi.c b/block/iscsi.c
index a015a52..9ce38b5 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -1032,6 +1032,15 @@ static int iscsi_open(BlockDriverState *bs, const char 
*filename, int flags)
 if (iscsi_url != NULL) {
 iscsi_destroy_url(iscsi_url);
 }
+
+/* Medium changer. We dont have any emulation for this so this must
+   be sg ioctl compatible. We force it to be sg, otherwise qemu will try
+   to read from the device to guess the image format.
+ */
+if (iscsilun->type == TYPE_MEDIUM_CHANGER) {
+bs->sg = 1;
+}
+
 return 0;
 
 failed:
-- 
1.7.3.1




Re: [Qemu-devel] [PULL 1.1 0/2] SCSI patches for 1.1.0-rc3

2012-05-25 Thread ronnie sahlberg
Paolo,

You need this patch too since without it it might crash.

commit 6e46eb1846a862dad253be1a576f8554071b154a
Author: Ronnie Sahlberg 
Date:   Sat May 26 10:28:05 2012 +1000

ISCSI: We can only do the shortcircuit and write directly to the socket
IFF we know the socket is open (and writeable).
If the target is more than insignificant distance away, we could otherwise
try to write to the socket before the nonblocking connect has completed
which would cause a crash.

Signed-off-by: Ronnie Sahlberg 


The other patches in the series can wait until later,  but this one is
needed for 1.1


regards
ronnie sahlberg



On Fri, May 25, 2012 at 9:11 PM, Paolo Bonzini  wrote:
> Il 22/05/2012 14:25, Paolo Bonzini ha scritto:
>> The following changes since commit 76ee152a86d5f2533443ce4d2be6fe253cfb3c45:
>>
>>   Update version to 1.1.0-rc2 (2012-05-14 17:56:50 -0500)
>>
>> are available in the git repository at:
>>
>>   git://github.com/bonzini/qemu.git scsi-next
>>
>> for you to fetch changes up to e1a2d34f4abd8a117f5c5a25a5bb2e67d597de23:
>>
>>   ISCSI: call qemu_notify_event() after updating events (2012-05-22 14:14:05 
>> +0200)
>
> Rebased, commit cd0b3cc9eee6f9b6b165f24787b4290bdc1c3f68 is now the tip
> of the branch.
>
> Paolo
>
>> ----
>> Jim Meyering (1):
>>       scsi: declare vmstate_info_scsi_requests to be static
>>
>> Ronnie Sahlberg (1):
>>       ISCSI: call qemu_notify_event() after updating events
>>
>>  block/iscsi.c |    7 +++
>>  hw/scsi-bus.c |    2 +-
>>  2 files changed, 8 insertions(+), 1 deletion(-)
>
>



Re: [Qemu-devel] [PULL 1.1 0/2] SCSI patches for 1.1.0-rc3

2012-05-26 Thread ronnie sahlberg
Almost,
but connecter=1 should be set after the if statement not inside it.

 if (status != 0) {
+itask->iscsilun->connected = 1;
 itask->status   = 1;

I.e. dont set it in the error path, set it after the block. then it
should be all good.



optionally you can revert 5b5e96bba6835b794ba237c1ddd6580dea8d4aef
and use the patch I attach instead


regards
ronnie sahlberg


 2012 at 5:46 PM, Paolo Bonzini  wrote:
> Il 26/05/2012 07:41, ronnie sahlberg ha scritto:
>> Paolo,
>>
>> You need this patch too since without it it might crash.
>>
>> commit 6e46eb1846a862dad253be1a576f8554071b154a
>> Author: Ronnie Sahlberg 
>> Date:   Sat May 26 10:28:05 2012 +1000
>>
>>     ISCSI: We can only do the shortcircuit and write directly to the socket
>>     IFF we know the socket is open (and writeable).
>>     If the target is more than insignificant distance away, we could 
>> otherwise
>>     try to write to the socket before the nonblocking connect has completed
>>     which would cause a crash.
>>
>>     Signed-off-by: Ronnie Sahlberg 
>>
>>
>> The other patches in the series can wait until later,  but this one is
>> needed for 1.1
>
> I didn't get the patch, so I redid it.
>
> Can you test asap git://github.com/bonzini/qemu.git, branch
> scsi-candidate-1.1 to see if it matches what you want to be in 1.1?
>
> Paolo


0001-ISCSI-We-can-only-do-the-shortcircuit-and-write-dire.patch
Description: Binary data


Re: [Qemu-devel] [PULL 1.1 0/2] SCSI patches for 1.1.0-rc3

2012-05-26 Thread ronnie sahlberg
Im compiling your branch now and will verify all is good.

It is missing an include  so you need to add this patch to compile :




On Sat, May 26, 2012 at 5:59 PM, ronnie sahlberg
 wrote:
> Almost,
> but connecter=1 should be set after the if statement not inside it.
>
>     if (status != 0) {
> +        itask->iscsilun->connected = 1;
>         itask->status   = 1;
>
> I.e. dont set it in the error path, set it after the block. then it
> should be all good.
>
>
>
> optionally you can revert 5b5e96bba6835b794ba237c1ddd6580dea8d4aef
> and use the patch I attach instead
>
>
> regards
> ronnie sahlberg
>
>
>  2012 at 5:46 PM, Paolo Bonzini  wrote:
>> Il 26/05/2012 07:41, ronnie sahlberg ha scritto:
>>> Paolo,
>>>
>>> You need this patch too since without it it might crash.
>>>
>>> commit 6e46eb1846a862dad253be1a576f8554071b154a
>>> Author: Ronnie Sahlberg 
>>> Date:   Sat May 26 10:28:05 2012 +1000
>>>
>>>     ISCSI: We can only do the shortcircuit and write directly to the socket
>>>     IFF we know the socket is open (and writeable).
>>>     If the target is more than insignificant distance away, we could 
>>> otherwise
>>>     try to write to the socket before the nonblocking connect has completed
>>>     which would cause a crash.
>>>
>>>     Signed-off-by: Ronnie Sahlberg 
>>>
>>>
>>> The other patches in the series can wait until later,  but this one is
>>> needed for 1.1
>>
>> I didn't get the patch, so I redid it.
>>
>> Can you test asap git://github.com/bonzini/qemu.git, branch
>> scsi-candidate-1.1 to see if it matches what you want to be in 1.1?
>>
>> Paolo


0001-ISCSI-need-to-include-scsi-defs.h.patch
Description: Binary data


Re: [Qemu-devel] [PULL 1.1 0/2] SCSI patches for 1.1.0-rc3

2012-05-26 Thread ronnie sahlberg
I have compiled your branch and run through some tests.

It all looks good as long as you apply the patch to #include "hw/scsi-defs.h"



On Sat, May 26, 2012 at 6:17 PM, ronnie sahlberg
 wrote:
> Im compiling your branch now and will verify all is good.
>
> It is missing an include  so you need to add this patch to compile :
>
>
>
>
> On Sat, May 26, 2012 at 5:59 PM, ronnie sahlberg
>  wrote:
>> Almost,
>> but connecter=1 should be set after the if statement not inside it.
>>
>>     if (status != 0) {
>> +        itask->iscsilun->connected = 1;
>>         itask->status   = 1;
>>
>> I.e. dont set it in the error path, set it after the block. then it
>> should be all good.
>>
>>
>>
>> optionally you can revert 5b5e96bba6835b794ba237c1ddd6580dea8d4aef
>> and use the patch I attach instead
>>
>>
>> regards
>> ronnie sahlberg
>>
>>
>>  2012 at 5:46 PM, Paolo Bonzini  wrote:
>>> Il 26/05/2012 07:41, ronnie sahlberg ha scritto:
>>>> Paolo,
>>>>
>>>> You need this patch too since without it it might crash.
>>>>
>>>> commit 6e46eb1846a862dad253be1a576f8554071b154a
>>>> Author: Ronnie Sahlberg 
>>>> Date:   Sat May 26 10:28:05 2012 +1000
>>>>
>>>>     ISCSI: We can only do the shortcircuit and write directly to the socket
>>>>     IFF we know the socket is open (and writeable).
>>>>     If the target is more than insignificant distance away, we could 
>>>> otherwise
>>>>     try to write to the socket before the nonblocking connect has completed
>>>>     which would cause a crash.
>>>>
>>>>     Signed-off-by: Ronnie Sahlberg 
>>>>
>>>>
>>>> The other patches in the series can wait until later,  but this one is
>>>> needed for 1.1
>>>
>>> I didn't get the patch, so I redid it.
>>>
>>> Can you test asap git://github.com/bonzini/qemu.git, branch
>>> scsi-candidate-1.1 to see if it matches what you want to be in 1.1?
>>>
>>> Paolo



Re: [Qemu-devel] [PATCH 4/4] ISCSI: If the device we open is a SMC device, then force the use of sg. We dont have any medium changer emulation so only passthrough via real sg or scsi-generic via iscsi

2012-05-28 Thread ronnie sahlberg
Paolo


I think I have seen a problem inside libiscsi that could be triggered
by the shortcut.

Can you remove this shortcut completely :

-/* Try to write as much as we can to the socket
- * without setting up an event.
- * Only do this if we are completely logged in, so we know that
- * the socket is in connected state.
- */
-if (iscsi_is_logged_in(iscsi)) {
-if (iscsi_which_events(iscsi) & POLLOUT) {
-iscsi_process_write(iscsilun);
-}
-}

I think there is a problem inside libiscsi if the socket becomes full
and is no longer writeable and we try to write via this shortcurcuit.
It will take a while until I can verify or fix that issue and before a
new version of libiscsi can be available
so I would feel most comfortable with if we just remove this
optimization from QEMU for now.

It can be added back later once libiscsi is fixed.



regards
ronnie sahlberg


On Mon, May 28, 2012 at 4:48 PM, Paolo Bonzini  wrote:
> Il 27/05/2012 15:12, Andreas Färber ha scritto:
>>> > Modified to also do the same for tapes, applied to scsi-next branch for 
>>> > 1.2.
>> Paolo, it seems you haven't pushed scsi-next since then.
>
> Yeah, I have a pending push request for scsi-next, so I'm waiting till
> Anthony applies it before pushing 1.2-only patches (I wasn't expecting
> parallel 1.1/1.2 development for SCSI).
>
>> I hope you've
>> also shortened the subject to a humanly bearable length?
>
> Yes. :)
>
> Paolo



Re: [Qemu-devel] Block job commands in QEMU 1.2 [v2, including support for replication]

2012-05-31 Thread ronnie sahlberg
On Fri, May 25, 2012 at 11:25 PM, Paolo Bonzini  wrote:
> Il 25/05/2012 14:09, Stefan Hajnoczi ha scritto:
>>> >
>>> > Perhaps that be simply a new qemu-img subcommand?  It should be possible
>>> > to run it while the VM is offline.  Then the file that is produced could
>>> > be fed to blockdev-dirty-enable.
>> For both continuous replication and incremental backups we cannot
>> require that the guest is shut down in order to collect the dirty
>> bitmap, I think.
>
> Yes, that is a problem for internal snapshots.  For external snapshots,
> see the drive-mirror command's sync parameter.  Perhaps we can add a
> blockdev-dirty-fill command that adds allocated sectors up to a given
> base to the dirty bitmap.
>
>> I think we really need a libvirt API because a local file not only has
>> permissions issues but also is not network transparent.  The continuous
>> replication server runs on another machine, how will it access the dirty
>> bitmap file?
>
> This is still using a "push" model where the dirty data is sent from
> QEMU to the replication server, so the dirty bitmap is not needed on the
> machine that runs the replication server---only on the machine that runs
> the VM (to preserve the bitmap across VM shutdowns including power
> loss).  It has to be stored on shared storage if you plan to run the VM
> from multiple hosts.

Why reinventing the wheel?
Wouldnt it be much better to externalize the snapshotting.

Some/Many filesystems support snapshotting today.
A-whole-lot-of/most-non-consumer-grade block storage devices support
it too.


So a different way to do this would be to use a mechanism to
quiescence the backing file/device and then call out to an external
agent to snapshot the backing file or the backing device.
Other external tools can then be used to compute a dense delta between
this new snapshot and the previous snapshot and transfer to the other
side.

I think all filesystems that support snapshotting on file level
support APIs for a cheap way to cumpute the block deltas.
I would imagine all midrange or better block storage devices that
support LUN snapshotting provide this too.


So why do snapshotting and computation of snapshot deltas in qemu ?
Why not just externalize it with "you want snapshotting and
incremental replication => you must use a system where file/block
supports it".


regards
ronnie sahlberg



[Qemu-devel] [PATCH 0/2] Two trivial patches for SCSI thin provisioning

2012-04-19 Thread Ronnie Sahlberg

List,

Please find two trivial patches for the recent thin provisioning support

1, you can support unmap via WRITE_SAME_10 just as well as WRITE_SAME_16 since 
they are the same/identical as far as how QEMU works ?

2, if implementing LBPWS and LBPWS10   we should also tell the guest that the 
LUN is thin-provisionied.



regards
ronnie sahlberg





[Qemu-devel] [PATCH 2/2] SCSI emulation: should tell the guest that we actually support thin provisioning

2012-04-19 Thread Ronnie Sahlberg
Signed-off-by: Ronnie Sahlberg 
---
 hw/scsi-disk.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c
index ccd9291..d6c197f 100644
--- a/hw/scsi-disk.c
+++ b/hw/scsi-disk.c
@@ -532,7 +532,7 @@ static int scsi_disk_emulate_inquiry(SCSIRequest *req, 
uint8_t *outbuf)
 outbuf[3] = buflen = 8;
 outbuf[4] = 0;
 outbuf[5] = 0x60; /* write_same 10/16 supported */
-outbuf[6] = 0;
+outbuf[6] = 2;/* thin provisioning */
 outbuf[7] = 0;
 break;
 }
-- 
1.7.3.1




[Qemu-devel] [PATCH 1/2] SCSI emulation: Support unmap via WRITE_SAME_10. It is the same as WRITE_SAME_16 as far as QEMU is concerned

2012-04-19 Thread Ronnie Sahlberg
Signed-off-by: Ronnie Sahlberg 
---
 hw/scsi-disk.c |5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c
index 9949786..ccd9291 100644
--- a/hw/scsi-disk.c
+++ b/hw/scsi-disk.c
@@ -531,7 +531,7 @@ static int scsi_disk_emulate_inquiry(SCSIRequest *req, 
uint8_t *outbuf)
 {
 outbuf[3] = buflen = 8;
 outbuf[4] = 0;
-outbuf[5] = 0x40; /* write same with unmap supported */
+outbuf[5] = 0x60; /* write_same 10/16 supported */
 outbuf[6] = 0;
 outbuf[7] = 0;
 break;
@@ -1456,10 +1456,11 @@ static int32_t scsi_send_command(SCSIRequest *req, 
uint8_t *buf)
 goto illegal_lba;
 }
 break;
+case WRITE_SAME_10:
 case WRITE_SAME_16:
 len = r->req.cmd.xfer / s->qdev.blocksize;
 
-DPRINTF("WRITE SAME(16) (sector %" PRId64 ", count %d)\n",
+DPRINTF("WRITE SAME() (sector %" PRId64 ", count %d)\n",
 r->req.cmd.lba, len);
 
 if (r->req.cmd.lba > s->qdev.max_lba) {
-- 
1.7.3.1




Re: [Qemu-devel] [PATCH 00/16] QEMU vhost-scsi support

2012-04-20 Thread ronnie sahlberg
On Fri, Apr 20, 2012 at 5:00 PM, Nicholas A. Bellinger
 wrote:
> On Thu, 2012-04-19 at 19:20 -0500, Anthony Liguori wrote:
>> Hi Nicholas,
>>
>> On 04/19/2012 06:53 PM, Nicholas A. Bellinger wrote:
>> > On Thu, 2012-04-19 at 07:30 -0500, Anthony Liguori wrote:
>> >> However, for storage, be it scsi or direct access, the same problem really
>> >> doesn't exist.  There isn't an obvious benefit to being in the kernel.
>> >>
>> >
>> > In the modern Linux v3.x tree, it was decided there is an obvious
>> > benefit to fabric drivers developers for going ahead and putting proper
>> > SCSI target logic directly into the kernel..  ;)
>>
>> I'm sure there are obvious benefits to having the kernel have SCSI target 
>> logic.
>>   I'm not claiming that there isn't.
>>
>> But there is not an obvious benefit to doing SCSI emulation *for virtual
>> machine* guests in the kernel.
>>
>> Guests are unconditionally hostile.  There is no qualification here.  Public
>> clouds are the obvious example of this.
>>
>> TCM runs in the absolute most privileged context possible.  When you're 
>> dealing
>> with extremely hostile input, it's pretty obvious that you want to run it in 
>> the
>> lowest privileged context as humanly possible.
>>
>
> The argument that a SCSI target for virtual machines is so complex that
> it can't possibly be implemented properly in the kernel is a bunch of
> non-sense.

There are also other benefits to NOT implement scsi emulation in the
kernel, aside from the security aspect of running large amounts of
code inside kernel context vs within restricted userspace context.

I am very happy to be able to add emulation of new opcodes or new
features to tgtd WITHOUT having to recompile my kernel.


regards
ronnie sahlberg



Re: [Qemu-devel] [PATCH 00/16] QEMU vhost-scsi support

2012-04-23 Thread ronnie sahlberg
On Mon, Apr 23, 2012 at 7:33 PM, Stefan Hajnoczi  wrote:
> On Sat, Apr 21, 2012 at 9:51 AM, Nicholas A. Bellinger
>  wrote:
>> On Fri, 2012-04-20 at 12:09 +0100, Stefan Hajnoczi wrote:
>>> On Fri, Apr 20, 2012 at 8:46 AM, Paolo Bonzini  wrote:
>>> > Il 20/04/2012 09:00, Nicholas A. Bellinger ha scritto:
>>
>> 
>>
>>> > - no support for migration (there can be pending SCSI requests at
>>> > migration time, that need to be restarted on the destination)
>>>
>>> Yes and it hasn't been thought through by me at least ;-).  So
>>> migration is indeed a challenge that needs to be worked through.
>>>
>>> > - no support for non-raw images (fix: use NBD on a Unix socket? perhaps
>>> > add an NBD backend to lio)
>>>
>>> For me this is the biggest issue with kernel-level storage for virtual
>>> machines.  We have NBD today but it goes through the network stack
>>> using a limited protocol and probably can't do zero-copy.
>>>
>>> The most promising option I found was dm-userspace
>>> (http://wiki.xensource.com/xenwiki/DmUserspace), which implements a
>>> device-mapper target with an in-kernel MMU-like lookup mechanism that
>>> calls out to userspace when block addresses need to be translated.
>>> It's not anywhere near to upstream and hasn't been pushed for several
>>> years.  On the plus side we could also write a userspace
>>> implementation of this so that QEMU image formats continue to be
>>> portable to other host OSes without duplicating code.
>>>
>>> If tcm_vhost only works with raw images then I don't see it as a
>>> realistic option given the effort it will require to complete and
>>> maintain.
>>>
>>
>> So there has been interest in the past for creating a TCM backend that
>> allows for a userspace passthrough, but so far the code to do this has
>> not materialized yet..
>>
>> There are pieces of logic from STGT that provide an interface for doing
>> something similar that still exist in the upstream kernel.  Allowing
>> different QEMU formats to be processed (in userspace) through a hybrid
>> TCM backend driver that fits into the existing HBA/DEV layout in
>> /sys/kernel/config/target/$HBA/$DEV/ is what would be required to really
>> do this properly..
>
> Could you point to the existing upstream code?
>
> I think the hybrid TCM backend driver means a way for a userspace
> process to execute SCSI Tasks from the target - implementing a subset
> of se_subsystem_api in userspace?
>
> If we solve the problem at the block driver level instead of inside
> the SCSI target then it's also possible for the host to inspect VM
> disk images similar to loopback devices (mount -o loop).  Do you think
> putting the userspace interface into the SCSI target has advantages
> over the block driver or device-mapper level?
>

Hi Stefan,

A little bit off-topic but

When you design the proper place and API to plug virt-scsi into an
external SCSI parser outside of qemu like the target in the kernel ...

It would be very nice if one could also plug virt-scsi into libiscsi
and pass the CDBs straight to the remote iSCSI target too.
Keep some thoughts on virt-scsi + libiscsi integration.


regards
ronnie sahlberg



[Qemu-devel] [PATCH 0/1] Add support for iSCSI thin-provisioning

2012-04-23 Thread Ronnie Sahlberg

List,

Please find a patch that updates the iscsi block device to provide support for
1, LUNs bigger than 2TB by switching to READCAPACITY16
2, Thin-provisioning by implementing bdrv_aio_discard using SCSI UNMAP commands.


The unmapping/discard of blocks was done by booting a guest and using
sg_write_same --16 --unmap  from within the guest.
This was trapped in hw/scsi-disc.c and then appeared on the wire to the iscsi 
target as a proper SCSI UNMAP command.


regards
ronnie sahlberg





[Qemu-devel] [PATCH] ISCSI: Add support for thin-provisioning via discard/UNMAP and bigger LUNs

2012-04-23 Thread Ronnie Sahlberg
Update the configure test for libiscsi support to detect version 1.3 or later.
Version 1.3 of libiscsi provides both READCAPACITY16 as well as UNMAP commands.

Update the iscsi block layer to use READCAPACITY16 to detect the size of the 
LUN instead of READCAPACITY10. This allows support for LUNs larger than 2TB.

Update to implement bdrv_aio_discard() using the UNMAP command.
This allows us to use thin-provisioned LUNs from TGTD and other iSCSI targets 
that support thin-provisioning.

Signed-off-by: Ronnie Sahlberg 
---
 block/iscsi.c |   86 
 configure |5 ++-
 2 files changed, 77 insertions(+), 14 deletions(-)

diff --git a/block/iscsi.c b/block/iscsi.c
index bd3ca11..eb49093 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -383,6 +383,65 @@ iscsi_aio_flush(BlockDriverState *bs,
 return &acb->common;
 }
 
+static void
+iscsi_unmap_cb(struct iscsi_context *iscsi, int status,
+ void *command_data, void *opaque)
+{
+IscsiAIOCB *acb = opaque;
+
+if (acb->canceled != 0) {
+qemu_aio_release(acb);
+scsi_free_scsi_task(acb->task);
+acb->task = NULL;
+return;
+}
+
+acb->status = 0;
+if (status < 0) {
+error_report("Failed to unmap data on iSCSI lun. %s",
+ iscsi_get_error(iscsi));
+acb->status = -EIO;
+}
+
+iscsi_schedule_bh(iscsi_readv_writev_bh_cb, acb);
+scsi_free_scsi_task(acb->task);
+acb->task = NULL;
+}
+
+static BlockDriverAIOCB *
+iscsi_aio_discard(BlockDriverState *bs,
+  int64_t sector_num, int nb_sectors,
+  BlockDriverCompletionFunc *cb, void *opaque)
+{
+IscsiLun *iscsilun = bs->opaque;
+struct iscsi_context *iscsi = iscsilun->iscsi;
+IscsiAIOCB *acb;
+struct unmap_list list[1];
+
+acb = qemu_aio_get(&iscsi_aio_pool, bs, cb, opaque);
+
+acb->iscsilun = iscsilun;
+acb->canceled   = 0;
+
+list[0].lba = sector_qemu2lun(sector_num, iscsilun);
+list[0].num = nb_sectors * BDRV_SECTOR_SIZE / iscsilun->block_size;
+
+acb->task = iscsi_unmap_task(iscsi, iscsilun->lun,
+ 0, 0, &list[0], 1,
+ iscsi_unmap_cb,
+ acb);
+if (acb->task == NULL) {
+error_report("iSCSI: Failed to send unmap command. %s",
+ iscsi_get_error(iscsi));
+qemu_aio_release(acb);
+return NULL;
+}
+
+iscsi_set_events(iscsilun);
+
+return &acb->common;
+}
+
 static int64_t
 iscsi_getlength(BlockDriverState *bs)
 {
@@ -396,11 +455,11 @@ iscsi_getlength(BlockDriverState *bs)
 }
 
 static void
-iscsi_readcapacity10_cb(struct iscsi_context *iscsi, int status,
+iscsi_readcapacity16_cb(struct iscsi_context *iscsi, int status,
 void *command_data, void *opaque)
 {
 struct IscsiTask *itask = opaque;
-struct scsi_readcapacity10 *rc10;
+struct scsi_readcapacity16 *rc16;
 struct scsi_task *task = command_data;
 
 if (status != 0) {
@@ -412,26 +471,25 @@ iscsi_readcapacity10_cb(struct iscsi_context *iscsi, int 
status,
 return;
 }
 
-rc10 = scsi_datain_unmarshall(task);
-if (rc10 == NULL) {
-error_report("iSCSI: Failed to unmarshall readcapacity10 data.");
+rc16 = scsi_datain_unmarshall(task);
+if (rc16 == NULL) {
+error_report("iSCSI: Failed to unmarshall readcapacity16 data.");
 itask->status   = 1;
 itask->complete = 1;
 scsi_free_scsi_task(task);
 return;
 }
 
-itask->iscsilun->block_size = rc10->block_size;
-itask->iscsilun->num_blocks = rc10->lba;
-itask->bs->total_sectors = (uint64_t)rc10->lba *
-   rc10->block_size / BDRV_SECTOR_SIZE ;
+itask->iscsilun->block_size = rc16->block_length;
+itask->iscsilun->num_blocks = rc16->returned_lba;
+itask->bs->total_sectors= rc16->returned_lba *
+   rc16->block_length / BDRV_SECTOR_SIZE ;
 
 itask->status   = 0;
 itask->complete = 1;
 scsi_free_scsi_task(task);
 }
 
-
 static void
 iscsi_connect_cb(struct iscsi_context *iscsi, int status, void *command_data,
  void *opaque)
@@ -445,10 +503,10 @@ iscsi_connect_cb(struct iscsi_context *iscsi, int status, 
void *command_data,
 return;
 }
 
-task = iscsi_readcapacity10_task(iscsi, itask->iscsilun->lun, 0, 0,
-   iscsi_readcapacity10_cb, opaque);
+task = iscsi_readcapacity16_task(iscsi, itask->iscsilun->lun,
+   iscsi_readcapacity16_cb, opaque);
 if (task == NULL) {
-error_report("iSCSI: failed to send readcapacity command.&qu

Re: [Qemu-devel] [PATCH] ISCSI: Add support for thin-provisioning via discard/UNMAP and bigger LUNs

2012-04-24 Thread ronnie sahlberg
On Tue, Apr 24, 2012 at 4:46 PM, Paolo Bonzini  wrote:
> Il 24/04/2012 08:29, Ronnie Sahlberg ha scritto:
>> Update the configure test for libiscsi support to detect version 1.3 or 
>> later.
>> Version 1.3 of libiscsi provides both READCAPACITY16 as well as UNMAP 
>> commands.
>>
>> Update the iscsi block layer to use READCAPACITY16 to detect the size of the 
>> LUN instead of READCAPACITY10. This allows support for LUNs larger than 2TB.
>>
>> Update to implement bdrv_aio_discard() using the UNMAP command.
>> This allows us to use thin-provisioned LUNs from TGTD and other iSCSI 
>> targets that support thin-provisioning.
>
> Looks good.  Kevin, do you want me to take libiscsi patches via the SCSI
> tree?
>
> As an aside, I am not really sure of the utility of adding these utility
> functions directly in libiscsi, rather than making it a pure transport
> library.  block/iscsi.c is going to grow as you add more functionality
> (e.g. WRITE SAME commands), and libiscsi will have to be updated each
> time in lockstep.
>
> I can see the value of basic read/write/flush and readcap10/16, but with
> unmap it's starting to be a bit more specific.  Are there other clients
> of libiscsi that use these functions?  Should they be placed into
> block/iscsi.c or a new block/iscsi-cdb.c instead?

I see your point.
I like to add scsi commands as I use them to libiscsi, since then I
dont have to
re-write the marshalling/unmarshalling code everytime in my small test
and utility programs.
For example when i want to write some one-off small tools to test something.


But, yes. There is no real need to use them directly from qemu.


So ignore this patch for now. I will redo UNMAP in the patch to
instead use the generic scsi function inside libiscsi.

That will serve the purpose to verify that the public API in libiscsi
is sufficient for accessing the generic transport and
secondly that will show a useful example on how to send CDB+data to
and to receive data back from the generic function.

This generic API would be what a future virt-scsi->libiscsi
integration would use anyway.



regards
ronnie sahlberg



Re: [Qemu-devel] [PATCH] ISCSI: Add support for thin-provisioning via discard/UNMAP and bigger LUNs

2012-04-24 Thread ronnie sahlberg
On Tue, Apr 24, 2012 at 6:05 PM, Paolo Bonzini  wrote:
> Il 24/04/2012 10:02, ronnie sahlberg ha scritto:
>> So ignore this patch for now. I will redo UNMAP in the patch to
>> instead use the generic scsi function inside libiscsi.
>>
>> That will serve the purpose to verify that the public API in libiscsi
>> is sufficient for accessing the generic transport and
>> secondly that will show a useful example on how to send CDB+data to
>> and to receive data back from the generic function.
>>
>> This generic API would be what a future virt-scsi->libiscsi
>> integration would use anyway.
>
> I will be on holiday from tomorrow to May 1, and I won't be able to send
> a pull request today.  As I want to minimize the time I spend looking at
> qemu-devel, :) I'll take this patch as is, since the new libiscsi
> version is needed anyway for READ CAPACITY (16).
>
> You could write a follow-up patch to teach iscsi.c about WRITE SAME(10)
> and WRITE SAME(16) with the unmap bit set, and use generic CDB+data
> functions for those.
>

Ok. That works for me. Thanks.

regards
ronnie sahlberg



Re: [Qemu-devel] [PATCH 00/16] QEMU vhost-scsi support

2012-04-24 Thread ronnie sahlberg
On Tue, Apr 24, 2012 at 7:13 PM, Stefan Hajnoczi  wrote:
> On Tue, Apr 24, 2012 at 8:05 AM, Paolo Bonzini  wrote:
>> Il 24/04/2012 06:21, ronnie sahlberg ha scritto:
>>> Hi Stefan,
>>>
>>> A little bit off-topic but
>>>
>>> When you design the proper place and API to plug virt-scsi into an
>>> external SCSI parser outside of qemu like the target in the kernel ...
>>>
>>> It would be very nice if one could also plug virt-scsi into libiscsi
>>> and pass the CDBs straight to the remote iSCSI target too.
>>> Keep some thoughts on virt-scsi + libiscsi integration.
>>
>> Yes, that makes a lot of sense.  It's a bit harder than scsi-generic but
>> we do want to get there.
>
> Yep.  I think previously there was discussion about a libiscsi
> SCSIDevice so that guest SCSI commands can be sent to libiscsi LUNs
> without going through the QEMU block layer.  (Another way to pass
> arbitrary SCSI commands to libiscsi is by hooking up .bdrv_aio_ioctl()
> with SG_IO scsi-generic compatible code in block/iscsi.c.)

bdrv_aio_ioctl() and SG_IO would mean #ifdef __linux__

So maybe it would be better to instead create a new hw/scsi-scsi.c
that calls straight into block/iscsi.c ?
That would be a lot more work than emulating SG_IO but would work on
all platforms.


Comments? How important is !linux support ?

regards
ronnie sahlberg



Re: [Qemu-devel] Wireshark SCSI dissectors for new transports

2012-02-28 Thread ronnie sahlberg
Hi Stefan.



Wireshark supports a lot more than just SCSI over iSCSI

It dissects SCSI over USB, FCOE, raw-FC, FCIP, iFCP (I never got
access to traces for mFCP :-( )   and also over NDMP.
I never got access to HyperSCSI traces either, so that is missing too.

Since I never got HyperSCSI or mFCP (very short-lived attempt from HBA
vendors)  those two are the only ones today I think where we miss
decode.
virt-scsi from QEMU sounds interesting!





SCSI has a very well defined API in wireshark  so assind a new
transport should be trivial.

I have done so several times.


First you need a DLT value from the tcpdump folks to wrap your packets in.
Once you have that  it should be semi-trivial to hook the
SCSI-dissector into your transport/DLT


Depending on what the framing looks like, you need at least a wrapper
around the SCSI payload that can contain
an I_T identifier, then  a LUN field, and then scoped per LUN you need
a task-tag or similar.
Wireshark would need I_T to be able to track initiators and targets
separately and form a "conversation" between an arbitrary pair.
A LUN identifier to track different luns on the same I_T separately.
Finally it also needs a task-tag so that on a specific ILT nexus it
will be able to match a SCSI CDB with DATA-IN/OUT blobs and a SCSI
response/sense

to make it map well  you might need to wrap thing inside a

struct scsi_wrapper {
initiator identifier
target identifier
lun
task-tag
opcode (cdb, datain, dataout, response/sense)

scsi *
}



if your transport also supports multiple datain/out blobs for a single
task,  in order to reassemble the data we would also need a
offset/length for each datain/out blob.


regards
ronnie sahlberg


On Tue, Feb 28, 2012 at 8:59 PM, Stefan Hajnoczi  wrote:
> Wireshark today supports SCSI dissectors for iSCSI.
>
> In the QEMU system emulator we have an emulated SCSI target which
> handles devices for SCSI Parallel Interface (SPI), USB Mass Storage
> Device, and now supports the new virtio-scsi transport (for efficient
> virtual machine SCSI I/O).
>
> Cong and I would like to add pcap support to the emulated SCSI target
> in QEMU, making it easy to use Wireshark for SCSI debugging and
> analysis.  I'm looking for advice on getting started because we are
> not familiar with Wireshark/dissector internals.
>
> I believe the SCSI dissector does not support raw CDB dissection - it
> requires a SCSI transport dissector (currently iSCSI).  A few options
> come to mind:
>
> 1. Change the SCSI dissector to support top-level dissecting of raw
> SCSI CDBs without transport information (initiator, target, and other
> metadata).
>
> 2. Add virtio-scsi and perhaps SPI SCSI transport dissectors.
>
> 3. A simpler approach I'm missing? :)
>
> Suggestions appreciated.
>
> Stefan



[Qemu-devel] [PATCH] Rename libiscsi to libiscsiclient

2012-03-02 Thread Ronnie Sahlberg

Please find a patch that changes the name of the iscsi library from libiscsi to 
libiscsiclient

While libiscsi is a very nice name that works on all platforms, including 
win32, it clashes with a linux library. 



regards
ronnie sahlberg




[Qemu-devel] [PATCH] Change library from libiscsi to libiscsiclient to match library rename in libiscsi

2012-03-02 Thread Ronnie Sahlberg

Signed-off-by: Ronnie Sahlberg 
---
 configure |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/configure b/configure
index fb0e18e..294c0c1 100755
--- a/configure
+++ b/configure
@@ -2503,9 +2503,9 @@ if test "$libiscsi" != "no" ; then
 #include 
 int main(void) { iscsi_create_context(""); return 0; }
 EOF
-  if compile_prog "-Werror" "-liscsi" ; then
+  if compile_prog "-Werror" "-liscsiclient" ; then
 libiscsi="yes"
-LIBS="$LIBS -liscsi"
+LIBS="$LIBS -liscsiclient"
   else
 if test "$libiscsi" = "yes" ; then
   feature_not_found "libiscsi"
-- 
1.7.3.1




Re: [Qemu-devel] QEMU desired libiscsi.so clashes with libiscsi.so from iscsi-initiator-utils

2012-03-02 Thread ronnie sahlberg
Yes,

Very unfortuante since libiscsi is such a nice name for a
multiplatform library what even works on win32 :-(

I have so renamed it to libiscsiclient  and sent a patch to qemu to
this list to use -liscsiclient instead of -liscsi



tarballs  can be found at
https://github.com/sahlberg/libiscsi/downloads


That resolves all issues you are concerned about ?


regards
ronnie sahlberg

On Tue, Feb 14, 2012 at 12:24 AM, Daniel P. Berrange
 wrote:
> I was investigating how to build latest QEMU with the iSCSI block driver
> enabled. I saw that configure wanted a libiscsi.so, so I installed that
> library from Fedora RPMs via the iscsi-initiator-utils package, but it
> still wouldn't build.
>
> After further investigation, I find that QEMU in fact wants a completely
> different, unlreated libiscsi.so library:
>
>  https://github.com/sahlberg/libiscsi
>
> Obviously we have a problem here because we can't have two different
> libraries called libiscsi.so installed at the same time.
>
> Since iscsi-initiator-utils is a standard Linux distro package whose usage
> of libiscsi.so predates this github project, it seems that to resolve this
> it will be neccessary to rename the latter. eg perhaps libiscsi-client.so ?
>
> The followup question is where to find actual libiscsi releases to package
> up for OS distros ? It is not very desirable to just package GIT snapshots.
>
> Regards,
> Daniel
> --
> |: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org              -o-             http://virt-manager.org :|
> |: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
> |: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|



  1   2   3   >