Re: [PATCH] Send quota messages via netlink
Jan Kara wrote: + } + ret = nla_put_u32(skb, QUOTA_NL_A_QTYPE, dquot->dq_type); + if (ret) + goto attr_err_out; + ret = nla_put_u64(skb, QUOTA_NL_A_EXCESS_ID, dquot->dq_id); + if (ret) + goto attr_err_out; + ret = nla_put_u32(skb, QUOTA_NL_A_WARNING, warntype); + if (ret) + goto attr_err_out; + ret = nla_put_u32(skb, QUOTA_NL_A_DEV_MAJOR, + MAJOR(dquot->dq_sb->s_dev)); + if (ret) + goto attr_err_out; + ret = nla_put_u32(skb, QUOTA_NL_A_DEV_MINOR, + MINOR(dquot->dq_sb->s_dev)); + if (ret) + goto attr_err_out; + ret = nla_put_u64(skb, QUOTA_NL_A_CAUSED_ID, current->user->uid); + if (ret) + goto attr_err_out; + genlmsg_end(skb, msg_head); + >> Have you looked at ensuring that the data structure works across 32 bit >> and 64 bit systems (in terms of binary compatibility)? That's usually >> a nice to have feature. > Generic netlink should take care of this - arguments are typed so it > knows how much bits numbers have. So this should be no issue. Are there any > other problems that you have in mind? > Yes, but apart from that, if I remember Jamal Hadi's initial comments on taskstats, he recommended that we align everything to 64 bit so that the data is well aligned for 64 bit systems. You could also consider creating a data structure, document it's members, align them and use that to send out the data. + ret = genlmsg_multicast(skb, 0, quota_genl_family.id, GFP_NOFS); + if (ret < 0 && ret != -ESRCH) + printk(KERN_ERR + "VFS: Failed to send notification message: %d\n", ret); + return; +attr_err_out: + printk(KERN_ERR "VFS: Failed to compose quota message: %d\n", ret); +err_out: + kfree_skb(skb); +} +#endif >>> This is it. Normally netlink payloads are represented as a struct. How >>> come this one is built-by-hand? >>> >>> It doesn't appear to be versioned. Should it be? >>> >> Yes, versioning is always nice and genetlink supports it. >> It would nice for you to use the versioning feature. >> The memory controller or VM would also be interested in notifications >> of OOM. At OLS this year interest was shown in getting OOM notifications >> and allow the user space a chance to handle the notification and take >> action (especially for containers). We already have containerstats for >> containers (which I was planning to reuse), but I was told that we would >> be interested in user space OOM notifications in general. > Generic netlink can be used to pass this information (although in OOM > situation, it may be a bit hairy to get the network stack working...). But > I guess it's not related to my patch. We could have a pre-allocated buffer stored at startup and use that for OOM notification. In the case of container OOM, we are likely to have free global memory. Working towards an infrastructure so that anybody can build on top of it and sending notifications on interesting events becomes easier would be nice. We can reuse code that way and add fewer bugs :-) -- Warm Regards, Balbir Singh Linux Technology Center IBM, ISTL - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [11/36] Use page_cache_xxx in fs/buffer.c
On Fri, 31 Aug 2007, Jens Axboe wrote: > > So if we try to push a too large buffer down with submit_bh() we get a > > failure. > > Only partly, you may be violating a number of other restrictions (size > is many things, not just length of the data). Could you be more specific? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ACPI: EC: Check if boot_ec was really found in DSDT
Andrew Morton wrote: I use sylpheed. thunderbird can be used, but one needs to follow the steps in http://mbligh.org/linuxdocs/Email/Clients/Thunderbird to get it out of i-know-better mode. Can we get something like this into the kernel tree, please? Documentation/email-clients.txt would go a long way towards making our collective lives more simple. Jeff - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [11/36] Use page_cache_xxx in fs/buffer.c
On Fri, Aug 31 2007, Christoph Lameter wrote: > On Fri, 31 Aug 2007, Jens Axboe wrote: > > > > So if we try to push a too large buffer down with submit_bh() we get a > > > failure. > > > > Only partly, you may be violating a number of other restrictions (size > > is many things, not just length of the data). > > Could you be more specific? Size of a single segment, for instance. Or if the bio crosses a dma boundary. If your block is 64kb and the maximum segment size is 32kb, then you would need to clone the bio and split it into two. Things like that. This isn't a problem with single page requests, as we based the lower possible boundaries on that. -- Jens Axboe - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Increase lockdep MAX_LOCK_DEPTH
On Thu, 2007-08-30 at 23:43 -0500, Eric Sandeen wrote: > The xfs filesystem can exceed the current lockdep > MAX_LOCK_DEPTH, because when deleting an entire cluster of inodes, > they all get locked in xfs_ifree_cluster(). The normal cluster > size is 8192 bytes, and with the default (and minimum) inode size > of 256 bytes, that's up to 32 inodes that get locked. Throw in a > few other locks along the way, and 40 seems enough to get me through > all the tests in the xfsqa suite on 4k blocks. (block sizes > above 8K will still exceed this though, I think) As 40 will still not be enough for people with larger block sizes, this does not seems like a solid solution. Could XFS possibly batch in smaller (fixed sized) chunks, or does that have significant down sides? > Signed-off-by: Eric Sandeen <[EMAIL PROTECTED]> > > Index: linux-2.6.23-rc3/include/linux/sched.h > === > --- linux-2.6.23-rc3.orig/include/linux/sched.h > +++ linux-2.6.23-rc3/include/linux/sched.h > @@ -1125,7 +1125,7 @@ struct task_struct { > int softirq_context; > #endif > #ifdef CONFIG_LOCKDEP > -# define MAX_LOCK_DEPTH 30UL > +# define MAX_LOCK_DEPTH 40UL > u64 curr_chain_key; > int lockdep_depth; > struct held_lock held_locks[MAX_LOCK_DEPTH]; > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [11/36] Use page_cache_xxx in fs/buffer.c
On Fri, 31 Aug 2007, Jens Axboe wrote: > > Could you be more specific? > > Size of a single segment, for instance. Or if the bio crosses a dma > boundary. If your block is 64kb and the maximum segment size is 32kb, > then you would need to clone the bio and split it into two. A DMA boundary cannot be crossed AFAIK. The compound pages are aligned to the power of two boundaries and the page allocator will not create pages that cross the zone boundaries. It looks like the code will correctly signal a failure if you try to write a 64k block on a device with a maximum segment size of 32k. Isnt this okay? One would not want to use a larger block size than supported by the underlying hardware? > Things like that. This isn't a problem with single page requests, as we > based the lower possible boundaries on that. submit_bh() is used to submit a single buffer and I think that was our main concern here. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Add all thread stats for TASKSTATS_CMD_ATTR_TGID
Jonathan Lim wrote: > On Sat Aug 25 21:58:44 2007, [EMAIL PROTECTED] wrote: >>> Also, I don't understand why the code to update btime: >>> >>> /* calculate task elapsed time in timespec */ >>> do_posix_clock_monotonic_gettime(&uptime); >>> ts = timespec_sub(uptime, tsk->start_time); >>> ... >>> stats->ac_btime = get_seconds() - ts.tv_sec; >>> >>> does not simply use tsk->start_time or tsk->real_start_time without >>> comparing it to the current time. >> From what I understand, task->start_time and task->real_start_time >> are taken from the realtime clock. The accounting in CSA seems >> to be very similar to the accounting done in do_acct_process() >> (kernel/acct.c). > > In CSA 3.0 ... > > csa_acct_eop(int exitcode, struct task_struct *p) > > csa->ac_btime = boottime + > ((p->start_time.tv_nsec < NSEC_PER_SEC/2) ? > p->start_time.tv_sec : > p->start_time.tv_sec +1); > > where > > do_posix_clock_monotonic_gettime(&uptime); > boottime = xtime.tv_sec - uptime.tv_sec; > > In an upcoming version of CSA ... > > csa_acct_eop(struct taskstats *p) > > csa->ac_btime = p->ac_btime; > > where > > do_posix_clock_monotonic_gettime(&uptime); > ts = uptime - tsk->start_time; > p->ac_btime = get_seconds() - ts.tv_sec; > = xtime.tv_sec - (uptime - tsk->start_time); > = (xtime.tv_sec - uptime) + tsk->start_time; > > So they're basically equivalent. Excellent, so can Guillaume change ac_btime to be just tsk->start_time? -- Warm Regards, Balbir Singh Linux Technology Center IBM, ISTL - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [11/36] Use page_cache_xxx in fs/buffer.c
On Fri, Aug 31 2007, Christoph Lameter wrote: > On Fri, 31 Aug 2007, Jens Axboe wrote: > > > > Could you be more specific? > > > > Size of a single segment, for instance. Or if the bio crosses a dma > > boundary. If your block is 64kb and the maximum segment size is 32kb, > > then you would need to clone the bio and split it into two. > > A DMA boundary cannot be crossed AFAIK. The compound pages are aligned to > the power of two boundaries and the page allocator will not create pages > that cross the zone boundaries. With a 64k page and a dma boundary of 0x7fff, that's two segments. > It looks like the code will correctly signal a failure if you try to write > a 64k block on a device with a maximum segment size of 32k. Isnt this > okay? One would not want to use a larger block size than supported by the > underlying hardware? That's just the size in sectors limitation again. And that also needs to be handled, the fact that it currently errors out is reassuring but definitely a long term solution. You don't want to knowingly setup such a system where the fs block size is larger than what the hardware would want, but it should work. You could be moving hardware around, for recovery or otherwise. > > Things like that. This isn't a problem with single page requests, as we > > based the lower possible boundaries on that. > > submit_bh() is used to submit a single buffer and I think that was our > main concern here. And how large can that be? -- Jens Axboe - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [11/36] Use page_cache_xxx in fs/buffer.c
On Fri, 31 Aug 2007, Jens Axboe wrote: > > A DMA boundary cannot be crossed AFAIK. The compound pages are aligned to > > the power of two boundaries and the page allocator will not create pages > > that cross the zone boundaries. > > With a 64k page and a dma boundary of 0x7fff, that's two segments. Ok so DMA memory restrictions not conforming to the DMA zones? The example is a bit weird. DMA only to the first 32k of memory? If the limit would be higher like 16MB then we would not have an issue. Is there really a device that can only do I/O to the first 32k of memory? How do we split that up today? We could add processing to submit_bio to check for the boundary and create two bios. > > submit_bh() is used to submit a single buffer and I think that was our > > main concern here. > > And how large can that be? As large as mkxxxfs allowed it to be. For XFS and extX with the current patchset 32k is the limit (64k with the fixes to ext2) but a new filesystem could theoretically use a larger blocksize. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] fix maxcpus=1 oops in show_stat()
On Fri, Aug 31, 2007 at 04:26:50AM +0100, Hugh Dickins wrote: > --- 2.6.23-rc4/init/main.c > +++ linux/init/main.c > @@ -397,10 +397,6 @@ static void __init smp_init(void) > { > unsigned int cpu; > > -#ifndef CONFIG_HOTPLUG_CPU > - cpu_possible_map = cpu_present_map; > -#endif > - > /* FIXME: This should be done in userspace --RR */ > for_each_present_cpu(cpu) { > if (num_online_cpus() >= max_cpus) > @@ -545,10 +541,6 @@ asmlinkage void __init start_kernel(void > setup_arch(&command_line); > setup_command_line(command_line); > unwind_setup(); > -#ifndef CONFIG_HOTPLUG_CPU > - if (max_cpus < 2) > - cpu_possible_map = cpu_online_map; > -#endif > setup_per_cpu_areas(); > smp_prepare_boot_cpu(); /* arch-specific boot-cpu hooks */ > Works here, I'll try on another box soon. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: recent nfs change causes autofs regression
On Thu, Aug 30, 2007 at 10:16:37PM -0700, Linus Torvalds wrote: > ... > > Why aren't we doing that for any other filesystem than NFS? > > How hard is it to acknowledge the following little word: > > "regression" > > It's simple. You broke things. You may want to fix them, but you need to > fix them in a way that does not break user space. Trond has a point Linus. What he "broke" is, for example, a ro mount being mounted as rw. That *could* be a very serious security (etc.etc.) problem which he just fixed. Anything depending on read-only not being enforced will cease to work, of course, and that is what a few people complain about(!). If ext3 in some rare case (which would still mean it hit a few thousand users) failed to remember that a file had been marked read-only and allowed writes to it, wouldn't we want to fix that too? It would cause regressions, but we'd fix it, right? mount passes back the error code on a failed mount. autofs passes that error along too (when people configure syslog correctly). In short; when these serious mistakes are made and caught, the admin sees an error in his logs. This is not wrong. This is good. -- / jakob - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] docs: ramdisk/initrd/initramfs corrections
On Thu, 30 Aug 2007, Rob Landley wrote: > On Thursday 30 August 2007 1:28:17 pm Robert P. J. Day wrote: > > On Thu, 30 Aug 2007, Randy Dunlap wrote: > > > > ... > > > > > The old "ramdisk=" has been changed to > > > "ramdisk_size=" to make it clearer. The original > > > "ramdisk=" has been kept around for compatibility reasons, > > > but it may be removed in the future. > > > > ... > > > > i just the other day submitted a patch to remove that backward > > compatibility, and the m68k portion of it has already been acked > > by geert uytterhoeven. > > Could you mention it in feature-removal-schedule.txt? (People check > that for warning of upcoming changes that impact existing code. > They may not notice something elsewhere after they've got it > working...) you know, if it makes everyone happier, why don't i just leave that as it is and move on? apparently, i have a different understanding of the word "deprecated" from a number of others here, and it's really not worth arguing about anymore. rday -- Robert P. J. Day Linux Consulting, Training and Annoying Kernel Pedantry Waterloo, Ontario, CANADA http://crashcourse.ca - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [11/36] Use page_cache_xxx in fs/buffer.c
On Fri, Aug 31 2007, Christoph Lameter wrote: > On Fri, 31 Aug 2007, Jens Axboe wrote: > > > > A DMA boundary cannot be crossed AFAIK. The compound pages are aligned to > > > the power of two boundaries and the page allocator will not create pages > > > that cross the zone boundaries. > > > > With a 64k page and a dma boundary of 0x7fff, that's two segments. > > Ok so DMA memory restrictions not conforming to the DMA zones? The > example is a bit weird. DMA only to the first 32k of memory? If the limit > would be higher like 16MB then we would not have an issue. Is there really > a device that can only do I/O to the first 32k of memory? They have nothing to do with each other, you are mixing things up. It has nothing to do with the device being able to dma into that memory or not, we have fine existing infrastructure to handle that. But different hardware have different characteristics on what a single segment is. You can say "a single segment cannot cross a 32kb boundary". So from the example above, your single 64k page may need to be split into two segments. Or it could have a maximum segment size of 32k, in which case it would have to be split as well. Do you see what I mean now? > How do we split that up today? We could add processing to submit_bio > to check for the boundary and create two bios. But we do not split them up today - see what I wrote! Today we impose the restriction that a device must be able to handle a single "normal" page, and if it can't do that, it has to split it up itself. But yes, you would have to create some out-of-line function to use bio_split() until you have chopped things down enough. It's not a good thing for performance naturally, but if we consider this a "just make it work" fallback, I don't think it's too bad. You want to make a note of that it is happening though, so people realize that it is happening. > > > submit_bh() is used to submit a single buffer and I think that was > > > our main concern here. > > > > And how large can that be? > > As large as mkxxxfs allowed it to be. For XFS and extX with the > current patchset 32k is the limit (64k with the fixes to ext2) but a > new filesystem could theoretically use a larger blocksize. OK, since it goes direct to bio anyway, it can be handled there. -- Jens Axboe - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2.6.23-rc4][reRESEND] ata_piix: IDE mode SATA patch for Intel Tolapai
In the future, consider following the information convention we have for patch revisions: [PATCH 2.6.23-rc4] ata_piix: do some stuff [PATCH 2.6.23-rc4 v2] ata_piix: do some stuff [PATCH 2.6.23-rc4 v3] ata_piix: do some stuff [PATCH 2.6.23-rc4 v4] ata_piix: do some stuff etc. Regards, Jeff - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2.6.23-rc4][reRESEND] ahci: RAID mode SATA patch for Intel Tolapai
Jason Gaston wrote: Resend trying to remove 8-bit characters in the email. This patch adds the Intel Tolapai RAID controller DID's for SATA support. Signed-off-by: Jason Gaston <[EMAIL PROTECTED]> --- linux-2.6.23-rc4/drivers/ata/ahci.c.orig2007-08-27 18:32:35.0 -0700 +++ linux-2.6.23-rc4/drivers/ata/ahci.c 2007-08-28 16:58:11.0 -0700 @@ -411,6 +411,8 @@ { PCI_VDEVICE(INTEL, 0x292f), board_ahci_pi }, /* ICH9M */ { PCI_VDEVICE(INTEL, 0x294d), board_ahci_pi }, /* ICH9 */ { PCI_VDEVICE(INTEL, 0x294e), board_ahci_pi }, /* ICH9M */ + { PCI_VDEVICE(INTEL, 0x502a), board_ahci }, /* Tolapai */ + { PCI_VDEVICE(INTEL, 0x502b), board_ahci }, /* Tolapai */ Why did you not use board_ahci_pi? Is the AHCI ports-implemented register unreliable on this platform? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [1/4] 2.6.23-rc4: known regressions
Here is the fix for alpha: >From [EMAIL PROTECTED] Thu Aug 30 14:13:57 2007 Subject: SLUB: Force inlining for functions in slub_def.h Some compilers (especially older gcc releases) may skip inlining sometimes which will lead to link failures. Force the inlining of keyfunctions in slub_def.h to avoid these issues. Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]> Acked-by: Jan Dittmer <[EMAIL PROTECTED]> --- include/linux/slub_def.h |8 1 file changed, 4 insertions(+), 4 deletions(-) Index: linux-2.6/include/linux/slub_def.h === --- linux-2.6.orig/include/linux/slub_def.h 2007-08-30 14:12:25.0 -0700 +++ linux-2.6/include/linux/slub_def.h 2007-08-30 14:13:07.0 -0700 @@ -78,7 +78,7 @@ extern struct kmem_cache kmalloc_caches[ * Sorry that the following has to be that ugly but some versions of GCC * have trouble with constant propagation and loops. */ -static inline int kmalloc_index(size_t size) +static __always_inline int kmalloc_index(size_t size) { if (!size) return 0; @@ -133,7 +133,7 @@ static inline int kmalloc_index(size_t s * This ought to end up with a global pointer to the right cache * in kmalloc_caches. */ -static inline struct kmem_cache *kmalloc_slab(size_t size) +static __always_inline struct kmem_cache *kmalloc_slab(size_t size) { int index = kmalloc_index(size); @@ -166,7 +166,7 @@ static inline struct kmem_cache *kmalloc void *kmem_cache_alloc(struct kmem_cache *, gfp_t); void *__kmalloc(size_t size, gfp_t flags); -static inline void *kmalloc(size_t size, gfp_t flags) +static __always_inline void *kmalloc(size_t size, gfp_t flags) { if (__builtin_constant_p(size) && !(flags & SLUB_DMA)) { struct kmem_cache *s = kmalloc_slab(size); @@ -183,7 +183,7 @@ static inline void *kmalloc(size_t size, void *__kmalloc_node(size_t size, gfp_t flags, int node); void *kmem_cache_alloc_node(struct kmem_cache *, gfp_t flags, int node); -static inline void *kmalloc_node(size_t size, gfp_t flags, int node) +static __always_inline void *kmalloc_node(size_t size, gfp_t flags, int node) { if (__builtin_constant_p(size) && !(flags & SLUB_DMA)) { struct kmem_cache *s = kmalloc_slab(size); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2.6.23-rc4][reRESEND] ata_piix: IDE mode SATA patch for Intel Tolapai
Jason Gaston wrote: Resend trying to remove 8-bit characters in the email. This patch adds the Intel Tolapai IDE mode SATA controller DID's. Signed-off-by: Jason Gaston <[EMAIL PROTECTED]> applied - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [11/36] Use page_cache_xxx in fs/buffer.c
On Fri, 31 Aug 2007, Jens Axboe wrote: > They have nothing to do with each other, you are mixing things up. It > has nothing to do with the device being able to dma into that memory or > not, we have fine existing infrastructure to handle that. But different > hardware have different characteristics on what a single segment is. You > can say "a single segment cannot cross a 32kb boundary". So from the > example above, your single 64k page may need to be split into two > segments. Or it could have a maximum segment size of 32k, in which case > it would have to be split as well. > > Do you see what I mean now? Ok. So another solution maybe to limit the blocksizes that can be used with a device? > > How do we split that up today? We could add processing to submit_bio > > to check for the boundary and create two bios. > > But we do not split them up today - see what I wrote! Today we impose > the restriction that a device must be able to handle a single "normal" > page, and if it can't do that, it has to split it up itself. > > But yes, you would have to create some out-of-line function to use > bio_split() until you have chopped things down enough. It's not a good > thing for performance naturally, but if we consider this a "just make it > work" fallback, I don't think it's too bad. You want to make a note of > that it is happening though, so people realize that it is happening. H.. We could keep the existing scheme too and check that device drivers split things up if they are too large? Isnt it possible today to create a huge bio of 2M for huge pages and send it to a device? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Fix out-by-one error in traps.c
On Fri, 31 Aug 2007, Rusty Russell wrote: > On Thu, 2007-08-30 at 21:44 -0700, Linus Torvalds wrote: > > > > Hmm.. This *really* cannot happen with a normal kernel - it implies that > > the stack has crossed into an invalid page. > > AFAICT, a corrupt stack could lead us to touch a page which isn't > mapped. If we assume the stack isn't corrupt, we don't have to do the > valid_stack_ptr() check at all... Fair enough. That said, you seem to see this even without a corrupt stack. > > Why is that allowed with lguest? What kind of code could validly *ever* > > come in here and cause problems? > > head.S pushes a "$0" on the stack to stop the unwinder, lguest doesn't. The unwinder should stop when it sees an invalid frame pointer, and even without the push 0 I'd have expected it to be invalid. But I suspect lguest triggers another thing: you actually make the stack start at the *very*top* of the stack area. Afaik, normal x86 does not. A normal x86 kernel will start off with a pt_regs[] setup, I think - ie the kernel stack is always set up so that it has the "return to user mode" information. And *that* difference may be what triggers this for lguest, even though it can never trigger for a "real" kernel. But your patch does improve the sanity checking of the frame pointer. That said, I think the following patch improves it more: does this also work for you? (Totally untested, but it looks like the RightThing(tm) to do) Linus --- diff --git a/arch/i386/kernel/traps.c b/arch/i386/kernel/traps.c index cfffe3d..b9998f3 100644 --- a/arch/i386/kernel/traps.c +++ b/arch/i386/kernel/traps.c @@ -100,10 +100,10 @@ asmlinkage void machine_check(void); int kstack_depth_to_print = 24; static unsigned int code_bytes = 64; -static inline int valid_stack_ptr(struct thread_info *tinfo, void *p) +static inline int valid_stack_ptr(struct thread_info *tinfo, void *p, unsigned size) { return p > (void *)tinfo && - p < (void *)tinfo + THREAD_SIZE - 3; + p <= (void *)tinfo + THREAD_SIZE - size; } static inline unsigned long print_context_stack(struct thread_info *tinfo, @@ -113,7 +113,7 @@ static inline unsigned long print_context_stack(struct thread_info *tinfo, unsigned long addr; #ifdef CONFIG_FRAME_POINTER - while (valid_stack_ptr(tinfo, (void *)ebp)) { + while (valid_stack_ptr(tinfo, (void *)ebp, 2*sizeof(unsigned long))) { unsigned long new_ebp; addr = *(unsigned long *)(ebp + 4); ops->address(data, addr); @@ -129,7 +129,7 @@ static inline unsigned long print_context_stack(struct thread_info *tinfo, ebp = new_ebp; } #else - while (valid_stack_ptr(tinfo, stack)) { + while (valid_stack_ptr(tinfo, stack, sizeof(*stack))) { addr = *stack++; if (__kernel_text_address(addr)) ops->address(data, addr); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH 1/2 -mm] kexec based hibernation: kexec jump
Hi! > To support jumping back from kexeced kernel, before executing the new > kernel, the devices are put into quiescent state (to be fully > implemented), and the state of devices and CPU is saved. After jumping > back from kexeced kernel, the state of devices and CPU are restored > accordingly. The devices/CPU state save/restore code of software > suspend is called to implement corresponding function. > > Signed-off-by: Huang Ying <[EMAIL PROTECTED]> Looks quite ok to me... > Index: linux-2.6.23-rc3/include/asm-i386/kexec.h > === > --- linux-2.6.23-rc3.orig/include/asm-i386/kexec.h2007-08-25 > 21:56:54.0 +0800 > +++ linux-2.6.23-rc3/include/asm-i386/kexec.h 2007-08-25 21:57:00.0 > +0800 > @@ -94,6 +94,10 @@ > unsigned long start_address, > unsigned int has_pae) ATTRIB_NORET; > > +#ifdef CONFIG_KEXEC_JUMP > +extern asmlinkage int machine_kexec_real_jump(void *buf); > +#endif Is it really neccessery to have ifdef here? > +#ifdef CONFIG_KEXEC_JUMP > +#define KEXEC_JUMP_FLAG_IS_KEXECED_KERNEL 0x1 > +#endif /* CONFIG_KEXEC_JUMP */ And here? ... It would be nice to use slightly shorter identifier. 'KJUMP_IS_KEXECED' should be enough. > +/* > + * Must be relocatable PIC code callable as a C function > + */ > +#define HALF_PAGE_ALIGNED (1 << (PAGE_SHIFT-1)) > + > +#define EBX 0x0 > +#define ESI 0x4 > +#define EDI 0x8 > +#define EBP 0xc > +#define ESP 0x10 > +#define CR0 0x14 > +#define CR3 0x18 > +#define CR4 0x1c > +#define FLAG 0x20 > +#define RET 0x24 Hmm, is this enough? Should it use struct ptregs for normal registers? What about segment registers -- they could change between kernel version. Should some kind of 'version of kjump protocol' be introduced? What about CX/DX/fpu state? GDT pointer? Actually I think that you _do_ need to save FPU. You should probably use relevant swsusp parts here. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: NFS hang + umount -f: better behaviour requested.
On Tue, 21 Aug 2007, John Stoffel wrote: > > "Peter" == Peter Staubach <[EMAIL PROTECTED]> writes: > > Peter> John Stoffel wrote: > Robin> I'm bringing this up again (I know it's been mentioned here > Robin> before) because I had been told that NFS support had gotten > Robin> better in Linux recently, so I have been (for my $dayjob) > Robin> testing the behaviour of NFS (autofs NFS, specifically) under > Robin> Linux with hard,intr and using iptables to simulate a hang. > >> > >> So why are you mouting with hard,intr semantics? At my current > >> SysAdmin job, we mount everything (solaris included) with 'soft,intr' > >> and it works well. If an NFS server goes down, clients don't hang for > >> large periods of time. > > Peter> Wow! That's _really_ a bad idea. NFS READ operations which > Peter> timeout can lead to executables which mysteriously fail, file > Peter> corruption, etc. NFS WRITE operations which fail may or may > Peter> not lead to file corruption. > > Peter> Anything writable should _always_ be mounted "hard" for safety > Peter> purposes. Readonly mounted file systems _may_ be mounted > Peter> "soft", depending upon what is located on them. > > Not in my experience. We use NetApps as our backing NFS servers, so > maybe my experience isn't totally relevant. But with a mix of Linux > and Solaris clients, we've never had problems with soft,intr on our > NFS clients. So, there's a power outage and the UPS had a glitch. Oops, you've got to recover multiple TB and tell users everything since the last incremental backup is gone. You use UPS in the computer room but management, in it's cost cutting wisdom, hasn't provided for UPS for your Unix workstations and there's a power outage. Oops, you've got lots of corrupt files but you don't know which ones they are so you've got to recover multiple TB and tell users everything since the last incremental backup is gone. Ok, so hard mounting may not always save you in these circumstances but soft mounting will surely get you in the neck. Ian - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: recent nfs change causes autofs regression
On Fri, 31 Aug 2007, Jakob Oestergaard wrote: > > Trond has a point Linus. I don't dispute that the new code does somethign good. But it changes existing behaviour. When we add NEW BEHAVIOUR, we don't add it to old interfaces when that breaks old user mode! We add a new flag saying "I want the new behaviour". This is not rocket science, guys. This is very basic kernel behaviour. The kernel exists only to serve user space, and that means that there is no more important thing to do than to make sure you don't break existing users, unless you have some *damns* strong reasons. > What he "broke" is, for example, a ro mount being mounted as rw. No. What he broke was a working and sane setup. The fact that he may *also* have broken insane setups is totally irrelevant. Don't go off on some tangent that has nothing to do with the regression in question! > If ext3 in some rare case (which would still mean it hit a few thousand users) > failed to remember that a file had been marked read-only and allowed writes to > it, wouldn't we want to fix that too? It would cause regressions, but we'd > fix > it, right? Stop blathering. Of course we fix security holes. But we don't break things that don't need breaking. This wasn't a security hole. You are making up irrelevant arguments that have nothing to do with this regression. If you want new behaviour, you add a new flag saying you want new behaviour. You don't just start behaving differently from what you've always done before (and what *other* UNIXes do, for that matter). Besides, even *if* it was a matter of somebody doing a mount with "rw", when the previous mount was "ro", returning EBUSY is still the wrong thing to do! If the user asks for a new mount that is read-write, he should just get it - ie we should not re-use the old client handles, and we should do what Solaris apparently does, namely to just make it a totally different mount. In other words, it should (as I already mentioned once) have used "nosharecache" by default, which makes it all work. Then, people who want to re-use the caches (which in turn may mean that everything needs to have the same flags), THOSE PEOPLE, who want the NEW SEMANTICS (errors and all) should then use a "sharecache" flag. See? You don't have to screw people over. > mount passes back the error code on a failed mount. autofs passes that error > along too (when people configure syslog correctly). In short; when these > serious mistakes are made and caught, the admin sees an error in his logs. Bullshit. "Seeing the error in his logs" doesn't help anything. The problem wasn't the lack of error, the problem was that it was a new and unnecessary error in the first place. Logging it doesn't make it any less buggy. Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [11/36] Use page_cache_xxx in fs/buffer.c
On Fri, Aug 31 2007, Christoph Lameter wrote: > On Fri, 31 Aug 2007, Jens Axboe wrote: > > > They have nothing to do with each other, you are mixing things up. It > > has nothing to do with the device being able to dma into that memory or > > not, we have fine existing infrastructure to handle that. But different > > hardware have different characteristics on what a single segment is. You > > can say "a single segment cannot cross a 32kb boundary". So from the > > example above, your single 64k page may need to be split into two > > segments. Or it could have a maximum segment size of 32k, in which case > > it would have to be split as well. > > > > Do you see what I mean now? > > Ok. So another solution maybe to limit the blocksizes that can be used > with a device? That'd work for creation, but not for moving things around. > > > How do we split that up today? We could add processing to submit_bio > > > to check for the boundary and create two bios. > > > > But we do not split them up today - see what I wrote! Today we impose > > the restriction that a device must be able to handle a single "normal" > > page, and if it can't do that, it has to split it up itself. > > > > But yes, you would have to create some out-of-line function to use > > bio_split() until you have chopped things down enough. It's not a good > > thing for performance naturally, but if we consider this a "just make it > > work" fallback, I don't think it's too bad. You want to make a note of > > that it is happening though, so people realize that it is happening. > > H.. We could keep the existing scheme too and check that device > drivers split things up if they are too large? Isnt it possible today > to create a huge bio of 2M for huge pages and send it to a device? Not sure, aren't the constituents of compound pages the basis for IO? -- Jens Axboe - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: New x86-Setup code breaks HVM-XEN boot
On Thu, Aug 30, 2007 at 12:04:09PM -0700, Jeremy Fitzhardinge wrote: > Are there any messages on the xen console ("xm dmesg")? Or logs ("xm > log")? What version of Xen are you using? What does your domain config > file look like? Domain Config: | kernel = "hvmloader" | builder='hvm' | memory = 256 | shadow_memory = 16 | name = "ceh-lin-hvm" | vif = [ 'type=ioemu, mac=00:16:3e:aa:00:08, bridge=xenbr0, | model=ne2k_pci' ] | disk = [ 'file:/xen/ceh-lin-hvm.img,ioemu:hda,w', 'file:/xen/cd/debian-40r0-i386-netinst.iso,hdc:cdrom,r' ] | device_model = 'qemu-dm' | boot="cad" | sdl=0 | vnc=1 | vncunused=0 | stdvga=0 | serial='pty' - xm dmesg output - (XEN) vmx_do_launch(): GUEST_CR3<=00bbfda0, HOST_CR3<=a1f47000 (XEN) (GUEST: 353) HVM Loader (XEN) (GUEST: 353) Detected Xen v3.0.3-1 (XEN) (GUEST: 353) Writing SMBIOS tables ... (XEN) (GUEST: 353) Loading ROMBIOS ... (XEN) (GUEST: 353) Loading Cirrus VGABIOS ... (XEN) (GUEST: 353) Loading VMXAssist ... (XEN) (GUEST: 353) VMX go ... (XEN) (GUEST: 353) VMXAssist (Nov 2 2006) (XEN) (GUEST: 353) Memory size 256 MB (XEN) (GUEST: 353) E820 map: (XEN) (GUEST: 353) - 0009F000 (RAM) (XEN) (GUEST: 353) 0009F000 - 000A (Reserved) (XEN) (GUEST: 353) 000A - 000C (Type 16) (XEN) (GUEST: 353) 000F - 0010 (Reserved) (XEN) (GUEST: 353) 0010 - 0FFF (RAM) (XEN) (GUEST: 353) 0FFF - 0FFFA000 (ACPI Data) (XEN) (GUEST: 353) 0FFFA000 - 0FFFD000 (ACPI NVS) (XEN) (GUEST: 353) 0FFFD000 - 0FFFE000 (Type 19) (XEN) (GUEST: 353) 0FFFE000 - 0000 (Type 18) (XEN) (GUEST: 353) 0000 - 1000 (Type 17) (XEN) (GUEST: 353) FEC0 - 0001 (Type 16) (XEN) (GUEST: 353) (XEN) (GUEST: 353) Start BIOS ... (XEN) (GUEST: 353) Starting emulated 16-bit real-mode: ip=F000:FFF0 (XEN) (GUEST: 353) rombios.c,v 1.138 2005/05/07 15:55:26 vruppert Exp $ (XEN) (GUEST: 353) Remapping master: ICW2 0x8 -> 0x20 (XEN) (GUEST: 353) Remapping slave: ICW2 0x70 -> 0x28 (XEN) (GUEST: 353) VGABios $Id: vgabios.c,v 1.61 2005/05/24 16:50:50 vruppert Exp $ (XEN) (GUEST: 353) HVMAssist BIOS, 1 cpu, $Revision: 1.138 $ $Date: 2005/05/07 15:55:26 $ (XEN) (GUEST: 353) (XEN) (GUEST: 353) ata0-0: PCHS=5079/16/63 translation=lba LCHS=634/128/63 (XEN) (GUEST: 353) ata0 master: QEMU HARDDISK ATA-7 Hard-Disk (2500 MBytes) (XEN) (GUEST: 353) ata0 slave: Unknown device (XEN) (GUEST: 353) ata1 master: QEMU CD-ROM ATAPI-4 CD-Rom/DVD-Rom (XEN) (GUEST: 353) ata1 slave: Unknown device (XEN) (GUEST: 353) (XEN) (GUEST: 353) Booting from Hard Disk... (XEN) (GUEST: 353) KBD: unsupported int 16h function 03 (XEN) (GUEST: 353) *** int 15h function AX=E980, BX= not yet supported! (XEN) (GUEST: 353) int13_harddisk: function 41, unmapped device for ELDL=81 (XEN) (GUEST: 353) int13_harddisk: function 42, unmapped device for ELDL=81 (XEN) (GUEST: 353) int13_harddisk: function 02, unmapped device for ELDL=81 (XEN) (GUEST: 353) int13_harddisk: function 41, unmapped device for ELDL=82 (XEN) (GUEST: 353) int13_harddisk: function 42, unmapped device for ELDL=82 (XEN) (GUEST: 353) int13_harddisk: function 02, unmapped device for ELDL=82 (XEN) (GUEST: 353) int13_harddisk: function 41, unmapped device for ELDL=83 (XEN) (GUEST: 353) int13_harddisk: function 42, unmapped device for ELDL=83 (XEN) (GUEST: 353) int13_harddisk: function 02, unmapped device for ELDL=83 (XEN) (GUEST: 353) int13_harddisk: function 41, unmapped device for ELDL=84 (XEN) (GUEST: 353) int13_harddisk: function 42, unmapped device for ELDL=84 (XEN) (GUEST: 353) int13_harddisk: function 02, unmapped device for ELDL=84 (XEN) (GUEST: 353) int13_harddisk: function 41, unmapped device for ELDL=85 (XEN) (GUEST: 353) int13_harddisk: function 42, unmapped device for ELDL=85 (XEN) (GUEST: 353) int13_harddisk: function 02, unmapped device for ELDL=85 (XEN) (GUEST: 353) int13_harddisk: function 41, unmapped device for ELDL=86 (XEN) (GUEST: 353) int13_harddisk: function 42, unmapped device for ELDL=86 (XEN) (GUEST: 353) int13_harddisk: function 02, unmapped device for ELDL=86 (XEN) (GUEST: 353) int13_harddisk: function 41, unmapped device for ELDL=87 (XEN) (GUEST: 353) int13_harddisk: function 42, unmapped device for ELDL=87 (XEN) (GUEST: 353) int13_harddisk: function 02, unmapped device for ELDL=87 (XEN) (GUEST: 353) int13_harddisk: function 41, ELDL out of range 88 (XEN) (GUEST: 353) int13_harddisk: function 42, ELDL out of range 88 (XEN) (GUEST: 353) int13_harddisk: function 02, ELDL out of range 88 (XEN) (GUEST: 353) int13_harddisk: function 41, ELDL out of range 89 (XEN) (GUEST: 353) int13_harddisk: function 42, ELDL out of range 89 (XEN) (GUEST: 353) int13_harddisk: function 02, ELDL out of range 89 (XEN) (GUEST: 353) int13_harddisk: function 41, ELDL out of range 8A (XEN) (GUEST: 353) int13_harddisk: function 42, ELDL
Re: recent nfs change causes autofs regression
> > It's not very conservative to suddenly change default behavior and break > > autofs mounts. There is not even one kernel message that "_tells_ user why > > it thinks it's wrong". It just silently fails. > > No it doesn't. It reports an error code to the caller. If autofs is > failing silently, then that is a bug in autofs: mount will report the > error to the user. Wrong(tm). autofs AND mounting at the commandline just say: mount.nfs: /mnt is already mounted or busy Which has an actual information value of about 1%. In my case i moved a nfs exported directory inside another nfs-exported directory month ago and placed a symlink where the direcotry was (on the server-side). It never acured to me that that was "wrong"(tm). Now i can only mount one of the two mounts and the other just tells "busy". After reading this i could fix my case easyly. I just erased the "deeper" mount and symlinked the directory from the other mount. But YOU HAVE TO KNOW THAT YOU DID SOMETHING WRONG. Just getting a "Busy" lets you staying with Question-marks flying around you head! Bis denn -- Real Programmers consider "what you see is what you get" to be just as bad a concept in Text Editors as it is in women. No, the Real Programmer wants a "you asked for it, you got it" text editor -- complicated, cryptic, powerful, unforgiving, dangerous. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: hda: set_drive_speed_status: status=0x51 { DriveReady SeekComplete Error }
Alan Cox wrote: John Sigler wrote: http://www.pqimemory.com/documents/domdata.pdf PIO mode 2 is mentioned. Even DMA seems to be supported. Or am I mistaken? Could there be a bug in my south bridge? Nothing there about DMA support. cf. document's page 12. DMACK- (DMA acknowledge) This signal shall be used by the host in response to DMARQ to initiate DMA transfers. DMARQ (DMA request) This signal, used for DMA data transfer between host and device, shall be asserted by the device when it is ready to transfer data to or from the host. The direction of data transfer is controlled by DIOR- and DIOW-. This signal is used in a handshake manner with DMACK- i.e., the device shall wait until the host asserts DMACK- before negating DMARQ, and re-asserting DMARQ if there is more data to transfer. This line shall be released (high impedance state) whenever the device is not selected or is selected and no DMA command is in progress. When enabled by DMA transfer, it shall be driven high and low by the device. When a DMA operation is enabled, CS0- and CS1- shall not be asserted and transfers shall be 16-bits wide. I took the above to mean the device was designed to support DMA. Where did I err? The data sheet says the media can only do 4.1MB/second which is consistent with only needing PIO2 (actually it's far slower than PIO2) Is such a slow speed typical of DOMs sold today? Or do DOMs sold today support DMA bus mastering, much higher interface rates, and much higher sustained throughput? Regards. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH 1/2 -mm] kexec based hibernation: kexec jump
On Mon, 2007-08-27 at 18:48 +, Pavel Machek wrote: > Hi! > > > To support jumping back from kexeced kernel, before executing the new > > kernel, the devices are put into quiescent state (to be fully > > implemented), and the state of devices and CPU is saved. After jumping > > back from kexeced kernel, the state of devices and CPU are restored > > accordingly. The devices/CPU state save/restore code of software > > suspend is called to implement corresponding function. > > > > Signed-off-by: Huang Ying <[EMAIL PROTECTED]> > > Looks quite ok to me... > > > > Index: linux-2.6.23-rc3/include/asm-i386/kexec.h > > === > > --- linux-2.6.23-rc3.orig/include/asm-i386/kexec.h 2007-08-25 > > 21:56:54.0 +0800 > > +++ linux-2.6.23-rc3/include/asm-i386/kexec.h 2007-08-25 > > 21:57:00.0 +0800 > > @@ -94,6 +94,10 @@ > > unsigned long start_address, > > unsigned int has_pae) ATTRIB_NORET; > > > > +#ifdef CONFIG_KEXEC_JUMP > > +extern asmlinkage int machine_kexec_real_jump(void *buf); > > +#endif > > Is it really neccessery to have ifdef here? It is not necessary. I will fix it in the next version. > > +#ifdef CONFIG_KEXEC_JUMP > > +#define KEXEC_JUMP_FLAG_IS_KEXECED_KERNEL 0x1 > > +#endif /* CONFIG_KEXEC_JUMP */ > > And here? ... It would be nice to use slightly shorter identifier. > 'KJUMP_IS_KEXECED' should be enough. Yes, that is nicer. I will fix it. > > +/* > > + * Must be relocatable PIC code callable as a C function > > + */ > > +#define HALF_PAGE_ALIGNED (1 << (PAGE_SHIFT-1)) > > + > > +#define EBX0x0 > > +#define ESI0x4 > > +#define EDI0x8 > > +#define EBP0xc > > +#define ESP0x10 > > +#define CR00x14 > > +#define CR30x18 > > +#define CR40x1c > > +#define FLAG 0x20 > > +#define RET0x24 > > Hmm, is this enough? Should it use struct ptregs for normal registers? > What about segment registers -- they could change between kernel > version. Should some kind of 'version of kjump protocol' be > introduced? All "preserve" registers defined in ABI are saved, I think that is sufficient. The "swsusp_arch_suspend" saves only these registers too. An extensible inter-kernel kjump protocol and corresponding version number seems sensible. I will work on this. > > What about CX/DX/fpu state? GDT pointer? > > Actually I think that you _do_ need to save FPU. You should probably > use relevant swsusp parts here. Before and after "machine_kexec_jump" is called, the save_processor_state() and restore_processor_state() are called, where the MTRR/FPU/GDT/IDT/TR/segments/cr are saved and restored. These two functions come from swsusp. Thanks swsusp guys. :) Best Regards, Huang Ying - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: recent nfs change causes autofs regression
On Fri, Aug 31, 2007 at 09:40:28AM +0200, Jakob Oestergaard wrote: > On Thu, Aug 30, 2007 at 10:16:37PM -0700, Linus Torvalds wrote: > > > ... > > > Why aren't we doing that for any other filesystem than NFS? > > > > How hard is it to acknowledge the following little word: > > > > "regression" > > > > It's simple. You broke things. You may want to fix them, but you need to > > fix them in a way that does not break user space. > > Trond has a point Linus. > > What he "broke" is, for example, a ro mount being mounted as rw. > > That *could* be a very serious security (etc.etc.) problem which he just > fixed. > Anything depending on read-only not being enforced will cease to work, of > course, and that is what a few people complain about(!). > > If ext3 in some rare case (which would still mean it hit a few thousand users) > failed to remember that a file had been marked read-only and allowed writes to > it, wouldn't we want to fix that too? It would cause regressions, but we'd > fix > it, right? > > mount passes back the error code on a failed mount. autofs passes that error > along too (when people configure syslog correctly). In short; when these > serious mistakes are made and caught, the admin sees an error in his logs. Hua explained already that seeing the error is not the same as fixing the error: he cannot fix it because NFS implies other systems we _must_ co-operate with. -- Frank - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: recent nfs change causes autofs regression
On Thu, Aug 30, 2007 at 02:07:43PM -0700, Hua Zhong wrote: > I am re-sending this after help from Ian and git-bisect. To me it's a > show-stopper: I cannot find an acceptable workaround that I can implement. > > The problem: upgrading to 2.6.23-rc4 from 2.6.22 causes several autofs > mounts to fail silently - they just not appear when they should. > > I believe it's caused by the NFS change that forces multiple mounts from > different directories under the same server side filesystem to have the same > mount options by default, otherwise it returns EBUSY. > > For example, if server has a filesystem /a, and it exports /a/x and /a/y > (maybe with rw or ro), and a client must mount /a/x and /a/y with the same > mount options now. > > Since in my setup they are managed by autofs, and the autofs map is managed > by nis, there is no way I could easily workaround it.. > > If we have to live with this regression, I want to hear some suggestions > about how to fix them realistically. Thanks. > > By the way, I am not sure if I did the bisect right, but FWIW, git-bisect > says: > > c98451bdb2f3e6d6cc1e03adad641e9497512b49 is first bad commit > commit c98451bdb2f3e6d6cc1e03adad641e9497512b49 > Author: Frank van Maarseveen <[EMAIL PROTECTED]> > Date: Mon Jul 9 22:25:29 2007 +0200 > > NLM: fix source address of callback to client > > Use the destination address of the original NLM request as the > source address in callbacks to the client. > > Signed-off-by: Frank van Maarseveen <[EMAIL PROTECTED]> > Signed-off-by: Trond Myklebust <[EMAIL PROTECTED]> > > :04 04 675c84bd8b2c50744018becaa0db4aeca19b8f9f > 105fbd3cb3fa5e3019836b4b5268125d0181a72d M fs > :04 04 0138796e0806b4ebd1cc3850ed4e8c7ab24d2d41 > 2fec08debe51c20423a88b1a0d4281c683ba5daf M include This does not have any relation with the mount problem, assuming commit and comment do match. -- Frank - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
possible build system oddity
Hi, I only touched sound/usb/usbaudio.c Nevertheless the whole subtree und sound/ is recompiling. What's happening? Regards Oliver - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [11/36] Use page_cache_xxx in fs/buffer.c
On 00:52 Fri 31 Aug , Christoph Lameter wrote: > On Fri, 31 Aug 2007, Jens Axboe wrote: > > > They have nothing to do with each other, you are mixing things up. It > > has nothing to do with the device being able to dma into that memory or > > not, we have fine existing infrastructure to handle that. But different > > hardware have different characteristics on what a single segment is. You > > can say "a single segment cannot cross a 32kb boundary". So from the > > example above, your single 64k page may need to be split into two > > segments. Or it could have a maximum segment size of 32k, in which case > > it would have to be split as well. > > > > Do you see what I mean now? > > Ok. So another solution maybe to limit the blocksizes that can be used > with a device? IMHO It is not good because after fs was created with big blksize it's image cant be used on other devices. We may just rewrite submit_bh simular to drivers/md/dm-io.c:do_region with following pseudocode: remaning = super_page_size(); while (remaining) { init_bio(bio); /*Try and add as many pages as possible*/ while (remaining) { dp->get_page(dp, &page, &len, &offset); len = min(len, to_bytes(remaining)); if(!bio_add_page(bio, page, len, offset)) break; offset = 0; remaining -= to_sector(len); dp->next_page(dp); } atomic_inc(&io->count); submit_bio(rw, bio); } > > > How do we split that up today? We could add processing to submit_bio > > > to check for the boundary and create two bios. > > > > But we do not split them up today - see what I wrote! Today we impose > > the restriction that a device must be able to handle a single "normal" > > page, and if it can't do that, it has to split it up itself. > > > > But yes, you would have to create some out-of-line function to use > > bio_split() until you have chopped things down enough. It's not a good > > thing for performance naturally, but if we consider this a "just make it > > work" fallback, I don't think it's too bad. You want to make a note of > > that it is happening though, so people realize that it is happening. > > H.. We could keep the existing scheme too and check that device > drivers split things up if they are too large? Isnt it possible today > to create a huge bio of 2M for huge pages and send it to a device? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: hda: set_drive_speed_status: status=0x51 { DriveReady SeekComplete Error }
Eric wrote: John Sigler wrote: According to my supplier, herre is the data sheet for the DOMs: http://www.pqimemory.com/documents/domdata.pdf PIO mode 2 is mentioned. Even DMA seems to be supported. Or am I mistaken? Page 3 states max interface burst speed is 8.3MB/s in PIO2. I wouldn't assume it supports DMA The reason I suspected DMA support is because I noticed the description of DMACK- (DMA acknowledge) and DMARQ (DMA request). Based on the quoted media transfer rates (1.2MB/s write and 4.1MB/s read), DMA would buy you a transfer checksum but probably not much performance, unless your embedded application is CPU bound. What I fear is that programmed I/O will tie up the CPU and add non-deterministic latency to my real-time apps. Suppose that an app is waiting for an acknowledgement from a PCI device when the OS suddenly decides it is time to write 4 KB to disk. Typical write rate is quoted as 1.2 MB/s i.e. the write would require at least 3.4 ms to complete. My fear is that the entire transfer is done in a non-preemptible critical section. In other words, my real-time app would be delayed several milliseconds, which is unacceptable. Am I mistaken? Regards. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: recent nfs change causes autofs regression
On Fri, Aug 31, 2007 at 01:07:56AM -0700, Linus Torvalds wrote: ... > When we add NEW BEHAVIOUR, we don't add it to old interfaces when that > breaks old user mode! We add a new flag saying "I want the new behaviour". > > This is not rocket science, guys. This is very basic kernel behaviour. The > kernel exists only to serve user space, and that means that there is no > more important thing to do than to make sure you don't break existing > users, unless you have some *damns* strong reasons. 100% agreed. > The fact that he may *also* have broken insane setups is totally > irrelevant. Don't go off on some tangent that has nothing to do with the > regression in question! It does not have "nothing" to do with the regression. Some setups which worked more by accident than by design earlier on were broken by the fix. This could have been avoided, I agree, but the breakage was caused by the fix (or the breakage is the fix, however you prefer to look at it). > > If ext3 in some rare case (which would still mean it hit a few thousand > > users) > > failed to remember that a file had been marked read-only and allowed writes > > to > > it, wouldn't we want to fix that too? It would cause regressions, but we'd > > fix > > it, right? > > Stop blathering. Of course we fix security holes. But we don't break > things that don't need breaking. This wasn't a security hole. *part* of it wasn't a security hole. The other half very much was. ... > In other words, it should (as I already mentioned once) have used > "nosharecache" by default, which makes it all work. > > Then, people who want to re-use the caches (which in turn may mean that > everything needs to have the same flags), THOSE PEOPLE, who want the NEW > SEMANTICS (errors and all) should then use a "sharecache" flag. > > See? You don't have to screw people over. Sure, given that Trond (or whomever) has the time it takes to go and implement all of this, there's no need to screw anyone. Assuming he's on a schedule and this will have to wait, I agree with him that it makes the most sense to play it safe security/consistency-wise rather than functionality-wise. > > mount passes back the error code on a failed mount. autofs passes that error > > along too (when people configure syslog correctly). In short; when these > > serious mistakes are made and caught, the admin sees an error in his logs. > > Bullshit. "Seeing the error in his logs" doesn't help anything. It makes troubleshooting possible, which adresses *the* major complaint from *one* of the *two* people who complained about this. -- / jakob - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2.6.23] ibmebus: Prevent bus_id collisions
[EMAIL PROTECTED] (Linas Vepstas) wrote on 30.08.2007 23:28:16: > On Thu, Aug 30, 2007 at 04:00:56PM +0200, Joachim Fenkes wrote: > > > > Plus, I rather like using > > the full_name since it also contains a descriptive name as opposed to > > being just nondescript numbers, helping the layman (ie user) to make sense > > out of a dev_id. > > [...] > Location codes are nice. I sure agree with you. Still, they're not unique, so we can't use them as bus_id. That's why we're having this discussion at all. What I meant was "I like using the full_name over using the device address / DRC index / you-name-it only". Location code is right out. Cheers, Joachim - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: recent nfs change causes autofs regression
--- Ian Kent <[EMAIL PROTECTED]> wrote: > On Thu, 30 Aug 2007, Linus Torvalds wrote: > > > > > > On Fri, 31 Aug 2007, Trond Myklebust wrote: > > > > > > It did not. The previous behaviour was to always silently > override the > > > user mount options. > > > > ..so it still worked for any sane setup, at least. > > > > You broke that. Hua gave good reasons for why he cannot use the > current > > kernel. It's a regression. > > > > In other words, the new behaviour is *worse* than the behaviour you > > > consider to be the incorrect one. > > > > This all came about due to complains about not being able to mount > the > same server file system with different options, most commonly ro vs. > rw > which I think was due to the shared super block changes some time > ago. > And, to some extent, I have to plead guilty for not complaining > enough > about this default in the beginning, which is basically unacceptable > for > sure. > > We have seen breakage in Fedora with the introduction of the patches > and > this is typical of it. It also breaks amd and admins have no way of > altering this that I'm aware of (help us here Ion). > > I understand Tronds concerns but the fact remains that other Unixs > allow > this behaviour but don't assert cache coherancy and many sysadmin > don't > realize this. So the broken behavior is expected to work and we can't > > simply stop allowing it unless we want to attend a public hanging > with us > as the paticipants. > > There is no question that the new behavior is worse and this change > is > unacceptable as a solution to the original problem. > > I really think that reversing the default, as has been suggested, > documenting the risk in the mount.nfs man page and perhaps issuing a > warning from the kernel is a better way to handle this. At least we > will > be doing more to raise public awareness of the issue than others. > I can only second that. Changing the default behavior in this way is really bad. Not that I am disagreeing with the technical reasons, but the change breaks working setups. And -EBUSY is not very helpful as a message here. It does not matter that the user tools may handle the breakage incorrect. The users (admins) had workings setups for years. And they were obviously working "good enough". And one should not forget that there will be a considerable time until "nosharecache" will trickle down into distributions. If the situation stays this way, quite a few people will not be able to move beyond 2.6.22 for some time. E.g. for I am working for a company that operates some linux "clusters" at a few german automotive cdompanies. For certain reasons everything there is based on automounter maps (both autofs and amd style). We have almost zero influence on that setup. The maps are a mess - we will run into the sharecache problem. At the same time I am trying to fight the notorious "system turns into frozen molassis on moderate I/O load". There maybe some interesting developements coming forth after 2.6.22. Not good :-( What I would like to see done for the at hand situation is: - make "nosharecache" the default for the forseeable future - log any attempt to mount option-inconsistent NFS filesystems to dmesh and syslog (apparently the NFS client is able to detect them :-). Do this regardless of the "nosharecache" option. This way admins will at least be made aware of the situation. - In a year or so we can talk about making the default safe. With proper advertising. Just my 0.02. Cheers Martin -- Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[git patches] libata fixes
Fixes, some new ids, and a version bump that we discovered was missing from several drivers. Please pull from 'upstream-linus' branch of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev.git upstream-linus to receive the following updates: drivers/ata/ata_generic.c |2 +- drivers/ata/ata_piix.c | 72 +++- drivers/ata/libata-core.c | 16 +++-- drivers/ata/pata_ali.c |2 +- drivers/ata/pata_amd.c |2 +- drivers/ata/pata_atiixp.c |2 +- drivers/ata/pata_cs5520.c |2 +- drivers/ata/pata_cs5530.c |2 +- drivers/ata/pata_isapnp.c |2 +- drivers/ata/pata_it821x.c |2 +- drivers/ata/pata_marvell.c |2 + drivers/ata/pata_mpc52xx.c |2 +- drivers/ata/pata_pcmcia.c |2 +- drivers/ata/pata_pdc2027x.c|2 +- drivers/ata/pata_platform.c|2 +- drivers/ata/pata_sc1200.c |2 +- drivers/ata/pata_scc.c |2 +- drivers/ata/pata_serverworks.c |2 +- drivers/ata/pata_sil680.c |2 +- drivers/ata/pata_sl82c105.c|2 +- drivers/ata/pdc_adma.c |2 +- drivers/ata/sata_inic162x.c|2 +- drivers/ata/sata_mv.c |2 +- drivers/ata/sata_nv.c |2 +- drivers/ata/sata_promise.c |6 ++-- drivers/ata/sata_qstor.c |2 +- drivers/ata/sata_sil.c |2 +- drivers/ata/sata_sil24.c |2 +- drivers/ata/sata_sis.c |2 +- drivers/ata/sata_svw.c |2 +- drivers/ata/sata_sx4.c |2 +- drivers/ata/sata_uli.c |2 +- drivers/ata/sata_via.c |2 +- drivers/ata/sata_vsc.c |2 +- include/linux/ata.h| 13 +++ include/linux/libata.h |1 + 36 files changed, 133 insertions(+), 37 deletions(-) Alan Cox (2): libata-core: Allow translation setting to fail pata_marvell: Add more identifiers Bartlomiej Zolnierkiewicz (1): ata: add ATA_MWDMA* and ATA_SWDMA* defines Jason Gaston (1): ata_piix: IDE mode SATA patch for Intel Tolapai Jeff Garzik (1): [libata] Bump driver versions Mikael Pettersson (1): sata_promise: FastTrack TX4200 is a second-generation chip Tejun Heo (3): ata_piix: add Satellite U200 to broken suspend list libata: implement BROKEN_HPA horkage and apply it to affected drives ata_piix: implement IOCFG bit18 quirk diff --git a/drivers/ata/ata_generic.c b/drivers/ata/ata_generic.c index 430fcf4..9454669 100644 --- a/drivers/ata/ata_generic.c +++ b/drivers/ata/ata_generic.c @@ -26,7 +26,7 @@ #include #define DRV_NAME "ata_generic" -#define DRV_VERSION "0.2.12" +#define DRV_VERSION "0.2.13" /* * A generic parallel ATA driver using libata diff --git a/drivers/ata/ata_piix.c b/drivers/ata/ata_piix.c index 071d274..e40c94f 100644 --- a/drivers/ata/ata_piix.c +++ b/drivers/ata/ata_piix.c @@ -94,7 +94,7 @@ #include #define DRV_NAME "ata_piix" -#define DRV_VERSION"2.11" +#define DRV_VERSION"2.12" enum { PIIX_IOCFG = 0x54, /* IDE I/O configuration register */ @@ -130,6 +130,7 @@ enum { ich6m_sata_ahci = 8, ich8_sata_ahci = 9, piix_pata_mwdma = 10, /* PIIX3 MWDMA only */ + tolapai_sata_ahci = 11, /* constants for mapping table */ P0 = 0, /* port 0 */ @@ -253,6 +254,8 @@ static const struct pci_device_id piix_pci_tbl[] = { { 0x8086, 0x292d, PCI_ANY_ID, PCI_ANY_ID, 0, 0, ich8_sata_ahci }, /* SATA Controller IDE (ICH9M) */ { 0x8086, 0x292e, PCI_ANY_ID, PCI_ANY_ID, 0, 0, ich8_sata_ahci }, + /* SATA Controller IDE (Tolapai) */ + { 0x8086, 0x5028, PCI_ANY_ID, PCI_ANY_ID, 0, 0, tolapai_sata_ahci }, { } /* terminate list */ }; @@ -441,12 +444,25 @@ static const struct piix_map_db ich8_map_db = { }, }; +static const struct piix_map_db tolapai_map_db = { +.mask = 0x3, +.port_enable = 0x3, +.map = { +/* PM PS SM SS MAP */ +{ P0, NA, P1, NA }, /* 00b */ +{ RV, RV, RV, RV }, /* 01b */ +{ RV, RV, RV, RV }, /* 10b */ +{ RV, RV, RV, RV }, +}, +}; + static const struct piix_map_db *piix_map_db_table[] = { [ich5_sata] = &ich5_map_db, [ich6_sata] = &ich6_map_db, [ich6_sata_ahci]= &ich6_map_db, [ich6m_sata_ahci] = &ich6m_map_db, [ich8_sata_ahci]= &ich8_map_db, + [tolapai_sata_ahci] = &tolapai_map_db, }; static struct ata_port_info piix_port_info[] = { @@ -560,6 +576,17 @@ static struct ata_port_info piix_port_info[] = { .mwdma_mask = 0x06, /* mwdma1-2 ?? CHECK 0 should be ok but slow */ .port_ops = &piix_pata_ops, },
PROBLEM: kernel 2.6.22.6 pata_pdc202xx_old.c limiting to UDMA/33 instead of UDMA/100 (UPDATED 2.6.22.6)
Update with kernel 2.6.22.6 i am getting this error now ata2.00: ATA-6: ST3120026A, 3.06, max UDMA/100 here is the new error. ata2.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen ata2.01: cmd ca/00:00:25:9c:fc/00:00:00:00:00/f6 tag 0 cdb 0x0 data 131072 out res 40/00:00:3f:00:00/00:00:00:00:00/f0 Emask 0x4 (timeout) ata2: port is slow to respond, please be patient (Status 0xfe) ata2: device not ready (errno=-16), forcing hardreset ata2: soft resetting port ata2.01: configured for UDMA/25 ata2: EH complete here is dmseg output [sde] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sde: sde1 sd 3:0:0:0: [sde] Attached SCSI disk scsi 3:0:1:0: Direct-Access ATA Maxtor 6Y120L0 YAR4 PQ: 0 ANSI: 5 sd 3:0:1:0: [sdf] 240121728 512-byte hardware sectors (122942 MB) sd 3:0:1:0: [sdf] Write Protect is off sd 3:0:1:0: [sdf] Mode Sense: 00 3a 00 00 sd 3:0:1:0: [sdf] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sd 3:0:1:0: [sdf] 240121728 512-byte hardware sectors (122942 MB) sd 3:0:1:0: [sdf] Write Protect is off sd 3:0:1:0: [sdf] Mode Sense: 00 3a 00 00 sd 3:0:1:0: [sdf] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sdf: sdf1 sd 3:0:1:0: [sdf] Attached SCSI disk usbmon: debugfs is not available USB Universal Host Controller Interface driver v3.0 ACPI: PCI Interrupt Link [LNKC] enabled at IRQ 11 PCI: setting IRQ 11 as level-triggered ACPI: PCI Interrupt :00:04.2[D] -> Link [LNKC] -> GSI 11 (level, low) -> IRQ 11 uhci_hcd :00:04.2: UHCI Host Controller uhci_hcd :00:04.2: new USB bus registered, assigned bus number 1 uhci_hcd :00:04.2: irq 11, io base 0xd400 usb usb1: configuration #1 chosen from 1 choice hub 1-0:1.0: USB hub found hub 1-0:1.0: 2 ports detected ACPI: PCI Interrupt :00:04.3[D] -> Link [LNKC] -> GSI 11 (level, low) -> IRQ 11 uhci_hcd :00:04.3: UHCI Host Controller uhci_hcd :00:04.3: new USB bus registered, assigned bus number 2 uhci_hcd :00:04.3: irq 11, io base 0xd000 usb usb2: configuration #1 chosen from 1 choice hub 2-0:1.0: USB hub found hub 2-0:1.0: 2 ports detected PNP: PS/2 Controller [PNP0303:PS2K] at 0x60,0x64 irq 1 PNP: PS/2 controller doesn't have AUX irq; using default 12 serio: i8042 KBD port at 0x60,0x64 irq 1 serio: i8042 AUX port at 0x60,0x64 irq 12 mice: PS/2 mouse device common for all mice device-mapper: ioctl: 4.11.0-ioctl (2006-10-12) initialised: [EMAIL PROTECTED] input: AT Translated Set 2 keyboard as /class/input/input2 usb 1-1: new low speed USB device using uhci_hcd and address 2 usb 1-1: configuration #1 chosen from 1 choice usb 1-2: new low speed USB device using uhci_hcd and address 3 usb 1-2: configuration #1 chosen from 1 choice usbcore: registered new interface driver hiddev hiddev96: USB HID v1.10 Device [CPS UPS AE550] on usb-:00:04.2-1 input: Logitech USB Receiver as /class/input/input3 input: USB HID v1.10 Mouse [Logitech USB Receiver] on usb-:00:04.2-2 usbcore: registered new interface driver usbhid drivers/hid/usbhid/hid-core.c: v2.6:USB HID core driver Advanced Linux Sound Architecture Driver Version 1.0.14 (Thu May 31 09:03:25 2007 UTC). ACPI: PCI Interrupt :00:04.5[C] -> Link [LNKB] -> GSI 10 (level, low) -> IRQ 10 PCI: Setting latency timer of device :00:04.5 to 64 ALSA device list: #0: VIA 82C686A/B rev20 with AD1881A at 0xb800, irq 10 TCP cubic registered Initializing XFRM netlink socket NET: Registered protocol family 1 NET: Registered protocol family 17 Using IPI Shortcut mode XFS mounting filesystem sdc2 Ending clean XFS mount for filesystem: sdc2 VFS: Mounted root (xfs filesystem) readonly. Freeing unused kernel memory: 208k freed sd 0:0:0:0: Attached scsi generic sg0 type 0 sd 1:0:1:0: Attached scsi generic sg1 type 0 sd 2:0:0:0: Attached scsi generic sg2 type 0 sd 2:0:1:0: Attached scsi generic sg3 type 0 sd 3:0:0:0: Attached scsi generic sg4 type 0 sd 3:0:1:0: Attached scsi generic sg5 type 0 parport_pc: VIA 686A/8231 detected parport_pc: probing current configuration parport_pc: Current parallel port base: 0x378 parport0: PC-style at 0x378 (0x778), irq 7 [PCSPP,TRISTATE] parport_pc: VIA parallel port: io=0x378, irq=7 r8169 Gigabit Ethernet driver 2.2LK-NAPI loaded ACPI: PCI Interrupt :00:0f.0[A] -> Link [LNKC] -> GSI 11 (level, low) -> IRQ 11 eth0: RTL8169sb/8110sb at 0xe882c000, 00:14:d1:38:5e:25, IRQ 11 8139too Fast Ethernet driver 0.9.28 ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 5 PCI: setting IRQ 5 as level-triggered ACPI: PCI Interrupt :00:10.0[A] -> Link [LNKD] -> GSI 5 (level, low) -> IRQ 5 eth1: RealTek RTL8139 at 0x8800, 00:40:05:3a:15:c4, IRQ 5 eth1: Identified 8139 chip type 'RTL-8100B/8139D' via686a :00:04.4: base address not set - upgrade BIOS or use force_addr=0xaddr Adding 390560k swap on /dev/sdc1. Priority:-1 extents:1 across:390560k XFS mounting filesystem sdc3 Ending clean XFS mount for filesystem: sdc3 XFS mounting filesystem sdc4 Ending
Re: recent nfs change causes autofs regression
On Fri, 31 Aug 2007, Frank van Maarseveen wrote: > On Thu, Aug 30, 2007 at 02:07:43PM -0700, Hua Zhong wrote: > > I am re-sending this after help from Ian and git-bisect. To me it's a > > show-stopper: I cannot find an acceptable workaround that I can implement. > > > > The problem: upgrading to 2.6.23-rc4 from 2.6.22 causes several autofs > > mounts to fail silently - they just not appear when they should. > > > > I believe it's caused by the NFS change that forces multiple mounts from > > different directories under the same server side filesystem to have the same > > mount options by default, otherwise it returns EBUSY. > > > > For example, if server has a filesystem /a, and it exports /a/x and /a/y > > (maybe with rw or ro), and a client must mount /a/x and /a/y with the same > > mount options now. > > > > Since in my setup they are managed by autofs, and the autofs map is managed > > by nis, there is no way I could easily workaround it.. > > > > If we have to live with this regression, I want to hear some suggestions > > about how to fix them realistically. Thanks. > > > > By the way, I am not sure if I did the bisect right, but FWIW, git-bisect > > says: > > > > c98451bdb2f3e6d6cc1e03adad641e9497512b49 is first bad commit > > commit c98451bdb2f3e6d6cc1e03adad641e9497512b49 > > Author: Frank van Maarseveen <[EMAIL PROTECTED]> > > Date: Mon Jul 9 22:25:29 2007 +0200 > > > > NLM: fix source address of callback to client > > > > Use the destination address of the original NLM request as the > > source address in callbacks to the client. > > > > Signed-off-by: Frank van Maarseveen <[EMAIL PROTECTED]> > > Signed-off-by: Trond Myklebust <[EMAIL PROTECTED]> > > > > :04 04 675c84bd8b2c50744018becaa0db4aeca19b8f9f > > 105fbd3cb3fa5e3019836b4b5268125d0181a72d M fs > > :04 04 0138796e0806b4ebd1cc3850ed4e8c7ab24d2d41 > > 2fec08debe51c20423a88b1a0d4281c683ba5daf M include > > This does not have any relation with the mount problem, assuming commit > and comment do match. That's right. The commits we're discussing here are (I believe): http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=75180df2ed467866ada839fe73cf7cc7d75c0a22 http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=275a5d24bf56b2d9dd4644c54a56366b89a028f1 The later being the one returning EBUSY for the option mismatch and the former the addition of the "nosharecache" option. Ian - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] H8/300: Fix misnamed "CONFIG_BLKDEV_RESERVE_ADDRESS" Kconfig variable.
Signed-off-by: Robert P. J. Day <[EMAIL PROTECTED]> --- if i read correctly an email i just got from Yoshinori Sato, he wanted me to post this to the main list. it seems an obvious enough error that it can probably be pushed to the main tree fairly soon, unless i've messed something up here. diff --git a/arch/h8300/Kconfig.debug b/arch/h8300/Kconfig.debug index 554efe6..996d97e 100644 --- a/arch/h8300/Kconfig.debug +++ b/arch/h8300/Kconfig.debug @@ -59,7 +59,7 @@ config BLKDEV_RESERVE help Reserved BLKDEV area. -config CONFIG_BLKDEV_RESERVE_ADDRESS +config BLKDEV_RESERVE_ADDRESS hex 'start address' depends on BLKDEV_RESERVE help -- Robert P. J. Day Linux Consulting, Training and Annoying Kernel Pedantry Waterloo, Ontario, CANADA http://crashcourse.ca - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: possible build system oddity
On 8/31/07, Oliver Neukum <[EMAIL PROTECTED]> wrote: > I only touched sound/usb/usbaudio.c > Nevertheless the whole subtree und sound/ is recompiling. What's > happening? Not here. Do make V=2, it should say why it's recompiling. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 13/30] net: Don't do pointless kmalloc return value casts in zd1211 driver
Jesper Juhl <[EMAIL PROTECTED]> wrote: > On 30/08/2007, Daniel Drake <[EMAIL PROTECTED]> wrote: >> Jesper Juhl wrote: >> > Since kmalloc() returns a void pointer there is no reason to cast >> > its return value. >> > This patch also removes a pointless initialization of a variable. >> >> NAK: adds a sparse warning >> zd_chip.c:116:15: warning: implicit cast to nocast type >> > Ok, I must admit I didn't check with sparse since it seemed pointless > - we usually never cast void pointers to other pointer types, > specifically because the C language nicely guarantees that the right > thing will happen without the cast. Sometimes we have to cast them to > integer types, su sure we need the cast there. But what on earth > makes a "zd_addr_t *" so special that we have to explicitly cast a > "void *" to that type? > > I see it's defined as > typedef u32 __nocast zd_addr_t; > in drivers/net/wireless/zd1211rw/zd_types.h , but why the __nocast ? Nevermind the __nocast, this looks like a bug in sparse. Just because a base type is __nocast, sparse shouldn't infer that a pointer to it should also be __nocast. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE/RFC] Really Fair Scheduler
On Fri, 2007-08-31 at 04:05 +0200, Roman Zippel wrote: > Hi, Greetings, > I'm glad to announce a working prototype of the basic algorithm I > already suggested last time. (finding it difficult to resist the urge to go shopping, I fast-forwarded to test drive... grep shopping arch/i386/kernel/tcs.c if you're curious;) I plunked it into 2.6.23-rc4 to see how it reacts to various sleeper loads, and hit some starvation. If I got it in right (think so) there's a bug lurking somewhere. taskset -c 1 fairtest2 resulted in the below. It starts up running both tasks at about 60/40 for hog/sleeper, then after a short while goes nuts. The hog component eats 100% cpu and starves the sleeper (and events, forever). runnable tasks: task PIDtree-key delta waiting switches priosum-execsum-wait sum-sleepwait-overrun wait-underrun -- events/1 8 13979193020350 -3984516524180 541606276813 2014 115 0 0 0 0 0 R fairtest2 7984 10027571241955 -7942765479096 5707836335486 294 120 0 0 0 0 0 fairtest2 7989 13539381091732 -4424328443109 8147144513458 286 120 0 0 0 0 0 taskset -c 1 massive_intr 3 produces much saner looking numbers, and is fair... runnable tasks: task PIDtree-key delta waiting switches priosum-execsum-wait sum-sleepwait-overrun wait-underrun -- massive_intr 12808 29762662234003 21699 -506538 4650 120 0 0 0 0 0 R massive_intr 12809 29762662225939 -687 53406 4633 120 0 0 0 0 0 massive_intr 12810 29762662220183 7879453097 4619 120 0 0 0 0 0 // gcc -O2 -o fiftyp fiftyp.c -lrt // code from interbench.c #include #include #include #include #include #include #include int forks=1; int runus,sleepus=7000; unsigned long loops_per_ms; void terminal_error(const char *name) { fprintf(stderr, "\n"); perror(name); exit (1); } unsigned long long get_nsecs(struct timespec *myts) { if (clock_gettime(CLOCK_REALTIME, myts)) terminal_error("clock_gettime"); return (myts->tv_sec * 10 + myts->tv_nsec ); } void burn_loops(unsigned long loops) { unsigned long i; /* * We need some magic here to prevent the compiler from optimising * this loop away. Otherwise trying to emulate a fixed cpu load * with this loop will not work. */ for (i = 0 ; i < loops ; i++) asm volatile("" : : : "memory"); } /* Use this many usecs of cpu time */ void burn_usecs(unsigned long usecs) { unsigned long ms_loops; ms_loops = loops_per_ms / 1000 * usecs; burn_loops(ms_loops); } void microsleep(unsigned long long usecs) { struct timespec req, rem; rem.tv_sec = rem.tv_nsec = 0; req.tv_sec = usecs / 100; req.tv_nsec = (usecs - (req.tv_sec * 100)) * 1000; continue_sleep: if ((nanosleep(&req, &rem)) == -1) { if (errno == EINTR) { if (rem.tv_sec || rem.tv_nsec) { req.tv_sec = rem.tv_sec; req.tv_nsec = rem.tv_nsec; goto continue_sleep; } goto out; } terminal_error("nanosleep"); } out: return; } /* * In an unoptimised loop we try to benchmark how many meaningless loops * per second we can perform on this hardware to fairly accurately * reproduce certain percentage cpu usage */ void calibrate_loop(void) { unsigned long long start_time, loops_per_msec, run_time = 0; unsigned long loops; struct timespec myts; loops_per_msec = 100; redo: /* Calibrate to within 1% accuracy */ while (run_time > 101 || run_time < 99) { loops = loops_per_msec; start_time = get_nsecs(&myts); burn_loops(loops); run_time = get_nsecs(&myts) - start_time; loops_per_msec = (100 * loops_per_msec / run_time ? : loops_per_msec); } /* Rechecking after a pause increases reproducibility */ sleep(1); loops = loops_per_msec; start_time = get_nsecs(&myts); burn_loops(loops); run_time = get_nsecs(&myts) - start_time; /* Tolerate 5% difference on checking */ if (run_time > 105 || run_time < 95) goto redo; loops_per_ms=loops_per_msec; sleep(1); start_time=get_nsecs(&myts); microsleep(sleepus); run_time=get_nsecs(&myts)-start_time; runus=run_time/
Re: [PATCH take #6] [libata] libata driver for bf548 on chip ATAPI controller.
Sonic Zhang wrote: Fix all issues pointed out in Jeff's email. Acked-by: Alan Cox <[EMAIL PROTECTED]> Signed-off-by: Sonic Zhang <[EMAIL PROTECTED]> --- drivers/ata/Kconfig | 16 drivers/ata/Makefile |1 drivers/ata/pata_bf54x.c | 1627 +++ 3 files changed, 1644 insertions(+) applied - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: possible build system oddity
On Fri, Aug 31, 2007 at 01:14:26PM +0400, Alexey Dobriyan wrote: > On 8/31/07, Oliver Neukum <[EMAIL PROTECTED]> wrote: > > I only touched sound/usb/usbaudio.c > > Nevertheless the whole subtree und sound/ is recompiling. What's > > happening? > > Not here. Do make V=2, it should say why it's recompiling. And if the problem persist then do a make V=1 and post output here. This information combined should tell what goes wrong. Sam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: possible build system oddity
Am Freitag 31 August 2007 schrieb Sam Ravnborg: > On Fri, Aug 31, 2007 at 01:14:26PM +0400, Alexey Dobriyan wrote: > > On 8/31/07, Oliver Neukum <[EMAIL PROTECTED]> wrote: > > > I only touched sound/usb/usbaudio.c > > > Nevertheless the whole subtree und sound/ is recompiling. What's > > > happening? > > > > Not here. Do make V=2, it should say why it's recompiling. > > And if the problem persist then do a make V=1 and post output here. > This information combined should tell what goes wrong. Not reproducible. Sorry. Regards Oliver - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] CONFIG_ZONE_MOVABLE [0/2] introduction
This is a rebased version for making ZONE_MOVABLE configurable. [1/2] clean up. this changes ZONE_xxx definitions and helps avoiding too many CONFIG_ZONE_xxx usage. [2/2] make ZONE_MOVABLE configurable. This patch set is against 2.6.23-rc3-mm1. tested on i386/UP and ia64/NUMA If I should wait for a while (until vm setteled down), I'll wait. Regards, -Kame - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] CONFIG_ZONE_MOVABLE [1/2] zone ifdef cleanup by renumbering
zone_ifdef_cleanup_by_renumbering.patch Now, this patch defines zone_idx for not-configured-zones. like enum_zone_type { (ZONE_DMA configured) (ZONE_DMA32 configured) ZONE_NORMAL (ZONE_HIGHMEM configured) ZONE_MOVABLE MAX_NR_ZONES, (ZONE_DMA not-configured) (ZONE_DMA32 not-configured) (ZONE_HIGHMEM not-configured) }; By this, we can determine zone is configured or not by zone_idx < MAX_NR_ZONES. We can avoid #ifdef for CONFIG_ZONE_xxx to some extent. This patch also replaces CONFIG_ZONE_DMA_FLAG by is_configured_zone(ZONE_DMA). Changelog: v2 -> v3 - updated against 2.6.23-rc3-mm1. Changelog: v1 -> v2 - rebased to 2.6.23-rc1 - Removed MAX_POSSIBLE_ZONES - Added comments Signed-Off-By: KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> include/linux/gfp.h| 18 --- include/linux/mmzone.h | 79 ++--- include/linux/vmstat.h | 24 +++--- mm/Kconfig |5 --- mm/page-writeback.c|7 +--- mm/page_alloc.c| 33 mm/slab.c |4 +- 7 files changed, 87 insertions(+), 83 deletions(-) Index: devel-2.6.23-rc3-mm1/include/linux/mmzone.h === --- devel-2.6.23-rc3-mm1.orig/include/linux/mmzone.h +++ devel-2.6.23-rc3-mm1/include/linux/mmzone.h @@ -178,10 +178,36 @@ enum zone_type { ZONE_HIGHMEM, #endif ZONE_MOVABLE, - MAX_NR_ZONES + MAX_NR_ZONES, +#ifndef CONFIG_ZONE_DMA + ZONE_DMA, +#endif +#ifndef CONFIG_ZONE_DMA32 + ZONE_DMA32, +#endif +#ifndef CONFIG_HIGHMEM + ZONE_HIGHMEM, +#endif }; /* + * Test zone type is configured or not. + * You can use this function for avoiding #ifdef. + * + * #ifdef CONFIG_ZONE_DMA + * do_something... + * #endif + * can be written as + * if (is_configured_zone(ZONE_DMA)) { + * do_something.. + * } + */ +static inline int is_configured_zone(enum zone_type zoneidx) +{ + return (zoneidx < MAX_NR_ZONES); +} + +/* * When a memory allocation must conform to specific limitations (such * as being suitable for DMA) the caller will pass in hints to the * allocator in the gfp_mask, in the zone modifier bits. These bits @@ -573,28 +599,35 @@ static inline int populated_zone(struct extern int movable_zone; -static inline int zone_movable_is_highmem(void) +/* + * Check zone is configured && specified "idx" is equal to target zone type. + * Zone's index is calucalted by above zone_idx(). + */ +static inline int zone_idx_is(enum zone_type idx, enum zone_type target) { -#if defined(CONFIG_HIGHMEM) && defined(CONFIG_ARCH_POPULATES_NODE_MAP) - return movable_zone == ZONE_HIGHMEM; -#else + if (is_configured_zone(target) && (idx == target)) + return 1; return 0; +} + +static inline int zone_movable_is_highmem(void) +{ +#if CONFIG_ARCH_POPULATES_NODE_MAP + if (is_configured_zone(ZONE_HIGHMEM)) + return movable_zone == ZONE_HIGHMEM; #endif + return 0; } static inline int is_highmem_idx(enum zone_type idx) { -#ifdef CONFIG_HIGHMEM - return (idx == ZONE_HIGHMEM || - (idx == ZONE_MOVABLE && zone_movable_is_highmem())); -#else - return 0; -#endif + return (zone_idx_is(idx, ZONE_HIGHMEM) || + (zone_idx_is(idx, ZONE_MOVABLE) && zone_movable_is_highmem())); } static inline int is_normal_idx(enum zone_type idx) { - return (idx == ZONE_NORMAL); + return zone_idx_is(idx, ZONE_NORMAL); } /** @@ -605,36 +638,22 @@ static inline int is_normal_idx(enum zon */ static inline int is_highmem(struct zone *zone) { -#ifdef CONFIG_HIGHMEM - int zone_idx = zone - zone->zone_pgdat->node_zones; - return zone_idx == ZONE_HIGHMEM || - (zone_idx == ZONE_MOVABLE && zone_movable_is_highmem()); -#else - return 0; -#endif + return is_highmem_idx(zone_idx(zone)); } static inline int is_normal(struct zone *zone) { - return zone == zone->zone_pgdat->node_zones + ZONE_NORMAL; + return zone_idx_is(zone_idx(zone), ZONE_NORMAL); } static inline int is_dma32(struct zone *zone) { -#ifdef CONFIG_ZONE_DMA32 - return zone == zone->zone_pgdat->node_zones + ZONE_DMA32; -#else - return 0; -#endif + return zone_idx_is(zone_idx(zone), ZONE_DMA32); } static inline int is_dma(struct zone *zone) { -#ifdef CONFIG_ZONE_DMA - return zone == zone->zone_pgdat->node_zones + ZONE_DMA; -#else - return 0; -#endif + return zone_idx_is(zone_idx(zone), ZONE_DMA); } /* These two functions are used to setup the per zone pages min values */ Index: devel-2.6.23-rc3-mm1/include/linux/vmstat.h === --- devel-2.6.23-rc3-m
[PATCH] CONFIG_ZONE_MOVABLE [2/2] config zone movable
Makes ZONE_MOVABLE as configurable Based on "zone_ifdef_cleanup_by_renumbering.patch" Signed-Off-By: KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> Index: devel-2.6.23-rc3-mm1/include/linux/mmzone.h === --- devel-2.6.23-rc3-mm1.orig/include/linux/mmzone.h +++ devel-2.6.23-rc3-mm1/include/linux/mmzone.h @@ -177,7 +177,9 @@ enum zone_type { */ ZONE_HIGHMEM, #endif +#ifdef CONFIG_ZONE_MOVABLE ZONE_MOVABLE, +#endif MAX_NR_ZONES, #ifndef CONFIG_ZONE_DMA ZONE_DMA, @@ -188,6 +190,9 @@ enum zone_type { #ifndef CONFIG_HIGHMEM ZONE_HIGHMEM, #endif +#ifndef CONFIG_ZONE_MOVABLE + ZONE_MOVABLE, +#endif }; /* @@ -612,11 +617,13 @@ static inline int zone_idx_is(enum zone_ static inline int zone_movable_is_highmem(void) { -#if CONFIG_ARCH_POPULATES_NODE_MAP - if (is_configured_zone(ZONE_HIGHMEM)) - return movable_zone == ZONE_HIGHMEM; -#endif +#ifdef CONFIG_ARCH_POPULATES_NODE_MAP + return is_configured_zone(ZONE_HIGHMEM) && + is_configured_zone(ZONE_MOVABLE) && + (movable_zone == ZONE_HIGHMEM); +#else return 0; +#endif } static inline int is_highmem_idx(enum zone_type idx) Index: devel-2.6.23-rc3-mm1/include/linux/gfp.h === --- devel-2.6.23-rc3-mm1.orig/include/linux/gfp.h +++ devel-2.6.23-rc3-mm1/include/linux/gfp.h @@ -130,7 +130,8 @@ static inline enum zone_type gfp_zone(gf if (is_configured_zone(ZONE_DMA32) && (flags & __GFP_DMA32)) return base + ZONE_DMA32; - if ((flags & (__GFP_HIGHMEM | __GFP_MOVABLE)) == + if (is_configured_zone(ZONE_MOVABLE) && + (flags & (__GFP_HIGHMEM | __GFP_MOVABLE)) == (__GFP_HIGHMEM | __GFP_MOVABLE)) return base + ZONE_MOVABLE; Index: devel-2.6.23-rc3-mm1/mm/Kconfig === --- devel-2.6.23-rc3-mm1.orig/mm/Kconfig +++ devel-2.6.23-rc3-mm1/mm/Kconfig @@ -125,6 +125,18 @@ config SPARSEMEM_VMEMMAP depends on SPARSEMEM default y if (SPARSEMEM_VMEMMAP_ENABLE) + +config ZONE_MOVABLE + bool"Zone for movable pages" + depends on ARCH_POPULATES_NODE_MAP + help + Allows creating a zone type only for movable pages, e.g. page cache + and anonymous memory. Because movable pages are easily reclaimed + and page migration technique can move them, your chance for allocating + contiguous memory such as huge pages will be better than other zones. + To use this zone, please see "kernelcore=" or "movablecore=" in + Documentation/kernel-parameters.txt + # eventually, we can have this option just 'select SPARSEMEM' config MEMORY_HOTPLUG bool "Allow for memory hot-add" Index: devel-2.6.23-rc3-mm1/mm/page_alloc.c === --- devel-2.6.23-rc3-mm1.orig/mm/page_alloc.c +++ devel-2.6.23-rc3-mm1/mm/page_alloc.c @@ -95,7 +95,9 @@ int sysctl_lowmem_reserve_ratio[MAX_NR_Z #ifdef CONFIG_HIGHMEM 32, #endif +#ifdef CONFIG_ZONE_MOVABLE 32, +#endif }; EXPORT_SYMBOL(totalram_pages); @@ -4000,6 +4002,10 @@ static int __init cmdline_parse_core(cha if (!p) return -EINVAL; + if (!is_configured_zone(ZONE_MOVABLE)) { + printk ("ZONE_MOVABLE is not configured, %s is ignored.\n",p); + return 0; + } coremem = memparse(p, &p); *core = coremem >> PAGE_SHIFT; Index: devel-2.6.23-rc3-mm1/mm/vmstat.c === --- devel-2.6.23-rc3-mm1.orig/mm/vmstat.c +++ devel-2.6.23-rc3-mm1/mm/vmstat.c @@ -678,8 +678,14 @@ const struct seq_operations pagetypeinfo #define TEXT_FOR_HIGHMEM(xx) #endif +#ifdef CONFIG_ZONE_MOVABLE +#define TEXT_FOR_MOVABLE(xx) xx "_movable", +#else +#define TEXT_FOR_MOVABLE(xx) +#endif + #define TEXTS_FOR_ZONES(xx) TEXT_FOR_DMA(xx) TEXT_FOR_DMA32(xx) xx "_normal", \ - TEXT_FOR_HIGHMEM(xx) xx "_movable", + TEXT_FOR_HIGHMEM(xx) xx TEXT_FOR_MOVABLE(xx) static const char * const vmstat_text[] = { /* Zoned VM counters */ Index: devel-2.6.23-rc3-mm1/include/linux/vmstat.h === --- devel-2.6.23-rc3-mm1.orig/include/linux/vmstat.h +++ devel-2.6.23-rc3-mm1/include/linux/vmstat.h @@ -25,7 +25,14 @@ #define HIGHMEM_ZONE(xx) #endif -#define FOR_ALL_ZONES(xx) DMA_ZONE(xx) DMA32_ZONE(xx) xx##_NORMAL HIGHMEM_ZONE(xx) , xx##_MOVABLE +#ifdef CONFIG_ZONE_MOVABLE +#define MOVABLE_ZONE(xx) , xx##_MOVABLE +#else +#define MOVABLE_ZONE(xx) +#endif + + +#define FOR_ALL_ZONES(xx) DMA_ZONE(xx) DMA32_ZONE(xx) xx##_NORMAL HIGHMEM_ZONE(xx) MOVABLE_ZONE(xx) enum vm_event_item { P
Re: [PATCH] kexec: reenable HPET before kexec
On Thu, 30 Aug 2007 12:04:33 -0600 [EMAIL PROTECTED] (Eric W. Biederman) wrote: > I was assuming that CLOCK_EVT_MODE_SHUTDOWN just mapped > to the shutdown method of the clock events or something else. > But it shutdown means something different in this context we > can certainly find a better place to hook into the device > tree and call shutdown methods. Especially if that will > make the code simpler. Agree. I've got rig from timekeeping. Now I'm using system device tree shutdown interface, as you suggested. I've added HPET system device class with shutdown method and HPET device to sysdev. > This is a design feature. machine_crash_shutdown is not really > supposed to disable any hardware. There is a very minimal set that we > haven't been able to figure out how to get the kernels initialization > routines to deal with properly. Which is a temporary justification > for not doing more now. If we can't find anyway to make the > initialization code more robust for the hpet we can revisit this. So you suggest to check if HPET is present in HPET init code even if HPET disabled in boot kernel command line or APIC is disabled. And if HPET is present and kernel not going to use it - disable HPET interrupts? > Please remove the CONFIG_KEXEC. We need to do this on a reboot also > so we don't confuse the BIOS. BIOS's frequently but not always > can just reset the board to avoid complications like this, but if > we need a shutdown method we need a shutdown method. The kexec > case just exercises things more. Removed CONFIG_KEXEC. So here is new version of fix. Patch against 2.6.23-rc3. Eric, review please. Thanks. Signed-off-by: Konstantin Baydarov <[EMAIL PROTECTED]> arch/i386/kernel/hpet.c | 48 1 file changed, 48 insertions(+) Index: linux-2.6.23-rc3/arch/i386/kernel/hpet.c === --- linux-2.6.23-rc3.orig/arch/i386/kernel/hpet.c +++ linux-2.6.23-rc3/arch/i386/kernel/hpet.c @@ -144,11 +144,31 @@ static void hpet_enable_int(void) { unsigned long cfg = hpet_readl(HPET_CFG); + if (hpet_legacy_int_enabled) + return; + cfg |= HPET_CFG_LEGACY; hpet_writel(cfg, HPET_CFG); hpet_legacy_int_enabled = 1; } +static void hpet_disable_int(void) +{ + unsigned long cfg; + + if (!hpet_legacy_int_enabled) + return; + + if (!is_hpet_capable()) + return; + + cfg = hpet_readl(HPET_CFG); + cfg &= ~HPET_CFG_LEGACY; + hpet_writel(cfg, HPET_CFG); + hpet_legacy_int_enabled = 0; + +} + static void hpet_set_mode(enum clock_event_mode mode, struct clock_event_device *evt) { @@ -551,3 +571,31 @@ irqreturn_t hpet_rtc_interrupt(int irq, return IRQ_HANDLED; } #endif + +static int hpet_shutdown(struct sys_device *dev) +{ + /* We need this to make PIT works in KEXECuted kernel */ + hpet_disable_int(); + + return 0; +} + +static struct sysdev_class hpet_sysdev_class = { + set_kset_name("hpet"), + .shutdown = hpet_shutdown, +}; + +static struct sys_device device_hpet = { + .id = 0, + .cls= &hpet_sysdev_class, +}; + +static int __init hpet_init_sysfs(void) +{ + int error = sysdev_class_register(&hpet_sysdev_class); + if (!error) + error = sysdev_register(&device_hpet); + return error; +} + +device_initcall(hpet_init_sysfs); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: possible build system oddity
On Aug 31 2007 11:51, Oliver Neukum wrote: >Am Freitag 31 August 2007 schrieb Sam Ravnborg: >> On Fri, Aug 31, 2007 at 01:14:26PM +0400, Alexey Dobriyan wrote: >> > On 8/31/07, Oliver Neukum <[EMAIL PROTECTED]> wrote: >> > > I only touched sound/usb/usbaudio.c >> > > Nevertheless the whole subtree und sound/ is recompiling. What's >> > > happening? >> > >> > Not here. Do make V=2, it should say why it's recompiling. >> >> And if the problem persist then do a make V=1 and post output here. >> This information combined should tell what goes wrong. > >Not reproducible. Sorry. NFS/etc. involved? Jan -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [1/4] 2.6.23-rc4: known regressions
On Fri, Aug 31, 2007 at 12:48:45AM -0700, Christoph Lameter wrote: > Here is the fix for alpha: > > >From [EMAIL PROTECTED] Thu Aug 30 14:13:57 2007 > Subject: SLUB: Force inlining for functions in slub_def.h > > Some compilers (especially older gcc releases) may skip inlining sometimes > which will lead to link failures. Force the inlining of keyfunctions in > slub_def.h to avoid these issues. >... This also explains why it's an Alpha-only problem - the Alpha port pretty much diverges from the rest of the kernel regarding "inline" semantics and usage... cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
DRM and/or X trouble (was Re: CFS review)
On 08/31/2007 08:46 AM, Tilman Sauerbeck wrote: On 08/29/2007 09:56 PM, Rene Herman wrote: With X server 1.3, I'm getting consistent crashes with two glxgear instances running. So, if you're getting any output, it's better than my situation. Before people focuss on software rendering too much -- also with 1.3.0 (and a Matrox Millenium G550 AGP, 32M) glxgears also works decidedly crummy using hardware rendering. While I can move the glxgears window itself, the actual spinning wheels stay in the upper-left corner of the screen and the movement leaves a non-repainting trace on the screen. This sounds like you're running an older version of Mesa. The bugfix went into Mesa 6.3 and 7.0. I have Mesa 6.5.2 it seems (slackware-12.0 standard): OpenGL renderer string: Mesa DRI G400 20061030 AGP 2x x86/MMX+/3DNow!+/SSE OpenGL version string: 1.2 Mesa 6.5.2 The bit of the problem sketched above -- the gears just sitting there in the upper left corner of the screen and not moving alongside their window is fully reproduceable. The bit below ... : Running a second instance of glxgears in addition seems to make both instances unkillable -- and when I just now forcefully killed X in this situation (the spinning wheels were covering the upper left corner of all my desktops) I got the below. [ two kernel BUGs ] ... isn't. This seems to (again) have been a race of sorts that I hit by accident since I haven't reproduced yet. Had the same type of "racyness" trouble with keyboard behaviour in this version of X earlier. Running two instances of glxgears and killing them works for me, too. I'm using xorg-server 1.3.0.0, Mesa 7.0.1 with the latest DRM bits from http://gitweb.freedesktop.org/?p=mesa/drm.git;a=summary For me, everything standard slackware-12.0 (X.org 1.3.0) and kernel 2.6.22 DRM. I'm not running CFS though, but I guess the oops wasn't related to that. I've noticed before the Matrox driver seems to get little attention/testing so maybe that's just it. A G550 is ofcourse in graphics-time a Model T by now. I'm rather decidedly not a graphics person so I don't care a lot but every time I try to do something fashionable (run Google Earth for example) I notice things are horribly, horribly broken. X bugs I do not find very interesting (there's just too many) and the kernel bugs are requiring more time to reproduce than I have available. If the BUGs as posted aren't enough for a diagnosis, please consider the report withdrawn. Rene. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE/RFC] Really Fair Scheduler
* Roman Zippel <[EMAIL PROTECTED]> wrote: > Hi, > > I'm glad to announce a working prototype of the basic algorithm I > already suggested last time. As I already tried to explain previously > CFS has a considerable algorithmic and computational complexity. [...] hey, thanks for working on this, it's appreciated! In terms of merging your stuff, your patch looks a bit large and because you did not tell us that you were coding in this area, you probably missed Peter Zijlstra's excellent CFS work: http://programming.kicks-ass.net/kernel-patches/sched-cfs/ The following portion of Peter's series does much of the core math changes of what your patch provides (and which makes up for most of the conceptual delta in your patch), on a step by step basis: sched-update_weight_inv.patch sched-se_load.patch sched-se_load-2.patch sched-64-wait_runtime.patch sched-scaled-limit.patch sched-wait_runtime-scale.patch sched-calc_delta.patch So the most intrusive (math) aspects of your patch have been implemented already for CFS (almost a month ago), in a finegrained way. Peter's patches change the CFS calculations gradually over from 'normalized' to 'non-normalized' wait-runtime, to avoid the normalizing/denormalizing overhead and rounding error. Turn off sleeper fairness, remove the limit code and we should arrive to something quite close to the core math in your patch, with similar rounding properties and similar overhead/complexity. (there are some other small details in the math but this is the biggest item by far.) I find Peter's series very understandable and he outlined the math arguments in his replies to your review mails. (would be glad to re-open those discussions though if you still think there are disagreements.) Peter's work fully solves the rounding corner-cases you described as: > This model is far more accurate than CFS is and doesn't add an error > over time, thus there are no more underflow/overflow anymore within > the described limits. ( your characterisation errs in that it makes it appear to be a common problem, while in practice it's only a corner-case limited to extreme negative nice levels and even there it needs a very high rate of scheduling and an artificially constructed workload: several hundreds of thousand of context switches per second with a yield-ing loop to be even measurable with unmodified CFS. So this is not a 2.6.23 issue at all - unless there's some testcase that proves the opposite. ) with Peter's queue there are no underflows/overflows either anymore in any synthetic corner-case we could come up with. Peter's queue works well but it's 2.6.24 material. Non-normalized wait-runtime is simply a different unit (resulting in slightly higher context-switch performance), the principle and the end result does not change. All in one, we dont disagree, this is an incremental improvement we are thinking about for 2.6.24. We do disagree with this being positioned as something fundamentally different though - it's just the same thing mathematically, expressed without a "/weight" divisor, resulting in no change in scheduling behavior. (except for a small shift of CPU utilization for a synthetic corner-case) And if we handled that fundamental aspect via Peter's queue, all the remaining changes you did can be done (and considered and merged) evolutionarily instead of revolutionarily, ontop of CFS - this should cut down on the size and the impact of your changes significantly! So if you'd like to work with us on this and get items that make sense merged (which we'd very much like to see happen), could you please re-shape the rest of your changes and ideas (such as whether to use ready-queueing or a runqueue concept, which does look interesting) ontop of Peter's queue, and please do it as a finegrained, reviewable, mergable series of patches, like Peter did. Thanks! Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: possible build system oddity
Am Freitag 31 August 2007 schrieb Jan Engelhardt: > On Aug 31 2007 11:51, Oliver Neukum wrote: > >Am Freitag 31 August 2007 schrieb Sam Ravnborg: > >> On Fri, Aug 31, 2007 at 01:14:26PM +0400, Alexey Dobriyan wrote: > >> > On 8/31/07, Oliver Neukum <[EMAIL PROTECTED]> wrote: > >> > > I only touched sound/usb/usbaudio.c > >> > > Nevertheless the whole subtree und sound/ is recompiling. What's > >> > > happening? > >> > > >> > Not here. Do make V=2, it should say why it's recompiling. > >> > >> And if the problem persist then do a make V=1 and post output here. > >> This information combined should tell what goes wrong. > > > >Not reproducible. Sorry. > > NFS/etc. involved? No, exclusively on reiserfs. Regards Oliver - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] IOC3: Program UART predividers.
Ralf Baechle wrote: The IOC3 driver's UART detection bits used to rely on the the firmware setting the UART pre-divider in a way that's apropriate for the 8250 driver which doesn't currently program this register. This happens to work for the console but not rarely for additional ports. While at it, also program the UART to RS-232 PIO mode; it the UART might have been in mac-serial and/or DMA mode though that hasn't actually been observed in practice. Signed-off-by: Ralf Baechle <[EMAIL PROTECTED]> applied - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] H8/300: Fix misnamed "CONFIG_BLKDEV_RESERVE_ADDRESS" Kconfig variable.
On Fri, 31 Aug 2007, Robert P. J. Day wrote: > > Signed-off-by: Robert P. J. Day <[EMAIL PROTECTED]> Acked-by: Satyam Sharma <[EMAIL PROTECTED]> This is a bugfix, true. > if i read correctly an email i just got from Yoshinori Sato, he > wanted me to post this to the main list. it seems an obvious enough > error that it can probably be pushed to the main tree fairly soon, > unless i've messed something up here. > > > diff --git a/arch/h8300/Kconfig.debug b/arch/h8300/Kconfig.debug > index 554efe6..996d97e 100644 > --- a/arch/h8300/Kconfig.debug > +++ b/arch/h8300/Kconfig.debug > @@ -59,7 +59,7 @@ config BLKDEV_RESERVE > help > Reserved BLKDEV area. > > -config CONFIG_BLKDEV_RESERVE_ADDRESS > +config BLKDEV_RESERVE_ADDRESS > hex 'start address' > depends on BLKDEV_RESERVE > help - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] snd_hda_intel for F/S T4210
At Thu, 30 Aug 2007 17:04:14 +0200, I wrote: > > At Wed, 29 Aug 2007 17:34:19 +0200, > Thomas Richter wrote: > > > > Hi folks, > > > > the patch below, to be applied to sound/pci/hda/patch_sigmatel.c fixes the > > audio > > output on the Fujiutsu/Siemens lifebook T4210 (and probably on others). It > > is > > suitable for the kernel 2.6.23-rc4 (and probably others). > > > > Without the patch, audio fails and the hda driver fails to load with > > > > No available DAC for pin 0x0 > > > > However, the indicated pin has no connections in first place and thus > > should be ignored. > > With the patch applied audio output works fine. > > The problem is that nid = 0 is used. So, your patch just hides > another bug. Just looking at the code, the suspicious part is the automatic addition of shared I/O pins. Does the patch below fix? Takashi diff -r e7c9355af3ff sound/pci/hda/patch_sigmatel.c --- a/sound/pci/hda/patch_sigmatel.cFri Aug 31 12:52:19 2007 +0200 +++ b/sound/pci/hda/patch_sigmatel.cFri Aug 31 13:05:22 2007 +0200 @@ -1479,7 +1479,8 @@ static int stac92xx_add_dyn_out_pins(str case 3: /* add line-in as side */ if (cfg->input_pins[AUTO_PIN_LINE] && num_dacs > 3) { - cfg->line_out_pins[3] = cfg->input_pins[AUTO_PIN_LINE]; + cfg->line_out_pins[cfg->line_outs] = + cfg->input_pins[AUTO_PIN_LINE]; spec->line_switch = 1; cfg->line_outs++; } @@ -1487,12 +1488,14 @@ static int stac92xx_add_dyn_out_pins(str case 2: /* add line-in as clfe and mic as side */ if (cfg->input_pins[AUTO_PIN_LINE] && num_dacs > 2) { - cfg->line_out_pins[2] = cfg->input_pins[AUTO_PIN_LINE]; + cfg->line_out_pins[cfg->line_outs] = + cfg->input_pins[AUTO_PIN_LINE]; spec->line_switch = 1; cfg->line_outs++; } if (cfg->input_pins[AUTO_PIN_MIC] && num_dacs > 3) { - cfg->line_out_pins[3] = cfg->input_pins[AUTO_PIN_MIC]; + cfg->line_out_pins[cfg->line_outs] = + cfg->input_pins[AUTO_PIN_MIC]; spec->mic_switch = 1; cfg->line_outs++; } @@ -1500,12 +1503,14 @@ static int stac92xx_add_dyn_out_pins(str case 1: /* add line-in as surr and mic as clfe */ if (cfg->input_pins[AUTO_PIN_LINE] && num_dacs > 1) { - cfg->line_out_pins[1] = cfg->input_pins[AUTO_PIN_LINE]; + cfg->line_out_pins[cfg->line_outs] = + cfg->input_pins[AUTO_PIN_LINE]; spec->line_switch = 1; cfg->line_outs++; } if (cfg->input_pins[AUTO_PIN_MIC] && num_dacs > 2) { - cfg->line_out_pins[2] = cfg->input_pins[AUTO_PIN_MIC]; + cfg->line_out_pins[cfg->line_outs] = + cfg->input_pins[AUTO_PIN_MIC]; spec->mic_switch = 1; cfg->line_outs++; } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC][PATCH 0/6] An IPC implementation base on Linux IDRs
A couple of months ago, I dropped a series of patches that introduced the AKT framework (Automatic Kernel Tunables). (see thread http://lkml.org/lkml/2007/1/16/16) When reading the patches, people complained that I was trying to treat a symptom, not the problem itself, and that it'd be better try to rewrite things where necessary (better data structures / implementations). (see http://lkml.org/lkml/2007/2/7/248) I'm trying to achieve this for the IPC case, with this new series of patches: after proposing a radix tree based implementation at the beginning of July, I'm now proposing a new ipc implementation based on the Linux idr API. In the current ipc implementation, the ipc structures pointers are stored in an array that is resized each time the corresponding tunable is changed (resized means that a new array is allocated with the new size, the old one is copied into that new array and the old array is de-allocated). With this new implementation, there is no need for the array resizing since the size change becomes "natural". With this completely dynamic implementation, it becomes possible to set the kernel default maximum values higher in order to prevent early DoS feedback, on configurations with a high amount of memory. Even if the maximum values are huge, no memory space will be wasted, since the allocations are done "on demand". This is not the case with the existing array based implementation: an array of the maximum number of entries is allocated only if a single ipc object is actually created. These patches should be applied to 2.6.23-rc2, in the following order: [PATCH 1/6]: ipc_idr.patch [PATCH 2/6]: ipc_syscalls.patch [PATCH 3/6]: ipc_get.patch [PATCH 4/6]: ipc_lock_and_check.patch [PATCH 5/6]: ipcid_to_idx.patch [PATCH 6/6]: ipc_buildid.patch -- Regards, Nadia - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC][PATCH 2/6] Unifying the syscalls code
[PATCH 02/06] This patch introduces a change into the sys_msgget(), sys_semget() and sys_shmget() routines: they now share a common code, which is better for maintainability. Signed-off-by: Nadia Derbey <[EMAIL PROTECTED]> --- ipc/msg.c | 61 ++-- ipc/sem.c | 78 ++- ipc/shm.c | 75 - ipc/util.c | 101 - ipc/util.h | 43 + 5 files changed, 217 insertions(+), 141 deletions(-) Index: linux-2.6.23-rc2/ipc/util.h === --- linux-2.6.23-rc2.orig/ipc/util.h2007-08-31 11:29:21.0 +0200 +++ linux-2.6.23-rc2/ipc/util.h 2007-08-31 11:38:50.0 +0200 @@ -35,6 +35,35 @@ struct ipc_ids { struct idr ipcs_idr; }; +/* + * Structure that holds the parameters needed by the ipc operations + * (see after) + */ +struct ipc_params { + key_t key; + int flg; + union { + size_t size;/* for shared memories */ + int nsems; /* for semaphores */ + } u;/* holds the getnew() specific param */ +}; + +/* + * Structure that holds some ipc operations. This structure is used to unify + * the calls to sys_msgget(), sys_semget(), sys_shmget() + * . routine to call to create a new ipc object. Can be one of newque, + *newary, newseg + * . routine to call to call to check permissions for a new ipc object. + *Can be one of security_msg_associate, security_sem_associate, + *security_shm_associate + * . routine to call for an extra check if needed + */ +struct ipc_ops { + int (*getnew) (struct ipc_namespace *, struct ipc_params *); + int (*associate) (void *, int); + int (*more_checks) (void *, struct ipc_params *); +}; + struct seq_file; void ipc_init_ids(struct ipc_ids *); @@ -50,7 +79,6 @@ void __init ipc_init_proc_interface(cons #define IPC_SHM_IDS2 /* must be called with ids->mutex acquired.*/ -struct kern_ipc_perm *ipc_findkey(struct ipc_ids *ids, key_t key); int ipc_addid(struct ipc_ids *, struct kern_ipc_perm *, int); int ipc_get_maxid(struct ipc_ids *); @@ -95,5 +123,18 @@ int ipc_parse_version (int *cmd); extern void free_msg(struct msg_msg *msg); extern struct msg_msg *load_msg(const void __user *src, int len); extern int store_msg(void __user *dest, struct msg_msg *msg, int len); +extern int ipcget_new(struct ipc_namespace *, struct ipc_ids *, + struct ipc_ops *, struct ipc_params *); +extern int ipcget_public(struct ipc_namespace *, struct ipc_ids *, + struct ipc_ops *, struct ipc_params *); + +static inline int ipcget(struct ipc_namespace *ns, struct ipc_ids *ids, + struct ipc_ops *ops, struct ipc_params *params) +{ + if (params->key == IPC_PRIVATE) + return ipcget_new(ns, ids, ops, params); + else + return ipcget_public(ns, ids, ops, params); +} #endif Index: linux-2.6.23-rc2/ipc/util.c === --- linux-2.6.23-rc2.orig/ipc/util.c2007-08-31 11:34:17.0 +0200 +++ linux-2.6.23-rc2/ipc/util.c 2007-08-31 11:40:03.0 +0200 @@ -197,7 +197,7 @@ void __init ipc_init_proc_interface(cons * If key is found ipc contains its ipc structure */ -struct kern_ipc_perm *ipc_findkey(struct ipc_ids *ids, key_t key) +static struct kern_ipc_perm *ipc_findkey(struct ipc_ids *ids, key_t key) { struct kern_ipc_perm *ipc; int next_id; @@ -301,6 +301,105 @@ int ipc_addid(struct ipc_ids* ids, struc } /** + * ipcget_new - create a new ipc object + * @ns: namespace + * @ids: identifer set + * @ops: the actual creation routine to call + * @params: its parameters + * + * This routine is called sys_msgget, sys_semget() and sys_shmget() when + * the key is IPC_PRIVATE + */ +int ipcget_new(struct ipc_namespace *ns, struct ipc_ids *ids, + struct ipc_ops *ops, struct ipc_params *params) +{ + int err; + + err = idr_pre_get(&ids->ipcs_idr, GFP_KERNEL); + + if (!err) + return -ENOMEM; + + mutex_lock(&ids->mutex); + err = ops->getnew(ns, params); + mutex_unlock(&ids->mutex); + + return err; +} + +/** + * ipc_check_perms - check security and permissions for an IPC + * @ipcp: ipc permission set + * @ids: identifer set + * @ops: the actual security routine to call + * @params: its parameters + */ +static int ipc_check_perms(struct kern_ipc_perm *ipcp, struct ipc_ops *ops, + struct ipc_params *params) +{ + int err; + + if (ipcperms(ipcp, params->flg)) + err = -EACCES; + else { + err = ops->associate(ipcp,
[RFC][PATCH 4/6] Integrating ipc_checkid() into ipc_lock()
[PATCH 04/06] This patch introduces a new ipc_lock_check() routine interface: . each time ipc_checkid() is called, this is done after calling ipc_lock(). ipc_checkid() is now called from inside ipc_lock_check(). Signed-off-by: Nadia Derbey <[EMAIL PROTECTED]> --- ipc/msg.c | 62 -- ipc/sem.c | 70 ++- ipc/shm.c | 90 - ipc/util.c | 23 +-- ipc/util.h | 40 --- 5 files changed, 149 insertions(+), 136 deletions(-) Index: linux-2.6.23-rc2/ipc/util.h === --- linux-2.6.23-rc2.orig/ipc/util.h2007-08-31 11:59:11.0 +0200 +++ linux-2.6.23-rc2/ipc/util.h 2007-08-31 12:26:31.0 +0200 @@ -103,11 +103,8 @@ void* ipc_rcu_alloc(int size); void ipc_rcu_getref(void *ptr); void ipc_rcu_putref(void *ptr); -struct kern_ipc_perm* ipc_lock(struct ipc_ids* ids, int id); -void ipc_lock_by_ptr(struct kern_ipc_perm *ipcp); -void ipc_unlock(struct kern_ipc_perm* perm); +struct kern_ipc_perm *ipc_lock(struct ipc_ids *, int); int ipc_buildid(struct ipc_ids* ids, int id, int seq); -int ipc_checkid(struct ipc_ids* ids, struct kern_ipc_perm* ipcp, int uid); void kernel_to_ipc64_perm(struct kern_ipc_perm *in, struct ipc64_perm *out); void ipc64_perm_to_ipc_perm(struct ipc64_perm *in, struct ipc_perm *out); @@ -127,6 +124,41 @@ extern int ipcget_new(struct ipc_namespa extern int ipcget_public(struct ipc_namespace *, struct ipc_ids *, struct ipc_ops *, struct ipc_params *); +static inline int ipc_checkid(struct ipc_ids *ids, struct kern_ipc_perm *ipcp, + int uid) +{ + if (uid / SEQ_MULTIPLIER != ipcp->seq) + return 1; + return 0; +} + +static inline void ipc_lock_by_ptr(struct kern_ipc_perm *perm) +{ + spin_lock(&perm->lock); +} + +static inline void ipc_unlock(struct kern_ipc_perm *perm) +{ + spin_unlock(&perm->lock); +} + +static inline struct kern_ipc_perm *ipc_lock_check(struct ipc_ids *ids, + int id) +{ + struct kern_ipc_perm *out; + + out = ipc_lock(ids, id); + if (IS_ERR(out)) + return out; + + if (ipc_checkid(ids, out, id)) { + spin_unlock(&out->lock); + return ERR_PTR(-EIDRM); + } + + return out; +} + static inline int ipcget(struct ipc_namespace *ns, struct ipc_ids *ids, struct ipc_ops *ops, struct ipc_params *params) { Index: linux-2.6.23-rc2/ipc/util.c === --- linux-2.6.23-rc2.orig/ipc/util.c2007-08-31 11:59:46.0 +0200 +++ linux-2.6.23-rc2/ipc/util.c 2007-08-31 12:29:32.0 +0200 @@ -678,7 +678,7 @@ struct kern_ipc_perm *ipc_lock(struct ip out = idr_find(&ids->ipcs_idr, lid); if (out == NULL) { rcu_read_unlock(); - return NULL; + return ERR_PTR(-EINVAL); } spin_lock(&out->lock); @@ -689,36 +689,17 @@ struct kern_ipc_perm *ipc_lock(struct ip if (out->deleted) { spin_unlock(&out->lock); rcu_read_unlock(); - return NULL; + return ERR_PTR(-EINVAL); } return out; } -void ipc_lock_by_ptr(struct kern_ipc_perm *perm) -{ - rcu_read_lock(); - spin_lock(&perm->lock); -} - -void ipc_unlock(struct kern_ipc_perm* perm) -{ - spin_unlock(&perm->lock); - rcu_read_unlock(); -} - int ipc_buildid(struct ipc_ids* ids, int id, int seq) { return SEQ_MULTIPLIER*seq + id; } -int ipc_checkid(struct ipc_ids* ids, struct kern_ipc_perm* ipcp, int uid) -{ - if(uid/SEQ_MULTIPLIER != ipcp->seq) - return 1; - return 0; -} - #ifdef __ARCH_WANT_IPC_PARSE_VERSION Index: linux-2.6.23-rc2/ipc/msg.c === --- linux-2.6.23-rc2.orig/ipc/msg.c 2007-08-31 11:42:10.0 +0200 +++ linux-2.6.23-rc2/ipc/msg.c 2007-08-31 12:33:38.0 +0200 @@ -73,10 +73,7 @@ static struct ipc_ids init_msg_ids; #define msg_ids(ns)(*((ns)->ids[IPC_MSG_IDS])) -#define msg_lock(ns, id) ((struct msg_queue*)ipc_lock(&msg_ids(ns), id)) #define msg_unlock(msq)ipc_unlock(&(msq)->q_perm) -#define msg_checkid(ns, msq, msgid)\ - ipc_checkid(&msg_ids(ns), &msq->q_perm, msgid) #define msg_buildid(ns, id, seq) \ ipc_buildid(&msg_ids(ns), id, seq) @@ -139,6 +136,17 @@ void __init msg_init(void) IPC_MSG_IDS, sysvipc_msg_proc_show); } +static inline struct msg_queue *msg_lock(struct ipc_namespace *ns, int id) +{ + return (struct msg_queue *) ipc_lock(&msg_ids(ns), id); +} + +static inline st
[RFC][PATCH 3/6] Removing the ipc_get() routine
[PATCH 03/06] This is a trivial patch that removes the ipc_get() routine: it is replaced by a call to idr_find(). Signed-off-by: Nadia Derbey <[EMAIL PROTECTED]> --- ipc/shm.c | 16 +--- ipc/util.c | 19 --- ipc/util.h |1 - 3 files changed, 13 insertions(+), 23 deletions(-) Index: linux-2.6.23-rc2/ipc/util.h === --- linux-2.6.23-rc2.orig/ipc/util.h2007-08-31 11:38:50.0 +0200 +++ linux-2.6.23-rc2/ipc/util.h 2007-08-31 11:59:11.0 +0200 @@ -103,7 +103,6 @@ void* ipc_rcu_alloc(int size); void ipc_rcu_getref(void *ptr); void ipc_rcu_putref(void *ptr); -struct kern_ipc_perm* ipc_get(struct ipc_ids* ids, int id); struct kern_ipc_perm* ipc_lock(struct ipc_ids* ids, int id); void ipc_lock_by_ptr(struct kern_ipc_perm *ipcp); void ipc_unlock(struct kern_ipc_perm* perm); Index: linux-2.6.23-rc2/ipc/util.c === --- linux-2.6.23-rc2.orig/ipc/util.c2007-08-31 11:40:03.0 +0200 +++ linux-2.6.23-rc2/ipc/util.c 2007-08-31 11:59:46.0 +0200 @@ -669,25 +669,6 @@ void ipc64_perm_to_ipc_perm (struct ipc6 out->seq= in->seq; } -/* - * So far only shm_get_stat() calls ipc_get() via shm_get(), so ipc_get() - * is called with shm_ids.mutex locked. Since grow_ary() is also called with - * shm_ids.mutex down(for Shared Memory), there is no need to add read - * barriers here to gurantee the writes in grow_ary() are seen in order - * here (for Alpha). - * - * However ipc_get() itself does not necessary require ipc_ids.mutex down. So - * if in the future ipc_get() is used by other places without ipc_ids.mutex - * down, then ipc_get() needs read memery barriers as ipc_lock() does. - */ -struct kern_ipc_perm *ipc_get(struct ipc_ids *ids, int id) -{ - struct kern_ipc_perm *out; - int lid = id % SEQ_MULTIPLIER; - out = idr_find(&ids->ipcs_idr, lid); - return out; -} - struct kern_ipc_perm *ipc_lock(struct ipc_ids *ids, int id) { struct kern_ipc_perm *out; Index: linux-2.6.23-rc2/ipc/shm.c === --- linux-2.6.23-rc2.orig/ipc/shm.c 2007-08-31 11:45:57.0 +0200 +++ linux-2.6.23-rc2/ipc/shm.c 2007-08-31 12:13:27.0 +0200 @@ -63,8 +63,6 @@ static struct ipc_ids init_shm_ids; ((struct shmid_kernel*)ipc_lock(&shm_ids(ns),id)) #define shm_unlock(shp)\ ipc_unlock(&(shp)->shm_perm) -#define shm_get(ns, id)\ - ((struct shmid_kernel*)ipc_get(&shm_ids(ns),id)) #define shm_buildid(ns, id, seq) \ ipc_buildid(&shm_ids(ns), id, seq) @@ -562,7 +560,19 @@ static void shm_get_stat(struct ipc_name struct shmid_kernel *shp; struct inode *inode; - shp = shm_get(ns, next_id); + /* +* idr_find() is called via shm_get(), so with shm_ids.mutex +* locked. Since ipc_addid() is also called with +* shm_ids.mutex down, there is no need to add read barriers +* here to gurantee the writes in ipc_addid() are seen in +* order here (for Alpha). +* However idr_find() itself does not necessary require +* ipc_ids.mutex down. So if idr_find() is used by other +* places without ipc_ids.mutex down, then it needs read +* read memory barriers as ipc_lock() does. +*/ + + shp = idr_find(&shm_ids(ns).ipcs_idr, next_id); if (shp == NULL) continue; -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC][PATCH 6/6] Inlining ipc_buildid()
[PATCH 06/06] This is a trivial patch that changes the ipc_buildid() routine into a static inline. Signed-off-by: Nadia Derbey <[EMAIL PROTECTED]> --- ipc/util.c |5 - ipc/util.h |6 +- 2 files changed, 5 insertions(+), 6 deletions(-) Index: linux-2.6.23-rc2/ipc/util.h === --- linux-2.6.23-rc2.orig/ipc/util.h2007-08-31 12:53:30.0 +0200 +++ linux-2.6.23-rc2/ipc/util.h 2007-08-31 12:58:14.0 +0200 @@ -106,7 +106,6 @@ void ipc_rcu_getref(void *ptr); void ipc_rcu_putref(void *ptr); struct kern_ipc_perm *ipc_lock(struct ipc_ids *, int); -int ipc_buildid(struct ipc_ids* ids, int id, int seq); void kernel_to_ipc64_perm(struct kern_ipc_perm *in, struct ipc64_perm *out); void ipc64_perm_to_ipc_perm(struct ipc64_perm *in, struct ipc_perm *out); @@ -126,6 +125,11 @@ extern int ipcget_new(struct ipc_namespa extern int ipcget_public(struct ipc_namespace *, struct ipc_ids *, struct ipc_ops *, struct ipc_params *); +static inline int ipc_buildid(struct ipc_ids *ids, int id, int seq) +{ + return SEQ_MULTIPLIER * seq + id; +} + static inline int ipc_checkid(struct ipc_ids *ids, struct kern_ipc_perm *ipcp, int uid) { Index: linux-2.6.23-rc2/ipc/util.c === --- linux-2.6.23-rc2.orig/ipc/util.c2007-08-31 12:54:39.0 +0200 +++ linux-2.6.23-rc2/ipc/util.c 2007-08-31 12:58:40.0 +0200 @@ -695,11 +695,6 @@ struct kern_ipc_perm *ipc_lock(struct ip return out; } -int ipc_buildid(struct ipc_ids* ids, int id, int seq) -{ - return SEQ_MULTIPLIER*seq + id; -} - #ifdef __ARCH_WANT_IPC_PARSE_VERSION -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC][PATCH 1/6] Storing ipcs into IDRs
[PATCH 01/06] This patch introduces ipcs storage into IDRs. The main changes are: . This ipc_ids structure is changed: the entries array is changed into a root idr structure. . The grow_ary() routine is removed: it is not needed anymore when adding an ipc structure, since we are now using the IDR facility. . The ipc_rmid() routine interface is changed: . there is no need for this routine to return the pointer passed in as argument: it is now declared as a void . since the id is now part of the kern_ipc_perm structure, no need to have it as an argument to the routine Signed-off-by: Nadia Derbey <[EMAIL PROTECTED]> --- include/linux/ipc.h |1 include/linux/msg.h |1 include/linux/sem.h |1 include/linux/shm.h |1 ipc/msg.c | 113 +- ipc/sem.c | 111 - ipc/shm.c | 116 +- ipc/util.c | 264 +++- ipc/util.h | 32 +- 9 files changed, 329 insertions(+), 311 deletions(-) Index: linux-2.6.23-rc2/include/linux/ipc.h === --- linux-2.6.23-rc2.orig/include/linux/ipc.h 2007-08-31 07:49:01.0 +0200 +++ linux-2.6.23-rc2/include/linux/ipc.h2007-08-31 08:05:36.0 +0200 @@ -61,6 +61,7 @@ struct kern_ipc_perm { spinlock_t lock; int deleted; + int id; key_t key; uid_t uid; gid_t gid; Index: linux-2.6.23-rc2/include/linux/msg.h === --- linux-2.6.23-rc2.orig/include/linux/msg.h 2007-08-31 07:49:01.0 +0200 +++ linux-2.6.23-rc2/include/linux/msg.h2007-08-31 08:06:24.0 +0200 @@ -77,7 +77,6 @@ struct msg_msg { /* one msq_queue structure for each present queue on the system */ struct msg_queue { struct kern_ipc_perm q_perm; - int q_id; time_t q_stime; /* last msgsnd time */ time_t q_rtime; /* last msgrcv time */ time_t q_ctime; /* last change time */ Index: linux-2.6.23-rc2/include/linux/sem.h === --- linux-2.6.23-rc2.orig/include/linux/sem.h 2007-08-31 07:49:01.0 +0200 +++ linux-2.6.23-rc2/include/linux/sem.h2007-08-31 08:06:45.0 +0200 @@ -90,7 +90,6 @@ struct sem { /* One sem_array data structure for each set of semaphores in the system. */ struct sem_array { struct kern_ipc_permsem_perm; /* permissions .. see ipc.h */ - int sem_id; time_t sem_otime; /* last semop time */ time_t sem_ctime; /* last change time */ struct sem *sem_base; /* ptr to first semaphore in array */ Index: linux-2.6.23-rc2/include/linux/shm.h === --- linux-2.6.23-rc2.orig/include/linux/shm.h 2007-08-31 07:49:01.0 +0200 +++ linux-2.6.23-rc2/include/linux/shm.h2007-08-31 08:07:07.0 +0200 @@ -77,7 +77,6 @@ struct shmid_kernel /* private to the ke { struct kern_ipc_permshm_perm; struct file * shm_file; - int id; unsigned long shm_nattch; unsigned long shm_segsz; time_t shm_atim; Index: linux-2.6.23-rc2/ipc/util.h === --- linux-2.6.23-rc2.orig/ipc/util.h2007-08-31 07:49:21.0 +0200 +++ linux-2.6.23-rc2/ipc/util.h 2007-08-31 11:29:21.0 +0200 @@ -10,6 +10,8 @@ #ifndef _IPC_UTIL_H #define _IPC_UTIL_H +#include + #define USHRT_MAX 0x #define SEQ_MULTIPLIER (IPCMNI) @@ -25,24 +27,17 @@ void sem_exit_ns(struct ipc_namespace *n void msg_exit_ns(struct ipc_namespace *ns); void shm_exit_ns(struct ipc_namespace *ns); -struct ipc_id_ary { - int size; - struct kern_ipc_perm *p[0]; -}; - struct ipc_ids { int in_use; - int max_id; unsigned short seq; unsigned short seq_max; struct mutex mutex; - struct ipc_id_ary nullentry; - struct ipc_id_ary* entries; + struct idr ipcs_idr; }; struct seq_file; -void ipc_init_ids(struct ipc_ids *ids, int size); +void ipc_init_ids(struct ipc_ids *); #ifdef CONFIG_PROC_FS void __init ipc_init_proc_interface(const char *path, const char *header, int ids, int (*show)(struct seq_file *, void *)); @@ -55,11 +50,12 @@ void __init ipc_init_proc_interface(cons #define IPC_SHM_IDS2 /* must be called with ids->mutex acquired.*/ -int ipc_findkey(struct ipc_ids* ids, key_t key); -int ipc_addid(struct ipc_ids* ids, struct ker
[RFC][PATCH 5/6] Introducing the ipcid_to_idx macro
[PATCH 05/06] This is a trivial patch that changes all the (id % SEQ_MULTIPLIER) into a call to the ipcid_to_idx(id) macro. Signed-off-by: Nadia Derbey <[EMAIL PROTECTED]> --- ipc/util.c |4 ++-- ipc/util.h |2 ++ 2 files changed, 4 insertions(+), 2 deletions(-) Index: linux-2.6.23-rc2/ipc/util.h === --- linux-2.6.23-rc2.orig/ipc/util.h2007-08-31 12:26:31.0 +0200 +++ linux-2.6.23-rc2/ipc/util.h 2007-08-31 12:53:30.0 +0200 @@ -78,6 +78,8 @@ void __init ipc_init_proc_interface(cons #define IPC_MSG_IDS1 #define IPC_SHM_IDS2 +#define ipcid_to_idx(id) ((id) % SEQ_MULTIPLIER) + /* must be called with ids->mutex acquired.*/ int ipc_addid(struct ipc_ids *, struct kern_ipc_perm *, int); int ipc_get_maxid(struct ipc_ids *); Index: linux-2.6.23-rc2/ipc/util.c === --- linux-2.6.23-rc2.orig/ipc/util.c2007-08-31 12:29:32.0 +0200 +++ linux-2.6.23-rc2/ipc/util.c 2007-08-31 12:54:39.0 +0200 @@ -413,7 +413,7 @@ int ipcget_public(struct ipc_namespace * void ipc_rmid(struct ipc_ids *ids, struct kern_ipc_perm *ipcp) { - int lid = ipcp->id % SEQ_MULTIPLIER; + int lid = ipcid_to_idx(ipcp->id); idr_remove(&ids->ipcs_idr, lid); @@ -672,7 +672,7 @@ void ipc64_perm_to_ipc_perm (struct ipc6 struct kern_ipc_perm *ipc_lock(struct ipc_ids *ids, int id) { struct kern_ipc_perm *out; - int lid = id % SEQ_MULTIPLIER; + int lid = ipcid_to_idx(id); rcu_read_lock(); out = idr_find(&ids->ipcs_idr, lid); -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 4KSTACKS + DEBUG_STACKOVERFLOW harmful
On Wednesday 29 August 2007 23:34, Eric Sandeen wrote: > Noticed today that the combination of 4KSTACKS and DEBUG_STACKOVERFLOW > config options is a bit deadly. > > DEBUG_STACKOVERFLOW warns in do_IRQ if we're within THREAD_SIZE/8 of the > end of useable stack space, or 512 bytes on a 4k stack. ... > The large stack usage in those 2 functions is due to big char arrays, of > size KSYM_NAME_LEN (128 bytes) and KSYM_SYMBOL_LEN (223 bytes). > > IOW, the stack warning effectively reduces useful stack left in our itty > bitty 4k stacks by over 10%. KSYM_NAME_LEN = 128 sounds stupid. The name which is wider than 80 chars?? Kernel shouldn't have names that long. Say, 50 chars ought to be enough. -- vda - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: possible build system oddity
On Fri, Aug 31, 2007 at 10:37:28AM +0200, Oliver Neukum wrote: > Hi, > > I only touched sound/usb/usbaudio.c > Nevertheless the whole subtree und sound/ is recompiling. What's > happening? The only file that gets compiled (CC) should be sound/usb/usbaudio.c It seems you are confused by the fact that this recompilation results in several relinks (LD)? > Regards > Oliver cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: possible build system oddity
Am Freitag 31 August 2007 schrieben Sie: > On Fri, Aug 31, 2007 at 10:37:28AM +0200, Oliver Neukum wrote: > > Hi, > > > > I only touched sound/usb/usbaudio.c > > Nevertheless the whole subtree und sound/ is recompiling. What's > > happening? > > The only file that gets compiled (CC) should be sound/usb/usbaudio.c > > It seems you are confused by the fact that this recompilation results in > several relinks (LD)? > > > Regards > > Oliver > > cu > Adrian Nope, it recompiled everything under sound/ and a lot of stuff under net/. I'll check the system for odd scripts running at night touching stuff. Regards Oliver - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2.6.23 1/2] cxgb3 - Fix dev->priv usage
Divy Le Ray wrote: From: Divy Le Ray <[EMAIL PROTECTED]> cxgb3 used netdev_priv() and dev->priv for different purposes. In 2.6.23, netdev_priv() == dev->priv, cxgb3 needs a fix. This patch is a partial backport of Dave Miller's changes in the net-2.6.24 git branch. Without this fix, cxgb3 crashes on 2.6.23. Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]> --- drivers/infiniband/hw/cxgb3/cxio_hal.c |2 - drivers/net/cxgb3/adapter.h|2 + drivers/net/cxgb3/cxgb3_main.c | 126 ++-- drivers/net/cxgb3/cxgb3_offload.c | 16 +++- drivers/net/cxgb3/cxgb3_offload.h |2 + drivers/net/cxgb3/sge.c| 23 -- drivers/net/cxgb3/t3cdev.h |3 - 7 files changed, 105 insertions(+), 69 deletions(-) applied 1-2 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: hda: set_drive_speed_status: status=0x51 { DriveReady SeekComplete Error }
> When a DMA operation is enabled, CS0- and CS1- shall not be asserted and > transfers shall be 16-bits wide. > > I took the above to mean the device was designed to support DMA. > Where did I err? The bus is specified for DMA, not the device. > > > The data sheet says the media can only do 4.1MB/second which is > > consistent with only needing PIO2 (actually it's far slower than PIO2) > > Is such a slow speed typical of DOMs sold today? > > Or do DOMs sold today support DMA bus mastering, much higher interface > rates, and much higher sustained throughput? Most CF is pretty slow but the newer CF cards do DMA and are getting far better. And yes I'd expect your real time app to deal for 3 or 4mS (Mind you you'll get 1mS normal operating worst cases off even a fast DMA IDE disk) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Section mismatch in 2.6.23-rc4
When compiling a kernel, I get the warnings below. I am using ARCH=powerpc and 'make mpc8641_hpcn_defconfig', 'make uImage'. This didn't appear in 2.6.22, but in arch/powerpc/kernel/head_32.S and setup_32.c, the section info apparently didn't change. LD vmlinux.o MODPOST vmlinux.o WARNING: vmlinux.o(.text+0x18): Section mismatch: reference to .init.text:early_init (between '__start' and '__after_mmu_off') WARNING: vmlinux.o(.text+0x3788): Section mismatch: reference to .init.text:machine_init (between 'start_here' and 'set_context') WARNING: vmlinux.o(.text+0x3790): Section mismatch: reference to .init.text:MMU_init (between 'start_here' and 'set_context') WARNING: vmlinux.o(.text+0x37ba): Section mismatch: reference to .init.text:start_kernel (between 'start_here' and 'set_context') WARNING: vmlinux.o(.text+0x37be): Section mismatch: reference to .init.text:start_kernel (between 'start_here' and 'set_context') LD vmlinux SYSMAP System.map - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386 and x86_64: randomize brk()
On Thu, 30 Aug 2007, Andrew Morton wrote: > Not strongly, but the general opinion seems to be that ARCH_HAS_FOO is > sucky. It should at least be done in Kconfig rather than in .h, but > even better is just to implement the thing for all architectures. Below is an updated patch. checkpatch complains about adding extern into arch/x86_64/ia32/ia32_binfmt.c, but it's needed to work with this crazy #include "../../../fs/binfmt_elf.c" in it. It also complains about multiple assignments, but I don't seem to see anything wrong with them here. From: Jiri Kosina <[EMAIL PROTECTED]> i386 and x86_64: randomize brk() This patch randomizes the location of the heap (brk) for i386 and x86_64. The range is randomized in the range starting at current brk location up to 0x0200 offset for both architectures. This, together with pie-executable-randomization.patch and pie-executable-randomization-fix.patch, should make the address space randomization on i386 and x86_64 complete. The empty stubs are not added for architectures that don't support ELF binaries, namely blackfin, h8300, m68knommu and v850. Signed-off-by: Jiri Kosina <[EMAIL PROTECTED]> diff --git a/arch/i386/kernel/process.c b/arch/i386/kernel/process.c index 8466471..ba8ad15 100644 --- a/arch/i386/kernel/process.c +++ b/arch/i386/kernel/process.c @@ -949,3 +949,17 @@ unsigned long arch_align_stack(unsigned long sp) sp -= get_random_int() % 8192; return sp & ~0xf; } + +void arch_randomize_brk(void) +{ + unsigned long new_brk; + unsigned long range_start; + unsigned long range_end; + + range_start = current->mm->brk; + range_end = range_start + 0x0200; + new_brk = randomize_range(range_start, range_end, 0); + if (new_brk) + current->mm->brk = current->mm->start_brk = new_brk; +} + diff --git a/arch/x86_64/ia32/ia32_binfmt.c b/arch/x86_64/ia32/ia32_binfmt.c index dffd2ac..ccc4350 100644 --- a/arch/x86_64/ia32/ia32_binfmt.c +++ b/arch/x86_64/ia32/ia32_binfmt.c @@ -262,6 +262,7 @@ static void elf32_init(struct pt_regs *); #define ARCH_HAS_SETUP_ADDITIONAL_PAGES 1 #define arch_setup_additional_pages syscall32_setup_pages extern int syscall32_setup_pages(struct linux_binprm *, int exstack); +extern void arch_randomize_brk(void); #include "../../../fs/binfmt_elf.c" diff --git a/arch/x86_64/kernel/process.c b/arch/x86_64/kernel/process.c index 2842f50..fe7203b 100644 --- a/arch/x86_64/kernel/process.c +++ b/arch/x86_64/kernel/process.c @@ -902,3 +902,17 @@ unsigned long arch_align_stack(unsigned long sp) sp -= get_random_int() % 8192; return sp & ~0xf; } + +void arch_randomize_brk(void) +{ + unsigned long new_brk; + unsigned long range_start; + unsigned long range_end; + + range_start = current->mm->brk; + range_end = range_start + 0x0200; + new_brk = randomize_range(range_start, range_end, 0); + if (new_brk) + current->mm->brk = current->mm->start_brk = new_brk; +} + diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c index d65f1d9..4c92461 100644 --- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -1073,6 +1073,9 @@ static int load_elf_binary(struct linux_binprm *bprm, struct pt_regs *regs) current->mm->end_data = end_data; current->mm->start_stack = bprm->p; + if (current->flags & PF_RANDOMIZE) + arch_randomize_brk(); + if (current->personality & MMAP_PAGE_ZERO) { /* Why this, you ask??? Well SVr4 maps page 0 as read-only, and some applications "depend" upon this behavior. diff --git a/include/asm-alpha/elf.h b/include/asm-alpha/elf.h index 6c2d78f..18210cc 100644 --- a/include/asm-alpha/elf.h +++ b/include/asm-alpha/elf.h @@ -163,5 +163,9 @@ extern int alpha_l3_cacheshape; NEW_AUX_ENT(AT_L3_CACHESHAPE, alpha_l3_cacheshape);\ } while (0) +static inline void arch_randomize_brk(void) +{ +} + #endif /* __KERNEL__ */ #endif /* __ASM_ALPHA_ELF_H */ diff --git a/include/asm-arm/elf.h b/include/asm-arm/elf.h index ec1c685..e3f 100644 --- a/include/asm-arm/elf.h +++ b/include/asm-arm/elf.h @@ -116,4 +116,8 @@ extern char elf_platform[]; #endif +static inline void arch_randomize_brk(void) +{ +} + #endif diff --git a/include/asm-avr32/elf.h b/include/asm-avr32/elf.h index d334b49..61b7d81 100644 --- a/include/asm-avr32/elf.h +++ b/include/asm-avr32/elf.h @@ -107,4 +107,8 @@ typedef struct user_fpu_struct elf_fpregset_t; #define SET_PERSONALITY(ex, ibcs2) set_personality(PER_LINUX_32BIT) #endif +static inline void arch_randomize_brk(void) +{ +} + #endif /* __ASM_AVR32_ELF_H */ diff --git a/include/asm-cris/elf.h b/include/asm-cris/elf.h index 96a40c1..10607c7 100644 --- a/include/asm-cris/elf.h +++ b/include/asm-cris/elf.h @@ -93,4 +93,8 @@ typedef unsigned long elf_fpregset_t; #endif /* __KERNEL__ */ +static inline void arch_randomize_brk(void) +{ +}
Re: [PATCH] Documentation/00-INDEX notice ecryptfs.txt moved.
Hi Rob, On Thu, 30 Aug 2007, Rob Landley wrote: > On Thursday 30 August 2007 2:04:37 pm Randy Dunlap wrote: > > Please use the expected (canonical) patch format. > > > > See Documentation/SubmittingPatches: > > 14) The canonical patch format > > from Rob Landley <[EMAIL PROTECTED]> > Signed-off-by: Rob Landley <[EMAIL PROTECTED]> > > ecryptfs.txt moved into filesystems, make 00-INDEX follow. That's still not quite right :-) What Randy meant is that the sign-off must come /after/ the patch description: From: Random J Hacker <[EMAIL PROTECTED]> /* * Above line not really required if it's your own patch that you're * sending. But required when you're "relaying" someone else's patch, * or if there are multiple people in the sign-off chain below, * to differentiate as to who was the author of the patch. */ [PATCH] : /* * Above line not really required if Subject: of email already such. * But may be required if you're replying on some other LKML thread, * whose subject may have nothing to do with this patch. */ Signed-off-by: Random J Hacker <[EMAIL PROTECTED]> Signed-off-by: Bob Z Maintainer <[EMAIL PROTECTED]> /* * It is generally preferable to order the sign-off's above so that * they reflect the patch journey -- so the original author preferably * comes first, then those who relayed it or picked it up from the * author (often the maintainer), and so on, till the person who is * finally relaying it currently (so from the above it seems this is * being sent out by Bob). If this is the first time a patch is being * sent, there would naturally be just a single sign-off line. */ Then three hyphens, followed by diffstat and patch ... so the below looks good :-) [ All this is not too important, admittedly, but causes least amount of processing time to be wasted on the recipient's end, and also does not confuse scripts that may be used to extract patches (and git commit command-line arguments) from mails automatically. ] > --- > > 00-INDEX |2 -- > filesystems/00-INDEX |2 ++ > 2 files changed, 2 insertions(+), 2 deletions(-) > > --- linux-2.6.23-rc4/Documentation/00-INDEX 2007-08-27 20:32:35.0 > -0500 > +++ linux-new/Documentation/00-INDEX 2007-08-30 14:43:15.0 -0500 > @@ -134,8 +134,6 @@ > - info on Linux Digital Video Broadcast (DVB) subsystem. > early-userspace/ > - info about initramfs, klibc, and userspace early during boot. > -ecryptfs.txt > - - docs on eCryptfs: stacked cryptographic filesystem for Linux. > eisa.txt > - info on EISA bus support. > exception.txt > --- linux-2.6.23-rc4/Documentation/filesystems/00-INDEX 2007-08-27 > 20:32:35.0 -0500 > +++ linux-new/Documentation/filesystems/00-INDEX 2007-08-30 > 14:42:08.0 -0500 > @@ -32,6 +32,8 @@ > - info about the locking scheme used for directory operations. > dlmfs.txt > - info on the userspace interface to the OCFS2 DLM. > +ecryptfs.txt > + - docs on eCryptfs: stacked cryptographic filesystem for Linux. > ext2.txt > - info, mount options and specifications for the Ext2 filesystem. > ext3.txt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Kernel 2.6.22.6 iPod conflict with PS/2 device.
When I have an iPod attached via USB to an ABIT IC7-G board before it boots up and let X start etc, the mouse (PS/2) does not function, but the keyboard works OK. GPM does not work either. When I attach the iPod after the machine has booted up, everything is OK, until the next reboot (with the iPod still attached via USB) Any idea why this happens? The kernel .config is attached. There are no modules in use, everything is compiled into the kernel, the distribution is Debian Testing. Justin. config-2.6.22.6.bz2 Description: Binary data
Re: recent nfs change causes autofs regression
On Fri, 2007-08-31 at 01:07 -0700, Linus Torvalds wrote: > > If you want new behaviour, you add a new flag saying you want new > behaviour. You don't just start behaving differently from what you've > always done before (and what *other* UNIXes do, for that matter). > > Besides, even *if* it was a matter of somebody doing a mount with "rw", > when the previous mount was "ro", returning EBUSY is still the wrong thing > to do! If the user asks for a new mount that is read-write, he should just > get it - ie we should not re-use the old client handles, and we should do > what Solaris apparently does, namely to just make it a totally different > mount. > > In other words, it should (as I already mentioned once) have used > "nosharecache" by default, which makes it all work. > > Then, people who want to re-use the caches (which in turn may mean that > everything needs to have the same flags), THOSE PEOPLE, who want the NEW > SEMANTICS (errors and all) should then use a "sharecache" flag. That would be a major change in existing semantics. The default has been "sharecache" ever since Al Viro introduced the "sget()" function some 6 or 7 years ago. The problem was that we never advertised the fact that the kernel was overriding your mount options, and so sysadmins were (rightly IMO) complaining that they should _know_ when the client does this. The list of known problems with a "nosharecache" default is nasty too: - file and directory attribute and data caching breaks. Applications will see stale data in cases where they otherwise would not expect it. - the existing dcache and icache issues when a file is renamed or deleted on the server are now extended to also include the case where the rename or deletion occurs on an alias in another directory on the client itself. In particular, sillyrename will break. - file locking breaks (the server knows that the client holds locks on one file, whereas the client thinks it holds locks on several). - the NFSv4 delegation model breaks: the client will be using OPEN when it could use cached opens. More importantly, when performing an operation that requires it to return the delegation on the aliased file, it won't know until the server sends it a callback. ...and of course, the amount of unnecessary traffic to the server increases. I'm not aware of any sane way of dealing with those issues, and I doubt Solaris has a solution for them either. Trond - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Nonblocking call may block in a mutex? Nonblocking call after poll may fail?
Hi! This is a driver-related question on non-blocking writes and poll. Setup: there is a single output-buffer (in kernel-space) of 24 bytes for writes from all processes A, B, and C: each process is restricted to use at most 8 bytes: 8*3 = 24 (until that data is handled (interrupt-handler...)) Question: If this output-buffer has "4-bytes space remaining for process A", then a non-blocking write of process A could still encounter a locked mutex, if process B is busy writing to the output-buffer. Should process A now block/sleep until that mutex is free and it can access the output-buffer (and it's 4 bytes space)? What about a non-blocking (write-) poll of process A: if the poll call succeeds (the output buffer has space remaining for process A), and process A now performs a non-blocking write: what happens if A encounters a blocked mutex, since process B is busy writing to the output-buffer. a) Should A block until the mutex is available? b) Should A return -EAGAIN, even though the poll call succeeded? c) Should it be impossible for this to happen! i.e. -> should process A already "have" the mutex in question, when the poll call succeeds (thus preventing B from writing to the output buffer) For c) What if process A "has" the mutex, but never does the non-blocking write. Then no process can write, since the mutex is held by process A... I'll appreciate any answer, or pointer to relevant information. Thanks Albert - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] Fix a lock problem in generic phy code
Lock debugging finds a problem in phy.c and phy_device.c, this patch fixes it. Tested on an AT91SAM9263-EK board, kernel 2.6.23-rc4. Signed-off-by: Hans J. Koch <[EMAIL PROTECTED]> --- Index: linux-2.6.23-rc/drivers/net/phy/phy_device.c === --- linux-2.6.23-rc.orig/drivers/net/phy/phy_device.c 2007-08-31 14:07:47.0 +0200 +++ linux-2.6.23-rc/drivers/net/phy/phy_device.c2007-08-31 14:08:22.0 +0200 @@ -644,7 +644,7 @@ if (!(phydrv->flags & PHY_HAS_INTERRUPT)) phydev->irq = PHY_POLL; - spin_lock(&phydev->lock); + spin_lock_bh(&phydev->lock); /* Start out supporting everything. Eventually, * a controller will attach, and may modify one @@ -658,7 +658,7 @@ if (phydev->drv->probe) err = phydev->drv->probe(phydev); - spin_unlock(&phydev->lock); + spin_unlock_bh(&phydev->lock); return err; Index: linux-2.6.23-rc/drivers/net/phy/phy.c === --- linux-2.6.23-rc.orig/drivers/net/phy/phy.c 2007-08-31 14:15:20.0 +0200 +++ linux-2.6.23-rc/drivers/net/phy/phy.c 2007-08-31 14:15:43.0 +0200 @@ -755,7 +755,7 @@ */ void phy_start(struct phy_device *phydev) { - spin_lock(&phydev->lock); + spin_lock_bh(&phydev->lock); switch (phydev->state) { case PHY_STARTING: @@ -769,7 +769,7 @@ default: break; } - spin_unlock(&phydev->lock); + spin_unlock_bh(&phydev->lock); } EXPORT_SYMBOL(phy_stop); EXPORT_SYMBOL(phy_start); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Lguest] [PATCH] modify lguest console to support multiple hvc's
On 8/30/07, Rusty Russell <[EMAIL PROTECTED]> wrote: > On Thu, 2007-08-30 at 13:38 -0500, Eric Van Hensbergen wrote: > > From: Eric Van Hensbergen <[EMAIL PROTECTED]> > > > > This was a quick modification I did of lguest to be able to support multiple > > HVC channels for some experiments I was doing. I'm not sure if this is more > > generally useful, so I'm posting it to the list in case someone else has a > > need for it. > > > > Signed-off-by: Eric Van Hensbergen <[EMAIL PROTECTED]> > > This is cool! Great that it's useful for you. What do the other > consoles look like from inside the guest? > They just show up on other hvc device minor numbers. I was running 9p over them, but I wanted a tighter coupling for v9fs so I could tune performance and incrementally optimize. -eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 5/5] Net: ath5k, kconfig changes
On Thu, 2007-08-30 at 08:36 -0400, John W. Linville wrote: > On Thu, Aug 30, 2007 at 04:38:09AM +0300, Nick Kossifidis wrote: > > 2007/8/28, Christoph Hellwig <[EMAIL PROTECTED]>: > > > > Also this whole patch seems rather pointless. It saves only > > > very little and turns the driver into a complete ifdef maze. > > > Also most > > people will use 5212 code only, 5211 cards are on some old laptops and > > 5210, well i couldn't even find a 5210 for actual testing :P > > FWIW, I'd bet dollars to donuts that distros will enable them all > together. I would certainly _hope_ that distros enable everything -that is in the kernel- that they can get their hands on, otherwise when you stick a card in, it doesn't just work. Dan > Is saving code space the only reason to turn these off? How much > space do you save? > > Is there some way you can isolate and/or limit the number of ifdef > blocks further? If so, we might consider a version of this patch > that depends on EMBEDDED or somesuch...? > > John - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] uli526x: Add suspend and resume routines (updated)
Rafael J. Wysocki wrote: On Tuesday, 14 August 2007 07:58, Jeff Garzik wrote: Rafael J. Wysocki wrote: On Wednesday, 8 August 2007 00:26, Jeff Garzik wrote: Rafael J. Wysocki wrote: On Tuesday, 7 August 2007 23:40, Jeff Garzik wrote: I'll let our new tulip maintainer see what he thinks about the implementation. Seems fairly sane to me, but should at least get an "it works" test. It has been tested, as stated in the changelog, and works (on my test system). [--snip--] Two comments: 1) Like akpm, the CONFIG_PM_SLEEP is out of place. All other drivers use CONFIG_PM OK 2) just remove the !dev checks, that is an impossible condition OK Updated patch follows. --- From: Rafael J. Wysocki <[EMAIL PROTECTED]> Add suspend/resume support to the uli526x network driver (tested on x86_64, with 'Ethernet controller: ALi Corporation M5263 Ethernet Controller, rev 40'). This patch is based on the suspend/resume code in the tg3 driver. Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]> --- drivers/net/tulip/uli526x.c | 108 +--- 1 file changed, 102 insertions(+), 6 deletions(-) applied - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Add all thread stats for TASKSTATS_CMD_ATTR_TGID (v3)
TASKSTATS_CMD_ATTR_TGID used to return only the delay accounting stats, not the basic and extended accounting. With this patch, TASKSTATS_CMD_ATTR_TGID also aggregates the accounting info for all threads of a thread group. This makes TASKSTATS_CMD_ATTR_TGID usable in a similar fashion to TASKSTATS_CMD_ATTR_PID, for commands like iotop -P (http://guichaz.free.fr/misc/iotop.py). Here is the output of the testcase before the patch: Name | System|User| Cached I/O|Self| Group| --++++++ version | 5| 5| 5| 5| 5| ac_exitcode | 0| 0| 0| 0| 0| ac_flag | 0| 0| 0| 0| 0| ac_nice | 5| 10| 15| 0| 0| cpu_count |2020|1378| 360| 70645| 74428| cpu_delay_tota|2327|14281152|14225866| 30475|44505841| blkio_delay_to| 0| 0| 0| 0| 1146901308| swapin_count | 0| 0| 0| 0| 0| swapin_delay_t| 0| 0| 0| 0| 0| cpu_run_real_t| 4794271160| 1269806960| 331949536|44993160| 6562002424| cpu_run_virtua| 4792937602| 1247020977| 330330065|43082258| 6530204192| ac_comm | cached| cached| cached| cached|| ac_uid| 500| 500| 500| 500| 0| ac_gid| 500| 500| 500| 500| 0| ac_pid|2513|2514|2515|2512| 0| ac_ppid |2445|2445|2445|2445| 0| ac_btime | 1188326011| 1188326012| 1188326013| 1188326011| 0| ac_etime | 6619382| 5615437| 4600494| 6622066| 0| ac_utime | 25000| 1264000| 2|2000| 0| ac_stime | 477|6000| 312000| 43000| 0| ac_minflt | 2| 0| 1| 320| 0| ac_majflt | 0| 0| 0| 0| 0| coremem |3436| 286|2302|1285| 0| virtmem |4012| 301| 740|2198| 0| hiwater_rss | 756| 756| 756| 756| 0| hiwater_vm| 35004| 35004| 35004| 35004| 0| read_char | 208877984| 0| 825517700|2807| 0| write_char| 0| 0| 0| 0| 0| read_syscalls | 244730| 0| 215782| 10| 0| write_syscalls| 0| 0| 0| 0| 0| read_bytes| 0| 0| 0| 0| 0| write_bytes | 0| 0| 0| 0| 0| cancelled_writ| 0| 0| 0| 0| 0| nvcsw | 0| 0| 0| 8| 8| nivcsw|2019|1378| 360| 9|3766| and after the patch: Name | System|User| Cached I/O|Self| Group| --++++++ version | 5| 5| 5| 5| 5| ac_exitcode | 0| 0| 0| 0| 0| ac_flag | 0| 0| 0| 0| 0| ac_nice | 5| 10| 15| 0| 0| cpu_count |1716|1189| 293| 72795| 76036| cpu_delay_tota|3635|15921729|15269951| 277683|46143870| blkio_delay_to| 0| 0| 0| 113237372| 113237372| swapin_count | 0| 0| 0| 0| 0| swapin_delay_t| 0| 0| 0| 0| 0| cpu_run_real_t| 4338340472| 1117830064| 284956680|49992400| 5915100768| cpu_run_virtua| 4333641573| 1099471940| 282075705|46039754| 5882048860| ac_comm | cached| cached| cached| cached| cached| ac_uid| 500| 500| 500| 500| 500| ac_gid| 500| 500| 500| 500| 500| ac_pid|7144|7145|7146|7143|7143| ac_ppid |2435|2435|2435|2435|2435| ac_btime | 1188563357| 1188563358| 1188563359| 1188563356| 1188563356| ac_et
Re: [git patches] net driver fixes
Satyam Sharma wrote: On Mon, 30 Jul 2007, Jeff Garzik wrote: true, we should just remove the dev==NULL check Patch below: [PATCH] nmclan_cs: Remove bogus (dev==NULL) check in mace_interrupt() The (dev == NULL) check in drivers/net/pcmcia/nmclan_cs.c:mace_interrupt() handler is always false, so let's remove it. Signed-off-by: Satyam Sharma <[EMAIL PROTECTED]> --- drivers/net/pcmcia/nmclan_cs.c |6 -- 1 files changed, 0 insertions(+), 6 deletions(-) ACK, but patch does not apply to netdev-2.6.git#upstream alas - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: recent nfs change causes autofs regression
On Fri, Aug 31, 2007 at 08:11:38AM -0400, Trond Myklebust wrote: > On Fri, 2007-08-31 at 01:07 -0700, Linus Torvalds wrote: > > > > > If you want new behaviour, you add a new flag saying you want new > > behaviour. You don't just start behaving differently from what you've > > always done before (and what *other* UNIXes do, for that matter). > > > > Besides, even *if* it was a matter of somebody doing a mount with "rw", > > when the previous mount was "ro", returning EBUSY is still the wrong thing > > to do! If the user asks for a new mount that is read-write, he should just > > get it - ie we should not re-use the old client handles, and we should do > > what Solaris apparently does, namely to just make it a totally different > > mount. > > > > In other words, it should (as I already mentioned once) have used > > "nosharecache" by default, which makes it all work. > > > > Then, people who want to re-use the caches (which in turn may mean that > > everything needs to have the same flags), THOSE PEOPLE, who want the NEW > > SEMANTICS (errors and all) should then use a "sharecache" flag. > > That would be a major change in existing semantics. The default has been > "sharecache" ever since Al Viro introduced the "sget()" function some 6 > or 7 years ago. The problem was that we never advertised the fact that > the kernel was overriding your mount options, and so sysadmins were > (rightly IMO) complaining that they should _know_ when the client does > this. > > The list of known problems with a "nosharecache" default is nasty too: > > - file and directory attribute and data caching breaks. > Applications will see stale data in cases where they otherwise > would not expect it. > > - the existing dcache and icache issues when a file is renamed > or deleted on the server are now extended to also include the > case where the rename or deletion occurs on an alias in another > directory on the client itself. In particular, sillyrename will > break. > > - file locking breaks (the server knows that the client holds > locks on one file, whereas the client thinks it holds locks on > several). > > - the NFSv4 delegation model breaks: the client will be using > OPEN when it could use cached opens. More importantly, when > performing an operation that requires it to return the > delegation on the aliased file, it won't know until the server > sends it a callback. > > ...and of course, the amount of unnecessary traffic to the server > increases. I'm not aware of any sane way of dealing with those issues, > and I doubt Solaris has a solution for them either. All of this won't happen when server foo exports /bar and a client mounts /bar/x and /bar/y separately: there must be a shared subtree or hard-links between files within them, right? An obvious (but disruptive) server side workaround is to export the subtrees with different fsid= but that would give the same list of problems as above, right? IMHO I'd only consider returning EBUSY when trying to mount _exactly_ the same directory with different flags, not for arbitrary subtrees. The client should preferably not be bothered with server side disk partitioning (at least not beyond the obvious such as df output). -- Frank - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE/RFC] Really Fair Scheduler
Hi, On Fri, 31 Aug 2007, Ingo Molnar wrote: > So the most intrusive (math) aspects of your patch have been implemented > already for CFS (almost a month ago), in a finegrained way. Interesting claim, please substantiate. > Peter's patches change the CFS calculations gradually over from > 'normalized' to 'non-normalized' wait-runtime, to avoid the > normalizing/denormalizing overhead and rounding error. Actually it changes wait-runtime to a normalized value and it changes nothing about the rounding error I was talking about. It addresses the conversion error between the different units I was mentioning in an earlier mail, but the value is still rounded. > > This model is far more accurate than CFS is and doesn't add an error > > over time, thus there are no more underflow/overflow anymore within > > the described limits. > > ( your characterisation errs in that it makes it appear to be a common > problem, while in practice it's only a corner-case limited to extreme > negative nice levels and even there it needs a very high rate of > scheduling and an artificially constructed workload: several hundreds > of thousand of context switches per second with a yield-ing loop to be > even measurable with unmodified CFS. So this is not a 2.6.23 issue at > all - unless there's some testcase that proves the opposite. ) > > with Peter's queue there are no underflows/overflows either anymore in > any synthetic corner-case we could come up with. Peter's queue works > well but it's 2.6.24 material. Did you even try to understand what I wrote? I didn't say that it's a "common problem", it's a conceptual problem. The rounding has been improved lately, so it's not as easy to trigger with some simple busy loops. Peter's patches don't remove limit_wait_runtime() and AFAICT they can't, so I'm really amazed how you can make such claims. > All in one, we dont disagree, this is an incremental improvement we are > thinking about for 2.6.24. We do disagree with this being positioned as > something fundamentally different though - it's just the same thing > mathematically, expressed without a "/weight" divisor, resulting in no > change in scheduling behavior. (except for a small shift of CPU > utilization for a synthetic corner-case) Everytime I'm amazed how quickly you get to your judgements... :-( Especially interesting is that you don't need to ask a single question for that, which would mean you actually understood what I wrote, OTOH your wild claims tell me something completely different. BTW who is "we" and how is it possible that this meta mind can come to such quick judgements? The basic concept is quite different enough, one can e.g. see that I have to calculate some of the key CFS variables for the debug output. The concepts are related, but they are definitively not "the same thing mathematically", the method of resolution is quite different, if you think otherwise then please _prove_ it. bye, Roman - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE/RFC] Really Fair Scheduler
Hi, On Fri, 31 Aug 2007, Mike Galbraith wrote: > I plunked it into 2.6.23-rc4 to see how it reacts to various sleeper > loads, and hit some starvation. If I got it in right (think so) there's > a bug lurking somewhere. taskset -c 1 fairtest2 resulted in the below. > It starts up running both tasks at about 60/40 for hog/sleeper, then > after a short while goes nuts. The hog component eats 100% cpu and > starves the sleeper (and events, forever). Thanks for testing, although your test program does nothing unusual here. Can you please send me your .config? Were there some kernel messages while running it? bye, Roman - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Bugme-new] [Bug 8961] New: BUG triggered by oidentd in netlink code
On Fri, Aug 31, 2007 at 01:05:04PM +0200, Patrick McHardy wrote: > Seems to be a bug introduced by the netlink_run_queue conversion, > since there is no locking and netlink_run_queue doesn't check > for NULL results from skb_dequeue, it might pass NULL to > netlink_rcv_skb, which crashes. > > Does this patch help? I'll compile up a new kernel, likely 2.6.22.6, plus this patch, and reboot to it tonight. I still don't know *exactly* how to trigger the bug on demand though, it's not reocurred since I posted the bug report (but had happened about a week before as well). thanks, -Ath -- - Athanasius = Athanasius(at)miggy.org / http://www.miggy.org/ Finger athan(at)fysh.org for PGP key "And it's me who is my enemy. Me who beats me up. Me who makes the monsters. Me who strips my confidence." Paula Cole - ME signature.asc Description: Digital signature
Re: [PATCH 8/11] cxgb3 - Update internal memory management
Divy Le Ray wrote: From: Divy Le Ray <[EMAIL PROTECTED]> Set PM1 internal memory to round robin mode It balances access to this internal memory for multiport adapters. Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]> --- drivers/net/cxgb3/regs.h |2 ++ drivers/net/cxgb3/t3_hw.c |2 ++ 2 files changed, 4 insertions(+), 0 deletions(-) diff --git a/drivers/net/cxgb3/regs.h b/drivers/net/cxgb3/regs.h index 2824278..5e1bc0d 100644 --- a/drivers/net/cxgb3/regs.h +++ b/drivers/net/cxgb3/regs.h @@ -1326,6 +1326,7 @@ #define V_D0_WEIGHT(x) ((x) << S_D0_WEIGHT) #define A_PM1_RX_CFG 0x5c0 +#define A_PM1_RX_MODE 0x5c4 #define A_PM1_RX_INT_ENABLE 0x5d8 @@ -1394,6 +1395,7 @@ #define A_PM1_RX_INT_CAUSE 0x5dc #define A_PM1_TX_CFG 0x5e0 +#define A_PM1_TX_MODE 0x5e4 #define A_PM1_TX_INT_ENABLE 0x5f8 Future note: would be nice to see the above as enums as well - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 8/11] cxgb3 - Update internal memory management
Divy Le Ray wrote: From: Divy Le Ray <[EMAIL PROTECTED]> Set PM1 internal memory to round robin mode It balances access to this internal memory for multiport adapters. Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]> --- drivers/net/cxgb3/regs.h |2 ++ drivers/net/cxgb3/t3_hw.c |2 ++ 2 files changed, 4 insertions(+), 0 deletions(-) applied patches 1-8 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 9/11] cxgb3 - engine microcode update
Divy Le Ray wrote: From: Divy Le Ray <[EMAIL PROTECTED]> Load microcode engine when the interface is configured up. Bump up version to 1.1.0. Allow the driver to be and running with older microcode images. Allow ethtool to log the microcode version. Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]> ACK patches 9-14, but dropped, since they do not apply to #upstream (probably due to fixes sent into 2.6.23-rc) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
DTR/DSR Patch
Hello, I make driver for Point of Sale Printer, a wide range of Printer use only a DTR/DSR hardware-handshaking. When I use a handshaking in the userspace, the Printr has a overrun problem and our customer has a problem with the tax office. my Patch relaize a simple DTR/DSR handhake with a small change of the code. The change auf the userspace is very simple. e.g. cflags |= CDTRDSR; The change of the stty tool at 2 lines. Michael diff -Nur linux-2.6/drivers/serial/8250.c linux-2.6.dsr_test/drivers/serial/8250.c --- linux-2.6/drivers/serial/8250.c 2007-08-28 11:31:48.0 +0200 +++ linux-2.6.dsr_test/drivers/serial/8250.c2007-08-31 13:20:57.0 +0200 @@ -1400,12 +1400,15 @@ up->port.info != NULL) { if (status & UART_MSR_TERI) up->port.icount.rng++; - if (status & UART_MSR_DDSR) - up->port.icount.dsr++; if (status & UART_MSR_DDCD) uart_handle_dcd_change(&up->port, status & UART_MSR_DCD); - if (status & UART_MSR_DCTS) - uart_handle_cts_change(&up->port, status & UART_MSR_CTS); + if (status & (UART_MSR_DCTS|UART_MSR_DDSR)) { + if (status & UART_MSR_DDSR) + up->port.icount.dsr++; + else + up->port.icount.cts++; + uart_handle_dsr_cts_change(&up->port, status & UART_MSR_CTS, status & UART_MSR_DSR); + } wake_up_interruptible(&up->port.info->delta_msr_wait); } diff -Nur linux-2.6/drivers/serial/serial_core.c linux-2.6.dsr_test/drivers/serial/serial_core.c --- linux-2.6/drivers/serial/serial_core.c 2007-08-28 11:31:48.0 +0200 +++ linux-2.6.dsr_test/drivers/serial/serial_core.c 2007-08-31 13:29:22.0 +0200 @@ -191,6 +191,13 @@ info->tty->hw_stopped = 1; spin_unlock_irq(&port->lock); } + + if (info->flags & UIF_DSR_FLOW) { + spin_lock_irq(&port->lock); + if (!(port->ops->get_mctrl(port) & TIOCM_DSR)) + info->tty->hw_stopped = 1; + spin_unlock_irq(&port->lock); + } info->flags |= UIF_INITIALIZED; @@ -437,6 +444,11 @@ else state->info->flags &= ~UIF_CTS_FLOW; + if (termios->c_cflag & CDTRDSR) + state->info->flags |= UIF_DSR_FLOW; + else + state->info->flags &= ~UIF_DSR_FLOW; + if (termios->c_cflag & CLOCAL) state->info->flags &= ~UIF_CHECK_CD; else --- linux-2.6/include/asm-i386/termbits.h 2007-05-21 10:37:10.0 +0200 +++ linux-2.6.dsr_test/include/asm-i386/termbits.h 2007-08-28 11:58:43.0 +0200 @@ -157,6 +157,7 @@ #define B350 0010016 #define B400 0010017 #define CIBAUD 00200360 +#define CDTRDSR 0040 /* dtrdsr flow control */ #define CMSPAR 0100 /* mark or space (stick) parity */ #define CRTSCTS 0200 /* flow control */ diff -Nur linux-2.6/include/linux/serial_core.h linux-2.6.dsr_test/include/linux/serial_core.h --- linux-2.6/include/linux/serial_core.h 2007-08-16 10:30:59.0 +0200 +++ linux-2.6.dsr_test/include/linux/serial_core.h 2007-08-30 16:09:55.0 +0200 @@ -334,6 +334,7 @@ * Definitions for info->flags. These are _private_ to serial_core, and * are specific to this structure. They may be queried by low level drivers. */ +#define UIF_DSR_FLOW ((__force uif_t) (1 << 22)) #define UIF_CHECK_CD ((__force uif_t) (1 << 25)) #define UIF_CTS_FLOW ((__force uif_t) (1 << 26)) #define UIF_NORMAL_ACTIVE ((__force uif_t) (1 << 29)) @@ -493,26 +494,27 @@ /** * uart_handle_cts_change - handle a change of clear-to-send state + * when set DTR/DSR and RTS/CTS send only when both lines ok * @port: uart_port structure for the open port * @status: new clear to send status, nonzero if active */ static inline void -uart_handle_cts_change(struct uart_port *port, unsigned int status) +uart_handle_dsr_cts_change(struct uart_port *port, unsigned int status_cts, unsigned int status_dsr) { struct uart_info *info = port->info; struct tty_struct *tty = info->tty; + int cts_stop = (info->flags & UIF_CTS_FLOW) && !status_cts; + int dsr_stop = (info->flags & UIF_DSR_FLOW) && !status_dsr; - port->icount.cts++; - - if (info->flags & UIF_CTS_FLOW) { + if ((info->flags & UIF_CTS_FLOW) || (info->flags & UIF_DSR_FLOW)) { if (tty->hw_stopped) { - if (status) { + if (!(cts_stop||dsr_stop)) {
Re: [PATCH] debloat aic7xxx and aic79xx drivers by deinlining
Jan Engelhardt wrote: > On Aug 30 2007 13:02, Matthew Wilcox wrote: >>> Well, you can send it to Linus/Andrew, that will usually upset people and >>> they >>> start commenting on it. Or they don't, and everything is fine. >>> (The "default y" approach so to speak ;-) >> The problem is that we don't really have a maintainer for the aic7xyz >> drivers any more. Volunteers welcome. NOT IT! > > Take it before someone else does! > Well, the semi-official maintainers are James B. and me. So I might as well do it officially. Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage [EMAIL PROTECTED] +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Markus Rex, HRB 16746 (AG Nürnberg) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Copy large memory regions from & to userspace
On Friday 31 August 2007 15:25:40 you wrote: > On 8/30/07, Clemens Kolbitsch <[EMAIL PROTECTED]> wrote: > > Hi! > > Just a short question: What is the correct method of copying large areas > > of memory from userspace into userspace when running in kernel-mode? > > relayfs? no... I'm copying user-memory to user-memory, not kernel-to-user, however running the code in kernel-mode. what i wanted to know is how to check the access-rights... i didn't get any other answers, so for now i'm just using if (access_ok(VERIFY_READ, from, PAGE_SIZE) && access_ok(VERIFY_WRITE, to, PAGE_SIZE)) { memcpy(to, from, PAGE_SIZE); } and hope that this is the *correct* way to do it... - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Increase lockdep MAX_LOCK_DEPTH
On Fri, Aug 31, 2007 at 08:39:49AM +0200, Peter Zijlstra wrote: > On Thu, 2007-08-30 at 23:43 -0500, Eric Sandeen wrote: > > The xfs filesystem can exceed the current lockdep > > MAX_LOCK_DEPTH, because when deleting an entire cluster of inodes, > > they all get locked in xfs_ifree_cluster(). The normal cluster > > size is 8192 bytes, and with the default (and minimum) inode size > > of 256 bytes, that's up to 32 inodes that get locked. Throw in a > > few other locks along the way, and 40 seems enough to get me through > > all the tests in the xfsqa suite on 4k blocks. (block sizes > > above 8K will still exceed this though, I think) > > As 40 will still not be enough for people with larger block sizes, this > does not seems like a solid solution. Could XFS possibly batch in > smaller (fixed sized) chunks, or does that have significant down sides? The problem is not filesystem block size, it's the xfs inode cluster buffer size / the size of the inodes that determines the lock depth. the common case is 8k/256 = 32 inodes in a buffer, and they all get looked during inode cluster writeback. This inode writeback clustering is one of the reasons XFS doesn't suffer from atime issues as much as other filesystems - it doesn't need to do as much I/O to write back dirty inodes to disk. IOWs, we are not going to make the inode clusters smallers - if anything they are going to get *larger* in future so we do less I/O during inode writeback than we do now. Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: recent nfs change causes autofs regression
On Fri, 2007-08-31 at 15:12 +0200, Frank van Maarseveen wrote: > IMHO I'd only consider returning EBUSY when trying to mount _exactly_ > the same directory with different flags, not for arbitrary subtrees. The > client should preferably not be bothered with server side disk > partitioning (at least not beyond the obvious such as df output). That is utterly inconsistent and confusing too. If you have a filesystem "/foo" exported on the server "remote", then why should mount -oro remote:/foo mount -orw remote:/foo/a be allowed, but mount -oro remote:/foo mount -orw remote:/foo be forbidden? The caching problems are the same. Telling the admin that one is safe and the other is not, is just messing with his mind. Trond - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] trivial - constify sched.h
On Thu, Aug 30, 2007 at 10:55:49PM +0200, Jan Engelhardt wrote: > "those callers". There was _exactly one_ caller, and that was an out-of-tree > module. There were not any in-kernel callers before, and it did not generate > any warning. That is perhaps why no one had constified it before me. This does > not mean we should wait for a caller to pop up before constifying IMHO. In this case we should just kill it instead of messing with constness. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE/RFC] Really Fair Scheduler
On Fri, 2007-08-31 at 15:22 +0200, Roman Zippel wrote: > Hi, > > On Fri, 31 Aug 2007, Mike Galbraith wrote: > > > I plunked it into 2.6.23-rc4 to see how it reacts to various sleeper > > loads, and hit some starvation. If I got it in right (think so) there's > > a bug lurking somewhere. taskset -c 1 fairtest2 resulted in the below. > > It starts up running both tasks at about 60/40 for hog/sleeper, then > > after a short while goes nuts. The hog component eats 100% cpu and > > starves the sleeper (and events, forever). > > Thanks for testing, although your test program does nothing unusual here. > Can you please send me your .config? Attached. > Were there some kernel messages while running it? I didn't look actually, was in rather a hurry. I'll try it again tomorrow. -Mike # # Automatically generated make config: don't edit # Linux kernel version: 2.6.23-rc3 # Fri Aug 31 09:04:00 2007 # CONFIG_X86_32=y CONFIG_GENERIC_TIME=y CONFIG_GENERIC_CMOS_UPDATE=y CONFIG_CLOCKSOURCE_WATCHDOG=y CONFIG_GENERIC_CLOCKEVENTS=y CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y CONFIG_LOCKDEP_SUPPORT=y CONFIG_STACKTRACE_SUPPORT=y CONFIG_SEMAPHORE_SLEEPERS=y CONFIG_X86=y CONFIG_MMU=y CONFIG_ZONE_DMA=y CONFIG_QUICKLIST=y CONFIG_GENERIC_ISA_DMA=y CONFIG_GENERIC_IOMAP=y CONFIG_GENERIC_BUG=y CONFIG_GENERIC_HWEIGHT=y CONFIG_ARCH_MAY_HAVE_PC_FDC=y CONFIG_DMI=y CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config" # # General setup # CONFIG_EXPERIMENTAL=y CONFIG_LOCK_KERNEL=y CONFIG_INIT_ENV_ARG_LIMIT=32 CONFIG_LOCALVERSION="-smp-r" CONFIG_LOCALVERSION_AUTO=y CONFIG_SWAP=y CONFIG_SYSVIPC=y CONFIG_SYSVIPC_SYSCTL=y CONFIG_POSIX_MQUEUE=y CONFIG_BSD_PROCESS_ACCT=y CONFIG_BSD_PROCESS_ACCT_V3=y # CONFIG_TASKSTATS is not set # CONFIG_USER_NS is not set CONFIG_AUDIT=y CONFIG_AUDITSYSCALL=y CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y CONFIG_LOG_BUF_SHIFT=17 # CONFIG_CPUSETS is not set # CONFIG_SYSFS_DEPRECATED is not set CONFIG_RELAY=y CONFIG_BLK_DEV_INITRD=y CONFIG_INITRAMFS_SOURCE="" # CONFIG_CC_OPTIMIZE_FOR_SIZE is not set CONFIG_SYSCTL=y CONFIG_EMBEDDED=y CONFIG_UID16=y CONFIG_SYSCTL_SYSCALL=y CONFIG_KALLSYMS=y CONFIG_KALLSYMS_ALL=y # CONFIG_KALLSYMS_EXTRA_PASS is not set CONFIG_HOTPLUG=y CONFIG_PRINTK=y CONFIG_BUG=y CONFIG_ELF_CORE=y CONFIG_BASE_FULL=y CONFIG_FUTEX=y CONFIG_ANON_INODES=y CONFIG_EPOLL=y CONFIG_SIGNALFD=y CONFIG_TIMERFD=y CONFIG_EVENTFD=y CONFIG_SHMEM=y CONFIG_VM_EVENT_COUNTERS=y CONFIG_SLAB=y # CONFIG_SLUB is not set # CONFIG_SLOB is not set CONFIG_RT_MUTEXES=y # CONFIG_TINY_SHMEM is not set CONFIG_BASE_SMALL=0 CONFIG_MODULES=y CONFIG_MODULE_UNLOAD=y CONFIG_MODULE_FORCE_UNLOAD=y CONFIG_MODVERSIONS=y CONFIG_MODULE_SRCVERSION_ALL=y CONFIG_KMOD=y CONFIG_STOP_MACHINE=y CONFIG_BLOCK=y CONFIG_LBD=y # CONFIG_BLK_DEV_IO_TRACE is not set CONFIG_LSF=y # CONFIG_BLK_DEV_BSG is not set # # IO Schedulers # CONFIG_IOSCHED_NOOP=y CONFIG_IOSCHED_AS=y CONFIG_IOSCHED_DEADLINE=y CONFIG_IOSCHED_CFQ=y # CONFIG_DEFAULT_AS is not set # CONFIG_DEFAULT_DEADLINE is not set CONFIG_DEFAULT_CFQ=y # CONFIG_DEFAULT_NOOP is not set CONFIG_DEFAULT_IOSCHED="cfq" # # Processor type and features # CONFIG_TICK_ONESHOT=y CONFIG_NO_HZ=y CONFIG_HIGH_RES_TIMERS=y CONFIG_SMP=y CONFIG_X86_PC=y # CONFIG_X86_ELAN is not set # CONFIG_X86_VOYAGER is not set # CONFIG_X86_NUMAQ is not set # CONFIG_X86_SUMMIT is not set # CONFIG_X86_BIGSMP is not set # CONFIG_X86_VISWS is not set # CONFIG_X86_GENERICARCH is not set # CONFIG_X86_ES7000 is not set # CONFIG_PARAVIRT is not set # CONFIG_M386 is not set # CONFIG_M486 is not set # CONFIG_M586 is not set # CONFIG_M586TSC is not set # CONFIG_M586MMX is not set # CONFIG_M686 is not set # CONFIG_MPENTIUMII is not set # CONFIG_MPENTIUMIII is not set # CONFIG_MPENTIUMM is not set # CONFIG_MCORE2 is not set CONFIG_MPENTIUM4=y # CONFIG_MK6 is not set # CONFIG_MK7 is not set # CONFIG_MK8 is not set # CONFIG_MCRUSOE is not set # CONFIG_MEFFICEON is not set # CONFIG_MWINCHIPC6 is not set # CONFIG_MWINCHIP2 is not set # CONFIG_MWINCHIP3D is not set # CONFIG_MGEODEGX1 is not set # CONFIG_MGEODE_LX is not set # CONFIG_MCYRIXIII is not set # CONFIG_MVIAC3_2 is not set # CONFIG_MVIAC7 is not set # CONFIG_X86_GENERIC is not set CONFIG_X86_CMPXCHG=y CONFIG_X86_L1_CACHE_SHIFT=7 CONFIG_X86_XADD=y CONFIG_RWSEM_XCHGADD_ALGORITHM=y # CONFIG_ARCH_HAS_ILOG2_U32 is not set # CONFIG_ARCH_HAS_ILOG2_U64 is not set CONFIG_GENERIC_CALIBRATE_DELAY=y CONFIG_X86_WP_WORKS_OK=y CONFIG_X86_INVLPG=y CONFIG_X86_BSWAP=y CONFIG_X86_POPAD_OK=y CONFIG_X86_GOOD_APIC=y CONFIG_X86_INTEL_USERCOPY=y CONFIG_X86_USE_PPRO_CHECKSUM=y CONFIG_X86_TSC=y CONFIG_X86_CMOV=y CONFIG_X86_MINIMUM_CPU_FAMILY=4 CONFIG_HPET_TIMER=y CONFIG_HPET_EMULATE_RTC=y CONFIG_NR_CPUS=2 CONFIG_SCHED_SMT=y CONFIG_SCHED_MC=y # CONFIG_PREEMPT_NONE is not set # CONFIG_PREEMPT_VOLUNTARY is not set CONFIG_PREEMPT=y CONFIG_PREEMPT_BKL=y CONFIG_X86_LOCAL_APIC=y CONFIG_X86_IO_APIC=y CONFIG_X86_MCE=y CONFIG_X86_MCE_NONFATAL=y CONFIG_X86_MCE_P4THERMAL=y CONFIG_VM86=y # CONFIG_TOSHIBA is not set # CO
Re: [PATCH 5/5] Net: ath5k, kconfig changes
Dan Williams wrote: On Thu, 2007-08-30 at 08:36 -0400, John W. Linville wrote: On Thu, Aug 30, 2007 at 04:38:09AM +0300, Nick Kossifidis wrote: 2007/8/28, Christoph Hellwig <[EMAIL PROTECTED]>: Also this whole patch seems rather pointless. It saves only very little and turns the driver into a complete ifdef maze. Also most people will use 5212 code only, 5211 cards are on some old laptops and 5210, well i couldn't even find a 5210 for actual testing :P FWIW, I'd bet dollars to donuts that distros will enable them all together. I would certainly _hope_ that distros enable everything -that is in the kernel- that they can get their hands on, otherwise when you stick a card in, it doesn't just work. Distros definitely -do not- do this. Plenty of ancient ISA drivers are disabled at build time, for example, in many distros. Jeff - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 0/2] IB/ehca: Fixes for rc5
These two patches fix some ehca issues that should be fixed in 2.6.23. [1/2] fixes regressions caused by the recent addition of Small QPs. [2/2] adds missing SRQ-related functionality that would have broken IPoIB CM. The patches should apply cleanly, in order, against Roland's git. Please review the changes and apply the patches for 2.6.23-rc5 if they are okay. Regards, Joachim -- Joachim Fenkes -- eHCA Linux Driver Developer and Hardware Tamer IBM Deutschland Entwicklung GmbH -- Dept. 3627 (I/O Firmware Dev. 2) Schoenaicher Strasse 220 -- 71032 Boeblingen -- Germany eMail: [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/2] IB/ehca: fix Small QP regressions
From: Stefan Roscher <[EMAIL PROTECTED]> The new Small QP code had a few bugs that would also trigger for non-Small QPs. Fix them. Signed-off-by: Joachim Fenkes <[EMAIL PROTECTED]> --- drivers/infiniband/hw/ehca/ehca_qp.c | 10 ++ drivers/infiniband/hw/ehca/ipz_pt_fn.c |2 +- 2 files changed, 7 insertions(+), 5 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_qp.c b/drivers/infiniband/hw/ehca/ehca_qp.c index b178cba..84d435a 100644 --- a/drivers/infiniband/hw/ehca/ehca_qp.c +++ b/drivers/infiniband/hw/ehca/ehca_qp.c @@ -600,10 +600,12 @@ static struct ehca_qp *internal_create_qp( if (EHCA_BMASK_GET(HCA_CAP_MINI_QP, shca->hca_cap) && !(context && udata)) { /* no small QP support in userspace ATM */ - ehca_determine_small_queue( - &parms.squeue, max_send_sge, is_llqp); - ehca_determine_small_queue( - &parms.rqueue, max_recv_sge, is_llqp); + if (HAS_SQ(my_qp)) + ehca_determine_small_queue( + &parms.squeue, max_send_sge, is_llqp); + if (HAS_RQ(my_qp)) + ehca_determine_small_queue( + &parms.rqueue, max_recv_sge, is_llqp); parms.qp_storage = (parms.squeue.is_small || parms.rqueue.is_small); } diff --git a/drivers/infiniband/hw/ehca/ipz_pt_fn.c b/drivers/infiniband/hw/ehca/ipz_pt_fn.c index a090c67..29bd476 100644 --- a/drivers/infiniband/hw/ehca/ipz_pt_fn.c +++ b/drivers/infiniband/hw/ehca/ipz_pt_fn.c @@ -172,7 +172,7 @@ static void free_small_queue_page(struct ipz_queue *queue, struct ehca_pd *pd) unsigned long bit; int free_page = 0; - bit = ((unsigned long)queue->queue_pages[0] & PAGE_MASK) + bit = ((unsigned long)queue->queue_pages[0] & ~PAGE_MASK) >> (order + 9); mutex_lock(&pd->lock); -- 1.5.2 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] trivial - constify sched.h
On Fri, Aug 31, 2007 at 02:53:23PM +0100, Christoph Hellwig wrote: > On Thu, Aug 30, 2007 at 10:55:49PM +0200, Jan Engelhardt wrote: > > "those callers". There was _exactly one_ caller, and that was an out-of-tree > > module. There were not any in-kernel callers before, and it did not generate > > any warning. That is perhaps why no one had constified it before me. This > > does > > not mean we should wait for a caller to pop up before constifying IMHO. > > In this case we should just kill it instead of messing with constness. I think Jan mis-spoke -- there were no in-kernel callers calling it with a const argument. -- Intel are signing my paycheques ... these opinions are still mine "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/2] IB/ehca: SRQ fixes to enable IPoIB CM
a) Report max_srq > 0 if SRQ is supported b) Report "last wqe reached" event when base QP dies Signed-off-by: Joachim Fenkes <[EMAIL PROTECTED]> --- drivers/infiniband/hw/ehca/ehca_hca.c | 10 +-- drivers/infiniband/hw/ehca/ehca_irq.c | 48 +--- 2 files changed, 38 insertions(+), 20 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_hca.c b/drivers/infiniband/hw/ehca/ehca_hca.c index fc19ef9..cf22472 100644 --- a/drivers/infiniband/hw/ehca/ehca_hca.c +++ b/drivers/infiniband/hw/ehca/ehca_hca.c @@ -93,9 +93,13 @@ int ehca_query_device(struct ib_device *ibdev, struct ib_device_attr *props) props->max_pd = min_t(int, rblock->max_pd, INT_MAX); props->max_ah = min_t(int, rblock->max_ah, INT_MAX); props->max_fmr = min_t(int, rblock->max_mr, INT_MAX); - props->max_srq = 0; - props->max_srq_wr = 0; - props->max_srq_sge = 0; + + if (EHCA_BMASK_GET(HCA_CAP_SRQ, shca->hca_cap)) { + props->max_srq = props->max_qp; + props->max_srq_wr = props->max_qp_wr; + props->max_srq_sge = 3; + } + props->max_pkeys = 16; props->local_ca_ack_delay = rblock->local_ca_ack_delay; diff --git a/drivers/infiniband/hw/ehca/ehca_irq.c b/drivers/infiniband/hw/ehca/ehca_irq.c index ee06d8b..a925ea5 100644 --- a/drivers/infiniband/hw/ehca/ehca_irq.c +++ b/drivers/infiniband/hw/ehca/ehca_irq.c @@ -175,41 +175,55 @@ error_data1: } -static void qp_event_callback(struct ehca_shca *shca, u64 eqe, - enum ib_event_type event_type, int fatal) +static void dispatch_qp_event(struct ehca_shca *shca, struct ehca_qp *qp, + enum ib_event_type event_type) { struct ib_event event; - struct ehca_qp *qp; - u32 token = EHCA_BMASK_GET(EQE_QP_TOKEN, eqe); - - read_lock(&ehca_qp_idr_lock); - qp = idr_find(&ehca_qp_idr, token); - read_unlock(&ehca_qp_idr_lock); - - - if (!qp) - return; - - if (fatal) - ehca_error_data(shca, qp, qp->ipz_qp_handle.handle); event.device = &shca->ib_device; + event.event = event_type; if (qp->ext_type == EQPT_SRQ) { if (!qp->ib_srq.event_handler) return; - event.event = fatal ? IB_EVENT_SRQ_ERR : event_type; event.element.srq = &qp->ib_srq; qp->ib_srq.event_handler(&event, qp->ib_srq.srq_context); } else { if (!qp->ib_qp.event_handler) return; - event.event = event_type; event.element.qp = &qp->ib_qp; qp->ib_qp.event_handler(&event, qp->ib_qp.qp_context); } +} + +static void qp_event_callback(struct ehca_shca *shca, u64 eqe, + enum ib_event_type event_type, int fatal) +{ + struct ehca_qp *qp; + u32 token = EHCA_BMASK_GET(EQE_QP_TOKEN, eqe); + + read_lock(&ehca_qp_idr_lock); + qp = idr_find(&ehca_qp_idr, token); + read_unlock(&ehca_qp_idr_lock); + + if (!qp) + return; + + if (fatal) + ehca_error_data(shca, qp, qp->ipz_qp_handle.handle); + + dispatch_qp_event(shca, qp, fatal && qp->ext_type == EQPT_SRQ ? + IB_EVENT_SRQ_ERR : event_type); + + /* +* eHCA only processes one WQE at a time for SRQ base QPs, +* so the last WQE has been processed as soon as the QP enters +* error state. +*/ + if (fatal && qp->ext_type == EQPT_SRQBASE) + dispatch_qp_event(shca, qp, IB_EVENT_QP_LAST_WQE_REACHED); return; } -- 1.5.2 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/