Kernel BUG in khubd with 2.4.30 x86_64
I'm using a Dell PowerEdge 2850 with dual 3.6Ghz Xeon EM64T CPUs. Using a vanilla 2.4.30 SMP x86_64 kernel, when I try to modprobe usb-uhci I get: kernel BUG in header file at line 160 Kernel BUG at panic:149 invalid operand: dmesg and ksymoops output below. Thanks James Pearson dmesg: usb.c: registered new driver usbdevfs usb.c: registered new driver hub usb-uhci.c: $Revision: 1.275 $ time 17:44:32 Apr 17 2005 usb-uhci.c: High bandwidth mode enabled PCI: Setting latency timer of device 00:1d.0 to 64 usb-uhci.c: USB UHCI at I/O 0x9ce0, IRQ 16 usb-uhci.c: Detected 2 ports usb.c: new USB bus registered, assigned bus number 1 hub.c: USB hub found hub.c: 2 ports detected PCI: Setting latency timer of device 00:1d.1 to 64 usb-uhci.c: USB UHCI at I/O 0x9cc0, IRQ 19 usb-uhci.c: Detected 2 ports usb.c: new USB bus registered, assigned bus number 2 hub.c: USB hub found hub.c: 2 ports detected PCI: Setting latency timer of device 00:1d.2 to 64 usb-uhci.c: USB UHCI at I/O 0x9ca0, IRQ 18 usb-uhci.c: Detected 2 ports usb.c: new USB bus registered, assigned bus number 3 hub.c: USB hub found hub.c: 2 ports detected usb-uhci.c: v1.275:USB Universal Host Controller Interface driver hub.c: new USB device 00:1d.0-1, assigned address 2 kernel BUG in header file at line 160 Kernel BUG at panic:149 invalid operand: CPU 1 Pid: 1380, comm: khubd Not tainted RIP: 0010:[] RSP: 0018:01023c3ddcd8 EFLAGS: 00010016 RAX: 0026 RBX: 01023e8f6880 RCX: RDX: 01023e163f08 RSI: 803e4000 RDI: RBP: 01023eac6400 R08: R09: 000d R10: R11: R12: R13: 0002 R14: 01023c4dcf80 R15: FS: () GS:803daa00() knlGS: CS: 0010 DS: ES: CR0: 8005003b CR2: 004031a0 CR3: 0d674000 CR4: 06a0 Process khubd (pid: 1380, stackpage=1023c3dd000) Stack: 01023c3ddcd8 0018 80121ae4 01023eac6400 a013b8c4 01023c3ddd28 0202 01023d641200 01f4 01023e8f6880 01023c3ddd48 01023c3ddd68 Call Trace: [] [] [] [] [] [] [] [] [] [] [] [] [] [] Code: 0f 0b 02 79 29 80 ff ff ff ff 95 00 eb fe 90 90 90 90 90 90 RIP [] RSP <01023c3ddcd8> ksymoops: ksymoops 2.4.11 on x86_64 2.4.30. Options used -V (default) -k /proc/ksyms (default) -l /proc/modules (default) -o /lib/modules/2.4.30/ (default) -m /boot/System.map-2.4.30 (specified) Error (expand_objects): cannot stat(/lib/xfs.o) for xfs Error (expand_objects): cannot stat(/lib/raid1.o) for raid1 Error (expand_objects): cannot stat(/lib/mptscsih.o) for mptscsih Error (expand_objects): cannot stat(/lib/mptbase.o) for mptbase Error (expand_objects): cannot stat(/lib/sd_mod.o) for sd_mod Error (expand_objects): cannot stat(/lib/scsi_mod.o) for scsi_mod SGI XFS with no debug enabled e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection e1000: eth1: e1000_probe: Intel(R) PRO/1000 Network Connection e1000: eth2: e1000_probe: Intel(R) PRO/1000 Network Connection e1000: eth3: e1000_probe: Intel(R) PRO/1000 Network Connection e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection e1000: eth1: e1000_probe: Intel(R) PRO/1000 Network Connection e1000: eth2: e1000_probe: Intel(R) PRO/1000 Network Connection e1000: eth3: e1000_probe: Intel(R) PRO/1000 Network Connection e1000: eth0: e1000_watchdog: NIC Link is Up 100 Mbps Full Duplex kernel BUG in header file at line 160 Kernel BUG at panic:149 invalid operand: CPU 1 Pid: 1380, comm: khubd Not tainted RIP: 0010:[] Using defaults from ksymoops -t elf64-x86-64 -a i386:x86-64 RSP: 0018:01023c3ddcd8 EFLAGS: 00010016 RAX: 0026 RBX: 01023e8f6880 RCX: RDX: 01023e163f08 RSI: 803e4000 RDI: RBP: 01023eac6400 R08: R09: 000d R10: R11: R12: R13: 0002 R14: 01023c4dcf80 R15: FS: () GS:803daa00() knlGS: CS: 0010 DS: ES: CR0: 8005003b CR2: 004031a0 CR3: 0d674000 CR4: 06a0 Process khubd (pid: 1380, stackpage=1023c3dd000) Stack: 01023c3ddcd8 0018 80121ae4 01023eac6400 a013b8c4 01023c3ddd28 0202 01023d641200 01f4 01023e8f6880 01023c3ddd48 01023c3ddd68 Call Trace: [] [] [] [] [] [] [] [] [] [] [] [] [] [] Code: 0f 0b 02 79 29 80 ff ff ff ff 95 00 eb fe 90 90 90 90 90 90 >>RIP; 80121ae4 <__out_of_line_bug+14/30> <= >>RSI; 803e4000 Trace; 80121ae4 <__out_of_line_bug+14/30> Trace; a013b8c4 <
Re: Kernel BUG in khubd with 2.4.30 x86_64
I've worked out what the problem is - this machine has more than 4GB memory and I didn't have IOMMU compiled in - rebuilding the kernel with this set and the problem goes away. James Pearson James Pearson wrote: I'm using a Dell PowerEdge 2850 with dual 3.6Ghz Xeon EM64T CPUs. Using a vanilla 2.4.30 SMP x86_64 kernel, when I try to modprobe usb-uhci I get: kernel BUG in header file at line 160 Kernel BUG at panic:149 invalid operand: dmesg and ksymoops output below. Thanks James Pearson dmesg: usb.c: registered new driver usbdevfs usb.c: registered new driver hub usb-uhci.c: $Revision: 1.275 $ time 17:44:32 Apr 17 2005 usb-uhci.c: High bandwidth mode enabled PCI: Setting latency timer of device 00:1d.0 to 64 usb-uhci.c: USB UHCI at I/O 0x9ce0, IRQ 16 usb-uhci.c: Detected 2 ports usb.c: new USB bus registered, assigned bus number 1 hub.c: USB hub found hub.c: 2 ports detected PCI: Setting latency timer of device 00:1d.1 to 64 usb-uhci.c: USB UHCI at I/O 0x9cc0, IRQ 19 usb-uhci.c: Detected 2 ports usb.c: new USB bus registered, assigned bus number 2 hub.c: USB hub found hub.c: 2 ports detected PCI: Setting latency timer of device 00:1d.2 to 64 usb-uhci.c: USB UHCI at I/O 0x9ca0, IRQ 18 usb-uhci.c: Detected 2 ports usb.c: new USB bus registered, assigned bus number 3 hub.c: USB hub found hub.c: 2 ports detected usb-uhci.c: v1.275:USB Universal Host Controller Interface driver hub.c: new USB device 00:1d.0-1, assigned address 2 kernel BUG in header file at line 160 Kernel BUG at panic:149 invalid operand: CPU 1 Pid: 1380, comm: khubd Not tainted RIP: 0010:[] RSP: 0018:01023c3ddcd8 EFLAGS: 00010016 RAX: 0026 RBX: 01023e8f6880 RCX: RDX: 01023e163f08 RSI: 803e4000 RDI: RBP: 01023eac6400 R08: R09: 000d R10: R11: R12: R13: 0002 R14: 01023c4dcf80 R15: FS: () GS:803daa00() knlGS: CS: 0010 DS: ES: CR0: 8005003b CR2: 004031a0 CR3: 0d674000 CR4: 06a0 Process khubd (pid: 1380, stackpage=1023c3dd000) Stack: 01023c3ddcd8 0018 80121ae4 01023eac6400 a013b8c4 01023c3ddd28 0202 01023d641200 01f4 01023e8f6880 01023c3ddd48 01023c3ddd68 Call Trace: [] [] [] [] [] [] [] [] [] [] [] [] [] [] Code: 0f 0b 02 79 29 80 ff ff ff ff 95 00 eb fe 90 90 90 90 90 90 RIP [] RSP <01023c3ddcd8> ksymoops: ksymoops 2.4.11 on x86_64 2.4.30. Options used -V (default) -k /proc/ksyms (default) -l /proc/modules (default) -o /lib/modules/2.4.30/ (default) -m /boot/System.map-2.4.30 (specified) Error (expand_objects): cannot stat(/lib/xfs.o) for xfs Error (expand_objects): cannot stat(/lib/raid1.o) for raid1 Error (expand_objects): cannot stat(/lib/mptscsih.o) for mptscsih Error (expand_objects): cannot stat(/lib/mptbase.o) for mptbase Error (expand_objects): cannot stat(/lib/sd_mod.o) for sd_mod Error (expand_objects): cannot stat(/lib/scsi_mod.o) for scsi_mod SGI XFS with no debug enabled e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection e1000: eth1: e1000_probe: Intel(R) PRO/1000 Network Connection e1000: eth2: e1000_probe: Intel(R) PRO/1000 Network Connection e1000: eth3: e1000_probe: Intel(R) PRO/1000 Network Connection e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection e1000: eth1: e1000_probe: Intel(R) PRO/1000 Network Connection e1000: eth2: e1000_probe: Intel(R) PRO/1000 Network Connection e1000: eth3: e1000_probe: Intel(R) PRO/1000 Network Connection e1000: eth0: e1000_watchdog: NIC Link is Up 100 Mbps Full Duplex kernel BUG in header file at line 160 Kernel BUG at panic:149 invalid operand: CPU 1 Pid: 1380, comm: khubd Not tainted RIP: 0010:[] Using defaults from ksymoops -t elf64-x86-64 -a i386:x86-64 RSP: 0018:01023c3ddcd8 EFLAGS: 00010016 RAX: 0026 RBX: 01023e8f6880 RCX: RDX: 01023e163f08 RSI: 803e4000 RDI: RBP: 01023eac6400 R08: R09: 000d R10: R11: R12: R13: 0002 R14: 01023c4dcf80 R15: FS: () GS:803daa00() knlGS: CS: 0010 DS: ES: CR0: 8005003b CR2: 004031a0 CR3: 0d674000 CR4: 06a0 Process khubd (pid: 1380, stackpage=1023c3dd000) Stack: 01023c3ddcd8 0018 80121ae4 01023eac6400 a013b8c4 01023c3ddd28 0202 01023d641200 01f4 01023e8f6880 01023c3ddd48 01023c3ddd68 Call Trace: [] [] [] [] [] [] [] [] [] [] [] [] [] [] Code: 0f 0b 02 79 29 80 ff ff ff ff 95 00
Re: 4096 byte limit to /proc/PID/environ ?
H. Peter Anvin wrote: Guy Streeter wrote: On 6/1/06, James Pearson <[EMAIL PROTECTED]> wrote: H. Peter Anvin wrote: I think this is the wrong approach. Many of these should probably be converted to seq_file, but in the particular case of environ, the right approach is to observe the fact that reading environ is just like reading /proc/PID/mem, except: a. the access restrictions are less strict, and b. there is a range restriction, which needs to be enforced, and c. there is an offset. Pretty much, take the guts from /proc/PID/mem and generalize it slightly, and you have the code that can run either /proc/PID/mem or /proc/PID/environ. The following patch is based on the /proc/PID/mem code appears to work fine. This thread has gone stale. The PAGE_SIZE limit still exists. Is this solution acceptable? Can we avoid the code duplication? There isn't that much that is duplicated - and there are also bits of the /proc/PID/mem code that are not needed in this case, so I'm not really sure if it is worth doing. I did submit a patch a few months ago - see: <http://marc.info/?l=linux-kernel&m=117862109623007&w=2> James Pearson - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 4096 byte limit to /proc/PID/environ ?
H. Peter Anvin wrote: Anton Arapov wrote: Hey guys, the future of this patch is important for me. What do you think, has this patch any chances to be committed to upstream? James Pearson <[EMAIL PROTECTED]> writes: H. Peter Anvin wrote: There isn't that much that is duplicated - and there are also bits of the /proc/PID/mem code that are not needed in this case, so I'm not really sure if it is worth doing. I did submit a patch a few months ago - see: <http://marc.info/?l=linux-kernel&m=117862109623007&w=2> Looks reasonable to me, except for the one overlong line. OK, here is the patch (without the long line) against 2.6.23-rc5 - what else needs to be done to get it committed? James Pearson --- ./fs/proc/base.c.dist 2007-09-01 07:08:24.0 +0100 +++ ./fs/proc/base.c2007-09-05 14:08:15.762518000 +0100 @@ -199,27 +199,6 @@ static int proc_root_link(struct inode * (task->state == TASK_STOPPED || task->state == TASK_TRACED) && \ security_ptrace(current,task) == 0)) -static int proc_pid_environ(struct task_struct *task, char * buffer) -{ - int res = 0; - struct mm_struct *mm = get_task_mm(task); - if (mm) { - unsigned int len; - - res = -ESRCH; - if (!ptrace_may_attach(task)) - goto out; - - len = mm->env_end - mm->env_start; - if (len > PAGE_SIZE) - len = PAGE_SIZE; - res = access_process_vm(task, mm->env_start, buffer, len, 0); -out: - mmput(mm); - } - return res; -} - static int proc_pid_cmdline(struct task_struct *task, char * buffer) { int res = 0; @@ -658,6 +637,85 @@ static const struct file_operations proc .open = mem_open, }; +static ssize_t environ_read(struct file * file, char __user * buf, + size_t count, loff_t *ppos) +{ + struct task_struct *task = get_proc_task(file->f_dentry->d_inode); + char *page; + unsigned long src = *ppos; + int ret = -ESRCH; + struct mm_struct *mm; + size_t max_len; + + if (!task) + goto out_no_task; + + if (!ptrace_may_attach(task)) + goto out; + + ret = -ENOMEM; + page = (char *)__get_free_page(GFP_USER); + if (!page) + goto out; + + ret = 0; + + mm = get_task_mm(task); + if (!mm) + goto out_free; + + max_len = (count > PAGE_SIZE) ? PAGE_SIZE : count; + + while (count > 0) { + int this_len, retval; + + this_len = mm->env_end - (mm->env_start + src); + + if (this_len <= 0) { + break; + } + + if (this_len > max_len) + this_len = max_len; + + retval = access_process_vm(task, (mm->env_start + src), + page, this_len, 0); + + if (!ptrace_may_attach(task)) { + ret = -ESRCH; + break; + } + + if (retval <= 0) { + ret = retval; + break; + } + + if (copy_to_user(buf, page, retval)) { + ret = -EFAULT; + break; + } + + ret += retval; + src += retval; + buf += retval; + count -= retval; + } + *ppos = src; + + mmput(mm); +out_free: + free_page((unsigned long) page); +out: + put_task_struct(task); +out_no_task: + return ret; +} + +static const struct file_operations proc_environ_operations = { + .read = environ_read, +}; + static ssize_t oom_adjust_read(struct file *file, char __user *buf, size_t count, loff_t *ppos) { @@ -2048,7 +2106,7 @@ static const struct pid_entry tgid_base_ DIR("task", S_IRUGO|S_IXUGO, task), DIR("fd", S_IRUSR|S_IXUSR, fd), DIR("fdinfo", S_IRUSR|S_IXUSR, fdinfo), - INF("environ",S_IRUSR, pid_environ), + REG("environ",S_IRUSR, environ), INF("auxv", S_IRUSR, pid_auxv), INF("status", S_IRUGO, pid_status), #ifdef CONFIG_SCHED_DEBUG @@ -2335,7 +2393,7 @@ out_no_task: static const struct pid_entry tid_base_stuff[] = { DIR("fd",S_IRUSR|S_IXUSR, fd), DIR("fdinfo",S_IRUSR|S_IXUSR, fdinfo), - INF("environ", S_IRUSR, pid_environ), + REG("environ", S_IRUSR, environ), INF("auxv", S_IRUSR, pid_auxv), INF("status",S_IRUGO, pid_status), #ifdef CONFIG_SCHED_DEBUG - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 4096 byte limit to /proc/PID/environ ?
Randy Dunlap wrote: OK, here is the patch (without the long line) against 2.6.23-rc5 - what else needs to be done to get it committed? Hi, a. It needs a changelog that describes the problem and the patch. b. It needs to apply cleanly to a current kernel. (It does not apply cleanly now due to some odd line breaks [see #1 below.) c. It needs to use tabs instead of spaces. That will probably help on item b as well. linux-2.6.23-rc5> dryrun < ~/fs-proc-read-sizes.patch 4 out of 4 hunks FAILED -- saving rejects to file fs/proc/base.c.rej I think a 'cut-n-paste' to my mail app mangled the patch - I'll re-submit it 'cleanly' ... Thanks James Pearson - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 4096 byte limit to /proc/PID/environ ?
Alexey Dobriyan wrote: On Wed, Sep 05, 2007 at 06:00:57PM +0100, James Pearson wrote: H. Peter Anvin wrote: Anton Arapov wrote: Hey guys, the future of this patch is important for me. What do you think, has this patch any chances to be committed to upstream? James Pearson <[EMAIL PROTECTED]> writes: H. Peter Anvin wrote: There isn't that much that is duplicated - and there are also bits of the /proc/PID/mem code that are not needed in this case, so I'm not really sure if it is worth doing. I did submit a patch a few months ago - see: <http://marc.info/?l=linux-kernel&m=117862109623007&w=2> Looks reasonable to me, except for the one overlong line. OK, here is the patch (without the long line) against 2.6.23-rc5 - what else needs to be done to get it committed? Remove duplicate ptrace_may_attach() checks, unecessary (), {} and spaces before pointer names -- char *buf. environ_read() in the patch uses ptrace_may_attach() in a similar way as does mem_read(). Given that environ_read() is based on mem_read(), does this mean that duplicate ptrace_may_attach() checks need to be removed from mem_read() as well? Which ptrace_may_attach() needs to be removed? Thanks James Pearson - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 4096 byte limit to /proc/PID/environ ?
H. Peter Anvin wrote: > > Right, also please use use checkpatch.pl. > OK - how about: /proc/PID/environ currently truncates at 4096 characters, patch based on the /proc/PID/mem code. Patch against 2.6.23-rc5 Signed-off-by: James Pearson <[EMAIL PROTECTED]> --- ./fs/proc/base.c.dist 2007-09-01 07:08:24.0 +0100 +++ ./fs/proc/base.c2007-09-06 14:29:46.413680554 +0100 @@ -199,27 +199,6 @@ static int proc_root_link(struct inode * (task->state == TASK_STOPPED || task->state == TASK_TRACED) && \ security_ptrace(current,task) == 0)) -static int proc_pid_environ(struct task_struct *task, char * buffer) -{ - int res = 0; - struct mm_struct *mm = get_task_mm(task); - if (mm) { - unsigned int len; - - res = -ESRCH; - if (!ptrace_may_attach(task)) - goto out; - - len = mm->env_end - mm->env_start; - if (len > PAGE_SIZE) - len = PAGE_SIZE; - res = access_process_vm(task, mm->env_start, buffer, len, 0); -out: - mmput(mm); - } - return res; -} - static int proc_pid_cmdline(struct task_struct *task, char * buffer) { int res = 0; @@ -658,6 +637,79 @@ static const struct file_operations proc .open = mem_open, }; +static ssize_t environ_read(struct file *file, char __user *buf, + size_t count, loff_t *ppos) +{ + struct task_struct *task = get_proc_task(file->f_dentry->d_inode); + char *page; + unsigned long src = *ppos; + int ret = -ESRCH; + struct mm_struct *mm; + size_t max_len; + + if (!task) + goto out_no_task; + + if (!ptrace_may_attach(task)) + goto out; + + ret = -ENOMEM; + page = (char *)__get_free_page(GFP_USER); + if (!page) + goto out; + + ret = 0; + + mm = get_task_mm(task); + if (!mm) + goto out_free; + + max_len = (count > PAGE_SIZE) ? PAGE_SIZE : count; + + while (count > 0) { + int this_len, retval; + + this_len = mm->env_end - (mm->env_start + src); + + if (this_len <= 0) + break; + + if (this_len > max_len) + this_len = max_len; + + retval = access_process_vm(task, (mm->env_start + src), + page, this_len, 0); + + if (retval <= 0) { + ret = retval; + break; + } + + if (copy_to_user(buf, page, retval)) { + ret = -EFAULT; + break; + } + + ret += retval; + src += retval; + buf += retval; + count -= retval; + } + *ppos = src; + + mmput(mm); +out_free: + free_page((unsigned long) page); +out: + put_task_struct(task); +out_no_task: + return ret; +} + +static const struct file_operations proc_environ_operations = { + .read = environ_read, +}; + static ssize_t oom_adjust_read(struct file *file, char __user *buf, size_t count, loff_t *ppos) { @@ -2048,7 +2100,7 @@ static const struct pid_entry tgid_base_ DIR("task", S_IRUGO|S_IXUGO, task), DIR("fd", S_IRUSR|S_IXUSR, fd), DIR("fdinfo", S_IRUSR|S_IXUSR, fdinfo), - INF("environ",S_IRUSR, pid_environ), + REG("environ",S_IRUSR, environ), INF("auxv", S_IRUSR, pid_auxv), INF("status", S_IRUGO, pid_status), #ifdef CONFIG_SCHED_DEBUG @@ -2335,7 +2387,7 @@ out_no_task: static const struct pid_entry tid_base_stuff[] = { DIR("fd",S_IRUSR|S_IXUSR, fd), DIR("fdinfo",S_IRUSR|S_IXUSR, fdinfo), - INF("environ", S_IRUSR, pid_environ), + REG("environ", S_IRUSR, environ), INF("auxv", S_IRUSR, pid_auxv), INF("status",S_IRUGO, pid_status), #ifdef CONFIG_SCHED_DEBUG - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Understanding cpufreq?
I have a number of dual CPU and dual CPU/dual core Opteron systems that are used as compute servers. In an effort to reduce power consumption and reduce heat output, I would like to make use of the PowerNow! capabilities to clock back the CPUs when the machines are idle. These machines are running a 2.6.9-42 RHEL4 kernel with the powernow-k8 module loaded - which I believe have backported cpufreq support from more recent mainline kernels. In trying to achieve what I want, I've become rather confused as to how cpufreq in a multi-CPU environment works: There is a directory under /sys/devices/system/cpu/cpu*/cpufreq for each CPU, which seems to imply that each CPU speed can be controlled separately - can this really be the case? Can separate CPU cores run at different speeds? e.g. I can echo 4 different governor names to the scaling_governor file in each /sys/devices/system/cpu/cpu[0-3]/cpufreq directory on a 4 core machine - and the resulting scaling_cur_freq file can contain a different value. However, the "cpu MHz" fields in /proc/cpuinfo are all the same for each each CPU - I assume the values in /proc/cpuinfo are the 'correct' values ?? Also, if I set all the governors to userspace, and then set each CPU's speed via scaling_setspeed to a different (allowed) value, then it appears quite random as to which value is then reflected in /proc/cpuinfo i.e. sometimes it will take the value given to CPU 0, other times it will be CPU 1 etc. If I set all the governors to ondemand, the CPUs will from time to time, clock back their speed in situations where one or more CPUs are being heavily used. i.e it appears that each CPU is treated separately, and if one CPU is deemed to be idle enough by its given metrics, then it can reduce the speed of all CPUs, regardless of other CPUs being 'busy' ... I've also tried a couple of userspace daemons (cpuspeed and powernowd) - again, these treat each CPU separately and will also reduce the speed of an 'idle' CPU - and hence reduce the speed of all the CPUs, again, regardless of other CPUs being 'busy'. Essentially what I want to achieve is something like: if _any_ CPU is 'busy' (usage over some threshold over some sampling period), then run at full speed and if _all_ CPUs are 'idle' (all below some threshold over some sampling period) then clock back the CPUs. Is there something/some setting(s) that can do this in a multi-CPU machine? Thanks James Pearson - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] Don't truncate /proc/PID/environ at 4096 characters
/proc/PID/environ currently truncates at 4096 characters, patch based on the /proc/PID/mem code. Signed-off-by: James Pearson <[EMAIL PROTECTED]> --- ./fs/proc/base.c.dist 2007-04-26 04:08:32.0 +0100 +++ ./fs/proc/base.c2007-04-27 16:32:44.277664457 +0100 @@ -196,22 +196,6 @@ (task->state == TASK_STOPPED || task->state == TASK_TRACED) && \ security_ptrace(current,task) == 0)) -static int proc_pid_environ(struct task_struct *task, char * buffer) -{ - int res = 0; - struct mm_struct *mm = get_task_mm(task); - if (mm) { - unsigned int len = mm->env_end - mm->env_start; - if (len > PAGE_SIZE) - len = PAGE_SIZE; - res = access_process_vm(task, mm->env_start, buffer, len, 0); - if (!ptrace_may_attach(task)) - res = -ESRCH; - mmput(mm); - } - return res; -} - static int proc_pid_cmdline(struct task_struct *task, char * buffer) { int res = 0; @@ -653,6 +637,84 @@ .open = mem_open, }; +static ssize_t environ_read(struct file * file, char __user * buf, + size_t count, loff_t *ppos) +{ + struct task_struct *task = get_proc_task(file->f_dentry->d_inode); + char *page; + unsigned long src = *ppos; + int ret = -ESRCH; + struct mm_struct *mm; + size_t max_len; + + if (!task) + goto out_no_task; + + if (!ptrace_may_attach(task)) + goto out; + + ret = -ENOMEM; + page = (char *)__get_free_page(GFP_USER); + if (!page) + goto out; + + ret = 0; + + mm = get_task_mm(task); + if (!mm) + goto out_free; + + max_len = (count > PAGE_SIZE) ? PAGE_SIZE : count; + + while (count > 0) { + int this_len, retval; + + this_len = mm->env_end - (mm->env_start + src); + + if (this_len <= 0) { + break; + } + + if (this_len > max_len) + this_len = max_len; + + retval = access_process_vm(task, (mm->env_start + src), page, this_len, 0); + + if (!ptrace_may_attach(task)) { + ret = -ESRCH; + break; + } + + if (retval <= 0) { + ret = retval; + break; + } + + if (copy_to_user(buf, page, retval)) { + ret = -EFAULT; + break; + } + + ret += retval; + src += retval; + buf += retval; + count -= retval; + } + *ppos = src; + + mmput(mm); +out_free: + free_page((unsigned long) page); +out: + put_task_struct(task); +out_no_task: + return ret; +} + +static struct file_operations proc_environ_operations = { + .read = environ_read, +}; + static ssize_t oom_adjust_read(struct file *file, char __user *buf, size_t count, loff_t *ppos) { @@ -1831,7 +1893,7 @@ static struct pid_entry tgid_base_stuff[] = { DIR("task", S_IRUGO|S_IXUGO, task), DIR("fd", S_IRUSR|S_IXUSR, fd), - INF("environ",S_IRUSR, pid_environ), + REG("environ",S_IRUSR, environ), INF("auxv", S_IRUSR, pid_auxv), INF("status", S_IRUGO, pid_status), INF("cmdline",S_IRUGO, pid_cmdline), @@ -2113,7 +2175,7 @@ */ static struct pid_entry tid_base_stuff[] = { DIR("fd",S_IRUSR|S_IXUSR, fd), - INF("environ", S_IRUSR, pid_environ), + REG("environ", S_IRUSR, environ), INF("auxv", S_IRUSR, pid_auxv), INF("status",S_IRUGO, pid_status), INF("cmdline", S_IRUGO, pid_cmdline), - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Don't truncate /proc/PID/environ at 4096 characters
Eric Dumazet wrote: On Fri, 04 May 2007 15:30:57 +0100 James Pearson <[EMAIL PROTECTED]> wrote: /proc/PID/environ currently truncates at 4096 characters, patch based on the /proc/PID/mem code. Signed-off-by: James Pearson <[EMAIL PROTECTED]> What about latency when reading one *big* environ ? Dont we need some cond_resched() test in the loop ? environ_read() only ever reads a maximum of PAGE_SIZE bytes in the loop, so would this make any difference when reading a big environ? + +static struct file_operations proc_environ_operations = { + .read = environ_read, +}; + Please use the const qualifier here : static const struct file_operations ... Thanks - fixed: Signed-off-by: James Pearson <[EMAIL PROTECTED]> --- ./fs/proc/base.c.dist 2007-04-26 04:08:32.0 +0100 +++ ./fs/proc/base.c2007-04-27 16:32:44.277664457 +0100 @@ -196,22 +196,6 @@ (task->state == TASK_STOPPED || task->state == TASK_TRACED) && \ security_ptrace(current,task) == 0)) -static int proc_pid_environ(struct task_struct *task, char * buffer) -{ - int res = 0; - struct mm_struct *mm = get_task_mm(task); - if (mm) { - unsigned int len = mm->env_end - mm->env_start; - if (len > PAGE_SIZE) - len = PAGE_SIZE; - res = access_process_vm(task, mm->env_start, buffer, len, 0); - if (!ptrace_may_attach(task)) - res = -ESRCH; - mmput(mm); - } - return res; -} - static int proc_pid_cmdline(struct task_struct *task, char * buffer) { int res = 0; @@ -653,6 +637,84 @@ .open = mem_open, }; +static ssize_t environ_read(struct file * file, char __user * buf, + size_t count, loff_t *ppos) +{ + struct task_struct *task = get_proc_task(file->f_dentry->d_inode); + char *page; + unsigned long src = *ppos; + int ret = -ESRCH; + struct mm_struct *mm; + size_t max_len; + + if (!task) + goto out_no_task; + + if (!ptrace_may_attach(task)) + goto out; + + ret = -ENOMEM; + page = (char *)__get_free_page(GFP_USER); + if (!page) + goto out; + + ret = 0; + + mm = get_task_mm(task); + if (!mm) + goto out_free; + + max_len = (count > PAGE_SIZE) ? PAGE_SIZE : count; + + while (count > 0) { + int this_len, retval; + + this_len = mm->env_end - (mm->env_start + src); + + if (this_len <= 0) { + break; + } + + if (this_len > max_len) + this_len = max_len; + + retval = access_process_vm(task, (mm->env_start + src), page, this_len, 0); + + if (!ptrace_may_attach(task)) { + ret = -ESRCH; + break; + } + + if (retval <= 0) { + ret = retval; + break; + } + + if (copy_to_user(buf, page, retval)) { + ret = -EFAULT; + break; + } + + ret += retval; + src += retval; + buf += retval; + count -= retval; + } + *ppos = src; + + mmput(mm); +out_free: + free_page((unsigned long) page); +out: + put_task_struct(task); +out_no_task: + return ret; +} + +static const struct file_operations proc_environ_operations = { + .read = environ_read, +}; + static ssize_t oom_adjust_read(struct file *file, char __user *buf, size_t count, loff_t *ppos) { @@ -1831,7 +1893,7 @@ static struct pid_entry tgid_base_stuff[] = { DIR("task", S_IRUGO|S_IXUGO, task), DIR("fd", S_IRUSR|S_IXUSR, fd), - INF("environ",S_IRUSR, pid_environ), + REG("environ",S_IRUSR, environ), INF("auxv", S_IRUSR, pid_auxv), INF("status", S_IRUGO, pid_status), INF("cmdline",S_IRUGO, pid_cmdline), @@ -2113,7 +2175,7 @@ */ static struct pid_entry tid_base_stuff[] = { DIR("fd",S_IRUSR|S_IXUSR, fd), - INF("environ", S_IRUSR, pid_environ), + REG("environ", S_IRUSR, environ), INF("auxv", S_IRUSR, pid_auxv), INF("status",S_IRUGO, pid_status), INF("cmdline", S_IRUGO, pid_cmdline), - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
X display shift with disabled console blanking
I have a problem whereby the X display 'shifts' to left when anything writes to /dev/console - where console screen blanking has been disabled i.e. doing something like: boot to run level 3 If not root, then make sure /dev/console is writeable login and type: setterm -blank 0 start X type into an xterm: echo "some random text" > /dev/console (may have to repeat the echo above a few times) ... and the whole X display jumps (and wraps) to the left I'm using a RHEL4 based distro with a vanilla 2.6.21 x86_64 kernel (although I've seen the problem with various x86_64 and i686 2.6.X kernels). I've seen this problem on a number of different nVidia cards - using the vesa driver (same problem occurs with nVidia's binary driver). I haven't tried using other makes of graphics cards. OK, this may be a strange combination of disabling the text console blanking and running X, but something isn't right somewhere ... Any ideas? Thanks James Pearson - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: X display shift with disabled console blanking
Antonino A. Daplas wrote: On Fri, 2007-04-27 at 18:08 +0100, James Pearson wrote: I have a problem whereby the X display 'shifts' to left when anything writes to /dev/console - where console screen blanking has been disabled i.e. doing something like: boot to run level 3 If not root, then make sure /dev/console is writeable login and type: setterm -blank 0 start X type into an xterm: echo "some random text" > /dev/console (may have to repeat the echo above a few times) ... and the whole X display jumps (and wraps) to the left I'm using a RHEL4 based distro with a vanilla 2.6.21 x86_64 kernel (although I've seen the problem with various x86_64 and i686 2.6.X kernels). I've seen this problem on a number of different nVidia cards - using the vesa driver (same problem occurs with nVidia's binary driver). I haven't tried using other makes of graphics cards. OK, this may be a strange combination of disabling the text console blanking and running X, but something isn't right somewhere ... Yep, it's strange because I can't reproduce this. And the console write should not succeed if the current console is in KD_GRAPHICS mode, which is done by X (unless your version is different). I've just installed a vanilla CentOS 4.4 on an i686 SMP machine - with an nVidia Quadro4 980 XGL card. By default, this sets up X using the 'nv' driver (using RedHat's xorg-x11-6.8.2-1.EL.13.37). If I follow my 'recipe' above, then the screen shifts - note: it looks like you have to write several lines of text to /dev/console (at least 30) to trigger the problem (e.g. run the echo to /dev/console in a loop) - also, I've found that switching to the console and back to X (Ctrl-Alt-F1 then Ctrl-Alt-F7) while this echo loop is running can force the shift to start ... This is with the RedHat based 2.6.9-42.ELsmp kernel - but I also get the problem with a vanilla 2.6.21 kernel. Any ideas? I don't. But, what is your current console? Is it VGA, or framebuffer? Can you try doing this again in both VGA and vesafb? I'm not sure what the current console is - whatever is the default with RHEL4/CentOS4 - how do I select a different type of console? And this does not happen if there is no previous setterm -blank 0 command? It doesn't happen if there is no previous 'setterm -blank 0' - so, arguably, this is the 'fix' ... James Pearson - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: X display shift with disabled console blanking
Antonino A. Daplas wrote: On Mon, 2007-04-30 at 13:58 +0100, James Pearson wrote: Antonino A. Daplas wrote: On Fri, 2007-04-27 at 18:08 +0100, James Pearson wrote: I have a problem whereby the X display 'shifts' to left when anything writes to /dev/console - where console screen blanking has been disabled i.e. doing something like: boot to run level 3 If not root, then make sure /dev/console is writeable login and type: setterm -blank 0 start X type into an xterm: echo "some random text" > /dev/console (may have to repeat the echo above a few times) ... and the whole X display jumps (and wraps) to the left I'm using a RHEL4 based distro with a vanilla 2.6.21 x86_64 kernel (although I've seen the problem with various x86_64 and i686 2.6.X kernels). I've seen this problem on a number of different nVidia cards - using the vesa driver (same problem occurs with nVidia's binary driver). I haven't tried using other makes of graphics cards. OK, this may be a strange combination of disabling the text console blanking and running X, but something isn't right somewhere ... Yep, it's strange because I can't reproduce this. And the console write should not succeed if the current console is in KD_GRAPHICS mode, which is done by X (unless your version is different). I've just installed a vanilla CentOS 4.4 on an i686 SMP machine - with an nVidia Quadro4 980 XGL card. By default, this sets up X using the 'nv' driver (using RedHat's xorg-x11-6.8.2-1.EL.13.37). If I follow my 'recipe' above, then the screen shifts - note: it looks like you have to write several lines of text to /dev/console (at least 30) to trigger the problem (e.g. run the echo to /dev/console in a loop) - also, I've found that switching to the console and back to X (Ctrl-Alt-F1 then Ctrl-Alt-F7) while this echo loop is running can force the shift to start ... I would understand that switching from text to graphics and vice versa can trigger display problems (it shouldn't, but it happens), but not while you are only echoing text to the system console in graphics mode. It looks like switching from graphics -> text -> graphics definitely plays a part in the problem ... see below This is with the RedHat based 2.6.9-42.ELsmp kernel - but I also get the problem with a vanilla 2.6.21 kernel. Any ideas? I don't. But, what is your current console? Is it VGA, or framebuffer? Can you try doing this again in both VGA and vesafb? I'm not sure what the current console is - whatever is the default with RHEL4/CentOS4 - how do I select a different type of console? dmesg | grep "Console:" Console: colour VGA+ 80x25 And this does not happen if there is no previous setterm -blank 0 command? It doesn't happen if there is no previous 'setterm -blank 0' - so, arguably, this is the 'fix' ... Weird. The only thing I can think of is that console blanking is being triggered while the console is in graphics mode, which is not legal. How about 'setterm -blank 1', do an infinite echo loop and wait for at least 1 minute? Setting the blank time to anything other than 0 is fine (no screen shift) Also, can you open drivers/char/vt.c and look for the function do_blank_screen? You should have this particular segment. Change this if (console_blanked) { if (blank_state == blank_vesa_wait) { blank_state = blank_off; vc->vc_sw->con_blank(vc, vesa_blank_mode + 1, 0); } return; } to if (console_blanked && vc->vc_mode == KD_TEXT && !entering_gfx ) { if (blank_state == blank_vesa_wait) { blank_state = blank_off; vc->vc_sw->con_blank(vc, vesa_blank_mode + 1, 0); } return; } Let me know if that even makes a difference. Made no difference, although I can't see how it would as console_blanked is 0 when the problem happens. It does indeed seem that the switching back and forth between text and graphics does appear to be part of the issue - in my previous testing I probably did do this (but didn't include this in my recipe above) - so here is a new 'recipe' that shows the problem (for me) boot to run level 3 if not root, then make sure /dev/console is writeable login and type: setterm -blank 0 start X type into an xterm: while true; do echo "" > /dev/console; usleep 10; done while the above loop is running switch to the text console and back again (Ctrl-Alt-F1 then Ctrl-Alt-F7) ... and the screen will be shifting (and wrapping) to the left. James Pearson - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: X display shift with disabled console blanking
Antonino A. Daplas wrote: On Tue, 2007-05-01 at 13:17 +0100, James Pearson wrote: Antonino A. Daplas wrote: On Mon, 2007-04-30 at 13:58 +0100, James Pearson wrote: Antonino A. Daplas wrote: On Fri, 2007-04-27 at 18:08 +0100, James Pearson wrote: It does indeed seem that the switching back and forth between text and graphics does appear to be part of the issue - in my previous testing I probably did do this (but didn't include this in my recipe above) - so here is a new 'recipe' that shows the problem (for me) boot to run level 3 if not root, then make sure /dev/console is writeable login and type: setterm -blank 0 start X type into an xterm: while true; do echo "" > /dev/console; usleep 10; done while the above loop is running switch to the text console and back again (Ctrl-Alt-F1 then Ctrl-Alt-F7) ... and the screen will be shifting (and wrapping) to the left. Okay, this makes me see the problem more clearly. It looks like that vt/console layer is unreliable in terms of checking for the text/graphics mode of the current console. Instead of auditing the console code, I'll just have vgacon check for the mode. Try the attached patch and let me know if it helps. That seems to fix it! Thanks James Pearson - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH -mm] Don't truncate /proc/PID/environ at 4096 characters
James Pearson wrote: > Arvin Moezzi wrote: > >> I think that's not true. 'count' is changing through the iteration. >> The difference in the mem_read(): >> >> * while (count > 0) { >> * int this_len, retval; >> * >> * this_len = (count > PAGE_SIZE) ? PAGE_SIZE : count; >> * retval = access_process_vm(task, src, page, this_len, 0); >> * >> * ... >> * } >> >> is the fact, that this_len = min(PAGE_SIZE, count) is in the >> iteration block, hence retval <= this_len <= count in each iteration >> step. So this is ok. But IMHO in your code 'retval' may be bigger than >> 'count' in the last iteration of the block, because 'max_len' is fix >> through your iteration but 'count' is changing. Or am i missing >> something? > > > Yes, you are correct ... Here is a new patch that fixes the above issue ... However, I'm not sure if I should be using GFP_TEMPORARY, GFP_KERNEL or GFP_USER ? Thanks James Pearson Patch against 2.6.23-rc6-mm1 --- /proc/PID/environ currently truncates at 4096 characters, patch based on the /proc/PID/mem code. Signed-off-by: James Pearson <[EMAIL PROTECTED]> --- ./fs/proc/base.c.dist 2007-09-19 12:29:46.244929651 +0100 +++ ./fs/proc/base.c2007-09-25 12:40:53.194497911 +0100 @@ -202,27 +202,6 @@ static int proc_root_link(struct inode * (task->state == TASK_STOPPED || task->state == TASK_TRACED) && \ security_ptrace(current,task) == 0)) -static int proc_pid_environ(struct task_struct *task, char * buffer) -{ - int res = 0; - struct mm_struct *mm = get_task_mm(task); - if (mm) { - unsigned int len; - - res = -ESRCH; - if (!ptrace_may_attach(task)) - goto out; - - len = mm->env_end - mm->env_start; - if (len > PAGE_SIZE) - len = PAGE_SIZE; - res = access_process_vm(task, mm->env_start, buffer, len, 0); -out: - mmput(mm); - } - return res; -} - static int proc_pid_cmdline(struct task_struct *task, char * buffer) { int res = 0; @@ -740,6 +719,76 @@ static const struct file_operations proc .open = mem_open, }; +static ssize_t environ_read(struct file *file, char __user *buf, + size_t count, loff_t *ppos) +{ + struct task_struct *task = get_proc_task(file->f_dentry->d_inode); + char *page; + unsigned long src = *ppos; + int ret = -ESRCH; + struct mm_struct *mm; + + if (!task) + goto out_no_task; + + if (!ptrace_may_attach(task)) + goto out; + + ret = -ENOMEM; + page = (char *)__get_free_page(GFP_TEMPORARY); + if (!page) + goto out; + + ret = 0; + + mm = get_task_mm(task); + if (!mm) + goto out_free; + + while (count > 0) { + int this_len, retval, max_len; + + this_len = mm->env_end - (mm->env_start + src); + + if (this_len <= 0) + break; + + max_len = (count > PAGE_SIZE) ? PAGE_SIZE : count; + this_len = (this_len > max_len) ? max_len : this_len; + + retval = access_process_vm(task, (mm->env_start + src), + page, this_len, 0); + + if (retval <= 0) { + ret = retval; + break; + } + + if (copy_to_user(buf, page, retval)) { + ret = -EFAULT; + break; + } + + ret += retval; + src += retval; + buf += retval; + count -= retval; + } + *ppos = src; + + mmput(mm); +out_free: + free_page((unsigned long) page); +out: + put_task_struct(task); +out_no_task: + return ret; +} + +static const struct file_operations proc_environ_operations = { + .read = environ_read, +}; + static ssize_t oom_adjust_read(struct file *file, char __user *buf, size_t count, loff_t *ppos) { @@ -2092,7 +2141,7 @@ static const struct pid_entry tgid_base_ DIR("task", S_IRUGO|S_IXUGO, task), DIR("fd", S_IRUSR|S_IXUSR, fd), DIR("fdinfo", S_IRUSR|S_IXUSR, fdinfo), - INF("environ",S_IRUSR, pid_environ), + REG("environ",S_IRUSR, environ), INF("auxv", S_IRUSR, pid_auxv), INF("status", S_IRUGO, pid_status), INF("limits", S_IRUSR, pid_limits), @@ -2421,7 +2470,7 @@ out_no_task: static const st
[PATCH -mm] Don't truncate /proc/PID/environ at 4096 characters
From: James Pearson <[EMAIL PROTECTED]> /proc/PID/environ currently truncates at 4096 characters, patch based on the /proc/PID/mem code. Signed-off-by: James Pearson <[EMAIL PROTECTED]> --- Patch against 2.6.23-rc6-mm1 --- ./fs/proc/base.c.dist 2007-09-19 12:29:46.244929651 +0100 +++ ./fs/proc/base.c2007-09-19 12:36:18.155648760 +0100 @@ -202,27 +202,6 @@ static int proc_root_link(struct inode * (task->state == TASK_STOPPED || task->state == TASK_TRACED) && \ security_ptrace(current,task) == 0)) -static int proc_pid_environ(struct task_struct *task, char * buffer) -{ - int res = 0; - struct mm_struct *mm = get_task_mm(task); - if (mm) { - unsigned int len; - - res = -ESRCH; - if (!ptrace_may_attach(task)) - goto out; - - len = mm->env_end - mm->env_start; - if (len > PAGE_SIZE) - len = PAGE_SIZE; - res = access_process_vm(task, mm->env_start, buffer, len, 0); -out: - mmput(mm); - } - return res; -} - static int proc_pid_cmdline(struct task_struct *task, char * buffer) { int res = 0; @@ -740,6 +719,79 @@ static const struct file_operations proc .open = mem_open, }; +static ssize_t environ_read(struct file *file, char __user *buf, + size_t count, loff_t *ppos) +{ + struct task_struct *task = get_proc_task(file->f_dentry->d_inode); + char *page; + unsigned long src = *ppos; + int ret = -ESRCH; + struct mm_struct *mm; + size_t max_len; + + if (!task) + goto out_no_task; + + if (!ptrace_may_attach(task)) + goto out; + + ret = -ENOMEM; + page = (char *)__get_free_page(GFP_TEMPORARY); + if (!page) + goto out; + + ret = 0; + + mm = get_task_mm(task); + if (!mm) + goto out_free; + + max_len = (count > PAGE_SIZE) ? PAGE_SIZE : count; + + while (count > 0) { + int this_len, retval; + + this_len = mm->env_end - (mm->env_start + src); + + if (this_len <= 0) + break; + + if (this_len > max_len) + this_len = max_len; + + retval = access_process_vm(task, (mm->env_start + src), + page, this_len, 0); + + if (retval <= 0) { + ret = retval; + break; + } + + if (copy_to_user(buf, page, retval)) { + ret = -EFAULT; + break; + } + + ret += retval; + src += retval; + buf += retval; + count -= retval; + } + *ppos = src; + + mmput(mm); +out_free: + free_page((unsigned long) page); +out: + put_task_struct(task); +out_no_task: + return ret; +} + +static const struct file_operations proc_environ_operations = { + .read = environ_read, +}; + static ssize_t oom_adjust_read(struct file *file, char __user *buf, size_t count, loff_t *ppos) { @@ -2092,7 +2144,7 @@ static const struct pid_entry tgid_base_ DIR("task", S_IRUGO|S_IXUGO, task), DIR("fd", S_IRUSR|S_IXUSR, fd), DIR("fdinfo", S_IRUSR|S_IXUSR, fdinfo), - INF("environ",S_IRUSR, pid_environ), + REG("environ",S_IRUSR, environ), INF("auxv", S_IRUSR, pid_auxv), INF("status", S_IRUGO, pid_status), INF("limits", S_IRUSR, pid_limits), @@ -2421,7 +2473,7 @@ out_no_task: static const struct pid_entry tid_base_stuff[] = { DIR("fd",S_IRUSR|S_IXUSR, fd), DIR("fdinfo",S_IRUSR|S_IXUSR, fdinfo), - INF("environ", S_IRUSR, pid_environ), + REG("environ", S_IRUSR, environ), INF("auxv", S_IRUSR, pid_auxv), INF("status",S_IRUGO, pid_status), INF("limits",S_IRUSR, pid_limits), - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH -mm] Don't truncate /proc/PID/environ at 4096 characters
Arvin Moezzi wrote: 2007/9/19, James Pearson <[EMAIL PROTECTED]>: + while (count > 0) { + int this_len, retval; + + this_len = mm->env_end - (mm->env_start + src); + + if (this_len <= 0) + break; + + if (this_len > max_len) + this_len = max_len; + + retval = access_process_vm(task, (mm->env_start + src), + page, this_len, 0); + + if (retval <= 0) { + ret = retval; + break; + } + + if (copy_to_user(buf, page, retval)) { shouldn't you only copy min(count,retval) bytes? otherwise you could write beyond the users buffer "buf", right? AFAIK, 'retval' can never be greater than 'this_len', which can never be greater than 'max_len', which can never be greater than 'count' James Pearson - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH -mm] Don't truncate /proc/PID/environ at 4096 characters
Andrew Morton wrote: On Wed, 19 Sep 2007 14:35:29 +0100 "James Pearson" <[EMAIL PROTECTED]> wrote: From: James Pearson <[EMAIL PROTECTED]> /proc/PID/environ currently truncates at 4096 characters, patch based on the /proc/PID/mem code. patch needs to be carefully reviewed from the security POV (ie: permissions) as well as for correctness. Does anyone have time to do that? Signed-off-by: James Pearson <[EMAIL PROTECTED]> --- ./fs/proc/base.c.dist 2007-09-19 12:29:46.244929651 +0100 +++ ./fs/proc/base.c2007-09-19 12:36:18.155648760 +0100 @@ -202,27 +202,6 @@ static int proc_root_link(struct inode * (task->state == TASK_STOPPED || task->state == TASK_TRACED) && \ security_ptrace(current,task) == 0)) -static int proc_pid_environ(struct task_struct *task, char * buffer) -{ - int res = 0; - struct mm_struct *mm = get_task_mm(task); - if (mm) { - unsigned int len; - - res = -ESRCH; - if (!ptrace_may_attach(task)) - goto out; - - len = mm->env_end - mm->env_start; - if (len > PAGE_SIZE) - len = PAGE_SIZE; - res = access_process_vm(task, mm->env_start, buffer, len, 0); -out: - mmput(mm); - } - return res; -} - static int proc_pid_cmdline(struct task_struct *task, char * buffer) { int res = 0; @@ -740,6 +719,79 @@ static const struct file_operations proc .open = mem_open, }; +static ssize_t environ_read(struct file *file, char __user *buf, + size_t count, loff_t *ppos) +{ + struct task_struct *task = get_proc_task(file->f_dentry->d_inode); + char *page; + unsigned long src = *ppos; + int ret = -ESRCH; + struct mm_struct *mm; + size_t max_len; + + if (!task) + goto out_no_task; + + if (!ptrace_may_attach(task)) + goto out; + + ret = -ENOMEM; + page = (char *)__get_free_page(GFP_TEMPORARY); Now I wonder what inspired you to reach for GFP_TEMPORARY? Perhaps the fact that it is crappily named and undocumented. This should be GFP_KERNEL - the page you're allocating here is not reclaimable by the VM. The code is based on mem_read() - and that is what mem_read() does in 2.6.23rc6-mm1 - my previous patch for 2.6.23rc5 used GFP_USER, as that is what mem_read() does in 2.6.23rc5. + if (!page) + goto out; + + ret = 0; + + mm = get_task_mm(task); + if (!mm) + goto out_free; + + max_len = (count > PAGE_SIZE) ? PAGE_SIZE : count; + + while (count > 0) { + int this_len, retval; + + this_len = mm->env_end - (mm->env_start + src); + + if (this_len <= 0) + break; + + if (this_len > max_len) + this_len = max_len; + + retval = access_process_vm(task, (mm->env_start + src), + page, this_len, 0); + + if (retval <= 0) { + ret = retval; + break; + } + + if (copy_to_user(buf, page, retval)) { + ret = -EFAULT; + break; + } + + ret += retval; + src += retval; + buf += retval; + count -= retval; + } Now that's a funky loop. Someone please convince me that there is no way in which `count - retval' can ever go negative (ie: huge positive). Again, this is exactly the same as in mem_read() James Pearson - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH -mm] Don't truncate /proc/PID/environ at 4096 characters
Arvin Moezzi wrote: I think that's not true. 'count' is changing through the iteration. The difference in the mem_read(): * while (count > 0) { * int this_len, retval; * * this_len = (count > PAGE_SIZE) ? PAGE_SIZE : count; * retval = access_process_vm(task, src, page, this_len, 0); * * ... * } is the fact, that this_len = min(PAGE_SIZE, count) is in the iteration block, hence retval <= this_len <= count in each iteration step. So this is ok. But IMHO in your code 'retval' may be bigger than 'count' in the last iteration of the block, because 'max_len' is fix through your iteration but 'count' is changing. Or am i missing something? Yes, you are correct ... Thanks James Pearson - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/