from:"Sebastian Siewior"

Re: Kernel oops with bluetooth usb dongle

2008-02-18 Thread Sebastian Siewior

* Quel Qun | 2008-02-18 00:01:21 [+]:

Please send me your .config file and process list (ps uax > ps_list)
after the crash. I have a dongle with the same usb id as yours and I
can't reproduce the crash. So it is either some .config magic or one of
your programs is accessing the dongle.

Sebastian
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[BUG] panic after umount (biscted)

2007-10-26 Thread Sebastian Siewior

After umount of 
/dev/md/1 on /mnt/sec type xfs (rw,nosuid,nodev,noatime,nodiratime)

I end up with [1] on v2.6.24-rc1. It worked fine with v2.6.23. Bisec
came to conclusion that 

|fd5d806266935179deda1502101624832eacd01f is first bad commit
|commit fd5d806266935179deda1502101624832eacd01f
|Author: Jens Axboe <[EMAIL PROTECTED]>
|Date:   Tue Oct 16 11:05:02 2007 +0200
|
|block: convert blkdev_issue_flush() to use empty barriers
|
|Then we can get rid of ->issue_flush_fn() and all the driver private
|implementations of that.
|
|Signed-off-by: Jens Axboe <[EMAIL PROTECTED]>
|
|:04 04 75f5c511fd785fa4c75b955f1818c382593a22ec 
adf460e288740ca9c0b5ab823dfc2ca04980f3d2 M  block
|:04 04 a972fe9a2433a00c79fd5ead584c482cd08b4b2c 
0f4534094b63fe21c5cf8a6170736abe5c518c00 M  drivers
|:04 04 ba2a9f681826db89ba5192a7db7139baecde786a 
fc68a77a41d25c283f39f712216566e9b36d00fc M  include

is the bad boy here.
The last panic (with first bad commit) is [2] (there is almost no 
difference). My .config [3], lspci [4], bisec-log [5]. The md1 
raid is a

|Personalities : [raid1] 
|md1 : active raid1 sda2[1]
|  174843328 blocks [2/1] [_U]
|

I hope this was usefull. Now, I'm going to rebuild my raid now

[1] http://download.breakpoint.cc/bug/bug_rc1.jpeg 166 KiB
[2] http://download.breakpoint.cc/bug/bug_last_commit.jpeg 217 KiB
[3] http://download.breakpoint.cc/bug/config.txt 27 KiB
[4] http://download.breakpoint.cc/bug/lspci.txt 2.3 KiB
[5] http://download.breakpoint.cc/bug/git-bisect-log.txt 2.3K

Sebastian
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] panic after umount (biscted)

2007-10-26 Thread Sebastian Siewior

* Jens Axboe | 2007-10-26 11:32:42 [+0200]:

>On Fri, Oct 26 2007, Jens Axboe wrote:
>> > 
>> > I hope this was usefull. Now, I'm going to rebuild my raid now
>> > 
>> Thanks a lot, a full report on this issue. Will get this fixed up asap.
No problem, thanks for working on that :)

>
>Does this work?
>
>diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
>index 61fdaf0..cf47fcb 100644
>--- a/drivers/scsi/scsi_lib.c
>+++ b/drivers/scsi/scsi_lib.c
>@@ -1115,6 +1115,8 @@ static int scsi_init_io(struct scsi_cmnd *cmd)
>* kmapping pages)
>*/
>   cmd->use_sg = req->nr_phys_segments;
>+  if (!cmd->use_sg)
>+  return 0;
> 
>   /*
>* If sg table allocation fails, requeue request later.
>@@ -1191,7 +1193,7 @@ int scsi_setup_blk_pc_cmnd(struct scsi_device *sdev, 
>struct request *req)
>   if (req->bio) {
>   int ret;
> 
>-  BUG_ON(!req->nr_phys_segments);
>+  BUG_ON(!req->nr_phys_segments && req->bio->bi_size);
> 
>   ret = scsi_init_io(cmd);
>   if (unlikely(ret))
>

Nope. I get [1] on manual umount and [2] on system reboot. This is
24-rc1 with this patch on top.

[1] http://download.breakpoint.cc/bug/bug_rc1_patch_manual.jpeg 163 KiB
[2] http://download.breakpoint.cc/bug/bug_rc1_patch_reboot.jpeg 171 KiB
>-- 
>Jens Axboe

Sebastian
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] panic after umount (biscted)

2007-10-27 Thread Sebastian Siewior

* Jens Axboe | 2007-10-26 13:42:30 [+0200]:

>> [2] http://download.breakpoint.cc/bug/bug_rc1_patch_reboot.jpeg 171 KiB
>
>Ah, second BUG() for same issue. Try this one. This?
>
>diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
>index 61fdaf0..57fde7b 100644
>--- a/drivers/scsi/scsi_lib.c
>+++ b/drivers/scsi/scsi_lib.c
>@@ -1115,6 +1115,8 @@ static int scsi_init_io(struct scsi_cmnd *cmd)
>* kmapping pages)
>*/
>   cmd->use_sg = req->nr_phys_segments;
>+  if (!cmd->use_sg)
>+  return 0;
> 
>   /*
>* If sg table allocation fails, requeue request later.
>@@ -1191,7 +1193,7 @@ int scsi_setup_blk_pc_cmnd(struct scsi_device *sdev, 
>struct request *req)
>   if (req->bio) {
>   int ret;
> 
>-  BUG_ON(!req->nr_phys_segments);
>+  BUG_ON(!req->nr_phys_segments && req->bio->bi_size);
> 
>   ret = scsi_init_io(cmd);
>   if (unlikely(ret))
>@@ -1236,9 +1238,10 @@ int scsi_setup_fs_cmnd(struct scsi_device *sdev, struct 
>request *req)
>   if (ret != BLKPREP_OK)
>   return ret;
>   /*
>-   * Filesystem requests must transfer data.
>+   * Filesystem requests must transfer data, unless it's an empty
>+   * barrier.
>*/
>-  BUG_ON(!req->nr_phys_segments);
>+  BUG_ON(!req->nr_phys_segments && !bio_empty_barrier(req->bio));
> 
>   cmd = scsi_get_cmd_from_req(sdev, req);
>   if (unlikely(!cmd))
>

I'm afraid you did not make it to the next level. I hope you have
another man :). [1] shows the result. I double checked it, it seems to
be the same bug() with the second patch.

[1] http://download.breakpoint.cc/bug/bug_patch2.jpeg 134 KiB

>Jens Axboe

Sebastian
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] panic after umount (biscted)

2007-10-27 Thread Sebastian Siewior

* Jens Axboe | 2007-10-27 13:39:16 [+0200]:

>> [1] http://download.breakpoint.cc/bug/bug_patch2.jpeg 134 KiB
>
>OK, can you see what this produces?
>
>diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
>index 61fdaf0..4042269 100644
>--- a/drivers/scsi/scsi_lib.c
>+++ b/drivers/scsi/scsi_lib.c
>@@ -1115,6 +1115,8 @@ static int scsi_init_io(struct scsi_cmnd *cmd)
>* kmapping pages)
>*/
>   cmd->use_sg = req->nr_phys_segments;
>+  if (!cmd->use_sg)
>+  return 0;
> 
>   /*
>* If sg table allocation fails, requeue request later.
>@@ -1191,7 +1193,7 @@ int scsi_setup_blk_pc_cmnd(struct scsi_device *sdev, 
>struct request *req)
>   if (req->bio) {
>   int ret;
> 
>-  BUG_ON(!req->nr_phys_segments);
>+  BUG_ON(!req->nr_phys_segments && req->bio->bi_size);
> 
>   ret = scsi_init_io(cmd);
>   if (unlikely(ret))
>@@ -1236,9 +1238,11 @@ int scsi_setup_fs_cmnd(struct scsi_device *sdev, struct 
>request *req)
>   if (ret != BLKPREP_OK)
>   return ret;
>   /*
>-   * Filesystem requests must transfer data.
>+   * Filesystem requests must transfer data, unless it's an empty
>+   * barrier.
>*/
>-  BUG_ON(!req->nr_phys_segments);
>+  if (!req->nr_phys_segments && !bio_empty_barrier(req->bio))
>+  blk_dump_rq_flags(req, "scsi");
> 
>   cmd = scsi_get_cmd_from_req(sdev, req);
>   if (unlikely(!cmd))

Good, [1] has the dmesg output after umount. 

[1] http://download.breakpoint.cc/bug/bug_patch3.jpeg 36 KiB
>-- 
>Jens Axboe

Sebastian
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] compat_ioctl requires CONFIG_BLOCK

2007-07-20 Thread Sebastian Siewior

Got with randconfig
include/linux/loop.h:66: error: expected specifier-qualifier-list before
'request_queue_t'
make[1]: *** [fs/compat_ioctl.o] Error 1

parts of compat ioctl require CONFIG_BLOCK to be set.

Signed-off-by: Sebastian Siewior <[EMAIL PROTECTED]>
Index: b/fs/compat_ioctl.c
===
--- a/fs/compat_ioctl.c
+++ b/fs/compat_ioctl.c
@@ -63,7 +63,9 @@
 #include 
 #include 
 #include 
+#ifdef CONFIG_BLOCK
 #include 
+#endif
 
 #include 
 #include 
@@ -3491,8 +3493,10 @@ HANDLE_IOCTL(LPSETTIMEOUT, lp_timeout_tr
 IGNORE_IOCTL(VFAT_IOCTL_READDIR_BOTH32)
 IGNORE_IOCTL(VFAT_IOCTL_READDIR_SHORT32)
 
+#ifdef CONFIG_BLOCK
 /* loop */
 IGNORE_IOCTL(LOOP_CLR_FD)
+#endif
 };
 
 #define IOCTL_HASHSIZE 256

-- 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] i2o: defined but not used.

2007-07-20 Thread Sebastian Siewior

Got with randconfig
drivers/message/i2o/exec-osm.c:539: warning: 'i2o_exec_lct_notify' defined but
not used
Signed-off-by: Sebastian Siewior <[EMAIL PROTECTED]>
Index: b/drivers/message/i2o/exec-osm.c
===
--- a/drivers/message/i2o/exec-osm.c
+++ b/drivers/message/i2o/exec-osm.c
@@ -389,9 +389,7 @@ static void i2o_exec_lct_modified(struct
if (i2o_device_parse_lct(c) != -EAGAIN)
change_ind = c->lct->change_ind + 1;
 
-#ifdef CONFIG_I2O_LCT_NOTIFY_ON_CHANGES
i2o_exec_lct_notify(c, change_ind);
-#endif
 };
 
 /**
@@ -525,6 +523,7 @@ int i2o_exec_lct_get(struct i2o_controll
return rc;
 }
 
+#ifdef CONFIG_I2O_LCT_NOTIFY_ON_CHANGES
 /**
  * i2o_exec_lct_notify - Send a asynchronus LCT NOTIFY request
  * @c: I2O controller to which the request should be send
@@ -570,6 +569,13 @@ static int i2o_exec_lct_notify(struct i2
return 0;
 };
 
+#else
+static int i2o_exec_lct_notify(struct i2o_controller *c, u32 change_ind)
+{
+   return 0;
+}
+#endif
+
 /* Exec OSM driver struct */
 struct i2o_driver i2o_exec_driver = {
.name = OSM_NAME,

-- 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] compat_ioctl requires CONFIG_BLOCK

2007-07-21 Thread Sebastian Siewior

* Arnd Bergmann | 2007-07-21 01:08:57 [+0200]:

>On Saturday 21 July 2007, Sebastian Siewior wrote:
>> 
>> Got with randconfig
>> include/linux/loop.h:66: error: expected specifier-qualifier-list before
>> 'request_queue_t'
>> make[1]: *** [fs/compat_ioctl.o] Error 1
>> 
>> parts of compat ioctl require CONFIG_BLOCK to be set.
>> 
>> Signed-off-by: Sebastian Siewior <[EMAIL PROTECTED]>
>> Index: b/fs/compat_ioctl.c
>> ===
>> --- a/fs/compat_ioctl.c
>> +++ b/fs/compat_ioctl.c
>> @@ -63,7 +63,9 @@
>>  #include 
>>  #include 
>>  #include 
>> +#ifdef CONFIG_BLOCK
>>  #include 
>> +#endif
>
>Adding #ifdef around an #include is considered bad style. Better just
>make loop.h compile without any conditionals. Does the below
>patch work for you?
>
Yes it does.

>   Arnd <><

Sebastian

>
>--- a/include/linux/loop.h
>+++ b/include/linux/loop.h
>@@ -63,7 +63,7 @@ struct loop_device {
>   struct task_struct  *lo_thread;
>   wait_queue_head_t   lo_event;
> 
>-  request_queue_t *lo_queue;
>+  struct request_queue*lo_queue;
>   struct gendisk  *lo_disk;
>   struct list_headlo_list;
> };

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] i2o: defined but not used.

2007-07-21 Thread Sebastian Siewior

* Andrew Morton | 2007-07-21 03:31:33 [-0700]:

>On Sat, 21 Jul 2007 00:58:01 +0200 Sebastian Siewior <[EMAIL PROTECTED]> wrote:
>
>> Got with randconfig
>
>randconfig apparently generates impossible configs.  Please always
>run `make oldconfig' after the randconfig, then do the test build.

Ah okey, I did not know this. 

CONFIG_I2O=y
# CONFIG_I2O_LCT_NOTIFY_ON_CHANGES is not set

Is still in my .config after oldconfig.

Sebastian

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 2/2] x86_64: use asm() like the other atomic operations already do.

2007-08-14 Thread Sebastian Siewior

As Segher pointed out, inline asm is better than the volatile casting all over
the place. From the PowerPC patch description:
 Also use inline functions instead of macros; this actually
 improves code generation (some code becomes a little smaller,
 probably because of improved alias information -- just a few
 hundred bytes total on a default kernel build, nothing shocking).

My config with march=k8 and gcc (GCC) 4.1.2 (Gentoo 4.1.2):
  textdata bss dec hex filename
  4002473  385936  474440 4862849  4a3381 atomic_normal/vmlinux
  4002587  385936  474440 4862963  4a33f3 atomic_inlineasm/vmlinux
  4003911  385936  474440 4864287  4a391f atomic_volatile/vmlinux
  4003959  385936  474440 4864335  4a394f atomic_volatile_inline/vmlinux

Signed-off-by: Sebastian Siewior <[EMAIL PROTECTED]>
--- a/include/asm-x86_64/atomic.h
+++ b/include/asm-x86_64/atomic.h
@@ -29,19 +29,34 @@ typedef struct { int counter; } atomic_t
 /**
  * atomic_read - read atomic variable
  * @v: pointer of type atomic_t
- * 
+ *
  * Atomically reads the value of @v.
- */ 
-#define atomic_read(v) ((v)->counter)
+ */
+static __inline__ int atomic_read(const atomic_t *v)
+{
+   int t;
+
+   __asm__ __volatile__(
+   "movl %1, %0"
+   : "=r"(t)
+   : "m"(v->counter));
+   return t;
+}
 
 /**
  * atomic_set - set atomic variable
  * @v: pointer of type atomic_t
  * @i: required value
- * 
+ *
  * Atomically sets the value of @v to @i.
- */ 
-#define atomic_set(v,i)(((v)->counter) = (i))
+ */
+static __inline__ void atomic_set(atomic_t *v, int i)
+{
+   __asm__ __volatile__(
+   "movl %1, %0"
+   : "=m"(v->counter)
+   : "ir"(i));
+}
 
 /**
  * atomic_add - add integer to atomic variable
@@ -206,7 +221,7 @@ static __inline__ int atomic_sub_return(
 
 /* An 64bit atomic type */
 
-typedef struct { volatile long counter; } atomic64_t;
+typedef struct { long counter; } atomic64_t;
 
 #define ATOMIC64_INIT(i)   { (i) }
 
@@ -217,7 +232,16 @@ typedef struct { volatile long counter; 
  * Atomically reads the value of @v.
  * Doesn't imply a read memory barrier.
  */
-#define atomic64_read(v)   ((v)->counter)
+static __inline__ long atomic64_read(const atomic64_t *v)
+{
+   long t;
+
+   __asm__ __volatile__(
+   "movq %1, %0"
+   : "=r"(t)
+   : "m"(v->counter));
+   return t;
+}
 
 /**
  * atomic64_set - set atomic64 variable
@@ -226,7 +250,13 @@ typedef struct { volatile long counter; 
  *
  * Atomically sets the value of @v to @i.
  */
-#define atomic64_set(v,i)  (((v)->counter) = (i))
+static __inline__ void atomic64_set(atomic64_t *v, long i)
+{
+   __asm__ __volatile__(
+   "movq %1, %0"
+   : "=m"(v->counter)
+   : "ir"(i));
+}
 
 /**
  * atomic64_add - add integer to atomic64 variable

-- 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 0/2] use asm() for atomic_{read|set}

2007-08-14 Thread Sebastian Siewior

I converted i386+x86-64. Compiled, booted and played for a while. The
description of both patches contains the file size of four kernel builds:
- "normal" is 28e8351ac22de25034e048c680014ad824323c65 as it
- "inline asm" is with this patch
- "inline volatile" is *(volatile int *)&(v)->counter as a static inline
  function
- "volatile" is *(volatile int *)&(v)->counter as a #define macro

I hope I don't encourage anyone to use macros over inline functions.

Sebastian
-- 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 1/2] i386: use asm() like the other atomic operations already do.

2007-08-14 Thread Sebastian Siewior

As Segher pointed out, inline asm is better than the volatile casting all over
the place. From the PowerPC patch description:
 Also use inline functions instead of macros; this actually
 improves code generation (some code becomes a little smaller,
 probably because of improved alias information -- just a few
 hundred bytes total on a default kernel build, nothing shocking).

My config with march=pentium-m and gcc (GCC) 4.1.2 (Gentoo 4.1.2):
  textdata bss dec hex filename
3434150  249176  176128 3859454  3ae3fe atomic_normal/vmlinux
3435308  249176  176128 3860612  3ae884 atomic_inlineasm/vmlinux
3436201  249176  176128 3861505  3aec01 atomic_inline_volatile/vmlinux
3436203  249176  176128 3861507  3aec03 atomic_volatile/vmlinux

Signed-off-by: Sebastian Siewior <[EMAIL PROTECTED]>
--- a/include/asm-i386/atomic.h
+++ b/include/asm-i386/atomic.h
@@ -22,19 +22,34 @@ typedef struct { int counter; } atomic_t
 /**
  * atomic_read - read atomic variable
  * @v: pointer of type atomic_t
- * 
+ *
  * Atomically reads the value of @v.
- */ 
-#define atomic_read(v) ((v)->counter)
+ */
+static __inline__ int atomic_read(const atomic_t *v)
+{
+   int t;
+
+   __asm__ __volatile__(
+   "movl %1,%0"
+   : "=r"(t)
+   : "m"(v->counter));
+   return t;
+}
 
 /**
  * atomic_set - set atomic variable
  * @v: pointer of type atomic_t
  * @i: required value
- * 
+ *
  * Atomically sets the value of @v to @i.
- */ 
-#define atomic_set(v,i)(((v)->counter) = (i))
+ */
+static __inline__ void atomic_set(atomic_t *v, int i)
+{
+   __asm__ __volatile__(
+   "movl %1,%0"
+   : "=m"(v->counter)
+   : "ir"(i));
+}
 
 /**
  * atomic_add - add integer to atomic variable

-- 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 1/2] i386: use asm() like the other atomic operations already do.

2007-08-15 Thread Sebastian Siewior

* Andi Kleen | 2007-08-15 02:20:35 [+0200]:

>> My config with march=pentium-m and gcc (GCC) 4.1.2 (Gentoo 4.1.2):
>>   textdata bss dec hex filename
>> 3434150  249176  176128 3859454  3ae3fe atomic_normal/vmlinux
>> 3435308  249176  176128 3860612  3ae884 atomic_inlineasm/vmlinux
>
>What is the difference between atomic_normal and atomic_inlineasm? 
atomic normal is Linus' tree, commit 28e8351. Inline asm is with this
patch. I wrote this in 0/2 (you want this here as well?).

>
>>  /**
>>   * atomic_read - read atomic variable
>>   * @v: pointer of type atomic_t
>> - * 
>> + *
>
>Please don't change white space in patches
I fixed white space errors. SubmitChecklist:24 says I should not
introduce any new ones so fixing existig sounds like the right thing to
do.
No white space fixing in future?

>
>>   * Atomically reads the value of @v.
>> - */ 
>> -#define atomic_read(v)  ((v)->counter)
>> + */
>> +static __inline__ int atomic_read(const atomic_t *v)
>> +{
>> +int t;
>> +
>> +__asm__ __volatile__(
>
>And don't use __*__ in new code
Okey.

>
>-Andi

Sebastian
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 1/2] i386: use asm() like the other atomic operations already do.

2007-08-15 Thread Sebastian Siewior

* Satyam Sharma | 2007-08-15 18:32:10 [+0530]:

>On Wed, 15 Aug 2007, Herbert Xu wrote:
>
>> Andi Kleen <[EMAIL PROTECTED]> wrote:
>> >> My config with march=pentium-m and gcc (GCC) 4.1.2 (Gentoo 4.1.2):
>> >>   textdata bss dec hex filename
>> >> 3434150  249176  176128 3859454  3ae3fe atomic_normal/vmlinux
>> >> 3435308  249176  176128 3860612  3ae884 atomic_inlineasm/vmlinux
>> > 
>> > What is the difference between atomic_normal and atomic_inlineasm? 
>> 
>> The inline asm stops certain optimisations from occuring.
>
>Yup, the "__volatile__" after "__asm__" shouldn't be required. The
>previous definitions allowed the compiler to optimize certain atomic
>ops away, and considering we haven't seen any bugs due to the past
>behaviour anyway, changing it like this sounds unnecessary.
>
>The second behavioral change that comes about here is the force-
>constraining of v->counter to memory in the extended asm's constraint
>sets. But that's required to give "volatility"-like behaviour, which
>I suspect is the goal of this patch.
exactly.

>Sebastian, could you look at the kernel images in detail and see why,
>how and what in the text expanded by 1158 bytes with these changes?
Wow, easy. "normal" is Linus' git. It has no volatile or anything. I
suspect the compiler to "optimize" some of the reads and sets away
(which remain after my patch is applied).
Anyway, I tried to find out the source of this and it is not that easy.
If the text is getting bigger, all references to the data section are
changed and I can't simply diff. Than I searched for two "equal" .o
files which differ in size. This was not better because gcc reorganized
the whole .o file and the functions that were at the beginning in the
first file, were at a different position in the second one.
For those who have plenty of time in their hands, I posted the four
kernels [1] and my four O= folders [2]. 
Once I have some time, I try to diff the specific function within the .o
file which used the atomic_{read|set} operation.

>> I'm still unconvinced why we need this because nobody has
>> brought up any examples of kernel code that legitimately
>> need this.
>
>Me too, but I do find these more palatable than the variants using
>"volatile", I admit.
Yes. The inline asm is even smaller than the volatile ones, what sounds
like better optimization by the gcc (and it does exactly what volatile
was used for).

[1] http://download.breakpoint.cc/kernel_builds_vmlinux.tbz2 8.3 MiB
[2] http://download.breakpoint.cc/kernel_builds.tbz2 117 MiB
>
>Satyam

Sebastian
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 0/2] use asm() for atomic_{read|set} (shot 2)

2007-08-15 Thread Sebastian Siewior

Diff to last post:
- no __*__ functions used.
- no white space fixes.

I converted i386+x86-64 arch. The description of both patches contains the 
file size of four kernel builds:
- "normal" is 28e8351ac22de25034e048c680014ad824323c65 as it
- "inline asm" is with this patch
- "inline volatile" is *(volatile int *)&(v)->counter as a static inline
  function
- "volatile" is *(volatile int *)&(v)->counter as a #define macro

Sebastian
-- 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 1/2] i386: use asm() like the other atomic operations already do.

2007-08-15 Thread Sebastian Siewior

As Segher pointed out, inline asm is better than the volatile casting all over
the place. From the PowerPC patch description:
 Also use inline functions instead of macros; this actually
 improves code generation (some code becomes a little smaller,
 probably because of improved alias information -- just a few
 hundred bytes total on a default kernel build, nothing shocking).

My config with march=pentium-m and gcc (GCC) 4.1.2 (Gentoo 4.1.2):
  textdata bss dec hex filename
  3434150  249176  176128 3859454  3ae3fe atomic_normal/vmlinux
  3435308  249176  176128 3860612  3ae884 atomic_inlineasm/vmlinux
  3436201  249176  176128 3861505  3aec01 atomic_inline_volatile/vmlinux
  3436203  249176  176128 3861507  3aec03 atomic_volatile/vmlinux

Signed-off-by: Sebastian Siewior <[EMAIL PROTECTED]>
Cc: Andi Kleen <[EMAIL PROTECTED]>
Cc: Chris Snook <[EMAIL PROTECTED]>
--- a/include/asm-i386/atomic.h
+++ b/include/asm-i386/atomic.h
@@ -25,7 +25,16 @@ typedef struct { int counter; } atomic_t
  * 
  * Atomically reads the value of @v.
  */ 
-#define atomic_read(v) ((v)->counter)
+static inline int atomic_read(const atomic_t *v)
+{
+   int t;
+
+   asm volatile(
+   "movl %1,%0"
+   : "=r"(t)
+   : "m"(v->counter));
+   return t;
+}
 
 /**
  * atomic_set - set atomic variable
@@ -34,7 +43,13 @@ typedef struct { int counter; } atomic_t
  * 
  * Atomically sets the value of @v to @i.
  */ 
-#define atomic_set(v,i)(((v)->counter) = (i))
+static inline void atomic_set(atomic_t *v, int i)
+{
+   asm volatile(
+   "movl %1,%0"
+   : "=m"(v->counter)
+   : "ir"(i));
+}
 
 /**
  * atomic_add - add integer to atomic variable

-- 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 2/2] x86_64: use asm() like the other atomic operations already do.

2007-08-16 Thread Sebastian Siewior

As Segher pointed out, inline asm is better than the volatile casting all over
the place. From the PowerPC patch description:
 Also use inline functions instead of macros; this actually
 improves code generation (some code becomes a little smaller,
 probably because of improved alias information -- just a few
 hundred bytes total on a default kernel build, nothing shocking).

My config with march=k8 and gcc (GCC) 4.1.2 (Gentoo 4.1.2):
  textdata bss dec hex filename
  4002473  385936  474440 4862849  4a3381 atomic_normal/vmlinux
  4002587  385936  474440 4862963  4a33f3 atomic_inlineasm/vmlinux
  4003911  385936  474440 4864287  4a391f atomic_volatile/vmlinux
  4003959  385936  474440 4864335  4a394f atomic_volatile_inline/vmlinux

Signed-off-by: Sebastian Siewior <[EMAIL PROTECTED]>
Cc: Andi Kleen <[EMAIL PROTECTED]>
Cc: Chris Snook <[EMAIL PROTECTED]>
--- a/include/asm-x86_64/atomic.h
+++ b/include/asm-x86_64/atomic.h
@@ -32,7 +32,16 @@ typedef struct { int counter; } atomic_t
  * 
  * Atomically reads the value of @v.
  */ 
-#define atomic_read(v) ((v)->counter)
+static inline int atomic_read(const atomic_t *v)
+{
+   int t;
+
+   asm volatile(
+   "movl %1, %0"
+   : "=r"(t)
+   : "m"(v->counter));
+   return t;
+}
 
 /**
  * atomic_set - set atomic variable
@@ -41,7 +50,13 @@ typedef struct { int counter; } atomic_t
  * 
  * Atomically sets the value of @v to @i.
  */ 
-#define atomic_set(v,i)(((v)->counter) = (i))
+static inline void atomic_set(atomic_t *v, int i)
+{
+   asm volatile(
+   "movl %1, %0"
+   : "=m"(v->counter)
+   : "ir"(i));
+}
 
 /**
  * atomic_add - add integer to atomic variable
@@ -206,7 +221,7 @@ static __inline__ int atomic_sub_return(
 
 /* An 64bit atomic type */
 
-typedef struct { volatile long counter; } atomic64_t;
+typedef struct { long counter; } atomic64_t;
 
 #define ATOMIC64_INIT(i)   { (i) }
 
@@ -217,7 +232,16 @@ typedef struct { volatile long counter; 
  * Atomically reads the value of @v.
  * Doesn't imply a read memory barrier.
  */
-#define atomic64_read(v)   ((v)->counter)
+static inline long atomic64_read(const atomic64_t *v)
+{
+   long t;
+
+   asm volatile(
+   "movq %1, %0"
+   : "=r"(t)
+   : "m"(v->counter));
+   return t;
+}
 
 /**
  * atomic64_set - set atomic64 variable
@@ -226,7 +250,13 @@ typedef struct { volatile long counter; 
  *
  * Atomically sets the value of @v to @i.
  */
-#define atomic64_set(v,i)  (((v)->counter) = (i))
+static inline void atomic64_set(atomic64_t *v, long i)
+{
+   asm volatile(
+   "movq %1, %0"
+   : "=m"(v->counter)
+   : "ir"(i));
+}
 
 /**
  * atomic64_add - add integer to atomic64 variable

-- 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] [mmc] fix compile without LED Triggers

2007-10-12 Thread Sebastian Siewior

drivers/mmc/core/host.c: In function 'mmc_remove_host':
drivers/mmc/core/host.c:146: error: implicit declaration of function 
'led_trigger_unregister'
drivers/mmc/core/host.c:146: error: 'struct mmc_host' has no member named 'led'

Signed-off-by: Sebastian Siewior <[EMAIL PROTECTED]>
---
 include/linux/leds.h |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/include/linux/leds.h b/include/linux/leds.h
index dc1178f..24c4830 100644
--- a/include/linux/leds.h
+++ b/include/linux/leds.h
@@ -98,6 +98,7 @@ extern void led_trigger_event(struct led_trigger *trigger,
 #define DEFINE_LED_TRIGGER_GLOBAL(x)
 #define led_trigger_register_simple(x, y) do {} while(0)
 #define led_trigger_unregister_simple(x) do {} while(0)
+#define led_trigger_unregister(x) do {} while(0)
 #define led_trigger_event(x, y) do {} while(0)
 
 #endif
-- 
1.5.3.4

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.4.24-rc1-git: crash on shutdown/unmount?

2007-11-03 Thread Sebastian Siewior

* Jens Axboe | 2007-11-01 11:51:09 [+0100]:

>On Wed, Oct 31 2007, Alistair John Strachan wrote:
>> Hi Jens,
>> 
>> I guessed from the oops that you might have an idea what's causing this oops 
>> on shutdown/unmount. The git version (describe), a screenshot showing the 
>> oops, a config, and dmesg for a booted kernel are available from:
>> 
>> http://devzero.co.uk/~alistair/oops-20071031/
>> 
>> I went back to -rc1 and it still happens there too. If you need any more 
>> information or want me to bisect it, please let me know.
>
>Does this work for you?
Yes, it does. Thanks for working on that. Sorry for the late reply but I
run -ENOINET.

Acked-by: Sebastian Siewior <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH v1 00/25] printk: new implementation

2019-03-13 Thread Sebastian Siewior

On 2019-03-13 09:19:32 [+0100], John Ogness wrote:
> recursive situation. As you are pointing out, the notification/wake
> component of printk_safe will still be needed. I will leave that (small)
> part in printk_safe.c.

Does this mean we keep irq_work part or we bury it and solve it by other
means?

> John Ogness

Sebastian

Re: [PATCH] [RFC] rt: kernel/sched/core: fix kthread_park() pending too long when CPU un-plugged

2021-01-07 Thread Sebastian Siewior

On 2021-01-07 11:45:39 [+0100], Peter Zijlstra wrote:
> On Thu, Jan 07, 2021 at 05:18:41PM +0800, Ran Wang wrote:
> > +
> > +   if (IS_ENABLED(CONFIG_PREEMPT_RT) &&
> > +   !strncmp(p->comm, "ksoftirqd/", 10))
> > +   schedule_hrtimeout(&to,
> > +   HRTIMER_MODE_REL | HRTIMER_MODE_HARD);
> > +   else
> > +   schedule_hrtimeout(&to, HRTIMER_MODE_REL);
> 
> This is horrific, why did you not self-censor and spare me the mental
> anguish of having to formulate a CoC compliant response?
> 
> It also violates coding style, but given the total lack of any sense,
> that seems like a minor detail.
> 
> Why can't we use HRTIMER_MODE_HARD unconditionally?

I had a similar patch in -RT and dropped it in v5.10-rc7-rt16.
It was added because RT couldn't boot since creating the boot-threads
didn't work before the ksoftirqd was up. This was fixed by commit
   26c7295be0c5e ("kthread: Do not preempt current task if it is going to call 
schedule()")

and live was good again.
tglx (also) suggested to add HRTIMER_MODE_HARD unconditionally (it
looked at SYSTEM_STATE back then) and I was only worried some abuse via
userland.
This sleep can be triggered by ptrace/strace() and with brief testing I
can trigger the sleep there but I don't get it anywhere near where I
would notice it with cyclictest.

Sebastian

Re: [PATCH] [RFC] rt: kernel/sched/core: fix kthread_park() pending too long when CPU un-plugged

2021-01-08 Thread Sebastian Siewior

On 2021-01-08 08:45:14 [+], Ran Wang wrote:
> Hi Sebastian, Peter
Hi,

> > I had a similar patch in -RT and dropped it in v5.10-rc7-rt16.
> > It was added because RT couldn't boot since creating the boot-threads 
> > didn't work before the ksoftirqd was up. This was fixed by commit
> >26c7295be0c5e ("kthread: Do not preempt current task if it is going to 
> > call schedule()")
> 
> I tried applying above commit to linux-5.6.y-rt, it could resolve my problem 
> on LX2160ARDB, THANKS!

so in other words all this could have been avoided by using a supported
or maintained RT series. The v5.4 series has this patch, v5.6 isn't
maintained anymore so it is likely that there is more missing.

Sebastian

Re: [PATCH] [RFC] rt: kernel/sched/core: fix kthread_park() pending too long when CPU un-plugged

2021-01-08 Thread Sebastian Siewior

On 2021-01-08 10:32:36 [+0100], Peter Zijlstra wrote:
> It's a single task wakeup (the caller), doing that from hardirq context
> really should not be a problem, we do lots of that in RT already.

I'm not worry about that single wakeup but about an artificial case
where you manage to accumulate multiple single wake ups in a short
time frame.

Sebastian

[PATCH] fs/buffer: Make BH_Uptodate_Lock bit_spin_lock a regular spinlock_t

2019-08-20 Thread Sebastian Siewior

From: Thomas Gleixner 

Bit spinlocks are problematic if PREEMPT_RT is enabled, because they
disable preemption, which is undesired for latency reasons and breaks when
regular spinlocks are taken within the bit_spinlock locked region because
regular spinlocks are converted to 'sleeping spinlocks' on RT. So RT
replaces the bit spinlocks with regular spinlocks to avoid this problem.
Bit spinlocks are also not covered by lock debugging, e.g. lockdep.

Substitute the BH_Uptodate_Lock bit spinlock with a regular spinlock.

Signed-off-by: Thomas Gleixner 
[bigeasy: remove the wrapper and use always spinlock_t]
Signed-off-by: Sebastian Andrzej Siewior 
---
 fs/buffer.c | 19 +++
 fs/ext4/page-io.c   |  8 +++-
 fs/ntfs/aops.c  |  9 +++--
 include/linux/buffer_head.h |  6 +++---
 4 files changed, 16 insertions(+), 26 deletions(-)

diff --git a/fs/buffer.c b/fs/buffer.c
index 131d39ec7d316..eab37fbaa439f 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -275,8 +275,7 @@ static void end_buffer_async_read(struct buffer_head *bh, 
int uptodate)
 * decide that the page is now completely done.
 */
first = page_buffers(page);
-   local_irq_save(flags);
-   bit_spin_lock(BH_Uptodate_Lock, &first->b_state);
+   spin_lock_irqsave(&first->uptodate_lock, flags);
clear_buffer_async_read(bh);
unlock_buffer(bh);
tmp = bh;
@@ -289,8 +288,7 @@ static void end_buffer_async_read(struct buffer_head *bh, 
int uptodate)
}
tmp = tmp->b_this_page;
} while (tmp != bh);
-   bit_spin_unlock(BH_Uptodate_Lock, &first->b_state);
-   local_irq_restore(flags);
+   spin_unlock_irqrestore(&first->uptodate_lock, flags);
 
/*
 * If none of the buffers had errors and they are all
@@ -302,8 +300,7 @@ static void end_buffer_async_read(struct buffer_head *bh, 
int uptodate)
return;
 
 still_busy:
-   bit_spin_unlock(BH_Uptodate_Lock, &first->b_state);
-   local_irq_restore(flags);
+   spin_unlock_irqrestore(&first->uptodate_lock, flags);
return;
 }
 
@@ -331,8 +328,7 @@ void end_buffer_async_write(struct buffer_head *bh, int 
uptodate)
}
 
first = page_buffers(page);
-   local_irq_save(flags);
-   bit_spin_lock(BH_Uptodate_Lock, &first->b_state);
+   spin_lock_irqsave(&first->uptodate_lock, flags);
 
clear_buffer_async_write(bh);
unlock_buffer(bh);
@@ -344,14 +340,12 @@ void end_buffer_async_write(struct buffer_head *bh, int 
uptodate)
}
tmp = tmp->b_this_page;
}
-   bit_spin_unlock(BH_Uptodate_Lock, &first->b_state);
-   local_irq_restore(flags);
+   spin_unlock_irqrestore(&first->uptodate_lock, flags);
end_page_writeback(page);
return;
 
 still_busy:
-   bit_spin_unlock(BH_Uptodate_Lock, &first->b_state);
-   local_irq_restore(flags);
+   spin_unlock_irqrestore(&first->uptodate_lock, flags);
return;
 }
 EXPORT_SYMBOL(end_buffer_async_write);
@@ -3420,6 +3414,7 @@ struct buffer_head *alloc_buffer_head(gfp_t gfp_flags)
struct buffer_head *ret = kmem_cache_zalloc(bh_cachep, gfp_flags);
if (ret) {
INIT_LIST_HEAD(&ret->b_assoc_buffers);
+   spin_lock_init(&ret->uptodate_lock);
preempt_disable();
__this_cpu_inc(bh_accounting.nr);
recalc_bh_state();
diff --git a/fs/ext4/page-io.c b/fs/ext4/page-io.c
index 12ceadef32c5a..7745ed23c6ad9 100644
--- a/fs/ext4/page-io.c
+++ b/fs/ext4/page-io.c
@@ -87,11 +87,10 @@ static void ext4_finish_bio(struct bio *bio)
}
bh = head = page_buffers(page);
/*
-* We check all buffers in the page under BH_Uptodate_Lock
+* We check all buffers in the page under uptodate_lock
 * to avoid races with other end io clearing async_write flags
 */
-   local_irq_save(flags);
-   bit_spin_lock(BH_Uptodate_Lock, &head->b_state);
+   spin_lock_irqsave(&head->uptodate_lock, flags);
do {
if (bh_offset(bh) < bio_start ||
bh_offset(bh) + bh->b_size > bio_end) {
@@ -103,8 +102,7 @@ static void ext4_finish_bio(struct bio *bio)
if (bio->bi_status)
buffer_io_error(bh);
} while ((bh = bh->b_this_page) != head);
-   bit_spin_unlock(BH_Uptodate_Lock, &head->b_state);
-   local_irq_restore(flags);
+   spin_unlock_irqrestore(&head->uptodate_lock, flags);
if (!under_io) {
fscrypt_free_bounce_page(bounce_page);
end_page_writeback(page);
diff --git a/fs/ntfs/aops.c b/fs/ntfs/aops.c
index 7202a1e39d70c..14ca433b3a9e4 100644
--- a/fs/ntfs/aops.c
+++ b/fs/

Re: [patch V2 0/7] fs: Substitute bit-spinlocks for PREEMPT_RT and debugging

2019-08-20 Thread Sebastian Siewior

On 2019-08-10 01:18:34 [-0700], Christoph Hellwig wrote:
> > > Does SLUB work on -rt at all?
> > 
> > It's the only allocator we support with a few tweaks :)
> 
> What do you do about this particular piece of code there?

This part remains untouched. This "lock" is acquired within ->list_lock
which is a raw_spinlock_t and disables preemption/interrupts on -RT.

Sebastian

Re: [PATCH] fs/buffer: Make BH_Uptodate_Lock bit_spin_lock a regular spinlock_t

2019-10-11 Thread Sebastian Siewior

On 2019-08-20 20:01:14 [+0200], Thomas Gleixner wrote:
> On Tue, 20 Aug 2019, Matthew Wilcox wrote:
> > On Tue, Aug 20, 2019 at 07:08:18PM +0200, Sebastian Siewior wrote:
> > > Bit spinlocks are problematic if PREEMPT_RT is enabled, because they
> > > disable preemption, which is undesired for latency reasons and breaks when
> > > regular spinlocks are taken within the bit_spinlock locked region because
> > > regular spinlocks are converted to 'sleeping spinlocks' on RT. So RT
> > > replaces the bit spinlocks with regular spinlocks to avoid this problem.
> > > Bit spinlocks are also not covered by lock debugging, e.g. lockdep.
> > > 
> > > Substitute the BH_Uptodate_Lock bit spinlock with a regular spinlock.
> > > 
> > > Signed-off-by: Thomas Gleixner 
> > > [bigeasy: remove the wrapper and use always spinlock_t]
> > 
> > Uhh ... always grow the buffer_head, even for non-PREEMPT_RT?  Why?
> 
> Christoph requested that:
> 
>   https://lkml.kernel.org/r/20190802075612.ga20...@infradead.org

What do we do about this one?

> Thanks,
> 
>   tglx

Sebastian

Re: [patch 09/10] sched/core: Add migrate_disable/enable()

2020-09-18 Thread Sebastian Siewior

On 2020-09-18 10:22:32 [+0200], pet...@infradead.org wrote:
> > > One reason for not allowing migrate_disable() to sleep was: FPU code.
> > > 
> > > Could it be it does something like:
> > > 
> > >   preempt_disable();
> > >   spin_lock();
> > > 
> > >   spin_unlock();
> > >   preempt_enable();
> > > 
> > > Where we'll never get preempted while migrate_disable()'d and thus never
> > > trigger any of the sleep paths?
> > 
> > I try to get rid of something like that. This doesn't work either way
> > because the spin_lock() may block which it can't with disabled
> > preemption.
> 
> Yeah, that obviously should have been migrate_disable/enable instead of
> spin_lock/unlock :/

Ah. Me stupid. fpregs_lock() does

preempt_disable();
local_bh_disable();

which is more or less the "official" pattern. As of today
local_bh_disable() does migrate_disable() / spin_lock(). Not sure what
we end up with for local_bh_disable() in the end.
We used not have a BLK here on RT but ended up in all kind of locking
problems because vanilla treats local_bh_disable() as a BLK and uses it
for locking.
Today we have a per-CPU spinlock_t in local_bh_disable() to emulate the
BKL. But this pattern above isn't working due to the atomic part…

Sebastian

Re: [patch 09/10] sched/core: Add migrate_disable/enable()

2020-09-17 Thread Sebastian Siewior

On 2020-09-17 16:24:38 [+0200], pet...@infradead.org wrote:
> And if I'm not mistaken, the above migrate_enable() *does* require being
> able to schedule, and our favourite piece of futex:
> 
>   raw_spin_lock_irq(&q.pi_state->pi_mutex.wait_lock);
>   spin_unlock(q.lock_ptr);
> 
> is broken. Consider that spin_unlock() doing migrate_enable() with a
> pending sched_setaffinity().

There are two instances of the above and only in the futex code and we
have sort of duct tape for that by manually balancing the migrate
counter so that it does not come to this.
But yes, not having to do the manual balance is a plus.

Sebastian

Re: [patch 09/10] sched/core: Add migrate_disable/enable()

2020-09-17 Thread Sebastian Siewior

On 2020-09-17 16:49:37 [+0200], pet...@infradead.org wrote:
> I'm aware of the duct-tape :-) But I was under the impression that we
> didn't want the duct-tape, and that there was lots of issues with the
> FPU code, or was that another issue?

Of course it would be better not to need the duct tape. 
Also symmetrical locking is what you want but clearly futex is one of
a kind.

I'm currently not aware of any issues in the FPU code in regard to this.
A few weeks ago, I was looking for this kind of usage and only futex
popped up.

Sebastian

Re: [patch 09/10] sched/core: Add migrate_disable/enable()

2020-09-17 Thread Sebastian Siewior

On 2020-09-17 17:54:10 [+0200], pet...@infradead.org wrote:
> I'm not sure what the problem with FPU was, I was throwing alternatives
> at tglx to see what would stick, in part to (re)discover the design
> constraints of this thing.

was this recent or distant in the time line?

> One reason for not allowing migrate_disable() to sleep was: FPU code.
> 
> Could it be it does something like:
> 
>   preempt_disable();
>   spin_lock();
> 
>   spin_unlock();
>   preempt_enable();
> 
> Where we'll never get preempted while migrate_disable()'d and thus never
> trigger any of the sleep paths?

I try to get rid of something like that. This doesn't work either way
because the spin_lock() may block which it can't with disabled
preemption.

Looking at my queue, FPU related is crypto. And here we break the loops
mostly due to the construct:
kernel_fpu_begin();
while (bytes)
crypto_thingy();
skcipher_walk_done()

and skcipher_walk_done() could allocate/free/map memory. This is
independent.

Ah. We used to have migrate_disable() in pagefault_disable(). The x86
FPU code does
preempt_disable();
…
pagefault_disable();

but that migrate_disable() was moved from pagefault_disable() to
kmap_atomic(). We shouldn't have
preempt_disable(); || local_irq_disable();
kmap_atomic();

on RT. I've been running around removing those. See
   a10dcebacdb0c ("fs/ntfs/aops.c: don't disable interrupts during 
kmap_atomic()")
   ce1e518190ea7 ("ide: don't disable interrupts during kmap_atomic()")
   f3a1075e5fc34 ("block: don't disable interrupts during kmap_atomic()")

Sebastian

Re: [patch V2 00/24] cpu/hotplug: Convert get_online_cpus() to a percpu_rwsem

2017-04-25 Thread Sebastian Siewior

On 2017-04-25 17:10:37 [+0100], Mark Rutland wrote:
> Hi,
Hi,

> When we bring the secondary CPU online, we detect an erratum that wasn't
> present on the boot CPU, and try to enable a static branch we use to
> track the erratum. The call to static_branch_enable() blows up as above.

this (cpus_set_cap()) seems only to be used used in CPU up part.

> I see that we now have static_branch_disable_cpuslocked(), but we don't
> have an equivalent for enable. I'm not sure what we should be doing
> here.

We should introduce static_branch_enable_cpuslocked(). Does this work
for you (after s/static_branch_enable/static_branch_enable_cpuslocked/
in cpus_set_cap()) ?:

>From fe004351e4649da037d5165247e89ab976fe4a26 Mon Sep 17 00:00:00 2001
From: Sebastian Andrzej Siewior 
Date: Tue, 25 Apr 2017 18:43:00 +0200
Subject: [PATCH] jump_label: Provide
 static_key_[enable|/slow_inc]_cpuslocked()

Provide static_key_[enable|slow_inc]_cpuslocked() variant that
don't take cpu_hotplug_lock().

Signed-off-by: Sebastian Andrzej Siewior 
---
 include/linux/jump_label.h |  7 +++
 kernel/jump_label.c| 10 ++
 2 files changed, 17 insertions(+)

diff --git a/include/linux/jump_label.h b/include/linux/jump_label.h
index d7b17d1ab875..c80d8b1279b5 100644
--- a/include/linux/jump_label.h
+++ b/include/linux/jump_label.h
@@ -164,6 +164,7 @@ extern void static_key_slow_dec_cpuslocked(struct 
static_key *key);
 extern void jump_label_apply_nops(struct module *mod);
 extern int static_key_count(struct static_key *key);
 extern void static_key_enable(struct static_key *key);
+extern void static_key_enable_cpuslocked(struct static_key *key);
 extern void static_key_disable(struct static_key *key);
 extern void static_key_disable_cpuslocked(struct static_key *key);
 
@@ -252,6 +253,11 @@ static inline void static_key_enable(struct static_key 
*key)
static_key_slow_inc(key);
 }
 
+static inline void static_key_enable_cpuslocked(struct static_key *key)
+{
+   static_key_enable(key);
+}
+
 static inline void static_key_disable(struct static_key *key)
 {
int count = static_key_count(key);
@@ -429,6 +435,7 @@ extern bool wrong_branch_error(void);
  */
 
 #define static_branch_enable(x)
static_key_enable(&(x)->key)
+#define static_branch_enable_cpuslocked(x) 
static_key_enable_cpuslocked(&(x)->key)
 #define static_branch_disable(x)   static_key_disable(&(x)->key)
 #define static_branch_disable_cpuslocked(x)
static_key_disable_cpuslocked(&(x)->key)
 
diff --git a/kernel/jump_label.c b/kernel/jump_label.c
index d71124ee3b14..6343f4c7e27f 100644
--- a/kernel/jump_label.c
+++ b/kernel/jump_label.c
@@ -90,6 +90,16 @@ void static_key_enable(struct static_key *key)
 }
 EXPORT_SYMBOL_GPL(static_key_enable);
 
+void static_key_enable_cpuslocked(struct static_key *key)
+{
+   int count = static_key_count(key);
+
+   WARN_ON_ONCE(count < 0 || count > 1);
+
+   if (!count)
+   static_key_slow_inc_cpuslocked(key);
+}
+
 void static_key_disable(struct static_key *key)
 {
int count = static_key_count(key);
-- 
2.11.0

> Thanks,
> Mark.

Sebastian

Re: [patch V2 00/24] cpu/hotplug: Convert get_online_cpus() to a percpu_rwsem

2017-04-27 Thread Sebastian Siewior

On 2017-04-26 11:32:36 [+0100], Mark Rutland wrote:
> > So we could end up calling static_branch_enable_cpuslocked()
> > without actually holding the lock. Should we do a cpu_hotplug_begin/done in
> > setup_cpu_feature_capabilities ? I agree it doesn't look that nice. 
> > Thoughts ?
> 
> I agree that's hideous, but it looks like the only choice given the
> hotplug rwsem cahnges. :/

would work for you to provide a locked and unlocked version?

> I can spin a v2 with that and the typo fixes.
> 
> Thanks,
> Mark.

Sebastian

[RFC PATCH] trace/perf: cure locking issue in perf_event_open() error path

2017-04-28 Thread Sebastian Siewior

As trinity figured out, there is a recursive get_online_cpus() in
perf_event_open()'s error path:
| Call Trace:
|  dump_stack+0x86/0xce
|  __lock_acquire+0x2520/0x2cd0
|  lock_acquire+0x27c/0x2f0
|  get_online_cpus+0x3d/0x80
|  static_key_slow_dec+0x5a/0x70
|  sw_perf_event_destroy+0x8e/0x100
|  _free_event+0x61b/0x800
|  free_event+0x68/0x70
|  SyS_perf_event_open+0x19db/0x1d80

In order to cure, I am moving free_event() after the put_online_cpus()
block.
Besides that one, there also the error path in perf_event_alloc() which
also invokes event->destory. Here I delayed the destory work to
schedule_work().

Signed-off-by: Sebastian Andrzej Siewior 
---

I am not quite happy with the schedule_work() part.

 include/linux/perf_event.h |  1 +
 kernel/events/core.c   | 52 +-
 2 files changed, 38 insertions(+), 15 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 24a635887f28..d6a874dbbd21 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -718,6 +718,7 @@ struct perf_event {
 #endif
 
struct list_headsb_list;
+   struct work_struct  destroy_work;
 #endif /* CONFIG_PERF_EVENTS */
 };
 
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 7aed78b516fc..3358889609f8 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -9320,6 +9320,16 @@ static void account_event(struct perf_event *event)
account_pmu_sb_event(event);
 }
 
+static void perf_alloc_destroy_ev(struct work_struct *work)
+{
+   struct perf_event *event;
+
+   event = container_of(work, struct perf_event, destroy_work);
+   event->destroy(event);
+   module_put(event->pmu->module);
+   kfree(event);
+}
+
 /*
  * Allocate and initialize a event structure
  */
@@ -9334,6 +9344,7 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
struct pmu *pmu;
struct perf_event *event;
struct hw_perf_event *hwc;
+   bool delay_destroy = false;
long err = -EINVAL;
 
if ((unsigned)cpu >= nr_cpu_ids) {
@@ -9497,15 +9508,22 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
exclusive_event_destroy(event);
 
 err_pmu:
-   if (event->destroy)
-   event->destroy(event);
-   module_put(pmu->module);
+   if (event->destroy) {
+   /* delay ->destroy due to nested get_online_cpus() */
+   INIT_WORK(&event->destroy_work, perf_alloc_destroy_ev);
+   delay_destroy = true;
+   } else {
+   module_put(pmu->module);
+   }
 err_ns:
if (is_cgroup_event(event))
perf_detach_cgroup(event);
if (event->ns)
put_pid_ns(event->ns);
-   kfree(event);
+   if (delay_destroy)
+   schedule_work(&event->destroy_work);
+   else
+   kfree(event);
 
return ERR_PTR(err);
 }
@@ -9798,7 +9816,7 @@ SYSCALL_DEFINE5(perf_event_open,
pid_t, pid, int, cpu, int, group_fd, unsigned long, flags)
 {
struct perf_event *group_leader = NULL, *output_event = NULL;
-   struct perf_event *event, *sibling;
+   struct perf_event *event = NULL, *sibling;
struct perf_event_attr attr;
struct perf_event_context *ctx, *uninitialized_var(gctx);
struct file *event_file = NULL;
@@ -9908,13 +9926,14 @@ SYSCALL_DEFINE5(perf_event_open,
 NULL, NULL, cgroup_fd);
if (IS_ERR(event)) {
err = PTR_ERR(event);
+   event = NULL;
goto err_cred;
}
 
if (is_sampling_event(event)) {
if (event->pmu->capabilities & PERF_PMU_CAP_NO_INTERRUPT) {
err = -EOPNOTSUPP;
-   goto err_alloc;
+   goto err_cred;
}
}
 
@@ -9927,7 +9946,7 @@ SYSCALL_DEFINE5(perf_event_open,
if (attr.use_clockid) {
err = perf_event_set_clock(event, attr.clockid);
if (err)
-   goto err_alloc;
+   goto err_cred;
}
 
if (pmu->task_ctx_nr == perf_sw_context)
@@ -9962,7 +9981,7 @@ SYSCALL_DEFINE5(perf_event_open,
ctx = find_get_context(pmu, task, event);
if (IS_ERR(ctx)) {
err = PTR_ERR(ctx);
-   goto err_alloc;
+   goto err_cred;
}
 
if ((pmu->capabilities & PERF_PMU_CAP_EXCLUSIVE) && group_leader) {
@@ -10186,18 +10205,21 @@ SYSCALL_DEFINE5(perf_event_open,
 err_context:
perf_unpin_context(ctx);
put_ctx(ctx);
-err_alloc:
-   /*
-* If event_file is set, the fput() above will have called ->release()
-* and that will take care of freeing the event.
-*/
-   if (!event_file)
-   free_event(event);
 err_cred:
if (task)
mutex_unlock(&task->signal->cred_gua

Re: [RFC PATCH] trace/perf: cure locking issue in perf_event_open() error path

2017-04-28 Thread Sebastian Siewior

With the last patch on-top I trigger this now and then:

 ==
 WARNING: possible circular locking dependency detected
 4.11.0-rc8-00894-g8bd462ee4aac-dirty #84 Not tainted
 --
 trinity-subchil/4966 is trying to acquire lock:
  (cpu_hotplug_lock.rw_sem){++}, at: [] 
tp_perf_event_destroy+0xd/0x20

 but task is already holding lock:
  (&ctx->mutex){+.+.+.}, at: [] 
perf_event_exit_task+0x2ae/0x8d0

 which lock already depends on the new lock.


 the existing dependency chain (in reverse order) is:

 -> #2 (&ctx->mutex){+.+.+.}:
__lock_acquire+0x2534/0x2cd0
lock_acquire+0x27c/0x2f0
__mutex_lock+0xef/0x1280
mutex_lock_nested+0x16/0x20
SyS_perf_event_open+0x11ab/0x1e80
entry_SYSCALL_64_fastpath+0x23/0xc2

 -> #1 (&sig->cred_guard_mutex){+.+.+.}:
__lock_acquire+0x2534/0x2cd0
lock_acquire+0x27c/0x2f0
__mutex_lock+0xef/0x1280
mutex_lock_interruptible_nested+0x16/0x20
SyS_perf_event_open+0x1bd6/0x1e80
entry_SYSCALL_64_fastpath+0x23/0xc2

 -> #0 (cpu_hotplug_lock.rw_sem){++}:
check_prevs_add+0x544/0x18f0
__lock_acquire+0x2534/0x2cd0
lock_acquire+0x27c/0x2f0
get_online_cpus+0x3d/0x80
tp_perf_event_destroy+0xd/0x20
_free_event+0x61b/0x800
free_event+0x68/0x70
perf_event_exit_task+0x816/0x8d0
do_exit+0xc90/0x2a90
do_group_exit+0x1aa/0x2b0
SyS_exit_group+0x18/0x20
entry_SYSCALL_64_fastpath+0x23/0xc2

 other info that might help us debug this:

 Chain exists of:
   cpu_hotplug_lock.rw_sem --> &sig->cred_guard_mutex --> &ctx->mutex

  Possible unsafe locking scenario:

CPU0CPU1

   lock(&ctx->mutex);
lock(&sig->cred_guard_mutex);
lock(&ctx->mutex);
   lock(cpu_hotplug_lock.rw_sem);

  *** DEADLOCK ***

 1 lock held by trinity-subchil/4966:
  #0:  (&ctx->mutex){+.+.+.}, at: [] 
perf_event_exit_task+0x2ae/0x8d0

 stack backtrace:
 CPU: 3 PID: 4966 Comm: trinity-subchil Not tainted 
4.11.0-rc8-00894-g8bd462ee4aac-dirty #84
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-1 04/01/2014
 Call Trace:
  dump_stack+0x86/0xce
  print_circular_bug+0x5c3/0x620
  ? lockdep_on+0x50/0x50
  check_prevs_add+0x544/0x18f0
  ? check_irq_usage+0x150/0x150
  ? lock_acquire+0x27c/0x2f0
  __lock_acquire+0x2534/0x2cd0
  ? __lock_acquire+0x2534/0x2cd0
  ? find_held_lock+0x36/0x1c0
  lock_acquire+0x27c/0x2f0
  ? tp_perf_event_destroy+0xd/0x20
  get_online_cpus+0x3d/0x80
  ? tp_perf_event_destroy+0xd/0x20
  tp_perf_event_destroy+0xd/0x20
  _free_event+0x61b/0x800
  free_event+0x68/0x70
  perf_event_exit_task+0x816/0x8d0
  do_exit+0xc90/0x2a90
  ? mm_update_next_owner+0x550/0x550
  ? getname_flags+0xde/0x370
  ? getname_flags+0xde/0x370
  ? rcu_read_lock_sched_held+0x14a/0x180
  ? prepare_bprm_creds+0xf0/0xf0
  ? kmem_cache_free+0x250/0x2c0
  ? getname_flags+0xde/0x370
  ? entry_SYSCALL_64_fastpath+0x5/0xc2
  do_group_exit+0x1aa/0x2b0
  SyS_exit_group+0x18/0x20
  entry_SYSCALL_64_fastpath+0x23/0xc2

and I am not sure what to do here…

Sebastian

Re: [patch RT 0/4] rwsem/rt: Lift the single reader restriction

2017-04-04 Thread Sebastian Siewior

On 2017-04-01 12:50:58 [+0200], Thomas Gleixner wrote:
> R/W semaphores in RT do not allow multiple readers because a writer
> blocking on the sempahore would have deal with all the readers in terms of
> priority or budget inheritance. While multi reader priority boosting would
> be possible (it has been attempted before), multi reader budget inheritance
> is impossible.
…
> It cures the Radeon mess, lowers the contention on mmap_sem for certain
> workloads and did not have any negative impact in our initial testing on RT
> behaviour.
> 
> I think it's worth to expose it to a wider audience of users for testing,
> so we can figure out it whether there are dragons lurking.

series applied.

> Thanks,
> 
>   tglx

Sebastian

Re: [PATCH 2/3] rtmutex: deboost priority conditionally when rt-mutex unlock

2017-04-13 Thread Sebastian Siewior

On 2017-04-13 22:02:53 [+0800], Alex Shi wrote:
> The rt_mutex_fastunlock() will deboost 'current' task when it should be.

without looking whether or not this patch makes sense those patches
won't apply. Please look at tip/master or tip's locking/core

https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/log/?h=locking/core

where PeterZ rewrote part of the code.

Sebastian

Re: [PATCH PREEMPT RT] rt-mutex: fix deadlock in device mapper

2017-11-17 Thread Sebastian Siewior

On 2017-11-13 12:56:53 [-0500], Mikulas Patocka wrote:
> Hi
Hi,

> I'm submitting this patch for the CONFIG_PREEMPT_RT patch. It fixes 
> deadlocks in device mapper when real time preemption is used.

applied, thank you.

> Mikulas

Sebastian

Re: [PATCH PREEMPT RT] rt-mutex: fix deadlock in device mapper

2017-11-23 Thread Sebastian Siewior

On 2017-11-21 22:20:51 [+0100], Mike Galbraith wrote:
> On Tue, 2017-11-21 at 14:56 -0500, Mikulas Patocka wrote:
> > 
> > If we don't have any reason why it is needed to unplug block requests when 
> > a spinlock is taken - so let's not do this.
> 
> That's perfectly fine.  I guess I shouldn't have even mentioned having
> encountered unplug at mutex being insufficient.

While at it, I intend to drop
fs-jbd2-pull-your-plug-when-waiting-for-space.patch from the -RT queue
for v4.14 which does

--- a/fs/jbd2/checkpoint.c
+++ b/fs/jbd2/checkpoint.c
@@ -116,6 +116,8 @@ void __jbd2_log_wait_for_space(journal_t
nblocks = jbd2_space_needed(journal);
while (jbd2_log_space_left(journal) < nblocks) {
write_unlock(&journal->j_state_lock);
+   if (current->plug)
+   io_schedule();
mutex_lock(&journal->j_checkpoint_mutex);
 
/*

and is/was probably a workaround for the missing schedule while blocking
on mutex/rwsem.

>   -Mike

Sebastian

Re: [PATCH PREEMPT RT] rt-mutex: fix deadlock in device mapper

2017-11-20 Thread Sebastian Siewior

On 2017-11-18 19:37:10 [+0100], Mike Galbraith wrote:
> Below is my 2012 3.0-rt version of that for reference; at that time we
> were using slab, and slab_lock referenced below was a local_lock.  The
> comment came from crash analysis of a deadlock I met before adding the
> (yeah, hacky) __migrate_disabled() blocker.  At the time, that was not
> an optional hack, you WOULD deadlock if you ground disks to fine powder
> the way SUSE QA did.  Pulling the plug before blocking cured the
> xfs/ext[34] IO deadlocks they griped about, but I had to add that hack
> to not trade their nasty IO deadlocks for the more mundane variety.  So
> my question is: are we sure that in the here and now flush won't want
> any lock we may be holding?  In days of yore, it most definitely would
> turn your box into a doorstop if you tried hard enough.

I have a deadlock in ftest01/LTP which is cured by that.
The root-problem (as I understand it) is that !RT does
  schedule() -> sched_submit_work() -> blk_schedule_flush_plug()

on a lock contention (that is that bit_spinlock or rwsem in jbd2/ext4
for instance). On RT this does not happen because of tsk_is_pi_blocked()
check in sched_submit_work(). That check is needed because an additional
(rtmutex) lock can not be acquired at this point.

With this change we get closer to what !RT does. In regard to that
migrate_disable() we could check if it is possible to do that after we
acquired the lock (which is something we tried before but failed due to
CPU-hotplug requirement, maybe something changed now) or flush the plug
before disabling migration if it is really a problem.

To your question whether or not delaying IO can cause any deadlocks is
something that I can't answer and this something that would affect !RT,
too. I tried to add lockdep to bit-spinlocks but this does not work
because one context acquires the lock and another does the unlock. It
has been explained to me that no deadlocks should happen as long as
the IO is flushed before we block/wait on a lock.

> Subject: rt: pull your plug before blocking
> Date: Wed Jul 18 14:43:15 CEST 2012
> From: Mike Galbraith 
> 
> Queued IO can lead to IO deadlock should a task require wakeup from a task
> which is blocked on that queued IO.
> 
> ext3: dbench1 queues a buffer, blocks on journal mutex, it's plug is not
> pulled.  dbench2 mutex owner is waiting for kjournald, who is waiting for
> the buffer queued by dbench1.  Game over.
> 
> Signed-off-by: Mike Galbraith 
> ---
>  kernel/locking/rtmutex.c |   18 --
>  1 file changed, 16 insertions(+), 2 deletions(-)
> 
> --- a/kernel/locking/rtmutex.c
> +++ b/kernel/locking/rtmutex.c
> @@ -22,6 +22,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include "rtmutex_common.h"
>  
> @@ -930,8 +931,18 @@ static inline void rt_spin_lock_fastlock
>  
>   if (likely(rt_mutex_cmpxchg_acquire(lock, NULL, current)))
>   rt_mutex_deadlock_account_lock(lock, current);
> - else
> + else {
> + /*
> +  * We can't pull the plug if we're already holding a lock
> +  * else we can deadlock.  eg, if we're holding slab_lock,
> +  * ksoftirqd can block while processing BLOCK_SOFTIRQ after
> +  * having acquired q->queue_lock.  If _we_ then block on
> +  * that q->queue_lock while flushing our plug, deadlock.
> +  */
> + if (__migrate_disabled(current) < 2 && 
> blk_needs_flush_plug(current))
> + blk_schedule_flush_plug(current);
>   slowfn(lock);
> + }
>  }
>  
>  static inline void rt_spin_lock_fastunlock(struct rt_mutex *lock,
> @@ -1892,9 +1903,12 @@ rt_mutex_fastlock(struct rt_mutex *lock,
>   if (likely(rt_mutex_cmpxchg_acquire(lock, NULL, current))) {
>   rt_mutex_deadlock_account_lock(lock, current);
>   return 0;
> - } else
> + } else {
> + if (blk_needs_flush_plug(current))
> + blk_schedule_flush_plug(current);
>   return slowfn(lock, state, NULL, RT_MUTEX_MIN_CHAINWALK,
> ww_ctx);
> + }
>  }
>  
>  static inline int

Sebastian

39 matches

Mail list logo