date:20160523

Re: [PATCH v7 0/5] Make cpuid <-> nodeid mapping persistent

2016-05-23 Thread Zhu Guihua


Hi,

On 05/19/2016 10:46 PM, Peter Zijlstra wrote:

On Thu, May 19, 2016 at 06:39:41PM +0800, Zhu Guihua wrote:

[Problem]

cpuid <-> nodeid mapping is firstly established at boot time. And workqueue 
caches
the mapping in wq_numa_possible_cpumask in wq_numa_init() at boot time.

When doing node online/offline, cpuid <-> nodeid mapping is 
established/destroyed,
which means, cpuid <-> nodeid mapping will change if node hotplug happens. But
workqueue does not update wq_numa_possible_cpumask.

So why are you not fixing up wq_numa_possible_cpumask instead? That
seems the far easier solution.


We tried to do that. You can see our patch at
http://www.gossamer-threads.com/lists/linux/kernel/2116748

But maintainer thought, we should establish persistent cpuid<->nodeid 
relationship,

there is no need to change the map.

Cc TJ,
Could we return to workqueue to fix this?

Thanks,
Zhu


Do all the other archs that support NUMA and HOTPLUG have the mapping
stable, or will you now go fix each and every one of them?


.

Re: [PATCH 52/54] MAINTAINERS: Add file patterns for watchdog device tree bindings

2016-05-23 Thread Wim Van Sebroeck

Hi Geert,

> Submitters of device tree binding documentation may forget to CC
> the subsystem maintainer if this is missing.
> 
> Signed-off-by: Geert Uytterhoeven 
> Cc: Wim Van Sebroeck 
> Cc: Guenter Roeck 
> Cc: linux-watch...@vger.kernel.org
> ---
> Please apply this patch directly if you want to be involved in device
> tree binding documentation for your subsystem.
> ---
>  MAINTAINERS | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 9f54762f002e16de..7202b6565dd98d50 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -12400,6 +12400,7 @@ L:linux-watch...@vger.kernel.org
>  W:   http://www.linux-watchdog.org/
>  T:   git git://www.linux-watchdog.org/linux-watchdog.git
>  S:   Maintained
> +F:   Documentation/devicetree/bindings/watchdog/
>  F:   Documentation/watchdog/
>  F:   drivers/watchdog/
>  F:   include/linux/watchdog.h
> -- 
> 1.9.1
> 

Patch added to linux-watchdog-next.

Kind regards,
Wim.

Re: [PATCH v3] vfio-pci: Allow to mmap sub-page MMIO BARs if the mmio page is exclusive

2016-05-23 Thread Yongji Xie


On 2016/5/20 6:33, Alex Williamson wrote:


On Thu, 12 May 2016 18:20:51 +0800
Yongji Xie  wrote:


Current vfio-pci implementation disallows to mmap
sub-page(size < PAGE_SIZE) MMIO BARs because these BARs' mmio
page may be shared with other BARs. This will cause some
performance issues when we passthrough a PCI device with
this kind of BARs. Guest will be not able to handle the mmio
accesses to the BARs which leads to mmio emulations in host.

However, not all sub-page BARs will share page with other BARs.
We should allow to mmap the sub-page MMIO BARs which we can
make sure will not share page with other BARs.

This patch adds support for this case. And we try to add a
dummy resource to reserve the remainder of the page which
hot-add device's BAR might be assigned into. But it's not
necessary to handle the case when the BAR is not page aligned.
Because we can't expect the BAR will be assigned into the same
location in a page in guest when we passthrough the BAR. And
it's hard to access this BAR in userspace because we have
no way to get the BAR's location in a page.

Signed-off-by: Yongji Xie 
---
  drivers/vfio/pci/vfio_pci.c |   70 +++
  drivers/vfio/pci/vfio_pci_private.h |8 
  2 files changed, 71 insertions(+), 7 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index 188b1ff..253c22f 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -110,6 +110,50 @@ static inline bool vfio_pci_is_vga(struct pci_dev *pdev)
return (pdev->class >> 8) == PCI_CLASS_DISPLAY_VGA;
  }
  
+static bool vfio_pci_bar_mmap_supported(struct vfio_pci_device *vdev, int index)

+{
+   struct resource *res = vdev->pdev->resource + index;
+   struct vfio_pci_dummy_resource *dummy_res = NULL;
+
+   if (!IS_ENABLED(CONFIG_VFIO_PCI_MMAP))
+   return false;
+
+   if (!(res->flags & IORESOURCE_MEM))
+   return false;
+
+   /*
+* The PCI core shouldn't set up a resource with a type but
+* zero size. But there may be bugs that cause us to do that.
+*/
+   if (!resource_size(res))
+   return false;
+
+   if (resource_size(res) >= PAGE_SIZE)
+   return true;
+
+   if (!(res->start & ~PAGE_MASK)) {
+   /*
+* Add a dummy resource to reserve the remainder
+* of the exclusive page in case that hot-add
+* device's bar is assigned into it.
+*/
+   dummy_res = kzalloc(sizeof(*dummy_res), GFP_KERNEL);
+   if (dummy_res == NULL)
+   return false;
+   dummy_res->resource.start = res->end + 1;
+   dummy_res->resource.end = res->start + PAGE_SIZE - 1;
+   dummy_res->resource.flags = res->flags;
+   if (request_resource(res->parent, &dummy_res->resource)) {
+   kfree(dummy_res);
+   return false;
+   }
+   dummy_res->index = index;
+   list_add(&dummy_res->res_next, &vdev->dummy_resources_list);
+   return true;
+   }
+   return false;
+}

The name of this function is vfio_pci_bar_mmap_supported(), which
suggests we should be able to call it at any point to test if mmap is
supported, but that's not what it does.  It's actually a one time setup
function, that also happens to return what it found or managed to
reserve.  If we were to call this a second time, we might get a
different result.  So I think this either needs to change to something
like:

static void vfio_pci_probe_mmaps(struct vfio_pci_device *vdev)

where it loops through all the BARs and results in a valid
bar_mmap_supported array, or the function should be made smart enough
to identify if the necessary resource has already been allocated such
that it can be call multiple times per BAR, at which point we could
remove the bar_mmap_supported array.


Thanks for your comment. I would change the name of this function
and reserve bar_mmap_supported array to cache the result.


A comment describing why we can only support sub-page mmaps for
resources aligned at the start of a page would also be helpful for
future maintenance here.


OK. I will add this.


+
  static void vfio_pci_try_bus_reset(struct vfio_pci_device *vdev);
  static void vfio_pci_disable(struct vfio_pci_device *vdev);
  
@@ -145,10 +189,12 @@ static bool vfio_pci_nointx(struct pci_dev *pdev)

  static int vfio_pci_enable(struct vfio_pci_device *vdev)
  {
struct pci_dev *pdev = vdev->pdev;
-   int ret;
+   int ret, bar;
u16 cmd;
u8 msix_pos;
  
+	INIT_LIST_HEAD(&vdev->dummy_resources_list);

+
pci_set_power_state(pdev, PCI_D0);
  
  	/* Don't allow our initial saved state to include busmaster */

@@ -218,12 +264,17 @@ static int vfio_pci_enable(struct vfio_pci_device *vdev)
}
}
  
+	for (bar

[PATCH v3 05/11] perf evlist: Introduce aux perf evlist

2016-05-23 Thread Wang Nan

Introduce auxiliary perf evlist. Such evlists created by perf_evlist__new_aux()
using an existing evlist. A 'parent' pointer points to the template.

A 'evlist->parent' pointer is added to 'struct perf_evlist' and points to the
evlist itself for normal evlists (so in this patch changing 'evlist' to
'evlist->parent' won't causes any error).

Following commits uses auxiliary evlist as container of 'struct perf_mmap',
creates auxiliary evlist for overwritable events, allow them to create
separated mmaps. To achieve this goal, this patch carefully changes 'evlist'
to 'evlist->parent' in all functions in the patch of 'perf_evlist__mmap_ex',
except 'evlist->mmap' related operations, to make sure all evlist modifications
like pollfd and event id hash tables goes to original evlist.

Signed-off-by: Wang Nan 
Cc: He Kuang 
Cc: Arnaldo Carvalho de Melo 
Cc: Jiri Olsa 
Cc: Masami Hiramatsu 
Cc: Namhyung Kim 
Cc: Zefan Li 
Cc: pi3or...@163.com
---
 tools/perf/util/evlist.c | 31 +--
 tools/perf/util/evlist.h |  3 +++
 2 files changed, 24 insertions(+), 10 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 1305910..af0bea7 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -45,6 +45,7 @@ void perf_evlist__init(struct perf_evlist *evlist, struct 
cpu_map *cpus,
fdarray__init(&evlist->pollfd, 64);
evlist->workload.pid = -1;
evlist->backward = false;
+   evlist->parent = evlist;
 }
 
 struct perf_evlist *perf_evlist__new(void)
@@ -994,7 +995,7 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist 
*evlist, int idx,
 {
struct perf_evsel *evsel;
 
-   evlist__for_each(evlist, evsel) {
+   evlist__for_each(evlist->parent, evsel) {
int fd;
 
if (evsel->system_wide && thread)
@@ -1021,16 +1022,16 @@ static int perf_evlist__mmap_per_evsel(struct 
perf_evlist *evlist, int idx,
 * Therefore don't add it for polling.
 */
if (!evsel->system_wide &&
-   __perf_evlist__add_pollfd(evlist, fd, idx) < 0) {
+   __perf_evlist__add_pollfd(evlist->parent, fd, idx) < 0) {
perf_evlist__mmap_put(evlist, idx);
return -1;
}
 
if (evsel->attr.read_format & PERF_FORMAT_ID) {
-   if (perf_evlist__id_add_fd(evlist, evsel, cpu, thread,
+   if (perf_evlist__id_add_fd(evlist->parent, evsel, cpu, 
thread,
   fd) < 0)
return -1;
-   perf_evlist__set_sid_idx(evlist, evsel, idx, cpu,
+   perf_evlist__set_sid_idx(evlist->parent, evsel, idx, 
cpu,
 thread);
}
}
@@ -1071,13 +1072,13 @@ static int perf_evlist__mmap_per_thread(struct 
perf_evlist *evlist,
struct mmap_params *mp)
 {
int thread;
-   int nr_threads = thread_map__nr(evlist->threads);
+   int nr_threads = thread_map__nr(evlist->parent->threads);
 
pr_debug2("perf event ring buffer mmapped per thread\n");
for (thread = 0; thread < nr_threads; thread++) {
int output = -1;
 
-   auxtrace_mmap_params__set_idx(&mp->auxtrace_mp, evlist, thread,
+   auxtrace_mmap_params__set_idx(&mp->auxtrace_mp, evlist->parent, 
thread,
  false);
 
if (perf_evlist__mmap_per_evsel(evlist, thread, mp, 0, thread,
@@ -1216,8 +1217,8 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, 
unsigned int pages,
 bool auxtrace_overwrite)
 {
struct perf_evsel *evsel;
-   const struct cpu_map *cpus = evlist->cpus;
-   const struct thread_map *threads = evlist->threads;
+   const struct cpu_map *cpus = evlist->parent->cpus;
+   const struct thread_map *threads = evlist->parent->threads;
struct mmap_params mp = {
.prot = PROT_READ | (overwrite ? 0 : PROT_WRITE),
};
@@ -1225,7 +1226,7 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, 
unsigned int pages,
if (evlist->mmap == NULL && perf_evlist__alloc_mmap(evlist) < 0)
return -ENOMEM;
 
-   if (evlist->pollfd.entries == NULL && perf_evlist__alloc_pollfd(evlist) 
< 0)
+   if (evlist->parent->pollfd.entries == NULL && 
perf_evlist__alloc_pollfd(evlist->parent) < 0)
return -ENOMEM;
 
evlist->overwrite = overwrite;
@@ -1236,7 +1237,7 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, 
unsigned int pages,
auxtrace_mmap_params__init(&mp.auxtrace_mp, evlist->mmap_len,
   auxtrace_pages, auxtrace_overwrite);
 
-   evlist__for_each(evlist, evsel) {
+   evlist__for_each(evlist->parent, evs

[PATCH v3 01/11] perf tools: Add API to pause/resume a evlist

2016-05-23 Thread Wang Nan

perf_evlist__toggle_{pause,resume}() are introduced to pause/resume
events in an evlist. Utilize PERF_EVENT_IOC_PAUSE_OUTPUT ioctl.
Following commits use them to ensure overwrite ring buffer is paused
before reading.

Signed-off-by: Wang Nan 
Signed-off-by: He Kuang 
Cc: Arnaldo Carvalho de Melo 
Cc: Jiri Olsa 
Cc: Masami Hiramatsu 
Cc: Namhyung Kim 
Cc: Zefan Li 
Cc: pi3or...@163.com
---
 tools/perf/util/evlist.c | 32 
 tools/perf/util/evlist.h |  2 ++
 2 files changed, 34 insertions(+)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 1a370db..9303525 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -679,6 +679,38 @@ static struct perf_evsel *perf_evlist__event2evsel(struct 
perf_evlist *evlist,
return NULL;
 }
 
+static int perf_evlist__set_paused(struct perf_evlist *evlist, bool pause)
+{
+   int i;
+
+   for (i = 0; i < evlist->nr_mmaps; i++) {
+   int fd = evlist->mmap[i].fd;
+   int err;
+
+   if (fd < 0)
+   continue;
+   err = ioctl(fd, PERF_EVENT_IOC_PAUSE_OUTPUT,
+   pause ? 1 : 0);
+   if (err) {
+   err = (errno == 0 ? -EINVAL : -errno);
+   pr_err("Unable to pause output on %d: %s\n",
+  fd, strerror(-err));
+   return err;
+   }
+   }
+   return 0;
+}
+
+int perf_evlist__pause(struct perf_evlist *evlist)
+{
+   return perf_evlist__set_paused(evlist, true);
+}
+
+int perf_evlist__resume(struct perf_evlist *evlist)
+{
+   return perf_evlist__set_paused(evlist, false);
+}
+
 /* When check_messup is true, 'end' must points to a good entry */
 static union perf_event *
 perf_mmap__read(struct perf_mmap *md, bool check_messup, u64 start,
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 0d165b1..97090b7 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -136,6 +136,8 @@ void perf_evlist__mmap_read_catchup(struct perf_evlist 
*evlist, int idx);
 
 void perf_evlist__mmap_consume(struct perf_evlist *evlist, int idx);
 
+int perf_evlist__pause(struct perf_evlist *evlist);
+int perf_evlist__resume(struct perf_evlist *evlist);
 int perf_evlist__open(struct perf_evlist *evlist);
 void perf_evlist__close(struct perf_evlist *evlist);
 
-- 
1.8.3.4

[PATCH v3 08/11] perf record: Introduce rec->overwrite_evlist for overwritable events

2016-05-23 Thread Wang Nan

Create a auxiliary evlist for overwritable events.

Before mmap, build this evlist and set 'overwrite' and 'backward'
attribute. Since perf_evlist__mmap_ex() only maps events when
evsel->overwrite matches evlist's corresponding attributes, with
these two evlists an event goes to either rec->evlist or
rec->overwrite_evlist.

Signed-off-by: Wang Nan 
Cc: He Kuang 
Cc: Arnaldo Carvalho de Melo 
Cc: Jiri Olsa 
Cc: Masami Hiramatsu 
Cc: Namhyung Kim 
Cc: Zefan Li 
Cc: pi3or...@163.com
---
 tools/perf/builtin-record.c | 125 
 1 file changed, 102 insertions(+), 23 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 9611380..b940c0d 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -50,6 +50,7 @@ struct record {
struct perf_data_file   file;
struct auxtrace_record  *itr;
struct perf_evlist  *evlist;
+   struct perf_evlist  *overwrite_evlist;
struct perf_session *session;
const char  *progname;
int realtime_prio;
@@ -341,6 +342,78 @@ int auxtrace_record__snapshot_start(struct auxtrace_record 
*itr __maybe_unused)
 
 #endif
 
+static int record__create_overwrite_evlist(struct record *rec)
+{
+   struct perf_evlist *evlist = rec->evlist;
+   struct perf_evsel *pos;
+
+   evlist__for_each(evlist, pos) {
+   if (!pos->overwrite)
+   continue;
+
+   if (!rec->overwrite_evlist) {
+   rec->overwrite_evlist = perf_evlist__new_aux(evlist);
+   if (rec->overwrite_evlist) {
+   rec->overwrite_evlist->backward = true;
+   rec->overwrite_evlist->overwrite = true;
+   return 0;
+   } else
+   return -ENOMEM;
+   }
+   }
+   return 0;
+}
+
+static int record__mmap_evlist(struct record *rec,
+  struct perf_evlist *evlist)
+{
+   struct record_opts *opts = &rec->opts;
+   char msg[512];
+
+   if (perf_evlist__mmap_ex(evlist, opts->mmap_pages, evlist->backward,
+opts->auxtrace_mmap_pages,
+opts->auxtrace_snapshot_mode) < 0) {
+   if (errno == EPERM) {
+   pr_err("Permission error mapping pages.\n"
+  "Consider increasing "
+  "/proc/sys/kernel/perf_event_mlock_kb,\n"
+  "or try again with a smaller value of 
-m/--mmap_pages.\n"
+  "(current value: %u,%u)\n",
+  opts->mmap_pages, opts->auxtrace_mmap_pages);
+   return -errno;
+   } else {
+   pr_err("failed to mmap with %d (%s)\n", errno,
+   strerror_r(errno, msg, sizeof(msg)));
+   if (errno)
+   return -errno;
+   else
+   return -EINVAL;
+   }
+   }
+   return 0;
+}
+
+static int record__mmap(struct record *rec)
+{
+   int err;
+
+   err = record__create_overwrite_evlist(rec);
+   if (err)
+   return err;
+
+   err = record__mmap_evlist(rec, rec->evlist);
+   if (err)
+   return err;
+
+   if (!rec->overwrite_evlist)
+   return 0;
+
+   err = record__mmap_evlist(rec, rec->overwrite_evlist);
+   if (err)
+   return err;
+   return 0;
+}
+
 static int record__open(struct record *rec)
 {
char msg[512];
@@ -353,6 +426,13 @@ static int record__open(struct record *rec)
perf_evlist__config(evlist, opts, &callchain_param);
 
evlist__for_each(evlist, pos) {
+   if (pos->overwrite) {
+   if (!pos->attr.write_backward) {
+   ui__warning("Unable to read from overwrite ring 
buffer\n\n");
+   rc = -ENOSYS;
+   goto out;
+   }
+   }
 try_again:
if (perf_evsel__open(pos, pos->cpus, pos->threads) < 0) {
if (perf_evsel__fallback(pos, errno, msg, sizeof(msg))) 
{
@@ -377,28 +457,9 @@ try_again:
goto out;
}
 
-   if (perf_evlist__mmap_ex(evlist, opts->mmap_pages, false,
-opts->auxtrace_mmap_pages,
-opts->auxtrace_snapshot_mode) < 0) {
-   if (errno == EPERM) {
-   pr_err("Permission error mapping pages.\n"
-  "Consider increasing "
-  "/proc/sys/kernel/perf_event_mlock_kb,\n"
-  "or try again wi

[PATCH] blk: remove NULL check before freeing functions

2016-05-23 Thread Mike Danese

Coccinelle complains:

WARNING: NULL check before freeing functions like kfree, debugfs_remove,
debugfs_remove_recursive or usb_free_urb is not needed. Maybe consider
reorganizing relevant code to avoid passing NULL values.

Signed-off-by: Mike Danese 
---
 block/bio-integrity.c | 7 ++-
 block/bio.c   | 7 ++-
 block/blk-core.c  | 3 +--
 block/elevator.c  | 3 +--
 4 files changed, 6 insertions(+), 14 deletions(-)

diff --git a/block/bio-integrity.c b/block/bio-integrity.c
index 711e4d8d..27bf4e7 100644
--- a/block/bio-integrity.c
+++ b/block/bio-integrity.c
@@ -499,11 +499,8 @@ EXPORT_SYMBOL(bioset_integrity_create);
 
 void bioset_integrity_free(struct bio_set *bs)
 {
-   if (bs->bio_integrity_pool)
-   mempool_destroy(bs->bio_integrity_pool);
-
-   if (bs->bvec_integrity_pool)
-   mempool_destroy(bs->bvec_integrity_pool);
+   mempool_destroy(bs->bio_integrity_pool);
+   mempool_destroy(bs->bvec_integrity_pool);
 }
 EXPORT_SYMBOL(bioset_integrity_free);
 
diff --git a/block/bio.c b/block/bio.c
index 0e4aa42..e1e1773 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1844,11 +1844,8 @@ void bioset_free(struct bio_set *bs)
if (bs->rescue_workqueue)
destroy_workqueue(bs->rescue_workqueue);
 
-   if (bs->bio_pool)
-   mempool_destroy(bs->bio_pool);
-
-   if (bs->bvec_pool)
-   mempool_destroy(bs->bvec_pool);
+   mempool_destroy(bs->bio_pool);
+   mempool_destroy(bs->bvec_pool);
 
bioset_integrity_free(bs);
bio_put_slab(bs);
diff --git a/block/blk-core.c b/block/blk-core.c
index 2475b1c7..916e7c3 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -636,8 +636,7 @@ int blk_init_rl(struct request_list *rl, struct 
request_queue *q,
 
 void blk_exit_rl(struct request_list *rl)
 {
-   if (rl->rq_pool)
-   mempool_destroy(rl->rq_pool);
+   mempool_destroy(rl->rq_pool);
 }
 
 struct request_queue *blk_alloc_queue(gfp_t gfp_mask)
diff --git a/block/elevator.c b/block/elevator.c
index c3555c9..5dc91ca 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -847,8 +847,7 @@ int elv_register(struct elevator_type *e)
spin_lock(&elv_list_lock);
if (elevator_find(e->elevator_name)) {
spin_unlock(&elv_list_lock);
-   if (e->icq_cache)
-   kmem_cache_destroy(e->icq_cache);
+   kmem_cache_destroy(e->icq_cache);
return -EBUSY;
}
list_add_tail(&e->list, &elv_list);
-- 
2.5.0

[PATCH v3 11/11] perf tools: Check write_backward during evlist config

2016-05-23 Thread Wang Nan

Before this patch, when using overwritable ring buffer on an old
kernel, error message is misleading:

 # ~/perf record -m 1 -e raw_syscalls:*/overwrite/ -a
 Error:
 The raw_syscalls:sys_enter event is not supported.

This patch output clear error message to tell user his/her kernel
is too old:

 # ~/perf record -m 1 -e raw_syscalls:*/overwrite/ -a
 Reading from overwrite event is not supported by this kernel
 Error:
 The raw_syscalls:sys_enter event is not supported.

Signed-off-by: Wang Nan 
Cc: He Kuang 
Cc: Arnaldo Carvalho de Melo 
Cc: Jiri Olsa 
Cc: Masami Hiramatsu 
Cc: Namhyung Kim 
Cc: Zefan Li 
Cc: pi3or...@163.com
---
 tools/perf/util/evsel.c  | 17 +
 tools/perf/util/evsel.h  | 13 +
 tools/perf/util/record.c | 17 +
 3 files changed, 35 insertions(+), 12 deletions(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 6330a4f..994310f 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -29,17 +29,7 @@
 #include "trace-event.h"
 #include "stat.h"
 
-static struct {
-   bool sample_id_all;
-   bool exclude_guest;
-   bool mmap2;
-   bool cloexec;
-   bool clockid;
-   bool clockid_wrong;
-   bool lbr_flags;
-   bool write_backward;
-} perf_missing_features;
-
+struct perf_missing_features perf_missing_features;
 static clockid_t clockid;
 
 static int perf_evsel__no_extra_init(struct perf_evsel *evsel __maybe_unused)
@@ -684,8 +674,11 @@ static void apply_config_terms(struct perf_evsel *evsel,
 * possible to set overwrite globally, without config
 * terms.
 */
-   if (evsel->overwrite)
+   if (evsel->overwrite) {
+   WARN_ONCE(perf_missing_features.write_backward,
+ "Reading from overwrite event is not supported by 
this kernel\n");
attr->write_backward = 1;
+   }
 
/* User explicitly set per-event callgraph, clear the old setting and 
reset. */
if ((callgraph_buf != NULL) || (dump_size > 0)) {
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index bce99fa..c9b6716 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -11,6 +11,19 @@
 #include "cpumap.h"
 #include "counts.h"
 
+struct perf_missing_features {
+   bool sample_id_all;
+   bool exclude_guest;
+   bool mmap2;
+   bool cloexec;
+   bool clockid;
+   bool clockid_wrong;
+   bool lbr_flags;
+   bool write_backward;
+};
+
+extern struct perf_missing_features perf_missing_features;
+
 struct perf_evsel;
 
 /*
diff --git a/tools/perf/util/record.c b/tools/perf/util/record.c
index 481792c..e3ab812 100644
--- a/tools/perf/util/record.c
+++ b/tools/perf/util/record.c
@@ -90,6 +90,11 @@ static void perf_probe_context_switch(struct perf_evsel 
*evsel)
evsel->attr.context_switch = 1;
 }
 
+static void perf_probe_write_backward(struct perf_evsel *evsel)
+{
+   evsel->attr.write_backward = 1;
+}
+
 bool perf_can_sample_identifier(void)
 {
return perf_probe_api(perf_probe_sample_identifier);
@@ -129,6 +134,17 @@ bool perf_can_record_cpu_wide(void)
return true;
 }
 
+static void perf_check_write_backward(void)
+{
+   static bool checked = false;
+
+   if (!checked) {
+   perf_missing_features.write_backward =
+   !perf_probe_api(perf_probe_write_backward);
+   checked = true;
+   }
+}
+
 void perf_evlist__config(struct perf_evlist *evlist, struct record_opts *opts,
 struct callchain_param *callchain)
 {
@@ -136,6 +152,7 @@ void perf_evlist__config(struct perf_evlist *evlist, struct 
record_opts *opts,
bool use_sample_identifier = false;
bool use_comm_exec;
 
+   perf_check_write_backward();
/*
 * Set the evsel leader links before we configure attributes,
 * since some might depend on this info.
-- 
1.8.3.4

[PATCH v3 00/11] perf tools: Support overwritable ring buffer

2016-05-23 Thread Wang Nan

This patch set enables daemonized perf recording by utilizing
overwritable backward ring buffer. With this feature one can
put perf background, and dump ring buffer records by a SIGUSR2
when he/she find something unusual. For example, following
command record system calls, schedule events and samples on cpu cycles
continously:

 # perf record -g -e cycles -e raw_syscalls:*/call-graph=no/ \
  -e sched:sched_switch/call-graph=no/ \
  --switch-output --overwrite -a

Then by sending SIGUSR2 to perf when lagging is happen, we get multiple
perf.data output, each of them correspond a abnormal event, and the data
size is reasonable:

 # ls -l ./perf.data*
 -rw--- 1 root root 5122165 May 13 23:51 ./perf.data.2016051323511683
 -rw--- 1 root root 5135093 May 13 23:51 ./perf.data.2016051323512107
 -rw--- 1 root root 5135213 May 13 23:51 ./perf.data.2016051323512215
 -rw--- 1 root root 5135157 May 13 23:51 ./perf.data.2016051323512387

v1 -> v2: Totally redesign: drop the principle of 'channal', use
  auxiliary evlist instead. Fix missing documentation.

v2 -> v3: Rename perf_evlist__toggle_paused() to perf_evlist__pause/resume.

Wang Nan (11):
  perf tools: Add API to pause/resume a evlist
  perf record: Prevent reading invalid data in record__mmap_read
  perf record: Rename variable to make code clear
  perf record: Read from backward ring buffer
  perf evlist: Introduce aux perf evlist
  perf tools: Don't poll and mmap overwritable events
  perf tools: Enable overwrite settings
  perf record: Introduce rec->overwrite_evlist for overwritable events
  perf record: Toggle overwrite ring buffer for reading
  perf tools: Don't warn about out of order event if write_backward is
used
  perf tools: Check write_backward during evlist config

 tools/perf/Documentation/perf-record.txt |  14 ++
 tools/perf/arch/x86/util/tsc.c   |   2 +
 tools/perf/builtin-record.c  | 354 +++
 tools/perf/perf.h|   1 +
 tools/perf/util/evlist.c |  85 ++--
 tools/perf/util/evlist.h |   6 +
 tools/perf/util/evsel.c  |  27 ++-
 tools/perf/util/evsel.h  |  15 ++
 tools/perf/util/parse-events.c   |  20 +-
 tools/perf/util/parse-events.h   |   2 +
 tools/perf/util/parse-events.l   |   2 +
 tools/perf/util/record.c |  17 ++
 tools/perf/util/session.c|  22 +-
 13 files changed, 498 insertions(+), 69 deletions(-)

Cc: Arnaldo Carvalho de Melo 
Cc: He Kuang 
Cc: Jiri Olsa 
Cc: Masami Hiramatsu 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Cc: Wang Nan 
Cc: Zefan Li 
Cc: pi3or...@163.com

-- 
1.8.3.4

[PATCH v3 10/11] perf tools: Don't warn about out of order event if write_backward is used

2016-05-23 Thread Wang Nan

If write_backward attribute is set, records are written into kernel
ring buffer from end to beginning, but read from beginning to end.
To avoid 'XX out of order events recorded' warning message (timestamps
of records is in reverse order when using write_backward), suppress the
warning message if write_backward is selected by at lease one event.

Result:

Before this patch:
 # perf record -m 1 -e raw_syscalls:sys_exit/overwrite/ \
-e raw_syscalls:sys_enter \
dd if=/dev/zero of=/dev/null count=300
 300+0 records in
 300+0 records out
 153600 bytes (154 kB) copied, 0.000601617 s, 255 MB/s
 [ perf record: Woken up 5 times to write data ]
 Warning:
 40 out of order events recorded.
 [ perf record: Captured and wrote 0.096 MB perf.data (696 samples) ]

After this patch:
 # perf record -m 1 -e raw_syscalls:sys_exit/overwrite/ \
-e raw_syscalls:sys_enter \
dd if=/dev/zero of=/dev/null count=300
 300+0 records in
 300+0 records out
 153600 bytes (154 kB) copied, 0.000644873 s, 238 MB/s
 [ perf record: Woken up 5 times to write data ]
 [ perf record: Captured and wrote 0.096 MB perf.data (696 samples) ]

Signed-off-by: Wang Nan 
Signed-off-by: He Kuang 
Cc: Arnaldo Carvalho de Melo 
Cc: Jiri Olsa 
Cc: Masami Hiramatsu 
Cc: Namhyung Kim 
Cc: Zefan Li 
Cc: pi3or...@163.com
---
 tools/perf/util/session.c | 22 +++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 2335b28..8e3d9d4 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1495,10 +1495,27 @@ int perf_session__register_idle_thread(struct 
perf_session *session)
return err;
 }
 
+static void
+perf_session__warn_order(const struct perf_session *session)
+{
+   const struct ordered_events *oe = &session->ordered_events;
+   struct perf_evsel *evsel;
+   bool should_warn = true;
+
+   evlist__for_each(session->evlist, evsel) {
+   if (evsel->attr.write_backward)
+   should_warn = false;
+   }
+
+   if (!should_warn)
+   return;
+   if (oe->nr_unordered_events != 0)
+   ui__warning("%u out of order events recorded.\n", 
oe->nr_unordered_events);
+}
+
 static void perf_session__warn_about_errors(const struct perf_session *session)
 {
const struct events_stats *stats = &session->evlist->stats;
-   const struct ordered_events *oe = &session->ordered_events;
 
if (session->tool->lost == perf_event__process_lost &&
stats->nr_events[PERF_RECORD_LOST] != 0) {
@@ -1555,8 +1572,7 @@ static void perf_session__warn_about_errors(const struct 
perf_session *session)
stats->nr_unprocessable_samples);
}
 
-   if (oe->nr_unordered_events != 0)
-   ui__warning("%u out of order events recorded.\n", 
oe->nr_unordered_events);
+   perf_session__warn_order(session);
 
events_stats__auxtrace_error_warn(stats);
 
-- 
1.8.3.4

[PATCH v3 03/11] perf record: Rename variable to make code clear

2016-05-23 Thread Wang Nan

record__mmap_read() write data from ring buffer into perf.data.
'head' is maintained by kernel, points to the last writtend record.
'old' is maintained by perf, points to the record read in previous
round. record__mmap_read() saves data from 'old' to 'head' to
perf.data.

The names of these variables are not very intutive. In addition,
when dealing with backward writing ring buffer, the md->prev pointer
should point to 'head' instead of the last byte it got.

Add start and end pointer to make code clear and set md->prev to 'head'
instead of the moved 'old' pointer. This patch doesn't change
behavior since:

buf = &data[old & md->mask];
size = head - old;
old += size; <--- Here, old == head

Signed-off-by: Wang Nan 
Signed-off-by: He Kuang 
Cc: Arnaldo Carvalho de Melo 
Cc: Jiri Olsa 
Cc: Masami Hiramatsu 
Cc: Namhyung Kim 
Cc: Zefan Li 
Cc: pi3or...@163.com
---
 tools/perf/builtin-record.c | 21 +++--
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index f302cc9..73ce651 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -88,17 +88,18 @@ static int record__mmap_read(struct record *rec, int idx)
struct perf_mmap *md = &rec->evlist->mmap[idx];
u64 head = perf_mmap__read_head(md);
u64 old = md->prev;
+   u64 end = head, start = old;
unsigned char *data = md->base + page_size;
unsigned long size;
void *buf;
int rc = 0;
 
-   if (old == head)
+   if (start == end)
return 0;
 
rec->samples++;
 
-   size = head - old;
+   size = end - start;
if (size > (unsigned long)(md->mask) + 1) {
WARN_ONCE(1, "failed to keep up with mmap data. (warn only 
once)\n");
 
@@ -107,10 +108,10 @@ static int record__mmap_read(struct record *rec, int idx)
return 0;
}
 
-   if ((old & md->mask) + size != (head & md->mask)) {
-   buf = &data[old & md->mask];
-   size = md->mask + 1 - (old & md->mask);
-   old += size;
+   if ((start & md->mask) + size != (end & md->mask)) {
+   buf = &data[start & md->mask];
+   size = md->mask + 1 - (start & md->mask);
+   start += size;
 
if (record__write(rec, buf, size) < 0) {
rc = -1;
@@ -118,16 +119,16 @@ static int record__mmap_read(struct record *rec, int idx)
}
}
 
-   buf = &data[old & md->mask];
-   size = head - old;
-   old += size;
+   buf = &data[start & md->mask];
+   size = end - start;
+   start += size;
 
if (record__write(rec, buf, size) < 0) {
rc = -1;
goto out;
}
 
-   md->prev = old;
+   md->prev = head;
perf_evlist__mmap_consume(rec->evlist, idx);
 out:
return rc;
-- 
1.8.3.4

[PATCH v3 09/11] perf record: Toggle overwrite ring buffer for reading

2016-05-23 Thread Wang Nan

overwrite_evt_state is introduced to reflect the state of overwritable
ring buffers. It is a state machine with 3 states:

 RUNNING --(1)--> DATA_PENDING --(2)--> EMPTY
^  ^ |
|  |___(disallow)___/|
||
 \_(3)__/

 RUNNING  : Overwritable ring buffers are recording
 DATA_PENDING : We are required to collect overwritable ring buffers
 EMPTY: We have collected data from those ring buffers.

 (1): Pause ring buffers for reading
 (2): N/A
 (3): Resume ring buffers for recording

We can't avoid this complexity. Because we deliberately drop records from
overwritable ring buffer, we can't detect remaining data by checking head
and old pointers. Therefore, DATA_PENDING state is mandatory.

Signed-off-by: Wang Nan 
Signed-off-by: He Kuang 
Cc: Arnaldo Carvalho de Melo 
Cc: Jiri Olsa 
Cc: Masami Hiramatsu 
Cc: Namhyung Kim 
Cc: Zefan Li 
Cc: pi3or...@163.com
---
 tools/perf/builtin-record.c | 146 +---
 1 file changed, 136 insertions(+), 10 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index b940c0d..ecce78d 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -42,6 +42,28 @@
 #include 
 #include 
 
+/*
+ * State machine of overwrite_evt_state:
+ *
+ * RUNNING --(1)--> DATA_PENDING --(2)--> EMPTY
+ *^  ^ |
+ *|  |___(disallow)___/|
+ *||
+ * \_(3)__/
+ *
+ * RUNNING  : Overwritable ring buffers are recording
+ * DATA_PENDING : We are required to collect overwritable ring buffers
+ * EMPTY: We have collected data from those ring buffers.
+ *
+ * (1): Pause ring buffers for reading
+ * (2): N/A
+ * (3): Resume ring buffers for recording
+ */
+enum overwrite_evt_state {
+   OVERWRITE_EVT_RUNNING,
+   OVERWRITE_EVT_DATA_PENDING,
+   OVERWRITE_EVT_EMPTY,
+};
 
 struct record {
struct perf_tooltool;
@@ -61,6 +83,7 @@ struct record {
boolbuildid_all;
booltimestamp_filename;
boolswitch_output;
+   enum overwrite_evt_state overwrite_evt_state;
unsigned long long  samples;
 };
 
@@ -132,9 +155,9 @@ rb_find_range(struct perf_evlist *evlist,
return backward_rb_find_range(data, mask, head, start, end);
 }
 
-static int record__mmap_read(struct record *rec, int idx)
+static int record__mmap_read(struct record *rec, struct perf_evlist *evlist, 
int idx)
 {
-   struct perf_mmap *md = &rec->evlist->mmap[idx];
+   struct perf_mmap *md = &evlist->mmap[idx];
u64 head = perf_mmap__read_head(md);
u64 old = md->prev;
u64 end = head, start = old;
@@ -143,7 +166,7 @@ static int record__mmap_read(struct record *rec, int idx)
void *buf;
int rc = 0;
 
-   if (rb_find_range(rec->evlist, data, md->mask, head,
+   if (rb_find_range(evlist, data, md->mask, head,
  old, &start, &end))
return -1;
 
@@ -157,7 +180,7 @@ static int record__mmap_read(struct record *rec, int idx)
WARN_ONCE(1, "failed to keep up with mmap data. (warn only 
once)\n");
 
md->prev = head;
-   perf_evlist__mmap_consume(rec->evlist, idx);
+   perf_evlist__mmap_consume(evlist, idx);
return 0;
}
 
@@ -182,7 +205,7 @@ static int record__mmap_read(struct record *rec, int idx)
}
 
md->prev = head;
-   perf_evlist__mmap_consume(rec->evlist, idx);
+   perf_evlist__mmap_consume(evlist, idx);
 out:
return rc;
 }
@@ -462,6 +485,7 @@ try_again:
goto out;
session->evlist = evlist;
perf_session__set_id_hdr_size(session);
+   rec->overwrite_evt_state = OVERWRITE_EVT_RUNNING;
 out:
return rc;
 }
@@ -542,17 +566,72 @@ static struct perf_event_header finished_round_event = {
.type = PERF_RECORD_FINISHED_ROUND,
 };
 
-static int record__mmap_read_all(struct record *rec)
+static void
+record__toggle_overwrite_evsels(struct record *rec,
+   enum overwrite_evt_state state)
+{
+   struct perf_evlist *evlist = rec->overwrite_evlist;
+   enum overwrite_evt_state old_state = rec->overwrite_evt_state;
+   enum action {
+   NONE,
+   PAUSE,
+   RESUME,
+   } action = NONE;
+
+   switch (old_state) {
+   case OVERWRITE_EVT_RUNNING:
+   if (state != OVERWRITE_EVT_RUNNING)
+   action = PAUSE;
+   break;
+   case OVERWRITE_EVT_DATA_PENDING:
+   if (state == OVERWRITE_EVT_RUNNING)
+   action = RESUME;
+   break;
+   case OVERWRITE_EVT_EMPTY:
+   if (sta

[PATCH v3 07/11] perf tools: Enable overwrite settings

2016-05-23 Thread Wang Nan

This patch allows following config terms and option:

Globally setting events to overwrite;

 # perf record --overwrite ...

Set specific events to be overwrite or no-overwrite.

 # perf record --event cycles/overwrite/ ...
 # perf record --event cycles/no-overwrite/ ...

Add missing config terms and update config term array size because the
longest string length is changed.

For overwritable events, automatically select attr.write_backward since
perf requires it to be backward for reading.

Signed-off-by: Wang Nan 
Signed-off-by: He Kuang 
Cc: Arnaldo Carvalho de Melo 
Cc: Jiri Olsa 
Cc: Masami Hiramatsu 
Cc: Namhyung Kim 
Cc: Zefan Li 
Cc: pi3or...@163.com
---
 tools/perf/Documentation/perf-record.txt | 14 ++
 tools/perf/builtin-record.c  |  1 +
 tools/perf/perf.h|  1 +
 tools/perf/util/evsel.c  | 12 
 tools/perf/util/evsel.h  |  2 ++
 tools/perf/util/parse-events.c   | 20 ++--
 tools/perf/util/parse-events.h   |  2 ++
 tools/perf/util/parse-events.l   |  2 ++
 8 files changed, 52 insertions(+), 2 deletions(-)

diff --git a/tools/perf/Documentation/perf-record.txt 
b/tools/perf/Documentation/perf-record.txt
index 8dbee83..f5cb932 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -360,6 +360,20 @@ particular perf.data snapshot should be kept or not.
 
 Implies --timestamp-filename, --no-buildid and --no-buildid-cache.
 
+--overwrite::
+Makes all events use overwritable ring buffer. Event with overwritable ring
+buffer works like a flight recorder: when buffer gets full, instead of dumping
+records into output file, kernel overwrites old records silently. Perf dumps
+data from overwritable ring buffer when switching output (see --switch-output)
+and before terminate.
+
+Perf behaves like a daemon when '--overwrite' and '--switch-output' are
+provided. It record and drop events in background, and dumps data when
+something unusual is detected.
+
+'overwrite' attribute can also be set or canceled for specific event using
+config terms like 'cycles/overwrite/' and 'instructions/no-overwrite/'.
+
 SEE ALSO
 
 linkperf:perf-stat[1], linkperf:perf-list[1]
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index d4cf1b0..9611380 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1310,6 +1310,7 @@ struct option __record_options[] = {
OPT_BOOLEAN_SET('i', "no-inherit", &record.opts.no_inherit,
&record.opts.no_inherit_set,
"child tasks do not inherit counters"),
+   OPT_BOOLEAN(0, "overwrite", &record.opts.overwrite, "use overwrite 
mode"),
OPT_UINTEGER('F', "freq", &record.opts.user_freq, "profile at this 
frequency"),
OPT_CALLBACK('m', "mmap-pages", &record.opts, "pages[,pages]",
 "number of mmap data pages and AUX area tracing mmap 
pages",
diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index cd8f1b1..608b42b 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -59,6 +59,7 @@ struct record_opts {
bool record_switch_events;
bool all_kernel;
bool all_user;
+   bool overwrite;
unsigned int freq;
unsigned int mmap_pages;
unsigned int auxtrace_mmap_pages;
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 02c177d..6330a4f 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -671,11 +671,22 @@ static void apply_config_terms(struct perf_evsel *evsel,
 */
attr->inherit = term->val.inherit ? 1 : 0;
break;
+   case PERF_EVSEL__CONFIG_TERM_OVERWRITE:
+   evsel->overwrite = term->val.overwrite ? 1 : 0;
+   break;
default:
break;
}
}
 
+   /*
+* Set backward after config term processing because it is
+* possible to set overwrite globally, without config
+* terms.
+*/
+   if (evsel->overwrite)
+   attr->write_backward = 1;
+
/* User explicitly set per-event callgraph, clear the old setting and 
reset. */
if ((callgraph_buf != NULL) || (dump_size > 0)) {
 
@@ -747,6 +758,7 @@ void perf_evsel__config(struct perf_evsel *evsel, struct 
record_opts *opts,
 
attr->sample_id_all = perf_missing_features.sample_id_all ? 0 : 1;
attr->inherit   = !opts->no_inherit;
+   evsel->overwrite= opts->overwrite;
 
perf_evsel__set_sample_bit(evsel, IP);
perf_evsel__set_sample_bit(evsel, TID);
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index c1f1015..bce99fa 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -44,6 +44,7 @@ enum {
PERF_EVSEL__CONFIG_TE

[PATCH v3 06/11] perf tools: Don't poll and mmap overwritable events

2016-05-23 Thread Wang Nan

There's no need to receive events from overwritable ring buffer. Instead,
perf should make them run background until something happen. This patch
makes normal events from overwrite events ignored.

Overwritable events must be mapped readonly and backward, so if evlist
and evsel is not match (evsel->overwrite is true but either evlist is
read/write or evlist is not backward, and vice versa), skip mapping it.

It is possible that all events in an evlist are overwritable.
perf_event__synth_time_conv() should not crash in this case. record__pick_pc()
is used to check avaliability. Further commits will expand it.

Signed-off-by: Wang Nan 
Signed-off-by: He Kuang 
Cc: Arnaldo Carvalho de Melo 
Cc: Jiri Olsa 
Cc: Masami Hiramatsu 
Cc: Namhyung Kim 
Cc: Zefan Li 
Cc: pi3or...@163.com
---
 tools/perf/arch/x86/util/tsc.c |  2 ++
 tools/perf/builtin-record.c|  9 -
 tools/perf/util/evlist.c   | 23 +++
 3 files changed, 29 insertions(+), 5 deletions(-)

diff --git a/tools/perf/arch/x86/util/tsc.c b/tools/perf/arch/x86/util/tsc.c
index 357f1b1..2e5567c 100644
--- a/tools/perf/arch/x86/util/tsc.c
+++ b/tools/perf/arch/x86/util/tsc.c
@@ -62,6 +62,8 @@ int perf_event__synth_time_conv(const struct 
perf_event_mmap_page *pc,
struct perf_tsc_conversion tc;
int err;
 
+   if (!pc)
+   return 0;
err = perf_read_tsc_conversion(pc, &tc);
if (err == -EOPNOTSUPP)
return 0;
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index dc3fcb5..d4cf1b0 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -655,6 +655,13 @@ perf_event__synth_time_conv(const struct 
perf_event_mmap_page *pc __maybe_unused
return 0;
 }
 
+static const struct perf_event_mmap_page *record__pick_pc(struct record *rec)
+{
+   if (rec->evlist && rec->evlist->mmap && rec->evlist->mmap[0].base)
+   return rec->evlist->mmap[0].base;
+   return NULL;
+}
+
 static int record__synthesize(struct record *rec)
 {
struct perf_session *session = rec->session;
@@ -692,7 +699,7 @@ static int record__synthesize(struct record *rec)
}
}
 
-   err = perf_event__synth_time_conv(rec->evlist->mmap[0].base, tool,
+   err = perf_event__synth_time_conv(record__pick_pc(rec), tool,
  process_synthesized_event, machine);
if (err)
goto out;
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index af0bea7..448c4b4 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -463,9 +463,9 @@ int perf_evlist__alloc_pollfd(struct perf_evlist *evlist)
return 0;
 }
 
-static int __perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd, int 
idx)
+static int __perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd, int 
idx, short revent)
 {
-   int pos = fdarray__add(&evlist->pollfd, fd, POLLIN | POLLERR | POLLHUP);
+   int pos = fdarray__add(&evlist->pollfd, fd, revent | POLLERR | POLLHUP);
/*
 * Save the idx so that when we filter out fds POLLHUP'ed we can
 * close the associated evlist->mmap[] entry.
@@ -481,7 +481,7 @@ static int __perf_evlist__add_pollfd(struct perf_evlist 
*evlist, int fd, int idx
 
 int perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd)
 {
-   return __perf_evlist__add_pollfd(evlist, fd, -1);
+   return __perf_evlist__add_pollfd(evlist, fd, -1, POLLIN);
 }
 
 static void perf_evlist__munmap_filtered(struct fdarray *fda, int fd)
@@ -989,15 +989,28 @@ static int __perf_evlist__mmap(struct perf_evlist 
*evlist, int idx,
return 0;
 }
 
+static bool
+perf_evlist__should_poll(struct perf_evlist *evlist __maybe_unused,
+struct perf_evsel *evsel)
+{
+   if (evsel->overwrite)
+   return false;
+   return true;
+}
+
 static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
   struct mmap_params *mp, int cpu,
   int thread, int *output)
 {
struct perf_evsel *evsel;
+   int revent;
 
evlist__for_each(evlist->parent, evsel) {
int fd;
 
+   if (evsel->overwrite != (evlist->overwrite && evlist->backward))
+   continue;
+
if (evsel->system_wide && thread)
continue;
 
@@ -1014,6 +1027,8 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist 
*evlist, int idx,
perf_evlist__mmap_get(evlist, idx);
}
 
+   revent = perf_evlist__should_poll(evlist, evsel) ? POLLIN : 0;
+
/*
 * The system_wide flag causes a selected event to be opened
 * always without a pid.  Consequently it will never get a
@@ -1022,7 +1037,7 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist 
*evlist, int

[PATCH v3 04/11] perf record: Read from backward ring buffer

2016-05-23 Thread Wang Nan

Introduce rb_find_range() to find start and end position from a backward
ring buffer.

Signed-off-by: Wang Nan 
Signed-off-by: He Kuang 
Cc: Arnaldo Carvalho de Melo 
Cc: Jiri Olsa 
Cc: Masami Hiramatsu 
Cc: Namhyung Kim 
Cc: Zefan Li 
Cc: pi3or...@163.com
---
 tools/perf/builtin-record.c | 52 +
 tools/perf/util/evlist.c|  1 +
 tools/perf/util/evlist.h|  1 +
 3 files changed, 54 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 73ce651..dc3fcb5 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -83,6 +83,54 @@ static int process_synthesized_event(struct perf_tool *tool,
return record__write(rec, event, event->header.size);
 }
 
+static int
+backward_rb_find_range(void *buf, int mask, u64 head, u64 *start, u64 *end)
+{
+   struct perf_event_header *pheader;
+   u64 evt_head = head;
+   int size = mask + 1;
+
+   pr_debug2("backward_rb_find_range: buf=%p, head=%"PRIx64"\n", buf, 
head);
+   pheader = (struct perf_event_header *)(buf + (head & mask));
+   *start = head;
+   while (true) {
+   if (evt_head - head >= (unsigned int)size) {
+   pr_debug("Finshed reading backward ring buffer: 
rewind\n");
+   if (evt_head - head > (unsigned int)size)
+   evt_head -= pheader->size;
+   *end = evt_head;
+   return 0;
+   }
+
+   pheader = (struct perf_event_header *)(buf + (evt_head & mask));
+
+   if (pheader->size == 0) {
+   pr_debug("Finshed reading backward ring buffer: get 
start\n");
+   *end = evt_head;
+   return 0;
+   }
+
+   evt_head += pheader->size;
+   pr_debug3("move evt_head: %"PRIx64"\n", evt_head);
+   }
+   WARN_ONCE(1, "Shouldn't get here\n");
+   return -1;
+}
+
+static int
+rb_find_range(struct perf_evlist *evlist,
+ void *data, int mask, u64 head, u64 old,
+ u64 *start, u64 *end)
+{
+   if (!evlist->backward) {
+   *start = old;
+   *end = head;
+   return 0;
+   }
+
+   return backward_rb_find_range(data, mask, head, start, end);
+}
+
 static int record__mmap_read(struct record *rec, int idx)
 {
struct perf_mmap *md = &rec->evlist->mmap[idx];
@@ -94,6 +142,10 @@ static int record__mmap_read(struct record *rec, int idx)
void *buf;
int rc = 0;
 
+   if (rb_find_range(rec->evlist, data, md->mask, head,
+ old, &start, &end))
+   return -1;
+
if (start == end)
return 0;
 
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 9303525..1305910 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -44,6 +44,7 @@ void perf_evlist__init(struct perf_evlist *evlist, struct 
cpu_map *cpus,
perf_evlist__set_maps(evlist, cpus, threads);
fdarray__init(&evlist->pollfd, 64);
evlist->workload.pid = -1;
+   evlist->backward = false;
 }
 
 struct perf_evlist *perf_evlist__new(void)
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 97090b7..d740fb8 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -44,6 +44,7 @@ struct perf_evlist {
bool overwrite;
bool enabled;
bool has_user_cpus;
+   bool backward;
size_t   mmap_len;
int  id_pos;
int  is_pos;
-- 
1.8.3.4

[PATCH v3 02/11] perf record: Prevent reading invalid data in record__mmap_read

2016-05-23 Thread Wang Nan

When record__mmap_read() requires data more than the size of ring
buffer, drop those data to avoid accessing invalid memory.

This can happen when reading from overwritable ring buffer, which
should be avoided. However, check this for robustness.

Signed-off-by: Wang Nan 
Signed-off-by: He Kuang 
Cc: Arnaldo Carvalho de Melo 
Cc: Jiri Olsa 
Cc: Masami Hiramatsu 
Cc: Namhyung Kim 
Cc: Zefan Li 
Cc: pi3or...@163.com
---
 tools/perf/builtin-record.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index f3679c4..f302cc9 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -40,6 +40,7 @@
 #include 
 #include 
 #include 
+#include 
 
 
 struct record {
@@ -98,6 +99,13 @@ static int record__mmap_read(struct record *rec, int idx)
rec->samples++;
 
size = head - old;
+   if (size > (unsigned long)(md->mask) + 1) {
+   WARN_ONCE(1, "failed to keep up with mmap data. (warn only 
once)\n");
+
+   md->prev = head;
+   perf_evlist__mmap_consume(rec->evlist, idx);
+   return 0;
+   }
 
if ((old & md->mask) + size != (head & md->mask)) {
buf = &data[old & md->mask];
-- 
1.8.3.4

[PATCH v2 2/3] regulator: of: Add support for parsing operation mode

2016-05-23 Thread Henry Chen

Some regulators support their operating mode to be changed by consumers for
module specific purpose.

This patch adds support to parse those properties and fill the regulator
constraints so the regulator core can call the regualtor_set_mode to change
the modes.

Signed-off-by: Henry Chen 
---
 drivers/regulator/of_regulator.c | 15 ++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/drivers/regulator/of_regulator.c b/drivers/regulator/of_regulator.c
index 6b0aa80..7f8d82e 100644
--- a/drivers/regulator/of_regulator.c
+++ b/drivers/regulator/of_regulator.c
@@ -31,7 +31,7 @@ static void of_get_regulation_constraints(struct device_node 
*np,
struct regulation_constraints *constraints = &(*init_data)->constraints;
struct regulator_state *suspend_state;
struct device_node *suspend_np;
-   int ret, i;
+   int ret, i, cnt;
u32 pval;
 
constraints->name = of_get_property(np, "regulator-name", NULL);
@@ -167,6 +167,19 @@ static void of_get_regulation_constraints(struct 
device_node *np,
suspend_state = NULL;
suspend_np = NULL;
}
+   cnt = of_property_count_elems_of_size(np,
+ "regulator-supported-modes",
+ sizeof(u32));
+   if (cnt > 0)
+   constraints->valid_ops_mask |= REGULATOR_CHANGE_MODE;
+
+   for (i = 0; i < cnt; i++) {
+   u32 mode;
+
+   of_property_read_u32_index(np, "regulator-supported-modes",
+  i, &mode);
+   constraints->valid_modes_mask |= (1 << mode);
+   }
 }
 
 /**
-- 
1.8.1.1.dirty

[PATCH v2 1/3] regulator: DT: Add DT property for operation mode configuration

2016-05-23 Thread Henry Chen

Some regulators support their operating mode to be changed by consumers for
module specific purpose. Add a DT property to support this.

Signed-off-by: Henry Chen 
---
 Documentation/devicetree/bindings/regulator/regulator.txt |  5 +
 include/dt-bindings/regulator/regulator.h | 14 ++
 2 files changed, 19 insertions(+)
 create mode 100644 include/dt-bindings/regulator/regulator.h

diff --git a/Documentation/devicetree/bindings/regulator/regulator.txt 
b/Documentation/devicetree/bindings/regulator/regulator.txt
index ecfc593..e505d0b 100644
--- a/Documentation/devicetree/bindings/regulator/regulator.txt
+++ b/Documentation/devicetree/bindings/regulator/regulator.txt
@@ -49,6 +49,11 @@ Optional properties:
0: Disable active discharge.
1: Enable active discharge.
Absence of this property will leave configuration to default.
+- regulator-supported-modes: Regulators can run in a variety of different
+  operating modes depending on output load. This allows further system power
+  savings by selecting the best (and most efficient) regulator mode for a
+  desired load. The definition for each of these operation is defined at
+  include/dt-bindings/regulator/regulator.h
 
 Deprecated properties:
 - regulator-compatible: If a regulator chip contains multiple
diff --git a/include/dt-bindings/regulator/regulator.h 
b/include/dt-bindings/regulator/regulator.h
new file mode 100644
index 000..2ed1dfd
--- /dev/null
+++ b/include/dt-bindings/regulator/regulator.h
@@ -0,0 +1,14 @@
+/*
+ * This header provides constants for binding regulator.
+ */
+
+#ifndef _DT_BINDINGS_REGULATOR_REGULATOR_H
+#define _DT_BINDINGS_REGULATOR_REGULATOR_H
+
+/* Regulator operating modes */
+#define REGULATOR_OPERATION_MODE_FAST  0x0
+#define REGULATOR_OPERATION_MODE_NORMAL0x1
+#define REGULATOR_OPERATION_MODE_IDLE  0x2
+#define REGULATOR_OPERATION_MODE_STANDBY   0x3
+
+#endif
-- 
1.8.1.1.dirty

[PATCH v2 3/3] regulator: mt6397: Add buck change mode regulator interface for mt6397

2016-05-23 Thread Henry Chen

BUCKs of mt6397 have auto mode and pwm mode.
User can use regulator interfaces to control modes

Signed-off-by: Henry Chen 
---
 drivers/regulator/mt6397-regulator.c | 89 
 1 file changed, 80 insertions(+), 9 deletions(-)

diff --git a/drivers/regulator/mt6397-regulator.c 
b/drivers/regulator/mt6397-regulator.c
index 1c45abb..c6c6aa85 100644
--- a/drivers/regulator/mt6397-regulator.c
+++ b/drivers/regulator/mt6397-regulator.c
@@ -23,6 +23,9 @@
 #include 
 #include 
 
+#define MT6397_BUCK_MODE_AUTO  0
+#define MT6397_BUCK_MODE_FORCE_PWM 1
+
 /*
  * MT6397 regulators' information
  *
@@ -38,10 +41,14 @@ struct mt6397_regulator_info {
u32 vselon_reg;
u32 vselctrl_reg;
u32 vselctrl_mask;
+   u32 modeset_reg;
+   u32 modeset_mask;
+   u32 modeset_shift;
 };
 
 #define MT6397_BUCK(match, vreg, min, max, step, volt_ranges, enreg,   \
-   vosel, vosel_mask, voselon, vosel_ctrl) \
+   vosel, vosel_mask, voselon, vosel_ctrl, _modeset_reg,   \
+   _modeset_shift) \
 [MT6397_ID_##vreg] = { \
.desc = {   \
.name = #vreg,  \
@@ -62,6 +69,9 @@ struct mt6397_regulator_info {
.vselon_reg = voselon,  \
.vselctrl_reg = vosel_ctrl, \
.vselctrl_mask = BIT(1),\
+   .modeset_reg = _modeset_reg,\
+   .modeset_mask = BIT(_modeset_shift),\
+   .modeset_shift = _modeset_shift \
 }
 
 #define MT6397_LDO(match, vreg, ldo_volt_table, enreg, enbit, vosel,   \
@@ -145,6 +155,63 @@ static const u32 ldo_volt_table7[] = {
130, 150, 180, 200, 250, 280, 300, 330,
 };
 
+static int mt6397_regulator_set_mode(struct regulator_dev *rdev,
+unsigned int mode)
+{
+   struct mt6397_regulator_info *info = rdev_get_drvdata(rdev);
+   int ret, val;
+
+   switch (mode) {
+   case REGULATOR_MODE_FAST:
+   val = MT6397_BUCK_MODE_FORCE_PWM;
+   break;
+   case REGULATOR_MODE_NORMAL:
+   val = MT6397_BUCK_MODE_AUTO;
+   break;
+   default:
+   ret = -EINVAL;
+   goto err_mode;
+   }
+
+   dev_dbg(&rdev->dev, "mt6397 buck set_mode %#x, %#x, %#x, %#x\n",
+   info->modeset_reg, info->modeset_mask,
+   info->modeset_shift, val);
+
+   val <<= info->modeset_shift;
+   ret = regmap_update_bits(rdev->regmap, info->modeset_reg,
+info->modeset_mask, val);
+err_mode:
+   if (ret != 0) {
+   dev_err(&rdev->dev,
+   "Failed to set mt6397 buck mode: %d\n", ret);
+   return ret;
+   }
+
+   return 0;
+}
+
+static unsigned int mt6397_regulator_get_mode(struct regulator_dev *rdev)
+{
+   struct mt6397_regulator_info *info = rdev_get_drvdata(rdev);
+   int ret, regval;
+
+   ret = regmap_read(rdev->regmap, info->modeset_reg, ®val);
+   if (ret != 0) {
+   dev_err(&rdev->dev,
+   "Failed to get mt6397 buck mode: %d\n", ret);
+   return ret;
+   }
+
+   switch ((regval & info->modeset_mask) >> info->modeset_shift) {
+   case MT6397_BUCK_MODE_AUTO:
+   return REGULATOR_MODE_NORMAL;
+   case MT6397_BUCK_MODE_FORCE_PWM:
+   return REGULATOR_MODE_FAST;
+   default:
+   return -EINVAL;
+   }
+}
+
 static int mt6397_get_status(struct regulator_dev *rdev)
 {
int ret;
@@ -170,6 +237,8 @@ static const struct regulator_ops mt6397_volt_range_ops = {
.disable = regulator_disable_regmap,
.is_enabled = regulator_is_enabled_regmap,
.get_status = mt6397_get_status,
+   .set_mode = mt6397_regulator_set_mode,
+   .get_mode = mt6397_regulator_get_mode,
 };
 
 static const struct regulator_ops mt6397_volt_table_ops = {
@@ -196,28 +265,30 @@ static const struct regulator_ops mt6397_volt_fixed_ops = 
{
 static struct mt6397_regulator_info mt6397_regulators[] = {
MT6397_BUCK("buck_vpca15", VPCA15, 70, 1493750, 6250,
buck_volt_range1, MT6397_VCA15_CON7, MT6397_VCA15_CON9, 0x7f,
-   MT6397_VCA15_CON10, MT6397_VCA15_CON5),
+   MT6397_VCA15_CON10, MT6397_VCA15_CON5, MT6397_VCA15_CON2, 11),
MT6397_BUCK("buck_vpca7", VPCA7, 70, 1493750, 6250,
buck_volt_range1, MT6397_VPCA7_CON7, MT6397_VPCA7_CON9, 0x7f,
-   MT6397_VPCA7_CON10, MT6397_VPCA7_CON5),
+   MT6397_VPCA7_CON10, MT6397_VPCA7_CON5, MT

Re: [RFC PATCH 2/3] mmc: host: omap_hsmmc: Enable ADMA2

2016-05-23 Thread Felipe Balbi


Hi Kishon,

Kishon Vijay Abraham I  writes:
> Hi Felipe,
>
> On Friday 20 May 2016 12:06 AM, Felipe Balbi wrote:
>> 
>> Hi,
>> 
>> Tony Lindgren  writes:
>>> * Peter Ujfalusi  [160519 01:10]:
 On 05/18/2016 10:30 PM, Tony Lindgren wrote:
> Ideally the adma support would be a separate loadable module,
> similar how the cppi41dma is a child of the OTG controller.

 The Master DMA is part of the hsmmc IP block. If the same ADMA module is
 present on other IPs it might be beneficial to have a helper library to 
 handle
 it (allocating the descriptor pool, wrinting, updating descriptors, etc).
>>>
>>> OK. Yeah if it's part of the MMC controller it makes no sense to
>>> separate it. So then the conecrns are using alternate DMA
>>> implementations and keeping PM runtime working :)
>>>
>>> BTW, Felipe mentioned that the best thing to do in the long run would
>>> be to set up sdhci-omap.c operating in ADMA mode.
>>>
>>> Felipe, care to summarize what you had in mind?
>> 
>> yeah, just write a new sdhci-omap.c to start moving away from
>> omap-hsmmc.c, just like it was done for 8250-omap.
>> 
>> At the beginning, it could be just the bare minimum to get it working
>> and slowly move over stuff like pm runtime, dmaengine, PIO. Move more
>> platforms over to that driver and, eventually, get rid of omap-hsmmc.c
>> altogether.
>> 
>> That way, development can be focussed on generic layers (SDHCI) to which
>> OMAP MMC controller is compliant (apart from the VERSION register
>> quirk).
>
> About an year back, when I tried using SDHCI for OMAP I ran into
> issues and was not able to get it working. IIRC SDHCI_PRESENT_STATE
> (or OMAP_HSMMC_PSTATE) was not showing the correct state for card
> present and is unable to raise an interrupt when a card is inserted. I
> didn't debug this further.

I'd say this is a bug in hsmmc. I remember seeing some bits in some
TI-specific register (before SDHCI address space starts) which can be
used to keep parts of SDHCI powered on exactly so normal WP and CD pins
work as expected.

In any case, adding support for GPIO-based card detect to generic SDHCI
shouldn't be too difficult :-)

> It also kept me wondering why gpio interrupt was always used for card
> detect instead of using mmci_sdcd line of the controller.

Probably a really, really old bug which nobody ever debugged properly ;-)

ps: you don't need that ADMA2 DT property, btw. There's a bit in another
register which you can check if $this controller was configured with
ADMA2 support or not. IIRC, OMAP5's TRM describes them.

-- 
balbi

Add support for regulator operation mode of mt6397

2016-05-23 Thread Henry Chen

Some regulators support different operating modes, but there is no suitable
property that can pass the opeation mode constraints on runtime at present.

This series making the change to specify supported modes as a devicetree list.
Consumers can change or get the regulator operation mode by regulator_set_mode
/regulator_get_mode and define the support operating mode on devicetree.

There is a requirement from SVS driver. SVS calibartion requires that the
regulator be in its low-noise (pwm mode) state at boot, but at all other times
it can be normal mode for power saving.
http://www.spinics.net/lists/devicetree/msg111204.html

Changes in v2:
- Separate patch for binding document changes.
- Create a header to define operation mode on dt-bindings.
- Remove the property "regulator-supported-modes".

Henry Chen (3):
  regulator: DT: Add DT property for operation mode configuration
  regulator: of: Add support for parsing operation mode
  regulator: mt6397: Add buck change mode regulator interface for mt6397

 .../devicetree/bindings/regulator/regulator.txt|  5 ++
 drivers/regulator/mt6397-regulator.c   | 89 +++---
 drivers/regulator/of_regulator.c   | 15 +++-
 include/dt-bindings/regulator/regulator.h  | 14 
 4 files changed, 113 insertions(+), 10 deletions(-)
 create mode 100644 include/dt-bindings/regulator/regulator.h

-- 
1.8.1.1.dirty

Re: [PATCH 16/54] MAINTAINERS: Add file patterns for drm device tree bindings

2016-05-23 Thread Daniel Vetter

On Sun, May 22, 2016 at 11:05:53AM +0200, Geert Uytterhoeven wrote:
> Submitters of device tree binding documentation may forget to CC
> the subsystem maintainer if this is missing.
> 
> Signed-off-by: Geert Uytterhoeven 
> Cc: David Airlie 
> Cc: dri-de...@lists.freedesktop.org
> ---
> Please apply this patch directly if you want to be involved in device
> tree binding documentation for your subsystem.

Applied to drm-misc, thanks.
-Daniel

> ---
>  MAINTAINERS | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index c79b99dd3a0bf22d..75138c09dd603093 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -3868,6 +3868,9 @@ T:  git git://people.freedesktop.org/~airlied/linux
>  S:   Maintained
>  F:   drivers/gpu/drm/
>  F:   drivers/gpu/vga/
> +F:   Documentation/devicetree/bindings/display/
> +F:   Documentation/devicetree/bindings/gpu/
> +F:   Documentation/devicetree/bindings/video/
>  F:   Documentation/DocBook/gpu.*
>  F:   include/drm/
>  F:   include/uapi/drm/
> -- 
> 1.9.1
> 
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [PATCH 23/54] MAINTAINERS: Add file patterns for led device tree bindings

2016-05-23 Thread Jacek Anaszewski


Hi Geert,

On 05/22/2016 11:06 AM, Geert Uytterhoeven wrote:

Submitters of device tree binding documentation may forget to CC
the subsystem maintainer if this is missing.

Signed-off-by: Geert Uytterhoeven 
Cc: Richard Purdie 
Cc: Jacek Anaszewski 
Cc: linux-l...@vger.kernel.org
---
Please apply this patch directly if you want to be involved in device
tree binding documentation for your subsystem.
---
  MAINTAINERS | 1 +
  1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index ec96da7cb006e823..82c82a72cb9cf02a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6701,6 +6701,7 @@ M:Jacek Anaszewski 
  L:linux-l...@vger.kernel.org
  T:git 
git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds.git
  S:Maintained
+F: Documentation/devicetree/bindings/leds/
  F:drivers/leds/
  F:include/linux/leds.h




Applied, thanks.

--
Best regards,
Jacek Anaszewski

Re: zone_reclaimable() leads to livelock in __alloc_pages_slowpath()

2016-05-23 Thread Michal Hocko

Hi,
Tetsuo has already pointed you at my oom detection rework which removes
the zone_reclaimable ugliness (btw. one of the top reasons to rework
this area) and it is likely to fix your problem. I would still like to
understand what happens with your test case because we might want to
prepare a stable patch for older kernels.

On Fri 20-05-16 22:28:17, Oleg Nesterov wrote:
> I don't understand vmscan.c, and in fact I don't even understand 
> NR_PAGES_SCANNED
[...]
> counter... why it has to be atomic/per-cpu? It is always updated under 
> ->lru_lock
> except free_pcppages_bulk/free_one_page try to reset this counter. But note 
> that
> they both do

It doesn't really have to be atomic/per-cpu because it is really updated
under the lock. It just uses the generic vmstat infrastructure...

>   nr_scanned = zone_page_state(zone, NR_PAGES_SCANNED);
>   if (nr_scanned)
>   __mod_zone_page_state(zone, NR_PAGES_SCANNED, -nr_scanned);
> 
> and this doesn't look exactly right: zone_page_state() ignores the per-cpu
> ->vm_stat_diff[] counters (and we probably do not want for_each_online_cpu()
> loop here). And I do not know if this is really bad or not, but note that if
> I change calculate_normal_threshold() to return 0, the problem goes away too.

You are absolutely right that this is racy. In the worst case we would
end up missing nr_cpus*threshold scanned pages which would stay behind.
But

bool zone_reclaimable(struct zone *zone)
{
return zone_page_state_snapshot(zone, NR_PAGES_SCANNED) <
zone_reclaimable_pages(zone) * 6;
}

So the left over shouldn't cause it to return true all the time. In
fact it could prematurely say false, right? (note that _snapshot variant
considers per-cpu diffs [1]).

That being said I am not really sure why would the 0 threshold help for
your test case. Could you add some tracing and see what are the numbers
above? Is it possible that zone_reclaimable_pages is some small number
which actuall prevents us to scan anything? Aka a bug is get_scan_count
or somewhere else?

[1] I am not really sure which kernel version have you tested - your
config says 4.6.0-rc7 but this is true since 0db2cb8da89d ("mm, vmscan:
make zone_reclaimable_pages more precise") which is 4.6-rc1.
-- 
Michal Hocko
SUSE Labs

[PATCH] regulator: mt6397: Constify struct regulator_ops

2016-05-23 Thread Henry Chen

Consitify the structure of regulator operations.

Signed-off-by: Henry Chen 
---
 drivers/regulator/mt6397-regulator.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/regulator/mt6397-regulator.c 
b/drivers/regulator/mt6397-regulator.c
index 17a5b6c..1c45abb 100644
--- a/drivers/regulator/mt6397-regulator.c
+++ b/drivers/regulator/mt6397-regulator.c
@@ -160,7 +160,7 @@ static int mt6397_get_status(struct regulator_dev *rdev)
return (regval & info->qi) ? REGULATOR_STATUS_ON : REGULATOR_STATUS_OFF;
 }
 
-static struct regulator_ops mt6397_volt_range_ops = {
+static const struct regulator_ops mt6397_volt_range_ops = {
.list_voltage = regulator_list_voltage_linear_range,
.map_voltage = regulator_map_voltage_linear_range,
.set_voltage_sel = regulator_set_voltage_sel_regmap,
@@ -172,7 +172,7 @@ static struct regulator_ops mt6397_volt_range_ops = {
.get_status = mt6397_get_status,
 };
 
-static struct regulator_ops mt6397_volt_table_ops = {
+static const struct regulator_ops mt6397_volt_table_ops = {
.list_voltage = regulator_list_voltage_table,
.map_voltage = regulator_map_voltage_iterate,
.set_voltage_sel = regulator_set_voltage_sel_regmap,
@@ -184,7 +184,7 @@ static struct regulator_ops mt6397_volt_table_ops = {
.get_status = mt6397_get_status,
 };
 
-static struct regulator_ops mt6397_volt_fixed_ops = {
+static const struct regulator_ops mt6397_volt_fixed_ops = {
.list_voltage = regulator_list_voltage_linear,
.enable = regulator_enable_regmap,
.disable = regulator_disable_regmap,
-- 
1.8.1.1.dirty

Re: [v2 PATCH] mm: move page_ext_init after all struct pages are initialized

2016-05-23 Thread Michal Hocko

On Fri 20-05-16 08:41:09, Shi, Yang wrote:
> On 5/20/2016 6:16 AM, Michal Hocko wrote:
> > On Thu 19-05-16 15:13:26, Yang Shi wrote:
> > [...]
> > > diff --git a/init/main.c b/init/main.c
> > > index b3c6e36..2075faf 100644
> > > --- a/init/main.c
> > > +++ b/init/main.c
> > > @@ -606,7 +606,6 @@ asmlinkage __visible void __init start_kernel(void)
> > >   initrd_start = 0;
> > >   }
> > >  #endif
> > > - page_ext_init();
> > >   debug_objects_mem_init();
> > >   kmemleak_init();
> > >   setup_per_cpu_pageset();
> > > @@ -1004,6 +1003,8 @@ static noinline void __init 
> > > kernel_init_freeable(void)
> > >   sched_init_smp();
> > > 
> > >   page_alloc_init_late();
> > > + /* Initialize page ext after all struct pages are initializaed */
> > > + page_ext_init();
> > > 
> > >   do_basic_setup();
> > 
> > I might be missing something but don't we have the same problem with
> > CONFIG_FLATMEM? page_ext_init_flatmem is called way earlier. Or
> > CONFIG_DEFERRED_STRUCT_PAGE_INIT is never enabled for CONFIG_FLATMEM?
> 
> Yes, CONFIG_DEFERRED_STRUCT_PAGE_INIT depends on MEMORY_HOTPLUG which
> depends on SPARSEMEM. So, this config is not valid for FLATMEM at all.

Well
config MEMORY_HOTPLUG
bool "Allow for memory hot-add"
depends on SPARSEMEM || X86_64_ACPI_NUMA
depends on ARCH_ENABLE_MEMORY_HOTPLUG

I wasn't really sure about X86_64_ACPI_NUMA dependency branch which
depends on X86_64 && NUMA && ACPI && PCI and that didn't sound like
SPARSEMEM only. If the FLATMEM shouldn't exist with
CONFIG_DEFERRED_STRUCT_PAGE_INIT can we make that explicit please?
-- 
Michal Hocko
SUSE Labs

Re: [PATCH 2/2] dma-buf/fence: add fence_array fences v4

2016-05-23 Thread Christian König


Am 20.05.2016 um 16:42 schrieb Chris Wilson:

On Fri, May 20, 2016 at 03:56:11PM +0200, Christian König wrote:

From: Gustavo Padovan 

struct fence_collection inherits from struct fence and carries a
collection of fences that needs to be waited together.

It is useful to translate a sync_file to a fence to remove the complexity
of dealing with sync_files on DRM drivers. So even if there are many
fences in the sync_file that needs to waited for a commit to happen,
they all get added to the fence_collection and passed for DRM use as
a standard struct fence.

That means that no changes needed to any driver besides supporting fences.

fence_collection's fence doesn't belong to any timeline context, so
fence_is_later() and fence_later() are not meant to be called with
fence_collections fences.

v2: Comments by Daniel Vetter:
- merge fence_collection_init() and fence_collection_add()
- only add callbacks at ->enable_signalling()
- remove fence_collection_put()
- check for type on to_fence_collection()
- adjust fence_is_later() and fence_later() to WARN_ON() if they
are used with collection fences.

v3: - Initialize fence_cb.node at fence init.

 Comments by Chris Wilson:
- return "unbound" on fence_collection_get_timeline_name()
- don't stop adding callbacks if one fails
- remove redundant !! on fence_collection_enable_signaling()
- remove redundant () on fence_collection_signaled
- use fence_default_wait() instead

v4 (chk): Rework, simplification and cleanup:
- Drop FENCE_NO_CONTEXT handling, always allocate a context.
- Rename to fence_array.
- Return fixed driver name.
- Register only one callback at a time.

Why? Even within a driver I expected there to be some amoritization of
the signaling costs for handling multiple fences at once (at least the
driver I'm familar with!).

So more just curiousity as to your experience that favours sequential
enabling.


Just the profane reason that I want to save the memory for all the 
callbacks.


But thinking about it you are probably right that we should enable the 
signaling for all fences immediately. Going to fix this in the next 
version of the patch.





+static bool fence_array_add_next_callback(struct fence_array *array)
+{
+   while (array->num_signaled < array->num_fences) {
+   struct fence *next = array->fences[array->num_signaled];
+
+   if (!fence_add_callback(next, &array->cb, fence_array_cb_func))
+   return true;
+
+   ++array->num_signaled;
+   }
+
+   return false;
+}
+
+static void fence_array_cb_func(struct fence *f, struct fence_cb *cb)
+{
+   struct fence_array *array = container_of(cb, struct fence_array, cb);

Some chasing around would have been saved by a

assert_spin_locked(&array->lock);

here.


Mhm, actually the array lock isn't held here. Thinking more about it 
adding a new callback from a fence callback can badly deadlock under 
certain situations.


I need to double check why the callback is called with the fence lock 
held here.





+
+   ++array->num_signaled;
+   if (!fence_array_add_next_callback(array))
+   fence_signal(&array->base);
+}
+
+static bool fence_array_enable_signaling(struct fence *fence)
+{
+   struct fence_array *array = to_fence_array(fence);
+
+   return fence_array_add_next_callback(array);
+}
+
+static bool fence_array_signaled(struct fence *fence)
+{
+   struct fence_array *array = to_fence_array(fence);
+
+   return ACCESS_ONCE(array->num_signaled) == array->num_fences;

Can just be READ_ONCE()


Good point, going to fix that.

Christian.


-Chris

Re: [PATCH v2 0/5] Thermal: Support for hardware-tracked trip points

2016-05-23 Thread Caesar Wang


Hello Eduardo & 'Zhang Rui'

Do we have the chance to merge this series patches for next kernel?
I had picked them up in my github, and tested for a period of time with 
rockchip inside kernel.


Let me know if someone have some suggestions or against opinios.
Thanks,

-Caesar
On 2016年05月03日 17:33, Caesar Wang wrote:

The history patches come from Mikko and Sascha.
http://thread.gmane.org/gmane.linux.power-management.general/59451

Now, I pick them up to continue upstream.
Nevermind!

This series history patches:
v1: https://lkml.org/lkml/2016/4/24/227

This series adds support for hardware trip points. It picks up earlier
work from Mikko Perttunen. Mikko implemented hardware trip points as part
of the device tree support. It was suggested back then to move the
functionality to the thermal core instead of putting more code into the
device tree support. This series does exactly that.

This series patches rebase the conflicts.
Note that the hardware-tracked trip points are very well tested currently.

Verified and tested on 
https://github.com/Caesar-github/rockchip/tree/wip/fixes-thermal-0503
That's based on linux-kernel 20160502.
.Linux version 4.6.0-rc6-next-20160502-08922-g860ed34 (wxt@nb)


Changes in v2:
- update the sysfs-api.txt for set_trips
- add the commit in patch[v2 2/5].
- Update the commit for patch[v2 4/5].

Caesar Wang (1):
   thermal: rockchip: add the set_trips function

Sascha Hauer (4):
   thermal: Add support for hardware-tracked trip points
   thermal: of: implement .set_trips for device tree thermal zones
   thermal: streamline get_trend callbacks
   thermal: bang-bang governor: act on lower trip boundary

  Documentation/thermal/sysfs-api.txt|  7 +++
  drivers/thermal/gov_bang_bang.c|  2 +-
  drivers/thermal/of-thermal.c   | 23 +-
  drivers/thermal/rockchip_thermal.c | 39 
  drivers/thermal/thermal_core.c | 52 ++
  drivers/thermal/ti-soc-thermal/ti-thermal-common.c | 25 ---
  include/linux/thermal.h|  9 +++-
  7 files changed, 128 insertions(+), 29 deletions(-)

Re: [PATCHv3] support for AD5820 camera auto-focus coil

2016-05-23 Thread Pali Rohár

On Saturday 21 May 2016 14:43:43 Ivaylo Dimitrov wrote:
> >diff --git a/include/media/ad5820.h b/include/media/ad5820.h
> >new file mode 100644
> >index 000..f5a1565
> >--- /dev/null
> >+++ b/include/media/ad5820.h
> >@@ -0,0 +1,70 @@
> >+/*
> >+ * include/media/ad5820.h
> >+ *
> >+ * Copyright (C) 2008 Nokia Corporation
> >+ * Copyright (C) 2007 Texas Instruments
> >+ *
> >+ * Contact: Tuukka Toivonen 
> >+ *  Sakari Ailus 
> >+ *
> >+ * Based on af_d88.c by Texas Instruments.
> >+ *
> >+ * This program is free software; you can redistribute it and/or
> >+ * modify it under the terms of the GNU General Public License
> >+ * version 2 as published by the Free Software Foundation.
> >+ *
> >+ * This program is distributed in the hope that it will be useful, but
> >+ * WITHOUT ANY WARRANTY; without even the implied warranty of
> >+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> >+ * General Public License for more details.
> >+ *
> >+ * You should have received a copy of the GNU General Public License
> >+ * along with this program; if not, write to the Free Software
> >+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA
> >+ * 02110-1301 USA
> >+ */
> >+
> >+#ifndef AD5820_H
> >+#define AD5820_H
> >+
> >+#include 
> >+#include 
> >+#include 
> >+
> >+#include 
> >+#include 
> >+
> >+struct regulator;
> >+
> >+#define AD5820_NAME "ad5820"
> >+#define AD5820_I2C_ADDR (0x18 >> 1)

Maybe write I2C address is more readable form? What is reason such
bit shift format?

> >+/* Register definitions */
> >+#define AD5820_POWER_DOWN   (1 << 15)
> >+#define AD5820_DAC_SHIFT4
> 
> Do those defines really belong here? Isn't it better if they are moved to
> ad5820.c?

For me it looks like this is private for ad5820.c.

> >+#define AD5820_RAMP_MODE_LINEAR (0 << 3)
> >+#define AD5820_RAMP_MODE_64_16  (1 << 3)
> >+
> >+struct ad5820_platform_data {
> >+int (*set_xshutdown)(struct v4l2_subdev *subdev, int set);
> >+};

This is for legacy board code support right? We need DT support for N900
as legacy board code is going to be deleted.

> >+#define to_ad5820_device(sd)container_of(sd, struct ad5820_device, 
> >subdev)
> >+
> >+struct ad5820_device {
> >+struct v4l2_subdev subdev;
> >+struct ad5820_platform_data *platform_data;
> >+struct regulator *vana;
> >+
> >+struct v4l2_ctrl_handler ctrls;
> >+u32 focus_absolute;
> >+u32 focus_ramp_time;
> >+u32 focus_ramp_mode;
> >+
> >+struct mutex power_lock;
> >+int power_count;
> >+
> >+int standby : 1;
> >+};
> >+
> 
> The same for struct ad5820_device, is it really part of the public API?

Yes, this is also private for ad5820.c

> >+#endif /* AD5820_H */
> >
> >
> >

-- 
Pali Rohár
pali.ro...@gmail.com

Re: [PATCH 2/2] dma-buf/fence: add fence_array fences v4

2016-05-23 Thread Daniel Vetter

On Fri, May 20, 2016 at 11:47:28AM -0300, Gustavo Padovan wrote:
> 2016-05-20 Christian König :
> 
> > From: Gustavo Padovan 
> > 
> > struct fence_collection inherits from struct fence and carries a
> > collection of fences that needs to be waited together.
> > 
> > It is useful to translate a sync_file to a fence to remove the complexity
> > of dealing with sync_files on DRM drivers. So even if there are many
> > fences in the sync_file that needs to waited for a commit to happen,
> > they all get added to the fence_collection and passed for DRM use as
> > a standard struct fence.
> > 
> > That means that no changes needed to any driver besides supporting fences.
> > 
> > fence_collection's fence doesn't belong to any timeline context, so
> > fence_is_later() and fence_later() are not meant to be called with
> > fence_collections fences.
> > 
> > v2: Comments by Daniel Vetter:
> > - merge fence_collection_init() and fence_collection_add()
> > - only add callbacks at ->enable_signalling()
> > - remove fence_collection_put()
> > - check for type on to_fence_collection()
> > - adjust fence_is_later() and fence_later() to WARN_ON() if they
> > are used with collection fences.
> > 
> > v3: - Initialize fence_cb.node at fence init.
> > 
> > Comments by Chris Wilson:
> > - return "unbound" on fence_collection_get_timeline_name()
> > - don't stop adding callbacks if one fails
> > - remove redundant !! on fence_collection_enable_signaling()
> > - remove redundant () on fence_collection_signaled
> > - use fence_default_wait() instead
> > 
> > v4 (chk): Rework, simplification and cleanup:
> > - Drop FENCE_NO_CONTEXT handling, always allocate a context.
> > - Rename to fence_array.
> > - Return fixed driver name.
> > - Register only one callback at a time.
> > - Document that create function takes ownership of array.
> 
> This looks good to me. Dropping NO_CONTEXT was a good idea, also
> registering only one callback makes it looks better.

This will make it even harder to eventually add a real fence_context
structure for tracking and debugging. I know you don't care for amdgpu
since you have amdgpu-specific debug files, and there's some lifetime fun
that makes it not immediately obvious how to resolve it. But on "lots of
shitty little drivers" systems aka SoCs generic debugging information is
crucial I think. Not liking too much where this is going.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [PATCH 13/54] MAINTAINERS: Add file patterns for cris device tree bindings

2016-05-23 Thread Jesper Nilsson

On Sun, May 22, 2016 at 11:05:50AM +0200, Geert Uytterhoeven wrote:
> Submitters of device tree binding documentation may forget to CC
> the subsystem maintainer if this is missing.
> 
> Signed-off-by: Geert Uytterhoeven 
> Cc: Mikael Starvik 

Acked-by: Jesper Nilsson 

> Cc: linux-cris-ker...@axis.com
> ---
> Please apply this patch directly if you want to be involved in device
> tree binding documentation for your subsystem.

/^JN - Jesper Nilsson
-- 
   Jesper Nilsson -- jesper.nils...@axis.com

[PATCH v9 0/2] memory: add Atmel EBI (External Bus Interface) driver

2016-05-23 Thread Boris Brezillon

Hello,

This patch series provides support for the EBI bus.

The EBI (External Bus Interface) is used to access external peripherals
(NOR, SRAM, NAND, and other specific devices like ethernet controllers).
Each device is assigned a CS line and an address range and can have its
own configuration (timings, access mode, bus width, ...).
This driver provides a generic DT binding to configure a device according
to its requirements.
For specific device controllers (like the NAND one) the SMC timings
should be configured by the controller driver through the matrix and
smc syscon regmaps.

Best Regards,

Boris

Changes since v8:
- fixed typo in DT bindings doc

Changes since v7:
- move EBI bus config properties directly in the sub-device node
- prefix all EBI/SMC properties with atmel,smc-
- rework the driver to take the new binding into account
- disable the subdevice when EBI implementation fails to configure
  the bus as requested

Changes since v6:
- rework the binding to put per-device config information into a
  separate subnode and have devices connected to the EBI bus
  defined as direct children of the EBI node
- make all EBI timings mandatory
- keep adding sub-devices when a failure occurs on one of them

Changes since v5:
- remove the "atmel,specialized-logic" property: now all devices are
  are attached to the SMC logic by default, and specialized drivers
  (like the NAND controller one), should change this configuration
  manually.
- provide hardware readout to allow partial config description.
  This is mainly here to keep existing platforms (where everything
  is configured by the bootloader/bootstrap) in working state.
- rename "atmel,tdf-optimized" into "atmel,tdf-mode" and switch from
  a boolean to string property to properly support partial config

Changes since v4:
- fix inconsistencies in SMC and MATRIX registers definition
- add missing compatible strings for at91sam9rl SoC
- fix DT bindings documentation
- replace "atmel,generic-dev" property by "atmel,specialized-logic"

Changes since v3:
- added AT91_MATRIX_USBPUCR_PUON definition
- removed useless macros (those directly returning SoC specific register
  offsets)
- use syscon_regmap_lookup_by_phandle instead of of_parse_phandle +
  syscon_node_to_regmap
- drop AT91_EBICSA_REGFIELD and AT91_MULTI_EBICSA_REGFIELD macros

Changes since v2:
- minor fixes int DT bindings doc
- fix SMC macros
- make use of SMC macros defined in include/linux/mfd/syscon/atmel-smc.h

Changes since v1:
- almost everything :-)

Boris Brezillon (2):
  memory: add Atmel EBI (External Bus Interface) driver
  memory: atmel-ebi: add DT bindings documentation

 .../bindings/memory-controllers/atmel,ebi.txt  | 136 
 drivers/memory/Kconfig |  11 +
 drivers/memory/Makefile|   1 +
 drivers/memory/atmel-ebi.c | 771 +
 4 files changed, 919 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/memory-controllers/atmel,ebi.txt
 create mode 100644 drivers/memory/atmel-ebi.c

-- 
2.7.4

[PATCH v9 2/2] memory: atmel-ebi: add DT bindings documentation

2016-05-23 Thread Boris Brezillon

The EBI (External Bus Interface) is used to access external peripherals
(NOR, SRAM, NAND, and other specific devices like ethernet controllers).
Each device is assigned a CS line and an address range and can have its
own configuration (timings, access mode, bus width, ...).
This driver provides a generic DT binding to configure a device according
to its requirements.
For specific device controllers (like the NAND one) the SMC timings
should be configured by the controller driver through the matrix and smc
syscon regmaps.

Signed-off-by: Boris Brezillon 
Acked-by: Rob Herring 
---
 .../bindings/memory-controllers/atmel,ebi.txt  | 136 +
 1 file changed, 136 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/memory-controllers/atmel,ebi.txt

diff --git a/Documentation/devicetree/bindings/memory-controllers/atmel,ebi.txt 
b/Documentation/devicetree/bindings/memory-controllers/atmel,ebi.txt
new file mode 100644
index 000..9bb5f57
--- /dev/null
+++ b/Documentation/devicetree/bindings/memory-controllers/atmel,ebi.txt
@@ -0,0 +1,136 @@
+* Device tree bindings for Atmel EBI
+
+The External Bus Interface (EBI) controller is a bus where you can connect
+asynchronous (NAND, NOR, SRAM, ) and synchronous memories (SDR/DDR SDRAMs).
+The EBI provides a glue-less interface to asynchronous memories through the SMC
+(Static Memory Controller).
+
+Required properties:
+
+- compatible:  "atmel,at91sam9260-ebi"
+   "atmel,at91sam9261-ebi"
+   "atmel,at91sam9263-ebi0"
+   "atmel,at91sam9263-ebi1"
+   "atmel,at91sam9rl-ebi"
+   "atmel,at91sam9g45-ebi"
+   "atmel,at91sam9x5-ebi"
+   "atmel,sama5d3-ebi"
+
+- reg: Contains offset/length value for EBI memory mapping.
+   This property might contain several entries if the EBI
+   memory range is not contiguous
+
+- #address-cells:  Must be 2.
+   The first cell encodes the CS.
+   The second cell encode the offset into the CS memory
+   range.
+
+- #size-cells: Must be set to 1.
+
+- ranges:  Encodes CS to memory region association.
+
+- clocks:  Clock feeding the EBI controller.
+   See clock-bindings.txt
+
+Children device nodes are representing device connected to the EBI bus.
+
+Required device node properties:
+
+- reg: Contains the chip-select id, the offset and the length
+   of the memory region requested by the device.
+
+EBI bus configuration will be defined directly in the device subnode.
+
+Optional EBI/SMC properties:
+
+- atmel,smc-bus-width: width of the asynchronous device's data bus
+   8, 16 or 32.
+   Default to 8 when undefined.
+
+- atmel,smc-byte-access-type   "write" or "select" (see Atmel datasheet).
+   Default to "select" when undefined.
+
+- atmel,smc-read-mode  "nrd" or "ncs".
+   Default to "ncs" when undefined.
+
+- atmel,smc-write-mode "nwe" or "ncs".
+   Default to "ncs" when undefined.
+
+- atmel,smc-exnw-mode  "disabled", "frozen" or "ready".
+   Default to "disabled" when undefined.
+
+- atmel,smc-page-mode  enable page mode if present. The provided value
+   defines the page size (supported values: 4, 8,
+   16 and 32).
+
+- atmel,smc-tdf-mode:  "normal" or "optimized". When set to
+   "optimized" the data float time is optimized
+   depending on the next device being accessed
+   (next device setup time is subtracted to the
+   current device data float time).
+   Default to "normal" when undefined.
+
+If at least one atmel,smc- property is defined the following SMC timing
+properties become mandatory. In the other hand, if none of the atmel,smc-
+properties are specified, we assume that the EBI bus configuration will be
+handled by the sub-device driver, and none of those properties should be
+defined.
+
+All the timings are expressed in nanoseconds (see Atmel datasheet for a full
+description).
+
+- atmel,smc-ncs-rd-setup-ns
+- atmel,smc-nrd-setup-ns
+- atmel,smc-ncs-wr-setup-ns
+- atmel,smc-nwe-setup-ns
+- atmel,smc-ncs-rd-pulse-ns
+- atmel,smc-nrd-pulse-ns
+- atmel,smc-ncs-wr-pulse-ns
+- atmel,smc-nwe-pulse-ns
+- atmel,smc-nwe-cycle-ns
+- atmel,smc-nrd-cycle-ns
+- atmel,smc-tdf-ns
+
+Example:
+
+   ebi: ebi@1000 {
+   compatible = "atmel,sama5d3-ebi";
+   #address-cells = <2>;
+   #size-cells = <1>;

[PATCH v9 1/2] memory: add Atmel EBI (External Bus Interface) driver

2016-05-23 Thread Boris Brezillon

The EBI (External Bus Interface) is used to access external peripherals
(NOR, SRAM, NAND, and other specific devices like ethernet controllers).
Each device is assigned a CS line and an address range and can have its
own configuration (timings, access mode, bus width, ...).
This driver provides a generic DT binding to configure a device according
to its requirements.
For specific device controllers (like the NAND one) the SMC timings
should be configured by the controller driver through the matrix and
smc syscon regmaps.

Signed-off-by: Boris Brezillon 
---
 drivers/memory/Kconfig |  11 +
 drivers/memory/Makefile|   1 +
 drivers/memory/atmel-ebi.c | 771 +
 3 files changed, 783 insertions(+)
 create mode 100644 drivers/memory/atmel-ebi.c

diff --git a/drivers/memory/Kconfig b/drivers/memory/Kconfig
index c61a284..d13ce16 100644
--- a/drivers/memory/Kconfig
+++ b/drivers/memory/Kconfig
@@ -25,6 +25,17 @@ config ATMEL_SDRAMC
  Starting with the at91sam9g45, this controller supports SDR, DDR and
  LP-DDR memories.
 
+config ATMEL_EBI
+   bool "Atmel EBI driver"
+   default y
+   depends on ARCH_AT91 && OF
+   select MFD_SYSCON
+   help
+ Driver for Atmel EBI controller.
+ Used to configure the EBI (external bus interface) when the device-
+ tree is used. This bus supports NANDs, external ethernet controller,
+ SRAMs, ATA devices, etc.
+
 config TI_AEMIF
tristate "Texas Instruments AEMIF driver"
depends on (ARCH_DAVINCI || ARCH_KEYSTONE) && OF
diff --git a/drivers/memory/Makefile b/drivers/memory/Makefile
index cb0b7a1..b20ae38 100644
--- a/drivers/memory/Makefile
+++ b/drivers/memory/Makefile
@@ -7,6 +7,7 @@ obj-$(CONFIG_OF)+= of_memory.o
 endif
 obj-$(CONFIG_ARM_PL172_MPMC)   += pl172.o
 obj-$(CONFIG_ATMEL_SDRAMC) += atmel-sdramc.o
+obj-$(CONFIG_ATMEL_EBI)+= atmel-ebi.o
 obj-$(CONFIG_TI_AEMIF) += ti-aemif.o
 obj-$(CONFIG_TI_EMIF)  += emif.o
 obj-$(CONFIG_OMAP_GPMC)+= omap-gpmc.o
diff --git a/drivers/memory/atmel-ebi.c b/drivers/memory/atmel-ebi.c
new file mode 100644
index 000..712c767
--- /dev/null
+++ b/drivers/memory/atmel-ebi.c
@@ -0,0 +1,771 @@
+/*
+ * EBI driver for Atmel chips
+ * inspired by the fsl weim bus driver
+ *
+ * Copyright (C) 2013 JJ Hiblot.
+ *
+ * This file is licensed under the terms of the GNU General Public
+ * License version 2. This program is licensed "as is" without any
+ * warranty of any kind, whether express or implied.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+struct at91sam9_smc_timings {
+   u32 ncs_rd_setup_ns;
+   u32 nrd_setup_ns;
+   u32 ncs_wr_setup_ns;
+   u32 nwe_setup_ns;
+   u32 ncs_rd_pulse_ns;
+   u32 nrd_pulse_ns;
+   u32 ncs_wr_pulse_ns;
+   u32 nwe_pulse_ns;
+   u32 nrd_cycle_ns;
+   u32 nwe_cycle_ns;
+   u32 tdf_ns;
+};
+
+struct at91sam9_smc_generic_fields {
+   struct regmap_field *setup;
+   struct regmap_field *pulse;
+   struct regmap_field *cycle;
+   struct regmap_field *mode;
+};
+
+struct at91sam9_ebi_dev_config {
+   struct at91sam9_smc_timings timings;
+   u32 mode;
+};
+
+struct at91_ebi_dev_config {
+   int cs;
+   union {
+   struct at91sam9_ebi_dev_config sam9;
+   };
+};
+
+struct at91_ebi;
+
+struct at91_ebi_dev {
+   struct list_head node;
+   struct at91_ebi *ebi;
+   u32 mode;
+   int numcs;
+   struct at91_ebi_dev_config configs[];
+};
+
+struct at91_ebi_caps {
+   unsigned int available_cs;
+   const struct reg_field *ebi_csa;
+   void (*get_config)(struct at91_ebi_dev *ebid,
+  struct at91_ebi_dev_config *conf);
+   int (*xlate_config)(struct at91_ebi_dev *ebid,
+   struct device_node *configs_np,
+   struct at91_ebi_dev_config *conf);
+   int (*apply_config)(struct at91_ebi_dev *ebid,
+   struct at91_ebi_dev_config *conf);
+   int (*init)(struct at91_ebi *ebi);
+};
+
+struct at91_ebi {
+   struct clk *clk;
+   struct regmap *smc;
+   struct regmap *matrix;
+
+   struct regmap_field *ebi_csa;
+
+   struct device *dev;
+   const struct at91_ebi_caps *caps;
+   struct list_head devs;
+   union {
+   struct at91sam9_smc_generic_fields sam9;
+   };
+};
+
+static void at91sam9_ebi_get_config(struct at91_ebi_dev *ebid,
+   struct at91_ebi_dev_config *conf)
+{
+   struct at91sam9_smc_generic_fields *fields = &ebid->ebi->sam9;
+   unsigned int clk_rate = clk_get_rate(ebid->ebi->clk);
+   struct at91sam9_ebi_dev_config *config = &conf->sam9;
+   struct at91sam9_smc_timings *timings = &config->timings;
+   unsigned int val;
+
+   regmap_fields_read(fields->mod

Re: Mount namespace "dominant peer group"?

2016-05-23 Thread Miklos Szeredi

C is slave of B is slave of A.  If a process can see (i.e. has under
its root) A and C but not B then for C it will show
master:B,propagate_from:A.  This piece of information is shown because
it can't see the immediate master (B) and so cannot determine the
chain of propagation between the mounts it can see.

Concrete example:

# mount --bind / /mnt
# mount --bind /proc /mnt/proc
# mount --make-private /mnt
# mount --make-shared /mnt
# mkdir /tmp/etc
# mount --bind /mnt/etc /tmp/etc
# mount --make-slave /tmp/etc
# mount --make-shared /tmp/etc
# mount --bind /tmp/etc /mnt/tmp/etc
# mount --make-slave /mnt/tmp/etc
# cat /proc/self/mountinfo | grep /tmp/etc
164 40 253:1 /etc /tmp/etc rw,relatime shared:100 master:97 - ...
# chroot /mnt
# cat /proc/self/mountinfo
129 62 253:1 / / rw,relatime shared:97 - ...
168 129 253:1 /etc /tmp/etc rw,relatime master:100 propagate_from:97 - ...

Thanks,
Miklos

Re: [PATCH 1/2] Revert "mtd: atmel_nand: Support variable RB_EDGE interrupts"

2016-05-23 Thread Boris Brezillon

On Mon, 9 May 2016 14:51:18 +0800
Wenyou Yang  wrote:

> This reverts commit 5ddc7bd43ccc ("mtd: atmel_nand: Support variable
> RB_EDGE interrupts")
> 
> Because for current SoCs, the RB_EDGE3(i.e. bit 27) of HSMC_SR
> register does not exist, the RB_EDGE0 (i.e. bit 24) is the ready/busy
> line edge status bit. It is a datasheet bug.
> 
> Signed-off-by: Wenyou Yang 

Wenyou, I know you sent it before v4.6 was released, but now we should
probably add

Cc: 

Brian, can you apply this patch directly in your tree (as previously
discussed, I'm not sure creating a nand/fixes branch is really useful)?

Thanks,

Boris

> ---
> 
>  .../devicetree/bindings/mtd/atmel-nand.txt |  2 +-
>  drivers/mtd/nand/atmel_nand.c  | 35 
> +-
>  drivers/mtd/nand/atmel_nand_nfc.h  |  3 +-
>  3 files changed, 10 insertions(+), 30 deletions(-)
> 
> diff --git a/Documentation/devicetree/bindings/mtd/atmel-nand.txt 
> b/Documentation/devicetree/bindings/mtd/atmel-nand.txt
> index d53aba9..3e7ee99 100644
> --- a/Documentation/devicetree/bindings/mtd/atmel-nand.txt
> +++ b/Documentation/devicetree/bindings/mtd/atmel-nand.txt
> @@ -39,7 +39,7 @@ Optional properties:
>  
>  Nand Flash Controller(NFC) is an optional sub-node
>  Required properties:
> -- compatible : "atmel,sama5d3-nfc" or "atmel,sama5d4-nfc".
> +- compatible : "atmel,sama5d3-nfc".
>  - reg : should specify the address and size used for NFC command registers,
>  NFC registers and NFC SRAM. NFC SRAM address and size can be absent
>  if don't want to use it.
> diff --git a/drivers/mtd/nand/atmel_nand.c b/drivers/mtd/nand/atmel_nand.c
> index efc8ea2..68b9160 100644
> --- a/drivers/mtd/nand/atmel_nand.c
> +++ b/drivers/mtd/nand/atmel_nand.c
> @@ -67,10 +67,6 @@ struct atmel_nand_caps {
>   uint8_t pmecc_max_correction;
>  };
>  
> -struct atmel_nand_nfc_caps {
> - uint32_t rb_mask;
> -};
> -
>  /*
>   * oob layout for large page size
>   * bad block info is on bytes 0 and 1
> @@ -129,7 +125,6 @@ struct atmel_nfc {
>   /* Point to the sram bank which include readed data via NFC */
>   void*data_in_sram;
>   boolwill_write_sram;
> - const struct atmel_nand_nfc_caps *caps;
>  };
>  static struct atmel_nfc  nand_nfc;
>  
> @@ -1715,9 +1710,9 @@ static irqreturn_t hsmc_interrupt(int irq, void *dev_id)
>   nfc_writel(host->nfc->hsmc_regs, IDR, NFC_SR_XFR_DONE);
>   ret = IRQ_HANDLED;
>   }
> - if (pending & host->nfc->caps->rb_mask) {
> + if (pending & NFC_SR_RB_EDGE) {
>   complete(&host->nfc->comp_ready);
> - nfc_writel(host->nfc->hsmc_regs, IDR, host->nfc->caps->rb_mask);
> + nfc_writel(host->nfc->hsmc_regs, IDR, NFC_SR_RB_EDGE);
>   ret = IRQ_HANDLED;
>   }
>   if (pending & NFC_SR_CMD_DONE) {
> @@ -1735,7 +1730,7 @@ static void nfc_prepare_interrupt(struct 
> atmel_nand_host *host, u32 flag)
>   if (flag & NFC_SR_XFR_DONE)
>   init_completion(&host->nfc->comp_xfer_done);
>  
> - if (flag & host->nfc->caps->rb_mask)
> + if (flag & NFC_SR_RB_EDGE)
>   init_completion(&host->nfc->comp_ready);
>  
>   if (flag & NFC_SR_CMD_DONE)
> @@ -1753,7 +1748,7 @@ static int nfc_wait_interrupt(struct atmel_nand_host 
> *host, u32 flag)
>   if (flag & NFC_SR_XFR_DONE)
>   comp[index++] = &host->nfc->comp_xfer_done;
>  
> - if (flag & host->nfc->caps->rb_mask)
> + if (flag & NFC_SR_RB_EDGE)
>   comp[index++] = &host->nfc->comp_ready;
>  
>   if (flag & NFC_SR_CMD_DONE)
> @@ -1821,7 +1816,7 @@ static int nfc_device_ready(struct mtd_info *mtd)
>   dev_err(host->dev, "Lost the interrupt flags: 0x%08x\n",
>   mask & status);
>  
> - return status & host->nfc->caps->rb_mask;
> + return status & NFC_SR_RB_EDGE;
>  }
>  
>  static void nfc_select_chip(struct mtd_info *mtd, int chip)
> @@ -1994,8 +1989,8 @@ static void nfc_nand_command(struct mtd_info *mtd, 
> unsigned int command,
>   }
>   /* fall through */
>   default:
> - nfc_prepare_interrupt(host, host->nfc->caps->rb_mask);
> - nfc_wait_interrupt(host, host->nfc->caps->rb_mask);
> + nfc_prepare_interrupt(host, NFC_SR_RB_EDGE);
> + nfc_wait_interrupt(host, NFC_SR_RB_EDGE);
>   }
>  }
>  
> @@ -2426,11 +2421,6 @@ static int atmel_nand_nfc_probe(struct platform_device 
> *pdev)
>   }
>   }
>  
> - nfc->caps = (const struct atmel_nand_nfc_caps *)
> - of_device_get_match_data(&pdev->dev);
> - if (!nfc->caps)
> - return -ENODEV;
> -
>   nfc_writel(nfc->hsmc_regs, IDR, 0x);
>   nfc_readl(nfc->hsmc_regs, SR);  /* clear the NFC_SR */
>  
> @@ -2459,17 +2449,8 @@ static int atmel_nand_nfc_remove(struct 
> platform_device *pdev)
>   return 0;
>  }
>  
>

Re: [RFC PATCH 2/3] mmc: host: omap_hsmmc: Enable ADMA2

2016-05-23 Thread Kishon Vijay Abraham I

Hi Felipe,

On Monday 23 May 2016 12:48 PM, Felipe Balbi wrote:
> 
> Hi Kishon,
> 
> Kishon Vijay Abraham I  writes:
>> Hi Felipe,
>>
>> On Friday 20 May 2016 12:06 AM, Felipe Balbi wrote:
>>>
>>> Hi,
>>>
>>> Tony Lindgren  writes:
 * Peter Ujfalusi  [160519 01:10]:
> On 05/18/2016 10:30 PM, Tony Lindgren wrote:
>> Ideally the adma support would be a separate loadable module,
>> similar how the cppi41dma is a child of the OTG controller.
>
> The Master DMA is part of the hsmmc IP block. If the same ADMA module is
> present on other IPs it might be beneficial to have a helper library to 
> handle
> it (allocating the descriptor pool, wrinting, updating descriptors, etc).

 OK. Yeah if it's part of the MMC controller it makes no sense to
 separate it. So then the conecrns are using alternate DMA
 implementations and keeping PM runtime working :)

 BTW, Felipe mentioned that the best thing to do in the long run would
 be to set up sdhci-omap.c operating in ADMA mode.

 Felipe, care to summarize what you had in mind?
>>>
>>> yeah, just write a new sdhci-omap.c to start moving away from
>>> omap-hsmmc.c, just like it was done for 8250-omap.
>>>
>>> At the beginning, it could be just the bare minimum to get it working
>>> and slowly move over stuff like pm runtime, dmaengine, PIO. Move more
>>> platforms over to that driver and, eventually, get rid of omap-hsmmc.c
>>> altogether.
>>>
>>> That way, development can be focussed on generic layers (SDHCI) to which
>>> OMAP MMC controller is compliant (apart from the VERSION register
>>> quirk).
>>
>> About an year back, when I tried using SDHCI for OMAP I ran into
>> issues and was not able to get it working. IIRC SDHCI_PRESENT_STATE
>> (or OMAP_HSMMC_PSTATE) was not showing the correct state for card
>> present and is unable to raise an interrupt when a card is inserted. I
>> didn't debug this further.
> 
> I'd say this is a bug in hsmmc. I remember seeing some bits in some
> TI-specific register (before SDHCI address space starts) which can be
> used to keep parts of SDHCI powered on exactly so normal WP and CD pins
> work as expected.
> 
> In any case, adding support for GPIO-based card detect to generic SDHCI
> shouldn't be too difficult :-)
> 
>> It also kept me wondering why gpio interrupt was always used for card
>> detect instead of using mmci_sdcd line of the controller.
> 
> Probably a really, really old bug which nobody ever debugged properly ;-)
> 
> ps: you don't need that ADMA2 DT property, btw. There's a bit in another
> register which you can check if $this controller was configured with
> ADMA2 support or not. IIRC, OMAP5's TRM describes them.

hmm yeah.. Should be the MADMA_EN in MMCHS_HL_HWINFO.

Thanks
Kishon

[PATCH RESEND] printk: add tgid and comm in dump_stack_print_info

2016-05-23 Thread liuhailong

Some threads'name of android is the same in different process.
So we need to get the tgid and the comm of thread's group_leader.

Signed-off-by: Liu Hailong 
Signed-off-by: Li Pengcheng 
Signed-off-by: Chen Feng 
---
 kernel/printk/printk.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index bfbf284..84246b3 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -3166,6 +3166,9 @@ void dump_stack_print_info(const char *log_lvl)
   (int)strcspn(init_utsname()->version, " "),
   init_utsname()->version);
 
+   printk("TGID: %d Comm: %.20s\n",
+  current->tgid, current->group_leader->comm);
+
if (dump_stack_arch_desc_str[0] != '\0')
printk("%sHardware name: %s\n",
   log_lvl, dump_stack_arch_desc_str);
-- 
1.8.3.2

[PATCH v8 1/3] create SMAF module

2016-05-23 Thread Benjamin Gaignard

Secure Memory Allocation Framework goal is to be able
to allocate memory that can be securing.
There is so much ways to allocate and securing memory that SMAF
doesn't do it by itself but need help of additional modules.
To be sure to use the correct allocation method SMAF implement
deferred allocation (i.e. allocate memory when only really needed)

Allocation modules (smaf-alloctor.h):
SMAF could manage with multiple allocation modules at same time.
To select the good one SMAF call match() to be sure that a module
can allocate memory for a given list of devices. It is to the module
to check if the devices are compatible or not with it allocation
method.

Securing module (smaf-secure.h):
The way of how securing memory it is done is platform specific.
Secure module is responsible of grant/revoke memory access.

Signed-off-by: Benjamin Gaignard 
---
 drivers/Kconfig|   2 +
 drivers/Makefile   |   1 +
 drivers/smaf/Kconfig   |   5 +
 drivers/smaf/Makefile  |   1 +
 drivers/smaf/smaf-core.c   | 818 +
 include/linux/smaf-allocator.h |  45 +++
 include/linux/smaf-secure.h|  65 
 include/uapi/linux/smaf.h  |  85 +
 8 files changed, 1022 insertions(+)
 create mode 100644 drivers/smaf/Kconfig
 create mode 100644 drivers/smaf/Makefile
 create mode 100644 drivers/smaf/smaf-core.c
 create mode 100644 include/linux/smaf-allocator.h
 create mode 100644 include/linux/smaf-secure.h
 create mode 100644 include/uapi/linux/smaf.h

diff --git a/drivers/Kconfig b/drivers/Kconfig
index d2ac339..f5262fd 100644
--- a/drivers/Kconfig
+++ b/drivers/Kconfig
@@ -198,4 +198,6 @@ source "drivers/hwtracing/intel_th/Kconfig"
 
 source "drivers/fpga/Kconfig"
 
+source "drivers/smaf/Kconfig"
+
 endmenu
diff --git a/drivers/Makefile b/drivers/Makefile
index 8f5d076..b2fb04a 100644
--- a/drivers/Makefile
+++ b/drivers/Makefile
@@ -172,3 +172,4 @@ obj-$(CONFIG_STM)   += hwtracing/stm/
 obj-$(CONFIG_ANDROID)  += android/
 obj-$(CONFIG_NVMEM)+= nvmem/
 obj-$(CONFIG_FPGA) += fpga/
+obj-$(CONFIG_SMAF) += smaf/
diff --git a/drivers/smaf/Kconfig b/drivers/smaf/Kconfig
new file mode 100644
index 000..d36651a
--- /dev/null
+++ b/drivers/smaf/Kconfig
@@ -0,0 +1,5 @@
+config SMAF
+   tristate "Secure Memory Allocation Framework"
+   depends on DMA_SHARED_BUFFER
+   help
+ Choose this option to enable Secure Memory Allocation Framework
diff --git a/drivers/smaf/Makefile b/drivers/smaf/Makefile
new file mode 100644
index 000..40cd882
--- /dev/null
+++ b/drivers/smaf/Makefile
@@ -0,0 +1 @@
+obj-$(CONFIG_SMAF) += smaf-core.o
diff --git a/drivers/smaf/smaf-core.c b/drivers/smaf/smaf-core.c
new file mode 100644
index 000..1cfd6a7
--- /dev/null
+++ b/drivers/smaf/smaf-core.c
@@ -0,0 +1,818 @@
+/*
+ * smaf-core.c
+ *
+ * Copyright (C) Linaro SA 2015
+ * Author: Benjamin Gaignard  for Linaro.
+ * License terms:  GNU General Public License (GPL), version 2
+ *
+ * Secure Memory Allocator Framework (SMAF) allow to register memory
+ * allocators and a secure module under a common API.
+ * Multiple allocators can be registered to fit with hardwrae devices
+ * requirement. Each allocator must provide a match() function to check
+ * it capaticity to handle the devices attached (like defined by dmabuf).
+ * Only one secure module can be registered since it dedicated to one
+ * hardware platform.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+struct smaf_handle {
+   struct dma_buf *dmabuf;
+   struct smaf_allocator *allocator;
+   struct dma_buf *db_alloc;
+   size_t length;
+   unsigned int flags;
+   int fd;
+   atomic_t is_secure;
+   void *secure_ctx;
+};
+
+/**
+ * struct smaf_device - smaf device node private data
+ * @misc_dev:  the misc device
+ * @head:  list of allocator
+ * @lock:  list and secure pointer mutex
+ * @secure:pointer to secure functions helpers
+ */
+struct smaf_device {
+   struct miscdevice misc_dev;
+   struct list_head head;
+   /* list and secure pointer lock*/
+   struct mutex lock;
+   struct smaf_secure *secure;
+};
+
+static long smaf_ioctl(struct file *file, unsigned int cmd, unsigned long arg);
+
+static const struct file_operations smaf_fops = {
+   .unlocked_ioctl = smaf_ioctl,
+};
+
+static struct smaf_device smaf_dev = {
+   .misc_dev.minor = MISC_DYNAMIC_MINOR,
+   .misc_dev.name  = "smaf",
+   .misc_dev.fops  = &smaf_fops,
+};
+
+static bool have_secure_module(void)
+{
+   return !!smaf_dev.secure;
+}
+
+/**
+ * smaf_grant_access - return true if the specified device can get access
+ * to the memory area
+ *
+ * This function must be called with smaf_dev.lock set
+ */
+static bool smaf_grant_access(struct smaf_handle *handle, struct device *dev,
+

[PATCH v8 0/3] Secure Memory Allocation Framework

2016-05-23 Thread Benjamin Gaignard

version 8 chanegs:
 - rework of the structures used within ioctl
   by adding a version field and padding to be futur proof
 - rename fake secure moduel to test secure module
 - fix the various remarks done on the previous patcheset

version 7 changes:
 - rebased on kernel 4.6-rc7
 - simplify secure module API
 - add vma ops to be able to detect mmap/munmap calls
 - add ioctl to get number and allocator names
 - update libsmaf with adding tests
   https://git.linaro.org/people/benjamin.gaignard/libsmaf.git
 - add debug log in fake secure module

version 6 changes:
 - rebased on kernel 4.5-rc4
 - fix mmapping bug while requested allocation size isn't a a multiple of
   PAGE_SIZE (add a test for this in libsmaf)

version 5 changes:
 - rebased on kernel 4.3-rc6
 - rework locking schema and make handle status use an atomic_t
 - add a fake secure module to allow performing tests without trusted
   environment

version 4 changes:
 - rebased on kernel 4.3-rc3
 - fix missing EXPORT_SYMBOL for smaf_create_handle()

version 3 changes:
 - Remove ioctl for allocator selection instead provide the name of
   the targeted allocator with allocation request.
   Selecting allocator from userland isn't the prefered way of working
   but is needed when the first user of the buffer is a software component.
 - Fix issues in case of error while creating smaf handle.
 - Fix module license.
 - Update libsmaf and tests to care of the SMAF API evolution
   https://git.linaro.org/people/benjamin.gaignard/libsmaf.git

version 2 changes:
 - Add one ioctl to allow allocator selection from userspace.
   This is required for the uses case where the first user of
   the buffer is a software IP which can't perform dma_buf attachement.
 - Add name and ranking to allocator structure to be able to sort them.
 - Create a tiny library to test SMAF:
   https://git.linaro.org/people/benjamin.gaignard/libsmaf.git
 - Fix one issue when try to secure buffer without secure module registered

The outcome of the previous RFC about how do secure data path was the need
of a secure memory allocator (https://lkml.org/lkml/2015/5/5/551)

SMAF goal is to provide a framework that allow allocating and securing
memory by using dma_buf. Each platform have it own way to perform those two
features so SMAF design allow to register helper modules to perform them.

To be sure to select the best allocation method for devices SMAF implement
deferred allocation mechanism: memory allocation is only done when the first
device effectively required it.
Allocator modules have to implement a match() to let SMAF know if they are
compatibles with devices needs.
This patch set provide an example of allocator module which use
dma_{alloc/free/mmap}_attrs() and check if at least one device have
coherent_dma_mask set to DMA_BIT_MASK(32) in match function.
I have named smaf-cma.c like it is done for drm_gem_cma_helper.c even if
a better name could be found for this file.

Secure modules are responsibles of granting and revoking devices access rights
on the memory. Secure module is also called to check if CPU map memory into
kernel and user address spaces.
An example of secure module implementation can be found here:
http://git.linaro.org/people/benjamin.gaignard/optee-sdp.git
This code isn't yet part of the patch set because it depends on generic TEE
which is still under discussion (https://lwn.net/Articles/644646/)

For allocation part of SMAF code I get inspirated by Sumit Semwal work about
constraint aware allocator.

Benjamin Gaignard (3):
  create SMAF module
  SMAF: add CMA allocator
  SMAF: add test secure module

 drivers/Kconfig|   2 +
 drivers/Makefile   |   1 +
 drivers/smaf/Kconfig   |  17 +
 drivers/smaf/Makefile  |   3 +
 drivers/smaf/smaf-cma.c| 188 ++
 drivers/smaf/smaf-core.c   | 818 +
 drivers/smaf/smaf-testsecure.c |  90 +
 include/linux/smaf-allocator.h |  45 +++
 include/linux/smaf-secure.h|  65 
 include/uapi/linux/smaf.h  |  85 +
 10 files changed, 1314 insertions(+)
 create mode 100644 drivers/smaf/Kconfig
 create mode 100644 drivers/smaf/Makefile
 create mode 100644 drivers/smaf/smaf-cma.c
 create mode 100644 drivers/smaf/smaf-core.c
 create mode 100644 drivers/smaf/smaf-testsecure.c
 create mode 100644 include/linux/smaf-allocator.h
 create mode 100644 include/linux/smaf-secure.h
 create mode 100644 include/uapi/linux/smaf.h

-- 
1.9.1

[PATCH v8 3/3] SMAF: add test secure module

2016-05-23 Thread Benjamin Gaignard

This module is allow testing secure calls of SMAF.

Signed-off-by: Benjamin Gaignard 
---
 drivers/smaf/Kconfig   |  6 +++
 drivers/smaf/Makefile  |  1 +
 drivers/smaf/smaf-testsecure.c | 90 ++
 3 files changed, 97 insertions(+)
 create mode 100644 drivers/smaf/smaf-testsecure.c

diff --git a/drivers/smaf/Kconfig b/drivers/smaf/Kconfig
index 058ec4c..aad1f05 100644
--- a/drivers/smaf/Kconfig
+++ b/drivers/smaf/Kconfig
@@ -9,3 +9,9 @@ config SMAF_CMA
depends on SMAF && HAVE_DMA_ATTRS
help
  Choose this option to enable CMA allocation within SMAF
+
+config SMAF_TEST_SECURE
+   tristate "SMAF secure module for test"
+   depends on SMAF
+   help
+ Choose this option to enable secure module for test purpose
diff --git a/drivers/smaf/Makefile b/drivers/smaf/Makefile
index 05bab01b..bca6b9c 100644
--- a/drivers/smaf/Makefile
+++ b/drivers/smaf/Makefile
@@ -1,2 +1,3 @@
 obj-$(CONFIG_SMAF) += smaf-core.o
 obj-$(CONFIG_SMAF_CMA) += smaf-cma.o
+obj-$(CONFIG_SMAF_TEST_SECURE) += smaf-testsecure.o
diff --git a/drivers/smaf/smaf-testsecure.c b/drivers/smaf/smaf-testsecure.c
new file mode 100644
index 000..823d0dc
--- /dev/null
+++ b/drivers/smaf/smaf-testsecure.c
@@ -0,0 +1,90 @@
+/*
+ * smaf-testsecure.c
+ *
+ * Copyright (C) Linaro SA 2015
+ * Author: Benjamin Gaignard  for Linaro.
+ * License terms:  GNU General Public License (GPL), version 2
+ */
+#include 
+#include 
+#include 
+
+#define MAGIC 0xDEADBEEF
+
+struct test_private {
+   int magic;
+};
+
+#define to_priv(x) (struct test_private *)(x)
+
+static void *smaf_testsecure_create(void)
+{
+   struct test_private *priv;
+
+   priv = kzalloc(sizeof(*priv), GFP_KERNEL);
+   if (!priv)
+   return NULL;
+
+   priv->magic = MAGIC;
+
+   return priv;
+}
+
+static int smaf_testsecure_destroy(void *ctx)
+{
+   struct test_private *priv = to_priv(ctx);
+
+   WARN_ON(!priv || (priv->magic != MAGIC));
+   kfree(priv);
+
+   return 0;
+}
+
+static bool smaf_testsecure_grant_access(void *ctx,
+struct device *dev,
+size_t addr, size_t size,
+enum dma_data_direction direction)
+{
+   struct test_private *priv = to_priv(ctx);
+
+   WARN_ON(!priv || (priv->magic != MAGIC));
+   pr_debug("grant requested by device %s\n",
+dev->driver ? dev->driver->name : "cpu");
+
+   return priv->magic == MAGIC;
+}
+
+static void smaf_testsecure_revoke_access(void *ctx,
+ struct device *dev,
+ size_t addr, size_t size,
+ enum dma_data_direction direction)
+{
+   struct test_private *priv = to_priv(ctx);
+
+   WARN_ON(!priv || (priv->magic != MAGIC));
+   pr_debug("revoke requested by device %s\n",
+dev->driver ? dev->driver->name : "cpu");
+}
+
+static struct smaf_secure test = {
+   .create_ctx = smaf_testsecure_create,
+   .destroy_ctx = smaf_testsecure_destroy,
+   .grant_access = smaf_testsecure_grant_access,
+   .revoke_access = smaf_testsecure_revoke_access,
+};
+
+static int __init smaf_testsecure_init(void)
+{
+   return smaf_register_secure(&test);
+}
+module_init(smaf_testsecure_init);
+
+static void __exit smaf_testsecure_deinit(void)
+{
+   smaf_unregister_secure(&test);
+}
+module_exit(smaf_testsecure_deinit);
+
+MODULE_DESCRIPTION("SMAF secure module for test purpose");
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR("Benjamin Gaignard ");
-- 
1.9.1

[PATCH v8 2/3] SMAF: add CMA allocator

2016-05-23 Thread Benjamin Gaignard

SMAF CMA allocator implement helpers functions to allow SMAF
to allocate contiguous memory.

match() each if at least one of the attached devices have coherent_dma_mask
set to DMA_BIT_MASK(32).

For allocation it use dma_alloc_attrs() with DMA_ATTR_WRITE_COMBINE and not
dma_alloc_writecombine to be compatible with ARM 64bits architecture

Signed-off-by: Benjamin Gaignard 
---
 drivers/smaf/Kconfig|   6 ++
 drivers/smaf/Makefile   |   1 +
 drivers/smaf/smaf-cma.c | 188 
 3 files changed, 195 insertions(+)
 create mode 100644 drivers/smaf/smaf-cma.c

diff --git a/drivers/smaf/Kconfig b/drivers/smaf/Kconfig
index d36651a..058ec4c 100644
--- a/drivers/smaf/Kconfig
+++ b/drivers/smaf/Kconfig
@@ -3,3 +3,9 @@ config SMAF
depends on DMA_SHARED_BUFFER
help
  Choose this option to enable Secure Memory Allocation Framework
+
+config SMAF_CMA
+   tristate "SMAF CMA allocator"
+   depends on SMAF && HAVE_DMA_ATTRS
+   help
+ Choose this option to enable CMA allocation within SMAF
diff --git a/drivers/smaf/Makefile b/drivers/smaf/Makefile
index 40cd882..05bab01b 100644
--- a/drivers/smaf/Makefile
+++ b/drivers/smaf/Makefile
@@ -1 +1,2 @@
 obj-$(CONFIG_SMAF) += smaf-core.o
+obj-$(CONFIG_SMAF_CMA) += smaf-cma.o
diff --git a/drivers/smaf/smaf-cma.c b/drivers/smaf/smaf-cma.c
new file mode 100644
index 000..cabe440
--- /dev/null
+++ b/drivers/smaf/smaf-cma.c
@@ -0,0 +1,188 @@
+/*
+ * smaf-cma.c
+ *
+ * Copyright (C) Linaro SA 2015
+ * Author: Benjamin Gaignard  for Linaro.
+ * License terms:  GNU General Public License (GPL), version 2
+ */
+
+#include 
+#include 
+#include 
+#include 
+
+struct smaf_cma_buffer_info {
+   struct device *dev;
+   size_t size;
+   void *vaddr;
+   dma_addr_t paddr;
+   struct dma_attrs attrs;
+};
+
+/**
+ * find_matching_device - iterate over the attached devices to find one
+ * with coherent_dma_mask correctly set to DMA_BIT_MASK(32).
+ * Matching device (if any) will be used to aim CMA area.
+ */
+static struct device *find_matching_device(struct dma_buf *dmabuf)
+{
+   struct dma_buf_attachment *attach_obj;
+
+   list_for_each_entry(attach_obj, &dmabuf->attachments, node) {
+   if (attach_obj->dev->coherent_dma_mask == DMA_BIT_MASK(32))
+   return attach_obj->dev;
+   }
+
+   return NULL;
+}
+
+/**
+ * smaf_cma_match - return true if at least one device has been found
+ */
+static bool smaf_cma_match(struct dma_buf *dmabuf)
+{
+   return !!find_matching_device(dmabuf);
+}
+
+static void smaf_cma_release(struct dma_buf *dmabuf)
+{
+   struct smaf_cma_buffer_info *info = dmabuf->priv;
+
+   dma_free_attrs(info->dev, info->size, info->vaddr,
+  info->paddr, &info->attrs);
+
+   kfree(info);
+}
+
+static struct sg_table *smaf_cma_map(struct dma_buf_attachment *attachment,
+enum dma_data_direction direction)
+{
+   struct smaf_cma_buffer_info *info = attachment->dmabuf->priv;
+   struct sg_table *sgt;
+   int ret;
+
+   sgt = kzalloc(sizeof(*sgt), GFP_KERNEL);
+   if (!sgt)
+   return NULL;
+
+   ret = dma_get_sgtable(info->dev, sgt, info->vaddr,
+ info->paddr, info->size);
+   if (ret < 0)
+   goto out;
+
+   sg_dma_address(sgt->sgl) = info->paddr;
+   return sgt;
+
+out:
+   kfree(sgt);
+   return NULL;
+}
+
+static void smaf_cma_unmap(struct dma_buf_attachment *attachment,
+  struct sg_table *sgt,
+  enum dma_data_direction direction)
+{
+   /* do nothing */
+}
+
+static int smaf_cma_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma)
+{
+   struct smaf_cma_buffer_info *info = dmabuf->priv;
+
+   if (info->size < vma->vm_end - vma->vm_start)
+   return -EINVAL;
+
+   vma->vm_flags |= VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP;
+   return dma_mmap_attrs(info->dev, vma, info->vaddr, info->paddr,
+ info->size, &info->attrs);
+}
+
+static void *smaf_cma_vmap(struct dma_buf *dmabuf)
+{
+   struct smaf_cma_buffer_info *info = dmabuf->priv;
+
+   return info->vaddr;
+}
+
+static void *smaf_kmap_atomic(struct dma_buf *dmabuf, unsigned long offset)
+{
+   struct smaf_cma_buffer_info *info = dmabuf->priv;
+
+   return (void *)info->vaddr + offset;
+}
+
+static const struct dma_buf_ops smaf_cma_ops = {
+   .map_dma_buf = smaf_cma_map,
+   .unmap_dma_buf = smaf_cma_unmap,
+   .mmap = smaf_cma_mmap,
+   .release = smaf_cma_release,
+   .kmap_atomic = smaf_kmap_atomic,
+   .kmap = smaf_kmap_atomic,
+   .vmap = smaf_cma_vmap,
+};
+
+static struct dma_buf *smaf_cma_allocate(struct dma_buf *dmabuf, size_t length)
+{
+   struct dma_buf_attachment *attach_obj;
+   struct smaf_cma_buffer_info *info;
+   struct d

[PATCH] PCI: pcie: Call pm_runtime_no_callbacks() after device is registered

2016-05-23 Thread Mika Westerberg

Commit 0195d2813547 ("PCI: Add runtime PM support for PCIe ports") added
call to pm_runtime_no_callbacks() for each port service device to prevent
them exposing unnecessary runtime PM sysfs files. However, that function
tries to acquire dev->power.lock which is not yet initialized.

This triggers following splat:

 BUG: spinlock bad magic on CPU#0, swapper/0/1
  lock: 0x8801be2aa8e8, .magic: , .owner: /-1, .owner_cpu: 0
 CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.6.0+ #820
   8801beb97be0 812cf42d 
  8801be2aa8e8 8801beb97c00 8109ee58 8801be2aa8e8
  8801be2aa8e8 8801beb97c30 8109efd9 8801be2aa8e8
 Call Trace:
  [] dump_stack+0x4f/0x72
  [] spin_dump+0x78/0xc0
  [] do_raw_spin_lock+0xf9/0x150
  [] _raw_spin_lock_irq+0x20/0x30
  [] pm_runtime_no_callbacks+0x1e/0x40
  [] pcie_port_device_register+0x1fd/0x4e0
  [] pcie_portdrv_probe+0x38/0xa0
  [] local_pci_probe+0x45/0xa0
  [] ? pci_match_device+0xe0/0x110
  [] pci_device_probe+0xdb/0x130
  [] driver_probe_device+0x22c/0x440
  [] __driver_attach+0xd1/0xf0
  [] ? driver_probe_device+0x440/0x440
  [] bus_for_each_dev+0x64/0xa0
  [] driver_attach+0x1e/0x20
  [] bus_add_driver+0x1eb/0x280
  [] ? pcie_port_setup+0x7c/0x7c
  [] driver_register+0x60/0xe0
  [] __pci_register_driver+0x60/0x70
  [] pcie_portdrv_init+0x63/0x75
  [] do_one_initcall+0xab/0x1c0
  [] kernel_init_freeable+0x153/0x1d9
  [] kernel_init+0xe/0x100
  [] ret_from_fork+0x22/0x40
  [] ? rest_init+0x90/0x90

Fix this by calling pm_runtime_no_callbacks() after device_register() just
like other buses, like I2C is doing already.

Reported-by: Valdis Kletnieks 
Tested-by: Valdis Kletnieks 
Suggested-by: Lukas Wunner 
Signed-off-by: Mika Westerberg 
---
 drivers/pci/pcie/portdrv_core.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c
index 65b1a624826b..97927dfbbf5f 100644
--- a/drivers/pci/pcie/portdrv_core.c
+++ b/drivers/pci/pcie/portdrv_core.c
@@ -344,7 +344,6 @@ static int pcie_device_init(struct pci_dev *pdev, int 
service, int irq)
 get_descriptor_id(pci_pcie_type(pdev), service));
device->parent = &pdev->dev;
device_enable_async_suspend(device);
-   pm_runtime_no_callbacks(device);
 
retval = device_register(device);
if (retval) {
@@ -352,6 +351,8 @@ static int pcie_device_init(struct pci_dev *pdev, int 
service, int irq)
return retval;
}
 
+   pm_runtime_no_callbacks(device);
+
return 0;
 }
 
-- 
2.8.1

Re: linux-next: Tree for May 17

2016-05-23 Thread Xiong Zhou

hi,

On Tue, May 17, 2016 at 1:04 PM, Stephen Rothwell  wrote:
> Hi all,
>
> Please do not add any v4.8 destined material to your linux-next included
> branches until after v4.7-rc1 has been released.
>
> Changes since 20160516:
>
> The vfs tree gained a conflict against the ext4 tree.
>
> The net-next tree gained a conflict against the arm64 tree.
>
> The spi tree lost its build failure.
>
> Non-merge commits (relative to Linus' tree): 9737
>  8314 files changed, 424907 insertions(+), 177068 deletions(-)
>
> 
>
> I have created today's linux-next tree at
> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
> (patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you

Patches after 0516 are not there.

i'm chasing an oom issue between 0516 and 0518 trees while missed
0517 tag, so is the patch file the only way to get there trying 0517 tree?

[PATCH V2] vfio: platform: support No-IOMMU mode

2016-05-23 Thread Peng Fan

The vfio No-IOMMU mode was supported by this
'commit 03a76b60f8ba2797 ("vfio: Include No-IOMMU mode")',
but it only support vfio-pci.

Using vfio_iommu_group_get/put, but not iommu_group_get/put,
the platform devices can be exposed to userspace with
CONFIG_VFIO_NOIOMMU and the "enable_unsafe_noiommu_mode"
option enabled.

>From 'commit 03a76b60f8ba2797 ("vfio: Include No-IOMMU mode")',
"This should make it very clear that this mode is not safe.
Additionally, CAP_SYS_RAWIO privileges are necessary to work
with groups and containers using this mode.  Groups making
use of this support are named /dev/vfio/noiommu-$GROUP and
can only make use of the special VFIO_NOIOMMU_IOMMU for the
container.  Use of this mode, specifically binding a device
without a native IOMMU group to a VFIO bus driver will taint
the kernel and should therefore not be considered supported.",

Actually, for vfio-platform No-IOMMU mode, the userspace can
not do DMA, because the ioctl API of noiommu container only
supports VFIO_CHECK_EXTENSION and VFIO_IOMMU_MAP_DMA is not
supported.

Signed-off-by: Peng Fan 
Cc: Eric Auger 
Cc: Baptiste Reynal 
Cc: Alex Williamson 
---

V2:
 Rename subject to support No-IOMMU
 Add more commit log.

 I wrote a simple program following this
 
https://github.com/virtualopensystems/vfio-host-test/blob/master/src_test/vfio_device_test.c
 ,no dma support. The device's register can be
 accessed in userspace using command './vfio_dev_test 30b6.usdhc 0 1 
platform'

 drivers/vfio/platform/vfio_platform_common.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/vfio/platform/vfio_platform_common.c 
b/drivers/vfio/platform/vfio_platform_common.c
index e65b142..993b2f9 100644
--- a/drivers/vfio/platform/vfio_platform_common.c
+++ b/drivers/vfio/platform/vfio_platform_common.c
@@ -561,7 +561,7 @@ int vfio_platform_probe_common(struct vfio_platform_device 
*vdev,
 
vdev->device = dev;
 
-   group = iommu_group_get(dev);
+   group = vfio_iommu_group_get(dev);
if (!group) {
pr_err("VFIO: No IOMMU group for device %s\n", vdev->name);
return -EINVAL;
@@ -569,7 +569,7 @@ int vfio_platform_probe_common(struct vfio_platform_device 
*vdev,
 
ret = vfio_add_group_dev(dev, &vfio_platform_ops, vdev);
if (ret) {
-   iommu_group_put(group);
+   vfio_iommu_group_put(group, dev);
return ret;
}
 
@@ -589,7 +589,7 @@ struct vfio_platform_device 
*vfio_platform_remove_common(struct device *dev)
 
if (vdev) {
vfio_platform_put_reset(vdev);
-   iommu_group_put(dev->iommu_group);
+   vfio_iommu_group_put(dev->iommu_group, dev);
}
 
return vdev;
-- 
2.6.2

Re: [PATCH] mm: compact: remove watermark check at compact suitable

2016-05-23 Thread Vlastimil Babka


On 05/23/2016 05:20 AM, Chen Feng wrote:

There are two paths calling this function.
For direct compact, there is no need to check the zone watermark here.
For kswapd wakeup kcompactd, since there is a reclaim before this.
It makes sense to do compact even the watermark is ok at this time.


Hi,

I'm just working on v2 of the series [1] and some patches planned for v2 are 
trying to simplify the watermark checks around compaction. The check you are 
removing looked like simple and obvious one, so I didn't change it. But I'll 
think more about your patch, e.g. if there are some corner cases. See for 
example the fragindex check:


 * index of -1000 would imply allocations might succeed depending on
 * watermarks, but we already failed the high-order watermark check

After your patch, there is no more high-order watermark check, so the assumption 
here is gone.

Also the comment above __compaction_suitable() should be updated too.

[1] http://lkml.kernel.org/r/<1462865763-22084-1-git-send-email-vba...@suse.cz>


Signed-off-by: Chen Feng 
---
  mm/compaction.c | 7 ---
  1 file changed, 7 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index 8fa2540..cb322df 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1260,13 +1260,6 @@ static unsigned long __compaction_suitable(struct zone 
*zone, int order,
return COMPACT_CONTINUE;

watermark = low_wmark_pages(zone);
-   /*
-* If watermarks for high-order allocation are already met, there
-* should be no need for compaction at all.
-*/
-   if (zone_watermark_ok(zone, order, watermark, classzone_idx,
-   alloc_flags))
-   return COMPACT_PARTIAL;

/*
 * Watermarks for order-0 must be met for compaction. Note the 2UL.

[PATCH v3 1/2] kbuild, x86: Track generated headers with generated-y

2016-05-23 Thread James Hogan

Track generated header files which aren't already in genhdr-y, alongside
generic-y wrappers in the */include/generated/[uapi/]asm/ directories.
Currently only x86 generates extra headers in these directories, for the
purposes of enumerating system calls for different ABIs, and xen
hypercalls.

This will allow the asm-generic wrapper handling code to remove stale
wrappers when files are removed from generic-y, without also removing
these headers which are generated separately.

Reported-by: kbuild test robot 
Signed-off-by: James Hogan 
Cc: Michal Marek 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: Jonathan Corbet 
Cc: linux-kbu...@vger.kernel.org
Cc: x...@kernel.org
Cc: linux-...@vger.kernel.org
---
Changes in v2:
- New patch (thanks to kbuild test robot).
---
 Documentation/kbuild/makefiles.txt | 14 ++
 arch/x86/include/asm/Kbuild|  6 ++
 2 files changed, 20 insertions(+)

diff --git a/Documentation/kbuild/makefiles.txt 
b/Documentation/kbuild/makefiles.txt
index 13f888a02a3d..385a5ef41c17 100644
--- a/Documentation/kbuild/makefiles.txt
+++ b/Documentation/kbuild/makefiles.txt
@@ -47,6 +47,7 @@ This document describes the Linux kernel Makefiles.
--- 7.2 genhdr-y
--- 7.3 destination-y
--- 7.4 generic-y
+   --- 7.5 generated-y
 
=== 8 Kbuild Variables
=== 9 Makefile language
@@ -1319,6 +1320,19 @@ See subsequent chapter for the syntax of the Kbuild file.
Example: termios.h
#include 
 
+   --- 7.5 generated-y
+
+   If an architecture generates other header files alongside generic-y
+   wrappers, and not included in genhdr-y, then generated-y specifies
+   them.
+
+   This prevents them being treated as stale asm-generic wrappers and
+   removed.
+
+   Example:
+   #arch/x86/include/asm/Kbuild
+   generated-y += syscalls_32.h
+
 === 8 Kbuild Variables
 
 The top Makefile exports the following variables:
diff --git a/arch/x86/include/asm/Kbuild b/arch/x86/include/asm/Kbuild
index aeac434c9feb..2cfed174e3c9 100644
--- a/arch/x86/include/asm/Kbuild
+++ b/arch/x86/include/asm/Kbuild
@@ -1,5 +1,11 @@
 
 
+generated-y += syscalls_32.h
+generated-y += syscalls_64.h
+generated-y += unistd_32_ia32.h
+generated-y += unistd_64_x32.h
+generated-y += xen-hypercalls.h
+
 genhdr-y += unistd_32.h
 genhdr-y += unistd_64.h
 genhdr-y += unistd_x32.h
-- 
2.4.10

[PATCH v3 2/2] kbuild: Remove stale asm-generic wrappers

2016-05-23 Thread James Hogan

When a header file is removed from generic-y (often accompanied by the
addition of an arch specific header), the generated wrapper file will
persist, and in some cases may still take precedence over the new arch
header.

For example commit f1fe2d21f4e1 ("MIPS: Add definitions for extended
context") removed ucontext.h from generic-y in arch/mips/include/asm/,
and added an arch/mips/include/uapi/asm/ucontext.h. The continued use of
the wrapper when reusing a dirty build tree resulted in build failures
in arch/mips/kernel/signal.c:

arch/mips/kernel/signal.c: In function ‘sc_to_extcontext’:
arch/mips/kernel/signal.c:142:12: error: ‘struct ucontext’ has no member named 
‘uc_extcontext’
  return &uc->uc_extcontext;
^

Fix by detecting and removing wrapper headers in generated header
directories that do not correspond to a filename in generic-y, genhdr-y,
or the newly introduced generated-y.

Reported-by: Jacek Anaszewski 
Reported-by: Hauke Mehrtens 
Reported-by: Heinrich Schuchardt 
Signed-off-by: James Hogan 
Acked-by: Arnd Bergmann 
Acked-by: Florian Fainelli 
Cc: Michal Marek 
Cc: Arnd Bergmann 
Cc: Ralf Baechle 
Cc: Paul Burton 
Cc: Florian Fainelli 
Cc: linux-kbu...@vger.kernel.org
Cc: linux-a...@vger.kernel.org
Cc: linux-m...@linux-mips.org
---
Changes in v3:
- Ensure FORCE actually gets marked .PHONY.

Changes in v2:
- Rewrite a bit, drawing inspiration from Makefile.headersinst.
- Exclude genhdr-y and generated-y (thanks to kbuild test robot).
---
 scripts/Makefile.asm-generic | 17 -
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/scripts/Makefile.asm-generic b/scripts/Makefile.asm-generic
index 045e0098e962..e4d017d53819 100644
--- a/scripts/Makefile.asm-generic
+++ b/scripts/Makefile.asm-generic
@@ -13,11 +13,26 @@ include scripts/Kbuild.include
 # Create output directory if not already present
 _dummy := $(shell [ -d $(obj) ] || mkdir -p $(obj))
 
+# Stale wrappers when the corresponding files are removed from generic-y
+# need removing.
+generated-y   := $(generic-y) $(genhdr-y) $(generated-y)
+all-files := $(patsubst %, $(obj)/%, $(generated-y))
+old-headers   := $(wildcard $(obj)/*.h)
+unwanted  := $(filter-out $(all-files),$(old-headers))
+
 quiet_cmd_wrap = WRAP$@
 cmd_wrap = echo "\#include " >$@
 
-all: $(patsubst %, $(obj)/%, $(generic-y))
+quiet_cmd_remove = REMOVE  $(unwanted)
+cmd_remove = rm -f $(unwanted)
+
+all: $(patsubst %, $(obj)/%, $(generic-y)) FORCE
+   $(if $(unwanted),$(call cmd,remove),)
@:
 
 $(obj)/%.h:
$(call cmd,wrap)
+
+PHONY += FORCE
+.PHONY: $(PHONY)
+FORCE: ;
-- 
2.4.10

Re: [PATCH] i2c_hid: enable i2c-hid devices to suspend/resume asynchronously

2016-05-23 Thread Mika Westerberg

On Thu, May 19, 2016 at 10:46:24AM +0800, Fu, Zhonghui wrote:
> i2c-hid devices' suspend/resume are usually time-consuming process.
> For example, the touch controller(i2c-ATML1000:00) on ASUS T100 tablet
> takes about 160ms for suspending and 120ms for resuming. This patch
> enables i2c-hid devices to suspend/resume asynchronously. This will
> take advantage of multicore and speed up system suspend/resume process.
> 
> Signed-off-by: Zhonghui Fu 

Looks reasonable to me,

Reviewed-by: Mika Westerberg

Re: [PATCH 1/6] statx: Add a system call to make enhanced file info available

2016-05-23 Thread Christoph Hellwig

On Fri, May 13, 2016 at 05:28:11PM +0200, Arnd Bergmann wrote:
> I'm trying to understand what that means for the 64-bit time_t syscalls.
> 
> The patch series I did last year had a replacement 'sys_newfstatat()'
> syscall but IIRC no other stat variant, the idea being that we would
> only need to provide this one to the libc and have user space emulate
> the stat/fstat/lstat/fstatat variants based on that.
> With the statx introduction, I was hoping to no longer have to add
> that syscall but instead have libc do everything on top of sys_statx().
> 
> Do you think that is reasonable, given that we won't be allowed to
> call any of the existing stat() variants for a y2038-safe libc build[1],
> or should we plan to keep needing replacement fstatat (and possibly
> stat/lstat/fstat) syscalls with 64-bit time_t even after statx() support
> is merged into the kernel.

Honestly I think this really matters on the amount of 'emulation' we
need - if it's just adding a new flag that can be trivially generated
in the syscall stub in userland that's probably fine, but if we have
actually differing semantics (like the stat weak attributes) I'd rather
have a properly documented syscall.  If we otherwise need to rewrite
whole structures I'd much rather do that in kernel space.

And to get back to stat: if would be really useful to coordinate the
new one with glibc so that we don't end up with two different stat
structures again like we do for a lot of platforms at the moment.

[PATCH v3 0/2] kbuild: Remove stale asm-generic wrappers

2016-05-23 Thread James Hogan

This patchset attempts to fix kbuild to automatically remove stale
asm-generic wrappers, i.e. when files are removed from generic-y and
added directly into arch/*/include/uapi/asm/, but where the existing
wrapper in arch/*/include/generated/asm/ continues to be used.

MIPS was recently burned by this in v4.3 (see patch 2), with continuing
reports of build failures when people upgrade their trees, which go away
after arch/mips/include/generated is removed (or reportedly make
mrproper/distclean). It is particularly irritating during bisection.

Since v2 I've seen other cases of this breaking MIPS build, and testing
on x86_64, starting a build first on v4.0 and then on mainline with this
patchset shows one stale generated header:
  REMOVE  arch/x86/include/generated/asm/scatterlist.h

Changes in v3:
- Ensure FORCE actually gets marked .PHONY.

Changes in v2:
- New patch 1 to add tracking of generated headers that aren't generic-y
  wrappers, via generated-y, particularly for x86 (thanks to kbuild test
  robot).
- Rewrite a bit, drawing inspiration from Makefile.headersinst.
- Exclude genhdr-y and generated-y (thanks to kbuild test robot).

James Hogan (2):
  kbuild, x86: Track generated headers with generated-y
  kbuild: Remove stale asm-generic wrappers

 Documentation/kbuild/makefiles.txt | 14 ++
 arch/x86/include/asm/Kbuild|  6 ++
 scripts/Makefile.asm-generic   | 17 -
 3 files changed, 36 insertions(+), 1 deletion(-)

Cc: Michal Marek 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: Jonathan Corbet 
Cc: Arnd Bergmann 
Cc: Ralf Baechle 
Cc: Paul Burton 
Cc: Florian Fainelli 
Cc: Heinrich Schuchardt 
Cc: linux-kbu...@vger.kernel.org
Cc: x...@kernel.org
Cc: linux-...@vger.kernel.org
Cc: linux-a...@vger.kernel.org
Cc: linux-m...@linux-mips.org
-- 
2.4.10

Re: [PATCH] f2fs: introduce on-disk layout version checking functionality

2016-05-23 Thread Christoph Hellwig

On Fri, May 20, 2016 at 11:30:43AM -0700, Viacheslav Dubeyko wrote:
> I am not sure that I follow to your point. The F2FS has "feature" field
> (__le32 feature) into on-disk superblock (struct f2fs_super_block). The
> suggested patch introduces the new F2FS_FEATURE_16TB_SUPPORT flag. And
> it looks like as your comment.

It does, but at the same time you also introduce a major version
superblock
field.

> So, necessary changes in on-disk layout for 16+TB volumes support will
> be incompatible with current available version of F2FS driver. It means
> that, anyway, we need to increase version of on-disk layout (major_ver
> of struct f2fs_super_block). The presence of superblock's version and
> F2FS_FEATURE_16TB_SUPPORT flag will be very useful for consistency
> checking by fsck tool.

Why is the feature not enough for that?

[PATCH] powerpc/pseries: start rtasd before PCI probing

2016-05-23 Thread Greg Kurz

A strange behaviour is observed when comparing PCI hotplug in QEMU, between
x86 and pseries. If you consider the following steps:
- start a VM
- add a PCI device via the QEMU monitor before the rtasd has started (for
  example starting the VM in paused state, or hotplug during FW or boot
  loader)
- resume the VM execution

The x86 kernel detects the PCI device, but the pseries one does not.

This happens because the rtasd kernel worker is currently started under
device_initcall, while PCI probing happens earlier under subsys_initcall.

As a consequence, if we have a pending RTAS event at boot time, a message
is printed and the event is dropped.

This patch moves all the initialization of rtasd to arch_initcall, which is
run before subsys_call: this way, logging_enabled is true when the RTAS
event pops up and it is not lost anymore.

The proc fs bits stay at device_initcall because they cannot be run before
fs_initcall.

Signed-off-by: Greg Kurz 
---
 arch/powerpc/kernel/rtasd.c |   19 ++-
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kernel/rtasd.c b/arch/powerpc/kernel/rtasd.c
index e864b7c5884e..ad9e4e1a2d5d 100644
--- a/arch/powerpc/kernel/rtasd.c
+++ b/arch/powerpc/kernel/rtasd.c
@@ -526,10 +526,8 @@ void rtas_cancel_event_scan(void)
 }
 EXPORT_SYMBOL_GPL(rtas_cancel_event_scan);
 
-static int __init rtas_init(void)
+static int __init rtas_event_scan_init(void)
 {
-   struct proc_dir_entry *entry;
-
if (!machine_is(pseries) && !machine_is(chrp))
return 0;
 
@@ -562,13 +560,24 @@ static int __init rtas_init(void)
return -ENOMEM;
}
 
+   start_event_scan();
+
+   return 0;
+}
+arch_initcall(rtas_event_scan_init);
+
+static int __init rtas_init(void)
+{
+   struct proc_dir_entry *entry;
+
+   if (!machine_is(pseries) && !machine_is(chrp))
+   return 0;
+
entry = proc_create("powerpc/rtas/error_log", S_IRUSR, NULL,
&proc_rtas_log_operations);
if (!entry)
printk(KERN_ERR "Failed to create error_log proc entry\n");
 
-   start_event_scan();
-
return 0;
 }
 __initcall(rtas_init);

Re: [PATCH V7 04/11] pci: Add new function to unmap IO resources.

2016-05-23 Thread Jayachandran C

On Tue, May 10, 2016 at 8:49 PM, Tomasz Nowicki  wrote:
> It is very useful to release I/O resources so that the same I/O resources
> can be allocated again (pci_remap_iospace), like in PCI hotplug removal
> scenario. Therefore this patch implements new pci_unmap_iospace call which
> unmaps I/O space as the symmetry to pci_remap_iospace.
>
> Signed-off-by: Sinan Kaya 
> Signed-off-by: Tomasz Nowicki 
> ---
>  drivers/pci/pci.c   | 24 
>  include/linux/pci.h |  1 +
>  2 files changed, 25 insertions(+)
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index bc0c914..ff97a0b 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -25,6 +25,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include "pci.h"
> @@ -3167,6 +3168,29 @@ int __weak pci_remap_iospace(const struct resource 
> *res, phys_addr_t phys_addr)
>  #endif
>  }
>
> +/**
> + * pci_unmap_iospace - Unmap the memory mapped I/O space
> + * @res: resource to be unmapped
> + *
> + * Unmap the CPU virtual address @res from virtual address space.
> + * Only architectures that have memory mapped IO functions defined
> + * (and the PCI_IOBASE value defined) should call this function.
> + */
> +void  pci_unmap_iospace(struct resource *res)
> +{
> +#if defined(PCI_IOBASE) && defined(CONFIG_MMU)
> +   unsigned long vaddr = (unsigned long)PCI_IOBASE + res->start;
> +
> +   unmap_kernel_range(vaddr, resource_size(res));
> +#else
> +   /*
> +* This architecture does not have memory mapped I/O space,
> +* so this function should never be called.
> +*/
> +   WARN_ONCE(1, "This architecture does not support memory mapped 
> I/O\n");
> +#endif
> +}

WARN is not needed here, since we would have already done it in
pci_remap_iospace.

Ideally, we should undo the pci_register_io_range as well, but
re-registering the same range seems to be fine.

JC.

Re: [PATCH] lightnvm: expose mark_blk through core

2016-05-23 Thread Matias Bjørling


On 05/10/2016 09:25 AM, Javier González wrote:

Expose mark_blk through the core LightNVM operations to hid the media
manager, as we do for the rest of the block operations. This is
necessary for targets to mark a growing bad block as bad before
returning it to the media manager.

Signed-off-by: Javier González 
---
  drivers/lightnvm/core.c  | 6 ++
  include/linux/lightnvm.h | 2 ++
  2 files changed, 8 insertions(+)

diff --git a/drivers/lightnvm/core.c b/drivers/lightnvm/core.c
index 160c1a6..13993c9 100644
--- a/drivers/lightnvm/core.c
+++ b/drivers/lightnvm/core.c
@@ -210,6 +210,12 @@ void nvm_put_blk(struct nvm_dev *dev, struct nvm_block 
*blk)
  }
  EXPORT_SYMBOL(nvm_put_blk);

+void nvm_mark_blk(struct nvm_dev *dev, struct ppa_addr ppa, int type)
+{
+   return dev->mt->mark_blk(dev, ppa, type);
+}
+EXPORT_SYMBOL(nvm_mark_blk);
+
  int nvm_submit_io(struct nvm_dev *dev, struct nvm_rq *rqd)
  {
return dev->mt->submit_io(dev, rqd);
diff --git a/include/linux/lightnvm.h b/include/linux/lightnvm.h
index ef2c7d2..9c56148 100644
--- a/include/linux/lightnvm.h
+++ b/include/linux/lightnvm.h
@@ -532,6 +532,8 @@ extern int nvm_register(struct request_queue *, char *,
struct nvm_dev_ops *);
  extern void nvm_unregister(char *);

+void nvm_mark_blk(struct nvm_dev *dev, struct ppa_addr ppa, int type);
+
  extern int nvm_submit_io(struct nvm_dev *, struct nvm_rq *);
  extern void nvm_generic_to_addr_mode(struct nvm_dev *, struct nvm_rq *);
  extern void nvm_addr_to_generic_mode(struct nvm_dev *, struct nvm_rq *);



Thanks Javier. Applied for 4.8. I updated the description a bit.

Re: [PATCH] gpio: remove redundant owner assignments of drivers

2016-05-23 Thread Charles Keepax

On Mon, May 23, 2016 at 10:49:10AM +0900, Masahiro Yamada wrote:
> A platform_driver need not set an owner since it will be populated
> by platform_driver_register().
> Likewise for mcb_driver (gpio-menz127.c).
> 
> Signed-off-by: Masahiro Yamada 
> ---

For the Wolfson bits:

Acked-by: Charles Keepax 

Thanks,
Charles

Re: linux-next: Tree for May 17

2016-05-23 Thread Stephen Rothwell

Hi Xiong,

On Mon, 23 May 2016 16:13:28 +0800 Xiong Zhou  wrote:
>
> hi,
> 
> On Tue, May 17, 2016 at 1:04 PM, Stephen Rothwell  
> wrote:
> >
> > I have created today's linux-next tree at
> > git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
> > (patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you  
> 
> Patches after 0516 are not there.
> 
> i'm chasing an oom issue between 0516 and 0518 trees while missed
> 0517 tag, so is the patch file the only way to get there trying 0517 tree?

They are there, just the version numbering puts them out of order in the
page listing - they appear before patch-v4.6-rc1-next-20160327.gz

All the (recent) tags are in the git tree as well, of course.
-- 
Cheers,
Stephen Rothwell

Re: [RFC 06/13] mm, thp: remove __GFP_NORETRY from khugepaged and madvised allocations

2016-05-23 Thread Michal Hocko

On Fri 20-05-16 15:57:08, Vlastimil Babka wrote:
[...]
> From: Vlastimil Babka 
> Date: Wed, 4 May 2016 13:40:03 +0200
> Subject: [PATCH] mm, thp: remove __GFP_NORETRY from khugepaged and madvised
>  allocations
> 
> After the previous patch, we can distinguish costly allocations that should be
> really lightweight, such as THP page faults, with __GFP_NORETRY. This means we
> don't need to recognize khugepaged allocations via PF_KTHREAD anymore. We can
> also change THP page faults in areas where madvise(MADV_HUGEPAGE) was used to
> try as hard as khugepaged, as the process has indicated that it benefits from
> THP's and is willing to pay some initial latency costs.
> 
> We can also make the flags handling less cryptic by distinguishing
> GFP_TRANSHUGE_LIGHT (no reclaim at all, default mode in page fault) from
> GFP_TRANSHUGE (only direct reclaim, khugepaged default). Adding __GFP_NORETRY
> or __GFP_KSWAPD_RECLAIM is done where needed.
> 
> The patch effectively changes the current GFP_TRANSHUGE users as follows:
> 
> * get_huge_zero_page() - the zero page lifetime should be relatively long and
>   it's shared by multiple users, so it's worth spending some effort on it.
>   We use GFP_TRANSHUGE, and __GFP_NORETRY is not added. This also restores
>   direct reclaim to this allocation, which was unintentionally removed by
>   commit e4a49efe4e7e ("mm: thp: set THP defrag by default to madvise and add
>   a stall-free defrag option")
> 
> * alloc_hugepage_khugepaged_gfpmask() - this is khugepaged, so latency is not
>   an issue. So if khugepaged "defrag" is enabled (the default), do reclaim
>   via GFP_TRANSHUGE without __GFP_NORETRY. We can remove the PF_KTHREAD check
>   from page alloc.
>   As a side-effect, khugepaged will now no longer check if the initial
>   compaction was deferred or contended. This is OK, as khugepaged sleep times
>   between collapsion attemps are long enough to prevent noticeable disruption,
>   so we should allow it to spend some effort.
> 
> * migrate_misplaced_transhuge_page() - already was masking out __GFP_RECLAIM,
>   so just convert to GFP_TRANSHUGE_LIGHT which is equivalent.
> 
> * alloc_hugepage_direct_gfpmask() - vma's with VM_HUGEPAGE (via madvise) are
>   now allocating without __GFP_NORETRY. Other vma's keep using __GFP_NORETRY
>   if direct reclaim/compaction is at all allowed (by default it's allowed only
>   for madvised vma's). The rest is conversion to GFP_TRANSHUGE(_LIGHT).
> 
> Signed-off-by: Vlastimil Babka 

I like it more than the previous approach.

Acked-by: Michal Hocko 

Thanks!

> ---
>  include/linux/gfp.h| 14 --
>  include/trace/events/mmflags.h |  1 +
>  mm/huge_memory.c   | 27 +++
>  mm/migrate.c   |  2 +-
>  mm/page_alloc.c|  6 ++
>  tools/perf/builtin-kmem.c  |  1 +
>  6 files changed, 28 insertions(+), 23 deletions(-)
> 
> diff --git a/include/linux/gfp.h b/include/linux/gfp.h
> index 570383a41853..1dfca27df492 100644
> --- a/include/linux/gfp.h
> +++ b/include/linux/gfp.h
> @@ -238,9 +238,11 @@ struct vm_area_struct;
>   *   are expected to be movable via page reclaim or page migration. 
> Typically,
>   *   pages on the LRU would also be allocated with GFP_HIGHUSER_MOVABLE.
>   *
> - * GFP_TRANSHUGE is used for THP allocations. They are compound allocations
> - *   that will fail quickly if memory is not available and will not wake
> - *   kswapd on failure.
> + * GFP_TRANSHUGE and GFP_TRANSHUGE_LIGHT are used for THP allocations. They 
> are
> + *   compound allocations that will generally fail quickly if memory is not
> + *   available and will not wake kswapd/kcompactd on failure. The _LIGHT
> + *   version does not attempt reclaim/compaction at all and is by default 
> used
> + *   in page fault path, while the non-light is used by khugepaged.
>   */
>  #define GFP_ATOMIC   (__GFP_HIGH|__GFP_ATOMIC|__GFP_KSWAPD_RECLAIM)
>  #define GFP_KERNEL   (__GFP_RECLAIM | __GFP_IO | __GFP_FS)
> @@ -255,9 +257,9 @@ struct vm_area_struct;
>  #define GFP_DMA32__GFP_DMA32
>  #define GFP_HIGHUSER (GFP_USER | __GFP_HIGHMEM)
>  #define GFP_HIGHUSER_MOVABLE (GFP_HIGHUSER | __GFP_MOVABLE)
> -#define GFP_TRANSHUGE((GFP_HIGHUSER_MOVABLE | __GFP_COMP | \
> -  __GFP_NOMEMALLOC | __GFP_NORETRY | __GFP_NOWARN) & \
> -  ~__GFP_RECLAIM)
> +#define GFP_TRANSHUGE_LIGHT  ((GFP_HIGHUSER_MOVABLE | __GFP_COMP | \
> +  __GFP_NOMEMALLOC| __GFP_NOWARN) & ~__GFP_RECLAIM)
> +#define GFP_TRANSHUGE(GFP_TRANSHUGE_LIGHT | __GFP_DIRECT_RECLAIM)
>  
>  /* Convert GFP flags to their corresponding migrate type */
>  #define GFP_MOVABLE_MASK (__GFP_RECLAIMABLE|__GFP_MOVABLE)
> diff --git a/include/trace/events/mmflags.h b/include/trace/events/mmflags.h
> index 43cedbf0c759..5a81ab48a2fb 100644
> --- a/include/trace/events/mmflags.h
> +++ b/include/trace/events/mmflags.h
> @@ -11,6 +11,7 @@
>  
>  #define __def_gf

[PATCH] powerpc: inline current_stack_pointer()

2016-05-23 Thread Christophe Leroy

current_stack_pointeur() is a single instruction function. it
It is not worth breaking the execution flow with a bl/blr for a
single instruction

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/reg.h  | 7 ++-
 arch/powerpc/kernel/misc.S  | 4 
 arch/powerpc/kernel/ppc_ksyms.c | 2 --
 3 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index c1e82e9..7ce6777 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -1301,7 +1301,12 @@ static inline unsigned long mfvtb (void)
 
 #define proc_trap()asm volatile("trap")
 
-extern unsigned long current_stack_pointer(void);
+static inline unsigned long current_stack_pointer(void)
+{
+   register unsigned long *ptr asm("r1");
+
+   return *ptr;
+}
 
 extern unsigned long scom970_read(unsigned int address);
 extern void scom970_write(unsigned int address, unsigned long value);
diff --git a/arch/powerpc/kernel/misc.S b/arch/powerpc/kernel/misc.S
index 0d43219..7ce26d4 100644
--- a/arch/powerpc/kernel/misc.S
+++ b/arch/powerpc/kernel/misc.S
@@ -114,7 +114,3 @@ _GLOBAL(longjmp)
mtlrr0
mr  r3,r4
blr
-
-_GLOBAL(current_stack_pointer)
-   PPC_LL  r3,0(r1)
-   blr
diff --git a/arch/powerpc/kernel/ppc_ksyms.c b/arch/powerpc/kernel/ppc_ksyms.c
index 9f01e28..eb5c5dc 100644
--- a/arch/powerpc/kernel/ppc_ksyms.c
+++ b/arch/powerpc/kernel/ppc_ksyms.c
@@ -33,5 +33,3 @@ EXPORT_SYMBOL(store_vr_state);
 #ifdef CONFIG_EPAPR_PARAVIRT
 EXPORT_SYMBOL(epapr_hypercall_start);
 #endif
-
-EXPORT_SYMBOL(current_stack_pointer);
-- 
2.1.0

[PATCH] powerpc32: get rid of sub_reloc_offset()

2016-05-23 Thread Christophe Leroy

sub_reloc_offset() has not been used since
commit 917f0af9e5a9 ("powerpc: Remove arch/ppc and include/asm-ppc")
which removed include/asm-ppc/prom.h

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/kernel/misc_32.S | 14 --
 1 file changed, 14 deletions(-)

diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S
index 285ca8c..d9c912b 100644
--- a/arch/powerpc/kernel/misc_32.S
+++ b/arch/powerpc/kernel/misc_32.S
@@ -104,20 +104,6 @@ _GLOBAL(mulhdu)
blr
 
 /*
- * sub_reloc_offset(x) returns x - reloc_offset().
- */
-_GLOBAL(sub_reloc_offset)
-   mflrr0
-   bl  1f
-1: mflrr5
-   lis r4,1b@ha
-   addir4,r4,1b@l
-   subfr5,r4,r5
-   subfr3,r5,r3
-   mtlrr0
-   blr
-
-/*
  * reloc_got2 runs through the .got2 section adding an offset
  * to each entry.
  */
-- 
2.1.0

[PATCH] powerpc32: use stmw/lmw for non volatile registers save/restore

2016-05-23 Thread Christophe Leroy

lmw/stmw have a 1 cycle (2 cycles for lmw on some ppc) in addition
and implies serialising, however it reduces the amount of instructions
hence the amount of instruction fetch compared to the equivalent
operation with several lzw/stw. It means less pressure on cache and
less fetching delays on slow memory.
When we transfer 20 registers, it is worth it.
gcc uses stmw/lmw at function entry/exit to save/restore non
volatile register, so lets also do it that way.

On powerpc64, we can't use lmw/stmw as it only handles 32 bits, so
we move longjmp() and setjmp() from misc.S to misc_64.S, and we
write a 32 bits version in misc_32.S using stmw/lmw

Signed-off-by: Christophe Leroy 
---
The patch goes on top of "powerpc: inline current_stack_pointer()" or
requires trivial manual merge in arch/powerpc/kernel/misc.S

 arch/powerpc/include/asm/ppc_asm.h |  6 ++--
 arch/powerpc/kernel/misc.S | 61 --
 arch/powerpc/kernel/misc_32.S  | 22 ++
 arch/powerpc/kernel/misc_64.S  | 61 ++
 4 files changed, 85 insertions(+), 65 deletions(-)

diff --git a/arch/powerpc/include/asm/ppc_asm.h 
b/arch/powerpc/include/asm/ppc_asm.h
index 2b31632..e29b649 100644
--- a/arch/powerpc/include/asm/ppc_asm.h
+++ b/arch/powerpc/include/asm/ppc_asm.h
@@ -82,10 +82,8 @@ END_FW_FTR_SECTION_IFSET(FW_FEATURE_SPLPAR)
 #else
 #define SAVE_GPR(n, base)  stw n,GPR0+4*(n)(base)
 #define REST_GPR(n, base)  lwz n,GPR0+4*(n)(base)
-#define SAVE_NVGPRS(base)  SAVE_GPR(13, base); SAVE_8GPRS(14, base); \
-   SAVE_10GPRS(22, base)
-#define REST_NVGPRS(base)  REST_GPR(13, base); REST_8GPRS(14, base); \
-   REST_10GPRS(22, base)
+#define SAVE_NVGPRS(base)  stmw13, GPR0+4*13(base)
+#define REST_NVGPRS(base)  lmw 13, GPR0+4*13(base)
 #endif
 
 #define SAVE_2GPRS(n, base)SAVE_GPR(n, base); SAVE_GPR(n+1, base)
diff --git a/arch/powerpc/kernel/misc.S b/arch/powerpc/kernel/misc.S
index 7ce26d4..9de71d8 100644
--- a/arch/powerpc/kernel/misc.S
+++ b/arch/powerpc/kernel/misc.S
@@ -53,64 +53,3 @@ _GLOBAL(add_reloc_offset)
 
.align  3
 2: PPC_LONG 1b
-
-_GLOBAL(setjmp)
-   mflrr0
-   PPC_STL r0,0(r3)
-   PPC_STL r1,SZL(r3)
-   PPC_STL r2,2*SZL(r3)
-   mfcrr0
-   PPC_STL r0,3*SZL(r3)
-   PPC_STL r13,4*SZL(r3)
-   PPC_STL r14,5*SZL(r3)
-   PPC_STL r15,6*SZL(r3)
-   PPC_STL r16,7*SZL(r3)
-   PPC_STL r17,8*SZL(r3)
-   PPC_STL r18,9*SZL(r3)
-   PPC_STL r19,10*SZL(r3)
-   PPC_STL r20,11*SZL(r3)
-   PPC_STL r21,12*SZL(r3)
-   PPC_STL r22,13*SZL(r3)
-   PPC_STL r23,14*SZL(r3)
-   PPC_STL r24,15*SZL(r3)
-   PPC_STL r25,16*SZL(r3)
-   PPC_STL r26,17*SZL(r3)
-   PPC_STL r27,18*SZL(r3)
-   PPC_STL r28,19*SZL(r3)
-   PPC_STL r29,20*SZL(r3)
-   PPC_STL r30,21*SZL(r3)
-   PPC_STL r31,22*SZL(r3)
-   li  r3,0
-   blr
-
-_GLOBAL(longjmp)
-   PPC_LCMPI r4,0
-   bne 1f
-   li  r4,1
-1: PPC_LL  r13,4*SZL(r3)
-   PPC_LL  r14,5*SZL(r3)
-   PPC_LL  r15,6*SZL(r3)
-   PPC_LL  r16,7*SZL(r3)
-   PPC_LL  r17,8*SZL(r3)
-   PPC_LL  r18,9*SZL(r3)
-   PPC_LL  r19,10*SZL(r3)
-   PPC_LL  r20,11*SZL(r3)
-   PPC_LL  r21,12*SZL(r3)
-   PPC_LL  r22,13*SZL(r3)
-   PPC_LL  r23,14*SZL(r3)
-   PPC_LL  r24,15*SZL(r3)
-   PPC_LL  r25,16*SZL(r3)
-   PPC_LL  r26,17*SZL(r3)
-   PPC_LL  r27,18*SZL(r3)
-   PPC_LL  r28,19*SZL(r3)
-   PPC_LL  r29,20*SZL(r3)
-   PPC_LL  r30,21*SZL(r3)
-   PPC_LL  r31,22*SZL(r3)
-   PPC_LL  r0,3*SZL(r3)
-   mtcrf   0x38,r0
-   PPC_LL  r0,0(r3)
-   PPC_LL  r1,SZL(r3)
-   PPC_LL  r2,2*SZL(r3)
-   mtlrr0
-   mr  r3,r4
-   blr
diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S
index d9c912b..de419e9 100644
--- a/arch/powerpc/kernel/misc_32.S
+++ b/arch/powerpc/kernel/misc_32.S
@@ -1086,3 +1086,25 @@ relocate_new_kernel_end:
 relocate_new_kernel_size:
.long relocate_new_kernel_end - relocate_new_kernel
 #endif
+
+_GLOBAL(setjmp)
+   mflrr0
+   li  r3, 0
+   stw r0, 0(r3)
+   stw r1, 4(r3)
+   stw r2, 8(r3)
+   mfcrr12
+   stmwr12, 12(r3)
+   blr
+
+_GLOBAL(longjmp)
+   lwz r0, 0(r3)
+   lwz r1, 4(r3)
+   lwz r2, 8(r3)
+   lmw r12, 12(r3)
+   mtcrf   0x38, r12
+   mtlrr0
+   mr. r3, r4
+   bnelr
+   li  r3, 1
+   blr
diff --git a/arch/powerpc/kernel/misc_64.S b/arch/powerpc/kernel/misc_64.S
index f28754c..7e25249 100644
--- a/arch/powerpc/kernel/misc_64.S
+++ b/arch/powerpc/kernel/misc_64.S
@@ -701,3 +701,64 @@ _GLOBAL(kexec_sequence)
li  r5,0
blr /* image->start(physid, image->start, 0); */
 #endif /* CONFIG_KEXEC */
+
+_GLOBAL(setjmp)
+   mflrr0
+

Re: [PATCH] drivers: nvmem: atmel-secumod: New driver for Atmel Secumod nvram

2016-05-23 Thread Srinivas Kandagatla


Thanks for the patch,
Few minors comments below.

On 18/05/16 22:06, David Mosberger-Tang wrote:

Signed-off-by: David Mosberger 
---
  .../devicetree/bindings/nvmem/atmel-secumod.txt|  47 +++
  drivers/nvmem/Kconfig  |   7 +
  drivers/nvmem/Makefile |   2 +
  drivers/nvmem/atmel-secumod.c  | 143 +
  4 files changed, 199 insertions(+)
  create mode 100644 Documentation/devicetree/bindings/nvmem/atmel-secumod.txt
  create mode 100644 drivers/nvmem/atmel-secumod.c

diff --git a/Documentation/devicetree/bindings/nvmem/atmel-secumod.txt 
b/Documentation/devicetree/bindings/nvmem/atmel-secumod.txt
new file mode 100644
index 000..d65cad5
--- /dev/null
+++ b/Documentation/devicetree/bindings/nvmem/atmel-secumod.txt
@@ -0,0 +1,47 @@
+= Atmel Secumod device tree bindings =
+


Can you split the dt-bindings into separate patch.


+This binding is intended to represent Atmel's Secumod which is found
+in SAMA5D2 and perhaps others.
+
+Required properties:
+- compatible: should be "atmel,sama5d2-secumod"
+- reg: Should contain RAM location and length, followed
+   by register location and length of the Secumod controller.
+
+= Data cells =
+Are child nodes of secumod, bindings of which as described in
+bindings/nvmem/nvmem.txt
+
+Example:
+
+secumod@fc04 {
+compatible = "atmel,sama5d2-secumod";
+reg = <0xf8044000 0x1420>, <0xfc04 0x4000>;
+reg-names = "SECURAM", "SECUMOD";
+status = "okay";
+
+#address-cells = <1>;
+#size-cells = <1>;
+ranges;
+
+secram-auto-erasable@0 {
+reg = <0x 0x1000>;
+};
+secram@1000 {
+reg = <0x1000 0x400>;
+};
+ram@1400 {
+reg = <0x1400 0x20>;
+};
+};
+
+= Data consumers =
+Are device nodes which consume nvmem data cells.
+
+For example:
+
+   ram {
+   ...
+   nvmem-cells = <&ram>;
+   nvmem-cell-names = "RAM";
+   };
diff --git a/drivers/nvmem/Kconfig b/drivers/nvmem/Kconfig
index 3041d48..88b21e3 100644
--- a/drivers/nvmem/Kconfig
+++ b/drivers/nvmem/Kconfig
@@ -101,4 +101,11 @@ config NVMEM_VF610_OCOTP
  This driver can also be build as a module. If so, the module will
  be called nvmem-vf610-ocotp.

+config NVMEM_ATMEL_SECUMOD
+   tristate "Atmel Secure Module driver"
+   depends on ARCH_AT91


COMPILE_TEST ?
Also please add
depends on HAS_IOMEM


+   help
+ Select this to get support for the secure module (SECUMOD) built
+into the SAMA5D2 chips.
+
  endif

...


index 000..fc5a96b
--- /dev/null
+++ b/drivers/nvmem/atmel-secumod.c


...

+
+/*
+ * Security-module register definitions:
+ */
+#define SECUMOD_RAMRDY 0x0014
+
+/*
+ * Since the secure module may need to automatically erase some of the
+ * RAM, it may take a while for it to be ready.  As far as I know,
+ * it's not documented how long this might take in the worst-case.
+ */
+static void
+secumod_wait_ready (void *regs)
+{
+   unsigned long start, stop;
+
+   start = jiffies;
+   while (!(readl(regs + SECUMOD_RAMRDY) & 1))
+   msleep_interruptible(1);


Worst case would be the system loop here forever, Can we add worst case 
timeout for this, and get out of this loop.



+   stop = jiffies;
+   if (stop != start)
+   pr_info("nvmem-atmel-secumod: it took %u msec for SECUMOD "
+   "to become ready...\n", jiffies_to_msecs(stop - start));
+   else
+   pr_info("nvmem-atmel-secumod: ready\n");
I dont see any use of this prints, We should probably remove these and 
add just a one dev_dbg.



+}
+

...
thanks,
srini

[x86] ad81363cd6: kernel BUG at lib/atomic64_test.c:184!

2016-05-23 Thread kernel test robot



FYI, we noticed the following commit:

https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs iso-atomic
commit ad81363cd63bc4700ad2f98e91cb20faf81e5c04 ("x86: Use ISO atomics")


on test machine: vm-vp-quantal-x86_64: 2 threads qemu-system-x86_64 -enable-kvm 
with 360M memory

caused below changes:


+--+++
|  | 2359154bd8 | ad81363cd6 |
+--+++
| boot_successes   | 10 | 0  |
| boot_failures| 0  | 10 |
| kernel_BUG_at_lib/atomic64_test.c| 0  | 10 |
| invalid_opcode:#[##]SMP  | 0  | 10 |
| RIP:test_atomic64| 0  | 10 |
| Kernel_panic-not_syncing:Fatal_exception | 0  | 10 |
| backtrace:test_atomics   | 0  | 10 |
| backtrace:kernel_init_freeable   | 0  | 10 |
+--+++



[0.949900]generic_sse: 17784.000 MB/sec
[0.950806] xor: using function: prefetch64-sse (19576.000 MB/sec)
[0.951956] [ cut here ]
[0.952898] kernel BUG at lib/atomic64_test.c:184!
[0.954041] invalid opcode:  [#1] SMP 
[0.955069] Modules linked in:
[0.955936] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.6.0-00010-gad81363 #1
[0.957161] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
Debian-1.8.2-1 04/01/2014
[0.958950] task: 880013028040 ti: 88001303 task.ti: 
88001303
[0.960559] RIP: 0010:[]  [] 
test_atomic64+0x9c2/0x9c4
[0.962392] RSP: :880013033e78  EFLAGS: 00010246
[0.963427] RAX:  RBX: 827f6ac2 RCX: deadbeefdeafcafe
[0.964645] RDX: 2221 RSI: aaa31337c001d00d RDI: 0246
[0.965842] RBP: 880013033e88 R08: 200201101001 R09: 8800131f
[0.967053] R10: 880013033e70 R11: 821b37a0 R12: 8800131f0008
[0.968272] R13:  R14: 8241d050 R15: 880013dd63c0
[0.969481] FS:  () GS:880013c0() 
knlGS:
[0.971179] CS:  0010 DS:  ES:  CR0: 80050033
[0.972239] CR2:  CR3: 02418000 CR4: 06f0
[0.973445] Stack:
[0.974101]  2221  880013033e98 
827f6ad0
[0.975976]  880013033f08 81000403 823a7000 
8239fd88
[0.977856]  880014e1550e 0200 821f5075 
0001
[0.979738] Call Trace:
[0.980451]  [] test_atomics+0xe/0xe
[0.981450]  [] do_one_initcall+0xe8/0x17b
[0.982504]  [] ? set_debug_rodata+0x12/0x12
[0.983572]  [] kernel_init_freeable+0x1cf/0x257
[0.984672]  [] kernel_init+0xe/0xf5
[0.985669]  [] ret_from_fork+0x22/0x50
[0.986687]  [] ? rest_init+0x13b/0x13b
[0.987713] Code: 22 22 11 11 11 11 48 89 45 f0 48 8b 45 f0 48 89 45 f8 48 
8b 45 f8 48 85 c0 7e 10 48 8d 50 ff 48 8b 45 f8 f0 48 0f b1 55 f0 75 e3 <0f> 0b 
55 48 89 e5 e8 fd ee ff ff e8 2e f6 ff ff 55 45 31 c9 48 
[0.995163] RIP  [] test_atomic64+0x9c2/0x9c4
[0.996308]  RSP 
[0.997140] ---[ end trace b63161db4a10d7b2 ]---
[0.998082] Kernel panic - not syncing: Fatal exception


FYI, raw QEMU command line is:

qemu-system-x86_64 -enable-kvm -kernel 
/pkg/linux/x86_64-nfsroot/gcc-6/ad81363cd63bc4700ad2f98e91cb20faf81e5c04/vmlinuz-4.6.0-00010-gad81363
 -append 'root=/dev/ram0 user=lkp 
job=/lkp/scheduled/vm-vp-quantal-x86_64-61/bisect_boot-1-quantal-core-x86_64.cgz-x86_64-nfsroot-ad81363cd63bc4700ad2f98e91cb20faf81e5c04-20160523-106725-1xpyqfq-0.yaml
 ARCH=x86_64 kconfig=x86_64-nfsroot branch=linux-devel/devel-hourly-2016052013 
commit=ad81363cd63bc4700ad2f98e91cb20faf81e5c04 
BOOT_IMAGE=/pkg/linux/x86_64-nfsroot/gcc-6/ad81363cd63bc4700ad2f98e91cb20faf81e5c04/vmlinuz-4.6.0-00010-gad81363
 max_uptime=600 
RESULT_ROOT=/result/boot/1/vm-vp-quantal-x86_64/quantal-core-x86_64.cgz/x86_64-nfsroot/gcc-6/ad81363cd63bc4700ad2f98e91cb20faf81e5c04/0
 LKP_SERVER=inn earlyprintk=ttyS0,115200 systemd.log_level=err debug apic=debug 
sysrq_always_enabled rcupdate.rcu_cpu_stall_timeout=100 panic=-1 
softlockup_panic=1 nmi_watchdog=panic oops=panic load_ramdisk=2 
prompt_ramdisk=0 console=ttyS0,115200 console=tty0 vga=normal rw 
ip=vm-vp-quantal-x86_64-61::dhcp drbd.minor_count=8'  -initrd 
/fs/sdd1/initrd-vm-vp-quantal-x86_64-61 -m 360 -smp 2 -device e1000,netdev=net0 
-netdev user,id=net0 -boot order=nc -no-reboot -watchdog i6300esb -rtc 
base=localtime -pidfile /dev/shm/kboot/pid-vm-vp-quantal-x86_64-61 -serial 
file:/dev/shm/kboot/serial-vm-vp-quantal-x86_64-61 -daemonize -display none 
-monitor null

Re: [Intel-gfx] linux-next: build failure after merge of the drm-intel tree

2016-05-23 Thread Jani Nikula

On Mon, 23 May 2016, Stephen Rothwell  wrote:
> Hi all,
>
> After merging the drm-intel tree, today's linux-next build (x86_64
> allmodconfig) failed like this:
>
> In file included from drivers/gpu/drm/i915/i915_trace.h:10:0,
>  from drivers/gpu/drm/i915/i915_drv.h:2735,
>  from drivers/gpu/drm/i915/i915_drv.c:34:
> drivers/gpu/drm/i915/intel_drv.h:36:41: fatal error: 
> drm/drm_dp_dual_mode_helper.h: No such file or directory
>
> Caused by commit
>
>   8d87410a019f ("drm/i915: Respect DP++ adaptor TMDS clock limit")
>
> I have used the drm-intel tree from next-20160520 for today.

Hi Stephen, my bad, should be fixed now, sorry for the trouble.

(Note to self, don't even dream of doing this stuff when you're out
sick. Try to remember there was a reason you were out sick and not at
the office in the first place...)

BR,
Jani.

-- 
Jani Nikula, Intel Open Source Technology Center

Re: [RFC PATCH] Increase in idle power with schedutil

2016-05-23 Thread Lorenzo Pieralisi

On Sun, May 22, 2016 at 01:42:52PM -0700, Steve Muckle wrote:
> On Sun, May 22, 2016 at 12:39:12PM +0200, Peter Zijlstra wrote:
> > On Fri, May 20, 2016 at 05:53:41PM +0530, Shilpasri G Bhat wrote:
> > > 
> > > Below are the comparisons by disabling watchdog.
> > > Both schedutil and ondemand have a similar ramp-down trend. And in both 
> > > the
> > > cases I can see that frequency of the cpu is not reduced in deterministic
> > > fashion. In a observation window of 30 seconds after running a workload I 
> > > can
> > > see that the frequency is not ramped down on some cpus in the system and 
> > > are
> > > idling at max frequency.
> > 
> > So does it actually matter what the frequency is when you idle? Isn't
> > the whole thing clock gated anyway?
> > 
> > Because this seems to generate contradictory requirements, on the one
> > hand we want to stay idle as long as possible while on the other hand
> > you seem to want to clock down while idle, which requires not being
> > idle.
> > 
> > If it matters; should not your idle state muck explicitly set/restore
> > frequency?
> 
> AFAIK this is very platform dependent. Some will waste more power than
> others when a CPU idles above fmin due to things like resource (bus
> bandwidth, shared cache freq etc) voting.

It is also related to static leakage power that depends on the operating
voltage (ie higher operating frequencies require higher voltage) so in a
way scaling frequency before going idle may not be effective if voltage
does not scale too in turn.

Lorenzo

Re: [PATCH V2] vfio: platform: support No-IOMMU mode

2016-05-23 Thread Eric Auger

Hi Peng,
On 05/23/2016 10:14 AM, Peng Fan wrote:
> The vfio No-IOMMU mode was supported by this
> 'commit 03a76b60f8ba2797 ("vfio: Include No-IOMMU mode")',
> but it only support vfio-pci.
> 
> Using vfio_iommu_group_get/put, but not iommu_group_get/put,
> the platform devices can be exposed to userspace with
> CONFIG_VFIO_NOIOMMU and the "enable_unsafe_noiommu_mode"
> option enabled.
> 
> From 'commit 03a76b60f8ba2797 ("vfio: Include No-IOMMU mode")',
> "This should make it very clear that this mode is not safe.
> Additionally, CAP_SYS_RAWIO privileges are necessary to work
> with groups and containers using this mode.  Groups making
> use of this support are named /dev/vfio/noiommu-$GROUP and
> can only make use of the special VFIO_NOIOMMU_IOMMU for the
> container.  Use of this mode, specifically binding a device
> without a native IOMMU group to a VFIO bus driver will taint
> the kernel and should therefore not be considered supported.",
> 
> Actually, for vfio-platform No-IOMMU mode, the userspace can
> not do DMA, because the ioctl API of noiommu container only
> supports VFIO_CHECK_EXTENSION and VFIO_IOMMU_MAP_DMA is not
> supported.
I did not play with no-iommu mode yet but I am surprised by this last
sentence. Without IOMMU the VFIO_IOMMU_MAP_DMA ioctl is not relevant
since you do not need to and cannot map anything but does this really
mean the device cannot perform DMA transfers towards physical memory, if
programmed to do so; I don't think so?

Best Regards

Eric
> 
> Signed-off-by: Peng Fan 
> Cc: Eric Auger 
> Cc: Baptiste Reynal 
> Cc: Alex Williamson 
> ---
> 
> V2:
>  Rename subject to support No-IOMMU
>  Add more commit log.
> 
>  I wrote a simple program following this
>  
> https://github.com/virtualopensystems/vfio-host-test/blob/master/src_test/vfio_device_test.c
>  ,no dma support. The device's register can be
>  accessed in userspace using command './vfio_dev_test 30b6.usdhc 0 1 
> platform'
> 
>  drivers/vfio/platform/vfio_platform_common.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/vfio/platform/vfio_platform_common.c 
> b/drivers/vfio/platform/vfio_platform_common.c
> index e65b142..993b2f9 100644
> --- a/drivers/vfio/platform/vfio_platform_common.c
> +++ b/drivers/vfio/platform/vfio_platform_common.c
> @@ -561,7 +561,7 @@ int vfio_platform_probe_common(struct 
> vfio_platform_device *vdev,
>  
>   vdev->device = dev;
>  
> - group = iommu_group_get(dev);
> + group = vfio_iommu_group_get(dev);
>   if (!group) {
>   pr_err("VFIO: No IOMMU group for device %s\n", vdev->name);
>   return -EINVAL;
> @@ -569,7 +569,7 @@ int vfio_platform_probe_common(struct 
> vfio_platform_device *vdev,
>  
>   ret = vfio_add_group_dev(dev, &vfio_platform_ops, vdev);
>   if (ret) {
> - iommu_group_put(group);
> + vfio_iommu_group_put(group, dev);
>   return ret;
>   }
>  
> @@ -589,7 +589,7 @@ struct vfio_platform_device 
> *vfio_platform_remove_common(struct device *dev)
>  
>   if (vdev) {
>   vfio_platform_put_reset(vdev);
> - iommu_group_put(dev->iommu_group);
> + vfio_iommu_group_put(dev->iommu_group, dev);
>   }
>  
>   return vdev;
>

Re: [PATCH V7 3/3] soc/tegra: pmc: Add support for IO pads power state and voltage

2016-05-23 Thread Jon Hunter


On 20/05/16 15:45, Laxman Dewangan wrote:
> The IO pins of Tegra SoCs are grouped for common control of IO
> interface like setting voltage signal levels and power state of
> the interface. The group is generally referred as IO pads. The
> power state and voltage control of IO pins can be done at IO pads
> level.
> 
> Tegra generation SoC supports the power down of IO pads when it
> is not used even in the active state of system. This saves power
> from that IO interface. Also it supports multiple voltage level
> in IO pins for interfacing on some of pads. The IO pad voltage is
> automatically detected till T124, hence SW need not to configure
> this. But from T210, the automatically detection logic has been
> removed, hence SW need to explicitly set the IO pad voltage into
> IO pad configuration registers.
> 
> Add support to set the power states and voltage level of the IO pads
> from client driver. The implementation for the APIs are in generic
> which is applicable for all generation os Tegra SoC.
> 
> IO pads ID and information of bit field for power state and voltage
> level controls are added for Tegra124, Tegra132 and Tegra210. The SOR
> driver is modified to use the new APIs.
> 
> Signed-off-by: Laxman Dewangan 

Thanks. I will defer to Thierry on how this should be organised for
merging but I am happy with the code. There is one minor typo below, but
otherwise ...

Acked-by: Jon Hunter 


> ---
> Changes from V1:
> This is reworked on earlier path to have separation between IO rails and
> io pads and add power state and voltage control APIs in single call.
> 
> Changes from V2:
> - Remove the tegra_io_rail_power_off/on() apis and change client (sor) driver
> to use the new APIs for IO pad power.
> - Remove the TEGRA_IO_RAIL_ macros.
> 
> Changes from V3:
> - Make all pad_id/io_pad_id to id.
> - tegra_io_pad_ -> tegra_io_pads
> - dpd_bit -> bit, pwr_mask/bit to mask/bit.
> - Rename function to tegra_io_pads_{set,get}_voltage_config
> - Make the io pad tables common for all SoC.
> - Make io_pads enums.
> - Add enums for voltage.
> 
> Changes from V4:
> - IO_PAD->IO_PADS
> - TEGRA_IO_PADS_POWER_SOURCE_ -> TEGRA_IO_PADS_VCONF_
> 
> Changes from V5:
> - Fix comment style to multi-line format.
> - Use -EINVAL instead of -1 to refactor some of function as suggested by Jon.
> 
> Changes from V6:
> - Doc style formatting.
> - io pads id checks.
> - Documenting public functions.
> - Corrected error numbers.
> ---
>  drivers/gpu/drm/tegra/sor.c |   8 +-
>  drivers/soc/tegra/pmc.c | 280 
> +++-
>  include/soc/tegra/pmc.h | 133 +++--
>  3 files changed, 350 insertions(+), 71 deletions(-)

> +/**
> + * Define the IO_PADS SOC for SOC mask to find out that IO pads supported
> + * or not in given SoC.
> + */

In addition to the typo, I would have made this a normal multi-line comment.

> +#define TEGRA_IO_PADS_T124   0x1
> +#define TEGRA_IO_PADS_T210   0x2
> +#define TEGRA_IO_PADS_T124_T210  (TEGRA_IO_PADS_T124 |   \
> + TEGRA_IO_PADS_T210)
> +

...

> +/**
> + * tegra_io_pads_power_enablei() - Enable the power to IO pads.

Typo ... s/enablei/enable


> +/*
> + * TEGRA_IO_PAD: The IO pins of Tegra SoCs are grouped for common

TEGRA_IO_PADS

> +/* tegra_io_pads_vconf_voltage: The voltage level of IO rails which source
> + *   the IO pads.
> + */
> +enum tegra_io_pads_vconf_voltage {
> + TEGRA_IO_PADS_VCONF_180UV,
> + TEGRA_IO_PADS_VCONF_330UV,
> +};

Normal multi-line comment.

Cheers
Jon

--
nvpublic

Re: [RFC v2 2/5] drm/mediatke: add support for Mediatek SoC MT2701

2016-05-23 Thread CK Hu

Hi, YT:

Some comments below.

On Fri, 2016-05-20 at 23:05 +0800, yt.s...@mediatek.com wrote:
> From: YT Shen 
> 
> This patch add support for the Mediatek MT2701 DISP subsystem.
> There is only one OVL engine in MT2701.
> 
> Signed-off-by: YT Shen 
>  
> +static void mtk_ddp_mux_sel(void __iomem *config_regs,
> + enum mtk_ddp_comp_id cur, enum mtk_ddp_comp_id next)
> +{
> + if (cur == DDP_COMPONENT_BLS && next == DDP_COMPONENT_DSI0) {
> + writel_relaxed(BLS_TO_DSI_RDMA1_TO_DPI1,
> +config_regs + DISP_REG_CONFIG_OUT_SEL);
> + }
> +}
> +

The function name 'mux' looks strange. The register written here
controls the single output selection. I prefer to rename it as
mtk_ddp_sout_sel().

>  
> -static const enum mtk_ddp_comp_id mtk_ddp_main[] = {
> +static const enum mtk_ddp_comp_id mtk_ddp_main_2701[] = {
> + DDP_COMPONENT_OVL0,
> + DDP_COMPONENT_RDMA0,
> + DDP_COMPONENT_COLOR0,
> + DDP_COMPONENT_BLS,
> + DDP_COMPONENT_DSI0,
> +};
> +
> +static const enum mtk_ddp_comp_id mtk_ddp_ext_2701[] = {
> + DDP_COMPONENT_OVL0,
> + DDP_COMPONENT_DSI0,
> +};
> +

These two pipelines has the same component such as OVL0 and DSI0. I
think user program could not enable both crtc at the same time. Maybe
MT2701 has only one crtc, so you should modify initial flow to create
only one crtc for main display. Or it's typo for external display pipe,
please correct it.

Regards,
CK

[PATCH v2 1/2] lightnvm: hold lock until finish the target creation

2016-05-23 Thread Wenwei Tao

From: Wenwei Tao 

When create a target, we check whether the target is
already exist first. If the answer is no, we release
the lock and continue the creation. This cannot prevent
concurrent creation of the same target, so hold the lock
until finish the target creation.

Signed-off-by: Wenwei Tao 
---
Changes since v1
-rebase to for-4.8/core

 drivers/lightnvm/gennvm.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/lightnvm/gennvm.c b/drivers/lightnvm/gennvm.c
index c65fb67..39ff0af 100644
--- a/drivers/lightnvm/gennvm.c
+++ b/drivers/lightnvm/gennvm.c
@@ -44,6 +44,7 @@ static int gen_create_tgt(struct nvm_dev *dev, struct 
nvm_ioctl_create *create)
struct nvm_tgt_type *tt;
struct nvm_target *t;
void *targetdata;
+   int ret = -ENOMEM;
 
tt = nvm_find_target_type(create->tgttype, 1);
if (!tt) {
@@ -55,14 +56,13 @@ static int gen_create_tgt(struct nvm_dev *dev, struct 
nvm_ioctl_create *create)
t = gen_find_target(gn, create->tgtname);
if (t) {
pr_err("nvm: target name already exists.\n");
-   mutex_unlock(&gn->lock);
-   return -EINVAL;
+   ret = -EINVAL;
+   goto err_unlock;
}
-   mutex_unlock(&gn->lock);
 
t = kmalloc(sizeof(struct nvm_target), GFP_KERNEL);
if (!t)
-   return -ENOMEM;
+   goto err_unlock;
 
tqueue = blk_alloc_queue_node(GFP_KERNEL, dev->q->node);
if (!tqueue)
@@ -95,8 +95,6 @@ static int gen_create_tgt(struct nvm_dev *dev, struct 
nvm_ioctl_create *create)
t->type = tt;
t->disk = tdisk;
t->dev = dev;
-
-   mutex_lock(&gn->lock);
list_add_tail(&t->list, &gn->targets);
mutex_unlock(&gn->lock);
 
@@ -107,7 +105,9 @@ err_queue:
blk_cleanup_queue(tqueue);
 err_t:
kfree(t);
-   return -ENOMEM;
+err_unlock:
+   mutex_unlock(&gn->lock);
+   return ret;
 }
 
 static void __gen_remove_target(struct nvm_target *t)
-- 
1.8.3.1

[RFC PATCH 2/2] lightnvm: Append device name to target name

2016-05-23 Thread Wenwei Tao

From: Wenwei Tao 

We may create targets with same name on different
backend devices, this is not what we want, so append
the device name to target name to make the new target
name unique in the system.

Signed-off-by: Wenwei Tao 
---
 drivers/lightnvm/gennvm.c | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/lightnvm/gennvm.c b/drivers/lightnvm/gennvm.c
index 39ff0af..ecb09cb 100644
--- a/drivers/lightnvm/gennvm.c
+++ b/drivers/lightnvm/gennvm.c
@@ -43,9 +43,18 @@ static int gen_create_tgt(struct nvm_dev *dev, struct 
nvm_ioctl_create *create)
struct gendisk *tdisk;
struct nvm_tgt_type *tt;
struct nvm_target *t;
+   char tgtname[DISK_NAME_LEN];
void *targetdata;
int ret = -ENOMEM;
 
+   if (strlen(dev->name) + strlen(create->tgtname) + 1 > DISK_NAME_LEN) {
+   pr_err("nvm: target name too long. %s:%s\n",
+   dev->name, create->tgtname);
+   return -EINVAL;
+   }
+
+   sprintf(tgtname, "%s%s", dev->name, create->tgtname);
+
tt = nvm_find_target_type(create->tgttype, 1);
if (!tt) {
pr_err("nvm: target type %s not found\n", create->tgttype);
@@ -53,7 +62,7 @@ static int gen_create_tgt(struct nvm_dev *dev, struct 
nvm_ioctl_create *create)
}
 
mutex_lock(&gn->lock);
-   t = gen_find_target(gn, create->tgtname);
+   t = gen_find_target(gn, tgtname);
if (t) {
pr_err("nvm: target name already exists.\n");
ret = -EINVAL;
@@ -73,7 +82,7 @@ static int gen_create_tgt(struct nvm_dev *dev, struct 
nvm_ioctl_create *create)
if (!tdisk)
goto err_queue;
 
-   sprintf(tdisk->disk_name, "%s", create->tgtname);
+   sprintf(tdisk->disk_name, "%s", tgtname);
tdisk->flags = GENHD_FL_EXT_DEVT;
tdisk->major = 0;
tdisk->first_minor = 0;
-- 
1.8.3.1

RE: [PATCH 1/1 RFC] net/phy: Add Lantiq PHY driver

2016-05-23 Thread Mehrtens, Hauke

Hi Alexander,

> -Original Message-
> From: Alexander Stein [mailto:alexander.st...@systec-electronic.com]
> Sent: Thursday, May 19, 2016 12:22 PM
> To: Mathias Kresin 
> Cc: John Crispin ; Florian Fainelli ;
> net...@vger.kernel.org; linux-kernel@vger.kernel.org; and...@lunn.ch;
> Mehrtens, Hauke 
> Subject: Re: [PATCH 1/1 RFC] net/phy: Add Lantiq PHY driver
> 
> On Thursday 19 May 2016 12:03:10, Mathias Kresin wrote:
> > 2016-05-19 9:03 GMT+02:00 John Crispin :
> > > On 19/05/2016 08:57, Alexander Stein wrote:
> > >> Thanks for the link, I wasn't aware of that patch. I like it in
> > >> general, but there are some things I'd like to get addressed first:
> > >> * vr9_gphy_of_reg_init() writes uncoditionally to led3h and led3l
> > >> even on
> > >>
> > >>   PEf7071 which does not have this register at all
> > >
> > > we use this driver mainly on the 11g and 22f version. mathias
> > > recently added the led3 handling.
> > >
> > > @Mathias, can you have a look at this and fix it inside the lede tree ?
> >
> > Well, I haven't added the led3 handling, I've only changed the initial
> > value (function) of led3.
> >
> > Maybe it's cleaner to not use a default value for the led function and
> > completely rely on the device tree bindings. But by adjusting the
> > initial values, I had to change only the led function of one board in
> > the openwrt xrx200 subtarget instead of touching all dts files.
> 
> I think setting default values is good.

The registers are set to some reset values after the chip is coming out of 
reset, but we should set  them all to the same value, Mathias said that all 
except for one board he knows are using only one LED per port, but they are 
often using different LED pins, I will change my patch.

> > I know that the LTQ Datasheet for the PEF 7071 Version 1.5 mentions
> > the led3 control register albeit there is no pin for a forth led. So I
> > guess it's safe to write to the led3 register even for the PEF 7071.
> 
> Mh, my PEF 7071 User Manual (Version 2.0, 2012-10-17) doesn't mention
> LED3x registers. There is LED3DA and LED3EN in PHY_LED but was removed in
> 1.6 manual.

LED3x is only available in PEF 7072 which is a different package with more pins 
for the LED3 and some other interfaces.

> I think, some flag if the PHY supports LED3 and depend on that is just fine.

I do not know how to distinguish between PEF 7071 and PEF 7072.

Hauke

Re: [PATCH] soc: mtk-pmic-wrap: avoid integer overflow warning

2016-05-23 Thread Henry Chen

Hi,

Thanks to the patch.

On Thu, 2016-05-12 at 23:01 +0200, Arnd Bergmann wrote:
> On ARM64, the mtk-pmic-wrap driver causes a harmless warning:
> 
> mtk-pmic-wrap.c:1062:16: warning: large integer implicitly truncated to 
> unsigned type [-Woverflow]
> mtk-pmic-wrap.c:1074:16: warning: large integer implicitly truncated to 
> unsigned type [-Woverflow]
> mtk-pmic-wrap.c:1086:16: warning: large integer implicitly truncated to 
> unsigned type [-Woverflow]
>   .int_en_all = ~(BIT(31) | BIT(1)),
> 
> The problem is that the result of the BIT() macro is an 'unsigned long',
> so taking the bitwise NOT operation of that results in an integer
> with the upper 32 bits all set and that cannot be assigned to a
> 'u32' variable without loss of information.
> 
> This is harmless because we were never interested in the upper bits
> here anyway, so we can shut up the warning by adding a simple cast
> to 'u32'.
> 
> Signed-off-by: Arnd Bergmann 

Acked-by: Henry Chen 

> ---
>  drivers/soc/mediatek/mtk-pmic-wrap.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/soc/mediatek/mtk-pmic-wrap.c 
> b/drivers/soc/mediatek/mtk-pmic-wrap.c
> index 3c3e56df526e..a003ba26ca6e 100644
> --- a/drivers/soc/mediatek/mtk-pmic-wrap.c
> +++ b/drivers/soc/mediatek/mtk-pmic-wrap.c
> @@ -1059,7 +1059,7 @@ static const struct pmic_wrapper_type pwrap_mt2701 = {
>   .regs = mt2701_regs,
>   .type = PWRAP_MT2701,
>   .arb_en_all = 0x3f,
> - .int_en_all = ~(BIT(31) | BIT(2)),
> + .int_en_all = ~(u32)(BIT(31) | BIT(2)),
>   .spi_w = PWRAP_MAN_CMD_SPI_WRITE_NEW,
>   .wdt_src = PWRAP_WDT_SRC_MASK_ALL,
>   .has_bridge = 0,
> @@ -1071,7 +1071,7 @@ static struct pmic_wrapper_type pwrap_mt8135 = {
>   .regs = mt8135_regs,
>   .type = PWRAP_MT8135,
>   .arb_en_all = 0x1ff,
> - .int_en_all = ~(BIT(31) | BIT(1)),
> + .int_en_all = ~(u32)(BIT(31) | BIT(1)),
>   .spi_w = PWRAP_MAN_CMD_SPI_WRITE,
>   .wdt_src = PWRAP_WDT_SRC_MASK_ALL,
>   .has_bridge = 1,
> @@ -1083,7 +1083,7 @@ static struct pmic_wrapper_type pwrap_mt8173 = {
>   .regs = mt8173_regs,
>   .type = PWRAP_MT8173,
>   .arb_en_all = 0x3f,
> - .int_en_all = ~(BIT(31) | BIT(1)),
> + .int_en_all = ~(u32)(BIT(31) | BIT(1)),
>   .spi_w = PWRAP_MAN_CMD_SPI_WRITE,
>   .wdt_src = PWRAP_WDT_SRC_MASK_NO_STAUPD,
>   .has_bridge = 0,

Re: [RFC PATCH 2/2] lightnvm: Append device name to target name

2016-05-23 Thread Matias Bjørling


On 05/23/2016 11:13 AM, Wenwei Tao wrote:

From: Wenwei Tao 

We may create targets with same name on different
backend devices, this is not what we want, so append
the device name to target name to make the new target
name unique in the system.

Signed-off-by: Wenwei Tao 
---
  drivers/lightnvm/gennvm.c | 13 +++--
  1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/lightnvm/gennvm.c b/drivers/lightnvm/gennvm.c
index 39ff0af..ecb09cb 100644
--- a/drivers/lightnvm/gennvm.c
+++ b/drivers/lightnvm/gennvm.c
@@ -43,9 +43,18 @@ static int gen_create_tgt(struct nvm_dev *dev, struct 
nvm_ioctl_create *create)
struct gendisk *tdisk;
struct nvm_tgt_type *tt;
struct nvm_target *t;
+   char tgtname[DISK_NAME_LEN];
void *targetdata;
int ret = -ENOMEM;

+   if (strlen(dev->name) + strlen(create->tgtname) + 1 > DISK_NAME_LEN) {
+   pr_err("nvm: target name too long. %s:%s\n",
+   dev->name, create->tgtname);
+   return -EINVAL;
+   }
+
+   sprintf(tgtname, "%s%s", dev->name, create->tgtname);
+
tt = nvm_find_target_type(create->tgttype, 1);
if (!tt) {
pr_err("nvm: target type %s not found\n", create->tgttype);
@@ -53,7 +62,7 @@ static int gen_create_tgt(struct nvm_dev *dev, struct 
nvm_ioctl_create *create)
}

mutex_lock(&gn->lock);
-   t = gen_find_target(gn, create->tgtname);
+   t = gen_find_target(gn, tgtname);
if (t) {
pr_err("nvm: target name already exists.\n");
ret = -EINVAL;
@@ -73,7 +82,7 @@ static int gen_create_tgt(struct nvm_dev *dev, struct 
nvm_ioctl_create *create)
if (!tdisk)
goto err_queue;

-   sprintf(tdisk->disk_name, "%s", create->tgtname);
+   sprintf(tdisk->disk_name, "%s", tgtname);
tdisk->flags = GENHD_FL_EXT_DEVT;
tdisk->major = 0;
tdisk->first_minor = 0;



Hi Wenwei, what about the case where a target instance has multiple 
devices associated?


I am okay with having the user choosing a unique name for the target to 
be exposed.

Re: [PATCH 1/2] libnvdimm, dax: autodetect support

2016-05-23 Thread Johannes Thumshirn

On Wed, May 18, 2016 at 04:44:02PM -0700, Dan Williams wrote:
> For autodetecting a previously established dax configuration we need the
> info block to indicate block-device vs device-dax mode, and we need to
> have the default namespace probe hand-off the configuration to the
> dax_pmem driver.
> 
> Signed-off-by: Dan Williams 

Reviewed-by: Johannes Thumshirn 

-- 
Johannes Thumshirn  Storage
jthumsh...@suse.de+49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

Re: [PATCH 2/2] libnvdimm, dax: fix alignment validation

2016-05-23 Thread Johannes Thumshirn

On Wed, May 18, 2016 at 04:44:07PM -0700, Dan Williams wrote:
> Testing the dax-device autodetect support revealed a probe failure with
> the following result:
> 
> dax0.1: bad offset: 0x820 dax disabled
> 
> The original pfn-device implementation inferred the alignment from
> ilog2(offset), now that the alignment is explicit the is_power_of_2()
> needs replacing with a real sanity check against the recorded alignment.
> Otherwise the alignment check is useless in the implicit case and only
> the minimum size of the offset matters.
> 
> This self-consistency check is further validated by the probe path that
> will re-check that the offset is large enough to contain all the
> metadata required to enable the device.
> 
> Cc: 
> Signed-off-by: Dan Williams 

Reviewed-by: Johannes Thumshirn 

-- 
Johannes Thumshirn  Storage
jthumsh...@suse.de+49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

Re: [patch] sched/fair: Move se->vruntime normalization state into struct sched_entity

2016-05-23 Thread Peter Zijlstra

On Sun, May 22, 2016 at 09:00:01AM +0200, Mike Galbraith wrote:
> On Sat, 2016-05-21 at 21:00 +0200, Mike Galbraith wrote:
> > On Sat, 2016-05-21 at 16:04 +0200, Mike Galbraith wrote:
> > 
> > > Wakees that were not migrated/normalized eat an unwanted min_vruntime,
> > > and likely take a size XXL latency hit.  Big box running master bled
> > > profusely under heavy load until I turned TTWU_QUEUE off.
> 
> May as well make it official and against master.today.  Fly or die
> little patchlet.
> 
> sched/fair: Move se->vruntime normalization state into struct sched_entity

Does this work?

---
 include/linux/sched.h |  1 +
 kernel/sched/core.c   | 18 +++---
 2 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 1b43b45a22b9..a2001e01b3df 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1534,6 +1534,7 @@ struct task_struct {
unsigned sched_reset_on_fork:1;
unsigned sched_contributes_to_load:1;
unsigned sched_migrated:1;
+   unsigned sched_remote_wakeup:1;
unsigned :0; /* force alignment to the next boundary */
 
/* unserialized, strictly 'current' */
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 404c0784b1fc..7f2cae4620c7 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1768,13 +1768,15 @@ void sched_ttwu_pending(void)
cookie = lockdep_pin_lock(&rq->lock);
 
while (llist) {
+   int wake_flags = 0;
+
p = llist_entry(llist, struct task_struct, wake_entry);
llist = llist_next(llist);
-   /*
-* See ttwu_queue(); we only call ttwu_queue_remote() when
-* its a x-cpu wakeup.
-*/
-   ttwu_do_activate(rq, p, WF_MIGRATED, cookie);
+
+   if (p->sched_remote_wakeup)
+   wake_flags = WF_MIGRATED;
+
+   ttwu_do_activate(rq, p, wake_flags, cookie);
}
 
lockdep_unpin_lock(&rq->lock, cookie);
@@ -1819,10 +1821,12 @@ void scheduler_ipi(void)
irq_exit();
 }
 
-static void ttwu_queue_remote(struct task_struct *p, int cpu)
+static void ttwu_queue_remote(struct task_struct *p, int cpu, int wake_flags)
 {
struct rq *rq = cpu_rq(cpu);
 
+   p->sched_remote_wakeup = !!(wake_flags & WF_MIGRATED);
+
if (llist_add(&p->wake_entry, &cpu_rq(cpu)->wake_list)) {
if (!set_nr_if_polling(rq->idle))
smp_send_reschedule(cpu);
@@ -1869,7 +1873,7 @@ static void ttwu_queue(struct task_struct *p, int cpu, 
int wake_flags)
 #if defined(CONFIG_SMP)
if (sched_feat(TTWU_QUEUE) && !cpus_share_cache(smp_processor_id(), 
cpu)) {
sched_clock_cpu(cpu); /* sync clocks x-cpu */
-   ttwu_queue_remote(p, cpu);
+   ttwu_queue_remote(p, cpu, wake_flags);
return;
}
 #endif

Re: [RFC PATCH 1/2] Input: rotary-encoder- Add support for absolute encoder

2016-05-23 Thread R, Vignesh

On 5/20/2016 10:04 PM, Dmitry Torokhov wrote:
> On Thu, May 19, 2016 at 02:34:00PM +0530, Vignesh R wrote:
>> There are rotary-encoders where GPIO lines reflect the actual position
>> of the rotary encoder dial. For example, if dial points to 9, then four
>> GPIO lines connected to the rotary encoder will read HLLH(1001b = 9).
>> Add support for such rotary-encoder.
>> The driver relies on rotary-encoder,absolute-encoder DT property to
>> detect such encoders.
>> Since, GPIO IRQs are not necessary to work with
>> such encoders, optional polling mode support is added using
>> input_poll_dev skeleton. This is can be used by enabling
>> CONFIG_INPUT_GPIO_ROTARY_ENCODER_POLL_MODE_SUPPORT.
> 
> Does this really belong to a rotary encoder and not a new driver that
> simply translates gpio-encoded value into ABS* event?
> 

Currently rotary encoder driver only supports incremental/step counting
rotary devices. However, the device that is there on am335x-ice is an
absolute encoder but, IMO, nevertheless a kind of rotary encoder. The
only difference is that there is no need to count steps and the absolute
position value is always available as binary encoded state of connected
GPIOs.
The hardware on am335x-ice is a mechanical rotary encoder switch
connected over 4 GPIOs. It is same as binary encoder described at [1]
(except there are 4 GPIO lines), so this lead me to add support in
rotary-encoder.

[1]https://en.wikipedia.org/wiki/Rotary_encoder#Standard_binary_encoding

Re: [RFC PATCHv2] usb: USB Type-C Connector Class

2016-05-23 Thread Heikki Krogerus

On Fri, May 20, 2016 at 10:02:28AM -0700, Guenter Roeck wrote:
> On Fri, May 20, 2016 at 01:47:03PM +0300, Heikki Krogerus wrote:
> > On Thu, May 19, 2016 at 10:53:04AM -0700, Guenter Roeck wrote:
> > > Hello Heikki,
> > > 
> > > On Thu, May 19, 2016 at 03:44:54PM +0300, Heikki Krogerus wrote:
> > > > The purpose of this class is to provide unified interface for user
> > > > space to get the status and basic information about USB Type-C
> > > > Connectors in the system, control data role swapping, and when USB PD
> > > > is available, also power role swapping and Alternate Modes.
> > > > 
> > > > Signed-off-by: Heikki Krogerus 
> > > > ---
> > > >  drivers/usb/Kconfig |   2 +
> > > >  drivers/usb/Makefile|   2 +
> > > >  drivers/usb/type-c/Kconfig  |   7 +
> > > >  drivers/usb/type-c/Makefile |   1 +
> > > >  drivers/usb/type-c/typec.c  | 957 
> > > > 
> > > >  include/linux/usb/typec.h   | 230 +++
> > > >  6 files changed, 1199 insertions(+)
> > > >  create mode 100644 drivers/usb/type-c/Kconfig
> > > >  create mode 100644 drivers/usb/type-c/Makefile
> > > >  create mode 100644 drivers/usb/type-c/typec.c
> > > >  create mode 100644 include/linux/usb/typec.h
> > > > 
> > > > Hi,
> > > > 
> > > > Like I've told some of you guys, I'm trying to implement a bus for
> > > > the Alternate Modes, but I'm still nowhere near finished with that
> > > > one, so let's just get the class ready now. The altmode bus should in
> > > > any case not affect the userspace interface proposed in this patch.
> > > > 
> > > > As you can see, the Alternate Modes are handled completely differently
> > > > compared to the original proposal. Every Alternate Mode will have
> > > > their own device instance (which will be then later bound to an
> > > > Alternate Mode specific driver once we have the bus), but also every
> > > > partner, cable and cable plug will have their own device instances
> > > > representing them.
> > > > 
> > > > An other change is that the data role is now handled in two ways.
> > > > The current_data_role file will represent static mode of the port, and
> > > > it will use the names for the roles as they are defined in the spec:
> > > > DFP, UFP and DRP. This file should be used if the port needs to be
> > > > fixed to one specific role with DRP ports. So this approach will
> > > > replace the suggestions for "preferred" data role we had. The
> > > > current_usb_data_role will use values "host" and "device" and it will
> > > > be used for data role swapping when already connected.
> > > > 
> > > 
> > > What I am missing completely is a means to handle role and alternate mode
> > > changes triggered by the partner. The need for those should be obvious,
> > > unless I am really missing something (just consider two devices supporting
> > > this code connected to each other).
> > 
> > We are missing the notifications that are needed in these cases. But I
> > don't see much more we can do about those cases. We can not put any
> > policies in place at this level, because we have to be able to support
> > also things like USB PD and Type-C controllers that take care of all
> > that, leaving us to not be able to do anything else but to pass the
> > information forward. So the framework at this level has to be
> > "stupid", and if more infrastructure is needed, it has to be
> > introduced in an other layer.
> > 
> Ok.
> 
> > > Also, I am not sure where the policy engine is supposed to reside.
> > > I understand that some policy changes (eg unsolicited requests to switch 
> > > roles)
> > > can be triggered from user space. However, role change requests triggered 
> > > from
> > > the partner need to be evaluated quickly (typically within 15 ms), so user
> > > space can not get involved. Maybe it would help to have some text 
> > > describing
> > > where the policy engine is expected to reside and how it is involved
> > > in the decision making process. This includes the initial decision making
> > > process, when it needs to be decided if role changes should be requested
> > > or if one or multiple alternate modes should be entered after the initial
> > > connection has been established.
> > 
> > Well, yes we need to document these things, but you are now coupling
> > this framework with USB PD and we really should not do that.
> > 
> Not really. I was trying to understand where you would expect the policy 
> engine
> to reside, which you answered above.

Ah OK, got it. Sorry.

> > The policy engine, and the whole USB PD stack, belongs inside the
> > kernel, and it will be completely separated from this framework. This
> > framework can not have any dependencies on the future USB PD stack.
> > This is not only because of the USB PD/Type-C controllers which handle
> > the policy engine on their own and only allow "unsolicited" requests
> > like "swap role" and "enter/exit mode", but also because this
> > framework must work smoothly on systems that don'

Re: [PATCH V2] vfio: platform: support No-IOMMU mode

2016-05-23 Thread Peng Fan

Hi Eric,

On Mon, May 23, 2016 at 10:59:24AM +0200, Eric Auger wrote:
>Hi Peng,
>On 05/23/2016 10:14 AM, Peng Fan wrote:
>> The vfio No-IOMMU mode was supported by this
>> 'commit 03a76b60f8ba2797 ("vfio: Include No-IOMMU mode")',
>> but it only support vfio-pci.
>> 
>> Using vfio_iommu_group_get/put, but not iommu_group_get/put,
>> the platform devices can be exposed to userspace with
>> CONFIG_VFIO_NOIOMMU and the "enable_unsafe_noiommu_mode"
>> option enabled.
>> 
>> From 'commit 03a76b60f8ba2797 ("vfio: Include No-IOMMU mode")',
>> "This should make it very clear that this mode is not safe.
>> Additionally, CAP_SYS_RAWIO privileges are necessary to work
>> with groups and containers using this mode.  Groups making
>> use of this support are named /dev/vfio/noiommu-$GROUP and
>> can only make use of the special VFIO_NOIOMMU_IOMMU for the
>> container.  Use of this mode, specifically binding a device
>> without a native IOMMU group to a VFIO bus driver will taint
>> the kernel and should therefore not be considered supported.",
>> 
>> Actually, for vfio-platform No-IOMMU mode, the userspace can
>> not do DMA, because the ioctl API of noiommu container only
>> supports VFIO_CHECK_EXTENSION and VFIO_IOMMU_MAP_DMA is not
>> supported.
>I did not play with no-iommu mode yet but I am surprised by this last
>sentence. Without IOMMU the VFIO_IOMMU_MAP_DMA ioctl is not relevant
>since you do not need to and cannot map anything but does this really
>mean the device cannot perform DMA transfers towards physical memory, if
>programmed to do so; I don't think so?

Sorry. My commit log maybe misleading, I mean we can not use mmap + 
VFIO_IOMMU_MAP_DMA
for noiommu.

The platform device can still do DMA, if we program the DMA address for the 
device correctly,
such as a reserved memory region.

I can discard the last sentence of the commit log in V3.

Thanks,
Peng.
>
>Best Regards
>
>Eric
>> 
>> Signed-off-by: Peng Fan 
>> Cc: Eric Auger 
>> Cc: Baptiste Reynal 
>> Cc: Alex Williamson 
>> ---
>> 
>> V2:
>>  Rename subject to support No-IOMMU
>>  Add more commit log.
>> 
>>  I wrote a simple program following this
>>  
>> https://github.com/virtualopensystems/vfio-host-test/blob/master/src_test/vfio_device_test.c
>>  ,no dma support. The device's register can be
>>  accessed in userspace using command './vfio_dev_test 30b6.usdhc 0 1 
>> platform'
>> 
>>  drivers/vfio/platform/vfio_platform_common.c | 6 +++---
>>  1 file changed, 3 insertions(+), 3 deletions(-)
>> 
>> diff --git a/drivers/vfio/platform/vfio_platform_common.c 
>> b/drivers/vfio/platform/vfio_platform_common.c
>> index e65b142..993b2f9 100644
>> --- a/drivers/vfio/platform/vfio_platform_common.c
>> +++ b/drivers/vfio/platform/vfio_platform_common.c
>> @@ -561,7 +561,7 @@ int vfio_platform_probe_common(struct 
>> vfio_platform_device *vdev,
>>  
>>  vdev->device = dev;
>>  
>> -group = iommu_group_get(dev);
>> +group = vfio_iommu_group_get(dev);
>>  if (!group) {
>>  pr_err("VFIO: No IOMMU group for device %s\n", vdev->name);
>>  return -EINVAL;
>> @@ -569,7 +569,7 @@ int vfio_platform_probe_common(struct 
>> vfio_platform_device *vdev,
>>  
>>  ret = vfio_add_group_dev(dev, &vfio_platform_ops, vdev);
>>  if (ret) {
>> -iommu_group_put(group);
>> +vfio_iommu_group_put(group, dev);
>>  return ret;
>>  }
>>  
>> @@ -589,7 +589,7 @@ struct vfio_platform_device 
>> *vfio_platform_remove_common(struct device *dev)
>>  
>>  if (vdev) {
>>  vfio_platform_put_reset(vdev);
>> -iommu_group_put(dev->iommu_group);
>> +vfio_iommu_group_put(dev->iommu_group, dev);
>>  }
>>  
>>  return vdev;
>> 
>

[PATCH v2 RESEND] mfd: arizona: Check if AOD interrupts are pending before dispatching

2016-05-23 Thread Richard Fitzgerald

Previously the arizona_irq_thread implementation would call
handle_nested_irqs() to handle AOD interrupts without checking if any
were actually pending. The kernel will see these as spurious IRQs and
will eventually disable the IRQ.

This patch ensures we only launch the nested handler if there are AOD
interrupts pending in the codec.

Signed-off-by: Simon Trimmer 
Signed-off-by: Richard Fitzgerald 
---
 drivers/mfd/arizona-irq.c | 16 ++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/mfd/arizona-irq.c b/drivers/mfd/arizona-irq.c
index edeb495..5e18d3c 100644
--- a/drivers/mfd/arizona-irq.c
+++ b/drivers/mfd/arizona-irq.c
@@ -109,8 +109,20 @@ static irqreturn_t arizona_irq_thread(int irq, void *data)
do {
poll = false;
 
-   if (arizona->aod_irq_chip)
-   handle_nested_irq(irq_find_mapping(arizona->virq, 0));
+   if (arizona->aod_irq_chip) {
+   /*
+* Check the AOD status register to determine whether
+* the nested IRQ handler should be called.
+*/
+   ret = regmap_read(arizona->regmap,
+ ARIZONA_AOD_IRQ1, &val);
+   if (ret)
+   dev_warn(arizona->dev,
+   "Failed to read AOD IRQ1 %d\n", ret);
+   else if (val)
+   handle_nested_irq(
+   irq_find_mapping(arizona->virq, 0));
+   }
 
/*
 * Check if one of the main interrupts is asserted and only
-- 
1.9.1

Re: [RFC PATCH] Increase in idle power with schedutil

2016-05-23 Thread Peter Zijlstra

On Sun, May 22, 2016 at 01:42:52PM -0700, Steve Muckle wrote:

> > So does it actually matter what the frequency is when you idle? Isn't
> > the whole thing clock gated anyway?
> > 
> > Because this seems to generate contradictory requirements, on the one
> > hand we want to stay idle as long as possible while on the other hand
> > you seem to want to clock down while idle, which requires not being
> > idle.
> > 
> > If it matters; should not your idle state muck explicitly set/restore
> > frequency?
> 
> AFAIK this is very platform dependent. Some will waste more power than
> others when a CPU idles above fmin due to things like resource (bus
> bandwidth, shared cache freq etc) voting.

Oh agreed, completely platform dependent. 'Luckily' all this cpuidle is
already very platform dependent.

> It is also true that there is power spent going to fmin (and then
> perhaps restoring the frequency when idle ends) which will be in part a
> function of how slow the frequency change operation is on that platform.

Agreed.

> I think Daniel Lezcano (added) was exploring the idea of having cpuidle
> drivers take the expected idle duration and potentially communicate to
> cpufreq to reduce the frequency depending on a platform-specific
> cost/benefit analysis.

Right; that's along the lines I was thinking. If the idle guestimate and
the idle QoS both allow (ie. it wins on power and doesn't violate
wake-up latency) muck with DVSF on the idle path.

Re: [RFC PATCH] Increase in idle power with schedutil

2016-05-23 Thread Peter Zijlstra

On Mon, May 23, 2016 at 10:00:04AM +0100, Lorenzo Pieralisi wrote:
> It is also related to static leakage power that depends on the operating
> voltage (ie higher operating frequencies require higher voltage) so in a
> way scaling frequency before going idle may not be effective if voltage
> does not scale too in turn.

Sure, but the platform drivers 'know' all this and can make the right
decision.

Re: [PATCH 1/5] iommu/rockchip: fix devm_request_irq and devm_free_irq parameter

2016-05-23 Thread Heiko Stuebner

Am Montag, 23. Mai 2016, 09:37:15 schrieb Shunqian Zheng:
> From: Simon 

generally a "firstname surename " is expected, so a first name is not 
really enough.

> 
> When rk_iommu_attach_device or rk_iommu_detach_device be called, the
> second parameter "dev" represent the device who own the iommu, so it is
> not resonable using "dev" for devm_request_irq's first parameter. To
> avoid potential error, we must use iommu device itself "iommu->dev"
> instead, the same as devm_free_irq.
> 
> Signed-off-by: Simon 

same here, and the person sending in the patch also needs a signed-off-by 
line, so that should look like

Signed-off-by: Simon 
Signed-off-by: Shunqian Zheng 

The same applies of course for all other affected patches of this series.


After looking at the iommu code, I think the change itself looks sane, so
Reviewed-by: Heiko Stuebner 

although I wonder what we need the devm_* for if we request and free the 
irqs manually all the time anyway?


Heiko

> ---
>  drivers/iommu/rockchip-iommu.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/iommu/rockchip-iommu.c
> b/drivers/iommu/rockchip-iommu.c index c7d6156..ec0ce62 100644
> --- a/drivers/iommu/rockchip-iommu.c
> +++ b/drivers/iommu/rockchip-iommu.c
> @@ -807,7 +807,7 @@ static int rk_iommu_attach_device(struct iommu_domain
> *domain,
> 
>   iommu->domain = domain;
> 
> - ret = devm_request_irq(dev, iommu->irq, rk_iommu_irq,
> + ret = devm_request_irq(iommu->dev, iommu->irq, rk_iommu_irq,
>  IRQF_SHARED, dev_name(dev), iommu);
>   if (ret)
>   return ret;
> @@ -860,7 +860,7 @@ static void rk_iommu_detach_device(struct iommu_domain
> *domain, }
>   rk_iommu_disable_stall(iommu);
> 
> - devm_free_irq(dev, iommu->irq, iommu);
> + devm_free_irq(iommu->dev, iommu->irq, iommu);
> 
>   iommu->domain = NULL;

Re: [PATCH] doc: self-protection: provide initial details

2016-05-23 Thread James Morris

On Mon, 16 May 2016, Kees Cook wrote:

> + Segregation of kernel memory from userspace memory
> +
> +The kernel must never execute userspace memory. The kernel must also never
> +access userspace memory without explicit expectation to do so. These
> +rules can be enforced either by support of hardware-based restrictions
> +(x86's SMEP/SMAP, ARM's PXN/PAN) or via emulation (ARM's Memory Domains).
> +By blocking userspace memory in this way, execution and data parsing
> +cannot be passed to trivially-controlled userspace memory, forcing
> +attacks to operate entirely in kernel memory.

One caveat is that there may be ways to bypass these protections, e.g. via 
aliased (direct mapped) memory.

I'd also note that some platforms have separate kernel and memory spaces, 
like Sparc.


> +To protect against even privileged users, systems may need to either
> +disable module loading entirely (e.g. monolithic kernel builds or
> +modules_disabled sysctl), or provide signed modules (e.g.
> +CONFIG_MODULE_SIG_FORCE, or dm-crypt with LoadPin), to keep from having
> +oot load arbitrary kernel code via the module loader interface.

Or utilize an appropriate MAC policy.



-- 
James Morris

Re: [PATCH 1/6] statx: Add a system call to make enhanced file info available

2016-05-23 Thread David Howells

Christoph Hellwig  wrote:

> Honestly I think this really matters on the amount of 'emulation' we
> need - if it's just adding a new flag that can be trivially generated
> in the syscall stub in userland that's probably fine, but if we have
> actually differing semantics (like the stat weak attributes) I'd rather
> have a properly documented syscall.  If we otherwise need to rewrite
> whole structures I'd much rather do that in kernel space.

I very much agree.

> And to get back to stat: if would be really useful to coordinate the
> new one with glibc so that we don't end up with two different stat
> structures again like we do for a lot of platforms at the moment.

I've tried reaching out to them and others, but no one responded.

David

Re: [PATCH 16/54] MAINTAINERS: Add file patterns for drm device tree bindings

2016-05-23 Thread Emil Velikov

On 22 May 2016 at 10:05, Geert Uytterhoeven  wrote:
> Submitters of device tree binding documentation may forget to CC
> the subsystem maintainer if this is missing.
>
> Signed-off-by: Geert Uytterhoeven 
> Cc: David Airlie 
> Cc: dri-de...@lists.freedesktop.org
> ---
> Please apply this patch directly if you want to be involved in device
> tree binding documentation for your subsystem.
> ---
>  MAINTAINERS | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index c79b99dd3a0bf22d..75138c09dd603093 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -3868,6 +3868,9 @@ T:git 
> git://people.freedesktop.org/~airlied/linux
>  S: Maintained
>  F: drivers/gpu/drm/
>  F: drivers/gpu/vga/
> +F: Documentation/devicetree/bindings/display/
Contains a mixed bag of fb (drivers/video) and drm ones. Perhaps one
could/should move the former to a fb/legacy/other subfolder ?

> +F: Documentation/devicetree/bindings/gpu/
nvidia,gk20a.txt
samsung-g2d.txt
samsung-rotator.txt

These three should be listed alongside the respective drivers and/or
moved to display/{tegra,exynos} ?

> +F: Documentation/devicetree/bindings/video/
bridge/anx7814.txt move this to display ?

Just some ideas, feel free to proceed as you think it's better.
-Emil

Re: [PATCH] seqlock: fix raw_read_seqcount_latch()

2016-05-23 Thread Peter Zijlstra

On Sun, May 22, 2016 at 09:50:40PM +0300, Alexey Dobriyan wrote:
> On Sun, May 22, 2016 at 12:48:27PM +0200, Peter Zijlstra wrote:
> > On Sat, May 21, 2016 at 11:14:49PM +0300, Alexey Dobriyan wrote:
> > > lockless_dereference() is supposed to take pointer not integer.
> > 
> > Urgh :/
> > 
> > Is there any way we can make lockless_dereference() issue a warning if
> > we don't feed it a pointer?
> > 
> > Would something like so work? All pointer types should silently cast to
> > void * while integer (and others) should refuse to.
> 
> This works (and spammy enough in case of seqlock, which is good)
> but not for "unsigned long":
> 
>   include/linux/percpu-refcount.h:146:36: warning: initialization makes 
> pointer from integer without a cast [-Wint-conversion]
> percpu_ptr = lockless_dereference(ref->percpu_count_ptr);

TJ; would you prefer casting or not using lockless_dereference() here?

> 
> > --- a/include/linux/compiler.h
> > +++ b/include/linux/compiler.h
> > @@ -544,6 +544,7 @@ static __always_inline void __write_once_size(volatile 
> > void *p, void *res, int s
> >   */
> >  #define lockless_dereference(p) \
> >  ({ \
> > +   __maybe_unused void * _p2 = p; \
> > typeof(p) _p1 = READ_ONCE(p); \
> > smp_read_barrier_depends(); /* Dependency order vs. p above. */ \
> > (_p1); \

Re: [patch] sched/fair: Move se->vruntime normalization state into struct sched_entity

2016-05-23 Thread Mike Galbraith

On Mon, 2016-05-23 at 11:19 +0200, Peter Zijlstra wrote:
> On Sun, May 22, 2016 at 09:00:01AM +0200, Mike Galbraith wrote:
> > On Sat, 2016-05-21 at 21:00 +0200, Mike Galbraith wrote:
> > > On Sat, 2016-05-21 at 16:04 +0200, Mike Galbraith wrote:
> > > 
> > > > Wakees that were not migrated/normalized eat an unwanted min_vruntime,
> > > > and likely take a size XXL latency hit.  Big box running master bled
> > > > profusely under heavy load until I turned TTWU_QUEUE off.
> > 
> > May as well make it official and against master.today.  Fly or die
> > little patchlet.
> > 
> > sched/fair: Move se->vruntime normalization state into struct sched_entity
> 
> Does this work?

Yup, bugs--.  Kinda funny, I considered ~this way first, but thought
you'd not that approach.. dang, got it back-assward ;-)

> ---
>  include/linux/sched.h |  1 +
>  kernel/sched/core.c   | 18 +++---
>  2 files changed, 12 insertions(+), 7 deletions(-)
> 
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 1b43b45a22b9..a2001e01b3df 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1534,6 +1534,7 @@ struct task_struct {
>   unsigned sched_reset_on_fork:1;
>   unsigned sched_contributes_to_load:1;
>   unsigned sched_migrated:1;
> + unsigned sched_remote_wakeup:1;
>   unsigned :0; /* force alignment to the next boundary */
>  
>   /* unserialized, strictly 'current' */
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 404c0784b1fc..7f2cae4620c7 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -1768,13 +1768,15 @@ void sched_ttwu_pending(void)
>   cookie = lockdep_pin_lock(&rq->lock);
>  
>   while (llist) {
> + int wake_flags = 0;
> +
>   p = llist_entry(llist, struct task_struct,
> wake_entry);
>   llist = llist_next(llist);
> - /*
> -  * See ttwu_queue(); we only call
> ttwu_queue_remote() when
> -  * its a x-cpu wakeup.
> -  */
> - ttwu_do_activate(rq, p, WF_MIGRATED, cookie);
> +
> + if (p->sched_remote_wakeup)
> + wake_flags = WF_MIGRATED;
> +
> + ttwu_do_activate(rq, p, wake_flags, cookie);
>   }
>  
>   lockdep_unpin_lock(&rq->lock, cookie);
> @@ -1819,10 +1821,12 @@ void scheduler_ipi(void)
>   irq_exit();
>  }
>  
> -static void ttwu_queue_remote(struct task_struct *p, int cpu)
> +static void ttwu_queue_remote(struct task_struct *p, int cpu, int
> wake_flags)
>  {
>   struct rq *rq = cpu_rq(cpu);
>  
> + p->sched_remote_wakeup = !!(wake_flags & WF_MIGRATED);
> +
>   if (llist_add(&p->wake_entry, &cpu_rq(cpu)->wake_list)) {
>   if (!set_nr_if_polling(rq->idle))
>   smp_send_reschedule(cpu);
> @@ -1869,7 +1873,7 @@ static void ttwu_queue(struct task_struct *p,
> int cpu, int wake_flags)
>  #if defined(CONFIG_SMP)
>   if (sched_feat(TTWU_QUEUE) &&
> !cpus_share_cache(smp_processor_id(), cpu)) {
>   sched_clock_cpu(cpu); /* sync clocks x-cpu */
> - ttwu_queue_remote(p, cpu);
> + ttwu_queue_remote(p, cpu, wake_flags);
>   return;
>   }
>  #endif

Re: [RFC v2 3/5] drm/mediatek: add *driver_data for different hardware settings

2016-05-23 Thread CK Hu

Hi, YT:

One comment below.

On Fri, 2016-05-20 at 23:05 +0800, yt.s...@mediatek.com wrote:
> From: YT Shen 
> 
> There are some hardware settings changed, between MT8173 & MT2701:
> DISP_OVL address offset changed, color format definition changed.
> DISP_RDMA fifo size changed.
> DISP_COLOR offset changed.
> 
> Signed-off-by: YT Shen 
> ---
> +
> +static inline struct mtk_ddp_comp_driver_data *mtk_ovl_get_driver_data(
> + struct platform_device *pdev)
> +{
> + const struct of_device_id *of_id =
> + of_match_device(mtk_disp_ovl_driver_dt_match, &pdev->dev);
> +
> + return (struct mtk_ddp_comp_driver_data *)of_id->data;
> +}
> +
> +static inline struct mtk_ddp_comp_driver_data *mtk_rdma_get_driver_data(
> + struct platform_device *pdev)
> +{
> + const struct of_device_id *of_id =
> + of_match_device(mtk_disp_rdma_driver_dt_match, &pdev->dev);
> +
> + return (struct mtk_ddp_comp_driver_data *)of_id->data;
> +}
> +
> +static inline struct mtk_ddp_comp_driver_data *mtk_color_get_driver_data(
> + struct device_node *node)
> +{
> + const struct of_device_id *of_id =
> + of_match_node(mtk_disp_color_driver_dt_match, node);
> +
> + return (struct mtk_ddp_comp_driver_data *)of_id->data;
> +}
> + 

These three functions looks the same with different parameter:
mtk_disp_ovl_driver_dt_match, mtk_disp_rdma_driver_dt_match, and
mtk_disp_color_driver_dt_match. So merge them to prevent duplicated
code.

Regards,
CK

RE: livepatch: change to a per-task consistency model

2016-05-23 Thread David Laight

From: Jiri Kosina
> Sent: 18 May 2016 21:23
> On Wed, 18 May 2016, Josh Poimboeuf wrote:
> 
> > Yeah, I think this situation -- a task sleeping on an affected function
> > in uninterruptible state for a long period of time -- would be
> > exceedingly rare and not something we need to worry about for now.
> 
> Plus in case task'd be in TASK_UNINTERRUPTIBLE for more than 120s, hung
> task detector would trigger anyway.

Related, please can we have a flag for the sleep and/or process so that
an uninterruptible sleep doesn't trigger the 'hung task' detector
and also stops the process counting towards the 'load average'.

In particular some kernel threads are not signalable, and do not
want to be woken by signals (they exit on a specific request).

David

[PATCH V3] vfio: platform: support No-IOMMU mode

2016-05-23 Thread Peng Fan

The vfio No-IOMMU mode was supported by this
'commit 03a76b60f8ba2797 ("vfio: Include No-IOMMU mode")',
but it only support vfio-pci.

Using vfio_iommu_group_get/put, but not iommu_group_get/put,
the platform devices can be exposed to userspace with
CONFIG_VFIO_NOIOMMU and the "enable_unsafe_noiommu_mode"
option enabled.

>From 'commit 03a76b60f8ba2797 ("vfio: Include No-IOMMU mode")',
"This should make it very clear that this mode is not safe.
Additionally, CAP_SYS_RAWIO privileges are necessary to work
with groups and containers using this mode.  Groups making
use of this support are named /dev/vfio/noiommu-$GROUP and
can only make use of the special VFIO_NOIOMMU_IOMMU for the
container.  Use of this mode, specifically binding a device
without a native IOMMU group to a VFIO bus driver will taint
the kernel and should therefore not be considered supported."

Signed-off-by: Peng Fan 
Cc: Eric Auger 
Cc: Baptiste Reynal 
Cc: Alex Williamson 
---

V3:
 The platform device can be programmed to do DMA without
 caring out mmap + VFIO_IOMMU_MAP_DMA which are not
 support by noiommu. So drop the last sentence of commit log
 in V2, which is misleading.

V2:
 Rename subject to support No-IOMMU
 Add more commit log.
 I wrote a simple program following this
 
https://github.com/virtualopensystems/vfio-host-test/blob/master/src_test/vfio_device_test.c
 ,no dma support. The device's register can be
 accessed in userspace using command './vfio_dev_test 30b6.usdhc 0 1 
platform'

 drivers/vfio/platform/vfio_platform_common.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/vfio/platform/vfio_platform_common.c 
b/drivers/vfio/platform/vfio_platform_common.c
index e65b142..993b2f9 100644
--- a/drivers/vfio/platform/vfio_platform_common.c
+++ b/drivers/vfio/platform/vfio_platform_common.c
@@ -561,7 +561,7 @@ int vfio_platform_probe_common(struct vfio_platform_device 
*vdev,
 
vdev->device = dev;
 
-   group = iommu_group_get(dev);
+   group = vfio_iommu_group_get(dev);
if (!group) {
pr_err("VFIO: No IOMMU group for device %s\n", vdev->name);
return -EINVAL;
@@ -569,7 +569,7 @@ int vfio_platform_probe_common(struct vfio_platform_device 
*vdev,
 
ret = vfio_add_group_dev(dev, &vfio_platform_ops, vdev);
if (ret) {
-   iommu_group_put(group);
+   vfio_iommu_group_put(group, dev);
return ret;
}
 
@@ -589,7 +589,7 @@ struct vfio_platform_device 
*vfio_platform_remove_common(struct device *dev)
 
if (vdev) {
vfio_platform_put_reset(vdev);
-   iommu_group_put(dev->iommu_group);
+   vfio_iommu_group_put(dev->iommu_group, dev);
}
 
return vdev;
-- 
2.6.2

Re: [PATCH 3.12 00/76] 3.12.60-stable review

2016-05-23 Thread Jiri Slaby

On 05/19/2016, 03:52 PM, Guenter Roeck wrote:
> On 05/19/2016 02:08 AM, Jiri Slaby wrote:
>> This is the start of the stable review cycle for the 3.12.60 release.
>> There are 76 patches in this series, all will be posted as a response
>> to this one.  If anyone has any issues with these being applied, please
>> let me know.
>>
>> Responses should be made by Mon May 23 11:07:53 CEST 2016.
>> Anything received after that time might be too late.
>>
> 
> Build results:
> total: 127 pass: 127 fail: 0
> Qemu test results:
> total: 85 pass: 85 fail: 0
> 
> Details are available at http://kerneltests.org/builders.

Thanks!

-- 
js
suse labs

Re: [PATCH 1/1 RFC] net/phy: Add Lantiq PHY driver

2016-05-23 Thread Alexander Stein

Hi Hauke,

On Monday 23 May 2016 09:12:54, Mehrtens, Hauke wrote:
> > On Thursday 19 May 2016 12:03:10, Mathias Kresin wrote:
> > > 2016-05-19 9:03 GMT+02:00 John Crispin :
> > > > On 19/05/2016 08:57, Alexander Stein wrote:
> > > >> Thanks for the link, I wasn't aware of that patch. I like it in
> > > >> general, but there are some things I'd like to get addressed first:
> > > >> * vr9_gphy_of_reg_init() writes uncoditionally to led3h and led3l
> > > >> even on
> > > >> 
> > > >>   PEf7071 which does not have this register at all
> > > > 
> > > > we use this driver mainly on the 11g and 22f version. mathias
> > > > recently added the led3 handling.
> > > > 
> > > > @Mathias, can you have a look at this and fix it inside the lede tree
> > > > ?
> > > 
> > > Well, I haven't added the led3 handling, I've only changed the initial
> > > value (function) of led3.
> > > 
> > > Maybe it's cleaner to not use a default value for the led function and
> > > completely rely on the device tree bindings. But by adjusting the
> > > initial values, I had to change only the led function of one board in
> > > the openwrt xrx200 subtarget instead of touching all dts files.
> > 
> > I think setting default values is good.
> 
> The registers are set to some reset values after the chip is coming out of
> reset, but we should set  them all to the same value, Mathias said that all
> except for one board he knows are using only one LED per port, but they are
> often using different LED pins, I will change my patch.

One LED per port? I would think of using one RJ45 socket per port which 
usually have 2 LEDs.

> > > I know that the LTQ Datasheet for the PEF 7071 Version 1.5 mentions
> > > the led3 control register albeit there is no pin for a forth led. So I
> > > guess it's safe to write to the led3 register even for the PEF 7071.
> > 
> > Mh, my PEF 7071 User Manual (Version 2.0, 2012-10-17) doesn't mention
> > LED3x registers. There is LED3DA and LED3EN in PHY_LED but was removed in
> > 1.6 manual.
> 
> LED3x is only available in PEF 7072 which is a different package with more
> pins for the LED3 and some other interfaces.
> > I think, some flag if the PHY supports LED3 and depend on that is just
> > fine.
> I do not know how to distinguish between PEF 7071 and PEF 7072.

I expected that PEF 7072 would have a different PHY ID, but apparently this is 
not the case, though I don't have a datasheet for 7072. Is there really no way 
to distinguish those two?

Alexander

Re: [RFC PATCHv2] usb: USB Type-C Connector Class

2016-05-23 Thread Heikki Krogerus

Hi Oliver,

On Fri, May 20, 2016 at 04:19:59PM +0200, Oliver Neukum wrote:
> On Thu, 2016-05-19 at 15:44 +0300, Heikki Krogerus wrote:
> > Like I've told some of you guys, I'm trying to implement a bus for
> > the Alternate Modes, but I'm still nowhere near finished with that
> > one, so let's just get the class ready now. The altmode bus should in
> > any case not affect the userspace interface proposed in this patch.
> 
> Is this strictly divorced from USB PD?

The bus can not be tied to the USB PD stack we will have in the
kernel completely, or there is no change of using it with things like
UCSI. It's going to be difficult to achieve that in any case as we
simply won't be able to send and rescieve the VDMs with things like
UCSI, but let's see.

> How do you trigger a cable reset or a USB PD reset?

There needs to be an API, but I'm sure that's not going to be a
problem. The bus and the altmode specific drivers will reside inside
kernel.

But I'm getting the sense that you are thinking about having some
responsibility of USB PD in userspace. Please correct me if I'm wrong.
I don't think it will be possible. I think the role of userspace can
only be the source for high level requests via this interface, like
enter/exit mode and swap role, and receiving the status and details of
the ports, but any knowledge about the requirements regarding those
steps belongs to the kernel. This includes also the knowledge about
stuff like mode dependencies, for example if cable plug has to be in a
certain mode in order for the partner to be able to enter some
specific mode, etc.

Thanks.

-- 
heikki

[block] e0d3dd5854: INFO: suspicious RCU usage. ]

2016-05-23 Thread kernel test robot

 command line is:

qemu-system-x86_64 -enable-kvm -cpu Haswell,+smep,+smap -kernel 
/pkg/linux/x86_64-nfsroot/gcc-6/e0d3dd5854af35d080411e2c51308f58f72ed18b/vmlinuz-4.6.0-rc3-00122-ge0d3dd5
 -append 'root=/dev/ram0 user=lkp 
job=/lkp/scheduled/vm-kbuild-1G-8/bisect_boot-1-debian-x86_64-2015-02-07.cgz-x86_64-nfsroot-e0d3dd5854af35d080411e2c51308f58f72ed18b-20160523-89777-1kpfl6n-0.yaml
 ARCH=x86_64 kconfig=x86_64-nfsroot branch=linux-devel/devel-hourly-2016051922 
commit=e0d3dd5854af35d080411e2c51308f58f72ed18b 
BOOT_IMAGE=/pkg/linux/x86_64-nfsroot/gcc-6/e0d3dd5854af35d080411e2c51308f58f72ed18b/vmlinuz-4.6.0-rc3-00122-ge0d3dd5
 max_uptime=600 
RESULT_ROOT=/result/boot/1/vm-kbuild-1G/debian-x86_64-2015-02-07.cgz/x86_64-nfsroot/gcc-6/e0d3dd5854af35d080411e2c51308f58f72ed18b/0
 LKP_SERVER=inn earlyprintk=ttyS0,115200 systemd.log_level=err debug apic=debug 
sysrq_always_enabled rcupdate.rcu_cpu_stall_timeout=100 panic=-1 
softlockup_panic=1 nmi_watchdog=panic oops=panic load_ramdisk=2 
prompt_ramdisk=0 console=ttyS0,115200 console=tty0 vga=normal rw 
ip=vm-kbuild-1G-8::dhcp'  -initrd /fs/sdc1/initrd-vm-kbuild-1G-8 -m 1024 
-smp 2 -device e1000,netdev=net0 -netdev user,id=net0,hostfwd=tcp::23007-:22 
-boot order=nc -no-reboot -watchdog i6300esb -rtc base=localtime -device 
virtio-scsi-pci,id=scsi0 -drive 
file=/fs/sdc1/disk0-vm-kbuild-1G-8,if=none,id=hd0,media=disk,aio=native,cache=none
 -device scsi-hd,bus=scsi0.0,drive=hd0,scsi-id=1,lun=0 -drive 
file=/fs/sdc1/disk1-vm-kbuild-1G-8,if=none,id=hd1,media=disk,aio=native,cache=none
 -device scsi-hd,bus=scsi0.0,drive=hd1,scsi-id=1,lun=1 -drive 
file=/fs/sdc1/disk2-vm-kbuild-1G-8,if=none,id=hd2,media=disk,aio=native,cache=none
 -device scsi-hd,bus=scsi0.0,drive=hd2,scsi-id=1,lun=2 -drive 
file=/fs/sdc1/disk3-vm-kbuild-1G-8,if=none,id=hd3,media=disk,aio=native,cache=none
 -device scsi-hd,bus=scsi0.0,drive=hd3,scsi-id=1,lun=3 -drive 
file=/fs/sdc1/disk4-vm-kbuild-1G-8,if=none,id=hd4,media=disk,aio=native,cache=none
 -device scsi-hd,bus=scsi0.0,drive=hd4,scsi-id=1,lun=4 -pidfile 
/dev/shm/kboot/pid-vm-kbuild-1G-8 -serial 
file:/dev/shm/kboot/serial-vm-kbuild-1G-8 -daemonize -display none -monitor 
null 





Thanks,
Kernel Test Robot
#
# Automatically generated file; DO NOT EDIT.
# Linux/x86_64 4.6.0-rc3 Kernel Configuration
#
CONFIG_64BIT=y
CONFIG_X86_64=y
CONFIG_X86=y
CONFIG_INSTRUCTION_DECODER=y
CONFIG_PERF_EVENTS_INTEL_UNCORE=y
CONFIG_OUTPUT_FORMAT="elf64-x86-64"
CONFIG_ARCH_DEFCONFIG="arch/x86/configs/x86_64_defconfig"
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_MMU=y
CONFIG_ARCH_MMAP_RND_BITS_MIN=28
CONFIG_ARCH_MMAP_RND_BITS_MAX=32
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MIN=8
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MAX=16
CONFIG_NEED_DMA_MAP_STATE=y
CONFIG_NEED_SG_DMA_LENGTH=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_ARCH_HAS_CPU_RELAX=y
CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
CONFIG_HAVE_SETUP_PER_CPU_AREA=y
CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y
CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y
CONFIG_ARCH_WANT_GENERAL_HUGETLB=y
CONFIG_ZONE_DMA32=y
CONFIG_AUDIT_ARCH=y
CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_X86_64_SMP=y
CONFIG_ARCH_HWEIGHT_CFLAGS="-fcall-saved-rdi -fcall-saved-rsi -fcall-saved-rdx 
-fcall-saved-rcx -fcall-saved-r8 -fcall-saved-r9 -fcall-saved-r10 
-fcall-saved-r11"
CONFIG_ARCH_SUPPORTS_UPROBES=y
CONFIG_FIX_EARLYCON_MEM=y
CONFIG_DEBUG_RODATA=y
CONFIG_PGTABLE_LEVELS=4
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"
CONFIG_IRQ_WORK=y
CONFIG_BUILDTIME_EXTABLE_SORT=y

#
# General setup
#
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_CROSS_COMPILE=""
# CONFIG_COMPILE_TEST is not set
CONFIG_LOCALVERSION=""
CONFIG_LOCALVERSION_AUTO=y
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_HAVE_KERNEL_BZIP2=y
CONFIG_HAVE_KERNEL_LZMA=y
CONFIG_HAVE_KERNEL_XZ=y
CONFIG_HAVE_KERNEL_LZO=y
CONFIG_HAVE_KERNEL_LZ4=y
CONFIG_KERNEL_GZIP=y
# CONFIG_KERNEL_BZIP2 is not set
# CONFIG_KERNEL_LZMA is not set
# CONFIG_KERNEL_XZ is not set
# CONFIG_KERNEL_LZO is not set
# CONFIG_KERNEL_LZ4 is not set
CONFIG_DEFAULT_HOSTNAME="(none)"
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
CONFIG_POSIX_MQUEUE_SYSCTL=y
CONFIG_CROSS_MEMORY_ATTACH=y
CONFIG_FHANDLE=y
CONFIG_USELIB=y
CONFIG_AUDIT=y
CONFIG_HAVE_ARCH_AUDITSYSCALL=y
CONFIG_AUDITSYSCALL=y
CONFIG_AUDIT_WATCH=y
CONFIG_AUDIT_TREE=y

#
# IRQ subsystem
#
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_GENERIC_IRQ_SHOW=y
CONFIG_GENERIC_PENDING_IRQ=y
CONFIG_IRQ_DOMAIN=y
CONFIG_IRQ_DOMAIN_HIERARCHY=y
CONFIG_GENERIC_MSI_IRQ=y
CONFIG_GENERIC_MSI_IRQ_DOMAIN=y
# CONFIG_IRQ_DOMAIN_DEBUG is not set
CONFIG_IRQ_FORCED_THREADING=y
CONFIG_SPA

[PATCH 3/4 v7] ASoC: dwc: Add PIO PCM extension

2016-05-23 Thread Jose Abreu

A PCM extension was added to I2S driver so that audio
samples are transferred using PIO mode.

The PCM supports two channels @ 16 or 32 bits with rates
32k, 44.1k and 48k.

Although the mainline I2S driver uses ALSA DMA engine the
I2S controller can be built without DMA support, therefore
this is the reason why this extension was added.

The selection between the use of DMA engine or PIO mode
is detected by declaring or not the interrupt parameters
in the DT and using Kconfig.

Signed-off-by: Jose Abreu 
Cc: Carlos Palminha 
Cc: Mark Brown 
Cc: Liam Girdwood 
Cc: Jaroslav Kysela 
Cc: Takashi Iwai 
Cc: Rob Herring 
Cc: Alexey Brodkin 
Cc: linux-snps-...@lists.infradead.org
Cc: alsa-de...@alsa-project.org
Cc: devicet...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---

Changes v6 -> v7:
* Discard the use of memcpy
* Report IRQ_HANDLED only when there is an IRQ
* Use interrupts to check if PIO mode is in use
* Unmask interrupts only when in PIO mode
* Remove empty functions

Changes v5 -> v6:
* Use SNDRV_DMA_TYPE_CONTINUOUS

Changes v4 -> v5:
* Resolve undefined references when compiling as module
* Use DMA properties in I2S to check which mode to use: PIO or DMA (as 
suggested by Lars-Peter Clausen)

Changes v3 -> v4:
* Reintroduced custom PCM driver
* Use DT boolean to switch between ALSA DMA engine PCM or custom PCM

Changes v2 -> v3:
* Removed pll_config functions (as suggested by Alexey Brodkin)
* Dropped custom platform driver, using now ALSA DMA engine
* Dropped IRQ handler for I2S

No changes v1 -> v2.

 sound/soc/dwc/Kconfig  |   9 ++
 sound/soc/dwc/Makefile |   5 +-
 sound/soc/dwc/designware_i2s.c | 148 ---
 sound/soc/dwc/designware_pcm.c | 220 +
 sound/soc/dwc/local.h  | 122 +++
 5 files changed, 415 insertions(+), 89 deletions(-)
 create mode 100644 sound/soc/dwc/designware_pcm.c
 create mode 100644 sound/soc/dwc/local.h

diff --git a/sound/soc/dwc/Kconfig b/sound/soc/dwc/Kconfig
index d50e085..c6fd95f 100644
--- a/sound/soc/dwc/Kconfig
+++ b/sound/soc/dwc/Kconfig
@@ -7,4 +7,13 @@ config SND_DESIGNWARE_I2S
 Synopsys desigwnware I2S device. The device supports upto
 maximum of 8 channels each for play and record.
 
+config SND_DESIGNWARE_PCM
+   bool "PCM PIO extension for I2S driver"
+   depends on SND_DESIGNWARE_I2S
+   help
+Say Y or N if you want to add a custom ALSA extension that registers
+a PCM and uses PIO to transfer data.
+
+This functionality is specially suited for I2S devices that don't have
+DMA support.
 
diff --git a/sound/soc/dwc/Makefile b/sound/soc/dwc/Makefile
index 319371f..11ea966 100644
--- a/sound/soc/dwc/Makefile
+++ b/sound/soc/dwc/Makefile
@@ -1,3 +1,4 @@
 # SYNOPSYS Platform Support
-obj-$(CONFIG_SND_DESIGNWARE_I2S) += designware_i2s.o
-
+obj-$(CONFIG_SND_DESIGNWARE_I2S) += dwc_i2s.o
+dwc_i2s-y := designware_i2s.o
+dwc_i2s-$(CONFIG_SND_DESIGNWARE_PCM) += designware_pcm.o
diff --git a/sound/soc/dwc/designware_i2s.c b/sound/soc/dwc/designware_i2s.c
index a97be8e..d4c3811 100644
--- a/sound/soc/dwc/designware_i2s.c
+++ b/sound/soc/dwc/designware_i2s.c
@@ -24,90 +24,7 @@
 #include 
 #include 
 #include 
-
-/* common register for all channel */
-#define IER0x000
-#define IRER   0x004
-#define ITER   0x008
-#define CER0x00C
-#define CCR0x010
-#define RXFFR  0x014
-#define TXFFR  0x018
-
-/* I2STxRxRegisters for all channels */
-#define LRBR_LTHR(x)   (0x40 * x + 0x020)
-#define RRBR_RTHR(x)   (0x40 * x + 0x024)
-#define RER(x) (0x40 * x + 0x028)
-#define TER(x) (0x40 * x + 0x02C)
-#define RCR(x) (0x40 * x + 0x030)
-#define TCR(x) (0x40 * x + 0x034)
-#define ISR(x) (0x40 * x + 0x038)
-#define IMR(x) (0x40 * x + 0x03C)
-#define ROR(x) (0x40 * x + 0x040)
-#define TOR(x) (0x40 * x + 0x044)
-#define RFCR(x)(0x40 * x + 0x048)
-#define TFCR(x)(0x40 * x + 0x04C)
-#define RFF(x) (0x40 * x + 0x050)
-#define TFF(x) (0x40 * x + 0x054)
-
-/* I2SCOMPRegisters */
-#define I2S_COMP_PARAM_2   0x01F0
-#define I2S_COMP_PARAM_1   0x01F4
-#define I2S_COMP_VERSION   0x01F8
-#define I2S_COMP_TYPE  0x01FC
-
-/*
- * Component parameter register fields - define the I2S block's
- * configuration.
- */
-#defineCOMP1_TX_WORDSIZE_3(r)  (((r) & GENMASK(27, 25)) >> 25)
-#defineCOMP1_TX_WORDSIZE_2(r)  (((r) & GENMASK(24, 22)) >> 22)
-#defineCOMP1_TX_WORDSIZE_1(r)  (((r) & GENMASK(21, 19)) >> 19)
-#defineCOMP1_TX_WORDSIZE_0(r)  (((r) & GENMASK(18, 16)) >> 16)
-#defineCOMP1_TX_CHANNELS(r)(((r) & GENMASK(10, 9)) >> 9)
-#defineCOMP1_RX_CHANNELS(r)(((r) & GENMASK(8, 7)) >> 7)
-#defineCOMP1_RX_ENABLED(r) (((r) & BIT(6)) >> 6)
-#defineCOMP1_TX_ENABLED(r) (((r) & BIT(5)) >

[PATCH 0/4 v7] Add I2S audio support for ARC AXS10x boards

2016-05-23 Thread Jose Abreu

ARC AXS10x platforms consist of a mainboard with several peripherals.
One of those peripherals is an HDMI output port controlled by the ADV7511
transmitter.

This patch set adds I2S audio for the AXS10x platform.

NOTE:
Although the mainline I2S driver uses ALSA DMA engine, this controller
can be built without DMA support so it was necessary to add this
custom platform driver so that HDMI audio works in AXS boards.

Changes v6 -> v7:
* Discard the use of memcpy
* Report IRQ_HANDLED only when there is an IRQ
* Use interrupts to check if PIO mode is in use
* Unmask interrupts only when in PIO mode
* Remove empty functions

Changes v5 -> v6:
* Use SNDRV_DMA_TYPE_CONTINUOUS

Changes v4 -> v5:
* Resolve undefined references when compiling as module
* Dropped adv7511 audio patches
* Use DMA properties in I2S to check which mode to use: PIO or DMA (as 
suggested by Lars-Peter Clausen)

Changes v3 -> v4:
* Reintroduced custom PCM driver (see note below)
* Use DT boolean to switch between ALSA DMA engine PCM or custom PCM
* Use fifo depth to program I2S FCR
* Update I2S documentation

Changes v2 -> v3:
* Removed pll_config functions (as suggested by Alexey Brodkin)
* Removed HDMI start at adv7511_core (as suggested by Archit Taneja)
* Use NOP functions for adv7511_audio (as suggested by Archit Taneja)
* Added adv7511_audio_exit() function (as suggested by Archit Taneja)
* Moved adv7511 to its own folder (as suggested by Archit Taneja)
* Separated file rename of adv7511_core (as suggested by Emil Velikov)
* Compile adv7511 as module if ALSA SoC is compiled as module
* Load adv7511 audio only if declared in device tree (as suggested by Laurent 
Pinchart)
* Dropped custom platform driver, using now ALSA DMA engine
* Dropped IRQ handler for I2S

Changes v1 -> v2:
* DT bindings moved to separate patch (as suggested by Alexey Brodkin)
* Removed defconfigs entries (as suggested by Alexey Brodkin)


Cc: Carlos Palminha 
Cc: Mark Brown 
Cc: Liam Girdwood 
Cc: Jaroslav Kysela 
Cc: Takashi Iwai 
Cc: Rob Herring 
Cc: Alexey Brodkin 
Cc: linux-snps-...@lists.infradead.org
Cc: alsa-de...@alsa-project.org
Cc: devicet...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org

Jose Abreu (4):
  ASoC: dwc: Add helper functions to disable/enable irqs
  ASoC: dwc: Do not use devm_clk_get() if using platform data
  ASoC: dwc: Add PIO PCM extension
  ASoC: dwc: Add irq parameter to DOCUMENTATION

 .../devicetree/bindings/sound/designware-i2s.txt   |   4 +
 sound/soc/dwc/Kconfig  |   9 +
 sound/soc/dwc/Makefile |   5 +-
 sound/soc/dwc/designware_i2s.c | 229 ++---
 sound/soc/dwc/designware_pcm.c | 220 
 sound/soc/dwc/local.h  | 122 +++
 6 files changed, 467 insertions(+), 122 deletions(-)
 create mode 100644 sound/soc/dwc/designware_pcm.c
 create mode 100644 sound/soc/dwc/local.h

-- 
1.9.1

[PATCH 4/4 v7] ASoC: dwc: Add irq parameter to DOCUMENTATION

2016-05-23 Thread Jose Abreu

A parameter description for the interruptions of the
I2S controller was added. This interrupt parameter
should only be set when I2S does not have DMA support.

Signed-off-by: Jose Abreu 
Cc: Carlos Palminha 
Cc: Mark Brown 
Cc: Liam Girdwood 
Cc: Jaroslav Kysela 
Cc: Takashi Iwai 
Cc: Rob Herring 
Cc: Alexey Brodkin 
Cc: linux-snps-...@lists.infradead.org
Cc: alsa-de...@alsa-project.org
Cc: devicet...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---

Changes v6 -> v7:
* interrupts is now optional property

No changes v5 -> v6.

Changes v4 -> v5:
* interrupts is now required property
* Drop 'snps-use-dmaengine' property

This patch was only introduced in v4.

 Documentation/devicetree/bindings/sound/designware-i2s.txt | 4 
 1 file changed, 4 insertions(+)

diff --git a/Documentation/devicetree/bindings/sound/designware-i2s.txt 
b/Documentation/devicetree/bindings/sound/designware-i2s.txt
index 7bb5424..6a536d5 100644
--- a/Documentation/devicetree/bindings/sound/designware-i2s.txt
+++ b/Documentation/devicetree/bindings/sound/designware-i2s.txt
@@ -12,6 +12,10 @@ Required properties:
one for receive.
  - dma-names : "tx" for the transmit channel, "rx" for the receive channel.
 
+Optional properties:
+ - interrupts: The interrupt line number for the I2S controller. Add this
+   parameter if the I2S controller that you are using does not support DMA.
+
 For more details on the 'dma', 'dma-names', 'clock' and 'clock-names'
 properties please check:
* resource-names.txt
-- 
1.9.1

[PATCH 1/4 v7] ASoC: dwc: Add helper functions to disable/enable irqs

2016-05-23 Thread Jose Abreu

Helper functions to disable and enable the I2S interrupts were
added. Only the interrupts of the used channels are enabled.

Also, there is no need to enable irqs at dw_i2s_config(), they
are already enabled at startup.

Signed-off-by: Jose Abreu 
Cc: Carlos Palminha 
Cc: Mark Brown 
Cc: Liam Girdwood 
Cc: Jaroslav Kysela 
Cc: Takashi Iwai 
Cc: Rob Herring 
Cc: Alexey Brodkin 
Cc: linux-snps-...@lists.infradead.org
Cc: alsa-de...@alsa-project.org
Cc: devicet...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---

This patch was only introduced in v7.

 sound/soc/dwc/designware_i2s.c | 68 +-
 1 file changed, 41 insertions(+), 27 deletions(-)

diff --git a/sound/soc/dwc/designware_i2s.c b/sound/soc/dwc/designware_i2s.c
index 0db69b7..4c4f0dc 100644
--- a/sound/soc/dwc/designware_i2s.c
+++ b/sound/soc/dwc/designware_i2s.c
@@ -145,26 +145,54 @@ static inline void i2s_clear_irqs(struct dw_i2s_dev *dev, 
u32 stream)
}
 }
 
-static void i2s_start(struct dw_i2s_dev *dev,
- struct snd_pcm_substream *substream)
+static inline void i2s_disable_irqs(struct dw_i2s_dev *dev, u32 stream,
+   int chan_nr)
 {
-   struct i2s_clk_config_data *config = &dev->config;
u32 i, irq;
-   i2s_write_reg(dev->i2s_base, IER, 1);
 
-   if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK) {
-   for (i = 0; i < (config->chan_nr / 2); i++) {
+   if (stream == SNDRV_PCM_STREAM_PLAYBACK) {
+   for (i = 0; i < (chan_nr / 2); i++) {
+   irq = i2s_read_reg(dev->i2s_base, IMR(i));
+   i2s_write_reg(dev->i2s_base, IMR(i), irq | 0x30);
+   }
+   } else {
+   for (i = 0; i < (chan_nr / 2); i++) {
+   irq = i2s_read_reg(dev->i2s_base, IMR(i));
+   i2s_write_reg(dev->i2s_base, IMR(i), irq | 0x03);
+   }
+   }
+}
+
+static inline void i2s_enable_irqs(struct dw_i2s_dev *dev, u32 stream,
+  int chan_nr)
+{
+   u32 i, irq;
+
+   if (stream == SNDRV_PCM_STREAM_PLAYBACK) {
+   for (i = 0; i < (chan_nr / 2); i++) {
irq = i2s_read_reg(dev->i2s_base, IMR(i));
i2s_write_reg(dev->i2s_base, IMR(i), irq & ~0x30);
}
-   i2s_write_reg(dev->i2s_base, ITER, 1);
} else {
-   for (i = 0; i < (config->chan_nr / 2); i++) {
+   for (i = 0; i < (chan_nr / 2); i++) {
irq = i2s_read_reg(dev->i2s_base, IMR(i));
i2s_write_reg(dev->i2s_base, IMR(i), irq & ~0x03);
}
-   i2s_write_reg(dev->i2s_base, IRER, 1);
}
+}
+
+static void i2s_start(struct dw_i2s_dev *dev,
+ struct snd_pcm_substream *substream)
+{
+   struct i2s_clk_config_data *config = &dev->config;
+
+   i2s_write_reg(dev->i2s_base, IER, 1);
+   i2s_enable_irqs(dev, substream->stream, config->chan_nr);
+
+   if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK)
+   i2s_write_reg(dev->i2s_base, ITER, 1);
+   else
+   i2s_write_reg(dev->i2s_base, IRER, 1);
 
i2s_write_reg(dev->i2s_base, CER, 1);
 }
@@ -172,24 +200,14 @@ static void i2s_start(struct dw_i2s_dev *dev,
 static void i2s_stop(struct dw_i2s_dev *dev,
struct snd_pcm_substream *substream)
 {
-   u32 i = 0, irq;
 
i2s_clear_irqs(dev, substream->stream);
-   if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK) {
+   if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK)
i2s_write_reg(dev->i2s_base, ITER, 0);
-
-   for (i = 0; i < 4; i++) {
-   irq = i2s_read_reg(dev->i2s_base, IMR(i));
-   i2s_write_reg(dev->i2s_base, IMR(i), irq | 0x30);
-   }
-   } else {
+   else
i2s_write_reg(dev->i2s_base, IRER, 0);
 
-   for (i = 0; i < 4; i++) {
-   irq = i2s_read_reg(dev->i2s_base, IMR(i));
-   i2s_write_reg(dev->i2s_base, IMR(i), irq | 0x03);
-   }
-   }
+   i2s_disable_irqs(dev, substream->stream, 8);
 
if (!dev->active) {
i2s_write_reg(dev->i2s_base, CER, 0);
@@ -223,7 +241,7 @@ static int dw_i2s_startup(struct snd_pcm_substream 
*substream,
 
 static void dw_i2s_config(struct dw_i2s_dev *dev, int stream)
 {
-   u32 ch_reg, irq;
+   u32 ch_reg;
struct i2s_clk_config_data *config = &dev->config;
 
 
@@ -235,16 +253,12 @@ static void dw_i2s_config(struct dw_i2s_dev *dev, int 
stream)
  dev->xfer_resolution);
i2s_write_reg(dev->i2s_base, TFCR(ch_reg),
  dev->fifo_th - 1);
-   irq = i2s_read_reg(dev->i2s_base, IMR(ch_reg));
-

[PATCH 2/4 v7] ASoC: dwc: Do not use devm_clk_get() if using platform data

2016-05-23 Thread Jose Abreu

When using platform data the devm_clk_get() function is
called causing a probe failure if the clock is not
declared. As we can pass the clock handler by platform
data call only devm_clk_get() when platform data is not
used.

Signed-off-by: Jose Abreu 
Cc: Carlos Palminha 
Cc: Mark Brown 
Cc: Liam Girdwood 
Cc: Jaroslav Kysela 
Cc: Takashi Iwai 
Cc: Rob Herring 
Cc: Alexey Brodkin 
Cc: linux-snps-...@lists.infradead.org
Cc: alsa-de...@alsa-project.org
Cc: devicet...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---

This patch was only introduced in v7.

 sound/soc/dwc/designware_i2s.c | 15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/sound/soc/dwc/designware_i2s.c b/sound/soc/dwc/designware_i2s.c
index 4c4f0dc..a97be8e 100644
--- a/sound/soc/dwc/designware_i2s.c
+++ b/sound/soc/dwc/designware_i2s.c
@@ -690,15 +690,16 @@ static int dw_i2s_probe(struct platform_device *pdev)
dev_err(&pdev->dev, "no clock configure 
method\n");
return -ENODEV;
}
-   }
-   dev->clk = devm_clk_get(&pdev->dev, clk_id);
+   } else {
+   dev->clk = devm_clk_get(&pdev->dev, clk_id);
 
-   if (IS_ERR(dev->clk))
-   return PTR_ERR(dev->clk);
+   if (IS_ERR(dev->clk))
+   return PTR_ERR(dev->clk);
 
-   ret = clk_prepare_enable(dev->clk);
-   if (ret < 0)
-   return ret;
+   ret = clk_prepare_enable(dev->clk);
+   if (ret < 0)
+   return ret;
+   }
}
 
dev_set_drvdata(&pdev->dev, dev);
-- 
1.9.1

1 2 3 4 5 6 7 8 9 >

1 - 100 of 806 matches

Mail list logo