Re: [RFC 03/18] memcontrol: present maximum used memory also for cgroup-v2

2016-06-14 Thread Michal Hocko
On Mon 13-06-16 22:44:10, Topi Miettinen wrote:
> Present maximum used memory in cgroup memory.current_max.

It would be really much more preferable to present the usecase in the
patch description. It is true that this information is presented in the
v1 API but the current policy is to export new knobs only when there is
a reasonable usecase for it.

> Signed-off-by: Topi Miettinen 
> ---
>  include/linux/page_counter.h |  7 ++-
>  mm/memcontrol.c  | 13 +
>  2 files changed, 19 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/page_counter.h b/include/linux/page_counter.h
> index 7e62920..be4de17 100644
> --- a/include/linux/page_counter.h
> +++ b/include/linux/page_counter.h
> @@ -9,9 +9,9 @@ struct page_counter {
>   atomic_long_t count;
>   unsigned long limit;
>   struct page_counter *parent;
> + unsigned long watermark;
>  
>   /* legacy */
> - unsigned long watermark;
>   unsigned long failcnt;
>  };
>  
> @@ -34,6 +34,11 @@ static inline unsigned long page_counter_read(struct 
> page_counter *counter)
>   return atomic_long_read(&counter->count);
>  }
>  
> +static inline unsigned long page_counter_read_watermark(struct page_counter 
> *counter)
> +{
> + return counter->watermark;
> +}
> +
>  void page_counter_cancel(struct page_counter *counter, unsigned long 
> nr_pages);
>  void page_counter_charge(struct page_counter *counter, unsigned long 
> nr_pages);
>  bool page_counter_try_charge(struct page_counter *counter,
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 75e7440..5513771 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -4966,6 +4966,14 @@ static u64 memory_current_read(struct 
> cgroup_subsys_state *css,
>   return (u64)page_counter_read(&memcg->memory) * PAGE_SIZE;
>  }
>  
> +static u64 memory_current_max_read(struct cgroup_subsys_state *css,
> +struct cftype *cft)
> +{
> + struct mem_cgroup *memcg = mem_cgroup_from_css(css);
> +
> + return (u64)page_counter_read_watermark(&memcg->memory) * PAGE_SIZE;
> +}
> +
>  static int memory_low_show(struct seq_file *m, void *v)
>  {
>   struct mem_cgroup *memcg = mem_cgroup_from_css(seq_css(m));
> @@ -5179,6 +5187,11 @@ static struct cftype memory_files[] = {
>   .read_u64 = memory_current_read,
>   },
>   {
> + .name = "current_max",
> + .flags = CFTYPE_NOT_ON_ROOT,
> + .read_u64 = memory_current_max_read,
> + },
> + {
>   .name = "low",
>   .flags = CFTYPE_NOT_ON_ROOT,
>   .seq_show = memory_low_show,
> -- 
> 2.8.1

-- 
Michal Hocko
SUSE Labs


Re: [PATCH v1 1/2] sound: lpass-cpu: add module licence and description

2016-06-14 Thread Kenneth Westfield
On Mon, Jun 13, 2016 at 02:23:16PM +0100, Srinivas Kandagatla wrote:
> This patch adds module licence to lpass-cpu driver, without this
> patch lpass-cpu module would taint with below error, while insmod/modprobe.
> 
> snd_soc_lpass_cpu: module license 'unspecified' taints kernel.
> Disabling lock debugging due to kernel taint
> snd_soc_lpass_cpu: Unknown symbol regmap_write (err 0)
> snd_soc_lpass_cpu: Unknown symbol devm_kmalloc (err 0)
> ...
> 
> Signed-off-by: Srinivas Kandagatla 
> ---

Acked-by: Kenneth Westfield 

-- 
Kenneth Westfield
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, 
a Linux Foundation Collaborative Project


Re: lm-senser can't detect thermal on thermal_zone

2016-06-14 Thread Kuninori Morimoto

Hi Zhang

> > > > struct thermal_zone_device *thermal_zone_device_register()
> > > > {
> > > > ...
> > > > if (!tz->tzp || !tz->tzp->no_hwmon) {
> > > > result = thermal_add_hwmon_sysfs(tz);
> > > > ...
> > > > }
> > > > ...
> > > > }
> > > > 
> > > > Does this mean "thermal_zone doesn't use lm-senser" ?
> 
> I'd prefer to say of_thermal registered thermal_zone doesn't use lm-
> sensor.
> If you're really want to see hwmon interface, I think you should use
> thermal API (thermal_zone_device_register) directly.

Thank you for your feedback.
My driver is supporting both of_thermal and thermal API.
So, switching is not a big deal.

But can you teach me why of_thermal doesn't use lm-senser ??


Re: [RFC PATCH 1/3] mm, thp: revert allocstall comparing

2016-06-14 Thread Michal Hocko
On Sat 11-06-16 22:15:59, Ebru Akagunduz wrote:
> This patch takes back allocstall comparing when deciding
> whether swapin worthwhile because it does not work,
> if vmevent disabled.
> 
> Related commit:
> http://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=2548306628308aa6a326640d345a737bc898941d

I guess it would be easier to simply drop
mm-thp-avoid-unnecessary-swapin-in-khugepaged.patch

> Signed-off-by: Ebru Akagunduz 
> ---
>  mm/khugepaged.c | 31 ---
>  1 file changed, 8 insertions(+), 23 deletions(-)
> 
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index 0ac63f7..e3d8da7 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -68,7 +68,6 @@ static DECLARE_WAIT_QUEUE_HEAD(khugepaged_wait);
>   */
>  static unsigned int khugepaged_max_ptes_none __read_mostly;
>  static unsigned int khugepaged_max_ptes_swap __read_mostly;
> -static unsigned long allocstall;
>  
>  static int khugepaged(void *none);
>  
> @@ -926,7 +925,6 @@ static void collapse_huge_page(struct mm_struct *mm,
>   struct page *new_page;
>   spinlock_t *pmd_ptl, *pte_ptl;
>   int isolated = 0, result = 0;
> - unsigned long swap, curr_allocstall;
>   struct mem_cgroup *memcg;
>   unsigned long mmun_start;   /* For mmu_notifiers */
>   unsigned long mmun_end; /* For mmu_notifiers */
> @@ -955,8 +953,6 @@ static void collapse_huge_page(struct mm_struct *mm,
>   goto out_nolock;
>   }
>  
> - swap = get_mm_counter(mm, MM_SWAPENTS);
> - curr_allocstall = sum_vm_event(ALLOCSTALL);
>   down_read(&mm->mmap_sem);
>   result = hugepage_vma_revalidate(mm, address);
>   if (result) {
> @@ -972,22 +968,15 @@ static void collapse_huge_page(struct mm_struct *mm,
>   up_read(&mm->mmap_sem);
>   goto out_nolock;
>   }
> -
>   /*
> -  * Don't perform swapin readahead when the system is under pressure,
> -  * to avoid unnecessary resource consumption.
> +  * __collapse_huge_page_swapin always returns with mmap_sem
> +  * locked.  If it fails, release mmap_sem and jump directly
> +  * out.  Continuing to collapse causes inconsistency.
>*/
> - if (allocstall == curr_allocstall && swap != 0) {
> - /*
> -  * __collapse_huge_page_swapin always returns with mmap_sem
> -  * locked.  If it fails, release mmap_sem and jump directly
> -  * out.  Continuing to collapse causes inconsistency.
> -  */
> - if (!__collapse_huge_page_swapin(mm, vma, address, pmd)) {
> - mem_cgroup_cancel_charge(new_page, memcg, true);
> - up_read(&mm->mmap_sem);
> - goto out_nolock;
> - }
> + if (!__collapse_huge_page_swapin(mm, vma, address, pmd)) {
> + mem_cgroup_cancel_charge(new_page, memcg, true);
> + up_read(&mm->mmap_sem);
> + goto out_nolock;
>   }
>  
>   up_read(&mm->mmap_sem);
> @@ -1822,7 +1811,6 @@ static void khugepaged_wait_work(void)
>   if (!scan_sleep_jiffies)
>   return;
>  
> - allocstall = sum_vm_event(ALLOCSTALL);
>   khugepaged_sleep_expire = jiffies + scan_sleep_jiffies;
>   wait_event_freezable_timeout(khugepaged_wait,
>khugepaged_should_wakeup(),
> @@ -1830,10 +1818,8 @@ static void khugepaged_wait_work(void)
>   return;
>   }
>  
> - if (khugepaged_enabled()) {
> - allocstall = sum_vm_event(ALLOCSTALL);
> + if (khugepaged_enabled())
>   wait_event_freezable(khugepaged_wait, khugepaged_wait_event());
> - }
>  }
>  
>  static int khugepaged(void *none)
> @@ -1842,7 +1828,6 @@ static int khugepaged(void *none)
>  
>   set_freezable();
>   set_user_nice(current, MAX_NICE);
> - allocstall = sum_vm_event(ALLOCSTALL);
>  
>   while (!kthread_should_stop()) {
>   khugepaged_do_scan();
> -- 
> 1.9.1

-- 
Michal Hocko
SUSE Labs


Re: [PATCH v3 1/3] pinctrl/broxton: enable platform device in the absent of ACPI enumeration

2016-06-14 Thread Linus Walleij
On Tue, Jun 7, 2016 at 8:55 AM, Tan Jui Nee  wrote:

> This is to cater the need for non-ACPI system whereby
> a platform device has to be created in order to bind
> with the Apollo Lake Pinctrl GPIO platform driver.
>
> Signed-off-by: Tan Jui Nee 

You forgot to put me on the To: line for the patch so it's only luck that I
saw it. (Every once on a blue moon I read the linux-gpio list directly...)

Patch applied with Mika's ACK.

Yours,
Linus Walleij


RE: [PATCH v5 1/2] mm, kasan: improve double-free detection

2016-06-14 Thread Luruo, Kuthonuzo
> > Next time, when/if you send patch series, send patches in one thread, i.e.
> > patches should be replies to the cover letter.
> > Your patches are not linked together, which makes them harder to track.

Thanks for the tip; but doesn't this conflict with the advice in
https://www.kernel.org/doc/Documentation/SubmittingPatches, specifically the
use of  the "summary phrase"... 

> >
> >
> >> Currently, KASAN may fail to detect concurrent deallocations of the same
> >> object due to a race in kasan_slab_free(). This patch makes double-free
> >> detection more reliable by serializing access to KASAN object metadata.
> >> New functions kasan_meta_lock() and kasan_meta_unlock() are provided to
> >> lock/unlock per-object metadata. Double-free errors are now reported via
> >> kasan_report().
> >>
> >> Per-object lock concept from suggestion/observations by Dmitry Vyukov.
> >>
> >
> >
> > So, I still don't like this, this too way hacky and complex.

I don't think patch is particularly complex; but respect your judgment.

> > I have some thoughts about how to make this lockless and robust enough.
> > I'll try to sort this out tomorrow.
> >
> 
> 
> So, I something like this should work.
> Tested very briefly.
> 
> diff --git a/include/linux/kasan.h b/include/linux/kasan.h
> index ac4b3c4..8691142 100644
> --- a/include/linux/kasan.h
> +++ b/include/linux/kasan.h
> @@ -75,6 +75,8 @@ struct kasan_cache {
>  int kasan_module_alloc(void *addr, size_t size);
>  void kasan_free_shadow(const struct vm_struct *vm);
> 
> +void kasan_init_slab_obj(struct kmem_cache *cache, const void *object);
> +
>  size_t ksize(const void *);
>  static inline void kasan_unpoison_slab(const void *ptr) { ksize(ptr); }
> 
> @@ -102,6 +104,9 @@ static inline void kasan_unpoison_object_data(struct
> kmem_cache *cache,
>  static inline void kasan_poison_object_data(struct kmem_cache *cache,
>   void *object) {}
> 
> +static inline void kasan_init_slab_obj(struct kmem_cache *cache,
> + const void *object) { }
> +
>  static inline void kasan_kmalloc_large(void *ptr, size_t size, gfp_t flags) 
> {}
>  static inline void kasan_kfree_large(const void *ptr) {}
>  static inline void kasan_poison_kfree(void *ptr) {}
> diff --git a/mm/kasan/kasan.c b/mm/kasan/kasan.c
> index 6845f92..ab0fded 100644
> --- a/mm/kasan/kasan.c
> +++ b/mm/kasan/kasan.c
> @@ -388,11 +388,9 @@ void kasan_cache_create(struct kmem_cache *cache,
> size_t *size,
>   *size += sizeof(struct kasan_alloc_meta);
> 
>   /* Add free meta. */
> - if (cache->flags & SLAB_DESTROY_BY_RCU || cache->ctor ||
> - cache->object_size < sizeof(struct kasan_free_meta)) {
> - cache->kasan_info.free_meta_offset = *size;
> - *size += sizeof(struct kasan_free_meta);
> - }
> + cache->kasan_info.free_meta_offset = *size;
> + *size += sizeof(struct kasan_free_meta);
> +
>   redzone_adjust = optimal_redzone(cache->object_size) -
>   (*size - cache->object_size);
>   if (redzone_adjust > 0)
> @@ -431,13 +429,6 @@ void kasan_poison_object_data(struct kmem_cache
> *cache, void *object)
>   kasan_poison_shadow(object,
>   round_up(cache->object_size,
> KASAN_SHADOW_SCALE_SIZE),
>   KASAN_KMALLOC_REDZONE);
> -#ifdef CONFIG_SLAB
> - if (cache->flags & SLAB_KASAN) {
> - struct kasan_alloc_meta *alloc_info =
> - get_alloc_info(cache, object);
> - alloc_info->state = KASAN_STATE_INIT;
> - }
> -#endif
>  }
> 
>  #ifdef CONFIG_SLAB
> @@ -501,6 +492,20 @@ struct kasan_free_meta *get_free_info(struct
> kmem_cache *cache,
>   BUILD_BUG_ON(sizeof(struct kasan_free_meta) > 32);
>   return (void *)object + cache->kasan_info.free_meta_offset;
>  }
> +
> +void kasan_init_slab_obj(struct kmem_cache *cache, const void *object)
> +{
> + struct kasan_alloc_meta *alloc_info;
> + struct kasan_free_meta *free_info;
> +
> + if (!(cache->flags & SLAB_KASAN))
> + return;
> +
> + alloc_info = get_alloc_info(cache, object);
> + free_info = get_free_info(cache, object);
> + __memset(alloc_info, 0, sizeof(*alloc_info));
> + __memset(free_info, 0, sizeof(*free_info));
> +}
>  #endif
> 
>  void kasan_slab_alloc(struct kmem_cache *cache, void *object, gfp_t flags)
> @@ -523,37 +528,47 @@ static void kasan_poison_slab_free(struct
> kmem_cache *cache, void *object)
>  bool kasan_slab_free(struct kmem_cache *cache, void *object)
>  {
>  #ifdef CONFIG_SLAB
> + struct kasan_free_meta *free_info = get_free_info(cache, object);
> + struct kasan_track new_free_stack, old_free_stack;
> + s8 old_shadow;
> +
>   /* RCU slabs could be legally used after free within the RCU period */
>   if (unlikely(cache->flags & SLAB_DESTROY_BY_RCU))
>   return false;
> 
> - if (likely(cache->flags & SLAB_KASAN)) {
> - struct kasan_alloc_meta

Re: [RFC PATCH 2/3] mm, thp: convert from optimistic to conservative

2016-06-14 Thread Michal Hocko
On Sat 11-06-16 22:16:00, Ebru Akagunduz wrote:
> Currently, khugepaged collapses pages saying only
> a referenced page enough to create a THP.
> 
> This patch changes the design from optimistic to conservative.
> It gives a default threshold which is half of HPAGE_PMD_NR
> for referenced pages, also introduces a new sysfs knob.

I am not really happy about yet another tunable khugepaged_max_ptes_none
is too specific already. We do not want to have one knob per page
bit. Shouldn't we rather make the existing knob more generic to allow
implementation to decide whether young bit or present bit is more
important.

> Signed-off-by: Ebru Akagunduz 
> ---
>  include/trace/events/huge_memory.h | 10 
>  mm/khugepaged.c| 50 
> +-
>  2 files changed, 44 insertions(+), 16 deletions(-)
> 
> diff --git a/include/trace/events/huge_memory.h 
> b/include/trace/events/huge_memory.h
> index 830d47d..5f14025 100644
> --- a/include/trace/events/huge_memory.h
> +++ b/include/trace/events/huge_memory.h
> @@ -13,7 +13,7 @@
>   EM( SCAN_EXCEED_NONE_PTE,   "exceed_none_pte")  \
>   EM( SCAN_PTE_NON_PRESENT,   "pte_non_present")  \
>   EM( SCAN_PAGE_RO,   "no_writable_page") \
> - EM( SCAN_NO_REFERENCED_PAGE,"no_referenced_page")   \
> + EM( SCAN_LACK_REFERENCED_PAGE,  "lack_referenced_page") \
>   EM( SCAN_PAGE_NULL, "page_null")\
>   EM( SCAN_SCAN_ABORT,"scan_aborted") \
>   EM( SCAN_PAGE_COUNT,"not_suitable_page_count")  \
> @@ -47,7 +47,7 @@ SCAN_STATUS
>  TRACE_EVENT(mm_khugepaged_scan_pmd,
>  
>   TP_PROTO(struct mm_struct *mm, struct page *page, bool writable,
> -  bool referenced, int none_or_zero, int status, int unmapped),
> +  int referenced, int none_or_zero, int status, int unmapped),
>  
>   TP_ARGS(mm, page, writable, referenced, none_or_zero, status, unmapped),
>  
> @@ -55,7 +55,7 @@ TRACE_EVENT(mm_khugepaged_scan_pmd,
>   __field(struct mm_struct *, mm)
>   __field(unsigned long, pfn)
>   __field(bool, writable)
> - __field(bool, referenced)
> + __field(int, referenced)
>   __field(int, none_or_zero)
>   __field(int, status)
>   __field(int, unmapped)
> @@ -108,14 +108,14 @@ TRACE_EVENT(mm_collapse_huge_page,
>  TRACE_EVENT(mm_collapse_huge_page_isolate,
>  
>   TP_PROTO(struct page *page, int none_or_zero,
> -  bool referenced, bool  writable, int status),
> +  int referenced, bool  writable, int status),
>  
>   TP_ARGS(page, none_or_zero, referenced, writable, status),
>  
>   TP_STRUCT__entry(
>   __field(unsigned long, pfn)
>   __field(int, none_or_zero)
> - __field(bool, referenced)
> + __field(int, referenced)
>   __field(bool, writable)
>   __field(int, status)
>   ),
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index e3d8da7..43fc41e 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -27,7 +27,7 @@ enum scan_result {
>   SCAN_EXCEED_NONE_PTE,
>   SCAN_PTE_NON_PRESENT,
>   SCAN_PAGE_RO,
> - SCAN_NO_REFERENCED_PAGE,
> + SCAN_LACK_REFERENCED_PAGE,
>   SCAN_PAGE_NULL,
>   SCAN_SCAN_ABORT,
>   SCAN_PAGE_COUNT,
> @@ -68,6 +68,7 @@ static DECLARE_WAIT_QUEUE_HEAD(khugepaged_wait);
>   */
>  static unsigned int khugepaged_max_ptes_none __read_mostly;
>  static unsigned int khugepaged_max_ptes_swap __read_mostly;
> +static unsigned int khugepaged_min_ptes_young __read_mostly;
>  
>  static int khugepaged(void *none);
>  
> @@ -282,6 +283,32 @@ static struct kobj_attribute 
> khugepaged_max_ptes_swap_attr =
>   __ATTR(max_ptes_swap, 0644, khugepaged_max_ptes_swap_show,
>  khugepaged_max_ptes_swap_store);
>  
> +static ssize_t khugepaged_min_ptes_young_show(struct kobject *kobj,
> +   struct kobj_attribute *attr,
> +   char *buf)
> +{
> + return sprintf(buf, "%u\n", khugepaged_min_ptes_young);
> +}
> +
> +static ssize_t khugepaged_min_ptes_young_store(struct kobject *kobj,
> +struct kobj_attribute *attr,
> +const char *buf, size_t count)
> +{
> + int err;
> + unsigned long min_ptes_young;
> + err  = kstrtoul(buf, 10, &min_ptes_young);
> + if (err || min_ptes_young > HPAGE_PMD_NR-1)
> + return -EINVAL;
> +
> + khugepaged_min_ptes_young = min_ptes_young;
> +
> + return count;
> +}
> +
> +static struct kobj_attribute khugepaged_min_ptes_young_attr =
> + __ATTR(min_ptes_young, 0644, khugepaged_min_ptes_young_show,
> + khugepaged_min_ptes_young_store);
>

Re: [v2 PATCH 1/2] scsi:stex.c Support Pegasus 3 product

2016-06-14 Thread kbuild test robot
Hi,

[auto build test WARNING on scsi/for-next]
[also build test WARNING on v4.7-rc3 next-20160609]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Charles-Chiou/scsi-stex-c-Support-Pegasus-3-product/20160614-142621
base:   https://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git for-next
config: x86_64-allmodconfig (attached as .config)
compiler: gcc-6 (Debian 6.1.1-1) 6.1.1 20160430
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64 

Note: it may well be a FALSE warning. FWIW you are at least aware of it now.
http://gcc.gnu.org/wiki/Better_Uninitialized_Warnings

All warnings (new ones prefixed by >>):

   drivers/scsi/stex.c: In function 'stex_handshake':
>> drivers/scsi/stex.c:1208:2: warning: 'scratch' may be used uninitialized in 
>> this function [-Wmaybe-uninitialized]
 memset(scratch, 0, scratch_size);
 ^~~~
   drivers/scsi/stex.c::10: note: 'scratch' was declared here
 __le32 *scratch;
 ^~~

vim +/scratch +1208 drivers/scsi/stex.c

0f3f6ee6 Ed Lin  2009-03-31  1192   msleep(1);
0f3f6ee6 Ed Lin  2009-03-31  1193   }
2a48e931 Charles 2016-06-14  1194   } else {
2a48e931 Charles 2016-06-14  1195   while ((readl(base + 
MAILBOX_BASE + MAILBOX_HNDSHK_STS)
2a48e931 Charles 2016-06-14  1196& SS_STS_HANDSHAKE) == 0) {
2a48e931 Charles 2016-06-14  1197   if (time_after(jiffies, 
before + MU_MAX_DELAY * HZ)) {
2a48e931 Charles 2016-06-14  1198   printk(KERN_ERR 
DRV_NAME
2a48e931 Charles 2016-06-14  1199   "(%s): 
no signature after handshake frame\n",
2a48e931 Charles 2016-06-14  1200   
pci_name(hba->pdev));
2a48e931 Charles 2016-06-14  1201   ret = -1;
2a48e931 Charles 2016-06-14  1202   break;
2a48e931 Charles 2016-06-14  1203   }
2a48e931 Charles 2016-06-14  1204   rmb();
2a48e931 Charles 2016-06-14  1205   msleep(1);
2a48e931 Charles 2016-06-14  1206   }
2a48e931 Charles 2016-06-14  1207   }
9eb46d2a Ed Lin  2009-09-28 @1208   memset(scratch, 0, scratch_size);
0f3f6ee6 Ed Lin  2009-03-31  1209   msg_h->flag = 0;
2a48e931 Charles 2016-06-14  1210  
0f3f6ee6 Ed Lin  2009-03-31  1211   return ret;
0f3f6ee6 Ed Lin  2009-03-31  1212  }
0f3f6ee6 Ed Lin  2009-03-31  1213  
0f3f6ee6 Ed Lin  2009-03-31  1214  static int stex_handshake(struct st_hba *hba)
0f3f6ee6 Ed Lin  2009-03-31  1215  {
0f3f6ee6 Ed Lin  2009-03-31  1216   int err;

:: The code at line 1208 was first introduced by commit
:: 9eb46d2a08de537e14e92216bf18e7cb541d2f67 [SCSI] stex: add support for 
reset request from firmware

:: TO: Ed Lin 
:: CC: James Bottomley 

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: Binary data


Re: [v2 PATCH 1/2] scsi:stex.c Support Pegasus 3 product

2016-06-14 Thread kbuild test robot
Hi,

[auto build test WARNING on scsi/for-next]
[also build test WARNING on v4.7-rc3 next-20160614]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Charles-Chiou/scsi-stex-c-Support-Pegasus-3-product/20160614-142621
base:   https://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git for-next
config: i386-allmodconfig (attached as .config)
compiler: gcc-6 (Debian 6.1.1-1) 6.1.1 20160430
reproduce:
# save the attached .config to linux build tree
make ARCH=i386 

Note: it may well be a FALSE warning. FWIW you are at least aware of it now.
http://gcc.gnu.org/wiki/Better_Uninitialized_Warnings

All warnings (new ones prefixed by >>):

   In file included from arch/x86/include/asm/string.h:2:0,
from include/linux/string.h:18,
from arch/x86/include/asm/page_32.h:34,
from arch/x86/include/asm/page.h:13,
from arch/x86/include/asm/thread_info.h:11,
from include/linux/thread_info.h:54,
from arch/x86/include/asm/preempt.h:6,
from include/linux/preempt.h:59,
from include/linux/spinlock.h:50,
from include/linux/mmzone.h:7,
from include/linux/gfp.h:5,
from include/linux/slab.h:14,
from drivers/scsi/stex.c:20:
   drivers/scsi/stex.c: In function 'stex_handshake':
>> arch/x86/include/asm/string_32.h:325:29: warning: 'scratch' may be used 
>> uninitialized in this function [-Wmaybe-uninitialized]
#define memset(s, c, count) __builtin_memset(s, c, count)
^~~~
   drivers/scsi/stex.c::10: note: 'scratch' was declared here
 __le32 *scratch;
 ^~~

vim +/scratch +325 arch/x86/include/asm/string_32.h

^1da177e include/asm-i386/string.hLinus Torvalds   2005-04-16  309  
^1da177e include/asm-i386/string.hLinus Torvalds   2005-04-16  310  
#undef COMMON
^1da177e include/asm-i386/string.hLinus Torvalds   2005-04-16  311  }
^1da177e include/asm-i386/string.hLinus Torvalds   2005-04-16  312  
^1da177e include/asm-i386/string.hLinus Torvalds   2005-04-16  313  
#define __constant_c_x_memset(s, c, count)  \
78d64fc2 include/asm-x86/string_32.h  Joe Perches  2008-05-12  314  
(__builtin_constant_p(count)\
78d64fc2 include/asm-x86/string_32.h  Joe Perches  2008-05-12  315  
 ? __constant_c_and_count_memset((s), (c), (count)) \
78d64fc2 include/asm-x86/string_32.h  Joe Perches  2008-05-12  316  
 : __constant_c_memset((s), (c), (count)))
^1da177e include/asm-i386/string.hLinus Torvalds   2005-04-16  317  
^1da177e include/asm-i386/string.hLinus Torvalds   2005-04-16  318  
#define __memset(s, c, count)   \
78d64fc2 include/asm-x86/string_32.h  Joe Perches  2008-05-12  319  
(__builtin_constant_p(count)\
78d64fc2 include/asm-x86/string_32.h  Joe Perches  2008-05-12  320  
 ? __constant_count_memset((s), (c), (count))   \
78d64fc2 include/asm-x86/string_32.h  Joe Perches  2008-05-12  321  
 : __memset_generic((s), (c), (count)))
^1da177e include/asm-i386/string.hLinus Torvalds   2005-04-16  322  
^1da177e include/asm-i386/string.hLinus Torvalds   2005-04-16  323  
#define __HAVE_ARCH_MEMSET
ff60fab7 arch/x86/include/asm/string_32.h Arjan van de Ven 2009-09-28  324  #if 
(__GNUC__ >= 4)
ff60fab7 arch/x86/include/asm/string_32.h Arjan van de Ven 2009-09-28 @325  
#define memset(s, c, count) __builtin_memset(s, c, count)
ff60fab7 arch/x86/include/asm/string_32.h Arjan van de Ven 2009-09-28  326  
#else
^1da177e include/asm-i386/string.hLinus Torvalds   2005-04-16  327  
#define memset(s, c, count) \
78d64fc2 include/asm-x86/string_32.h  Joe Perches  2008-05-12  328  
(__builtin_constant_p(c)\
78d64fc2 include/asm-x86/string_32.h  Joe Perches  2008-05-12  329  
 ? __constant_c_x_memset((s), (0x01010101UL * (unsigned char)(c)), \
78d64fc2 include/asm-x86/string_32.h  Joe Perches  2008-05-12  330  
 (count))   \
78d64fc2 include/asm-x86/string_32.h  Joe Perches  2008-05-12  331  
 : __memset((s), (c), (count)))
ff60fab7 arch/x86/include/asm/string_32.h Arjan van de Ven 2009-09-28  332  
#endif
^1da177e include/asm-i386/string.hLinus Torvalds   2005-04-16  333  

:: The code at line 325 was first introduced by commit
:: ff60fab71bb3b4fdbf8caf57ff3739ffd0887396 x86: Use __builtin_memset and 
__builtin_memcpy for memset/memcpy

:: TO: 

Re: [PATCH 0/6] Support DAX for device-mapper dm-linear devices

2016-06-14 Thread Dan Williams
On Mon, Jun 13, 2016 at 5:02 PM, Dan Williams  wrote:
> On Mon, Jun 13, 2016 at 4:59 PM, Kani, Toshimitsu  wrote:
>> On Mon, 2016-06-13 at 16:18 -0700, Dan Williams wrote:
>>> Thanks Toshi!
>>>
>>> On Mon, Jun 13, 2016 at 3:21 PM, Toshi Kani  wrote:
>>> >
>>> > This patch-set adds DAX support to device-mapper dm-linear devices
>>> > used by LVM.  It works with LVM commands as follows:
>>> >  - Creation of a logical volume with all DAX capable devices (such
>>> >as pmem) sets the logical volume DAX capable as well.
>>> >  - Once a logical volume is set to DAX capable, the volume may not
>>> >be extended with non-DAX capable devices.
>>>
>>> I don't mind this, but it seems a policy decision that the kernel does
>>> not need to make.  A sufficiently sophisticated user could cope with
>>> DAX being available at varying LBAs.  Would it be sufficient to move
>>> this policy decision to userspace tooling?
>>
>> I think this is a kernel restriction.  When a block device is declared as
>> DAX capable, it should mean that the whole device is DAX capable.  So, I
>> think we need to assure the same to a mapped device.
>
> Hmm, but we already violate this with badblocks.  The device is DAX
> capable, but certain LBAs will return an error if direct_access is
> attempted.

Nevermind, for this to be useful we would need to fallback to regular
mmap for a portion of the linear span.  That's different than the
badblocks case.


Re: Boot failure on emev2/kzm9d (was: Re: [PATCH v2 11/11] mm/slab: lockless decision to grow cache)

2016-06-14 Thread Geert Uytterhoeven
Hi Joonsoo,

On Tue, Jun 14, 2016 at 8:24 AM, Joonsoo Kim  wrote:
> On Mon, Jun 13, 2016 at 09:43:13PM +0200, Geert Uytterhoeven wrote:
>> On Tue, Apr 12, 2016 at 6:51 AM,   wrote:
>> > From: Joonsoo Kim 
>> > To check whther free objects exist or not precisely, we need to grab a
>> > lock.  But, accuracy isn't that important because race window would be
>> > even small and if there is too much free object, cache reaper would reap
>> > it.  So, this patch makes the check for free object exisistence not to
>> > hold a lock.  This will reduce lock contention in heavily allocation case.
>> >
>> > Note that until now, n->shared can be freed during the processing by
>> > writing slabinfo, but, with some trick in this patch, we can access it
>> > freely within interrupt disabled period.
>> >
>> > Below is the result of concurrent allocation/free in slab allocation
>> > benchmark made by Christoph a long time ago.  I make the output simpler.
>> > The number shows cycle count during alloc/free respectively so less is
>> > better.
>> >
>> > * Before
>> > Kmalloc N*alloc N*free(32): Average=248/966
>> > Kmalloc N*alloc N*free(64): Average=261/949
>> > Kmalloc N*alloc N*free(128): Average=314/1016
>> > Kmalloc N*alloc N*free(256): Average=741/1061
>> > Kmalloc N*alloc N*free(512): Average=1246/1152
>> > Kmalloc N*alloc N*free(1024): Average=2437/1259
>> > Kmalloc N*alloc N*free(2048): Average=4980/1800
>> > Kmalloc N*alloc N*free(4096): Average=9000/2078
>> >
>> > * After
>> > Kmalloc N*alloc N*free(32): Average=344/792
>> > Kmalloc N*alloc N*free(64): Average=347/882
>> > Kmalloc N*alloc N*free(128): Average=390/959
>> > Kmalloc N*alloc N*free(256): Average=393/1067
>> > Kmalloc N*alloc N*free(512): Average=683/1229
>> > Kmalloc N*alloc N*free(1024): Average=1295/1325
>> > Kmalloc N*alloc N*free(2048): Average=2513/1664
>> > Kmalloc N*alloc N*free(4096): Average=4742/2172
>> >
>> > It shows that allocation performance decreases for the object size up to
>> > 128 and it may be due to extra checks in cache_alloc_refill().  But, with
>> > considering improvement of free performance, net result looks the same.
>> > Result for other size class looks very promising, roughly, 50% performance
>> > improvement.
>> >
>> > v2: replace kick_all_cpus_sync() with synchronize_sched().
>> >
>> > Signed-off-by: Joonsoo Kim 
>>
>> I've bisected a boot failure (no output at all) in v4.7-rc2 on emev2/kzm9d
>> (Renesas dual Cortex A9) to this patch, which is upstream commit
>> 801faf0db8947e01877920e848a4d338dd7a99e7.
>>
>> I've attached my .config. I don't know if it also happens with
>> shmobile_defconfig, as something went wrong with my remote access to the 
>> board,
>> preventing further testing. I also couldn't verify if the issue persists in
>> v4.7-rc3.

In the mean time, I've verified it also happens with shmobile_defconfig.

>>
>> Do you have a clue?
>
> I don't have yet. Could you help me to narrow down the problem?
> Following diff is half-revert change to check that synchronize_sched()
> has no problem.

Thanks!

Unfortunately the half revert is not sufficient. The full revert is.

> ->8-
> diff --git a/mm/slab.c b/mm/slab.c
> index 763096a..257a0eb 100644
> --- a/mm/slab.c
> +++ b/mm/slab.c
> @@ -3016,9 +3016,6 @@ static void *cache_alloc_refill(struct kmem_cache 
> *cachep, gfp_t flags)
> n = get_node(cachep, node);
>
> BUG_ON(ac->avail > 0 || !n);
> -   shared = READ_ONCE(n->shared);
> -   if (!n->free_objects && (!shared || !shared->avail))
> -   goto direct_grow;
>
> spin_lock(&n->list_lock);
> shared = READ_ONCE(n->shared);
> @@ -3047,7 +3044,6 @@ alloc_done:
> spin_unlock(&n->list_lock);
> fixup_objfreelist_debug(cachep, &list);
>
> -direct_grow:
> if (unlikely(!ac->avail)) {
> /* Check if we can use obj in pfmemalloc slab */
> if (sk_memalloc_socks()) {

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


Re: [PATCH v2 3/3] x86/quirks: Add early quirk to reset Apple AirPort card

2016-06-14 Thread Ingo Molnar

* Matt Fleming  wrote:

> On Sun, 12 Jun, at 02:37:26PM, Lukas Wunner wrote:
> > 
> > Resetting the card solves the problem at the root and fixes both,
> > the spurious interrupts and the memory corruption. 
> 
> It also avoids the need to figure out exactly which Boot Services
> regions may have become corrupt.
> 
> This would be necessary since you can't keep all Boot Services regions
> reserved because those regions can add up to be many gigabytes in
> size, even on relatively low-end laptops that don't have huge amounts
> of RAM to begin with.
> 
> Freeing Boot Services regions is pretty important.

Ok!

Thanks,

Ingo


Re: [PATCH 1/9] of: Add a new macro to declare_of for one parameter function returning a value

2016-06-14 Thread Daniel Lezcano

On 06/01/2016 10:34 AM, Daniel Lezcano wrote:

The macro OF_DECLARE_1 expect a void (*func)(struct device_node *) while the
OF_DECLARE_2 expect a int (*func)(struct device_node *, struct device_node *).

The second one allows to pass an init function returning a value, which make
possible to call the functions in the table and check the return value in order
to catch at a higher level the errors and handle them from there instead of
doing a panic in each driver (well at least this is the case for the clkevt).

Unfortunately the OF_DECLARE_1 does not allow that and that lead to some code
duplication and crappyness in the drivers.

The OF_DECLARE_1 is used by all the clk drivers and the clocksource/clockevent
drivers. It is not possible to do the change in one shot as we have to change
all the init functions.

The OF_DECLARE_2 specifies an init function prototype with two parameters with
the node and its parent. The latter won't be used, ever, in the timer drivers.

Introduce a OF_DECLARE_1_RET macro to be used, and hopefully we can smoothly
and iteratively change the users of OF_DECLARE_1 to use the new macro instead.

Signed-off-by: Daniel Lezcano 
---


Rob, Grant,

do you agree with this change ?

Thanks.

  -- Daniel

--
  Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog



Re: [PATCH v6 3/6] crypto: AF_ALG -- add asymmetric cipher interface

2016-06-14 Thread Andrew Zaborowski
Hi Stephan,

On 14 June 2016 at 07:12, Stephan Mueller  wrote:
> Am Dienstag, 14. Juni 2016, 00:16:11 schrieb Andrew Zaborowski:
>> On 8 June 2016 at 21:14, Mat Martineau
>>
>>  wrote:
>> > On Wed, 8 Jun 2016, Stephan Mueller wrote:
>> >> What is your concern?
>> >
>> > Userspace must allocate larger buffers than it knows are necessary for
>> > expected results.
>> >
>> > It looks like the software rsa implementation handles shorter output
>> > buffers ok (mpi_write_to_sgl will return EOVERFLOW if the the buffer is
>> > too small), however I see at least one hardware rsa driver that requires
>> > the output buffer to be the maximum size. But this inconsistency might be
>> > best addressed within the software cipher or drivers rather than in
>> > recvmsg.
>> Should the hardware drivers fix this instead?  I've looked at the qat
>> and caam drivers, they both require the destination buffer size to be
>> the key size and in both cases there would be no penalty for dropping
>> this requirement as far as I see.  Both do a memmove if the result
>> ends up being shorter than key size.  In case the caller knows it is
>> expecting a specific output size, the driver will have to use a self
>> allocated buffer + a memcpy in those same cases where it would later
>> use memmove instead.  Alternatively the sg passed to dma_map_sg can be
>> prepended with a dummy segment the right size to save the memcpy.
>>
>> akcipher.h only says:
>> @dst_len: Size of the output buffer. It needs to be at least as big as
>> the expected result depending on the operation
>>
>> Note that for random input data the memmove will be done about 1 in
>> 256 times but with PKCS#1 padding the signature always has a leading
>> zero.
>>
>> Requiring buffers bigger than needed makes the added work of dropping
>> the zero bytes from the sglist and potentially re-adding them in the
>> client difficult to justify.  RSA doing this sets a precedent for a
>> future pkcs1pad (or other algorithm) implementation to do the same
>> thing and a portable client having to always know the key size and use
>> key-sized buffers.
>
> I think we have agreed on dropping the length enforcement at the interface
> level.

Separately from this there's a problem with the user being unable to
know if the algorithm is going to fail because of destination buffer
size != key size (including kernel users).  For RSA, the qat
implementation will fail while the software implementation won't.  For
pkcs1pad(...) there's currently just one implementation but the user
can't assume that.

Best regards


Re: [PATCH] sched: unlikely corrupted stack end

2016-06-14 Thread Ingo Molnar

* WANG Chao  wrote:

> unlikely() was dropped in commit ce03e41 ("sched/core: Drop unlikely
> behind BUG_ON()"), but commit 29d6455 ("sched: panic on corrupted stack
> end") dropped BUG_ON() and called panic directly.
> 
> Now we should bring unlikely() back for branch prediction.
> 
> Signed-off-by: WANG Chao 
> ---
>  kernel/sched/core.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 017d539..7db442c 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -3170,7 +3170,7 @@ static noinline void __schedule_bug(struct task_struct 
> *prev)
>  static inline void schedule_debug(struct task_struct *prev)
>  {
>  #ifdef CONFIG_SCHED_STACK_END_CHECK
> - if (task_stack_end_corrupted(prev))
> + if (unlikely(task_stack_end_corrupted(prev)))
>   panic("corrupted stack end detected inside scheduler\n");
>  #endif

It would be better and cleaner to push that into the task_stack_end_corrupted() 
definition. (and to turn it into an inline function while we are touching it.)

Thanks,

Ingo


Re: [PATCH v8 2/3] CMDQ: Mediatek CMDQ driver

2016-06-14 Thread Horng-Shyang Liao
Hi Matthias,

On Wed, 2016-06-08 at 17:35 +0200, Matthias Brugger wrote:
> 
> On 08/06/16 14:25, Horng-Shyang Liao wrote:
> > Hi Matthias,
> >
> > On Wed, 2016-06-08 at 12:45 +0200, Matthias Brugger wrote:
> >>
> >> On 08/06/16 07:40, Horng-Shyang Liao wrote:
> >>> Hi Matthias,
> >>>
> >>> On Tue, 2016-06-07 at 18:59 +0200, Matthias Brugger wrote:
> 
>  On 03/06/16 15:11, Matthias Brugger wrote:
> >
> >
>  [...]
> 
> >> +
> >> +smp_mb(); /* modify jump before enable thread */
> >> +}
> >> +
> >> +cmdq_thread_writel(thread, task->pa_base +
> >> task->command_size,
> >> +   CMDQ_THR_END_ADDR);
> >> +cmdq_thread_resume(thread);
> >> +}
> >> +list_move_tail(&task->list_entry, 
> >> &thread->task_busy_list);
> >> +spin_unlock_irqrestore(&cmdq->exec_lock, flags);
> >> +}
> >> +
> >> +static void cmdq_handle_error_done(struct cmdq *cmdq,
> >> +   struct cmdq_thread *thread, u32 irq_flag)
> >> +{
> >> +struct cmdq_task *task, *tmp, *curr_task = NULL;
> >> +u32 curr_pa;
> >> +struct cmdq_cb_data cmdq_cb_data;
> >> +bool err;
> >> +
> >> +if (irq_flag & CMDQ_THR_IRQ_ERROR)
> >> +err = true;
> >> +else if (irq_flag & CMDQ_THR_IRQ_DONE)
> >> +err = false;
> >> +else
> >> +return;
> >> +
> >> +curr_pa = cmdq_thread_readl(thread, CMDQ_THR_CURR_ADDR);
> >> +
> >> +list_for_each_entry_safe(task, tmp, 
> >> &thread->task_busy_list,
> >> + list_entry) {
> >> +if (curr_pa >= task->pa_base &&
> >> +curr_pa < (task->pa_base + task->command_size))
> >
> > What are you checking here? It seems as if you make some implcit
> > assumptions about pa_base and the order of execution of
> > commands in the
> > thread. Is it save to do so? Does dma_alloc_coherent give any
> > guarantees
> > about dma_handle?
> 
>  1. Check what is the current running task in this GCE thread.
>  2. Yes.
>  3. Yes, CMDQ doesn't use iommu, so physical address is 
>  continuous.
> 
> >>>
> >>> Yes, physical addresses might be continous, but AFAIK there is no
> >>> guarantee that the dma_handle address is steadily growing, when
> >>> calling
> >>> dma_alloc_coherent. And if I understand the code correctly, you
> >>> use this
> >>> assumption to decide if the task picked from task_busy_list is
> >>> currently
> >>> executing. So I think this mecanism is not working.
> >>
> >> I don't use dma_handle address, and just use physical addresses.
> >>  From CPU's point of view, tasks are linked by the busy list.
> >>  From GCE's point of view, tasks are linked by the JUMP 
> >> command.
> >>
> >>> In which cases does the HW thread raise an interrupt.
> >>> In case of error. When does CMDQ_THR_IRQ_DONE get raised?
> >>
> >> GCE will raise interrupt if any task is done or error.
> >> However, GCE is fast, so CPU may get multiple done tasks
> >> when it is running ISR.
> >>
> >> In case of error, that GCE thread will pause and raise interrupt.
> >> So, CPU may get multiple done tasks and one error task.
> >>
> >
> > I think we should reimplement the ISR mechanism. Can't we just read
> > CURR_IRQ_STATUS and THR_IRQ_STATUS in the handler and leave
> > cmdq_handle_error_done to the thread_fn? You will need to pass
> > information from the handler to thread_fn, but that shouldn't be an
> > issue. AFAIK interrupts are disabled in the handler, so we should 
> > stay
> > there as short as possible. Traversing task_busy_list is expensive, 
> > so
> > we need to do it in a thread context.
> 
>  Actually, our initial implementation is similar to your suggestion,
>  but display needs CMDQ to return callback function very precisely,
>  else display will drop frame.
>  For display, CMDQ interrupt will be raised every 16 ~ 17 ms,
>  and CMDQ needs to call callback function in ISR.
>  If we defer callback to workqueue, the time interval may be larger 
>  than
>  32 ms.sometimes.
> 
> >>>
> >>> I think the problem is, that you implemented the w

Re: [PATCH 5/7] ARM: OMAP: dmtimer: Do not call PM runtime functions when not needed.

2016-06-14 Thread Tony Lindgren
* Ivaylo Dimitrov  [160613 12:01]:
> Hi,
> 
> On 13.06.2016 10:10, Tony Lindgren wrote:
> > * Ivaylo Dimitrov  [160610 14:23]:
> > > 
> > > On 10.06.2016 13:22, Tony Lindgren wrote:
> > > > 
> > > > OK. And I just applied the related dts changes. Please repost the driver
> > > > changes and DT binding doc with Rob's ack to the driver maintainers to
> > > > apply.
> > > > 
> > > 
> > > Already did, see https://lkml.org/lkml/2016/5/16/429
> > > 
> > > Shall I do anything else?
> > 
> > Probably good idea to repost just the driver changes to the
> > subsystem maintainers. With v4.7 out any pre v4.7 patchsets
> > easily get forgotten.
> > 
> 
> Sorry for the maybe stupid question, but does this mean that I should send
> separate patches instead of series? Or the series without what you've
> already applied?

Always leave out the patches that have been already applied.
Otherwise people will get confused. Just mention it in the
cover letter saying "patch xyz has been already applied into
foo tree, these patches are safe to apply separately into bar
tree" or something similar :)

Tony


Kconfig error: Missing dependency for MEMSTICK_UNSAFE_RESUME

2016-06-14 Thread Sascha El-Sharkawy
Dear Kernel developers,

we detected a missing dependency inside the Kconfig model, which allows to 
configure Memstick support for power management (MEMSTICK_UNSAFE_RESUME) even 
if Power Management (PM) was disabled. We suggest to add a "depends on" 
constraint to the Kconfig model for semantical correctness and to simplify the 
configuration process by avoiding the configuration of unnecessary 
configuration options (patch is attached to this mail).
Without this dependency, a configuring user is not able to detect the 
divergence between the desired and real behavior of the kernel.
We also wrote a Bugzilla report for more details: 
https://bugzilla.kernel.org/show_bug.cgi?id=116871

Sincerely yours,
Sascha El-Sharkawy

-- ---
Sascha El-Sharkawy, MSc 
University of HildesheimTel.: +49 (0) 5121 / 883-40336  
Institute of Computer Science   Fax:  +49 (0) 5121 / 883-40337
Universitaetsplatz 1       els...@sse.uni-hildesheim.de
D-31141 Hildesheim, Germany   http://www.sse.uni-hildesheim.de



MEMSTICK_UNSAFE_RESUME.patch
Description: MEMSTICK_UNSAFE_RESUME.patch


[PATCH] [RESEND] drivers/base dmam_declare_coherent_memory leaks

2016-06-14 Thread Vyacheslav V. Yurkov
dmam_declare_coherent_memory doesn't take into account the return
value of dma_declare_coherent_memory, which leads to incorrect resource
handling

Signed-off-by: Vyacheslav V. Yurkov 
---
 drivers/base/dma-mapping.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/base/dma-mapping.c b/drivers/base/dma-mapping.c
index d799662..f5d2132 100644
--- a/drivers/base/dma-mapping.c
+++ b/drivers/base/dma-mapping.c
@@ -198,10 +198,13 @@ int dmam_declare_coherent_memory(struct device *dev, 
phys_addr_t phys_addr,
 
rc = dma_declare_coherent_memory(dev, phys_addr, device_addr, size,
 flags);
-   if (rc == 0)
+   if (rc) {
devres_add(dev, res);
-   else
+   rc = 0;
+   } else {
devres_free(res);
+   rc = -ENOMEM;
+   }
 
return rc;
 }
-- 
1.8.4



[rfc patch] sched/fair: Use instantaneous load for fork/exec balancing

2016-06-14 Thread Mike Galbraith
SUSE's regression testing noticed that...

0905f04eb21f sched/fair: Fix new task's load avg removed from source CPU in 
wake_up_new_task()

...introduced a hackbench regression, and indeed it does.  I think this
regression has more to do with randomness than anything else, but in
general...

While averaging calms down load balancing, helping to keep migrations
down to a dull roar, it's not completely wonderful when it comes to
things that live in the here and now, hackbench being one such.

time sh -c 'for i in `seq 1000`; do hackbench -p -P > /dev/null; done'

real0m55.397s
user0m8.320s
sys 5m40.789s

echo LB_INSTANTANEOUS_LOAD > /sys/kernel/debug/sched_features

real0m48.049s
user0m6.510s
sys 5m6.291s

Signed-off-by: Mike Galbraith 
---
 kernel/sched/fair.c |   54 
 kernel/sched/features.h |1 
 kernel/sched/sched.h|6 +
 3 files changed, 35 insertions(+), 26 deletions(-)

--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -738,7 +738,7 @@ void post_init_entity_util_avg(struct sc
}
 }
 
-static inline unsigned long cfs_rq_runnable_load_avg(struct cfs_rq *cfs_rq);
+static inline unsigned long cfs_rq_runnable_load_avg(struct cfs_rq *cfs_rq, 
int avg);
 static inline unsigned long cfs_rq_load_avg(struct cfs_rq *cfs_rq);
 #else
 void init_entity_runnable_average(struct sched_entity *se)
@@ -1229,9 +1229,9 @@ bool should_numa_migrate_memory(struct t
   group_faults_cpu(ng, src_nid) * group_faults(p, dst_nid) * 4;
 }
 
-static unsigned long weighted_cpuload(const int cpu);
-static unsigned long source_load(int cpu, int type);
-static unsigned long target_load(int cpu, int type);
+static unsigned long weighted_cpuload(const int cpu, int avg);
+static unsigned long source_load(int cpu, int type, int avg);
+static unsigned long target_load(int cpu, int type, int avg);
 static unsigned long capacity_of(int cpu);
 static long effective_load(struct task_group *tg, int cpu, long wl, long wg);
 
@@ -1261,7 +1261,7 @@ static void update_numa_stats(struct num
struct rq *rq = cpu_rq(cpu);
 
ns->nr_running += rq->nr_running;
-   ns->load += weighted_cpuload(cpu);
+   ns->load += weighted_cpuload(cpu, LOAD_AVERAGE);
ns->compute_capacity += capacity_of(cpu);
 
cpus++;
@@ -3102,8 +3102,10 @@ void remove_entity_load_avg(struct sched
atomic_long_add(se->avg.util_avg, &cfs_rq->removed_util_avg);
 }
 
-static inline unsigned long cfs_rq_runnable_load_avg(struct cfs_rq *cfs_rq)
+static inline unsigned long cfs_rq_runnable_load_avg(struct cfs_rq *cfs_rq, 
int avg)
 {
+   if (sched_feat(LB_INSTANTANEOUS_LOAD) && avg == LOAD_INSTANT)
+   return cfs_rq->load.weight;
return cfs_rq->runnable_load_avg;
 }
 
@@ -4701,9 +4703,9 @@ static void cpu_load_update(struct rq *t
 }
 
 /* Used instead of source_load when we know the type == 0 */
-static unsigned long weighted_cpuload(const int cpu)
+static unsigned long weighted_cpuload(const int cpu, int avg)
 {
-   return cfs_rq_runnable_load_avg(&cpu_rq(cpu)->cfs);
+   return cfs_rq_runnable_load_avg(&cpu_rq(cpu)->cfs, avg);
 }
 
 #ifdef CONFIG_NO_HZ_COMMON
@@ -4748,7 +4750,7 @@ static void cpu_load_update_idle(struct
/*
 * bail if there's load or we're actually up-to-date.
 */
-   if (weighted_cpuload(cpu_of(this_rq)))
+   if (weighted_cpuload(cpu_of(this_rq), LOAD_AVERAGE))
return;
 
cpu_load_update_nohz(this_rq, READ_ONCE(jiffies), 0);
@@ -4769,7 +4771,7 @@ void cpu_load_update_nohz_start(void)
 * concurrently we'll exit nohz. And cpu_load write can race with
 * cpu_load_update_idle() but both updater would be writing the same.
 */
-   this_rq->cpu_load[0] = weighted_cpuload(cpu_of(this_rq));
+   this_rq->cpu_load[0] = weighted_cpuload(cpu_of(this_rq), LOAD_AVERAGE);
 }
 
 /*
@@ -4784,7 +4786,7 @@ void cpu_load_update_nohz_stop(void)
if (curr_jiffies == this_rq->last_load_update_tick)
return;
 
-   load = weighted_cpuload(cpu_of(this_rq));
+   load = weighted_cpuload(cpu_of(this_rq), LOAD_AVERAGE);
raw_spin_lock(&this_rq->lock);
update_rq_clock(this_rq);
cpu_load_update_nohz(this_rq, curr_jiffies, load);
@@ -4810,7 +4812,7 @@ static void cpu_load_update_periodic(str
  */
 void cpu_load_update_active(struct rq *this_rq)
 {
-   unsigned long load = weighted_cpuload(cpu_of(this_rq));
+   unsigned long load = weighted_cpuload(cpu_of(this_rq), LOAD_AVERAGE);
 
if (tick_nohz_tick_stopped())
cpu_load_update_nohz(this_rq, READ_ONCE(jiffies), load);
@@ -4825,10 +4827,10 @@ void cpu_load_update_active(struct rq *t
  * We want to under-estimate the load of migration sources, to
  * balance conservatively.
  */
-static unsigned long source_load(int cpu, int type)
+static unsigned long source_l

Re: [PATCH] mm/zsmalloc: add trace events for zs_compact

2016-06-14 Thread Sergey Senozhatsky
On (06/13/16 15:49), Ganesh Mahendran wrote:
[..]
> > some parts (of the info above) are already available: zram maps to
> > pool name, which maps to a sysfs file name, that can contain the rest.
> > I'm just trying to understand what kind of optimizations we are talking
> > about here and how would timings help... compaction can spin on class
> > lock, for example, if the device in question is busy, etc. etc. on the
> > other hand we have a per-class info in zsmalloc pool stats output, so
> > why not extend it instead of introducing a new debugging interface?
> 
> I've considered adding new interface in /sys/../zsmalloc/ or uasing
> trace_mm_shrink_slab_[start/end] to get such information.
> But none of them can cover all the cases:
> 1) distinguish which zs pool is compacted.
> 2) freed pages of zs_compact(), total freed pages of zs_compact()
> 3) realtime log printed

I'm not against the patch in general, just curious, do you have any
specific optimization in mind? if so, can we start with that optimization
then, otherwise, can we define what type of optimizations this tracing
will boost?

what I'm thinking of, we have a zsmalloc debugfs file, which provides
per-device->per-pool->per-class stats:

cat /sys/kernel/debug/zsmalloc/zram0/classes
 class  size almost_full almost_empty obj_allocated   obj_used pages_used 
pages_per_zspage freeable

so the 'missing' thing is just one column, right? the total freed
pages number is already accounted.

thoughts?

-ss


Re: [PATCH 2/2 v16] drm/bridge: Add I2C based driver for ps8640 bridge

2016-06-14 Thread Daniel Kurtz
Hi Jitao,

On Thu, Jun 2, 2016 at 5:57 PM, Jitao Shi  wrote:
>
> This patch adds drm_bridge driver for parade DSI to eDP bridge chip.
>
> Signed-off-by: Jitao Shi 
> Reviewed-by: Daniel Kurtz 
> ---
> Changes since v15:
>  - Drop drm_connector_(un)register calls from parade ps8640.
>The main DRM driver mtk_drm_drv now calls
>drm_connector_register_all() after drm_dev_register() in the
>mtk_drm_bind() function. That function should iterate over all
>connectors and call drm_connector_register() for each of them.
>So, remove drm_connector_(un)register calls from parade ps8640.

[snip...]

> +static void ps8640_pre_enable(struct drm_bridge *bridge)
> +{
> +   struct ps8640 *ps_bridge = bridge_to_ps8640(bridge);
> +   struct i2c_client *client = ps_bridge->page[2];
> +   int err;
> +   u8 set_vdo_done;
> +   ktime_t timeout;
> +
> +   if (ps_bridge->in_fw_update)
> +   return;
> +
> +   if (ps_bridge->enabled)
> +   return;
> +
> +   err = drm_panel_prepare(ps_bridge->panel);
> +   if (err < 0) {
> +   DRM_ERROR("failed to prepare panel: %d\n", err);
> +   return;
> +   }
> +

(1) For patch v10, Philipp Zabel commented that gpio_slp_n &
gpio_rst_n are both active low, and that the device tree should
contain a reset-gpios property with the GPIO_ACTIVE_LOW flag set.
(2) However, you did change the the reset logic from v10 -> v11, but
it isn't clear why (nor mentioned in the patch notes).

v10 (https://patchwork.kernel.org/patch/8357851/) had:

+ gpiod_set_value(ps_bridge->gpio_slp_n, 1);
+ gpiod_set_value(ps_bridge->gpio_rst_n, 0);
+
+ err = regulator_bulk_enable(ARRAY_SIZE(ps_bridge->supplies),
+ps_bridge->supplies);
+ if (err < 0) {
+ DRM_ERROR("cannot enable regulators %d\n", err);
+ goto err_panel_unprepare;
+ }
+
+ usleep_range(500, 700);
+ gpiod_set_value(ps_bridge->gpio_rst_n, 1);

In other words:
  (a) de-assert power down
  (b) assert reset
  (c) enable 1.2 & 3.3 regulators
  (d)  (aka regulator-ramp-delay in device-tree)
  (e) wait an additional 2 ms (as requested by Parade for ps8640 to stabilize?)
  (f) de-assert reset
  (g) wait 200 ms (for ps8640 FW to load?)

This mostly made sense to me, except for step (a)... I'm not sure why
you de-assert power down before enabling the regulators.  It seems
like you'd want to do that later, maybe after reset (can you ask
Paradetech?).

Now (as of v11) it has changed to:

> +   err = regulator_bulk_enable(ARRAY_SIZE(ps_bridge->supplies),
> +   ps_bridge->supplies);
> +   if (err < 0) {
> +   DRM_ERROR("cannot enable regulators %d\n", err);
> +   goto err_panel_unprepare;
> +   }
> +
> +   gpiod_set_value(ps_bridge->gpio_slp_n, 1);
> +   gpiod_set_value(ps_bridge->gpio_rst_n, 0);
> +   usleep_range(2000, 2500);
> +   gpiod_set_value(ps_bridge->gpio_rst_n, 1);

Two additional comments:

(3) if you correctly configure these gpios as GPIO_ACTIVE_LOW, you can
drop the "_n" suffix, which will make the driver code easier to
understand.
(4) "gpio_slp_n" is called "PD#" by the PS8640 datasheet, so a better
name might be: "gpio_power_down".

Thanks,
-Dan


Re: Boot failure on emev2/kzm9d (was: Re: [PATCH v2 11/11] mm/slab: lockless decision to grow cache)

2016-06-14 Thread Joonsoo Kim
On Tue, Jun 14, 2016 at 09:31:23AM +0200, Geert Uytterhoeven wrote:
> Hi Joonsoo,
> 
> On Tue, Jun 14, 2016 at 8:24 AM, Joonsoo Kim  wrote:
> > On Mon, Jun 13, 2016 at 09:43:13PM +0200, Geert Uytterhoeven wrote:
> >> On Tue, Apr 12, 2016 at 6:51 AM,   wrote:
> >> > From: Joonsoo Kim 
> >> > To check whther free objects exist or not precisely, we need to grab a
> >> > lock.  But, accuracy isn't that important because race window would be
> >> > even small and if there is too much free object, cache reaper would reap
> >> > it.  So, this patch makes the check for free object exisistence not to
> >> > hold a lock.  This will reduce lock contention in heavily allocation 
> >> > case.
> >> >
> >> > Note that until now, n->shared can be freed during the processing by
> >> > writing slabinfo, but, with some trick in this patch, we can access it
> >> > freely within interrupt disabled period.
> >> >
> >> > Below is the result of concurrent allocation/free in slab allocation
> >> > benchmark made by Christoph a long time ago.  I make the output simpler.
> >> > The number shows cycle count during alloc/free respectively so less is
> >> > better.
> >> >
> >> > * Before
> >> > Kmalloc N*alloc N*free(32): Average=248/966
> >> > Kmalloc N*alloc N*free(64): Average=261/949
> >> > Kmalloc N*alloc N*free(128): Average=314/1016
> >> > Kmalloc N*alloc N*free(256): Average=741/1061
> >> > Kmalloc N*alloc N*free(512): Average=1246/1152
> >> > Kmalloc N*alloc N*free(1024): Average=2437/1259
> >> > Kmalloc N*alloc N*free(2048): Average=4980/1800
> >> > Kmalloc N*alloc N*free(4096): Average=9000/2078
> >> >
> >> > * After
> >> > Kmalloc N*alloc N*free(32): Average=344/792
> >> > Kmalloc N*alloc N*free(64): Average=347/882
> >> > Kmalloc N*alloc N*free(128): Average=390/959
> >> > Kmalloc N*alloc N*free(256): Average=393/1067
> >> > Kmalloc N*alloc N*free(512): Average=683/1229
> >> > Kmalloc N*alloc N*free(1024): Average=1295/1325
> >> > Kmalloc N*alloc N*free(2048): Average=2513/1664
> >> > Kmalloc N*alloc N*free(4096): Average=4742/2172
> >> >
> >> > It shows that allocation performance decreases for the object size up to
> >> > 128 and it may be due to extra checks in cache_alloc_refill().  But, with
> >> > considering improvement of free performance, net result looks the same.
> >> > Result for other size class looks very promising, roughly, 50% 
> >> > performance
> >> > improvement.
> >> >
> >> > v2: replace kick_all_cpus_sync() with synchronize_sched().
> >> >
> >> > Signed-off-by: Joonsoo Kim 
> >>
> >> I've bisected a boot failure (no output at all) in v4.7-rc2 on emev2/kzm9d
> >> (Renesas dual Cortex A9) to this patch, which is upstream commit
> >> 801faf0db8947e01877920e848a4d338dd7a99e7.
> >>
> >> I've attached my .config. I don't know if it also happens with
> >> shmobile_defconfig, as something went wrong with my remote access to the 
> >> board,
> >> preventing further testing. I also couldn't verify if the issue persists in
> >> v4.7-rc3.
> 
> In the mean time, I've verified it also happens with shmobile_defconfig.
> 
> >>
> >> Do you have a clue?
> >
> > I don't have yet. Could you help me to narrow down the problem?
> > Following diff is half-revert change to check that synchronize_sched()
> > has no problem.
> 
> Thanks!
> 
> Unfortunately the half revert is not sufficient. The full revert is.

Thanks for quick testing!

Could I ask one more time to check that synchronize_sched() is root
cause of the problem? Testing following two diffs will be helpful to me.

Thanks.

--->8
diff --git a/mm/slab.c b/mm/slab.c
index 763096a..d892364 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -965,7 +965,7 @@ static int setup_kmem_cache_node(struct kmem_cache *cachep,
 * freed after synchronize_sched().
 */
if (force_change)
-   synchronize_sched();
+   kick_all_cpus_sync();
 
 fail:
kfree(old_shared);

--->8--
diff --git a/mm/slab.c b/mm/slab.c
index 763096a..38d99c2 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -964,8 +964,6 @@ static int setup_kmem_cache_node(struct kmem_cache *cachep,
 * guaranteed to be valid until irq is re-enabled, because it will be
 * freed after synchronize_sched().
 */
-   if (force_change)
-   synchronize_sched();
 
 fail:
kfree(old_shared);



[PATCH 3/7] ubi: Fix whitespace issue in count_fastmap_pebs()

2016-06-14 Thread Richard Weinberger
Signed-off-by: Richard Weinberger 
---
 drivers/mtd/ubi/fastmap.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/mtd/ubi/fastmap.c b/drivers/mtd/ubi/fastmap.c
index 990898b..ab337e6 100644
--- a/drivers/mtd/ubi/fastmap.c
+++ b/drivers/mtd/ubi/fastmap.c
@@ -578,7 +578,7 @@ static int count_fastmap_pebs(struct ubi_attach_info *ai)
list_for_each_entry(aeb, &ai->free, u.list)
n++;
 
-ubi_rb_for_each_entry(rb1, av, &ai->volumes, rb)
+   ubi_rb_for_each_entry(rb1, av, &ai->volumes, rb)
ubi_rb_for_each_entry(rb2, aeb, &av->root, u.rb)
n++;
 
-- 
2.7.3



[PATCH 2/7] ubi: Introduce vol_ignored()

2016-06-14 Thread Richard Weinberger
This makes the logic more easy to follow.

Signed-off-by: Richard Weinberger 
---
 drivers/mtd/ubi/attach.c | 24 ++--
 drivers/mtd/ubi/ubi.h| 15 +++
 2 files changed, 33 insertions(+), 6 deletions(-)

diff --git a/drivers/mtd/ubi/attach.c b/drivers/mtd/ubi/attach.c
index cc04ca8..6e5fb1d 100644
--- a/drivers/mtd/ubi/attach.c
+++ b/drivers/mtd/ubi/attach.c
@@ -803,6 +803,20 @@ out_unlock:
return err;
 }
 
+static bool vol_ignored(int vol_id)
+{
+   switch (vol_id) {
+   case UBI_LAYOUT_VOLUME_ID:
+   return true;
+   }
+
+#ifdef CONFIG_MTD_UBI_FASTMAP
+   return ubi_is_fm_vol(vol_id);
+#else
+   return false;
+#endif
+}
+
 /**
  * scan_peb - scan and process UBI headers of a PEB.
  * @ubi: UBI device description object
@@ -995,17 +1009,15 @@ static int scan_peb(struct ubi_device *ubi, struct 
ubi_attach_info *ai,
*vid = vol_id;
if (sqnum)
*sqnum = be64_to_cpu(vidh->sqnum);
-   if (vol_id > UBI_MAX_VOLUMES && vol_id != UBI_LAYOUT_VOLUME_ID) {
+   if (vol_id > UBI_MAX_VOLUMES && !vol_ignored(vol_id)) {
int lnum = be32_to_cpu(vidh->lnum);
 
/* Unsupported internal volume */
switch (vidh->compat) {
case UBI_COMPAT_DELETE:
-   if (vol_id != UBI_FM_SB_VOLUME_ID
-   && vol_id != UBI_FM_DATA_VOLUME_ID) {
-   ubi_msg(ubi, "\"delete\" compatible internal 
volume %d:%d found, will remove it",
-   vol_id, lnum);
-   }
+   ubi_msg(ubi, "\"delete\" compatible internal volume 
%d:%d found, will remove it",
+   vol_id, lnum);
+
err = add_to_list(ai, pnum, vol_id, lnum,
  ec, 1, &ai->erase);
if (err)
diff --git a/drivers/mtd/ubi/ubi.h b/drivers/mtd/ubi/ubi.h
index 61d4e99..91075b6 100644
--- a/drivers/mtd/ubi/ubi.h
+++ b/drivers/mtd/ubi/ubi.h
@@ -1105,4 +1105,19 @@ static inline int idx2vol_id(const struct ubi_device 
*ubi, int idx)
return idx;
 }
 
+/**
+ * ubi_is_fm_vol - check whether a volume ID is a Fastmap volume.
+ * @vol_id: volume ID
+ */
+static inline bool ubi_is_fm_vol(int vol_id)
+{
+   switch (vol_id) {
+   case UBI_FM_SB_VOLUME_ID:
+   case UBI_FM_DATA_VOLUME_ID:
+   return true;
+   }
+
+   return false;
+}
+
 #endif /* !__UBI_UBI_H__ */
-- 
2.7.3



Re: [PATCH v10 00/14] USB OTG/dual-role framework

2016-06-14 Thread Roger Quadros
+Alan. (Sorry Alan, I forgot to add copy you in this series).

Hi,

On 14/06/16 05:17, Peter Chen wrote:
> On Fri, Jun 10, 2016 at 04:07:09PM +0300, Roger Quadros wrote:
>> Hi,
>>
>> This series centralizes OTG/Dual-role functionality in the kernel.
>> As of now I've got Dual-role functionality working pretty reliably on
>> dra7-evm and am437x-gp-evm.
>>
>> DWC3 controller and TI platform related patches will be sent separately.
>>
>> Series is based on v4.7-rc1 + balbi/usb.git testing/next
>> commit 4a2786c10462df650965785462ca82c185164d98.
>>
> 
> Roger, I have acked for all your patches. You can send v11 for all
> patches. When you get Felipe's ack for gadget part, Alan's ack
> for hcd part, and your dwc3 and platform related patches ack,
> you can let Felipe help to queue this patch set since some of
> the code based on his tree. Thank you for your hard work for it.

Thanks Peter,

I'll send v11, with your Acks.

cheers,
-roger

> 
> Peter
> 
>> Why?:
>> 
>>
>> Currently there is no central location where OTG/dual-role functionality is
>> implemented in the Linux USB stack and every USB controller driver is
>> doing their own thing for OTG/dual-role. We can benefit from code-reuse
>> and simplicity by adding the OTG/dual-role core driver.
>>
>> Newer OTG cores support standard host interface (e.g. xHCI) so
>> host and gadget functionality are no longer closely knit like older
>> cores. There needs to be a way to co-ordinate the operation of the
>> host and gadget controllers in dual-role mode. i.e. to stop and start them
>> from a central location. This central location should be the
>> USB OTG/dual-role core.
>>
>> Host and gadget controllers might be sharing resources and can't
>> be always running. One has to be stopped for the other to run.
>> This couldn't be done till now but can be done from the OTG core.
>>
>> What?:
>> -
>>
>> The OTG/dual-role core consists of a set of APIs that allow
>> registration of OTG controller device and OTG capable host and gadget
>> controllers.
>>
>> - The OTG controller driver can provide the OTG capabilities and the
>> Finite State Machine work function via 'struct usb_otg_config'
>> at the time of registration i.e. usb_otg_register();
>>
>>  struct usb_otg *usb_otg_register(struct device *dev,
>>   struct usb_otg_config *config);
>>  int usb_otg_unregister(struct device *dev);
>>  /**
>>   * struct usb_otg_config - otg controller configuration
>>   * @caps: otg capabilities of the controller
>>   * @ops: otg fsm operations
>>   * @otg_work: optional custom otg state machine work function
>>   */
>>  struct usb_otg_config {
>>  struct usb_otg_caps *otg_caps;
>>  struct otg_fsm_ops *fsm_ops;
>>  void (*otg_work)(struct work_struct *work);
>>  };
>>
>> The dual-role state machine is built-into the OTG core so nothing
>> special needs to be provided if only dual-role functionality is desired.
>> The low level OTG controller driver ops are povided via
>> 'struct otg_fsm_ops *fsm_ops' in the 'struct usb_otg_config'.
>>
>> After registration, the OTG core waits for host, gadget controller
>> and the gadget function driver to be registered. Once all resources are
>> available it instantiates the Finite State Machine (FSM).
>> The host/gadget controllers are started/stopped according to the FSM.
>>
>> - Host and gadget controllers that are a part of OTG/dual-role port must
>> use the OTG core provided APIs to add/remove the host/gadget.
>> i.e. hosts must use usb_otg_add_hcd() usb_otg_remove_hcd(),,
>> gadgets must use usb_otg_add_gadget_udc() usb_del_gadget_udc().
>> This ensures that the host and gadget controllers are not started till
>> the state machine is ready and the right bus conditions are met.
>> It also allows the host and gadget controllers to provide the OTG
>> controller device to link them together. For Device tree boots
>> the related OTG controller is automatically picked up via the
>> 'otg-controller' property in the Host/Gadget controller nodes.
>>
>>  int usb_otg_add_hcd(struct usb_hcd *hcd,
>>  unsigned int irqnum, unsigned long irqflags,
>>  struct device *otg_dev);
>>  void usb_otg_remove_hcd(struct usb_hcd *hcd);
>>
>>  int usb_otg_add_gadget_udc(struct device *parent,
>> struct usb_gadget *gadget,
>> struct device *otg_dev);
>>  usb_del_gadget_udc() must be used for removal.
>>
>>
>> - During the lifetime of the FSM, the OTG controller driver can provide
>> inputs event changes using usb_otg_sync_inputs(). The OTG core will
>> then schedule the FSM work function (or internal dual-role state machine)
>> to update the FSM state. The FSM then calls the OTG controller
>> operations (fsm_ops) as necessary.
>>  void usb_otg_sync_inputs(struct usb_otg *otg);
>>
>> - The following 2 functions are provided 

[PATCH 7/7] ubi: Use bitmaps in Fastmap self-check code

2016-06-14 Thread Richard Weinberger
...don't waste memory by allocating one sizeof(int) per
PEB.

Signed-off-by: Richard Weinberger 
---
 drivers/mtd/ubi/fastmap.c | 20 +++-
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/drivers/mtd/ubi/fastmap.c b/drivers/mtd/ubi/fastmap.c
index b66cb3e..48eb55f 100644
--- a/drivers/mtd/ubi/fastmap.c
+++ b/drivers/mtd/ubi/fastmap.c
@@ -15,20 +15,22 @@
  */
 
 #include 
+#include 
 #include "ubi.h"
 
 /**
  * init_seen - allocate memory for used for debugging.
  * @ubi: UBI device description object
  */
-static inline int *init_seen(struct ubi_device *ubi)
+static inline unsigned long *init_seen(struct ubi_device *ubi)
 {
-   int *ret;
+   unsigned long *ret;
 
if (!ubi_dbg_chk_fastmap(ubi))
return NULL;
 
-   ret = kcalloc(ubi->peb_count, sizeof(int), GFP_KERNEL);
+   ret = kcalloc(BITS_TO_LONGS(ubi->peb_count), sizeof(unsigned long),
+ GFP_KERNEL);
if (!ret)
return ERR_PTR(-ENOMEM);
 
@@ -39,7 +41,7 @@ static inline int *init_seen(struct ubi_device *ubi)
  * free_seen - free the seen logic integer array.
  * @seen: integer array of @ubi->peb_count size
  */
-static inline void free_seen(int *seen)
+static inline void free_seen(unsigned long *seen)
 {
kfree(seen);
 }
@@ -50,12 +52,12 @@ static inline void free_seen(int *seen)
  * @pnum: The PEB to be makred as seen
  * @seen: integer array of @ubi->peb_count size
  */
-static inline void set_seen(struct ubi_device *ubi, int pnum, int *seen)
+static inline void set_seen(struct ubi_device *ubi, int pnum, unsigned long 
*seen)
 {
if (!ubi_dbg_chk_fastmap(ubi) || !seen)
return;
 
-   seen[pnum] = 1;
+   set_bit(pnum, seen);
 }
 
 /**
@@ -63,7 +65,7 @@ static inline void set_seen(struct ubi_device *ubi, int pnum, 
int *seen)
  * @ubi: UBI device description object
  * @seen: integer array of @ubi->peb_count size
  */
-static int self_check_seen(struct ubi_device *ubi, int *seen)
+static int self_check_seen(struct ubi_device *ubi, unsigned long *seen)
 {
int pnum, ret = 0;
 
@@ -71,7 +73,7 @@ static int self_check_seen(struct ubi_device *ubi, int *seen)
return 0;
 
for (pnum = 0; pnum < ubi->peb_count; pnum++) {
-   if (!seen[pnum] && ubi->lookuptbl[pnum]) {
+   if (test_bit(pnum, seen) && ubi->lookuptbl[pnum]) {
ubi_err(ubi, "self-check failed for PEB %d, fastmap 
didn't see it", pnum);
ret = -EINVAL;
}
@@ -1139,7 +1141,7 @@ static int ubi_write_fastmap(struct ubi_device *ubi,
struct rb_node *tmp_rb;
int ret, i, j, free_peb_count, used_peb_count, vol_count;
int scrub_peb_count, erase_peb_count;
-   int *seen_pebs = NULL;
+   unsigned long *seen_pebs = NULL;
 
fm_raw = ubi->fm_buf;
memset(ubi->fm_buf, 0, ubi->fm_size);
-- 
2.7.3



[PATCH 5/7] ubi: Check whether the Fastmap anchor matches the super block

2016-06-14 Thread Richard Weinberger
This helps to detect cases where an user copies an UBI image to
another target with different bad blocks.

Signed-off-by: Richard Weinberger 
---
 drivers/mtd/ubi/fastmap.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/mtd/ubi/fastmap.c b/drivers/mtd/ubi/fastmap.c
index 12bdb09..b66cb3e 100644
--- a/drivers/mtd/ubi/fastmap.c
+++ b/drivers/mtd/ubi/fastmap.c
@@ -975,6 +975,13 @@ int ubi_scan_fastmap(struct ubi_device *ubi, struct 
ubi_attach_info *ai,
goto free_hdr;
}
 
+   if (i == 0 && pnum != fm_anchor) {
+   ubi_err(ubi, "Fastmap anchor PEB mismatch: PEB: %i vs. 
%i",
+   pnum, fm_anchor);
+   ret = UBI_BAD_FASTMAP;
+   goto free_hdr;
+   }
+
ret = ubi_io_read_ec_hdr(ubi, pnum, ech, 0);
if (ret && ret != UBI_IO_BITFLIPS) {
ubi_err(ubi, "unable to read fastmap block# %i EC (PEB: 
%i)",
-- 
2.7.3



[PATCH 1/7] ubi: Fix scan_fast() comment

2016-06-14 Thread Richard Weinberger
Signed-off-by: Richard Weinberger 
---
 drivers/mtd/ubi/attach.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/mtd/ubi/attach.c b/drivers/mtd/ubi/attach.c
index c1aaf03..cc04ca8 100644
--- a/drivers/mtd/ubi/attach.c
+++ b/drivers/mtd/ubi/attach.c
@@ -1326,7 +1326,7 @@ static struct ubi_attach_info *alloc_ai(void)
 #ifdef CONFIG_MTD_UBI_FASTMAP
 
 /**
- * scan_fastmap - try to find a fastmap and attach from it.
+ * scan_fast - try to find a fastmap and attach from it.
  * @ubi: UBI device description object
  * @ai: attach info object
  *
-- 
2.7.3



[PATCH 4/7] ubi: Rework Fastmap attach base code

2016-06-14 Thread Richard Weinberger
Introduce a new list to the UBI attach information
object to be able to deal better with old and corrupted
Fastmap eraseblocks.
Also move more Fastmap specific code into fastmap.c.

Signed-off-by: Richard Weinberger 
---
 drivers/mtd/ubi/attach.c  | 99 +--
 drivers/mtd/ubi/fastmap.c | 36 +++--
 drivers/mtd/ubi/ubi.h | 28 +-
 drivers/mtd/ubi/wl.c  | 41 
 4 files changed, 162 insertions(+), 42 deletions(-)

diff --git a/drivers/mtd/ubi/attach.c b/drivers/mtd/ubi/attach.c
index 6e5fb1d..bd6fc52 100644
--- a/drivers/mtd/ubi/attach.c
+++ b/drivers/mtd/ubi/attach.c
@@ -175,6 +175,40 @@ static int add_corrupted(struct ubi_attach_info *ai, int 
pnum, int ec)
 }
 
 /**
+ * add_fastmap - add a Fastmap related physical eraseblock.
+ * @ai: attaching information
+ * @pnum: physical eraseblock number the VID header came from
+ * @vid_hdr: the volume identifier header
+ * @ec: erase counter of the physical eraseblock
+ *
+ * This function allocates a 'struct ubi_ainf_peb' object for a Fastamp
+ * physical eraseblock @pnum and adds it to the 'fastmap' list.
+ * Such blocks can be Fastmap super and data blocks from both the most
+ * recent Fastmap we're attaching from or from old Fastmaps which will
+ * be erased.
+ */
+static int add_fastmap(struct ubi_attach_info *ai, int pnum,
+  struct ubi_vid_hdr *vid_hdr, int ec)
+{
+   struct ubi_ainf_peb *aeb;
+
+   aeb = kmem_cache_alloc(ai->aeb_slab_cache, GFP_KERNEL);
+   if (!aeb)
+   return -ENOMEM;
+
+   aeb->pnum = pnum;
+   aeb->vol_id = be32_to_cpu(vidh->vol_id);
+   aeb->sqnum = be64_to_cpu(vidh->sqnum);
+   aeb->ec = ec;
+   list_add(&aeb->u.list, &ai->fastmap);
+
+   dbg_bld("add to fastmap list: PEB %d, vol_id %d, sqnum: %llu", pnum,
+   aeb->vol_id, aeb->sqnum);
+
+   return 0;
+}
+
+/**
  * validate_vid_hdr - check volume identifier header.
  * @ubi: UBI device description object
  * @vid_hdr: the volume identifier header to check
@@ -822,18 +856,15 @@ static bool vol_ignored(int vol_id)
  * @ubi: UBI device description object
  * @ai: attaching information
  * @pnum: the physical eraseblock number
- * @vid: The volume ID of the found volume will be stored in this pointer
- * @sqnum: The sqnum of the found volume will be stored in this pointer
  *
  * This function reads UBI headers of PEB @pnum, checks them, and adds
  * information about this PEB to the corresponding list or RB-tree in the
  * "attaching info" structure. Returns zero if the physical eraseblock was
  * successfully handled and a negative error code in case of failure.
  */
-static int scan_peb(struct ubi_device *ubi, struct ubi_attach_info *ai,
-   int pnum, int *vid, unsigned long long *sqnum)
+static int scan_peb(struct ubi_device *ubi, struct ubi_attach_info *ai, int 
pnum)
 {
-   long long uninitialized_var(ec);
+   long long ec;
int err, bitflips = 0, vol_id = -1, ec_err = 0;
 
dbg_bld("scan PEB %d", pnum);
@@ -1005,10 +1036,6 @@ static int scan_peb(struct ubi_device *ubi, struct 
ubi_attach_info *ai,
}
 
vol_id = be32_to_cpu(vidh->vol_id);
-   if (vid)
-   *vid = vol_id;
-   if (sqnum)
-   *sqnum = be64_to_cpu(vidh->sqnum);
if (vol_id > UBI_MAX_VOLUMES && !vol_ignored(vol_id)) {
int lnum = be32_to_cpu(vidh->lnum);
 
@@ -1049,7 +1076,12 @@ static int scan_peb(struct ubi_device *ubi, struct 
ubi_attach_info *ai,
if (ec_err)
ubi_warn(ubi, "valid VID header but corrupted EC header at PEB 
%d",
 pnum);
-   err = ubi_add_to_av(ubi, ai, pnum, ec, vidh, bitflips);
+
+   if (ubi_is_fm_vol(vol_id))
+   err = add_fastmap(ai, pnum, vidh, ec);
+   else
+   err = ubi_add_to_av(ubi, ai, pnum, ec, vidh, bitflips);
+
if (err)
return err;
 
@@ -1198,6 +1230,10 @@ static void destroy_ai(struct ubi_attach_info *ai)
list_del(&aeb->u.list);
kmem_cache_free(ai->aeb_slab_cache, aeb);
}
+   list_for_each_entry_safe(aeb, aeb_tmp, &ai->fastmap, u.list) {
+   list_del(&aeb->u.list);
+   kmem_cache_free(ai->aeb_slab_cache, aeb);
+   }
 
/* Destroy the volume RB-tree */
rb = ai->volumes.rb_node;
@@ -1257,7 +1293,7 @@ static int scan_all(struct ubi_device *ubi, struct 
ubi_attach_info *ai,
cond_resched();
 
dbg_gen("process PEB %d", pnum);
-   err = scan_peb(ubi, ai, pnum, NULL, NULL);
+   err = scan_peb(ubi, ai, pnum);
if (err < 0)
goto out_vidh;
}
@@ -1323,6 +1359,7 @@ static struct ubi_attach_info *alloc_ai(void)
INIT_LIST_HEAD(&ai->free);
INIT_LIST_HEAD(&ai->erase);
INIT_LIST_HEAD(&ai->alien);
+   INIT_LIST_HEA

[PATCH 6/7] ubi: Be more paranoid while seaching for the most recent Fastmap

2016-06-14 Thread Richard Weinberger
Since PEB erasure is asynchornous it can happen that there is
more than one Fastmap on the MTD. This is fine because the attach logic
will pick the Fastmap data structure with the highest sequence number.

On a not so well configured MTD stack spurious ECC errors are common.
Causes can be different, bad hardware, wrong operating modes, etc...
If the most current Fastmap renders bad due to ECC errors UBI might
pick an older Fastmap to attach from.
While this can only happen on an anyway broken setup it will show
completely different sympthoms and makes finding the root cause much
more difficult.
So, be debug friendly and fall back to scanning mode of we're facing
an ECC error while scanning for Fastmap.

Signed-off-by: Richard Weinberger 
---
 drivers/mtd/ubi/attach.c | 28 
 drivers/mtd/ubi/ubi.h|  3 +++
 2 files changed, 27 insertions(+), 4 deletions(-)

diff --git a/drivers/mtd/ubi/attach.c b/drivers/mtd/ubi/attach.c
index bd6fc52..903becd 100644
--- a/drivers/mtd/ubi/attach.c
+++ b/drivers/mtd/ubi/attach.c
@@ -856,13 +856,15 @@ static bool vol_ignored(int vol_id)
  * @ubi: UBI device description object
  * @ai: attaching information
  * @pnum: the physical eraseblock number
+ * @fast: true if we're scanning for a Fastmap
  *
  * This function reads UBI headers of PEB @pnum, checks them, and adds
  * information about this PEB to the corresponding list or RB-tree in the
  * "attaching info" structure. Returns zero if the physical eraseblock was
  * successfully handled and a negative error code in case of failure.
  */
-static int scan_peb(struct ubi_device *ubi, struct ubi_attach_info *ai, int 
pnum)
+static int scan_peb(struct ubi_device *ubi, struct ubi_attach_info *ai,
+   int pnum, bool fast)
 {
long long ec;
int err, bitflips = 0, vol_id = -1, ec_err = 0;
@@ -980,6 +982,20 @@ static int scan_peb(struct ubi_device *ubi, struct 
ubi_attach_info *ai, int pnum
 */
ai->maybe_bad_peb_count += 1;
case UBI_IO_BAD_HDR:
+   /*
+* If we're facing a bad VID header we have to drop 
*all*
+* Fastmap data structures we find. The most recent 
Fastmap
+* could be bad and therefore there is a chance that we 
attach
+* from an old one. On a fine MTD stack a PEB must not 
render
+* bad all of a sudden, but the reality is different.
+* So, let's be paranoid and help finding the root 
cause by
+* falling back to scanning mode instead of attaching 
with a
+* bad EBA table and cause data corruption which is 
hard to
+* analyze.
+*/
+   if (fast)
+   ai->force_full_scan = 1;
+
if (ec_err)
/*
 * Both headers are corrupted. There is a possibility
@@ -1293,7 +1309,7 @@ static int scan_all(struct ubi_device *ubi, struct 
ubi_attach_info *ai,
cond_resched();
 
dbg_gen("process PEB %d", pnum);
-   err = scan_peb(ubi, ai, pnum);
+   err = scan_peb(ubi, ai, pnum, false);
if (err < 0)
goto out_vidh;
}
@@ -1407,7 +1423,7 @@ static int scan_fast(struct ubi_device *ubi, struct 
ubi_attach_info **ai)
cond_resched();
 
dbg_gen("process PEB %d", pnum);
-   err = scan_peb(ubi, scan_ai, pnum);
+   err = scan_peb(ubi, scan_ai, pnum, true);
if (err < 0)
goto out_vidh;
}
@@ -1415,7 +1431,11 @@ static int scan_fast(struct ubi_device *ubi, struct 
ubi_attach_info **ai)
ubi_free_vid_hdr(ubi, vidh);
kfree(ech);
 
-   err = ubi_scan_fastmap(ubi, *ai, scan_ai);
+   if (scan_ai->force_full_scan)
+   err = UBI_NO_FASTMAP;
+   else
+   err = ubi_scan_fastmap(ubi, *ai, scan_ai);
+
if (err) {
/*
 * Didn't attach via fastmap, do a full scan but reuse what
diff --git a/drivers/mtd/ubi/ubi.h b/drivers/mtd/ubi/ubi.h
index c8b90a8..b616a115 100644
--- a/drivers/mtd/ubi/ubi.h
+++ b/drivers/mtd/ubi/ubi.h
@@ -715,6 +715,8 @@ struct ubi_ainf_volume {
  * @vols_found: number of volumes found
  * @highest_vol_id: highest volume ID
  * @is_empty: flag indicating whether the MTD device is empty or not
+ * @force_full_scan: flag indicating whether we need to do a full scan and drop
+all existing Fastmap data structures
  * @min_ec: lowest erase counter value
  * @max_ec: highest erase counter value
  * @max_sqnum: highest sequence number value
@@ -742,6 +744,7 @@ struct ubi_attach_info {
int vols_found;
int highest_vol_id;
int is_empty;
+   int

Re: [PATCH v2 00/38] Documentation/sphinx

2016-06-14 Thread Daniel Vetter
Hi Jon,

On Fri, Jun 10, 2016 at 10:41 PM, Dave Airlie  wrote:
> On 11 June 2016 at 04:17, Daniel Vetter  wrote:
>> On Thu, Jun 9, 2016 at 9:55 PM, Jonathan Corbet  wrote:
>>> On Sat,  4 Jun 2016 14:37:01 +0300
>>> Jani Nikula  wrote:
>>>
 When this lands in docs-next and we can backmerge to drm, we'll plunge
 ahead and convert gpu.tmpl to rst, and have that ready for v4.8.
>>>
>>> That is now done — thanks for running with this!  I'm looking forward to
>>> seeing where we can take it from here.
>>>
>>> One little thing: there's a bunch of new warnings in the htmldocs build:
>>>
>>> .//include/net/mac80211.h:671: warning: duplicate section name 'Description'
>>> .//include/net/mac80211.h:3174: warning: duplicate section name 
>>> 'Description'
>>>
>>> Some quick messing around suggests that it happens when a kerneldoc entry
>>> has free text both above and below the parameter list; there aren't many
>>> such places.  I can send in a patch for mac80211.h to silence most of it,
>>> but it might be nice if it worked as before without whining.
>>
>> Awesome, Jani's patches landed in drm-next. As discussed I'd like to
>> merge this into drm-misc, to be able to synchronize the gpu.tmpl->rst
>> conversion with ongoing drm work. Can I just pull your branch, or do
>> you want me to pull a special tag? With that I can start wreaking
>> havoc ;-)
>>
>> Also we need to coordinate the merge window order. I think as long as
>> I only pull from your tree for the 4.8 cycle (there shouldn't be any
>> conflicts in the conversion itself, as long as we only touch gpu.rst
>> in drm-misc) that would work if drm-next lands after doc-next.
>>
>> Dave, would that be ok with you too?
>
> It would be best if Jon can give us a known tag that won't get rebased,
> and will end up in docs-next and drm-next, then we can arranage for docs-next
> to get merged early and drm-next should be less trouble.
>
> I'm happy to merge that stable branch via drm-misc.

Ping for tag/pull request. Note that for gpu.tmpl->rst conversion we
only need what's currently in docs-next - you can merge Jani's
follow-up series at leasure later on.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch


Re: [patch 13/20] timer: Switch to a non cascading wheel

2016-06-14 Thread George Spelvin
Nice cleanup!


I think I see a buglet in your level-5 cascading.

Suppose a timer is requested far in the future for a time
that is an exact multiple of 32768 jiffies.

collect_expired_timers() scans level 5 after all the previous ones,
and will cascade it to level 0, in a level-0 bucket which has already
been scanned, and won't be scanned again for 64 jiffies.

I agree that 64 jiffies is well within your allowed rounding accuracy,
and order of timer firing is not guaranteed when they're for the same
time, but it is a bit odd when a timer fires 32 jiffies *before* another
timer scheduled for 32 jiffies later.  That's the sort of peculiarity
that could lead to a subtle bug.


While I like the cleanup of just limiting long-term resolution, if
it turns out to be necessary, it's not too hard to add exact timers
back in if a need is found in future.  All you need is a second
__internal_add_timer function that rounds down rather than up, and to
teach expire_timers() to cascade in the unlikely situation that a timer
does have an expiry time in the future.

(It also gets rid of the special case for level 5.)


Other, mostly minor, code comments:

> +/* Level offsets in the wheel */
> +#define LVL0_OFFS(0)
> +#define LVL1_OFFS(LVL_SIZE)
> +#define LVL2_OFFS(LVL1_OFFS + LVL_SIZE)
> +#define LVL3_OFFS(LVL2_OFFS + LVL_SIZE)
> +#define LVL4_OFFS(LVL3_OFFS + LVL_SIZE)
> +#define LVL5_OFFS(LVL4_OFFS + LVL_SIZE)
> +
> +/* Clock divisor for the next level */
> +#define LVL_CLK_SHIFT3
> +#define LVL_CLK_DIV  (1 << LVL_CLK_SHIFT)
> +#define LVL_CLK_MASK (LVL_CLK_DIV - 1)
> +
> +/* The shift constants for selecting the bucket at the levels */
> +#define LVL1_SHIFT   (1 * LVL_CLK_SHIFT)
> +#define LVL2_SHIFT   (2 * LVL_CLK_SHIFT)
> +#define LVL3_SHIFT   (3 * LVL_CLK_SHIFT)
> +#define LVL4_SHIFT   (4 * LVL_CLK_SHIFT)
> +#define LVL5_SHIFT   (5 * LVL_CLK_SHIFT)
> +
> +/* The granularity of each level */
> +#define LVL0_GRAN0x0001
> +#define LVL1_GRAN(LVL0_GRAN << LVL_CLK_SHIFT)
> +#define LVL2_GRAN(LVL1_GRAN << LVL_CLK_SHIFT)
> +#define LVL3_GRAN(LVL2_GRAN << LVL_CLK_SHIFT)
> +#define LVL4_GRAN(LVL3_GRAN << LVL_CLK_SHIFT)
> +#define LVL5_GRAN(LVL4_GRAN << LVL_CLK_SHIFT)

Wouldn't this all be so much simpler as

#define LVL_BITS6   /* Renamed previous LVL_SHIFT */
#define LVL_SIZE(1 << LVL_BITS)
#define LVL_MASK(LVL_BITS - 1)
#define LVL_OFFS(n) ((n) * LVL_SIZE)
#define LVL_SHIFT(n)((n) * LVL_CLK_SHIFT)
#define LVL_GRAN(n) (1 << LVL_SHIFT(n))

Then you could do
+static inline unsigned calc_index(unsigned expires, unsigned level),
+{
+   /* Round up to next bin bin */
+   expires = ((expires - 1) >> LVL_SHIFT(level)) + 1;
+   return LVL_OFFS(level) + (expires & LVL_MASK);
+}


> +#define LVL1_TSTART  (LVL_SIZE - 1)

Er... isn't that LVL_SIZE, as documented in the table above?
Then it could be
#define LVL_TSTART(n) (LVL_SIZE << LVL_SHIFT(n))

Ideally, you'd like all of that

+   if (delta < LVL1_TSTART) {
+   idx = (expires + LVL0_GRAN) & LVL_MASK;
+   } else if (delta < LVL2_TSTART) {
+   idx = calc_index(expires, LVL1_GRAN, LVL1_SHIFT, LVL1_OFFS);
+   } else if (delta < LVL3_TSTART) {
+   idx = calc_index(expires, LVL2_GRAN, LVL2_SHIFT, LVL2_OFFS);
+   } else if (delta < LVL4_TSTART) {
+   idx = calc_index(expires, LVL3_GRAN, LVL3_SHIFT, LVL3_OFFS);
+   } else if (delta < LVL5_TSTART) {
+   idx = calc_index(expires, LVL4_GRAN, LVL4_SHIFT, LVL4_OFFS);

to be replaced with __builtin_clz or similar:

level = __fls(delta | LVL_MASK);
if (level <  LVL_BITS + LVL_SHIFT(LVL_DEPTH-1)) {   /* or 
LVL_DEPTH-2, no difference */
level = (level + LVL_CLK_SHIFT - LVL_BITS) / LVL_CLK_SHIFT;
} else if ((long)delta < 0) {
expires = base->clk;
level = 0;
} else {
level = LVL_DEPTH - 1;
}
index = calc_index(expires, level);


> +static inline void detach_expired_timer(struct timer_list *timer)
>  {
>   detach_timer(timer, true);
> - if (!(timer->flags & TIMER_DEFERRABLE))
> - base->active_timers--;
> - base->all_timers--;
>  }

Is there even a reason to have this wrapper any more?  Why not
just replace all calls to it in the source?


> + timer = hlist_entry(head->first, struct timer_list, entry);
> + fn = timer->function;
> + data = timer->data;
> +
> + timer_stats_account_timer(timer);
> +
> + base->running_timer = timer;
> + detach_expired_timer(timer);

Is there some non-obvious reason that you have to fetch fn and data
so early?  It seems like a register pressure pessimization, if the
compiler can't figure out that timer_stats code can't change them.

The cache line containing this timer was already prefetched when you
updated its entry.pprev as 

Re: [PATCH v10 01/14] usb: hcd: Initialize hcd->flags to 0

2016-06-14 Thread Roger Quadros
+Alan,

On 10/06/16 16:07, Roger Quadros wrote:
> When using the OTG/DRD library we can call hcd_add/remove
> consecutively without calling usb_put_hcd/usb_create_hcd in between
> so hcd->flags can be stale.
> 
> If the HC dies due to whatever reason then without this
> patch we get the below error on next hcd_add.
> 
> [   91.494257] xhci-hcd xhci-hcd.0.auto: HC died; cleaning up
> [   91.502068] hub 3-0:1.0: state 0 ports 1 chg  evt 
> [   91.510240] xhci-hcd xhci-hcd.0.auto: xHCI Host Controller
> [   91.516940] xhci-hcd xhci-hcd.0.auto: new USB bus registered, assigned bus 
> number 4
> [   91.529745] usb usb4: We don't know the algorithms for LPM for this host, 
> disabling LPM.
> [   91.540637] usb usb4: New USB device found, idVendor=1d6b, idProduct=0003
> [   91.757865] irq 254: nobody cared (try booting with the "irqpoll" option)
> [   91.757880] CPU: 0 PID: 68 Comm: kworker/u2:2 Not tainted 
> 4.1.4-00828-g1f0ed8c-dirty #44
> [   91.757885] Hardware name: Generic AM43 (Flattened Device Tree)
> [   91.757914] Workqueue: usb_otg usb_otg_work
> [   91.757921] Backtrace:
> [   91.757954] [] (dump_backtrace) from [] 
> (show_stack+0x18/0x1c)
> [   91.757972]  r6:c089d4a4 r5: r4: r3:ee44
> [   91.757991] [] (show_stack) from [] 
> (dump_stack+0x84/0xd0)
> [   91.758008] [] (dump_stack) from [] 
> (__report_bad_irq+0x28/0xc8)
> [   91.758024]  r7: r6:00fe r5: r4:ee514c40
> [   91.758037] [] (__report_bad_irq) from [] 
> (note_interrupt+0x24c/0x2ac)
> [   91.758052]  r6:00fe r5: r4:ee514c40 r3:
> [   91.758065] [] (note_interrupt) from [] 
> (handle_irq_event_percpu+0xb0/0x158)
> [   91.758085]  r10:ee514c40 r9:c08ce49a r8:00fe r7: r6: 
> r5:
> [   91.758094]  r4: r3:
> [   91.758105] [] (handle_irq_event_percpu) from [] 
> (handle_irq_event+0x44/0x64)
> [   91.758126]  r10:0001 r9:ee441ab0 r8:ee441bb8 r7:c0858b4c r6:ed174280 
> r5:ee514ca0
> [   91.758132]  r4:ee514c40
> [   91.758144] [] (handle_irq_event) from [] 
> (handle_fasteoi_irq+0x100/0x1bc)
> [   91.758159]  r6:c085dba0 r5:ee514ca0 r4:ee514c40 r3:
> [   91.758171] [] (handle_fasteoi_irq) from [] 
> (generic_handle_irq+0x28/0x38)
> [   91.758186]  r7:c0853d40 r6:c0858b4c r5:00fe r4:00fe
> [   91.758197] [] (generic_handle_irq) from [] 
> (__handle_domain_irq+0x98/0x12c)
> [   91.758207]  r4:c0853d40 r3:0100
> [   91.758219] [] (__handle_domain_irq) from [] 
> (gic_handle_irq+0x28/0x68)
> [   91.758239]  r10:0001 r9:ee441bb8 r8:fa240100 r7:c0858d70 r6:ee441ab0 
> r5:00b8
> [   91.758245]  r4:fa24010c
> [   91.758264] [] (gic_handle_irq) from [] 
> (__irq_svc+0x40/0x74)
> [   91.758271] Exception stack(0xee441ab0 to 0xee441af8)
> [   91.758280] 1aa0:  c08d2980 
> ee441ac0 
> [   91.758292] 1ac0: 0008 0089 c0858b4c c0858080  ee441bb8 
> 0001 ee441b3c
> [   91.758301] 1ae0: 0101 ee441af8 c02fc418 c0046a1c 2113 
> [   91.758321]  r8: r7:ee441ae4 r6: r5:2113 r4:c0046a1c 
> r3:c02fc418
> [   91.758347] [] (__do_softirq) from [] 
> (irq_exit+0xb8/0x104)
> [   91.758367]  r10:0001 r9:ee441bb8 r8: r7:c0853d40 r6:c0858b4c 
> r5:0089
> [   91.758373]  r4:
> [   91.758386] [] (irq_exit) from [] 
> (__handle_domain_irq+0xa0/0x12c)
> [   91.758395]  r4: r3:0100
> [   91.758406] [] (__handle_domain_irq) from [] 
> (gic_handle_irq+0x28/0x68)
> [   91.758426]  r10:c08e3510 r9:2013 r8:fa240100 r7:c0858d70 r6:ee441bb8 
> r5:0039
> [   91.758433]  r4:fa24010c
> [   91.758445] [] (gic_handle_irq) from [] 
> (__irq_svc+0x40/0x74)
> [   91.758450] Exception stack(0xee441bb8 to 0xee441c00)
> [   91.758457] 1ba0:   
>  0001
> [   91.758468] 1bc0:  ee44 c08e2524 004d 0274  
>  2013
> [   91.758479] 1be0: c08e3510 ee441c4c ee441b60 ee441c00 c03acfec c0080d4c 
> 6013 
> [   91.758499]  r8: r7:ee441bec r6: r5:6013 r4:c0080d4c 
> r3:c03acfec
> [   91.758524] [] (console_unlock) from [] 
> (vprintk_emit+0x20c/0x500)
> [   91.758544]  r10:ee441cc0 r9:c08d3550 r8:c08e3ea0 r7: r6:0001 
> r5:003d
> [   91.758551]  r4:c08d3550
> [   91.758573] [] (vprintk_emit) from [] 
> (dev_vprintk_emit+0x104/0x1ac)
> [   91.758593]  r10:ee441d8c r9:000e r8:c07951e0 r7:0006 r6:ee441cc0 
> r5:000d
> [   91.758599]  r4:ee731068
> [   91.758612] [] (dev_vprintk_emit) from [] 
> (dev_printk_emit+0x28/0x30)
> [   91.758632]  r10:0001 r9:ee5f8410 r8:ee731000 r7:ed429000 r6:0006 
> r5:ee441dc0
> [   91.758638]  r4:ee731068
> [   91.758651] [] (dev_printk_emit) from [] 
> (__dev_printk+0x50/0x70)
> [   91.758660]  r3:bf2268cc r2:c07951e0
> [   91.758673] [] (__dev_printk) from [] 
> (_dev_info+0x3c/0x48)
> [   91.758686]  r6: r5:ee7310

Re: [PATCH] sched: unlikely corrupted stack end

2016-06-14 Thread Peter Zijlstra
On Tue, Jun 14, 2016 at 02:43:06PM +0800, WANG Chao wrote:
> unlikely() was dropped in commit ce03e41 ("sched/core: Drop unlikely
> behind BUG_ON()"), but commit 29d6455 ("sched: panic on corrupted stack
> end") dropped BUG_ON() and called panic directly.

Please use git config core.abbrev=12 and try again.


Re: [PATCH v10 03/14] usb: hcd.h: Add OTG to HCD interface

2016-06-14 Thread Roger Quadros
+Alan

On 10/06/16 16:07, Roger Quadros wrote:
> The OTG core will use struct otg_hcd_ops to interface
> with the HCD (Host Controller Driver).
> 
> The main purpose of this interface is to avoid directly
> calling HCD APIs from the OTG core as they
> wouldn't be defined in the built-in symbol table if
> CONFIG_USB is m.
> 
> Signed-off-by: Roger Quadros 
> Acked-by: Peter Chen 
> ---
>  include/linux/usb/hcd.h | 24 
>  1 file changed, 24 insertions(+)
> 
> diff --git a/include/linux/usb/hcd.h b/include/linux/usb/hcd.h
> index 66fc137..7729c1f 100644
> --- a/include/linux/usb/hcd.h
> +++ b/include/linux/usb/hcd.h
> @@ -400,6 +400,30 @@ struct hc_driver {
>  
>  };
>  
> +/**
> + * struct otg_hcd_ops - Interface between OTG core and HCD
> + *
> + * Provided by the HCD core to allow the OTG core to interface with the HCD
> + *
> + * @add: function to add the HCD
> + * @remove: function to remove the HCD
> + * @usb_bus_start_enum: function to immediately start bus enumeration
> + * @usb_control_msg: function to build and send a control URB
> + * @usb_hub_find_child: function to get pointer to the child device
> + */
> +struct otg_hcd_ops {
> + int (*add)(struct usb_hcd *hcd,
> +unsigned int irqnum, unsigned long irqflags);
> + void (*remove)(struct usb_hcd *hcd);
> + int (*usb_bus_start_enum)(struct usb_bus *bus, unsigned int port_num);
> + int (*usb_control_msg)(struct usb_device *dev, unsigned int pipe,
> +__u8 request, __u8 requesttype, __u16 value,
> +__u16 index, void *data, __u16 size,
> +int timeout);
> + struct usb_device * (*usb_hub_find_child)(struct usb_device *hdev,
> +   int port1);
> +};
> +
>  static inline int hcd_giveback_urb_in_bh(struct usb_hcd *hcd)
>  {
>   return hcd->driver->flags & HCD_BH;
> 

--
cheers,
-roger


Re: [PATCH v10 12/14] usb: hcd: Adapt to OTG core

2016-06-14 Thread Roger Quadros
+Alan

On 10/06/16 16:07, Roger Quadros wrote:
> Introduce usb_otg_add/remove_hcd() for use by host
> controllers that are part of OTG/dual-role port.
> 
> Non device tree platforms can use the otg_dev argument
> to specify the OTG controller device. If otg_dev is NULL
> then the device tree node's otg-controller property is used to
> get the otg_dev device.
> 
> Signed-off-by: Roger Quadros 
> Acked-by: Peter Chen 
> ---
>  drivers/usb/core/hcd.c  | 55 
> +
>  include/linux/usb/hcd.h |  4 
>  2 files changed, 59 insertions(+)
> 
> diff --git a/drivers/usb/core/hcd.c b/drivers/usb/core/hcd.c
> index ae6c76d..c6f4155 100644
> --- a/drivers/usb/core/hcd.c
> +++ b/drivers/usb/core/hcd.c
> @@ -46,6 +46,11 @@
>  #include 
>  #include 
>  #include 
> +#include 
> +#include 
> +
> +#include 
> +#include 
>  
>  #include "usb.h"
>  
> @@ -3025,6 +3030,56 @@ void usb_remove_hcd(struct usb_hcd *hcd)
>  }
>  EXPORT_SYMBOL_GPL(usb_remove_hcd);
>  
> +static struct otg_hcd_ops otg_hcd_intf = {
> + .add = usb_add_hcd,
> + .remove = usb_remove_hcd,
> + .usb_bus_start_enum = usb_bus_start_enum,
> + .usb_control_msg = usb_control_msg,
> + .usb_hub_find_child = usb_hub_find_child,
> +};
> +
> +/**
> + * usb_otg_add_hcd - Register the HCD with OTG core.
> + * @hcd: the usb_hcd structure to initialize
> + * @irqnum: Interrupt line to allocate
> + * @irqflags: Interrupt type flags
> + * @otg_dev: OTG controller device managing this HCD
> + *
> + * Registers the HCD with OTG core. OTG core will call usb_add_hcd()
> + * or usb_remove_hcd() as necessary.
> + * If otg_dev is NULL then device tree node is checked for OTG
> + * controller device via the otg-controller property.
> + */
> +int usb_otg_add_hcd(struct usb_hcd *hcd,
> + unsigned int irqnum, unsigned long irqflags,
> + struct device *otg_dev)
> +{
> + struct device *dev = hcd->self.controller;
> +
> + if (!otg_dev) {
> + hcd->otg_dev = of_usb_get_otg(dev->of_node);
> + if (!hcd->otg_dev)
> + return -ENODEV;
> + } else {
> + hcd->otg_dev = otg_dev;
> + }
> +
> + return usb_otg_register_hcd(hcd, irqnum, irqflags, &otg_hcd_intf);
> +}
> +EXPORT_SYMBOL_GPL(usb_otg_add_hcd);
> +
> +/**
> + * usb_otg_remove_hcd - Unregister the HCD with OTG core.
> + * @hcd: the usb_hcd structure to remove
> + *
> + * Unregisters the HCD from the OTG core.
> + */
> +void usb_otg_remove_hcd(struct usb_hcd *hcd)
> +{
> + usb_otg_unregister_hcd(hcd);
> +}
> +EXPORT_SYMBOL_GPL(usb_otg_remove_hcd);
> +
>  void
>  usb_hcd_platform_shutdown(struct platform_device *dev)
>  {
> diff --git a/include/linux/usb/hcd.h b/include/linux/usb/hcd.h
> index 36bd54f..0c70282 100644
> --- a/include/linux/usb/hcd.h
> +++ b/include/linux/usb/hcd.h
> @@ -473,6 +473,10 @@ extern int usb_hcd_is_primary_hcd(struct usb_hcd *hcd);
>  extern int usb_add_hcd(struct usb_hcd *hcd,
>   unsigned int irqnum, unsigned long irqflags);
>  extern void usb_remove_hcd(struct usb_hcd *hcd);
> +extern int usb_otg_add_hcd(struct usb_hcd *hcd,
> +unsigned int irqnum, unsigned long irqflags,
> +struct device *otg_dev);
> +extern void usb_otg_remove_hcd(struct usb_hcd *hcd);
>  extern int usb_hcd_find_raw_port_number(struct usb_hcd *hcd, int port1);
>  
>  struct platform_device;
> 

--
cheers,
-roger


RE: [PATCH v10 2/2] dmaengine: Add Xilinx zynqmp dma engine driver support

2016-06-14 Thread Appana Durga Kedareswara Rao
Hi Vinod,

Thanks for the review...

> 
> On Wed, Jun 08, 2016 at 07:40:52AM +, Appana Durga Kedareswara Rao
> wrote:
> > > > +static void zynqmp_dma_desc_config_eod(struct zynqmp_dma_chan
> > > > +*chan, void *desc)
> > >
> > > eod? 80 line?
> 
> What's eod?

End of descriptor...

> 
> > > > +int zynqmp_dma_channel_set_config(struct dma_chan *dchan,
> > > > + struct zynqmp_dma_config *cfg) {
> > > > +   struct zynqmp_dma_chan *chan = to_chan(dchan);
> > > > +
> > > > +   chan->config.ovrfetch = cfg->ovrfetch;
> > > > +   chan->config.has_sg = cfg->has_sg;
> > >
> > > is this HW capability? if so why would anyone not like to use it!
> >
> > Yes it is HW capability. It can be either in simple mode or SG mode
> > Earlier In the driver this configuration is read from the device-tree
> > But as per lars and your suggestion moved it as runtime config parameters.
> 
> If sg mode is available why would anyone _not_ want it?
> 
> I do not think there is point to have this

You mean always keep the device in SG mode and provide an option 
For simple dma mode if user want to use simple DMA mode??

There are few features that are available in the simple DMA mode won't
Available in SG mode like write only DMA , read only DMA mode etc...

> 
> >
> > >
> > > > +   chan->config.ratectrl = cfg->ratectrl;
> > > > +   chan->config.src_issue = cfg->src_issue;
> > > > +   chan->config.src_burst_len = cfg->src_burst_len;
> > > > +   chan->config.dst_burst_len = cfg->dst_burst_len;
> > >
> > > can you describe these parameters?
> > ratectl:
> > Rate control can be independently enabled per channel. When rate
> > control is enabled, the DMA channel uses the rate control count to schedule
> successive data read transactions.
> 
> And how is this used by client?

When rate control is enabled, ZDMA channel uses the rate control count
To schedule successive data read transactions I mean kind of flow control to 
schedule 
Transactions at fixed intervals instead of pumping the transfers without delay 
or whenever bus is available

Rate control count register definition (11:0):
Scheduling interval for SRC AXI transaction, only used if rate control is 
enabled 


> 
> > src_issue:
> > Tells outstanding transaction on SRC.
> 
> This should be read only then, right?

It is a Read/Write register
http://www.xilinx.com/html_docs/registers/ug1087/ug1087-zynq-ultrascale-registers.html
 
By default it is configured for Max transactions.
If user want to limit it they can limit it using this config option.

> 
> > Burst_len:
> > Configures the burst length of the src and dst transfers...
> 
> Hmmm, but you are on memcpy, so that should be programmed for throughput?

Yes...

> 
> > >
> > > How would a client know how to configure them?
> >
> > With the default values of the config parameters driver will work.
> 
> But how will client know what is default!

Default values means IP default state after reset.
If user not aware of the above parameters also still the driver will work for 
basic functionality.
Do you want me to implement one more API get_config so that 
Whenever user will call the get_config he will know the default values
Of the config parameters?

> 
> > If user has specific requirement to change these parameters they can
> > pass It to the driver using set_config API and all these parameters
> > are Documented in the include/linux/dma/xilinx_dma.h file...
> 
> Can you give me an example where user would like to do that


I am using customized dma test client.
There I am calling this set_config API before triggering memcpy/SG operations.

Regards,
Kedar.

> 
> --
> ~Vinod
> --
> To unsubscribe from this list: send the line "unsubscribe dmaengine" in the 
> body
> of a message to majord...@vger.kernel.org More majordomo info at
> http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] i2c: octeon: Add retry logic after receiving STAT_RXADDR_NAK

2016-06-14 Thread Jan Glauber
On Thu, Jun 09, 2016 at 10:11:51PM +0200, Wolfram Sang wrote:
> On Wed, Jun 08, 2016 at 08:51:18AM +0200, Jan Glauber wrote:
> > The controller specification states that when receiving STAT_RXADDR_NAK
> > the START should be sent again. Retry several times before finally
> > failing with -ENXIO.
> > 
> > Without this change the IPMI SSIF driver fails executing several commands
> > like 'ipmitool fru' on ThunderX.
> 
> Huh? Looks wrong to me. I'd say the client driver needs to retry. Only
> that one knows if retrying is appropriate or a waste of time.
> 

I've been debugging this and it turned out that there was an related issue with
the clock setting. With that corrected this patch is not needed anymore,
so you can drop it.

I still see a huge number of RXADDR_NAK's after START but the ipmi_ssif
driver retry logic seems to deal with that.

thanks,
Jan


Re: [PATCH v10 14/14] usb: host: xhci-plat: Add otg device to platform data

2016-06-14 Thread Roger Quadros
Mathias,

On 10/06/16 16:07, Roger Quadros wrote:
> Host controllers that are part of an OTG/dual-role instance
> need to somehow pass the OTG controller device information
> to the HCD core.
> 
> We use platform data to pass the OTG controller device.
> 
> Signed-off-by: Roger Quadros 
> Reviewed-by: Peter Chen 
> ---
>  drivers/usb/host/xhci-plat.c | 35 ---
>  include/linux/usb/xhci_pdriver.h |  3 +++
>  2 files changed, 31 insertions(+), 7 deletions(-)

Any comments on this one?

> 
> diff --git a/drivers/usb/host/xhci-plat.c b/drivers/usb/host/xhci-plat.c
> index 676ea45..24d030a 100644
> --- a/drivers/usb/host/xhci-plat.c
> +++ b/drivers/usb/host/xhci-plat.c
> @@ -239,11 +239,20 @@ static int xhci_plat_probe(struct platform_device *pdev)
>   goto put_usb3_hcd;
>   }
>  
> - ret = usb_add_hcd(hcd, irq, IRQF_SHARED);
> + if (pdata && pdata->otg_dev)
> + ret = usb_otg_add_hcd(hcd, irq, IRQF_SHARED, pdata->otg_dev);
> + else
> + ret = usb_add_hcd(hcd, irq, IRQF_SHARED);
> +
>   if (ret)
>   goto disable_usb_phy;
>  
> - ret = usb_add_hcd(xhci->shared_hcd, irq, IRQF_SHARED);
> + if (pdata && pdata->otg_dev)
> + ret = usb_otg_add_hcd(xhci->shared_hcd, irq, IRQF_SHARED,
> +   pdata->otg_dev);
> + else
> + ret = usb_add_hcd(xhci->shared_hcd, irq, IRQF_SHARED);
> +
>   if (ret)
>   goto dealloc_usb2_hcd;
>  
> @@ -251,7 +260,10 @@ static int xhci_plat_probe(struct platform_device *pdev)
>  
>  
>  dealloc_usb2_hcd:
> - usb_remove_hcd(hcd);
> + if (pdata && pdata->otg_dev)
> + usb_otg_remove_hcd(hcd);
> + else
> + usb_remove_hcd(hcd);
>  
>  disable_usb_phy:
>   usb_phy_shutdown(hcd->usb_phy);
> @@ -269,16 +281,25 @@ put_hcd:
>   return ret;
>  }
>  
> -static int xhci_plat_remove(struct platform_device *dev)
> +static int xhci_plat_remove(struct platform_device *pdev)
>  {
> - struct usb_hcd  *hcd = platform_get_drvdata(dev);
> + struct usb_hcd  *hcd = platform_get_drvdata(pdev);
>   struct xhci_hcd *xhci = hcd_to_xhci(hcd);
>   struct clk *clk = xhci->clk;
> + struct usb_xhci_pdata *pdata = dev_get_platdata(&pdev->dev);
> +
> + if (pdata && pdata->otg_dev)
> + usb_otg_remove_hcd(xhci->shared_hcd);
> + else
> + usb_remove_hcd(xhci->shared_hcd);
>  
> - usb_remove_hcd(xhci->shared_hcd);
>   usb_phy_shutdown(hcd->usb_phy);
>  
> - usb_remove_hcd(hcd);
> + if (pdata && pdata->otg_dev)
> + usb_otg_remove_hcd(hcd);
> + else
> + usb_remove_hcd(hcd);
> +
>   usb_put_hcd(xhci->shared_hcd);
>  
>   if (!IS_ERR(clk))
> diff --git a/include/linux/usb/xhci_pdriver.h 
> b/include/linux/usb/xhci_pdriver.h
> index 376654b..5c68b83 100644
> --- a/include/linux/usb/xhci_pdriver.h
> +++ b/include/linux/usb/xhci_pdriver.h
> @@ -18,10 +18,13 @@
>   *
>   * @usb3_lpm_capable:determines if this xhci platform supports USB3
>   *   LPM capability
> + * @otg_dev: OTG controller device. Only requied if part of
> + *   OTG/dual-role.
>   *
>   */
>  struct usb_xhci_pdata {
>   unsignedusb3_lpm_capable:1;
> + struct device   *otg_dev;
>  };
>  
>  #endif /* __USB_CORE_XHCI_PDRIVER_H */
> 

--
cheers,
-roger


[PATCH v2] sched: unlikely corrupted stack end

2016-06-14 Thread WANG Chao
unlikely() was dropped in commit ce03e4137bb2 ("sched/core: Drop
unlikely behind BUG_ON()"), but commit 29d6455178a0 ("sched: panic on
corrupted stack end") dropped BUG_ON() and called panic directly.

Now we should bring unlikely() back for branch prediction. While we're
at it, it's better and cleaner to turn task_stack_end_corrupted() into
inline function.

Signed-off-by: WANG Chao 
---
 include/linux/sched.h | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 6e42ada26345..797ca1975431 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2997,8 +2997,11 @@ static inline unsigned long *end_of_stack(struct 
task_struct *p)
 }
 
 #endif
-#define task_stack_end_corrupted(task) \
-   (*(end_of_stack(task)) != STACK_END_MAGIC)
+
+static inline int task_stack_end_corrupted(struct task_struct *p)
+{
+   return unlikely(*(end_of_stack(p)) != STACK_END_MAGIC);
+}
 
 static inline int object_is_on_stack(void *obj)
 {
-- 
2.8.4



Re: [PATCH 1/3] net: Add MDIO bus driver for the Hisilicon FEMAC

2016-06-14 Thread Li Dongpo


On 2016/6/13 21:32, Andrew Lunn wrote:
> On Mon, Jun 13, 2016 at 02:07:54PM +0800, Dongpo Li wrote:
>> This patch adds a separate driver for the MDIO interface of the
>> Hisilicon Fast Ethernet MAC.
>>
>> Reviewed-by: Jiancheng Xue 
>> Signed-off-by: Dongpo Li 
>> ---
>>  .../bindings/net/hisilicon-femac-mdio.txt  |  22 +++
>>  drivers/net/phy/Kconfig|   8 +
>>  drivers/net/phy/Makefile   |   1 +
>>  drivers/net/phy/mdio-hisi-femac.c  | 165 
>> +
>>  4 files changed, 196 insertions(+)
>>  create mode 100644 
>> Documentation/devicetree/bindings/net/hisilicon-femac-mdio.txt
>>  create mode 100644 drivers/net/phy/mdio-hisi-femac.c

[...]
>> +
> 
> Hi Dongpo
> 
> Overall this looks good. Just some minor comments
> 
>> +static int hisi_femac_mdio_wait_ready(struct mii_bus *bus)
>> +{
>> +struct hisi_femac_mdio_data *data = bus->priv;
> 
> You could just pass data here. Your read and write functions already
> have it.
> 
Thank you, I will fix it in next patch version.

>> +data->clk = devm_clk_get(&pdev->dev, NULL);
>> +if (IS_ERR(data->clk)) {
>> +ret = -ENODEV;
>> +goto err_out_free_mdiobus;
>> +}
> 
> Return the error which devm_clk_get() gives you.
> 
ok, I will fix it.

>> +
>> +ret = clk_prepare_enable(data->clk);
>> +if (ret)
>> +goto err_out_free_mdiobus;
>> +
>> +ret = of_mdiobus_register(bus, np);
>> +if (ret)
>> +goto err_out_free_mdiobus;
> 
> You leave the clock prepared and enabled on error.
> 
ok, I will fix it.

> Andrew
> 
> .
> 



Re: [LKP] [lkp] [mm] 5c0a85fad9: unixbench.score -6.3% regression

2016-06-14 Thread Kirill A. Shutemov
On Mon, Jun 13, 2016 at 11:11:05PM -0700, Linus Torvalds wrote:
> On Mon, Jun 13, 2016 at 5:52 AM, Kirill A. Shutemov
>  wrote:
> > On Sat, Jun 11, 2016 at 06:02:57PM -0700, Linus Torvalds wrote:
> >>
> >> I've timed it at over a thousand cycles on at least some CPU's, but
> >> that's still peanuts compared to a real page fault. It shouldn't be
> >> *that* noticeable, ie no way it's a 6% regression on its own.
> >
> > Looks like setting accessed bit is the problem.
> 
> Ok. I've definitely seen it as an issue, but never to the point of
> several percent on a real benchmark that wasn't explicitly testing
> that cost.
> 
> I reported the excessive dirty/accessed bit cost to Intel back in the
> P4 days, but it's apparently not been high enough for anybody to care.
> 
> > We spend 36% more time in page walk only, about 1% of total userspace time.
> > Combining this with page walk footprint on caches, I guess we can get to
> > this 3.5% score difference I see.
> >
> > I'm not sure if there's anything we can do to solve the issue without
> > screwing relacim logic again. :(
> 
> I think we should say "screw the reclaim logic" for now, and revert
> commit 5c0a85fad949 for now.

Okay. I'll prepare the patch.

> Considering how much trouble the accessed bit is on some other
> architectures too, I wonder if we should strive to simply not care
> about it, and always leaving it set. And then rely entirely on just
> unmapping the pages and making the "we took a page fault after
> unmapping" be the real activity tester.
> 
> So get rid of the "if the page is young, mark it old but leave it in
> the page tables" logic entirely. When we unmap a page, it will always
> either be in the swap cache or the page cache anyway, so faulting it
> in again should be just a minor fault with no actual IO happening.
> 
> That might be less of an impact in the end - yes, the unmap and
> re-fault is much more expensive, but it presumably happens to much
> fewer pages.
> 
> What do you think?

Well, we cannot do this for anonymous memory. No swap -- no swap cache, if
I read code correctly.

I guess it's doable for file mappings. Although I would expect regressions
in other benchmarks. IIUC, it would require page unmapping to propogate
page to active list, which is suboptimal.

And implications for page_idle is not clear to me.

Rik, Mel, any comments?

-- 
 Kirill A. Shutemov


Re: [PATCH RFC] slub: reap free slabs periodically

2016-06-14 Thread Vladimir Davydov
On Fri, Jun 10, 2016 at 04:32:26PM -0500, Christoph Lameter wrote:
> One reason for SLUBs creation was the 2 second scans in  SLAB which causes
> significant disruption of latency sensitive tasksk.

That's not good, indeed.

> 
> You can simply implement a reaper in userspace by running
> 
> slabinfo -s
> 
> if you have to have this.

Doing this periodically would probably hurt performance of active caches
as 'slabinfo -s' shrinks all slabs unconditionally, even if they are
being actively used. OTOH, one could trigger shrinking slabs only on
memory pressure. That would require yet another daemon tracking the
system state, but it is doable I guess.

Thanks a lot for your input, Christoph.

> 
> There is no need to duplicate SLAB problems.


[PATCH v2 2/2] sound: lpass-platform: Move dma channel allocation to pcmops

2016-06-14 Thread Srinivas Kandagatla
Move dma channel allocations to pcmops open and close functions. Reason
to do this is that, lpass_platform_pcm_free() accesses snd_soc_pcm_runtime
via substream->private data, However By this time runtimes are already
freed as part of soc_cleanup_card_resources() sequence.

This patch moves the channel allocations/deallocations to pcmops open()
and close() respectively, where the code has valid snd_soc_pcm_runtime.

Without this patch unloading lpass sound card module would result in below
crash:

Unable to handle kernel NULL pointer dereference at virtual address

pgd = 800038f0d000
[] *pgd=
Internal error: Oops: 9604 [#1] PREEMPT SMP
Modules linked in: snd_soc_apq8016_sbc(-) snd_soc_lpass_apq8016
snd_soc_lpass_cpu snd_soc_lpass_platform
CPU: 0 PID: 1573 Comm: rmmod Not tainted 4.7.0-rc2-next-20160609+ #59
Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
task: 800038cd ti: 80003929c000 task.ti: 80003929c000
PC is at lpass_platform_pcm_free+0xc4/0x1c0 [snd_soc_lpass_platform]
LR is at lpass_platform_pcm_free+0xb8/0x1c0 [snd_soc_lpass_platform]
pc : [] lr : [] pstate: 6145
sp : 80003929fa90
x29: 80003929fa90 x28: 00b22438
x27: 00b22450 x26: 00b22468
x25: 00b22488 x24: 00b223f0
x23: 00b22418 x22: 800038f428c0
x21: 8000392ae280 x20: 0001
x19: 00b22118 x18: dc331600
x17: b78036c0 x16: 081c16e8
x15: b77f0588 x14: 3d4d554e51455300
x13:  x12: 0028
x11: 0044 x10: 80003929f822
x9 : 80003929f823 x8 : 
x7 : 0004 x6 : 08864890
x5 :  x4 : 
x3 :  x2 : 80003efac228
x1 : 00b22118 x0 : 00b22450

Process rmmod (pid: 1573, stack limit = 0x80003929c020)
Stack: (0x80003929fa90 to 0x8000392a)
fa80:  80003929fb40 086d1f8c
faa0: 08ca5408 800038f42200 08ca5420 000b
fac0: 80003929fd70 0015 0120 006a
fae0: 087f2000 80003929c000 80003929fb40 8000392ae358
fb00: 8000392af900  8000392afa48 000b
fb20: 80003929fd70 0015 80003929fb80 086cc070
fb40: 80003929fb70 086d21d4 800038f7fa00 8000392801a0
fb60: 800038cf2000 0015 80003929fb80 086cc064
fb80: 80003929fba0 086cc1f4 800038f70600 000b
fba0: 80003929fbc0 086c68a8 80003928 800039280540
fbc0: 80003929fbe0 084a7438 800039280540 800039280550
fbe0: 80003929fc10 08355ddc 800039280550 08c64718
fc00: 800038f61d00 08ca5190 80003929fc40 08355e5c
fc20: 800039280550 80003928 80003847b618 80003928
fc40: 80003929fc60 084a77d8  0015
fc60: 80003929fc70 086c6e58 80003929fc90 086c6fbc
fc80: 80003929fcb0 80003928 80003929fcd0 086e3e50
fca0: 80003847b050 80003847b728 8000 
fcc0: 80003929fcc0 80003929fcc0 80003929fd00 086e4c8c
fce0: 80003847b618 800038f61100 8000399ddf90 000b
fd00: 80003929fd20 086f0684 800038f61000 080d51c0
fd20: 80003929fd30 084af904 80003929fd80 084afcf8
fd40: 8000399ddf90 00b3c028 8000399ddff0 08cc8000
fd60: 8000 084ac090 800038f94600 800038f61000
fd80: 80003929fda0 084ac0b0 8000399ddf90 00b3c028
fda0: 80003929fdc0 084ac234 8000399ddf90 00b3c028
fdc0: 80003929fdf0 084ab3d4 00b3c028 08c64000
fde0: 08c64818 0001 80003929fe20 084ac8ac
fe00: 00b3c028 00b3c100 fff5 
fe20: 80003929fe40 084ad998 08c2d000 0015
fe40: 80003929fe50 00b3a460 80003929fe60 08120fe4
fe60:  08084e70  
fe80:  954cca48 0004 5f636f735f646e73
fea0: 5f36313038717061 00636273  08084d64
fec0:   bc814340 0800
fee0: 4fdc43dac03e2300 2002 95548e58 d9f89fb9
ff00:   006a 1999
ff20:   0005 
ff40: 95402a94 9554a588 954cca40 af8d22d0
ff60: d9f8ad70 bc8142e0  
ff80: d9f8be7c  d9f8b0e0 d9f8b2b8
ffa0: aa

[PATCH v2 1/2] sound: lpass-cpu: add module licence and description

2016-06-14 Thread Srinivas Kandagatla
This patch adds module licence to lpass-cpu driver, without this
patch lpass-cpu module would taint with below error:

snd_soc_lpass_cpu: module license 'unspecified' taints kernel.
Disabling lock debugging due to kernel taint
snd_soc_lpass_cpu: Unknown symbol regmap_write (err 0)
snd_soc_lpass_cpu: Unknown symbol devm_kmalloc (err 0)
...

Acked-by: Kenneth Westfield 
Signed-off-by: Srinivas Kandagatla 
---
 sound/soc/qcom/lpass-cpu.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/sound/soc/qcom/lpass-cpu.c b/sound/soc/qcom/lpass-cpu.c
index 3cde9fb..eff3f9a 100644
--- a/sound/soc/qcom/lpass-cpu.c
+++ b/sound/soc/qcom/lpass-cpu.c
@@ -586,3 +586,6 @@ int asoc_qcom_lpass_cpu_platform_remove(struct 
platform_device *pdev)
return 0;
 }
 EXPORT_SYMBOL_GPL(asoc_qcom_lpass_cpu_platform_remove);
+
+MODULE_DESCRIPTION("QTi LPASS CPU Driver");
+MODULE_LICENSE("GPL v2");
-- 
2.8.3



[v3 PATCH 1/2] scsi:stex.c Support Pegasus 3 product

2016-06-14 Thread Charles Chiou
From: Charles 

Pegasus series is a RAID support product by using Thunderbolt technology.

The newest product, Pegasus 3 is support Thunderbolt 3 technology with another 
chip.

1.Change driver version.

2.Add Pegasus 3 VID, DID and define it's device address.

3.Pegasus 3 use msi interrupt, so stex_request_irq P3 type enable msi.

4.For hibernation, use msi_lock in stex_ss_handshake to prevent msi register 
write again when handshaking.

5.Pegasus 3 don't need read() as flush.

6.In stex_ss_intr & stex_abort, P3 only clear interrupt register when getting 
vendor defined interrupt.

Signed-off-by: Charles 
Signed-off-by: Paul 
---
 drivers/scsi/stex.c | 240 ++--
 1 file changed, 175 insertions(+), 65 deletions(-)

diff --git a/drivers/scsi/stex.c b/drivers/scsi/stex.c
index 5b23175..cad53b5 100644
--- a/drivers/scsi/stex.c
+++ b/drivers/scsi/stex.c
@@ -38,8 +38,8 @@
 #include 
 
 #define DRV_NAME "stex"
-#define ST_DRIVER_VERSION  "5.00..01"
-#define ST_VER_MAJOR   5
+#define ST_DRIVER_VERSION  "6.00..01"
+#define ST_VER_MAJOR   6
 #define ST_VER_MINOR   00
 #define ST_OEM 
 #define ST_BUILD_VER   01
@@ -64,6 +64,13 @@ enum {
YI2H_INT_C  = 0xa0,
YH2I_REQ= 0xc0,
YH2I_REQ_HI = 0xc4,
+   PSCRATCH0   = 0xb0,
+   PSCRATCH1   = 0xb4,
+   PSCRATCH2   = 0xb8,
+   PSCRATCH3   = 0xbc,
+   PSCRATCH4   = 0xc8,
+   MAILBOX_BASE= 0x1000,
+   MAILBOX_HNDSHK_STS  = 0x0,
 
/* MU register value */
MU_INBOUND_DOORBELL_HANDSHAKE   = (1 << 0),
@@ -87,7 +94,7 @@ enum {
MU_STATE_STOP   = 5,
MU_STATE_NOCONNECT  = 6,
 
-   MU_MAX_DELAY= 120,
+   MU_MAX_DELAY= 50,
MU_HANDSHAKE_SIGNATURE  = 0x5555,
MU_HANDSHAKE_SIGNATURE_HALF = 0x5a5a,
MU_HARD_RESET_WAIT  = 3,
@@ -135,6 +142,7 @@ enum {
st_yosemite = 2,
st_seq  = 3,
st_yel  = 4,
+   st_P3   = 5,
 
PASSTHRU_REQ_TYPE   = 0x0001,
PASSTHRU_REQ_NO_WAKEUP  = 0x0100,
@@ -339,6 +347,7 @@ struct st_hba {
u16 rq_size;
u16 sts_count;
u8  supports_pm;
+   int msi_lock;
 };
 
 struct st_card_info {
@@ -540,11 +549,15 @@ stex_ss_send_cmd(struct st_hba *hba, struct req_msg *req, 
u16 tag)
 
++hba->req_head;
hba->req_head %= hba->rq_count+1;
-
-   writel((addr >> 16) >> 16, hba->mmio_base + YH2I_REQ_HI);
-   readl(hba->mmio_base + YH2I_REQ_HI); /* flush */
-   writel(addr, hba->mmio_base + YH2I_REQ);
-   readl(hba->mmio_base + YH2I_REQ); /* flush */
+   if (hba->cardtype == st_P3) {
+   writel((addr >> 16) >> 16, hba->mmio_base + YH2I_REQ_HI);
+   writel(addr, hba->mmio_base + YH2I_REQ);
+   } else {
+   writel((addr >> 16) >> 16, hba->mmio_base + YH2I_REQ_HI);
+   readl(hba->mmio_base + YH2I_REQ_HI); /* flush */
+   writel(addr, hba->mmio_base + YH2I_REQ);
+   readl(hba->mmio_base + YH2I_REQ); /* flush */
+   }
 }
 
 static void return_abnormal_state(struct st_hba *hba, int status)
@@ -974,15 +987,31 @@ static irqreturn_t stex_ss_intr(int irq, void *__hba)
 
spin_lock_irqsave(hba->host->host_lock, flags);
 
-   data = readl(base + YI2H_INT);
-   if (data && data != 0x) {
-   /* clear the interrupt */
-   writel(data, base + YI2H_INT_C);
-   stex_ss_mu_intr(hba);
-   spin_unlock_irqrestore(hba->host->host_lock, flags);
-   if (unlikely(data & SS_I2H_REQUEST_RESET))
-   queue_work(hba->work_q, &hba->reset_work);
-   return IRQ_HANDLED;
+   if (hba->cardtype == st_yel) {
+   data = readl(base + YI2H_INT);
+   if (data && data != 0x) {
+   /* clear the interrupt */
+   writel(data, base + YI2H_INT_C);
+   stex_ss_mu_intr(hba);
+   spin_unlock_irqrestore(hba->host->host_lock, flags);
+   if (unlikely(data & SS_I2H_REQUEST_RESET))
+   queue_work(hba->work_q, &hba->reset_work);
+   return IRQ_HANDLED;
+   }
+   } else {
+   data = readl(base + PSCRATCH4);
+   if (dat

[v3 PATCH 2/2] scsi:stex.c Add S6 support

2016-06-14 Thread Charles Chiou
From: Charles 

1.Add reboot notifier and register it in stex_probe for all supported device.

2.For all supported device in restart flow, we get a callback from notifier and 
set S6flag for stex_shutdown & stex_hba_stop to send restart command to FW.

Signed-off-by: Charles 
Signed-off-by: Paul 
---
 drivers/scsi/stex.c | 25 +++--
 1 file changed, 23 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/stex.c b/drivers/scsi/stex.c
index cad53b5..7b29a00 100644
--- a/drivers/scsi/stex.c
+++ b/drivers/scsi/stex.c
@@ -26,6 +26,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -362,6 +363,12 @@ struct st_card_info {
u16 sts_count;
 };
 
+int S6flag;
+static int stex_halt(struct notifier_block *nb, ulong event, void *buf);
+static struct notifier_block stex_notifier = {
+   stex_halt, NULL, 0
+};
+
 static int msi;
 module_param(msi, int, 0);
 MODULE_PARM_DESC(msi, "Enable Message Signaled Interrupts(0=off, 1=on)");
@@ -1655,6 +1662,9 @@ static int stex_probe(struct pci_dev *pdev, const struct 
pci_device_id *id)
 
pci_set_master(pdev);
 
+   S6flag = 0;
+   register_reboot_notifier(&stex_notifier);
+
host = scsi_host_alloc(&driver_template, sizeof(struct st_hba));
 
if (!host) {
@@ -1929,15 +1939,20 @@ static void stex_remove(struct pci_dev *pdev)
scsi_host_put(hba->host);
 
pci_disable_device(pdev);
+
+   unregister_reboot_notifier(&stex_notifier);
 }
 
 static void stex_shutdown(struct pci_dev *pdev)
 {
struct st_hba *hba = pci_get_drvdata(pdev);
 
-   if (hba->supports_pm == 0)
+   if (hba->supports_pm == 0) {
stex_hba_stop(hba, ST_IGNORED);
-   else
+   } else if (hba->supports_pm == 1 && S6flag) {
+   unregister_reboot_notifier(&stex_notifier);
+   stex_hba_stop(hba, ST_S6);
+   } else
stex_hba_stop(hba, ST_S5);
 }
 
@@ -1974,6 +1989,12 @@ static int stex_resume(struct pci_dev *pdev)
stex_handshake(hba);
return 0;
 }
+
+static int stex_halt(struct notifier_block *nb, unsigned long event, void *buf)
+{
+   S6flag = 1;
+   return NOTIFY_OK;
+}
 MODULE_DEVICE_TABLE(pci, stex_pci_tbl);
 
 static struct pci_driver stex_pci_driver = {
-- 
1.9.1



Re: [very-RFC 0/8] TSN driver for the kernel

2016-06-14 Thread Henrik Austad
On Mon, Jun 13, 2016 at 08:56:44AM -0700, John Fastabend wrote:
> On 16-06-13 04:47 AM, Richard Cochran wrote:
> > [...]
> > Here is what is missing to support audio TSN:
> > 
> > * User Space
> > 
> > 1. A proper userland stack for AVDECC, MAAP, FQTSS, and so on.  The
> >OpenAVB project does not offer much beyond simple examples.
> > 
> > 2. A user space audio application that puts it all together, making
> >use of the services in #1, the linuxptp gPTP service, the ALSA
> >services, and the network connections.  This program will have all
> >the knowledge about packet formats, AV encodings, and the local HW
> >capabilities.  This program cannot yet be written, as we still need
> >some kernel work in the audio and networking subsystems.
> > 
> > * Kernel Space
> > 
> > 1. Providing frames with a future transmit time.  For normal sockets,
> >this can be in the CMESG data.  For mmap'ed buffers, we will need a
> >new format.  (I think Arnd is working on a new layout.)
> > 
> > 2. Time based qdisc for transmitted frames.  For MACs that support
> >this (like the i210), we only have to place the frame into the
> >correct queue.  For normal HW, we want to be able to reserve a time
> >window in which non-TSN frames are blocked.  This is some work, but
> >in the end it should be a generic solution that not only works
> >"perfectly" with TSN HW but also provides best effort service using
> >any NIC.
> > 
> 
> When I looked at this awhile ago I convinced myself that it could fit
> fairly well into the DCB stack (DCB is also part of 802.1Q). A lot of
> the traffic class to queue mappings and priories could be handled here.
> It might be worth taking a look at ./net/sched/mqprio.c and ./net/dcb/.

Interesting, I'll have a look at dcb and mqprio, I'm not familiar with 
those systems. Thanks for pointing those out!

I hope that the complexity doesn't run crazy though, TSN is not aimed at 
datacentra, a lot of the endpoints are going to be embedded devices, 
introducing a massive stack for handling every eventuality in 802.1q is 
going to be counter productive.

> Unfortunately I didn't get too far along but we probably don't want
> another mechanism to map hw queues/tcs/etc if the existing interfaces
> work or can be extended to support this.

Sure, I get that, as long as the complexity for setting up a link doesn't 
go through the roof :)

Thanks!

-- 
Henrik Austad


signature.asc
Description: Digital signature


RE: [RESEND PATCH v2 1/2] device property: Add function to search for named child of device

2016-06-14 Thread Opensource [Adam Thomson]
On 13 June 2016 20:33, Frank Rowand wrote:

> > DT node names are case insensitive. The of.h header does provide a helper 
> > macro
> > which is equivalent to this, but that macro is part of the '#ifdef 
> > CONFIG_OF'
> > block. If I were to use it then it would cause non-DT builds to fail. I 
> > opted
> > for strcasecmp() directly as I didn't think for just this one scenario it 
> > made
> > sense to reorganise the of.h header with regards to the helper macros. Of 
> > course
> > if there are other opinions on this then am happy to listen.
>
> DT node names are not always case insensitive.  Please us of_node_cmp().
>
> -Frank

Ok, fair enough. I'll have to move those definitions in the of.h header out of
the CONFIG_OF block then.


[PATCH 0/2] Reverts to address unixbench regression

2016-06-14 Thread Kirill A. Shutemov
Faultaround changes cause regression in unixbench, let's revert them.

Kirill A. Shutemov (2):
  Revert "mm: make faultaround produce old ptes"
  Revert "mm: disable fault around on emulated access bit architecture"

 include/linux/mm.h |  2 +-
 mm/filemap.c   |  2 +-
 mm/memory.c| 31 +--
 3 files changed, 7 insertions(+), 28 deletions(-)

-- 
2.8.1



[PATCH] [ACPI] Change structure initialisation to C99 style

2016-06-14 Thread Amitoj Kaur Chawla
Replace the in order struct initialisation style with explicit field
style.

The Coccinelle semantic patch used to make this change is as follows:

@decl@
identifier i1,fld;
type T;
field list[n] fs;
@@

struct i1 {
 fs
 T fld;
 ...};

@@
identifier decl.i1,i2,decl.fld;
expression e;
position bad.p, bad.fix;
@@

struct i1 i2@p = { ...,
+ .fld = e
- e@fix
 ,...};

Signed-off-by: Amitoj Kaur Chawla 
---
 arch/ia64/kernel/acpi-ext.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/ia64/kernel/acpi-ext.c b/arch/ia64/kernel/acpi-ext.c
index bd09bf7..31e331a 100644
--- a/arch/ia64/kernel/acpi-ext.c
+++ b/arch/ia64/kernel/acpi-ext.c
@@ -80,7 +80,7 @@ static acpi_status find_csr_space(struct acpi_resource 
*resource, void *data)
 
 static acpi_status hp_crs_locate(acpi_handle obj, u64 *base, u64 *length)
 {
-   struct csr_space space = { 0, 0 };
+   struct csr_space space = { .base = 0, .length = 0 };
 
acpi_walk_resources(obj, METHOD_NAME__CRS, find_csr_space, &space);
if (!space.length)
-- 
1.9.1



[PATCH 2/2] Revert "mm: disable fault around on emulated access bit architecture"

2016-06-14 Thread Kirill A. Shutemov
This reverts commit d0834a6c2c5b0c76cfb806bd7dba6556d8b4edbb.

After revert of 5c0a85fad949 ("mm: make faultaround produce old ptes")
faultaround doesn't have dependencies on hardware accessed bit, so let's
revert this one too.

Signed-off-by: Kirill A. Shutemov 
---
 mm/memory.c | 8 
 1 file changed, 8 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index 61fe7e7b56bf..cd1f29e4897e 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2898,16 +2898,8 @@ void do_set_pte(struct vm_area_struct *vma, unsigned 
long address,
update_mmu_cache(vma, address, pte);
 }
 
-/*
- * If architecture emulates "accessed" or "young" bit without HW support,
- * there is no much gain with fault_around.
- */
 static unsigned long fault_around_bytes __read_mostly =
-#ifndef __HAVE_ARCH_PTEP_SET_ACCESS_FLAGS
-   PAGE_SIZE;
-#else
rounddown_pow_of_two(65536);
-#endif
 
 #ifdef CONFIG_DEBUG_FS
 static int fault_around_bytes_get(void *data, u64 *val)
-- 
2.8.1



[PATCH 1/2] Revert "mm: make faultaround produce old ptes"

2016-06-14 Thread Kirill A. Shutemov
This reverts commit 5c0a85fad949212b3e059692deecdeed74ae7ec7.

The commit causes ~6% regression in unixbench.

Let's revert it for now and consider other solution for reclaim problem
later.

Signed-off-by: Kirill A. Shutemov 
Reported-by: "Huang, Ying" 
---
 include/linux/mm.h |  2 +-
 mm/filemap.c   |  2 +-
 mm/memory.c| 23 +--
 3 files changed, 7 insertions(+), 20 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 5df5feb49575..ece042dfe23c 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -602,7 +602,7 @@ static inline pte_t maybe_mkwrite(pte_t pte, struct 
vm_area_struct *vma)
 }
 
 void do_set_pte(struct vm_area_struct *vma, unsigned long address,
-   struct page *page, pte_t *pte, bool write, bool anon, bool old);
+   struct page *page, pte_t *pte, bool write, bool anon);
 #endif
 
 /*
diff --git a/mm/filemap.c b/mm/filemap.c
index 00ae878b2a38..20f3b1f33f0e 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2186,7 +2186,7 @@ repeat:
if (file->f_ra.mmap_miss > 0)
file->f_ra.mmap_miss--;
addr = address + (page->index - vmf->pgoff) * PAGE_SIZE;
-   do_set_pte(vma, addr, page, pte, false, false, true);
+   do_set_pte(vma, addr, page, pte, false, false);
unlock_page(page);
goto next;
 unlock:
diff --git a/mm/memory.c b/mm/memory.c
index 15322b73636b..61fe7e7b56bf 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2877,7 +2877,7 @@ static int __do_fault(struct vm_area_struct *vma, 
unsigned long address,
  * vm_ops->map_pages.
  */
 void do_set_pte(struct vm_area_struct *vma, unsigned long address,
-   struct page *page, pte_t *pte, bool write, bool anon, bool old)
+   struct page *page, pte_t *pte, bool write, bool anon)
 {
pte_t entry;
 
@@ -2885,8 +2885,6 @@ void do_set_pte(struct vm_area_struct *vma, unsigned long 
address,
entry = mk_pte(page, vma->vm_page_prot);
if (write)
entry = maybe_mkwrite(pte_mkdirty(entry), vma);
-   if (old)
-   entry = pte_mkold(entry);
if (anon) {
inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES);
page_add_new_anon_rmap(page, vma, address, false);
@@ -3032,20 +3030,9 @@ static int do_read_fault(struct mm_struct *mm, struct 
vm_area_struct *vma,
 */
if (vma->vm_ops->map_pages && fault_around_bytes >> PAGE_SHIFT > 1) {
pte = pte_offset_map_lock(mm, pmd, address, &ptl);
-   if (!pte_same(*pte, orig_pte))
-   goto unlock_out;
do_fault_around(vma, address, pte, pgoff, flags);
-   /* Check if the fault is handled by faultaround */
-   if (!pte_same(*pte, orig_pte)) {
-   /*
-* Faultaround produce old pte, but the pte we've
-* handler fault for should be young.
-*/
-   pte_t entry = pte_mkyoung(*pte);
-   if (ptep_set_access_flags(vma, address, pte, entry, 0))
-   update_mmu_cache(vma, address, pte);
+   if (!pte_same(*pte, orig_pte))
goto unlock_out;
-   }
pte_unmap_unlock(pte, ptl);
}
 
@@ -3060,7 +3047,7 @@ static int do_read_fault(struct mm_struct *mm, struct 
vm_area_struct *vma,
put_page(fault_page);
return ret;
}
-   do_set_pte(vma, address, fault_page, pte, false, false, false);
+   do_set_pte(vma, address, fault_page, pte, false, false);
unlock_page(fault_page);
 unlock_out:
pte_unmap_unlock(pte, ptl);
@@ -3111,7 +3098,7 @@ static int do_cow_fault(struct mm_struct *mm, struct 
vm_area_struct *vma,
}
goto uncharge_out;
}
-   do_set_pte(vma, address, new_page, pte, true, true, false);
+   do_set_pte(vma, address, new_page, pte, true, true);
mem_cgroup_commit_charge(new_page, memcg, false, false);
lru_cache_add_active_or_unevictable(new_page, vma);
pte_unmap_unlock(pte, ptl);
@@ -3164,7 +3151,7 @@ static int do_shared_fault(struct mm_struct *mm, struct 
vm_area_struct *vma,
put_page(fault_page);
return ret;
}
-   do_set_pte(vma, address, fault_page, pte, true, false, false);
+   do_set_pte(vma, address, fault_page, pte, true, false);
pte_unmap_unlock(pte, ptl);
 
if (set_page_dirty(fault_page))
-- 
2.8.1



Re: USB broken on Banana Pi in Linux 4.6 [solved]

2016-06-14 Thread Marc Haber
Hi,

On Mon, May 30, 2016 at 09:02:54PM +0200, Marc Haber wrote:
> on my Bananapis, in kernel 4.6 USB does not work. Kernel configuration
> is USB-wise identical to 4.5 (grepped for differences in (hci|usb)),
> and in 4.6 there is not even /dev/bus/usb.

This turned out to be a configuration issue. 4.6 kernels on Banana Pi
need CONFIG_AXP20X_POWER for working USB. If that driver is missing,
one gets a silent fail.

Thanks for all your help.

Greetings
Marc

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421


Re: [PATCH v3 0/9] Add MT8173 Video Decoder Driver

2016-06-14 Thread 李務誠
On Wed, Jun 8, 2016 at 6:13 AM, Hans Verkuil  wrote:
>
>
> On 06/07/2016 11:22 PM, Mauro Carvalho Chehab wrote:
>>
>> Em Mon, 30 May 2016 20:29:14 +0800
>> Tiffany Lin  escreveu:
>>
>>> ==
>>>   Introduction
>>> ==
>>>
>>> The purpose of this series is to add the driver for video codec hw
>>> embedded in the Mediatek's MT8173 SoCs.
>>> Mediatek Video Codec is able to handle video decoding of in a range of
>>> formats.
>>>
>>> This patch series add Mediatek block format V4L2_PIX_FMT_MT21, the
>>> decoder driver will decoded bitstream to
>>> V4L2_PIX_FMT_MT21 format.
>>>
>>> This patch series rely on MTK VPU driver in patch series "Add MT8173
>>> Video Encoder Driver and VPU Driver"[1]
>>> and patch "CHROMIUM: v4l: Add V4L2_PIX_FMT_VP9 definition"[2] for VP9
>>> support.
>>> Mediatek Video Decoder driver rely on VPU driver to load, communicate
>>> with VPU.
>>>
>>> Internally the driver uses videobuf2 framework and MTK IOMMU and MTK SMI
>>> both have been merged in v4.6-rc1.
>>>
>>> [1]https://patchwork.linuxtv.org/patch/33734/
>>> [2]https://chromium-review.googlesource.com/#/c/245241/
>>
>>
>> Hmm... I'm not seeing the firmware for this driver at the
>> linux-firmware tree:
>>
>> https://git.kernel.org/cgit/linux/kernel/git/firmware/linux-firmware.git/log/
Tiffany. Can you check the license and add the firmware to linux-firmware?

For the information, both encoder and decoder drivers require the
firmware to work.
>>
>> Nor I'm seeing any pull request for them. Did you send it?
>> I'll only merge the driver upstream after seeing such pull request.
>
>
> Mauro, are you confusing the decoder and encoder driver? I haven't
> thoroughly reviewed the decoder driver
> yet, so there is no pull request for the decoder driver.
>
> The only pull request I made was for the encoder driver.
>
> Regards,
>
> Hans
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-media" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: linux-next: manual merge of the kvms390 tree with the s390 tree

2016-06-14 Thread Christian Borntraeger
On 06/14/2016 06:51 AM, Stephen Rothwell wrote:
> Hi all,
> 
> Today's linux-next merge of the kvms390 tree got a conflict in:
> 
>   arch/s390/hypfs/hypfs_diag.c
> 
> between commit:
> 
>   6c22c9863760 ("s390: avoid extable collisions")
> 
> from the s390 tree and commit:
> 
>   e65f30e0cb29 ("s390: hypfs: Move diag implementation and data definitions")
> 
> from the kvms390 tree.
> 
> I fixed it up (using the kvms390 version and then adding the following
> patch) and can carry the fix as necessary. This is now fixed as far as
> linux-next is concerned, but any non trivial conflicts should be
> mentioned to your upstream maintainer when your tree is submitted for
> merging.  You may also want to consider cooperating with the maintainer
> of the conflicting tree to minimise any particularly complex conflicts.
> 
> From: Stephen Rothwell 
> Date: Tue, 14 Jun 2016 14:47:33 +1000
> Subject: [PATCH] s390: merge fix up for __diag204 move
> 
> Signed-off-by: Stephen Rothwell 
> ---
>  arch/s390/kernel/diag.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/s390/kernel/diag.c b/arch/s390/kernel/diag.c
> index a44faf4a0454..2289d6f8bec0 100644
> --- a/arch/s390/kernel/diag.c
> +++ b/arch/s390/kernel/diag.c
> @@ -169,7 +169,7 @@ static inline int __diag204(unsigned long subcode, 
> unsigned long size, void *add
> 
>   asm volatile(
>   "   diag%2,%0,0x204\n"
> - "0:\n"
> + "0: nopr%%r7\n"
>   EX_TABLE(0b,0b)
>   : "+d" (_subcode), "+d" (_size) : "d" (addr) : "memory");
>   if (_subcode)
> 


Yes, thanks. This conflict (and the other preexisting one) will move soon to 
Paolos KVM tree
as I plan to submit my first pull request soon.

Christian 



[PATCH] mmc: Change the enhanced area related sysfs output format

2016-06-14 Thread Beata Baranowska
From: Chuanxiao Dong 

When the enhanced area feature was not enabled, the related sysfs will 
have -EINVAL(-22) value, so change the sysfs output format to display 
the correct value.

Signed-off-by: Chuanxiao Dong 
---
 drivers/mmc/core/mmc.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/mmc/core/mmc.c b/drivers/mmc/core/mmc.c
index 5d438ad3ee32..cc55253a5d47 100644
--- a/drivers/mmc/core/mmc.c
+++ b/drivers/mmc/core/mmc.c
@@ -722,9 +722,10 @@ MMC_DEV_ATTR(name, "%s\n", card->cid.prod_name);
 MMC_DEV_ATTR(oemid, "0x%04x\n", card->cid.oemid);
 MMC_DEV_ATTR(prv, "0x%x\n", card->cid.prv);
 MMC_DEV_ATTR(serial, "0x%08x\n", card->cid.serial);
-MMC_DEV_ATTR(enhanced_area_offset, "%llu\n",
+MMC_DEV_ATTR(enhanced_area_offset, "%lld\n",
card->ext_csd.enhanced_area_offset);
-MMC_DEV_ATTR(enhanced_area_size, "%u\n", card->ext_csd.enhanced_area_size);
+MMC_DEV_ATTR(enhanced_area_size, "%d KBytes\n",
+   card->ext_csd.enhanced_area_size);
 MMC_DEV_ATTR(raw_rpmb_size_mult, "%#x\n", card->ext_csd.raw_rpmb_size_mult);
 MMC_DEV_ATTR(rel_sectors, "%#x\n", card->ext_csd.rel_sectors);
 
-- 




[PATCH v3 2/2] pci/aer: interrupt fixup in the quirk

2016-06-14 Thread Po Liu
On some platforms, root port doesn't support MSI/MSI-X/INTx in RC mode.
When chip support the aer interrupt with none MSI/MSI-X/INTx mode,
maybe there is interrupt line for aer pme etc. Search the interrupt
number in the fdt file. Then fixup the dev->irq with it.

Signed-off-by: Po Liu 
---
changes for V3:
- Move to quirk;
- Only correct the irq in RC mode;

 drivers/pci/quirks.c | 29 +
 1 file changed, 29 insertions(+)

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index ee72ebe..8b39cce 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -25,6 +25,7 @@
 #include 
 #include 
 #include 
+#include 
 #include/* isa_dma_bridge_buggy */
 #include "pci.h"
 
@@ -4419,3 +4420,31 @@ static void quirk_intel_qat_vf_cap(struct pci_dev *pdev)
}
 }
 DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x443, quirk_intel_qat_vf_cap);
+
+/* If root port doesn't support MSI/MSI-X/INTx in RC mode,
+ * but use standalone irq. Read the device tree for the aer
+ * interrupt number.
+ */
+static void quirk_aer_interrupt(struct pci_dev *dev)
+{
+   int ret;
+   u8 header_type;
+   struct device_node *np = NULL;
+
+   /* Only for the RC mode device */
+   pci_read_config_byte(dev, PCI_HEADER_TYPE, &header_type);
+   if ((header_type & 0x7F) != PCI_HEADER_TYPE_BRIDGE)
+   return;
+
+   if (dev->bus->dev.of_node)
+   np = dev->bus->dev.of_node;
+
+   if (IS_ENABLED(CONFIG_OF_IRQ) && np) {
+   ret = of_irq_get_byname(np, "aer");
+   if (ret > 0) {
+   dev->no_msi = 1;
+   dev->irq = ret;
+   }
+   }
+}
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_FREESCALE, PCI_ANY_ID, 
quirk_aer_interrupt);
-- 
2.1.0.27.g96db324



[PATCH v3 1/2] nxp/dts: add pcie aer interrupt-name property in the dts

2016-06-14 Thread Po Liu
NXP some platforms aer interrupt was not MSI/MSI-X/INTx
but using interrupt line independently. This patch add a "aer"
interrupt-names for aer interrupt.

Signed-off-by: Po Liu 
---
changes for v3:
- None;

 .../devicetree/bindings/pci/layerscape-pci.txt |  4 ++--
 arch/arm/boot/dts/ls1021a.dtsi |  6 --
 arch/arm64/boot/dts/freescale/fsl-ls1043a.dtsi | 18 +-
 arch/arm64/boot/dts/freescale/fsl-ls2080a.dtsi | 16 
 4 files changed, 23 insertions(+), 21 deletions(-)

diff --git a/Documentation/devicetree/bindings/pci/layerscape-pci.txt 
b/Documentation/devicetree/bindings/pci/layerscape-pci.txt
index ef683b2..d27973a 100644
--- a/Documentation/devicetree/bindings/pci/layerscape-pci.txt
+++ b/Documentation/devicetree/bindings/pci/layerscape-pci.txt
@@ -19,7 +19,7 @@ Required properties:
 - interrupts: A list of interrupt outputs of the controller. Must contain an
   entry for each entry in the interrupt-names property.
 - interrupt-names: Must include the following entries:
-  "intr": The interrupt that is asserted for controller interrupts
+  "aer" : The interrupt that is asserted for aer interrupts
 - fsl,pcie-scfg: Must include two entries.
   The first entry must be a link to the SCFG device node
   The second entry must be '0' or '1' based on physical PCIe controller index.
@@ -33,7 +33,7 @@ Example:
   0x40 0x 0x0 0x2000>; /* configuration space 
*/
reg-names = "regs", "config";
interrupts = ; /* controller 
interrupt */
-   interrupt-names = "intr";
+   interrupt-names = "aer";
fsl,pcie-scfg = <&scfg 0>;
#address-cells = <3>;
#size-cells = <2>;
diff --git a/arch/arm/boot/dts/ls1021a.dtsi b/arch/arm/boot/dts/ls1021a.dtsi
index 5ae8e92..b638697 100644
--- a/arch/arm/boot/dts/ls1021a.dtsi
+++ b/arch/arm/boot/dts/ls1021a.dtsi
@@ -633,7 +633,8 @@
reg = <0x00 0x0340 0x0 0x0001   /* controller 
registers */
   0x40 0x 0x0 0x2000>; /* 
configuration space */
reg-names = "regs", "config";
-   interrupts = ; /* 
controller interrupt */
+   interrupts = ; /* aer 
interrupt */
+   interrupt-names = "aer";
fsl,pcie-scfg = <&scfg 0>;
#address-cells = <3>;
#size-cells = <2>;
@@ -656,7 +657,8 @@
reg = <0x00 0x0350 0x0 0x0001   /* controller 
registers */
   0x48 0x 0x0 0x2000>; /* 
configuration space */
reg-names = "regs", "config";
-   interrupts = ;
+   interrupts = ; /* aer 
interrupt */
+   interrupt-names = "aer";
fsl,pcie-scfg = <&scfg 1>;
#address-cells = <3>;
#size-cells = <2>;
diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1043a.dtsi 
b/arch/arm64/boot/dts/freescale/fsl-ls1043a.dtsi
index de0323b..4beb760 100644
--- a/arch/arm64/boot/dts/freescale/fsl-ls1043a.dtsi
+++ b/arch/arm64/boot/dts/freescale/fsl-ls1043a.dtsi
@@ -473,9 +473,9 @@
reg = <0x00 0x0340 0x0 0x0010   /* controller 
registers */
   0x40 0x 0x0 0x2000>; /* 
configuration space */
reg-names = "regs", "config";
-   interrupts = <0 118 0x4>, /* controller interrupt */
-<0 117 0x4>; /* PME interrupt */
-   interrupt-names = "intr", "pme";
+   interrupts = <0 117 0x4>, /* PME interrupt */
+<0 118 0x4>; /* aer interrupt */
+   interrupt-names = "pme", "aer";
#address-cells = <3>;
#size-cells = <2>;
device_type = "pci";
@@ -497,9 +497,9 @@
reg = <0x00 0x0350 0x0 0x0010   /* controller 
registers */
   0x48 0x 0x0 0x2000>; /* 
configuration space */
reg-names = "regs", "config";
-   interrupts = <0 128 0x4>,
-<0 127 0x4>;
-   interrupt-names = "intr", "pme";
+   interrupts = <0 127 0x4>,
+<0 128 0x4>;
+   interrupt-names = "pme", "aer";
#address-cells = <3>;
#size-cells = <2>;
device_type = "pci";
@@ -521,9 +521,9 @@
reg = <0x00 0x0360 0x0 0x0010   /* controller 
registers */
   0x50 0x

Re: [patch 13/20] timer: Switch to a non cascading wheel

2016-06-14 Thread Thomas Gleixner
On Tue, 14 Jun 2016, George Spelvin wrote:
> I think I see a buglet in your level-5 cascading.
> 
> Suppose a timer is requested far in the future for a time
> that is an exact multiple of 32768 jiffies.
> 
> collect_expired_timers() scans level 5 after all the previous ones,
> and will cascade it to level 0, in a level-0 bucket which has already
> been scanned, and won't be scanned again for 64 jiffies.
> 
> I agree that 64 jiffies is well within your allowed rounding accuracy,
> and order of timer firing is not guaranteed when they're for the same
> time, but it is a bit odd when a timer fires 32 jiffies *before* another
> timer scheduled for 32 jiffies later.  That's the sort of peculiarity
> that could lead to a subtle bug.

I thought about that and when looking at those long timeout thingies I came to
the conclusion that it's simply not worth the trouble.
 
> Wouldn't this all be so much simpler as
> 
> #define LVL_BITS  6   /* Renamed previous LVL_SHIFT */
> #define LVL_SIZE  (1 << LVL_BITS)
> #define LVL_MASK  (LVL_BITS - 1)
> #define LVL_OFFS(n)   ((n) * LVL_SIZE)
> #define LVL_SHIFT(n)  ((n) * LVL_CLK_SHIFT)
> #define LVL_GRAN(n)   (1 << LVL_SHIFT(n))

Indeed.
 
> Ideally, you'd like all of that
> 
> + if (delta < LVL1_TSTART) {
> + idx = (expires + LVL0_GRAN) & LVL_MASK;
> + } else if (delta < LVL2_TSTART) {
> + idx = calc_index(expires, LVL1_GRAN, LVL1_SHIFT, LVL1_OFFS);
> + } else if (delta < LVL3_TSTART) {
> + idx = calc_index(expires, LVL2_GRAN, LVL2_SHIFT, LVL2_OFFS);
> + } else if (delta < LVL4_TSTART) {
> + idx = calc_index(expires, LVL3_GRAN, LVL3_SHIFT, LVL3_OFFS);
> + } else if (delta < LVL5_TSTART) {
> + idx = calc_index(expires, LVL4_GRAN, LVL4_SHIFT, LVL4_OFFS);
> 
> to be replaced with __builtin_clz or similar:

Except that __fls() is noticeably slower than the if chain.

> > +static inline void detach_expired_timer(struct timer_list *timer)
> >  {
> > detach_timer(timer, true);
> > -   if (!(timer->flags & TIMER_DEFERRABLE))
> > -   base->active_timers--;
> > -   base->all_timers--;
> >  }
> 
> Is there even a reason to have this wrapper any more?  Why not
> just replace all calls to it in the source?

That just happened to stay there for no particular reason.

> > +   timer = hlist_entry(head->first, struct timer_list, entry);
> > +   fn = timer->function;
> > +   data = timer->data;
> > +
> > +   timer_stats_account_timer(timer);
> > +
> > +   base->running_timer = timer;
> > +   detach_expired_timer(timer);
> 
> Is there some non-obvious reason that you have to fetch fn and data
> so early?  It seems like a register pressure pessimization, if the
> compiler can't figure out that timer_stats code can't change them.
> 
> The cache line containing this timer was already prefetched when you
> updated its entry.pprev as part of removing the previous entry from
> the list.
> 
> I see why you want to fetch them with the lock held in case there's some
> freaky race, but I'd do it all after detach_timer().

That's not new code. We kept the ordering, but yes, we definitely can turn
that around. The only restriction is that we get it before releasing the lock.

Thanks,

tglx



Re: [PATCH v2 1/5] nbd: fix might_sleep warning on socket shutdown.

2016-06-14 Thread Markus Pargmann
Hi,

On Thursday 02 June 2016 13:24:57 Pranay Kr. Srivastava wrote:
> spinlocked ranges should be small and not contain calls into huge
> subfunctions. Fix my mistake and just get the pointer to the socket
> instead of doing everything with spinlock held.
> 
> Reported-by: Mikulas Patocka 
> Signed-off-by: Markus Pargmann 
> 
> Changelog:
> Pranay Kr. Srivastava:
> 
> 1) Use spin_lock instead of irq version for sock_shutdown.
> 
> 2) Use system work queue to actually trigger the shutdown of
>socket. This solves the issue when kernel_sendmsg is currently
>blocked while a timeout occurs.
> 
> Signed-off-by: Pranay Kr. Srivastava 

This looks better. Some smaller things inline. Also this patch does not
apply on my tree anymore. Can you please rebase it onto:

http://git.pengutronix.de/?p=mpa/linux-nbd.git;a=shortlog;h=refs/heads/master

> ---
>  drivers/block/nbd.c | 65 
> ++---
>  1 file changed, 42 insertions(+), 23 deletions(-)
> 
> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> index 31e73a7..0339d40 100644
> --- a/drivers/block/nbd.c
> +++ b/drivers/block/nbd.c
> @@ -39,6 +39,7 @@
>  #include 
>  
>  #include 
> +#include 
>  
>  struct nbd_device {
>   u32 flags;
> @@ -69,6 +70,10 @@ struct nbd_device {
>  #if IS_ENABLED(CONFIG_DEBUG_FS)
>   struct dentry *dbg_dir;
>  #endif
> + /*
> +  *This is specifically for calling sock_shutdown, for now.
> +  */
> + struct work_struct ws_shutdown;
>  };
>  
>  #if IS_ENABLED(CONFIG_DEBUG_FS)
> @@ -95,6 +100,11 @@ static int max_part;
>   */
>  static DEFINE_SPINLOCK(nbd_lock);
>  
> +/*
> + * Shutdown function for nbd_dev work struct.
> + */
> +static void nbd_ws_func_shutdown(struct work_struct *);
> +
>  static inline struct device *nbd_to_dev(struct nbd_device *nbd)
>  {
>   return disk_to_dev(nbd->disk);
> @@ -172,39 +182,35 @@ static void nbd_end_request(struct nbd_device *nbd, 
> struct request *req)
>   */
>  static void sock_shutdown(struct nbd_device *nbd)
>  {
> - spin_lock_irq(&nbd->sock_lock);
> -
> - if (!nbd->sock) {
> - spin_unlock_irq(&nbd->sock_lock);
> - return;
> - }
> + struct socket *sock;
>  
> - dev_warn(disk_to_dev(nbd->disk), "shutting down socket\n");
> - kernel_sock_shutdown(nbd->sock, SHUT_RDWR);
> - sockfd_put(nbd->sock);
> + spin_lock(&nbd->sock_lock);
> + sock = nbd->sock;
>   nbd->sock = NULL;
> - spin_unlock_irq(&nbd->sock_lock);
> + spin_unlock(&nbd->sock_lock);
> +
> + if (!sock)
> + return;
>  
>   del_timer(&nbd->timeout_timer);
> + dev_warn(disk_to_dev(nbd->disk), "shutting down socket\n");
> + kernel_sock_shutdown(sock, SHUT_RDWR);
> + sockfd_put(sock);
>  }
>  
>  static void nbd_xmit_timeout(unsigned long arg)
>  {
>   struct nbd_device *nbd = (struct nbd_device *)arg;
> - unsigned long flags;
>  
>   if (list_empty(&nbd->queue_head))
>   return;
> -
> - spin_lock_irqsave(&nbd->sock_lock, flags);
> -
>   nbd->timedout = true;
> -
> - if (nbd->sock)
> - kernel_sock_shutdown(nbd->sock, SHUT_RDWR);
> -
> - spin_unlock_irqrestore(&nbd->sock_lock, flags);
> -
> + schedule_work(&nbd->ws_shutdown);
> + /*
> +  * Make sure sender thread sees nbd->timedout.
> +  */
> + smp_wmb();

I am not sure that we need this memory barrier here. But as it is just
the timeout path it probably won't hurt.

> + wake_up(&nbd->waiting_wq);
>   dev_err(nbd_to_dev(nbd), "Connection timed out, shutting down 
> connection\n");
>  }
>  
> @@ -592,7 +598,11 @@ static int nbd_thread_send(void *data)
>   spin_unlock_irq(&nbd->queue_lock);
>  
>   /* handle request */
> - nbd_handle_req(nbd, req);
> + if (nbd->timedout) {
> + req->errors++;
> + nbd_end_request(nbd, req);
> + } else
> + nbd_handle_req(nbd, req);
>   }
>  
>   nbd->task_send = NULL;
> @@ -672,6 +682,7 @@ static void nbd_reset(struct nbd_device *nbd)
>   set_capacity(nbd->disk, 0);
>   nbd->flags = 0;
>   nbd->xmit_timeout = 0;
> + INIT_WORK(&nbd->ws_shutdown, nbd_ws_func_shutdown);
>   queue_flag_clear_unlocked(QUEUE_FLAG_DISCARD, nbd->disk->queue);
>   del_timer_sync(&nbd->timeout_timer);
>  }
> @@ -804,15 +815,15 @@ static int __nbd_ioctl(struct block_device *bdev, 
> struct nbd_device *nbd,
>   nbd_dev_dbg_close(nbd);
>   kthread_stop(thread);
>  
> - mutex_lock(&nbd->tx_lock);
> -
>   sock_shutdown(nbd);
> + mutex_lock(&nbd->tx_lock);
>   nbd_clear_que(nbd);
>   kill_bdev(bdev);
>   nbd_bdev_reset(bdev);
>  
>   if (nbd->disconnect) /* user requested, ignore socket errors */
>   error = 0;
> +

Random newline here.

Best Regards,

Mark

Re: [PATCH v2] sched: unlikely corrupted stack end

2016-06-14 Thread Ingo Molnar

* WANG Chao  wrote:

> unlikely() was dropped in commit ce03e4137bb2 ("sched/core: Drop
> unlikely behind BUG_ON()"), but commit 29d6455178a0 ("sched: panic on
> corrupted stack end") dropped BUG_ON() and called panic directly.
> 
> Now we should bring unlikely() back for branch prediction. While we're
> at it, it's better and cleaner to turn task_stack_end_corrupted() into
> inline function.
> 
> Signed-off-by: WANG Chao 
> ---
>  include/linux/sched.h | 7 +--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 6e42ada26345..797ca1975431 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -2997,8 +2997,11 @@ static inline unsigned long *end_of_stack(struct 
> task_struct *p)
>  }
>  
>  #endif
> -#define task_stack_end_corrupted(task) \
> - (*(end_of_stack(task)) != STACK_END_MAGIC)
> +
> +static inline int task_stack_end_corrupted(struct task_struct *p)
> +{
> + return unlikely(*(end_of_stack(p)) != STACK_END_MAGIC);
> +}

The passed in pointer should be const, and the extra parentheses around the 
end_of_stack() call are not needed anymore (since it's now proper C code now).

Thanks,

Ingo


Re: [PATCH] iommu/arm-smmu: request pcie devices to enable ACS

2016-06-14 Thread Will Deacon
On Tue, Jun 14, 2016 at 11:11:36AM +0800, Wei Chen wrote:
> On 13 June 2016 at 20:45, Will Deacon  wrote:
> > On Mon, Jun 13, 2016 at 05:20:17PM +0800, Wei Chen wrote:
> >> The PCIe ACS capability will affect the layout of iommu groups.
> >> Generally speaking, if the path from root port to the PCIe device
> >> is ACS enabled, the iommu will create a single iommu group for this
> >> PCIe device. If all PCIe devices on the path are ACS enabled then
> >> Linux can determine this path is ACS enabled.
> >>
> >> Linux use two PCIe configuration registers to determine the ACS
> >> status of PCIe devices:
> >> ACS Capability Register and ACS Control Register.
> >>
> >> The first register is used to check the implementation of ACS function
> >> of a PCIe device, the second register is used to check the enable status
> >> of ACS function. If one PCIe device has implemented and enabled the ACS
> >> function then Linux will determine this PCIe device enabled ACS.
> >>
> >> From the Chapter:6.12 of PCI Express Base Specification Revision 3.1a,
> >> we can find that when a PCIe device implements ACS function, the enable
> >> status is set to disabled by default and can be enabled by ACS-aware
> >> software.
> >>
> >> ACS will affect the iommu groups topology, so, the iommu driver is
> >> ACS-aware software. This patch adds a call to pci_request_acs() to the
> >> arm-smmu driver to enable the ACS function in PCIe devices that support
> >> it.
> >>
> >> Signed-off-by: Wei Chen 
> >> ---
> >>  drivers/iommu/arm-smmu-v3.c | 2 ++
> >>  drivers/iommu/arm-smmu.c| 4 +++-
> >>  2 files changed, 5 insertions(+), 1 deletion(-)
> >
> > Thanks, queued for 4.8 w/ Robin and Eric's reviewed-by tags and the minor
> > commit wording change.
> >
> 
> Thanks, I will post a v2 patch to include above changes.

:/ As above, I've already queued this.

Will


Re: [PATCH v3 4/6] drm/panel: simple: Add support for Samsung LSN122DL01-C01 2560x1600 panel

2016-06-14 Thread Thierry Reding
On Mon, Jun 13, 2016 at 10:00:45AM -0700, Doug Anderson wrote:
> Yakir,
> 
> On Sat, Jun 11, 2016 at 7:56 PM, Yakir Yang  wrote:
> > The Samsung LSN122DL01-C01 is an 12.2" 2560x1600 (WQXGA) TFT-LCD panel
> > connected using eDP interfaces.
> >
> > Signed-off-by: Yakir Yang 
> > ---
> > Changes in v3:
> > - Correct the size of panel_desc to active area 262mmx164mm (Emil, Stéphane)
> >
> > Changes in v2: None
> >
> >  drivers/gpu/drm/panel/panel-simple.c | 25 +
> >  1 file changed, 25 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/panel/panel-simple.c 
> > b/drivers/gpu/drm/panel/panel-simple.c
> > index 2d40a21..17cc973 100644
> > --- a/drivers/gpu/drm/panel/panel-simple.c
> > +++ b/drivers/gpu/drm/panel/panel-simple.c
> > @@ -1246,6 +1246,28 @@ static const struct panel_desc qd43003c0_40 = {
> > .bus_format = MEDIA_BUS_FMT_RGB888_1X24,
> >  };
> >
> > +static const struct drm_display_mode samsung_lsn122dl01_c01_mode = {
> > +   .clock = 271560,
> > +   .hdisplay = 2560,
> > +   .hsync_start = 2560 + 48,
> > +   .hsync_end = 2560 + 48 + 32,
> > +   .htotal = 2560 + 48 + 32 + 80,
> > +   .vdisplay = 1600,
> > +   .vsync_start = 1600 + 2,
> > +   .vsync_end = 1600 + 2 + 5,
> > +   .vtotal = 1600 + 2 + 5 + 57,
> > +   .vrefresh = 60,
> > +};
> > +
> > +static const struct panel_desc samsung_lsn122dl01_c01 = {
> > +   .modes = &samsung_lsn122dl01_c01_mode,
> > +   .num_modes = 1,
> > +   .size = {
> > +   .width = 262,
> > +   .height = 164,
> 
> Earlier you said that the active area of this panel was:
> 
> > Display area 262.656(H) X 164.16(V) (12.2”diagonal)
> 
> In other panels I looked at the EDID tended to round numbers, not
> truncate them.  For instance the Starry panel that I sent the patch
> for says in the manual "262.7712 (H) x 164.232 (V)" but then the EDID
> says "263 x 164".
> 
> That would mean your width should be 263 mm, not 262 mm.

Yes, rounding is what I've also applied to all panels that I added.
While it isn't documented I hope that other panels did round, rather
than truncate, as well.

Thierry


signature.asc
Description: PGP signature


Re: [LKP] [lkp] [mm] 5c0a85fad9: unixbench.score -6.3% regression

2016-06-14 Thread Minchan Kim
On Wed, Jun 08, 2016 at 11:58:11AM +0300, Kirill A. Shutemov wrote:
> On Wed, Jun 08, 2016 at 04:41:37PM +0800, Huang, Ying wrote:
> > "Huang, Ying"  writes:
> > 
> > > "Kirill A. Shutemov"  writes:
> > >
> > >> On Mon, Jun 06, 2016 at 10:27:24AM +0800, kernel test robot wrote:
> > >>> 
> > >>> FYI, we noticed a -6.3% regression of unixbench.score due to commit:
> > >>> 
> > >>> commit 5c0a85fad949212b3e059692deecdeed74ae7ec7 ("mm: make faultaround 
> > >>> produce old ptes")
> > >>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
> > >>> master
> > >>> 
> > >>> in testcase: unixbench
> > >>> on test machine: lituya: 16 threads Haswell High-end Desktop (i7-5960X 
> > >>> 3.0G) with 16G memory
> > >>> with following parameters: 
> > >>> cpufreq_governor=performance/nr_task=1/test=shell8
> > >>> 
> > >>> 
> > >>> Details are as below:
> > >>> -->
> > >>> 
> > >>> 
> > >>> =
> > >>> compiler/cpufreq_governor/kconfig/nr_task/rootfs/tbox_group/test/testcase:
> > >>>   
> > >>> gcc-4.9/performance/x86_64-rhel/1/debian-x86_64-2015-02-07.cgz/lituya/shell8/unixbench
> > >>> 
> > >>> commit: 
> > >>>   4b50bcc7eda4d3cc9e3f2a0aa60e590fedf728c5
> > >>>   5c0a85fad949212b3e059692deecdeed74ae7ec7
> > >>> 
> > >>> 4b50bcc7eda4d3cc 5c0a85fad949212b3e059692de 
> > >>>  -- 
> > >>>fail:runs  %reproductionfail:runs
> > >>>| | |
> > >>>   3:4  -75%:4 
> > >>> kmsg.DHCP/BOOTP:Reply_not_for_us,op[#]xid[#]
> > >>>  %stddev %change %stddev
> > >>>  \  |\  
> > >>>  14321 .  0%  -6.3%  13425 .  0%  unixbench.score
> > >>>1996897 .  0%  -6.1%1874635 .  0%  
> > >>> unixbench.time.involuntary_context_switches
> > >>>  1.721e+08 .  0%  -6.2%  1.613e+08 .  0%  
> > >>> unixbench.time.minor_page_faults
> > >>> 758.65 .  0%  -3.0% 735.86 .  0%  unixbench.time.system_time
> > >>> 387.66 .  0%  +5.4% 408.49 .  0%  unixbench.time.user_time
> > >>>5950278 .  0%  -6.2%5583456 .  0%  
> > >>> unixbench.time.voluntary_context_switches
> > >>
> > >> That's weird.
> > >>
> > >> I don't understand why the change would reduce number or minor faults.
> > >> It should stay the same on x86-64. Rise of user_time is puzzling too.
> > >
> > > unixbench runs in fixed time mode.  That is, the total time to run
> > > unixbench is fixed, but the work done varies.  So the minor_page_faults
> > > change may reflect only the work done.
> > >
> > >> Hm. Is reproducible? Across reboot?
> > >
> > 
> > And FYI, there is no swap setup for test, all root file system including
> > benchmark files are in tmpfs, so no real page reclaim will be
> > triggered.  But it appears that active file cache reduced after the
> > commit.
> > 
> > 111331 ±  1% -13.3%  96503 ±  0%  meminfo.Active
> >  27603 ±  1% -43.9%  15486 ±  0%  meminfo.Active(file)
> > 
> > I think this is the expected behavior of the commit?
> 
> Yes, it's expected.
> 
> After the change faularound would produce old pte. It means there's more
> chance for these pages to be on inactive lru, unless somebody actually
> touch them and flip accessed bit.

Hmm, tmpfs pages should be in anonymous LRU list and VM shouldn't scan
anonymous LRU list on swapless system so I really wonder why active file
LRU is shrunk.


Re: [PATCH v2 1/1] Staging: comedi: dmm32at: fix BIT macro issue.

2016-06-14 Thread Ian Abbott

On 14/06/16 06:53, Ravishankar Karkala Mallikarjunayya wrote:

This Replace all occurences of (1<
---
Changes V1 -> V2:
- BIT macros added(suggested by Ian Abbott)
-i.e.DMM32AT_AI_CFG_SCINT(x), DMM32AT_CTRL_PAGE(x)
---
  drivers/staging/comedi/drivers/dmm32at.c | 98 
  1 file changed, 50 insertions(+), 48 deletions(-)



Thanks!

Reviewed-by: Ian Abbott 

--
-=( Ian Abbott @ MEV Ltd.E-mail:  )=-
-=(  Web: http://www.mev.co.uk/  )=-


Re: [PATCH v2 4/5]nbd: make nbd device wait for its users.

2016-06-14 Thread Markus Pargmann
On Thursday 02 June 2016 13:25:00 Pranay Kr. Srivastava wrote:
> When a timeout occurs or a recv fails, then
> instead of abruplty killing nbd block device
> wait for it's users to finish.
> 
> This is more required when filesystem(s) like
> ext2 or ext3 don't expect their buffer heads to
> disappear while the filesystem is mounted.
> 
> Each open of a nbd device is refcounted, while
> the userland program [nbd-client] doing the
> NBD_DO_IT ioctl would now wait for any other users
> of this device before invalidating the nbd device.
> 
> Signed-off-by: Pranay Kr. Srivastava 
> ---
>  drivers/block/nbd.c | 58 
> +
>  1 file changed, 58 insertions(+)
> 
> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> index d1d898d..4da40dc 100644
> --- a/drivers/block/nbd.c
> +++ b/drivers/block/nbd.c
> @@ -70,10 +70,13 @@ struct nbd_device {
>  #if IS_ENABLED(CONFIG_DEBUG_FS)
>   struct dentry *dbg_dir;
>  #endif
> + atomic_t inuse;
>   /*
>*This is specifically for calling sock_shutdown, for now.
>*/
>   struct work_struct ws_shutdown;
> + struct kref users;
> + struct completion user_completion;
>  };
>  
>  #if IS_ENABLED(CONFIG_DEBUG_FS)
> @@ -104,6 +107,7 @@ static DEFINE_SPINLOCK(nbd_lock);
>   * Shutdown function for nbd_dev work struct.
>   */
>  static void nbd_ws_func_shutdown(struct work_struct *);
> +static void nbd_kref_release(struct kref *);
>  
>  static inline struct device *nbd_to_dev(struct nbd_device *nbd)
>  {
> @@ -682,6 +686,8 @@ static void nbd_reset(struct nbd_device *nbd)
>   nbd->flags = 0;
>   nbd->xmit_timeout = 0;
>   INIT_WORK(&nbd->ws_shutdown, nbd_ws_func_shutdown);
> + init_completion(&nbd->user_completion);
> + kref_init(&nbd->users);
>   queue_flag_clear_unlocked(QUEUE_FLAG_DISCARD, nbd->disk->queue);
>   del_timer_sync(&nbd->timeout_timer);
>  }
> @@ -815,6 +821,14 @@ static int __nbd_ioctl(struct block_device *bdev, struct 
> nbd_device *nbd,
>   kthread_stop(thread);
>  
>   sock_shutdown(nbd);
> + /*
> +  * kref_init initializes with ref count as 1,
> +  * nbd_client, or the user-land program executing
> +  * this ioctl will make the refcount to 2[at least]
> +  * so subtracting 2 from refcount.
> +  */
> + kref_sub(&nbd->users, 2, nbd_kref_release);

Why don't you use a kref_put?

> + wait_for_completion(&nbd->user_completion);
>   mutex_lock(&nbd->tx_lock);
>   nbd_clear_que(nbd);
>   kill_bdev(bdev);
> @@ -865,13 +879,56 @@ static int nbd_ioctl(struct block_device *bdev, fmode_t 
> mode,
>  
>   return error;
>  }
> +static void nbd_kref_release(struct kref *kref_users)
> +{
> + struct nbd_device *nbd = container_of(kref_users, struct nbd_device,
> + users);

Not indented to opening bracket.

> + pr_debug("Releasing kref [%s]\n", __func__);
> + atomic_set(&nbd->inuse, 0);
> + complete(&nbd->user_completion);
> +
> +}
> +
> +static int nbd_open(struct block_device *bdev, fmode_t mode)
> +{
> + struct nbd_device *nbd_dev = bdev->bd_disk->private_data;
> +
> + if (kref_get_unless_zero(&nbd_dev->users))
> + atomic_set(&nbd_dev->inuse, 1);
> +
> + pr_debug("Opening nbd_dev %s. Active users = %u\n",
> + bdev->bd_disk->disk_name,
> + atomic_read(&nbd_dev->users.refcount) - 1);

Indent to opening bracket.

> + return 0;
> +}
> +
> +static void nbd_release(struct gendisk *disk, fmode_t mode)
> +{
> + struct nbd_device *nbd_dev = disk->private_data;
> + /*
> + *kref_init initializes ref count to 1, so we
> + *we check for refcount to be 2 for a final put.
> + *
> + *kref needs to be re-initialized just here as the
> + *other process holding it must see the ref count as 2.
> + */
> + if (atomic_read(&nbd_dev->inuse))
> + kref_put(&nbd_dev->users,  nbd_kref_release);

What is this inuse atomic for? Everyone that releases the nbd device
will need to execute a kref_put().

Best Regards,

Markus

> +
> + pr_debug("Closing nbd_dev %s. Active users = %u\n",
> + disk->disk_name,
> + atomic_read(&nbd_dev->users.refcount) - 1);
> +}
>  
>  static const struct block_device_operations nbd_fops = {
>   .owner =THIS_MODULE,
>   .ioctl =nbd_ioctl,
>   .compat_ioctl = nbd_ioctl,
> + .open = nbd_open,
> + .release =  nbd_release
>  };
>  
> +
>  static void nbd_ws_func_shutdown(struct work_struct *ws_nbd)
>  {
>   struct nbd_device *nbd_dev = container_of(ws_nbd, struct nbd_device,
> @@ -1107,6 +1164,7 @@ static int __init nbd_init(void)
>   disk->fops = &nbd_fops;
>   disk->private_data = &nbd_dev[i];
>   sprintf(disk->disk_name, "nbd%d", i);

[PATCH] alarmtimer: fixed comments describing structure fields

2016-06-14 Thread Pratyush Patel
Updated struct alarm and struct alarm_timer descriptions.

Signed-off-by: Pratyush Patel 
---
 include/linux/alarmtimer.h | 6 +++---
 kernel/time/alarmtimer.c   | 1 -
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/include/linux/alarmtimer.h b/include/linux/alarmtimer.h
index 52f3b7d..9d80312 100644
--- a/include/linux/alarmtimer.h
+++ b/include/linux/alarmtimer.h
@@ -26,10 +26,10 @@ enum alarmtimer_restart {
  * struct alarm - Alarm timer structure
  * @node:  timerqueue node for adding to the event list this value
  * also includes the expiration time.
- * @period:Period for recuring alarms
+ * @timer: hrtimer used to schedule events while running
  * @function:  Function pointer to be executed when the timer fires.
- * @type:  Alarm type (BOOTTIME/REALTIME)
- * @enabled:   Flag that represents if the alarm is set to fire or not
+ * @type:  Alarm type (BOOTTIME/REALTIME).
+ * @state: Flag that represents if the alarm is set to fire or not.
  * @data:  Internal data value.
  */
 struct alarm {
diff --git a/kernel/time/alarmtimer.c b/kernel/time/alarmtimer.c
index e840ed8..c3aad68 100644
--- a/kernel/time/alarmtimer.c
+++ b/kernel/time/alarmtimer.c
@@ -30,7 +30,6 @@
  * struct alarm_base - Alarm timer bases
  * @lock:  Lock for syncrhonized access to the base
  * @timerqueue:Timerqueue head managing the list of events
- * @timer: hrtimer used to schedule events while running
  * @gettime:   Function to read the time correlating to the base
  * @base_clockid:  clockid for the base
  */
-- 
2.7.4



Re: [RFC PATCH V2 1/2] ACPI/PCI: Match PCI config space accessors against platfrom specific ECAM quirks

2016-06-14 Thread Duc Dang
On Mon, Jun 13, 2016 at 10:51 PM, Dongdong Liu  wrote:
> Hi Duc
>
> 在 2016/6/14 4:57, Duc Dang 写道:
>>
>> On Mon, Jun 13, 2016 at 8:47 AM, Christopher Covington
>>  wrote:
>>>
>>> Hi Dongdong,
>>>
>>> On 06/13/2016 09:02 AM, Dongdong Liu wrote:

 diff --git a/drivers/acpi/pci_mcfg.c b/drivers/acpi/pci_mcfg.c
 index d3c3e85..49612b3 100644
 --- a/drivers/acpi/pci_mcfg.c
 +++ b/drivers/acpi/pci_mcfg.c
 @@ -22,6 +22,10 @@
   #include 
   #include 
   #include 
 +#include 
 +
 +/* Root pointer to the mapped MCFG table */
 +static struct acpi_table_mcfg *mcfg_table;

   /* Structure to hold entries from the MCFG table */
   struct mcfg_entry {
 @@ -35,6 +39,38 @@ struct mcfg_entry {
   /* List to save mcfg entries */
   static LIST_HEAD(pci_mcfg_list);

 +extern struct pci_cfg_fixup __start_acpi_mcfg_fixups[];
 +extern struct pci_cfg_fixup __end_acpi_mcfg_fixups[];
 +
 +struct pci_ecam_ops *pci_mcfg_get_ops(struct acpi_pci_root *root)
 +{
 + int bus_num = root->secondary.start;
 + int domain = root->segment;
 + struct pci_cfg_fixup *f;
 +
 + if (!mcfg_table)
 + return &pci_generic_ecam_ops;
 +
 + /*
 +  * Match against platform specific quirks and return corresponding
 +  * CAM ops.
 +  *
 +  * First match against PCI topology  then use OEM ID
 and
 +  * OEM revision from MCFG table standard header.
 +  */
 + for (f = __start_acpi_mcfg_fixups; f < __end_acpi_mcfg_fixups;
 f++) {
 + if ((f->domain == domain || f->domain ==
 PCI_MCFG_DOMAIN_ANY) &&
 + (f->bus_num == bus_num || f->bus_num ==
 PCI_MCFG_BUS_ANY) &&
 + (!strncmp(f->oem_id, mcfg_table->header.oem_id,
 +   ACPI_OEM_ID_SIZE)) &&
 + (!strncmp(f->oem_table_id,
 mcfg_table->header.oem_table_id,
 +   ACPI_OEM_TABLE_ID_SIZE)))
>>>
>>>
>>> This would just be a small convenience, but if the character count used
>>> here were
>>>
>>> min(strlen(f->oem_id), ACPI_OEM_ID_SIZE)
>>>
>>> then the parameters to DECLARE_ACPI_MCFG_FIXUP macro could be substrings
>>> and
>>> wouldn't need to be padded out to the full length.
>>>
 + return f->ops;
 + }
 + /* No quirks, use ECAM */
 + return &pci_generic_ecam_ops;
 +}
>>>
>>>
 diff --git a/include/linux/pci-acpi.h b/include/linux/pci-acpi.h
 index 7d63a66..088a1da 100644
 --- a/include/linux/pci-acpi.h
 +++ b/include/linux/pci-acpi.h
 @@ -25,6 +25,7 @@ static inline acpi_status
 pci_acpi_remove_pm_notifier(struct acpi_device *dev)
   extern phys_addr_t acpi_pci_root_get_mcfg_addr(acpi_handle handle);

   extern phys_addr_t pci_mcfg_lookup(u16 domain, struct resource
 *bus_res);
 +extern struct pci_ecam_ops *pci_mcfg_get_ops(struct acpi_pci_root
 *root);

   static inline acpi_handle acpi_find_root_bridge_handle(struct pci_dev
 *pdev)
   {
 @@ -72,6 +73,25 @@ struct acpi_pci_root_ops {
int (*prepare_resources)(struct acpi_pci_root_info *info);
   };

 +struct pci_cfg_fixup {
 + struct pci_ecam_ops *ops;
 + char *oem_id;
 + char *oem_table_id;
 + int domain;
 + int bus_num;
 +};
 +
 +#define PCI_MCFG_DOMAIN_ANY  -1
 +#define PCI_MCFG_BUS_ANY -1
 +
 +/* Designate a routine to fix up buggy MCFG */
 +#define DECLARE_ACPI_MCFG_FIXUP(ops, oem_id, oem_table_id, dom, bus) \
 + static const struct pci_cfg_fixup   \
 + __mcfg_fixup_##oem_id##oem_table_id##dom##bus   \
>>>
>>>
>>> I'm not entirely sure that this is the right fix--I'm pretty blindly
>>> following a GCC documentation suggestion [1]--but removing the first two
>>> preprocessor concatenation operators "##" solved the following build
>>> error
>>> for me.
>>>
>>> include/linux/pci-acpi.h:90:2: error: pasting "__mcfg_fixup_" and
>>> ""QCOM"" does not give a valid preprocessing token
>>>__mcfg_fixup_##oem_id##oem_table_id##dom##bus   \
>>
>>
>> I think the problem is gcc is not happy with quoted string when
>> processing these tokens
>> (""QCOM"", the extra "" are added by gcc). So should we not concat
>> string tokens and
>> use the fixup definition in v1 of this RFC:
>> /* Designate a routine to fix up buggy MCFG */
>> #define DECLARE_ACPI_MCFG_FIXUP(ops, oem_id, rev, dom, bus) \
>>  static const struct pci_cfg_fixup
>> __mcfg_fixup_##system##dom##bus\
>>   __used __attribute__((__section__(".acpi_fixup_mcfg"), \
>>  aligned((sizeof(void *) =   \
>>  { ops, oem_id, rev, dom, bus };
>
>
> V1 fixup exist the redefinition error when compiling mutiple

Re: [RFC PATCH 1/3] pci, acpi: Match PCI config space accessors against platfrom specific ECAM quirks.

2016-06-14 Thread Tomasz Nowicki

Hi Arnd,

Sorry for late response. Please see comments inline.

On 02.06.2016 17:19, Arnd Bergmann wrote:

On Thursday, June 2, 2016 3:35:34 PM CEST Tomasz Nowicki wrote:

On 02.06.2016 14:32, Arnd Bergmann wrote:

On Thursday, June 2, 2016 2:07:43 PM CEST Tomasz Nowicki wrote:

On 02.06.2016 13:42, Arnd Bergmann wrote:

On Thursday, June 2, 2016 10:41:01 AM CEST Tomasz Nowicki wrote:

+struct pci_ecam_ops *pci_mcfg_get_ops(struct acpi_pci_root *root)
+{
+   int bus_num = root->secondary.start;
+   int domain = root->segment;
+   struct pci_cfg_fixup *f;
+
+   if (!mcfg_table)
+   return &pci_generic_ecam_ops;
+
+   /*
+* Match against platform specific quirks and return corresponding
+* CAM ops.
+*
+* First match against PCI topology  then use OEM ID and
+* OEM revision from MCFG table standard header.
+*/
+   for (f = __start_acpi_mcfg_fixups; f < __end_acpi_mcfg_fixups; f++) {
+   if ((f->domain == domain || f->domain == PCI_MCFG_DOMAIN_ANY) &&
+   (f->bus_num == bus_num || f->bus_num == PCI_MCFG_BUS_ANY) &&
+   (!strncmp(f->oem_id, mcfg_table->header.oem_id,
+ ACPI_OEM_ID_SIZE)) &&
+   (f->oem_revision == mcfg_table->header.oem_revision))
+   return f->ops;
+   }
+   /* No quirks, use ECAM */
+   return &pci_generic_ecam_ops;
+}
+
int pci_mcfg_lookup(struct acpi_pci_root *root)


Can you explain the use of pci_ecam_ops instead of pci_ops here?



I wanted to get associated bus_shift and use it to setup configuration
region properly before calling pci_ecam_create. Please see next patch.



I see. It feels really odd to do it this way though, since having a
nonstandard bus_shift essentially means not using anything resembling
ECAM to start with.

I realize that a lot of the host bridges are not ECAM, but because
of this, it would be more logical to have their own pci_ops instead
of pci_ecam_ops.


Well, we have bus_shift there to express bus shift differentiation. So I
would say we should change just structure name to prevent misunderstanding.


I'm not really convinced here. We use the bus_shift for two
completely different things in the end: for sizing the MMIO window
that gets mapped by ACPI and for the pci_ecam_map_bus() function
that isn't actually used for the typical fixups that override the
pci_ops.


Since we overwrite the whole pci_ecam_ops structure (next patch):
-   cfg = pci_ecam_create(&root->device->dev, &cfgres, bus_res,
- &pci_generic_ecam_ops);
+   cfg = pci_ecam_create(&root->device->dev, &cfgres, bus_res, ops);

IMO bus_shift is used in the right way. So if anybody decides to put 
different bus_shift there he also needs to implement map_bus and use 
there bus_shift appropriate to quirk requirements. Obviously we can use 
standard pci_ecam_map_bus() as map_bus but that would mean quirk nature 
needs that, like for ThunderX one.




I see now that this sneaks in an .init callback for the quirk
through the backdoor, by adding it to the pci_ecam_ops. I think
that is not good: if the idea is to have the config space access
be adapted to various quirks that is one thing, but if we actually
need a function to be called for the quirk we should do just that
and have it be obvious. That function can then override the
pci_ops.


Actually we do not need to call a function for each quirk. At the same 
time we already have .init callback adopted to configuration space 
access quirk. This way there is really small amount of code duplication. 
On the other hand I understand that .init call should be more explicit. 
Any suggestions are very appreciated.


Thanks,
Tomasz


Re: [PATCH 2/4] mtd: nand: implement two pairing scheme

2016-06-14 Thread George Spelvin
Boris Brezillon wrote:
> On 12 Jun 2016 16:24:53 George Spelvin wrote:
>> Boris Brezillon wrote:
>> My problem is that I don't really understand MLC programming.

> I came to the same conclusion: we really have these 2 cases in the
> wild, which makes it even more complicated to define a standard
> behavior.

I did find a useful stuy of the issue: "Program Interference in MLC NAND
Flash Memory: Characterization, Modeling, and Mitigation"

https://users.ece.cmu.edu/~omutlu/pub/flash-programming-interference_iccd13.pdf

It describes the write-disturb-precompensation technique, and also
shows how the two-stage programming works.  (Although the fact that the
"least significant bit" is the *largest* voltage difference and is shown
on the *left* makes no sense at all.)

Looking at the demonstrated programming sequence, it looks like
it should be possible to probe for the bit assignment.  If you have
a half-programmed page, then any bits programmed to "0" are actually
sitting close to the threshold between the two middle voltage levels.

So you'll get a lot of errors reading them as "1", but the interesting
part is the read-back of the unprogrammed bit.

If the chip is using the binary sequence, you'll read either 10 or 01.
If the chip us ising the Gray-code sequence, you'll read 10 or 00.

Basically, you read both pages and see which bit combination never
appears.  That is the combination that corresponds to the highest voltage
level.

Another interesting paper is "Read Disturb Errors in MLC NAND Flash
Memory: Characterization, Mitigation, and Recovery"
https://users.ece.cmu.edu/~omutlu/pub/flash-read-disturb-errors_dsn15.pdf

That talks about tricks that do as you observe: increase read error to start.
(In order to decreaease read disturb, and thus read errors later.)

>> It's more considering it to have 16K pages that can be accessed in 
>> half-pages.

> Yes, I know, but it's not really easy to fake that at the NAND level,
> because programming 2 pages still requires 2 page program operation.
> The MTD user could detect that the pairing scheme always exposes 2
> consecutive non-paired pages, but as you've seen, this condition does
> not necessarily imply the 'pair coupling' constraint, and we don't want
> to increase the min_io_size value if it's not really necessary.

Ideally, it would be nice to separate the "SLC hack" from the "later
write failures can corrupt earlier data" workaround.

First, you get the latter working on SLC flash.  Then you add MLC, and
make MLC another reason why it can happen.

But I'm not certain this is actually necessary.  Could listing 4 pages
rather than 2 as in other data sheets just be an editing or translation
error?  Maybe someoe got confused about "in the same row" when they
wrote that clarifying example.

> I'm just realizing this is actually a non-issue for the solution we
> developed with Ricard.  As I said, it's unsafe to partially write a
> block in MLC mode, so the only sane way is either to write a block in
> SLC mode, or atomically write a block in MLC mode, and that's what
> we're doing with our 'UBI LEB consolidation' approach.  I'm pretty sure
> the problem described in the Hynix datasheet does not happen when only
> writing in SLC mode.  So, even if the pairing scheme does not account
> for this extra 'coupling' constraint, we should be safe.

I can't see any reason why it would affect MLC and not SLC.


Re: [PATCH v2] sched: unlikely corrupted stack end

2016-06-14 Thread kbuild test robot
Hi,

[auto build test ERROR on tip/sched/core]
[also build test ERROR on v4.7-rc3 next-20160614]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/WANG-Chao/sched-unlikely-corrupted-stack-end/20160614-162711
config: ia64-allmodconfig (attached as .config)
compiler: ia64-linux-gcc (GCC) 4.9.0
reproduce:
wget 
https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross
 -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=ia64 

All error/warnings (new ones prefixed by >>):

   warning: (FAULT_INJECTION_STACKTRACE_FILTER && LATENCYTOP && KMEMCHECK && 
LOCKDEP) selects FRAME_POINTER which has unmet direct dependencies 
(DEBUG_KERNEL && (CRIS || M68K || FRV || UML || AVR32 || SUPERH || BLACKFIN || 
MN10300 || METAG) || ARCH_WANT_FRAME_POINTERS)
   warning: (FAULT_INJECTION_STACKTRACE_FILTER && LATENCYTOP && KMEMCHECK && 
LOCKDEP) selects FRAME_POINTER which has unmet direct dependencies 
(DEBUG_KERNEL && (CRIS || M68K || FRV || UML || AVR32 || SUPERH || BLACKFIN || 
MN10300 || METAG) || ARCH_WANT_FRAME_POINTERS)
   In file included from include/uapi/linux/stddef.h:1:0,
from include/linux/stddef.h:4,
from include/uapi/linux/posix_types.h:4,
from include/uapi/linux/types.h:13,
from include/linux/types.h:5,
from include/uapi/linux/capability.h:16,
from include/linux/capability.h:15,
from include/linux/sched.h:15,
from arch/ia64/kernel/asm-offsets.c:9:
   include/linux/sched.h: In function 'task_stack_end_corrupted':
>> arch/ia64/include/asm/ptrace.h:37:29: error: 'IA64_TASK_SIZE' undeclared 
>> (first use in this function)
#define IA64_RBS_OFFSET   ((IA64_TASK_SIZE + IA64_THREAD_INFO_SIZE + 31) & 
~31)
^
   include/linux/compiler.h:170:42: note: in definition of macro 'unlikely'
# define unlikely(x) __builtin_expect(!!(x), 0)
 ^
>> arch/ia64/include/asm/thread_info.h:74:57: note: in expansion of macro 
>> 'IA64_RBS_OFFSET'
#define end_of_stack(p) (unsigned long *)((void *)(p) + IA64_RBS_OFFSET)
^
>> include/linux/sched.h:3006:20: note: in expansion of macro 'end_of_stack'
 return unlikely(*(end_of_stack(p)) != STACK_END_MAGIC);
   ^
   arch/ia64/include/asm/ptrace.h:37:29: note: each undeclared identifier is 
reported only once for each function it appears in
#define IA64_RBS_OFFSET   ((IA64_TASK_SIZE + IA64_THREAD_INFO_SIZE + 31) & 
~31)
^
   include/linux/compiler.h:170:42: note: in definition of macro 'unlikely'
# define unlikely(x) __builtin_expect(!!(x), 0)
 ^
>> arch/ia64/include/asm/thread_info.h:74:57: note: in expansion of macro 
>> 'IA64_RBS_OFFSET'
#define end_of_stack(p) (unsigned long *)((void *)(p) + IA64_RBS_OFFSET)
^
>> include/linux/sched.h:3006:20: note: in expansion of macro 'end_of_stack'
 return unlikely(*(end_of_stack(p)) != STACK_END_MAGIC);
   ^
>> arch/ia64/include/asm/ptrace.h:37:46: error: 'IA64_THREAD_INFO_SIZE' 
>> undeclared (first use in this function)
#define IA64_RBS_OFFSET   ((IA64_TASK_SIZE + IA64_THREAD_INFO_SIZE + 31) & 
~31)
 ^
   include/linux/compiler.h:170:42: note: in definition of macro 'unlikely'
# define unlikely(x) __builtin_expect(!!(x), 0)
 ^
>> arch/ia64/include/asm/thread_info.h:74:57: note: in expansion of macro 
>> 'IA64_RBS_OFFSET'
#define end_of_stack(p) (unsigned long *)((void *)(p) + IA64_RBS_OFFSET)
^
>> include/linux/sched.h:3006:20: note: in expansion of macro 'end_of_stack'
 return unlikely(*(end_of_stack(p)) != STACK_END_MAGIC);
   ^
   make[2]: *** [arch/ia64/kernel/asm-offsets.s] Error 1
   make[2]: Target '__build' not remade because of errors.
   make[1]: *** [prepare0] Error 2
   make[1]: Target 'prepare' not remade because of errors.
   make: *** [sub-make] Error 2

vim +/end_of_stack +3006 include/linux/sched.h

  2990   * When the stack grows up, this is the highest address.
  2991   * Beyond that position, we corrupt data on the next page.
  2992   */
  2

Re: [RFC][PATCH 1/8] rtmutex: Deboost before waking up the top waiter

2016-06-14 Thread Juri Lelli
Hi,

I've got only nitpicks for the changelog. Otherwise the patch looks good
to me (and yes, without it bw inheritance would be a problem).

On 07/06/16 21:56, Peter Zijlstra wrote:
> From: Xunlei Pang 
> 
> We should deboost before waking the high-prio task, such that
> we don't run two tasks with the same "state"(priority, deadline,
  ^
space

> sched_class, etc) during the period between the end of wake_up_q()
> and the end of rt_mutex_adjust_prio().
> 
> As "Peter Zijlstra" said:
> Its semantically icky to have the two tasks running off the same

s/Its/It's/

> state and practically icky when you consider bandwidth inheritance --
> where the boosted task wants to explicitly modify the state of the
> booster. In that latter case you really want to unboost before you
> let the booster run again.
> 
> But this however can lead to prio-inversion if current would get
> preempted after the deboost but before waking our high-prio task,
> hence we disable preemption before doing deboost, and enabling it

s/enabling/re-enable/

> after the wake up is over.
> 
> The patch fixed the logic, and introduced rt_mutex_postunlock()

s/The/This/
s/fixed/fixes/
s/introduced/introduces/

> to do some code refactor.
> 
> Most importantly however; this change ensures pointer stability for
> the next patch, where we have rt_mutex_setprio() cache a pointer to
> the top-most waiter task. If we, as before this change, do the wakeup
> first and then deboost, this pointer might point into thin air.
> 
> Cc: Steven Rostedt 
> Cc: Ingo Molnar 
> Cc: Juri Lelli 
> Suggested-by: Peter Zijlstra 
> [peterz: Changelog]
> Signed-off-by: Xunlei Pang 
> Signed-off-by: Peter Zijlstra (Intel) 
> Link: 
> http://lkml.kernel.org/r/1461659449-19497-1-git-send-email-xlp...@redhat.com

Do we have any specific tests for this set? I'm running mine.

Best,

- Juri

> ---
> 
>  kernel/futex.c  |5 ++---
>  kernel/locking/rtmutex.c|   28 
>  kernel/locking/rtmutex_common.h |1 +
>  3 files changed, 27 insertions(+), 7 deletions(-)
> 
> --- a/kernel/futex.c
> +++ b/kernel/futex.c
> @@ -1336,9 +1336,8 @@ static int wake_futex_pi(u32 __user *uad
>* scheduled away before the wake up can take place.
>*/
>   spin_unlock(&hb->lock);
> - wake_up_q(&wake_q);
> - if (deboost)
> - rt_mutex_adjust_prio(current);
> +
> + rt_mutex_postunlock(&wake_q, deboost);
>  
>   return 0;
>  }
> --- a/kernel/locking/rtmutex.c
> +++ b/kernel/locking/rtmutex.c
> @@ -1390,12 +1390,32 @@ rt_mutex_fastunlock(struct rt_mutex *loc
>   } else {
>   bool deboost = slowfn(lock, &wake_q);
>  
> - wake_up_q(&wake_q);
> + rt_mutex_postunlock(&wake_q, deboost);
> + }
> +}
> +
>  
> - /* Undo pi boosting if necessary: */
> - if (deboost)
> - rt_mutex_adjust_prio(current);
> +/*
> + * Undo pi boosting (if necessary) and wake top waiter.
> + */
> +void rt_mutex_postunlock(struct wake_q_head *wake_q, bool deboost)
> +{
> + /*
> +  * We should deboost before waking the top waiter task such that
> +  * we don't run two tasks with the 'same' priority. This however
> +  * can lead to prio-inversion if we would get preempted after
> +  * the deboost but before waking our high-prio task, hence the
> +  * preempt_disable.
> +  */
> + if (deboost) {
> + preempt_disable();
> + rt_mutex_adjust_prio(current);
>   }
> +
> + wake_up_q(wake_q);
> +
> + if (deboost)
> + preempt_enable();
>  }
>  
>  /**
> --- a/kernel/locking/rtmutex_common.h
> +++ b/kernel/locking/rtmutex_common.h
> @@ -111,6 +111,7 @@ extern int rt_mutex_finish_proxy_lock(st
>  extern int rt_mutex_timed_futex_lock(struct rt_mutex *l, struct 
> hrtimer_sleeper *to);
>  extern bool rt_mutex_futex_unlock(struct rt_mutex *lock,
> struct wake_q_head *wqh);
> +extern void rt_mutex_postunlock(struct wake_q_head *wake_q, bool deboost);
>  extern void rt_mutex_adjust_prio(struct task_struct *task);
>  
>  #ifdef CONFIG_DEBUG_RT_MUTEXES
> 
> 


Re: [PATCH v2 0/4] clocksource: rockchip/timer: Support rktimer for rk3399

2016-06-14 Thread Caesar Wang


On 2016年06月14日 12:00, Huang, Tao wrote:

Hi Daniel:
On 2016年06月13日 21:06, Daniel Lezcano wrote:

On Tue, Jun 07, 2016 at 12:54:29PM +0800, Caesar Wang wrote:

This series patches had been tested on rockchip inside kernel.
In order to support the rk3399 SoC timer and turn off interrupts and IPIs to
save power in idle.

For my personnal information, are the arch_timer in the same power domain
than the CPU ? IOW, what is the 'always-on' property in the DT ?

Yes. In our SoC design, all arch (generic) timer in the same power
domain of CPU core. So if one CPU core power down, the arch (generic)
timer will lose it's state and stop working.
While rk timer maybe in peri power domain or pmu power domain, so the
timer will still work when CPU power down.

But before RK3399, all SoCs with CPU power domain, do not support auto
power down while cpu idle. So the arch timer can be seem as always on,
i.e. we don't need a broadcast timer at all.


Okay, it still works bootup on rk3288/other SoCs, even though many socs
hasn't used
the broadcast timer.

Yes, unfortunately the SoC design on rk3288 and the previous ones do not
allow to use a cpuidle driver with cpu/cluster power down, so obviously the
broadcast timer is pointless on these boards :)


You are right.


History version:
v1:
https://lkml.org/lkml/2016/5/25/186

Easy to test for my borad.
localhost / # cat /proc/interrupts
CPU0   CPU1   CPU2   CPU3   CPU4   CPU5
1:  0  0  0  0  0  0 GICv3  
29 Edge  arch_timer
...
5:  0  0  0  0  0  0 GICv3 
113 Level rk_timer
..

localhost / # cat /proc/timer_list | grep event_handler
get "event_handler:  hrtimer_interrupt"
event_handler:  tick_handle_oneshot_broadcast
event_handler:  hrtimer_interrupt

What are you trying to demonstrate here ? There are no interrupts for both
arch_timer and rk_timer.


My god!! let's forget it now!
Sorry for forgetting what happened.
---

Re-picked them up for my board since I'm doing other things to run a 
single cpu.


localhost / # cat /proc/interrupts
   CPU0
  1:  0 GICv3  29 Edge  arch_timer
  2:807 GICv3  30 Edge  arch_timer
  5:712 GICv3 113 Level rk_timer




I don't know. Maybe Caesar do something wrong :(
This is my output:
CPU0   CPU1   CPU2   CPU3   CPU4   CPU5

...
   2:   2911   1967   1588   1608   1295   1606
GICv3  30 Edge  arch_timer
   5:578637684626161165
GICv3 113 Level rk_timer





--
caesar wang | software engineer | w...@rock-chip.com




[PATCH v3 12/14] regulator: pwm: Retrieve correct voltage

2016-06-14 Thread Boris Brezillon
The continuous PWM voltage regulator is caching the voltage value in
the ->volt_uV field. While most of the time this value should reflect the
real voltage, sometime it can be sightly different if the PWM device
rounded the set_duty_cycle request.
Moreover, this value is not valid until someone has modified the regulator
output.

Remove the ->volt_uV field and always rely on the PWM state to calculate
the regulator output.

Signed-off-by: Boris Brezillon 
Reviewed-by: Brian Norris 
Tested-by: Brian Norris 
Tested-by: Heiko Stuebner 
---
Mark,

I know you already added your Tested-by/Acked-by tags on this patch
but this version has slightly change and is now making use of the
pwm_get_relative_duty_cycle() helper instead of manually converting
the absolute duty_cycle value into a relative one.
---
 drivers/regulator/pwm-regulator.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/regulator/pwm-regulator.c 
b/drivers/regulator/pwm-regulator.c
index 2000118..80d083f 100644
--- a/drivers/regulator/pwm-regulator.c
+++ b/drivers/regulator/pwm-regulator.c
@@ -35,9 +35,6 @@ struct pwm_regulator_data {
struct regulator_ops ops;
 
int state;
-
-   /* Continuous voltage */
-   int volt_uV;
 };
 
 struct pwm_voltages {
@@ -135,8 +132,13 @@ static int pwm_regulator_is_enabled(struct regulator_dev 
*dev)
 static int pwm_regulator_get_voltage(struct regulator_dev *rdev)
 {
struct pwm_regulator_data *drvdata = rdev_get_drvdata(rdev);
+   int min_uV = rdev->constraints->min_uV;
+   int diff = rdev->constraints->max_uV - min_uV;
+   struct pwm_state pstate;
 
-   return drvdata->volt_uV;
+   pwm_get_state(drvdata->pwm, &pstate);
+
+   return min_uV + pwm_get_relative_duty_cycle(&pstate, diff);
 }
 
 static int pwm_regulator_set_voltage(struct regulator_dev *rdev,
@@ -162,8 +164,6 @@ static int pwm_regulator_set_voltage(struct regulator_dev 
*rdev,
return ret;
}
 
-   drvdata->volt_uV = min_uV;
-
/* Delay required by PWM regulator to settle to the new voltage */
usleep_range(ramp_delay, ramp_delay + 1000);
 
-- 
2.7.4



Re: [RESEND PATCH v5 0/1] ARM64: ACPI: Update documentation for latest specification version

2016-06-14 Thread Will Deacon
On Mon, Jun 13, 2016 at 03:41:54PM -0600, Al Stone wrote:
> This is a resend only: Ping?  Last ping was 26 May; there has been zero
> response since then.  Already have one ACK from Lorenzo; another from an
> arm64 maintainer would be really helpful.

I thought there were outstanding comments on v4 of this?

http://lkml.kernel.org/r/571e699b.9090...@linaro.org

Will


[PATCH v3 14/14] regulator: pwm: Document pwm-dutycycle-unit and pwm-dutycycle-range

2016-06-14 Thread Boris Brezillon
Document the pwm-dutycycle-unit and pwm-dutycycle-range properties.

Signed-off-by: Boris Brezillon 
Acked-by: Brian Norris 
---
 .../devicetree/bindings/regulator/pwm-regulator.txt   | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/Documentation/devicetree/bindings/regulator/pwm-regulator.txt 
b/Documentation/devicetree/bindings/regulator/pwm-regulator.txt
index ed936f0..9fbc7b1 100644
--- a/Documentation/devicetree/bindings/regulator/pwm-regulator.txt
+++ b/Documentation/devicetree/bindings/regulator/pwm-regulator.txt
@@ -34,6 +34,18 @@ Only required for Voltage Table Mode:
First cell is voltage in microvolts (uV)
Second cell is duty-cycle in percent (%)
 
+Optional properties for Continuous mode:
+- pwm-dutycycle-unit:  Integer value encoding the duty cycle unit. If not
+   defined, <100> is assumed, meaning that
+   pwm-dutycycle-range contains values expressed in
+   percent.
+
+- pwm-dutycycle-range: Should contain 2 entries. The first entry is encoding
+   the dutycycle for regulator-min-microvolt and the
+   second one the dutycycle for regulator-max-microvolt.
+   Duty cycle values are expressed in pwm-dutycycle-unit.
+   If not defined, <0 100> is assumed.
+
 NB: To be clear, if voltage-table is provided, then the device will be used
 in Voltage Table Mode.  If no voltage-table is provided, then the device will
 be used in Continuous Voltage Mode.
@@ -48,6 +60,13 @@ Continuous Voltage Example:
regulator-min-microvolt = <1016000>;
regulator-max-microvolt = <1114000>;
regulator-name = "vdd_logic";
+   /* unit == per-mille */
+   pwm-dutycycle-unit = <1000>;
+   /*
+* Inverted PWM logic, and the duty cycle range is limited
+* to 30%-70%.
+*/
+   pwm-dutycycle-range <700 300>; /* */
};
 
 Voltage Table Example:
-- 
2.7.4



[PATCH v3 13/14] regulator: pwm: Support extra continuous mode cases

2016-06-14 Thread Boris Brezillon
The continuous mode allows one to declare a PWM regulator without having
to declare the voltage <-> dutycycle association table. It works fine as
long as your voltage(dutycycle) function is linear, but also has the
following constraints:

- dutycycle for min_uV = 0%
- dutycycle for max_uV = 100%
- dutycycle for min_uV < dutycycle for max_uV

While the linearity constraint is acceptable for now, we sometimes need to
restrict of the PWM range (to limit the maximum/minimum voltage for
example) or have a min_uV_dutycycle > max_uV_dutycycle (this could be
tweaked with PWM polarity, but not all PWMs support inverted polarity).

Add the pwm-dutycycle-range and pwm-dutycycle-unit DT properties to define
such constraints. If those properties are not defined, the PWM regulator
use the default pwm-dutycycle-range = <0 100> and
pwm-dutycycle-unit = <100> values (existing behavior).

Signed-off-by: Boris Brezillon 
Reviewed-by: Brian Norris 
Tested-by: Brian Norris 
Tested-by: Heiko Stuebner 
---
 drivers/regulator/pwm-regulator.c | 90 +++
 1 file changed, 81 insertions(+), 9 deletions(-)

diff --git a/drivers/regulator/pwm-regulator.c 
b/drivers/regulator/pwm-regulator.c
index 80d083f..fa1c74c 100644
--- a/drivers/regulator/pwm-regulator.c
+++ b/drivers/regulator/pwm-regulator.c
@@ -21,6 +21,12 @@
 #include 
 #include 
 
+struct pwm_continuous_reg_data {
+   unsigned int min_uV_dutycycle;
+   unsigned int max_uV_dutycycle;
+   unsigned int dutycycle_unit;
+};
+
 struct pwm_regulator_data {
/*  Shared */
struct pwm_device *pwm;
@@ -28,6 +34,9 @@ struct pwm_regulator_data {
/* Voltage table */
struct pwm_voltages *duty_cycle_table;
 
+   /* Continuous mode info */
+   struct pwm_continuous_reg_data continuous;
+
/* regulator descriptor */
struct regulator_desc desc;
 
@@ -132,31 +141,77 @@ static int pwm_regulator_is_enabled(struct regulator_dev 
*dev)
 static int pwm_regulator_get_voltage(struct regulator_dev *rdev)
 {
struct pwm_regulator_data *drvdata = rdev_get_drvdata(rdev);
+   unsigned int min_uV_duty = drvdata->continuous.min_uV_dutycycle;
+   unsigned int max_uV_duty = drvdata->continuous.max_uV_dutycycle;
+   unsigned int duty_unit = drvdata->continuous.dutycycle_unit;
int min_uV = rdev->constraints->min_uV;
-   int diff = rdev->constraints->max_uV - min_uV;
+   int max_uV = rdev->constraints->max_uV;
+   int diff_uV = max_uV - min_uV;
struct pwm_state pstate;
+   unsigned int diff_duty;
+   unsigned int voltage;
 
pwm_get_state(drvdata->pwm, &pstate);
 
-   return min_uV + pwm_get_relative_duty_cycle(&pstate, diff);
+   voltage = pwm_get_relative_duty_cycle(&pstate, duty_unit);
+
+   /*
+* The dutycycle for min_uV might be greater than the one for max_uV.
+* This is happening when the user needs an inversed polarity, but the
+* PWM device does not support inversing it in hardware.
+*/
+   if (max_uV_duty < min_uV_duty) {
+   voltage = min_uV_duty - voltage;
+   diff_duty = min_uV_duty - max_uV_duty;
+   } else {
+   voltage = voltage - min_uV_duty;
+   diff_duty = max_uV_duty - min_uV_duty;
+   }
+
+   voltage = DIV_ROUND_CLOSEST_ULL((u64)voltage * diff_uV, diff_duty);
+
+   return voltage + min_uV;
 }
 
 static int pwm_regulator_set_voltage(struct regulator_dev *rdev,
-   int min_uV, int max_uV,
-   unsigned *selector)
+int req_min_uV, int req_max_uV,
+unsigned int *selector)
 {
struct pwm_regulator_data *drvdata = rdev_get_drvdata(rdev);
+   unsigned int min_uV_duty = drvdata->continuous.min_uV_dutycycle;
+   unsigned int max_uV_duty = drvdata->continuous.max_uV_dutycycle;
+   unsigned int duty_unit = drvdata->continuous.dutycycle_unit;
unsigned int ramp_delay = rdev->constraints->ramp_delay;
-   unsigned int req_diff = min_uV - rdev->constraints->min_uV;
+   int min_uV = rdev->constraints->min_uV;
+   int max_uV = rdev->constraints->max_uV;
+   int diff_uV = max_uV - min_uV;
struct pwm_state pstate;
-   unsigned int diff;
+   unsigned int diff_duty;
+   unsigned int dutycycle;
int ret;
 
pwm_init_state(drvdata->pwm, &pstate);
-   diff = rdev->constraints->max_uV - rdev->constraints->min_uV;
 
-   /* We pass diff as the scale to get a uV precision. */
-   pwm_set_relative_duty_cycle(&pstate, req_diff, diff);
+   /*
+* The dutycycle for min_uV might be greater than the one for max_uV.
+* This is happening when the user needs an inversed polarity, but the
+* PWM device does not support inversing it in hardware.
+*/
+   if (max_uV_duty < min_uV_duty)
+ 

Re: [RFC 06/18] limits: present RLIMIT_CPU and RLIMIT_RTTIMER current status

2016-06-14 Thread Alexey Dobriyan
On Mon, Jun 13, 2016 at 10:44 PM, Topi Miettinen  wrote:
> Present current cputimer status in /proc/self/limits.

> --- a/fs/proc/base.c
> +++ b/fs/proc/base.c
> @@ -650,8 +650,30 @@ static int proc_pid_limits(struct seq_file *m, struct 
> pid_namespace *ns,
> +   switch (i) {
> +   case RLIMIT_RTTIME:
> +   case RLIMIT_CPU:
> +   if (rlim[i].rlim_max == RLIM_INFINITY)
> +   seq_printf(m, "%-20s\n", "-");
> +   else {
> +   unsigned long long utime, ptime;
> +   unsigned long psecs;
> +   struct task_cputime cputime;
> +
> +   thread_group_cputimer(task, &cputime);
> +   utime = cputime_to_expires(cputime.utime);
> +   ptime = utime + 
> cputime_to_expires(cputime.stime);
> +   psecs = cputime_to_secs(ptime);
> +   if (i == RLIMIT_RTTIME)
> +   psecs *= USEC_PER_SEC;
> +   seq_printf(m, "%-20lu\n", psecs);
> +   }
> +   break;

Let's keep rlimits file for rlimits.


[PATCH v3 07/14] pwm: sti: Add support for hardware readout

2016-06-14 Thread Boris Brezillon
Implement ->get_state() to provide support for initial state retrieval.

Signed-off-by: Boris Brezillon 
---
 drivers/pwm/pwm-sti.c | 38 ++
 1 file changed, 38 insertions(+)

diff --git a/drivers/pwm/pwm-sti.c b/drivers/pwm/pwm-sti.c
index 92abbd5..6300d3e 100644
--- a/drivers/pwm/pwm-sti.c
+++ b/drivers/pwm/pwm-sti.c
@@ -238,6 +238,43 @@ static void sti_pwm_disable(struct pwm_chip *chip, struct 
pwm_device *pwm)
mutex_unlock(&pc->sti_pwm_lock);
 }
 
+static void sti_pwm_get_state(struct pwm_chip *chip,
+ struct pwm_device *pwm,
+ struct pwm_state *state)
+{
+   struct sti_pwm_chip *pc = to_sti_pwmchip(chip);
+   unsigned int regval, prescaler;
+   int ret;
+
+   /* The clock has to be enabled to access PWM registers */
+   ret = clk_enable(pc->clk);
+   if (ret) {
+   dev_err(chip->dev, "Failed to enable PWM clk");
+   return;
+   }
+
+   regmap_field_read(pc->prescale_high, ®val);
+   prescaler = regval << 4;
+   regmap_field_read(pc->prescale_low, ®val);
+   prescaler |= regval;
+   state->period = DIV_ROUND_CLOSEST_ULL((u64)(prescaler + 1) *
+ NSEC_PER_SEC *
+ (pc->cdata->max_pwm_cnt + 1),
+ pc->clk_rate);
+
+   regmap_read(pc->regmap, STI_DS_REG(pwm->hwpwm), ®val);
+   state->duty_cycle = DIV_ROUND_CLOSEST_ULL((u64)(regval + 1) *
+ state->period,
+ pc->cdata->max_pwm_cnt + 1);
+
+   regmap_field_read(pc->pwm_en, ®val);
+   state->enabled = regval;
+
+   state->polarity = PWM_POLARITY_NORMAL;
+
+   clk_disable(pc->clk);
+}
+
 static void sti_pwm_free(struct pwm_chip *chip, struct pwm_device *pwm)
 {
struct sti_pwm_chip *pc = to_sti_pwmchip(chip);
@@ -249,6 +286,7 @@ static const struct pwm_ops sti_pwm_ops = {
.config = sti_pwm_config,
.enable = sti_pwm_enable,
.disable = sti_pwm_disable,
+   .get_state = sti_pwm_get_state,
.free = sti_pwm_free,
.owner = THIS_MODULE,
 };
-- 
2.7.4



[PATCH v3 11/14] regulator: pwm: Properly initialize the ->state field

2016-06-14 Thread Boris Brezillon
The ->state field is currently initialized to 0, thus referencing the
voltage selector at index 0, which might not reflect the current
voltage value.
If possible, retrieve the current voltage selector from the PWM state,
else return -EINVAL.

Signed-off-by: Boris Brezillon 
Tested-by: Brian Norris 
Tested-by: Heiko Stuebner 
---
Mark,

I know you already added your Acked-by tag on this patch but this
version has slightly change and is now making use of the
pwm_get_relative_duty_cycle() helper instead of manually converting
the absolute duty_cycle value into a relative one.
---
 drivers/regulator/pwm-regulator.c | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/drivers/regulator/pwm-regulator.c 
b/drivers/regulator/pwm-regulator.c
index ad75360..2000118 100644
--- a/drivers/regulator/pwm-regulator.c
+++ b/drivers/regulator/pwm-regulator.c
@@ -48,10 +48,31 @@ struct pwm_voltages {
 /**
  * Voltage table call-backs
  */
+static void pwm_regulator_init_state(struct regulator_dev *rdev)
+{
+   struct pwm_regulator_data *drvdata = rdev_get_drvdata(rdev);
+   struct pwm_state pwm_state;
+   unsigned int dutycycle;
+   int i;
+
+   pwm_get_state(drvdata->pwm, &pwm_state);
+   dutycycle = pwm_get_relative_duty_cycle(&pwm_state, 100);
+
+   for (i = 0; i < rdev->desc->n_voltages; i++) {
+   if (dutycycle == drvdata->duty_cycle_table[i].dutycycle) {
+   drvdata->state = i;
+   return;
+   }
+   }
+}
+
 static int pwm_regulator_get_voltage_sel(struct regulator_dev *rdev)
 {
struct pwm_regulator_data *drvdata = rdev_get_drvdata(rdev);
 
+   if (drvdata->state < 0)
+   pwm_regulator_init_state(rdev);
+
return drvdata->state;
 }
 
@@ -203,6 +224,7 @@ static int pwm_regulator_init_table(struct platform_device 
*pdev,
return ret;
}
 
+   drvdata->state  = -EINVAL;
drvdata->duty_cycle_table   = duty_cycle_table;
memcpy(&drvdata->ops, &pwm_regulator_voltage_table_ops,
   sizeof(drvdata->ops));
-- 
2.7.4



[PATCH v3 10/14] regulator: pwm: Switch to the atomic PWM API

2016-06-14 Thread Boris Brezillon
Use the atomic API wherever appropriate and get rid of pwm_apply_args()
call (the reference period and polarity are now explicitly set when
calling pwm_apply_state()).

We also make use of the pwm_set_relative_duty_cycle() helper to ease
relative to absolute duty_cycle conversion.

Note that changes introduced by commit fd786fb0276a ("regulator: pwm:
Try to avoid voltage error in duty cycle calculation") are no longer
needed because pwm_set_relative_duty_cycle() takes care of all rounding
approximation for us.

Signed-off-by: Boris Brezillon 
Reviewed-by: Brian Norris 
Tested-by: Brian Norris 
Acked-by: Laxman Dewangan 
Tested-by: Heiko Stuebner 
---
 drivers/regulator/pwm-regulator.c | 38 ++
 1 file changed, 10 insertions(+), 28 deletions(-)

diff --git a/drivers/regulator/pwm-regulator.c 
b/drivers/regulator/pwm-regulator.c
index 524b43f..ad75360 100644
--- a/drivers/regulator/pwm-regulator.c
+++ b/drivers/regulator/pwm-regulator.c
@@ -59,16 +59,14 @@ static int pwm_regulator_set_voltage_sel(struct 
regulator_dev *rdev,
 unsigned selector)
 {
struct pwm_regulator_data *drvdata = rdev_get_drvdata(rdev);
-   struct pwm_args pargs;
-   int dutycycle;
+   struct pwm_state pstate;
int ret;
 
-   pwm_get_args(drvdata->pwm, &pargs);
+   pwm_init_state(drvdata->pwm, &pstate);
+   pwm_set_relative_duty_cycle(&pstate,
+   drvdata->duty_cycle_table[selector].dutycycle, 100);
 
-   dutycycle = (pargs.period *
-   drvdata->duty_cycle_table[selector].dutycycle) / 100;
-
-   ret = pwm_config(drvdata->pwm, dutycycle, pargs.period);
+   ret = pwm_apply_state(drvdata->pwm, &pstate);
if (ret) {
dev_err(&rdev->dev, "Failed to configure PWM: %d\n", ret);
return ret;
@@ -126,34 +124,18 @@ static int pwm_regulator_set_voltage(struct regulator_dev 
*rdev,
 {
struct pwm_regulator_data *drvdata = rdev_get_drvdata(rdev);
unsigned int ramp_delay = rdev->constraints->ramp_delay;
-   struct pwm_args pargs;
unsigned int req_diff = min_uV - rdev->constraints->min_uV;
+   struct pwm_state pstate;
unsigned int diff;
-   unsigned int duty_pulse;
-   u64 req_period;
-   u32 rem;
int ret;
 
-   pwm_get_args(drvdata->pwm, &pargs);
+   pwm_init_state(drvdata->pwm, &pstate);
diff = rdev->constraints->max_uV - rdev->constraints->min_uV;
 
-   /* First try to find out if we get the iduty cycle time which is
-* factor of PWM period time. If (request_diff_to_min * pwm_period)
-* is perfect divided by voltage_range_diff then it is possible to
-* get duty cycle time which is factor of PWM period. This will help
-* to get output voltage nearer to requested value as there is no
-* calculation loss.
-*/
-   req_period = req_diff * pargs.period;
-   div_u64_rem(req_period, diff, &rem);
-   if (!rem) {
-   do_div(req_period, diff);
-   duty_pulse = (unsigned int)req_period;
-   } else {
-   duty_pulse = (pargs.period / 100) * ((req_diff * 100) / diff);
-   }
+   /* We pass diff as the scale to get a uV precision. */
+   pwm_set_relative_duty_cycle(&pstate, req_diff, diff);
 
-   ret = pwm_config(drvdata->pwm, duty_pulse, pargs.period);
+   ret = pwm_apply_state(drvdata->pwm, &pstate);
if (ret) {
dev_err(&rdev->dev, "Failed to configure PWM: %d\n", ret);
return ret;
-- 
2.7.4



[PATCH v3 08/14] pwm: sti: Avoid glitches on already running PWMs

2016-06-14 Thread Boris Brezillon
The current logic will disable the PWM clk even if a PWM was left
enabled by the bootloader (because it's controlling a critical device
like a regulator for example).
Keep the PWM clk enabled if at least one PWM is enabled to avoid any
glitches.

Signed-off-by: Boris Brezillon 
---
 drivers/pwm/pwm-sti.c | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/pwm/pwm-sti.c b/drivers/pwm/pwm-sti.c
index 6300d3e..5bda51d 100644
--- a/drivers/pwm/pwm-sti.c
+++ b/drivers/pwm/pwm-sti.c
@@ -340,7 +340,7 @@ static int sti_pwm_probe(struct platform_device *pdev)
struct sti_pwm_compat_data *cdata;
struct sti_pwm_chip *pc;
struct resource *res;
-   int ret;
+   int i, ret;
 
pc = devm_kzalloc(dev, sizeof(*pc), GFP_KERNEL);
if (!pc)
@@ -391,7 +391,7 @@ static int sti_pwm_probe(struct platform_device *pdev)
return -EINVAL;
}
 
-   ret = clk_prepare(pc->clk);
+   ret = clk_prepare_enable(pc->clk);
if (ret) {
dev_err(dev, "failed to prepare clock\n");
return ret;
@@ -409,6 +409,16 @@ static int sti_pwm_probe(struct platform_device *pdev)
return ret;
}
 
+   /*
+* Keep the PWM clk enabled if some PWMs appear to be up and
+* running.
+*/
+   for (i = 0; i < pc->chip.npwm; i++) {
+   if (pwm_is_enabled(&pc->chip.pwms[i]))
+   clk_enable(pc->clk);
+   }
+   clk_disable(pc->clk);
+
platform_set_drvdata(pdev, pc);
 
return 0;
-- 
2.7.4



[PATCH v3 09/14] regulator: pwm: Adjust PWM config at probe time

2016-06-14 Thread Boris Brezillon
The PWM attached to a PWM regulator device might have been previously
configured by the bootloader.
Make sure the bootloader and linux config are in sync, and adjust the PWM
config if that's not the case.

Signed-off-by: Boris Brezillon 
Acked-by: Mark Brown 
Acked-by: Brian Norris 
Tested-by: Brian Norris 
Tested-by: Heiko Stuebner 
---
 drivers/regulator/pwm-regulator.c | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/regulator/pwm-regulator.c 
b/drivers/regulator/pwm-regulator.c
index ab3cc02..524b43f 100644
--- a/drivers/regulator/pwm-regulator.c
+++ b/drivers/regulator/pwm-regulator.c
@@ -285,11 +285,9 @@ static int pwm_regulator_probe(struct platform_device 
*pdev)
return ret;
}
 
-   /*
-* FIXME: pwm_apply_args() should be removed when switching to the
-* atomic PWM API.
-*/
-   pwm_apply_args(drvdata->pwm);
+   ret = pwm_adjust_config(drvdata->pwm);
+   if (ret)
+   return ret;
 
regulator = devm_regulator_register(&pdev->dev,
&drvdata->desc, &config);
-- 
2.7.4



[PATCH v3 05/14] pwm: rockchip: Avoid glitches on already running PWMs

2016-06-14 Thread Boris Brezillon
The current logic will disable the PWM clk even if the PWM was left
enabled by the bootloader (because it's controlling a critical device
like a regulator for example).
Keep the PWM clk enabled if the PWM is enabled to avoid any glitches.

Signed-off-by: Boris Brezillon 
Reviewed-by: Brian Norris 
Tested-by: Brian Norris 
Tested-by: Heiko Stuebner 
---
 drivers/pwm/pwm-rockchip.c | 20 +++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/drivers/pwm/pwm-rockchip.c b/drivers/pwm/pwm-rockchip.c
index c72b419..dd8ca86 100644
--- a/drivers/pwm/pwm-rockchip.c
+++ b/drivers/pwm/pwm-rockchip.c
@@ -319,7 +319,7 @@ static int rockchip_pwm_probe(struct platform_device *pdev)
if (IS_ERR(pc->clk))
return PTR_ERR(pc->clk);
 
-   ret = clk_prepare(pc->clk);
+   ret = clk_prepare_enable(pc->clk);
if (ret)
return ret;
 
@@ -342,6 +342,10 @@ static int rockchip_pwm_probe(struct platform_device *pdev)
dev_err(&pdev->dev, "pwmchip_add() failed: %d\n", ret);
}
 
+   /* Keep the PWM clk enabled if the PWM appears to be up and running. */
+   if (!pwm_is_enabled(pc->chip.pwms))
+   clk_disable(pc->clk);
+
return ret;
 }
 
@@ -349,6 +353,20 @@ static int rockchip_pwm_remove(struct platform_device 
*pdev)
 {
struct rockchip_pwm_chip *pc = platform_get_drvdata(pdev);
 
+   /*
+* Disable the PWM clk before unpreparing it if the PWM device is still
+* running. This should only happen when the last PWM user left it
+* enabled, or when nobody requested a PWM that was previously enabled
+* by the bootloader.
+*
+* FIXME: Maybe the core should disable all PWM devices in
+* pwmchip_remove(). In this case we'd only have to call
+* clk_unprepare() after pwmchip_remove().
+*
+*/
+   if (pwm_is_enabled(pc->chip.pwms))
+   clk_disable(pc->clk);
+
clk_unprepare(pc->clk);
 
return pwmchip_remove(&pc->chip);
-- 
2.7.4



[PATCH v3 01/14] pwm: Add an helper to prepare a new PWM state

2016-06-14 Thread Boris Brezillon
The pwm_init_state() helper prepares a new state object containing the
current PWM state except for the polarity and period fields which are
set to the reference values (those in pwm_args).
This is particularly useful for PWM users who want to apply a new
duty-cycle expressed relatively to the reference period without
changing the enable state.

Signed-off-by: Boris Brezillon 
Tested-by: Heiko Stuebner 
---
 include/linux/pwm.h | 33 +
 1 file changed, 33 insertions(+)

diff --git a/include/linux/pwm.h b/include/linux/pwm.h
index 17018f3..a100f6e 100644
--- a/include/linux/pwm.h
+++ b/include/linux/pwm.h
@@ -148,6 +148,39 @@ static inline void pwm_get_args(const struct pwm_device 
*pwm,
 }
 
 /**
+ * pwm_init_state() - prepare a new state to be applied with pwm_apply_state()
+ * @pwm: PWM device
+ * @state: state to fill with the prepared PWM state
+ *
+ * This functions prepares a state that can later be tweaked and applied
+ * to the PWM device with pwm_apply_state(). This is a convenient function
+ * that first retrieves the current PWM state and the replaces the period
+ * and polarity fields with the reference values defined in pwm->args.
+ * Once the function returns, you can adjust the ->enabled and ->duty_cycle
+ * fields according to your needs before calling pwm_apply_state().
+ *
+ * ->duty_cycle is initially set to zero to avoid cases where the current
+ * ->duty_cycle value exceed the pwm_args->period one, which would trigger
+ * an error if the user calls pwm_apply_state() without adjusting ->duty_cycle
+ * first.
+ */
+static inline void pwm_init_state(const struct pwm_device *pwm,
+ struct pwm_state *state)
+{
+   struct pwm_args args;
+
+   /* First get the current state. */
+   pwm_get_state(pwm, state);
+
+   /* Then fill it with the reference config */
+   pwm_get_args(pwm, &args);
+
+   state->period = args.period;
+   state->polarity = args.polarity;
+   state->duty_cycle = 0;
+}
+
+/**
  * struct pwm_ops - PWM controller operations
  * @request: optional hook for requesting a PWM
  * @free: optional hook for freeing a PWM
-- 
2.7.4



[PATCH v3 06/14] pwm: rockchip: Add support for atomic update

2016-06-14 Thread Boris Brezillon
Implement the ->apply() function to add support for atomic update.

Signed-off-by: Boris Brezillon 
Tested-by: Heiko Stuebner 
Reviewed-by: Brian Norris 
Tested-by: Brian Norris 
---
 drivers/pwm/pwm-rockchip.c | 84 --
 1 file changed, 43 insertions(+), 41 deletions(-)

diff --git a/drivers/pwm/pwm-rockchip.c b/drivers/pwm/pwm-rockchip.c
index dd8ca86..ef89df1 100644
--- a/drivers/pwm/pwm-rockchip.c
+++ b/drivers/pwm/pwm-rockchip.c
@@ -47,10 +47,12 @@ struct rockchip_pwm_regs {
 struct rockchip_pwm_data {
struct rockchip_pwm_regs regs;
unsigned int prescaler;
+   bool supports_polarity;
const struct pwm_ops *ops;
 
void (*set_enable)(struct pwm_chip *chip,
-  struct pwm_device *pwm, bool enable);
+  struct pwm_device *pwm, bool enable,
+  enum pwm_polarity polarity);
void (*get_state)(struct pwm_chip *chip, struct pwm_device *pwm,
  struct pwm_state *state);
 };
@@ -61,7 +63,8 @@ static inline struct rockchip_pwm_chip 
*to_rockchip_pwm_chip(struct pwm_chip *c)
 }
 
 static void rockchip_pwm_set_enable_v1(struct pwm_chip *chip,
-  struct pwm_device *pwm, bool enable)
+  struct pwm_device *pwm, bool enable,
+  enum pwm_polarity polarity)
 {
struct rockchip_pwm_chip *pc = to_rockchip_pwm_chip(chip);
u32 enable_conf = PWM_CTRL_OUTPUT_EN | PWM_CTRL_TIMER_EN;
@@ -91,14 +94,15 @@ static void rockchip_pwm_get_state_v1(struct pwm_chip *chip,
 }
 
 static void rockchip_pwm_set_enable_v2(struct pwm_chip *chip,
-  struct pwm_device *pwm, bool enable)
+  struct pwm_device *pwm, bool enable,
+  enum pwm_polarity polarity)
 {
struct rockchip_pwm_chip *pc = to_rockchip_pwm_chip(chip);
u32 enable_conf = PWM_OUTPUT_LEFT | PWM_LP_DISABLE | PWM_ENABLE |
  PWM_CONTINUOUS;
u32 val;
 
-   if (pwm_get_polarity(pwm) == PWM_POLARITY_INVERSED)
+   if (polarity == PWM_POLARITY_INVERSED)
enable_conf |= PWM_DUTY_NEGATIVE | PWM_INACTIVE_POSITIVE;
else
enable_conf |= PWM_DUTY_POSITIVE | PWM_INACTIVE_NEGATIVE;
@@ -166,7 +170,6 @@ static int rockchip_pwm_config(struct pwm_chip *chip, 
struct pwm_device *pwm,
struct rockchip_pwm_chip *pc = to_rockchip_pwm_chip(chip);
unsigned long period, duty;
u64 clk_rate, div;
-   int ret;
 
clk_rate = clk_get_rate(pc->clk);
 
@@ -182,69 +185,66 @@ static int rockchip_pwm_config(struct pwm_chip *chip, 
struct pwm_device *pwm,
div = clk_rate * duty_ns;
duty = DIV_ROUND_CLOSEST_ULL(div, pc->data->prescaler * NSEC_PER_SEC);
 
-   ret = clk_enable(pc->clk);
-   if (ret)
-   return ret;
-
writel(period, pc->base + pc->data->regs.period);
writel(duty, pc->base + pc->data->regs.duty);
-   writel(0, pc->base + pc->data->regs.cntr);
-
-   clk_disable(pc->clk);
-
-   return 0;
-}
-
-static int rockchip_pwm_set_polarity(struct pwm_chip *chip,
-struct pwm_device *pwm,
-enum pwm_polarity polarity)
-{
-   /*
-* No action needed here because pwm->polarity will be set by the core
-* and the core will only change polarity when the PWM is not enabled.
-* We'll handle things in set_enable().
-*/
 
return 0;
 }
 
-static int rockchip_pwm_enable(struct pwm_chip *chip, struct pwm_device *pwm)
+static int rockchip_pwm_apply(struct pwm_chip *chip, struct pwm_device *pwm,
+ struct pwm_state *state)
 {
struct rockchip_pwm_chip *pc = to_rockchip_pwm_chip(chip);
+   struct pwm_state curstate;
+   bool enabled;
int ret;
 
+   pwm_get_state(pwm, &curstate);
+   enabled = curstate.enabled;
+
ret = clk_enable(pc->clk);
if (ret)
return ret;
 
-   pc->data->set_enable(chip, pwm, true);
+   if (state->polarity != curstate.polarity && enabled) {
+   pc->data->set_enable(chip, pwm, false, state->polarity);
+   enabled = false;
+   }
 
-   return 0;
-}
+   ret = rockchip_pwm_config(chip, pwm, state->duty_cycle, state->period);
+   if (ret) {
+   if (enabled != curstate.enabled)
+   pc->data->set_enable(chip, pwm, !enabled,
+state->polarity);
 
-static void rockchip_pwm_disable(struct pwm_chip *chip, struct pwm_device *pwm)
-{
-   struct rockchip_pwm_chip *pc = to_rockchip_pwm_chip(chip);
+   goto out;
+   }
+
+   if (state->enabled != enabled)
+   pc->data->set_enable(chip, pwm, s

[PATCH v3 00/14] regulator: pwm: various improvements

2016-06-14 Thread Boris Brezillon
Hello,

This patch series series aims at adding two important features to the
pwm-regulator driver.

The first one is the support for 'smooth handover' between the
bootloader and the kernel. This is mainly solving problems we have when
the PWM is controlling a critical regulator (like the one powering the
DDR chip). Currently, when the PWM regulator acquire the PWM device it
assumes it was off and it's safe to change the configuration before
enabling it, which can generate glitches on the PWM signal which in turn
generated glitches on the output voltage.
To solve that we've introduced support for hardware readout to the
PWM framework, so that the PWM regulator driver can adjust the PWM
a probe time and avoid glitches.
Atomic update is also helping in this regard.

Patch 1 is adding convenient helpers (at the PWM level) that will be
used by the PWM regulator driver.
Patches 2 to 7 are preparing everything on the PWM driver side to make
hardware readout available to all platforms using the PWM regulator
driver (rockchip and sti).
Patches 8 to 11 are making use of the atomic update and hardware readout
features to implement this smooth handover.

The second feature we add to the driver is the capability of using
a sub duty_cycle range in continuous mode. By default the regulator
is assuming that min_uV is achieved with a 0% dutycyle and max_uV
with a 100% dutycycle, but this is not necessarily true.
Moreover, in some cases (when the PWM device does not support
polarity inversion), we might have min_uV at 100% and max_uV at 0%.
Hence the addition of new properties to the existing DT bindings.
The feature is added in patch 12 and 13.

Best Regards,

Boris

Changes since v2:
- add Heiko's Tested-by
- split patch 1 in 2 patches
- rework the documentation
- rename pwm_prepare_new_state() into pwm_init_state()
- make pwm_set_relative_duty_cycle() return an error code when scale
  or duty_cycle are inconsistent

Changes since v1:
- dropped already applied patches
- added R-b/A-b/T-b tags
- s/readl/readl_relaxed/ in patch 3 (as suggested by Brian)
- fixed pwm-regulator DT binding doc
- added some comments in the code
- replaced pwm_get_state() + if (state.enabled) by if (pwm_is_enabled())

Boris Brezillon (14):
  pwm: Add an helper to prepare a new PWM state
  pwm: Add two helpers to ease relative duty cycle manipulation
  pwm: rockchip: Fix period and duty_cycle approximation
  pwm: rockchip: Add support for hardware readout
  pwm: rockchip: Avoid glitches on already running PWMs
  pwm: rockchip: Add support for atomic update
  pwm: sti: Add support for hardware readout
  pwm: sti: Avoid glitches on already running PWMs
  regulator: pwm: Adjust PWM config at probe time
  regulator: pwm: Switch to the atomic PWM API
  regulator: pwm: Properly initialize the ->state field
  regulator: pwm: Retrieve correct voltage
  regulator: pwm: Support extra continuous mode cases
  regulator: pwm: Document pwm-dutycycle-unit and pwm-dutycycle-range

 .../bindings/regulator/pwm-regulator.txt   |  19 +++
 drivers/pwm/pwm-rockchip.c | 178 +++--
 drivers/pwm/pwm-sti.c  |  52 +-
 drivers/regulator/pwm-regulator.c  | 160 +-
 include/linux/pwm.h|  89 +++
 5 files changed, 407 insertions(+), 91 deletions(-)

-- 
2.7.4



[PATCH v3 04/14] pwm: rockchip: Add support for hardware readout

2016-06-14 Thread Boris Brezillon
Implement the ->get_state() function to expose initial state.

Signed-off-by: Boris Brezillon 
Reviewed-by: Brian Norris 
Tested-by: Brian Norris 
Tested-by: Heiko Stuebner 
---
 drivers/pwm/pwm-rockchip.c | 67 ++
 1 file changed, 67 insertions(+)

diff --git a/drivers/pwm/pwm-rockchip.c b/drivers/pwm/pwm-rockchip.c
index 68d72ce..c72b419 100644
--- a/drivers/pwm/pwm-rockchip.c
+++ b/drivers/pwm/pwm-rockchip.c
@@ -51,6 +51,8 @@ struct rockchip_pwm_data {
 
void (*set_enable)(struct pwm_chip *chip,
   struct pwm_device *pwm, bool enable);
+   void (*get_state)(struct pwm_chip *chip, struct pwm_device *pwm,
+ struct pwm_state *state);
 };
 
 static inline struct rockchip_pwm_chip *to_rockchip_pwm_chip(struct pwm_chip 
*c)
@@ -75,6 +77,19 @@ static void rockchip_pwm_set_enable_v1(struct pwm_chip *chip,
writel_relaxed(val, pc->base + pc->data->regs.ctrl);
 }
 
+static void rockchip_pwm_get_state_v1(struct pwm_chip *chip,
+ struct pwm_device *pwm,
+ struct pwm_state *state)
+{
+   struct rockchip_pwm_chip *pc = to_rockchip_pwm_chip(chip);
+   u32 enable_conf = PWM_CTRL_OUTPUT_EN | PWM_CTRL_TIMER_EN;
+   u32 val;
+
+   val = readl_relaxed(pc->base + pc->data->regs.ctrl);
+   if ((val & enable_conf) == enable_conf)
+   state->enabled = true;
+}
+
 static void rockchip_pwm_set_enable_v2(struct pwm_chip *chip,
   struct pwm_device *pwm, bool enable)
 {
@@ -98,6 +113,53 @@ static void rockchip_pwm_set_enable_v2(struct pwm_chip 
*chip,
writel_relaxed(val, pc->base + pc->data->regs.ctrl);
 }
 
+static void rockchip_pwm_get_state_v2(struct pwm_chip *chip,
+ struct pwm_device *pwm,
+ struct pwm_state *state)
+{
+   struct rockchip_pwm_chip *pc = to_rockchip_pwm_chip(chip);
+   u32 enable_conf = PWM_OUTPUT_LEFT | PWM_LP_DISABLE | PWM_ENABLE |
+ PWM_CONTINUOUS;
+   u32 val;
+
+   val = readl_relaxed(pc->base + pc->data->regs.ctrl);
+   if ((val & enable_conf) != enable_conf)
+   return;
+
+   state->enabled = true;
+
+   if (!(val & PWM_DUTY_POSITIVE))
+   state->polarity = PWM_POLARITY_INVERSED;
+}
+
+static void rockchip_pwm_get_state(struct pwm_chip *chip,
+  struct pwm_device *pwm,
+  struct pwm_state *state)
+{
+   struct rockchip_pwm_chip *pc = to_rockchip_pwm_chip(chip);
+   unsigned long clk_rate;
+   u64 tmp;
+   int ret;
+
+   ret = clk_enable(pc->clk);
+   if (ret)
+   return;
+
+   clk_rate = clk_get_rate(pc->clk);
+
+   tmp = readl_relaxed(pc->base + pc->data->regs.period);
+   tmp *= pc->data->prescaler * NSEC_PER_SEC;
+   state->period = DIV_ROUND_CLOSEST_ULL(tmp, clk_rate);
+
+   tmp = readl_relaxed(pc->base + pc->data->regs.duty);
+   tmp *= pc->data->prescaler * NSEC_PER_SEC;
+   state->duty_cycle = DIV_ROUND_CLOSEST_ULL(tmp, clk_rate);
+
+   pc->data->get_state(chip, pwm, state);
+
+   clk_disable(pc->clk);
+}
+
 static int rockchip_pwm_config(struct pwm_chip *chip, struct pwm_device *pwm,
   int duty_ns, int period_ns)
 {
@@ -170,6 +232,7 @@ static void rockchip_pwm_disable(struct pwm_chip *chip, 
struct pwm_device *pwm)
 }
 
 static const struct pwm_ops rockchip_pwm_ops_v1 = {
+   .get_state = rockchip_pwm_get_state,
.config = rockchip_pwm_config,
.enable = rockchip_pwm_enable,
.disable = rockchip_pwm_disable,
@@ -177,6 +240,7 @@ static const struct pwm_ops rockchip_pwm_ops_v1 = {
 };
 
 static const struct pwm_ops rockchip_pwm_ops_v2 = {
+   .get_state = rockchip_pwm_get_state,
.config = rockchip_pwm_config,
.set_polarity = rockchip_pwm_set_polarity,
.enable = rockchip_pwm_enable,
@@ -194,6 +258,7 @@ static const struct rockchip_pwm_data pwm_data_v1 = {
.prescaler = 2,
.ops = &rockchip_pwm_ops_v1,
.set_enable = rockchip_pwm_set_enable_v1,
+   .get_state = rockchip_pwm_get_state_v1,
 };
 
 static const struct rockchip_pwm_data pwm_data_v2 = {
@@ -206,6 +271,7 @@ static const struct rockchip_pwm_data pwm_data_v2 = {
.prescaler = 1,
.ops = &rockchip_pwm_ops_v2,
.set_enable = rockchip_pwm_set_enable_v2,
+   .get_state = rockchip_pwm_get_state_v2,
 };
 
 static const struct rockchip_pwm_data pwm_data_vop = {
@@ -218,6 +284,7 @@ static const struct rockchip_pwm_data pwm_data_vop = {
.prescaler = 1,
.ops = &rockchip_pwm_ops_v2,
.set_enable = rockchip_pwm_set_enable_v2,
+   .get_state = rockchip_pwm_get_state_v2,
 };
 
 static const struct of_device_id rockchip_pwm_dt_ids[] = {
-- 
2.7.4



[PATCH v3 02/14] pwm: Add two helpers to ease relative duty cycle manipulation

2016-06-14 Thread Boris Brezillon
The PWM framework expects PWM users to configure the duty cycle in
nanoseconds, but most users just want to express this duty cycle
relatively to the period value (i.e. duty_cycle = 33% of the period).
Add the pwm_{get,set}_relative_duty_cycle() helpers to ease this kind
of conversion.

Signed-off-by: Boris Brezillon 
Tested-by: Heiko Stuebner 
---
 include/linux/pwm.h | 56 +
 1 file changed, 56 insertions(+)

diff --git a/include/linux/pwm.h b/include/linux/pwm.h
index a100f6e..05e4ea4 100644
--- a/include/linux/pwm.h
+++ b/include/linux/pwm.h
@@ -181,6 +181,62 @@ static inline void pwm_init_state(const struct pwm_device 
*pwm,
 }
 
 /**
+ * pwm_get_relative_duty_cycle() - Get a relative duty_cycle value
+ * @state: PWM state to extract the duty_cycle from
+ * @scale: target scale of the relative duty cycle
+ *
+ * This functions converts the absolute duty_cycle stored in @state
+ * (expressed in nanosecond) into a value relative to the period.
+ * For example if you want to get the duty_cycle expressed in percent,
+ * call:
+ *
+ * pwm_get_state(pwm, &state);
+ * duty = pwm_get_relative_duty_cycle(&state, 100);
+ */
+static inline unsigned int
+pwm_get_relative_duty_cycle(const struct pwm_state *state, unsigned int scale)
+{
+   if (!state->period)
+   return 0;
+
+   return DIV_ROUND_CLOSEST_ULL((u64)state->duty_cycle * scale,
+state->period);
+}
+
+/**
+ * pwm_set_relative_duty_cycle() - Set a relative duty_cycle value
+ * @state: PWM state to fill
+ * @duty_cycle: relative duty_cycle value
+ * @scale: scale in which @duty_cycle is expressed
+ *
+ * This functions converts a relative duty_cycle into an absolute one
+ * (expressed in nanoseconds), and put the result in state->duty_cycle.
+ *
+ * For example if you want to configure a 50% duty_cycle, call:
+ *
+ * pwm_init_state(pwm, &state);
+ * pwm_set_relative_duty_cycle(&state, 50, 100);
+ * pwm_apply_state(pwm, &state);
+ *
+ * This functions returns -EINVAL if @duty_cycle and/or @scale are
+ * inconsistent (@scale == 0 or @duty_cycle > @scale).
+ */
+static inline int
+pwm_set_relative_duty_cycle(struct pwm_state *state, unsigned int duty_cycle,
+   unsigned int scale)
+{
+   /* Make sure @scale is > 0 and @duty_cycle <= @scale */
+   if (!scale || duty_cycle > scale)
+   return -EINVAL;
+
+   state->duty_cycle = DIV_ROUND_CLOSEST_ULL((u64)duty_cycle *
+ state->period,
+ scale);
+
+   return 0;
+}
+
+/**
  * struct pwm_ops - PWM controller operations
  * @request: optional hook for requesting a PWM
  * @free: optional hook for freeing a PWM
-- 
2.7.4



[PATCH v3 03/14] pwm: rockchip: Fix period and duty_cycle approximation

2016-06-14 Thread Boris Brezillon
The current implementation always round down the duty and period
values, while it would be better to round them to the closest integer.

These changes are needed in preparation of atomic update support to
prevent a period/duty cycle drift when executing several time the
'pwm_get_state() / modify / pwm_apply_state()' sequence.

Say you have an expected period of 3.333 us and a clk rate of
112.67 MHz -- the clock frequency doesn't divide evenly,
so the period (stashed in nanoseconds) shrinks when we convert to
the register value and back, as follows:

  pwm_apply_state(): register = period * 11267 / 10;
  pwm_get_state(): period = register * 10 / 11267;

or in other words:

  period = period * 11267 / 10 * 10 / 11267;

which yields a sequence like:

   -> 3328
  3328 -> 3319
  3319 -> 3310
  3310 -> 3301
  3301 -> 3292
  3292 -> ... (etc) ...

With this patch, we'd see instead:

  period = div_round_closest(period * 11267, 10) *
   10 / 11267;

which yields a stable sequence:

   -> 3337
  3337 -> 3337
  3337 -> ... (etc) ...

Signed-off-by: Boris Brezillon 
Reviewed-by: Brian Norris 
Tested-by: Brian Norris 
Tested-by: Heiko Stuebner 
---
 drivers/pwm/pwm-rockchip.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/pwm/pwm-rockchip.c b/drivers/pwm/pwm-rockchip.c
index 7d9cc90..68d72ce 100644
--- a/drivers/pwm/pwm-rockchip.c
+++ b/drivers/pwm/pwm-rockchip.c
@@ -114,12 +114,11 @@ static int rockchip_pwm_config(struct pwm_chip *chip, 
struct pwm_device *pwm,
 * default prescaler value for all practical clock rate values.
 */
div = clk_rate * period_ns;
-   do_div(div, pc->data->prescaler * NSEC_PER_SEC);
-   period = div;
+   period = DIV_ROUND_CLOSEST_ULL(div,
+  pc->data->prescaler * NSEC_PER_SEC);
 
div = clk_rate * duty_ns;
-   do_div(div, pc->data->prescaler * NSEC_PER_SEC);
-   duty = div;
+   duty = DIV_ROUND_CLOSEST_ULL(div, pc->data->prescaler * NSEC_PER_SEC);
 
ret = clk_enable(pc->clk);
if (ret)
-- 
2.7.4



[PATCH] more mapcount page as kpage could reduce total replacement times than fewer mapcount one in probability.

2016-06-14 Thread zhouxianrong
From: z00281421 

more mapcount page as kpage could reduce total replacement 
times than fewer mapcount one when ksmd scan and replace 
among forked pages later.

Signed-off-by: z00281421 
---
 mm/ksm.c |   15 +++
 1 file changed, 15 insertions(+)

diff --git a/mm/ksm.c b/mm/ksm.c
index 4786b41..17a238c 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -1094,6 +1094,21 @@ static struct page *try_to_merge_two_pages(struct 
rmap_item *rmap_item,
 {
int err;
 
+   /*
+* select more mapcount page as kpage
+*/
+   if (page_mapcount(page) < page_mapcount(tree_page)) {
+   struct page *tmp_page;
+   struct rmap_item *tmp_rmap_item;
+
+   tmp_page = page;
+   page = tree_page;
+   tree_page = tmp_page;
+   tmp_rmap_item = rmap_item;
+   rmap_item = tree_rmap_item;
+   tree_rmap_item = tmp_rmap_item;
+   }
+
err = try_to_merge_with_ksm_page(rmap_item, page, NULL);
if (!err) {
err = try_to_merge_with_ksm_page(tree_rmap_item,
-- 
1.7.9.5



Re: [RESEND PATCH v5 0/1] ARM64: ACPI: Update documentation for latest specification version

2016-06-14 Thread Will Deacon
On Tue, Jun 14, 2016 at 10:13:31AM +0100, Will Deacon wrote:
> On Mon, Jun 13, 2016 at 03:41:54PM -0600, Al Stone wrote:
> > This is a resend only: Ping?  Last ping was 26 May; there has been zero
> > response since then.  Already have one ACK from Lorenzo; another from an
> > arm64 maintainer would be really helpful.
> 
> I thought there were outstanding comments on v4 of this?
> 
> http://lkml.kernel.org/r/571e699b.9090...@linaro.org

Hmm, that's weird. You sent a v5 on 25 April:

http://archive.arm.linux.org.uk/lurker/message/20160425.212126.fe36116c.en.html

but I can't find that in either my ARM inbox or my gmail inbox that's
subscribed to the lists. I guess Catalin can pick this up for 4.8.

Will


Re: [very-RFC 0/8] TSN driver for the kernel

2016-06-14 Thread Henrik Austad
On Mon, Jun 13, 2016 at 09:32:10PM +0200, Richard Cochran wrote:
> On Mon, Jun 13, 2016 at 03:00:59PM +0200, Henrik Austad wrote:
> > On Mon, Jun 13, 2016 at 01:47:13PM +0200, Richard Cochran wrote:
> > > Which driver is that?
> > 
> > drivers/net/ethernet/renesas/
> 
> That driver is merely a PTP capable MAC driver, nothing more.
> Although AVB is in the device name, the driver doesn't implement
> anything beyond the PTP bits.

Yes, I think they do the rest from userspace, not sure though :)

> > What is the rationale for no new sockets? To avoid cluttering? or do 
> > sockets have a drawback I'm not aware of?
> 
> The current raw sockets will work just fine.  Again, there should be a
> application that sits in between with the network socket and the audio
> interface.

So loop data from kernel -> userspace -> kernelspace and finally back to 
userspace and the media application? I agree that you need a way to pipe 
the incoming data directly from the network to userspace for those TSN 
users that can handle it. But again, for media-applications that don't know 
(or care) about AVB, it should be fed to ALSA/v4l2 directly and not jump 
between kernel and userspace an extra round.

I get the point of not including every single audio/video encoder in the 
kernel, but raw audio should be piped directly to alsa. V4L2 has a way of 
piping encoded video through the system and to the media application (in 
order to support cameras that to encoding). The same approach should be 
doable for AVB, no? (someone from alsa/v4l2 should probably comment on 
this)

> > Why is configfs wrong?
> 
> Because the application will use the already existing network and
> audio interfaces to configure the system.

Configuring this via the audio-interface is going to be a challenge since 
you need to configure the stream through the network before you can create 
the audio interface. If not, you will have to either drop data or block the 
caller until the link has been fully configured.

This is actually the reason why configfs is used in the series now, as it 
allows userspace to figure out all the different attributes and configure 
the link before letting ALSA start pushing data.

> > > Lets take a look at the big picture.  One aspect of TSN is already
> > > fully supported, namely the gPTP.  Using the linuxptp user stack and a
> > > modern kernel, you have a complete 802.1AS-2011 solution.
> > 
> > Yes, I thought so, which is also why I have put that to the side and why 
> > I'm using ktime_get() for timestamps at the moment. There's also the issue 
> > of hooking the time into ALSA/V4L2
> 
> So lets get that issue solved before anything else.  It is absolutely
> essential for TSN.  Without the synchronization, you are only playing
> audio over the network.  We already have software for that.

Yes, I agree, presentation-time and local time needs to be handled 
properly. The same for adjusting sample-rate etc. This is a lot of work, so 
I hope you can understand why I started out with a simple approach to spark 
a discussion before moving on to the larger bits.

> > > 2. A user space audio application that puts it all together, making
> > >use of the services in #1, the linuxptp gPTP service, the ALSA
> > >services, and the network connections.  This program will have all
> > >the knowledge about packet formats, AV encodings, and the local HW
> > >capabilities.  This program cannot yet be written, as we still need
> > >some kernel work in the audio and networking subsystems.
> > 
> > Why?
> 
> Because user space is right place to place the knowledge of the myriad
> formats and options.

Se response above, better to let anything but uncompressed raw data trickle 
through.

> > the whole point should be to make it as easy for userspace as 
> > possible. If you need to tailor each individual media-appliation to use 
> > AVB, it is not going to be very useful outside pro-Audio. Sure, there will 
> > be challenges, but one key element here should be to *not* require 
> > upgrading every single media application.
> > 
> > Then, back to the suggestion of adding a TSN_SOCKET (which you didn't like, 
> > but can we agree on a term "raw interface to TSN", and mode of transport 
> > can be defined later? ), was to let those applications that are TSN-aware 
> > to do what they need to do, whether it is controlling robots or media 
> > streams.
> 
> First you say you don't want ot upgrade media applications, but then
> you invent a new socket type.  That is a contradiction in terms.

Hehe, no, bad phrasing on my part. I want *both* (hence the shim-interface) 
:)

> Audio apps already use networking, and they already use the audio
> subsystem.  We need to help them get their job done by providing the
> missing kernel interfaces.  They don't need extra magic buffering the
> kernel.  They already can buffer audio data by themselves.

Yes, I know some audio apps "use networking", I can stream netradio, I can 
use jack to connec

Re: [PATCH V9 09/11] ARM64/PCI: ACPI support for legacy IRQs parsing and consolidation with DT code

2016-06-14 Thread Lorenzo Pieralisi
On Mon, Jun 13, 2016 at 01:01:35PM -0700, Duc Dang wrote:
> On Mon, Jun 13, 2016 at 3:40 AM, Lorenzo Pieralisi
>  wrote:
> >
> > On Fri, Jun 10, 2016 at 06:36:12PM -0500, Bjorn Helgaas wrote:
> > > On Fri, Jun 10, 2016 at 09:55:17PM +0200, Tomasz Nowicki wrote:
> > > > To enable PCI legacy IRQs on platforms booting with ACPI, arch code
> > > > should include ACPI specific callbacks that parse and set-up the
> > > > device IRQ number, equivalent to the DT boot path. Owing to the current
> > > > ACPI core scan handlers implementation, ACPI PCI legacy IRQs bindings
> > > > cannot be parsed at device add time, since that would trigger ACPI scan
> > > > handlers ordering issues depending on how the ACPI tables are defined.
> > >
> > > Uh, OK :)  I can't figure out exactly what the problem is here -- I
> > > don't know where to look if I wanted to fix the scan handler ordering
> > > issues, and I don't know how I could tell if it would ever be safe to
> > > move this from driver probe-time back to device add-time.
> >
> > Right, the commit log could have been more informative.
> >
> > pcibios_add_device() was added in:
> >
> > commit d1e6dc91b532 ("arm64: Add architectural support for PCI")
> >
> > whose commit log does not specify why legacy IRQ parsing should
> > be done at pcibios_add_device() either, so honestly we had to
> > do with the information we have at hand.
> >
> > > I also notice that x86 and ia64 call acpi_pci_irq_enable() even later,
> > > when the driver *enables* the device.  Is there a reason you didn't do
> > > it at the same time as x86 and ia64?  This is another of those pcibios
> > > hooks that really don't do anything arch-specific, so I can imagine
> > > refactoring this somehow, someday.
> >
> > Yes, with [1], that was the goal, that stopped because [1] does not
> > work on x86.
> >
> > Only DT platform(s) affected by this change are all platforms relying on
> > drivers/pci/host/pci-xgene.c (others rely on pci_fixup_irqs() that
> > should be removed too), if on those platforms probing IRQs at device
> > enable time works ok I can update this patch (it can be done through [1]
> > once we figure out what to do with it on x86) and move the IRQ set-up at
> > pcibios_enable_device() time.
> >
> > @Duc: any feedback on this ?
> 
> Hi Lorenzo,
> 
> The changes to add pcibios_alloc_irq works fine on X-Gene PCIe
> 
> I also tried to remove pcibios_alloc_irq and move its code into
> pcibios_enable_device
> after pci_enable_resource call and legacy IRQ also works.

Thank you, that was important to check.

> Can you also point me to the discussion thread or some info. about the
> issue on x86 with [1]?
> I want to check if there is any more test case I need to verify.

http://marc.info/?l=linux-pci&m=145091187918297&w=2

I do not think there is much we can do, unless we convert pci_fixup_irqs
as per [1] but keep x86 code as-is (for now), there is a way to do it
it is just a matter of taking Matt's series and refactor it, there are
potential problems (not on xgene, but on other ARM platforms
where the IRQ legacy routing is done through pci_fixup_irqs()).

Lorenzo

> Regards,
> Duc Dang.
> 
> >
> > Thanks,
> > Lorenzo
> >
> > [1] http://www.spinics.net/lists/linux-pci/msg45950.html
> >
> > > Did we have this conversation before?  It seems vaguely familiar, so I
> > > apologize if you already explained this once.
> > >
> > > > To solve this problem and consolidate FW PCI legacy IRQs parsing in
> > > > one single pcibios callback (pending final removal), this patch moves
> > > > DT PCI IRQ parsing to the pcibios_alloc_irq() callback (called by
> > > > PCI core code at device probe time) and adds ACPI PCI legacy IRQs
> > > > parsing to the same callback too, so that FW PCI legacy IRQs parsing
> > > > is confined in one single arch callback that can be easily removed
> > > > when code parsing PCI legacy IRQs is consolidated and moved to core
> > > > PCI code.
> > > >
> > > > Signed-off-by: Tomasz Nowicki 
> > > > Suggested-by: Lorenzo Pieralisi 
> > > > ---
> > > >  arch/arm64/kernel/pci.c | 11 ---
> > > >  1 file changed, 8 insertions(+), 3 deletions(-)
> > > >
> > > > diff --git a/arch/arm64/kernel/pci.c b/arch/arm64/kernel/pci.c
> > > > index d5d3d26..b3b8a2c 100644
> > > > --- a/arch/arm64/kernel/pci.c
> > > > +++ b/arch/arm64/kernel/pci.c
> > > > @@ -51,11 +51,16 @@ int pcibios_enable_device(struct pci_dev *dev, int 
> > > > mask)
> > > >  }
> > > >
> > > >  /*
> > > > - * Try to assign the IRQ number from DT when adding a new device
> > > > + * Try to assign the IRQ number when probing a new device
> > > >   */
> > > > -int pcibios_add_device(struct pci_dev *dev)
> > > > +int pcibios_alloc_irq(struct pci_dev *dev)
> > > >  {
> > > > -   dev->irq = of_irq_parse_and_map_pci(dev, 0, 0);
> > > > +   if (acpi_disabled)
> > > > +   dev->irq = of_irq_parse_and_map_pci(dev, 0, 0);
> > > > +#ifdef CONFIG_ACPI
> > > > +   else
> > > > +   return acpi_pci_irq_enable(dev);
> > > > +#end

Re: [PATCH v2 4/5]nbd: make nbd device wait for its users.

2016-06-14 Thread Pranay Srivastava
Hi Markus,

On Tue, Jun 14, 2016 at 2:29 PM, Markus Pargmann  wrote:
>
> On Thursday 02 June 2016 13:25:00 Pranay Kr. Srivastava wrote:
> > When a timeout occurs or a recv fails, then
> > instead of abruplty killing nbd block device
> > wait for it's users to finish.
> >
> > This is more required when filesystem(s) like
> > ext2 or ext3 don't expect their buffer heads to
> > disappear while the filesystem is mounted.
> >
> > Each open of a nbd device is refcounted, while
> > the userland program [nbd-client] doing the
> > NBD_DO_IT ioctl would now wait for any other users
> > of this device before invalidating the nbd device.
> >
> > Signed-off-by: Pranay Kr. Srivastava 
> > ---
> >  drivers/block/nbd.c | 58 
> > +
> >  1 file changed, 58 insertions(+)
> >
> > diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> > index d1d898d..4da40dc 100644
> > --- a/drivers/block/nbd.c
> > +++ b/drivers/block/nbd.c
> > @@ -70,10 +70,13 @@ struct nbd_device {
> >  #if IS_ENABLED(CONFIG_DEBUG_FS)
> >   struct dentry *dbg_dir;
> >  #endif
> > + atomic_t inuse;
> >   /*
> >*This is specifically for calling sock_shutdown, for now.
> >*/
> >   struct work_struct ws_shutdown;
> > + struct kref users;
> > + struct completion user_completion;
> >  };
> >
> >  #if IS_ENABLED(CONFIG_DEBUG_FS)
> > @@ -104,6 +107,7 @@ static DEFINE_SPINLOCK(nbd_lock);
> >   * Shutdown function for nbd_dev work struct.
> >   */
> >  static void nbd_ws_func_shutdown(struct work_struct *);
> > +static void nbd_kref_release(struct kref *);
> >
> >  static inline struct device *nbd_to_dev(struct nbd_device *nbd)
> >  {
> > @@ -682,6 +686,8 @@ static void nbd_reset(struct nbd_device *nbd)
> >   nbd->flags = 0;
> >   nbd->xmit_timeout = 0;
> >   INIT_WORK(&nbd->ws_shutdown, nbd_ws_func_shutdown);
> > + init_completion(&nbd->user_completion);
> > + kref_init(&nbd->users);
> >   queue_flag_clear_unlocked(QUEUE_FLAG_DISCARD, nbd->disk->queue);
> >   del_timer_sync(&nbd->timeout_timer);
> >  }
> > @@ -815,6 +821,14 @@ static int __nbd_ioctl(struct block_device *bdev, 
> > struct nbd_device *nbd,
> >   kthread_stop(thread);
> >
> >   sock_shutdown(nbd);
> > + /*
> > +  * kref_init initializes with ref count as 1,
> > +  * nbd_client, or the user-land program executing
> > +  * this ioctl will make the refcount to 2[at least]
> > +  * so subtracting 2 from refcount.
> > +  */
> > + kref_sub(&nbd->users, 2, nbd_kref_release);
>
> Why don't you use a kref_put?

Ok, so I'll try to explain as I've understood the problem.

When the module is loaded the kref is initialized to 1.

Suppose now, someone has started nbd-client [nbdC-1] , then this
nbd-client will increase the ref count to 2. So far so good...

Now let's say this device is being shutdown via nbd-client[nbdC-2].

nbdC-1 will subtract the refcount by two, it has to do in NBD_DO_IT
since device file will not
be closed until after ioctl is over, and it'll wait_for_completion.

nbdC-2 now closes it's use of device file, this makes the refcount as
zero and completion
is triggered with nbdC-1 completed.

Now we don't want to trigger kref_put when nbdC-1 closes the device
file so kref_put needs
to be conditional in this regard so for that in_use is used.


>
> > + wait_for_completion(&nbd->user_completion);
> >   mutex_lock(&nbd->tx_lock);
> >   nbd_clear_que(nbd);
> >   kill_bdev(bdev);
> > @@ -865,13 +879,56 @@ static int nbd_ioctl(struct block_device *bdev, 
> > fmode_t mode,
> >
> >   return error;
> >  }
> > +static void nbd_kref_release(struct kref *kref_users)
> > +{
> > + struct nbd_device *nbd = container_of(kref_users, struct nbd_device,
> > + users);
>
> Not indented to opening bracket.
>
> > + pr_debug("Releasing kref [%s]\n", __func__);
> > + atomic_set(&nbd->inuse, 0);
> > + complete(&nbd->user_completion);
> > +
> > +}
> > +
> > +static int nbd_open(struct block_device *bdev, fmode_t mode)
> > +{
> > + struct nbd_device *nbd_dev = bdev->bd_disk->private_data;
> > +
> > + if (kref_get_unless_zero(&nbd_dev->users))
> > + atomic_set(&nbd_dev->inuse, 1);
> > +
> > + pr_debug("Opening nbd_dev %s. Active users = %u\n",
> > + bdev->bd_disk->disk_name,
> > + atomic_read(&nbd_dev->users.refcount) - 1);
>
> Indent to opening bracket.
>
> > + return 0;
> > +}
> > +
> > +static void nbd_release(struct gendisk *disk, fmode_t mode)
> > +{
> > + struct nbd_device *nbd_dev = disk->private_data;
> > + /*
> > + *kref_init initializes ref count to 1, so we
> > + *we check for refcount to be 2 for a final put.
> > + *
> > + *kref needs to be re-initialized just here as the
> > + *other process holding it must 

Re: [PATCH 2/4] mtd: nand: implement two pairing scheme

2016-06-14 Thread Boris Brezillon
On 14 Jun 2016 05:07:26 -0400
"George Spelvin"  wrote:

> Boris Brezillon wrote:
> > On 12 Jun 2016 16:24:53 George Spelvin wrote:  
> >> Boris Brezillon wrote:
> >> My problem is that I don't really understand MLC programming.  
> 
> > I came to the same conclusion: we really have these 2 cases in the
> > wild, which makes it even more complicated to define a standard
> > behavior.  
> 
> I did find a useful stuy of the issue: "Program Interference in MLC NAND
> Flash Memory: Characterization, Modeling, and Mitigation"
> 
> https://users.ece.cmu.edu/~omutlu/pub/flash-programming-interference_iccd13.pdf
> 
> It describes the write-disturb-precompensation technique, and also
> shows how the two-stage programming works.  (Although the fact that the
> "least significant bit" is the *largest* voltage difference and is shown
> on the *left* makes no sense at all.)
> 
> Looking at the demonstrated programming sequence, it looks like
> it should be possible to probe for the bit assignment.  If you have
> a half-programmed page, then any bits programmed to "0" are actually
> sitting close to the threshold between the two middle voltage levels.
> 
> So you'll get a lot of errors reading them as "1", but the interesting
> part is the read-back of the unprogrammed bit.
> 
> If the chip is using the binary sequence, you'll read either 10 or 01.
> If the chip us ising the Gray-code sequence, you'll read 10 or 00.
> 
> Basically, you read both pages and see which bit combination never
> appears.  That is the combination that corresponds to the highest voltage
> level.
> 
> Another interesting paper is "Read Disturb Errors in MLC NAND Flash
> Memory: Characterization, Mitigation, and Recovery"
> https://users.ece.cmu.edu/~omutlu/pub/flash-read-disturb-errors_dsn15.pdf
> 
> That talks about tricks that do as you observe: increase read error to start.
> (In order to decreaease read disturb, and thus read errors later.)

Thanks a lot for sharing your thoughts along with all these references,
that's really useful. I'll carefully read all of them.

> 
> >> It's more considering it to have 16K pages that can be accessed in 
> >> half-pages.  
> 
> > Yes, I know, but it's not really easy to fake that at the NAND level,
> > because programming 2 pages still requires 2 page program operation.
> > The MTD user could detect that the pairing scheme always exposes 2
> > consecutive non-paired pages, but as you've seen, this condition does
> > not necessarily imply the 'pair coupling' constraint, and we don't want
> > to increase the min_io_size value if it's not really necessary.  
> 
> Ideally, it would be nice to separate the "SLC hack" from the "later
> write failures can corrupt earlier data" workaround.
> 
> First, you get the latter working on SLC flash.

When you say SLC flash, you're talking about MLC NANDs operating in SLC
mode, right?

> Then you add MLC, and
> make MLC another reason why it can happen.
> 
> But I'm not certain this is actually necessary.  Could listing 4 pages
> rather than 2 as in other data sheets just be an editing or translation
> error?  Maybe someoe got confused about "in the same row" when they
> wrote that clarifying example.

Yes, that's what I supposed. I'll try to test that on a real device.

> 
> > I'm just realizing this is actually a non-issue for the solution we
> > developed with Ricard.  As I said, it's unsafe to partially write a
> > block in MLC mode, so the only sane way is either to write a block in
> > SLC mode, or atomically write a block in MLC mode, and that's what
> > we're doing with our 'UBI LEB consolidation' approach.  I'm pretty sure
> > the problem described in the Hynix datasheet does not happen when only
> > writing in SLC mode.  So, even if the pairing scheme does not account
> > for this extra 'coupling' constraint, we should be safe.  
> 
> I can't see any reason why it would affect MLC and not SLC.

That's something we'll have to check on a real NAND exposing this
constraint (I'll try to find a board with one of these NANDs), but if
that's really the case, and programming page 1 can really spoil page 0
even if they're not sharing the same cells, then that's a big problem.

-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com


Re: [PATCH v2 2/2] memory: tegra: tegra124-emc: Add missing of_node_put

2016-06-14 Thread Thierry Reding
On Mon, Jan 25, 2016 at 10:57:48PM +0530, Amitoj Kaur Chawla wrote:
> for_each_child_of_node performs an of_node_get on each iteration, so
> to break out of the loop an of_node_put is required.
> 
> Found using Coccinelle. The semantic patch used for this is as follows:
> 
> // 
> @@
> expression e;
> local idexpression n;
> @@
> 
>  for_each_child_of_node(..., n) {
>... when != of_node_put(n)
>when != e = n
> (
>return n;
> |
> +  of_node_put(n);
> ?  return ...;
> )
>...
>  }
> //  
> Signed-off-by: Amitoj Kaur Chawla 
> ---
> Changes in v2:
> -Modified the note underneath ---
> 
> There is an extra of_node_put() before a continue in the same file on
> line 1001 which should be deleted too. Julia Lawall has already sent a patch 
> to delete this but if preferred I can send one patch to do both the changes.
> 
>  drivers/memory/tegra/tegra124-emc.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)

Applied, thanks.

Thierry


signature.asc
Description: PGP signature


Re: [PATCH 1/2] memory: tegra: delete unneeded of_node_put

2016-06-14 Thread Thierry Reding
On Fri, Oct 09, 2015 at 07:47:40PM +0200, Julia Lawall wrote:
> for_each_child_of_node performs an of_node_put on each iteration, so
> putting an of_node_put before a continue results in a double put.
> 
> The semantic match that finds this problem is as follows
> (http://coccinelle.lip6.fr):
> 
> // 
> @@
> expression root,e;
> local idexpression child;
> iterator name for_each_child_of_node;
> @@
> 
>  for_each_child_of_node(root, child) {
>... when != of_node_get(child)
> *  of_node_put(child);
>...
> *  continue;
> }
> // 
> 
> Signed-off-by: Julia Lawall 
> 
> ---
>  drivers/memory/tegra/mc.c   |4 +---
>  drivers/memory/tegra/tegra124-emc.c |4 +---
>  2 files changed, 2 insertions(+), 6 deletions(-)

Applied, thanks.

Thierry


signature.asc
Description: PGP signature


  1   2   3   4   5   6   7   8   9   10   >