On Fri, May 08, 2015 at 11:02:28PM -0400, Rik van Riel wrote:
> On 05/08/2015 09:14 PM, Linus Torvalds wrote:
> > On Fri, May 8, 2015 at 9:59 AM, Rik van Riel wrote:
> >>
> >> However, for persistent memory, all of the files will be "in memory".
> >
> > Yes. However, I doubt you will find a very
On Fri, May 8, 2015 at 8:02 PM, Rik van Riel wrote:
>
> The TLB performance bonus of accessing the large files with
> large pages may make it worthwhile to solve that hard problem.
Very few people can actually measure that TLB advantage on systems
with good TLB's.
It's largely a myth, fed by som
On 05/08/2015 09:14 PM, Linus Torvalds wrote:
> On Fri, May 8, 2015 at 9:59 AM, Rik van Riel wrote:
>>
>> However, for persistent memory, all of the files will be "in memory".
>
> Yes. However, I doubt you will find a very sane rw filesystem that
> then also makes them contiguous and aligns them
On Fri, May 8, 2015 at 9:59 AM, Rik van Riel wrote:
>
> However, for persistent memory, all of the files will be "in memory".
Yes. However, I doubt you will find a very sane rw filesystem that
then also makes them contiguous and aligns them at 2MB boundaries.
Anything is possible, I guess, but t
> "Linus" == Linus Torvalds writes:
Linus> On Fri, May 8, 2015 at 7:40 AM, John Stoffel wrote:
>>
>> Now go and look at your /home or /data/ or /work areas, where the
>> endusers are actually keeping their day to day work. Photos, mp3,
>> design files, source code, object code littered aro
On 05/08/2015 11:54 AM, Linus Torvalds wrote:
> On Fri, May 8, 2015 at 7:40 AM, John Stoffel wrote:
>>
>> Now go and look at your /home or /data/ or /work areas, where the
>> endusers are actually keeping their day to day work. Photos, mp3,
>> design files, source code, object code littered aroun
On Fri, May 08, 2015 at 08:54:06AM -0700, Linus Torvalds wrote:
> However, the big files in that list are almost immaterial from a
> caching standpoint.
.git/objects/pack/* caching matters a lot, though...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a
On Fri, May 8, 2015 at 7:40 AM, John Stoffel wrote:
>
> Now go and look at your /home or /data/ or /work areas, where the
> endusers are actually keeping their day to day work. Photos, mp3,
> design files, source code, object code littered around, etc.
However, the big files in that list are alm
On 05/08/2015 10:05 AM, Ingo Molnar wrote:
> * Rik van Riel wrote:
>> Memory trends point in one direction, file size trends in another.
>>
>> For persistent memory, we would not need 4kB page struct pages
>> unless memory from a particular area was in small files AND those
>> files were being
> "Ingo" == Ingo Molnar writes:
Ingo> * Rik van Riel wrote:
>> The disadvantage is pretty obvious too: 4kB pages would no longer be
>> the fast case, with an indirection. I do not know how much of an
>> issue that would be, or whether it even makes sense for 4kB pages to
>> continue bein
* Rik van Riel wrote:
> The disadvantage is pretty obvious too: 4kB pages would no longer be
> the fast case, with an indirection. I do not know how much of an
> issue that would be, or whether it even makes sense for 4kB pages to
> continue being the fast case going forward.
I strongly disa
On 05/07/2015 03:11 PM, Ingo Molnar wrote:
> Stable, global page-struct descriptors are a given for real RAM, where
> we allocate a struct page for every page in nice, large, mostly linear
> arrays.
>
> We'd really need that for pmem too, to get the full power of struct
> page: and that means
On Fri, May 08, 2015 at 11:26:01AM +0200, Ingo Molnar wrote:
>
> * Al Viro wrote:
>
> > On Fri, May 08, 2015 at 07:37:59AM +0200, Ingo Molnar wrote:
> >
> > > So if code does iov_iter_get_pages_alloc() on a user address that
> > > has a real struct page behind it - and some other code does a
* Al Viro wrote:
> On Fri, May 08, 2015 at 07:37:59AM +0200, Ingo Molnar wrote:
>
> > So if code does iov_iter_get_pages_alloc() on a user address that
> > has a real struct page behind it - and some other code does a
> > regular get_user_pages() on it, we'll have two sets of struct page
> >
On Fri, May 08, 2015 at 07:37:59AM +0200, Ingo Molnar wrote:
> same as iov_iter_get_pages(), except that pages array is allocated
> (kmalloc if possible, vmalloc if that fails) and left for caller to
> free. Lustre and NFS ->direct_IO() switched to it.
>
> Signed-off-by: Al V
Al,
I was wondering about the struct page rules of
iov_iter_get_pages_alloc(), used in various places. There's no
documentation whatsoever in lib/iov_iter.c, nor in
include/linux/uio.h, and the changelog that introduced it only says:
commit 91f79c43d1b54d7154b118860d81b39bad07dfff
Author: A
On Thu, May 07, 2015 at 09:53:13PM +0200, Ingo Molnar wrote:
>
> * Ingo Molnar wrote:
>
> > > Is handling kernel pagefault on the vmemmap completely out of the
> > > picture ? So we would carveout a chunck of kernel address space
> > > for those pfn and use it for vmemmap and handle pagefault
On Thu, May 7, 2015 at 10:43 AM, Linus Torvalds
wrote:
> On Thu, May 7, 2015 at 9:03 AM, Dan Williams wrote:
>>
>> Ok, I'll keep thinking about this and come back when we have a better
>> story about passing mmap'd persistent memory around in userspace.
>
> Ok. And if we do decide to go with your
* Ingo Molnar wrote:
> > Is handling kernel pagefault on the vmemmap completely out of the
> > picture ? So we would carveout a chunck of kernel address space
> > for those pfn and use it for vmemmap and handle pagefault on it.
>
> That's pretty clever. The page fault doesn't even have to do
* Jerome Glisse wrote:
> > So I think the main value of struct page is if everyone on the
> > system sees the same struct page for the same pfn - not just the
> > temporary IO instance.
> >
> > The idea of having very temporary struct page arrays misses the
> > point I think: if struct page
On Thu, May 7, 2015 at 11:40 AM, Ingo Molnar wrote:
>
> * Dan Williams wrote:
>
>> On Thu, May 7, 2015 at 9:18 AM, Christoph Hellwig wrote:
>> > On Wed, May 06, 2015 at 05:19:48PM -0700, Linus Torvalds wrote:
>> >> What is the primary thing that is driving this need? Do we have a very
>> >> conc
On Thu, May 07, 2015 at 09:11:07PM +0200, Ingo Molnar wrote:
>
> * Dave Hansen wrote:
>
> > On 05/07/2015 10:42 AM, Dan Williams wrote:
> > > On Thu, May 7, 2015 at 10:36 AM, Ingo Molnar wrote:
> > >> * Dan Williams wrote:
> > >>
> > >> So is there anything fundamentally wrong about creating s
* Dave Hansen wrote:
> On 05/07/2015 10:42 AM, Dan Williams wrote:
> > On Thu, May 7, 2015 at 10:36 AM, Ingo Molnar wrote:
> >> * Dan Williams wrote:
> >>
> >> So is there anything fundamentally wrong about creating struct
> >> page backing at mmap() time (and making sure aliased mmaps share
* Dan Williams wrote:
> On Thu, May 7, 2015 at 9:18 AM, Christoph Hellwig wrote:
> > On Wed, May 06, 2015 at 05:19:48PM -0700, Linus Torvalds wrote:
> >> What is the primary thing that is driving this need? Do we have a very
> >> concrete example?
> >
> > FYI, I plan to to implement RAID accele
On 05/07/2015 10:42 AM, Dan Williams wrote:
> On Thu, May 7, 2015 at 10:36 AM, Ingo Molnar wrote:
>> * Dan Williams wrote:
>> So is there anything fundamentally wrong about creating struct page
>> backing at mmap() time (and making sure aliased mmaps share struct
>> page arrays)?
>
> Something l
* Dan Williams wrote:
> > That looks like a layering violation and a mistake to me. If we
> > want to do direct (sector_t -> sector_t) IO, with no serialization
> > worries, it should have its own (simple) API - which things like
> > hierarchical RAID or RDMA APIs could use.
>
> I'm wrapped
On Thu, May 7, 2015 at 9:03 AM, Dan Williams wrote:
>
> Ok, I'll keep thinking about this and come back when we have a better
> story about passing mmap'd persistent memory around in userspace.
Ok. And if we do decide to go with your kind of "__pfn" type, I'd
probably prefer that we encode the ty
On Thu, May 7, 2015 at 10:36 AM, Ingo Molnar wrote:
>
> * Dan Williams wrote:
>
>> > Anyway, I did want to say that while I may not be convinced about
>> > the approach, I think the patches themselves don't look horrible.
>> > I actually like your "__pfn_t". So while I (very obviously) have
>> >
* Dan Williams wrote:
> > Anyway, I did want to say that while I may not be convinced about
> > the approach, I think the patches themselves don't look horrible.
> > I actually like your "__pfn_t". So while I (very obviously) have
> > some doubts about this approach, it may be that the most
On Thu, May 07, 2015 at 06:18:07PM +0200, Christoph Hellwig wrote:
> On Wed, May 06, 2015 at 05:19:48PM -0700, Linus Torvalds wrote:
> > What is the primary thing that is driving this need? Do we have a very
> > concrete example?
>
> FYI, I plan to to implement RAID acceleration using nvdimms, and
On Thu, May 7, 2015 at 9:18 AM, Christoph Hellwig wrote:
> On Wed, May 06, 2015 at 05:19:48PM -0700, Linus Torvalds wrote:
>> What is the primary thing that is driving this need? Do we have a very
>> concrete example?
>
> FYI, I plan to to implement RAID acceleration using nvdimms, and I plan to
>
On Wed, May 06, 2015 at 05:19:48PM -0700, Linus Torvalds wrote:
> What is the primary thing that is driving this need? Do we have a very
> concrete example?
FYI, I plan to to implement RAID acceleration using nvdimms, and I plan to
ue pages for that. The code just merge for 4.1 can easily support
On Thu, May 7, 2015 at 8:58 AM, Linus Torvalds
wrote:
> On Thu, May 7, 2015 at 8:40 AM, Dan Williams wrote:
>>
>> blkdev_get(FMODE_EXCL) is the protection in this case.
>
> Ugh. That looks like a horrible nasty big hammer that will bite us
> badly some day. Since you'd have to hold it for the who
On Thu, May 7, 2015 at 8:40 AM, Dan Williams wrote:
>
> blkdev_get(FMODE_EXCL) is the protection in this case.
Ugh. That looks like a horrible nasty big hammer that will bite us
badly some day. Since you'd have to hold it for the whole IO. But I
guess it at least works.
Anyway, I did want to say
On Thu, May 7, 2015 at 7:42 AM, Ingo Molnar wrote:
>
> * Ingo Molnar wrote:
>
>> [...]
>>
>> For anything more complex, that maps any of this storage to
>> user-space, or exposes it to higher level struct page based APIs,
>> etc., where references matter and it's more of a cache with
>> potential
On Thu, May 7, 2015 at 8:00 AM, Linus Torvalds
wrote:
> On Wed, May 6, 2015 at 7:36 PM, Dan Williams wrote:
>>
>> My pet concrete example is covered by __pfn_t. Referencing persistent
>> memory in an md/dm hierarchical storage configuration. Setting aside
>> the thrash to get existing block use
On Wed, May 6, 2015 at 7:36 PM, Dan Williams wrote:
>
> My pet concrete example is covered by __pfn_t. Referencing persistent
> memory in an md/dm hierarchical storage configuration. Setting aside
> the thrash to get existing block users to do "bvec_set_page(page)"
> instead of "bvec->page = pag
* Ingo Molnar wrote:
> [...]
>
> For anything more complex, that maps any of this storage to
> user-space, or exposes it to higher level struct page based APIs,
> etc., where references matter and it's more of a cache with
> potentially multiple users, not an IO space, the natural API is
> s
* Dan Williams wrote:
> > What is the primary thing that is driving this need? Do we have a
> > very concrete example?
>
> My pet concrete example is covered by __pfn_t. Referencing
> persistent memory in an md/dm hierarchical storage configuration.
> Setting aside the thrash to get existi
On Wed, May 6, 2015 at 5:19 PM, Linus Torvalds
wrote:
> On Wed, May 6, 2015 at 4:47 PM, Dan Williams wrote:
>>
>> Conceptually better, but certainly more difficult to audit if the fake
>> struct page is initialized in a subtle way that breaks when/if it
>> leaks to some unwitting context.
>
> May
On Wed, May 6, 2015 at 4:47 PM, Dan Williams wrote:
>
> Conceptually better, but certainly more difficult to audit if the fake
> struct page is initialized in a subtle way that breaks when/if it
> leaks to some unwitting context.
Maybe. It could go either way, though. In particular, with the
"dyn
On Wed, May 6, 2015 at 3:10 PM, Linus Torvalds
wrote:
> On Wed, May 6, 2015 at 1:04 PM, Dan Williams wrote:
>>
>> The motivation for this change is persistent memory and the desire to
>> use it not only via the pmem driver, but also as a memory target for I/O
>> (DAX, O_DIRECT, DMA, RDMA, etc) in
On Wed, May 6, 2015 at 1:04 PM, Dan Williams wrote:
>
> The motivation for this change is persistent memory and the desire to
> use it not only via the pmem driver, but also as a memory target for I/O
> (DAX, O_DIRECT, DMA, RDMA, etc) in other parts of the kernel.
I detest this approach.
I'd muc
On Wed, May 06, 2015 at 04:04:53PM -0400, Dan Williams wrote:
> Changes since v1 [1]:
>
> 1/ added include/asm-generic/pfn.h for the __pfn_t definition and helpers.
>
> 2/ added kmap_atomic_pfn_t()
>
> 3/ rebased on v4.1-rc2
>
> [1]: http://marc.info/?l=linux-kernel&m=142653770511970&w=2
>
> -
Changes since v1 [1]:
1/ added include/asm-generic/pfn.h for the __pfn_t definition and helpers.
2/ added kmap_atomic_pfn_t()
3/ rebased on v4.1-rc2
[1]: http://marc.info/?l=linux-kernel&m=142653770511970&w=2
---
A lead in note, this looks scarier than it is. Most of the code thrash
is autom
45 matches
Mail list logo