karl3@writeme.com wrote:
> karl3@writeme.com wrote:
> > ( >( >(
> > (Pdb) p input.shape
> > torch.Size([1, 6, 16384])
> > (Pdb) p weight[0:16].T.shape
> > torch.Size([16384, 16])
> > input @ weight
> > rows @ cols
> > so one row of input is [0,0,:]
> > then one col of weight.T is [:,0]
> > these are dotted.
> > now, weight.T is dense on the first dimension. so ideally we'd make stripes 
> > across the other dimension. uhhhhhhhhhhhhhhhhhhhhhhhhh
> > [_,_] @ [_,_]
> > [_,_] @ [_,_]
> > rows x cols
> > rows of left times cols of right
> > so the output is just a concatenation differently on each side. the rows of 
> > the left can be treated independently. the cols of the right can be treated 
> > independently.
> > [_,_] @ [_,_]
> > [_,_] @ [_,_]
> > rows x cols
> > the dense portion of the second operand is the dimension that is broadcast. 
> > so, since it's rows @ cols, the dense portion of the second operation would 
> > be the columns. the columns are dense.
> > well that's frustrating. this would work better if they were stored 
> > sideways. if i get all of one col and none of the next it doesn't really 
> > help me. it's not summed inside the dot product. this may still be an 
> > appropriate thing to do, but it would involve math elsewise in the operator 
> > graph rather than right here
> > looks like http does actually support sparse requests: 
> > https://developer.mozilla.org/en-US/docs/Web/HTTP/Guides/Range_requests#mult...
> i wonder what a server would do if i sent a huge sparsity document for a 
> range header. there must be some kind of maximum (there might not be) ... 
> likely other issues too

i think the underlying store operates in sizes of mmap pages.

so ideally that would backpropagate somehow to the slicing operator and i could 
also use the surrounding data that is in the same page. maybe a separate 
function to fetch the smallest encompassing slice--[someday

Reply via email to