On Mon, Jun 4, 2012 at 7:02 PM, Anton Blanchard <[email protected]> wrote: > > I blame Mikey for this. He elevated my slightly dubious testcase: > > # dd if=/dev/zero of=/dev/null bs=1M count=10000 > > to benchmark status. And naturally we need to be number 1 at creating > zeros. So lets improve __clear_user some more. > > As Paul suggests we can use dcbz for large lengths. This patch gets > the destination cacheline aligned then uses dcbz on whole cachelines. > > Before: > 10485760000 bytes (10 GB) copied, 0.414744 s, 25.3 GB/s > > After: > 10485760000 bytes (10 GB) copied, 0.268597 s, 39.0 GB/s > > 39 GB/s, a new record. > > Signed-off-by: Anton Blanchard <[email protected]>
Besides the comments from Segher, feel free to add: Tested-by: Olof Johansson <[email protected]> Acked-by: Olof Johansson <[email protected]> Didn't help performance all that much on pa6t, but it didn't go down. Too low on cycles to actually analyze why at this time. -OIof _______________________________________________ Linuxppc-dev mailing list [email protected] https://lists.ozlabs.org/listinfo/linuxppc-dev
