> On 9 May 2022, at 17:41, r...@zedat.fu-berlin.de wrote: > > Barry Scott <ba...@barrys-emacs.org> writes: >> Why use tiny chunks? You can read 4KiB as fast as 100 bytes > > When optimizing code, it helps to be aware of the orders of > magnitude
That is true and we’ll know to me, now show how what I said is wrong. The os is going to DMA at least 4k, with read ahead more like 64k. So I can get that into the python memory at the same scale of time as 1 byte because it’s the setup of the I/O that is expensive not the bytes transferred. Barry > . Code that is more cache-friendly is faster, that is, > code that holds data in single region of memory and that uses > regular patterns of access. Chandler Carruth talked about this, > and I made some notes when watching the video of his talk: > > CPUS HAVE A HIERARCHICAL CACHE SYSTEM > (from a 2014 talk by Chandler Carruth) > > One cycle on a 3 GHz processor 1 ns > L1 cache reference 0.5 ns > Branch mispredict 5 ns > L2 cache reference 7 ns 14x L1 cache > Mutex lock/unlock 25 ns > Main memory reference 100 ns 20xL2, 200xL1 > Compress 1K bytes with Snappy 3,000 ns > Send 1K bytes over 1 Gbps network 10,000 ns 0.01 ms > Read 4K randomly from SSD 150,000 ns 0.15 ms > Read 1 MB sequentially from memory 250,000 ns 0.25 ms > Round trip within same datacenter 500,000 ns 0.5 ms > Read 1 MB sequentially From SSD 1,000,000 ns 1 ms 4x memory > Disk seek 10,000,000 ns 10 ms 20xdatacen. RT > Read 1 MB sequentially from disk 20,000,000 ns 20 ms 80xmem.,20xSSD > Send packet CA->Netherlands->CA 150,000,000 ns 150 ms > > . Remember how recently people here talked about how you cannot > copy text from a video? Then, how did I do it? Turns out, for my > operating system, there's a screen OCR program! So I did this OCR > and then manually corrected a few wrong characters, and was done! > > > -- > https://mail.python.org/mailman/listinfo/python-list > -- https://mail.python.org/mailman/listinfo/python-list