The other interesting question is what algorithm we use to find the pattern 
in each line.
Generally bytes.Contains uses Rabin-Karp. But as the pattern is the word 
"test" which is only 4 bytes long,
a brute force search is used, using SSE type instructions where available. 
So the naive Go approach will give you a very fast execution. The main 
thing is to set up your scanner with a large buffer, to minimize the number
of file system reads, and to avoid the newbie error of working with strings 
rather than []byte, and forcing the code to do vast numbers of 
unnecessary and expensive allocations.

On Saturday, 7 May 2022 at 22:53:54 UTC+1 Amnon wrote:

> p.s. If you changed the above code to use strings rather than []byte 
> it would run many times slower due to the cost of allocation.
>
> On Saturday, 7 May 2022 at 22:49:08 UTC+1 Amnon wrote:
>
>> How about something like 
>>
>> func grep(pat []byte, r io.Reader, w io.Writer) error {
>>     scanner := bufio.NewScanner(r)
>>     for scanner.Scan() {
>>         if (bytes.Contains(scanner.Bytes(), pat)) {
>>             w.Write(scanner.Bytes())
>>         }
>>     }
>>
>>     return scanner.Err()
>> }
>>
>> and for extra speed, just allocate a bigger buffer to the scanner...
>> On Saturday, 7 May 2022 at 21:46:33 UTC+1 Jan Mercl wrote:
>>
>>> On Sat, May 7, 2022 at 10:24 PM Constantine Vassilev <ths...@gmail.com> 
>>> wrote: 
>>>
>>> > I need to write a program that reads STDIN and should output every 
>>> line that contains a search word "test" to STDOUT. 
>>>
>>> Piping the data through grep(1) would be my first option. 
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/ef946f73-7a2e-492e-bfe3-79e7f17962ebn%40googlegroups.com.

Reply via email to