The other interesting question is what algorithm we use to find the pattern in each line. Generally bytes.Contains uses Rabin-Karp. But as the pattern is the word "test" which is only 4 bytes long, a brute force search is used, using SSE type instructions where available. So the naive Go approach will give you a very fast execution. The main thing is to set up your scanner with a large buffer, to minimize the number of file system reads, and to avoid the newbie error of working with strings rather than []byte, and forcing the code to do vast numbers of unnecessary and expensive allocations.
On Saturday, 7 May 2022 at 22:53:54 UTC+1 Amnon wrote: > p.s. If you changed the above code to use strings rather than []byte > it would run many times slower due to the cost of allocation. > > On Saturday, 7 May 2022 at 22:49:08 UTC+1 Amnon wrote: > >> How about something like >> >> func grep(pat []byte, r io.Reader, w io.Writer) error { >> scanner := bufio.NewScanner(r) >> for scanner.Scan() { >> if (bytes.Contains(scanner.Bytes(), pat)) { >> w.Write(scanner.Bytes()) >> } >> } >> >> return scanner.Err() >> } >> >> and for extra speed, just allocate a bigger buffer to the scanner... >> On Saturday, 7 May 2022 at 21:46:33 UTC+1 Jan Mercl wrote: >> >>> On Sat, May 7, 2022 at 10:24 PM Constantine Vassilev <ths...@gmail.com> >>> wrote: >>> >>> > I need to write a program that reads STDIN and should output every >>> line that contains a search word "test" to STDOUT. >>> >>> Piping the data through grep(1) would be my first option. >>> >> -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/ef946f73-7a2e-492e-bfe3-79e7f17962ebn%40googlegroups.com.