Hello Paul

Do you mean by undef $/ and with <$fh> we can read the file into memory at one time? Yes that would be faster b/c we don't need to read file by each line, which increases the disk IO.

Another questions:
1. what's the "truss" command?
2. what's the syntax "<:mmap"?

Thank you.


On 15.01.2022 15:45, Paul Procacci wrote:
Hey Jon,

The most glaringly obvious thing I could recommend is that at least in
your perl routine (and probably the other languages) most of your time
is context switching reading from the disk.

Now, my perl version is indeed faster, but one has to ask themselves,
was .015193256 seconds really worth the effort?  /shrug   -- If this
is for a financial industry perhaps, but then they'd just have written
it in C.  Otherwise, probably not.
Also note, there's other ways to speed this up even further, but at
that point it isn't really worth the time.  We're talking a couple of
microseconds at best.  I've included my version for your reference.

Before closing, I happen to like micro benchmarks whether or not you
think 'I know this benchmark is maybe meaningless' as your site says.

If anything, it can absolutely be useful.  I personally think
sometimes they are and others not so much.  Just depends on the
context.

Your perl source (doit) .. my perl source (doit2):
# ./doit2.pl [1] | md5
786be54356a5832dcd1148c18de71fc8
root@nas:~ # ./doit.pl [2] | md5
786be54356a5832dcd1148c18de71fc8

# truss -c ./doit.pl [2]
<!--snip-->

syscall                     seconds   calls  errors
read                    0.036828813    4140       0

<!--snip-->
                      ------------- ------- -------
                        0.037821821    5227     284

# truss -c ./doit2.pl [1]
<!--snip-->

syscall                     seconds   calls  errors
read                    0.000245121      19       0
<!--snip-->
                      ------------- ------- -------
                        0.022628565     804      59

-------------------------------------
use strict;

$/ = undef;
my %stopwords = do {
        open my $fh, '<:mmap', 'stopwords.txt' or die $!;
        map { $_ => 1; } split /\n/, <$fh>;
};

my %count = do {
        my %res;
        open my $fh, '<:mmap', 'words.txt' or die $!;
        map { $res{$_}++ unless $stopwords{$_}; } split /\n/, <$fh>;
        %res;
};

my $i=0;
for (sort {$count{$b} <=> $count{$a}} keys %count) {
    if ($i < 20) {
        print "$_ -> $count{$_}\n"
    } else {
       last;
    }
    $i ++;
}

On Sat, Jan 15, 2022 at 12:37 AM Jon Smart <j...@smartown.nl> wrote:

Hello,

May I show the result of my benchmark for perl5, ruby, and scala?
https://blog.cloudcache.net/benchmark-for-scala-ruby-and-perl/

Welcome you to give any suggestion to me for improving this.

Thanks.

--
__________________

:(){ :|:& };:

Links:
------
[1] http://doit2.pl
[2] http://doit.pl

--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/


Reply via email to