pgbench-ycsb

2018-07-19 Thread a . bykov
Hello, hackers.

It might be a good idea to give users an opportunity to test their
applications with pgbench under different real-life-like load. So that
they will be able to see what's going to happen on production.

YCSB (Yahoo! Cloud Serving Benchmark) was taken as a concept. YCSB tests
were originally designed to facilitate performance comparisons of
different cloud data serving systems and it takes into account different
application workloads like: 
workload A - assumes that application do a lot of reads(50%) and
updates(50%).
workload B - case when application do 95% of cases reads
and 5% updates 
workload C - models behavior of read-only application.
workload E - the workload of the applications which in 95% of cases
requests for several neighboring tuples and in 5% of cases - does
updates.

In the patch those workloads were implemented to be executed by pgbench:
pgbench -b ycsb-A

--
Anthony Bykov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 41b756c089..cd884af937 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -506,6 +506,153 @@ static const BuiltinScript builtin_script[] =
 		"",
 		"\\set aid random(1, " CppAsString2(naccounts) " * :scale)\n"
 		"SELECT abalance FROM pgbench_accounts WHERE aid = :aid;\n"
+	},
+	{
+		"ycsb-A",
+		"",
+		"\\set write_weight 0\n"
+		"\\set read_weight 50\n"
+		"\\set scan_weight 0\n"
+		"\\set update_weight 50\n"
+		"\\set total_weight 100\n"
+		"\\set operation random(1,:total_weight)\n"
+		"\\set parameter 2\n"
+		"\\if (:operation < :write_weight)\n"
+		"\\set aid abs(hash(random_zipfian(1, " CppAsString2(naccounts)", :parameter)))%"CppAsString2(naccounts)"\n"
+		"\\set bid random(1,"CppAsString2(nbranches)")\n"
+		"\\set tid random(1,"CppAsString2(ntellers)")\n"
+		"\\set delta random(-5000, 5000)\n"
+		"INSERT into pgbench_accounts (bid, aid) VALUES (:bid, :aid);\n"
+		"\\elif (:operation < :read_weight+:write_weight)\n"
+		"\\set read abs(hash(random_zipfian(1, 10,:parameter)))%" CppAsString2(naccounts)"\n"
+		"SELECT * from pgbench_accounts where aid = :read;\n"
+		"\\elif (:operation < :total_weight-:update_weight)\n"
+		"\\set scan abs(hash(random_zipfian(1, 10, :parameter)))%" CppAsString2(naccounts)"\n"
+		"\\set scanlimit random(2, 10)\n"
+		"SELECT * from pgbench_accounts where abalance>:scan limit :scanlimit\n"
+		"\\elif (:operation < :total_weight)\n"
+		"\\set aid abs(hash(random_zipfian(1, " CppAsString2(naccounts)", :parameter)))%" CppAsString2(naccounts)"\n"
+		"\\set delta random(-5000, 5000)\n"
+		"UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid\n"
+		"\\endif"
+ 	},
+	{
+		"ycsb-B",
+		"",
+		"\\set write_weight 0\n"
+		"\\set read_weight 95\n"
+		"\\set scan_weight 0\n"
+		"\\set update_weight 5\n"
+		"\\set total_weight 100\n"
+		"\\set operation random(1,:total_weight)\n"
+		"\\set parameter 2\n"
+		"\\if (:operation < :write_weight)\n"
+		"\\set aid abs(hash(random_zipfian(1, " CppAsString2(naccounts)", :parameter)))%"CppAsString2(naccounts)"\n"
+		"\\set bid random(1,"CppAsString2(nbranches)")\n"
+		"\\set tid random(1,"CppAsString2(ntellers)")\n"
+		"\\set delta random(-5000, 5000)\n"
+		"INSERT into pgbench_accounts (bid, aid) VALUES (:bid, :aid);\n"
+		"\\elif (:operation < :read_weight+:write_weight)\n"
+		"\\set read abs(hash(random_zipfian(1, 10,:parameter)))%" CppAsString2(naccounts)"\n"
+		"SELECT * from pgbench_accounts where aid = :read;\n"
+		"\\elif (:operation < :total_weight-:update_weight)\n"
+		"\\set scan abs(hash(random_zipfian(1, 10, :parameter)))%" CppAsString2(naccounts)"\n"
+		"\\set scanlimit random(2, 10)\n"
+		"SELECT * from pgbench_accounts where abalance>:scan limit :scanlimit\n"
+		"\\elif (:operation < :total_weight)\n"
+		"\\set aid abs(hash(random_zipfian(1, " CppAsString2(naccounts)", :parameter)))%" CppAsString2(naccounts)"\n"
+		"\\set delta random(-5000, 5000)\n"
+		"UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid\n"
+		"\\endif"
+	},
+	{
+		"ycsb-C",
+		"",
+		"\\set write_weight 0\n"
+		"\\set read_weight 100\n"
+		"\\set scan_weight 0\n"
+		"\\set update_weight 0\n"
+		"\\set total_weight 100\n"
+		"\\set operation random(1,:total_weight)\n"
+		"\\set parameter 2\n"
+		"\\if (:operation < :write_weight)\n"
+		"\\set aid abs(hash(random_zipfian(1, " CppAsString2(naccounts)", :parameter)))%"CppAsString2(naccounts)"\n"
+		"\\set bid random(1,"CppAsString2(nbranches)")\n"
+		"\\set tid random(1,"CppAsString2(ntellers)")\n"
+		"\\set delta random(-5000, 5000)\n"
+		"INSERT into pgbench_accounts (bid, aid) VALUES (:bid, :aid);\n"
+		"\\elif (:operation < :read_weight+:write_weight)\n"
+		"\\set read abs(hash(random_zipfian(1, 10,:parameter)))%" CppAsString2(naccounts)"\n"
+		"SELECT * from pgbench_accounts where aid = :read;\n"
+		"\\elif (:operation < :total_weight-:update_weight)\n"
+		"\\set scan abs(hash(random_zipfian(

Re: pgbench-ycsb

2018-07-19 Thread a . bykov

On 2018-07-19 16:50, Dmitry Dolgov wrote:
On Thu, 19 Jul 2018 at 15:36, Fabien COELHO  
wrote:



Hello Anthony,

> applications with pgbench under different real-life-like load. So that
> they will be able to see what's going to happen on production.
>
> YCSB (Yahoo! Cloud Serving Benchmark) was taken as a concept. YCSB tests
> were originally designed to facilitate performance comparisons of
> different cloud data serving systems and it takes into account different
> application workloads like:
> workload A - assumes that application do a lot of reads(50%) and
> updates(50%).
> workload B - case when application do 95% of cases reads
> and 5% updates
> workload C - models behavior of read-only application.
> workload E - the workload of the applications which in 95% of cases
> requests for several neighboring tuples and in 5% of cases - does
> updates.
>
> In the patch those workloads were implemented to be executed by pgbench:
> pgbench -b ycsb-A

Could you provide a link to the specification?

I cannot find something simple, and I was kind of hoping to avoid 
diving

into the source code of the java tool on github:-) In particular, I'm
looking for a description of the expected underlying schema and its 
size

(scale) parameters.


There are the description files for different workloads, like [1], 
(with the

custom amount of records, of course) and the schema [2]. Would this
information be enough?

[1]: 
https://github.com/brianfrankcooper/YCSB/blob/master/workloads/workloada

[2]:
https://github.com/brianfrankcooper/YCSB/blob/master/jdbc/src/main/resources/sql/create_table.sql


Hi.
Thanks for your feedback, I'll fix it soon.
Actually I used the article "Brian F. Cooper, Adam Silberstein, Erwin 
Tam,

Raghu Ramakrishnan and Russell Sears. Benchmarking Cloud Serving Systems
with YCSB. ACM Symposium on Cloud Computing (SoCC), Indianapolis, IN, 
USA, 2010"

It is available here:
https://github.com/brianfrankcooper/YCSB/wiki/Papers-and-Presentations

But maybe an article is more complicated then your example.

--
Anthony Bykov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: pgbench-ycsb

2018-07-22 Thread a . bykov

On 2018-07-22 16:56, Fabien COELHO wrote:
Just to clarify - if I understand Anthony correctly, this proposal is 
not about
implementing exactly YCSB as it is, but more about using zipfian 
distribution
for an id in the regular pgbench table structure in conjunction with 
read/write

balance to simulate something similar to it.


Ok, I misunderstood. My 0.02€: If it does not implement YCSB, and the
point is not to implement YCSB, then do not call it YCSB:-)

Maybe there could be other simpler builtins to use non uniform
distributions: {zipf,exp,...}-{simple,select} and default values
(exp_param, zipf_param?) for the random distribution parameters.

  \set id random_zipfian(1, 10*:scale, :zipf_param)
  \set val random(-5000, 5000)
  UPDATE pgbench_whatever ...;

Then

  pgbench -b zipf-se@1 -b zipf-si@1 [ -D zipf_param=1.1 ... ] -T 1 
...


And probably instead of implementing the exact YCSB workload inside 
pgbench, it
makes more sense to add PostgreSQL Jsonb as one of the options into 
the
framework itself (I was in the middle of it few years ago, but then 
was

distracted by some interesting benchmarking results).


Sure.


Hello,
thank you for your interest. I'm still improving this idea, the patch
and I'm very happy about the discussion we have. It really helps.

The idea was to implement the workloads as close to YCSB as possible
using pgbench.

So, the schema it should be applied to - is default schema generated by
pgbnench -i (pgbench_accounts).

--
Anthony Bykov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company