I've been doing some comparative testing on different cloud platform's hosting of PostgreSQL. And one of the deficiencies in this effort is the only benchmarking tool I have is
pgbench, which doesn't reflect the kinds of workloads people would want to run on cloud hosting. Don't get me wrong, pgbench does everything you could imagine with the simple Wisconsin benchmark, including statistics and sampling. But the core benchmark is still something which doesn't look much like the kind of Rails and Django apps I deal with on a daily basis.
There's also
TPCC-js, which is more sophisticated, but is ultimately still a transactional, back-office OLTP benchmark.
So I'm thinking of developing a "webapp" benchmark. Here's what I see as concepts for such a benchmark:
- Read-mostly
- No multi-statement transactions
- Defined "users" concept with logins and new user registration
- Needs a "sessions" table which is frequently updated
- Read-write, read-only and session database connections should be separable, in order to test load-balancing optimization.
- Queries counting, sorting and modifying content
- Measured unit of work is the "user session" which would contain some content lookups and minor updates ("likes").
Now, one of the big questions is whether we should base this benchmark on the idea of a social networking (SN) site. I think we should; SN sites test a number of things, including locking and joins which might not be exercised by other types of applications (and aren't by pgbench). What do you think? Does anyone other than me want to work on this?
What about a spec compliant TPC-W?
ReplyDeleteJan published on written in PHP several years ago it can be obtained here on pgfoundry, http://pgfoundry.org/projects/tpc-w-php/
Yah, this is even worse than pgbench as far as being a real benchmark is concerned. No FKs, no joins. And yet uses multi-statement transactions.
DeleteThere's reasons TPC-W has been an obsolete benchmark for a decade now.
I think this website/project can be helpful Josh http://oltpbenchmark.com/wiki/index.php?title=Main_Page specially the Epinions and SEATS workloads.
ReplyDeleteOh! Thank you! I'll dig into that, that looks like exactly what we need.
DeleteFeh. Java! Why did it have to be Java?
DeleteAlso, sadly, the benchmarks I'm interested in seem to be incomplete. For example, there's no Wikipedia trace file.
DeleteWhat do you mean with trace file? I remember running various workloads against PG an year ago or so.
DeleteYeah, I'll need to give it some testing time; apparently the file structure is just very opaque. My real concern is that if I don't like some of the workloads (that is, if they're unbalanced or don't scale well), I won't be able to fix it. But test first, fix later ...
DeleteThe structure isn't optimal as it's designed to be generic both workload and database wise. The good stuff is usually at workloads/*/procedures/*.java
DeleteMaybe some wiki made with Django? I'm thinking about using some existing open source project, something like https://github.com/django-wiki/django-wiki. Using Django will allow compare PostgreSql with other DBMS too.
ReplyDeleteI think that using a framework with an ORM may be a little unfair (since it uses basic features), but will give better results, closer to more common and real workloads.
yeah, I started down the road of building a benchmark out of some common Django projects. The problem with that is unless I want to devote 5-6 webservers to running the benchmark, that becomes a benchmark *of Django* rather than of the Postges sevice.
DeleteIn an age when we can have servers with 100s of gigabytes of memory, terabytes of SSD, and dozens of cores for about $10K (lower end of the possibilities, yes) talking to browsers on broadband with persistent connections, perhaps we should contemplate designing apps to the VT-100/*nix database/RS-232 paradigm of times past. After all, all those so-called OO coders remain in love with dataObject/actionObject structure which is not a bit different from the COBOL/VSAM structure their granddaddies used. Some kinds of Back to The Future actually make sense; we can now do what Dr. Codd had in mind with hardware he didn't have.
ReplyDeleteA test suite should acknowledge such a structure, rather than promoting the disconnected structure of vanilla HTTP.
Um, did you have a point with this?
DeleteYeah. Conventional web benchmarks are really measuring the latency among javascript, the webserver, and the database server. We need not do it that way any more. At least for a significant number of application types. HTTP was always a bad paradigm for OLTP applications, and with current day infrastructure, we can do away with the disconnected-ness.
DeleteBuilding a benchmark for yesterday's paradigm only serves to support yesterday's paradigm.
Um ... yeah. You do that. Post a link when you have something.
DeleteI think you should look into linkbench.
ReplyDeleteThanks, Baron! However, I'm not looking to mainly benchmark graph performance.
DeleteI'm not convinced your suggested set of measurements reflects typical web apps.
ReplyDelete* Server-side sessions are rare in my experience, signed or encrypted cookies being the preferred alternative for almost any situation. (Sites with server-side sessions always seem frustratingly unusable with the session timing out at inopportune times, the implementation against the grain of HTTP.)
* "likes" should generally be implemented asynchronously, perhaps via a webserver log which is read in in small batches, so a 1 like per transaction under load would be unusual.
* The login action needs to reflect a usual user table rather than a database user. Very, very few webapps map app users to db users.
Lawrence: you seem to have an axe to grind which has little to do with my post. No thanks.
DeleteI think it's a great idea. One thing to add to the list: common pagination techniques, with abusing limit/offset being number 1. That's where most of my unexpected performance variation (and corresponding headaches) comes from.
ReplyDeleteIf I end up writing my own benchmark, I'd want to hold of on pagination until the 2nd version. It's definitely a real problem, but it's also very hard to set up paginated data in a synthetically generated benchmark.
DeleteYou can use something like tcpcopy: https://github.com/session-replay-tools/tcpcopy & a real log from a Rails/Django MySQL intercept: https://github.com/session-replay-tools/mysql-replay-module
ReplyDeleteNote: I haven't used it yet, but seems promising!
Came across this, perhaps it has something useful:
ReplyDeletehttps://github.com/oltpbenchmark/oltpbench