Yii Framework Forum: AR query performance - Yii Framework Forum

Jump to content

Page 1 of 1
  • You cannot start a new topic
  • You cannot reply to this topic

AR query performance

#1 User is offline   mindplay 

  • Advanced Member
  • PipPipPip
  • Yii
  • Group: Members
  • Posts: 397
  • Joined: 03-September 09
  • Location:New York

Posted 11 May 2012 - 10:18 PM

Programmers (and ORMs) usually optimize their query-strategy for as few queries and roundtrips as possible - we sometimes forget that the database server may in fact be on a different physical machine, and suddenly the data-volume may influence the response-time more than the number of queries.

Join-statements merely for the sake of getting all the data in one round-trip actually don't make much sense, when you really think about it - you're often transfering and traversing loads of duplicate data, putting more load on the network and adding memory overhead.

That's not really why we have join-statements in the first-place - they exist for situations where you need to obtain a product of some sort - e.g. something involving sum, avg, count, min or max functions. Join queries are not really an optimization-tool, but most ORMs treat them as such.

But there is in fact a way to execute multiple queries in one round-trip, at least with the mysqli-extension:

http://php.net/manua...multi-query.php

This does not appear to be possible with PDO, which, I'm guess, is why nobody uses it.

But think about it - if you have 3 customers and 50 orders, and you join those, selecting all the fields, you're actually sending each customer's data more than 16 times repeatedly. This data has to be serialized by the database-server, transported over the network (which is worse if your database server is on a different physical server from your application-server), then unserialized by the database-client, and then subsequently you have to traverse all that data in PHP (or some of it, but all of it has to go into memory) and weed out the non-unique customer records.

Does that sound efficient?

Much for efficient and simple would be a batch of two queries that select the 50 orders and 3 customers separately.

Do you think that's possible?
0

#2 User is offline   Da:Sourcerer 

  • Master Member
  • PipPipPipPip
  • Yii
  • Group: Members
  • Posts: 1,162
  • Joined: 30-March 11
  • Location:Berlin, Germany

Posted 12 May 2012 - 07:08 AM

Quick note without having read your entire post: At least MySQL allows the traffic to be compressed, which is most useful if client and server are not on the same (physical) machine.
programmer /ˈprəʊgramə/, noun: a device that converts ►coffee into ►code
0

#3 User is offline   Da:Sourcerer 

  • Master Member
  • PipPipPipPip
  • Yii
  • Group: Members
  • Posts: 1,162
  • Joined: 30-March 11
  • Location:Berlin, Germany

Posted 12 May 2012 - 07:15 AM

Ok, got it. I think a lot of this is already covered by the lazy-loading approach Yii's current AR implementing is using. So the JOINs will mostly kick in if you explicitly fore Yii to use eager loading via ->with(...).

As for the data duplication: I do not think this can be entirely avoided. But it sounds like you would profit from identity maps.
programmer /ˈprəʊgramə/, noun: a device that converts ►coffee into ►code
0

#4 User is offline   mindplay 

  • Advanced Member
  • PipPipPip
  • Yii
  • Group: Members
  • Posts: 397
  • Joined: 03-September 09
  • Location:New York

Posted 13 May 2012 - 10:33 AM

That's sort of my point. You can choose between lazy-loading with many round-trips - or eager-loading with data-duplication.

With the correct APIs, this could be avoided - you could have one round-trip (eager loading) but no data-duplication. Unfortunately, the multi-query facility does not seem to be the answer, as you can't predict what else you need to load without another round-trip - since the queries are independent queries with individual result-sets. At least not as far as I can figure.

This brings me back to what I've been thinking for some time now - that DBMS APIs really are designed for database administrators (or DBA tools like graphical front-ends) and not for software. Where are the APIs that were designed for software? Mainstream DMBS do not have any... odd right? being that DBMS are much more frequently used for back-end persistence for software, than for DBA front-end tools...
0

#5 User is offline   Karl Zilles 

  • Junior Member
  • Pip
  • Yii
  • Group: Members
  • Posts: 48
  • Joined: 07-June 11
  • Location:Los Angeles, United States

Posted 15 May 2012 - 12:21 PM

Doesn't the default mechanism that the current Yii uses already address your data throughput issue?

Unless you use the "together" flag on a with query:

Quote

tracing the application's queries shows that when CActiveDataProvider gets data, it handles 1:n related tables a bit different. AR gets the data from the parent table and 1:1 related child tables with one query joining all those 1:1 child tables. 1:n child tables are not part of this query. AR gets data from those with subsequent queries, one query per 1:n child table, specifying the rows it wants with a WHERE PK IN (id1, id2, ...) clause.


So, true, it takes 2 or maybe 3 queries to get your results instead of the one you're talking about, but there is no duplication of data, and you're avoiding the n query problem.
0

#6 User is offline   mindplay 

  • Advanced Member
  • PipPipPip
  • Yii
  • Group: Members
  • Posts: 397
  • Joined: 03-September 09
  • Location:New York

Posted 17 May 2012 - 09:27 PM

Well, not quite - I'm talking also about the number of roundtrips. You send one request, and you have to wait for the response before you can construct the next query and send that - and so forth.

The waiting is pointless - you already know your data-access pattern in advance; unfortunately DBMS, their APIs, and SQL at large, were not designed for software, but for database administrators and DBA tools. Actually I saw somebody else pointing that out in an article today:

http://saucelabs.com...oodbye-couchdb/

Quote

It’s 2012, and most queries are run from code rather than by a human sitting at a console. Why are we still querying our databases by constructing strings of code in a language most closely related to freaking COBOL, which after being constructed have to be parsed for every single query?
SQL in its natural habitat
Things like SQL injection attacks simply should not exist. They’re a consequence of thinking of your database API as a programming language instead of a protocol, and it’s just nuts that vulnerabilities still result from this poorly thought out 1970s design today.


Although I thought this article was largely garbage, this particular observation is one I've been pointing out a lot lately. Your ORM is actually a code-generator. What's worse, your ORM generates SQL, and your DBMS then parses that code and compiles it into an "execution plan", which is really a program. But the truth of the matter is, the entire process is really pointless, because your software does not sporadically come up with new queries on it's own - your applications have predictable data-access patterns, with the possible exception of a few applications that allow arbitrary query-construction. Applications such as DBA tools. Heh.
0

#7 User is offline   Da:Sourcerer 

  • Master Member
  • PipPipPipPip
  • Yii
  • Group: Members
  • Posts: 1,162
  • Joined: 30-March 11
  • Location:Berlin, Germany

Posted 18 May 2012 - 03:31 AM

Hm, I'm not sure that actually holds ... What will be sent over the wire is not the raw "SELECT * FROM ..." but a binary protocol containing already (pre-)parsed queries. Then, there are prepared statements that further reduce parsing time. Predictable access patterns are covered by some quite good query caches etc. Honestly, I don't quite see where you're going here. Your charge against SQL sounds really like a charge against n-tier application design.
programmer /ˈprəʊgramə/, noun: a device that converts ►coffee into ►code
0

Share this topic:


Page 1 of 1
  • You cannot start a new topic
  • You cannot reply to this topic

1 User(s) are reading this topic
0 members, 1 guests, 0 anonymous users