Cross-table Querying

Hi guys here’s a question. What’s the best way to access data in Yii where you’re bringing some data back from many tables. For example, lets say I’m building a shopping cart and need to show a product list. In this product list however I need to show some product data, as well as a couple of fields from the category a couple of fields from the manufacturer and a couple of fields from the image, and some fields from the product info tables. SO in total there are 5 tables I’m getting data out of, but I don’t need ALL the data from all the tables? Also, once the data is retrieved I will need to add some logic to do things like create the seo friendly name to pass to the urlManager class to generate the URL, where would I put such logic? It doesn’t seem that Active Record is the most appropriate solution here unless I create an active record based on a query somehow where I specify what fields need to be retrieved from what tables…

I thinks what I’m talking about here is a pretty common scenario for websites that primarily output stuff.

Any help and suggestions would be much appreciated.

AR should fit perfectly for this kind of work. While you may think writing an AR class for every relevant table is very troublesome, you will find it is actually very convenient to use. It also helps you to separate your biz logic from the rest of code very well. You usually only need to consider non-AR solution in case when you need to fetch thousands of rows of data or your SQL involves complex aggregations.

So can you actually create an ActiveRecord based on 5 different tables? For example, I want to show the customer first name and last name, order number, order date, order total, count of order products and I want to show all of this in a single list. How would I do this with Active record in Yii? I don’t want to retrieve all the data about the customer or the order, just 2 out of 20 fields and show all of this in a paged list.

Also, in another example, lets say I have a customer full name in a few places which is a derived field as firstname . ’ ’ . lastname and I don’t want to replicate this logic. I want to use on both the customers list and the orders list. How would i go about that. This is a simple example, but lets say I need to create something out of 4 fields with more complex logic (as I do for products when creating SEO friendly names by stripping some text, etc, etc). How would you recommend doing that? Should I create a static method somewhere with the logic and call that out of all Active Records as I don’t want to replicate the logic in 2 places?

I guess that the advice of Qiang is to use freely AR, without thinking to "fields loaded without needed".

So create 5 active record, create observer method in them for customizing the output (eg. getFullName that gives you name and surname) and so on.

Don’t be afraid from the field loaded and not used: using a custom sql query that gives you only the needed data is not improving very much the speed of your application.

Just an example: if you ask name and surname from a table user in wich there are 50 fields, the db use lot of time for find out the row in wich there is the requested user, and then a very small time for retrive the other datas.

If you want to improve speed, the best advice is to divide the table user in 2 tables in one-one relation: the data "frequently accessed" and the data "rarely accessed".

That will gives you and advance. I tested it in a table of images, there was some data like size, type and name and a blob (sometime even 10mb of photo). Saving the blob in a different table improved the speed, because was rarely accessed.

So, don’t be afraid from big AR, small AR, is not AR to make real difference in performance, is DB. Better to think about gut tables, small tables, id always the DB that does the hard work of retriving datas, AR are just a representation

I know what you mean and that’s something that I usually do when I’m not dealing with a lot of data, but in a serious application when I have tables that have 50 fields and I only require 2 fields it really makes little sense to load 50 fields when you require two. Also in my example it’s only 3 tables of say 50 fields each, but what happens when you have 8 tables when you’re doing some aggregate view in which you need data from across a few tables? Also I have tables that have 1:1 relationship such as tables for internationalization (e.g. Product Name, Description in a specific language) and tables for bundled products (which is once again a separate table). How do I handle all of that in Active Record?

I mean it makes little sense from a performance perspective to retrieve 150 fields from 3 tables when all you need is up to 5 fields from each (i.e. 15 out of 150). Also, I don’t need to retrieve the big Product Description field stored as text because I’m not showing it the list, why on earth would I want to load it? Remember all of this has to live IN MEMORY when there’s no reason for it!

The just use ActiveRecord and don’t worry about it advise would work for a small app where you have 10 columns on a database, but not on a large application which is data-centric.

I’m sure this is a situation that is frequently encountered in large data-centric applications and this is the same problem that arises when you do any OO programming with serious databases.

My thinking was to create a view perhaps or just a query and then create an AR based on that. For any derived fields what I would then do is create static functions in the most appropriate place and use them out of the getters to achieve code reuse.

Qiang, would be good to get your input on this, as I’m sure it’s a common problem people come across when writing anything more complex than a simple blog.

Bump… Anyone? Qiang? I mean surely this is a common problem encountered?

In this case, you can specify the ‘select’ option in the query criteria to retrieve only the specified column data.

Thanks Qiang… So is what you’re saying that I would create a separate AR class for this ViewModel (at least that’s what they call it in ASP.NET MVC because it’s specific to the View) and then in the select just specify what fields I need? What about for calculated fields, what would you recommend? Should I create a static method in in my Customer model something like getFullName($firstName, $lastName) so it can be called as Customer::getFullName from both within the CustomerModel and OrderSummaryViewModel?

Please note I’m using Customer, Order and FullName as just a simple example of the problem, because in my situation I actually need to take like 5 fields and do some more complex logic on them to produce the output.

I would really appreciate some guidance on this from the experts as this is a common OO problem when it comes to data-centric applications, where the logic is usually pretty simple, but there is a lot of data.

From what I’ve seen, for heavily data-centric applications Microsoft in the past recommended the approach where the data and the logic are separated into Data Transfer Objects and Business Components which act on those Data Transfer Objects, which is the opposite of Active Record where the data and the logic are bundled together.

I’ve never tried this, but maybe this is possible with AR:

create a db view with the necessairy fields and use AR to read this view…?

Yeah, that’s an option, the questions is how to handle situations where your Model and your ViewModel need to perform the same logic (e.g. get the Full Name from First Name and Last Name).

Use the same base class for both and implement that methods there.

Hmm, good thinking, won’t always work for complex scenarios, but good thinking indeed.