Please, note. This article is entirely based on a work done by me colleague developer, who hasn't got account on this webpage and also hasn't got time for writing this article. I'm writing it for him. I will try to forward him any comments that may appear, but can't assure if/when he'll be able to answer. I'm far, far less experienced in Oracle therefore I won't probably be able to help myself.
An approach currently presented in COciSchema - i.e. basing on all_objects, ALL_TAB_COLUMNS and ALL_CONS_COLUMNS is probably correct solution, but very, very slow, when dealing with Oracle databases. Experiments proved that in some rare situations (very slow, not localhost server), schema analysing query based on above may run more than one minute (up to even twenty minutes in a very separate cases), which causes for example Gii to end up permanently with timeout error message, if PHP script execution time is to low. This situation was presented for example in this post.
This approach works very well in other RDBMS (like MySQL) as they are properly optimised for such queries. Unfortunately, fails on Oracle setup on a slow server.
One of possible solutions is to introduce new set of tables that will hold information on tables and views used in a project. This is a kind of caching on database side.
The idea is following. We are taking original SQL query from original COciSchema, which is used to get tables schema. It is fired upon each AR use and since it is one of slowest queries in Oracle, it makes AR nearly unusable. Instead of calling this slow query every time, we are firing it only when real table schema changes. And instead of returning results, we are putting them into our set of tables, used for caching real tables schema. Then we change COciSchema so it reads tables schema from that so called technical-tables. This speeds up whole process significantly as Active Record is querying normal table, which is done very fast.
This looks pretty much like using schemaCachingDuration property, but has some advantages:
Here are an example SQL statements that builds up these tables:
create table WWW_TAB_COLUMNS ( column_id NUMBER(10) not null, table_name VARCHAR2(50) not null, column_name VARCHAR2(50) not null, data_type VARCHAR2(100) not null, nullable CHAR(1), data_default VARCHAR2(100), key CHAR(1) ); comment on column WWW_TAB_COLUMNS.nullable is 'Y/N'; comment on column WWW_TAB_COLUMNS.key is 'P/NULL'; alter table WWW_TAB_COLUMNS add constraint PK_WWW_TAB_COLUMNS primary key (TABLE_NAME, COLUMN_NAME); create table WWW_TABLES ( table_name VARCHAR2(50) not null ); alter table WWW_TABLES add constraint PK_WWW_TABLES primary key (TABLE_NAME); create table WWW_TAB_CONS ( table_name VARCHAR2(50) not null, column_name VARCHAR2(50) not null, position NUMBER(10) not null, r_constraint_name VARCHAR2(50) not null, table_ref VARCHAR2(50) not null, column_ref VARCHAR2(200) not null ); alter table WWW_TAB_CONS add constraint PK_WWW_TAB_CONS primary key (TABLE_NAME, COLUMN_NAME); alter table WWW_TAB_CONS add constraint FK_WWW_TAB_CONS foreign key (TABLE_NAME) references WWW_TABLES (TABLE_NAME);
After creating proper tables and modifying COCiSchema class definition, you can insert any table schema into created set of tables. Here is an exemplary SQL statement for this. As it was stated eariler, it is pretty much the same as the one originally used in COciSchema.
SQL for retrieving columns data:
SELECT a.column_name, a.data_type || case when data_precision is not null then '(' || a.data_precision || case when a.data_scale > 0 then ',' || a.data_scale else '' end || ')' when data_type = 'DATE' then '' else '(' || to_char(a.data_length) || ')' end as data_type, a.nullable, a.data_default, ( SELECT D.constraint_type FROM user_CONS_COLUMNS C inner join user_constraints D On D.constraint_name = C.constraint_name Where C.table_name = A.TABLE_NAME and C.column_name = A.column_name and D.constraint_type = 'P') as Key FROM user_TAB_COLUMNS A WHERE A.TABLE_NAME = 'NAZWA_TABELI' ORDER by a.column_id
SQL for retrieving references:
SELECT D.constraint_type, C.COLUMN_NAME, C.position, D.r_constraint_name, E.table_name as table_ref, f.column_name as column_ref FROM ALL_CONS_COLUMNS C inner join ALL_constraints D on D.OWNER = C.OWNER and D.constraint_name = C.constraint_name left join ALL_constraints E on E.OWNER = D.r_OWNER and E.constraint_name = D.r_constraint_name left join ALL_cons_columns F on F.OWNER = E.OWNER and F.constraint_name = E.constraint_name and F.position = c.position WHERE C.OWNER = '{$schemaName}' and C.table_name = '{$name}' and D.constraint_type = 'R' order by d.constraint_name, c.position
This step is essential as base of speeding up Active Record with presented solution is to change it, so it will read table schema from newly created set of tables, instead of querying Oracle each time for table schema (this is one of the slowest operations in Oracle). Therefore, any change in real table schema must be reflected in above set of tables.
After filling set of tables with tables schema, you need to alter some parts of COciSchema class code, by either overwriting original one or creating own extension basing on it:
protected function findColumns($table) { list($schemaName,$tableName) = $this->getSchemaTableName($table->name); $sql=<<<EOD SELECT Upper(COLUMN_NAME) as COLUMN_NAME, Upper(DATA_TYPE) as DATA_TYPE, NULLABLE, DATA_DEFAULT, KEY FROM www_tab_columns Where Upper(table_name) = Upper('{$tableName}') ORDER by column_id EOD; $command=$this->getDbConnection()->createCommand($sql); if(($columns=$command->queryAll())===array()){ return false; } foreach($columns as $column) { $c=$this->createColumn($column); $table->columns[$c->name]=$c; if($c->isPrimaryKey) { if($table->primaryKey===null) $table->primaryKey=$c->name; else if(is_string($table->primaryKey)) $table->primaryKey=array($table->primaryKey,$c->name); else $table->primaryKey[]=$c->name; /*if(strpos(strtolower($column['Extra']),'auto_increment')!==false) $table->sequenceName='';*/ } } return true; } protected function findConstraints($table) { $sql=<<<EOD SELECT upper(COLUMN_NAME) As COLUMN_NAME, upper(TABLE_REF) As TABLE_REF, upper(COLUMN_REF) As COLUMN_REF FROM WWW_TAB_CONS WHERE upper(TABLE_NAME) = upper('{$table->name}') Order By POSITION EOD; $command=$this->getDbConnection()->createCommand($sql); foreach($command->queryAll() as $row) { $name = $row["COLUMN_NAME"]; $table->foreignKeys[$name]=array($row["TABLE_REF"], $row["COLUMN_REF"]); if(isset($table->columns[$name])) $table->columns[$name]->isForeignKey=true; } } protected function findTableNames($schema='') { $sql='SELECT upper(table_name) as TABLE_NAME FROM www_tables'; $command=$this->getDbConnection()->createCommand($sql); $rows=$command->queryAll(); $names=array(); foreach($rows as $row) { $names[]=$row['TABLE_NAME']; } return $names; }
This is one of possible solution. It was designed for our own project and therefore might not satisfy other developers needs or may need some slight changes. But, on the other hand, we were able to speed up Active Record ten times. For example selection of 300 records (160 columns) took 0,24 second, while before (on original Yii-build in COciSchema) the same query was taking around 2,45 second. Both test without using any caching component.
Total 6 comments
@yktoo - you could create github ticket and even push change request with your patch...
@trejder - about param binding in PDO OCI - there is a difference between named parameters which are passed to Oracle engine with OCIBindByName internally and query without params. Also Oracle query runner can cache query execution plans with named params and reuse them which should speed up running similar queries. So using params is rather recommended...
Actually, after a quite simple optimization to COciSchema->findColumns() I managed to reduce time it takes to read in the metadata by about 50%.
For some reason the existing query uses a scalar subquery to determine whether a column belongs to the primary key, which is inefficient compared to an outer join.
Could anyone please update the method to use the following, completely equivalent, query, which makes a significant improvement with regard to execution time?
I use CFileCache. It works well in both Windows and Linux. I still can delete the cache files when I think necessary.
Using many queries and retrieving many data is not a problem. The metadata querie is just made once per table.
Caching metadata is always a good idea. But not caching resultsets. Yes, when your data changes very often, caching is unnecessary and slow down performance, as you told. But this will happen even on MySQl or Postgres.
@rickgrana: First of all -- I'm not working on a project, where I used Oracle, any more. I'm not pretty sure right now, but the problem was that we're not able to find a good cache component for PHP running under Windows and Linux (we were developing a localhost application to be deployed on many customers' servers).
Second of all -- Using cache was not a good idea at all in my project. It was meant to handle hundreds (if not thousands) of small SQL queries, ex-changing millions of a small chunks of data per second. Data changing very dynamically every second. If I'm not mistaken, using a cache in such project would only slow down performance.
I was told many times that cache isn't a best solution always, in every project.
Sorry, but this article is unnecessary.
Just use cache - Yes it works (ALWAYS works!). Maybe you are misusing cache...
Cache will make queries EVEN faster than yours (The metadata will only be requested once and cached. Next queries will not request the metadata anymore).
Also Gii works - Yes , sometimes it fails the first time, but with cache, next attempts will work very well.
@sx9: Do you ask, if just adding parameters binding will speed up your queries (not using above solution at all) or you we tested our solution with parameters binding?
Answer to the second question is that this solution should speed up any query passed to Oracle database? Parameters binding is done on PHP side (in either Yii or PDO) and the result is a query quite similar to the one not using parameters biding. I.e. If you are using it or not, it should not have influence for speed, if using our solution.
For first question: I can't answer, if using parameters biding is faster than not using it, because I haven't got opportunity to test it and because this is quite not related to this article! :)
Leave a comment
Please login to leave your comment.