PHP objects vs arrays performance myth

During discussions with someone from Zend who is reviewing the proposed PHP annotations feature, the age-old "objects are slower than arrays" myth came up in the discussion.

Objects in some cases are actually faster than arrays, but they’re generally as fast as arrays. There are actually cases where arrays can be a painful bottleneck on performance (think Drupal).

To prove my point, I hammered out a quick benchmark, which I’m posting here in case anyone cares.

My test case is based on the usage patterns of my annotation manager, so it involves a small amount of construction by a "factory" class, a lot of passing objects (vs arrays) around, and some access (mostly read) to their properties from various contexts.





<?php


define('NUM_INSTANCES', 10);

define('NUM_TESTS', 10000);


class TestObject

{

  public $a;

  public $b;

  public $c;

  

  public function __construct($a,$b,$c)

  {

    $this->a = $a;

    $this->b = $b;

    $this->c = $c;

  }

}


class Test

{

  public function getObjects()

  {

    $a = array();

    for ($i=0; $i<NUM_INSTANCES; $i++)

      $a[] = new TestObject('a','b','c');

    return $a;

  }

  

  public function getArrays()

  {

    $a = array();

    for ($i=0; $i<NUM_INSTANCES; $i++)

      $a[] = $this->buildArray('a', 'b', 'c');

    return $a;

  }

  

  public function buildArray($a, $b, $c)

  {

    $array = array();

    $array['a'] = $a;

    $array['b'] = $b;

    $array['c'] = $c;

    return $array;

  }

  

  public function useObject($o)

  {

    $a = $o->a;

    $b = $o->b;

    $o->c = 'xxx';

  }

  

  public function useArray($o)

  {

    $a = $o['a'];

    $b = $o['b'];

    $o['c'] = 'xxx';

  }

}


$test = new Test;


// Benchmark with objects:

$start = microtime(true);

for ($i=0; $i<NUM_TESTS; $i++)

{

  $x = $test->getObjects();

  foreach ($x as $y)

    $test->useObject($y);

}

echo "\nObject time = " . (microtime(true)-$start) . "\n\n";


// Benchmark with arrays:

$start = microtime(true);

for ($i=0; $i<NUM_TESTS; $i++)

{

  $x = $test->getArrays();

  foreach ($x as $y)

    $test->useArray($y);

}

echo "\nArray time = " . (microtime(true)-$start) . "\n\n";



This is probably not a fair comparison. The issue is in useArray() and useObject(). In useArray(), you are making a new array copy, while in useObject, it is merely a reference copy.

While I never read the PHP engine code, based on my experience I have the following feeling:

  • The implementation of PHP object is close to an array (that may explain why the performance of the two is close).

  • When assigning an array variable to another variable (including passing an array parameter to a function), PHP will not copy the whole array until we modify the array elements in the latter.

  • The memory used by PHP objects and arrays are both dependent of the string length of their property/key names.

PHP does not copy arrays until you modify them - internally, they are always passed by reference, only if you modify them does it create an actual copy.

If you change useArray($o) to useArray(&$o), you will actually see a small increase in array time. PHP is full of little surprises like that :wink:

They are close, they both use hashes.

If arrays are less performant, it’s probably due to lookups, and the fact that arrays work with variant keys - the keys can be strings or numbers, and there are some slight type-handling differences for the two, which must result in some amount of overhead.

With objects, the keys are always strings.

Another fun fact, regarding constants - you might think it’s more overhead to resolve the namespaced MyClass::MY_CONST than the global MY_CONST, because MyClass::MY_CONST requires too consecutive lookups. Maybe the implementations differ, or maybe it’s due to the fact that the global namespace is already full of constants at startup, that need to be searched - under any circumstances, MY_CONST takes about twice the CPU power of MyClass::MY_CONST.

I have lots of weird benchmarks laying around. If I’m about to write code that makes heavy use of a particular feature, I often benchmark first - and the results are often quite surprising :slight_smile:

I found something more weird. If you run the array test first before the object test. You will get slower array and faster object instead of faster array and slower object.




// Benchmark with arrays:

$start = microtime(true);

for ($i=0; $i<NUM_TESTS; $i++)

{

  $x = $test->getArrays();

  foreach ($x as $y)

    $test->useArray($y);

}

echo "\nArray time = " . (microtime(true)-$start) . "\n\n";

// Benchmark with objects:

$start = microtime(true);

for ($i=0; $i<NUM_TESTS; $i++)

{

  $x = $test->getObjects();

  foreach ($x as $y)

    $test->useObject($y);

}

echo "\nObject time = " . (microtime(true)-$start) . "\n\n";



RATHER THAN




// Benchmark with objects:

$start = microtime(true);

for ($i=0; $i<NUM_TESTS; $i++)

{

  $x = $test->getObjects();

  foreach ($x as $y)

    $test->useObject($y);

}

echo "\nObject time = " . (microtime(true)-$start) . "\n\n";

// Benchmark with arrays:

$start = microtime(true);

for ($i=0; $i<NUM_TESTS; $i++)

{

  $x = $test->getArrays();

  foreach ($x as $y)

    $test->useArray($y);

}

echo "\nArray time = " . (microtime(true)-$start) . "\n\n";



Maybe the difference is just the PHP overhead, not object vs array difference.


Actually did some more run, the result varies. Sometime array faster, sometime object faster. But changing the order did make object faster more time.

You should run these separately… and multiple times.

Good point.

Another reason why objects should be faster than arrays - every time you add a new item to an array collection, it has to allocate memory for the list item itself, as well as the new variable. With objects, all of it’s “indexes” (members) are predefined, and can be allocated in one shot.

There’s plenty of reasons why arrays should be slower than objects. The difference is probably so marginal that it barely makes any sense to fuss over it in the first place, but I’ve heard the argument so many times, and just wanted to dispute the myth.

I’m always sceptical about how meaningful these “synthetic” benchmarks really are. You need to know a lot about PHP’s internals to not fall into a trap and only measure the result of some hidden optimization technique. Take for example the concept of zval in PHP (described nicely here). Knowing this, it’s easy to come up with some benchmark that leads to wrong assumptions about the speed of variable assignments. And i’m sure, there are more clever things happening under the hood.

So i would say: Ask a PHP core programmer. He should probably know best, in which situation arrays or objects are faster. :)