Array search performance

by alf

30 10 2008

You probably have often searched for a value in an array using the in_array() function. Reading php|architect’s Guide to PHP Security by Ilia Alshanetsky, I’ve noticed that searching on values can be slower than searching on keys. I’ve wanted to compare performance between the three methods to check a value in a white list and it’s amazing how slower is in_array(), look at the result :

fga@brian:~/tmp$ php5 test_array.php
With in_array :
111
time : 0.349022865295 sec
With isset on keys :
111
time : 0.000214099884033 sec
With array_key_exists :
111
time : 0.00021505355835 sec

Searching on keys is 1000 to 10000 times faster regarding the bench on an array containing 700 000 elements.

Here is the code of the test :

<?php
// array to store white list as values
$t = array();
// array to store white list as keys
$u = array();

// elements count
$max = 700000;

$first = $last = $middle = '';
for($i=$max-1;$i>=0;$i--){
        // values are random strings
        $id = uniqid(rand());
        $t[] = $id;
        $u[$id] = '';
        // we store first, last and middle values to
        // to search for them
        switch($i) {
                case $max-1:
                        $first = $id;
                break;
                case 0:
                        $last = $id;
                break;
                case $max/2:
                        $middle = $id;
                break;
        }
}

echo "With in_array :\n";
$start_time = microtime(true);
echo in_array($first,$t);
echo in_array($last,$t);
echo in_array($middle,$t);
echo "\n";
echo "time : " . (microtime(true)-$start_time) . " sec";
echo "\n";

echo "With isset on keys :\n";
$start_time = microtime(true);
echo isset($u[$first]);
echo isset($u[$last]);
echo isset($u[$middle]);
echo "\n";
echo "time : " . (microtime(true)-$start_time) . " sec";
echo "\n";

echo "With array_key_exists :\n";
$start_time = microtime(true);
echo array_key_exists($first,$u);
echo array_key_exists($last,$u);
echo array_key_exists($middle,$u);
echo "\n";
echo "time : " . (microtime(true)-$start_time) . " sec";
echo "\n";


Autocomplete Ajax search with Dojo and Zend Framework

by regilero

26 10 2008
autocomplete-ajax-search-with-dojo-and-zend-framework

search filter 2With the new Zend Framework 1.6 we’ve these nice Dojo widgets.

New things lacks documentations most of times. So if you want to build something really usefull like theses nice autocomplete search combobox this example could save you a lot of time.
We assume you have dojo already installed and activated on your views, and that acl verifications are done elsewhere, on your Controller plugins for example.
search filter

First let’s see HTML code (in your view):
<script type="text/javascript">
    dojo.require("dojo.parser");
    dojo.require("dojox.data.QueryReadStore");
    dojo.require("dijit.form.ComboBox");
    dojo.require("dijit.form.FilteringSelect");
    dojo.require("custom.FindAutoCompleteReadStore");
    dojo.require("dijit.form.Form");
    dojo.require("dijit.form.Button");
</script>
<form id="Find_Form" action="/module/foo/edit" method="get" dojoType="dijit.form.Form">
<div dojoType="custom.FindAutoCompleteReadStore" jsId="NameStore" url="/module/foo/find/format/json" requestMethod="get"></div>
<label for="id" class="optional">Recherchez un nom:</label>
<span class="formelement"><select name="id" id="FindByName" hasDownArrow="" store="NameStore" size="25" tabindex="99" autocomplete="1" dojoType="dijit.form.FilteringSelect" pageSize="10" ></select></span>
<span class="actionbuttons"><input id="Find_go" name="Find_go" value="Go:" type="submit" label="go:"dojoType="dijit.form.Button" /></span>
</form>

As we can see you’ll need an additional custom js:
custom.FindAutoCompleteReadStore
This is a really simple js to write, create your custom directory in the same level as dojo or dijit directory and create FindAutoCompleteReadStore.js like that:
dojo.provide("custom.FindAutoCompleteReadStore");
dojo.require("dojox.data.QueryReadStore");
dojo.declare("custom.FindAutoCompleteReadStore", dojox.data.QueryReadStore, {
    fetch:function (request) {
        request.serverQuery = { Find:request.query.name };
        // cal superclass fecth
        return this.inherited("fetch", arguments);
    }
});

Now you’ll need to serve the requested Ajax query (requested by the Dojo store linked with our FilteringSelect or Combobox) : /module/foo/find/format/json
This is the method ‘findAction’ in the Controller ‘foo’ on module ‘module’.
But first let’s see the preDispatch function of this controller where we handle the format/json instruction to switch in Ajax mode:

public function preDispatch()
{
    $contextSwitch = $this->_helper->getHelper('contextSwitch');
    $contextSwitch->setAutoJsonSerialization( true );
    $contextSwitch->addActionContext('find', 'json');
    $contextSwitch->initContext();
}

So now let’s write the find function:

public function findAction()
{
    // handle filtering of recieved data
    $replacer = new Zend_Filter_pregReplace('/\*/','%');
    // emulate alpha+num filter with some more characters enabled
    //**** http://www.regular-expressions.info/unicode.html ****
    // \p{N} --> numeric chars of any language
    // \s -> withespace
    //\x0027 : APOSTROPHE

    //\x002C : COMMA
    //\x0025% : % in UTF-8 and not in utf-8
    //\x002D : HYPHEN / MINUS
    //\x005F : UNDERSCORE
    //\. DOT
    $mylimit = new Zend_Filter_pregReplace('/[^\p{L}\p{N}\s\x0027\x002C\x002D\x005F\x0025%\.]/u’,”);
    $filters = array(
        ’*’     => ‘StringTrim’
        ,’Find’ => array(
            ’StripNewlines’
            ,$replacer
            ,$mylimit
            ,’StripTags’
        )
        ,’start’ => ‘Int’
        ,’count’ => ‘Int’
    );
    $validators =array();
    $input = new Zend_Filter_Input($filters, $validators, $_GET);
    $find = $input->getUnescaped(’Find’);
    if (empty($find)) $find = ‘%’;
    $start = intval($input->getUnescaped(’start’));
    if (empty($start)) $start = 0;
    $count = intval($input->getUnescaped(’count’));
    if (empty($count)) $count = 3;
    // get the model, here you should adjust with the way you work
    // then make your query with limits
    $this->_modeltable = new My_Zend_Db_Table_Foo($this->db)
    $fieldid = ‘my_id_field’;
    $fieldident = ‘my_name_field’;
    $select = $this->_modeltable->select();
    $db = $this->_modeltable->getAdapter();
    $select->where($db->quoteinto($db->quoteIdentifier($fieldident).’ LIKE ?’, $find));
    $select->limit($count, $start);
    $rows= $this->_modeltable->fetchAll($select);
    $rowsarray = $rows->ToArray();
    $finalarray=array();
    foreach ($rowsarray as $row)
    {
        $key = $row[$fieldid];
        $finalarray[$key] = $row[$fieldident];
    }
//Zend_Debug::dump($finalarray);
//die(__METHOD__);
    $this->_helper->autoCompleteDojo($finalarray);
}

And it should be sufficient, pffiuu.
But… there’s one remaining problem after that. We put the search autocomplete inside a form and we wanted the ‘go’ button to send a request to something like that:

/module/foo/edit/id/1245 OR /module/foo/edit?id=1245

But we’ll have something like:

/module/foo/edit?id=THE NAME

too bad…

To get it done I had to change one thing in Zend Framework library on the Zend/Controller/Action/Helper/AutoCompleteDojo.php Helper:

62 public function prepareAutoCompletion($data, $keepLayouts = false)
63 {
64 $items = array();
65 foreach ($data as $key => $value) {
66 $items[] = array(’label’ => $value, ‘name’ => $value, ‘key’ => $key);
67 }
68 $final = array(
69 ‘identifier’ => ‘key’,
70 ‘items’ => $items,
71 );
72 return $this->encodeJson($final, $keepLayouts);
73 }

Line 66 ‘key’ is added on the item and line 69 ‘identifier’ is set to ‘key’ and not ‘name’. ‘identifier’ is used by the Dojo Filtering Select to decide which field will be used for the form, for more info see dojo book page and search ‘abbreviation’. There’s also a bug talking about that for Zend Framework, to get other solutions or info on the way it will be fixed later look here



Drupal 5 modules release !

by pounard

6 10 2008

Working on a wide drupal site project, I had to make some custom modules, feeting to my needs. After some monthes, I finaly have stable releases for most of them, so I intend to release them to the community.

en lire plus



Forcing HTTP/1.0 Apache response when PHP is there…

by regilero

29 08 2008
forcing-http10-apache-response-when-php-is-there

Recently I had to force an HTTP/1.0 response with Apache because of a bad Java Parser.

This parser/browser was asking for HTTP/1.1 responses but did’nt understood chunked content encoding. And so giving me a nice Sax exception “content not allowed in prolog”. So, well, I won’t fix this #$*%! code. Better trying to talk to this special User Agent in HTTP/1.0, he might handle it in a better way. Here’s what a chunked content looks like. See the 306c hexa lenght code before the body of the response?

HTTP/1.1 207 Multi-Status
(... lot of headers, but no lenght one ...)
Content-Type: text/xml; charset="utf-8"

306c
<?xml version="1.0" encoding="utf-8"?>
(... here the content ...)

So I have the user agent of this Java HTTP Client, let’s call it ‘NoobieJavaParser’.

I simply wrote in my apache virtualhost config file:

BrowserMatch "^NoobieJavaParser" nokeepalive force-response-1.0 downgrade-1.0

And it should be sufficient. In fact it’s not because of a very old PHP bug (saw first bug report in 2004).

PHP is building is $_SERVER variable by reading Apache env, and PHP doesn’t want any dot in this parsed content. The ‘downgrade-1.0‘ env name seem malicious for PHP. So the env setting looks like that in PHP:

echo $_SERVER['downgrade-1_0'];
-> 1

See the dots is now a ‘_’. It should not hurt anyone, except PHP changed this env name in Apache as well. So when apache is sending the response, he does not care anymore about this downgrade-1_0 settings.

If you want Apache to have the real behaviour, i.e: sending HTTP/1.0 responses for this ‘NoobieJavaParser’ User Agent you must re-set the env of Apache in PHP, with something like that:

if ($_SERVER['downgrade-1_0']){
        apache_setenv(’downgrade-1.0′,’true’);
}
if ($_SERVER['force-response-1_0']){
        apache_setenv(’force-response-1.0′,’true’);
}

Ugly, but it’s PHP’s fault. and no more chunked content after that. And the nice thing is that apache_setenv is not changing $_SERVER, so PHP still does not have this malicious dot.



Migration d’un site SPIP vers Drupal 1/3

by Pierre

11 08 2008
J'ai eu dans le cadre de mon travail la refonte d'un site contenant un grand volume de donnée. Le coeur de cette refonte reposait sur la migration du site d'une technologie vers une autre. La technologique d'origine est le CMS SPIP, celle d'arrivée le CMS Drupal. read more


Drupal et filtres: méfiez vous des caches (et surtout des URL absolues) !

by Pierre

12 06 2008
Au cours de mes mésaventures avec Drupal, le module Image et les différents comportements des caches, j'ai fini par trouver la vraie cause de mes problèmes. Pour récapituler mon précédent billet, mon problème était que certaines images affichées dans mes contenus Drupal avaient une URL en dur, ce qui provoquait des erreurs quand on changeait de site (donc de nom de domaine). Pour comprendre le problème, il faut connaitre le contexte, j'ai donc: read more


Wrapping text for Zend Pdf

by alf

4 06 2008

A common issue in Zend_Pdf is to wrap text in a box. I’ve found partial solutions such as wrapping text each 80 characters for instance but the line width can vary regarding the font and the character width. Since we can’t rely on the character count unless using a monospaced font, we have to wrap text on the real box width.

In partial solutions, I’ve found a function which computes the real width of a string according to the font and the font size. By aggregating every chunks, I’ve made my getWrappedText() method which returns a string with the correct \n :

protected function getWrappedText($string, Zend_Pdf_Style $style,$max_width)
{
    $wrappedText = '' ;
    $lines = explode("\n",$string) ;
    foreach($lines as $line) {
         $words = explode(' ',$line) ;
         $word_count = count($words) ;
         $i = 0 ;
         $wrappedLine = '' ;
         while($i < $word_count)
         {
             /* if adding a new word isn't wider than $max_width,
             we add the word */
             if($this->widthForStringUsingFontSize($wrappedLine.' '.$words[$i]
                 ,$style->getFont()
                 , $style->getFontSize()) < $max_width) {
                 if(!empty($wrappedLine)) {
                     $wrappedLine .= ' ' ;
                 }
                 $wrappedLine .= $words[$i] ;
             } else {
                 $wrappedText .= $wrappedLine."\n" ;
                 $wrappedLine = $words[$i] ;
             }
             $i++ ;
         }
         $wrappedText .= $wrappedLine."\n" ;
     }
     return $wrappedText ;
}
/**
 * found here, not sure of the author :
 * http://devzone.zend.com/article/2525-Zend_Pdf-tutorial#comments-2535
 */
 protected function widthForStringUsingFontSize($string, $font, $fontSize)
 {
     $drawingString = iconv('UTF-8', 'UTF-16BE//IGNORE', $string);
     $characters = array();
     for ($i = 0; $i < strlen($drawingString); $i++) {
         $characters[] = (ord($drawingString[$i++]) << 8 ) | ord($drawingString[$i]);
     }
     $glyphs = $font->glyphNumbersForCharacters($characters);
     $widths = $font->widthsForGlyphs($glyphs);
     $stringWidth = (array_sum($widths) / $font->getUnitsPerEm()) * $fontSize;
     return $stringWidth;
 }

then you can draw the text easily :

$y = 700;
$lines = explode("\n",$this->getWrappedText($text,$style_text,400)) ;
foreach($lines as $line)
{
    $page2->drawText($line, 140, $y);
    $y-=15;
}


Apache Virtualhost generator

by alf

2 04 2008

Currently I’ve not much time to work on Wisss :-( However, I will make a tiny dsl and generator to have a virtualhost file generated (independent from Wisss, which already generate a vhost). I’ve already done this with a shell script but it will be more powerful and easy to maintain for a few hours of work.

The initial need is for my own server but it will also be useful for my company. The goal is to provide our best practice for vhost in a tool.



Vertimus 1.0

by Stéphane

22 03 2008

GNOME translation teams,
Vertimus is the perfect tool to follow each translation, translate, proofread and enhance the quality of contributions.

https://launchpad.net/vertimus

===================================
Overview of changes in Vertimus 1.0
===================================
tab: vertimus-1-0 (2008-03-21)

* XHTML and CSS pass the W3C validator
* New tool to download all PO files of a release
* Show informations about other modules with the same name in the
module page (Javascript)
* Move old files in a backup directory
* Add some maintenance scripts
* Display the authorized extensions in module page
* Add a search tool
* Reduce require_once calls
* Add informations about APC installation
* FIX Backup of files
* FIX No error message when the extension file isn’t valid
* FIX Web site title translation
* FIX call to date_default_timezone_set
* Fix #189903 - RSS feed error

New and last release wrote in PHP! Take it while it’s hot! The next version will be in Python to integrate features from Damned-lies and Transifex.



Performance of a PHP application with APC

by Stéphane

9 03 2008

To evaluate the performance of Vertimus with a opcode cache like APC, I used xdebug and Kcachegrind.
The results are really interesting, without APC, the index page has a total time cost of 191 032, the Zend Framework requires to use many classes, PHP is not really fast to parse and execute this code :

Vertimus without APC

and with APC, the total time cost is only 123 904:
Vertimus with APC

The CPU load is reduced by 36% but you need a bit of memory to store cache data (30 Mo by default). The results has been obtained with APC 3.0.16 and the following configuration:

;;;;;;;;;;;;;;;;;;
; APC ;
;;;;;;;;;;;;;;;;;;
extension=apc.so
apc.enabled=1
apc.shm_segments=1
apc.shm_size=30
apc.ttl=7200
apc.user_ttl=7200