Howto install OpenOffice 3 on Ubuntu 8.10 Intrepid

by pounard

4 11 2008
Since Ubuntu does not provide supported packages for OpenOffice 3.0, here is a simple method to install OOo3. en lire plus


Prototype Javascript Framework, le Javascript facile

by sylvain

3 11 2008
Prototype.js: faciliter l'utilisation d'Ajax et la manipulation du DOM dans vos applications web. Cet article présente cette bibliothèque et des liens vers de nombreuses ressources externes.


OpenERP: rapide présentation

by sylvain

3 11 2008
OpenERP est un ERP distribué gratuitement sous une licence libre (GPL). Développé pour répondre aux besoins complexes et évolutifs d'entreprises en pleine croissance, il est à la fois flexible et puissant. Avec plus de 200 modules, son champ fonctionnel très large couvre la plupart des besoins de l'entreprise.


6 nice things not know enough about PostgreSQL

by regilero

30 10 2008
6-nice-things-not-know-enough-about-postgresql

With the new PostgreSQL server versions in place (8.2 and 8.3) and in a more general way with the 8.x series some nice fonctionnalities have benn added. Let’s have a short look at som interesting ones:

1) WITH FILLFACTOR=50 in CREATE TABLE instructions (since 8.2):
FILLFACTOR is 100% by default and is a good default setting for tables where the basic usage is INSERTS (and select). But when you know that you’ll make a lot of UPDATEs on your rows you should decrease this factor. This way some space on the table will be reserved near your inserted rows. This space will then be used as a work zone when you’ll make an UPDATE on the row. And the magic effect is that this work zone won’t be at the end of the table but near your row, in the same page in memory. see postgreSQL documentation page for details.

2) RETURNING on INSERT INTO to get your INSERTED Id (since 8.2):
The classical way to get you ‘last insert Id’ in PostgreSQl as always been using currval(SEQUENCE)
This is right and secure as PRIMARY KEYS ar usually defined as SEQUENCEs with DEFAULT nextval(SEQUENCE). And currvall render the last value set by nextval in the current session (others concurrent sessions cannot interfere with it). But that’s not something easy to understand for newbies and very bad example with max(id) can always be found googling around. Now you can add a RETURNING MyId code on your INSERT query and the result of your insert won’t be the row OID anymore but your Id (or anything else if you want). Consult postgreSQL documentation page for details.

3) TOAST FIELDS:
TOAST means ‘The Oversized-Attribute Storage Technique‘. You can set up to 1Gb in one field of your row. This column won’t be saved in the same physical file as the others. Another file will be created to store such big fields. postgreSQL documentation page is still the best reference.
If you wonder about the size of your tables and the physical files on your filesystem you should not. Your tables are always split in files of 2Gb. And Toast values are stored on their own files.

4) TABLE INHERITANCE:
You can define a table B as child of table A. Request on table A will then render rows from A and B tables.
With ONLY keyword you can limit requests on A with A rows. A could have several tables (B, C, D, etc). Indexes are done tables by table, and are by this way shorter. This is quite powerfull but you’ll have some problems with contraints. UNIQUE constraints for example are done for each table. You cannot ensure A+B+C+D rows will not share the same value for this ‘UNIQUE’ constraint. Setting Referential intergrity from one of this table to a Z table is easy (but should be done for each table). But setting the reverse relation from Z to A+B+C+D isn’t possible. You should really look postgreSQL documentation page, as always.

5) TABLE PARTITIONNING:
One of the most powerfull thing you can do with INHERITANCE is table PARTITIONNING. Using TABLESPACEs you can define several different physical storage locations for your databases. TABLESPACES can easily be used for a database, a table, or even for an index (or the WAL sync log). This is fine. You can use several storage devices with different characteristics, each adapted to your differents needs (capacity, speed, sync/async, etc). But this combined with INHERITANCE becomes even more powerfull:
Define table A as an empty table.
Define table B and C as child tables of A, and use different tablespaces for B and C. You then have a virtual A table with his conent spread on diferent storage devices (or not, you could use the TABLESPACE on the same storage but you’ll lose most of the power of the ‘thing’). Your benefits? smaller indexes, on different devices, which can run in parallel, some problems with constraints as with point 4), but this is not a problem for all tables, and for a huge table this TABLESPACE splitting could be a coll thing to study. Have a look at postgreSQL documentation page. One last point, you’ll have to defined how the rows are splitted with the different tables (ranges, or domains, or anything else), you’ll maybe have to check RULES as well, even with simple INHERITANCE , beacuse INSERT for example should be done on the child table, and INSERT on the main TABLE should be redirected elsewhere.

6) NOTIFY/LISTEN:
PostgreSQL has a builtin fonctionnality for Observer/observable Design Pattern. You can NOTIFY something, as an SQL command and at the end of your transaction (or directly if you’re not in a transaction) others SQL sessions which have registered this notification with LISTEN will get your notification (the doc). Usefull with server processes (while true processes), a cli process in PHp for example with builtin pg lib but not with PDO actually. Here is as well a Java example and examples in python, the demo2a/b files.



Array search performance

by alf

30 10 2008

You probably have often searched for a value in an array using the in_array() function. Reading php|architect’s Guide to PHP Security by Ilia Alshanetsky, I’ve noticed that searching on values can be slower than searching on keys. I’ve wanted to compare performance between the three methods to check a value in a white list and it’s amazing how slower is in_array(), look at the result :

fga@brian:~/tmp$ php5 test_array.php
With in_array :
111
time : 0.349022865295 sec
With isset on keys :
111
time : 0.000214099884033 sec
With array_key_exists :
111
time : 0.00021505355835 sec

Searching on keys is 1000 to 10000 times faster regarding the bench on an array containing 700 000 elements.

Here is the code of the test :

<?php
// array to store white list as values
$t = array();
// array to store white list as keys
$u = array();

// elements count
$max = 700000;

$first = $last = $middle = '';
for($i=$max-1;$i>=0;$i--){
        // values are random strings
        $id = uniqid(rand());
        $t[] = $id;
        $u[$id] = '';
        // we store first, last and middle values to
        // to search for them
        switch($i) {
                case $max-1:
                        $first = $id;
                break;
                case 0:
                        $last = $id;
                break;
                case $max/2:
                        $middle = $id;
                break;
        }
}

echo "With in_array :\n";
$start_time = microtime(true);
echo in_array($first,$t);
echo in_array($last,$t);
echo in_array($middle,$t);
echo "\n";
echo "time : " . (microtime(true)-$start_time) . " sec";
echo "\n";

echo "With isset on keys :\n";
$start_time = microtime(true);
echo isset($u[$first]);
echo isset($u[$last]);
echo isset($u[$middle]);
echo "\n";
echo "time : " . (microtime(true)-$start_time) . " sec";
echo "\n";

echo "With array_key_exists :\n";
$start_time = microtime(true);
echo array_key_exists($first,$u);
echo array_key_exists($last,$u);
echo array_key_exists($middle,$u);
echo "\n";
echo "time : " . (microtime(true)-$start_time) . " sec";
echo "\n";


Database performance in Web applications

by Stéphane

29 10 2008

It’s more efficient to connect a Web application with an Unix Domain
Socket than TCP/IP one (reduced overhead) so I’ll explain the required
configuration with the following pairs:
1 - TurboGears/SQLAlchemy
2 - Django/PostgreSQL
3 - Django/MySQL

1 - TurboGears/SA

SQLObject is dead, isn’t it? So With SQLalchemy, the syntax is:

sqlalchemy.dburi="postgres:///dbname?user=mydbuser&password=XXXXXX" ([1])
http://docs.turbogears.org/1.0/DatabasePostgres

2 - Django/PostgreSQL

You just need to define DATABASE_ENGINE = ‘postgresql_psycopg2′ and DATABASE_NAME. Leave DATABASE_HOST setting empty to use UDS.

3 - Django/MySQL

Create a database in UTF8, either with default-character-set = utf8
under [mysqld] section in the my.cnf file or with an explicit ‘create
database bla charset=utf8;’

In settings.py:


DATABASE_HOST = '/var/run/mysqld/mysqld.sock'
DATABASE_OPTIONS = {
'read_default_file': '/etc/mysql/my.cnf',
'init_command': 'SET storage_engine=INNODB'
}

A - Note about PostgreSQL

When the user isn’t the same one who runs the process, you must edit the PostgreSQL configuration (/etc/postgresql/8.3/main/pg_hba.conf):
# "local" is for Unix domain socket connections only
local   user     database      md5
local   all      all           ident sameuser

You must create an user with a encrypted password (encrypted by default).

$ CREATEUSER username
$ psql
postgres=# ALTER USER username WITH ENCRYPTED PASSWORD 'my_password';

If you want to be sure, remove the lines with ‘host’ to deny nonlocal connections.



Autocomplete Ajax search with Dojo and Zend Framework

by regilero

26 10 2008
autocomplete-ajax-search-with-dojo-and-zend-framework

search filter 2With the new Zend Framework 1.6 we’ve these nice Dojo widgets.

New things lacks documentations most of times. So if you want to build something really usefull like theses nice autocomplete search combobox this example could save you a lot of time.
We assume you have dojo already installed and activated on your views, and that acl verifications are done elsewhere, on your Controller plugins for example.
search filter

First let’s see HTML code (in your view):
<script type="text/javascript">
    dojo.require("dojo.parser");
    dojo.require("dojox.data.QueryReadStore");
    dojo.require("dijit.form.ComboBox");
    dojo.require("dijit.form.FilteringSelect");
    dojo.require("custom.FindAutoCompleteReadStore");
    dojo.require("dijit.form.Form");
    dojo.require("dijit.form.Button");
</script>
<form id="Find_Form" action="/module/foo/edit" method="get" dojoType="dijit.form.Form">
<div dojoType="custom.FindAutoCompleteReadStore" jsId="NameStore" url="/module/foo/find/format/json" requestMethod="get"></div>
<label for="id" class="optional">Recherchez un nom:</label>
<span class="formelement"><select name="id" id="FindByName" hasDownArrow="" store="NameStore" size="25" tabindex="99" autocomplete="1" dojoType="dijit.form.FilteringSelect" pageSize="10" ></select></span>
<span class="actionbuttons"><input id="Find_go" name="Find_go" value="Go:" type="submit" label="go:"dojoType="dijit.form.Button" /></span>
</form>

As we can see you’ll need an additional custom js:
custom.FindAutoCompleteReadStore
This is a really simple js to write, create your custom directory in the same level as dojo or dijit directory and create FindAutoCompleteReadStore.js like that:
dojo.provide("custom.FindAutoCompleteReadStore");
dojo.require("dojox.data.QueryReadStore");
dojo.declare("custom.FindAutoCompleteReadStore", dojox.data.QueryReadStore, {
    fetch:function (request) {
        request.serverQuery = { Find:request.query.name };
        // cal superclass fecth
        return this.inherited("fetch", arguments);
    }
});

Now you’ll need to serve the requested Ajax query (requested by the Dojo store linked with our FilteringSelect or Combobox) : /module/foo/find/format/json
This is the method ‘findAction’ in the Controller ‘foo’ on module ‘module’.
But first let’s see the preDispatch function of this controller where we handle the format/json instruction to switch in Ajax mode:

public function preDispatch()
{
    $contextSwitch = $this->_helper->getHelper('contextSwitch');
    $contextSwitch->setAutoJsonSerialization( true );
    $contextSwitch->addActionContext('find', 'json');
    $contextSwitch->initContext();
}

So now let’s write the find function:

public function findAction()
{
    // handle filtering of recieved data
    $replacer = new Zend_Filter_pregReplace('/\*/','%');
    // emulate alpha+num filter with some more characters enabled
    //**** http://www.regular-expressions.info/unicode.html ****
    // \p{N} --> numeric chars of any language
    // \s -> withespace
    //\x0027 : APOSTROPHE

    //\x002C : COMMA
    //\x0025% : % in UTF-8 and not in utf-8
    //\x002D : HYPHEN / MINUS
    //\x005F : UNDERSCORE
    //\. DOT
    $mylimit = new Zend_Filter_pregReplace('/[^\p{L}\p{N}\s\x0027\x002C\x002D\x005F\x0025%\.]/u’,”);
    $filters = array(
        ’*’     => ‘StringTrim’
        ,’Find’ => array(
            ’StripNewlines’
            ,$replacer
            ,$mylimit
            ,’StripTags’
        )
        ,’start’ => ‘Int’
        ,’count’ => ‘Int’
    );
    $validators =array();
    $input = new Zend_Filter_Input($filters, $validators, $_GET);
    $find = $input->getUnescaped(’Find’);
    if (empty($find)) $find = ‘%’;
    $start = intval($input->getUnescaped(’start’));
    if (empty($start)) $start = 0;
    $count = intval($input->getUnescaped(’count’));
    if (empty($count)) $count = 3;
    // get the model, here you should adjust with the way you work
    // then make your query with limits
    $this->_modeltable = new My_Zend_Db_Table_Foo($this->db)
    $fieldid = ‘my_id_field’;
    $fieldident = ‘my_name_field’;
    $select = $this->_modeltable->select();
    $db = $this->_modeltable->getAdapter();
    $select->where($db->quoteinto($db->quoteIdentifier($fieldident).’ LIKE ?’, $find));
    $select->limit($count, $start);
    $rows= $this->_modeltable->fetchAll($select);
    $rowsarray = $rows->ToArray();
    $finalarray=array();
    foreach ($rowsarray as $row)
    {
        $key = $row[$fieldid];
        $finalarray[$key] = $row[$fieldident];
    }
//Zend_Debug::dump($finalarray);
//die(__METHOD__);
    $this->_helper->autoCompleteDojo($finalarray);
}

And it should be sufficient, pffiuu.
But… there’s one remaining problem after that. We put the search autocomplete inside a form and we wanted the ‘go’ button to send a request to something like that:

/module/foo/edit/id/1245 OR /module/foo/edit?id=1245

But we’ll have something like:

/module/foo/edit?id=THE NAME

too bad…

To get it done I had to change one thing in Zend Framework library on the Zend/Controller/Action/Helper/AutoCompleteDojo.php Helper:

62 public function prepareAutoCompletion($data, $keepLayouts = false)
63 {
64 $items = array();
65 foreach ($data as $key => $value) {
66 $items[] = array(’label’ => $value, ‘name’ => $value, ‘key’ => $key);
67 }
68 $final = array(
69 ‘identifier’ => ‘key’,
70 ‘items’ => $items,
71 );
72 return $this->encodeJson($final, $keepLayouts);
73 }

Line 66 ‘key’ is added on the item and line 69 ‘identifier’ is set to ‘key’ and not ‘name’. ‘identifier’ is used by the Dojo Filtering Select to decide which field will be used for the form, for more info see dojo book page and search ‘abbreviation’. There’s also a bug talking about that for Zend Framework, to get other solutions or info on the way it will be fixed later look here



Pylons, xmlrpc and doctest

by kiorky

21 10 2008

I m actually developping some application around XMLRPC protocol at work.

We are using Pylons for the framework part, and i played this afternoon at setting up some testing environnement for doing doctests.

This test is a proof of concept, it 's code extracted from our internal application, it's just a starter for you. The whole is working with some tweaks.


controllers/mycontroller.py, a simple controller doing simple stuff

class MyController(XMLRPCController):                                                                                                                                      
    """controller."""                                                                                                                                                   

    def index(self):
        return '\_o<'

 

lib/base.py, Please add the XMLRPCController import

lib/base.py:from pylons.controllers import WSGIController, XMLRPCController

 

Then, we are setted up to continue with tests

First of all, the doctest boilerplate:

tests/test_doctest_files.py

import doctest                                                                   
from doctest import DocFileSuite                                                      

from myproject.tests import setUp, tearDown
flags = (doctest.ELLIPSIS | doctest.NORMALIZE_WHITESPACE | doctest.REPORT_ONLY_FIRST_FAILURE)

def test_suite():
    return DocFileSuite(
        "test.txt",     
        setUp = setUp,  
        tearDown = tearDown,
        optionflags = flags 
    )

 

setUp and tearDown will have a central place as they are intialising the application.

As we can't use paste.fixture.TestApp objects with XMLRPC because it does not bind everywhere, the idea is

  • launch the server somewhere in a thread
  • use it later, as usual throught xmlrplib.
  • We will even declare it as a global to ease the doctests writings.
  • We also add a wrapper to url_for to return the host to bind to.
tests/__init__.py
                                                                                                                                                                                                                                                                            
import os                                                                                                                                                                                                                                                                        
import sys                                                                                                                                                                                                                                                                       
import re                                                                                                                                                                                                                                                                        
import threading                                                                                                                                                                                                                                                                 
from ConfigParser import ConfigParser                                                                                                                                                                                                                                            
from unittest import TestCase                                                                                                                                                                                                                                                    

import paste.fixture
import paste.script.appinstall
from paste.deploy import loadapp
from paste.httpserver import serve
from routes.util import url_for

here_dir = os.path.dirname(os.path.abspath(__file__))
conf_dir = os.path.dirname(os.path.dirname(here_dir))
test_file = os.path.join(here_dir, 'test.ini')

cmd = paste.script.appinstall.SetupCommand('setup-app')

cmd.run([test_file])

def setUp(test, *args, **kwargs):
    print "\t-----------------------------------------------------------------"
    print "\t---    Setting up database test environment, please stand by. ---"
    print "\t-----------------------------------------------------------------"
    config = ConfigParser()
    config.read(
        os.path.join(os.path.dirname(sys.argv[0]), '..', 'etc', 'config.ini')
    )

    infos = ConfigParser()
    infos.read(test_file)
    sinfos = infos._sections['server:main']
    wsgiapp = loadapp('config:test.ini', relative_to = here_dir)
    server = test.globs['server'] = serve(wsgiapp,
                                 sinfos['host'],
                                 sinfos['port'],
                                 socket_timeout=1,
                                 start_loop=False,
                                )
    t = threading.Thread(target=server.serve_forever)
    t.setDaemon(True)
    t.start()
    test.globs['app'] = paste.fixture.TestApp(wsgiapp)
    def url_for_wrapper(*args, **kwargs):
        lkwargs = {'protocol': 'http' ,'host':  "%s:%s" % (server.server_name, server.server_port)}
        lkwargs.update(kwargs)
        return url_for(*args, **lkwargs)
    test.globs['url_for'] = url_for_wrapper
    test.globs['url_for_orig'] = url_for

def tearDown(test):
    test.globs['server'].server_close()

class TestController(TestCase):
    def __init__(self, *args, **kwargs):
        wsgiapp = loadapp('config:test.ini', relative_to = here_dir)
        self.app = paste.fixture.TestApp(wsgiapp)
        TestCase.__init__(self, *args, **kwargs)


 

And finally, letz play with our doctest

tests/text.txt

>>> create_url = url_for(controller='mycontroller')

>>> import xmlrpclib
>>> s = xmlrpclib.Server(create_url)
>>> s.index()
'\\_o<'




[OFF] Et si j’étais un prof de philo ?

by pounard

19 10 2008

.. je pense que je donnerais comme sujet de dissertation des citations chinoises !

Et oui, en balladant sur le net, j'en ai repéré une ou deux que je trouve tout à fait juste, mais libre à chacun d'avoir son opinion sur le sujet.

Petite note, elles sont toutes de Cofucius (étonnant tiens!).

en lire plus


JMeter, improving performance of a Plone web site

by toutpt

13 10 2008

Last week i have made a rush to improve performance of a Plone based web site. For performance testing i have used JMeter, because i have seen Using open source tools for performance testing

JMeter is really nice to use. Just launch it’s proxy, plug your browser on it, and do your test. Next you save it as xml and you can edit the test. So you can login (it support cookies) you can create content (with an once logic controller) consult content, and stress your server.

What i have learn from this about Plone is:

  • Do not use brains or any object in templates, or you will not beeing able to cache your logic code in ramcache. Use dict that contains every strings ready to be displayed in the templates.
  • How to use the ram cache
  • i can store acl_users in ramcache, and i have been surprised to see the difference. On 5 tabs hitted, i have hit the cache 278 times …
  • Archetypes is damly slow (about one second to set some attributes of an object in a btree and reindexIt)
  • CMFPlone.utils.createObjectByType do a reindexObject
  • Do not add any index to the portal_catalog, use the buinding done by archetype_tool to be able to use other index. I m adding about one catalog tool per custom content type.
  • A query on the portal_catalog can take one second if you have for example a list of 100 paths (query['path'] = ['/first/path', '/second/path'] and more than 100 000 entries.

I have learn many other things during the last week, but now i m using stress tests during the dev