Import data massively in Plone using Talend + csvreplicata

Typical use case: you need to migrate an existing web site to Plone and this web site is using a relational database to store its content.

Talend ( http://www.talend.com ) is a well-known opensource ETL able to read, convert and store data from/to almost any format. Unfortunately, it cannot access a ZODB.

Csvreplicata ( http://www.makina-corpus.org/project/csvreplicata and http://plone.org/products/csvreplicata ) is a Plone product able to import/export any Plone archetypes-based contents (including folder/sub-folder structures).

The principle is simple, we use Talend to read the original data and to produce cvsreplicata compliant CSV files, then we import it using csvreplicata.

Here are the steps:

- Using Talend, access the relational db content tables (directly or using an intermediary format (XML, CSV, ...)).

- Create a Talend job containing: the appropriate input component to read the original data (its schema can be guessed automatically by Talend), a FileOutPutDelimited component where you declare the target schema (basically the list of fields you want to fill in your AT content), and a tMap component to map the input schema with the output schema.

- Run the job.

- Import the result in your Plone site using csvreplicata (make sure you insert the appropriate csvreplicata header at the beginning of the file)

Easy, clean, and fast !!!

Few tricks:

- you can paste your csvreplicata header into a file, and use the tFileCopy component to initialize your csv output before inserting the content,

- if your original content contains some cross links, you can easily convert them into their Plone valid corresponding url using a simple tReplace component,

- file attachments and images can be imported using csvreplicata, you just need to put them into a zip file,

- encoding makes no problem (just select the right one in Talend and in csvreplicata)

Aucun vote pour l'instant.

Plone

Thanks for taking the timeThanks for taking the time to write this up. I learned a lot about Plone. I am excited to see more comparisons.

You can read this article
http://bygsoft.wordpress.com/2010/01/18/howto-argouml-and-archgenxml/