betabug... Sascha Welter

home english | home deutsch | Site Map | Sascha | Kontakt | Pro | Weblog | Wiki

22 November 2011

Notes on Zope 2 migration

Remove it, replace it, renew it
 
Old tractor

Have an old Zope site around? Want to migrate it? Have to decide what to do with it? Givent that Zope appears to be dropping in hotness a bit lately, these points come up more often now. There have been a few questions about this on the #zope IRC channel lately. So I'm writing down a few thoughts and FAQs on it.


So, what do you want to do?

Obviously starting with a good description of what you are trying to accomplish would be good. It's not always how it works though, you might just have been handed this old server and been told to "do something about it". Now you're looking at your options to see what you can actually do. Other days I've been going for one task description and in the process had to change plans, based on facts and dead ends. I've had to write small helper products, fix old bugs, semi-disable features. Other times "get the data" out is all of the description. In all these case start by checking what you really have and find out where you can go from there. Do you still need to be able to edit stuff? Do you only want an archive. Give yourself a description, and when asking for help, share your task description with me or other helpers.

What do you have?

"We have Zope" is often not good enough a description of the status quo. Zope is an application server and getting your data "out" involves knowing what application is there and holds your data. Even for which technology was used in whatever application there are a few options.

In quite a few cases, people asked to do the migration or transition are unsure about what they have. The classic story being an administrator with no prior exposure to Zope being handed a black box server machine. Here are some places to check from my own experience with being faced with an unknown "wonder box".

Find out the Zope version

Ask me for support and one of the very first questions I will ask you is what Zope version you have. So when you ask support questions or describe what your target is, you might as well are include the Zope version that is used on your server right away.

But then, you might not know it right away. The first and easiest place to look is inside the /control_panel in the ZMI. To get into the ZMI you usually need an adminstrator account. If you don't have one, but you have access to the file system, you can set up a new admin account.

If you don't have access to the ZMI, you can look into the "Instance" folder structure. The script bin/zopectl usually has a pointer to where the Zope code is pulled from... and that should help you to find out which version you have. Typically you will get a version number like 2.7 or 2.10. Zope versions below 2 are not out in the wild any more, You might find Zope 3, but it's a totally different beast and pretty much everything in this article does not apply to it.

Check the HTML

To continue to find out what you have, check the HTML source that is generated. For example a Plone site will usually mention so in the "generator" meta-tag (and quite often in a "powered by" comment at the bottom of the page). Even if it's not Plone and even if someone went out of their way to hide a CMS app used, there might be hints that would tell me what it is, maybe in something as small as a link to a CSS or other helper files.

Check in the ZODB itself, using the "ZMI"

When you log in to the "Zope Management Interface" - by adding /manage to the end of the URL of the app - you get to see the objects that are in the database. The "meta types" of the objects are displayed along with the ids. This can give you a hint on what basis the application was built.

Check in the "Products" folder of the "Instance"

A typical Zope installation has a folder structure called the "Instance". Inside that is a folder called "Products". Inside that might be a bunch of other folders, so called Zope Products or "filesystem based products". You can get information for what's used from the names and included documentation. Be aware though that there might be various Products installed and not used. You might have to cross check with what's in the databse.

As a rule of thumb on a maintained system, you probably need all that is in the Products folder. On systems that were handled by various admis over time, I've seen a lot of products installed that nobody really used and that are not relevant to the migration task at hand. There is no quick and obvious way to find out though.

Old tractor engine
Find out the technology used for the application

There were always a variety of options for people building applications on Zope. What you have could be one of them exclusively, but much more likely it would be a mix. You even might have multiple applications running on your application server, made with different technology each. Or you might have something totally stock, but extended using one of the other options.

TTW

Something that happened a lot during the first years of Zope were applications that were developed with so called "Through The Web" technology. Here templates and scripts (and other building blocks) are added from the ZMI in a web browser into the database. They are then configured and code is added right in the browser. An application might consist totally of those elements, but it also might be combined with objects that are based on filesystem based Products.

Some typical building blocks were:

  • DTML Methods
  • Page Templates
  • Python scripts (labelled "Script (Python)")

If you see a lot of these, you might have an application that was built TTW. In this mode, everything can be in the database: The data itself as well as the code, the templates, and images.

TTW applications are sometimes easy to port to newer Zope versions, since they use really simple technology. The hard parts would be where the makers went beyond TTW and combined it with other technologies. Another problem is that TTW applications often got messy real quick, you could be in for a puzzle if you need to find out how the app actually works.

External methods

Applications built TTW ("through the web") have some security limitations. One method to get around those limitations is to use "External methods". These are represented in the database as "External Method" objects and in the file system Zope Instance there is a folder called "Extensions" where the actual Python files are stored. A typical pattern was to have a TTW application with some advanced parts handled by External methods. Depending on what your migration task actually is, you might not care about what's in there or you might have to hunt down what the code in the external methods was doing.

Zope Products, home grown

Your application might have been developed with more "normal" Python tools on the file system, in so called Zope "Products". As described above, the code for these is found in the Instance/Products directory. These products are basically Python modules with some extra added stuff to comply with the way the Zope application server wants to do things. The kind of objects that are used inside the Zope app are defined as classes, based on a range of base classes.

In the ZODB, visible from the ZMI, you will find so called "instances" of these products. They have small icons and are defined by "meta type" strings.

If you have a custom application, there might be one or there might be many such product folders inside the instance/Product folder. Some of them might be the custom parts, special coded for the app you're looking at, while others were stock products that were widely used.

Widely used stock Zope Products

There were (and still are!) a bunch of readily available products to run on Zope: CMS, forum, blog, whatever software. Some offer a lot of functionality and they "are your application" itself, others are smaller tools used - or they are "base" objects that your app might be using internally.

When you identify them, sometimes you can find newer versions (helpful if your task is to move to a newer Zope version). Some of them are still fully maintained, some have been abandoned, things that are usual all over the software world. In some projects I have patched old, unmaintained products myself, to move beyond obstacles in going to newer Zope versions.

Plone

Actual Plone falls into the last category of readily available Zope applications, but given its widespread use, it is a special case. Plone is of course still around, so you can find plenty of community, advice and help on upgrading to newer versions (if that is your task). You can even hand the task off completely to a service provider, pay some money and get it all done.

A Mix

From my experience, this is quite likely what you might find on an unknown Zope application server install: Some well known helper products used, maybe a CMS or some blog software, combined with some tailormade stuff. I bet some admin at some point threw in at least a couple of TTW "Python scripts" to solve some small day-to-day problem.

Old tractor

Database used

ZODB

Zope usually uses the "built in" ZODB (Zope Object Database). The last few years have given more attention to "NoSQL" databases, but the ZODB has been around for a long time as a "NoSQL" and "Object Database". Lots of programmers who come from an SQL background are confused by it: Is the application code in the DB? (It might be, but usually it isn't.) How do I query/search this thing? (Using Python code.) Can I simply dump the contents? (Usually not that easy.) ... and other such questions. Basically and simplified spoken, the ZODB stores instances of Python classes (pickled objects) in a tree structure. To access these objects, you walk the tree as if it was a structure of Python objects in memory. The "real data" is usually in the object's attributes, but the arrangement, the "structure" of the objects are quite often part of the data too.

Underlying storage engines in ZODB

The ZODB can have a couple of different backends where data is stored.

Filestorage

This is what you'll have most frequently. The database is inside a single file called "Data.fs". This is a storage where data is always appended, the file always grows, never shrinks in size.

Directory Storage

There used to be an implementation where instead of a single file you had a folder and file structure that represented your DB. This has been out of fashion for a long, long time and I doubt that you will find it out there on a running system.

Relstorage

Some attempts were made to implement the ZODB on top of a standard SQL database. Not so long ago with Relstorage one of these was written that actually delivered and is in use. So you have a small, small chance to find a ZODB that stores its data in an SQL DB. This doesn't mean that you will find tables and records with your actual, readable data inside (which you could dream of dumping out), but rather what is inside the SQL DB would be the Python pickles that the ZODB would otherwise have stored in the Data.fs file.

SQL databases

In addition to the ZODB, some projects may use a database of the SQL family. This means that there will be a ZODB which usually holds some normal Zope objects, and some of these objects access the SQL DB. One hint for such a setup would be if you find "ZSQL Methods" in your ZODB. These methods link SQL queries with Zope code.

If your application uses an SQL database, you probably need the ZODB and (Data.fs file) and the SQL dump to get all the data together.

Export files

Sometimes people get handed data "exported" from Zope (actually from within Zope's ZMI). These usually come in files ending in .zexp (there used to be also an xml export option, but it has veriously been working or broken over time). Note that these .zexp files are basically useless for almost all tasks having to do with migrating a Zope application. They were meant to export data and reimport it into the exactly same Zope version, given the same Zope Products installed in exactly the same versions as what you started from.

If someone hands you only a .zexp file, you need to demand the real thing. As an alternative you might be able to build a Zope server environment if you get the exact Zope version used and the list of which products were used. If you don't have this information and absolutely need the data in the zexp file, then you're in for some major detective work.

What do you want to do with it?

So what is it you were asked to do?

Shut it down and uninstall Zope

You just need to get rid of it? Shut it down and throw it all away? Easy enough:

  • Stop the Zope server (through the ZMI /control_panel or from the zopectl script on the file system, maybe with some special command built into your OS prepackaged Zope)
  • Remove the Zope "instance" folder, which usually holds the application code and the database (the database might be in another instance, shared through the ZEO server)
  • Locate the Zope code itself and remove it (or uninstall it using your package system)
  • Remove any startup scripts in the OS
Move it to a newer Zope version

Sometimes all you need to do is to get to the newest and latest minor Zope version (2.10.X) of whatever major Zope version your application runs on (2.10 in that example). Usually that is no trouble, do a good backup and go ahead.

Other times you need to move up that one major version number in Zope (say from 2.9 to 2.10), getting passed by some obstacle or bug. Both of these versions are old, but maybe it's all you need. Or there might be another reason to do this: Often if you want to go to the latest and greatest Zope 2 version, going through the intermediary versions is safer and easier.

So to go to the next major version, check the Changelog of the next version, looking out for some of the bigger "steps" in the history of the Zope project. For example between 2.8 and 2.9, "versions" got lost. Or at 2.10 there were big changes how Zope Page Templates handle Unicode data. You might find a key technology of that old, old app not being supported any more (at least not without some custom work). You might also find that some "Products" need to be upgraded along with Zope.

A special case for that is Products that are not fully "supported" any more (meaning that whoever wrote them doesn't look after them all the time or might not be around for getting it over the version bump). For quite a few of these, patches float around on the net. In other cases I checked what was wrong and fixed stuff myself for obscure Products - it's only Python code in the end.

Move it to the newest Zope version

Here it gets interesting: If you want to keep stuff around, you might want to move your application to the latest and newest Zope version. Yes, Zope is still around and new, maintained versions are published. Not everything is easily backward compatible though. The Python world has moved on in the last few years and the Zope code with it. You might be in for some rewriting, for some bugfixing or just for some re-packaging.

If you don't know your way around too much, and even are not too certain what is inside the app in front of you, I might humbly suggest to invest the extra time and go version by version (2.9 to 2.10, then 2.10 to 2.11, ... to 2.13). It might seem to take longer, but it might be much faster than hitting 2.13 right away and stopping dead because "something doesn't work" and you have no idea what's wrong.

If you want to go all the way to the newest Zope version, you will also want to look into the way Products are "packaged" or "eggified", as is now usual in the Python world. In any case, the way you install the newer Zope might be different to what you are used to, the newest Zopes are usually installed with "buildout", so customizing a buildout script for your application will be part of the job.

Move it to a newer server (or to some other server)

If you want to move your old Zope installation to newer hardware, without upgrading to newer Zope versions, that's basically one of the easiest tasks around. Some admins stop at their new servers not having the right, old Python versions. That's not really a problem though: You are not obliged to run Zope with the "System Python". In fact it's good practice to have a Python install that is used only for Zope (and on the latest Zope versions you would use virtualenv too). So go ahead and install that Python 2.4 version you might need for your old app, right in a private user directory.

If your OS doesn't have a pre-packaged Python of that version any more, don't worry, you can compile it yourself. You might need a bit of handholding on newer Mac OS X versions for that, but it's manageable.

Same if you are used to install Zope from your package system: Don't worry if it's not there any more. Especially for old Zope versions, all you need is a tarball and some plain "configure && make && make install". For the latest Zope versions the system has changed (mostly people use buildout there).

If you want to move to newer hardware and at the same time to a newer Zope version, I'd suggest to move the setup in small steps up the versions, preferably on the new machine. If I expect things to get messy in some of the steps, I will suggest building the exact same setup on the new hardware first till it works, then move on step by step. Otherwise you might not know which change made things break.

Get the contents out, in order to keep things around

If you simply want to build an archive of the site, sometimes you can just spider it into a static site. You will have to take special measures for parts of the site that were accessed through forms or when URLs with POST variables were used.

If the site was made for editing stuff or if it was dynamic in other ways, you will want to cut things out in a sensible way - but that's outside of a Zope topic really.

Get the contents out, in order to move to something else

If you want to migrate the content to a different kind of system / framework, what you usually need to do is to write some Python code to write out the actual data into an (intermediary) form that you can use. Since Zope usually builds on ZODB, and since ZODB has a tree-like structure, most of the time you will end up with something that recurses through the tree of objects and writes out the data depending on what kind of objects it finds.

Obvious candidates for intermediary data formats would be XML or JSON, assuming that your new environment knows what to do with them. You also might write stuff to SQL tables. Either I would keep track of relationships myself there (might get confusing) or I would try to press SQLAlchemy into service, to build me my relationships on my objects.

If the Zope app involves an SQL DB next to the ZODB, you would obviously have to mix that data into your output too.

How to do backups?

Maybe this should have been at the start of the article, but assuming that you are a seasoned sysadmin, you already know that taking a backup is the first thing you will do.

Filesystem application code

In case of filesystem based Zope Products code, you want the "Products" folder in the instance. You might also need the "Extensions" folder in the instance. If you're there, you could do a backup of the complete instance folder anyway.

ZODB databse file

A special case is the Data.fs file in the instance. This is where the ZODB stores its contents. There is strong evidence that this file does not like to be backed up with rsync. It's fine with cp or other simpler tools. Incremental backups of the Data.fs file were usually done with a script called repoze.py. You can find that inside the source directory of your Zope install.

Note that there might be multiple ZODB data files, as the ZODB could mount them at various points into the tree structure.

The ZODB data file can also be outside of your normal Zope instance, if a technology called ZEO was used. In that case there is a "ZEO instance" which holds the Data.fs file and there is one or more Zope client instances that hook up to the ZEO. Obviously you want the Data.fs out of the ZEO instance. You will likely find files called "Data.fs" in the other instances too, usually older and much smaller, unused either from the automatic install or from when the instance was converted to a ZEO setup.

Exra SQL databases

Don't forget to backup any SQL databases too, be they used in addition to the ZODB Data.fs or (if the setup used relstorage), they might be the ZODB storage themselves.

Shameless Self-Plug

I'm a freelancer with my own company (Betabug Sirius) working amongst other areas in the Zope field. If you have a migration project of an old Zope application, I might be able to help or even to handle the job for you. Feel free to get in touch to discuss the task!

Posted by betabug at 10:22 | Comments (0) | Trackbacks (0)
ch athens
Life in Athens (Greece) for a foreigner from the other side of the mountains. And with an interest in digital life and the feeling of change in a big city. Multilingual English - German - Greek.
Main blog page
Recent Entries
Best of
Some of the most sought after posts, judging from access logs and search engine queries.

Apple & Macintosh:
Security & Privacy:
Misc technical:
Athens for tourists and visitors:
Life in general:
<< Instapaper on eBook reader | Main | Bragging on AppleScript >>
Comments
There are no comments.
Trackbacks
You can trackback to: http://betabug.ch/blogs/ch-athens/1217/tbping
There are no trackbacks.
Leave a comment