[elephant-devel] SQL Back-end ready --- sending with links removed in hopes it goes through

Sat Oct 15 15:58:16 UTC 2005

Dear Elephant Devel team,

    Here is my good-enough-to-be-released extension of elephant to use
CL-SQL as a back-end (with PostGres).

Below is the documentation that I wrote (It's in the texinfo, but I
thought you might like to read it directly, so I pasted it here.)
Please find attached a tar file containing all of the new files, and
additionally a file containing the cvs diff of the files that
already existed in the repository.

    If your interested in this, you should probably read the
documentation below.  The code change is so great that 
I doubt you can see much from the diff; the best way to really analyze
would be to create a separate branch in CVS and 
allow me to commit my stuff there, so that one can easily checkout the
whole branch.

    In a nutshell, I have extended Elephant to utilized CL-SQL (on top
of PostGres) as a back-end in addition to 
BerkeleyDB.  Some people may think this pointless; others may be pleased
with the more permissive licensing of 
PostGres, or see this as a step to supporting, for example, SQLite.
Addtionally, this version is a "mutli-repository" version
in that many repositories can be open at the same time, and data can be
migrated between them, without regard to 
the implementation strategy that underlies the repository.

    I commend the Elephant developers on the good set of tests they had;
I have expanded them to allow testing
on multiple repository types and to test migration.

    I know the current owners of Elephant are looking for a new owner.
That, and the fact that some people might
like Elephant the way it is and hate the apparent complexity that I have
added, or might really like what I have done,
creates a complicated set of questions we have to answer:

    1)  Who will own Elephant?
    2)  Should this be the next version of Elephant, or should it be a
fork (that is, a completely different project, maybe
"Bignose" or something)?

    I don't know how many people use Elephant or are on this list.  I
enjoyed it until I hit the BerkeleyDB licensing
restriction.  I definitely plan to use this work that I have done in a
website moving forward, and will be maintaining
it, one way or the other.

    I have not yet offered to own Elephant, since I am not an expert
LISP coder, have never managed a large open-source
project before, don't know who is using it, don't plan to pay for any
lisp system, and don't know how much time is 
involved in maintaining such a package, and am not an expert on
BerkeleyDB.  However, if nobody else has volunteered
and Ben can't do it, I suppose that I must be better than nothing at
all.

    Please express some sort of opinion.  I spent 5 solid weeks on this;
which in hindsight was probably a three-week 
waste of time compared to just implementing my own serializer, which
would have served my purpose.
However, now that it is presentable, I certainly hope that someone in
addition to myself will benefit from it.

5 SQL back-end 

      * SQL-Introduction: The design and status of the SQL back-end
        extention. 
      * Extention Status: The current status of the SQL back-end
        extention. 
      * Back-compatibility: Issues if you have already been using
        Elephant 
      * Multi-repository Operation: Specifying repositories 
      * Setting up PostGres: An example 
      * Repository Migration: How to move objects from one repository to
        another 
      * 5.1 SQL-Introduction

Although originally designed as an interface to the BerkeleyDB system,
the original Elephant system has been experimenetally extended to
support the use of relational database management systems as the
implementation of the persistent store. This relies on Kevin Rosenberg's
CL-SQL interface to relational systems. 

Although the BerkeleyDB system is an ideal object store for LISP
objects, one might prefer the licensing of a different system. For
example, at the time of this writing, it is my interpretation that one
cannot use the BerkeleyDB system behind a public website
http://www.sleepycat.com/download/licensinginfo.shtml#redistribute
unless one releases the entire web application as open source. 

The PostGres DBMS has no such restriction. Elephant itself is released
under the GPL. It is somewhat debatable if the GPL allows one to
construct to construct a non-open-source web application but the
preponderance of opinion appears to be that it does. Thefore using
Elephant and the other GPLed software that it depends upon allows one to
host a a non open-source web application. This might be a reason to use
Elephant on PostGres rather than Elephant on BerkeleyDB. 

Other reasons to use a relational database system might include:
familiarity with those systems, the fact that some part of your
application needs to use the truly relational aspects of those systems,
preference for the tools associated with those systems, etc. 

The SQL back-end extention of Elephant provides a function for migrating
data seamlessly between repositories. That is, one can quite easily move
data from a BerkeleyDB repository to a PostGres repository, and vice
versa. In fact, one of the most important aspects of the extention is
that it makes Elephant a multi-repository system, rather than a single
repository system, as addition to allowing different implementation
strategies for those repositories. This offers at least the possiblity
than once can develop using one backend, for example BerkeleyDB, and
then later move to MySQL. 

At the time of this writing, the basic strategy for the SQL
implementation is quite simple. The same serializer used for the
Sleepycat implementation is employed, the byte-string is base64 encoded,
and placed in a single table which is managed by Elephant. 

5.1 SQL-Introduction
Although originally designed as an interface to the BerkeleyDB system,
the original Elephant system has been experimenetally extended to
support the use of relational database management systems as the
implementation of the persistent store. This relies on Kevin Rosenberg's
CL-SQL interface to relational systems. 

Although the BerkeleyDB system is an ideal object store for LISP
objects, one might prefer the licensing of a different system. For
example, at the time of this writing, it is my interpretation that one
cannot use the BerkeleyDB system behind a public website
http://www.sleepycat.com/download/licensinginfo.shtml#redistribute
unless one releases the entire web application as open source. 

The PostGres DBMS has no such restriction. Elephant itself is released
under the GPL. It is somewhat debatable if the GPL allows one to
construct to construct a non-open-source web application but the
preponderance of opinion appears to be that it does. Thefore using
Elephant and the other GPLed software that it depends upon allows one to
host a a non open-source web application. This might be a reason to use
Elephant on PostGres rather than Elephant on BerkeleyDB. 

Other reasons to use a relational database system might include:
familiarity with those systems, the fact that some part of your
application needs to use the truly relational aspects of those systems,
preference for the tools associated with those systems, etc. 

The SQL back-end extention of Elephant provides a function for migrating
data seamlessly between repositories. That is, one can quite easily move
data from a BerkeleyDB repository to a PostGres repository, and vice
versa. In fact, one of the most important aspects of the extention is
that it makes Elephant a multi-repository system, rather than a single
repository system, as addition to allowing different implementation
strategies for those repositories. This offers at least the possiblity
than once can develop using one backend, for example BerkeleyDB, and
then later move to MySQL. 

At the time of this writing, the basic strategy for the SQL
implementation is quite simple. The same serializer used for the
Sleepycat implementation is employed, the byte-string is base64 encoded,
and placed in a single table which is managed by Elephant. 

5.3 Back-compatibility
The CL-SQL based extention is very back-compatible with any existing
Elephant application, except for two items. 

First, the routines “build-btree” and “build-index-btree” should be used
in place of the previous approach to direct calls to make-instance. This
is necessary, because the underlying class of the object depends on what
repository it is stored in. These routines, like make-instance on
persistent objects directly, allow you to specify the store controller
at creation time. However, build-btree and build-index-btree will use
the global *store-controller* if no keyword argument is provided. 

Secondly, in addition to executing:

     (asdf:operate 'asdf:load-op :elephant)

to load elephant, one must execute either or both of:

     (asdf:operate 'asdf:load-op :ele-clsql)
     (asdf:operate 'asdf:load-op :ele-bdb)

depending on whether or not you wish to use the clsql backend or the
BerkeleyDB backend, or both. 

5.4 Multi-repository Operation
Elephant now keeps a small hashtables that maps “database
specifications” into actual database connections. 

If a database spec is a string, it is assumed to be a BerkeleyDB path.
If it is a list, it is a assumed to be a CL-SQL connection
specification. 

The tests now have a function “do-all-tests-spec” that take a spec and
based on its type attempt to open the correct kind of store controller
and perform the tests. 

The routine “get-controller” takes this specifiation. 

The basic strategy is that the “database specification” object is stored
in every persistent object and collection so that the repository can be
found. 

In this way, objects that reside in different repositories can coexist
within the LISP object space, allowing data migration. 

5.5 Setting up PostGres
To set up a PostGres based back end, you should: 

     1. Install postgres and make sure postmaster is running. 
     2. Create a database called “test” and set its permissions to be
        reached by whatever connection specification you intend to use.
        The tests use: 

                  (defvar *testpg-path*
                  '("localhost.localdomain" "test" "postgres" ""))

        meaning that connections must be allowed to the database test,
        user “postgres”, no password, connected from the same machine
        “localhost.localdomain”. (This would be changed to something
        more secure in a real application.) Typically you edit the
        file : pg_hba.conf to enable various kinds of connections in
        postgres. 

     3. Be sure to enable socket connection to postgres when you invoke
        the postmaster. 
     4. Test that you can connect to the database with these credentials
        by running: 

        psql -h 127.0.0.1 -U postgres test 

        Before you attempt to connect with Elephant. 

meaning that connections must be allowed to the database test, user
“postgres”, no password, connected from the same machine
“localhost.localdomain”. (This would be changed to something more secure
in a real application.) 

Furthermore, you must grant practically all creation/read/write
privileges to the user postgres on this schema, so that it can construct
the tables it needs. 

Upon first opening a CL-SQL based store controller, the tables, indexes,
sequences, and so on needed by the Elephant system will be created in
the schema named “test” automatically. 

To run the tests, execute:

     (asdf:operate 'asdf:load-op :elephant)
     (asdf:operate 'asdf:load-op :ele-clsql)
     (asdf:oos 'asdf:load-op :clsql-postgresql-socket)
     (in-package "ELEPHANT-TESTS")
     (do-all-tests-spec *testpg-path*)

This should produce a small number of errors (about 7) for those test
having to do with migration and the BerkeleyDB system specifically. 

If you execute:

     (asdf:operate 'asdf:load-op :ele-bdb)

Then connection to the BerkeleyDB system will be enabled, and you should
be able to execute both

     (do-all-tests-spec *testpg-path*)
     (do-all-tests-spec *testdb-path*)

with no errors in either case. 

At present the system has only been tested under PostGres. Some code
parametrization would be required to work with other databases. 

5.6 Repository Migration
This version of Elephant supports migration betwen store controllers,
whether of the same implementation strategy or not. 

The tests migrate1 - migrate5 are demonstrations of this techinque. 

The functions for performing these migrations are: 

migraten-pobj 

The name of this function is meant to imply that it is destructive of
the object in question, mutating it to point at the new repository. 

Which requies that you provide a copy-function to copy whatever slots
you want from the persistent object as deeply or as shallowly as you
desire. 

Data collections (btree's) can be move with the function: 

migrate 

A simple object that does not inherit from “persistent” but is attached
to a key (on the root) can be copied with the routine 

copy-from-key 

It is hoped that these routines would allow, with some labor, a user to
use one repository, and later decide to start using a different
implementation strategy, and easily migrate the objects to the the new
repository. The old repository could then be abandoned, or multiple
repositories could be used at the same time.

----
Robert L. Read, PhD                                     read &T
robertlread.net
Consider visiting Progressive Engineering:
http://robertlread.net/pe
In Austin: 912-8593                                        "Think
globally, Act locally." -- RBF

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.common-lisp.net/pipermail/elephant-devel/attachments/20051015/3883fc79/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cvs.diff
Type: text/x-patch
Size: 145969 bytes
Desc: not available
URL: <https://mailman.common-lisp.net/pipermail/elephant-devel/attachments/20051015/3883fc79/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sql-backend.tar.gz
Type: application/x-compressed-tar
Size: 18154 bytes
Desc: not available
URL: <https://mailman.common-lisp.net/pipermail/elephant-devel/attachments/20051015/3883fc79/attachment-0001.bin>