[Gsll-devel] Introducing "Grid Structured Data"

Mon Jan 25 00:24:22 UTC 2010

Some thoughts on the two interfaces (grid, xarray) discussed here ...

I am trying to figure out if we can classify different types of usage of
vector and matrix data.  The classification below is very rough with much
gray area in-between.

At some basic level, collections of numbers are either

   1. vectors and arrays to be processed by numerical algorithms
   2. just collections of numbers that are will be parsed, processed in some
   semi-numerical algorithms

Packages such as GSL and LAPACK will deal mostly with the first kind.

For other uses, like when dealing with results from multiple experiments, we
are using vectors and arrays as indexed storage with fast access, but there
may not be anything `algebraic' (in the sense of linear algebra) to those
collections.

In this second case, we may choose to process all the numbers in the
collection, or some random subset of them.  (In either case, vectorized
processing of those collections may be desired - Tamas has published a
package that does that).

It seems to me that Tamas' (now abandoned) `affi'  package, on top of which
`grid' is built upon, is a natural for case 1 above, while xarray is natural
for case 2 above.

In addition, someone noted that affi is probably faster than xarray (to be
verified), which is of paramount importance for the number crunching
libraries (We first use non-numeric tools at the top level when parsing the
data, which than may pass the data to the number-crunchers in gsll, lla,
where speed is important).

In that case, the two packages may have a valid role each.  What would be
optimal would be a unified notation, in which case that of grid would be a
subset of the xarray.

Mirko
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.common-lisp.net/pipermail/gsll-devel/attachments/20100124/aa8c97e5/attachment.html>