[pro] [Review Request] Small example of modern application of Common Lisp
Paul Nathan
pnathan.software at gmail.com
Wed Nov 2 01:32:01 UTC 2011
On Tue, Nov 1, 2011 at 12:20 PM, Samium Gromoff <skosyrev at common-lisp.net>wrote:
> On Fri, 28 Oct 2011 11:29:38 -0400, Faré <fahree at gmail.com> wrote:
> > Dear Christian,
> >
> > I'm interested in your web scraping technology in CL.
> >
> > I'd like to build a distributed web proxy that persistently records
> > everything one views, so that you can always read and share the pages
> > you like even when the author dies, the servers are taken off-line,
> > the domain name is bought by someone else, and the new owner puts a
> > new robots.txt that tells archive.org to not display the pages
> > anymore.
> >
> > I don't know if this adventure tempts you, but I think the time is
> > ripe for end-user-controlled peer-to-peer distributed archival and
> > sharing of information. Obvious application, beyond archival, is a
> > distributed facebook/g+ replacement.
>
> I cannot add anything, but express an emphatic agreement.
>
> One important thing, IMO, would be a mathematically-sound, peer-to-peer
> archive authenticity co-verification -- perhaps in the same sense as
> git manages to do it.
>
>
I agree. It's becoming pretty obvious to me that the 'web' can be
described as being in a state of constant rot and regrowth (sites go down.
other sites go up). Unfortunately, the rot takes with it some really
valuable pieces of information.
An interesting definition of a website might be to be actually a git
repository - hyperlinks take both a file and a changeset hash the file was
valid at; a 'certified' website might have a gpg signature on the commits
as well.
One interesting application might be an 'archiving browser', which caches
all/most of the sites you visit. Instead of rummaging through google trying
to figure out what the search terms were to hit that one site (if it's
still indexed by google and if it's still up), you can instead run a query
on your local application.
As a personal project, I have been contemplating putting together a web
spider/index for better web searching; it would be nice to contribute
components from that to a larger project relating to web storage &
archiving.
Regards,
Paul
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.common-lisp.net/pipermail/pro/attachments/20111101/a5297381/attachment.html>
More information about the pro
mailing list