[Bese-devel] Re: rfc2388

Lou Vanek vanek at acd.net
Mon Jul 17 18:09:34 UTC 2006


Marco Baringer wrote:

> Lou Vanek <vanek at acd.net> writes:
> 
> 
>>Marco Baringer wrote:
>>
>>
>>>Lou Vanek <vanek at acd.net> writes:
>>>
>>>
>>>>if anybody out there is using clisp and needs an rfc2388 mime parser,
>>>>the attached code will probably save you some time.
>>>
>>>does the current parser fail on clisp?
>>
>>On the mime headers i get when running the upload file example,
>>clisp cannot parse 'em, at least on windows. Two reasons below.
>>Third reason: the stream i get from araneida (nonbuffered-character)
>>is not readable via "read-byte", which is why i added the stream-reader
>>function. But i still had problems with the parser, as detailed below.
> 
> 
> yeah, araneida treats http as a text protocol. unfortunetely it is
> not :(
> 
> the _solution_ is to fix araneida...btw that's where i think you
> should send this patch.

i don't see how sending rfc2388 code to an http server project helps,
but if i explained all the problems i'm having with their choice
of streams that may be of some use.

on second thought, maybe you're right. Maybe they'll add rfc2388 support
directly into the http server. I didn't get the impression that araneida
was maintained any more, though.


>>The binary parser cannot handle pure-unix line endings,
>>and i believe the binary parser requires two dashes surrounding
>>the boundary string.
> 
> 
> the binary parser doesn't know what a line ending is. it can handle
> CRLF sequences just fine.

but the state machine spends quite a bit of time dealing with line-end
characters (or at least integers 13s and 10s) when parsing the mime headers.
the mime headers that i was getting back from ff where getting lost in
this state machine and (most of the time) not reaching the final state.

it's just my opinion, but i don't think it's too much of a stretch
to support unix line endings in addition to DOS. But that's only
my opinion. It seems to work for me on windows on my wacky setup.


>>clisp coalesces <cr><nl> into just <nl> on windows unless
>>you are able to drop down into reading the stream in binary,
>>which i wasn't able to do. I don't think that's possible in clisp
>>for some types of streams.
> 
> 
> then we need to add an :external-format when araneida. 

i'm not having any more trouble supporting both
unix and dos line endings. It doesn't require much code, and the
version-c state machine is more robust.

sam gave an explanation of why he coalesces the line-ending
characters but i can't remember his explanation. It sounded
good at the time. (saw it somewhere on c.l.l.)


if you're
> treating the http stream as text where do you setup the character
> encoding?

i'm just assuming character data is either ISO 8859-1 or a subset thereof,
which is the way araneida is hard-coded and the mode i start clisp up in.
not a perfect solution, i know, but i don't need a perfect solution.


> one of the main reasons i wrote the new rfc2388 was to deal with
> non-ascii, non-utf data, i can not accept a change to rfc2388 which
> breaks this.

ok. thanks for saying it.
i don't see why having both a binary and character parser hurts, though,
especially if somebody's in the situation where they have a character stream
and you can't change it. And it needs to also work with rfc2046 headers.
But i understand it's easier to support just one parser.


>>i think the binary parser expects the boundary to be
>>both prefixed and suffixed with two dashes, but the
>>mime boundary that i received didn't end with two dashes,
>>and rfc2046 doesn't require it.
> 
> 
> rfc2388 specifies two dashes on either side of the boundray as an
> end-of-data marker. between parts the boundry is only prefixed by the
> dashes. note that rfc2388 and rfc2046 are different standards (albeit
> very similar). rfc2388 does not purport to implement rfc2046.

i didn't know that. Then i'm not receiving mime headers in rfc2388
back from my browsers.


> is there a browser out there sending rfc2046 in place of rfc2388?
> (explorer right?)

every boundary i have inspected so far starts with dashes but ends with
some sort of line ending character (no double-dash).

i spend 90% of my time debugging in ff1.5.x/windows, and 10% of my time
in ie7b3. Given up on opera. Don't have access to mac unless I pull the
ol' SE30 out of the closet. Lost my linux dual boot in a terrible accident.


>>i also don't think the binary parser currently is set
>>up to restrict the size of the upload.
> 
> 
> no, but it is flexable enough to allow you do that without changing
> the parser itself:
> 
> (defvar *maximum-size-limit* (- space-left-on-disk 1MB))
> 
> (read-mime binary-stream boundry
>            (lambda (mime-part)
>              (let ((counter 0))
>                (values (lambda (byte)
>                          (when (< *maximum-size-limit* counter)
>                            (error 'data-overflow))
>                          (collect-byte-somewhere byte)
>                          (incf counter))
>                        (lambda (mime-part) 
>                          mime-part)))))
> 
> ucw's callback function in is ucw/src/backend/common.lisp.

good to know.
thanks,
-lv




More information about the bese-devel mailing list