A bug in functon parse-content-type.

Hans Hübner hans.huebner at gmail.com
Sun May 26 06:04:15 UTC 2013


Jingtao,

please refer to http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.7,
it clearly describes that a media type consists of exactly one type/subtype
indicator followed by optional attribute=value pairs.  The content type
that you have presented is not valid according to these rules.   Neither a
lax parser like the one in CL-HTTP nor the fact that a large site sends
these bogus headers makes them valid.  I do not want to include code in
Hunchentoot that tries to interpret such bogus data.

However, if you cannot get your trading partner to fix their client, I can
offer this solution:

(defclass request-with-bad-content-type (hunchentoot:request)
  ())

(defmethod hunchentoot:header-in :around ((name (eql :content-type))
(request request-with-bad-content-type))
  (alexandria:when-let (content-type (call-next-method))
    (ppcre:regex-replace-all "^([^/]+/[^/]+); *[^/]+/[^/;]+" content-type
"\\1")))

You'll then have to use the :request-class argument to your acceptor
instantiation to make it use the request-with-bad-content-type class.  You
also want to review the regular expression carefully and maybe profile your
application to see whether you need to cache or otherwise improve
performance.

-Hans


On Sun, May 26, 2013 at 5:07 AM, Jingtao Xu <jingtaozf at gmail.com> wrote:

> Hi Hans,
>
> I don't agree with you to say that this content type header is just bogus.
> As the content-type is sent by the largest B2B/B2C site in china, it
> must have a reason.
>
> And if you try cl-http, you can find that cl-http will parse such
> content type correctly.
>
>
> -----------------------------------------------------------------------------
> (parse-mime-content-type-header "application/x-www-form-urlencoded;
> text/html; charset=UTF-8")
>    ==> (:APPLICATION :X-WWW-FORM-URLENCODED :CHARSET :UTF-8)
>
> -----------------------------------------------------------------------------
>
> You can find the definition in cl-http/server/headers.lisp
>
> -----------------------------------------------------------------------------
> (define-header-type :content-type-header (:header)
>   :parse-function parse-mime-content-type-header
>   :print-function print-mime-content-type-header)
>
> -----------------------------------------------------------------------------
>
> Even this content-type header is bogus(actually I don't think so),
> hunchentoot/drakma should parse
> the header without raising an error if one special variable like *
> accept-bogus-content-type* is true.
>
>
> With Best Regards,
> jingtao.
>
> On Sat, May 25, 2013 at 8:11 PM, Hans Hübner <hans.huebner at gmail.com>
> wrote:
> > Jingtao,
> >
> > the content-type header "application/x-www-form-urlencoded; text/html;
> > charset=UTF-8" is just bogus.  I do not want to include code that makes
> > Hunchentoot work with clearly broken clients.  Better error reporting
> would
> > be acceptable, though.
> >
> > -Hans
> >
> >
> > On Sat, May 25, 2013 at 12:38 PM, Jingtao Xu <jingtaozf at gmail.com>
> wrote:
> >>
> >> Hi all,
> >>
> >> I found the content type header which raise the bug in my message.log
> >> generated by hunchentoot.
> >> It happened when hunchentoot get following content type header:
> >>
> >>
> >>
> -----------------------------------------------------------------------------------------
> >> application/x-www-form-urlencoded; text/html; charset=UTF-8
> >>
> >>
> -----------------------------------------------------------------------------------------
> >>
> >> I noticed that in package drakma's file read.lisp,function
> >> 'get-content-type'
> >> also assumed "/" as a token separator.
> >>
> >> I hope package chunga/drakma/hunchentoot could accept such content type
> >> header
> >> without raising an exception,As Edl said,a new special variable
> >> similar to *accept-bogus-eols* or
> >> *treat-semicolon-as-continuation* which only assume " ,;" as token
> >> separator may be a good idea and will fix my question.
> >>
> >> Any way, RFC standard is not well fit with the read world.
> >>
> >> Thanks very much.
> >>
> >> WIth Best Regards,
> >> jingtao.
> >>
> >>
> >> On Thu, May 23, 2013 at 2:01 PM, Edi Weitz <edi at agharta.de> wrote:
> >> > I'm not the maintainer anymore, but my take is that if some Ruby or
> >> > Java client misinterprets the RFC I wouldn't change Hunchentoot's (or
> >> > rather Chunga's) default behavior because of that.  I'd rather
> >> > introduce a new special variable similar to *accept-bogus-eols* or
> >> > *treat-semicolon-as-continuation*.
> >> >
> >> > Just my .02 Euros,
> >> > Edi.
> >> >
> >> >
> >> >
> >> > On Thu, May 23, 2013 at 2:52 AM, Jingtao Xu <jingtaozf at gmail.com>
> wrote:
> >> >> Hi All,
> >> >>
> >> >> 1. The function `read-name-value-pair' is called by `
> >> >> parse-content-type' in hunchentoo/util.lisp,not by my codes.
> >> >> 2. the slash is a token constituent in java/ruby implementation,and I
> >> >> think some web client/server treat it as a token constituent too,
> >> >>     but I am waiting for the hunchentoot log to give us a live
> example.
> >> >>
> >> >> With Best Regards,
> >> >> jingtao
> >> >>
> >> >>
> >> >> On Wed, May 22, 2013 at 11:40 PM, Edi Weitz <edi at agharta.de> wrote:
> >> >>> If I'm not mistaken, the slash is a "separator" and thus not a token
> >> >>> constituent according to RFC 2616 which means "path=/foo" is not
> legal
> >> >>> input for READ-NAME-VALUE-PAIR.
> >> >>>
> >> >>> On Wed, May 22, 2013 at 5:27 PM, Ron Garret <ron at flownet.com>
> wrote:
> >> >>>> Very likely Jingtao's code is calling READ-NAME-VALUE-PAIR without
> >> >>>> being wrapped in this macro
> >> >>>>
> >> >>>> But there's still a bug in READ-NAME-VALUE-PAIR:
> >> >>>>
> >> >>>> ? (WITH-INPUT-FROM-VECTOR (S (MAP '(VECTOR (UNSIGNED-BYTE 8))
> >> >>>> 'CHAR-CODE "path=/foo"))
> >> >>>>   (chunga:with-character-stream-semantics
> >> >>>>       (CHUNGA:READ-NAME-VALUE-PAIR S)))
> >> >>>> ("path" . "")
> >> >>>>
> >> >>>> On May 22, 2013, at 8:19 AM, Edi Weitz wrote:
> >> >>>>
> >> >>>>> On Wed, May 22, 2013 at 4:18 PM, Ron Garret <ron at flownet.com>
> wrote:
> >> >>>>>> I found a bug in CHUNGA:READ-NAME-VALUE-PAIR.
> >> >>>>>
> >> >>>>> It's not quite clear to me yet what the bug is supposed to be.
> >> >>>>>
> >> >>>>> The documentation clearly says that calls to READ-NAME-VALUE-PAIR
> >> >>>>> and
> >> >>>>> friends must be wrapped with this macro:
> >> >>>>>
> >> >>>>>  http://weitz.de/chunga/#with-character-stream-semantics
> >> >>>>>
> >> >>>>> (You might argue that this isn't very user-friendly, but Chunga
> >> >>>>> wasn't
> >> >>>>> really intended to be used that way.)
> >> >>>>
> >> >>
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.common-lisp.net/pipermail/tbnl-devel/attachments/20130526/6f5210bb/attachment.html>


More information about the Tbnl-devel mailing list