From lispercat at gmail.com Wed Sep 16 20:59:59 2009 From: lispercat at gmail.com (Andrei Stebakov) Date: Wed, 16 Sep 2009 16:59:59 -0400 Subject: [drakma-devel] drakma:get-cookies fails to parse cookies with commas in values Message-ID: One of the web sites started to give me cookies with commas and drakma:get-cookies just crashes on those requests. I distilled my case into a small example like this: (drakma::get-cookies '((:CONTENT-TYPE . "text/html; charset=utf-8") (:LOCATION . "http://www.test") (:SERVER . "Microsoft-IIS/7.0") (:CONTENT-LENGTH . "46") (:DATE . "Sat, 12 Sep 2009 14:58:04 GMT") (:CONNECTION . "close") (:SET-COOKIE . "domain=test.com; expires=Thu, 12-Sep-2109 14:58:04 GMT; session=6,Direct,placeholder,test.com;") (:CACHE-CONTROL . "private")) (puri:parse-uri "http://www.test.com")) It'll throw an exception trying to parse "session=6,Direct,placeholder, test.com" pair and will complain about the commas. I tried to capture the same page with FF Live Http Headers and it has no problems with that. Do you think we could change drakma to be able to digest it as well? Thank you, Andrei -------------- next part -------------- An HTML attachment was scrubbed... URL: From edi at agharta.de Wed Sep 30 09:09:49 2009 From: edi at agharta.de (Edi Weitz) Date: Wed, 30 Sep 2009 11:09:49 +0200 Subject: [drakma-devel] drakma:get-cookies fails to parse cookies with commas in values In-Reply-To: References: Message-ID: On Wed, Sep 16, 2009 at 10:59 PM, Andrei Stebakov wrote: > One of the web sites started to give me cookies with commas and > drakma:get-cookies just crashes on those requests. > I distilled my case into a small example like this: > (drakma::get-cookies > ? '((:CONTENT-TYPE . "text/html; charset=utf-8") > ??? (:LOCATION . "http://www.test") > ??? (:SERVER . "Microsoft-IIS/7.0") > ??? (:CONTENT-LENGTH . "46") > ??? (:DATE . "Sat, 12 Sep 2009 14:58:04 GMT") (:CONNECTION . "close") > ??? (:SET-COOKIE > ???? . "domain=test.com; expires=Thu, 12-Sep-2109 14:58:04 GMT; > session=6,Direct,placeholder,test.com;") > ??? (:CACHE-CONTROL . "private")) > ? (puri:parse-uri "http://www.test.com")) > > It'll throw an exception trying to parse > "session=6,Direct,placeholder,test.com" pair and will complain about the > commas. > I tried to capture the same page with FF Live Http Headers and it has no > problems with that. > Do you think we could change drakma to be able to digest it as well? Sorry for the late reply. I was going to write that IIS sends a wrong header according to the RFCs, but after re-reading them I now think that one might interpret them in a different way and that Drakma's general handling of commas has to be reworked to accommodate this interpretation. Stay tuned, I'll think about how this can best be achieved. Edi. From edi at agharta.de Wed Sep 30 14:41:06 2009 From: edi at agharta.de (Edi Weitz) Date: Wed, 30 Sep 2009 16:41:06 +0200 Subject: [drakma-devel] drakma:get-cookies fails to parse cookies with commas in values In-Reply-To: References: Message-ID: On Wed, Sep 30, 2009 at 11:09 AM, Edi Weitz wrote: > I was going to write that IIS sends a wrong header according to the > RFCs, but after re-reading them I now think that one might interpret > them in a different way and that Drakma's general handling of commas > has to be reworked to accommodate this interpretation. No. In the meantime, I think this cookie really looks fishy. In RFC 2109 (for "Set-Cookie") the syntax is defined as "1#cookie" which according to the HTTP specification this RFC refers to means a comma-separated list of values, i.e. if a comma is not quoted, it separates one Set-Cookie header from the next one. I understand that this is kind of sloppy already because lots of servers use a syntax were the date in "expires" uses a comma in the wrong place and Drakma caters to that. The question is how to deal with commas in general. Consider this example: Set-Cookie: domain=test.com; expires=Thu, 12-Sep-2109 14:58:04 GMT; session=foo,bar=baz If sent by IIS this probably means (?) that the cookie "domain" has an attribute "session" with the value "foo,bar=baz", right? But it could also mean (see RFC) that the value of "session" is "foo" and that there's a second cookie "bar" with the value "baz". In fact, if Drakma reads two header lines like so Set-Cookie: domain=test.com; expires=Thu, 12-Sep-2109 14:58:04 GMT; session=foo Set-Cookie: bar=baz it will actually join them with a comma before parsing them (in accordance with the HTTP RFC). So, we could probably provide some special variable to make cookie parsing less restrictive, but I wonder what the exact semantics of this should be. Any suggestions? Thanks, Edi. From lispercat at gmail.com Wed Sep 30 15:35:48 2009 From: lispercat at gmail.com (Andrei Stebakov) Date: Wed, 30 Sep 2009 11:35:48 -0400 Subject: [drakma-devel] drakma:get-cookies fails to parse cookies with commas in values In-Reply-To: References: Message-ID: Looks like according to RFC 2109, "=" takes priority over "," so probably when we encounter something like session=foo,bar=baz, the parser should analyze sequences on both sides of an "=" character, so in this case comma becomes a separator of two different pairs. On Wed, Sep 30, 2009 at 10:41 AM, Edi Weitz wrote: > On Wed, Sep 30, 2009 at 11:09 AM, Edi Weitz wrote: > > > I was going to write that IIS sends a wrong header according to the > > RFCs, but after re-reading them I now think that one might interpret > > them in a different way and that Drakma's general handling of commas > > has to be reworked to accommodate this interpretation. > > No. In the meantime, I think this cookie really looks fishy. > > In RFC 2109 (for "Set-Cookie") the syntax is defined as "1#cookie" > which according to the HTTP specification this RFC refers to means a > comma-separated list of values, i.e. if a comma is not quoted, it > separates one Set-Cookie header from the next one. I understand that > this is kind of sloppy already because lots of servers use a syntax > were the date in "expires" uses a comma in the wrong place and Drakma > caters to that. The question is how to deal with commas in general. > > Consider this example: > > Set-Cookie: domain=test.com; expires=Thu, 12-Sep-2109 14:58:04 GMT; > session=foo,bar=baz > > If sent by IIS this probably means (?) that the cookie "domain" has an > attribute "session" with the value "foo,bar=baz", right? > > But it could also mean (see RFC) that the value of "session" is "foo" > and that there's a second cookie "bar" with the value "baz". In fact, > if Drakma reads two header lines like so > > Set-Cookie: domain=test.com; expires=Thu, 12-Sep-2109 14:58:04 GMT; > session=foo > Set-Cookie: bar=baz > > it will actually join them with a comma before parsing them (in > accordance with the HTTP RFC). > > So, we could probably provide some special variable to make cookie > parsing less restrictive, but I wonder what the exact semantics of > this should be. > > Any suggestions? > > Thanks, > Edi. > > _______________________________________________ > drakma-devel mailing list > drakma-devel at common-lisp.net > http://common-lisp.net/cgi-bin/mailman/listinfo/drakma-devel > -------------- next part -------------- An HTML attachment was scrubbed... URL: From edi at agharta.de Wed Sep 30 21:58:32 2009 From: edi at agharta.de (Edi Weitz) Date: Wed, 30 Sep 2009 23:58:32 +0200 Subject: [drakma-devel] drakma:get-cookies fails to parse cookies with commas in values In-Reply-To: References: Message-ID: On Wed, Sep 30, 2009 at 5:35 PM, Andrei Stebakov wrote: > Looks like according to RFC 2109, "=" takes priority over "," so probably > when we encounter something like session=foo,bar=baz, the parser should > analyze sequences on both sides of an "=" character, so in this case comma > becomes a separator of two different pairs. Ah, that's something I've been missing so far. Can you point to where exactly this can be found in the RFC? That should make the cookie parsing code clearer and I should be able to get rid of the comma workaround which is already in there. Thanks, Edi.