From cmucl-devel at common-lisp.net Fri Aug 13 00:02:55 2010 From: cmucl-devel at common-lisp.net (cmucl) Date: Fri, 13 Aug 2010 00:02:55 -0000 Subject: [cmucl-ticket] [cmucl] #36: file-position broken for utf16 and utf32 In-Reply-To: <052.abece47505902f36eb5d5040f3f75545@common-lisp.net> References: <052.abece47505902f36eb5d5040f3f75545@common-lisp.net> Message-ID: <061.5e0156e6cdb9048d96f4b167de9513dd@common-lisp.net> #36: file-position broken for utf16 and utf32 ---------------------+------------------------------------------------------ Reporter: rtoy | Owner: somebody Type: defect | Status: new Priority: minor | Milestone: Component: Core | Version: 2010-01 Resolution: | Keywords: ---------------------+------------------------------------------------------ Comment(by rtoy): One possible solution is to keep track of the number of octets used to create each character. This has a relatively high cost because we need to save this for each character, for all inputs, but the data is only used for file-position. This seems really wasteful of MIPS and memory since file-position probably occurs much less often than reading characters. Another alternative would be to modify string-encode so that the BOM is not included. But that's a bit tricky too. Either we need a new method for each external format (that needs it) or we need to add an extra parameter to the external format method to say we don't want a BOM. Not too hard to do, but some work to modify every format for this. Or maybe string-encode can take a new argument specifying the ef state. But then we would need a new ef function to give us the ef state that will guarantee no BOM. Or, the most hackish, but workable solution is to look at the output of string-encode. If the first two octets are the BOM, adjust for that. A bit hackish, but seems doable. -- Ticket URL: cmucl cmucl From cmucl-devel at common-lisp.net Fri Aug 13 16:51:29 2010 From: cmucl-devel at common-lisp.net (cmucl) Date: Fri, 13 Aug 2010 16:51:29 -0000 Subject: [cmucl-ticket] [cmucl] #36: file-position broken for utf16 and utf32 In-Reply-To: <052.abece47505902f36eb5d5040f3f75545@common-lisp.net> References: <052.abece47505902f36eb5d5040f3f75545@common-lisp.net> Message-ID: <061.2878b8304f49e9ffd231546da9c884fb@common-lisp.net> #36: file-position broken for utf16 and utf32 ---------------------+------------------------------------------------------ Reporter: rtoy | Owner: somebody Type: defect | Status: new Priority: minor | Milestone: Component: Core | Version: 2010-01 Resolution: | Keywords: ---------------------+------------------------------------------------------ Comment(by rtoy): Keeping track of the octets is probably the only "correct" solution. There's no guarantee that the input (octet-to-code) state has any relationship to the output (code-to-octet) state, so there may be no consistent way run string-encode correctly. Some tests with keeping track of the char lengths indicate that the cost is fairly low, at least when reading characters one at a time (but the conversion is still done a block at a time and doled out one character at a time). -- Ticket URL: cmucl cmucl From cmucl-devel at common-lisp.net Sun Aug 15 12:55:06 2010 From: cmucl-devel at common-lisp.net (cmucl) Date: Sun, 15 Aug 2010 12:55:06 -0000 Subject: [cmucl-ticket] [cmucl] #36: file-position broken for utf16 and utf32 In-Reply-To: <052.abece47505902f36eb5d5040f3f75545@common-lisp.net> References: <052.abece47505902f36eb5d5040f3f75545@common-lisp.net> Message-ID: <061.a7322bf504faa1e63e9389a02938dd57@common-lisp.net> #36: file-position broken for utf16 and utf32 ---------------------+------------------------------------------------------ Reporter: rtoy | Owner: somebody Type: defect | Status: closed Priority: minor | Milestone: Component: Core | Version: 2010-01 Resolution: fixed | Keywords: ---------------------+------------------------------------------------------ Changes (by rtoy): * status: new => closed * resolution: => fixed Comment: Fixed by using an array to hold the octet length of each character. Tests show very small change in speed (about 1% increase in time). -- Ticket URL: cmucl cmucl From cmucl-devel at common-lisp.net Sun Aug 15 13:05:35 2010 From: cmucl-devel at common-lisp.net (cmucl) Date: Sun, 15 Aug 2010 13:05:35 -0000 Subject: [cmucl-ticket] [cmucl] #41: empty unwind-protect In-Reply-To: <054.4521ad1eef2fd171dcedc56236913e87@common-lisp.net> References: <054.4521ad1eef2fd171dcedc56236913e87@common-lisp.net> Message-ID: <063.13786a6cc48ca73ef3640937365d5bb2@common-lisp.net> #41: empty unwind-protect --------------------------+------------------------------------------------- Reporter: heller | Owner: somebody Type: enhancement | Status: new Priority: minor | Milestone: Component: Core | Version: 20a Resolution: | Keywords: --------------------------+------------------------------------------------- Comment(by rtoy): Yes it does generate quite a bit. Are you expecting the compiler to check that the protected form is safe and to elide the unwind-protect? -- Ticket URL: cmucl cmucl