File I/O Metrics

Robert Goldman rpgoldman at sift.net
Fri Oct 21 20:47:04 UTC 2022


I don't know what data you are reading but is there any chance that the 
difference is that when you read text in lisp as ISO-8859-1 lisp is 
actually processing the text as unicode, but when you are reading it in 
Java you are just slamming raw bytes into memory?

Maybe this is relevant? 
https://stackoverflow.com/questions/979932/read-unicode-text-files-with-java

I don't use Java myself, so I can't say, and I don't have access to your 
data, but it does seem like the Java code is doing something simpler 
than the Lisp code.

What happens if you change your Lisp code to `read-sequence` of type 
`byte` instead of `character`?

On 21 Oct 2022, at 13:43, Garrett Dangerfield wrote:

> I don't want to cause a firestore here but I was doing some simple
> benchmarks on file i/o between Java, ABCL, and SBCL and I'm a bit 
> shocked,
> honestly.
>
> Reading a 2.5M file in 16M chunks in (using iso-8859-1):
> - abcl takes a tad over 1 second
> - sbcl takes 0.04 seconds
>
> Reading a 5.8G file in 16M chunks in (using iso-8859-1 for Lisp, for 
> Java
> it's just bytes):
> - abcl takes...too long, I gave up
> - sbcl takes between 20 and 21 seconds
> - Java takes 1.5 seconds
>
> These are all run on the same computer using the same files, etc.
>
> What's up with this?  Thoughts?  I'd heard that SBCL should be as fast 
> as C
> under at least some circumstances.  I'd wager that C is at least as 
> fast as
> Java (probably faster).
>
> Thanks,
> Garrett Dangerfield. (he/him/his)
>
> P.S. Don't get me wrong, I *LOVE* Lisp, I'm trying to get away from 
> Java as
> fast as I can (the syntax is killing me slowly).  I've used ABCL in
> projects before (it was wonderful, Java doesn't handle XML well).
>
> Lisp code:
>   (with-open-file (stream "/media/danger/OS/temp/jars.txt" 
> :external-format
> :iso-8859-1) ; great_expectations.iso
>  (let ((size (file-length stream))
> (buffer-size (* 16 1024 1024)) ; 16M
> )
>    (time
>     (loop with buffer = (make-array buffer-size :element-type 
> 'character)
>   for n-characters = (read-sequence buffer stream)
>   while (< 0 n-characters)))
>    )))
>
> Java code:
> private static final int BUFFER_SIZE = 16 * 1024 * 1024;
> try (InputStream in = new
> FileInputStream("/media/danger/OS/temp/great_expectations.iso"); ) {
> byte[] buff = new byte[BUFFER_SIZE];
> int chunkLen = -1;
> long start = System.currentTimeMillis();
> while ((chunkLen = in.read(buff)) != -1) {
> System.out.println("chunkLen = " + chunkLen);
> }
> double duration = System.currentTimeMillis() - start;
> duration /= 1000;
> System.out.println(String.format("it took %,2f secs", duration));
> } catch (Exception e) {
> e.printStackTrace(System.out);
> } finally {
> System.out.println("Done.");
> }


Robert P. Goldman
Research Fellow
Smart Information Flow Technologies (d/b/a SIFT, LLC)

319 N. First Ave., Suite 400
Minneapolis, MN 55401

Voice:	(612) 326-3934
Email:    rpgoldman at SIFT.net
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.common-lisp.net/pipermail/armedbear-devel/attachments/20221021/69a85265/attachment.html>


More information about the armedbear-devel mailing list