Parsing big XML files with klacks and sbcl
Mark Janssen
mpc.janssen at gmail.com
Tue May 29 16:33:57 UTC 2018
I am trying to parse a big xml file (around several GBs) and I am
using klacks because of the the size.
However it seems that there is some leak during parsing because the
memory use continuously increase until sbcl runs out of memory.
What am I missing?
Regards,
Mark
Some info:
$ sbcl --version
SBCL 1.4.8
The script:
(ql:quickload 'cxml)
(defparameter *src* (cxml:make-source (pathname "huge.xml")))
(loop while t do
(klacks:consume *src*))
A (room t) call when breaking to the debugger:
0] (gc)
; No debug variables for current frame: using EVAL instead of EVAL-IN-FRAME.
NIL
0] (room t)
; No debug variables for current frame: using EVAL instead of EVAL-IN-FRAME.
Dynamic space usage is: 231,363,600 bytes.
Immobile space usage is: 15,866,480 bytes (116,720 bytes overhead).
Read-only space usage is: 0 bytes.
Static space usage is: 704 bytes.
Control stack usage is: 9,648 bytes.
Binding stack usage is: 2,064 bytes.
Control and binding stack usage is for the current thread only.
Garbage collection is currently enabled.
Summary of spaces: dynamic immobile static
CONS:
198,982,960 bytes, 12,436,435 objects, 100% dynamic.
CODE:
13,755,264 bytes, 22,368 objects, 100% immobile, 0% dynamic.
SIMPLE-VECTOR:
10,533,136 bytes, 80,217 objects, 100% dynamic.
INSTANCE:
7,169,776 bytes, 126,568 objects, 2% immobile, 98% dynamic.
SIMPLE-ARRAY-UNSIGNED-BYTE-64:
3,423,232 bytes, 1,867 objects, 100% dynamic.
SIMPLE-ARRAY-UNSIGNED-BYTE-8:
2,874,208 bytes, 39,494 objects, 100% dynamic.
SIMPLE-BASE-STRING:
2,031,264 bytes, 40,187 objects, 100% dynamic.
SYMBOL:
1,778,320 bytes, 37,048 objects, 0% static, 67% immobile, 33% dynamic.
BIGNUM:
1,327,760 bytes, 40,630 objects, 100% dynamic.
SIMPLE-CHARACTER-STRING:
1,156,736 bytes, 13,489 objects, 100% dynamic.
SIMPLE-ARRAY-UNSIGNED-BYTE-32:
888,800 bytes, 24,915 objects, 100% dynamic.
FDEFN:
663,360 bytes, 20,730 objects, 0% static, 100% immobile.
CLOSURE:
595,344 bytes, 16,722 objects, 100% dynamic.
SIMPLE-ARRAY-UNSIGNED-BYTE-16:
588,608 bytes, 4,832 objects, 100% dynamic.
SIMPLE-ARRAY-UNSIGNED-BYTE-31:
243,616 bytes, 4 objects, 100% dynamic.
SIMPLE-ARRAY-SIGNED-BYTE-8:
196,208 bytes, 6,131 objects, 100% dynamic.
FUNCALLABLE-INSTANCE:
160,368 bytes, 4,149 objects, 44% immobile, 56% dynamic.
SIMPLE-BIT-VECTOR:
44,544 bytes, 100 objects, 100% dynamic.
SIMPLE-ARRAY-SIGNED-BYTE-16:
15,072 bytes, 208 objects, 100% dynamic.
SIMPLE-ARRAY-SIGNED-BYTE-32:
8,096 bytes, 194 objects, 100% dynamic.
VALUE-CELL:
5,584 bytes, 349 objects, 100% dynamic.
SIMPLE-ARRAY-FIXNUM:
2,960 bytes, 7 objects, 100% dynamic.
ARRAY-HEADER:
2,208 bytes, 28 objects, 100% dynamic.
RATIO:
1,024 bytes, 32 objects, 100% dynamic.
DOUBLE-FLOAT:
704 bytes, 44 objects, 100% dynamic.
WEAK-POINTER:
448 bytes, 14 objects, 100% dynamic.
SAP:
256 bytes, 16 objects, 100% dynamic.
SIMPLE-ARRAY-UNSIGNED-BYTE-2:
96 bytes, 2 objects, 100% dynamic.
SIMPLE-ARRAY-UNSIGNED-FIXNUM:
80 bytes, 3 objects, 100% dynamic.
COMPLEX-DOUBLE-FLOAT:
64 bytes, 2 objects, 100% dynamic.
COMPLEX:
32 bytes, 1 object, 100% dynamic.
COMPLEX-SINGLE-FLOAT:
32 bytes, 2 objects, 100% dynamic.
SIMD-PACK:
32 bytes, 1 object, 100% dynamic.
SIMPLE-ARRAY-NIL:
32 bytes, 2 objects, 100% dynamic.
SIMPLE-ARRAY-UNSIGNED-BYTE-4:
16 bytes, 1 object, 100% dynamic.
SIMPLE-ARRAY-UNSIGNED-BYTE-7:
16 bytes, 1 object, 100% dynamic.
SIMPLE-ARRAY-UNSIGNED-BYTE-15:
16 bytes, 1 object, 100% dynamic.
SIMPLE-ARRAY-UNSIGNED-BYTE-63:
16 bytes, 1 object, 100% dynamic.
SIMPLE-ARRAY-SIGNED-BYTE-64:
16 bytes, 1 object, 100% dynamic.
SIMPLE-ARRAY-SINGLE-FLOAT:
16 bytes, 1 object, 100% dynamic.
SIMPLE-ARRAY-DOUBLE-FLOAT:
16 bytes, 1 object, 100% dynamic.
SIMPLE-ARRAY-COMPLEX-SINGLE-FLOAT:
16 bytes, 1 object, 100% dynamic.
SIMPLE-ARRAY-COMPLEX-DOUBLE-FLOAT:
16 bytes, 1 object, 100% dynamic.
Summary total:
246,450,368 bytes, 12,916,800 objects.
Top 10 dynamic instance types:
COMPILED-DEBUG-FUN 1,534,912 bytes, 23,983 objects.
COMPILED-DEBUG-FUN-EXTERNAL 1,468,160 bytes, 22,940 objects.
COMPILED-DEBUG-INFO 986,016 bytes, 20,542 objects.
DEFINITION-SOURCE-LOCATION 241,056 bytes, 7,533 objects.
FAST-METHOD-CALL 239,952 bytes, 4,999 objects.
SLOT-INFO 220,272 bytes, 4,589 objects.
VOP-PARSE 190,624 bytes, 851 objects.
VOP-INFO 183,456 bytes, 819 objects.
FUN-TYPE 155,520 bytes, 1,620 objects.
ARG-INFO 140,928 bytes, 1,468 objects.
Other types 1,691,632 bytes, 36,184 objects.
Dynamic instance total 7,052,528 bytes, 125,528 objects.
Top 10 immobile instance types:
LAYOUT 119,616 bytes, 1,068 objects.
PACKAGE 4,224 bytes, 33 objects.
Immobile instance total 123,840 bytes, 1,101 objects.
Top 10 static instance types:
Static instance total 0 bytes, 0 objects.
More information about the cxml-devel
mailing list