<div dir="ltr">Howdy,<div><br></div><div style>I wonder if those of you have worked with threads might have a quick look to see if I am doing something stupid.</div><div style><br></div><div style><a href="https://lsw2.googlecode.com/svn/branches/bona/util/jargrep.lisp">https://lsw2.googlecode.com/svn/branches/bona/util/jargrep.lisp</a><br>
</div><div style><br></div><div style>The situation is that I want to do stuff (like look for matches to a regular expression) in 240k files which comprise 52G of data.</div><div style><br></div><div style>I am running on a VM allocated 5 CPUs each with three cores.</div>
<div style><br></div><div style>Because at the moment the disk subsystem isn't very fast, I decided to approach this by breaking up the 240k files into 15 parts and put each part in a jar file. </div><div style><br></div>
<div style>The code mentioned above looks for a regular expression (two methods for two different regex handlers: java and dk.brics.automaton</div><div style><br></div><div style>It is invoked something like:</div><div style>
<br></div><div style><div>(jar-map-threads-automaton-find</div><div style> regex</div><div> (generate-filename-sequence "/data/jars/15/file#.jar" 2 0 14))</div><div><br></div><div style>This spawns off 15 threads that go at it for something around a minute. As they find hits they save them in a lisp hash table keyed by the entry name in the jar file, which is unique across all the jar files. </div>
<div style><br></div><div style>The result of running this is about (and their's the rub) 20 key value pairs in the hash table (I had read that ABCL hash tables are thread safe). The problem is that different runs of this code on the same data get different numbers of key value pairs, between 13 and 24!</div>
<div style><br></div><div style>I'm not sure whether I'm just not doing this the right way, in which case it would be very helpful to get an explanation of why not, or there's a problem somewhere in the implementation.</div>
<div style><br></div><div style>Any ideas would be greatly appreciated.</div><div style><br></div><div style>Best,</div><div style>Alan</div><div style><br></div><div style><br></div><div style><div>(LISP-IMPLEMENTATION-VERSION)</div>
<div>"1.2.0-dev-svn-14436M" </div>
<div>"Java_HotSpot(TM)_64-Bit_Server_VM-Oracle_Corporation-1.7.0_21-b11" </div>
<div>"amd64-Linux-3.8.0-30-generic" </div></div><div><br></div></div><div style><br></div></div>