I want to load a large (~20mb) log file into a byte array, which I then use for searching, editing, ....
My problem now is, that the loading of the log file performs really fast (< 2 seconds) as long as the read log file is about 10mb or smaller. But as soon as the file is larger than 15mb the process seems to hang and actually nothing happens for about 5 minutes (I than kill the process). Memory should not be the problem, as the program runs on a server (with 2GB RAM) and I reserve about 300mb for this process only (via the command line argument: "-Xms300m -Xmx300m" at the startup of the process).
Here is the code snippet of the method I use to read the file:
public static byte[] read2array(String file) throws FileNotFoundException, OutOfMemoryError {
// fast way to get file from file system to byte[]
InputStream in = null;
boolean getFileAgain = true; // if exception occured, get file again
byte[] out = new byte[0];
while (getFileAgain) {
getFileAgain = false; // to step out if everything is OK
try {
in = new BufferedInputStream(new FileInputStream(file));
// the length of a buffer can vary
int bufLen = 10000*1024;
byte[] buf = new byte[bufLen];
byte[] tmp = null;
int len = 0;
while((len = in.read(buf,0,bufLen)) != -1){
// extend array
tmp = new byte[out.length len];
// copy data
System.arraycopy(out,0,tmp,0,out.length);
System.arraycopy(buf,0,tmp,out.length,len);
out = tmp;
tmp = null;
}
} catch(BufferOverflowException be) {
logger.info("BufferOverflowException: " be.toString() " - try again!");
// reset values to get file again
out = new byte[0];
getFileAgain = true;
} catch(OutOfMemoryError me) {
logger.info("OutOfMemoryError: " me.toString());
throw me;
} catch(IOException ie) {
} finally {
// always close the stream
if (in != null)
try {
in.close();
} catch (Exception e) {}
}
}
return out;
}
Anyone there who can help me - this would really be a cracker!
do you really need the whole file in memory at the same time? Even if you could load the whole file at once, you would still be examining small portions at any given time.
There are several workaround approaches to this problem. Here are some:
1) redesign, so you are working on a 'window view' of the file at any one time (ie only load a segment and do the work).
2) write a program that takes the file and splits it into several 'binned' files. binning would be something along the line of if you had a truck load of mixed fruit and you needed to separate by broad types. In this case all apples regardless of variety would go in the apple bin, and so on. The same thing with the binned files. If you have some high level discriminators then you can easily do the binning and then work on the sub pieces one at a time. Then redesign your code to work with the bins
Thanks a lot for your code - it works also with larger files now (I tested with a file of about 20mb).
@all
But anyway I still get the application hanging with larger file (> 30 mb) - at least I know now that this is always due to a BufferOverflowException!
the program is right at the beginning of this topic (in my first post). it has slightly changed, due to the new read2array-method provided above, but the problem is still the same!
Buffer overflows? With the code I listed? My test was on a 40+ mibibyte file. I just tried it with a 124796928, 119+ mibibyte file, with no issues whatsoever, completed (again) on a really slow box in 7.543 seconds.
@jensenje & all
ok, i am sorry, the problem (BufferOverflowException) does now not occur in the code listed by jensenje, but afterwards, when I try to make a String out of the byte[] that is returned from the function. I use the following statement to get the String:
String wholeFileString = new String(read2array(fileName));
(where read2array is the function listed above from jensenje)
Within this statement the BufferOverflow occurs! Maybe a solution would be to use another function to transform the byte[] to String, but which?
If you need send the file, you must use a buffered input stream, if you need edit the file you could use a TMP file and a RandowAccess file, you need manage only when you show.
Other application you can manage window of information or File Cache in order to work wiht the whole file,
I can help you if you send some data
I wouldn't really rely on such a large String. Is there something in the String class in terms of functionality that you need? You will certainly NOT want to load/convert such a large byte array into a String. Post what you want to do with the data after loaded and I bet some better suggestions could come from that.
The thing is, I want to be able to search for a given text in the whole (or maybe just the last 10000 lines) log file. To do this I now use the indexOf - method of the String class. That is mainly the reason that I use a String. Also, I want to display the file (if requested the whole file!!!) in a SWT text field, so that the user can navigate through this (multi-line text field) and therefore I guess I have to use a String.
If anyone has any other suggestions, how to do the search or the text displaying, I would really be interested - maybe it is also possible that a small code example will be posted - that would be a great help!
nocomm
This topic has
26
replies
on
2
pages.
1
|
2
|
Next »