participate


Java Programming [Archive] - How to load large files
<<   Back to Forum  |   Give us Feedback
This topic has 26 replies on 2 pages.    1 | 2 | Next »
nocomm
Posts:86
Registered: 7/9/03
How to load large files   
Oct 23, 2003 6:12 AM

 
Hi all!

I want to load a large (~20mb) log file into a byte array, which I then use for searching, editing, ....
My problem now is, that the loading of the log file performs really fast (< 2 seconds) as long as the read log file is about 10mb or smaller. But as soon as the file is larger than 15mb the process seems to hang and actually nothing happens for about 5 minutes (I than kill the process). Memory should not be the problem, as the program runs on a server (with 2GB RAM) and I reserve about 300mb for this process only (via the command line argument: "-Xms300m -Xmx300m" at the startup of the process).

Here is the code snippet of the method I use to read the file:

public static byte[] read2array(String file) throws FileNotFoundException, OutOfMemoryError {
// fast way to get file from file system to byte[]
InputStream in = null;
boolean getFileAgain = true; // if exception occured, get file again
byte[] out = new byte[0];

while (getFileAgain) {
getFileAgain = false; // to step out if everything is OK
try {
in = new BufferedInputStream(new FileInputStream(file));
// the length of a buffer can vary
int bufLen = 10000*1024;
byte[] buf = new byte[bufLen];
byte[] tmp = null;
int len = 0;
while((len = in.read(buf,0,bufLen)) != -1){
// extend array
tmp = new byte[out.length len];
// copy data
System.arraycopy(out,0,tmp,0,out.length);
System.arraycopy(buf,0,tmp,out.length,len);
out = tmp;
tmp = null;
}
} catch(BufferOverflowException be) {
logger.info("BufferOverflowException: "
be.toString() " - try again!");
// reset values to get file again
out = new byte[0];
getFileAgain = true;
} catch(OutOfMemoryError me) {
logger.info("OutOfMemoryError: "
me.toString());
throw me;
} catch(IOException ie) {
} finally {
// always close the stream
if (in != null)
try {
in.close();
} catch (Exception e) {}
}
}
return out;
}

Anyone there who can help me - this would really be a cracker!

nocomm
 
plumone
Posts:402
Registered: 7/17/03
Re: How to load large files   
Oct 23, 2003 6:42 AM (reply 1 of 26)  (In reply to original post )

 
do you really need the whole file in memory at the same time? Even if you could load the whole file at once, you would still be examining small portions at any given time.

There are several workaround approaches to this problem. Here are some:

1) redesign, so you are working on a 'window view' of the file at any one time (ie only load a segment and do the work).

2) write a program that takes the file and splits it into several 'binned' files. binning would be something along the line of if you had a truck load of mixed fruit and you needed to separate by broad types. In this case all apples regardless of variety would go in the apple bin, and so on. The same thing with the binned files. If you have some high level discriminators then you can easily do the binning and then work on the sub pieces one at a time. Then redesign your code to work with the bins

3) redesign, consider using a db approach.

 
andrew_malcolm
Posts:2,120
Registered: 11/12/01
Re: How to load large files   
Oct 23, 2003 6:58 AM (reply 2 of 26)  (In reply to #1 )

 
Can you use RandomAccessFile to load data from any point in the file as you need it?
 
plumone
Posts:402
Registered: 7/17/03
Re: How to load large files   
Oct 23, 2003 7:31 AM (reply 3 of 26)  (In reply to #2 )

 
Can you use RandomAccessFile to load data from any
point in the file as you need it?


Yep thats what i was heading towards with #1...
 
jensenje
Posts:426
Registered: 11/1/00
Re: How to load large files   
Oct 23, 2003 9:20 AM (reply 4 of 26)  (In reply to original post )

 
Holy code boy, just do this:
public byte[] read2Array(String fileName) throws IOException {
	InputStream is = new FileInputStream(fileName);
	ByteArrayOutputStream baos = new ByteArrayOutputStream();
	byte[] buf = new byte[4096];
	for(int len=-1;(len=is.read(buf))!=-1;)
		baos.write(buf,0,len);
	baos.flush();
	is.close();
	baos.close();
	return baos.toByteArray();
}

I ran this against a file of 43704320 bytes with -Xmx384m on a really slow box and it took 3419 millis.
 
nocomm
Posts:86
Registered: 7/9/03
Re: How to load large files   
Oct 23, 2003 10:15 PM (reply 5 of 26)  (In reply to #4 )

 
@jensenje

Thanks a lot for your code - it works also with larger files now (I tested with a file of about 20mb).

@all
But anyway I still get the application hanging with larger file (> 30 mb) - at least I know now that this is always due to a BufferOverflowException!

Can anyone give me a hint how to avoid this?

thanx again
nocomm
 
nocomm
Posts:86
Registered: 7/9/03
Re: How to load large files   
Oct 24, 2003 4:38 AM (reply 6 of 26)  (In reply to #5 )

 
Is there anyone having an idea how to avoid the BufferOverflowException?

I read a lot of other articles and posts here, but unfortunately no real answer.

But I still hope....

nocomm
 
andrew_malcolm
Posts:2,120
Registered: 11/12/01
Re: How to load large files   
Oct 24, 2003 4:41 AM (reply 7 of 26)  (In reply to #6 )

 
I recommend (as always) - write a small program that illustrates the problem, post it here, and I'm sure someonr will figure out what's wrong.
 
nocomm
Posts:86
Registered: 7/9/03
Re: How to load large files   
Oct 24, 2003 4:44 AM (reply 8 of 26)  (In reply to #7 )

 
the program is right at the beginning of this topic (in my first post). it has slightly changed, due to the new read2array-method provided above, but the problem is still the same!

nocomm
 
jensenje
Posts:426
Registered: 11/1/00
Re: How to load large files   
Oct 24, 2003 8:51 AM (reply 9 of 26)  (In reply to #5 )

 
Buffer overflows? With the code I listed? My test was on a 40+ mibibyte file. I just tried it with a 124796928, 119+ mibibyte file, with no issues whatsoever, completed (again) on a really slow box in 7.543 seconds.
 
jschell
Posts:36,985
Registered: 11/3/97
Re: How to load large files   
Oct 24, 2003 1:07 PM (reply 10 of 26)  (In reply to #6 )

 
Is there anyone having an idea how to avoid the
BufferOverflowException?

As in java.nio.BufferOverflowException? You are using something in the nio package?

And where does it occur (printStackTrace?)
 
nocomm
Posts:86
Registered: 7/9/03
Re: How to load large files   
Oct 26, 2003 10:53 PM (reply 11 of 26)  (In reply to #5 )

 
@jensenje & all
ok, i am sorry, the problem (BufferOverflowException) does now not occur in the code listed by jensenje, but afterwards, when I try to make a String out of the byte[] that is returned from the function. I use the following statement to get the String:

String wholeFileString = new String(read2array(fileName));
(where read2array is the function listed above from jensenje)

Within this statement the BufferOverflow occurs! Maybe a solution would be to use another function to transform the byte[] to String, but which?

Thanks for your help again
nocomm
 
barbyware
Posts:197
Registered: 12/1/98
Re: How to load large files   
Oct 27, 2003 12:32 AM (reply 12 of 26)  (In reply to original post )

 
Hi,

I think that you do not need whole file.

If you need send the file, you must use a buffered input stream, if you need edit the file you could use a TMP file and a RandowAccess file, you need manage only when you show.
Other application you can manage window of information or File Cache in order to work wiht the whole file,
I can help you if you send some data
 
jensenje
Posts:426
Registered: 11/1/00
Re: How to load large files   
Oct 27, 2003 8:13 AM (reply 13 of 26)  (In reply to #11 )

 
I wouldn't really rely on such a large String. Is there something in the String class in terms of functionality that you need? You will certainly NOT want to load/convert such a large byte array into a String. Post what you want to do with the data after loaded and I bet some better suggestions could come from that.
 
nocomm
Posts:86
Registered: 7/9/03
Re: How to load large files   
Oct 27, 2003 8:57 AM (reply 14 of 26)  (In reply to #13 )

 
The thing is, I want to be able to search for a given text in the whole (or maybe just the last 10000 lines) log file. To do this I now use the indexOf - method of the String class. That is mainly the reason that I use a String. Also, I want to display the file (if requested the whole file!!!) in a SWT text field, so that the user can navigate through this (multi-line text field) and therefore I guess I have to use a String.

If anyone has any other suggestions, how to do the search or the text displaying, I would really be interested - maybe it is also possible that a small code example will be posted - that would be a great help!

nocomm
 
This topic has 26 replies on 2 pages.    1 | 2 | Next »
Back to Forum
 
Read the Developer Forums Code of Conduct

Click to email this message Email this Topic

Edit this Topic
  
 
 
Forums Statistics
    Users Online : 29
  • Guests : 132

About Sun forums
  • Sun Forums is a large collection of user generated discussions. It is here to help you ask questions, find answers, and participate in discussions.

    Check out our guide on Getting started with Sun Forums for a full walkthrough of how to best leverage the benefits of this community.

Powered by Jive Forums