participate


Reflections & Reference Objects - PROPOSED StringBuffer ENHANCEMENT -- COMMENTS???
<<   Back to Forum  |   Give us Feedback
This topic has 25 replies on 2 pages.    1 | 2 | Next »
tony_lapaso
Posts:37
Registered: 4/7/97
PROPOSED StringBuffer ENHANCEMENT -- COMMENTS???   
Nov 8, 2001 10:31 AM

 
Hello all,

At Sun's suggestion, I wanted to get some feedback from the developer community regarding an implementation enhancement to the StringBuffer class. Please tell me what you think. Here's the issue:

The current implementation of StringBuffer maintains an internal char[] to hold the characters. StringBuffer will automatically increase the size of its char[] as new characters are added. It does this by allocating a new char[] whose size is ALWAYS at least twice that of the previous char[]. This behavior can be hugely wasteful, especially for large StringBuffers. Here's an example:

StringBuffer sb = new StringBuffer(50000);
System.out.println(sb.capacity());
for(int i = 0; i < 50001; ++i)
   sb.append('M');
System.out.println(sb.capacity());


This code inserts 50,001 characters into a StringBuffer whose length is 50,000 characters. The StringBuffer class responds by DOUBLING the size of its internal character array (actually, it doubles the current size plus one) to accommodate the 50,001st character. In the above example we end up with a character array of 100,002 when only 50,001 are needed.

What I am proposing is that the StringBuffer implementation be modified so that it does not automatically and wastefully double the size of the internal char[]. I believe we need a more refined, controlled mechanism for increasing the internal char[] size. Simply doubling the char[] is a very primitive, wasteful and brut force approach (IMHO).

I don't want to discuss possible solutions at this point, I simply want your feedback as to whether or not a more elegant approach is needed.

Thanks.

 
skanjo
Posts:116
Registered: 6/23/97
Re: PROPOSED StringBuffer ENHANCEMENT -- COMMENTS???   
Nov 8, 2001 12:28 PM (reply 1 of 25)  (In reply to original post )

 
I agree. The waste of resources should be avoided whenever possible.
 
Enygma42
Posts:181
Registered: 6/12/01
Re: PROPOSED StringBuffer ENHANCEMENT -- COMMENTS???   
Nov 8, 2001 12:37 PM (reply 2 of 25)  (In reply to #1 )

 
I was just thinking this, although I wasn't aware that it doubles the original capacity.
I was wondering why there isn't a double argument constructor like Vector's:
public StringBuffer(int initialCapacity, int capacityIncrement);
 
schapel
Posts:2,586
Registered: 6/22/99
Re: PROPOSED StringBuffer ENHANCEMENT -- COMMENTS???   
Nov 8, 2001 12:51 PM (reply 3 of 25)  (In reply to original post )

 
Multiplying the storage capacity by a constant (2 in the case of doubling) is needed to ensure that adding N characters isn't an O(N^2) operation. It's exactly this doubling in capacity that makes a StringBuffer so much faster than a String when you need to build up a string in a loop.

Perhaps StringBuffer could be changed so that it's more like ArrayList, which increases storage by 50% each time it runs out of space. However, people have complained that ArrayList is nearly twice as slow as Vector, which doubles in capacity just like StringBuffer usually does.

As is often the case, it's a tradeoff between space and time -- saving storage space will mean that programs run more slowly.
 
IanMechura
Posts:589
Registered: 2/9/01
Re: PROPOSED StringBuffer ENHANCEMENT -- COMMENTS???   
Nov 8, 2001 1:45 PM (reply 4 of 25)  (In reply to #3 )

 
How come it dosent just get the size of the string that
is being appended to it and increase the buffer size
according to that+1. I guess something like this:

 public void append(String appStr)
 {
	int nSize = capacity+(appStr.length()+1);
	/* then set the char[] to this size instead of             wasting proc speed.
	 */
 }

Just an idea.
 
Mcohen
Posts:31
Registered: 7/27/98
Re: PROPOSED StringBuffer ENHANCEMENT -- COMMENTS???   
Nov 8, 2001 5:12 PM (reply 5 of 25)  (In reply to #4 )

 
This reminds me of another problem with the StringBuffer. Suppose you use a StringBuffer in your toString method, like this:
void String toString()
{
    StringBuffer sb = new StringBuffer();
    sb.append(....);
    sb.append(....);
    ....
    return sb.toString();
}

To make this method more efficient you may want to remove the allocation that happens everytime this method is called. You can do this by sharing the StringBuffer object across calls to this method:
void String toString()
{
    sb.setLength(0);
    sb.append(....);
    sb.append(....);
    ....
    return sb.toString();
}
private final StringBuffer sb = new StringBuffer();

This does save an allocation, however, interanally the StringBuffer will re-allocate the internal char[] when you set the length to 0. This defeats the purpose of saving the allocation.

Perhaps the StringBuffer should keep the char array around and re-use it, instead of re-allocating it? Does this make any sense?
 
tony_lapaso
Posts:37
Registered: 4/7/97
Re: PROPOSED StringBuffer ENHANCEMENT -- COMMENTS???   
Nov 8, 2001 10:42 PM (reply 6 of 25)  (In reply to #5 )

 
You are 100% correct, but what you describe is a flat out bug. I recently submitted this bug (ID 4524848) and am waiting for it to appear in the "Bug Parade".

The issue surrounding the doubling of the size of the char[], however, is not a bug. It's simply a very inefficient implementation (IMHO).

Thanks.
 
schapel
Posts:2,586
Registered: 6/22/99
Re: PROPOSED StringBuffer ENHANCEMENT -- COMMENTS???   
Nov 9, 2001 7:38 AM (reply 7 of 25)  (In reply to #4 )

 
How come it dosent just get the size of the string
that
is being appended to it and increase the buffer size
according to that+1. I guess something like this:

public void append(String appStr)
{
int nSize = capacity+(appStr.length()+1);
/* then set the char[] to this size instead of
wasting proc speed.
*/
}

Just an idea.

Because that would mean appending characters to a StringBuffer in a loop would take O(N^2) time instead of O(N) time. Read my post above.
 
schapel
Posts:2,586
Registered: 6/22/99
Re: PROPOSED StringBuffer ENHANCEMENT -- COMMENTS???   
Nov 9, 2001 7:43 AM (reply 8 of 25)  (In reply to #5 )

 
Perhaps the StringBuffer should keep the char array
around and re-use it, instead of re-allocating it?
Does this make any sense?

Not really. When you call toString() on a StringBuffer, the String uses the char[] that the StringBuffer was using. The next time you want to use that StringBuffer, it will have to create a new char[] anyway because it can't change the one that it's sharing with the String.
 
schapel
Posts:2,586
Registered: 6/22/99
Re: PROPOSED StringBuffer ENHANCEMENT -- COMMENTS???   
Nov 9, 2001 7:46 AM (reply 9 of 25)  (In reply to #6 )

 
The issue surrounding the doubling of the size of the
char[], however, is not a bug. It's simply a very
inefficient implementation (IMHO).

To be precise, it's inefficient in terms of the memory used, but very efficient in terms of the time used. As I point out above, there's a time-space tradeoff. If you make StringBuffer more memory efficient, it will become slower.

Because more people complain about Java being slow than about Java taking too much memory, that's probably the way StringBuffer should probably be implemented. Perhaps there could be a StringBuffer constructor added that takes a "growth factor" such as 50% or 20% so that a programmer could choose a slower but more memory efficient StringBuffer growth pattern.
 
dubwai
Posts:13,976
Registered: 29/11/00
Re: PROPOSED StringBuffer ENHANCEMENT -- COMMENTS???   
Nov 9, 2001 7:47 AM (reply 10 of 25)  (In reply to #7 )

 
schapel, I see your point here.

As was mentioned in another post Vector has a second constructor that allows a capacity increment argument.

Are you suggesting that in general we shouldn't use that constuctor with values larger than 1?

I'm not trying to be being argumentative. I take your advice very seriously.
 
hlindqvi
Posts:41
Registered: 7/3/00
Re: PROPOSED StringBuffer ENHANCEMENT -- COMMENTS???   
Nov 9, 2001 9:46 AM (reply 11 of 25)  (In reply to original post )

 
Maybe you could make yourself a NativeStringBuffer using the new CharBuffer. That should be a preformance boost, or?
 
DrClap
Posts:39,424
Registered: 4/30/99
Re: PROPOSED StringBuffer ENHANCEMENT -- COMMENTS???   
Nov 9, 2001 9:50 AM (reply 12 of 25)  (In reply to #11 )

 
Perhaps there should be more constructors for StringBuffer to allow finer control over its allocation. Just think of all the threads we could have arguing about which was the most efficient!
 
schapel
Posts:2,586
Registered: 6/22/99
Re: PROPOSED StringBuffer ENHANCEMENT -- COMMENTS???   
Nov 9, 2001 10:28 AM (reply 13 of 25)  (In reply to #10 )

 
schapel, I see your point here.

As was mentioned in another post Vector has a second
constructor that allows a capacity increment
argument.

Are you suggesting that in general we shouldn't use
that constuctor with values larger than 1?

I'm not trying to be being argumentative. I take your
advice very seriously.

No, I'm not saying that... if you do use Vector and use the capacityIncrement, then you need to be aware that adding many elements to the Vector one at a time could become an O(N^2) operation rather than an O(N) operation. If you're using a Vector with thousands of elements, using capacityIncrement could be orders of magnitude slower than letting the Vector double in size each time it needs to grow.

The default behavior of a data structure that uses an array to store elements and that can have elements added to it (Vector, ArrayList, StringBuffer, BitSet, ...) should be to grow in size by a significant factor (2 or 1.5 for all of the data structures mentioned) each time it needs to become larger. When you add N elements, it will take O(N) time and O(N) wasted space.

It makes sense to allow a programmer to fine-tune this growth to a different factor, or to cause the data structure to grow by a constant number of elements instead. Depending on exactly how the data structure will be used, and whether memory or execution time is being optimized, any one of those growth patterns will be most efficient. A smaller growth factor will cause it to take more time to add elements, but will result in less wasted space. Using a capacity increment will cause the time to explode to O(N^2) but the wasted space will be only O(1).
 
vy_ho
Posts:214
Registered: 3/26/01
Re: PROPOSED StringBuffer ENHANCEMENT -- COMMENTS???   
Aug 12, 2002 10:33 AM (reply 14 of 25)  (In reply to #13 )

 
Yes, definitely a new approach is needed. I ran into a
problem with this already, and it seems to be a big problem
if this can not be resolved. One solution that I am
thinking which may not be the best or even possible (but
seem very feasible) is to have an array of pointers to
fixed (or variable if you want more complex) pieces of memories.

The array of pointers can grow double up just like now.

Each item in the array points to a fixed piece of memory.

This approach:

-very fast appending: because no coping of more memory is
needed(every time you double, you have to copy the content
to a new one, and destroy the old one, very slow, a big
problem right now).

- Very efficient use of memory space: I have 512 Megs of physical RAM, but can
not have a 40 Megs string. Pitiful for modern computer
(shouldn't it be able to allocate much more than 512 megs of
ram String?) Why more efficient? Because it can use small
pieces of memory instead of a large continuous one. (of
course, a good operating system may do this well, and we
don't need this, but with Microsoft's 97% of desktop and
their screw up OS, you have to invent a better solution).

- Fast indexing. Just need an
extra operation to find which pointers it is. Simply a
divide a nd a mod.

- Need to give users control of increment size

Ofcourse, there could be a variation of this that provide even better architecture.

 
This topic has 25 replies on 2 pages.    1 | 2 | Next »
Back to Forum
 
Read the Developer Forums Code of Conduct

Click to email this message Email this Topic

Edit this Topic
  
 
 
Forums Statistics

About Sun forums
  • Oracle Forums is a large collection of user generated discussions. It is here to help you ask questions, find answers, and participate in discussions.

    Check out our guide on Getting started with Oracle Forums for a full walkthrough of how to best leverage the benefits of this community.

Powered by Jive Forums