For the prototype 2.2 compiler what is the best way to serialize enums ?
I looked at the Enum.java class and there is only a readResolve method. That makes sense for singlton but what method is used for the actual reading/writing ? Are these methods going to write both the String name field and int ordinal field ?
Enumerations will have special handling for serialization. The serial form will include the enum class (not the class of the instance - that might have been specialized) and the NAME of the enumeration constant, and nothing else. The prototype doesn't currently do any of this.
Hi Niel and thanks for the answer, I'm glad to see the NAME be used in serialization.
Though not specific to serialization I have a request. If your going to include a method to support serialization that converts from the NAME to the enum, I would like the method public accessible. What I have in mind is something like this (example not tested):
publicenum Color
{
RED,GREEN,BLUE;
publicstatic Color parse(String s)
{
Color colorFound =null;
for( Color c: Color.VALUES)
{
if(0== c.toString().compareTo(s))
{
colorFound=c;
}
}
if(colorFound== null)
{
thrownew IllegalArgumentException("Unknown Color string+"s+".");
}
return colorFound;
}
}
I'm not certain the throwing an exception is the best thing and if one is thrown, maybe it should be a new class of exception like "InvalidEnumStringException".
Why would I like this public ? So I don't have the parse method boiler plate in most of my enums. Here's a real world example based on converting our java application for system management of CompactPCI/AdvancdedTCA from J2SE to the prototype generic 2_2 compiler. The application has 450 classes + an additional 70 enum classes. Since we were refactoring anyway we decided to place all the parsers inside the enum class. It turns out that 46 of our enum classes need parsers. Here is the breakdown how the parsers are used
60 % Convert strings in XML files to enums(note that is not java.lang.Enum's)
10 % for command line strings to enums
30 % for pre- java IP protocols to enums
I was surprised about the percentage of XML to enum parsers because we are NOT an XML intensive application.
Though not a "must have" it would be nice to have the parse method part of the JS2E 1.5 so I don't have all this (Our app has 46 copies of the boiler plate) duplicate code around. We strive for "concise is nice".
Though not specific to serialization I have a request.
If your going to include a method to support
serialization that converts from the NAME to the enum,
I would like the method public accessible. What I
You're referring to the 'valueOf()' method, which is in fact present and emitted in the current prototype 2.2 compiler. See
http://jcp.org/aboutJava/communityprocess/jsr/tiger/enum.html
as well as the javadoc created for a enum class by sinjdoc:
http://cscott.net/Projects/GJ
Are you saying that there will be no way to properly
serialize the (non-final) 'lastComment' field?
Yes, that's exactly what I'm saying. If it was serialized, what would we do with it when deserializing? Enumeration constants are UNIQUE. You can't have more than one instance of any particular enumerator.
In any case, I think this should be documented with big red letters: "all fields are effectively transient" or something like that.
Wow, you're right - that's not apparent at all. Maybe the compiler should automatically warn about any non-final fields? I'm not even sure why mutable fields should be allowed at all.
But surely if the enumeration object does not already
exist it wouldn't hurt to initialize the singleton
properly.
But what do you mean by "properly". Sun are suggesting that "properly" means as defined in the class file loaded into the JVM holding the reference to the enum constant. Are you suggesting it should be as defined defined in the class file that was loaded into the JVM that did the serialization?
Neal is saying - what do we do if we deserialize two copies of the same enum (same class and name) but with different values of these instance fields? - Its a can of worms and I am pretty sure SUN are right on this one.
In any case, I think this should be documented with
big red letters: "all fields are effectively
transient" or something like that.
Or (because not everyone reads all the documentation) require all instance fields to have the "transient" modifier, and generate compiler errors if not? (or at the very least compiler warnings)?
But it's not exactly the "transient" semantics, either, since the field won't be reset to null, but rather to whatever value the constructor gives it (if the instance didn't previously exist) or the pre-existing value (if it did).
It may seem that the only way this works without surprise is if all fields of enumerations are final. But even this can be surprising:
Consider an enumeration of stock market symbols: SUNW, RHAT, LNUX, etc.
Each one has a field 'price' with a stock price for the symbol.
First, let's make price final. The constructor looks up the current stock price for the symbol to initialize the field. Serialization causes problems, despite the 'final' modifier: after deserialization, does 'price' refer to the current price, the price when the class was serialized, or some other price? Surprisingly, the only thing I can guarantee (with the current approach to enumeration serialization) is that it is not the price when it was serialized. Instead it is either the current price (if the enumeration instances had not yet been created when deserialization occurred) or some other earlier price (the price at the time the enumeration instances were created).
If we make price non-final, then we can define a routine to update all the prices. Again, prices after deserialization will be either current or out-of-date prices, but not the prices when the instances were serialized. Calling the update routine allows us to restore sanity to our world: an argument for allowing non-final fields.
Marking price as transient would seem to indicate that some price fields should get reset to 'null' to indicate that the price was not saved by serialization. However, in this instance the transient modifier has no effect and the prices are jumbled as above. (Since all fields are initialized by the constructor)
Finally, suppose my example were a little more complicated, and I had a spreadsheet of calculated information based on the current stock prices to save as well. Since stock prices are not actually being serialized/saved, it is obvious that my complicated calculations will be all off after deserialization, since the prices will have changed.
Now here's the catch: what if I have a spreadsheet based on today's prices, and I want to compare it to a spreadsheet based on yesterday's prices, which I serialized yesterday. When I deserialize yesterday's spreadsheet I'm either going to screw up today's spreadsheet (if I insist that serializing and restoring fields of enumerations is the Thing To Do) or yesterday's spreadsheet (with the current approach to enumeration serialization).
I can't really think of a way out of this dilemma other than by adding lots of red ink to the official description of enumeration warning of these hazards and advocating caution in using either non-final fields or final fields initialized by non-constants. I don't think that banning either of these is the correct solution to the problem, but I do think that caution is needed. If anyone can think of some hook to add to the serialization process that allows a more intuitive semantics, I'd love to hear about it.
I think your example is one of a misuse of enum's rather than a problem with enums. The ticker symbol is one attribute of a 'stock'; its price and its type (common/preferred/option/bond/fund/...) are others. I doesn't seem to be appropriate to use an enum to represent the whole object; use an enum to represent just the symbol (or not -- you don't want to update your Stock class everytime there's an IPO).
I agree with dhall. Your examples are simply not valid uses for enumerations. Frankly, I don't care how serialization for enums works, but I'm quite content to say that only final fields should be allowed in an enum. I guess that could be extended to include transient fields for flexibility (i.e. some sort of internal caching) as long as it doesn't interfere with the obvious and intuitive semantics that should be associated with an enum type.
Most importantly, I want to get compiler warnings or errors when these conditions are violated, because it indicates an obvious mistake in the code.
You missed my point: transient fields don't behave "as expected" either, because they are not zeroed by deserialization. And final fields might not behave "as expected" either.
I wasn't arguing that non-final fields ought to be allowed in enumerations, per se: I was trying to describe several situations where fields (even final fields) wouldn't behave as the programmer "expects" -- and thus situations that should either trigger a compiler error, a compiler warning, or (at least) a boldface paragraph in the language/feature description.
Fields in general in enumerations may lead to surprising serialization behavior.
This topic has
16
replies
on
2
pages.
1
|
2
|
Next »