participate


Reflections & Reference Objects - Hashtable vs HashMap
<<   Back to Forum  |   Give us Feedback
This topic has 10 replies on 1 page.
dpz
Posts:459
Registered: 11/20/97
Hashtable vs HashMap   
Feb 2, 2004 3:33 PM

 
I've been hearing everybody championing HashMaps over Hashtables becuause they're unsynchronized, therefor much faster for quite some time. I tried to benchmark the two and I got results which are puzzling. The test does not try to measure the performance directly, instead it compares the two. The test has lots of other stuff going on but that stuff should be identical for each.


Here's my test code
import java.util.*;
 
public class t
    {
    static final int ITER = 500000;
 
    static void doHash (Map t)
        {
        for (int i=0; i<ITER; ++i)
            {
            Integer in = new Integer(i);
            t.put(in, in);
            }
        for (int i=0; i<ITER; ++i)
            {
            Integer in = new Integer(i);
            Object o = t.get(in);
            if (!(o instanceof Integer)) // Do something with the result
                 System.out.println(i+" wasn't an Integer");
            }
        }
 
 
    public static void main (String[] argv)
        {
        // Warm up hotspot
        for (int i=0; i<3; ++i)
            {
            doHash(new Hashtable());
            doHash(new HashMap());
            }
 
        // Time the calls
        for (int i=0; i<10; ++i)
            {
            Map t = new Hashtable();
            long start = System.currentTimeMillis();
            doHash(t);
            System.out.println("Hashtable "+(System.currentTimeMillis()-start));
 
            t = new HashMap();
            start = System.currentTimeMillis();
            doHash(t);
            System.out.println("HashMap "+(System.currentTimeMillis()-start));
            }
        }
    }

I ran my test with -server, as it gave better times (as expected)

java -server -cp . t


On Red Hat Fedora, running an PIII, 700MHz 512MByte ram
java version "1.4.2_03"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2_03-b02)
Java HotSpot(TM) Client VM (build 1.4.2_03-b02, mixed mode)

Hashtable 2461
HashMap 4104
Hashtable 2502
HashMap 4609
Hashtable 3174
HashMap 4062
Hashtable 2788
HashMap 5572
Hashtable 2516
HashMap 4094
Hashtable 2471
HashMap 4094
Hashtable 2482
HashMap 4026
Hashtable 2444
HashMap 4091
Hashtable 2460
HashMap 4063
Hashtable 2480
HashMap 4081




On Mac OSX 10.3, on a G3 450MHz 768MByte ram
java version "1.4.1_01"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.1_01-99)
Java HotSpot(TM) Client VM (build 1.4.1_01-27, mixed mode)

Hashtable 4361
HashMap 5705
Hashtable 4366
HashMap 5640
Hashtable 4188
HashMap 5807
Hashtable 4313
HashMap 5793
Hashtable 4054
HashMap 5648
Hashtable 4197
HashMap 5749
Hashtable 4428
HashMap 5692
Hashtable 4110
HashMap 5955
Hashtable 4282
HashMap 5552
Hashtable 4216
HashMap 5776



On W2K on 1 PIII 1000MHz, 1GByte ram
java version "1.4.2_01"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2_01-b06)
Java HotSpot(TM) Client VM (build 1.4.2_01-b06, mixed mode)

Hashtable 1502
HashMap 2945
Hashtable 1462
HashMap 2954
Hashtable 1492
HashMap 2855
Hashtable 1392
HashMap 2954
Hashtable 1462
HashMap 2944
Hashtable 1452
HashMap 2934
Hashtable 1472
HashMap 2954
Hashtable 1503
HashMap 3134
Hashtable 1422
HashMap 3005
Hashtable 1432
HashMap 2994


It looks like the Hashtable is considerably faster than the Hashmap
 
silk.m
Posts:2,829
Registered: 9/1/03
Re: Hashtable vs HashMap   
Feb 2, 2004 3:50 PM (reply 1 of 10)  (In reply to original post )

 
let me just guess quickly that it could be because of ...

1. possible gc when it cleans up the unused Ints at somepoint
after de-referencing of the old map.

2. different default size amonts ...

3. ?


i think that 1 is the most likely cause of the results though ...
 
dpz
Posts:459
Registered: 11/20/97
Re: Hashtable vs HashMap   
Feb 2, 2004 3:54 PM (reply 2 of 10)  (In reply to #1 )

 
I changed the order inside the loop of Hashtable and HashMap but got identical results. This would seem to invalidate the gc argument.
 
silk.m
Posts:2,829
Registered: 9/1/03
Re: Hashtable vs HashMap   
Feb 2, 2004 3:59 PM (reply 3 of 10)  (In reply to original post )

 
hm.. and it seems i'm wrong about number2.
 
_dnoyeB
Posts:1,350
Registered: 6/20/03
Re: Hashtable vs HashMap   
Feb 2, 2004 6:11 PM (reply 4 of 10)  (In reply to original post )

 
try again without garbage collection
-Xnoclassgc

and without hotspot
-Xint
 
_dnoyeB
Posts:1,350
Registered: 6/20/03
Re: Hashtable vs HashMap   
Feb 2, 2004 6:19 PM (reply 5 of 10)  (In reply to original post )

 
Though I dont know for certain, i think hashtable is not synchronized for the heck of it. probably they are internally two different beasts.

synchronization is not that much these days. Especially when the hotspot knows you have but 1 thread and can ignore the synchronization if it so chooses.

another test is to use a syhcnronized wrapper on the HashMap and see how slow it is in comparison to an unsynchronized map.
 
lexzeus
Posts:438
Registered: 2/17/03
Re: Hashtable vs HashMap   
Feb 2, 2004 7:19 PM (reply 6 of 10)  (In reply to #5 )

 
Hashtable extends java.util.Dictionary, and HashMap extends java.util.AbstractMap
I think Sun have a reason not to have Hashtable subclass or wrap the HashMap with synchronized modifier..
Have you tried to benchmark Collections.synchronizedMap(new HashMap())? Perhaps you can also post the result here..
 
UlrikaJ
Posts:5,965
Registered: 5/17/03
Re: Hashtable vs HashMap   
Feb 3, 2004 12:29 AM (reply 7 of 10)  (In reply to #6 )

 
You use very special input data - a sequence from 0 to 500000 inserted in that order. This case seems to favour Hashtable. In a more 'natural' situation I'm sure HashMap beats Hashtable consistently.

To check this I've modified your test code:

1. Create both types of Map with the same initial capacity and load factor.
2. Remove the object creation overhead from the test loops. They can be created once and held in an array.
3. Separate the loading (put) from the retrival (get) and clock separately. The loading is likely to take much longer so the retrival time will 'drown' in it.

I now instead generate 500000 RANDOM Integers. This is favourable for hash tables in two ways. First the Integers will be loaded in a random order. This basically shouldn't matter but I think it does because of the different ways hash tables may handle growth. Second the hashCode method of Integer will return a number over the full integer range thus randomizing the object positions in the underlying hash table. This is the important factor.

The test shows that HashMap's is equally fast in both cases, wheras Hashtable differs a lot. In the more 'natural' case with random input data HashMap is faster.

So my conclussion is - use HashMap. It shows consistent performance with very different input data sets and it's faster than Hashtable exept in extreme cases.
// To start
T t = new T();
//
public class T    {
    private final int ITER = 500000;
    private float LOAD_FACTOR = 0.75F;
    private int INIT_CAPACITY = 100;
    private final Object[] keys = new Object[ITER]; // hold keys
    private Random rnd = new Random(13L);   // seeded random generator
    public T() {
        main();
    }
    private void doHashLoad (Map t) {
        for (int i=0; i<ITER; ++i) {
            Object in = keys[i];
            t.put(in, in);
        }
    }
    private void doHashRetrieve (Map t) {        
        for (int i=0; i<ITER; ++i) {
            Object in = keys[i];
            Object o = t.get(in);
            if (o==null)
                System.out.println("error");
        }
    }
    
    private void main () {
        for (int i=0; i<ITER; ++i) { // create keys
//            keys[i] = new Integer(i); // Very special case good for Hashtable
            keys[i] = new Integer(rnd.nextInt()); // More 'natural' case good for all Maps
        }
        for (int i=0; i<3; ++i) {         // Warm up hotspot
            doHashLoad(new Hashtable());
            doHashLoad(new HashMap());
        }
        // Time the calls
        for (int i=0; i<3; ++i) {
            Map t = new Hashtable(INIT_CAPACITY,LOAD_FACTOR);
            long start = System.currentTimeMillis();
            doHashLoad(t);
            System.out.println("Hashtable Load "+(System.currentTimeMillis()-start));
            start = System.currentTimeMillis();
            doHashRetrieve(t);
            System.out.println("Hashtable Retrieve "+(System.currentTimeMillis()-start));
            //
            t = new HashMap(INIT_CAPACITY,LOAD_FACTOR);
            start = System.currentTimeMillis();
            doHashLoad(t);
            System.out.println("HashMap Load "+(System.currentTimeMillis()-start));
            start = System.currentTimeMillis();
            doHashRetrieve(t);
            System.out.println("HashMap Retrieve "+(System.currentTimeMillis()-start));
            System.out.println("---");
        }
    }
}    
 
ekupcik
Posts:163
Registered: 8/7/97
Re: Hashtable vs HashMap   
Feb 3, 2004 2:55 AM (reply 8 of 10)  (In reply to #7 )

 
There is at least one bugreport about this topic and the answer from SUN is something like this: Hashtable performs better when you use a sequence of numbers as keys but can perform quite bad if the keys follow some special patterns while HashMap has an overall better performance in the real word and doesn't have "any" problems with special patterns or weak hashCode() implementations (but of course you are still in trouble if hashCode() returns 0 for each object)
 
UlrikaJ
Posts:5,965
Registered: 5/17/03
Re: Hashtable vs HashMap   
Feb 3, 2004 4:54 AM (reply 9 of 10)  (In reply to #8 )

 
Hashtable
performs better when you use a sequence of numbers as
keys but can perform quite bad if the keys follow
some special patterns while HashMap has an overall
better performance in the real word and doesn't have
"any" problems with special patterns or weak
hashCode() implementations

I've done some further testing and I can confirm the above. Both Maps perform better when the Integers are loaded and retrieved in the same order. Hashtable a little better than HashMap. When the Integers are retrieved at random there's a slight advantage for HashMap.
 
matfud
Posts:2,410
Registered: 9/8/98
Re: Hashtable vs HashMap   
Feb 3, 2004 5:41 AM (reply 10 of 10)  (In reply to #9 )

 
Another few things to point out.

HashMap (and HashSet) perform double hashing using the following code
 /**
     * Returns a hash value for the specified object.  In addition to 
     * the object's own hashCode, this method applies a "supplemental
     * hash function," which defends against poor quality hash functions.
     * This is critical because HashMap uses power-of two length 
     * hash tables.<p>
     *
     * The shift distances in this function were chosen as the result
     * of an automated search over the entire four-dimensional search space.
     */
    static int hash(Object x) {
        int h = x.hashCode();
 
        h += ~(h << 9);
        h ^=  (h >>> 14);
        h +=  (h << 4);
        h ^=  (h >>> 10);
        return h;
    }


This partly accounts for the speed of HashMap wrt to Hashtable. There is more overhead on put and
get operations but those operations allow for a better distribution of the data over the buckets. This
can in turn lead to HashMap being faster then Hashtable esp. when hashCodes are poorly distributed.

(this was shown in the tests that UJ and the OP did)

The second issue to do with synchronization. Synchronization is fairly low cost on most hardware
architectures especially when contention is low. (the tests you use only have thread). This is quite
a complex issue as most of the time multiple threads should not be accessing an unsynchronized
map. However Maps/tables are often used as a constituent of more complex structures and it can often be
more efficient (in the presence of multiple threads) to perform synchronization on the higher level
structure (Maps) rather then on the high level structure and Hashtable.

There is also a fundamental difference between the two classes. HashMaps allows null for keys and
values. Hashtable does not.

matfud.
 
This topic has 10 replies on 1 page.
Back to Forum
 
Read the Developer Forums Code of Conduct

Click to email this message Email this Topic

Edit this Topic
  
 
 
Forums Statistics
    Users Online : 26
  • Guests : 129

About Sun forums
  • Sun Forums is a large collection of user generated discussions. It is here to help you ask questions, find answers, and participate in discussions.

    Check out our guide on Getting started with Sun Forums for a full walkthrough of how to best leverage the benefits of this community.

Powered by Jive Forums