分享

The WeakHashMap Class

 moonboat 2010-10-22

The WeakHashMap Class

07/09/2001

WeakHashMap is a type of Map which differs from other Maps in more than just having a different implementation.

WeakHashMap uses weak references to hold its keys, making it one of the few classes able to respond to the fluctuating memory requirements of the JVM. This can make WeakHashMap unpredictable at times, unless you know exactly what you are doing with it. In the following sections I examine how best to use WeakHashMap, and why WeakHashMap behaves the way it does.

Using WeakHashMap to Avoid Memory Leaks

How many times have you used a Map to store extra information about objects? Perhaps subclassing of those objects were disallowed, but you needed to record extra state information for each object; or there were many different types of objects, and you needed to have a central repository where information was easily accessible about any one of the objects.

Here's a simple example of a class which allows any object to be registered with it, and which keeps track of two items of information about that object, the elapsed time since the object was registered, and a separate counter.


public class ExtraInformation
{
static Map ExtraInfo = new HashMap();

long registrationTime;
int countSomething = 0;

public static void registerObject(Object o)
{
ExtraInfo.put(o, new ExtraInformation())
}

public ExtraInformation()
{
//record time when first registered
registrationTime = System.currentTimeMillis();
}

public static void incrementCount(Object o)
{
((ExtraInformation) ExtraInfo.get(o)).incrementCount++;
}

public static int getCount(Object o)
{
return ((ExtraInformation) ExtraInfo.get(o)).incrementCount;
}

public static long getElapseTimeMillis(Object o)
{
return System.currentTimeMillis() -
((ExtraInformation) ExtraInfo.get(o)).registrationTime;
}
}

Related Reading

Java performance tuningJava Performance Tuning
By Jack Shirazi
Table of Contents
Index
Sample Chapter
Full Description
Read Online -- Safari

If you are alert, or thinking about the title of this section, you may have immediately noticed that a deregistration method is missing. Without a deregistration method, the internal Map in the ExtraInformation class keeps on growing, and retains a reference to the object key. This is a classic memory leak; objects never get dereferenced, so the garbage collector can never reclaim them. So lets add the deregistration method:

public static void deregisterObject(Object o)
{
  ExtraInfo.remove(o);
}

Now we can avoid the memory leak. But what if the users of this class forget to call the deregister method? What if there is no particular time when it would be appropriate to call the deregister method? Well, we can put a big warning in the documentation -- "WARNING: MEMORY LEAK IF DEREGISTRATION IS NOT PERFORMED!". Or we can try to be nicer and handle the situation ourselves.

Specifically, we want to handle the situation where the deregisterObject() method is not called, but the registered object is no longer referenced anywhere in the application, except by our internal Map. This situation is precisely what WeakHashMap was designed for. The keys entered into a WeakHashMap are held using WeakReferences. This allows the garbage collector to collect the keys if there are no other strong references to those keys elsewhere in the application (for more details about strong and weak references in Java, see the "Further resources" section at the end of this article). After the keys have been collected, the values are also dereferenced from the WeakHashMap, and can also be garbage collected if they are not referred to elsewhere in the application.

To enable our ExtraInformation class to work with the garbage collector in avoiding memory leaks, the change is simple -- make the internal Map a WeakHashMap:

static Map ExtraInfo = new WeakHashMap();

Comment on this articleSend your questions to Jack.
Post your comments

Also in Java Design and Performance Optimization:

Micro-Tuning Step-by-Step

Tuning JDBC: Measuring JDBC performance

Faster List Iteration with RandomAccess Interface

With this one change, we have now enabled our ExtraInformation class to automatically release objects to the garbage collector once those objects are no longer referenced anywhere else in the application. (There can be other classes using weak references in a similar way without interfering with this functionality. No amount of weak references to an object will prevent that object from being eligible for collection by the garbage collector).

Using WeakHashMap as a Cache

The previous section looked at using WeakHashMap to avoid memory leaks. WeakHashMap can also be used fairly straightforwardly as a cache that can be cleared automatically when JVM memory is low. The only difference is in needing the keys to be objects which can be equal to other keys of the same class. This is most easily explained with some examples:

  • Unique keys. If the keys were of the Object class, each key would be unique and could not evaluate equal to another key. Consequently, once you no longer retained a reference to the Object key, there would be no way of re-accessing values, short of enumerating all keys or values (this is true for any Map). Since a cache normally works by allowing values to be re-accessed, this would be a severe problem.

  • Equal Keys. If the keys are objects like Strings or Integers, then any particular key value could be reaccessed by creating a new key which evaluates as equal (according to the equal() method for that class) to the original key. This allows the original key to have no further strong references to it, thus enabling it to be sucessfully used as a WeakHashMap key.

    For example, a cache with the functionality of a sparse array (see the "Sparse Arrays" sidebar) is easy to implement:

public class SparseArrayCache {
  protected WeakHashMap map = new WeakHashMap();

  /* This put() will return any previous value already at index i. */
  public Object put(int i, Object o) {
    map.put(new Integer(i), o);
  }

  /* This get() will return the object at index i, or null if no
   object was put there, or if the object was garbage collected. */
  public Object get(int i) {
    return map.get(new Integer(i));
  }

  public void remove(int i, Object o) {
    map.put(new Integer(i), o);
  }
}


However, using WeakHashMap to implement a cache has a real problem with lack of control. Most builders of caches usually find that they want to control how and when objects are removed from the cache. For example, a least-recently-used algorithm tries to retain elements in the cache that are being used the most often and removes those elements that have not been used recently. But the elements of a WeakHashMap cache could be cleared at any time by the JVM, and there is no selectivity. Normally all the elements which can be cleared, are cleared in one go. This takes control away from the cache-builder. A more useful cache would be signalled by the JVM that memory was needed and would then be allowed to select which elements to clear. It is possible to take more control over cache clearing with your own implementation directly using WeakReferences, but that wouldn't use a WeakHashMap, so I'll leave that discussion to another article.

How WeakHashMap works

You don't need to know the internals of WeakHashMaps to use them. But the implementation is interesting, and studying the it can make you aware of some of the curious performance consequences of working with WeakHashMaps.

The keys in a WeakHashMap are WeakReference objects. The object passed as the key to a WeakHashMap is stored as the referent of the WeakReference object, and the value is the standard Map value. (The object returned by calling Reference.get() is termed the referent of the Reference object.) A comparison with HashMap can help:


HashMap

Map h = new HashMap();

Object key = new Object;
h.put(key, "xyz");

key = null;

WeakHashMap

Map h = new WeakHashMap();

Object key = new Object;
h.put(key, "xyz");

key = null;

Conceptually, this is similar to inserting a line before the put() call like this:

key = new WeakReferenkey(key);

The key is referenced directly by the HashMap.

The key is not referenced directly by the WeakHashMap. Instead, a WeakReference object is referenced directly by the WeakHashMap, and the key is referenced weakly from the WeakReference object.

The value is referenced directly by the HashMap.

The value is referenced directly by the HashMap.

The key is not garbage collectable, since the map contains a strong reference to the key. The key could be obtained by iterating over the keys of the HashMap.

The key is garbage collectable, as nothing else in the application refers to it, and the WeakReference only holds the key weakly. Iterating over the keys of the WeakHashMap might obtain the key, but might not, if the key has been garbage collected.

The value is similarly not garbage collectable.

The value is not directly garbage collectable. However, when the key is collected by the garbage collector, the WeakReference object is subsequently removed from the WeakHashMap as a key, thus making the value garbage collectable too.

The (1.2- and 1.3-version) WeakHashMap implementation wraps a HashMap for its underlying Map implementation, and wraps keys with WeakReferences (actually a WeakReference subclass) before putting the keys into the underlying HashMap. The WeakHashMap uses its own ReferenceQueue object so that it is notified of keys that have been garbage collected, thus allowing the timely removal of the WeakReference objects and the correponding values. The queue is checked whenever the Map is altered. If you have not worked with Reference objects and ReferenceQueues before, this can be a little confusing, so I'll work through an example. The following example adds a key-value pair to the WeakHashMap, assumes that the key is garbage collected, and records the subsequent procedure followed by the WeakHashMap.

  1. A key value pair is added to the Map:

    aWeakHashMap.put(key, value);

    This results in the addition of a WeakReference key added to the WeakHashMap, with the original key held as the referent of the WeakReference object. You could do the equivalent using a HashMap like this:

    ReferenceQueue Queue = new ReferenceQueue();
    MyWeakReference RefKey = new MyWeakReference(key, Queue);
    aHashMap.put(RefKey, value);

    (For the equivalence code, I've used a subclass of WeakReference, as I'll need to override the WeakReference.equals() for equal key access in the subsequent points to work correctly.) Note that at this stage the referent of the WeakReference just created is the original key; i.e., the following statement would output "true":

    System.out.println(RefKey.get() == key);
  2. At this point, you could access the value from the WeakHashMap using the original key, or another key which is equal to the original key; i.e., the following statements would now output "true":

    System.out.println(aWeakHashMap.get(equalKey) == value);
    System.out.println(aWeakHashMap.get(key) == value);

    In our equivalent code using the HashMap, the following statements would now output "true":

    MyWeakReference RefKey2 = new MyWeakReference(equalKey, Queue);
    System.out.println(aHashMap.get(RefKey2) == value);
    System.out.println(aHashMap.get(RefKey) == value);

    Note that in order to get this equivalence, we need to implement equals() and hashcode() in the MyWeakReference class, so that equal referents make equal MyWeakReference objects. This is necessary so that the MyWeakReference wrapped keys evaluate as equal keys in Maps. The code is available from the source (see "Further Resources" at the end of this article). The equals() method returns true if the MyWeakReference objects are identical or if their referents are equal.

  3. We now null out the reference to the original key:

    key = null;

    After some time, the garbage collector detects that the key is no longer referenced anywhere else in the application, and clears the WeakReference key. In the equivalent code using the HashMap from this point on, the WeakReference we created has a null referent; i.e., the following statement would now output "true":

    System.out.println(RefKey.get() == null);

    Maintaining a reference to the WeakReference object (in the RefKey variable) doesn't affect clearing the referent. In the WeakHashMap, the WeakReference object key is also strongly referenced from the map, but its referent, the original key, is cleared.

  4. The garbage collector adds the WeakReference that it recently cleared into its ReferenceQueue; that queue is the ReferenceQueue object, which was passed in to the constructor of the WeakReference.

  5. Trying to retrieve the value using a key equal to the original key would now return null. (To try this it is necessary to use a key equal to the original key; we no longer have access to the original key, otherwise it could not have been garbage collected). The following statement would now output "true":

    System.out.println(aWeakHashMap.get(equalKey) == null);

    In our equivalent code using the HashMap, the following statements would now output "true":

    MyWeakReference RefKey3 = new MyWeakReference(equalKey, Queue);
    System.out.println(aHashMap.get(RefKey3) == null);

  6. However, at the moment the WeakReference and the value objects are still strongly referenced by the Map. That is where the ReferenceQueue comes in. Recall that when the garbage collector clears the WeakReference, it adds the WeakReference into the ReferenceQueue. Now that it is in the ReferenceQueue, we need to have it processed. In the case of the WeakHashMap, the WeakReference stays in the ReferenceQueue until the WeakHashMap is altered in some way, e.g. by calling put(), remove(), or clear(), etc. Once one of the mutator methods have been called, the WeakHashMap runs through its ReferenceQueue, removing all WeakReference objects from the queue and also removing each WeakReference object as a key in its internal map, thus simultaneously dereferencing the value. In the following example, I use a dummy object to force queue processing without making any real changes to the WeakHashMap:

    aWeakHashMap.put(DUMMY_OBJ, DUMMY_OBJ);

    The equivalent code using the HashMap does not need a dummy object, but we need to carry out the equivalent queue processing:

    MyWeakReference aRef;
    while ((aRef = (MyWeakReference) Queue.poll()) != null)
    {
    aHashMap.remove(aRef);
    }

    As you see, we take each WeakReeference out of the queue, and remove it from the Map. This also releases the corresponding value object, and both the WeakReference object and the value object can now be garbage collected if there are no other strong references to them.

Reference objects with String literal referents

Note that if you use a string literal as a key to a WeakHashMap, or the referent to a Reference object, it will not necessarily be garbage collected when the application no longer references it. For example, consider the code:

String s = "hello";
WeakHashMap h = new WeakHashMap();
h.put(s,"xyz");
s = null;

You might expect that the string "hello" can now be garbage collected, since we have nulled the reference to it. However, a string created as a literal is created at compile time, and held in a string pool in the class file. The JVM normally holds these strings internally in its own string pool after the class has been loaded. Consequently, the JVM retains a strong reference to the String object, and it will not be garbage collected until the JVM releases it from the string pool; that may be never, or it may be when the class is garbage collected, or some other time. If you want to use a String object as a key to a WeakHashMap, ensure it is created at runtime, e.g.:

String s1 = new String("hello");
String s2 = (new StringBuffer()).append("hello").toString();

This string does not get put into the JVM string pool, and so can be garbage collected when the application no longer holds strong references to it. Note that calling String.intern() on a string will also put it into the internal JVM string pool, giving rise to the same issues as literal strings. Similarly, other objects which the JVM could retain a strong reference to, such as Class objects, may also not be garbage collected when there are no longer any strong references to them from the application.

Some Consequences of the WeakHashMap Implementation

  • Reference clearing is automatic; consequently, there is no need to worry about achieving some sort of corrupt state if you try to access an object and the garbage collector is clearing keys at the same time. You will either get the object or you won't.

  • The values are not released until the WeakHashMap is altered. This is specific to the implementation, and was done to avoid getting ConcurrentModificationExceptions. If the ReferenceQueue were processed on an accessor, this could be from an Iterator accessing elements. Then even if there was only one thread running, it would be possible for the map to be altered as you iterated through it, giving a surprising ConcurrentModificationException. Specifically one of the mutator methods, put(), remove(), or clear(), need to be called directly or indirectly (e.g. from putAll()) for the values to be released by the WeakHashMap. If you do not call any mutator methods after populating the WeakHashMap, the values and WeakReference objects will never be dereferenced.

  • WeakHashMap (at least up to version 1.3) wraps an internal HashMap. This means that practically every call to the WeakHashMap has one extra level of indirection it must go through (e.g. WeakHashMap.get() calls HashMap.get()), which can be a significant performance overhead. This is a specific choice of the implementation.

  • Every call to get() creates a new WeakReference object to enable equality testing of keys in the internal HashMap. Although these are small short-lived objects, if get() was used intensively this could generate a heavy performance overhead. Again this is a specific choice of the implementation. The WeakHashMap could implement a hash table directly in the class and avoid the need for extra WeakReference objects.

  • Unlike many other collections, WeakHashMap cannot maintain a count of elements, since keys can be cleared at any time by the garbage collector without immediately notifying the WeakHashMap. This means that seemingly simple methods such as isEmpty() and size() have more complicated implementations than for most collections. Specifically, size() actually iterates thorugh the keys, counting those that have not been cleared. Consequently size() is an operation that takes time proportional to the size of the WeakHashMap. Similarly, isEmpty() iterates through the collection looking for a non-null key. This produces the perverse result that a WeakHashMap which is empty, due to having all its keys cleared, requires more time for isEmpty() to return than a similar WeakHashMap that is not empty.

Further Resources

The source code for testing code equivalent to WeakHashMap is in Test.java.

Strong and weak references:

Java performance tuning

Jack Shirazi is the author of Java Performance Tuning. He was an early adopter of Java, and for the last few years has consulted mainly for the financial sector, focusing on Java performance.

    本站是提供个人知识管理的网络存储空间,所有内容均由用户发布,不代表本站观点。请注意甄别内容中的联系方式、诱导购买等信息,谨防诈骗。如发现有害或侵权内容,请点击一键举报。
    转藏 分享 献花(0

    0条评论

    发表

    请遵守用户 评论公约

    类似文章 更多