详解重写equals()方法就必须重写hashCode()方法的原因

liang1234_ 2019-01-26

展开全文

从Object类的hashCode()和equals()方法讲起：

最近看了Object类的源码，对hashCode() 和equals()方法有了更深的认识。重写equals()方法就必须重写hashCode()方法的原因，从源头Object类讲起就更好理解了。

先来看Object关于hashCode()和equals()的源码：

    public native int hashCode();

    public boolean equals(Object obj) {
        return (this == obj);
    }

光从代码中我们可以知道，hashCode()方法是一个本地native方法，返回的是对象引用中存储的对象的内存地址，而equals方法是利用==来比较的也是对象的内存地址。从上边我们可以看出，hashCode方法和equals方法是一致的。还有最关键的一点，我们来看Object类中关于hashCode()方法的注释：

/**
     * Returns a hash code value for the object. This method is
     * supported for the benefit of hash tables such as those provided by
     * {@link java.util.HashMap}.
     * <p>
     * The general contract of {@code hashCode} is:
     * <ul>
     * <li>Whenever it is invoked on the same object more than once during
     *     an execution of a Java application, the {@code hashCode} method
     *     must consistently return the same integer, provided no information
     *     used in {@code equals} comparisons on the object is modified.
     *     This integer need not remain consistent from one execution of an
     *     application to another execution of the same application.
     * <li>If two objects are equal according to the {@code equals(Object)}
     *     method, then calling the {@code hashCode} method on each of
     *     the two objects must produce the same integer result.
     * <li>It is <em>not</em> required that if two objects are unequal
     *     according to the {@link java.lang.Object#equals(java.lang.Object)}
     *     method, then calling the {@code hashCode} method on each of the
     *     two objects must produce distinct integer results.  However, the
     *     programmer should be aware that producing distinct integer results
     *     for unequal objects may improve the performance of hash tables.
     * </ul>
     * <p>
     * As much as is reasonably practical, the hashCode method defined by
     * class {@code Object} does return distinct integers for distinct
     * objects. (This is typically implemented by converting the internal
     * address of the object into an integer, but this implementation
     * technique is not required by the
     * Java™ programming language.)
     *
     * @return  a hash code value for this object.
     * @see     java.lang.Object#equals(java.lang.Object)
     * @see     java.lang.System#identityHashCode
     */
    public native int hashCode();

简单的翻译一下就是，hashCode方法一般的规定是：

1.在 Java 应用程序执行期间，在对同一对象多次调用 hashCode 方法时，必须一致地返回相同的整数，前提是将对象进行 equals 比较时所用的信息没有被修改。从某一应用程序的一次执行到同一应用程序的另一次执行，该整数无需保持一致。    
2.如果根据 equals(Object) 方法，两个对象是相等的，那么对这两个对象中的每个对象调用 hashCode 方法都必须生成相同的整数结果。    
3.如果根据 equals(java.lang.Object) 方法，两个对象不相等，那么对这两个对象中的任一对象上调用 hashCode 方法不 要求一定生成不同的整数结果。但是，程序员应该意识到，为不相等的对象生成不同整数结果可以提高哈希表的性能。

再简单的翻译一下第二三点就是：hashCode()和equals()保持一致，如果equals方法返回true，那么两个对象的hasCode()返回值必须一样。如果equals方法返回false，hashcode可以不一样，但是这样不利于哈希表的性能，一般我们也不要这样做。重写equals()方法就必须重写hashCode()方法的原因也就显而易见了。

假设两个对象，重写了其equals方法，其相等条件是属性相等，就返回true。如果不重写hashcode方法，其返回的依然是两个对象的内存地址值，必然不相等。这就出现了equals方法相等，但是hashcode不相等的情况。这不符合hashcode的规则。下边，会介绍在集合框架中，这种情况会导致的严重问题。

重写的作用：

如果重写（用于需求，比如建立一个Person类，比较相等我只比较其属性身份证相等就可不管其他属性，这时候重写）equals，就得重写hashCode，和其对象相等保持一致。如果不重写，那么调用的Object中的方法一定保持一致。

1. 重写equals()方法就必须重写hashCode()方法主要是针对HashSet和Map集合类型。集合框架只能存入对象（对象的引用（基本类型数据：自动装箱））。

在向HashSet集合中存入一个元素时，HashSet会调用该对象（存入对象）的hashCode()方法来得到该对象的hashCode()值，然后根据该hashCode值决定该对象在HashSet中存储的位置。简单的说：HashSet集合判断两个元素相等的标准是：两个对象通过equals()方法比较相等，并且两个对象的HashCode()方法返回值也相等。如果两个元素通过equals()方法比较返回true，但是它们的hashCode()方法返回值不同，HashSet会把它们存储在不同的位置，依然可以添加成功。同样：在Map集合中，例如其子类Hashtable（jdk1.0错误的命名规范），HashMap，存储的数据是<key,value>对，key，value都是对象，被封装在Map.Entry，即：每个集合元素都是Map.Entry对象。在Map集合中，判断key相等标准也是：两个key通过equals()方法比较返回true，两个key的hashCode的值也必须相等。判断valude是否相等equal()相等即可。

稍微提一句：（1）两个对象，用==比较比较的是地址，需采用equals方法（可根据需求重写）比较。

（2）重写equals()方法就重写hashCode()方法。

（3）一般相等的对象都规定有相同的hashCode。

hash：散列，Map关联数组，字典

2. 集合类都重写了toString方法。String类重写了equal和hashCode方法，比较的是值。

用HashSet来验证两个需都重写的必要性

程序提供了三个类A,B,C，它们分别重写了equals()，hashCode()两个方法中的一个或全部。

public class A {
    @Override
    public boolean equals(Object obj) {
        return true;
    }
}

public class B {
    @Override
    public int hashCode() {
        return 1;
    }
}

public class C {
    @Override
    public int hashCode() {
        return 2;
    }
    @Override
    public boolean equals(Object obj) {
        return true;
    }
}

public class HashSetTest {
    public static void main(String[] args) {
        HashSet hashSet = new HashSet();
        hashSet.add(new A());
        hashSet.add(new A());
        hashSet.add(new B());
        hashSet.add(new B());
        hashSet.add(new C());
        hashSet.add(new C());
        for (Object hs : hashSet) {
            System.out.println(hs);
        }
        //HashSet重写了toString()方法
//        System.out.println(hashSet);
    }
}

其结果为：

cn.edu.uestc.collection.B@1
cn.edu.uestc.collection.B@1
cn.edu.uestc.collection.C@2
cn.edu.uestc.collection.A@3f84246a
cn.edu.uestc.collection.A@18a9fa9c
Process finished with exit code 0

从上边的程序结果可以看到，必须要同时重写这两个方法，要不然Set的特性就被破坏了。

重写hashCode()的原则

（1）同一个对象多次调用hashCode()方法应该返回相同的值；

（2）当两个对象通过equals()方法比较返回true时，这两个对象的hashCode()应该返回相等的（int）值；

（3）对象中用作equals()方法比较标准的Filed(成员变量（类属性）)，都应该用来计算hashCode值。

计算hashCode值的方法：

//f是Filed属性
boolean    hashCode=(f?0:1)
(byte,short,char,int)      hashCode=(int)f
long       hashCode=(int)(f^(f>>>32))
float       hashCode=Float.floatToIntBits(f)
double   hashCode=(int)(1^(1>>>32))
普通引用类型    hashCode=f.hashCode()

将计算出的每个Filed的hashCode值相加返回，为了避免直接相加产生的偶然相等（单个不相等，加起来就相等了），为每个Filed乘以一个质数后再相加，例如有：

return  f1.hashCode()*17+(int)f2.13

查看String源码，看hashCode()d的实现方法：

  /**
     * Returns a hash code for this string. The hash code for a
     * <code>String</code> object is computed as
     * <blockquote><pre>
     * s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]
     * </pre></blockquote>
     * using <code>int</code> arithmetic, where <code>s[i]</code> is the
     * <i>i</i>th character of the string, <code>n</code> is the length of
     * the string, and <code>^</code> indicates exponentiation.
     * (The hash value of the empty string is zero.)
     *
     * @return  a hash code value for this object.
     */
    public int hashCode() {
        int h = hash;
        if (h == 0 && value.length > 0) {
            char val[] = value;
            for (int i = 0; i < value.length; i++) {
                h = 31 * h + val[i];
            }
            hash = h;
        }
        return h;
    }