这里有一些关于JPA实体的讨论,以及应该为JPA实体类使用哪些hashCode()/equals()实现。它们中的大多数(如果不是全部)依赖于Hibernate,但是我想中立地讨论它们的jpa实现(顺便说一下,我使用的是EclipseLink)。

所有可能的实现都有其自身的优点和缺点:

hashCode()/equals()契约一致性(不可变性)用于列表/集操作 是否可以检测到相同的对象(例如来自不同会话的对象,来自惰性加载数据结构的动态代理) 实体在分离(或非持久化)状态下是否正确运行

在我看来,有三种选择:

Do not override them; rely on Object.equals() and Object.hashCode() hashCode()/equals() work cannot identify identical objects, problems with dynamic proxies no problems with detached entities Override them, based on the primary key hashCode()/equals() are broken correct identity (for all managed entities) problems with detached entities Override them, based on the Business-Id (non-primary key fields; what about foreign keys?) hashCode()/equals() are broken correct identity (for all managed entities) no problems with detached entities

我的问题是:

我是否错过了一个选择和/或赞成/反对的观点? 你选择了什么,为什么?

更新1:

通过“hashCode()/equals()是坏的”,我的意思是连续的hashCode()调用可能返回不同的值,这(当正确实现时)在对象API文档的意义上不是坏的,但是当试图从Map、Set或其他基于哈希的集合中检索更改的实体时,会导致问题。因此,JPA实现(至少是EclipseLink)在某些情况下不能正确工作。

更新2:

谢谢你的回答——大部分问题都很有质量。 不幸的是,我仍然不确定哪种方法最适合实际应用程序,或者如何确定最适合我的应用程序的方法。所以,我将保持这个问题的开放性,希望有更多的讨论和/或意见。


当前回答

Jakarta Persistence 3.0,第4.12节写道:

相同抽象模式类型的两个实体当且仅当它们具有相同的主键值时相等。

我看不出为什么Java代码的行为应该有所不同。

If the entity class is in a so called "transient" state, i.e. it's not yet persisted and it has no identifier, then the hashCode/equals methods can not return a value, they ought to blow up, ideally implicitly with a NullPointerException when the method attempts to traverse the ID. Either way, this will effectively stop application code from putting a non-managed entity into a hash-based data structure. In fact, why not go one step further and blow up if the class and identifier are equal, but other important attributes such as the version are unequal (IllegalStateException)! Fail-fast in a deterministic way is always the preferred option.

警告:也要记录下爆发行为。文档本身很重要,但它也希望能够阻止初级开发人员在未来对您的代码做一些愚蠢的事情(他们倾向于压制发生NullPointerException的地方,他们最不关心的是副作用,lol)。

哦,总是使用getClass()而不是instanceof。equals方法要求对称性。如果b等于a,那么a必须等于b。对于子类,instanceof打破了这种关系(a不是b的实例)。

尽管我个人总是使用getClass(),即使在实现非实体类(类型是状态,所以子类添加状态,即使子类是空的或只包含行为),只有当类是final时,instanceof才会很好。但实体类必须不是最终的(§2.1),所以我们真的别无选择。

Some folks may not like getClass(), because of the persistence provider's proxy wrapping the object. This might have been a problem in the past, but it really shouldn't be. A provider not returning different proxy classes for different entities, well, I'd say that's not a very smart provider lol. Generally, we shouldn't solve a problem until there is a problem. And, it seems like Hibernate's own documentation doesn't even see it worthwhile mentioning. In fact, they elegantly use getClass() in their own examples (see this).

Lastly, if one has an entity subclass that is an entity, and the inheritance mapping strategy used is not the default ("single table"), but configured to be a "joined subtype", then the primary key in that subclass table will be the same as the superclass table. If the mapping strategy is "table per concrete class", then the primary key may be the same as in the superclass. An entity subclass is very likely to be adding state and therefore just as likely to be logically a different thing. But an equals implementation using instanceof can not necessarily and secondarily rely on the ID only, as we saw may be the same for different entities.

在我看来,instanceof在非final Java类中根本没有位置。对于持久实体来说尤其如此。

其他回答

我使用类EntityBase和继承到我所有的JPA实体,这对我来说非常好。

/**
 * @author marcos.oliveira
 */
@MappedSuperclass
public abstract class EntityBase<TId extends Serializable> implements Serializable{
    /**
     *
     */
    private static final long serialVersionUID = 1L;

    @Id
    @Column(name = "id", unique = true, nullable = false)
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    protected TId id;



    public TId getId() {
        return this.id;
    }

    public void setId(TId id) {
        this.id = id;
    }

    @Override
    public int hashCode() {
        return (super.hashCode() * 907) + Objects.hashCode(getId());//this.getId().hashCode();
    }

    @Override
    public String toString() {
        return super.toString() + " [Id=" + id + "]";
    }

    @Override
    public boolean equals(Object obj) {
        if (this == obj) {
            return true;
        }
        if (obj == null || getClass() != obj.getClass()) {
            return false;
        }
        EntityBase entity = (EntityBase) obj;
        if (entity.id == null || id == null) {
            return false;
        }
        return Objects.equals(id, entity.id);
    }
}

参考:https://thorben-janssen.com/ultimate-guide-to-implementing-equals-and-hashcode-with-hibernate/

If you have a business key, then you should use that for equals and hashCode. If you don't have a business key, you should not leave it with the default Object equals and hashCode implementations because that does not work after you merge and entity. You can use the entity identifier in the equals method only if the hashCode implementation returns a constant value, like this: @Entity public class Book implements Identifiable<Long> { @Id @GeneratedValue private Long id; private String title; @Override public boolean equals(Object o) { if (this == o) return true; if (!(o instanceof Book)) return false; Book book = (Book) o; return getId() != null && Objects.equals(getId(), book.getId()); } @Override public int hashCode() { return getClass().hashCode(); } //Getters and setters omitted for brevity }

看看GitHub上的这个测试用例,它证明了这个解决方案很有魅力。

我同意Andrew的回答。我们在应用程序中做同样的事情,但不是将uuid存储为VARCHAR/CHAR,而是将其分割为两个长值。请参阅UUID.getLeastSignificantBits()和UUID.getMostSignificantBits()。

还有一件事需要考虑,对UUID. randomuuid()的调用非常慢,因此您可能希望只在需要时才惰性地生成UUID,例如在持久化期间或调用equals()/hashCode()期间

@MappedSuperclass
public abstract class AbstractJpaEntity extends AbstractMutable implements Identifiable, Modifiable {

    private static final long   serialVersionUID    = 1L;

    @Version
    @Column(name = "version", nullable = false)
    private int                 version             = 0;

    @Column(name = "uuid_least_sig_bits")
    private long                uuidLeastSigBits    = 0;

    @Column(name = "uuid_most_sig_bits")
    private long                uuidMostSigBits     = 0;

    private transient int       hashCode            = 0;

    public AbstractJpaEntity() {
        //
    }

    public abstract Integer getId();

    public abstract void setId(final Integer id);

    public boolean isPersisted() {
        return getId() != null;
    }

    public int getVersion() {
        return version;
    }

    //calling UUID.randomUUID() is pretty expensive, 
    //so this is to lazily initialize uuid bits.
    private void initUUID() {
        final UUID uuid = UUID.randomUUID();
        uuidLeastSigBits = uuid.getLeastSignificantBits();
        uuidMostSigBits = uuid.getMostSignificantBits();
    }

    public long getUuidLeastSigBits() {
        //its safe to assume uuidMostSigBits of a valid UUID is never zero
        if (uuidMostSigBits == 0) {
            initUUID();
        }
        return uuidLeastSigBits;
    }

    public long getUuidMostSigBits() {
        //its safe to assume uuidMostSigBits of a valid UUID is never zero
        if (uuidMostSigBits == 0) {
            initUUID();
        }
        return uuidMostSigBits;
    }

    public UUID getUuid() {
        return new UUID(getUuidMostSigBits(), getUuidLeastSigBits());
    }

    @Override
    public int hashCode() {
        if (hashCode == 0) {
            hashCode = (int) (getUuidMostSigBits() >> 32 ^ getUuidMostSigBits() ^ getUuidLeastSigBits() >> 32 ^ getUuidLeastSigBits());
        }
        return hashCode;
    }

    @Override
    public boolean equals(final Object obj) {
        if (obj == null) {
            return false;
        }
        if (!(obj instanceof AbstractJpaEntity)) {
            return false;
        }
        //UUID guarantees a pretty good uniqueness factor across distributed systems, so we can safely
        //dismiss getClass().equals(obj.getClass()) here since the chance of two different objects (even 
        //if they have different types) having the same UUID is astronomical
        final AbstractJpaEntity entity = (AbstractJpaEntity) obj;
        return getUuidMostSigBits() == entity.getUuidMostSigBits() && getUuidLeastSigBits() == entity.getUuidLeastSigBits();
    }

    @PrePersist
    public void prePersist() {
        // make sure the uuid is set before persisting
        getUuidLeastSigBits();
    }

}

如果你想对你的set使用equals()/hashCode(),也就是说同一个实体只能出现一次,那么只有一个选项:选项2。这是因为根据定义,实体的主键永远不会改变(如果有人确实更新了它,它就不再是同一个实体了)

您应该从字面上理解:因为equals()/hashCode()是基于主键的,所以在设置主键之前,您不能使用这些方法。所以你不应该把实体放到集合里,直到它们被赋主键。(是的,uuid和类似的概念可能有助于早期分配主键。)

Now, it's theoretically also possible to achieve that with Option 3, even though so-called "business-keys" have the nasty drawback that they can change: "All you'll have to do is delete the already inserted entities from the set(s), and re-insert them." That is true - but it also means, that in a distributed system, you'll have to make sure, that this is done absolutely everywhere the data has been inserted to (and you'll have to make sure, that the update is performed, before other things occur). You'll need a sophisticated update mechanism, especially if some remote systems aren't currently reachable...

只有当集合中的所有对象都来自同一个Hibernate会话时,才可以使用选项1。Hibernate文档在13.1.3章中非常清楚地说明了这一点。考虑对象同一性:

Within a Session the application can safely use == to compare objects. However, an application that uses == outside of a Session might produce unexpected results. This might occur even in some unexpected places. For example, if you put two detached instances into the same Set, both might have the same database identity (i.e., they represent the same row). JVM identity, however, is by definition not guaranteed for instances in a detached state. The developer has to override the equals() and hashCode() methods in persistent classes and implement their own notion of object equality.

它继续主张选择3:

这里有一个警告:永远不要使用数据库标识符来实现相等。使用由唯一的、通常是不可变的属性组合而成的业务键。如果将瞬态对象持久化,则数据库标识符将更改。如果瞬态实例(通常与分离实例一起)保存在Set中,更改hashcode将破坏Set的契约。

这是真的,如果你

不能提前分配id(例如使用uuid) 当对象处于瞬态时,你肯定想把它们放到集合中。

否则,您可以自由选择选项2。

然后它提到了相对稳定性的需求:

业务键的属性不必像数据库主键那样稳定;只要对象在同一集合中,你就必须保证稳定性。

这是正确的。我所看到的实际问题是:如果你不能保证绝对的稳定性,你如何能够保证“只要对象在同一个集合中”的稳定性。我可以想象一些特殊的情况(比如只在对话中使用集合,然后将其丢弃),但我会质疑这种方法的一般实用性。


短版:

选项1只能用于单个会话中的对象。 如果可以,使用选项2。(尽早分配PK,因为在分配PK之前你不能在集合中使用对象。) 如果你能保证相对的稳定性,你可以使用选项3。但是要小心。

Jakarta Persistence 3.0,第4.12节写道:

相同抽象模式类型的两个实体当且仅当它们具有相同的主键值时相等。

我看不出为什么Java代码的行为应该有所不同。

If the entity class is in a so called "transient" state, i.e. it's not yet persisted and it has no identifier, then the hashCode/equals methods can not return a value, they ought to blow up, ideally implicitly with a NullPointerException when the method attempts to traverse the ID. Either way, this will effectively stop application code from putting a non-managed entity into a hash-based data structure. In fact, why not go one step further and blow up if the class and identifier are equal, but other important attributes such as the version are unequal (IllegalStateException)! Fail-fast in a deterministic way is always the preferred option.

警告:也要记录下爆发行为。文档本身很重要,但它也希望能够阻止初级开发人员在未来对您的代码做一些愚蠢的事情(他们倾向于压制发生NullPointerException的地方,他们最不关心的是副作用,lol)。

哦,总是使用getClass()而不是instanceof。equals方法要求对称性。如果b等于a,那么a必须等于b。对于子类,instanceof打破了这种关系(a不是b的实例)。

尽管我个人总是使用getClass(),即使在实现非实体类(类型是状态,所以子类添加状态,即使子类是空的或只包含行为),只有当类是final时,instanceof才会很好。但实体类必须不是最终的(§2.1),所以我们真的别无选择。

Some folks may not like getClass(), because of the persistence provider's proxy wrapping the object. This might have been a problem in the past, but it really shouldn't be. A provider not returning different proxy classes for different entities, well, I'd say that's not a very smart provider lol. Generally, we shouldn't solve a problem until there is a problem. And, it seems like Hibernate's own documentation doesn't even see it worthwhile mentioning. In fact, they elegantly use getClass() in their own examples (see this).

Lastly, if one has an entity subclass that is an entity, and the inheritance mapping strategy used is not the default ("single table"), but configured to be a "joined subtype", then the primary key in that subclass table will be the same as the superclass table. If the mapping strategy is "table per concrete class", then the primary key may be the same as in the superclass. An entity subclass is very likely to be adding state and therefore just as likely to be logically a different thing. But an equals implementation using instanceof can not necessarily and secondarily rely on the ID only, as we saw may be the same for different entities.

在我看来,instanceof在非final Java类中根本没有位置。对于持久实体来说尤其如此。