在Java 8中,我如何使用流API通过检查每个对象的属性的清晰度来过滤一个集合?

例如,我有一个Person对象列表,我想删除同名的人,

persons.stream().distinct();

将对Person对象使用默认的相等性检查,所以我需要这样的东西,

persons.stream().distinct(p -> p.getName());

不幸的是,distinct()方法没有这样的重载。如果不修改Person类内部的相等检查,是否可以简洁地做到这一点?


当前回答

Set<YourPropertyType> set = new HashSet<>();
list
        .stream()
        .filter(it -> set.add(it.getYourProperty()))
        .forEach(it -> ...);

其他回答

在我的情况下,我需要控制什么是前一个元素。然后,我创建了一个有状态的Predicate,我在其中控制前一个元素是否与当前元素不同,在这种情况下,我保留了它。

public List<Log> fetchLogById(Long id) {
    return this.findLogById(id).stream()
        .filter(new LogPredicate())
        .collect(Collectors.toList());
}

public class LogPredicate implements Predicate<Log> {

    private Log previous;

    public boolean test(Log atual) {
        boolean isDifferent = previouws == null || verifyIfDifferentLog(current, previous);

        if (isDifferent) {
            previous = current;
        }
        return isDifferent;
    }

    private boolean verifyIfDifferentLog(Log current, Log previous) {
        return !current.getId().equals(previous.getId());
    }

}

有一种更简单的方法,使用带有自定义比较器的TreeSet。

persons.stream()
    .collect(Collectors.toCollection(
      () -> new TreeSet<Person>((p1, p2) -> p1.getName().compareTo(p2.getName())) 
));

实现这一点最简单的方法是跳到sort特性上,因为它已经提供了一个可选的Comparator,可以使用元素的属性创建。然后你必须过滤掉重复项,这可以使用一个状态完备的Predicate来完成,它使用的事实是,对于一个已排序的流,所有相等的元素是相邻的:

Comparator<Person> c=Comparator.comparing(Person::getName);
stream.sorted(c).filter(new Predicate<Person>() {
    Person previous;
    public boolean test(Person p) {
      if(previous!=null && c.compare(previous, p)==0)
        return false;
      previous=p;
      return true;
    }
})./* more stream operations here */;

当然,一个有状态的Predicate不是线程安全的,但是如果你需要,你可以把这个逻辑移到一个Collector中,让流在使用你的Collector时处理线程安全。这取决于你想如何处理你在问题中没有告诉我们的不同元素流。

我做了一个通用版本:

private <T, R> Collector<T, ?, Stream<T>> distinctByKey(Function<T, R> keyExtractor) {
    return Collectors.collectingAndThen(
            toMap(
                    keyExtractor,
                    t -> t,
                    (t1, t2) -> t1
            ),
            (Map<R, T> map) -> map.values().stream()
    );
}

每年的例子:

Stream.of(new Person("Jean"), 
          new Person("Jean"),
          new Person("Paul")
)
    .filter(...)
    .collect(distinctByKey(Person::getName)) // return a stream of Person with 2 elements, jean and Paul
    .map(...)
    .collect(toList())
Here is the example
public class PayRoll {

    private int payRollId;
    private int id;
    private String name;
    private String dept;
    private int salary;


    public PayRoll(int payRollId, int id, String name, String dept, int salary) {
        super();
        this.payRollId = payRollId;
        this.id = id;
        this.name = name;
        this.dept = dept;
        this.salary = salary;
    }
} 

import java.util.ArrayList;
import java.util.Comparator;
import java.util.List;
import java.util.Map;
import java.util.Optional;
import java.util.stream.Collector;
import java.util.stream.Collectors;

public class Prac {
    public static void main(String[] args) {

        int salary=70000;
        PayRoll payRoll=new PayRoll(1311, 1, "A", "HR", salary);
        PayRoll payRoll2=new PayRoll(1411, 2    , "B", "Technical", salary);
        PayRoll payRoll3=new PayRoll(1511, 1, "C", "HR", salary);
        PayRoll payRoll4=new PayRoll(1611, 1, "D", "Technical", salary);
        PayRoll payRoll5=new PayRoll(711, 3,"E", "Technical", salary);
        PayRoll payRoll6=new PayRoll(1811, 3, "F", "Technical", salary);
        List<PayRoll>list=new ArrayList<PayRoll>();
        list.add(payRoll);
        list.add(payRoll2);
        list.add(payRoll3);
        list.add(payRoll4);
        list.add(payRoll5);
        list.add(payRoll6);


        Map<Object, Optional<PayRoll>> k = list.stream().collect(Collectors.groupingBy(p->p.getId()+"|"+p.getDept(),Collectors.maxBy(Comparator.comparingInt(PayRoll::getPayRollId))));


        k.entrySet().forEach(p->
        {
            if(p.getValue().isPresent())
            {
                System.out.println(p.getValue().get());
            }
        });



    }
}

Output:

PayRoll [payRollId=1611, id=1, name=D, dept=Technical, salary=70000]
PayRoll [payRollId=1811, id=3, name=F, dept=Technical, salary=70000]
PayRoll [payRollId=1411, id=2, name=B, dept=Technical, salary=70000]
PayRoll [payRollId=1511, id=1, name=C, dept=HR, salary=70000]