spring-data-jpa JPA保存操作插入了重复记录

svujldwt  于 2022-11-10  发布在  Spring
关注(0)|答案(1)|浏览(324)

I write a parser, reading data from json file.
data.properties

{
  "Data Science & Analytics": [
    {
      "Data Mining": [
        "Data Entry",
        "Data Mining",
        "Data Scraping",
        "Online Research"
      ]
    }
  ]
}

Then we need to persist this objects into three entitis. These are Category("Data Science & Analytics" in this case) -> Speciality("Data Mining" in this case) -> Skill(["Data Entry", "Data Mining", "Data Scraping", "Online Research"]), where a category contains several specialities, a speciality contains several skills, a skill can belong to several specialites. This means we have an OneToMany relation between Category and Speciality, a ManyToMany relation between Speciality and Skill.
Category.java:

@Entity
@Table(name = "categories")
@Getter
@Setter
@Builder
@AllArgsConstructor

public class Category extends Auditable<Category>{
    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Long id;
...
    @OneToMany(
            fetch = FetchType.LAZY,
            cascade = {
                    CascadeType.MERGE,
                    CascadeType.PERSIST
            },
            orphanRemoval = true,
            mappedBy = "category")
    @JsonIgnore
    private Set<Speciality> specialities;
...
}

Speciality.java

@Entity
@Table(name = "specialities")
@Getter
@Setter
@Builder
@AllArgsConstructor
public class Speciality extends Auditable<Speciality>{
    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Long id;
...
    @ManyToOne(
            fetch = FetchType.LAZY,
            cascade = {
                    CascadeType.MERGE,
                    CascadeType.PERSIST
            }
    )
    private Category category;
...
    @ManyToMany(
            fetch = FetchType.LAZY,
            cascade = {
                    CascadeType.MERGE,
                    CascadeType.PERSIST
            },
            targetEntity=Skill.class,
            mappedBy = "specialities")
    @JsonIgnore
    private Set<Skill> skills = new HashSet<>();
...
}

Skill.java

@Entity
@Table(name = "skills")
@Getter
@Setter
@Builder
@AllArgsConstructor

public class Skill extends Auditable<Skill> {
    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Long id;
...
    @JoinTable( name = "skill_speciality",
            joinColumns = @JoinColumn(name = "fk_skill"),
            inverseJoinColumns = @JoinColumn(name = "fk_speciality"),
            uniqueConstraints = @UniqueConstraint(columnNames = {
                    "fk_skill", "fk_speciality" })
    )
    @ManyToMany(
            fetch = FetchType.EAGER,
            cascade = {
                    CascadeType.ALL
            },
            targetEntity = Speciality.class
    )
    @JsonIgnore
    private Set<Speciality> specialities = new HashSet<>();
...
}

I run following parse functions, and found the duplicate records inserted in table skills when specialityRepository.save(specialityObj); executes.
Parser function:

public void parseSkillsTags(
            String filename,
            CategoryRepository categoryRepository,
            SpecialityRepository specialityRepository,
            SkillRepository skillRepository) {

        jsonHelper = new JSONHelper();
        JSONObject jsonObject = jsonHelper.parse(filename);
        if (jsonObject != null) {
            Iterator<String> categoriesIt = jsonObject.keys();

            while (categoriesIt.hasNext()) {
                String categoryName = categoriesIt.next();
                s_logger.info("parseSkillsTags:category:" + categoryName);

                Category categoryObj = new Category();
                categoryObj.setName(categoryName);

                JSONArray specialitiesArray = (JSONArray) jsonObject.get(categoryName);
                s_logger.info("specialitiesArray length:" + specialitiesArray.length());
                for (int i = 0; i < specialitiesArray.length(); i++) {
                    JSONObject specialityJsonObj = specialitiesArray.getJSONObject(i);
                    s_logger.info("index:" + i);
                    Iterator<String> specialityIt = specialityJsonObj.keys();

                    while (specialityIt.hasNext()) {
                        String specialityName = specialityIt.next();
                        s_logger.info("parseSkillsTags:speciality:" + specialityName);

                        Speciality specialityObj = new Speciality();
                        specialityObj.setName(specialityName);

                        specialityObj.setCategory(categoryObj);

                        // category
                        categoryObj.addSpeciality(specialityObj);

                        JSONArray skillsArray = specialityJsonObj.getJSONArray(specialityName);
                        for (int j = 0; j < skillsArray.length(); j++) {
                            String skillName = skillsArray.getString(j);
                            s_logger.info("parseSkillsTags:skill:" + skillName);

                            Skill skillObj = new Skill();
                            skillObj.setName(skillName);

                            // speciality
                            specialityObj.addSkill(skillObj);

                            // skill
                            skillObj.addSpeciality(specialityObj);

                            // To be fix here.
                            specialityRepository.save(specialityObj);

                        }

                        categoryRepository.save(categoryObj);

                    }
                }
            }
        } else {
            s_logger.error("skills conf is missing.");
            return;
        }
    }

The first time it hits, the "Data Entry" is inserted into skills. The second time, "Data Mining" is inserted into skills. The third time, the following I get, both "Data Mining" and "Data Scraping" inserted.

I am not sure why this is happened. Do you have any thoughts? Thanks in advance.

cnjp1d6j

cnjp1d6j1#

Change following code fix this issue.

// To be fix here.
                            specialityRepository.save(specialityObj);

                        }

                        categoryRepository.save(categoryObj);

To

}
                        // Fix it.
                        specialityRepository.save(specialityObj);
                        categoryRepository.save(categoryObj);

The reason why the "Data Mining" inserted twice is, each time that clause hit, we needs to sync the entity status with table, that means we need to add findById/findByName to sync them.
We move the specilityRepository save operation out of the loop, is because of the performance. We add skill objects into speciality, and save them once.

相关问题