I write a parser, reading data from json file.
data.properties
{
"Data Science & Analytics": [
{
"Data Mining": [
"Data Entry",
"Data Mining",
"Data Scraping",
"Online Research"
]
}
]
}
Then we need to persist this objects into three entitis. These are Category("Data Science & Analytics" in this case) -> Speciality("Data Mining" in this case) -> Skill(["Data Entry", "Data Mining", "Data Scraping", "Online Research"]), where a category contains several specialities, a speciality contains several skills, a skill can belong to several specialites. This means we have an OneToMany relation between Category and Speciality, a ManyToMany relation between Speciality and Skill.
Category.java:
@Entity
@Table(name = "categories")
@Getter
@Setter
@Builder
@AllArgsConstructor
public class Category extends Auditable<Category>{
@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
private Long id;
...
@OneToMany(
fetch = FetchType.LAZY,
cascade = {
CascadeType.MERGE,
CascadeType.PERSIST
},
orphanRemoval = true,
mappedBy = "category")
@JsonIgnore
private Set<Speciality> specialities;
...
}
Speciality.java
@Entity
@Table(name = "specialities")
@Getter
@Setter
@Builder
@AllArgsConstructor
public class Speciality extends Auditable<Speciality>{
@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
private Long id;
...
@ManyToOne(
fetch = FetchType.LAZY,
cascade = {
CascadeType.MERGE,
CascadeType.PERSIST
}
)
private Category category;
...
@ManyToMany(
fetch = FetchType.LAZY,
cascade = {
CascadeType.MERGE,
CascadeType.PERSIST
},
targetEntity=Skill.class,
mappedBy = "specialities")
@JsonIgnore
private Set<Skill> skills = new HashSet<>();
...
}
Skill.java
@Entity
@Table(name = "skills")
@Getter
@Setter
@Builder
@AllArgsConstructor
public class Skill extends Auditable<Skill> {
@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
private Long id;
...
@JoinTable( name = "skill_speciality",
joinColumns = @JoinColumn(name = "fk_skill"),
inverseJoinColumns = @JoinColumn(name = "fk_speciality"),
uniqueConstraints = @UniqueConstraint(columnNames = {
"fk_skill", "fk_speciality" })
)
@ManyToMany(
fetch = FetchType.EAGER,
cascade = {
CascadeType.ALL
},
targetEntity = Speciality.class
)
@JsonIgnore
private Set<Speciality> specialities = new HashSet<>();
...
}
I run following parse functions, and found the duplicate records inserted in table skills when specialityRepository.save(specialityObj); executes.
Parser function:
public void parseSkillsTags(
String filename,
CategoryRepository categoryRepository,
SpecialityRepository specialityRepository,
SkillRepository skillRepository) {
jsonHelper = new JSONHelper();
JSONObject jsonObject = jsonHelper.parse(filename);
if (jsonObject != null) {
Iterator<String> categoriesIt = jsonObject.keys();
while (categoriesIt.hasNext()) {
String categoryName = categoriesIt.next();
s_logger.info("parseSkillsTags:category:" + categoryName);
Category categoryObj = new Category();
categoryObj.setName(categoryName);
JSONArray specialitiesArray = (JSONArray) jsonObject.get(categoryName);
s_logger.info("specialitiesArray length:" + specialitiesArray.length());
for (int i = 0; i < specialitiesArray.length(); i++) {
JSONObject specialityJsonObj = specialitiesArray.getJSONObject(i);
s_logger.info("index:" + i);
Iterator<String> specialityIt = specialityJsonObj.keys();
while (specialityIt.hasNext()) {
String specialityName = specialityIt.next();
s_logger.info("parseSkillsTags:speciality:" + specialityName);
Speciality specialityObj = new Speciality();
specialityObj.setName(specialityName);
specialityObj.setCategory(categoryObj);
// category
categoryObj.addSpeciality(specialityObj);
JSONArray skillsArray = specialityJsonObj.getJSONArray(specialityName);
for (int j = 0; j < skillsArray.length(); j++) {
String skillName = skillsArray.getString(j);
s_logger.info("parseSkillsTags:skill:" + skillName);
Skill skillObj = new Skill();
skillObj.setName(skillName);
// speciality
specialityObj.addSkill(skillObj);
// skill
skillObj.addSpeciality(specialityObj);
// To be fix here.
specialityRepository.save(specialityObj);
}
categoryRepository.save(categoryObj);
}
}
}
} else {
s_logger.error("skills conf is missing.");
return;
}
}
The first time it hits, the "Data Entry" is inserted into skills. The second time, "Data Mining" is inserted into skills. The third time, the following I get, both "Data Mining" and "Data Scraping" inserted.
I am not sure why this is happened. Do you have any thoughts? Thanks in advance.
1条答案
按热度按时间cnjp1d6j1#
Change following code fix this issue.
To
The reason why the "Data Mining" inserted twice is, each time that clause hit, we needs to sync the entity status with table, that means we need to add findById/findByName to sync them.
We move the specilityRepository save operation out of the loop, is because of the performance. We add skill objects into speciality, and save them once.