CSV导入到MySQL中被截断-数据比列多

d8tt03nd  于 2023-09-28  发布在  Mysql
关注(0)|答案(1)|浏览(116)

对不起,如果它似乎已经回答了很多次,但花了很多时间试图弄清楚。我有一个从CSV到MySQL的复杂导入。我的CSV文件是在不同的软件中生成的,更改它是不可能的。
我的导入代码:

LOAD DATA INFILE 'krupka_lezaky.csv'
INTO TABLE table_lezaky_krupka5 
    FIELDS TERMINATED BY ',' 
    ENCLOSED BY '"' 
    LINES TERMINATED BY '\n' 
    IGNORE 1 LINES
    (@var1, @var2, NAZEV, DOPLNEK, @var5, @var6, @var7, @var8, @var9, @var10, @var11, @var12)
SET 
    ID_ZBOZI = REPLACE(@var1, '"', ''), -- Remove double quotes
    ID_CISELNIK = REPLACE(@var2, '"', ''), -- Remove double quotes
    MNOZSTVI = REPLACE(REPLACE(@var5, ',', '.'), '"', ''), -- Remove double quotes and commas
    NCENA_BD = REPLACE(REPLACE(@var6, ',', '.'), '"', ''),
    NCENA_SD = REPLACE(REPLACE(@var7, ',', '.'), '"', ''),
    PCENA_SD = REPLACE(REPLACE(@var8, ',', '.'), '"', ''),
    PCENA_RP = REPLACE(REPLACE(@var9, ',', '.'), '"', ''),
    DAT_PRIJEM = STR_TO_DATE(@var10, '%d.%m.%Y %H:%i:%s'),
    DAT_VYDEJ = STR_TO_DATE(@var11, '%d.%m.%Y %H:%i:%s'),
    KOD_VZP = REPLACE(@var12, '"', ''), -- Remove double quotes
    ID_PROVOZOVNA = 4; -- Set the fixed value for the 13th column

我的表格格式:

我的表代码:

CREATE TABLE `table_lezaky_krupka5` (
  `ID_ZBOZI` int NOT NULL,
  `ID_CISELNIK` varchar(5) DEFAULT NULL,
  `NAZEV` varchar(200) DEFAULT NULL,
  `DOPLNEK` varchar(200) DEFAULT NULL,
  `MNOZSTVI` decimal(10,4) DEFAULT NULL,
  `NCENA_BD` decimal(10,4) DEFAULT NULL,
  `NCENA_SD` decimal(10,4) DEFAULT NULL,
  `PCENA_SD` decimal(10,4) DEFAULT NULL,
  `PCENA_RP` decimal(10,4) DEFAULT NULL,
  `DAT_PRIJEM` datetime DEFAULT NULL,
  `DAT_VYDEJ` datetime DEFAULT NULL,
  `KOD_VZP` varchar(20) DEFAULT NULL,
  `ID_PROVOZOVNA` int NOT NULL,
  PRIMARY KEY (`ID_ZBOZI`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;

我的csv文件:

"ID_ZBOZI","ID_CISELNIK","NAZEV","DOPLNEK","MNOZSTVI","NCENA_BD","NCENA_SD","PCENA_SD","PCENA_RP","DAT_PRIJEM","DAT_VYDEJ","KOD_VZP"
"11311782","*","5PreveMax Imunit nukleotidy+betaglukan","tbl 60+20","1,0000","238,5000","274,2800","365,0000","365,0000","21.07.2022 07:34:57","14.07.2022 14:32:58",""
"15100001","1","ACYLCOFFIN","POR TBL NOB 10","6,0000","307,0100","337,7300","470,0000","470,0000","20.07.2022 08:08:38","10.02.2023 11:07:09","9"
"15103351","1","AERIUS 5 MG","TBL 30X5MG","1,0000","109,9100","120,9000","160,0000","160,0000","27.07.2022 07:55:40","26.07.2022 07:55:05","26329"
"15144399","1","AESCULUS HIPPOCASTANUM","GRA 4GM 5CH","2,0000","148,3200","163,1600","230,0000","230,0000","06.01.2021 08:30:50","","62741"
"15100094","1","AGOLUTIN","INJ 5X2ML/60MG","1,0000","55,7400","61,3100","70,0000","70,0000","11.02.2022 08:13:57","07.02.2022 14:58:29","186149"
"15103137","*","Alpa bylinný masážní gel kaštan ","100ml","1,0000","35,5700","43,0400","65,0000","65,0000","23.12.2022 10:15:38","22.12.2022 09:14:17",""
"15102019","*","Alpa emulze s arnikou ","100ml","2,0000","81,1200","98,1600","138,0000","138,0000","15.02.2023 07:51:23","14.02.2023 10:51:55",""
"15104744","1","AMPRILAN 5","TBL 30X5MG","1,0000","23,6000","25,9600","34,0000","34,0000","23.12.2022 10:15:25","22.12.2022 16:29:00","23962"
"15120270","1","ANALERGIN NEO","5MG TBL FLM 90 II","1,0000","195,7900","215,3700","283,0000","283,0000","23.09.2022 06:48:38","22.09.2022 11:23:23","201946"

错误代码:
Error Code: 1262. Row 1 was truncated; it contained more data than there were input columns
我想我的问题是CSV文件的格式,我们使用逗号作为小数点分隔符,然后我的十进制值也得到了双引号。我试图绕过它,使用替换来获得正确的格式,但仍然得到错误。任何帮助都将不胜感激,谢谢。

ikfrs5lh

ikfrs5lh1#

谢谢你们的帮助,我终于找到了解决办法。有两个问题

  1. CSV文件中的一些日期为空,因此str_to_date导致错误,使条件有帮助(至少这是我认为的原因)
    1.我resaved在UTF-8编码的CSV文件,似乎原来的文件是在ANSI编码
    我的代码是这样的:
LOAD DATA INFILE 'hranik_lezaky.csv'
INTO TABLE table_lezaky_hranik
CHARACTER SET latin1 -- Assuming ANSI encoding is similar to ISO 8859-1 (Latin-1)
FIELDS TERMINATED BY ',' 
ENCLOSED BY '"' 
LINES TERMINATED BY '\r\n' 
IGNORE 1 LINES
(@var1, @var2, NAZEV, DOPLNEK, @var5, @var6, @var7, @var8, @var9, @var10, @var11, @var12)
SET 
    ID_ZBOZI = @var1,
    ID_CISELNIK = @var2,
    MNOZSTVI = REPLACE(@var5, ',', '.'),
    NCENA_BD = REPLACE(@var6, ',', '.'),
    NCENA_SD = REPLACE(@var7, ',', '.'),
    PCENA_SD = REPLACE(@var8, ',', '.'),
    PCENA_RP = REPLACE(@var9, ',', '.'),
    DAT_PRIJEM = IF(@var11 <> '', STR_TO_DATE(@var11, '%d.%m.%Y %H:%i:%s'), NULL),
    DAT_VYDEJ = IF(@var11 <> '', STR_TO_DATE(@var11, '%d.%m.%Y %H:%i:%s'), NULL),
    KOD_VZP = @var12,
    ID_PROVOZOVNA = 1; -- Set the fixed value for the 13th column

最后,我设法导入数据,现在我将尝试在此代码中更改导入CSV文件的编码
编辑//完成,谢谢:)

相关问题