linux bash sed将日期时间替换为时间戳格式

az31mfrm  于 2022-11-02  发布在  Linux
关注(0)|答案(2)|浏览(262)

我不是bash和Linux方面的教授,但我需要用JSON语法预处理一些财务数据(OHLC数据),如下所示:

$ data='
[
{ "t": "2022-09-01T00:00:00", "o": 1.3800, "c": 1.3800, "h": 1.3800, "l": 1.3800, "v":   46.900, "n": 1  }, 
{ "t": "2022-09-01T00:00:15", "o": 1.3700, "c": 1.3700, "h": 1.3700, "l": 1.3700, "v":  299.100, "n": 1  }, 
{ "t": "2022-09-01T00:00:45", "o": 1.3800, "c": 1.3800, "h": 1.3800, "l": 1.3800, "v":    2.900, "n": 1  }, 
{ "t": "2022-09-01T00:02:45", "o": 1.3700, "c": 1.3735, "h": 1.3735, "l": 1.3700, "v":  450.443, "n": 7  }, 
{ "t": "2022-09-01T00:03:00", "o": 1.3743, "c": 1.3744, "h": 1.3744, "l": 1.3743, "v":   15.128, "n": 2  }, 
{ "t": "2022-09-01T00:03:45", "o": 1.3773, "c": 1.3776, "h": 1.3776, "l": 1.3773, "v":   32.078, "n": 3  }, 
{ "t": "2022-09-01T00:04:45", "o": 1.3700, "c": 1.3700, "h": 1.3700, "l": 1.3700, "v":  380.000, "n": 1  }, 
{ "t": "2022-09-01T00:05:00", "o": 1.3783, "c": 1.3783, "h": 1.3783, "l": 1.3783, "v":    8.191, "n": 1  }, 
{ "t": "2022-09-01T00:05:15", "o": 1.3800, "c": 1.3800, "h": 1.3800, "l": 1.3800, "v": 5654.400, "n": 14 }, 
{ "t": "2022-09-01T00:05:45", "o": 1.3800, "c": 1.3800, "h": 1.3800, "l": 1.3800, "v":  427.100, "n": 2  }, 
...
]'

我想使用DATE命令将“时间”字段从当前格式替换为时间戳格式,如下所示:

new_data=
[
{ "t": 1661974200, "o": 1.3800, "c": 1.3800, "h": 1.3800, "l": 1.3800, "v":   46.900, "n": 1  }, 
{ "t": 1661974215, "o": 1.3700, "c": 1.3700, "h": 1.3700, "l": 1.3700, "v":  299.100, "n": 1  }, 
{ "t": 1661974245, "o": 1.3800, "c": 1.3800, "h": 1.3800, "l": 1.3800, "v":    2.900, "n": 1  }, 
{ "t": 1661974365, "o": 1.3700, "c": 1.3735, "h": 1.3735, "l": 1.3700, "v":  450.443, "n": 7  }, 
{ "t": 1661974380, "o": 1.3743, "c": 1.3744, "h": 1.3744, "l": 1.3743, "v":   15.128, "n": 2  }, 
{ "t": 1661974435, "o": 1.3773, "c": 1.3776, "h": 1.3776, "l": 1.3773, "v":   32.078, "n": 3  }, 
{ "t": 1661974495, "o": 1.3700, "c": 1.3700, "h": 1.3700, "l": 1.3700, "v":  380.000, "n": 1  }, 
{ "t": 1661974510, "o": 1.3783, "c": 1.3783, "h": 1.3783, "l": 1.3783, "v":    8.191, "n": 1  }, 
{ "t": 1661974525, "o": 1.3800, "c": 1.3800, "h": 1.3800, "l": 1.3800, "v": 5654.400, "n": 14 }, 
{ "t": 1661974555, "o": 1.3800, "c": 1.3800, "h": 1.3800, "l": 1.3800, "v":  427.100, "n": 2  }, 
...
]

在谷歌的帮助下,我试着运行这个命令

$ echo "$data" | sed "s/\"([0-9]+-[0-9]+-[0-9]+T[0-9]+:[0-9]+:[0-9]+(\.[0-9]*Z)?)\"/$(date --date=\1 +'%s')/g"

但输出结果中所有记录的时间戳都相同!

output=
[
{ "t": 1666733400, "o": 1.3800, "c": 1.3800, "h": 1.3800, "l": 1.3800, "v":   46.900, "n": 1  }, 
{ "t": 1666733400, "o": 1.3700, "c": 1.3700, "h": 1.3700, "l": 1.3700, "v":  299.100, "n": 1  }, 
{ "t": 1666733400, "o": 1.3800, "c": 1.3800, "h": 1.3800, "l": 1.3800, "v":    2.900, "n": 1  }, 
{ "t": 1666733400, "o": 1.3700, "c": 1.3735, "h": 1.3735, "l": 1.3700, "v":  450.443, "n": 7  }, 
{ "t": 1666733400, "o": 1.3743, "c": 1.3744, "h": 1.3744, "l": 1.3743, "v":   15.128, "n": 2  }, 
{ "t": 1666733400, "o": 1.3773, "c": 1.3776, "h": 1.3776, "l": 1.3773, "v":   32.078, "n": 3  }, 
{ "t": 1666733400, "o": 1.3700, "c": 1.3700, "h": 1.3700, "l": 1.3700, "v":  380.000, "n": 1  }, 
{ "t": 1666733400, "o": 1.3783, "c": 1.3783, "h": 1.3783, "l": 1.3783, "v":    8.191, "n": 1  }, 
{ "t": 1666733400, "o": 1.3800, "c": 1.3800, "h": 1.3800, "l": 1.3800, "v": 5654.400, "n": 14 }, 
{ "t": 1666733400, "o": 1.3800, "c": 1.3800, "h": 1.3800, "l": 1.3800, "v":  427.100, "n": 2  }, 
...
]

最后,在多次失败的尝试之后,我发现SED的替换部分不传递匹配的子字符串,而只传递字符串“\1”。
请指导我任何人如何解决这个问题。坦克家伙

mzmfm0qo

mzmfm0qo1#

我建议逐行处理文件。另外,请记住,不建议在shell中解析或处理JSON。一个简单的脚本(如下面所示)可以为您提供帮助:


# !/bin/bash

while read -r line; do
 if [[ $line == {* ]]; then
  timestamp=$(echo "$line" | awk '/t/ {t=$3; gsub(/[,"]/, "", t); print t }')
  epoch=$(date --date="$timestamp" +'%s')
  echo "$line" | sed "s/$timestamp/$epoch/"
 else
  echo "$line"
 fi
done < data

这是一个基本的(也是不优雅的)解决方案,但是你可以避免处理正则表达式。注意,在上面的例子中,data是一个文本文件,而不是一个env var。
因此,例如,如果data文件包含:

[
{ "t": "2022-09-01T00:00:00", "o": 1.3800, "c": 1.3800, "h": 1.3800, "l": 1.3800, "v":   46.900, "n": 1  }, 
{ "t": "2022-09-01T00:00:15", "o": 1.3700, "c": 1.3700, "h": 1.3700, "l": 1.3700, "v":  299.100, "n": 1  }, 
{ "t": "2022-09-01T00:00:45", "o": 1.3800, "c": 1.3800, "h": 1.3800, "l": 1.3800, "v":    2.900, "n": 1  }, 
{ "t": "2022-09-01T00:02:45", "o": 1.3700, "c": 1.3735, "h": 1.3735, "l": 1.3700, "v":  450.443, "n": 7  }, 
{ "t": "2022-09-01T00:03:00", "o": 1.3743, "c": 1.3744, "h": 1.3744, "l": 1.3743, "v":   15.128, "n": 2  }, 
{ "t": "2022-09-01T00:03:45", "o": 1.3773, "c": 1.3776, "h": 1.3776, "l": 1.3773, "v":   32.078, "n": 3  }, 
{ "t": "2022-09-01T00:04:45", "o": 1.3700, "c": 1.3700, "h": 1.3700, "l": 1.3700, "v":  380.000, "n": 1  }, 
{ "t": "2022-09-01T00:05:00", "o": 1.3783, "c": 1.3783, "h": 1.3783, "l": 1.3783, "v":    8.191, "n": 1  }, 
{ "t": "2022-09-01T00:05:15", "o": 1.3800, "c": 1.3800, "h": 1.3800, "l": 1.3800, "v": 5654.400, "n": 14 }, 
{ "t": "2022-09-01T00:05:45", "o": 1.3800, "c": 1.3800, "h": 1.3800, "l": 1.3800, "v":  427.100, "n": 2  }, 
...
]

运行脚本时的输出为:

[
{ "t": "1661986800", "o": 1.3800, "c": 1.3800, "h": 1.3800, "l": 1.3800, "v":   46.900, "n": 1  },
{ "t": "1661986815", "o": 1.3700, "c": 1.3700, "h": 1.3700, "l": 1.3700, "v":  299.100, "n": 1  },
{ "t": "1661986845", "o": 1.3800, "c": 1.3800, "h": 1.3800, "l": 1.3800, "v":    2.900, "n": 1  },
{ "t": "1661986965", "o": 1.3700, "c": 1.3735, "h": 1.3735, "l": 1.3700, "v":  450.443, "n": 7  },
{ "t": "1661986980", "o": 1.3743, "c": 1.3744, "h": 1.3744, "l": 1.3743, "v":   15.128, "n": 2  },
{ "t": "1661987025", "o": 1.3773, "c": 1.3776, "h": 1.3776, "l": 1.3773, "v":   32.078, "n": 3  },
{ "t": "1661987085", "o": 1.3700, "c": 1.3700, "h": 1.3700, "l": 1.3700, "v":  380.000, "n": 1  },
{ "t": "1661987100", "o": 1.3783, "c": 1.3783, "h": 1.3783, "l": 1.3783, "v":    8.191, "n": 1  },
{ "t": "1661987115", "o": 1.3800, "c": 1.3800, "h": 1.3800, "l": 1.3800, "v": 5654.400, "n": 14 },
{ "t": "1661987145", "o": 1.3800, "c": 1.3800, "h": 1.3800, "l": 1.3800, "v":  427.100, "n": 2  },
...
]
iezvtpos

iezvtpos2#

使用GNU sed和GNU date

$ new_data=$(sed -E 's/"/#/g;s/([^:]*[^#]*#)([^#]*)(.*)/echo "\1$(date -d \2 +'%s')\3"/e;s/#/"/g' <<< "$data")
$ echo "$new_data"

[
{ "t": "1661986800", "o": 1.3800, "c": 1.3800, "h": 1.3800, "l": 1.3800, "v":   46.900, "n": 1  },
{ "t": "1661986815", "o": 1.3700, "c": 1.3700, "h": 1.3700, "l": 1.3700, "v":  299.100, "n": 1  },
{ "t": "1661986845", "o": 1.3800, "c": 1.3800, "h": 1.3800, "l": 1.3800, "v":    2.900, "n": 1  },
{ "t": "1661986965", "o": 1.3700, "c": 1.3735, "h": 1.3735, "l": 1.3700, "v":  450.443, "n": 7  },
{ "t": "1661986980", "o": 1.3743, "c": 1.3744, "h": 1.3744, "l": 1.3743, "v":   15.128, "n": 2  },
{ "t": "1661987025", "o": 1.3773, "c": 1.3776, "h": 1.3776, "l": 1.3773, "v":   32.078, "n": 3  },
{ "t": "1661987085", "o": 1.3700, "c": 1.3700, "h": 1.3700, "l": 1.3700, "v":  380.000, "n": 1  },
{ "t": "1661987100", "o": 1.3783, "c": 1.3783, "h": 1.3783, "l": 1.3783, "v":    8.191, "n": 1  },
{ "t": "1661987115", "o": 1.3800, "c": 1.3800, "h": 1.3800, "l": 1.3800, "v": 5654.400, "n": 14 },
{ "t": "1661987145", "o": 1.3800, "c": 1.3800, "h": 1.3800, "l": 1.3800, "v":  427.100, "n": 2  },
...
]

相关问题