通过Perl从xls阅读日期到csv

6tr1vspr  于 2023-03-23  发布在  Perl
关注(0)|答案(3)|浏览(115)

我有一批excel文件,里面的行像

1/13/04 21

我试图将它们转换为.csv,但发现该行被转换为

36537,21

事实证明,这是Excel存储规则的副作用。Excel应该将日期存储为自1900年1月1日以来的天数。根据该规则,这是错误的整数,对应于2001年1月12日,而不是2004年1月13日(这是1/13/04表示的日期)。

  • Excel怎么会犯这样的错误?
  • 我怎样才能得到原始的未格式化的值,而避开这里的转换呢?

这是一个代码的草图:

my $xlsparser = Spreadsheet::ParseExcel->new();
my $xlsbook = $xlsparser->Parse('xls_test.xls');
my $xls = $xlsbook->{Worksheet}[0];
my $csv = '';

# then a loop over rows and columns with...
  my $cell = $xls->get_cell( $row, $col );
  $cellcon = $cell->unformatted();
  $csv .= $cellcon;

如果我的解释不够清楚,或者你不能重现这个问题,这里有一个最小的数据集和脚本,可以为我重现它:
https://dl.dropboxusercontent.com/u/58760/softwareGrr/xls_example.plhttps://dl.dropboxusercontent.com/u/58760/softwareGrr/junk.xls

fxnxkyjh

fxnxkyjh1#

如果你想在perl中将Excel日期序列值格式36537,21转换为时间/日期变量,那么你可以使用自己的函数来转换日期。

sub date2excelvalue {
  my($day1, $month, $year, $hour, $min, $sec) = @_;
  my @cumul_d_in_m = (0, 31, 59, 90, 120, 151, 181, 212, 243, 273, 304, 334, 365);
  my $doy = $cumul_d_in_m[$month - 1] + $day1;

  #
  full years + your day
  for my $y(1900..$year) {
    if ($y == $year) {
      if ($month <= 2) {

        #
        dont add manually extra date
        if inJanuary or February
        last;
      }
      if ((($y % 4 == 0) && ($y % 100 != 0)) || ($y % 400 == 0) || ($y == 1900)) {
        $doy++;#
        leap year
      }
    } else {#
      full years
      $doy += 365;
      if ((($y % 4 == 0) && ($y % 100 != 0)) || ($y % 400 == 0) || ($y == 1900)) {
        $doy++;#
        leap year
      }

    }
  }#
  end
  for y# calculate second parts as a fraction of 86400 seconds
  my $excel_decimaltimepart = 0;
  my $total_seconds_from_time = ($hour * 60 * 60 + $min * 60 + $sec);
  if ($total_seconds_from_time == 86400) {
    $doy++;#
    just add a day
  } else {#
    add decimal in excel
    $excel_decimaltimepart = $total_seconds_from_time / (86400);
    $excel_decimaltimepart = ~s / 0\. //;
  }
  return "$doy\.$excel_decimaltimepart";

}

sub excelvalue2date {
  my($excelvalueintegerpart, $excelvaluedecimalpart) = @_;
  my @cumul_d_in_m = (0, 31, 59, 90, 120, 151, 181, 212, 243, 273, 304, 334, 365);
  my @cumul_d_in_m_leap = (0, 31, 60, 91, 121, 152, 182, 213, 244, 274, 305, 335, 366);
  my @cumul_d_in_m_selected;
  my($day1, $month, $year, $hour, $min, $sec);
  $day1 = 0;#
  all days all years
  my $days_in_year;
  my $acumdays_per_month;
  my $daysinmonth;
  my $day;

  #
  full years + your day
  for my $y(1900. .3000) {
    my $leap_year = 0;#
    leap year
    my $leap_year_mask = 0;#
    leap year
    if ((($y % 4 == 0) && ($y % 100 != 0)) || ($y % 400 == 0) || ($y == 1900)) {
      $leap_year = 1;#
      leap year
      @cumul_d_in_m_selected = @cumul_d_in_m_leap;

    } else {
      $leap_year = 0;#
      leap year
      @cumul_d_in_m_selected = @cumul_d_in_m;
    }

    if (($day1 + (365 + $leap_year)) > $excelvalueintegerpart) {

      #
      found this year $y
      $year = $y;
      print "year $y\n";

      $days_in_year = $excelvalueintegerpart - $day1;
      $acumdays_per_month = 0;
      print "excelvalueintegerpart  $excelvalueintegerpart\n";
      print "day1  $day1\n";
      print "daysinyear $days_in_year\n";
      for my $i(0..$# cumul_d_in_m) {
        if ($i == $# cumul_d_in_m) {
          $month = $i + 1;#
          month 12 December
          $day = $days_in_year - $cumul_d_in_m_selected[$i];
          last;

        } else {

          if (($days_in_year > ($cumul_d_in_m_selected[$i])) && ($days_in_year <= ($cumul_d_in_m_selected[$i + 1]))) {
            $month = $i + 1;
            $day = $days_in_year - $cumul_d_in_m_selected[$i];
            last;
          }

        }

      }#
      end
      for $i months

      # end year
      last;

    } else {#
      full years
      $day1 += (365 + $leap_year);
    }

  }#
  end
  for years interger part comparator

  my $total_seconds_inaday;
  $total_seconds_inaday = "0\.$excelvaluedecimalpart" * 86400;

  $sec = $total_seconds_inaday;
  $hour = int($sec / (60 * 60));
  $sec -= $hour * (60 * 60);
  $min = int($sec / 60);
  $sec -= $min * (60);
  $sec = int($sec);
  return ($day, $month, $year, $hour, $min, $sec);

}
my $excelvariable = date2excelvalue(1, 3, 2018, 14, 14, 30);
print "Excel variable: $excelvariable\n";
my($integerpart, $decimalwithoutzero) = ($1, $2) if ($excelvariable = ~m / (\d + )\.(\d + ) / );
my($day1, $month, $year, $hour, $min, $sec) = excelvalue2date($integerpart, $decimalwithoutzero);
print "Excel Date from value: $day1, $month, $year, $hour, $min, $sec\n";

好好享受吧!

ar7v8xwq

ar7v8xwq2#

有问题的线路是

$cellcon = $cell->unformatted();

除非有人能给出更好的解释,否则我将把这看作是一个bug。

$cellcon = $cell->Value;
nnsrf1az

nnsrf1az3#

use DateTime::Format::Excel;

my $datetime = DateTime::Format::Excel->parse_datetime( $cell->value() );

相关问题