linux 输出文件名未正确命名为Bash

nc1teljy  于 2023-10-16  发布在  Linux
关注(0)|答案(1)|浏览(104)

文件名格式如下:4digitnumber_S_R1_001.fastq.gz。为了给你一个给予的例子3145_S2_R1_001.fastq.gz我试图让我的输出文件名不包括_R1_001部分,但它一直包括完整的文件名。我不知道为什么它不给我正确的输出文件名格式,我想要的。下面是我的代码:

#!/bin/bash

# Set the input directory containing .fastq.gz files
input_dir="/storage/home/user/work/fastq_data"

# Set the output directory for .sam and summary files
output_dir="/storage/home/user/work/fastq_data/gene_alignment"

# Create the output directory if it doesn't exist
mkdir -p "$output_dir"

# Get the full path of the output directory
output_dir="/storage/home/user/work/fastq_data/gene_alignment"

# Use find to iterate over .fastq.gz files in the input directory
find "${input_dir}" -type f -name "*.fastq.gz" | while read -r input_file; do
  # Remove the directory path and extension to get the base file name
  base_name=$(basename "$input_file" .fastq.gz)

  # Remove "_R_001" from the base_name using awk
  new_name=$(echo "$base_name" | awk -F '_R_001' '{print $1}')

  # Construct the output file names
  output_sam="${output_dir}/${new_name}.sam"
  summary_file="${output_dir}/${new_name}.summary"

  # Run HISAT2
  hisat2 -p 8 -x musculus_index -U "${input_file}" -S "${output_sam}" --dta-cufflinks 2>&1 | tee "${summary_file}"

  echo "Processed ${input_file}"
done

echo "Processing complete."

我尝试了各种方法来获得new_name =,对于我的输出文件名,使用sed"${base_name/_R_001/}"等。我不知道我哪里弄错了。我会感激任何建议。谢谢

f8rj6qna

f8rj6qna1#

所以我得到了(正确的)警告,因为我提供了一个可以完成工作的答案,但有点错误。所以我就把整个灾难删除了,再试一次。我相信这个平台上的高级贡献者可以改进它,但这是我最好的机会。我添加了一个捕获器,用于在路径不正确时返回消息。你可以忽略任何对你没有用的东西。希望它有帮助!

#!/bin/bash

# Set the input directory containing .fastq.gz files
input_dir="/storage/home/user/work/fastq_data"

# Set the output directory for .sam and summary files
output_dir="/storage/home/user/work/fastq_data/gene_alignment"

#Ignore this if you want: I always like to add a little message 
#to let me know whether I typed my paths in wrong :)

#Confirm input path exists
if [ -d "$input_dir" ]; then
        echo "$input_dir found. Continuing..."
else 
        echo "$input_dir does not exist. Exiting..."
        exit
fi

#Confirm *fastq.gz files are in input_dir
if find "$input_dir" -type f -name "*.fastq.gz" -print -quit | grep -q .; then
    echo "Files with .fastq.gz extension exist in $input_dir"
else
    echo "No files with .fastq.gz extension found in $input_dir"
    exit
fi

#Confirm output directory path exists
if [ -d "$output_dir" ]; then
        echo "$output_dir found. Continuing..."
else 
        echo "$output_dir does not exist. Creating new..."
        mkdir -p "$output_dir"
fi

#Again, your choice. I prefer for loops over while loops. Totally your call
for file in `ls ${input_dir}/*.fastq.gz`; do
  filename=$(basename "$file")
  new_name=$(echo "$filename" | sed 's/_R_001//g')
  output_sam="${output_dir}/${new_name/.fastq.gz/.sam}"
  summary_file="${output_dir}/${new_name/.fastq.gz/.summary}"
  hisat2 -p 8 -x musculus_index -U "${input_file}" -S "${output_sam}" --dta-cufflinks 2>&1 | tee "${summary_file}"
  echo "Processed ${file}"
done

echo "Processing complete."

相关问题