Skip to content

[BUG?] Unable to recognize fq.gz file as a gzip file #8

@hellopeccat

Description

@hellopeccat

Hi there,

I confirmed the completeness and format of my input files:

zuoxiaotian@a203fast-PowerEdge-T640:~$ file /mnt/16T_2/metatranscriptome/MbPL202401629/2_cleandata/L2_se_clean_R2.fq.gz 
/mnt/16T_2/metatranscriptome/MbPL202401629/2_cleandata/L2_se_clean_R2.fq.gz: gzip compressed data, original size modulo 2^32 627952
zuoxiaotian@a203fast-PowerEdge-T640:~$ head -c 10 /mnt/16T_2/metatranscriptome/MbPL202401629/2_cleandata/L2_se_clean_R1.fq.gz | xxd
00000000: 1f8b 0800 0000 0000 00ff 

When I test the command:

bbmap.sh \
  ref=/mnt/16T_1/zuo/bbmap_mapping/plasmid/plasmid_representatives.fna \
  in=/mnt/16T_2/metatranscriptome/MbPL202401629/2_cleandata/L2_se_clean_R1.fq.gz \
  in2=/mnt/16T_2/metatranscriptome/MbPL202401629/2_cleandata/L2_se_clean_R2.fq.gz \
  out=/mnt/16T_1/zuo/bbmap_mapping/metaG/L2_se_filtered.sam \
  pairedonly=t idfilter=1 threads=40 -Xmx40g

Error emerged as:

Exception in thread "BGZF-InputProducer" java.lang.AssertionError: Not a gzip file: 0, 0
	at stream.bam.BgzfInputStreamMT2.readNextBlock(BgzfInputStreamMT2.java:215)
	at stream.bam.BgzfInputStreamMT2.producerLoop(BgzfInputStreamMT2.java:135)
	at stream.bam.BgzfInputStreamMT2.access$0(BgzfInputStreamMT2.java:127)
	at stream.bam.BgzfInputStreamMT2$1.run(BgzfInputStreamMT2.java:109)
	at java.base/java.lang.Thread.run(Thread.java:840)

I tried to flag unpigz=t but did not address it. Need for some help, thanks in advance.

-------↓ The latest update ↓-------
That's so strange. When I test threads=1 it seems to work. Maybe, the problem is only associated with using multiple threads to build sam file.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions