• Language
  • Transcriptome Sequencing
  • Eukaryote
  • mRNA Sequencing
  • Non-coding RNA Sequencing
  • Long non-coding RNA Sequencing
  • Small RNA Sequencing
  • Circular RNA Sequencing
  • Whole Transcriptome Sequencing
  • Isoform Sequencing (Full-length Transcript Sequencing)
  • Prokaryote
  • Prokaryotic RNA Sequencing
  • Metatranscriptome Sequencing
  • Epigenomics
  • Whole Genome Bisulfite Sequencing (Gene Methylation)
  • Reduced Representation Bisulfite Sequencing (RRBS-Gene Methylation)
  • ChIP-Seq
  • RIP-Seq
  • Metagenomics
  • 16S/18S/ITS Amplicon Metagenomic Sequencing
  • Shotgun Metagenomic Sequencing
  • Pre-made Library Sequencing
  • Pre-made Library Sequencing
  • Clinical Diagnostics
  • Clinical Whole Exome Sequencing (CLIA/CAP)
  • Clinical Panels
  • Genetic Testing
  • Oncology
  • NovoPM™ 2.0
  • NovoFocus™ NSCLC 2.0
  • NovoFocus™ CRC 2.0
  • NovoFocus™ PARPi CDx 1.0
  • NovoFocus™ CR
  • Cancer Immunotherapy Biomarkers
  • Patient's Case Study
  • Biopharma Services
  • Discovery and Pre Clinical
  • Whole Genome Sequencing
  • Whole Exome Sequencing
  • RNA-Seq
  • FFPE RNA-seq
  • Small RNA Sequencing
  • Pacbio Sequencing (DNA)
  • Pacbio Sequencing (RNA)
  • Pre-made Library Sequencing
  • Translational and Clinical
  • NovoPM™ 2.0
  • NovoNeoantigen™ 2.0
  • Star Allele Analysis
  • Tumor Microenvironment Analysis
  • Companion Diagnostics
  • Service Support
  • Service Workflow
  • Sample Preparation Guide
  • Sample Requirements
  • Quality Management
  • Data Analysis
  • FAQs​​
  • Technology
  • Overview
  • Platform
  • Certification
  • Intelligent Delivery Platform
  • Resources
  • Publications
  • Downloads
  • Webinars
  • About
  • About Us
  • Major Milestones
  • Careers
  • News & Events
  • Partnership
  • Contact Us
  • De novo Sequencing
  • Animal & Plant De novo Sequencing
  • Microbial De novo Sequencing
  • Transcriptome Sequencing

  • Whole Transcriptome Sequencing
  • Isoform Sequencing (Full-length Transcript Sequencing)
  • Prokaryote
  • Prokaryotic RNA Sequencing
  • Metatranscriptome Sequencing
  • Epigenomics

    Metagenomics

    Pre-made Library Sequencing

  • Overview
  • Platform
  • Certification
  • Intelligent Delivery Platform
  • Publications
  • Downloads
  • Webinars
  • About Us
  • Major Milestones
  • Careers
  • News & Events
  • Partnership
  • Tell us about your project
    Contact us to discuss how we can help you achieve your research goals
    Research Services

    Human Whole Genome Sequencing

    Overview

    Human whole genome sequencing (hWGS) enables researchers to catalog a genetic constitution of individuals and capture all variants (single-nucleotide variations (SNVs), insertions and deletions (InDels), copy number variations (CNVs), and large structural variants (SV) present in a single assay. Equipped with the powerful Illumina NovaSeq 6000 system, Novogene is capable of sequencing up to 280,000 human genomes per year at the lowest cost per genome. With the addition of Oxford Nanopore PromethION and PacBio Sequel Systems, Novogene also provides hWGS services with more complete and accurate characterization of human genome and complements missing sequencing reads, especially in highly polymorphic and highly repetitive regions from short reads sequencing. With extensive experience in whole genome sequencing and advanced bioinformatics capabilities, Novogene is able to expertly meet customer needs for delivering large project results with quick turnaround times and the highest quality results.

    Service Specifications

    Applications

  • Genetic disease study
  • Cancer research
  • Human population evolution
  • DNA biomarkers
  • Pharmacogenomics
  • Advantages

  • State-of-the-art NGS technologies: Novogene is a world leader in sequencing capacity using state-of-the-art technology, including Illumina HiSeq and NovaSeq 6000 Systems.
  • Highest data quality: We guarantee a Q30 score ≥ 80%, exceeding Illumina’s official guarantee of ≥ 75%. See our data example.
  • Extraordinary informatics expertise: Novogene uses its cutting-edge bioinformatics pipeline and internationally recognized, best-in-class software to provide customers with highly reliable, publication-ready data.
  • Sample Requirements

    Platform Type
    Sample Type
    Amount (Qubit®)
    Purity
    Illumina Novaseq 6000
    Genomic DNA
    ≥ 200 ng
    OD260/280=1.8-2.0
    Genomic DNA (PCR free)
    ≥ 1.5 μg
    Genomic DNA from FFPE
    ≥ 0.8 μg
    PacBio Sequel I/II
    HMW Genomic DNA
    ≥ 10 μg (for Sequel I)
    ≥30 μg (for Sequel II)
    OD260/280=1.8-2.0;
    OD260/230=2.0-2.2;
    Fragments should be ≥ 30 Kb for Sequel I, ≥ 60 Kb for Sequel II
    Nanopore PromethION
    HMW Genomic DNA
    ≥ 10 μg
    OD260/280=1.8-2.0;
    OD260/230=2.0-2.2;
    Fragments should be ≥ 30 Kb

    Sequencing Parameters and Analysis Contents

    Platform Type
    Illumina Novaseq 6000
    PacBio Sequel I/II
    Nanopore PromethION
    Read Length
    Paired-end 150 bp
    average > 10 Kb for Sequel I
    average > 15 Kb for Sequel II
    average > 17 Kb
    Recommended Sequencing Depth
    For rare diseases: 30-50×
    For genetic diseases: 10-20×
    For genetic diseases: 10-20×
    For tumor tissues: 50×, adjacent normal tissues and blood 30×
    For tumor tissues: ≥20×
    For tumor tissues: ≥20×
    Standard Data Analysis
    Data quality control
    Data quality control
    Alignment with reference genome
    Sequence alignment
    SNP/InDel/SV/CNV detection
    Structural variant (SV) detection
    Somatic SNP/InDel/SV/CNV detection (tumor-normal paired samples)
    Variation annotation

    Note: For detailed information, please refer to the Service Specifications and contact us for customized requests.

    Project Workflow

    Genomic sequencing identifies WNK2 as a driver in hepatocellular carcinoma and a risk factor for early recurrence (Zhou et al., 2019)

    Background:

    Hepatocellular carcinoma (HCC) is a relatively common type of cancer with rising incidence and mortality rates. Although advances in the treatment and management of patients with HCC have improved survival rates, HCC still has a high rate of early recurrence. This study aimed to systematically define genomic alterations in Chinese patients with HCC and to identify mutations associated with early tumor recurrence in those patients.

    Sampling & Sequencing Strategy:

    Sampling:
    • 182 Chinese primary HCC samples

    Sequencing Strategy:
    • Human whole genome sequencing (49 cases), whole exome sequencing (18 cases), and targeted region sequencing (115 cases) on Illumina platforms (PE150)

    Results & Conclusion

    By using WGS, this study described the genomic landscape, including somatic SNVs/InDels, CNVs, and SVs, and identified five prominent mutational signatures in 49 Chinese patients with HCC (Figure 3). Through WGS, WES, and targeted sequencing of 182 primary HCC samples, the results suggest that WNK2, RUNX1T1, CTNNB1, TSC1, and TP53 may play roles in HCC invasion and metastasis, and that WNK2 had the most significant difference in mutation frequency (Figure 4). Biofunctional investigations revealed a tumor-suppressor role for WNK2; its inactivation led to ERK1/2 signaling activation in HCC cells, tumor-associated macrophage infiltration, and tumor growth and metastasis. This study describes the genomic events that characterize Chinese HCCs and identify WNK2 as a driver of HCC that was associated with early tumor recurrence after curative resection.

    Figure 1. Genomic alterations and mutational signatures in 49 Chinese primary HCCs that had tumor early.

    Figure 2. The mutational spectrum in HCCs with or without early recurrence.

    Reference: Zhou SL, Zhou ZJ, Hu ZQ, et al. Genomic Sequencing Identifies WNK2 as a Driver in Hepatocellular Carcinoma and a Risk Factor for Early Recurrence[J]. Journal of Hepatology 2019, doi: 10.1016/j.jhep.2019.07.014.

    Characteristics of genomic alterations of lung adenocarcinoma in young never-smokers (Luo et al., 2018)

    Background:

    Non-small-cell lung cancer (NSCLC) has been recognized as a highly heterogeneous disease with phenotypic and genotypic diversity in each subgroup. While never-smoker patients with NSCLC have been well studied through next generation sequencing, the potentially unique molecular features of young never-smoker patients with NSCLC remains largely unknown.

    Sampling & Sequencing Strategy:

    Sampling:
    • 36 never-smoker patients with lung adenocarcinoma (LUAD)

    Sequencing Strategy:
    • Human whole genome sequencing on Illumina platform (PE150)

    Results & Conclusion

    The study revealed that besides the well-known gene mutations, several potential lung cancer-associated gene mutations that were rarely reported (e.g., HOXA4 and MST1) were identified. The lung cancer-related copy number variations (e.g., EGFR and CDKN2A) were enriched and the lung cancer-related structural variations (e.g., EML4-ALK and KIF5B-RET) were commonly observed. Notably, new fusion partners of ALK (SMG6-ALK) and RET (JMJD1C-RET) were found. Furthermore, a high prevalence of potentially targetable genomic alterations was observed in the cohort. Finally, the research identified germline mutations in BPIFB1, CHD4, PARP1, NUDT1, RAD52, and MFI2 were significantly enriched in the young never-smoker patients with LUAD comparing with the in-house noncancer database (p<0.05). This study provides a detailed mutational portrait of LUAD occurring in young never-smokers and gives insights into the molecular pathogenesis of this distinct subgroup of NSCLC.

    Figure 3. Mutation landscape of lung adenocarcinoma in young never-smoker patients.

    Reference: Luo WX, Tian PW, Wang Y, et al. Characteristics of genomic alterations of lung adenocarcinoma in young never-smokers[J]. International Journal of Cancer, 2018, 143, 1696‒1705.

    Genetic alterations in esophageal tissues from squamous dysplasia to carcinoma (Liu et al., 2017)

    Background:

    Esophageal squamous cell carcinoma (ESCC) is the most common subtype of esophageal cancer. Little is known about the genetic changes that occur in esophageal cells during the development of ESCC. This study performed next-generation sequence analyses of esophageal nontumor, intraepithelial neoplasia (IEN), and ESCC tissues from the same patients to track genetic changes during tumor development.

    Sampling & Sequencing Strategy:

    Sampling:
    • 227 esophageal tissue samples from 70 patients with ESCC undergoing resection

    Sequencing Strategy:
    • Human whole genome sequencing (7 cases), whole exome sequencing (18 cases), and targeted region sequencing (45 cases) on Illumina platforms (PE150)

    Results & Conclusion

    The study revealed significant similarities in the types and frequency of mutations between IEN and ESCC (Figure 1), including similarity in the DNA damage mutation signature. Mutations in the CCND1, CDKN2A, and FGFR1 genes were also revealed as the early driver events from phylogenetic and clonal analysis. However, the number of non-overlapping SNVs in tissues taken from the same individuals indicated that various lesions formed independently and that there was independent clonal expansion of mutations. As shown in this study, using multiple NGS applications provides novel approaches for exploring early diagnostics and treatments for cancer.

    Figure 4. The mutation variation landscape of ESCC, IEN, and simple hyperplasia (ESSH) from whole genome sequencing and whole exome sequencing.

    Reference: Liu X, Zhang M, Ying SM, et al. Genetic alterations in esophageal tissues from squamous dysplasia to carcinoma[J]. Gastroenterology, 2017, 153: 166‒177.


    Sequencing error rate distribution

    Note: The x-axis represents position in reads, and the y-axis represents the average error rate of bases of all reads at a position.

    GC content distribution

    Note: The x-axis is position in reads, and the y-axis is percentage of each type of bases (A, T, G, C); different bases are distinguishable by different colors.

    Sequencing depth & coverage distribution

    Note: Average sequencing depth (bar plot) and coverage (dot-line plot) in each chromosome. The x-axis represents chromosome; the left y-axis is the average depth; the right y-axis is the coverage (proportion of covered bases).

    SNP detection

    Sample
    Sample_1
    Sample_2
    Sample_3
    Sample_4
    Sample_5
    Sample_6
    CDS
    22318
    22343
    22271
    22702
    22654
    22418
    Synonymous SNP
    11342
    11375
    11329
    11,439.00
    11387
    11376
    missense SNP
    10335
    10340
    10334
    10643
    10649
    10400
    stopgain
    77
    81
    72
    87
    87
    8.30
    stoploss
    14
    13
    11
    12
    12
    10
    unknown
    558
    541
    536
    531
    528
    501
    intronic
    1263778
    1261992
    1262435
    1259099
    1262095
    1271575
    UTR3
    25167
    25134
    25496
    25396
    25462
    25510
    UTR5
    5568
    5562
    5644
    5767
    5829
    5702
    splicing
    84
    85
    84
    86
    90
    96
    ncRNA exonic
    11867
    11818
    11734
    11628
    11697
    11760
    ncRNA intronic
    205360
    205028
    200363
    199813
    200397
    205018
    ncRNA splicing
    66
    66
    58
    61
    64
    60
    upstream
    22383
    22339
    22230
    22648
    22744
    22708
    downstream
    23565
    23544
    23515
    23221
    23235
    23557
    intergenic
    2119447
    2115048
    2110391
    2091107
    2098406
    2138433
    Total
    3700477
    3693838
    3685038
    3662384
    3673519
    3727684

    Circos

    Note:
    Novogene shows Circos only when CNV analysis was carried out. The figure consists seven rings from outer to inner.
    (1) The outer circle (the first circle) is chrome information.
    (2) The second ring represents the read coverage in histogram style. A histogram is the average coverage of a 0.5Mbp region.
    (3) The third ring represents indel density in scatter style. A black dot is calculated as indel number in a range of 1Mbp.
    (4) The fourth ring represents snp density in scatter style. A green dot is calculated as snp number in a range of 1Mbp.
    (5) The fifth ring represents the proportion of homozygous SNP (orange) and heterozygous SNP (grey) in histogram style. A histogram is calculated from a 1Mbp region.
    (6) The sixth ring represents the CNV inference. Red means gain, and green means loss.
    (7) The most central ring represents the SV inference in exonic and splicing regions. TRA (orange), INS (green), DEL (grey), DUP (pink) and INV (blue).

    Heatmap of significantly mutated genes


    Linkage analysis

    Note: The upper x-axis is chromosome number; the lower x-axis is centimorgan (cM). And the y-axis is LOD score.