High throughput and accurate genotyping for large-scale screening of multiple targets

You no longer need to look only at known markers in an array for agriculture decisions. With Khufu, you can identify the known and de novo markers at a fraction of the cost.

The key to unlocking the whole genome for genetic improvement

Khufu Analysis Services

Khufu is a data analysis platform that delivers genomics solutions at a fraction of the standard cost using low-coverage whole genome sequencing. From the identification of functional variation linked to traits of interest to producing high-quality genome-wide genotyping and even simply calling and imputing custom SNP sets, Khufu unlocks the potential of your breeding or research program. 

With all Khufu Analysis services, we offer High Throughput DNA Extraction, 96-well plate Library Prep, and Sequencing.*

There are 3 levels of Khufu services that are customizable to suit your breeding or research needs.

Khufu Core Linear

Layers low-pass short-read sequencing data with a reference genome for genotyping to produce 100K+ of de novo SNP calls. Imputation is provided using a panel-free method comparing haplotypes within the population. The genotyping approach may be customized in one of the following ways:
    • Calling an array with a known set of markers
    • Parent-guided genotyping: filtering for homozygous polymorphic SNP calls in the parents
    • calling only de-novo SNPs across the genome
Output: * a hapmap file with SNP calls (Default is  ACGT format) * an imputed hapmap file (.Ihapmap file extension) Optional Customization: Talk to us if you want us to produce the marker calls in the same format as an array that you already receive such as A/B format.

Khufu Trait Discovery

After running Khufu Core for SNPs, we layer in your phenotypes to identify traits of interest, such as disease, drought or pest resistance, size, and coloring. Each trait in each population is a separate add-on. For each trait, you may choose one of two different analysis methods to be run:

QTLvar: recommended for breeding populations to identify markers for rapid introduction of beneficial traits into a plant or animal population. Details for this method are available in Korani et al. 2021 (https://www.mdpi.com/2073-4395/11/11/2201).

Output:

* a text file with coordinates and statistical information about each SNP
* an RDS file that can be uploaded to khufu_var, an interactive online tool for manual filtering and visualization: https://w-korani.shinyapps.io/khufu_var3/
* a PDF file with graphs plotting the difference in SNP allele frequency, created in khufu_var

 

QTLvar Example Reports

Hawkhap II: recommended for breeding or diversity populations to identify haplotype regions linked to the trait of interest.

Output:
* a text file with coordinates and statistical information about each SNP and haplotype region.
* a PDF file with graphs plotting the haplotype-phenotype association for each region.

 

Hawkhap II Example

KhufuPAN – Founder Pangenome Graph

Maximizes breeding in a population by generating HiFi data from your population’s Founders to construct a custom KhufuPAN Graph. The Founders’ variants are assigned to a single linear reference to maintain reproducibility and reduce the complexity of structure variants in genotyping calls. This KhufuPAN Graph is layered into subsequent Khufu Core services to generate 2 to 3 times the number of callable markers over the standard Khufu Core as well as structural variants. See the example below.

Output:
* a text file with coordinates and statistical information about each variant (SNPs & structural variants)
* an RDS file that can be uploaded to khufu_var, an interactive online tool for manual filtering and visualization: https://w-korani.shinyapps.io/khufu_var3/
* a PDF file with graphs plotting the difference in SNP allele frequency, created in khufu_var

Any additional analysis beyond the standard Khufu services listed above will be charged as an add-on project. This includes items such as running the data with a new or second reference genome, rerunning reports with new criteria, and producing an additional report beyond our standard outputs.

Frequently Asked Questions

The Frequently Asked Questions (FAQ) section provides quick answers to common inquiries, helping visitors find essential information about our services, policies, and procedures without needing to contact support.

Khufu uses low-pass whole genome sequencing to input SNPs and traits. The average sequencing coverage is <3x per sample. With this whole genome sequencing coverage, Khufu is able to look for not only pre-identified markers of interest but also call out additional SNPs across the genome. When using an array, you are limited to the panel that has been previously developed, which may or may not have been customized to your specific population, and will only ever call the pre-determined 3,000-50,000 markers. Meanwhile, you can expect 100,000 and up to 1,000,000+ markers with Khufu.

Khufu Comparison Chart

Khufu maps short DNA reads to a reference genome and identifies genetic factors, single-nucleotide polymorphisms (SNPs), to select and apply beneficial traits into existing populations.

For most projects we deliver our report and finalized SNP calls within six to ten weeks of receiving your samples. For re-analysis projects, we deliver our report within two weeks of receiving your raw sequence data.

Marker counts will vary based on population size and genomes size. Below are some marker count examples from actual projects with various goals:

Species

Genome Size

SNPs

Peanut

2.54 Gb

31,134 – 112,531

Reptile

1.59 Gb

1,527,667

Plum

1.43 Gb

170,978

Common Bean

587 Mb

891,465

Chicken

1.05 Gb

297,712

Cacao

337 Mb

2,131,768

Pistachio

499 Mb

1,615,336

Coffee

990 Mb

843,969

Pear

446 Mb

577,482

Cannabis

854 Mb

1,553,866

Yes. If you provide us with the markers of interest from your array, then we can call those with Khufu. We can also customize your reports so they match the format you are used to getting. Please contact us and provide us with an example of what you need, and we can match it.

We recommend a minimum of 96 samples. However, there is no maximum number of samples allowed. It all depends on the population you are working with and your goal for the results. For genomic selection, we can accept samples from the entire population. If you are doing a genomic verification of the crop, a population, or an ingredient, then a random selection of a smaller ration of your group will work.

The Khufu Trait Mapping services can be used on any species that has a reference genome. If your species does not have a reference genome assembled yet, or you would like one specific to your population, then we can work with you to assemble one.

The Khufu Co-developer Team

Josh Clevenger

Josh Clevenger, PhD, is a Faculty Investigator at the HudsonAlpha Institute for Biotechnology. This role, coupled with his experiences working as a research scientist on the nut science team at Mars Wrigley, positions him perfectly to genetically improve crops by leveraging genomics technologies and computational tools.

Walid Korani, PhD, is a computational biologist at the HudsonAlpha Institute for Biotechnology. His post-doctoral research in peanut breeding and work as a bioinformatician for a bovine breeding company afforded him expert experience in both plant and animal genomics and bioinformational analysis.

REQUEST A QUOTE

Please fill out the form below to provide the necessary details for your quote. This will help us understand your needs and provide an accurate estimate.

Get More Information

Fill out the form below or email our team at khufudatainfo@hudsonalpha.org to get more information or schedule time to meet with a specialist.

*We can only guarantee sequencing and analysis results on samples we extract, prep and sequence.