Q: Does the AFC dataset contain healthy (normal) samples only?
A: No, it does contain individuals affected with phenotypes as well as individuals who are free of known genetic disease. In cases where family structures were provided, parental/founder genomes were used, but not the children to reduce family bias.

Q: What are we filtering against? Ie: what are the specs to which a sample must meet to be entered into the AFC?
A: Allele Frequency Community currently excludes panels (i.e. only exomes & whole genomes are currently used). In addition, certain exomes & genomes are also excluded from the Allele Frequency Community resource, e.g. only keeps high-quality datasets, keeps parents/founders only in family units, excludes tumor genomes, and excludes duplicates.

Q: Regarding ethnicity? How do we capture this? Will we re-annotate the donated samples in future releases for ethnicity?
A: Self-identified ethnic background was provided by a subset of the founders. This information is not required for submission, but has been extremely valuable and appreciated for development of a classifier that can impute ethnic subpopulations for other individuals represented in the database. This, in turn, allows Allele Frequency Community to provide community members with ethic-group specific allele frequency information for those subgroups that are represented by >=100 individuals to protect patient privacy.

Q: How does the dataset compare to ExAC?
A: ExAC is a spectacular resource, with extremely high value to the community. High-quality public resources like ExAC, CG Diversity and Exome Variant Server are incorporated into Allele Frequency Community database builds. The key unique benefits of the Allele Frequency Community are;

  1. it already has over 13,000 (ca. May 2016) opted-in whole genomes, while whole genomes are not well-represented in other public allele frequency resources at this point (the “E” in ExAC and EVS stands for “exome”),
  2. Allele Frequency Community grows over time as it is used,
  3. Allele Frequency Community is extremely diverse, representing over 100 countries of origin at launch and growing over time.

ExAC has just over 60,000 exomes, while Allele Frequency Community already has (ca. May 2016) over 130,000 exomes/genomes, wherein >13,000 of those are whole genomes.

Q: Once we opt in, any exomes or genomes that we upload will be put through the AFC quality filters, and if they pass, the variants will be added to the AFC dataset.
A: Yes

Q: Am I right in thinking that only allele frequencies are available publically via AFC?
A: Yes

Q: Are variants stored in the AFC database, even if not publically accessible?
A: The VCF-level data are stored in your private; HIPAA & Safe Harbor certified QIAGEN account in a secure, private, hosted IT infrastructure. By opting in for AFC, you permit anonymous computation across these VCFs (in addition to the other 80,000+ exomes & whole genomes opted-in by others) to compute the anonymized, pooled Allele Frequency Community statistics when the Allele Frequency Community database is updated.

Q: If variants are stored, are they stored in a per individual basis? (i.e. is it possible to tell which set of variants came from any one individual?)
A: In your private, secure account, the answer is “yes”, but more importantly, in the Allele Frequency Community database which is available to third parties, the answer is “no”—only anonymized, pooled statistics are shared in the context of the Allele Frequency Community.

Q: Is there any phenotype data stored against the variants? You’ve said that ethnicity is being recorded, anything else?
A: Phenotype data may be stored in your private account, but these data are not released in the Allele Frequency Community resource. Some founders have elected to provide self-identified ethnicity information for the purposes of building a classifier algorithm that is used to provide anonymous, pooled allele frequency statistics within sub-populations that are represented by at least 100 patients in the opted-in dataset, No ethnic subpopulation statistics will be provided for subpopulations that are not represented by at least 100 individuals as an additional patient privacy safeguard.

Q: Is any other information apart from the variant recorded in AFC?

A: The variant itself and its anonymized pooled frequency / count information computed from across all opted-in samples that pass a series of quality exclusion filters. In the future, we also intend to provide ethnic subpopulation frequencies once a population reaches a minimum threshold size of 100 patients or more.

Q: Is there anyway to go from a variant frequency in ACF, back to an individual, or to a specific analysis in IVA?
A: No, the Allele Frequency Community only provides anonymized, pooled allele frequency statistics, and does not link back to specific individuals in any way.

