IGV自定义参考基因组

IGV 2.11.0版本以上,自定义参考基因组的方法

软件版本

IGV版本:2.13.0

需求说明

对b37的参考基因组fasta文件做了一些修改,需要在IGV中浏览它,而且需要有Refseq的基因注释信息的Track。

信息说明

版本差异

IGV 2.11.0版本以上,是用一个JSON文件去指定和加载参考基因组的。
同时弃用了之前版本的.genome格式;通过选项Genomes -> Greate .genome File…的加载方式也取消了。

官方文档

格式说明 https://github.com/igvteam/igv/wiki/JSON-Genome-Format
属性说明 https://github.com/igvteam/igv.js/wiki/Reference-Genome

参数说明

IGV 2.11.0版本以上自定义参考基因组JSON文件,只有是fastaURL必填,其他都是可选。
所有URL都可以是在线资源或本地路径。

  • id:该参考基因组的名称,可选。 就是在Genome下拉框里显示的基因组名称。 如果需要使用BLAT功能的话,这里要填某些特定的genome ID。具体原因见IGV的BLAT
  • name:描述信息,可选。就是在Genome下拉框里显示的基因组名称。
  • fastaURL:参考基因组Fasta的URL,必填。可以是线上的,如UCSC等数据库中的fasta文件;也可以是本地的,如服务器上的fasta文件路径。
  • indexURL:参考基因组Fasta的索引(.fai)文件,可选。但如果不提供.fai文件,会一次性加载整个fasta文件。
  • cytobandURL:UCSC格式的cytoBand文件的URL,可选。是用于画染色体示意图的,可以在UCSC的goldenPath找到cytoBand.txt.gz,例如hg19的。UCSC上有cytoBand文件格式说明
  • tracks:加载参考基因组时,同时加载的一系列Tracks,例如默认的hg19基因组的RefSeq Gene描述信息,可选。IGV的Github有tracks格式说明

加载方法

写完JSON文件后,在软件中加载自定义参考基因组的方法:Genomes -> Load Genome from File… -> 选择参考基因组json文件

JSON示例

示例的详细信息见:IGV reference genome (JSON)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
{
"id": "hg38",
"name": "Human (GRCh38/hg38)",
"fastaURL": "https://s3.amazonaws.com/igv.broadinstitute.org/genomes/seq/hg38/hg38.fa",
"indexURL": "https://s3.amazonaws.com/igv.broadinstitute.org/genomes/seq/hg38/hg38.fa.fai",
"cytobandURL": "https://s3.amazonaws.com/igv.org.genomes/hg38/annotations/cytoBandIdeo.txt.gz",
"aliasURL": "https://s3.amazonaws.com/igv.org.genomes/hg38/hg38_alias.tab",
"chromosomeOrder": [
"chr1",
"chr2",
"chr3",
"chr4",
"chr5",
"chr6",
"chr7",
"chr8",
"chr9",
"chr10",
"chr11",
"chr12",
"chr13",
"chr14",
"chr15",
"chr16",
"chr17",
"chr18",
"chr19",
"chr20",
"chr21",
"chr22",
"chrX",
"chrY"
],
"tracks": [
{
"name": "Refseq Genes",
"format": "refgene",
"url": "https://s3.amazonaws.com/igv.org.genomes/hg38/ncbiRefSeq.sorted.txt.gz",
"indexURL": "https://s3.amazonaws.com/igv.org.genomes/hg38/ncbiRefSeq.sorted.txt.gz.tbi"
},
{
"name": "Gencode v24 genes",
"format": "gtf",
"url": "https://s3.amazonaws.com/igv.org.genomes/hg19/gencode.v24.genes.gtf.gz"
}
]
}

IGV的BLAT

BLAT
Reference genome
从以上官方文档可以知道,IGV的BLAT功能,本质是将待查询序列形成一个命令,传到UCSC的网页服务进行分析。
所以如果命令中的某些参数不对,会导致BLAT功能失效,例如指定数据库的db参数。经过测试发现JSON文件的id就是命令中的db参数。
Reference genomeHosting genomes这部分有个https://igv.org/genomes/genomes.tsv,在里面可以看到IGV自带的各个基因组的JSON文件,以及genome ID(db参数/JSON文件的id取值)。

  • 如果你只是想把IGV自带的参考基因组文件下载到本地,避免每次都要等加载;同时要保留BLAT功能。那么JSON文件的id取值必须要是genomes.tsv中的genome ID。
  • 如果你的物种或者参考基因组不是IGV自带的,或者想调用本地服务器的BLAT功能,可以看这篇文章:如何在IGV上使用BLAT搜索非模式物种

https://igv.org/genomes/genomes.tsv

Name (for menu) url to .genome or .json file genome ID
A. thaliana (TAIR 10) https://igv.org/genomes/json/tair10.json tair10
C. elegans (ce11) https://igv.org/genomes/json/ce11.json ce11
Chicken (GRCg6a / galGal6) https://igv.org/genomes/json/galGal6.json galGal6
Gallus gallus (GCF_016699485.2) https://igv.org/genomes/json/GCF_016699485.2/GCF_016699485.2.json GCF_016699485.2
Chimp (panTro4) https://igv.org/genomes/json/panTro4.json panTro4
Coprinopsis cinerea okayama7#130 (GCA_000182895.1) https://igv.org/genomes/json/GCA_000182895.1/GCA_000182895.1_dt.json GCA_000182895.1
Cow (bosTau8) https://igv.org/genomes/json/bosTau8.json bosTau8
Cow (bosTau9) https://igv.org/genomes/json/bosTau9.json bosTau9
D. melanogaster (dm6) https://igv.org/genomes/json/dm6.json dm6
D. melanogaster (dm3) https://igv.org/genomes/json/dm3.json dm3
D. melanogaster (r5.9) https://igv.org/genomes/json/dmel_r5.9.json dmel_r5.9
Dog (canFam3) https://igv.org/genomes/json/canFam3.json canFam3
Dog (UU_Cfam_GSD_1.0/canFam4) https://igv.org/genomes/json/canFam4.json canFam4
Dog (canFam5) https://igv.org/genomes/json/canFam5.json canFam5
Human (1kg, b37+decoy) https://igv.org/genomes/json/b37_1kg.json 1kg_v37
Human (T2T CHM13-v2.0/hs1) https://igv.org/genomes/json/hs1.json hs1
Human (T2T CHM13-v1.1) https://igv.org/genomes/json/chm13v1.1.json chm13v1.1
Human (hg18) https://igv.org/genomes/json/hg18.json hg18
Human (hg19) https://igv.org/genomes/json/hg19.json hg19
Human (hg38) https://igv.org/genomes/json/hg38.json hg38
Human (hg38 1kg/GATK) https://igv.org/genomes/json/hg38_1kg.json hg38_1kg
Macaca fascicularis 6.0 (GCA_011100615.1) https://igv.org/genomes/json/GCA_011100615.1.json GCA_011100615.1
Mouse mm10 https://igv.org/genomes/json/mm10.json mm10
Mouse mm9 https://igv.org/genomes/json/mm9.json mm9
Mouse mm39 https://igv.org/genomes/json/mm39.json mm39
Rat (rn6) https://igv.org/genomes/json/rn6.json rn6
Rat (rn7) https://igv.org/genomes/json/rn7.json rn7
S. cerevisiae (sacCer3) https://igv.org/genomes/json/sacCer3.json sacCer3
Zebrafish (GRCz10/danRer10) https://igv.org/genomes/json/danRer10.json danRer10
Zebrafish (GRCz11/danRer11) https://igv.org/genomes/json/danRer11.json danRer11
SARS-CoV-2 https://igv.org/genomes/json/ASM985889v3.json ASM985889v3
S. pombe (ASM294v2) https://igv.org/genomes/json/ASM294v2.json ASM294v2
Gorilla (gorGor4) https://igv.org/genomes/json/gorGor4.json gorGor4
Gorilla (gorGor6) https://igv.org/genomes/json/gorGor6.json gorGor6
Bonobo (MPI-EVA panpan1.1/panPan2) https://igv.org/genomes/json/panPan2.json panPan2
Pig (SGSC Sscrofa11.1/susScr11) https://igv.org/genomes/json/susScr11.json susScr11
S. purpuratus (Baylor 2.1/strPur2) https://igv.org/genomes/json/strPur2.json strPur2
S. purpuratus (Spur5.0) https://igv.org/genomes/json/Spur5.0.json Spur5.0
P. miniata (Pmin3.0) https://igv.org/genomes/json/Pmin3.0.json Pmin3.0
L. variegatus (Lvar3.0) https://igv.org/genomes/json/Lvar3.0.json Lvar3.0
O. sativa IRGSP-1.0 (GCF_001433935.1) https://igv.org/genomes/json/GCF_001433935.1.json GCF_001433935.1
A. gambia (Pest AgamP3) https://s3.amazonaws.com/igv.broadinstitute.org/genomes/AgamP3.genome AgamP3
Autographa californica MNPV https://s3.amazonaws.com/igv.broadinstitute.org/genomes/NC_001623.genome NC_001623
Bacillus Subtilis str. 168 https://s3.amazonaws.com/igv.broadinstitute.org/genomes/NC_000964.genome NC_000964
Banana (M. Balbisiana PKWv1) https://s3.amazonaws.com/igv.broadinstitute.org/genomes/MusaBalbisianaPKWv1.genome MusaBalbisianaPKWv1
Banana (Musa acuminata) https://s3.amazonaws.com/igv.broadinstitute.org/genomes/MusaAcuminata.genome MusaAcuminata
C. alibcans (SC5314 A21) https://s3.amazonaws.com/igv.broadinstitute.org/genomes/ca21.genome ca21
C. elegans (WS241) https://s3.amazonaws.com/igv.broadinstitute.org/genomes/ws241.genome ws241
C. elegans (WS235) https://s3.amazonaws.com/igv.broadinstitute.org/genomes/ws235.genome ws235
C. elegans (WS245) https://s3.amazonaws.com/igv.broadinstitute.org/genomes/WS245.genome WS245
C. elegans (ce10) https://s3.amazonaws.com/igv.broadinstitute.org/genomes/ce10.genome ce10
Chicken (galGal4) https://s3.amazonaws.com/igv.broadinstitute.org/genomes/galGal4.genome galGal4
Chicken (galGal5) https://s3.amazonaws.com/igv.broadinstitute.org/genomes/galGal5.genome galGal5
Chimp (panTro3) https://s3.amazonaws.com/igv.broadinstitute.org/genomes/panTro3.genome panTro3
Chimp (panTro5) https://s3.amazonaws.com/igv.org.genomes/panTro5/panTro5.genome panTro5
Chimp (panTro6) https://s3.amazonaws.com/igv.org.genomes/panTro6/panTro6.genome panTro6
Cat (felCat5) https://s3.amazonaws.com/igv.broadinstitute.org/genomes/felCat5.genome felCat5
Cow (bosTau7) https://s3.amazonaws.com/igv.broadinstitute.org/genomes/bosTau7.genome bosTau7
E. coli K-12 MG1655 (NC_000913.2) https://s3.amazonaws.com/igv.broadinstitute.org/genomes/NC_000913.2.genome NC_000913.2
E. coli K-12 (NC_000913.3) https://s3.amazonaws.com/igv.broadinstitute.org/genomes/NC_000913.3.gbk NC_000913.3
Ferret (MusPutFur1.0) https://s3.amazonaws.com/igv.broadinstitute.org/genomes/MusPutFur1.0.genome MusPutFur1.0
Francisella tularensis (NC_008601) https://s3.amazonaws.com/igv.broadinstitute.org/genomes/NC_008601.gbk NC_008601
Glycine max (v8.0) https://s3.amazonaws.com/igv.broadinstitute.org/genomes/gmax8.genome gmax8
Glycine max (Wm82.a2.v1) https://s3.amazonaws.com/igv.broadinstitute.org/genomes/gmax10.genome gmax10
Helicobacter hepaticus https://s3.amazonaws.com/igv.broadinstitute.org/genomes/NC_004917.genome NC_004917
HIV-1 https://s3.amazonaws.com/igv.broadinstitute.org/genomes/NC_001802.genome NC_001802
HIV-2 https://s3.amazonaws.com/igv.broadinstitute.org/genomes/NC_001722.genome NC_001722
Human Adenovirus C https://s3.amazonaws.com/igv.broadinstitute.org/genomes/NC_001405.genome NC_001405
Human Herpesvirus 4, Type 1 https://s3.amazonaws.com/igv.broadinstitute.org/genomes/HHV4_Type1.genome HHV4_Type1
Human Herpesvirus 4, Type 2 https://s3.amazonaws.com/igv.broadinstitute.org/genomes/HHV4_Type2.genome HHV4_Type2
Human Mito (NC_012920) https://s3.amazonaws.com/igv.broadinstitute.org/genomes/NC_012920.1.gbk NC_012920
Human respiratory synctial virus https://s3.amazonaws.com/igv.broadinstitute.org/genomes/M74568.genome M74568
Macaca fascicularis (CE_1.0) https://s3.amazonaws.com/igv.broadinstitute.org/genomes/CE_1.0.genome CE_1.0
Macaca fascicularis (5.0) https://s3.amazonaws.com/igv.org.genomes/Macaca_fascicularis_5.0/Macaca_fascicularis_5.0.genome GCF_000364345.1
Mouse (129S1/SvImJ) https://s3.amazonaws.com/igv.broadinstitute.org/genomes/SvImJ.genome 129S1/SvImJ
Mouse mm8 https://s3.amazonaws.com/igv.broadinstitute.org/genomes/mm8.genome mm8
Mycobacterium TB (CD1551) https://s3.amazonaws.com/igv.broadinstitute.org/genomes/NC_002755.genome NC_002755
N. Meningitidis (FAM18) https://s3.amazonaws.com/igv.broadinstitute.org/test/genomes/NC_008767.genome NC_008767
N. Meningitidis (MC58) https://s3.amazonaws.com/igv.broadinstitute.org/test/genomes/NC_003112.genome NC_003112
N. Meningitidis (Z2491) https://s3.amazonaws.com/igv.broadinstitute.org/test/genomes/NC_003116.genome NC_003116
P. falciparum 3D7 (V9.0) https://s3.amazonaws.com/igv.broadinstitute.org/genomes/Pf3D7_v9.0.genome Pf3D7_v9.0
Plasmodium (3D7 V24) https://s3.amazonaws.com/igv.broadinstitute.org/genomes/PlasmoDB_24.genome Plasmodium_24
Rabbit (oryCun2.0) https://s3.amazonaws.com/igv.broadinstitute.org/genomes/oryCun2.0.genome oryCun2.0
Rat (rn5) https://s3.amazonaws.com/igv.broadinstitute.org/genomes/rn5.genome rn5
Rhesus (rheMac3) https://s3.amazonaws.com/igv.broadinstitute.org/genomes/rheMac3.genome rheMac3
Rhesus (rheMac8) https://s3.amazonaws.com/igv.broadinstitute.org/genomes/rheMac8.genome rheMac8
Rhesus (Mmul_10/rheMac10) https://s3.amazonaws.com/igv.org.genomes/rheMac10/rheMac10.genome rheMac10
S. cerevisiae (Y55) https://s3.amazonaws.com/igv.broadinstitute.org/genomes/Y55.genome Y55
S. pombe (ASM294v2) https://s3.amazonaws.com/igv.broadinstitute.org/genomes/ASM294v2.genome ASM294v2
S. sclerotiorum (sclerotiorum) https://s3.amazonaws.com/igv.broadinstitute.org/genomes/sclerotiorum.genome sclerotiorum
Salmonella enterica str. 14028S https://s3.amazonaws.com/igv.broadinstitute.org/genomes/NC_016856.genome NC_016856
Salmo salar (ICSASG_v2) https://s3.amazonaws.com/igv.broadinstitute.org/genomes/GCF_000233375.1.genome GCF_000233375.1
Sheep (Ovis Aries v3.1) https://s3.amazonaws.com/igv.broadinstitute.org/genomes/oviAri3.genome oviAri3
Sus Scrofa (Sscrofa10.2/susScr3) https://s3.amazonaws.com/igv.broadinstitute.org/genomes/susScr3.genome susScr3
T. brucei (427 v4.2) https://s3.amazonaws.com/igv.broadinstitute.org/genomes/tb427_4.2.genome tb427_4.2
T. brucei (927 v5.0) https://s3.amazonaws.com/igv.broadinstitute.org/genomes/tbrucei927_5.0.genome tbrucei927_5.0
T. brucei gambiense https://s3.amazonaws.com/igv.broadinstitute.org/genomes/tbgambi.genome tbgambi
Tomato (2.31) https://s3.amazonaws.com/igv.broadinstitute.org/genomes/SL2.31.genome SL2.31
Tomato (2.40) https://s3.amazonaws.com/igv.broadinstitute.org/genomes/SL2.40.genome SL2.40
Trout (CCAF000000000) https://s3.amazonaws.com/igv.broadinstitute.org/genomes/CCAF000000000.genome CCAF000000000
V. vitifera https://s3.amazonaws.com/igv.broadinstitute.org/genomes/vvinifera.genome vvitifera
X. tropicalis v9.0 https://s3.amazonaws.com/igv.broadinstitute.org/genomes/xenTro9.genome xenTro9
X. laevis (7.1) https://s3.amazonaws.com/igv.broadinstitute.org/genomes/laevis_7.1.genome laevis_7.1
Zea mays (AGPv3.31) https://s3.amazonaws.com/igv.broadinstitute.org/genomes/AGPv3.31.genome AGPv3.31
C. reinhardtii (CC-503 v5.5) https://s3.amazonaws.com/igv.org.genomes/Creinhardtii_CC-503_v5.0/2019-03-21_Creinhardtii_CC-503_v5.5.genome 2019-03-21_Creinhardtii_CC-503_v5.5
C. reinhardtii (CC-503 v5.6) https://s3.amazonaws.com/igv.org.genomes/C.reinhardtii_CC-503_v5.6+/2019-03-21_Creinhardtii_CC-503_v5.6.genome 2019-03-21_Creinhardtii_CC-503_v5.6
C. reinhardtii (CC-503 v5.6 Cp+Mt v4.4) https://s3.amazonaws.com/igv.org.genomes/Creinhardtii_CC-503_v5.6/2019-03-21_Creinhardtii_CC-503_v5.6_Cp%2BMt_v4.4.genome 2019-03-21_Creinhardtii_CC-503_v5.6_Cp+Mt_v4.4
Ixodes scapularis JCVI_ISG_i3_1.0 (GCA_000208615.1) https://s3.amazonaws.com/igv.org.genomes/tick/GCA_000208615.1/GCA_000208615.1.genome GCA_000208615.1
Ixodes scapularis ISE6_asm2.2_deduplicated (GCA_002892825.2) https://s3.amazonaws.com/igv.org.genomes/tick/GCA_000208615.1/GCA_000208615.1.genome GCA_002892825.2

文章更新记录

2025.03.26

  1. 参数说明中id和name的说明有误,已更正。
  2. 更新JSON示例为官方示例。
  3. 新增IGV的BLAT相关信息。