Motif enrichment with pycisTarget using mouse liver ChIP-seq regions
[1]:
%matplotlib inline
import pycistarget
pycistarget.__version__
pycisTarget is a python module that allows to perform motif enrichment analysis and derive genome-wide cistromes implementing cisTarget (Herrmann et al., 2012; Imrichova et al., 2015). In addition, de novo cistromes can also be derived (via Homer (Heinz et al., 2010)) and pycisTarget also includes a novel approach to derive differentially enriched motifs and cistromes between one or more groups of regions, named Differentially Enriched Motifs (DEM).
0. Getting your input region sets
pycisTarget uses as input a dictionary containing the region set name as label and regions (as pyranges) as values. In this tutorial we will use 4 region sets, which correspond to the top 5K ChIP-seq peaks of Hnf4a, Foxa1, Cebpa and Onecut1 in the mouse liver (Ballester et al., 2014). We can easily read the data in the correct format using list comprehensension.
[2]:
import pyranges as pr
import os
path_to_region_sets = '/staging/leuven/stg_00002/lcb/cbravo/Liver/Multiome/pycistopic/GEMSTAT/ChIP/All_summits'
region_sets_files = ['Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K.bed', 'Foxa1_ERR235786_summits_order_by_score_extended_250bp_top5K.bed', 'Hnf4a_ERR235763_summits_order_by_score_extended_250bp_top5K.bed', 'Onecut1_ERR235752_summits_order_by_score_extended_250bp_top5K.bed']
region_sets = {x.replace('.bed', ''):pr.read_bed(os.path.join(path_to_region_sets, x)) for x in region_sets_files}
Apart from the cisTarget method, pycisTarget includes wrapper functions to use Homer (for de novo motif enrichment) and a new implementation relying in statistical testing between sets of regions using Cluster-Buster scores (DEM). We will first describe how to perform motif enrichment and form cistromes using Homer.
1. cisTarget
A. Creating cisTarget databases
To run cisTarget you will need to provide a ranking database (that is, a feather file with a dataframe with motifs as rows, genomic regions as columns and their ranked position [based on cis-regulatory module (CRM) score (Frith et al., 2003)] as values). We provide those databases for human (hg38, hg19), mouse (mm10, mm9) and fly (dm3, dm6) at https://resources.aertslab.org/cistarget/.
In addition, if you want to use other regions or genomes to build your databases, we provide a step-by-step tutorial and scripts at https://github.com/aertslab/create_cisTarget_databases. Below you can find the basic steps to do so:
[ ]:
%%bash
#### Variables
genome_fasta = 'PATH_TO_GENOME_FASTA'
region_bed = 'PATH_TO_BED_FILE_WITH_GENOMIC_REGIONS_FOR_DATABASE'
region_fasta = 'PATH_TO_FASTA_FILE_WITH_GENOMIC_REGIONS_FOR_DATABASE'
database_suffix = 'SUFFIX_FOR_DATABASE_FILE'
path_to_motif_collection = 'PATH_TO_MOTIF_COLLECTION_IN_CLUSTER_BUSTER_FORMAT'
motif_list = 'PATH_TO_FILE_WITH_MOTIFS_TO_SCORE'
n_cpu = 'NUMBER_OF_CORES'
#### Get fasta sequences
module load BEDTools # In our system, load BEDTools
bedtools getfasta -fi ${genome_fasta} -bed ${region_bed} > ${region_fasta}
#### Activate environment
my_conda_initialize # In our system, initialize conda
conda activate /staging/leuven/stg_00002/lcb/ghuls/software/miniconda3/envs/create_cistarget_databases
#### Set ${create_cistarget_databases_dir} to https://github.com/aertslab/create_cisTarget_databases
create_cistarget_databases_dir='/staging/leuven/stg_00002/lcb/ghuls/software/create_cisTarget_databases'
#### Score the motifs
${create_cistarget_databases_dir}/create_cistarget_motif_databases.py \
-f ${region_fasta} \
-M ${path_to_motif_collection} \
-m ${motif_list} \
-o ${database_suffix} \
-t ${n_cpu} \
-l \
-s 555
done
#### Create rankings
motifs_vs_regions_scores_feather = 'PATH_TO_MOTIFS_VS_REGIONS_SCORES_DATABASE'
${create_cistarget_databases_dir}/convert_motifs_or_tracks_vs_regions_or_genes_scores_to_rankings_cistarget_dbs.py -i ${motifs_vs_regions_scores_feather} -s 555
B. Running cisTarget
For running cisTarget there are some relevant parameters:
ctx_db: Path to the cisTarget database to use, or a preloaded cisTargetDatabase object. In this tutorial we will use the precomputed mm10 database (using SCREEN regions), available at https://resources.aertslab.org/cistarget/.
region_sets: The input sets of regions
specie: Specie to which region coordinates and database belong to. To annotate motifs to TFs using cisTarget annotations, possible values are ‘mus_musculus’, ‘homo_sapiens’ or ‘drosophila_melanogaster’. If any other value, motifs will not be annotated to a TF unless providing a customized annotation.
fraction_overlap: Minimum overlap fraction (in any direction) to map input regions to regions in the database. Default: 0.4.
auc_threshold: Threshold to calculate the AUC. For human and mouse we recommend to set it to 0.005 (default), for fly to 0.01.
nes_threshold: NES threshold to calculate the motif significant. Default: 3.0
rank_threshold: Percentage of regions to use as maximum rank to take into account for the region enrichment recovery curve. By default, we use 5% of the total number of regions in the database.
annotation: Annotation to use to form the cistromes. Default: [‘Direct_annot’, ‘Motif_similarity_annot’, ‘Orthology_annot’, ‘Motif_similarity_and_Orthology_annot’]. Since we are using the clustered motif database, we will not use motif similatiry annotations (which only rely on Tomtom q-values), since it is implicit on the clusters.
annotation_version : Motif collection version. Here we use the clustered v10 database (‘v10nr_clust’).
path_to_motif_annotations : File with motif annotations. These files are available at https://resources.aertslab.org/cistarget/motif2tf .
n_cpu: Number of cpus to use during calculations.
[3]:
# Load cistarget functions
from pycistarget.motif_enrichment_cistarget import *
[5]:
# Run, using precomputed database
cistarget_dict = run_cistarget(ctx_db = '/staging/leuven/stg_00002/icistarget-data/make_rankings/v10_clust/CTX_mm10/CTX_mm10_SCREEN3_no_bg_with_mask/CTX_mm10_SCREEN3_no_bg_with_mask.regions_vs_motifs.rankings.v2.feather',
region_sets = region_sets,
specie = 'mus_musculus',
auc_threshold = 0.005,
nes_threshold = 3.0,
rank_threshold = 0.05,
annotation = ['Direct_annot', 'Orthology_annot'],
annotation_version = 'v10nr_clust',
path_to_motif_annotations = '/staging/leuven/stg_00002/lcb/cbravo/cluster_motif_collection_V10_no_desso_no_factorbook/snapshots/motifs-v10-nr.mgi-m0.00001-o0.0.tbl',
n_cpu = 4,
_temp_dir='/scratch/leuven/313/vsc31305/ray_spill')
2022-08-04 09:14:15,645 cisTarget INFO Reading cisTarget database
(ctx_internal_ray pid=30473) 2022-08-04 09:14:41,873 cisTarget INFO Running cisTarget for Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K which has 4924 regions
(ctx_internal_ray pid=30476) 2022-08-04 09:14:41,925 cisTarget INFO Running cisTarget for Foxa1_ERR235786_summits_order_by_score_extended_250bp_top5K which has 4715 regions
(ctx_internal_ray pid=30475) 2022-08-04 09:14:42,008 cisTarget INFO Running cisTarget for Hnf4a_ERR235763_summits_order_by_score_extended_250bp_top5K which has 5019 regions
(ctx_internal_ray pid=30474) 2022-08-04 09:14:42,100 cisTarget INFO Running cisTarget for Onecut1_ERR235752_summits_order_by_score_extended_250bp_top5K which has 3777 regions
(ctx_internal_ray pid=30473) 2022-08-04 09:14:54,544 cisTarget INFO Annotating motifs for Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K
(ctx_internal_ray pid=30474) 2022-08-04 09:14:54,715 cisTarget INFO Annotating motifs for Onecut1_ERR235752_summits_order_by_score_extended_250bp_top5K
(ctx_internal_ray pid=30476) 2022-08-04 09:14:55,520 cisTarget INFO Annotating motifs for Foxa1_ERR235786_summits_order_by_score_extended_250bp_top5K
(ctx_internal_ray pid=30475) 2022-08-04 09:14:56,065 cisTarget INFO Annotating motifs for Hnf4a_ERR235763_summits_order_by_score_extended_250bp_top5K
(ctx_internal_ray pid=30473) 2022-08-04 09:14:57,050 cisTarget INFO Getting cistromes for Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K
(ctx_internal_ray pid=30474) 2022-08-04 09:14:57,333 cisTarget INFO Getting cistromes for Onecut1_ERR235752_summits_order_by_score_extended_250bp_top5K
(ctx_internal_ray pid=30476) 2022-08-04 09:14:58,260 cisTarget INFO Getting cistromes for Foxa1_ERR235786_summits_order_by_score_extended_250bp_top5K
(ctx_internal_ray pid=30475) 2022-08-04 09:14:58,938 cisTarget INFO Getting cistromes for Hnf4a_ERR235763_summits_order_by_score_extended_250bp_top5K
2022-08-04 09:15:02,779 cisTarget INFO Done!
[5]:
# Save
import pickle
with open('/staging/leuven/stg_00002/lcb/cbravo/Multiomics_pipeline/pycistarget_tutorial/cisTarget/cisTarget_dict.pkl', 'wb') as f:
pickle.dump(cistarget_dict, f)
C. Exploring cisTarget results
We can load the results for exploration.
[6]:
# Load
import pickle
infile = open('/staging/leuven/stg_00002/lcb/cbravo/Multiomics_pipeline/pycistarget_tutorial/cisTarget/cisTarget_dict.pkl', 'rb')
cistarget_dict = pickle.load(infile)
infile.close()
To visualize motif enrichment results, we can use the cisTarget_results()
function:
[7]:
cistarget_results(cistarget_dict, name='Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K')
[7]:
Logo | Region_set | Direct_annot | Orthology_annot | NES | AUC | Rank_at_max | Motif_hits | |
---|---|---|---|---|---|---|---|---|
metacluster_46.4 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | Cebpe, Cebpb, Cebpd, Cebpg, Hlf, Cebpa | Cebpe, Hes2, Cebpb, Ep300, Cebpd, Cebpg, Gatad2a, Cebpa, Dbp | 29.343196 | 0.097521 | 55485.0 | 2661 | |
homer__ATTGCGCAAC_CEBP | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | Cebpb | NaN | 24.940297 | 0.083402 | 55526.0 | 2148 | |
cisbp__M01815 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | Cebpe | NaN | 18.255857 | 0.061966 | 55529.0 | 1913 | |
swissregulon__mm__Cebpe | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | Cebpe | NaN | 13.044653 | 0.045254 | 55394.0 | 1454 | |
swissregulon__hs__CEBPB | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Cebpb | 11.735067 | 0.041054 | 55112.0 | 1303 | |
transfac_pro__M01869 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | Cebpg | NaN | 10.872791 | 0.038289 | 55512.0 | 1477 | |
transfac_pro__M04761 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Hsf1 | 10.672929 | 0.037648 | 55521.0 | 1433 | |
taipale_tf_pairs__GCM1_CEBPB_MTRSGGGNNNNNTTRCGYAAN_CAP | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Gcm1, Cebpb | 10.544153 | 0.037235 | 9621.0 | 496 | |
taipale_tf_pairs__GCM1_CEBPB_MTRSGGGNNNNNNTTRCGYAAN_CAP | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Gcm1, Cebpb | 9.960710 | 0.035364 | 7159.0 | 372 | |
taipale_tf_pairs__ATF4_CEBPB_NNATGAYGCAAYN_CAP | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Cebpb, Atf4 | 9.523663 | 0.033963 | 5333.0 | 266 | |
taipale_tf_pairs__ATF4_CEBPD_NGATGATGCAATNN_CAP | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Cebpd, Atf4 | 9.413217 | 0.033608 | 16238.0 | 417 | |
taipale_tf_pairs__CEBPG_ATF4_NNATGAYGCAAT_CAP | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Atf4, Cebpg | 8.875060 | 0.031883 | 52734.0 | 974 | |
taipale_tf_pairs__TEAD4_CEBPD_RGWATGYNNTTRCGYAAN_CAP | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Tead4, Cebpd | 8.650575 | 0.031163 | 55459.0 | 1425 | |
metacluster_156.2 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | Ddit3, Atf4, Cebpg | Myc, Cebpg, Atf3, Ddit3, Atf4 | 8.590335 | 0.030970 | 55477.0 | 1098 | |
taipale_tf_pairs__GCM1_CEBPB_ATRSGGGNNNNTTRCGYAAN_CAP_repr | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Gcm1, Cebpb | 8.588150 | 0.030963 | 7559.0 | 322 | |
taipale_tf_pairs__ATF4_TEF_RNMTGATGCAATN_CAP | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Atf4, Tef | 8.358196 | 0.030225 | 49968.0 | 905 | |
transfac_pro__M12588 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Ddit3 | 8.168929 | 0.029618 | 55436.0 | 1160 | |
taipale_tf_pairs__TEAD4_CEBPD_NTTRCGYAANNNNNNNNRGWATGY_CAP_repr | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Tead4, Cebpd | 7.872247 | 0.028667 | 13534.0 | 492 | |
taipale_tf_pairs__FLI1_CEBPB_RNCGGANNTTGCGCAAN_CAP | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Fli1, Cebpb | 7.740382 | 0.028244 | 7500.0 | 308 | |
taipale_tf_pairs__TEAD4_CEBPD_NTTRCGYAANNNNNNNRGWATGY_CAP_repr | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Tead4, Cebpd | 7.629948 | 0.027890 | 32982.0 | 874 | |
taipale_tf_pairs__TEAD4_CEBPD_NTTRCGYAANNNNNNRGWATGY_CAP | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Tead4, Cebpd | 7.377786 | 0.027081 | 45260.0 | 1110 | |
taipale_tf_pairs__ETV5_CEBPD_NSCGGANNTTRCGYAAN_CAP | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Cebpd, Etv5 | 7.323667 | 0.026908 | 55501.0 | 1091 | |
taipale_tf_pairs__FLI1_CEBPD_RNCGGANNTTGCGCAAN_CAP | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Fli1, Cebpd | 7.000281 | 0.025871 | 36615.0 | 805 | |
dbtfbs__HLF_HepG2_ENCSR528PSI_merged_N1 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Hlf | 6.981791 | 0.025811 | 55483.0 | 1449 | |
metacluster_156.3 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | Hlf, Dbp, Tef, Nfil3 | Tef, Gm4125, Hlf, Nfil3, Dbp | 6.623988 | 0.024664 | 55311.0 | 1326 | |
taipale_tf_pairs__TEAD4_CEBPB_NTTRCGYAANNNNNNGGAATGY_CAP_repr | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Tead4, Cebpb | 6.117250 | 0.023039 | 13175.0 | 395 | |
taipale_tf_pairs__ERF_CEBPD_RSMGGAANTTGCGYAAN_CAP | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Erf, Cebpd | 6.019584 | 0.022726 | 25715.0 | 612 | |
transfac_pro__M00621 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | Cebpd | NaN | 6.005248 | 0.022680 | 55285.0 | 1061 | |
cisbp__M01819 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | Nfil3 | NaN | 5.540409 | 0.021189 | 55322.0 | 941 | |
metacluster_46.2 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | Ddit3, Cebpa | Ddit3 | 5.461645 | 0.020936 | 55427.0 | 1010 | |
taipale_tf_pairs__ETV2_CEBPD_RSCGGANNTTGCGYAAN_CAP_repr | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Cebpd, Etv2 | 5.365398 | 0.020628 | 55474.0 | 951 | |
transfac_pro__M04829 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Stat3 | 5.216519 | 0.020150 | 49441.0 | 747 | |
tfdimers__MD00123 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | E2f1, Sox17 | 5.136474 | 0.019894 | 55251.0 | 814 | |
cisbp__M00808 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | Mypop | NaN | 4.597916 | 0.018166 | 23477.0 | 367 | |
metacluster_46.5 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Hlf, Tef | 4.246750 | 0.017040 | 55392.0 | 1381 | |
transfac_pro__M01872 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | Dbp | NaN | 3.938958 | 0.016053 | 53619.0 | 813 | |
transfac_pro__M05469 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Sall1 | 3.874759 | 0.015847 | 54755.0 | 619 | |
swissregulon__hs__GMEB2 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Gmeb2 | 3.757689 | 0.015472 | 30443.0 | 404 | |
hocomoco__GMEB2_HUMAN.H11MO.0.D | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Gmeb2 | 3.088250 | 0.013325 | 18673.0 | 293 | |
metacluster_39.2 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | Hnf4a | Hnf4a, Zfy2, Zfy1, Nr2f6, Mixl1, Hnf4g, Rxrg, Zfp644 | 3.080836 | 0.013301 | 52527.0 | 692 | |
taipale_tf_pairs__ETV2_TEF_RSCGGAWNTTRCGYAAN_CAP | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Etv2, Tef | 3.076843 | 0.013289 | 24472.0 | 338 | |
taipale_tf_pairs__ELK1_TEF_NSCGGAWNTTACGTAAN_CAP | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Elk1, Tef | 3.043136 | 0.013181 | 37781.0 | 578 |
This table can also be easily exported to a html file:
[8]:
out_file = '/staging/leuven/stg_00002/lcb/cbravo/Multiomics_pipeline/pycistarget_tutorial/cisTarget/Cebpa_motif_enricment.html'
cistarget_dict['Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K'].motif_enrichment.to_html(open(out_file, 'w'), escape=False, col_space=80)
You can also access the regions enriched for each motif. You will find to entries in motif_hits (similarly for cistromes); in ‘Region_set’ you will find the coordinates as in the input regions, in ‘Database’ you will find the coordinates as in the database:
[9]:
cistarget_dict['Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K'].motif_hits['Region_set']['metacluster_46.4'][0:10]
[9]:
['chr7:88310722-88311223',
'chr4:132078352-132078853',
'chr7:16525901-16526402',
'chr6:99266056-99266557',
'chr1:20820207-20820708',
'chr15:58214791-58215292',
'chr7:99181713-99182214',
'chr7:46719487-46719988',
'chr13:49681875-49682376',
'chr5:150599840-150600341']
To access cistromes (only available if motifs have been annotated):
[10]:
cistarget_dict['Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K'].cistromes['Region_set']['Cebpa_(2809r)'][0:10]
[10]:
['chr7:88310722-88311223',
'chr4:132078352-132078853',
'chr7:16525901-16526402',
'chr6:99266056-99266557',
'chr1:20820207-20820708',
'chr15:58214791-58215292',
'chr7:99181713-99182214',
'chr7:46719487-46719988',
'chr13:49681875-49682376',
'chr5:150599840-150600341']
You can easily export cistromes to a bed file:
[11]:
from pycistarget.utils import *
cebpa_cistrome = cistarget_dict['Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K'].cistromes['Region_set']['Cebpa_(2809r)']
cebpa_cistrome_pr = pr.PyRanges(region_names_to_coordinates(cebpa_cistrome))
cebpa_cistrome_pr.to_bed(path='/staging/leuven/stg_00002/lcb/cbravo/Multiomics_pipeline/pycistarget_tutorial/cisTarget/cebpa_cistrome_example.bed')
2. DEM
A. Creating your DEM databases
To run DEM you will need to provide a CRM scores database (that is, a feather file with a dataframe with motifs as rows, genomic regions as columns and their cis-regulatory module (CRM) score (Frith et al., 2003) as values). We provide those databases for human (hg38, hg19), mouse (mm10, mm9) and fly (dm3, dm6) at https://resources.aertslab.org/cistarget/.
In addition, if you want to use other regions or genomes to build your databases, we provide a step-by-step tutorial and scripts at https://github.com/aertslab/create_cisTarget_databases. The steps are the same as for creating a cisTarget database, without running the last step for ranking the regions. Below you can find the basic steps to do so:
[12]:
%%bash
#### Variables
genome_fasta = 'PATH_TO_GENOME_FASTA'
region_bed = 'PATH_TO_BED_FILE_WITH_GENOMIC_REGIONS_FOR_DATABASE'
region_fasta = 'PATH_TO_FASTA_FILE_WITH_GENOMIC_REGIONS_FOR_DATABASE'
database_suffix = 'SUFFIX_FOR_DATABASE_FILE'
path_to_motif_collection = 'PATH_TO_MOTIF_COLLECTION_IN_CLUSTER_BUSTER_FORMAT'
motif_list = 'PATH_TO_FILE_WITH_MOTIFS_TO_SCORE'
n_cpu = 'NUMBER_OF_CORES'
#### Get fasta sequences
module load BEDTools # In our system, load BEDTools
bedtools getfasta -fi ${genome_fasta} -bed ${region_bed} > ${region_fasta}
#### Activate environment
my_conda_initialize # In our system, initialize conda
conda activate /staging/leuven/stg_00002/lcb/ghuls/software/miniconda3/envs/create_cistarget_databases
#### Set ${create_cistarget_databases_dir} to https://github.com/aertslab/create_cisTarget_databases
create_cistarget_databases_dir='/staging/leuven/stg_00002/lcb/ghuls/software/create_cisTarget_databases'
#### Score the motifs
${create_cistarget_databases_dir}/create_cistarget_motif_databases.py \
-f ${region_fasta} \
-M ${path_to_motif_collection} \
-m ${motif_list} \
-o ${database_suffix} \
-t ${n_cpu} \
-l \
-s 555
done
B. Running DEM
For running DEM there are some relevant parameters:
dem_db: Path to the DEM database to use, or a preloaded DEMDatabase object (using the same region sets to be analyzed)
region_sets: The input sets of regions
specie: Specie to which region coordinates and database belong to. To annotate motifs to TFs using cisTarget annotations, possible values are ‘mus_musculus’, ‘homo_sapiens’ or ‘drosophila_melanogaster’. If any other value, motifs will not be annotated to a TF unless providing a customized annotation.
contrasts: Type of contrast to perform. If ‘Other’, background regions will be taken from other region sets; if ‘Shuffle’ the background will consist of the scores on shuffled input sequences. You can also provide a list specifying the specific contrasts to make. We will show some examples of these modalities below. When using ‘Shuffle’, the cluster-buster path, the genome fasta and the path to the folder with the motifs to score (cluster-buster format) has to be provided.
fraction_overlap: Minimum overlap fraction (in any direction) to map input regions to regions in the database. Default: 0.4.
max_bg_regions: Maximum number of background regions to use. Default: None (all regions).
adjpval_thr: Maximum adjusted p-value to select motifs. Default: 0.05
log2fc_thr: Minimum LogFC between the regions set and te background to consider the motif as differentially enriched. Default: 1.
mean_fg_thr: Minimum mean CRM value in the foreground (region set) to consider the motif differentially enriched. Default: 0
motif_hit_thr: Minimum CRM value to consider a region a motif hit. If None (default), an optimal threshold will be calculated per motif by comparing foreground and background.
annotation_version : Motif collection version. Here we use the clustered v10 database (‘v10nr_clust’).
path_to_motif_annotations : File with motif annotations. These files are available at https://resources.aertslab.org/cistarget/motif2tf .
motif_annotation: Annotation to use to form the cistromes. Here we will only use the direct and orthology annotation as example. Default: [‘Direct_annot’, ‘Motif_similarity_annot’, ‘Orthology_annot’, ‘Motif_similarity_and_Orthology_annot’]
n_cpu: Number of cpus to use during calculations.
[6]:
# Load DEM functions
from pycistarget.motif_enrichment_dem import *
[7]:
DEM_dict = DEM(dem_db = '/staging/leuven/stg_00002/icistarget-data/make_rankings/v10_clust/CTX_mm10/CTX_mm10_SCREEN3_no_bg_with_mask/CTX_mm10_SCREEN3_no_bg_with_mask.regions_vs_motifs.scores.v2.feather',
region_sets = region_sets,
specie = 'mus_musculus',
contrasts = 'Other',
name = 'DEM',
fraction_overlap = 0.4,
max_bg_regions = 500,
adjpval_thr = 0.05,
log2fc_thr = 1,
mean_fg_thr = 0,
motif_hit_thr = None,
cluster_buster_path = None,
path_to_genome_fasta = None,
path_to_motifs = None,
annotation_version = 'v10nr_clust',
path_to_motif_annotations = '/staging/leuven/stg_00002/lcb/cbravo/cluster_motif_collection_V10_no_desso_no_factorbook/snapshots/motifs-v10-nr.mgi-m0.00001-o0.0.tbl',
motif_annotation = ['Direct_annot', 'Orthology_annot'],
n_cpu = 4,
tmp_dir = '/staging/leuven/stg_00002/lcb/cbravo/Multiomics_pipeline/pycistarget/tmp',
_temp_dir='/scratch/leuven/313/vsc31305/ray_spill')
2022-08-04 09:15:22,876 DEM INFO Reading DEM database
2022-08-04 09:17:26,334 DEM INFO Creating contrast groups
(DEM_internal_ray pid=1603) 2022-08-04 09:17:33,557 DEM INFO Computing DEM for Onecut1_ERR235752_summits_order_by_score_extended_250bp_top5K
(DEM_internal_ray pid=1605) 2022-08-04 09:17:33,648 DEM INFO Computing DEM for Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K
(DEM_internal_ray pid=1606) 2022-08-04 09:17:33,672 DEM INFO Computing DEM for Foxa1_ERR235786_summits_order_by_score_extended_250bp_top5K
(DEM_internal_ray pid=1604) 2022-08-04 09:17:33,791 DEM INFO Computing DEM for Hnf4a_ERR235763_summits_order_by_score_extended_250bp_top5K
2022-08-04 09:17:46,089 DEM INFO Forming cistromes
2022-08-04 09:17:46,411 DEM INFO Done!
[14]:
# Save
import pickle
with open('/staging/leuven/stg_00002/lcb/cbravo/Multiomics_pipeline/pycistarget_tutorial/DEM/DEM_dict_B.pkl', 'wb') as f:
pickle.dump(DEM_dict, f)
C. Exploring DEM results
We can load the results for exploration.
[15]:
# Load
import pickle
infile = open('/staging/leuven/stg_00002/lcb/cbravo/Multiomics_pipeline/pycistarget_tutorial/DEM/DEM_dict_B.pkl', 'rb')
DEM_dict = pickle.load(infile)
infile.close()
To visualize motif enrichment results, we can use the DEM_results()
function:
[16]:
DEM_dict.DEM_results('Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K')
[16]:
Logo | Contrast | Direct_annot | Orthology_annot | Log2FC | Adjusted_pval | Mean_fg | Mean_bg | Motif_hit_thr | Motif_hits | |
---|---|---|---|---|---|---|---|---|---|---|
taipale_tf_pairs__ATF4_TEF_RNMTGATGCAATN_CAP | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Tef, Atf4 | 3.57887 | 0.000017 | 0.480499 | 0.040211 | 1.150 | 496.0 | |
taipale_tf_pairs__CEBPG_ATF4_NNATGAYGCAAT_CAP | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Cebpg, Atf4 | 3.574535 | 0.000001 | 0.522313 | 0.043842 | 1.500 | 521.0 | |
taipale_tf_pairs__GCM1_CEBPB_MTRSGGGNNNNNTTRCGYAAN_CAP | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Gcm1, Cebpb | 3.409684 | 0.033062 | 0.172734 | 0.016254 | 0.487 | 343.0 | |
taipale_tf_pairs__TEAD4_CEBPD_NTTRCGYAANNNNNNRGWATGY_CAP | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Tead4, Cebpd | 3.361205 | 0.0 | 0.420022 | 0.040874 | 1.550 | 493.0 | |
tfdimers__MD00123 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | E2f1, Sox17 | 2.677306 | 0.000008 | 0.536723 | 0.083908 | 2.150 | 476.0 | |
taipale_tf_pairs__TEAD4_CEBPD_NTTRCGYAANNNNNNNRGWATGY_CAP_repr | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Tead4, Cebpd | 2.605569 | 0.000032 | 0.32604 | 0.053569 | 2.450 | 239.0 | |
taipale_tf_pairs__TEAD4_CEBPD_RGWATGYNNTTRCGYAAN_CAP | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Tead4, Cebpd | 2.596006 | 0.0 | 0.64943 | 0.107413 | 0.395 | 1433.0 | |
taipale_tf_pairs__ERF_CEBPD_RSMGGAANTTGCGYAAN_CAP | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Cebpd, Erf | 2.277135 | 0.034172 | 0.243821 | 0.050302 | 1.040 | 310.0 | |
tfdimers__MD00288 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Hmga1b, Sry, Hmga2 | 2.157576 | 0.009955 | 0.300005 | 0.067241 | 2.420 | 235.0 | |
taipale_tf_pairs__FLI1_CEBPD_RNCGGANNTTGCGCAAN_CAP | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Fli1, Cebpd | 2.149682 | 0.000405 | 0.325289 | 0.073308 | 1.290 | 386.0 | |
homer__ATTGCGCAAC_CEBP | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | Cebpb | NaN | 2.124851 | 0.0 | 2.547318 | 0.584035 | 2.380 | 2299.0 | |
metacluster_46.4 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | Hlf, Cebpd, Cebpe, Cebpg, Cebpb, Cebpa | Hes2, Cebpe, Cebpd, Ep300, Cebpg, Gatad2a, Dbp, Cebpb, Cebpa | 2.058157 | 0.0 | 2.910528 | 0.698883 | 1.980 | 3165.0 | |
taipale_tf_pairs__ETV5_CEBPD_NSCGGANNTTRCGYAAN_CAP | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Etv5, Cebpd | 1.986943 | 0.0 | 0.600852 | 0.151579 | 0.903 | 949.0 | |
dbtfbs__HLF_HepG2_ENCSR528PSI_merged_N1 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Hlf | 1.885549 | 0.0 | 1.106676 | 0.299512 | 1.800 | 1293.0 | |
swissregulon__hs__CEBPB | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Cebpb | 1.869025 | 0.0 | 1.894034 | 0.518508 | 1.170 | 2173.0 | |
taipale_tf_pairs__ETV2_CEBPD_RSCGGANNTTGCGYAAN_CAP_repr | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Etv2, Cebpd | 1.676178 | 0.0 | 0.64632 | 0.20224 | 1.330 | 817.0 | |
transfac_pro__M04761 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Hsf1 | 1.644374 | 0.0 | 1.911942 | 0.611602 | 0.887 | 2782.0 | |
metacluster_156.2 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | Ddit3, Cebpg, Atf4 | Ddit3, Atf3, Cebpg, Atf4, Myc | 1.639006 | 0.0 | 1.453472 | 0.466677 | 1.900 | 1389.0 | |
metacluster_46.5 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Hlf, Tef | 1.586788 | 0.0 | 1.297947 | 0.432102 | 1.550 | 1661.0 | |
cisbp__M01815 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | Cebpe | NaN | 1.483739 | 0.0 | 2.501856 | 0.894566 | 2.160 | 2614.0 | |
metacluster_156.3 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | Dbp, Hlf, Tef, Nfil3 | Gm4125, Hlf, Tef, Dbp, Nfil3 | 1.436273 | 0.0 | 1.31915 | 0.487453 | 1.330 | 1852.0 | |
swissregulon__mm__Cebpe | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | Cebpe | NaN | 1.434853 | 0.0 | 2.007923 | 0.742699 | 1.810 | 2326.0 | |
metacluster_46.2 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | Ddit3, Cebpa | Ddit3 | 1.352718 | 0.0 | 1.434168 | 0.561554 | 1.480 | 1778.0 | |
transfac_pro__M04829 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Stat3 | 1.293588 | 0.0 | 1.138097 | 0.46427 | 1.040 | 1102.0 | |
transfac_pro__M01869 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | Cebpg | NaN | 1.19817 | 0.0 | 2.465139 | 1.074376 | 1.570 | 3141.0 | |
tfdimers__MD00232 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Taf6, Cebpb, Tbp | 1.131818 | 0.007504 | 0.442282 | 0.201831 | 0.765 | 791.0 | |
transfac_pro__M00621 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | Cebpd | NaN | 1.083923 | 0.0 | 1.799703 | 0.848999 | 1.010 | 2847.0 | |
transfac_pro__M05469 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Sall1 | 1.026501 | 0.000002 | 0.759713 | 0.372943 | 2.070 | 707.0 |
This table can also be easily exported to a html file:
[17]:
out_file = '/staging/leuven/stg_00002/lcb/cbravo/Multiomics_pipeline/pycistarget_tutorial/DEM/Cebpa_motif_enricment.html'
DEM_dict.motif_enrichment['Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K'].to_html(open(out_file, 'w'), escape=False, col_space=80)
You can also access the regions enriched for each motif. You will find to entries in motif_hits (similarly for cistromes); in ‘Region_set’ you will find the coordinates as in the input regions, in ‘Database’ you will find the coordinates as in the database:
[18]:
DEM_dict.motif_hits['Region_set']['Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K']['homer__ATTGCGCAAC_CEBP'][0:10]
[18]:
['chr4:53196410-53196911',
'chr9:95477249-95477750',
'chr17:53580191-53580692',
'chr1:106267982-106268483',
'chr5:99283569-99284070',
'chr8:22054603-22055104',
'chr5:102537694-102538195',
'chr4:48132714-48133215',
'chr4:156124035-156124536',
'chr15:59643719-59644220']
To access cistromes (only available if motifs have been annotated):
[19]:
DEM_dict.cistromes['Region_set']['Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K']['Cebpa_(3360r)'][0:10]
[19]:
['chr4:53196410-53196911',
'chr9:25570286-25570787',
'chr9:95477249-95477750',
'chr1:106267982-106268483',
'chr5:99283569-99284070',
'chr4:76344051-76344552',
'chr8:22054603-22055104',
'chr17:53580191-53580692',
'chr5:102537694-102538195',
'chr12:7978369-7978870']
What is the length of this cistrome? We will compare how this changes with different settings below:
[20]:
len(DEM_dict.cistromes['Region_set']['Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K']['Cebpa_(3360r)'])
[20]:
3360
You can easily export cistromes to a bed file:
[21]:
from pycistarget.utils import *
cebpa_cistrome = DEM_dict.cistromes['Region_set']['Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K']['Cebpa_(3360r)']
cebpa_cistrome_pr = pr.PyRanges(region_names_to_coordinates(cebpa_cistrome))
cebpa_cistrome_pr.to_bed(path='/staging/leuven/stg_00002/lcb/cbravo/Multiomics_pipeline/pycistarget_tutorial/DEM/cebpa_cistrome_example.bed')
D. Advanced usage
1. Thresholding on the mean foreground signal
Above you may have noticed some motifs with high LogFC values, but low signal in both foreground and background. To avoid them, you can set a threshold on the mean CRM value in the foreground with mean_fg_thr
. Here we will set it to 1:
[8]:
DEM_dict = DEM(dem_db = '/staging/leuven/stg_00002/icistarget-data/make_rankings/v10_clust/CTX_mm10/CTX_mm10_SCREEN3_no_bg_with_mask/CTX_mm10_SCREEN3_no_bg_with_mask.regions_vs_motifs.scores.v2.feather',
region_sets = region_sets,
specie = 'mus_musculus',
contrasts = 'Other',
name = 'DEM',
fraction_overlap = 0.4,
max_bg_regions = 500,
adjpval_thr = 0.05,
log2fc_thr = 1,
mean_fg_thr = 1,
motif_hit_thr = None,
n_cpu = 4,
cluster_buster_path = None,
path_to_genome_fasta = None,
path_to_motifs = None,
annotation_version = 'v10nr_clust',
path_to_motif_annotations = '/staging/leuven/stg_00002/lcb/cbravo/cluster_motif_collection_V10_no_desso_no_factorbook/snapshots/motifs-v10-nr.mgi-m0.00001-o0.0.tbl',
motif_annotation = ['Direct_annot', 'Orthology_annot'],
tmp_dir = '/staging/leuven/stg_00002/lcb/cbravo/Multiomics_pipeline/pycistarget/tmp',
_temp_dir='/scratch/leuven/313/vsc31305/ray_spill')
2022-08-04 09:18:24,202 DEM INFO Reading DEM database
2022-08-04 09:18:47,321 DEM INFO Creating contrast groups
(DEM_internal_ray pid=3083) 2022-08-04 09:18:54,925 DEM INFO Computing DEM for Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K
(DEM_internal_ray pid=3084) 2022-08-04 09:18:54,911 DEM INFO Computing DEM for Onecut1_ERR235752_summits_order_by_score_extended_250bp_top5K
(DEM_internal_ray pid=3085) 2022-08-04 09:18:54,941 DEM INFO Computing DEM for Foxa1_ERR235786_summits_order_by_score_extended_250bp_top5K
(DEM_internal_ray pid=3086) 2022-08-04 09:18:55,092 DEM INFO Computing DEM for Hnf4a_ERR235763_summits_order_by_score_extended_250bp_top5K
2022-08-04 09:19:07,185 DEM INFO Forming cistromes
2022-08-04 09:19:07,446 DEM INFO Done!
You will observe now that these motifs are gone:
[23]:
DEM_dict.DEM_results('Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K')
[23]:
Logo | Contrast | Direct_annot | Orthology_annot | Log2FC | Adjusted_pval | Mean_fg | Mean_bg | Motif_hit_thr | Motif_hits | |
---|---|---|---|---|---|---|---|---|---|---|
homer__ATTGCGCAAC_CEBP | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | Cebpb | NaN | 2.124851 | 0.0 | 2.547318 | 0.584035 | 2.380 | 2299.0 | |
metacluster_46.4 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | Hlf, Cebpd, Cebpe, Cebpg, Cebpb, Cebpa | Hes2, Cebpe, Cebpd, Ep300, Cebpg, Gatad2a, Dbp, Cebpb, Cebpa | 2.058157 | 0.0 | 2.910528 | 0.698883 | 1.980 | 3165.0 | |
dbtfbs__HLF_HepG2_ENCSR528PSI_merged_N1 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Hlf | 1.885549 | 0.0 | 1.106676 | 0.299512 | 1.800 | 1293.0 | |
swissregulon__hs__CEBPB | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Cebpb | 1.869025 | 0.0 | 1.894034 | 0.518508 | 1.170 | 2173.0 | |
transfac_pro__M04761 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Hsf1 | 1.644374 | 0.0 | 1.911942 | 0.611602 | 0.887 | 2782.0 | |
metacluster_156.2 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | Ddit3, Cebpg, Atf4 | Ddit3, Atf3, Cebpg, Atf4, Myc | 1.639006 | 0.0 | 1.453472 | 0.466677 | 1.900 | 1389.0 | |
metacluster_46.5 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Hlf, Tef | 1.586788 | 0.0 | 1.297947 | 0.432102 | 1.550 | 1661.0 | |
cisbp__M01815 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | Cebpe | NaN | 1.483739 | 0.0 | 2.501856 | 0.894566 | 2.160 | 2614.0 | |
metacluster_156.3 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | Dbp, Hlf, Tef, Nfil3 | Gm4125, Hlf, Tef, Dbp, Nfil3 | 1.436273 | 0.0 | 1.31915 | 0.487453 | 1.330 | 1852.0 | |
swissregulon__mm__Cebpe | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | Cebpe | NaN | 1.434853 | 0.0 | 2.007923 | 0.742699 | 1.810 | 2326.0 | |
metacluster_46.2 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | Ddit3, Cebpa | Ddit3 | 1.352718 | 0.0 | 1.434168 | 0.561554 | 1.480 | 1778.0 | |
transfac_pro__M04829 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Stat3 | 1.293588 | 0.0 | 1.138097 | 0.46427 | 1.040 | 1102.0 | |
transfac_pro__M01869 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | Cebpg | NaN | 1.19817 | 0.0 | 2.465139 | 1.074376 | 1.570 | 3141.0 | |
transfac_pro__M00621 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | Cebpd | NaN | 1.083923 | 0.0 | 1.799703 | 0.848999 | 1.010 | 2847.0 |
The Cebpa cistrome has the same length:
[24]:
len(DEM_dict.cistromes['Region_set']['Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K']['Cebpa_(3360r)'])
[24]:
3360
And save this object:
[25]:
# Save
import pickle
with open('/staging/leuven/stg_00002/lcb/cbravo/Multiomics_pipeline/pycistarget_tutorial/DEM/DEM_dict_D1.pkl', 'wb') as f:
pickle.dump(DEM_dict, f)
2. Using a fixed threshold for the motif hits
You may have also noticed that cistromes are larger compared to Homer or cisTarget, and this will largely depend on your background (cistromes will be formed by those regions that are more enriched for that motif compared to that background). You can also set a fixed threshold to consider a motif a hit with motif_hit_thr
. Here we will set it to 3.
[9]:
DEM_dict = DEM(dem_db = '/staging/leuven/stg_00002/icistarget-data/make_rankings/v10_clust/CTX_mm10/CTX_mm10_SCREEN3_no_bg_with_mask/CTX_mm10_SCREEN3_no_bg_with_mask.regions_vs_motifs.scores.v2.feather',
region_sets = region_sets,
specie = 'mus_musculus',
contrasts = 'Other',
name = 'DEM',
fraction_overlap = 0.4,
max_bg_regions = 500,
adjpval_thr = 0.05,
log2fc_thr = 1,
mean_fg_thr = 1,
motif_hit_thr = 3,
n_cpu = 4,
cluster_buster_path = None,
path_to_genome_fasta = None,
path_to_motifs = None,
annotation_version = 'v10nr_clust',
path_to_motif_annotations = '/staging/leuven/stg_00002/lcb/cbravo/cluster_motif_collection_V10_no_desso_no_factorbook/snapshots/motifs-v10-nr.mgi-m0.00001-o0.0.tbl',
motif_annotation = ['Direct_annot', 'Orthology_annot'],
tmp_dir = '/staging/leuven/stg_00002/lcb/cbravo/Multiomics_pipeline/pycistarget/tmp',
_temp_dir='/scratch/leuven/313/vsc31305/ray_spill')
2022-08-04 09:19:30,235 DEM INFO Reading DEM database
2022-08-04 09:19:53,620 DEM INFO Creating contrast groups
(DEM_internal_ray pid=19721) 2022-08-04 09:20:01,295 DEM INFO Computing DEM for Onecut1_ERR235752_summits_order_by_score_extended_250bp_top5K
(DEM_internal_ray pid=19720) 2022-08-04 09:20:01,401 DEM INFO Computing DEM for Foxa1_ERR235786_summits_order_by_score_extended_250bp_top5K
(DEM_internal_ray pid=19722) 2022-08-04 09:20:01,378 DEM INFO Computing DEM for Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K
(DEM_internal_ray pid=19723) 2022-08-04 09:20:01,544 DEM INFO Computing DEM for Hnf4a_ERR235763_summits_order_by_score_extended_250bp_top5K
2022-08-04 09:20:13,338 DEM INFO Forming cistromes
2022-08-04 09:20:13,567 DEM INFO Done!
You will notice now that the number of motif hits per motif is generally lower.
[27]:
DEM_dict.DEM_results('Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K')
[27]:
Logo | Contrast | Direct_annot | Orthology_annot | Log2FC | Adjusted_pval | Mean_fg | Mean_bg | Motif_hit_thr | Motif_hits | |
---|---|---|---|---|---|---|---|---|---|---|
homer__ATTGCGCAAC_CEBP | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | Cebpb | NaN | 2.124851 | 0.0 | 2.547318 | 0.584035 | 3.0 | 1940.0 | |
metacluster_46.4 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | Hlf, Cebpd, Cebpe, Cebpg, Cebpb, Cebpa | Hes2, Cebpe, Cebpd, Ep300, Cebpg, Gatad2a, Dbp, Cebpb, Cebpa | 2.058157 | 0.0 | 2.910528 | 0.698883 | 3.0 | 2340.0 | |
dbtfbs__HLF_HepG2_ENCSR528PSI_merged_N1 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Hlf | 1.885549 | 0.0 | 1.106676 | 0.299512 | 3.0 | 780.0 | |
swissregulon__hs__CEBPB | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Cebpb | 1.869025 | 0.0 | 1.894034 | 0.518508 | 3.0 | 1379.0 | |
transfac_pro__M04761 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Hsf1 | 1.644374 | 0.0 | 1.911942 | 0.611602 | 3.0 | 1377.0 | |
metacluster_156.2 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | Ddit3, Cebpg, Atf4 | Ddit3, Atf3, Cebpg, Atf4, Myc | 1.639006 | 0.0 | 1.453472 | 0.466677 | 3.0 | 846.0 | |
metacluster_46.5 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Hlf, Tef | 1.586788 | 0.0 | 1.297947 | 0.432102 | 3.0 | 775.0 | |
cisbp__M01815 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | Cebpe | NaN | 1.483739 | 0.0 | 2.501856 | 0.894566 | 3.0 | 1911.0 | |
metacluster_156.3 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | Dbp, Hlf, Tef, Nfil3 | Gm4125, Hlf, Tef, Dbp, Nfil3 | 1.436273 | 0.0 | 1.31915 | 0.487453 | 3.0 | 689.0 | |
swissregulon__mm__Cebpe | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | Cebpe | NaN | 1.434853 | 0.0 | 2.007923 | 0.742699 | 3.0 | 1315.0 | |
metacluster_46.2 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | Ddit3, Cebpa | Ddit3 | 1.352718 | 0.0 | 1.434168 | 0.561554 | 3.0 | 851.0 | |
transfac_pro__M04829 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Stat3 | 1.293588 | 0.0 | 1.138097 | 0.46427 | 3.0 | 778.0 | |
transfac_pro__M01869 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | Cebpg | NaN | 1.19817 | 0.0 | 2.465139 | 1.074376 | 3.0 | 1736.0 | |
transfac_pro__M00621 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | Cebpd | NaN | 1.083923 | 0.0 | 1.799703 | 0.848999 | 3.0 | 1110.0 |
The length of the cistromes is lower too:
[28]:
len(DEM_dict.cistromes['Region_set']['Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K']['Cebpa_(2488r)'])
[28]:
2488
Let’s save this object:
[29]:
# Save
import pickle
with open('/staging/leuven/stg_00002/lcb/cbravo/Multiomics_pipeline/pycistarget_tutorial/DEM/DEM_dict_D2.pkl', 'wb') as f:
pickle.dump(DEM_dict, f)
3. Using a shuffled background
It is possible that you don’t have a background (for example, if you only have a ChIP-seq experiment). You can also use shuffled regions (from your input) as background by setting contrasts
to ‘Shuffle’. You will need to have Cluster-Buster installed to use this option.
[10]:
os.putenv('CBUST_HOME','/data/leuven/software/biomed/skylake_centos7/2018a/software/Cluster-Buster/20220421-GCCcore-6.4.0')
os.environ["PATH"] += os.pathsep + '/data/leuven/software/biomed/skylake_centos7/2018a/software/Cluster-Buster/20220421-GCCcore-6.4.0/bin:'
DEM_dict = DEM(dem_db = '/staging/leuven/stg_00002/icistarget-data/make_rankings/v10_clust/CTX_mm10/CTX_mm10_SCREEN3_no_bg_with_mask/CTX_mm10_SCREEN3_no_bg_with_mask.regions_vs_motifs.scores.v2.feather',
region_sets = region_sets,
specie = 'mus_musculus',
contrasts = 'Shuffle',
name = 'DEM',
max_bg_regions = 100,
adjpval_thr = 0.05,
log2fc_thr = 1,
mean_fg_thr = 2.5, #You may need to increase the detection threshold here, otherwise you may see a lot of G repeats
n_cpu = 4,
fraction_overlap = 0.4,
cluster_buster_path = '/data/leuven/software/biomed/skylake_centos7/2018a/software/Cluster-Buster/20220421-GCCcore-6.4.0/bin/cbust',
path_to_genome_fasta = '/staging/leuven/res_00001/genomes/mus_musculus/mm10_ucsc/fasta/mm10.fa',
path_to_motifs = '/staging/leuven/stg_00002/lcb/cbravo/cluster_motif_collection_V10_no_desso_no_factorbook/cluster_buster/',
annotation_version = 'v10nr_clust',
path_to_motif_annotations = '/staging/leuven/stg_00002/lcb/cbravo/cluster_motif_collection_V10_no_desso_no_factorbook/snapshots/motifs-v10-nr.mgi-m0.00001-o0.0.tbl',
motif_annotation = ['Direct_annot', 'Orthology_annot'],
tmp_dir = '/staging/leuven/stg_00002/lcb/cbravo/Multiomics_pipeline/pycistarget_tutorial/tmp',
_temp_dir='/scratch/leuven/313/vsc31305/ray_spill')
2022-08-04 09:20:27,317 DEM INFO Reading DEM database
2022-08-04 09:20:52,439 DEM INFO Creating contrast groups
2022-08-04 09:20:52,443 DEM INFO Generating and scoring shuffled background
2022-08-04 09:20:58,295 Cluster-Buster INFO Scoring sequences
2022-08-04 09:22:07,487 Cluster-Buster INFO Done!
2022-08-04 09:22:07,543 DEM INFO Generating and scoring shuffled background
2022-08-04 09:22:12,910 Cluster-Buster INFO Scoring sequences
2022-08-04 09:22:41,457 Cluster-Buster INFO Done!
2022-08-04 09:22:41,512 DEM INFO Generating and scoring shuffled background
2022-08-04 09:22:46,153 Cluster-Buster INFO Scoring sequences
2022-08-04 09:23:12,511 Cluster-Buster INFO Done!
2022-08-04 09:23:12,567 DEM INFO Generating and scoring shuffled background
2022-08-04 09:23:15,634 Cluster-Buster INFO Scoring sequences
2022-08-04 09:23:43,120 Cluster-Buster INFO Done!
(DEM_internal_ray pid=23238) 2022-08-04 09:23:50,847 DEM INFO Computing DEM for Onecut1_ERR235752_summits_order_by_score_extended_250bp_top5K
(DEM_internal_ray pid=23242) 2022-08-04 09:23:50,912 DEM INFO Computing DEM for Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K
(DEM_internal_ray pid=23240) 2022-08-04 09:23:50,956 DEM INFO Computing DEM for Foxa1_ERR235786_summits_order_by_score_extended_250bp_top5K
(DEM_internal_ray pid=23239) 2022-08-04 09:23:51,128 DEM INFO Computing DEM for Hnf4a_ERR235763_summits_order_by_score_extended_250bp_top5K
2022-08-04 09:24:04,530 DEM INFO Forming cistromes
2022-08-04 09:24:04,839 DEM INFO Done!
Let’s see the results now:
[11]:
DEM_dict.DEM_results('Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K')
[11]:
Logo | Contrast | Direct_annot | Orthology_annot | Log2FC | Adjusted_pval | Mean_fg | Mean_bg | Motif_hit_thr | Motif_hits | |
---|---|---|---|---|---|---|---|---|---|---|
homer__ATTGCGCAAC_CEBP | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | Cebpb | NaN | 2.532608 | 0.0 | 2.547322 | 0.440243 | 1.56 | 2771.0 | |
metacluster_46.4 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | Cebpg, Cebpa, Cebpb, Cebpd, Cebpe, Hlf | Ep300, Cebpg, Cebpa, Cebpb, Hes2, Gatad2a, Cebpd, Dbp, Cebpe | 2.422879 | 0.0 | 2.910528 | 0.542766 | 1.60 | 3416.0 | |
transfac_pro__M12588 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Ddit3 | 1.860392 | 0.0 | 2.645379 | 0.728541 | 1.43 | 3269.0 | |
transfac_pro__M09737 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Zfp644 | 1.824506 | 0.0 | 2.768588 | 0.781677 | 1.88 | 2792.0 | |
swissregulon__hs__EZH2 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Ezh2 | 1.775441 | 0.0 | 2.676924 | 0.781943 | 1.58 | 3074.0 | |
hocomoco__SMAD3_HUMAN.H11MO.0.B | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Smad3 | 1.712154 | 0.0 | 2.745296 | 0.837876 | 2.06 | 2667.0 | |
transfac_pro__M12659 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Rora | 1.707419 | 0.0 | 2.525895 | 0.773448 | 1.31 | 3244.0 | |
cisbp__M01815 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | Cebpe | NaN | 1.639925 | 0.0 | 2.501861 | 0.80278 | 1.41 | 3217.0 | |
transfac_pro__M01721 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | Pura | NaN | 1.609624 | 0.0 | 2.591059 | 0.849048 | 1.55 | 3244.0 | |
swissregulon__hs__CUX1 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Cux1 | 1.588562 | 0.0 | 2.901514 | 0.964761 | 1.74 | 3205.0 | |
transfac_pro__M09729 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Creb3 | 1.524944 | 0.0 | 2.594553 | 0.901589 | 2.39 | 2112.0 | |
transfac_pro__M12689 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Zbtb39 | 1.392545 | 0.0 | 3.151474 | 1.200372 | 2.25 | 2946.0 | |
transfac_pro__M09763 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Nfic | 1.387777 | 0.0 | 2.675388 | 1.022408 | 1.75 | 2961.0 | |
transfac_pro__M12352 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Zbtb44 | 1.345064 | 0.0 | 2.532602 | 0.996925 | 1.34 | 3376.0 | |
hocomoco__HAND1_HUMAN.H11MO.1.D | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Hand1 | 1.243063 | 0.0 | 2.755776 | 1.164246 | 2.42 | 2452.0 | |
transfac_pro__M09746 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Hnrnpul1 | 1.166264 | 0.0 | 2.930853 | 1.305911 | 2.32 | 2685.0 | |
transfac_pro__M09726 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Gatad2a | 1.07463 | 0.0 | 2.907862 | 1.380632 | 2.13 | 2904.0 |
The length of the cistromes is lower too:
[13]:
len(DEM_dict.cistromes['Region_set']['Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K']['Cebpa_(3378r)'])
[13]:
3378
Let’s save this object:
[15]:
# Save
import pickle
with open('/staging/leuven/stg_00002/lcb/cbravo/Multiomics_pipeline/pycistarget_tutorial/DEM/DEM_dict_D3.pkl', 'wb') as f:
pickle.dump(DEM_dict, f)
4. Specifying contrasts
It is possible that you want to make specific contrast between region sets. You can do this by passing a list to contrast (each slot will be a contrast, first slot with it will be the foreground and second the background). For example, here we will perform two contrasts: 1) Cebpa versus Onecut and 2) Cebpa versus Onecut and Hnf4a.
[16]:
DEM_dict = DEM(dem_db = '/staging/leuven/stg_00002/icistarget-data/make_rankings/v10_clust/CTX_mm10/CTX_mm10_SCREEN3_no_bg_with_mask/CTX_mm10_SCREEN3_no_bg_with_mask.regions_vs_motifs.scores.v2.feather',
region_sets = region_sets,
specie = 'mus_musculus',
contrasts = [[['Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K'], ['Onecut1_ERR235752_summits_order_by_score_extended_250bp_top5K']], [['Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K'], ['Onecut1_ERR235752_summits_order_by_score_extended_250bp_top5K', 'Hnf4a_ERR235763_summits_order_by_score_extended_250bp_top5K']]],
name = 'DEM',
fraction_overlap = 0.4,
max_bg_regions = 500,
adjpval_thr = 0.05,
log2fc_thr = 1,
mean_fg_thr = 1,
motif_hit_thr = 3,
n_cpu = 4,
cluster_buster_path = None,
path_to_genome_fasta = None,
path_to_motifs = None,
annotation_version = 'v10nr_clust',
path_to_motif_annotations = '/staging/leuven/stg_00002/lcb/cbravo/cluster_motif_collection_V10_no_desso_no_factorbook/snapshots/motifs-v10-nr.mgi-m0.00001-o0.0.tbl',
motif_annotation = ['Direct_annot', 'Orthology_annot'],
tmp_dir = '/staging/leuven/stg_00002/lcb/cbravo/Multiomics_pipeline/pycistarget/tmp',
_temp_dir='/scratch/leuven/313/vsc31305/ray_spill')
2022-08-04 09:24:59,453 DEM INFO Reading DEM database
2022-08-04 09:25:21,915 DEM INFO Creating contrast groups
(DEM_internal_ray pid=24473) 2022-08-04 09:25:29,749 DEM INFO Computing DEM for Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K_VS_Onecut1_ERR235752_summits_order_by_score_extended_250bp_top5K
(DEM_internal_ray pid=24472) 2022-08-04 09:25:29,796 DEM INFO Computing DEM for Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K_VS_Onecut1_ERR235752_summits_order_by_score_extended_250bp_top5K_Hnf4a_ERR235763_summits_order_by_score_extended_250bp_top5K
2022-08-04 09:25:40,199 DEM INFO Forming cistromes
2022-08-04 09:25:40,293 DEM INFO Done!
Let’s see the results now comparing with Onecut:
[17]:
DEM_dict.DEM_results('Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K_VS_Onecut1_ERR235752_summits_order_by_score_extended_250bp_top5K')
[17]:
Logo | Contrast | Direct_annot | Orthology_annot | Log2FC | Adjusted_pval | Mean_fg | Mean_bg | Motif_hit_thr | Motif_hits | |
---|---|---|---|---|---|---|---|---|---|---|
dbtfbs__HLF_HepG2_ENCSR528PSI_merged_N1 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K_VS_Onecut1_ERR235752_summits_order_by_score_extended_250bp_top5K | NaN | Hlf | 2.210142 | 0.0 | 1.106676 | 0.239167 | 3.0 | 780.0 | |
metacluster_46.4 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K_VS_Onecut1_ERR235752_summits_order_by_score_extended_250bp_top5K | Cebpg, Cebpa, Cebpb, Cebpd, Cebpe, Hlf | Ep300, Cebpg, Cebpa, Cebpb, Hes2, Gatad2a, Cebpd, Dbp, Cebpe | 2.018376 | 0.0 | 2.910528 | 0.718423 | 3.0 | 2340.0 | |
homer__ATTGCGCAAC_CEBP | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K_VS_Onecut1_ERR235752_summits_order_by_score_extended_250bp_top5K | Cebpb | NaN | 1.823287 | 0.0 | 2.547322 | 0.719813 | 3.0 | 1940.0 | |
metacluster_46.5 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K_VS_Onecut1_ERR235752_summits_order_by_score_extended_250bp_top5K | NaN | Tef, Hlf | 1.760758 | 0.0 | 1.297947 | 0.383015 | 3.0 | 775.0 | |
swissregulon__hs__CEBPB | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K_VS_Onecut1_ERR235752_summits_order_by_score_extended_250bp_top5K | NaN | Cebpb | 1.74152 | 0.0 | 1.894033 | 0.566419 | 3.0 | 1379.0 | |
transfac_pro__M04761 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K_VS_Onecut1_ERR235752_summits_order_by_score_extended_250bp_top5K | NaN | Hsf1 | 1.720374 | 0.0 | 1.911941 | 0.580217 | 3.0 | 1377.0 | |
metacluster_156.2 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K_VS_Onecut1_ERR235752_summits_order_by_score_extended_250bp_top5K | Atf4, Ddit3, Cebpg | Cebpg, Myc, Atf4, Ddit3, Atf3 | 1.665894 | 0.0 | 1.45347 | 0.45806 | 3.0 | 846.0 | |
metacluster_156.3 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K_VS_Onecut1_ERR235752_summits_order_by_score_extended_250bp_top5K | Dbp, Tef, Nfil3, Hlf | Gm4125, Nfil3, Dbp, Tef, Hlf | 1.62854 | 0.0 | 1.319148 | 0.426633 | 3.0 | 689.0 | |
cisbp__M01815 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K_VS_Onecut1_ERR235752_summits_order_by_score_extended_250bp_top5K | Cebpe | NaN | 1.514219 | 0.0 | 2.501861 | 0.875866 | 3.0 | 1911.0 | |
swissregulon__mm__Cebpe | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K_VS_Onecut1_ERR235752_summits_order_by_score_extended_250bp_top5K | Cebpe | NaN | 1.40163 | 0.0 | 2.007918 | 0.76 | 3.0 | 1315.0 | |
metacluster_46.2 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K_VS_Onecut1_ERR235752_summits_order_by_score_extended_250bp_top5K | Cebpa, Ddit3 | Ddit3 | 1.395899 | 0.0 | 1.434166 | 0.544995 | 3.0 | 851.0 | |
transfac_pro__M04829 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K_VS_Onecut1_ERR235752_summits_order_by_score_extended_250bp_top5K | NaN | Stat3 | 1.206516 | 0.0 | 1.138095 | 0.493152 | 3.0 | 778.0 | |
transfac_pro__M01869 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K_VS_Onecut1_ERR235752_summits_order_by_score_extended_250bp_top5K | Cebpg | NaN | 1.061128 | 0.0 | 2.465138 | 1.181435 | 3.0 | 1736.0 | |
transfac_pro__M12588 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K_VS_Onecut1_ERR235752_summits_order_by_score_extended_250bp_top5K | NaN | Ddit3 | 1.050354 | 0.0 | 2.645379 | 1.27732 | 3.0 | 1772.0 | |
tfdimers__MD00518 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K_VS_Onecut1_ERR235752_summits_order_by_score_extended_250bp_top5K | NaN | Pou5f1, Myb | 1.030543 | 0.0 | 1.110707 | 0.54372 | 3.0 | 642.0 | |
transfac_pro__M00621 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K_VS_Onecut1_ERR235752_summits_order_by_score_extended_250bp_top5K | Cebpd | NaN | 1.0292 | 0.0 | 1.799702 | 0.881821 | 3.0 | 1110.0 |
Let’s save this object:
[18]:
# Save
import pickle
with open('/staging/leuven/stg_00002/lcb/cbravo/Multiomics_pipeline/pycistarget_tutorial/DEM/DEM_dict_D4.pkl', 'wb') as f:
pickle.dump(DEM_dict, f)
5. Balancing promoter content
Finally it is possible to balance the proportion of promoters between foreground and background to avoid overrepresentation of the promoter sequences signal. You only need to provide the promoter annotation.
[34]:
# Retrive promoter annotation from biomart
import pybiomart as pbm
promoter_space = 500
dataset = pbm.Dataset(name='mmusculus_gene_ensembl', host='http://nov2020.archive.ensembl.org/')
annot = dataset.query(attributes=['chromosome_name', 'transcription_start_site', 'strand', 'external_gene_name', 'transcript_biotype'])
annot.columns = ['Chromosome', 'Start', 'Strand', 'Gene', 'Transcript_type']
annot['Chromosome'] = annot['Chromosome'].astype('str')
filterf = annot['Chromosome'].str.contains('CHR|GL|JH|MT')
annot = annot[~filterf]
annot['Chromosome'] = annot['Chromosome'].str.replace(r'(\b\S)', r'chr\1')
annot = annot[annot.Transcript_type == 'protein_coding']
annot = annot.dropna(subset = ['Chromosome', 'Start'])
[39]:
DEM_dict = DEM(dem_db = '/staging/leuven/stg_00002/icistarget-data/make_rankings/v10_clust/CTX_mm10/CTX_mm10_SCREEN3_no_bg_with_mask/CTX_mm10_SCREEN3_no_bg_with_mask.regions_vs_motifs.scores.v2.feather',
region_sets = region_sets,
specie = 'mus_musculus',
contrasts = 'Other',
name = 'DEM',
fraction_overlap = 0.4,
max_bg_regions = 500,
adjpval_thr = 0.05,
log2fc_thr = 1,
mean_fg_thr = 1,
motif_hit_thr = None,
genome_annotation= annot, # Add genome_annotation
promoter_space = 500,
cluster_buster_path = None,
path_to_genome_fasta = None,
path_to_motifs = None,
annotation_version = 'v10nr_clust',
path_to_motif_annotations = '/staging/leuven/stg_00002/lcb/cbravo/cluster_motif_collection_V10_no_desso_no_factorbook/snapshots/motifs-v10-nr.mgi-m0.00001-o0.0.tbl',
motif_annotation = ['Direct_annot', 'Orthology_annot'],
n_cpu = 4,
tmp_dir = '/staging/leuven/stg_00002/lcb/cbravo/Multiomics_pipeline/pycistarget/tmp',
_temp_dir='/scratch/leuven/313/vsc31305/ray_spill')
2022-08-04 09:39:55,535 DEM INFO Reading DEM database
2022-08-04 09:40:17,093 DEM INFO Creating contrast groups
(DEM_internal_ray pid=33197) 2022-08-04 09:40:26,597 DEM INFO Computing DEM for Onecut1_ERR235752_summits_order_by_score_extended_250bp_top5K
(DEM_internal_ray pid=33194) 2022-08-04 09:40:26,722 DEM INFO Computing DEM for Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K
(DEM_internal_ray pid=33195) 2022-08-04 09:40:26,721 DEM INFO Computing DEM for Foxa1_ERR235786_summits_order_by_score_extended_250bp_top5K
(DEM_internal_ray pid=33196) 2022-08-04 09:40:26,824 DEM INFO Computing DEM for Hnf4a_ERR235763_summits_order_by_score_extended_250bp_top5K
2022-08-04 09:40:37,990 DEM INFO Forming cistromes
2022-08-04 09:40:38,190 DEM INFO Done!
Let’s see the results now comparing with Onecut:
[40]:
DEM_dict.DEM_results('Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K')
[40]:
Logo | Contrast | Direct_annot | Orthology_annot | Log2FC | Adjusted_pval | Mean_fg | Mean_bg | Motif_hit_thr | Motif_hits | |
---|---|---|---|---|---|---|---|---|---|---|
dbtfbs__HLF_HepG2_ENCSR528PSI_merged_N1 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Hlf | 2.195518 | 0.0 | 1.106676 | 0.241604 | 2.250 | 1058.0 | |
homer__ATTGCGCAAC_CEBP | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | Cebpb | NaN | 2.173246 | 0.0 | 2.547322 | 0.56477 | 2.400 | 2284.0 | |
metacluster_46.4 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | Cebpg, Cebpa, Cebpb, Cebpd, Cebpe, Hlf | Ep300, Cebpg, Cebpa, Cebpb, Hes2, Gatad2a, Cebpd, Dbp, Cebpe | 2.116813 | 0.0 | 2.910528 | 0.671039 | 1.900 | 3225.0 | |
transfac_pro__M04761 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Hsf1 | 1.891702 | 0.0 | 1.911941 | 0.515247 | 1.060 | 2598.0 | |
swissregulon__hs__CEBPB | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Cebpb | 1.859936 | 0.0 | 1.894033 | 0.521784 | 2.150 | 1760.0 | |
metacluster_46.5 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Tef, Hlf | 1.847229 | 0.0 | 1.297947 | 0.360733 | 2.030 | 1364.0 | |
metacluster_156.3 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | Dbp, Tef, Nfil3, Hlf | Gm4125, Nfil3, Dbp, Tef, Hlf | 1.667336 | 0.0 | 1.319148 | 0.415313 | 0.984 | 2184.0 | |
metacluster_156.2 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | Atf4, Ddit3, Cebpg | Cebpg, Myc, Atf4, Ddit3, Atf3 | 1.575318 | 0.0 | 1.45347 | 0.48774 | 1.680 | 1523.0 | |
cisbp__M01815 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | Cebpe | NaN | 1.504281 | 0.0 | 2.501861 | 0.88192 | 1.830 | 2843.0 | |
swissregulon__mm__Cebpe | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | Cebpe | NaN | 1.482787 | 0.0 | 2.007918 | 0.718427 | 1.650 | 2475.0 | |
transfac_pro__M04829 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Stat3 | 1.471578 | 0.0 | 1.138095 | 0.410383 | 0.715 | 1557.0 | |
metacluster_46.2 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | Cebpa, Ddit3 | Ddit3 | 1.37845 | 0.0 | 1.434166 | 0.551626 | 0.479 | 2922.0 | |
transfac_pro__M01869 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | Cebpg | NaN | 1.172004 | 0.0 | 2.465138 | 1.094039 | 2.310 | 2403.0 | |
transfac_pro__M12588 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | NaN | Ddit3 | 1.073521 | 0.0 | 2.645379 | 1.256973 | 2.270 | 2430.0 | |
transfac_pro__M00621 | Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K | Cebpd | NaN | 1.047998 | 0.0 | 1.799702 | 0.870406 | 1.220 | 2576.0 |
Let’s save this object:
[41]:
# Save
import pickle
with open('/staging/leuven/stg_00002/lcb/cbravo/Multiomics_pipeline/pycistarget_tutorial/DEM/DEM_dict_D5.pkl', 'wb') as f:
pickle.dump(DEM_dict, f)
3. Homer
First we need to load the functions needed for Homer:
[42]:
# Load homer functions
from pycistarget.motif_enrichment_homer import *
A. Running Homer
For running Homer there are some relevant parameters:
homer_path: Path to the executable Homer files. Homer has to be also accessible in the python paths too.
region_sets: The input sets of regions
outdir: Output directory
genome: Genome assembly (equivalent to the genome parameter in Homer). Several species and genomes are supported, including human (hg18, hg19, hg38) and mouse (mm8, mm9, mm10), among others. Alternatively, it can be a path to custom genome fasta files.
size: Fragment size to use for motif finding (by default, ‘given’, which is the whole region).
mask: Whether to mask repeat regions
denovo: Whether to perform de novo motif discovery. This will increase the running time considerably. If running de novo motif enrichment, you can use meme with a motif collection of interest to identify potential TFs linked to de novo motifs. If False, Homer will only be run for known motifs.
length: Motif length for the de novo motif discovery.
n_cpu: Number of cores to use
meme_path: Path to the executable MEME files. MEME has to be also accessible in the python paths too.
meme_collection_path : Path to the motif collection in meme format. We recommend to use the cisTarget motif collection.
annotation_version : Motif collection version. Here we use the unclustered v10 database (‘v10’).
path_to_motif_annotations : File with motif annotations. These files are available at https://resources.aertslab.org/cistarget/motif2tf .
cistrome_annotation : Annotations to assign motifs to TFs (direct, and/or by motif similarity or orthology)
[43]:
# Set correct path to run HOMER
import os
os.putenv('HOMER_HOME','/data/leuven/software/biomed/haswell_centos7/2018a/software/HOMER/4.10.4-foss-2018a')
os.environ["PATH"] += os.pathsep + '/data/leuven/software/biomed/haswell_centos7/2018a/software/HOMER/4.10.4-foss-2018a/bin:'
homer_path='/data/leuven/software/biomed/haswell_centos7/2018a/software/HOMER/4.10.4-foss-2018a/bin/'
# Choose the output directory for the results
outdir='/staging/leuven/stg_00002/lcb/cbravo/Multiomics_pipeline/pycistarget_tutorial/Homer/'
# Select your genome
genome='mm10'
# Set correct path to MEME for de novo motif annotation - Only needed if using de novo annotation!
# We have tomtom installed in our image, so we dont need to add additional paths
meme_collection_path = '/staging/leuven/stg_00002/lcb/cbravo/Multiomics_pipeline/pycistarget_tutorial/scenicplus_motif_collection.meme'
meme_path='/opt/meme/bin/'
# Run
homer_dict=run_homer(homer_path,
region_sets,
outdir,
genome,
size='given',
mask=True,
denovo=True,
length='8,10,12',
n_cpu=4,
meme_path = meme_path,
meme_collection_path = meme_collection_path,
annotation_version = 'v10',
path_to_motif_annotations = '/staging/leuven/stg_00002/lcb/icistarget/data/motif2tf_project/motif_to_tf_db_data/snapshots/motifs-v10-nr.mgi-m0.00001-o0.0.tbl',
cistrome_annotation = ['Direct_annot', 'Orthology_annot'],
_temp_dir='/scratch/leuven/313/vsc31305/ray_spill')
(homer_ray pid=33838) 2022-08-04 09:41:08,646 Homer INFO Running Onecut1_ERR235752_summits_order_by_score_extended_250bp_top5K
(homer_ray pid=33838) 2022-08-04 09:41:08,647 Homer INFO Running Homer for Onecut1_ERR235752_summits_order_by_score_extended_250bp_top5K with /data/leuven/software/biomed/haswell_centos7/2018a/software/HOMER/4.10.4-foss-2018a/bin/findMotifsGenome.pl /staging/leuven/stg_00002/lcb/cbravo/Multiomics_pipeline/pycistarget_tutorial/Homer/regions_bed/Onecut1_ERR235752_summits_order_by_score_extended_250bp_top5K.bed mm10 /staging/leuven/stg_00002/lcb/cbravo/Multiomics_pipeline/pycistarget_tutorial/Homer/Onecut1_ERR235752_summits_order_by_score_extended_250bp_top5K -preparsedDir /staging/leuven/stg_00002/lcb/cbravo/Multiomics_pipeline/pycistarget_tutorial/Homer/Onecut1_ERR235752_summits_order_by_score_extended_250bp_top5K -size given -len 8,10,12 -mask -keepFiles
(homer_ray pid=33839) 2022-08-04 09:41:08,790 Homer INFO Running Hnf4a_ERR235763_summits_order_by_score_extended_250bp_top5K
(homer_ray pid=33839) 2022-08-04 09:41:08,791 Homer INFO Running Homer for Hnf4a_ERR235763_summits_order_by_score_extended_250bp_top5K with /data/leuven/software/biomed/haswell_centos7/2018a/software/HOMER/4.10.4-foss-2018a/bin/findMotifsGenome.pl /staging/leuven/stg_00002/lcb/cbravo/Multiomics_pipeline/pycistarget_tutorial/Homer/regions_bed/Hnf4a_ERR235763_summits_order_by_score_extended_250bp_top5K.bed mm10 /staging/leuven/stg_00002/lcb/cbravo/Multiomics_pipeline/pycistarget_tutorial/Homer/Hnf4a_ERR235763_summits_order_by_score_extended_250bp_top5K -preparsedDir /staging/leuven/stg_00002/lcb/cbravo/Multiomics_pipeline/pycistarget_tutorial/Homer/Hnf4a_ERR235763_summits_order_by_score_extended_250bp_top5K -size given -len 8,10,12 -mask -keepFiles
(homer_ray pid=33840) 2022-08-04 09:41:08,786 Homer INFO Running Foxa1_ERR235786_summits_order_by_score_extended_250bp_top5K
(homer_ray pid=33840) 2022-08-04 09:41:08,786 Homer INFO Running Homer for Foxa1_ERR235786_summits_order_by_score_extended_250bp_top5K with /data/leuven/software/biomed/haswell_centos7/2018a/software/HOMER/4.10.4-foss-2018a/bin/findMotifsGenome.pl /staging/leuven/stg_00002/lcb/cbravo/Multiomics_pipeline/pycistarget_tutorial/Homer/regions_bed/Foxa1_ERR235786_summits_order_by_score_extended_250bp_top5K.bed mm10 /staging/leuven/stg_00002/lcb/cbravo/Multiomics_pipeline/pycistarget_tutorial/Homer/Foxa1_ERR235786_summits_order_by_score_extended_250bp_top5K -preparsedDir /staging/leuven/stg_00002/lcb/cbravo/Multiomics_pipeline/pycistarget_tutorial/Homer/Foxa1_ERR235786_summits_order_by_score_extended_250bp_top5K -size given -len 8,10,12 -mask -keepFiles
(homer_ray pid=33841) 2022-08-04 09:41:08,879 Homer INFO Running Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K
(homer_ray pid=33841) 2022-08-04 09:41:08,880 Homer INFO Running Homer for Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K with /data/leuven/software/biomed/haswell_centos7/2018a/software/HOMER/4.10.4-foss-2018a/bin/findMotifsGenome.pl /staging/leuven/stg_00002/lcb/cbravo/Multiomics_pipeline/pycistarget_tutorial/Homer/regions_bed/Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K.bed mm10 /staging/leuven/stg_00002/lcb/cbravo/Multiomics_pipeline/pycistarget_tutorial/Homer/Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K -preparsedDir /staging/leuven/stg_00002/lcb/cbravo/Multiomics_pipeline/pycistarget_tutorial/Homer/Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K -size given -len 8,10,12 -mask -keepFiles
(homer_ray pid=33838) 2022-08-04 10:38:13,434 Homer INFO Annotating motifs for Onecut1_ERR235752_summits_order_by_score_extended_250bp_top5K
(homer_ray pid=33838) 2022-08-04 10:38:13,434 Homer INFO Annotating known motifs
[44]:
# Save
import pickle
with open('/staging/leuven/stg_00002/lcb/cbravo/Multiomics_pipeline/pycistarget_tutorial/Homer/Homer_dict.pkl', 'wb') as f:
pickle.dump(homer_dict, f)
B. Exploring Homer results
We can load the results for exploration.
[4]:
# Load
import pickle
infile = open('/staging/leuven/stg_00002/lcb/cbravo/Multiomics_pipeline/pycistarget_tutorial_old/pycistarget_tutorial/Homer/Homer_dict.pkl', 'rb')
homer_dict = pickle.load(infile)
infile.close()
To visualize motif enrichment results, we can use the homer_results()
function:
[45]:
homer_results(homer_dict, 'Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K', results='known')
[45]:
Homer Known Motif Enrichment Results (/staging/leuven/stg_00002/lcb/cbravo/Multiomics_pipeline/pycistarget_tutorial/Homer/Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K)
Homer de novo Motif ResultsGene Ontology Enrichment Results
Known Motif Enrichment Results (txt file)
Total Target Sequences = 4595, Total Background Sequences = 44731
Rank | Motif | Name | P-value | log P-pvalue | q-value (Benjamini) | # Target Sequences with Motif | % of Targets Sequences with Motif | # Background Sequences with Motif | % of Background Sequences with Motif | Motif File | SVG |
1 | CEBP(bZIP)/ThioMac-CEBPb-ChIP-Seq(GSE21512)/Homer | 1e-1326 | -3.054e+03 | 0.0000 | 2719.0 | 59.17% | 5045.7 | 11.28% | motif file (matrix) | svg | |
2 | HLF(bZIP)/HSC-HLF.Flag-ChIP-Seq(GSE69817)/Homer | 1e-669 | -1.541e+03 | 0.0000 | 2216.0 | 48.23% | 6249.0 | 13.97% | motif file (matrix) | svg | |
3 | NFIL3(bZIP)/HepG2-NFIL3-ChIP-Seq(Encode)/Homer | 1e-646 | -1.489e+03 | 0.0000 | 1926.0 | 41.92% | 4777.9 | 10.68% | motif file (matrix) | svg | |
4 | CEBP:AP1(bZIP)/ThioMac-CEBPb-ChIP-Seq(GSE21512)/Homer | 1e-532 | -1.227e+03 | 0.0000 | 2013.0 | 43.81% | 6161.9 | 13.78% | motif file (matrix) | svg | |
5 | Atf4(bZIP)/MEF-Atf4-ChIP-Seq(GSE35681)/Homer | 1e-335 | -7.727e+02 | 0.0000 | 947.0 | 20.61% | 2008.9 | 4.49% | motif file (matrix) | svg | |
6 | Chop(bZIP)/MEF-Chop-ChIP-Seq(GSE35681)/Homer | 1e-217 | -5.017e+02 | 0.0000 | 684.0 | 14.89% | 1569.1 | 3.51% | motif file (matrix) | svg | |
7 | PPARa(NR),DR1/Liver-Ppara-ChIP-Seq(GSE47954)/Homer | 1e-94 | -2.176e+02 | 0.0000 | 1548.0 | 33.69% | 9195.6 | 20.56% | motif file (matrix) | svg | |
8 | HNF4a(NR),DR1/HepG2-HNF4a-ChIP-Seq(GSE25021)/Homer | 1e-86 | -1.996e+02 | 0.0000 | 875.0 | 19.04% | 4230.7 | 9.46% | motif file (matrix) | svg | |
9 | RARa(NR)/K562-RARa-ChIP-Seq(Encode)/Homer | 1e-82 | -1.908e+02 | 0.0000 | 2930.0 | 63.76% | 22170.5 | 49.58% | motif file (matrix) | svg | |
10 | COUP-TFII(NR)/K562-NR2F1-ChIP-Seq(Encode)/Homer | 1e-82 | -1.893e+02 | 0.0000 | 1820.0 | 39.61% | 11860.6 | 26.52% | motif file (matrix) | svg | |
11 | Erra(NR)/HepG2-Erra-ChIP-Seq(GSE31477)/Homer | 1e-77 | -1.791e+02 | 0.0000 | 2441.0 | 53.12% | 17634.5 | 39.44% | motif file (matrix) | svg | |
12 | Atf1(bZIP)/K562-ATF1-ChIP-Seq(GSE31477)/Homer | 1e-75 | -1.727e+02 | 0.0000 | 988.0 | 21.50% | 5313.0 | 11.88% | motif file (matrix) | svg | |
13 | RXR(NR),DR1/3T3L1-RXR-ChIP-Seq(GSE13511)/Homer | 1e-70 | -1.612e+02 | 0.0000 | 1508.0 | 32.82% | 9607.9 | 21.49% | motif file (matrix) | svg | |
14 | PPARE(NR),DR1/3T3L1-Pparg-ChIP-Seq(GSE13511)/Homer | 1e-69 | -1.606e+02 | 0.0000 | 1324.0 | 28.81% | 8093.4 | 18.10% | motif file (matrix) | svg | |
15 | COUP-TFII(NR)/Artia-Nr2f2-ChIP-Seq(GSE46497)/Homer | 1e-60 | -1.396e+02 | 0.0000 | 1942.0 | 42.26% | 13734.6 | 30.71% | motif file (matrix) | svg | |
16 | THRb(NR)/Liver-NR1A2-ChIP-Seq(GSE52613)/Homer | 1e-60 | -1.383e+02 | 0.0000 | 3476.0 | 75.65% | 28793.7 | 64.39% | motif file (matrix) | svg | |
17 | EAR2(NR)/K562-NR2F6-ChIP-Seq(Encode)/Homer | 1e-57 | -1.319e+02 | 0.0000 | 1612.0 | 35.08% | 10954.3 | 24.50% | motif file (matrix) | svg | |
18 | FOXA1(Forkhead)/LNCAP-FOXA1-ChIP-Seq(GSE27824)/Homer | 1e-56 | -1.290e+02 | 0.0000 | 1531.0 | 33.32% | 10302.8 | 23.04% | motif file (matrix) | svg | |
19 | Foxa2(Forkhead)/Liver-Foxa2-ChIP-Seq(GSE25694)/Homer | 1e-55 | -1.282e+02 | 0.0000 | 1140.0 | 24.81% | 7047.0 | 15.76% | motif file (matrix) | svg | |
20 | FOXM1(Forkhead)/MCF7-FOXM1-ChIP-Seq(GSE72977)/Homer | 1e-55 | -1.273e+02 | 0.0000 | 1361.0 | 29.62% | 8884.2 | 19.87% | motif file (matrix) | svg | |
21 | FOXA1(Forkhead)/MCF7-FOXA1-ChIP-Seq(GSE26831)/Homer | 1e-54 | -1.266e+02 | 0.0000 | 1303.0 | 28.36% | 8410.2 | 18.81% | motif file (matrix) | svg | |
22 | NF1-halfsite(CTF)/LNCaP-NF1-ChIP-Seq(Unpublished)/Homer | 1e-49 | -1.137e+02 | 0.0000 | 2064.0 | 44.92% | 15343.3 | 34.31% | motif file (matrix) | svg | |
23 | Hnf6b(Homeobox)/LNCaP-Hnf6b-ChIP-Seq(GSE106305)/Homer | 1e-48 | -1.115e+02 | 0.0000 | 1061.0 | 23.09% | 6649.3 | 14.87% | motif file (matrix) | svg | |
24 | Fox:Ebox(Forkhead,bHLH)/Panc1-Foxa2-ChIP-Seq(GSE47459)/Homer | 1e-47 | -1.085e+02 | 0.0000 | 1393.0 | 30.32% | 9471.4 | 21.18% | motif file (matrix) | svg | |
25 | Foxa3(Forkhead)/Liver-Foxa3-ChIP-Seq(GSE77670)/Homer | 1e-41 | -9.616e+01 | 0.0000 | 525.0 | 11.43% | 2717.6 | 6.08% | motif file (matrix) | svg | |
26 | ERRg(NR)/Kidney-ESRRG-ChIP-Seq(GSE104905)/Homer | 1e-39 | -9.200e+01 | 0.0000 | 1081.0 | 23.53% | 7124.7 | 15.93% | motif file (matrix) | svg | |
27 | HNF6(Homeobox)/Liver-Hnf6-ChIP-Seq(ERP000394)/Homer | 1e-39 | -9.154e+01 | 0.0000 | 695.0 | 15.13% | 4035.8 | 9.03% | motif file (matrix) | svg | |
28 | NF1(CTF)/LNCAP-NF1-ChIP-Seq(Unpublished)/Homer | 1e-36 | -8.447e+01 | 0.0000 | 561.0 | 12.21% | 3110.7 | 6.96% | motif file (matrix) | svg | |
29 | Cux2(Homeobox)/Liver-Cux2-ChIP-Seq(GSE35985)/Homer | 1e-33 | -7.773e+01 | 0.0000 | 565.0 | 12.30% | 3223.4 | 7.21% | motif file (matrix) | svg | |
30 | FOXK1(Forkhead)/HEK293-FOXK1-ChIP-Seq(GSE51673)/Homer | 1e-31 | -7.316e+01 | 0.0000 | 1244.0 | 27.07% | 8869.1 | 19.83% | motif file (matrix) | svg | |
31 | Foxo3(Forkhead)/U2OS-Foxo3-ChIP-Seq(E-MTAB-2701)/Homer | 1e-30 | -7.123e+01 | 0.0000 | 961.0 | 20.91% | 6495.8 | 14.53% | motif file (matrix) | svg | |
32 | Foxf1(Forkhead)/Lung-Foxf1-ChIP-Seq(GSE77951)/Homer | 1e-30 | -7.027e+01 | 0.0000 | 1065.0 | 23.18% | 7391.7 | 16.53% | motif file (matrix) | svg | |
33 | Esrrb(NR)/mES-Esrrb-ChIP-Seq(GSE11431)/Homer | 1e-30 | -6.954e+01 | 0.0000 | 854.0 | 18.59% | 5638.3 | 12.61% | motif file (matrix) | svg | |
34 | Foxo1(Forkhead)/RAW-Foxo1-ChIP-Seq(Fan_et_al.)/Homer | 1e-29 | -6.757e+01 | 0.0000 | 2024.0 | 44.05% | 16056.3 | 35.91% | motif file (matrix) | svg | |
35 | FoxD3(forkhead)/ZebrafishEmbryo-Foxd3.biotin-ChIP-seq(GSE106676)/Homer | 1e-29 | -6.715e+01 | 0.0000 | 1095.0 | 23.83% | 7711.9 | 17.25% | motif file (matrix) | svg | |
36 | TR4(NR),DR1/Hela-TR4-ChIP-Seq(GSE24685)/Homer | 1e-29 | -6.699e+01 | 0.0000 | 227.0 | 4.94% | 951.9 | 2.13% | motif file (matrix) | svg | |
37 | Atf7(bZIP)/3T3L1-Atf7-ChIP-Seq(GSE56872)/Homer | 1e-28 | -6.676e+01 | 0.0000 | 612.0 | 13.32% | 3736.8 | 8.36% | motif file (matrix) | svg | |
38 | FoxL2(Forkhead)/Ovary-FoxL2-ChIP-Seq(GSE60858)/Homer | 1e-26 | -6.134e+01 | 0.0000 | 976.0 | 21.24% | 6817.8 | 15.25% | motif file (matrix) | svg | |
39 | FOXK2(Forkhead)/U2OS-FOXK2-ChIP-Seq(E-MTAB-2204)/Homer | 1e-26 | -6.051e+01 | 0.0000 | 829.0 | 18.04% | 5598.5 | 12.52% | motif file (matrix) | svg | |
40 | FOXP1(Forkhead)/H9-FOXP1-ChIP-Seq(GSE31006)/Homer | 1e-24 | -5.736e+01 | 0.0000 | 596.0 | 12.97% | 3756.4 | 8.40% | motif file (matrix) | svg | |
41 | HNF1b(Homeobox)/PDAC-HNF1B-ChIP-Seq(GSE64557)/Homer | 1e-24 | -5.683e+01 | 0.0000 | 227.0 | 4.94% | 1029.0 | 2.30% | motif file (matrix) | svg | |
42 | Nur77(NR)/K562-NR4A1-ChIP-Seq(GSE31363)/Homer | 1e-24 | -5.662e+01 | 0.0000 | 302.0 | 6.57% | 1542.7 | 3.45% | motif file (matrix) | svg | |
43 | Tlx?(NR)/NPC-H3K4me1-ChIP-Seq(GSE16256)/Homer | 1e-23 | -5.371e+01 | 0.0000 | 563.0 | 12.25% | 3553.7 | 7.95% | motif file (matrix) | svg | |
44 | FXR(NR),IR1/Liver-FXR-ChIP-Seq(Chong_et_al.)/Homer | 1e-22 | -5.209e+01 | 0.0000 | 526.0 | 11.45% | 3286.4 | 7.35% | motif file (matrix) | svg | |
45 | CEBP:CEBP(bZIP)/MEF-Chop-ChIP-Seq(GSE35681)/Homer | 1e-21 | -5.030e+01 | 0.0000 | 225.0 | 4.90% | 1072.8 | 2.40% | motif file (matrix) | svg | |
46 | MYB(HTH)/ERMYB-Myb-ChIPSeq(GSE22095)/Homer | 1e-20 | -4.812e+01 | 0.0000 | 1867.0 | 40.63% | 15162.3 | 33.91% | motif file (matrix) | svg | |
47 | AMYB(HTH)/Testes-AMYB-ChIP-Seq(GSE44588)/Homer | 1e-20 | -4.716e+01 | 0.0000 | 1704.0 | 37.08% | 13677.8 | 30.59% | motif file (matrix) | svg | |
48 | Atf2(bZIP)/3T3L1-Atf2-ChIP-Seq(GSE56872)/Homer | 1e-20 | -4.605e+01 | 0.0000 | 435.0 | 9.47% | 2667.4 | 5.97% | motif file (matrix) | svg | |
49 | Hnf1(Homeobox)/Liver-Foxa2-Chip-Seq(GSE25694)/Homer | 1e-19 | -4.539e+01 | 0.0000 | 203.0 | 4.42% | 968.3 | 2.17% | motif file (matrix) | svg | |
50 | Nr5a2(NR)/Pancreas-LRH1-ChIP-Seq(GSE34295)/Homer | 1e-17 | -4.031e+01 | 0.0000 | 845.0 | 18.39% | 6173.7 | 13.81% | motif file (matrix) | svg | |
51 | PR(NR)/T47D-PR-ChIP-Seq(GSE31130)/Homer | 1e-16 | -3.890e+01 | 0.0000 | 2163.0 | 47.07% | 18282.6 | 40.88% | motif file (matrix) | svg | |
52 | THRb(NR)/HepG2-THRb.Flag-ChIP-Seq(Encode)/Homer | 1e-16 | -3.808e+01 | 0.0000 | 675.0 | 14.69% | 4775.1 | 10.68% | motif file (matrix) | svg | |
53 | RORa(NR)/Liver-Rora-ChIP-Seq(GSE101115)/Homer | 1e-16 | -3.762e+01 | 0.0000 | 206.0 | 4.48% | 1065.5 | 2.38% | motif file (matrix) | svg | |
54 | HIC1(Zf)/Treg-ZBTB29-ChIP-Seq(GSE99889)/Homer | 1e-15 | -3.551e+01 | 0.0000 | 2044.0 | 44.48% | 17277.3 | 38.64% | motif file (matrix) | svg | |
55 | RORgt(NR)/EL4-RORgt.Flag-ChIP-Seq(GSE56019)/Homer | 1e-15 | -3.530e+01 | 0.0000 | 189.0 | 4.11% | 970.8 | 2.17% | motif file (matrix) | svg | |
56 | RORgt(NR)/EL4-RORgt.Flag-ChIP-Seq(GSE56019)/Homer | 1e-15 | -3.530e+01 | 0.0000 | 189.0 | 4.11% | 970.8 | 2.17% | motif file (matrix) | svg | |
57 | CUX1(Homeobox)/K562-CUX1-ChIP-Seq(GSE92882)/Homer | 1e-15 | -3.526e+01 | 0.0000 | 700.0 | 15.23% | 5054.8 | 11.30% | motif file (matrix) | svg | |
58 | STAT5(Stat)/mCD4+-Stat5-ChIP-Seq(GSE12346)/Homer | 1e-15 | -3.516e+01 | 0.0000 | 448.0 | 9.75% | 2956.4 | 6.61% | motif file (matrix) | svg | |
59 | Nanog(Homeobox)/mES-Nanog-ChIP-Seq(GSE11724)/Homer | 1e-14 | -3.413e+01 | 0.0000 | 3333.0 | 72.54% | 30023.7 | 67.14% | motif file (matrix) | svg | |
60 | NF1:FOXA1(CTF,Forkhead)/LNCAP-FOXA1-ChIP-Seq(GSE27824)/Homer | 1e-14 | -3.336e+01 | 0.0000 | 106.0 | 2.31% | 435.7 | 0.97% | motif file (matrix) | svg | |
61 | Gata4(Zf)/Heart-Gata4-ChIP-Seq(GSE35151)/Homer | 1e-14 | -3.334e+01 | 0.0000 | 1025.0 | 22.31% | 7946.7 | 17.77% | motif file (matrix) | svg | |
62 | ARE(NR)/LNCAP-AR-ChIP-Seq(GSE27824)/Homer | 1e-14 | -3.315e+01 | 0.0000 | 333.0 | 7.25% | 2075.8 | 4.64% | motif file (matrix) | svg | |
63 | Stat3+il21(Stat)/CD4-Stat3-ChIP-Seq(GSE19198)/Homer | 1e-14 | -3.260e+01 | 0.0000 | 798.0 | 17.37% | 5968.8 | 13.35% | motif file (matrix) | svg | |
64 | GRE(NR),IR3/RAW264.7-GRE-ChIP-Seq(Unpublished)/Homer | 1e-13 | -3.173e+01 | 0.0000 | 317.0 | 6.90% | 1973.6 | 4.41% | motif file (matrix) | svg | |
65 | c-Jun-CRE(bZIP)/K562-cJun-ChIP-Seq(GSE31477)/Homer | 1e-13 | -3.156e+01 | 0.0000 | 367.0 | 7.99% | 2371.5 | 5.30% | motif file (matrix) | svg | |
66 | Gata2(Zf)/K562-GATA2-ChIP-Seq(GSE18829)/Homer | 1e-13 | -3.098e+01 | 0.0000 | 703.0 | 15.30% | 5189.8 | 11.61% | motif file (matrix) | svg | |
67 | Pitx1(Homeobox)/Chicken-Pitx1-ChIP-Seq(GSE38910)/Homer | 1e-13 | -3.048e+01 | 0.0000 | 3218.0 | 70.03% | 29003.9 | 64.86% | motif file (matrix) | svg | |
68 | BMYB(HTH)/Hela-BMYB-ChIP-Seq(GSE27030)/Homer | 1e-12 | -2.952e+01 | 0.0000 | 1584.0 | 34.47% | 13184.0 | 29.48% | motif file (matrix) | svg | |
69 | GRE(NR),IR3/A549-GR-ChIP-Seq(GSE32465)/Homer | 1e-12 | -2.926e+01 | 0.0000 | 195.0 | 4.24% | 1082.2 | 2.42% | motif file (matrix) | svg | |
70 | CLOCK(bHLH)/Liver-Clock-ChIP-Seq(GSE39860)/Homer | 1e-12 | -2.912e+01 | 0.0000 | 615.0 | 13.38% | 4482.0 | 10.02% | motif file (matrix) | svg | |
71 | STAT1(Stat)/HelaS3-STAT1-ChIP-Seq(GSE12782)/Homer | 1e-12 | -2.908e+01 | 0.0000 | 370.0 | 8.05% | 2441.6 | 5.46% | motif file (matrix) | svg | |
72 | Gata1(Zf)/K562-GATA1-ChIP-Seq(GSE18829)/Homer | 1e-12 | -2.885e+01 | 0.0000 | 629.0 | 13.69% | 4609.7 | 10.31% | motif file (matrix) | svg | |
73 | GATA3(Zf)/iTreg-Gata3-ChIP-Seq(GSE20898)/Homer | 1e-12 | -2.850e+01 | 0.0000 | 1424.0 | 30.99% | 11740.5 | 26.25% | motif file (matrix) | svg | |
74 | MITF(bHLH)/MastCells-MITF-ChIP-Seq(GSE48085)/Homer | 1e-12 | -2.806e+01 | 0.0000 | 1002.0 | 21.81% | 7911.6 | 17.69% | motif file (matrix) | svg | |
75 | Nkx6.1(Homeobox)/Islet-Nkx6.1-ChIP-Seq(GSE40975)/Homer | 1e-11 | -2.704e+01 | 0.0000 | 2208.0 | 48.05% | 19204.7 | 42.95% | motif file (matrix) | svg | |
76 | AR-halfsite(NR)/LNCaP-AR-ChIP-Seq(GSE27824)/Homer | 1e-11 | -2.673e+01 | 0.0000 | 3262.0 | 70.99% | 29611.6 | 66.22% | motif file (matrix) | svg | |
77 | Gata6(Zf)/HUG1N-GATA6-ChIP-Seq(GSE51936)/Homer | 1e-11 | -2.656e+01 | 0.0000 | 920.0 | 20.02% | 7228.9 | 16.17% | motif file (matrix) | svg | |
78 | ZNF416(Zf)/HEK293-ZNF416.GFP-ChIP-Seq(GSE58341)/Homer | 1e-11 | -2.593e+01 | 0.0000 | 1355.0 | 29.49% | 11206.0 | 25.06% | motif file (matrix) | svg | |
79 | Nr5a2(NR)/mES-Nr5a2-ChIP-Seq(GSE19019)/Homer | 1e-11 | -2.564e+01 | 0.0000 | 610.0 | 13.28% | 4532.3 | 10.14% | motif file (matrix) | svg | |
80 | RAR:RXR(NR),DR5/ES-RAR-ChIP-Seq(GSE56893)/Homer | 1e-10 | -2.530e+01 | 0.0000 | 175.0 | 3.81% | 985.9 | 2.20% | motif file (matrix) | svg | |
81 | NFY(CCAAT)/Promoter/Homer | 1e-10 | -2.522e+01 | 0.0000 | 850.0 | 18.50% | 6649.9 | 14.87% | motif file (matrix) | svg | |
82 | IRF4(IRF)/GM12878-IRF4-ChIP-Seq(GSE32465)/Homer | 1e-10 | -2.522e+01 | 0.0000 | 483.0 | 10.51% | 3459.8 | 7.74% | motif file (matrix) | svg | |
83 | TRPS1(Zf)/MCF7-TRPS1-ChIP-Seq(GSE107013)/Homer | 1e-10 | -2.517e+01 | 0.0000 | 1824.0 | 39.70% | 15624.7 | 34.94% | motif file (matrix) | svg | |
84 | USF1(bHLH)/GM12878-Usf1-ChIP-Seq(GSE32465)/Homer | 1e-10 | -2.517e+01 | 0.0000 | 498.0 | 10.84% | 3587.7 | 8.02% | motif file (matrix) | svg | |
85 | Hoxa9(Homeobox)/ChickenMSG-Hoxa9.Flag-ChIP-Seq(GSE86088)/Homer | 1e-10 | -2.490e+01 | 0.0000 | 2411.0 | 52.47% | 21268.7 | 47.56% | motif file (matrix) | svg | |
86 | THRa(NR)/C17.2-THRa-ChIP-Seq(GSE38347)/Homer | 1e-10 | -2.466e+01 | 0.0000 | 488.0 | 10.62% | 3515.4 | 7.86% | motif file (matrix) | svg | |
87 | Stat3(Stat)/mES-Stat3-ChIP-Seq(GSE11431)/Homer | 1e-10 | -2.463e+01 | 0.0000 | 569.0 | 12.38% | 4207.2 | 9.41% | motif file (matrix) | svg | |
88 | SF1(NR)/H295R-Nr5a1-ChIP-Seq(GSE44220)/Homer | 1e-10 | -2.395e+01 | 0.0000 | 553.0 | 12.03% | 4088.4 | 9.14% | motif file (matrix) | svg | |
89 | STAT4(Stat)/CD4-Stat4-ChIP-Seq(GSE22104)/Homer | 1e-10 | -2.373e+01 | 0.0000 | 1020.0 | 22.20% | 8228.9 | 18.40% | motif file (matrix) | svg | |
90 | NPAS2(bHLH)/Liver-NPAS2-ChIP-Seq(GSE39860)/Homer | 1e-10 | -2.356e+01 | 0.0000 | 1117.0 | 24.31% | 9116.3 | 20.39% | motif file (matrix) | svg | |
91 | RORg(NR)/Liver-Rorc-ChIP-Seq(GSE101115)/Homer | 1e-10 | -2.348e+01 | 0.0000 | 137.0 | 2.98% | 731.6 | 1.64% | motif file (matrix) | svg | |
92 | Isl1(Homeobox)/Neuron-Isl1-ChIP-Seq(GSE31456)/Homer | 1e-10 | -2.318e+01 | 0.0000 | 1824.0 | 39.70% | 15717.4 | 35.15% | motif file (matrix) | svg | |
93 | AP-1(bZIP)/ThioMac-PU.1-ChIP-Seq(GSE21512)/Homer | 1e-9 | -2.297e+01 | 0.0000 | 730.0 | 15.89% | 5660.4 | 12.66% | motif file (matrix) | svg | |
94 | Hoxa13(Homeobox)/ChickenMSG-Hoxa13.Flag-ChIP-Seq(GSE86088)/Homer | 1e-9 | -2.260e+01 | 0.0000 | 2190.0 | 47.66% | 19244.5 | 43.04% | motif file (matrix) | svg | |
95 | LXRE(NR),DR4/RAW-LXRb.biotin-ChIP-Seq(GSE21512)/Homer | 1e-9 | -2.234e+01 | 0.0000 | 87.0 | 1.89% | 398.7 | 0.89% | motif file (matrix) | svg | |
96 | JunB(bZIP)/DendriticCells-Junb-ChIP-Seq(GSE36099)/Homer | 1e-9 | -2.189e+01 | 0.0000 | 545.0 | 11.86% | 4077.1 | 9.12% | motif file (matrix) | svg | |
97 | Fra1(bZIP)/BT549-Fra1-ChIP-Seq(GSE46166)/Homer | 1e-9 | -2.148e+01 | 0.0000 | 542.0 | 11.80% | 4063.6 | 9.09% | motif file (matrix) | svg | |
98 | Tgif2(Homeobox)/mES-Tgif2-ChIP-Seq(GSE55404)/Homer | 1e-9 | -2.103e+01 | 0.0000 | 2919.0 | 63.53% | 26453.9 | 59.16% | motif file (matrix) | svg | |
99 | Fosl2(bZIP)/3T3L1-Fosl2-ChIP-Seq(GSE56872)/Homer | 1e-9 | -2.096e+01 | 0.0000 | 348.0 | 7.57% | 2430.2 | 5.43% | motif file (matrix) | svg | |
100 | PGR(NR)/EndoStromal-PGR-ChIP-Seq(GSE69539)/Homer | 1e-9 | -2.083e+01 | 0.0000 | 291.0 | 6.33% | 1964.1 | 4.39% | motif file (matrix) | svg | |
101 | Atf3(bZIP)/GBM-ATF3-ChIP-Seq(GSE33912)/Homer | 1e-9 | -2.073e+01 | 0.0000 | 637.0 | 13.86% | 4914.7 | 10.99% | motif file (matrix) | svg | |
102 | Bcl11a(Zf)/HSPC-BCL11A-ChIP-Seq(GSE104676)/Homer | 1e-8 | -2.016e+01 | 0.0000 | 801.0 | 17.43% | 6389.9 | 14.29% | motif file (matrix) | svg | |
103 | Sp1(Zf)/Promoter/Homer | 1e-8 | -2.013e+01 | 0.0000 | 280.0 | 6.09% | 1889.5 | 4.23% | motif file (matrix) | svg | |
104 | KLF3(Zf)/MEF-Klf3-ChIP-Seq(GSE44748)/Homer | 1e-8 | -1.998e+01 | 0.0000 | 534.0 | 11.62% | 4038.1 | 9.03% | motif file (matrix) | svg | |
105 | LXH9(Homeobox)/Hct116-LXH9.V5-ChIP-Seq(GSE116822)/Homer | 1e-8 | -1.959e+01 | 0.0000 | 1325.0 | 28.84% | 11203.2 | 25.05% | motif file (matrix) | svg | |
106 | Foxh1(Forkhead)/hESC-FOXH1-ChIP-Seq(GSE29422)/Homer | 1e-8 | -1.932e+01 | 0.0000 | 640.0 | 13.93% | 4987.4 | 11.15% | motif file (matrix) | svg | |
107 | BATF(bZIP)/Th17-BATF-ChIP-Seq(GSE39756)/Homer | 1e-8 | -1.927e+01 | 0.0000 | 633.0 | 13.78% | 4927.2 | 11.02% | motif file (matrix) | svg | |
108 | NPAS(bHLH)/Liver-NPAS-ChIP-Seq(GSE39860)/Homer | 1e-8 | -1.910e+01 | 0.0000 | 1502.0 | 32.69% | 12880.2 | 28.80% | motif file (matrix) | svg | |
109 | Smad3(MAD)/NPC-Smad3-ChIP-Seq(GSE36673)/Homer | 1e-8 | -1.885e+01 | 0.0000 | 2622.0 | 57.06% | 23643.1 | 52.87% | motif file (matrix) | svg | |
110 | Usf2(bHLH)/C2C12-Usf2-ChIP-Seq(GSE36030)/Homer | 1e-8 | -1.870e+01 | 0.0000 | 354.0 | 7.70% | 2534.2 | 5.67% | motif file (matrix) | svg | |
111 | NFAT(RHD)/Jurkat-NFATC1-ChIP-Seq(Jolma_et_al.)/Homer | 1e-8 | -1.847e+01 | 0.0000 | 871.0 | 18.96% | 7085.4 | 15.84% | motif file (matrix) | svg | |
112 | c-Myc(bHLH)/LNCAP-cMyc-ChIP-Seq(Unpublished)/Homer | 1e-7 | -1.796e+01 | 0.0000 | 449.0 | 9.77% | 3363.8 | 7.52% | motif file (matrix) | svg | |
113 | bHLHE41(bHLH)/proB-Bhlhe41-ChIP-Seq(GSE93764)/Homer | 1e-7 | -1.778e+01 | 0.0000 | 1041.0 | 22.66% | 8664.1 | 19.38% | motif file (matrix) | svg | |
114 | Max(bHLH)/K562-Max-ChIP-Seq(GSE31477)/Homer | 1e-7 | -1.726e+01 | 0.0000 | 670.0 | 14.58% | 5326.2 | 11.91% | motif file (matrix) | svg | |
115 | Lhx3(Homeobox)/Neuron-Lhx3-ChIP-Seq(GSE31456)/Homer | 1e-7 | -1.691e+01 | 0.0000 | 1494.0 | 32.51% | 12918.5 | 28.89% | motif file (matrix) | svg | |
116 | Fra2(bZIP)/Striatum-Fra2-ChIP-Seq(GSE43429)/Homer | 1e-7 | -1.679e+01 | 0.0000 | 458.0 | 9.97% | 3476.7 | 7.77% | motif file (matrix) | svg | |
117 | BMAL1(bHLH)/Liver-Bmal1-ChIP-Seq(GSE39860)/Homer | 1e-7 | -1.674e+01 | 0.0000 | 1680.0 | 36.56% | 14684.7 | 32.84% | motif file (matrix) | svg | |
118 | SCL(bHLH)/HPC7-Scl-ChIP-Seq(GSE13511)/Homer | 1e-7 | -1.673e+01 | 0.0000 | 3445.0 | 74.97% | 31958.2 | 71.47% | motif file (matrix) | svg | |
119 | Six1(Homeobox)/Myoblast-Six1-ChIP-Chip(GSE20150)/Homer | 1e-7 | -1.656e+01 | 0.0000 | 279.0 | 6.07% | 1959.8 | 4.38% | motif file (matrix) | svg | |
120 | JunD(bZIP)/K562-JunD-ChIP-Seq/Homer | 1e-7 | -1.636e+01 | 0.0000 | 125.0 | 2.72% | 733.3 | 1.64% | motif file (matrix) | svg | |
121 | Gfi1b(Zf)/HPC7-Gfi1b-ChIP-Seq(GSE22178)/Homer | 1e-7 | -1.621e+01 | 0.0000 | 666.0 | 14.49% | 5330.1 | 11.92% | motif file (matrix) | svg | |
122 | CRX(Homeobox)/Retina-Crx-ChIP-Seq(GSE20012)/Homer | 1e-7 | -1.612e+01 | 0.0000 | 2000.0 | 43.53% | 17772.6 | 39.74% | motif file (matrix) | svg | |
123 | Hoxd12(Homeobox)/ChickenMSG-Hoxd12.Flag-ChIP-Seq(GSE86088)/Homer | 1e-6 | -1.562e+01 | 0.0000 | 1611.0 | 35.06% | 14094.8 | 31.52% | motif file (matrix) | svg | |
124 | TEAD4(TEA)/Tropoblast-Tead4-ChIP-Seq(GSE37350)/Homer | 1e-6 | -1.555e+01 | 0.0000 | 814.0 | 17.71% | 6691.7 | 14.96% | motif file (matrix) | svg | |
125 | Sox10(HMG)/SciaticNerve-Sox3-ChIP-Seq(GSE35132)/Homer | 1e-6 | -1.547e+01 | 0.0000 | 1638.0 | 35.65% | 14358.1 | 32.11% | motif file (matrix) | svg | |
126 | Sp5(Zf)/mES-Sp5.Flag-ChIP-Seq(GSE72989)/Homer | 1e-6 | -1.541e+01 | 0.0000 | 924.0 | 20.11% | 7700.7 | 17.22% | motif file (matrix) | svg | |
127 | CRE(bZIP)/Promoter/Homer | 1e-6 | -1.520e+01 | 0.0000 | 256.0 | 5.57% | 1800.6 | 4.03% | motif file (matrix) | svg | |
128 | bHLHE40(bHLH)/HepG2-BHLHE40-ChIP-Seq(GSE31477)/Homer | 1e-6 | -1.517e+01 | 0.0000 | 324.0 | 7.05% | 2372.7 | 5.31% | motif file (matrix) | svg | |
129 | n-Myc(bHLH)/mES-nMyc-ChIP-Seq(GSE11431)/Homer | 1e-6 | -1.479e+01 | 0.0000 | 671.0 | 14.60% | 5431.6 | 12.15% | motif file (matrix) | svg | |
130 | Tgif1(Homeobox)/mES-Tgif1-ChIP-Seq(GSE55404)/Homer | 1e-6 | -1.438e+01 | 0.0000 | 2738.0 | 59.59% | 25052.9 | 56.03% | motif file (matrix) | svg | |
131 | p63(p53)/Keratinocyte-p63-ChIP-Seq(GSE17611)/Homer | 1e-6 | -1.437e+01 | 0.0000 | 401.0 | 8.73% | 3057.4 | 6.84% | motif file (matrix) | svg | |
132 | Olig2(bHLH)/Neuron-Olig2-ChIP-Seq(GSE30882)/Homer | 1e-6 | -1.430e+01 | 0.0000 | 1863.0 | 40.54% | 16571.2 | 37.06% | motif file (matrix) | svg | |
133 | KLF6(Zf)/PDAC-KLF6-ChIP-Seq(GSE64557)/Homer | 1e-6 | -1.402e+01 | 0.0000 | 976.0 | 21.24% | 8245.6 | 18.44% | motif file (matrix) | svg | |
134 | TEAD(TEA)/Fibroblast-PU.1-ChIP-Seq(Unpublished)/Homer | 1e-6 | -1.400e+01 | 0.0000 | 650.0 | 14.15% | 5275.3 | 11.80% | motif file (matrix) | svg | |
135 | TEAD2(TEA)/Py2T-Tead2-ChIP-Seq(GSE55709)/Homer | 1e-6 | -1.399e+01 | 0.0000 | 515.0 | 11.21% | 4071.0 | 9.10% | motif file (matrix) | svg | |
136 | Zac1(Zf)/Neuro2A-Plagl1-ChIP-Seq(GSE75942)/Homer | 1e-6 | -1.394e+01 | 0.0000 | 2229.0 | 48.51% | 20116.5 | 44.99% | motif file (matrix) | svg | |
137 | KLF14(Zf)/HEK293-KLF14.GFP-ChIP-Seq(GSE58341)/Homer | 1e-5 | -1.378e+01 | 0.0000 | 1492.0 | 32.47% | 13077.6 | 29.25% | motif file (matrix) | svg | |
138 | GRHL2(CP2)/HBE-GRHL2-ChIP-Seq(GSE46194)/Homer | 1e-5 | -1.374e+01 | 0.0000 | 379.0 | 8.25% | 2886.3 | 6.45% | motif file (matrix) | svg | |
139 | Tbx5(T-box)/HL1-Tbx5.biotin-ChIP-Seq(GSE21529)/Homer | 1e-5 | -1.367e+01 | 0.0000 | 2751.0 | 59.87% | 25227.6 | 56.42% | motif file (matrix) | svg | |
140 | Sp2(Zf)/HEK293-Sp2.eGFP-ChIP-Seq(Encode)/Homer | 1e-5 | -1.341e+01 | 0.0000 | 1325.0 | 28.84% | 11526.7 | 25.78% | motif file (matrix) | svg | |
141 | DMRT1(DM)/Testis-DMRT1-ChIP-Seq(GSE64892)/Homer | 1e-5 | -1.337e+01 | 0.0000 | 270.0 | 5.88% | 1963.9 | 4.39% | motif file (matrix) | svg | |
142 | Otx2(Homeobox)/EpiLC-Otx2-ChIP-Seq(GSE56098)/Homer | 1e-5 | -1.331e+01 | 0.0000 | 687.0 | 14.95% | 5638.7 | 12.61% | motif file (matrix) | svg | |
143 | Bcl6(Zf)/Liver-Bcl6-ChIP-Seq(GSE31578)/Homer | 1e-5 | -1.319e+01 | 0.0000 | 1320.0 | 28.73% | 11492.7 | 25.70% | motif file (matrix) | svg | |
144 | Six2(Homeobox)/NephronProgenitor-Six2-ChIP-Seq(GSE39837)/Homer | 1e-5 | -1.297e+01 | 0.0000 | 947.0 | 20.61% | 8031.7 | 17.96% | motif file (matrix) | svg | |
145 | Jun-AP1(bZIP)/K562-cJun-ChIP-Seq(GSE31477)/Homer | 1e-5 | -1.290e+01 | 0.0000 | 236.0 | 5.14% | 1689.6 | 3.78% | motif file (matrix) | svg | |
146 | Ronin(THAP)/ES-Thap11-ChIP-Seq(GSE51522)/Homer | 1e-5 | -1.289e+01 | 0.0000 | 43.0 | 0.94% | 189.8 | 0.42% | motif file (matrix) | svg | |
147 | Hoxa11(Homeobox)/ChickenMSG-Hoxa11.Flag-ChIP-Seq(GSE86088)/Homer | 1e-5 | -1.259e+01 | 0.0000 | 2008.0 | 43.70% | 18074.6 | 40.42% | motif file (matrix) | svg | |
148 | ZNF711(Zf)/SHSY5Y-ZNF711-ChIP-Seq(GSE20673)/Homer | 1e-5 | -1.243e+01 | 0.0000 | 1567.0 | 34.10% | 13871.1 | 31.02% | motif file (matrix) | svg | |
149 | NFE2L2(bZIP)/HepG2-NFE2L2-ChIP-Seq(Encode)/Homer | 1e-5 | -1.206e+01 | 0.0000 | 67.0 | 1.46% | 361.8 | 0.81% | motif file (matrix) | svg | |
150 | NeuroG2(bHLH)/Fibroblast-NeuroG2-ChIP-Seq(GSE75910)/Homer | 1e-5 | -1.200e+01 | 0.0000 | 1526.0 | 33.21% | 13510.7 | 30.21% | motif file (matrix) | svg | |
151 | Elk1(ETS)/Hela-Elk1-ChIP-Seq(GSE31477)/Homer | 1e-5 | -1.200e+01 | 0.0000 | 582.0 | 12.67% | 4748.4 | 10.62% | motif file (matrix) | svg | |
152 | TEAD3(TEA)/HepG2-TEAD3-ChIP-Seq(Encode)/Homer | 1e-5 | -1.195e+01 | 0.0000 | 1060.0 | 23.07% | 9134.2 | 20.43% | motif file (matrix) | svg | |
153 | Mef2b(MADS)/HEK293-Mef2b.V5-ChIP-Seq(GSE67450)/Homer | 1e-5 | -1.188e+01 | 0.0000 | 730.0 | 15.89% | 6095.8 | 13.63% | motif file (matrix) | svg | |
154 | E-box(bHLH)/Promoter/Homer | 1e-5 | -1.183e+01 | 0.0000 | 94.0 | 2.05% | 564.7 | 1.26% | motif file (matrix) | svg | |
155 | GSC(Homeobox)/FrogEmbryos-GSC-ChIP-Seq(DRA000576)/Homer | 1e-5 | -1.182e+01 | 0.0000 | 987.0 | 21.48% | 8463.0 | 18.93% | motif file (matrix) | svg | |
156 | Pdx1(Homeobox)/Islet-Pdx1-ChIP-Seq(SRA008281)/Homer | 1e-4 | -1.142e+01 | 0.0000 | 883.0 | 19.22% | 7523.4 | 16.82% | motif file (matrix) | svg | |
157 | Rbpj1(?)/Panc1-Rbpj1-ChIP-Seq(GSE47459)/Homer | 1e-4 | -1.130e+01 | 0.0000 | 1426.0 | 31.03% | 12609.3 | 28.20% | motif file (matrix) | svg | |
158 | TEAD1(TEAD)/HepG2-TEAD1-ChIP-Seq(Encode)/Homer | 1e-4 | -1.125e+01 | 0.0000 | 906.0 | 19.72% | 7745.4 | 17.32% | motif file (matrix) | svg | |
159 | MafA(bZIP)/Islet-MafA-ChIP-Seq(GSE30298)/Homer | 1e-4 | -1.117e+01 | 0.0000 | 848.0 | 18.45% | 7214.7 | 16.13% | motif file (matrix) | svg | |
160 | c-Myc(bHLH)/mES-cMyc-ChIP-Seq(GSE11431)/Homer | 1e-4 | -1.110e+01 | 0.0000 | 480.0 | 10.45% | 3871.8 | 8.66% | motif file (matrix) | svg | |
161 | MNT(bHLH)/HepG2-MNT-ChIP-Seq(Encode)/Homer | 1e-4 | -1.103e+01 | 0.0000 | 989.0 | 21.52% | 8527.8 | 19.07% | motif file (matrix) | svg | |
162 | Bach2(bZIP)/OCILy7-Bach2-ChIP-Seq(GSE44420)/Homer | 1e-4 | -1.100e+01 | 0.0000 | 202.0 | 4.40% | 1452.8 | 3.25% | motif file (matrix) | svg | |
163 | RAR:RXR(NR),DR5/ES-RAR-ChIP-Seq(GSE56893)/Homer | 1e-4 | -1.100e+01 | 0.0000 | 46.0 | 1.00% | 225.8 | 0.50% | motif file (matrix) | svg | |
164 | KLF5(Zf)/LoVo-KLF5-ChIP-Seq(GSE49402)/Homer | 1e-4 | -1.096e+01 | 0.0000 | 1155.0 | 25.14% | 10080.3 | 22.54% | motif file (matrix) | svg | |
165 | CTCF(Zf)/CD4+-CTCF-ChIP-Seq(Barski_et_al.)/Homer | 1e-4 | -1.089e+01 | 0.0000 | 142.0 | 3.09% | 959.5 | 2.15% | motif file (matrix) | svg | |
166 | Rfx6(HTH)/Min6b1-Rfx6.HA-ChIP-Seq(GSE62844)/Homer | 1e-4 | -1.082e+01 | 0.0001 | 1097.0 | 23.87% | 9546.9 | 21.35% | motif file (matrix) | svg | |
167 | NeuroD1(bHLH)/Islet-NeuroD1-ChIP-Seq(GSE30298)/Homer | 1e-4 | -1.080e+01 | 0.0001 | 839.0 | 18.26% | 7151.2 | 15.99% | motif file (matrix) | svg | |
168 | Lhx2(Homeobox)/HFSC-Lhx2-ChIP-Seq(GSE48068)/Homer | 1e-4 | -1.074e+01 | 0.0001 | 962.0 | 20.94% | 8293.7 | 18.55% | motif file (matrix) | svg | |
169 | ZFX(Zf)/mES-Zfx-ChIP-Seq(GSE11431)/Homer | 1e-4 | -1.071e+01 | 0.0001 | 1189.0 | 25.88% | 10415.6 | 23.29% | motif file (matrix) | svg | |
170 | GFY-Staf(?,Zf)/Promoter/Homer | 1e-4 | -1.042e+01 | 0.0001 | 61.0 | 1.33% | 337.3 | 0.75% | motif file (matrix) | svg | |
171 | MafK(bZIP)/C2C12-MafK-ChIP-Seq(GSE36030)/Homer | 1e-4 | -1.032e+01 | 0.0001 | 242.0 | 5.27% | 1810.3 | 4.05% | motif file (matrix) | svg | |
172 | MafF(bZIP)/HepG2-MafF-ChIP-Seq(GSE31477)/Homer | 1e-4 | -1.030e+01 | 0.0001 | 260.0 | 5.66% | 1965.0 | 4.39% | motif file (matrix) | svg | |
173 | Tbx20(T-box)/Heart-Tbx20-ChIP-Seq(GSE29636)/Homer | 1e-4 | -1.024e+01 | 0.0001 | 260.0 | 5.66% | 1967.2 | 4.40% | motif file (matrix) | svg | |
174 | Mef2c(MADS)/GM12878-Mef2c-ChIP-Seq(GSE32465)/Homer | 1e-4 | -1.017e+01 | 0.0001 | 383.0 | 8.34% | 3045.1 | 6.81% | motif file (matrix) | svg | |
175 | HRE(HSF)/HepG2-HSF1-ChIP-Seq(GSE31477)/Homer | 1e-4 | -1.000e+01 | 0.0001 | 185.0 | 4.03% | 1335.1 | 2.99% | motif file (matrix) | svg | |
176 | Tcf3(HMG)/mES-Tcf3-ChIP-Seq(GSE11724)/Homer | 1e-4 | -9.844e+00 | 0.0001 | 289.0 | 6.29% | 2231.6 | 4.99% | motif file (matrix) | svg | |
177 | HIF-1b(HLH)/T47D-HIF1b-ChIP-Seq(GSE59937)/Homer | 1e-4 | -9.825e+00 | 0.0001 | 981.0 | 21.35% | 8526.0 | 19.07% | motif file (matrix) | svg | |
178 | Sox17(HMG)/Endoderm-Sox17-ChIP-Seq(GSE61475)/Homer | 1e-4 | -9.734e+00 | 0.0001 | 727.0 | 15.82% | 6178.9 | 13.82% | motif file (matrix) | svg | |
179 | Hoxd10(Homeobox)/ChickenMSG-Hoxd10.Flag-ChIP-Seq(GSE86088)/Homer | 1e-4 | -9.669e+00 | 0.0002 | 1066.0 | 23.20% | 9329.7 | 20.86% | motif file (matrix) | svg | |
180 | MafB(bZIP)/BMM-Mafb-ChIP-Seq(GSE75722)/Homer | 1e-4 | -9.638e+00 | 0.0002 | 415.0 | 9.03% | 3351.1 | 7.49% | motif file (matrix) | svg | |
181 | LEF1(HMG)/H1-LEF1-ChIP-Seq(GSE64758)/Homer | 1e-4 | -9.625e+00 | 0.0002 | 706.0 | 15.36% | 5991.2 | 13.40% | motif file (matrix) | svg | |
182 | Hoxd13(Homeobox)/ChickenMSG-Hoxd13.Flag-ChIP-Seq(GSE86088)/Homer | 1e-4 | -9.580e+00 | 0.0002 | 1414.0 | 30.77% | 12615.0 | 28.21% | motif file (matrix) | svg | |
183 | ZNF264(Zf)/HEK293-ZNF264.GFP-ChIP-Seq(GSE58341)/Homer | 1e-4 | -9.488e+00 | 0.0002 | 644.0 | 14.02% | 5431.0 | 12.15% | motif file (matrix) | svg | |
184 | PU.1:IRF8(ETS:IRF)/pDC-Irf8-ChIP-Seq(GSE66899)/Homer | 1e-4 | -9.329e+00 | 0.0002 | 204.0 | 4.44% | 1514.6 | 3.39% | motif file (matrix) | svg | |
185 | ZNF7(Zf)/HepG2-ZNF7.Flag-ChIP-Seq(Encode)/Homer | 1e-3 | -9.060e+00 | 0.0003 | 565.0 | 12.30% | 4731.7 | 10.58% | motif file (matrix) | svg | |
186 | IRF2(IRF)/Erythroblas-IRF2-ChIP-Seq(GSE36985)/Homer | 1e-3 | -8.990e+00 | 0.0003 | 111.0 | 2.42% | 747.5 | 1.67% | motif file (matrix) | svg | |
187 | VDR(NR),DR3/GM10855-VDR+vitD-ChIP-Seq(GSE22484)/Homer | 1e-3 | -8.981e+00 | 0.0003 | 226.0 | 4.92% | 1713.0 | 3.83% | motif file (matrix) | svg | |
188 | ZNF341(Zf)/EBV-ZNF341-ChIP-Seq(GSE113194)/Homer | 1e-3 | -8.842e+00 | 0.0003 | 717.0 | 15.60% | 6136.6 | 13.72% | motif file (matrix) | svg | |
189 | Sox15(HMG)/CPA-Sox15-ChIP-Seq(GSE62909)/Homer | 1e-3 | -8.491e+00 | 0.0005 | 1062.0 | 23.11% | 9371.4 | 20.96% | motif file (matrix) | svg | |
190 | DMRT6(DM)/Testis-DMRT6-ChIP-Seq(GSE60440)/Homer | 1e-3 | -8.281e+00 | 0.0006 | 224.0 | 4.87% | 1718.4 | 3.84% | motif file (matrix) | svg | |
191 | RARg(NR)/ES-RARg-ChIP-Seq(GSE30538)/Homer | 1e-3 | -8.132e+00 | 0.0007 | 50.0 | 1.09% | 285.3 | 0.64% | motif file (matrix) | svg | |
192 | Zfp281(Zf)/ES-Zfp281-ChIP-Seq(GSE81042)/Homer | 1e-3 | -8.015e+00 | 0.0007 | 266.0 | 5.79% | 2094.6 | 4.68% | motif file (matrix) | svg | |
193 | Tcf7(HMG)/GM12878-TCF7-ChIP-Seq(Encode)/Homer | 1e-3 | -7.987e+00 | 0.0008 | 371.0 | 8.07% | 3028.9 | 6.77% | motif file (matrix) | svg | |
194 | ELF1(ETS)/Jurkat-ELF1-ChIP-Seq(SRA014231)/Homer | 1e-3 | -7.983e+00 | 0.0008 | 525.0 | 11.43% | 4422.0 | 9.89% | motif file (matrix) | svg | |
195 | Klf9(Zf)/GBM-Klf9-ChIP-Seq(GSE62211)/Homer | 1e-3 | -7.977e+00 | 0.0008 | 393.0 | 8.55% | 3226.1 | 7.21% | motif file (matrix) | svg | |
196 | PBX2(Homeobox)/K562-PBX2-ChIP-Seq(Encode)/Homer | 1e-3 | -7.865e+00 | 0.0008 | 764.0 | 16.63% | 6630.3 | 14.83% | motif file (matrix) | svg | |
197 | Elk4(ETS)/Hela-Elk4-ChIP-Seq(GSE31477)/Homer | 1e-3 | -7.806e+00 | 0.0009 | 550.0 | 11.97% | 4660.3 | 10.42% | motif file (matrix) | svg | |
198 | ZNF415(Zf)/HEK293-ZNF415.GFP-ChIP-Seq(GSE58341)/Homer | 1e-3 | -7.789e+00 | 0.0009 | 515.0 | 11.21% | 4342.0 | 9.71% | motif file (matrix) | svg | |
199 | Eomes(T-box)/H9-Eomes-ChIP-Seq(GSE26097)/Homer | 1e-3 | -7.772e+00 | 0.0009 | 1892.0 | 41.18% | 17331.1 | 38.76% | motif file (matrix) | svg | |
200 | Hoxd11(Homeobox)/ChickenMSG-Hoxd11.Flag-ChIP-Seq(GSE86088)/Homer | 1e-3 | -7.760e+00 | 0.0009 | 2033.0 | 44.24% | 18692.6 | 41.80% | motif file (matrix) | svg | |
201 | ETS(ETS)/Promoter/Homer | 1e-3 | -7.730e+00 | 0.0009 | 353.0 | 7.68% | 2878.7 | 6.44% | motif file (matrix) | svg | |
202 | Srebp2(bHLH)/HepG2-Srebp2-ChIP-Seq(GSE31477)/Homer | 1e-3 | -7.695e+00 | 0.0010 | 148.0 | 3.22% | 1084.3 | 2.42% | motif file (matrix) | svg | |
203 | ZNF322(Zf)/HEK293-ZNF322.GFP-ChIP-Seq(GSE58341)/Homer | 1e-3 | -7.661e+00 | 0.0010 | 301.0 | 6.55% | 2417.6 | 5.41% | motif file (matrix) | svg | |
204 | CDX4(Homeobox)/ZebrafishEmbryos-Cdx4.Myc-ChIP-Seq(GSE48254)/Homer | 1e-3 | -7.646e+00 | 0.0010 | 784.0 | 17.06% | 6830.2 | 15.27% | motif file (matrix) | svg | |
205 | Tbet(T-box)/CD8-Tbet-ChIP-Seq(GSE33802)/Homer | 1e-3 | -7.618e+00 | 0.0010 | 1047.0 | 22.79% | 9293.9 | 20.78% | motif file (matrix) | svg | |
206 | Nrf2(bZIP)/Lymphoblast-Nrf2-ChIP-Seq(GSE37589)/Homer | 1e-3 | -7.562e+00 | 0.0011 | 56.0 | 1.22% | 338.4 | 0.76% | motif file (matrix) | svg | |
207 | ERE(NR),IR3/MCF7-ERa-ChIP-Seq(Unpublished)/Homer | 1e-3 | -7.537e+00 | 0.0011 | 285.0 | 6.20% | 2280.4 | 5.10% | motif file (matrix) | svg | |
208 | ZNF189(Zf)/HEK293-ZNF189.GFP-ChIP-Seq(GSE58341)/Homer | 1e-3 | -7.515e+00 | 0.0011 | 811.0 | 17.65% | 7090.5 | 15.86% | motif file (matrix) | svg | |
209 | p73(p53)/Trachea-p73-ChIP-Seq(PRJNA310161)/Homer | 1e-3 | -7.506e+00 | 0.0011 | 62.0 | 1.35% | 385.8 | 0.86% | motif file (matrix) | svg | |
210 | Sox6(HMG)/Myotubes-Sox6-ChIP-Seq(GSE32627)/Homer | 1e-3 | -7.496e+00 | 0.0011 | 1529.0 | 33.28% | 13874.2 | 31.03% | motif file (matrix) | svg | |
211 | Dlx3(Homeobox)/Kerainocytes-Dlx3-ChIP-Seq(GSE89884)/Homer | 1e-3 | -7.447e+00 | 0.0012 | 608.0 | 13.23% | 5213.9 | 11.66% | motif file (matrix) | svg | |
212 | Sox4(HMG)/proB-Sox4-ChIP-Seq(GSE50066)/Homer | 1e-3 | -7.446e+00 | 0.0012 | 866.0 | 18.85% | 7608.4 | 17.01% | motif file (matrix) | svg | |
213 | Meis1(Homeobox)/MastCells-Meis1-ChIP-Seq(GSE48085)/Homer | 1e-3 | -7.439e+00 | 0.0012 | 1699.0 | 36.97% | 15506.0 | 34.68% | motif file (matrix) | svg | |
214 | NF-E2(bZIP)/K562-NFE2-ChIP-Seq(GSE31477)/Homer | 1e-3 | -7.334e+00 | 0.0013 | 66.0 | 1.44% | 419.1 | 0.94% | motif file (matrix) | svg | |
215 | EBF2(EBF)/BrownAdipose-EBF2-ChIP-Seq(GSE97114)/Homer | 1e-3 | -7.319e+00 | 0.0013 | 900.0 | 19.59% | 7935.3 | 17.75% | motif file (matrix) | svg | |
216 | Hoxb4(Homeobox)/ES-Hoxb4-ChIP-Seq(GSE34014)/Homer | 1e-3 | -7.311e+00 | 0.0013 | 196.0 | 4.27% | 1507.7 | 3.37% | motif file (matrix) | svg | |
217 | Bach1(bZIP)/K562-Bach1-ChIP-Seq(GSE31477)/Homer | 1e-3 | -7.284e+00 | 0.0014 | 65.0 | 1.41% | 412.5 | 0.92% | motif file (matrix) | svg | |
218 | ISRE(IRF)/ThioMac-LPS-Expression(GSE23622)/Homer | 1e-3 | -7.252e+00 | 0.0014 | 69.0 | 1.50% | 444.2 | 0.99% | motif file (matrix) | svg | |
219 | PAX5(Paired,Homeobox),condensed/GM12878-PAX5-ChIP-Seq(GSE32465)/Homer | 1e-3 | -7.185e+00 | 0.0015 | 126.0 | 2.74% | 912.9 | 2.04% | motif file (matrix) | svg | |
220 | Hoxa10(Homeobox)/ChickenMSG-Hoxa10.Flag-ChIP-Seq(GSE86088)/Homer | 1e-3 | -7.068e+00 | 0.0017 | 552.0 | 12.01% | 4720.5 | 10.56% | motif file (matrix) | svg | |
221 | GABPA(ETS)/Jurkat-GABPa-ChIP-Seq(GSE17954)/Homer | 1e-3 | -7.056e+00 | 0.0017 | 908.0 | 19.76% | 8029.9 | 17.96% | motif file (matrix) | svg | |
222 | Reverb(NR),DR2/RAW-Reverba.biotin-ChIP-Seq(GSE45914)/Homer | 1e-3 | -7.013e+00 | 0.0017 | 148.0 | 3.22% | 1103.6 | 2.47% | motif file (matrix) | svg | |
223 | NRF1(NRF)/MCF7-NRF1-ChIP-Seq(Unpublished)/Homer | 1e-2 | -6.900e+00 | 0.0019 | 125.0 | 2.72% | 912.0 | 2.04% | motif file (matrix) | svg | |
224 | HOXB13(Homeobox)/ProstateTumor-HOXB13-ChIP-Seq(GSE56288)/Homer | 1e-2 | -6.844e+00 | 0.0020 | 937.0 | 20.39% | 8317.2 | 18.60% | motif file (matrix) | svg | |
225 | TFE3(bHLH)/MEF-TFE3-ChIP-Seq(GSE75757)/Homer | 1e-2 | -6.841e+00 | 0.0020 | 98.0 | 2.13% | 687.9 | 1.54% | motif file (matrix) | svg | |
226 | HOXA2(Homeobox)/mES-Hoxa2-ChIP-Seq(Donaldson_et_al.)/Homer | 1e-2 | -6.826e+00 | 0.0021 | 107.0 | 2.33% | 762.8 | 1.71% | motif file (matrix) | svg | |
227 | Mef2d(MADS)/Retina-Mef2d-ChIP-Seq(GSE61391)/Homer | 1e-2 | -6.751e+00 | 0.0022 | 170.0 | 3.70% | 1300.0 | 2.91% | motif file (matrix) | svg | |
228 | Elf4(ETS)/BMDM-Elf4-ChIP-Seq(GSE88699)/Homer | 1e-2 | -6.733e+00 | 0.0022 | 1009.0 | 21.96% | 9003.1 | 20.13% | motif file (matrix) | svg | |
229 | HRE(HSF)/Striatum-HSF1-ChIP-Seq(GSE38000)/Homer | 1e-2 | -6.468e+00 | 0.0029 | 238.0 | 5.18% | 1906.5 | 4.26% | motif file (matrix) | svg | |
230 | ETV1(ETS)/GIST48-ETV1-ChIP-Seq(GSE22441)/Homer | 1e-2 | -6.467e+00 | 0.0029 | 1317.0 | 28.66% | 11943.3 | 26.71% | motif file (matrix) | svg | |
231 | ETS1(ETS)/Jurkat-ETS1-ChIP-Seq(GSE17954)/Homer | 1e-2 | -6.407e+00 | 0.0031 | 1018.0 | 22.15% | 9114.2 | 20.38% | motif file (matrix) | svg | |
232 | Lhx1(Homeobox)/EmbryoCarcinoma-Lhx1-ChIP-Seq(GSE70957)/Homer | 1e-2 | -6.319e+00 | 0.0033 | 976.0 | 21.24% | 8725.2 | 19.51% | motif file (matrix) | svg | |
233 | Cdx2(Homeobox)/mES-Cdx2-ChIP-Seq(GSE14586)/Homer | 1e-2 | -6.311e+00 | 0.0033 | 602.0 | 13.10% | 5228.3 | 11.69% | motif file (matrix) | svg | |
234 | BORIS(Zf)/K562-CTCFL-ChIP-Seq(GSE32465)/Homer | 1e-2 | -6.242e+00 | 0.0036 | 176.0 | 3.83% | 1369.9 | 3.06% | motif file (matrix) | svg | |
235 | ZNF675(Zf)/HEK293-ZNF675.GFP-ChIP-Seq(GSE58341)/Homer | 1e-2 | -6.186e+00 | 0.0037 | 145.0 | 3.16% | 1102.3 | 2.46% | motif file (matrix) | svg | |
236 | PRDM1(Zf)/Hela-PRDM1-ChIP-Seq(GSE31477)/Homer | 1e-2 | -6.151e+00 | 0.0039 | 509.0 | 11.08% | 4379.3 | 9.79% | motif file (matrix) | svg | |
237 | RUNX(Runt)/HPC7-Runx1-ChIP-Seq(GSE22178)/Homer | 1e-2 | -6.004e+00 | 0.0045 | 726.0 | 15.80% | 6403.3 | 14.32% | motif file (matrix) | svg | |
238 | IRF8(IRF)/BMDM-IRF8-ChIP-Seq(GSE77884)/Homer | 1e-2 | -5.995e+00 | 0.0045 | 308.0 | 6.70% | 2553.9 | 5.71% | motif file (matrix) | svg | |
239 | Nkx2.5(Homeobox)/HL1-Nkx2.5.biotin-ChIP-Seq(GSE21529)/Homer | 1e-2 | -5.967e+00 | 0.0046 | 1985.0 | 43.20% | 18402.0 | 41.15% | motif file (matrix) | svg | |
240 | Sox9(HMG)/Limb-SOX9-ChIP-Seq(GSE73225)/Homer | 1e-2 | -5.931e+00 | 0.0047 | 901.0 | 19.61% | 8050.8 | 18.00% | motif file (matrix) | svg | |
241 | Myf5(bHLH)/GM-Myf5-ChIP-Seq(GSE24852)/Homer | 1e-2 | -5.875e+00 | 0.0050 | 716.0 | 15.58% | 6319.0 | 14.13% | motif file (matrix) | svg | |
242 | ZNF317(Zf)/HEK293-ZNF317.GFP-ChIP-Seq(GSE58341)/Homer | 1e-2 | -5.788e+00 | 0.0054 | 99.0 | 2.15% | 721.4 | 1.61% | motif file (matrix) | svg | |
243 | ZNF652/HepG2-ZNF652.Flag-ChIP-Seq(Encode)/Homer | 1e-2 | -5.701e+00 | 0.0059 | 254.0 | 5.53% | 2081.5 | 4.65% | motif file (matrix) | svg | |
244 | GATA(Zf),IR3/iTreg-Gata3-ChIP-Seq(GSE20898)/Homer | 1e-2 | -5.520e+00 | 0.0070 | 140.0 | 3.05% | 1080.8 | 2.42% | motif file (matrix) | svg | |
245 | GATA:SCL(Zf,bHLH)/Ter119-SCL-ChIP-Seq(GSE18720)/Homer | 1e-2 | -5.515e+00 | 0.0070 | 121.0 | 2.63% | 916.1 | 2.05% | motif file (matrix) | svg | |
246 | SCRT1(Zf)/HEK293-SCRT1.eGFP-ChIP-Seq(Encode)/Homer | 1e-2 | -5.506e+00 | 0.0071 | 320.0 | 6.96% | 2686.8 | 6.01% | motif file (matrix) | svg | |
247 | GLIS3(Zf)/Thyroid-Glis3.GFP-ChIP-Seq(GSE103297)/Homer | 1e-2 | -5.499e+00 | 0.0071 | 1233.0 | 26.83% | 11233.4 | 25.12% | motif file (matrix) | svg | |
248 | MyoD(bHLH)/Myotube-MyoD-ChIP-Seq(GSE21614)/Homer | 1e-2 | -5.429e+00 | 0.0076 | 775.0 | 16.87% | 6906.2 | 15.44% | motif file (matrix) | svg | |
249 | STAT6(Stat)/Macrophage-Stat6-ChIP-Seq(GSE38377)/Homer | 1e-2 | -5.323e+00 | 0.0084 | 539.0 | 11.73% | 4711.0 | 10.54% | motif file (matrix) | svg | |
250 | Klf4(Zf)/mES-Klf4-ChIP-Seq(GSE11431)/Homer | 1e-2 | -5.310e+00 | 0.0085 | 360.0 | 7.83% | 3061.5 | 6.85% | motif file (matrix) | svg | |
251 | Atoh1(bHLH)/Cerebellum-Atoh1-ChIP-Seq(GSE22111)/Homer | 1e-2 | -5.298e+00 | 0.0085 | 1075.0 | 23.39% | 9750.5 | 21.80% | motif file (matrix) | svg | |
252 | CHR(?)/Hela-CellCycle-Expression/Homer | 1e-2 | -5.291e+00 | 0.0086 | 614.0 | 13.36% | 5410.9 | 12.10% | motif file (matrix) | svg | |
253 | E2F4(E2F)/K562-E2F4-ChIP-Seq(GSE31477)/Homer | 1e-2 | -5.244e+00 | 0.0089 | 312.0 | 6.79% | 2627.8 | 5.88% | motif file (matrix) | svg | |
254 | Fli1(ETS)/CD8-FLI-ChIP-Seq(GSE20898)/Homer | 1e-2 | -5.241e+00 | 0.0089 | 1064.0 | 23.16% | 9651.0 | 21.58% | motif file (matrix) | svg | |
255 | Tcf12(bHLH)/GM12878-Tcf12-ChIP-Seq(GSE32465)/Homer | 1e-2 | -5.237e+00 | 0.0089 | 949.0 | 20.65% | 8562.8 | 19.15% | motif file (matrix) | svg | |
256 | RUNX1(Runt)/Jurkat-RUNX1-ChIP-Seq(GSE29180)/Homer | 1e-2 | -5.198e+00 | 0.0092 | 1010.0 | 21.98% | 9144.0 | 20.45% | motif file (matrix) | svg | |
257 | IRF1(IRF)/PBMC-IRF1-ChIP-Seq(GSE43036)/Homer | 1e-2 | -5.180e+00 | 0.0094 | 138.0 | 3.00% | 1074.9 | 2.40% | motif file (matrix) | svg | |
258 | Tbr1(T-box)/Cortex-Tbr1-ChIP-Seq(GSE71384)/Homer | 1e-2 | -5.134e+00 | 0.0098 | 1267.0 | 27.57% | 11593.1 | 25.93% | motif file (matrix) | svg | |
259 | HOXA1(Homeobox)/mES-Hoxa1-ChIP-Seq(SRP084292)/Homer | 1e-2 | -5.116e+00 | 0.0099 | 240.0 | 5.22% | 1982.0 | 4.43% | motif file (matrix) | svg | |
260 | PRDM15(Zf)/ESC-Prdm15-ChIP-Seq(GSE73694)/Homer | 1e-2 | -5.075e+00 | 0.0103 | 967.0 | 21.04% | 8747.8 | 19.56% | motif file (matrix) | svg | |
261 | NRF(NRF)/Promoter/Homer | 1e-2 | -5.030e+00 | 0.0107 | 168.0 | 3.66% | 1343.5 | 3.00% | motif file (matrix) | svg | |
262 | Sox3(HMG)/NPC-Sox3-ChIP-Seq(GSE33059)/Homer | 1e-2 | -5.028e+00 | 0.0107 | 1644.0 | 35.78% | 15216.4 | 34.03% | motif file (matrix) | svg | |
263 | GFY(?)/Promoter/Homer | 1e-2 | -4.958e+00 | 0.0114 | 74.0 | 1.61% | 532.5 | 1.19% | motif file (matrix) | svg | |
264 | Ap4(bHLH)/AML-Tfap4-ChIP-Seq(GSE45738)/Homer | 1e-2 | -4.951e+00 | 0.0115 | 1176.0 | 25.59% | 10744.9 | 24.03% | motif file (matrix) | svg | |
265 | Mef2a(MADS)/HL1-Mef2a.biotin-ChIP-Seq(GSE21529)/Homer | 1e-2 | -4.920e+00 | 0.0118 | 357.0 | 7.77% | 3056.2 | 6.83% | motif file (matrix) | svg | |
266 | HEB(bHLH)/mES-Heb-ChIP-Seq(GSE53233)/Homer | 1e-2 | -4.911e+00 | 0.0118 | 1716.0 | 37.34% | 15922.1 | 35.61% | motif file (matrix) | svg | |
267 | Tcfcp2l1(CP2)/mES-Tcfcp2l1-ChIP-Seq(GSE11431)/Homer | 1e-2 | -4.874e+00 | 0.0123 | 162.0 | 3.53% | 1296.0 | 2.90% | motif file (matrix) | svg | |
268 | ZNF519(Zf)/HEK293-ZNF519.GFP-ChIP-Seq(GSE58341)/Homer | 1e-2 | -4.820e+00 | 0.0129 | 191.0 | 4.16% | 1556.2 | 3.48% | motif file (matrix) | svg | |
269 | EBF1(EBF)/Near-E2A-ChIP-Seq(GSE21512)/Homer | 1e-2 | -4.795e+00 | 0.0132 | 954.0 | 20.76% | 8650.9 | 19.35% | motif file (matrix) | svg | |
270 | ZBTB18(Zf)/HEK293-ZBTB18.GFP-ChIP-Seq(GSE58341)/Homer | 1e-2 | -4.760e+00 | 0.0136 | 526.0 | 11.45% | 4629.2 | 10.35% | motif file (matrix) | svg | |
271 | WT1(Zf)/Kidney-WT1-ChIP-Seq(GSE90016)/Homer | 1e-2 | -4.750e+00 | 0.0137 | 530.0 | 11.53% | 4667.7 | 10.44% | motif file (matrix) | svg | |
272 | TCFL2(HMG)/K562-TCF7L2-ChIP-Seq(GSE29196)/Homer | 1e-2 | -4.750e+00 | 0.0137 | 94.0 | 2.05% | 707.6 | 1.58% | motif file (matrix) | svg | |
273 | EWS:ERG-fusion(ETS)/CADO_ES1-EWS:ERG-ChIP-Seq(SRA014231)/Homer | 1e-2 | -4.662e+00 | 0.0148 | 711.0 | 15.47% | 6368.3 | 14.24% | motif file (matrix) | svg | |
274 | MyoG(bHLH)/C2C12-MyoG-ChIP-Seq(GSE36024)/Homer | 1e-2 | -4.632e+00 | 0.0152 | 1042.0 | 22.68% | 9501.2 | 21.25% | motif file (matrix) | svg |
[46]:
homer_results(homer_dict, 'Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K', results='denovo')
[46]:
Homer de novo Motif Results (/staging/leuven/stg_00002/lcb/cbravo/Multiomics_pipeline/pycistarget_tutorial/Homer/Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K/)
Known Motif Enrichment ResultsGene Ontology Enrichment Results
If Homer is having trouble matching a motif to a known motif, try copy/pasting the matrix file into STAMP
More information on motif finding results: HOMER | Description of Results | Tips
Total target sequences = 4595
Total background sequences = 44741
* - possible false positive
Rank | Motif | P-value | log P-pvalue | % of Targets | % of Background | STD(Bg STD) | Best Match/Details | Motif File |
1 | 1e-1721 | -3.963e+03 | 62.81% | 9.40% | 55.1bp (152.9bp) | NFIL3(bZIP)/HepG2-NFIL3-ChIP-Seq(Encode)/Homer(0.918) More Information | Similar Motifs Found | motif file (matrix) | |
2 | 1e-196 | -4.516e+02 | 22.00% | 7.81% | 123.0bp (154.6bp) | Ddit3::Cebpa/MA0019.1/Jaspar(0.770) More Information | Similar Motifs Found | motif file (matrix) | |
3 | 1e-133 | -3.076e+02 | 38.02% | 21.94% | 116.4bp (147.3bp) | PPARa(NR),DR1/Liver-Ppara-ChIP-Seq(GSE47954)/Homer(0.938) More Information | Similar Motifs Found | motif file (matrix) | |
4 | 1e-85 | -1.964e+02 | 38.48% | 25.29% | 129.6bp (152.7bp) | FOXM1(Forkhead)/MCF7-FOXM1-ChIP-Seq(GSE72977)/Homer(0.922) More Information | Similar Motifs Found | motif file (matrix) | |
5 | 1e-50 | -1.152e+02 | 49.84% | 38.95% | 131.9bp (150.3bp) | NFY(CCAAT)/Promoter/Homer(0.860) More Information | Similar Motifs Found | motif file (matrix) | |
6 | 1e-48 | -1.121e+02 | 27.73% | 18.78% | 135.3bp (146.2bp) | Hnf4a/MA0114.3/Jaspar(0.793) More Information | Similar Motifs Found | motif file (matrix) | |
7 | 1e-46 | -1.080e+02 | 46.79% | 36.36% | 134.5bp (150.6bp) | HIC1(Zf)/Treg-ZBTB29-ChIP-Seq(GSE99889)/Homer(0.760) More Information | Similar Motifs Found | motif file (matrix) | |
8 | 1e-46 | -1.061e+02 | 13.56% | 7.43% | 128.4bp (152.5bp) | HNF1b(Homeobox)/PDAC-HNF1B-ChIP-Seq(GSE64557)/Homer(0.893) More Information | Similar Motifs Found | motif file (matrix) | |
9 | 1e-36 | -8.461e+01 | 3.29% | 0.95% | 128.6bp (150.7bp) | Atf4(bZIP)/MEF-Atf4-ChIP-Seq(GSE35681)/Homer(0.838) More Information | Similar Motifs Found | motif file (matrix) | |
10 | 1e-30 | -7.016e+01 | 14.67% | 9.33% | 141.9bp (154.7bp) | Arid5a/MA0602.1/Jaspar(0.788) More Information | Similar Motifs Found | motif file (matrix) | |
11 | 1e-28 | -6.528e+01 | 9.95% | 5.75% | 124.4bp (150.8bp) | USF2/MA0526.2/Jaspar(0.906) More Information | Similar Motifs Found | motif file (matrix) | |
12 | 1e-24 | -5.561e+01 | 10.71% | 6.63% | 135.6bp (145.7bp) | Stat5a::Stat5b/MA0519.1/Jaspar(0.932) More Information | Similar Motifs Found | motif file (matrix) | |
13 | 1e-22 | -5.213e+01 | 4.87% | 2.35% | 131.7bp (147.1bp) | BMYB(HTH)/Hela-BMYB-ChIP-Seq(GSE27030)/Homer(0.629) More Information | Similar Motifs Found | motif file (matrix) | |
14 | 1e-18 | -4.252e+01 | 60.41% | 53.91% | 146.0bp (147.8bp) | FOXA1(Forkhead)/LNCAP-FOXA1-ChIP-Seq(GSE27824)/Homer(0.652) More Information | Similar Motifs Found | motif file (matrix) | |
15 | 1e-16 | -3.718e+01 | 16.69% | 12.48% | 134.3bp (151.3bp) | IRF4(IRF)/GM12878-IRF4-ChIP-Seq(GSE32465)/Homer(0.689) More Information | Similar Motifs Found | motif file (matrix) | |
16 | 1e-15 | -3.603e+01 | 0.50% | 0.05% | 115.6bp (123.1bp) | Hnf6b(Homeobox)/LNCaP-Hnf6b-ChIP-Seq(GSE106305)/Homer(0.616) More Information | Similar Motifs Found | motif file (matrix) | |
17 | 1e-14 | -3.451e+01 | 0.28% | 0.01% | 141.8bp (77.8bp) | Tbet(T-box)/CD8-Tbet-ChIP-Seq(GSE33802)/Homer(0.681) More Information | Similar Motifs Found | motif file (matrix) | |
18 | 1e-12 | -2.929e+01 | 1.33% | 0.44% | 149.2bp (150.5bp) | Atf1/MA0604.1/Jaspar(0.842) More Information | Similar Motifs Found | motif file (matrix) | |
19 * | 1e-10 | -2.488e+01 | 0.48% | 0.08% | 135.2bp (154.5bp) | Creb3l2/MA0608.1/Jaspar(0.689) More Information | Similar Motifs Found | motif file (matrix) | |
20 * | 1e-10 | -2.455e+01 | 0.15% | 0.00% | 149.3bp (10.0bp) | Barx1(Homeobox)/Stomach-Barx1.3xFlag-ChIP-Seq(GSE69483)/Homer(0.643) More Information | Similar Motifs Found | motif file (matrix) | |
21 * | 1e-9 | -2.109e+01 | 0.35% | 0.05% | 130.0bp (168.5bp) | YY2/MA0748.1/Jaspar(0.741) More Information | Similar Motifs Found | motif file (matrix) | |
22 * | 1e-8 | -1.978e+01 | 0.15% | 0.01% | 135.1bp (83.5bp) | Tcf21(bHLH)/ArterySmoothMuscle-Tcf21-ChIP-Seq(GSE61369)/Homer(0.774) More Information | Similar Motifs Found | motif file (matrix) | |
23 * | 1e-6 | -1.520e+01 | 0.39% | 0.09% | 145.0bp (161.8bp) | PB0134.1_Hnf4a_2/Jaspar(0.701) More Information | Similar Motifs Found | motif file (matrix) | |
24 * | 1e-6 | -1.391e+01 | 0.13% | 0.01% | 138.0bp (93.0bp) | Ap4(bHLH)/AML-Tfap4-ChIP-Seq(GSE45738)/Homer(0.602) More Information | Similar Motifs Found | motif file (matrix) | |
25 * | 1e-6 | -1.388e+01 | 0.17% | 0.02% | 136.8bp (104.8bp) | Nobox/MA0125.1/Jaspar(0.705) More Information | Similar Motifs Found | motif file (matrix) | |
26 * | 1e-4 | -1.001e+01 | 0.13% | 0.01% | 121.9bp (131.2bp) | MZF1/MA0056.1/Jaspar(0.700) More Information | Similar Motifs Found | motif file (matrix) | |
27 * | 1e-4 | -9.576e+00 | 0.11% | 0.01% | 152.2bp (71.8bp) | RXR(NR),DR1/3T3L1-RXR-ChIP-Seq(GSE13511)/Homer(0.659) More Information | Similar Motifs Found | motif file (matrix) |
You can also access the regions enriched for each motif (use known_motif_hits for known motifs; and denovo_motif_hits for de novo motifs):
[47]:
homer_dict['Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K'].known_motif_hits['CEBP(bZIP)/ThioMac-CEBPb-ChIP-Seq(GSE21512)/Homer'][0:10]
[47]:
['chr10:89748570-89749071',
'chr10:111335980-111336481',
'chr4:45495781-45496282',
'chr19:30170213-30170714',
'chr10:121129224-121129725',
'chr2:103492434-103492935',
'chr2:26600492-26600993',
'chr4:145280844-145281345',
'chr13:81329746-81330247',
'chr13:96742830-96743331']
To access cistromes (use known_cistromes for cistromes based on known motifs; and denovo_cistromes for cistromes based on de novo motifs):
[52]:
homer_dict['Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K'].denovo_cistromes['Cebpa_(2886r)'][0:10]
[52]:
['chr10:89748570-89749071',
'chr10:111335980-111336481',
'chr8:70544122-70544623',
'chr19:30170213-30170714',
'chr10:121129224-121129725',
'chr2:103492434-103492935',
'chr2:26600492-26600993',
'chr4:145280844-145281345',
'chr1:193289929-193290430',
'chr13:81329746-81330247']
You can easily export cistromes to a bed file:
[53]:
from pycistarget.utils import *
cebpa_cistrome_pr = pr.PyRanges(region_names_to_coordinates(homer_dict['Cebpa_ERR235722_summits_order_by_score_extended_250bp_top5K'].denovo_cistromes['Cebpa_(2886r)']))
cebpa_cistrome_pr.to_bed(path='/staging/leuven/stg_00002/lcb/cbravo/Multiomics_pipeline/pycistarget_tutorial/Homer/cebpa_cistrome_example.bed')