Back                                                                                                     Contact:  Simon H. Rasmussen
The rank-file should be an ordered list (most down to most up-regulated genes) of IDs or a list of pairs of IDs and differential expression metric like log fold change.

This is an example:
ENST00000375050           -2.744688
ENST00000376423           -2.551617
ENST00000262795           -2.018950
ENST00000252603           -1.963149
ENST00000380607           -1.946855
ENST00000378524           -1.904424
ENST00000296674           -1.862955
ENST00000357115           -1.834129
ENST00000262967           -1.804530
ENST00000331029           -1.803831
ENST00000005905           -1.770742
ENST00000368653           -1.769864

For the statistical methods to have power, there should be at least 1000 IDs in a rank-file.

Sequence file should conform to fasta format and IDs should map directly to IDs of the rank-file and ID lines should only contain one ID and nothing else.

We currently supply sequences for 4 species, all sequences are from Ensembl release 64 and gene ID mapping files release 69 (Assemblies:hg19,mm9,dm3,ce).
You can use different types of IDs (in rank-files or sequence files) for the different species.

ID support:
Human: Ensembl gene and transcript IDs, HNGN symbols, refseq IDs, UCSC IDs and IPI IDs.
Mouse: Ensembl gene and transcript IDs and MGI symbols.
D. melanogaster: Ensembl gene and transcript IDs and Flybase symbols.
C. elegans: Ensembl transcript IDs and refseq IDs.


More information is availble here

Paper:
Simon H Rasmussen, Anders Jacobsen and Anders Krogh. cWords - systematic microRNA regulatory motif discovery from mRNA expression data. Silence (2013), 4:2