Commit fe8a14830f9ee90bc20fbcf09cafff90686bce0d

Authored by Jean-Michel Garant
1 parent 6235bcc9
Exists in stable-0.3 and in 1 other branch master

Documentation

Showing 4 changed files with 252 additions and 67 deletions   Show diff stats
@@ -4,19 +4,63 @@ @@ -4,19 +4,63 @@
4 **G4RNA SCREENER MANUAL** 4 **G4RNA SCREENER MANUAL**
5 ========================= 5 =========================
6 6
  7 +1. [Name](#name)
  8 +2. [Synopsis](#synopsis)
  9 +3. [Screen.py](#screen.py)
  10 + 1. [Usage](#screen-usage)
  11 + 2. [Description](#screen-description)
  12 + 3. [Arguments](#screen-arguments)
  13 + 4. [Fasta Example](#screen-fasta-example)
  14 +4. [Merge.py](#merge.py)
  15 + 1. [Usage](#merge-usage)
  16 + 2. [Description](#merge-description)
  17 + 3. [Arguments](#merge-arguments)
  18 + 4. [Example](#merge-example)
  19 +
7 ## **NAME** 20 ## **NAME**
8 21
9 G4RNA Screener - The nucleic acid screener for RNA G-quadruplexes 22 G4RNA Screener - The nucleic acid screener for RNA G-quadruplexes
10 23
11 ## **SYNOPSIS** 24 ## **SYNOPSIS**
12 25
13 - ./screen.py [-?|--help] [-V|--version] 26 +```
  27 +./screen.py [-h|--help] [-V|--version]
  28 +
  29 +./screen.py <path>[.fa|.fas|.fasta]
  30 + [-a|--ann <path>.pkl]
  31 + [-c|--columns <list>]
  32 + [-w|--window <int>]
  33 + [-s|--step <int>]
  34 + [-b|--bedgraph]
  35 + [-e|--error]
  36 + [-v|--verbose]
  37 +```
  38 +
  39 +```
  40 +./merge.py [-h|--help] [-V|--version]
  41 +
  42 +./merge.py <path>[.tsv|.csv|.txt]
  43 + [--cGcC [<float>]]
  44 + [--G4H [<float>]]
  45 + [--G4NN [<float>]]
  46 + [-w|--window <int>]
  47 + [-s|--step <int>]
  48 + [-a|--aggregate <str>]
  49 +```
  50 +
  51 +> Combination using pipeline
  52 +> ```./screen.py <path>.fa | ./merge.py -```
  53 +> example:
  54 +> ```./screen.py example.fasta | ./merge.py - --G4NN > example.tsv```
  55 +
  56 +**SCREEN.PY**
  57 +=============
14 58
15 - ./screen.py [-a|--ann <path>.pkl] [-c|--columns <list>] [-w|--window <int>]  
16 - [-s|--step <int>] [-b|--bedgraph]  
17 - [-f|--fasta <path>.fas] [-e|--error] [-v|--verbose] 59 +## **SCREN USAGE**
18 60
19 -## **DESCRIPTION** 61 + screen.py [-h] FASTA [-a ANN] [-w INT] [-s INT] [-c [...]] [-b] [-v] [-V] [-e]
  62 +
  63 +## **SCREEN DESCRIPTION**
20 64
21 Score nucleic acid using an artificial neural network classifier (G4NN) that was 65 Score nucleic acid using an artificial neural network classifier (G4NN) that was
22 trained on sequences found in the [G4RNA database](http://scottgroup.med. 66 trained on sequences found in the [G4RNA database](http://scottgroup.med.
@@ -24,27 +68,23 @@ usherbrooke.ca/G4RNA/). It also provides the previously described: @@ -24,27 +68,23 @@ usherbrooke.ca/G4RNA/). It also provides the previously described:
24 [G4Hunter score](https://www.ncbi.nlm.nih.gov/pubmed/26792894) and 68 [G4Hunter score](https://www.ncbi.nlm.nih.gov/pubmed/26792894) and
25 [cG/cC score](https://www.ncbi.nlm.nih.gov/pubmed/24121682) if specified. 69 [cG/cC score](https://www.ncbi.nlm.nih.gov/pubmed/24121682) if specified.
26 70
27 -## **OPTIONS** 71 +## **SCREEN ARGUMENTS**
28 72
29 -**_-?, --help_** 73 +**_FASTA Positional Argument_** = - (default STDIN)
30 74
31 -> Show usage and exit. 75 +> Path to the fasta file to analyze. Will support string value as long as it
  76 +respects the fasta format. Use "-" to feed standard input.
32 77
33 -**_-V, --version_** 78 +**_-h, --help_**
34 79
35 -> Show version information and exit. 80 +> Show usage and exit.
36 81
37 **_-a, --ann_** = G4RNA_2016-11-07.pkl 82 **_-a, --ann_** = G4RNA_2016-11-07.pkl
38 83
39 > Path to the pickled ANN (.pkl) which will provide the program a particular 84 > Path to the pickled ANN (.pkl) which will provide the program a particular
40 pattern to evaluate each sequences or windows of sequences. 85 pattern to evaluate each sequences or windows of sequences.
41 86
42 -**_-f, --fasta_** = STDIN  
43 -  
44 -> Path to the fasta file to analyze. Will support string value as long as it  
45 -respects the fasta format. Use "STDIN" to feed standard input to -f argument.  
46 -  
47 -**_-w, --window_** 87 +**_-w, --window_** = 60
48 88
49 > Length of the sliding window that is used to analyze long sequences. 89 > Length of the sliding window that is used to analyze long sequences.
50 90
@@ -108,7 +148,19 @@ gene,description,chromosome,strand,... @@ -108,7 +148,19 @@ gene,description,chromosome,strand,...
108 > + G4NN: _Score obtained through G4NN (Must be specified when 148 > + G4NN: _Score obtained through G4NN (Must be specified when
109 enumerating columns in list)_ 149 enumerating columns in list)_
110 150
111 -###### Fasta example 151 +**_-v, --verbose_**
  152 +
  153 +> Rudimental verbose
  154 +
  155 +**_-V, --version_**
  156 +
  157 +> Show version information and exit.
  158 +
  159 +**_-e, --error_**
  160 +
  161 +> Raise and display error and warnings
  162 +
  163 +###### Screen Fasta Examples
112 164
113 Any fasta file can support columns: description,length,sequence,cGcC,G4H,G4NN 165 Any fasta file can support columns: description,length,sequence,cGcC,G4H,G4NN
114 ```html 166 ```html
@@ -151,3 +203,84 @@ tgacggctattatcgcatgatcGGCGGCGGCGGGGcggcggcgatatcgt @@ -151,3 +203,84 @@ tgacggctattatcgcatgatcGGCGGCGGCGGGGcggcggcgatatcgt
151 >ENST00000380152 chromosome:GRCh38:11:106073501:106100710:1 203 >ENST00000380152 chromosome:GRCh38:11:106073501:106100710:1
152 ACGTTTACGGCTATTCGTATGCTTGTGCTATTGCATGTACTG 204 ACGTTTACGGCTATTCGTATGCTTGTGCTATTGCATGTACTG
153 ``` 205 ```
  206 +
  207 +**MERGE.PY**
  208 +============
  209 +
  210 +## **MERGE USAGE**
  211 +
  212 + merge.py [-h] TSV [--cGcC [FLOAT]] [--G4H [FLOAT]] [--G4NN [FLOAT]] [-w INT] [-s INT] [-a STR]
  213 +
  214 +## **MERGE DESCRIPTION**
  215 +
  216 +Transforms the tabular output of `screen.py` in order to merge overlapping
  217 +regions that are scored above threshold(s). The omission of using --cGcC, --G4H
  218 +and --G4NN argument wil merge all overlapping windows back into the original
  219 +sequences. The -w,--window and -s,--step arguments must be the same used for
  220 +the `screen.py`.
  221 +
  222 +## **MERGE ARGUMENTS**
  223 +
  224 +**_TSV Positional Argument_** = - (default STDIN)
  225 +
  226 +> Path to the tabular separated values file to analyze. Use "-" to feed
  227 +standard input.
  228 +
  229 +**_-h, --help_**
  230 +
  231 +> Show usage and exit.
  232 +
  233 +**_--cGcC_** = 4.5
  234 +
  235 +> Use consecutive G over consecutive C score to determine positive windows
  236 +above threshold and merge them.
  237 +
  238 +**_--G4H_** = 0.9
  239 +
  240 +> Use G4Hunter score to determine positive windows above threshold and merge
  241 +them.
  242 +
  243 +**_--G4NN_** = 0.5
  244 +
  245 +> Use G4 neural network score to determine positive windows above threshold and
  246 +merge them.
  247 +
  248 +**_-w, --window_** = 60
  249 +
  250 +> Length of the sliding window that was used to analyze long sequences.
  251 +
  252 +**_-a, --aggregation_** = list (only available with python 2.7.15rc1 +)
  253 +
  254 +> Scores of merged windows will be aggregated using an aggregation function:
  255 +(choose from 'max', 'min', 'median', 'mean', 'std', 'sem')
  256 +
  257 +**_-s, --step_** = 10
  258 +
  259 +> Step size that defines the overlap between the windows sequences.
  260 +
  261 +**_-v, --verbose_**
  262 +
  263 +> Show version information and exit.
  264 +
  265 +**_-e, --error_**
  266 +
  267 +> Raise and display error and warnings
  268 +
  269 +
  270 +###### Merge Examples
  271 +
  272 +`screen.py` output
  273 +```
  274 +./screen.py spinach.fas -w 60 -s 10
  275 + description sequence start cGcC G4H G4NN
  276 +1 Spinach aptamer (false negative example) GGACGCGACCGAAAUGGUGAAGGACGGGUCCAGUGCGAAACACGCACUGUUGAGUAGAGU 1 2.1875 0.3 0.209915966961
  277 +2 Spinach aptamer (false negative example) GAAAUGGUGAAGGACGGGUCCAGUGCGAAACACGCACUGUUGAGUAGAGUGUGAGCUCCG 11 2.2 0.283333333333 0.146798576587
  278 +3 Spinach aptamer (false negative example) AGGACGGGUCCAGUGCGAAACACGCACUGUUGAGUAGAGUGUGAGCUCCGUAACUGGUCG 21 1.88235294118 0.233333333333 0.154917456609
  279 +4 Spinach aptamer (false negative example) CAGUGCGAAACACGCACUGUUGAGUAGAGUGUGAGCUCCGUAACUGGUCGCGUC 31 1.26666666667 0.0555555555556 0.118000916901
  280 +```
  281 +
  282 +`screen.py | merge.py` output
  283 +```
  284 +./screen.py spinach.fas -w 60 -s 10 | ./merge.py - -w 60 -s 10
  285 +1 Spinach aptamer (false negative example) GGACGCGACCGAAAUGGUGAAGGACGGGUCCAGUGCGAAACACGCACUGUUGAGUAGAGUGUGAGCUCCGUAACUGGUCGCGUC 1 [2.1875, 2.2, 1.88235294118, 1.2666666666700002] [0.3, 0.283333333333, 0.233333333333, 0.0555555555556] [0.209915966961, 0.14679857658700002, 0.154917456609, 0.118000916901]
  286 +```
@@ -34,7 +34,7 @@ library_name (recommended version) @@ -34,7 +34,7 @@ library_name (recommended version)
34 ##### From environment 34 ##### From environment
35 35
36 ```bash 36 ```bash
37 -python2.7 (2.7.12) 37 +python2.7 (2.7.15rc1)
38 ``` 38 ```
39 39
40 ##### From Python 40 ##### From Python
@@ -34,6 +34,8 @@ class float_range(object): @@ -34,6 +34,8 @@ class float_range(object):
34 Object that defines a range of float that is authorized or returns the 34 Object that defines a range of float that is authorized or returns the
35 authorized range in error. Used to validate score threshold input by user. 35 authorized range in error. Used to validate score threshold input by user.
36 """ 36 """
  37 + # float range is a function that is needed to have any threshold of float in
  38 + # supported range of scores
37 def __init__(self, start, end): 39 def __init__(self, start, end):
38 self.start = start 40 self.start = start
39 self.end = end 41 self.end = end
@@ -102,12 +104,14 @@ def merge_g4rna(df, window=60, step=10, @@ -102,12 +104,14 @@ def merge_g4rna(df, window=60, step=10,
102 score_aggregation=list): 104 score_aggregation=list):
103 """ 105 """
104 """ 106 """
  107 + # filter windows with users thresholds
105 if cGcC: 108 if cGcC:
106 df = df[ df.cGcC >= cGcC ].dropna() 109 df = df[ df.cGcC >= cGcC ].dropna()
107 if G4H: 110 if G4H:
108 df = df[ df.G4H >= G4H ].dropna() 111 df = df[ df.G4H >= G4H ].dropna()
109 if G4NN: 112 if G4NN:
110 df = df[ df.G4NN >= G4NN ].dropna() 113 df = df[ df.G4NN >= G4NN ].dropna()
  114 + # aggregate functions are defined in this dictionnary
111 agg_fct = { 115 agg_fct = {
112 # 'description':"".join, 116 # 'description':"".join,
113 'gene_symbol':'max', 117 'gene_symbol':'max',
@@ -134,25 +138,32 @@ def merge_g4rna(df, window=60, step=10, @@ -134,25 +138,32 @@ def merge_g4rna(df, window=60, step=10,
134 # overlap is the length that sequential windows should share 138 # overlap is the length that sequential windows should share
135 overlap = window-step 139 overlap = window-step
136 pd.set_option('display.max_colwidth', -1) 140 pd.set_option('display.max_colwidth', -1)
  141 + # the merge is dependant on the sequence columns
137 if 'sequence' in df.columns: 142 if 'sequence' in df.columns:
138 - for ite in range(0,len(df)):  
139 - print any(df.sequence.str[:(overlap+step*ite)].eq(  
140 - df.sequence.str[step:].shift(1))) 143 + for ite in range(0,len(df)):#max iterations is dataframe length
  144 + # check if any windows still needs some merging
141 if any(df.sequence.str[:(overlap+step*ite)].eq( 145 if any(df.sequence.str[:(overlap+step*ite)].eq(
142 df.sequence.str[step:].shift(1))) is False: 146 df.sequence.str[step:].shift(1))) is False:
143 break 147 break
  148 + # merge overlapping window by appending the [step] first
  149 + # nucleotides to the next window
144 df.loc[ 150 df.loc[
145 df.sequence.str[:(overlap+step*ite)].eq( 151 df.sequence.str[:(overlap+step*ite)].eq(
146 df.sequence.str[step:].shift(1)) 152 df.sequence.str[step:].shift(1))
147 , 'sequence'] = df.sequence.str[:step].shift(1) + \ 153 , 'sequence'] = df.sequence.str[:step].shift(1) + \
148 df.sequence.str[:] 154 df.sequence.str[:]
  155 + # description column is used when available which should prevent the
  156 + # eventual problems caused by a user that submits two sequences that
  157 + # share some subsequences
149 if 'description' in df.columns: 158 if 'description' in df.columns:
150 df_grouped = df.groupby( 159 df_grouped = df.groupby(
151 [df.description,df.sequence.str[:overlap]], 160 [df.description,df.sequence.str[:overlap]],
152 sort=False, 161 sort=False,
153 as_index=False) 162 as_index=False)
154 - print "******",{k:agg_fct[k] for k in df.columns.drop(['description'])}  
155 - return df_grouped.agg({k:agg_fct[k] for k in df.columns.drop(['description'])}).reindex(columns=df.columns) 163 + # grouping the resulting windows using the first [:overlap] nts
  164 + return df_grouped.agg(
  165 + {k:agg_fct[k] for k in df.columns.drop(['description'])}
  166 + ).reindex(columns=df.columns)
156 else: 167 else:
157 df_grouped = df.groupby( 168 df_grouped = df.groupby(
158 df.sequence.str[-overlap:], 169 df.sequence.str[-overlap:],
@@ -211,6 +222,18 @@ def arguments(): @@ -211,6 +222,18 @@ def arguments():
211 help="Use G4NN score threshold to determine positive windows "\ 222 help="Use G4NN score threshold to determine positive windows "\
212 "(default: 0.5)", 223 "(default: 0.5)",
213 metavar="FLOAT") 224 metavar="FLOAT")
  225 + # windows length
  226 + parser.add_argument("-w", "--window",
  227 + type=int,
  228 + default=60,
  229 + help="Windows length of input",
  230 + metavar="INT")
  231 + # step length
  232 + parser.add_argument("-s","--step",
  233 + type=int,
  234 + default=10,
  235 + help="Step lenght of input",
  236 + metavar="INT")
214 # aggregation function for scores 237 # aggregation function for scores
215 parser.add_argument("-a", "--aggregation", 238 parser.add_argument("-a", "--aggregation",
216 #nargs="+", 239 #nargs="+",
@@ -219,8 +242,19 @@ def arguments(): @@ -219,8 +242,19 @@ def arguments():
219 help="Aggregation function to pool scores of merged windows "\ 242 help="Aggregation function to pool scores of merged windows "\
220 "(default: list)", 243 "(default: list)",
221 metavar="STR") 244 metavar="STR")
  245 + parser.add_argument("-V", "--version",
  246 + action="version",
  247 + version="%(prog)s 0.3")
  248 + # useful for debug, not meant for users
  249 + parser.add_argument("-e", "--error",
  250 + action="store_true",
  251 + default=False,
  252 + help="Raise errors and exceptions")
  253 + # needed to to have default help display
  254 + if len(sys.argv[1:])==0:
  255 + parser.print_help()
  256 + parser.exit()
222 args = parser.parse_args() 257 args = parser.parse_args()
223 -# args.aggregation = [ aggr if ( aggr != 'list' ) else list for aggr in args.aggregation ]  
224 if args.cGcC == None: 258 if args.cGcC == None:
225 args.cGcC = 4.5 259 args.cGcC = 4.5
226 if args.G4H == None: 260 if args.G4H == None:
@@ -232,11 +266,21 @@ def arguments(): @@ -232,11 +266,21 @@ def arguments():
232 def main(): 266 def main():
233 args = arguments() 267 args = arguments()
234 g4rna_frame = pd.read_csv(args.tsv, sep='\t', index_col=0) 268 g4rna_frame = pd.read_csv(args.tsv, sep='\t', index_col=0)
235 - merge_g4rna(g4rna_frame,  
236 - 60, 10,  
237 - args.cGcC, args.G4H, args.G4NN,  
238 - args.aggregation).to_csv(  
239 - path_or_buf=sys.stdout, sep='\t') 269 + try:
  270 + merge_g4rna(g4rna_frame,
  271 + args.window, args.step,
  272 + args.cGcC, args.G4H, args.G4NN,
  273 + args.aggregation).to_csv(
  274 + path_or_buf=sys.stdout, sep='\t')
  275 + except:
  276 + # raise python error calls if -e --error is used
  277 + if args.error:
  278 + raise
  279 + # custom error message
  280 + else:
  281 + parser.print_usage()
  282 + sys.stderr.write(parser.prog+': error: '\
  283 + 'An option is missing, incorrect or not authorized\n')
240 284
241 if __name__ == '__main__': 285 if __name__ == '__main__':
242 main() 286 main()
@@ -239,7 +239,7 @@ def arguments(): @@ -239,7 +239,7 @@ def arguments():
239 parser.add_argument("-a", "--ann", 239 parser.add_argument("-a", "--ann",
240 type=argparse.FileType('r'), 240 type=argparse.FileType('r'),
241 default=os.path.dirname(__file__)+"/G4RNA_2016-11-07.pkl", 241 default=os.path.dirname(__file__)+"/G4RNA_2016-11-07.pkl",
242 - help="Supply a picled ANN (default: G4RNA_2016-11-07.pkl)") 242 + help="Supply a pickled ANN (default: G4RNA_2016-11-07.pkl)")
243 # length of segmentation of long sequences into analysis windows 243 # length of segmentation of long sequences into analysis windows
244 parser.add_argument("-w", "--window", 244 parser.add_argument("-w", "--window",
245 type=int, 245 type=int,
@@ -253,13 +253,6 @@ def arguments(): @@ -253,13 +253,6 @@ def arguments():
253 default=10, 253 default=10,
254 help="Step length between windows (default: 10)", 254 help="Step length between windows (default: 10)",
255 metavar="INT") 255 metavar="INT")
256 - # bedgraph will generate the required header, compatible UCSC genome browser  
257 - parser.add_argument("-b", "--bedgraph",  
258 - action="store_true",  
259 - default=False,  
260 - help="Display output as BedGraph, user must provides columns")  
261 - ## TODO use choices of three scores as bedgraph options which will  
262 - ## select columns for the user, must include verifications  
263 # columns to generate are provided by a space delimited list 256 # columns to generate are provided by a space delimited list
264 parser.add_argument("-c", "--columns", 257 parser.add_argument("-c", "--columns",
265 nargs="+", 258 nargs="+",
@@ -292,16 +285,61 @@ def arguments(): @@ -292,16 +285,61 @@ def arguments():
292 "cGcC G4H G4NN). "\ 285 "cGcC G4H G4NN). "\
293 "To browse available columns use: -c list", 286 "To browse available columns use: -c list",
294 metavar="") 287 metavar="")
  288 + # bedgraph will generate the required header, compatible UCSC genome browser
  289 + parser.add_argument("-b", "--bedgraph",
  290 + action="store_true",
  291 + default=False,
  292 + help="Display output as BedGraph, user must provides columns")
  293 + ## TODO use choices of three scores as bedgraph options which will
  294 + ## select columns for the user, must include verifications
295 # verbose option is very rudimental 295 # verbose option is very rudimental
296 parser.add_argument("-v", "--verbose", 296 parser.add_argument("-v", "--verbose",
297 action="store_true", 297 action="store_true",
298 default=False, 298 default=False,
299 help="Verbose output with operations when completed") 299 help="Verbose output with operations when completed")
  300 + parser.add_argument("-V", "--version",
  301 + action="version",
  302 + version="%(prog)s 0.3")
300 # useful for debug, not meant for users 303 # useful for debug, not meant for users
301 parser.add_argument("-e", "--error", 304 parser.add_argument("-e", "--error",
302 action="store_true", 305 action="store_true",
303 default=False, 306 default=False,
304 help="Raise errors and exceptions") 307 help="Raise errors and exceptions")
  308 + # needed to to have default help display
  309 + if len(sys.argv[1:])==0:
  310 + parser.print_help()
  311 + parser.exit()
  312 + # adapted help for columns listing
  313 + if "list" in sys.argv:
  314 + splitted_help = parser.format_help().split(
  315 + " Columns to display (default: description sequence\n start cGcC G4H G4NN). To browse available columns use:\n -c list")
  316 + sys.stdout.write("\n\t".join([splitted_help[0],
  317 + "Available columns:",
  318 + "all \t\tAll of the following except description\n",
  319 + "description\t\tDescription as available in fasta (Default)",
  320 + "gene_symbol\t\tGene symbol",
  321 + "mrnaAcc \t\tRefSeq mRNA accession number",
  322 + "protAcc \t\tRefseq protein accession number",
  323 + "gene_stable_id\t\tEnsembl gene stable ID",
  324 + "transcript_stable_id\tEnsembl transcript stable ID",
  325 + "full_name \t\tGene full name (From HGNC)",
  326 + "HGNC_id \t\tHGNC numeric ID",
  327 + "identifier \t\tIdentifier",
  328 + "source \t\tSource of the data",
  329 + "genome_assembly\t\tGenome build version",
  330 + "chromosome \t\tChromosome",
  331 + "start \t\tStart position (Default)",
  332 + "end \t\tEnd position",
  333 + "strand \t\tCoding strand",
  334 + "range \t\tInitial chromosomic range",
  335 + "length \t\tLength of sequence analyzed",
  336 + "sequence \t\tSequence analyzed (Default)",
  337 + "cGcC \t\tcGcC score (Default)",
  338 + "G4H \t\tG4Hunter score (Default)",
  339 + "G4NN \t\tG4NN score of similitude (Default)",
  340 + " \t\t(must be specified to use ANN)",
  341 + splitted_help[1]]))
  342 + sys.exit()
305 return parser 343 return parser
306 344
307 def main(): 345 def main():
@@ -311,36 +349,6 @@ def main(): @@ -311,36 +349,6 @@ def main():
311 parser = arguments() 349 parser = arguments()
312 args = parser.parse_args() 350 args = parser.parse_args()
313 # custom help message to list columns choices 351 # custom help message to list columns choices
314 - if args.columns == ["list"]:  
315 - splitted_help = parser.format_help().split(  
316 - " Columns to display (default: description sequence\n start cGcC G4H G4NN). To browse available columns use:\n -c list")  
317 - print("\n\t".join([splitted_help[0],  
318 - "Available columns:",  
319 - "description\t\tDescription as available in fasta (Default)",  
320 - "all \t\tAll of the following except description\n",  
321 - "gene_symbol\t\tGene symbol",  
322 - "mrnaAcc \t\tRefSeq mRNA accession number",  
323 - "protAcc \t\tRefseq protein accession number",  
324 - "gene_stable_id\tEnsembl gene stable ID",  
325 - "transcript_stable_id\tEnsembl transcript stable ID",  
326 - "full_name \t\tGene full name (From HGNC)",  
327 - "HGNC_id \t\tHGNC numeric ID",  
328 - "identifier \t\tIdentifier",  
329 - "source \t\tSource of the data",  
330 - "genome_assembly\tGenome build version",  
331 - "chromosome \t\tChromosome",  
332 - "start \t\tStart position",  
333 - "end \t\tEnd position",  
334 - "strand \t\tCoding strand",  
335 - "range \t\tInitial chromosomic range",  
336 - "length \t\tLength of sequence analyzed",  
337 - "sequence \t\tSequence analyzed",  
338 - "cGcC \t\tcGcC score",  
339 - "G4H \t\tG4Hunter score",  
340 - "G4NN \t\tG4NN score of similitude",  
341 - " \t\t(must be specified to use ANN)",  
342 - splitted_help[1]]))  
343 - sys.exit()  
344 # restrictive verifications for bedgraph options 352 # restrictive verifications for bedgraph options
345 if args.bedgraph and ( 353 if args.bedgraph and (
346 len(args.columns) != 4\ 354 len(args.columns) != 4\