There are three parts of Unicorn pipeline: 1. Unicorn algorithm: produce integrated network; 2. CliXO: accept the networks to infer the ontologies; 3. Unicorn_measurement: automatically measure the performance (recall, precision and F-score); Here we provide the part 1 and part 3, you also need to download part 2 (CliXO algorithm[1]) in https://github.com/mhk7/clixo_0.3. First, download all the programs, and unzip them under the same workspace. For these three parts: 1. Unicorn algorithm %%Unicorn algorithm to produce integrated networks %inputs: % term: the input term index, e.g. GO:xxxxxxx % wb: which branch does the inputted term on: 1--BP, 2--CC, 3--MF %outputs: % the integrated network and the filtered original networks are produced and stored in the INT_xxxxxxx folder, % e.g. Output_0071840_ind_cca_ALL.txt %demo: % Unicorn('GO:0071840',1); % After running, please find the files in INT_0071840 folder in the current workspace 2. Clixo algorithm %%For this part you need refer to [1] to know more details about clixo %Inputs: % The output file name of Unicorn, e.g. Output_0071840_ind_cca_ALL.txt % Alpha, e.g. 0.01 % Beta, e.g. 0.5 %Outputs: % The ontology structure and annotations, e.g. go_bp_0071840_intermediate_ind_cca_ALL_0.01_0.5.txt %Demo: % ./clixo Output_0071840_ind_cca_ALL.txt 0.01 0.5 > go_bp_0071840_intermediate_ind_cca_ALL_0.01_0.5.txt 3. Unicorn_measurement %%This measurement program process the results files of clixo algorithm, %%These results must be stored in the 'results' folder; %%And these results should be specifically named, e.g. 'go_bp_0071840_intermediate_ind_cca_ALL_0.01.txt' %input: % term_idx: the index of GO term, e.g. '0071840' for GO:0071840 %output: % the results are stored in Unicorn_measurement/exp_results/0071840_results.mat % The struct data 'scores' in this '.mat' contains: % 'term': term name, % 'recall': recall matrix, % 'precision': precision matrix, % 'f_measure': F-score matrix, % 'row_pat': labels of rows for the above three matrices, % 'col_pat': labels of columns for the above three matrices %demo: % Unicorn_measurement('0071840'); Important notes: To make sure the smooth running of Unicorn and Unicorn_measurement, please strictly follow our naming rules and put the files into the correct folder. Sorry for the inconvenience. a) The output file of CliXO / input file of Unicorn_measurement should be named as: File name format: go_p1_p2_intermediate_p3_p4_p5_p6.txt p1: bp for BP sub-ontology, cc for CC sub-ontology, mf for MF sub-ontology; p2: term index, e.g. 0071840 for GO:0071840; p3: network name, e.g. ind_cca: integrated network, ori_net1-4: DRYGIN, SMD, YeastNet, BioGRID; p4: range, e.g. ALL for whole ontology, KI for training part, Dhh for the leave-out part; p5: alpha value, e.g. 0.01; p6: beta value, e.g. 0.5. b) The output file of CliXO / input file of Unicorn_measurement should be placed under Unicorn_measurement/exp_results/. c) Unicorn folder and Unicorn_measurement folder should be placed under the same workspace. Demos: 1. running Unicorn algorithm change workspace to Unicorn/, run the Unicorn function as follow: Unicorn('GO:1901363',3); % term GO:1901363 is in MF sub-ontology,e.g. 3; then the output file will be stored under Unicorn/INT_1901363/ 2. running CliXO algorithm ./clixo Output_1901363_ind_cca_ALL.txt 0.01 0.5 > go_cc_1901363_intermediate_ind_cca_ALL_0.01_0.5.txt 3. running Unicorn_measurement algorithm store the output file of CliXO under Unicorn_measurement/results/, and then change workspace to Unicorn_measurement/, run the function as follow: Unicorn_measurement('0071840') Reference: [1]Kramer M, Dutkowski J, Yu M, et al. Inferring gene ontologies from pairwise similarity data[J]. Bioinformatics, 2014, 30(12): i34-i42.