The download zip file contains files:
Scripts:
MInetworkcalculate.pl
Random select sequences from each year sequence data source, create all 312 H3N2 sites substitution profile, residue same with last year give value 0, otherwise value 10. A total of 2,000 repeats were calculated to ensure that enough statistics were obtained.
MI(Mutual Information) values were calculated by software CLR.
The Main parameters of CLR: bins = 12, spline = 3 .
predict.pl
Find out sites mutated in the "induction year" and belonged to positive selection sites. According to the network information, compute the number of links among candidate sites.
Data files:
H3N2pro
Directory with protein sequences of H3N2 in different year
H5N1pro
Directory with protein sequences of H5N1 in different year
H3N2allfrequence.txt
Annual frequency of each site in H3N2
H5N1allfrequence.txt
Annual frequency of each site in H5N1
MImatrix2009.list
MI value matrix list(site transition network) calculated by Aracne used the sequence information from 1968 to 2007.
clrmatrix
Change sequences to mutation matrix
Residue same with last year as 0
Residue different with last year as 10
predict_result_2009.txt
Predicted result from 1972 to 2009. Format: [Fixed_year] [Predict_year] [Residue] [Predict_score]
Software used:
CLR(Context Likelihood of Relatedness)
The CLR algorithm infers regulatory interactions between transcription factors and their targets using a compendium of gene expression profiles.
The Main parameters of CLR: bins = 12, spline = 3. In our network the threshold is 2
|