从string-db下载蛋白质相互作用的信息,在处理时发现蛋白A与B互作被记录了两次比如下边的例子(即AB、BA)
而且 字符串的比较大小是根据字符串按位比较,两个字符串第一位字符的ascii码谁大,字符串就大,不再比较后面的,比如
对
这个时候再对temp的行去重复,就可以了
最后再删除temp列
STRING is a database of known and predicted protein-protein interactions. The interactions include direct (physical) and indirect (functional) associations; they stem from computational prediction, from knowledge transfer between organisms, and from interactions aggregated from other (primary) databases. Data Sources Interactions in STRING are derived from five main sources: Genomic Context PredictionsHigh-throughput Lab Experiments(Conserved) Co-ExpressionAutomated TextminingPrevious Knowledge in Databases Coverage The STRING database currently covers 67'592'464 proteins from 14'094 organisms. |
|