| UPr100 | URef50 | Pfam | Rep-MSA | Seed Seq | Residues |
---|
YceB (DUF1439) | 2536 | 261 | 333 | 104 | P0AB26 | 1–186 |
DUF4403 (N & C) a | 926 | 144 | 200 | 118 | J3A9I6 | 42–489 |
DUF2140 | 1613 | 258 | 316 | 114 | Q039F2 | 1–205 |
Rv871c (DUF2993) b | 5783 | 749 | 956 | 124 | I6WZH9 | 1–270 |
Takeout | 4377 | 485 | 1258 | 113 | Q9VBV3 | 1–249 |
P47 | 380 | 135 | 47 | 110 | 6EKT_A | 11–437 |
OrfX2 | 18 | 4 | “c | 105 | Q6RI02 | 1–750 |
AsmA (N-term: 1–180) | 12,029 | 2825 | 2069 | 184 | P28249 | 1–180 |
Chorein_N (1–115) | 6222 | 2160 | 2810 | 118 | HH cons. d | 1–115 |
TamB (N-term: 1–150) | 12,677 | 3525 | 2410 | 102 e | P39321 | 1–150 |
Mdm31p (131–382) | 1030 | 194 | 747 | 136 | P38880 | 131–382 |
- Columns are: “Upr”: Uniprot (all sequences); “Uref”: Uniref (clustered for proteins with > 50% identity); Pfam number of sequences in all representative proteomes; Rep-MSA: number of sequences in the “representative MSA” produced by HHpred after three iterations of PSI-BLAST. These MSAs were used for all pairwise comparisons
- a MSA made to whole proteins then split: N = 42–287 (n = 113) and 288–489 (n = 111); b MSA made to whole proteins then used either all or just section including residues 1–130 of Rv087c; c All OrfX2 sequences are included in the P47 family; d for Chorein_N the seed sequence was the Tuebingen Toolkit’s consensus: FESLIADFLTKTIGKYIEDLDVNSVSVSLWNGNVQLKNLQVKKDACSAFNLPVIISKGILKTLEVEVPWKSIKTDPFKIKIKGLHIISQPQTVFVFDAEQYDLKKKEHRKEIIDR; e an alternative TamB (N-term) MSA was created with one iteration (n = 104)