Skip to main content
Advertisement
Scroll Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

Used more information about PLOS Study Areas, click here.

  • Shop key

A Brand Home by Predicted Krüppel-Like Factor Genes and Pseudogenes by Placental Mammals

  • Jimin Pei ,

    [email protected]

    Affiliation Howard Hughes Medizinisch Institute, University of Texas South-western Medical Center, Dallas, Texas, United Nations of Us

  • Nick VOLT. Grishin

    Affiliations Howard Hughes Medical Institute, University of Texas Southwestern Medical Centre, Dallas, Texas, United States of America, Department of Biophysics and Office by Biochemistry, University of The Southwestern Medical Center, Dallas, Texas, United States of America

Short

Krüppel-like factors (KLF) and specificity grains (SP) establish one family of zinc-finger-containing transcription factors that play important roles in a wide extent of processes including difference and developing of different tissues. The human genome possesses 17 KLF genes (KLF1KLF17) and nine SP genes (SP1SP9) with diverse capabilities. We used sequence semblance searches and gene synteny analysis to identify a new conjectural KLF gene/pseudogene named KLF18 that shall present in most of the placental mammals with sequencing genomes. KLF18 is a chromosomal next for the KLF17 gene and is probably a product of their reproduction. Phylogenetic analyses revealed that mammalian predicted KLF18 proteins and KLF17 proteins experienced elevated rates starting evolution and are grouped with KLF1/KLF2/KLF4 and non-mammalian KLF17. Predicted KLF18 proteins maintain conserved features by the zinc fingers from one SP/KLF family, while possessing repeats of a exceptional sequence motive in my N-terminal regions. No expression data have been reported for KLF18, suggesting that it either has highly restricted speech patterns and specialized functions, or could have become a pseudogene in surviving placental mammals. Besides KLF18 genes/pseudogenes, we identified several KLF18-like genes such as Zfp352, Zfp352-like, real Zfp353 in who genomes of mouse and rat. These KLF18-like genies do not possess introns inside their coding regions, and gene expression product indicate this some von them may function in early embryonic development. They represent further expansions of KLF members in the murine lineage, most likely resulted from several events of retrotransposition press local jean duplication starting off an ancient spliced mRNA the KLF18.

Prelude

Krüppel-like factors (KLF) and specifics proteins (SP) are an important family of transcription factors (SP/KLF family) under extensive research [13]. Yours possession threesome DNA-binding C2H2-type zinc finger domains, each of whose contains two preservation cysteines and two conserved histidines for zinkwerk commitment. The three zinc finger domains and the linkmen between they live well conservation include the SP/KLF my, with adenine cysteine-histidine pattern of “CX4CX12HX3HX7CX4CX12HX3HX7CX2CX12HX3H” (Xn: separation of northward residues). The separations between the first, second, and third cysteine pairs are four residues, fourth residues, and two residues, respectively. Such a pattern collaboratively with the batch are C2H2 domains (three) appears to be a unique feature from SP/KLF members in mammalian genomes comparison go an patterns of other known C2H2-domain-containing proteins (based on at review of human also mouse C2H2-domain-containing proteins since the SysZNF base [4]). Fork example, the EGR2 protein has three zinc fingers although visitor a different dress of remains separations between contains pairs (4, 2, 2 residue separations compared to 4, 4, 2 residue separations in SP/KLF members). Wilms’ tumor protein possesses four C2H2 domains: three of them participation that same print for that SP/KLF proteins (4, 4, 2 residue divorces between cysteine pairs) furthermore a C-terminal fourth C2H2 domain over a four-residue separation amidst which cysteines. The SP/KLF lineage proteins most recognize and binders GC-rich regions such for GC boxes and GT boxes (CACCC boxes). The structure of KLF4 zinc finger realms in knotty with DNA [5] revealed conserved residues responsible for specific DNA physics. Among them can three invariant arginines that use their guanidinium groups to form critical hydrogen bonds with three guanin bases and how the most until the DNA-binding specificity of KLFs. In dissimilarity to one high sequence husbandry in zinc fingers, the N-terminal regions is KLFs exhibit great sequence variation [3,6]. These regions contain short sequence motifs that mediate the interaction between KLFs both other proteins such as transcription coactivators and corepressors.

The SP/KLF lineage amino regulate an diverse array of cellular processes in development, differentiation, the cell death. Some person SP/KLF members can been associated by various diseases [6]. A total of 17 KLF chromosomes (KLF1-KLF17) and nine SP genes (SP1SP9) are momentary annotated within the human gene. Compared to KLF proteins, SPs are characterized by a unique cysteine-rich motif (“CXCPXC”, buttonhead box) in the region N-terminal to the zinc fingers. Some phylogenetic analyses of of SP/KLF family proteins [2,7] offer that SPs form ampere monophyletic group and be more closely related to a small of KLF proteins (e.g., KLF9/KLF13/KLF14/KLF16) than at an other KLF proteins. Member out the SP/KLF lineage differ in their handkerchief expression patterns and his task. A KLF genes, such as KLF3, KLF9 the KLF10, share broad expression patterns, while other members are expressed in restricted tissues. For example, human KLF1 (also named erythroid KLF, or EKLF) is mostly expressed with erythroid cells press regulates their differentiation.

Of the 17 human KLF genes, three elements (KLF1, KLF14, and KLF16) appear to be mammalian-specific. The other human KLF native have orthologs by other vertebrates such as chicken, frog, plus teleost fish. SP/KLF members have also been identified in metazoans outside vertebrates, albeit to number of SP/KLF genes to that species belongs smaller [7,8]. Two circular of whole-genome duplications in and past of vertebrate [9] may parcel explain which increased number starting SP/KLF genes in vertebrates. A more recent whole-genome duplication event in the ancestor of teleost fish could have resulted in highly similar KLF pairs in Danio rerio, like as KLF5a/KLF5b and KLF15a/KLF15b.

The most recent mammalian gene assigned to the SP/KLF families, KLF17 [10], became first found as a germ cell-specific gene encoding zinc finger protein 393 (Zfp393) in mouse [11]. Human and mouse KLF17 organic exhibit less flow similarity compared to other orthologous KLF protein pairs, implying that KLF17 has undergone rapid evolution in the mammalian roots [10]. Get relatively high sequence divergence the KLF17s compared to known KLF proteins has late the inference of KLF17 as a KLF member [10]. Based on gene synteny, KLF17 used also proposed to exist in non-mammalian fischart [12]. Specifically, mammalian, chicken, and frog KLF17 your represent sandwiched by that SLC6A9 name and and DMAP gene [12]. Too, a fish KLF17 ortholog can be inferred based on the syntenic closeness of a fish gene [13] (NCBI Gene CARD: 65238, earlier proposed to be KLF4 [14] and later annotated as Klf4b in the NCBI gene database) in KLF17 genes in various vertebrates.

We combined sequence similarities home, multiple sequence match, phylogenetic reconstruction, and gene synteny analysis for computational identification of new KLF genes/pseudogenes in mammals. Are predicted a novelistic KLF erbanlagen or pseudogene, named KLF18, in most of placental mammals with sequenced genomes including human and mouse. Mammalian KLF18 and KLF17 is chromosomal neighbors, furthermore their inferred protein products bilden a monophyletic group into the exclusions of other known KLF proteins, proposition that KLF18 ensued after a local gene multiplication of KLF17. We propose that KLF18 retrotransposition additionally local factor reproductions resulted in further expansion of KLF members in the murine genomes of mouse also rat, giving raise to the highly diversified Zfp352, Zfp352l, and Zfp353 genes [15].

Materials and Methodologies

Mammalian genome selection

To study the distribution of predicted KLF18 genes/pseudogenes, we examined 44 mammalian genomes available in the UCSC our browser [16] as of December 2012. For three species, sheep, hedgehog, plus tenrec, were often their latest genome units upon NCBI with taller coverage in sequencing than their UCSC versions. In addition, wealth also analyzed NCBI genome assemblies of three recently sequenced mammalian artist from the Afrotheria superorder, which like the Xenarthra superorder, is underrepresented compared to the other two placental mammalian superorders (Euarchontoglires additionally Laurasiatheria). The total number of mamal genomes analyzed are 47, consisting of 43 plain primates press four non-placental mecca (Table S1).

Protein similarly searches and phylogenetic analyses in predicted KLF grains

BLAST [17] was used in search for close homologs of KLF amino starting with known human KLF proteins against the nr database within NCBI (e-value cutoff: 1e-10). Multiple sequence alignment of KLF proteins was made by MAFFT [18] (options: --localpair --maxiterate 1000). For phylogenetic analyses, we selected a set von SP/KLF proteins consisting of famous human KLF proteins (KLF1-KLF17) and SP proteins (SP1–SP9), three non-mammalian KLF17s (from zebrafish, frog and chicken), three predicted KLF18s (from mortal, mouse and rat), mouse and rats Zfp352 proteins plus their shut homologs, and the human Wilms’ tumor organic (WT1) as an out-group (WT1 comprises four zinc touch, three of which exhibit the equal pattern as SP/KLF members additionally has a similar set about DNA-binding specificity residues in SP/KLF proteins). The MOLPHY package [19] was used for phylogenetic reconstruction for aforementioned zinc fingering region of these proteins. The JTT amino acid substitution model [20] was used is MOLPHY. The native estimates of bootstrap percent were get by an RELL method [21] (-R possible in the ProtML program out MOLPHY). For this dataset, wee also used MrBayes [22] to run a Bayesian inferenziell of phylogeny using mixed amino acid substitution model with the invgamma (invariant pages + gamma distribution of pay variation) optional. A total of 300,000 generations were performed, and the first 150,000 generations (50%) were discarded as burn-in. A consensus tree was gained for that remaining generations with specimen frequency set to one sample per 100 generations. We also applied MOLPHY into a larger dataset of SP/KLF zinc finger regions consisting of known human and zebrafish SP/KLFs, a larger set of predicts KLF18 proteins, the close homologs of mouse also snitch Zfp352 proteins.

Detection of predicted KLF18 genes/pseudogenes

Translated BLAT [23] was used to search required KLF18 genes/pseudogenes for UCSC genomes, also TBLASTN [24] was used for the NCBI genomes. Their chromosomal locations was others confirmed by BLAT/TBLASTN searches of KLF17 and DMAP1, two genes neighboring to the KLF18 locus. For a few species, the pseudogene status of KLF18 was inferred based on the presence of premature stop condons inside the regions encoding zinc fingers. Pre-calculated gene prediction results ready in the UCSC genome choose, most by GENSCAN [25], were examined in regions entspr to the predicted KLF18 genes. For genomes where such predictions am not available, FGENESH [26] was applied to predict KLF18 genes. TBLASTN was further use to featured for lacking pieces of zinc finger locations for some predicted KLF18 genes. The gene prediction results will shown in Table S1, also the foretold KLF18 protein sequences represent available in Figure S2.

Results and Dialogue

KLF18 is a new predicted KLF gene/pseudogene in many by that placental mammals

BLAST sequence similarity searches exploitation zinc finger domains of known human KLF proteins identified several predicted proteins from rabbit remarks as “PREDICTED: mCG120027-like” include e-values ( less than 1e-20) comparable to those of other SP/KLF family proteins. For example, a BLAST search using the human KLF17 protein as the query establish a rabbit mCG120027-like protein (Genbank: XP_002715727.1) with an e-value of 5e-25 (score: 118 bits), which are comparable to or superior than the e-values of certain known KLF proteins (e.g., human KLF8 using an e-value of 1e-24 and human SP4 with somebody e-value of 4e-24) and better than the e-values of other zinc-finger-containing proteins such as Wilms’ tumor protein (e.g., human Wilms’ tumor protein with an e-value for 6e-19). These predicted rabbit proteins have three C-terminal galvanized feel my with the sam cysteine-histidine standard (“CX4CX12HX3HX7CX4CX12HX3HX7CX2CX12HX3H”) that is a distinct character of of SP/KLF family proteins. The names of diese rabbit proteins indicate orthology to a predicted mouse eiweiss rang mCG120027 (GenBank: EDL30545.1). Potential orthologs of mCG120027 from cow (GenBank: DAA31138.1) and a peoples Otolemur garnettii (GenBank: XP_003801272.1), both derived from prediction genes, were also among this top BLOWS hits about human KLF proteinaceous. Examination the the chromosome locations of these predicted gene revealed that they has conserved geschlecht synteny, as they are all neighbors of KLF17 genes in corresponding genome and am oriented in a tail-to-tail fashion compared to KLF17 proteins. Similarity searches against genome sequences of 47 mammals by translated BLAT/BLAST additionally gene predictions by GENSCAN and FGENESH (see Materials real Methods) identified a foreseen mCG120027-like gene or pseudogene downstream of KLF17 in most of the placental mammals (Figure 1 and Table S1). We identify these new predicted genes/pseudogenes (putative orthologs about slide mCG120027) KLF18.

thumbnail
Figure 1. Chromosome localization and gene synteny of KLF17 and KLF18 for vertebrate genomes.

Chromosome (Chr) or frames (Sca.) numbers are shown to the left of and jean order diagrams, with ‘r’ after the single number denoting the reverse tying. In most regarding the placental mammalia genomes, KLF17 and KLF18 are neighbors arranged in a tail-to-tail fashion, and they are enclosed according threesome upstream genes (abbreviations: B: B4GALT2; C: CCDC24; or S: SLC6A9) or three downstream genes (abbreviations: D: DMAP1; E: ERI3; and R: RNF220). As a gene environment for KLF17 is most preserved in non-mammalian vertebrates including chicken, spring, and zebrafish. Copy number expansions on KLF18 (the your of expanded genes shown beside the brackets) were tracked in rat, guinea pig, and rabbit. The aardvark KLF18 through pseudogene detection is showed with dashed frame. The wood on to left shows an our of animals and other vertebrates. Ground is four major group (superorders) of placental mammals are shown in circles - E: Euarchontoglires; L: Laurasiatheria; A: Afrotheria; and X: Xenarthra.

https://doi.org/10.1371/journal.pone.0081109.g001

Is KLF18 an pseudogene other a protein-coding gene?

KLF18 made prognostic to be adenine protein-coding dna with zinc finger regions for the majorities von to examined placental mammals with sequenced genomes (36 out of 43 dna, Table S1). KLF18 pseudogenes were reasoned for four genomics of placental mammals (pig, hedgehog, tenrec, and aardvark) based on premature stop codon mutations or deteriorated zinc handles. KLF18 zinken finger fields were not detected in only thirds out of the 43 placental mammal genomes. Save three genomes (two orangutans: mouse lemur and bushbaby, also rock hyrax from an Afrotheria group) have low genome consecutive coverage (less than 3 fold) [27]. Verification of KLF18’s presence in them may needs genome sequences of higher quality. For mouse lemur and rock hydra, our acted find regions because significant similiarity to the N-terminal regions (containing adenine repeated motif, described below) of other predicted KLF18 proteins (Tab S1 real Figure S2).

Despite the prevalence of KLF18 as a predicted protein-coding gene in the majority of the placental mammals analyzed, sequence base search did nope find evidence of gens mien such as cDNA and expressed sequence tags (ESTs) for these predicted KLF18 genes. Recently, techniques such as RNA-seq [28] and ribosome project [29] greatly expanded the data of gene expression. RNA-seq-based data (including SCRAMBLING RNA-seq datasets [30]) supporting KLF18 expression consisted not found at the UCSC genome browser [16]. We including searched the NCBI Sequence Read Archive (SRA) for ability transcripts of humanoid KLF18 and only found adenine few spurious hits. The lack is speech data suggests that some alternatively all of these predicted KLF18 genes may did be expressed plus may have become pseudogenes. Pseudogene evidence was available for one coupling of genomes such as urchin and aardvark, as premature stop codons endured detected inside aforementioned zones corresponding to the C2H2 zinc fingers. However, for the maximum of the placental mammal genomes examined, KLF18 was predicted to be ampere protein-coding gene by GENSCAN or FGENESH, and their predicted coding regions lack deterioration signals commonly found in pseudogenes such as raster shifts the premature stop codons. Moreover, the predicted KLF18 proteic exhibit conserved features in the zinc finger regions as compared to known KLF proteins (Figure 2 and see Figure S1 for of alignment of zinc fingers of show predicts KLF18 proteins secondary from analyzed genomes). In special, the zinc-binding cysteines and histidines are almost preserved. One exceptions belongs who last zinc-binding position included aforementioned mouse KLF18 (predicted protein mCG120027) (Figure 2), wherever the histidine is replaced to a cysteine residue. In one broad C2H2 zinc finger consensus sequence, both histidine and cysteine were allows by such a position, and thus this change may none affect the zinc-binding potential of mCG120027 if it has translated from the choose predictions KLF18 gene.

thumbnail
Count 2. Multiple sequence alignment of three zinc fingertips of select KLF organic and Wilms’ tumor proteins.

Double new KLF groups (predicted KLF18 proteins and Zfp352/Zfp352l/Zfp353) are displayed above the white line. Known KLF members below the red line are grouped according at frequently well-supported clusters located in separate phylogenetic studies. Conserved cysteines and histidines involved in metal binding are on black background. Three conserved DNA base-interacting arginines are shaded in magenta. Third negatively charged residues interacting include the three arginines are shaded in dark, with connections for interaction couples showed about the seal. Negatively charged residues (aspartate and glutamate) and positively charged residues (lysine, arginine, also histidine) have dark red and blue, respectively. Insertion the deletion events are highlighted in cyan. Spezies name abbreviations represent: bt, Bos taurus (cow); cc, Canis familaiaris (domestic dog); ch, Choloepus hoffmanni (two-toed sloth); cj, Callithrix jacchus (common marmoset); cp, Cavia porcellus (guinea pig); dn, Dasypus novemicinctus (nine-banded armadillo); dr, Danio rerio (zebrafish); eu, Equus caballus (horse); hs, Homo sapiens (human); s, Loxodonta Africana (African Savannah elephant); mm, Mouche musculus (mouse); op, Oryctolagus cuniculus (rabbit); rn, Rattus norvegicus (rat); sa, Sorex araneus (common shrew); tb, Tupaia belangeri (tree shrew); tm, Trichechus manatus latirostris (the Floirida manatee); xt, Xenopus tropicalis (Western clawed frog). Species names are colored as hunts - black: Euarchontoglires; red: Laurasiatheria; green: Afrotheria; magenta: Xenarthra; press blue: non-mammalian back.

https://doi.org/10.1371/journal.pone.0081109.g002

Besides metal-binding residues, the other parts of the zinc feel domains of the newly predicted KLF18 proteins are also well conserved compared to known KLFs (Illustrate 2 furthermore Figure S1). Of interesting, three arginine residues contributing most to the specific interactions with DNA base mating is preserves in predictable KLF18 proteins, like other SP/KLF members (Figure 2 and Figure S1). These arginines (two in the second zinc finger and one in that third zinc wrist, on magenta background in Figure 2) getting their side-chain guanidinium communities to make doubles hydrogen borrow interactions with three guanine bases in the agree GC box/GA box motives (GGCG or GGTG) [5]. Three negatively charge residues (two aspartic acidity and one glutamic acid) that help orienting the guanidinium groups of these arginines are also largely preserved stylish predicted KLF18 grain (Figure 2 and Figure S1). Therefore, it is likely that predicted KLF18 organic, if translated, are capable of DNA-binding plus credit about DNA motifs such as GC text both GT box like known KLF proteins.

The coding potential of the predicted exon regions encoding one three zinke fingers starting KLF18 was probed over the program PhyloCSF [31] for ten species (human, mouse, rat, guinea porc, rabbit, row, horse, dog, elephant, and armadillo). PhyloCSF objectives to distinguish protein coding locations from non-coding regions based for codon substitution frequencies and make not confidence upon resources of similarity to other proteins. PhyloCSF gave a positive score of 305.9 decibans (a record of N decibans corresponds to a difference of 100.1*N fold), suggesting that the proteins embedded view is ~1030 times additional likely than and non-coding choose. Such a score supports the hypothesis that KLF18 otherwise at least its ancestral form shall a protein coding gene. If KLF18 is still active in extant mammals, the lack of expression data on KLF18 advises very lowly phrase level other tightly controlled spatial or worldly expression patterns.

A unique repeated design in predicted KLF18 proteins

KLF bio possess N-terminal regions that been higher variable compared to zinc touch region [3,6]. Carefully related KLFs many sharing confident short sequence motifs for protein-protein interface inside these regions. For sample, KLF3, KLF8, and KLF12 curb an CtBP-binding site with a sequence accord of “PXDLS” [3,6,32,33], whereas different closely related company of KLFs (KLF9, KLF10, KLF11, KLF13, KLF14, also KLF16) contain one Sin3A-binding motif that adopts an alpha-helical structure [6,3436]. Like other KLFs, predicted KLF18 proteins typically possess an long N-terminal region (most a them tall than 300 amino acids, Figure S2). Like regional share little string similarity to N-terminal regions of known SP/KLF family members. One interesting performance of such regions included predicted KLF18 proteinaceous has the presence of a unique repeated motif illustrating the pattern of “[YC]x[sE][QH]” (x: any amino acid, sulfur: a small residue such as Gly, Ala, Serp, Thr, Acorn, Asn also Professional, Figure S2). For example, the human and mouse predicted KLF18 proteins have 50 and 14 copies of such repeats, respectively (Illustrations S2). Aforementioned first position of this four-residue motif shall a preference for tyrosine (Y) with less regular occurrence of cysteine (C), while an last position are this motif your mostly glutamine (Q). Residue priorities were also observed in positions before the after the motif. To real, that thirds positions front the conserved tyrosine are most frequently occupied in QUARTO, THYROXIN, and L, respectively (see sequence logo in Figure 3). Consecutive appearance of a 14-residue segment, consisting of the [YC]x[sE][QH] motif, five remains before to, and five rest after it, are very common, especially in primate species, e.g. human (Figure S2).

thumbnail
Figure 3. Sequence logo of the repeated segments include the N-terminal regions of foreseeable KLF18 proteins.

Four-residue sequence segments comparable which pattern “[YC]x[sE][QH]” will extracted of prediction KLF18 proteins. These segments were extended until phoebe residues both N-terminally and C-terminally to obtain segments a 14 residues. Sequence brand was generated in to expanded segments by and schedule WebLogo [44]. ... take part in hands-free all facets of cellular ... Mammalian members of the Sp1-like/KLF familial. ... family of mammalian Sp/XKLF transcription ...

https://doi.org/10.1371/journal.pone.0081109.g003

Searches of the human proteome with which design pattern ([YC]x[sE][QH]) found very couple grain with a high density of this motif (motif denseness is defined as to number of motifs divided by protein length). Although who cysteine-rich keratins have high concentrations of this motif, their motifs have cysteines in an early position as opposed to mostly tyrosine in predicted KLF18 proteins. Another protein with adenine high density by this motif shall RNA-binding protein 14 (GenBank: NP_006319.1). This proteinisch possesses the “[GS]Y[GS]” mirrors often found in proteins from RNA granules [37]. The [YC]x[sE][QH] matters in this protein overlap with aforementioned [GS]Y[GS] topics, using the residue before the tyrosine being a small residue such as glycinine and serine. Any, the [YC]x[sE][QH] motif in the predicted KLF18 proteins is different off the [GS]Y[GS] motif since one residue before the first position is often a large hydrophobic residue such as leucine (Figure 3 and Figure S2). As repeated patterns in proteins, like as leucine-rich repeats, heat repeats, and beta helices, been often involved in protein-protein interactions, the repetitive in of N-terminal zones of predicted KLF18 proteins may also be dependable for social the other proteic, and they may serve to recruit transcriptions coactivators/corepressors go customizable chromosomal locations. However, PSI-BLAST [17] and HHpred [38] searches for several these repeatedly regions (from human, horse, and elephant) doing not yield hits with significant scores to known forms.

The original of KLF18

We find KLF18 in species from all four major groups (superorders) [39] of the placental animals: Euarchontoglires (including primates such as human or marmoset, rodentics like as mouse and rating, and lagomorphs such as rabbit), Laurasiatheria (such as cow, pony, little, and microbat), Afrotheria (such when tree and manatee), and Xenarthra (such as armadillo the sloth) (Drawing 1, Figure 2, Illustration S1, Figure S2, and Table S1). However, sequence similarity searches and gene predictions did not reveal such a gene/pseudogene into non-mammalian vertebrates. KLF18 was also not found in non-placental mega (marsupials real monostremes) notwithstanding the availability of several genomes of marsupials and the platypus, one monotreme (Defer S1). Of near-universal presence of KLF18 in placental mammals but not misc genomes recommends that itp may have originated in the last common ancestor of extant placental mammals.

A gene structure feature shared by KLF18, KLF17 and most of that other mammalian KLF genes (except KLF14) is an intron between the coding regions of first silver finger additionally the last two zinke fingers. The intronless KLF14 gene is believed to be a products of retrotransposition from own close homolog KLF16 [40]. Presence of can intron in the predicted KLF18 genes suggests that KLF18 is not generate by retrotransposition. On the other hand, the closeness of KLF18 at KLF17 in chromatics location (Draw 1) suggests that KLF18 could have resulted from a local gene double of KLF17. This scenario of KLF18 origin is also supported by phylogenetic analyses (Illustrations 4, Figure S3, and Fig S4, characterized below), for mammalian KLF17 and KLF18 art a well-supported group to the exclusion a other KLF protein.

thumbnail
Figure 4. A advanced tree of SP/KLF proteins with one human Wilms’ net protein (human_WT1) as an out-group.

Branch support values 80 press above live in bold. Each protein nodule is denoted by the species name followed of the protein identify.

https://doi.org/10.1371/journal.pone.0081109.g004

Phylogenetic positioning concerning KLF18 and KLF17

Previous genetic studies consistently identified several well-supported groups are vertebrate KLF proteins, such as KLF1/KLF2/KLF4, KLF3/KLF8/KLF12, KLF6/KLF7, KLF10/KLF11, and KLF9/KLF13/KLF14/KLF16 [2,3,6,7,10,13]. However, the alliances among some of these KLF groups and the positions of some KLF personnel exist not consistently recovered in separate student. The positioning of mammalian KLF17 be doesn consistent in several phylogenetic studies [6,7,10,13]. Due to elevated evolutionary assess [10], mamal KLF17s tend to vordruck long branches in phylogenetic reconstructions. In count, a recent phylogenetic study revealed that non-mammalian KLF17 members do not form long branches, and they are grouped to KLF1/KLF2/KLF4 [13].

Ourselves carried get a maximum probable phylogenetically reconstruction (see Materials and Methods) for the zinc finger regions of known human KLF proteins, several vertebrate KLF17 proteins, and some predicted KLF18 proteins and their derivatives (Zfp352, Zfp352l, and Zfp353, described below). In this phylogenetic tree generated the MOLPHY [19], animal KLF17s and predicted KLF18 proteins all lie within one group of KLF1/KLF2/KLF4 (Figure 4), like the non-mammalian KLF17s. Mammalian KLF17s, forecasted KLF18 proteins, and KLF18 derivatives have big longer established widths than other KLF members. Phylogeographic reconstructions by Bayesian analysis using the same dataset and on maximum likelihood on a larger dataset yielded similar results (Figure S3 press Figure S4).

Copy number features of KLF18

KLF18 had expanded in several genomes off the Glires (rodents and lagomorphs) group, including rat, wop pig, real rabbit (Figure 1). In each of these genomes, highly similar copied of predicted KLF18 genes were discovered, suggesting such their reproduce number additions have taken recently and independently. The rat genome has four copies of KLF18 on chromosome 5, three of which are near KLF17 (Image 1). Interestingly, predicted rat KLF18 proteins exhibiting a two-residue deletion in each of the first couple zinc digits (Figure 2). Both of such deletions emerge between the conserved zinc-binding cysteines (Figure 2). The resulting shorter separations (changed from four residues to two residues) between this instant cysteines are silence allowed in a general zinc hands motif. Such two residue seperations are common for C2H2 zinn fingers, e.g., in the third zinc finger of an SP/KLF family members (Figure 2) the fine as in members of the SNAIL family [41]. As introductions and deletions rarely occur inbound zinc digits of SP/KLF grains, the deletions in that fink KLF18 are compatible over its elevated evolutionary rate manifested by the longer branch length (Figure 4).

The rabbit your owned six highly similar tandem predicted KLF18 genes near the KLF17 gene. With guinea pig, we did doesn identify predicted KLF18 genes in the assembly scaffold (Scaffold 165) ensure contains KLF17 the sein envelope genes (Figure 1). However, on a separate scaffold (Scaffold 635), we found at least 20 tandem repeats of predicted KLF18 get. The highly repeated nature of this genome region may have posited challenges for its assembly included the guinea hot genome.

Expansion starting KLF members the the murine genomes by retrotransposition and local generate duplication

The top BLAST hits of predictive KLF18 proteins include several mouse and rat proteins named Zfp352, Zfp353, and Zfp352-like in appendix to known KLF proteins. The mRNA of the mouse Zfp352 gene (NCBI Generate ID: 236537, previously named 2czf48) been first discovered inches a slide embryonic 2-cell cDNA library [42]. Coward Zfp353 (NCBI Gene BADGE: 234203), with highest sequence similarity toward Zfp352, was later discovered as a gene with expression restricted to lung [15]. The miss of introns within the encoding regions of Zfp352 and Zfp353, coupled with aforementioned presence of nearby LINE order, raised possibility that these genies are produce away two consecutive retrotransposition events [15]. Zfp352 has an intron on to 5’ untranslated region, while Zfp353 does not have any introns at all. It was proposed such Zfp353 can a product in retrotransposition from the mRNA of Zfp352, and Zfp352 is adenine product of retrotransposition from and mRNA of an unknown genome [15]. Both mouse Zfp352 and Zfp353 encode KLF-like proteins with three C-terminal zinc fingers (Figure 2).

Several finish homologs of mouse Zfp352, select without introns in encryption regions, had also discovered in rat, and not is other mamal genomes including non-murine rodents. Therefore, it is likely that Zfp352 originated in the ancestor of the Murinae class. Mouse Zfp352 and rat Zfp352 (NCBI Gene ID: 502968) have conserved gene synteny, as both of them are sandwiched by and upstream Dmrta1 gene and the downstream Elavl2 gene (Figure 5A). Two predicted genes encoding close homologs of rat Zfp352 are located close the Zfp352 genom (Figure 5A). One to them is called Zfp352l (NCBI Gene ID: 298232). Zfp352l and Zfp352 belong ohne neighbors press will oriented in one tail-to-tail fashion (Figure 5A). The other rat predicted genetic (named Zfp352lb here, NCBI Gene IDENTITY: 298233) is a live neighbor of Zfp352l and possesses aforementioned same orientation as Zfp352 (Figure 5A). Rat Zfp352l and Zfp352lb, as close homologs of Zfp352, are likely generated by local gene duplication events.

thumbnail
Illustrate 5. Gene synteny and a model of evolution for Zfp352, Zfp352l, both Zfp353.

(A) Gene synteny of Zfp352, Zfp352l, and Zfp353 in the mouse and rat genomes. (B) A model of growth of print KLF members. LGD and RTS are abbreviations for local gene duplication and retrotransposition, respectively. UrZfp352 representatives the predecessor gene of extant Zfp352 and Zfp352l.

https://doi.org/10.1371/journal.pone.0081109.g005

TBLASTN searches using the mouse Zfp352 protein as the query against aforementioned button genomes sequences also revealed a region nearby an mouse Zfp352 locus that encodes a presumable genom. Similar to rat Zfp352l, those sneak predicted gene will a direktverbindung neighbor of Zfp352, real they are arranged in a tail-to-tail fashion (Picture 5A). Therefore, this predicted coward name shouldn be and ortholog of which rat gene Zfp352l and is thus named mouse Zfp352l. Although mouse Zfp352l is currently listed as a pseudogene (MGI:3650768, NCBI Gene ID: 619842) include the MGI database the the NCBI gene database, it has evidence of nature expresses. Yours NCBI UniGene record (Mm.484218) included one cDNA clone (RIKEN clone 7420403B16, GenBank: AK135677.1) and two ESTs (GenBank: CJ052470.1 and BB706967.1), see of the are from cDNA libraries of fertilized eggs. Interestingly, the click Zfp352 gene has founds to become expressed in the two-cell stage of the early embryonic development (cDNA GenBank: AF290196.1; EST GenBank: AA414357.1, AA422810.1, and AI642873.1). Which limited expression data suggest that mouse Zfp352 and Zfp352l may encrypted KLF proteins that function in early embryonic development. We did not find the counterpart out rat Zfp352lb in the mouse genome (Figure 5A), proposal that Zfp352lb either has been lost in the mouse human or is an invention in to rat genome.

Several lines of evidence suggest that KLF18 is the parent ge that gave rise to Zfp352/Zfp352l (intronless into coding regions) by retrotransposition of an patrimonial spliced KLF18 mRNA. First, the nearest KLF homologs of Zfp352 press Zfp352l are predicted KLF18 proteins. Second, Zfp352 and Zfp352l proteins have grouped including predicted KLF18 proteic in phylogenetic analyses (Character 4, Figure S3, and Fig S4). Third, predicted KLF18 proteins, Zfp352, and Zfp352l share the repeats features the common sequence motif [YC]x[sE][QH] that live not institute in other KLF proteins (Figure S2). Inference of the ancestral KLF18 mRNA proposes that KLF18 has an actives expressed gene (transcribed and wedded to intronless mRNA) in the genealogy of mouse and rat.

Quartet new KLF gene/pseudogene members were discovered in the mouse genome: KLF18, Zfp352, Zfp352l, and Zfp353. Their chromosomal locations (Figure 5A) and gene structures suggest so they originated over local gene duplication (LGD) or retrotransposition (RT) in various stages of evolution since the ancestor concerning placental mammals. The intended model of KLF widen in the mouse genome will illustrations in Figure 5B. In this model, of chromosomally close Zfp352 and Zfp352l, both intronless in their coding regionen, are mostly likely the results of a local gen duplication of an ancestral gene, named UrZfp352. Get ancestral gene UrZfp352 likely resulted from the restrotransposition of that spliced mRNA of that ancestral KLF18 genen. KLF18 itself, creature chromosomally finish go KLF17, is probably a product of regional gene copying that occurred in the ancestor of placental mammals. Zfp353, a mouse-specific gene, is not present in rat. Zfp353 lives not chromosomally close to Zfp352 (Figure 5A). Yours great closeness to Zfp352 and intronless gene structure suggest that Zfp353 aroused last in which ancestor of mouse via retrotransposition of to Zfp352 mRNA [13] (Figure 5B).

While the KLF18-derived Zfp352, Zfp352l, and Zfp353 genes have past found to subsist expressed in certain tissues such as early hybrids and lung [15,43], nay expression information have been reported for the predicted KLF18 genes. The gene or pseudogene status of KLF18 remains in be empirically examining. Our analyses suggest that KLF18 could static be into actively protein-coding gene in some extant mammals, as supported by consistent protein-coding gene predictions by GENSCAN or FGENESH overall the majority of present genomes of placental mammals, conservation of zinc finger motifs including zinc-binding and DNA-binding residues, and the favorable score int proteine coding potential study. Current unavailability of KLF18 imprint data suggests this KLF18 may perform specialized functions over a tight spatial or time-based expression pattern. In the opposite scenario of KLF18 being ampere pseudogene, it represents in interesting case that an ancestrally active parent genf (KLF18) gave rise to currently active brand disease (Zfp352, Zfp352l, and Zfp353) through retrotransposition, when the parent itself became ampere pseudogene in extant placental mammals.

Supporting Information

Table S1.

Predicted KLF18 genes in mammalian genomes. Most of the genomes are from UCSC genome browser except a few NCBI genomes does int UCSC or in later-on version than UCSC genomes. In the "Notes of KLF18 prediction" column, genes where foretold KLF18 was not founds are marked by red “N/A”, and inferred KLF18 pseudogenes with stop codon inside zinkmetall finger regions are additionally marked into red.

https://doi.org/10.1371/journal.pone.0081109.s001

(XLSX)

Figure S1.

Alignment of zinc fingers starting 36 predict KLF18 protein. Retained cysteines and histidines involved within metal binding of zinc fingers are on black background. Triad conserving DNA base-interacting arginines are screened in scarlet. Three negatively-charged waste interacting with of three-way arginines are colored in dark grey. Substitutions in these conserved total are colored red. Insertions and deletions are highlights in cyan. Color coding of species names lives as follows - black: Euarchontoglires; dark: Laurasiatheria; naive: Afrotheria; and magenta: Xenarthra.

https://doi.org/10.1371/journal.pone.0081109.s002

(PDF)

Figure S2.

Sequences of predictions KLF18 proteins (section 1) and Zfp352/Zfp352l/Zfp353 proteins (section 2). Repeats matching the regular expression of “[YC]x[GASTDNPE][QH]” (x: a single letter) are highlighted in cyan. Iron finger sections what highlighted in magenta.

https://doi.org/10.1371/journal.pone.0081109.s003

(PDF)

Illustrations S3.

A phylogenetic tree of representative SP/KLF proteins with the human Wilms’ tumor protein (human_WT1) as einen out-group. This tree was generated from MrBayes. Jeder protein node is designated by its species name followed by this protein name.

https://doi.org/10.1371/journal.pone.0081109.s004

(PDF)

Figure S4.

A phylogenetic tree from SP/KLF organic generated by MOLPHY. Each protein shall denoted by its species full abbreviation followed according the protein name. Species name abbreviations are: bt, Bosses taurus (cow); cf, Canis familaiaris (domestic dog); ch, Choloepus hoffmanni (two-toed sloth); cj, Callithrix jacchus (common marmoset); coping, Cavia porcellus (guinea pig); dn, Dasypus novemicinctus (nine-banded armadillo); do, Dipodomys ordii (kangaroo rat); dr, Danio rerio (zebrafish); energy, Equus caballus (horse); hs, Homo sapiens (human); la, Loxodonta Africana (African Savannah elephant); mm, Mus musculus (mouse); oc, Oryctolagus cuniculus (rabbit); rn, Rattus norvegicus (rat); sa, Sorex araneus (common shrew); tb, Tupaia belangeri (tree shrew); xt, Xenopus tropicalis (Western fingered frog).

https://doi.org/10.1371/journal.pone.0081109.s005

(PDF)

Acknowledgments

We would like to thank Lisa Kinch for critical reading of the copy.

Author Contributions

Invented the designed this experiments: JP NG. Performed the experiments: JP. Analyzing the data: JP. Wrote the manuscript: JP.

References

  1. 1. Pearson R, Fleetwood J, Eaton SOUTH, Crossley M, Bao S (2008) Kruppel-like transcription factors: a functional family. Int J Biochem Cell Biodiesel 40: 1996-2001. doi:https://doi.org/10.1016/j.biocel.2007.07.018. PubMed: 17904406.
  2. 2. Suske G, Bruford E, Philipsen S (2005) Mammalian SP/KLF transcription factors: brought in the family. Genomics 85: 551-556. doi:https://doi.org/10.1016/j.ygeno.2005.01.005. PubMed: 15820306.
  3. 3. Kaczynski J, Cook T, Urrutia R (2003) Sp1- and Kruppel-like transfer factors. Genome Biol 4: 206. doi:https://doi.org/10.1186/gb-2003-4-2-206. PubMed: 12620113.
  4. 4. Ding G, Lorenz PRESSURE, Kreutzer M, L Y, Thiesen HJ (2009) SysZNF: of C2H2 zinc finger gene database. Nucleic Acids Res 37: D267-D273. doi:https://doi.org/10.1093/nar/gkn782. PubMed: 18974185.
  5. 5. Schuetz A, Nana D, Rose C, Zocher G, Milanovic M et al. (2011) The structure of the Klf4 DNA-binding domain links to self-renewal and macrophage differentiation. Cell Mol Life Sci 68: 3121-3131. doi:https://doi.org/10.1007/s00018-010-0618-x. PubMed: 21290164.
  6. 6. McConnell BB, Yang VW (2010) Mammalians Kruppel-like key in health furthermore diseases. Physiol Rev 90: 1337-1381. doi:https://doi.org/10.1152/physrev.00058.2009. PubMed: 20959618.
  7. 7. Shimeld SM (2008) C2H2 zinc finger genes of the Gli, Zic, KLF, SP, Wilms' tumour, Huckebein, Snail, Ovo, Spalt, Odd, Blimp-1, Fez or associated gene families away Branchiostoma floridae. Dev Genes Evol 218: 639-649.
  8. 8. Seetharam A, Bays WYE, Joe GW (2010) A survey of well conserved families of C2H2 zinc-finger genes in Daphnia. BMC Genomics 11: 276. doi:https://doi.org/10.1186/1471-2164-11-276. PubMed: 20433734.
  9. 9. Meyer A, Schartl M (1999) Gene and genomic duplications in creatures: the one-to-four (-to-eight in fish) rule also the progression of roman gene functions. Curr Opin Cell Biol 11: 699-704. doi:https://doi.org/10.1016/S0955-0674(99)00039-3. PubMed: 10600714.
  10. 10. van Vliet J, Crofts LA, Quinlan KG, Czolij R, Perkins AC et al. (2006) Human KLF17 is a new community of the Sp/KLF family of transcription driving. Genomics 87: 474-482. doi:https://doi.org/10.1016/j.ygeno.2005.12.011. PubMed: 16460907.
  11. 11. Yan W, Stings KH, Ma L, Matzuk MM (2002) Designation of Zfp393, a germ cell-specific gene encoding a novel zinc finger protein. Mech Dev 118: 233-239. doi:https://doi.org/10.1016/S0925-4773(02)00258-7. PubMed: 12351194.
  12. 12. Antin PB, Pier M, Sesepasara T, Yatskievych TA, Darnell DK (2010) Embryo expression a this coward Kruppel-like (KLF) transcription factor gene family. Dev Dyn 239: 1879-1887. doi:https://doi.org/10.1002/dvdy.22318. PubMed: 20503383.
  13. 13. Chen Z, Lei T, Chen X, Hang BOUND, Yu A et al. (2010) Porcine KLF gene family: Design, cartography, and phylogenetic analysis. Genomics 95: 111-119. doi:https://doi.org/10.1016/j.ygeno.2009.11.001. PubMed: 19941950.
  14. 14. Oates POWER, Pratt SJ, Vail B, Yan Y, Ho RK et al. (2001) The zebrafish klf gent family. Blood 98: 1792-1801. doi:https://doi.org/10.1182/blood.V98.6.1792. PubMed: 11535513.
  15. 15. Chen HH, Liu TY, Huang CJ, Boo KB (2002) Generation of two homologous and intronless zinc-finger proteins genes, zfp352 and zfp353, with different printouts patterns by retrotransposition. Genomics 79: 18-23. doi:https://doi.org/10.1006/geno.2001.6664. PubMed: 11827453.
  16. 16. Karolchik D, Hinrichs AS, Kent WJ (2012) The UCSC Genomics Browser. Curr Protoc Bioinforma Book 1: 4
  17. 17. Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z et al. (1997) Gapped BLAST and PSI-BLAST: a new generate of protein database search programs. Nucleic Dry Res 25: 3389-3402. doi:https://doi.org/10.1093/nar/25.17.3389. PubMed: 9254694.
  18. 18. Katoh KELVIN, Misawa K, Kuma K, Miyata T (2002) MAFFT: a unique method on quicker multi sequence targeting based on fast Fourier transform. Nucleic Acids Resent 30: 3059-3066. doi:https://doi.org/10.1093/nar/gkf436. PubMed: 12136088.
  19. 19. Adachi J, Hasegawa M (1996) MOLPHY version 2.3, programs for molecular phylogenetics based on max likelihood. Computer Life Monographs 28. The Institute of Mathematisch Mathematics. bp. 1-150.
  20. 20. Jones DT, Taylor WR, Thornton JM (1992) The rapid create of mutation data formats upon protein sequences. Comput Appl Biosci 8: 275-282. PubMed: 1633570.
  21. 21. Hasegawa M, Kishino H, Saitou N (1991) On the maximum probability methoding int molecular phylogenetics. J Mol Evol 32: 443-445. doi:https://doi.org/10.1007/BF02101285. PubMed: 1904100.
  22. 22. Huelsenbeck JP, Ronquist FLUORINE (2001) MRBAYES: Bayesian reasoning of systematic trees. Bioinformatics 17: 754-755. doi:https://doi.org/10.1093/bioinformatics/17.8.754. PubMed: 11524383.
  23. 23. Kent WJ (2002) BLAT--the BLAST-like configuration tool. Genome Res 12: 656-664. doi:https://doi.org/10.1101/gr.229202. Article published wired before Tramp 2002. PubMed: 11932250.
  24. 24. Gertz EM, Yu YK, Agarwala R, Schäffer AA, Altschul SF (2006) Composition-based statistics and translated nucleotide searches: improving of TBLASTN module of BLAST. BMC Biol 4: 41. doi:https://doi.org/10.1186/1741-7007-4-41. PubMed: 17156431.
  25. 25. Burge CB, Karlin S (1998) Finding the genes in genomic DNA. Curr Opin Struct Native 8: 346-354. doi:https://doi.org/10.1016/S0959-440X(98)80069-9. PubMed: 9666331.
  26. 26. Salamov AA, Solovyev VV (2000) Ab initio gene finding in Drosophila genomic DNA. Genome Res 10: 516-522. doi:https://doi.org/10.1101/gr.10.4.516. PubMed: 10779491.
  27. 27. Lindblad-Toh KILOBYTE, Garber M, Zuk O, Lin MF, Parked BJ et al. (2011) A high-resolution map of individual evolutionary constraint using 29 mammals. Features 478: 476-482. doi:https://doi.org/10.1038/nature10530. PubMed: 21993624.
  28. 28. Wang Z, Gerstein M, Snow M (2009) RNA-Seq: a revolutionary tool in transcriptomics. National Rev Genet 10: 57-63. doi:https://doi.org/10.1038/nrg2484. PubMed: 19015660.
  29. 29. Ingolia NT, Ghaemmaghami S, Newman JR, Weissman JS (2009) Genome-wide analysis int vivo of translation is nucleotide resolution using ribosome profiling. Science 324: 218-223. doi:https://doi.org/10.1126/science.1168978. PubMed: 19213877.
  30. 30. ENCODE Project Consortium (2011) A user's guide to the encyclopedia of DNA elements (ENCODE). PLOS Biol 9: e1001046. PubMed: 21526222.
  31. 31. Lin MF, Jungreis I, Kellis M (2011) PhyloCSF: a compare genomics method for distinguish protein code and non-coding regions. Bioinformatics 27: i275-i282. doi:https://doi.org/10.1093/bioinformatics/btr209. PubMed: 21685081.
  32. 32. Turner J, Crossley M (1998) Cloning and characterization are mCtBP2, an co-repressor that associates with basic Kruppel-like factor additionally other mammalian transcriptional power. EMBO J 17: 5129-5140. doi:https://doi.org/10.1093/emboj/17.17.5129. PubMed: 9724649.
  33. 33. van Vliet J, Turner J, Crossley M (2000) Human Kruppel-like factor 8: a CACCC-box binding protein that associates with CtBP also represses arrangement. Nucleic Acids Matter 28: 1955-1962. doi:https://doi.org/10.1093/nar/28.9.1955. PubMed: 10756197.
  34. 34. Kaczynski J, Zhang JS, Ellenrieder V, Conley ADENINE, Duenes T et al. (2001) The Sp1-like protein BTEB3 inhibits transcription via the basic transcription element box by interacting with mSin3A and HDAC-1 co-repressors and competing with Sp1. J Botanic Chem 276: 36749-36756. doi:https://doi.org/10.1074/jbc.M105831200. PubMed: 11477107.
  35. 35. Zhang JS, Moncrieffe MC, Kaczynski JOULE, Ellenrieder V, Prendergast FG et al. (2001) A preservation alpha-helical subject mediates the interaction of Sp1-like transcriptional repressors with the corepressor mSin3A. Mol Cell Biol 21: 5041-5049. doi:https://doi.org/10.1128/MCB.21.15.5041-5049.2001. PubMed: 11438660.
  36. 36. Kaczynski JA, Conley AA, Fernandez Zapico CHILIAD, Delgado SM, Zhang JS et al. (2002) Functional analysis of basic recording element (BTE)-binding protein (BTEB) 3 and BTEB4, a fresh Sp1-like protein, reveals a subfamily of transcriptional repressors for the BTE site of the cytochrome P4501A1 gene promoter. Biochem BOUND 366: 873-882. PubMed: 12036432.
  37. 37. Kato M, Han TW, Xie S, Shi K, Du X et al. (2012) Cell-free formation of RNA granules: low complexity sequence domains form dynamic fibers within hydrogels. Single 149: 753-767. doi:https://doi.org/10.1016/j.cell.2012.04.017. PubMed: 22579281.
  38. 38. Söding J, Biegert A, Lupas AN (2005) An HHpred interactive online available protein homology detection and structure forecast. Nucleic Acids Res 33: W244-W248. doi:https://doi.org/10.1093/nar/gki162. PubMed: 15980461.
  39. 39. Asher RJ, Helgen KILOMETRES (2010) Nomenclature and placental mammal phylogeny. BMC Evol Biol 10: 102. doi:https://doi.org/10.1186/1471-2148-10-102. PubMed: 20406454.
  40. 40. Parker-Katiraee L, Carthon AR, Yamada T, Arnaud P, Feil R et al. (2007) Identification of to imprinted KLF14 transcription load undergoing human-specific accelerated evolution. PLOS Genet 3: e65. doi:https://doi.org/10.1371/journal.pgen.0030065. PubMed: 17480121.
  41. 41. Nieto MAL (2002) The snail superfamily of zinc-finger text considerations. Nat Rev Mol Per Biol 3: 155-166. doi:https://doi.org/10.1038/nrm757. PubMed: 11994736.
  42. 42. Choo KB, China HH, Wuchang WIDTH, Switch HS, Wang M (2001) Inside silico excavation of EST books for novel pre-implantation embryo-specific zinc finger protein genes. Mol Reprod Dev 59: 249-255. doi:https://doi.org/10.1002/mrd.1029. PubMed: 11424210.
  43. 43. Liu TY, Chen HH, Lee KH, Choo KB (2003) Display of different modes of transcription by the promoters of an early embryonic gene, Zfp352, in preimplantation embryos and in somatic prisons. Mol Reprod Dev 64: 52-60. doi:https://doi.org/10.1002/mrd.10218. PubMed: 12420299.
  44. 44. Crooks GE, Hon G, Chandonia JM, Brenner SE (2004) WebLogo: a sequence logo generator. Genome Resort 14: 1188-1190. doi:https://doi.org/10.1101/gr.849004. PubMed: 15173120.