GPATCH2L

GPATCH2L (G-Patch Domain Containing 2 Like) is a protein that is encoded by the GPATCH2L human gene located at 14q24.3.[5] In humans, the length of mRNA in GPATCH2L (NM_017926) is 14,021 base pairs and the gene spans bases is 62,422 nt between chr14: 76,151,922 - 76,214,343.[6] GPATCH2L is on the positive strand. IFT43 is the gene directly before GPATCH2L on the positive strand and LOC105370575 is the uncharacterized gene on the negative strand, which is approximately one and a half the size of GPATCH2L. Known aliases for GPATCH2L contain C14orf118, FLJ20689, FLJ10033, and KIAA1152. GPATCH2L produces 28 distinct introns (27 gt-ag, 1 gc-ag), 17 different mRNAs, 14 alternatively spliced variants, and 3 unspliced forms.[7] It has 5 probable alternative promoters, 7 validated polyadenylation sites, and 6 predicted promoters of varying lengths.

GPATCH2L
Identifiers
AliasesGPATCH2L, C14orf118, G-patch domain containing 2 like
External IDsMGI: 1917623 HomoloGene: 9942 GeneCards: GPATCH2L
Orthologs
SpeciesHumanMouse
Entrez

55668

70373

Ensembl

ENSG00000089916

ENSMUSG00000021254

UniProt

Q9NWQ4

Q6PE65

RefSeq (mRNA)

NM_027405
NM_001324488

RefSeq (protein)

NP_001311417
NP_081681

Location (UCSC)Chr 14: 76.15 – 76.25 MbChr 12: 86.29 – 86.34 Mb
PubMed search[3][4]
Wikidata
View/Edit HumanView/Edit Mouse

Transcript variants

There are 23 different transcript variants in GPATCH2L Homo sapiens. The most common is transcript variant 1 and each transcript variant uses different exons.

Exon usages of 23 transcript variants of GPATCH2L Homo sapiens from NCBI Gene.[6]
23 Transcript Variants of GPATCH2L Homo sapiens from NCBI Gene.[6]
NameAccession Number# of ExonsSize (bp)
Transcript Variant 1NM_0179261014,021
Transcript Variant 2NM_017972912,396
Transcript Variant 3, non-coding RNANR_1103141012,511
Transcript Variant 4NM_00132202673,263
Transcript Variant 5NM_00132202744,485
Transcript Variant 6NM_001322028101,797
Transcript Variant 7NM_00132202991,825
Transcript Variant 8NM_001322030103,232
Transcript Variant 9NM_00132203135,278
Transcript Variant 10NM_00132203273,014
Transcript Variant X1XM_0170214271114,388
Transcript Variant X2XM_0170214281114,373
Transcript Variant X3XM_017021429113,472
Transcript Variant X4XM_0067201911014,146
Transcript Variant X5XM_017021430113,457
Transcript Variant X6XM_017021431112,927
Transcript Variant X7XM_017021432112,924
Transcript Variant X8XM_017021433102,685
Transcript Variant X9XR_0017504141014,302
Transcript Variant X10XR_001750415102,841
Transcript Variant X11XR_001750416103,386
Transcript Variant X12XR_001750417121,758
Transcript Variant X13XR_0017504181214,488

Protein

The GPATCH2L human protein (NP_060396) has a molecular weight of 54,260 Da and consists of 482 amino acids with a predicted isoelectric point of 8.77.[8] It has 17 different isoforms and the most common is isoform 1. Every human GPATCH2L isoform has a GPATCH2L domain, but no other significant smaller repeats were found as be seen in the schematic illustration below. Also, human GPATCH2L protein in prostate tissue reveals distinct positivity in glandular cells, according to immunohistochemical staining of human prostate GPATCH2L Antibody (HPA018856) in IHC from The Human Protein Atlas.[9]

Schematic illustration of GPATCH2L human protein was created by using Illustrater for BioSequence (IBS) tool from GPS, including the domain, disordered region, nuclear localization signal (NLS) regions, phosphorylation, phosphothreonine, O-GalNAc (mucin-type) glycosylation, 0-(beta)-GlcNAc, N-glycosylation, and O-Glycosylation sites.[10]
17 isoforms of GPATCH2L Homo sapiens protein from NCBI Gene.[6]
NameAccession NumberSize (aa)
Isoform 1NP_060396482
Isoform 2NP_060442477
Isoform 4NP_001308955434
Isoform 5NP_001308956304
Isoform 6NP_001308957447
Isoform 7NP_001308958446
Isoform 8NP_001308959469
Isoform 9NP_001308960271
Isoform 10NP_001308961402
Isoform X1XP_016876916495
Isoform X2XP_016876917490
Isoform X3XP_016876918482
Isoform X4XP_006720254477
Isoform X5XP_016876919477
Isoform X6XP_016876920460
Isoform X7XP_016876921459
Isoform X8XP_016876922442

Secondary and tertiary structure

Two figures show the predicted tertiary structure of GPATCH2L human protein from AlphaFold.[11] Four Alpha helix bundles and two Beta sheets are observable and these are annotated on conceptual translation.


The GPATCH2L domain annotated in yellow is shown on the predicted tertiary structure from AlphaFold[11] by using iCn3D viewer from NCBI.[6]

Interacting proteins

GPATCH2L human protein is known to interact with KRR1, DDX10, and NOL6 within the nucleolus and nucleus.

Predicted interacting proteins of GPATCH2L Homo sapiens from STRING.[12]
NameFull NameFunctionCell’s CompartmentExperimental ValidationString-db Score
KRR1KRR1 small subunit processome component homolog1) Nucleolar protein required for rRNA synthesis and ribosomal assembly. 2) it enables RNA and protein binding. 3) it is required for 40S ribosome biogenesis in the nucleolus.NucleolusExperiments: 1) Detected by two-hybrid array assay. 2) Detected by affinity chromatography technology assay. 3) Detected by inferred by author assay. 4) Detected by tandem affinity purification assay.0.650
DDX10Probable ATP-dependent RNA helicase DDX101) it promotes AIM2-inflammasome activation by maintaining AIM2 protein stability. 2) it promotes human lung carcinoma proliferation by U3 small nucleolar ribonucleoprotein IMP4NucleusExperiments: 1) Detected by two-hybrid array assay. 2) Detected by inferred by author assay. 3) Detected by tandem affinity purification assay.0.548
NOL6Nucleolar protein 6A nucleolar RNA-associated protein; 1) it is related to ribosome biogenesis in endometrial cancer. 2) it promotes the proliferation and migration of endometrial cancer cells by regulating TWIST1 expression.NucleusExperiments: 1) Detected by two-hybrid array assay. 2) Detected by inferred by author assay. 3) Detected by tandem affinity purification assay.0.527

Gene level regulation

Promoter

GPATCH2L human gene has a promoter [GXP_207451] located in ch14:76150912 - 76151972.[13] The length of the promoter is 1061 bp.

Transcription factor binding sites

In the below table , 5 transcription factors [KLFS, HOMF, SP1F, ZF02, and NFKB] are predicted to bind within a conserved section of the transcriptional regulatory region. Unlike 19 transcription factors in the table, MAZF is only specifically active in cartilage and skeleton tissues.[13] Also, PLAG is only specifically active in bone marrow cells, digestive system, embryonic structures, endocrine system, germ cells, and hematopoietic system. Most transcription factors are active in the ovary, lung, brain, prostate, bone marrow cells, which show the highest values in RNA-seq data from the Gene database record at NCBI.[6]

The multiple sequence alignment of GPATCH2L mammals shows NFKB is the most conserved transcription factor.
Detailed information about 20 transcription factors in human GPATCH2L from Genomatix.[13]
ElementDescription/Full NameThe Best Matrix ScoreThe Number of Binding Sites in The Region
AP1FAP1, Activating protein 10.9031
NR2FNuclear receptor subfamily 2 factors0.8265
LHXFLim homeodomain factors0.9063
STATSignal transducer and activator of transcription0.8962
HIFFHypoxia inducible factor, bHLH/PAS protein family0.9895
HESFVertebrate homologues of enhancer of split complex0.9852
KLFSKrueppel like transcription factors0.91213
GLIFGLI zinc finger family0.9145
PLAGPleomorphic adenoma gene0.8454
SP1FGC-Box factors SP1/GC0.8559
EGRFEGR/nerve growth factor-induced protein C & related factors0.9304
HOMFHomeodomain transcription factors0.9805
CAATCCAAT binding factors0.9261
AP2FActivator protein 20.9171
ZF02C2H2 zinc finger transcription factors 20.9323
RXRFRXR heterodimer binding sites0.8509
SMADVertebrate SMAD family of transcription factors0.9942
CREBcAMP-responsive element binding proteins0.8446
NFKBNuclear factor kappa B/c-rel0.9284
MAZFMyc associated zinc fingers1.0003

Transcript level regulation

Expression Pattern

RNA-seq was performed on tissue samples from 95 human individuals representing 27 different tissues to identify tissue-specificity protein-coding genes at NCBI.[6] RNA-seq data shows high expression within the bone marrow, testis, and brain tissue in GPATCH2L human mRNA. Tissues with low expression are the pancreas, liver, and salivary glands.

NCBI GEO profile across all tissues


Human GPATCH2L mRNA level in lung carcinoma and all normal tissues, including lung tissue, on average from NCBI GEO.[14]

Significantly different gene expressions in tissues are shown in a microarray-assessed tissue expression pattern (GDS596) in GPATCH2L Homo sapiens from NCBI GEO.[14] The high gene expressions in cerebellum, fetal brain, bone marrow, ovary, prostate, and lung tissues in RNA-seq data are extremely low in GDS596. However, the gene expressions in liver, pancreas, salivary gland, and fetal liver tissues remain low in every gene database record.

A graph in NCBI GEO can be interpreted as follows: a 'single channel' sample means that a hybridization where cDNA obtained from one biosource is combined with the array.[14] This method is typically used for membrane (filter) arrays with radionucleotide labels and high-density oligonucleotide arrays with fluorescent labels. This experiment type makes the measurements of gene expression, which are defined as scaled/normalized signal count values that correspond to "value" in the below tables and right figures.

Three highest values among 158 samples in a microarray-assessed tissue expression pattern (GDS596) in GPATCH2L Homo sapiens from NCBI GEO.[14]
Sample/TissueTitleValueRank
GSM19012 / (Superior Cervical Ganglion)3AJZ02081478b_Superior_Cervical_Ganglion1408.781
GSM19014 / (Skeletal Muscle)3AJZ02083092b_Skeletal_Muscle_Psoas819.283
GSM19009 / (Dorsal Root Ganglion)3ARS02080736e_DRG787.481
Three lowest values among 158 samples in a microarray-assessed tissue expression pattern (GDS596) in GPATCH2L Homo Sapiens from NCBI GEO.[14]
Sample/TissueTitleValueRank
GSM18875 / (PB-CD 56+NK cells) / (Superior Cervical Ganglion)3AMH02082109_PB_CD56NKCells10.119
GSM18969 / (Cardiac Myocytes)3AJZ02053107_CardiacMyocytes10.112
GSM18881 / (PB - CD 19 + B cells)3AMH02082107_PB_CD19BCells10.528

Predicted stem-loops and miRNA targeting

The nucleic acid secondary structure of human GPATCH2L (5’UTR) shows one stem-loop, inframe stop codon, start codon, and exon boundaries. In this stem-loop, g and g are not connected. However, these are conserved in the multiple sequence alignment of this stem-loop region. In the 3'UTR figure, there are 10 stem-loops and these are zoomed in another figure. Although 3-3), 3-6), 3-9) show weird structure (3: CCTT, 6: CAT, TTC, 9:GTG), every letter is conserved in its multiple sequence alignment. Especially, 3-8) includes hsa-miRNA-205 in its stem-loop, and every letter of hsa-miRNA-205 is conserved in the multiple sequence alignment.



Protein level regulation

Immunochemistry (IHC)

GPATCH2L protein is highly expressed in the bone marrow tissues, according to immunohistochemical staining of human hematopoietic cells in bone marrow tissue GPATCH2L Antibody (HPA018856) in IHC from The Human Protein Atlas.[9] Also, it has shown that GPATCH2L protein is highly expressed in human respiratory epithelial cells in bronchus tissue and human cells in endometrial stroma and glandular cells in endometrium tissue.

Protein localization and abundance

GPATCH2L Homo sapiens protein is mainly localized to the nucleoplasm, according to GPATCH2L antibody staining from The Human Protein Atlas and Thermo Fisher Scientific.[9][15] Also, 82.6% of GPATCH2L human protein is predicted to be located in the nucleus, according to PSORT II[16] and pI/MW tool from Expasy.[8] GPATCH2L human protein is Isoleucine poor (I-), Serine rich (S+), and Arginine rich (R+) compared to other human proteins.[17] The post-translational modification sites [O-GalNAc (mucin-type) glycosylation, 0-(beta)-GlcNAc, N-glycosylation, O-glycosylation, and phosphorylation] are annotated on the conceptual translation. The conceptual translation figures in Wikipedia only include 1,560 bp mRNA and 482 amino acids.

Human GPATCH2L isoform 1 (NM_017926.4) annotated conceptual translation 1
Human GPATCH2L isoform 1 (NM_017926.4) annotated conceptual translation 2

Homology and evolution

Orthologs and paralogs

GPATCH2L Homo sapiens has orthologs in Mammalia, Reptilia, Amphibia, Mollusca, Arthropoda, Ave, Fish, and Invertebrate. The values [query cover values (%), sequence identity (%), and sequence similarity (%)] decrease as the group changes into more distant orthologs from Homo sapiens, such as Invertebrates. However, frogs are unusual in that they have a very low sequence identity (36.8% - 38.0%). Also, the class of fungi and bacteria that contain GPATCH2L homologs was not able to be found using NCBI Homologene.[18] The paralogs of GPATCH2L Homo sapiens were found by using NCBI Homologene. In the below table, MYA stands for "Million Years Ago" and the equation of the corrected divergence [m] is 100*(-LN(sequence similarity(%)).

20 different orthologs of GPATCH2L that include Mammalia, Reptiles, Birds, Amphibians, Fish, and Invertebrates.
GPATCH2LGenus, SpeciesCommon NameTaxonomic GroupDivergence Date (MYA)Accession NumberQuery CoverSequence Length (aa)Sequence Identity (%)Sequence Similarity (%)Corrected Divergence [m]
MammaliaHomo sapiensHumanPrimates0NP_060396.2100.0%482100.0%100.0%0
MammaliaMus musculusHouse MouseRodentia90XP_006516282.1100.0%49086.1%90.0%10.5
MammaliaUrsus arctos horribilisGrizzly BearCarnivora94XP_026375234.193.8%47993.8%95.4%4.7
MammaliaEquus caballusHorsePerissodactyla94XP_023483915.194.8%48286.1%90.0%10.5
ReptilesChelonia mydasGreen Sea TurtleTestudines318XP_007064235.185.8%48585.8%90.3%10.2
ReptilesCrocodylus porosusSaltwater CrocodileCrocodilia318XP_019407441.184.2%48684.2%89.3%11.3
AvesDromaius novaehollandiaeEmuCasuariiformes318XP_025976214.183.7%48483.7%89.5%11.1
AvesFalco cherrugSaker FalconFalconiformes318XP_014139614.185.2%48485.2%89.9%10.6
AmphibiansXenopus laevisAfrican Clawed FrogAnura352XP_018120469.136.8%48132.9%44.1%81.9
AmphibiansXenopus tropicalisWestern Clawed FrogAnura352XP_004914844.137.4%48133.7%46.5%76.6
AmphibiansEleutherodactylus coquiCommon coquí (Frog)Anura352KAG9484589.138.0%47835.4%48.5%72.4
FishParamormyrops kingsleyaeElephantfishOsteoglossiformes (bony fish)433XP_023683992.158.8%48352.0%60.9%49.6
FishOncorhynchus tshawytschaChinook SalmonSalmoniformes (bony fish)433XP_024285887.156.2%48150.6%61.3%48.9
FishDanio rerioZebrafishCypriniformes (Zebrafish)433XP_009293252.135.5%48932.1%43.3%83.7
FishCarcharodon carchariasGreat White SharkLamniformes (sharks and rays)465XP_041071062.150.9%48445.7%56.0%58.0
InvertebratesTrachymyrmex septentrionalisAntArthropoda736XP_018345408.129.9%48524.3%33.2%110.3
InvertebratesChionoecetes opilioSnow CrabArthropoda736KAG0728066.131.3%47821.0%32.1%113.6
InvertebratesAmphibalanus amphitriteAcorn BarnacleArthropoda736XP_043205896.130.1%48923.8%34.9%105.3
InvertebratesOwenia fusiformisTubewormAnnelida736CAC9666901.133.3%48027.8%41.0%89.2
InvertebratesPomacea canaliculataChanneled ApplesnailMollusca736XP_025105792.131.0%48925.4%38.2%96.2
InvertebratesOctopus sinensisAsian Common OctopusMollusca736XP_036357334.134.2%46226.2%36.6%100.5
6 paralogs of GPATCH2L Homo sapiens.
Paralogs [Homo sapiens]Accession NumberSequence Length (aa)Sequence Identity (%)Sequence Similarity (%)Corrected Divergence [m]
GPATCH2LNP_060396.2482100.0%100.0%0
GPATCH2NP_060510.152831.3%43.6%83.0
GPATCH1NP_0604959319.8%17.2%176.0
GPATCH3NP_07136152513.3%22.8%147.8
GPATCH4NP_0564053756.8%11.2%218.9
GPATCH8NP_00100290915028.1%12.3%209.6
GPATCH11NP_7775915259.2%18.4%169.3

Evolution

GPATCH2L evolves more slowly compared to Fibrinogen Alpha Chain but faster than Cytochrome C.[19] In the unrooted tree of GPATCH2L protein, only one mammal (human) is included since every species in mammals is very closely related to each other, showing various short lines. Arthropoda in invertebrates shows the longest line, meaning that they have diverged the longest.

GPATCH2L Homo sapiens evolutionary graph
An unrooted tree of GPATCH2L protein, including Mammal, Bird, Reptile, Arthropoda, Mollusca, Annelida, Amphibians, and Fish, was created by using LIRMM.[20]

Distant homologs

The closest organisms that do/do not have GPATCH2L are as follows: In reptiles, GPATCH2L homologs in crocodile, turtle, snake, lizard, gecko were found by using NCBI Blast,[21] while there were no homologs in skink, chameleon,and iguana. In amphibians, GPATCH2L homologs in frog, toad, and salamander were found by using NCBI Blast,[21] while other types of amphibians, such as caecilian, microsauria, and labyrinthodontia, do not contain GPATCH2L gene. In Invertebrates, NCBI Blast[21] demonstrates that scallop, starfish, octopus, spider have GPATCH2L gene, but sponge and jellyfish do not. Lastly, there are no GPATCH2L homolog in fungi and bacteria, according to NCBI Blast.[21]

The analysis of amino acids in GPATCH2L human with distant homologs [ant, honeybee, acorn barnacle, octopus, and snail].

Function and biochemistry

GPATCH2L’s function is still unknown; however, the paralog GPATCH3 has been shown to participate in innate immune response within mammals.[22] GPATCH 3 negatively regulates RLR-mediated innate antiviral response, disrupting VISA signalosome assembly.[23] It has also shown to participate in ocular and craniofacial development.[24]

Clinical significance

An SNP (rs935332) within the human GPATCH2L region is related to scleroderma renal crisis (SRC), according to the validation cohort.[25] Immunostaining of renal biopsy sections demonstrated an increase in tubular expression of GPATCH2L, despite the absence of any genetic replication for the associated SNP.[25]

Retinitis Pigmentosa 24 is one of the diseases that is associated with this gene.[5] The expression of GPATCH2L in cancer is as follows: A few cases of pancreatic cancers exhibited strong immunoreactivity, while malignant lymphomas, colorectal, breast, and prostate cancers were negative or weakly stained.[26]

GPATCH2 is overexpressed in the great majority of breast cancer cases since it encodes a nuclear factor that may be important for tumor growth during breast cancer and spermatogenesis.[27] An interaction of hPrp43 (an RNA-dependent ATPase) and GPATCH2 protein greatly improves the ATPase activity of hPrp43 and cause a growth-promoting effect on mammalian cells.[28] Since GPATCH2 may be novel cancer/testis antigen, according to northern blot analyses of normal human organs, targeting GPATCH2 or inhibiting the interaction between hPrp43 and GPATCH2 could be a therapeutic technique for breast cancer.[28]

References

  1. GRCh38: Ensembl release 89: ENSG00000089916 - Ensembl, May 2017
  2. GRCm38: Ensembl release 89: ENSMUSG00000021254 - Ensembl, May 2017
  3. "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  4. "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  5. "GPATCH2L". www.genecards.org. GeneCards.
  6. "GPATCH2L G-patch domain containing 2 like [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov.
  7. "AceView entry on FLJ20689". AceView.
  8. "GPATCH2L pI/MW". Expasy.
  9. "Tissue expression of GPATCH2L - Staining in prostate - The Human Protein Atlas". www.proteinatlas.org.
  10. "IBS: Illustrator for Biological Sequences". ibs.biocuckoo.org.
  11. "AlphaFold Protein Structure Database". alphafold.ebi.ac.uk.
  12. "STRING entry on GPATCH2L". string-db.org.
  13. "Genomatix entry on human GPATCH2L".
  14. "NCBI GEO entry on human GPATCH2L GDS596". www.ncbi.nlm.nih.gov.
  15. "Anti-GPATCH2L Antibodies | Invitrogen". www.thermofisher.com.
  16. "PSORT II Prediction". psort.hgc.jp.
  17. "SAPS < Sequence Statistics < EMBL-EBI". www.ebi.ac.uk.
  18. "Home - Protein - NCBI". www.ncbi.nlm.nih.gov.
  19. "TimeTree :: The Timescale of Life". www.timetree.org.
  20. "LIRMM". www.lirmm.fr.
  21. "BLAST: Basic Local Alignment Search Tool". blast.ncbi.nlm.nih.gov.
  22. Li M, Liu C, Xu X, Liu Y, Jiang Z, Li Y, et al. (November 2020). "Grass carp (Ctenopharyngodon idella) GPATCH3 initiates IFN 1 expression via the activation of STING-IRF7 signal axis". Developmental and Comparative Immunology. 112: 103781. doi:10.1016/j.dci.2020.103781. PMID 32645337. S2CID 220465136.
  23. Nie Y, Ran Y, Zhang HY, Huang ZF, Pan ZY, Wang SY, Wang YY (April 2017). "GPATCH3 negatively regulates RLR-mediated innate antiviral responses by disrupting the assembly of VISA signalosome". PLOS Pathogens. 13 (4): e1006328. doi:10.1371/journal.ppat.1006328. PMC 5407853. PMID 28414768.
  24. Ferre-Fernández JJ, Aroca-Aguilar JD, Medina-Trillo C, Bonet-Fernández JM, Méndez-Hernández CD, Morales-Fernández L, et al. (April 2017). "Whole-Exome Sequencing of Congenital Glaucoma Patients Reveals Hypermorphic Variants in GPATCH3, a New Gene Involved in Ocular and Craniofacial Development". Scientific Reports. 7 (1): 46175. Bibcode:2017NatSR...746175F. doi:10.1038/srep46175. PMC 5387416. PMID 28397860. S2CID 28275432.
  25. Stern EP, Guerra SG, Chinque H, Acquaah V, González-Serna D, Ponticos M, et al. (November 2020). "Analysis of Anti-RNA Polymerase III Antibody-positive Systemic Sclerosis and Altered GPATCH2L and CTNND2 Expression in Scleroderma Renal Crisis". The Journal of Rheumatology. 47 (11): 1668–1677. doi:10.3899/jrheum.190945. PMID 32173657. S2CID 212728058.
  26. "The expression of GPATCH2L in cancer". The Human Protein Atlas.
  27. Lin ML, Fukukawa C, Park JH, Naito K, Kijima K, Shimo A, et al. (August 2009). "Involvement of G-patch domain containing 2 overexpression in breast carcinogenesis". Cancer Science. 100 (8): 1443–1450. doi:10.1111/j.1349-7006.2009.01185.x. PMID 19432882. S2CID 205235010.
  28. Lin ML, Fukukawa C, Park JH, Naito K, Kijima K, Shimo A, et al. (August 2009). "Involvement of G-patch domain containing 2 overexpression in breast carcinogenesis". Cancer Science. 100 (8): 1443–1450. doi:10.1111/j.1349-7006.2009.01185.x. PMID 19432882. S2CID 205235010.
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.