Bonjour,
j'essaye désespérément de parser le texte suivant. Mon but est de faire en sorte que chaque élément entre {{ }} soit un élément d'une liste.
J'utilise pour cela la fonction findall mais sans succès.
Merci d'avance pour votre aide.
Code : Sélectionner tout - Visualiser dans une fenêtre à part re.findall("{{ ?.*|\s* ?}}",snp_page)
Le texte à parser.
This SNP, a variant in the [[BRCA1]] gene, is 1 of 25 SNPs reported to represent independently minor, but cumulatively significant, increased risk for [[breast cancer]]. {{PMID|17341484}}
Code : Sélectionner tout - Visualiser dans une fenêtre à part
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34 {{Rsnum |rsid=1799966 |Gene=BRCA1 |Chromosome=17 |position=43071077 |Orientation=minus |ReferenceAllele=A |MissenseAllele=G |GMAF=0.3274 |Assembly=GRCh38 |GenomeBuild=38.1 |dbSNPBuild=141 |geno1=(A;A) |geno2=(A;G) |geno3=(G;G) |Gene_s=BRCA1 }}{{ population diversity | geno1=(A;A) | geno2=(A;G) | geno3=(G;G) | CEU | 45.5 | 46.4 | 8.2 | HCB | 46.3 | 41.9 | 11.8 | JPT | 53.1 | 38.9 | 8.0 | YRI | 63.3 | 35.4 | 1.4 | ASW | 50.9 | 45.6 | 3.5 | CHB | 46.3 | 41.9 | 11.8 | CHD | 29.4 | 55.0 | 15.6 | GIH | 31.6 | 50.0 | 18.4 | LWK | 62.7 | 32.7 | 4.5 | MEX | 41.4 | 43.1 | 15.5 | MKK | 60.9 | 35.9 | 3.2 | TSI | 41.2 | 51.5 | 7.2 | HapMapRevision=28 }}
For details of all 25 SNPs in this group, along with the two methods used to calculate overall risk estimates for [[breast cancer]], refer to the SNPedia [[breast cancer]] entry.
For this particular SNP, the risk (minor) allele is (G).
Code : Sélectionner tout - Visualiser dans une fenêtre à part
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73 {{ClinVar |rsid=1799966 |Reversed=1 |FwdREF=A |FwdALT=G,T |REF=T |ALT=A,C |RSPOS=41223094 |CHROM=17 |GMAF=0.3274 |dbSNPBuildID=89 |SSR=0 |SAO=0 |VP=0x05016800000017051f100101 |GENEINFO=BRCA1:672 |GENE_NAME=BRCA1 |GENE_ID=672 |WGT=0 |VC=SNV |CLNALLE=1; 2 |CLNHGVS=NC_000017.10:g.41223094T>A; NC_000017.10:g.41223094T>C |CLNSIG=2 |CLNCUI= |CLNACC= RCV000031194.2; RCV000048673.2; RCV000034753.1; RCV000048672.2 |Tags=RV;PM;PMC;SLO;VLD;G5A;G5;HD;GNO;KGPhase1;KGPilot123;KGPROD;OTHERKG;PH3;LSD |CAF=0.6726; 0.3274 |CLNDBN=Breast-ovarian cancer, familial 1; Familial cancer of breast; not provided |CLNDSDB=GeneReviews:MedGen:OMIM:Orphanet; GeneReviews:MedGen:OMIM:SNOMED_CT |CLNDSDBID=NBK1247:C2676676:604370:145; NBK1247:C0346153:114480:254843006 |COMMON=1 |Disease=Breast-ovarian cancer; Familial cancer of breast; not provided }} {{PMID Auto |PMID=18559551 |Title=Pathway analysis of single-nucleotide polymorphisms potentially associated with glioblastoma multiforme susceptibility using random forests. }} {{GET Evidence |gene=BRCA1 |aa_change=Ser1634Gly |aa_change_short=S1634G |impact=not reviewed |qualified_impact=Insufficiently evaluated not reviewed |inheritance=unknown |quality_scores=Array |dbsnp_id=rs1799966 |overall_frequency_n=3203 |overall_frequency_d=10758 |overall_frequency=0.297732 |n_genomes=3 |n_genomes_annotated=0 |n_haplomes=3 |n_articles=0 |n_articles_annotated=0 |qualityscore_in_silico=1 |qualitycomment_in_silico=Y |gene_in_genetests=Y |genetests_testable=Y |genetests_reviewed=Y |nblosum100=2 |autoscore=2 |webscore=N }} {{on chip | 23andMe v1}} {{on chip | 23andMe v2}} {{on chip | 23andMe v3}} {{on chip | 23andMe v4}} {{on chip | FTDNA2}} {{on chip | FTDNA}} {{on chip | Illumina Human 1M}}
Partager