IdentifiantMot de passe
Loading...
Mot de passe oublié ?Je m'inscris ! (gratuit)
Navigation

Inscrivez-vous gratuitement
pour pouvoir participer, suivre les réponses en temps réel, voter pour les messages, poser vos propres questions et recevoir la newsletter

Dotnet Discussion :

Expression régulière : filtrer tout sauf un expression


Sujet :

Dotnet

  1. #1
    Membre régulier
    Inscrit en
    Octobre 2004
    Messages
    92
    Détails du profil
    Informations forums :
    Inscription : Octobre 2004
    Messages : 92
    Points : 70
    Points
    70
    Par défaut Expression régulière : filtrer tout sauf un expression
    Bonjour,

    Avis aux experts des expressions régulières. J'ai 1 expression régulière à faire pour filtrer des URLs. Elle consiste à faire une négation.

    Objectif : laisser passer toutes les URL sauf celles qui du type http://test/admin* et http://test/login.aspx

    Jeu de test :
    1. http://test/admin
    2. http://test/admin/
    3. http://test/admin_client/
    4. http://test/admin/login.aspx
    5. http://test/login.aspx
    6. http://test/toto.aspx
    7. http://test/titi/tata.aspx

    Résultat attendu : seules les URLs 6 et 7 du jeu de test doivent être acceptées.

    Expression régulière : http://(.*)/(?!(admin(.*)))(?!login.aspx)(.*)

    => C'est ok pour l'expression 4, elle est bien filtrée. Par contre, je galère avec les expressions 1, 2, 3, 4 du jeu de test.
    Je m'aide de http://regexlib.com/RETester.aspx pour tester facilement mes expressions régulières.

    Par avance, merci pour votre aide !

  2. #2
    Membre régulier
    Inscrit en
    Octobre 2004
    Messages
    92
    Détails du profil
    Informations forums :
    Inscription : Octobre 2004
    Messages : 92
    Points : 70
    Points
    70
    Par défaut
    J'ai obtenu la solution sur un autre forum : http://regexadvice.com/forums/69876/...ead.aspx#69876
    Ty:

    http://([^/]*)/(?!admin.*)(?!login.aspx).*

    I've clean up the lookaheads a bit, just to get rid of the unnecessary parentheses and make them a bit more readable.

    To see what was going wrong with your pattern, I'll explain it a section at a time:

    http:// - all OK here

    (.*)/ - this is where things start to go wrong. What I think you are intending to do is to match everything to the next "/" character - ie skip the "domain" part of the URL. However you have used the '*' quantifier which means "match zero or more characters, matching as many as possible". Because the regex engine works through the pattern one operator at a time, it will do what you say and match everything from after the "http://" to the end of the string (or line if there are multiple lines and you are not using the "singleline" or "dot matches newline" option).

    At this point, it tries to match the "/" but is at the end of the string and so has to backtrack, releasing 1 character at a time from those matched by the '.*' until it finds a "/" character. AS you can see, this effectively searches for the LAST "/" in the string. In your test cases #2 and#3, that is the last character of the URL. The things that follow are negative lookaheads which will always succeed matching nothing (the lookahead can't match - fail - and so the negation turns this into a "succeed") and another ".*" which is quite happy to match nothing at all.

    In test cases #1, #4 and #5, the backtracking leaves the "login.aspx" and this lets the first lookahead reject the match and so this works for those cases.

    For test cases #6 and #7, the negative lookaheads both succeed and so you get the match.

    The "typical" correction for the greediness of the '.*' operator is to use '.*?' which means "match zero or more of any character, matching as few as possible". however this doesn't work in this case because of the way the regex engine actually does the laziness checking.

    When it sees '.*?/', the first thing it does is to not try to match anything with the '.' operator but sees if what follows can match - in this case the '/' of the pattern. If this fails, the regex engine goes back to the '.' and lets it try to match - which is nearly every case it will. This carries on until it tests the first "/" in (say) test case #2 (after the "http://test" part). Now, the '/' will match and so it tries to move on in the pattern, getting to the first negative lookahead. In this case it matches the "admin" part and so the negative lookahead returns a "fail".

    The regex engine then backtracks to see if there is some other path that will lead to a match. That means it backs off the "/" it has matched and gets to the '.*?' again. as we've reached this as a result of a failure further on, the regex engine uses the '.' to match the "/" character and the process described in the previous paragraph starts all over again, this time matching the 'admin" characters until it again gets to the "/" at the end. We are now in the situation where neither lookahead can match and so both succeed, and the final '.*' can also succeed and so an overall match is declared.

    My solution involves explicitly matching all non"/' character and then the "/" character. There is no alternate path in this that the regex can use to backtrack past this and so the negative lookaheads are forced to operate on the required part of all test cases and so return the required matches.

    Susan
    Merci à Susan :-)

+ Répondre à la discussion
Cette discussion est résolue.

Discussions similaires

  1. Réponses: 2
    Dernier message: 16/07/2011, 10h14
  2. Pb d'expression régulière : tout sauf une certaine extension
    Par supertom dans le forum Collection et Stream
    Réponses: 4
    Dernier message: 10/07/2008, 11h55
  3. [REGEX] expression régulière qui match tout les nombres sauf un
    Par neuromencien dans le forum Collection et Stream
    Réponses: 11
    Dernier message: 28/05/2008, 08h21
  4. Expression régulière - Selectionner tout y compris les retours à la ligne
    Par Julien281 dans le forum Général JavaScript
    Réponses: 2
    Dernier message: 16/02/2008, 19h30
  5. Expressions réguliére filtrer fichier dns
    Par Gad29 dans le forum Langage
    Réponses: 5
    Dernier message: 30/04/2007, 14h03

Partager

Partager
  • Envoyer la discussion sur Viadeo
  • Envoyer la discussion sur Twitter
  • Envoyer la discussion sur Google
  • Envoyer la discussion sur Facebook
  • Envoyer la discussion sur Digg
  • Envoyer la discussion sur Delicious
  • Envoyer la discussion sur MySpace
  • Envoyer la discussion sur Yahoo