Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

XSD Restriction pattern 2

Status
Not open for further replies.

ClulessChris

IS-IT--Management
Jan 27, 2003
890
GB
Hi,
I'm trying to write a regex pattern for my schema. I want to prohibit the use of certain characters ("&'#) & --
I have the following pattern that works to a point:

[a-zA-Z\s\d_\. !$%^*()_+={}|:;,?~/£@\\]*

I'm having trouble restricting the hyphen '-' to a single char as I need to exclude double hyphen '--'

I've tried a few things, such as

[a-zA-Z\s\d_\. !$%^*()_+={}|:;,?~/£@\\]*[-]?

but it's now driving me crazy. Any help would be most welcome.
 
I think probably your solution will require grouping. I am going to use a simpler pattern that I have used to restrict to a hyphenated data name, that excludes adjacent hyphens, as my example (I found this a while back via Google search). You can expand the set of characters [A-Za-z0-9] as needed for your situation.

Code:
^[highlight #FCAF3E][A-Za-z0-9][/highlight][highlight #729FCF]+[/highlight][highlight #FCE94F](-[A-Za-z0-9]+)[/highlight][highlight #8AE234]*[/highlight]

The parentheses define a group. The group is defined as [highlight #FCE94F]a hyphen followed by one or more non-hyphen characters[/highlight]. [highlight #8AE234]The group can appear zero or more times[/highlight]. The the beginning of the RegEx defines [highlight #FCAF3E]the same set of non-hyphen characters[/highlight], that can be [highlight #729FCF]repeated one or more times[/highlight]. So, there is no way for a hyphen to appear adjacent to another hyphen, or at the beginning or end (which was a requirement for me).

I hope this helps...

Tom Morrison
Hill Country Software
 
\-([^\-]|$) will match a hyphen that is followed by a character that is not a hyphen or by the end of the string.

Based on your first pattern:
Code:
([a-zA-Z\s\d_\. !$%^*()_+={}|:;,?~/£@\\]|\-([^\-]|$))*
 
Both, thanks for your input, this has taken up the majority of my day.
K5tm, I tried grouping. several variations. I think it was the greedy " that was the route of my problem but I just couldn't find a way around it.
atlopes, many thanks. I've left the office for the day but will give that a go tomorrow and let you know how I get on.

thanks again.
Chris
 
atlopes, doesn't the [tt][^\-][/tt] allow characters from the unwanted set to follow a hyphen?

Tom Morrison
Hill Country Software
 
atlopes, at first test that pattern works however, as k5tm points out it allows the unwanted characters to follow a hyphen.
but it's still a step forward from what I had yesterday.
 
Yes, Tom's remark was spot on.

A new alternative, better tried:
Code:
^([a-zA-Z\s\d_\. !$%^*()_+={}|:;,?~/£@\\]|-([a-zA-Z\s\d_\. !$%^*()_+={}|:;,?~/£@\\]|$))+$
 
Fantastic!

I've made a slight amendment to include characters I'd missed from the pattern in the OP and removed the anchors, but this seems just the ticket.

Code:
<xs:pattern value="([a-zA-Z\s\d_\. !$%^*\[\]()_+={}|:;,?~/£€@\\]|-([a-zA-Z\s\d_\. !$%^*\[\]()_+={}|:;,?~/£€@\\]|$))+"/>
 
One further amendment. I found the pattern I posted last was restricting a hyphen by the end of the string.

so I now have :
Code:
<xs:pattern value="([a-zA-Z\s\d_\. !$%^*\[\]()_+={}|:;,?~/£€@\\]|-([a-zA-Z\s\d_\. !$%^*\[\]()_+={}|:;,?~/£€@\\]|$))+[red][-$]?[/red]"/>
 
Without the anchors, the |$ inside the group following the hyphen becomes redundant, so you can also get rid of it.

Tried finally with an XSD editor (not just with a Regular Expression editor): it's your corrections that make this work.
 
A final question: do you pretend to allow a single hyphen as the value for the element / attribute?
 
Sorry I don't understand the question. A single hyphen is allowed, what do you mean 'pretend'?
 
Given your pattern
Code:
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="[URL unfurl="true"]http://www.w3.org/2001/XMLSchema"[/URL] elementFormDefault="qualified">
  <xs:element name="root">
    <xs:complexType>
      <xs:attribute name="testPattern">
        <xs:simpleType>
          <xs:restriction base="xs:string">
            <xs:pattern value="([a-zA-Z\s\d_\. !$%^*\[\]()_+={}|:;,?~/£€@\\]|-([a-zA-Z\s\d_\. !$%^*\[\]()_+={}|:;,?~/£€@\\]|$))+[-$]?"/>
          </xs:restriction>
        </xs:simpleType>
      </xs:attribute>
    </xs:complexType>
  </xs:element>
</xs:schema>
testPattern="-" is not allowed, as is testPattern="+", for instance (- must be followed by some other allowed character, or must be preceded by some other allowed character and be placed at the end of the string). This may be, or not, what you do "pretend" (sorry about the terminology, English is not my native language and I might be following a false friend, here).
 
Just about the English: Pretend means "to do as if", I think you meant "specify" rather than pretend, atlopes.

There is a synonym for both things in german, for example: "vorgeben" in one context can mean to specify and in another means to pretend, in the sense of specifying something untrue, the full verb then is "vorgeben zu sein (oder haben)", which actually makes it another verb via the auxiliary verbs "haben" or "sein". Maybe something similar is working in your native language, too, but I would guess you're rather spanish or mexican perhaps? I may be wrong there thinking of Lopez.

So the question likely is whether the specification should allow a single "-" only as match of the pattern. Maybe really very unimportant as it would be a rare corner case.

Bye, Olaf.
 
Olaf,

Olaf said:
I may be wrong there

You left out Portuguese - which I think has a good chance of being correct.

Thanks for the interesting German lesson. I am several decades removed from my days when I actually could converse in German. I never learned this type of auxiliary verb.

Tom Morrison
Hill Country Software
 
Yes, I was really following a false friend: in Portuguese, "pretender" (a verb) means not only "to pretend", but also "to intend". I hope to not forget it in the future, because looking back I think I used it wrongly more than once.

As for the name, when you find a surname that looks like Spanish but ends in "-es" instead of "-ez", then most likely it's a Portuguese name (like "Gomes", instead of "Gomez", "Lopes" instead of "Lopez", and so on - but they share the same semantics, meaning "the son of", like the "-son" suffix in English, or "-ic" in Slavic languages, ...).
 
So it's about the same, OK.

Tom, auxiliary verbs are not a german only thing, the term alone exists in English, too: Be, have, will, do... So german has that kind of compound verbs, too, not always put together into one word.
And nice to know. So "z" vs "s" is about the same difference as in US English vs UK English ;)

Bye, Olaf.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top