PHP RegEx: a Pattern to Validate the Second Level Domain

Note: this is a theoretical question about PHP flavor of regex, not a practical question about validation in PHP. I am merely using Domain Names for lack of a better example.

"Second Level Domain" refers to the combination of letters, numbers, period signs, and/or dashes that are placed between http:// or http://www. and .com (.co, .info, .etc) .

I am only interested in second level domains that use English version of Latin alphabet.

This pattern:

[A-Za-z0-9.-]+

matches valid domain names, such as stackoverflow, StackOverflow, stackoverflow.co (as in stackoverflow.co.uk), stack-overflow, or stackoverflow123.

However, the same pattern would also match something like stack...overflow, stack---over--flow, ........ , -------- , or even . and -.

How can that pattern be rewritten, to indicate that period signs and dashes, even though they can be used multiple times in a node,

  • cannot be used without other symbols,
  • cannot be placed twice or more side by side with each other,
  • and cannot be placed in the beginning or end of the node?

Thank you in advance!

2 Answers

  1. Marcy- Reply

    2019-11-16

    I think something like this should do the trick:

    ^([a-zA-Z0-9]+[.-])*[a-zA-Z0-9]+$
    

    What this tries to do is

    start at the beginning of string, end at the end

    one or more letter or digit
    followed by either dot or hypen

    the group above repeated 0 or more times

    followed by one or more letter or digit

  2. Mark- Reply

    2019-11-16

    Assuming that you are looking for a regex that does not allow two consecutive . or - you can use:

    ^[a-zA-Z0-9]+([-.][a-zA-Z0-9]+)*$
    

    regexr demo

Leave a Reply

Your email address will not be published. Required fields are marked *

You can use these HTML tags and attributes <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>