Javaв„ў EE 5 Tutorial, The (3rd Edition)
In the previous matching tag examples, my input has matched p and H2 tags, and the Regex finds them just fine. However, there s nothing in the regular expression itself that requires the opening and closing tags to match. I m going to add a test that shows that unmatched tags will still pass this Regex, and then see if I can figure out how to require them to match. I seem to remember that there s a way to do that with the Regex class. Here s the new test:
[Test] public void InvalidXmlNotHandledYet() { Regex r = new Regex("<(?<prefix>.*)>(?<body>.*)</(?<suffix>.*)>"); Match m = r.Match("<p>this is a para</H2>"); Assert(m.Success); AssertEquals("p",m.Groups["prefix"].Value); AssertEquals("H2",m.Groups["suffix"].Value); }
Just as expected, the same Regex matches a p followed by an H2. Not what we really want, but we want to be sure we understand what our code does. This test now motivates the next extension, to a Regex that does force the tags to match. I m not sure we will need this ”we may already have gone beyond our current need for regular expressions, but my mission here is to learn as much as I can, in a reasonable time, about how Regex works. Now I ll have to search the Help a bit. Hold on...
The documentation seems to suggest that you can have named backreferences , using \k. I ll write a test. Hold on again... All right! Worked almost the first time: just a simple mistake away from perfect. Here s the new test:
[Test] public void Backreference() { Regex r = new Regex("<(?<prefix>.*)>(?<body>.*)</\k<prefix>.*>"); Match m = r.Match("<p>this is a para</p>"); Assert(m.Success); m = r.Match("<p>this is a para</H2>"); Assert(!m.Success); }
In this test, notice that we had to type \\k to get the \k into the expression. This is because C# strings, like most languages strings, already use the backslash to prefix newlines and other special characters . We have to type two of them to get one backslash into the string. The amazing thing is that I actually remembered to do that the first time! The mistake? I left the word suffix there instead of saying \k<prefix>, as was my intent.
Категории