[regex-coach] Two Problems
Snow Squall
snow_squall at hotmail.com
Wed Nov 24 19:05:39 UTC 2004
Hello, first time poster...
I have two issues that i can't seem to track down....
1. I'm looking for a rule that will eliminate the following. I'm looking
for all of my web pages that have some snippit of code before the
<!DOCTYPE... my <!DOCTYPE should start the HTML on my web pages... I've
seen individuals sneak the following code in:
<!-- saved from -->
<!DOCTYPE HTML Public ..ect...
So is there a regex construct that will fail if any characters are found
before <!DOCTYPE ???
2. Secondly, looking to find ONLY the .PDF's inside a test.com domain. I
wish to match the pattern http://www.test.com/snow/squall/index.pdf . I
know to start my regex as http://www\.test\.com but how do i ignore all the
directory stuff and key in on the .pdf extension.
Thanks
More information about the regex-coach
mailing list