.. without breaking anything. It sounds easy enough, but there's some odd quirks
that make it more challenging than I want to handle :) See the full spec for
the way I see it...
Deliverables:
Easy one for those who know regex's well! If I have a big long string that basically
contains a messy HTML page, I want to strip out:
1: All HTML comments
2: All javascript comments
3: All whitespace than can be removed
Here's the HTML comments regex:
preg_replace('/<!--(.|\s)*?-->;/', '', $buffer);
Problem 1: If you follow that and you know HTML, you'll know that it's great
unless you're inside a tag pair that uses comments to protect its content from
older browser that don't understand (most common I guess being {script}
and {style} tags.
Problem 2: Javascript comments that might appear after a line of code, not
on their own - eg:
var problemVar; // this variable is commented
Problem 3: Whitespace that appears inside {pre} tags needs to remain
untouched (I don't know if there's any others like that).
Those are the problems that I can think of, there might be more - notice the
brief says "without breaking anything"! :)
Nice regex problems if anyone fancies them anyway :)
1) Complete and fully-functional working program(s) in executable form as well
as complete source code of all work done.
2) Complete ownership and distribution copyrights to all work purchased.