You are not logged in.
Pages: 1
Build a basic CSS Rules Validator that validates the content of a CSS file based on the following rules:
SELECTOR1 {
PROPERTY1: VALUE;
PROPERTY2: VALUE;
}
SELECTOR2 {
PROPERTY1: VALUE;
PROPERTY2: VALUE;
}
Example of valid CSS rules:
body {
background-color: #FF0000;
}
.foo {
color: Black;
border-style: solid;
}
Example of invalid CSS rules:
body {{
background-color: #FF0000
}
.foo {
color: Black
border style: solid;
}
P.S.: Make it extensible enough so that the next exercise can enhance the rules validator further.
I think this can be done with a single regular expression.
Yet I'm not so sure, and anyway it will be painful. Maybe a combination...
Anyway in your valid CSS code example, the 1st like has no semi-column. Is that valid per your specs? I don't think it's valid CSS. And what about newlines, do they have to be respected, or can everything be on a single line?
Last edited by rolf (January 10 2012)
You're right, i fixed the semicolon.
I don't think there's a problem with newlines, as everything is delimited accordingly. So, no you don't have to respect new lines, it was just for presentation purposes.
Edit: Thinking about it further, new lines can be helpful when debugging in which the validator can provide an invalid line number. Otherwise it'll always be "Line 1 invalid"
Last edited by xterm (January 10 2012)
You need to keep in mind that this is a startup exercise, it would grow, that's why i mentioned that the implementation should be flexible enough for enhancements.
What about indentation? Are whitespaces in front of properties accountable for?
No, same as new lines.
Whitespace is not an issue.
Well, here's my solution to the problem(signed up to post it :) ):
It's in python(currently using the 2.x syntax but the conversion to 3.x is simple, just use print() instead of print)
class ParseException(Exception):
"""
Convenient way to indicating errors to be reported to the user
"""
def __init__(self, reason, line):
self.reason = reason
self.line = line
def __str__(self):
return "Parse error on line %d: %s" % (self.line, self.reason)
def parse_string(text):
#TODO do this without loading the entire file into memory
"""
State machine that splits up the incoming text into a logical form
that can be more easily processed later
Returns a dict which maps selectors into their bodies. Selectors
can be any string, whitespace(including newlines) before or after
elements(selectors, property, value) are ignored. The body of a
selector consists of a list of dicts which describe each property
affecting this selector.
Some metadata is included such as which line things occur on for
better error reporting later.
May error out if the file is malformed at a very basic level, such
as misplaced or mismatched braces. The selector and body syntax
itself is not checked this is delegated to users of this function
"""
# possible states:
# selector: we start out here and enter the body for an opening
# brace we come back to it after a closing brace
#
# key: we enter here after an opening brace in a selector. we exit
# upon reading a ":" character
#
# value: we enter from reading_key after a ":" character. we exit
# either at "}" or ";". I'm assuming it's legal for the
# last(or single) statement in a css body not to be
# terminated by a semi-colon(not semi-column )
state = "selector"
result = {}
buf = ""
cur_line = 1
cur_selector = None
cur_key = None
for char in text:
if char == "{":
if state != "selector":
raise ParseException("illegal '{' inside a selector body", cur_line)
cur_selector = buf.strip()
buf = ""
if cur_selector not in result:
result[cur_selector] = []
state = "key"
elif char == "\n":
# All end of lines show up as "\n" even on windows as long
# as the file is opened in text mode
cur_line += 1
buf += char
elif char == "}":
if state == "selector":
raise ParseException("illegal '}' inside a selector", cur_line)
if state == "key" and buf.strip() != "":
raise ParseException("illegal '}' inside a property", cur_line)
if buf.strip() != "":
# there's a dangling key:value that hasn't been
# inserted yet
# TODO line reporting here isn't very accurate
# (multiline property definition?), consider a better
# strategy maybe?
result[cur_selector].append({'property':cur_key,'value':buf, 'line':cur_line})
buf = ""
state = "selector"
elif char == ";":
if state != "value" or buf.strip() == "":
raise ParseException("Illegal ';'", cur_line)
result[cur_selector].append({'property':cur_key, 'value':buf.strip(), 'line':cur_line})
buf = ""
state = "key"
elif char == ":":
if state != "key" or buf.strip() == "":
raise ParseException("Illegal ':'", cur_line)
state = "value"
cur_key = buf.strip()
buf = ""
else:
buf += char
return result
if __name__ == "__main__":
import sys,pprint
pp = pprint.PrettyPrinter(indent=4)
if len(sys.argv) < 2:
print "Usage: %s <file to parse> [<other file> [...]]" % sys.argv[0]
for fname in sys.argv[1:]:
print "Parsing %s" % fname
try:
pp.pprint(parse_string(open(fname).read()))
except ParseException, e:
print "Invalid!"
print e
except IOError,e:
print "Cannot read file!"
print e
Here is the result when run on all three examples:
$ python cssvalidator.py test_file.css test_file2.css test_file3.css
Parsing test_file.css
{ 'SELECTOR1': [ { 'line': 2, 'property': 'PROPERTY1', 'value': 'VALUE'},
{ 'line': 3, 'property': 'PROPERTY2', 'value': 'VALUE'}],
'SELECTOR2': [ { 'line': 7, 'property': 'PROPERTY1', 'value': 'VALUE'},
{ 'line': 8, 'property': 'PROPERTY2', 'value': 'VALUE'}]}
Parsing test_file2.css
{ '.foo': [ { 'line': 6, 'property': 'color', 'value': 'Black'},
{ 'line': 7, 'property': 'border-style', 'value': 'solid'}],
'body': [{ 'line': 2,
'property': 'background-color',
'value': '#FF0000'}]}
Parsing test_file3.css
Invalid!
Parse error on line 1: illegal '{' inside a selector body
This parser essentially massages the input into a form which is easy to later validate in other ways(specific rules about valid selector/property/value specifications for example). It only checks for very basic validity itself.
I ran it on a pretty large css file from a project I'm currently working on and it worked just fine. If anyone has any examples which break my code(either by getting it to error out on a valid input or not do so on invalid input) let me know :)
Raja,
Thank you very much for posting this, you actually made my day saying thatmyou signed up to post it. I do expect however that you stick around since we do this quite a lot.
Just make sure you introduce your self in the members introduction topic!
Your solution is interesting, your documentation is outright awesome and your TODOs are right on the spot!
Well, sorry if this ends up ruining your day, but apparently I was already registered though under a different name way back in 2009. This is me: http://www.lebgeeks.com/forums/viewtopic.php?id=5245
I guess the forum software didn't catch it as I'm using a different email now. Think I should re-post in the introductions?
hey there, nice solution!
TODO do this without loading the entire file into memory
If I understand correctly, you just have to replace
pp.pprint(parse_string(open(fname).read()))
with
pp.pprint(parse_string(open(fname)))
Since a file is its own iterator, you don't need to call the read() function.
Not necessarily, I added that TODO as a quick afterthought but I think if you iterate over a file you get lines not characters. So it might be a bit more involved than that. I'll have to check and get back to you
Edit: Indeed, I just tried it and iterating over a file directly will yield lines and not single characters.
The following code could be used though to wrap it:
def file_wrapper(fname):
fh = open(fname)
while True:
char = fh.read(1)
if char == '':
break
yield char
And then you get:
pp.pprint(parse_string(file_wrapper(fname)))
That would work, but given that this only matters for really large files where loading them into memory would be a problem and since we're talking CSS files here that's quite unlikely(I doubt there are CSS files that are >100MB in size, that would be stupid).
Last edited by raja (January 13 2012)
Pages: 1