I want to write a lexical analyzer for python from scratch. But I do not know where and how to begin. For starters I want to assume that we will have a python program as a set of strings passed to the analyzer. The analyzer should figure out where is a new line and the appropriate whitespace to be looked at. How to figure out new lines in python source code? I've read the lexical specs of python and we can use a stack based approach in resolving indentation with whitespaces, but can't figure how to look at.Is it just a regular expression check with '\n' or is there any algorithmic way to determine this?
I purposely don't want to use things like lex,yacc or flex for that matter.