Note: Argh! Why the editor doesn’t just auto convert the arrow symbols into html entities I will never know. For now, imagine that the results in the table match the description in the paragraphs.
I despise Lua patterns. Then again, I sort of dislike Regular Expressions too, but it feels like Regular Expressions are a lot more powerful – and I suppose that is because they are. Lua doesn’t implement Regular Expressions because the code to implement them would likely exceed the entire code base of Lua itself. Sure there are libraries one can get, but I would rather not force myself into that position. Its bad enough that I’m implementing Amateura that runs in another interpreted language, I would rather not have a dependency on a library.
Moving on now…
I’ve made some progress with the Lexer method version 2. This time around I’m straying away from iterating through the inputted source string… I’m hoping this will speed up things dramatically as well as make things easier for me later. I’ve run into a snag though, specifically dealing with Lua Patterns… Perhaps I just don’t understand Patterns. I barely grasp the advanced regular expressions out there (some are quite impressive!) but finding a pattern that turns any length of numbers into a single value while ignoring numbers within special symbols is a nightmare.
| [^<>][" .. numbers .. "]+[^<>] |
Now I know I can use the “built-in” classes for numbers, however there are reasons why I’m using my own strings. The above patterns at any rate is supposed go through the string and find any numbers that aren’t beside a < or >. Now there are a few issues with this patterns. The ^ character in a character set modifies the character set to find any character but the ones in the character set. This means not only “1337″ will be found but also “142G”. Further more, if I replace the “numbers” string with the “alphas” string (just a string composed of a-z and A-Z) then it will, for the most part work (except for it finding LOL3 instead of just LOL). That isn’t the issue that is giving me most of my frustration though. If I put alphas instead of numbers in the pattern, it most works fine (besides the catching numbers beside it part) however using numbers in the pattern produces strange results do the fact that my symbols are numbers. Given the following string, if LOL == 5557, replacing spaces and the if keyword produces the following results: <7><3>LOL<3>==<3>5557. Only certain lex symbols will have the delimiters added (the arrows) but the will be removed before validation occurs. The only reason they are there is to distinguish from a previously placed symbol and a number that was in the source string. Here are the results when either or both patterns (alpha/numbers) are ran against the strings.
| Pattern | Result |
| Numbers | LOL==1 |
| Letters | 2==1337 |
| Both | 2==1 |
The inputted string was if LOL == 1337 for these results. 1 is the symbol for a group of numbers and 2 is the symbol for a group of letters. Judging by the above results you may determine that it is working, however that is not the case. When I use the input string of: if L == 8 neither pattern works. The if and spaces are converted correctly, however the L and 8 remain unchanged.
Tonight (or rather, this morning. It is 12:18 am, at the moment) I will be modifying the code slightly so I can do away with inserting separators around certain symbols. Instead, I’m going to trying and build a Lexed string instead of running replacements on the given string. I may use a table (which could take up large amounts of memory, depending on how long the given line of source code is) or I’ll simply keep adding the found symbols to the string. My only concern though is making sure my patterns find and add to the string in order. Perhaps I should do an indepth study of Lua Patterns as well.