TODOs 01/04/2010 10pm
- Added back boundary together with ^|\W\|\b + phrases + $|\W
- Removed word boundary from the regex, not sure the impact
Is there a need to escape parenthesis, full-stops etc.?- Testing - long texts with tags, scripts etc., sizeable sample of terms
- Tagged item requirement in sec 2.20, use :term: in title and use the same in content, don't use tag? May need a list of valid tags to match
- Exceptions (while parsing html, while replacing text (regex), error retrieving cache, invalid channel
- Product channel required instead of infered (can be changed).
- Case sensitivity -
will cause issues with look back and or condition
TODOs 01/04/2010
- Refactor into smaller classes and methods
- Loading of cache needs revisiting
- If time permits look into OSCache for loading and refresh
- Appending tags, attributes
- Calling replace function
- Using Jericho parser
Load
- Servlet
- Own class, singleton
- Caching
Regualar expression
- looping through HashMap for terms
- Regex
- Cleanup definition (strip html tags)? Since we have the parser already
No comments:
Post a Comment