Automatic quotation mark curling


Thangalin <thangalin@...>
 

Hi all,

I've been developing a Java-based desktop text editor (KeenWrite) to help write a novel. One of the editor's features is automatic curling of straight quotes. The code correctly curls most tests in the following file:

https://github.com/DaveJarvis/keenquotes/blob/main/src/test/resources/com/whitemagicsoftware/keenquotes/smartypants.txt

The algorithm is quick, small (64 kb), single-pass (mostly), and capable of handling XHTML documents. Other solutions, such as Stanford's NLP and John Gruber's PHP-based "SmartyPants" algorithm have many drawbacks (including size, speed, missing API hooks, and incorrect results). My implementation can't handle some edge cases because the algorithm lacks grammatical context. I'd like to replace KeenQuotes with a more robust solution.

Can GATE be used to curl straight quotes? That is, could it pass all the tests listed in the "smartypants.txt" file, including the ambiguous cases?

Thank you.