Non NaNoWriMo Open Source Thoughts

I should really be focusing on my NaNoWriMo project, but... I had some thoughts related to open source grammar checking.

I know there's Language Tool, but I'm hoping to create something light weight and native to Linux. The problem is what I'm thinking of is not-quite-a-grammar checker.

The Not-Quite-a-Grammar Checker Proposal

What I'm thinking of is using a transition matrix with probabilities of what the next work should be. Each row would look something like:

[NOUN SINGULAR, NOUN PLURAL, VERB SINGULAR, VERB PLURAL, ADVERB, ADJECTIVE, SENTENCE END, COMMA, etc.]

If the next word or punctuation doesn't match, the algorithm could highlight the word and say “This looks wrong, should it be a VERB/NOUN”.

I'm thinking of using the Brown Corpus for tagging. For training, I'm planning on stealing public domain works to build the transition matrix.

With this, not-quite-grammar checking could be ran locally without the need of running a separate server or adding other run-times to an open source project. It wouldn't be as good as online paid subscription services, but not-quite-a-grammar checker with spell check and some style check will be close.

Back to NaNoWriMo

For my NaNoWriMo, I have an idea of where I'm going. It's a YA dystopian romance. I also came up with some side stories and key points where the main character learns something about themself, small victories, and the all hope is lost moment.

Feel free to let me know what you think. I'm @kmwallio@writing.exchange and @kmwallio on twitter.

Day 4 of #100DaysToOffload