Jump to content

User:OrenBochman/bots

From mediawiki.org

Some bot Ideas

Rule Based Bots

[edit]
  1. Phonologist. Use TTS code from Mbrola etc to to add IPA, Sampa, MBrola phonetical data in registered languages.
    1. IPA to Sampa etc. conversion.
    2. QA and confidence tests on against existing IPA.
    3. Compound word mode processing.
    4. String matching algorithm to map text n-grams to IPA ngrams (space,phon,phon,phone).
    5. production rule extraction from above (as per paper).

Mine Feedback loop

[edit]
  1. Mine for data in wikis
 #Get all he.wiktionary entries and add them to en.wiktionary + othographt
  1. Edit terms and store it there.

Template Labeler & Checker

[edit]
  1. Add ID or MD5 HASH to mark template boundaries.
  2. Detect and Mark with categorized template mistakes.
    1. e.g. orphan tags/bad tidy code.