Typology of Text Manipulation (for Center for Literary Computing, WVU):

Alan Sondheim sondheim at panix.com
Mon Jun 14 05:26:17 CEST 2004




Typology of Text Manipulation (for Center for Literary Computing, WVU):


Thanks Sandy Baldwin -

We need to implement with a simple interface if possible:

Please note many of the following overlap -

Input: Text of any sort
          keep spacings / eliminate spacings
          keep tabs / eliminate tabs

Input placed within: single file
                     array
                     hash

Output: single file of manipulated text
         with original file intact
        "core dump" of file with protocols / processes attached

Manipulations: substitution (similar to awk)
               word for word
               letter for letter
               line for line
               etc.
               In other words: Y for X

               reverse line (similar to rev)
               reverse file
               reverse words (words in reverse "reverse in words")
               reversed words ("sdrow ni esrever")
               reversed lines (etc.) (similar to tac)

Eliminations:  first instance word / letter / line < rest eliminated

Functions (Doublings etc.): Given X, then f(X)

Fields: Reordering fields (similar to awk)

Filtering lines / words / letters: (similar to grep)

Randomizing: Generating (lines / words / letters) > texts
              Various Grammars
              Various Lists (nouns / verbs / adverbs / etc.)

Randomizing: Filtering (ability to set parameters - for example:
              randomizing field order
              randomizing elimination order)
              etc.

Topological: Folding Texts
             Splitting Texts
             Joining Texts (end to end / side by side)
             Internal block removal (see emacs)
             "Crumpling" Texts

Numeric and code: Translating texts into hex, binary, octal, digital
                    f(X) on hex, binary, octal, digital
                  Translating full text file / inodes etc.
                  Codes - see /usr/games Unix/Linux directory:
                    Morse / figlets
                    Caesar (rot 13 as special example)
                    pig / bcd / ppt / banner

Later: Embedded Babel etc. translation
       Pictographic translation (Dongba, hieroglyphic, etc.)

_




More information about the Syndicate mailing list