I’m working on something silly, a project to take words from structured content (Twitter posts) and associate use freely available data sources to create new types of structured content (poems, images, and potentially music). My motivation is my work, where business types frequently take a statistically invalid sampling of data, enforce arbitrary structure on it, and make decisions based on the ensuing self-fulfilling prophecies. This project, like those  “analyses” has absolutely no value, aside from the dubiously artistic, but if I get something working I’ll open source it anyway.

My plan is to write it in Python, with whichever modules are necessary (probably many, I don’t have much time or inclination to roll my own). As the design progresses I’ll add more here. Especially if things come out as hilariously as expected. Right now, the high level is thus:

Read a Twitter stream

  1. Follow some folks, a hashtag, popular posts, newest posts etc.
  2. Strip out common words. There are many lists of the top 100, that’ll probably do for my purposes.
  3. Store whatever words are left over with relative frequency in some data source or other.
  4. Potentially grab images from the stream for input into later image generation, depending on whether the stream source is something I can trust to only have images I can use.

Create an image

  1. Take the words and do a CC image search for them.
  2. Use a randomized sampling of the results to create a mosaic of… something.
  3. Potentially use an image from the Twitter stream as the mosaic template, or just as a color source to help find images that blend well.

Create a poem

  1. Take highly unique words and do a lyric search for them
  2. Take lines from the resulting song list where the word is used to create a stanza.
  3. Most lyrics are trite, I expect huge success from this with occasional hilarity.
  4. Repeat for as many words as we care to, perhaps driven by the number of tweets found in a particular update.

Create a song

  1. Take the words over time and create a set of chord transitions.
  2. Transitions based on word frequency, mapped to the frequency of “typical” musical transitions.
  3. Note density might be based on the number of tweets per update or the like.
  4. If it ends up being a known group of people each person might be assigned an instrument.
  5. End result should be sheet music, a midi file, or (heavens forbid) an actual wav of whatever godawful noise this makes.

If you want to follow the project and have your tweets be part of the project, follow @FAPTIPSBot! It’ll be a hoot.

