Got CHUNK?: ruby

As a follow-on to Picasync, I've started work on a library to interface with Google Documents, so that google can be used to manage a website's text and articles in addition to its image galleries.

Google Documents provides users with a familiar, intuitive word processing UI, revision history and drafts, and document folders... at the expense of lots of hpricot cannon-fodder.

The process:

Find document on google, and :fetch its content
Strip script and form tags and tag attributes (including inline styles)
convert to textile (c/o James Stewart's html2textile ruby script)
escape left-over html
convert back to [spartan? hopefully] html from textile via RedCloth (with some house-keeping regex that needs really expanding upon)

i.e.

doc = Gdocsync::Document.find_by_title("Foo", :fetch).clothe

or, skip Redcloth (but still escape tatterdemalion tags after textile conversion):

doc = Gdocsync::Document.find_by_title("Foo", :fetch).textile

With safe_html, the user gets pretty much what they see on google (however that document might be constructed), but you also get stuck with inline styles and tag soup -- only script tags (which Google Docs doesn't appear to allow, anyway), and form tags, are stripped:

doc = Gdocsync::Document.find_by_title("Foo", :fetch).safe_html

You also have recourse to the raw html body with no modifications beyond google's own processing (the 'raw' document is stored in the object and used as the starting point for the previous methods):

doc = Gdocsync::Document.find_by_title("Foo", :fetch).raw

This can all be easily tied together with database tables via a rake task. The following snippet loops through Properties, looks up the document on google and updates the database record. This example matches document title with object title field, which is not very smart -- in the next few days I will be making use of google document directories and tightening things up.

namespace :google do

 task :docs => :environment do
   require '../gdocsync/lib/gdocsync'

   Property.find(:all).each do |property|
     doc = Gdocsync::Document.find_by_title(property.title, :fetch)
     Property.update(property.id, :description => doc.clothe)
   end

 end

end

$ rake google:docs

Gdocsync git repo (early days).

A nascent ruby module for interfacing with Picasa and mirroring user albums locally (initially conceived so I could programmatically delete the multitude of galleries I'd spawned while hacking around trying to get stuff to work).


Picasync::Album.find(:all).each {|album|
 album.delete!
}

%w(one two three).each do |title|
 album = Picasync::Album.new(title)
 album.create!
end

Easy. No ability to upload images via the api yet, although its purpose is to farm-out uploading and cms tasks to Picasa's UI anyhow, so that Picasa can essentially power a site's galleries (but without hotlinking or hitting their feed on every page load).

Albums are synced locally via Picasync::Sync::All.new & Picasync::Sync::CSV.new, which fetches files to a single directory, hashing the file names and generating a couple of csvs for image sizes, captions and parent albums. Still to add table migrations and automatic csv imports.

It's also simple to fetch files arbitrarily, be it a whole album, or a particular image in a set (although things aren't properly tied together with csv generation yet):


Picasync::Image.mirror(:album, album.id)
Picasync::Image.mirror(image.id, album.id)

Uses google's ClientLogin authentication scheme.

It's a work in progress and definitely not a drop-in solution for end users, but you can grab (and contribute to) the code on Github.

See the Readme for the methods I've got around to adding.


albums = Picasync::Album.find(:all, :images)
albums.each do |album|
puts album.title
album.images.each {|image|
  puts img.medium
  puts img.caption
}
end

Got CHUNK?

Sunday, 2 March 2008

DRYing up CMS with Google

Saturday, 1 March 2008

Ruby Interface for Google Picasa API

Blog Archive

About Me