William Vambenepe
Expanding on “twitter with a brain”
Chuck Shotton recently made a compelling case (”Twitter with a Brain“) for Twitter tools to allow the user to change the protocol endpoint. That is, instead of always going to twitter.com, you can tell your Twitter client to send all requests to myTweetInterceptor.me.com. Why would you do this? You should read his blog entry, but in short his point is that the intermediary can add all kinds of new features that neither the Twitter client nor Twitter itself support. As always in computer science, a new level of decoupling adds opportunities for extensions (and breakage too, of course).
I fully agree with what he writes and I would very much like to see his call to action answered. In fact, I want more than what he is asking for. So here is my call to action:
1) It’s not just Twitter
Why just Twitter? This should be true for any client using any protocol. Why not also the APIs for the various Google and Yahoo services? The APIs for the other social networks beyond Twitter? For shopping sites like Amazon and EBay? Etc. And of course to all the various Cloud providers out there. Just because I am using the Amazon EC2 API it doesn’t mean I necessarily want the requests to go straight to Amazon. Client tools should always make the endpoint configurable, period.
2) It’s not just the clients, it’s also (and especially) the third party sites
Chuck’s examples are about features that the Twitter clients could provide but don’t, so an intermediary would be an easy way to hack support for them (others presumably include modifying the client – if open source -, writing a plug-in for it – it there is such mechanism -, or running a network interceptor on the local client – unless the protocol is encrypted-).
That’s nice and I’d love to see this, but the big deal for me is less with clients and more with third party sites. You know, all these sites that ask for your Twitter login/password. Or those that ask for your GMail/Yahoo account info to retrieve a list of your contacts. I never grant these requests, but I would consider it if they allowed me to tell them what endpoint URL to use. For example, rather than using my Twitter login to go straight to twitter.com, they would use a login/password that I create and talk to twitterIntermediary.vambenepe.com. The requests would be in the exact same shape as what they send today to Twitter, just directed to another URL. There, I could have a proxy that only allows some requests (e.g. “update twitter background image” but not “send update”) and forwards them using my real Twitter credentials. Or, for email accounts, I could have a proxy that allows requests that read my address book but not those that read my mails. The goal here is not to add features, it is to delegate trust in a fine-grained (and audited) manner. This, to me, is the burning need, rather than a 3rd place to implement Twitter lists.
I would probably write these proxies using a PaaS platform like the Google App Engine. Or maybe even Yahoo Pipes. I have long struggled to think of use cases for which Yahoo Pipes hits the sweetspot, and this may well be it. Especially if people write modules to handle specific APIs (e.g. a “Twitter API” module that shows all operations and lets you enable/disable them one by one in a pipe). The one thing missing would be a way for a pipe to keep a log of its invocations, for auditing.
You want access to my email and social network accounts? Give me the ability to filter you requests and you’ll get access. If it’s blind trust you want, I am afraid I have a very limited supply.
[Note: I wanted to add this as a comment on Chuck's blog, but he doesn't seem to allow them: "go start your own blog and/or shut up and eat your vegetables" is his recommendation. Since I already have my own blog, I guess I don't have to eat my vegetables if I don't want to. I just hope my kids don't learn about this rule or they'll be blogging in no time.]
[UPDATED 2009/11/30: WRT to Chuck's request, it looks like it's being done already. But no luck with the third party sites so far, which is what I most want to see.]
Can your hypervisor radio for air support?
As I was reading about Microsoft Azure recently, a military analogy came to my mind. Hypervisors are tanks. Application development and runtime platforms compose the air force.
Tanks (and more generally the mechanization of ground forces) transformed war in the 20th century. They multiplied the fighting capabilities of individuals and changed the way war was fought. A traditional army didn’t stand a chance against a mechanized one. More importantly, a mechanized army that used the new tools with the old mindset didn’t stand a chance against a similarly equipped army that had rethought its strategy to take advantage of the new capabilities. Consider France at the beginning of WWII, where tanks were just canons on wheels, spread evenly along the front line to support ground troops. Contrast this with how Germany, as part of the Blitzkrieg, used tanks and radios to create highly mobile – and yet coordinated – units that caused havoc in the linear French defense.
Exercise for the reader who wants to push the analogy further:
- Describe how Blitzkrieg-style mobility of troops (based on tanks and motorized troop transports) compares to Live Migration of virtual machines.
- Describe how the use of radios by these troops compares to the use of monitoring and control protocols to frame IT management actions.
Tanks (hypervisor) were a game-changer in a world of foot soldiers (dedicated servers).
But no matter how good your tanks are, you are at a disadvantage if the other party achieves air superiority. A less sophisticated/numerous ground force that benefits from strong air support is likely to prevail over a stronger ground force with no such support. That’s what came to my mind as I read about how Azure plans to cover the IaaS layer, but in the context of an application-and-data-centric approach. Where hypervisors are not left to fend for themselves based on the limited view of the horizion from the periscope of their turrets but rather orchestrated, supported (and even deployed) from the air, from the application platform.

(Yes, I am referring to the Azure vision as it was presented at PDC09, not necessarily the currently available bits.)
Does your Cloud vendor/provider need an air force?
Exercise for the reader who wants to push the analogy to the stratosphere:
- Describe how business logic/process, business transaction management and business intelligence are equivalent to satellites, surveying the battlefield and providing actionable intelligence.
The new Cloud stack (”military-cloud complex” version):

[Note: I have no expertise in military history (or strategy) beyond high school classes about WWI and WWII, plus a couple of history books and a few war movies. My goal here is less to be accurate on military concerns (though I hope to be) than to draw an analogy which may be meaningful to fellow IT management geeks who share my level of (in)expertise in military matters. This is just yet another way in which I try to explain that, for Clouds as for plain old IT management, "it's the application, stupid".]
Review of Fujitsu’s IaaS Cloud API submission to DMTF
Things are heating up in the DMTF Cloud incubator. Back in September, VMWare submitted its vCloud API (or rather a “reader’s digest” version of it) to the group. Last week, the group released a white paper titled “Interoperable Clouds”. And a second submission, from Fujitsu, was made last week and publicly announced today.
The Fujitsu submission is called an “API design”. What this means is that it doesn’t tell you anything about what things look like on the wire. It could materialize as another “XML over HTTP” protocol (with or without SOAP wrapper), but it could just as well be implemented as a binary RPC protocol. It’s really more of an esquisse of a resource model than a remote API. The only invocation-related aspect of the document is that it defines explicit operations on various resources (though not their input and outputs). This suggest that the most obvious mapping would be to some XML/HTTP RPC protocol (SOAPy or not). In that sense, it stands out a bit from the more recent Cloud API proposals that take a “RESTful” rather than RPC approach. But in these days of enthusiastic REST-washing I am pretty sure a determined designer could produce a RESTful-looking (but contorted) set of resources that would channel the operations in the specification as HTTP-like verbs on these resources.
Since there are few protocol aspects to this “API design”, if we are to compare it to other “Cloud APIs”, it’s really the resource model that’s worth evaluating. The obvious comparison is to the EC2 model as it provides a pretty similar set of infrastructure resources (it’s entirely focused on the IaaS layer). It lacks EC2 capabilities around availability, security and monitoring. But it adds to the EC2 resource model the notions of VDC (”virtual data center”, a container of IaaS resources), VSYS (see below) and a lightly-defined EFM (Extended Function Module) concept which intends to encompass all kinds of network/security appliances (and presumably makes up for the lack of security groups).
The heart of the specification is the VSYS and its accompanying VSYS Descriptor. We are encouraged to think of the VSYS Descriptor as an extension of OVF that lets you specify this kind of environment:

Example content for a VSYS Descriptor
By forcing the initial VSYS instance to be based on a VSYS Descriptor, but then allowing the VSYS to drift away from the descriptor via direct management actions, the specification takes a middle-of-the-road approach to the “model-based versus procedural” debate. Disciples of the procedural approach will presumably start from a very generic and unconstrained VSYS Descriptor and, from there, script their way to happiness. Model geeks will look for a way to keep the system configuration in sync with a VSYS Descriptor.
How this will work is completely undefined. There is supposed to be a getVSYSConfiguration() operation which “returns the configuration information on the VSYS” but there is no format/content proposed for the response payload. Is this supposed to return every single config file, every setting (OS, MW, application) on all the servers in the VSYS? Surely not. But what then is it supposed to return? The specification defines five VSYS attributes (VSYSID, creator, createTime, description and baseDescriptor) so I know what getSYSAttributes() returns. But leaving getVSYSConfiguration() undefined is like handing someone an airplane maintenance manual that simply reads “put the right part in the right place”. A similar feature is also left as an exercise to the reader in section that sketches an “external configuration service”. We are provided with a URL convention to address the service, but zero information about the format and content of the configuration instructions provided to the VServer.
EC2 has a keypair access mechanism for Linux instances and a clumsy password-retrieval system for Windows instances. The Fujitsu proposal adopts the lowest common denominator (actually the greatest common divisor, but that’s a lost rhetorical cause): random password generation/retrieval for everyone.
I also noticed the statement that a VServer must be “implemented as a virtual machine” which is an unnecessary constraint/assumption. The opposite statement is later made for EFMs, which “can be implemented in various ways (e.g. run on virtual machines or not)”, so I don’t want to read too much into the “hypervisor-required” VServer statement which probably just needs an editorial clean-up.
From a political perspective this specification looks more like a case of “can I play with you? I brought some marbles” than a more aggressive “listen everybody, we’re playing soccer now and I am the captain”. In other words, this may not be as much an attempt to shape the outcome of the incubator as much as to contribute to its work and position Fujitsu as a respected member whose participation needs to be acknowledged.
While this is an alternative submission to the vCloud API, I don’t think VMWare will feel very challenged by it. The specification’s core (VSYS Descriptor) intends to build on OVF, which should be music to VMWare’s ears (it’s the model, not the protocol, which is strategic). And it is light enough on technical details that it will be pretty easy for vCloud to claim that it, indeed, aligns with the intent of this “design”.
All in all, it is good to see companies take the time to write down what they expect out of the DMTF work. And it’s refreshing to see genuine single-company contributions rather than pre-negotiated documents by a clique. Whether they look more like implementable specifications of position paper, they all provide good input to the DMTF Cloud incubator.
Desirable technical characteristics of PaaS
PaaS can most dramatically improve the IT experience in four areas:
- Hosting/operations efficiency
- Application-centric management
- Development productivity
- Security
To do so, there are technical characteristics that PaaS frameworks should eventually exhibit. These are not technical characteristics of a given PaaS container, they are shared characteristics that go across all container types, no matter what the operational capabilities of the containers are.
Here is a rough and unorganized list of the desirable characteristics (meta-capabilities) of PaaS Cloud containers:
- An application component model that supports deployment/configuration across all PaaS container types.
- Explicit interactions/invocations between application components (resilient connections between component: infrastructure-level retry/reroute)
- Uniform and consistent request tracking across all components. Ability to intercept component-to-component communication.
- Short-term (or externally persisted) state so that all instances can be quickly redirected out of any one node.
- Subset of platform management interface exposed to consumer, along with out of the box application management. Application metrics consolidated at application level rather than node level.
- Consistent, model-based application management interface across all container types. Hooks for component code to provide its manageability in the same framework.
- Minimal footprint of any container node for limited patching requirements.
- Assistance for debugging platform-hosted code (see this entry).
- No encroachment of container technology on application contract (e.g. no forced URL structure).
- Application uniformly scalable to the limit of the underlying hardware (no imposed partitioning).
- Shared authentication / authorization / auditing across containers.
- Minimum contract/interface exposed by each container.
- Governance of application services, aligned (in model/protocols) with the container management interfaces.
- [UPDATE: need to add metering+billing as William Louth pointed out in a comment]
This applies across the board to public, private and hybrid PaaS. The distinctions between these delivery models are real but at a different level. The important thing is that the PaaS administrator is different from the application administrator in all cases. On the other hand, most of these technical characteristics are not achievable for lower-level Cloud resources (like virtual hosts and low-level storage) which is why the IaaS form of Cloud leaves the Cloud promise only partially fulfilled.
Enumeration of PaaS container types
What do we need from an on-demand application platform for enterprise software? Here is a proposed classification of container types on which you can create/scale/manage applications. Your application is made of modules that run on these containers (e.g. one app may have two “synchronous web request” modules, one “structured data service” module and one “scheduled job” module). All tied together using an application component model that dovetails with the capabilities of the different containers. The proposed containers map to the following capabilities:
- Synchronous web request processing
- Long-lived (persisted) process execution with introspectable/declarative flow
- Event processing
- Persistence (a few different types: structured vs. buckets/files vs. object cache)
- Batch/background/scheduled/queued processing
- Possibly some advanced Web/mobile UI (portals, human task flows…)
- User management (self-contained or as an abstraction layer for other systems)
- [UPDATED 2009/11/19: I should have included pluggable protocol/format adapters.]
While new applications may be built to run purely on such a platform, you will not be able to run your IT solely on it anytime in the foreseeable future. So if you are thinking about it from the perspective of the entire universe of virtualization containers needed to support your IT system, you’ll need to add lower-level container types, such as:
- Guest hosts (typically from hypervisors)
- Low-level (e.g. block) storage
- Networking between them
Another way to think about it is that Cloud/Utility computing is about having pools of resources dynamically allocated. There needs to be few enough pools that each pool is large enough and used across enough consumers to derive efficiencies from the act of sharing. But there needs to be enough pool types that applications have a complete infrastructure to run on that lets them abstracts away what is not business logic. This list is an attempt at this middle ground.
This is a classification of containers, not a description. There are many different ways to realize each of them. The synchronous web request processing could look like a servlet engine, a Python handler or a PHP page. The persisted execution could be a BPEL engine or some other state machine definition/execution engine. The structured data interface could look like SQL, XQuery or SPARQL, etc… In addition, more specialized application infrastructure elements (e.g. video streaming, analytics) might enter the picture for some applications.
I hope to see the discussion move beyond “IaaS vs. PaaS” towards talking about more specific container types that are needed/supported by the different virtualization stacks. My application doesn’t care if the file container is considered IaaS or PaaS.
The next post will list desired characteristics of the PaaS environment (meta-capabilities that go across all the container types in the first list but may not be available from the lower-level containers in the second list).
Would you like some management with that appliance?
Andi Mann recently wrote an interesting post about virtual appliances . He uses the domain name pleasediscuss.com for his blog so I figured I’d do just that. More specifically, I have three comments on his article.
Opaque or transparent appliance
Andi’s concerns about the security and management problems posed by virtual appliances are real, but he seems to assume that the content of the appliance is necessarily opaque to the customer and under the responsibility of the appliance provider. Why can’t a virtual appliance be transparent in the sense that the customer is able to efficiently manage at least some aspects of the software installed on it? “You can’t put agents on most virtual appliances, they don’t come with WMI, and most have only a GUI for management” says Andi. Why can’t an appliance come with an agent (especially in these days of consolidation where many vendors provide many layers of the stack – hypervisor / OS / application container / application / management tools – including their agent)? Why can’t it implement a standard management API (most servers nowadays implement WBEM, WS-Management and/or IPMI pre-boot – on the motherboard – which is a lot more challenging to do than supporting a similar protocol in a virtual appliance). Andi is really criticizing the current offering more than the virtual appliance model per se and in this I can join him.
Let me put it differently, since this is probably just a question of definition: what would Andi call a virtual appliance that does expose management APIs for its infrastructure (e.g. WS-Management for the OS, JMX for the java stack) or that comes with an agent (HP, IBM, BMC, Oracle…) installed on it?
Such an appliance (let’s call it a “transparent virtual appliance” for now) doesn’t provide all the commonly claimed benefits of an appliance (zero config/admin) but as Andi points out these benefits come with major intrinsic drawbacks. A transparent virtual appliance still drastically simplifies installation (especially useful for test/dev/demo/POC). It doesn’t entirely free you of monitoring and configuration but at least it provides you with a very consistent and controlled starting point, manageable from the start (no need to subsequently install an agent). In addition, it can be made “just enough” (just enough OS, just enough app server…) to require a lot less maintenance than an application stack that you assemble yourself out of generic parts. We’ll always have trade offs between how optimized/customized it is versus how uniform your overall environment can be, but I don’t see the use of an appliance as a delivery mechanism as necessarily cornering you into a completely opaque situation, from a management perspective.
Those who attended Oracle Open World a few weeks ago were treated to an example of such an appliance, if they attended any of the sessions that covered Oracle’s Appliance Builder (the main one was, I believe, Virtualizing Oracle Fusion Middleware in the Modern Data Center, in case you have access to the Open World On Demand replay and slides). I believe it’s probably the same content that @jayfry3 was shown when he tweeted about “Oracle is demoing their private cloud self-service app”. These appliances are not at all opaque from a management perspective. To the contrary, they are highly manageable, coming with an Enterprise Manager agent installed that can manage everything in the appliance (and when that “everything” doesn’t include the OS, it’s because there isn’t one thanks to JRockit Virtual Edition, making things slimmer, faster, safer and more manageable). And of course the OVM-based environment in which you deploy these appliances is also managed by Enterprise Manager. OK, my point here wasn’t to go into marketing mode, but this is cool stuff and an example of what virtual appliances should be. BTW, this was also demonstrated during Hasan Rizvi’s keynote at OpenWorld, including the management of these systems through Enterprise Manager.
In the long run it’s irrelevant
As with all things computer-related, the issue is going to get blurrier and then irrelevant . The great thing about software is that there is no solid line. In this case, we will eventually get more customized appliances (via appliance builders or model-driven appliance generation) blurring the line between installed software and appliance-based software.
Waiting for PaaS
Towards the end of his post, Andi paints an optimistic vision of the future: “I also think that virtual appliances have a bright future – but in some ways I continue to see them as a beta version of what could (or should) come next. By adding in capabilities for responsible and accountable management, they could form the basis of more fully-functional virtual service management containers. These in turn could form the basis of elastic, mobile, network-deployed, responsible cloud appliances that deliver complete end-to-end service management without regard to physical location or domain of control.”
I mostly agree with this vision, though when I describe it it is in the guise of a PaaS platform. Where your appliance (which today goes from the OS all the way to the app) has shrunk to an application template that you deploy in the PaaS environment (rather than in a hypervisor). If/when the underlying PaaS environment has reached the right level of management automation you get all the benefits of an appliance while maintaining the consistency of your environment and its adherence to your management policies (because the environment is the PaaS platform and its management is driven from your policies).
[As is often the case, this started as a comment (on Andi's blog) and quickly outgrew that environment, leading to this new post. Plus, Andi's blog is brand new and seems to be well worth spreading the word about (Andi himself is under-marketing it).]
OWL news you can use
The W3C released OWL 2 today. Most readers of this blog are IT management people (whether they call it “cloud computing” or “boring old system management”) and don’t follow RDF, OWL, SPARQL etc too closely (if at all). Yet there is a lot of potential value in using these technologies for IT management, so I thought it might be helpful to provide some practical resources on the topic. I have selected articles that cover the special (some may say “twisted”) approach of using OWL and its friends for validation rather than just inference, as this use case is very relevant to IT management.
- Dave Reynolds’ overview of OWL2, with a focus on vocabularies and validation.
- A good series of articles from Clark&Parsia, starting with this one.
- SPIN (which I blogged about before), from the TopQuadrant guys, who produce the TopBraid tool (their free edition is a great way to get started using Semantic Web technologies).
Of course you can also go to the W3C standard itself, starting with the overview of OWL 2.
Just so you don’t feel lonely if you decide to explore this path, have a look at Elastra’s sexy technology stack. ECML, EDML and EMML are all defined as OWL ontologies.
Yes you can read the OSGi specification
You know what I like the best about OSGi? That it doesn’t put the bar too high for architects. At first I was a bit intimidated by the size (338 pages for the “core specification”, 862 pages for the “service compendium”) and the fact that I had to look up “compendium”. But then they put me right at ease:
“Architects should focus on the introduction of each subject. This introduction contains a general overview of the subject, the requirements that influenced its design, and a short description of its operation as well as the entities that are used. The introductory sections require knowledge of Java concepts like classes and interfaces, but should not require coding experience.”
I am like so totally overqualified for my job. Hell, I even know what packages are.
(from the recently released OSGi version 4.2.)
Missing out on the OCCI fun
As a recovering “design by committee” offender I have to be careful when lurking near standards groups mailing list, for fear my instincts may take over and I might join the fray. But tonight a few tweets containing alluring words like “header” and “metadata” got the better of me and sent me plowing through a long and heated discussion thread in the OGF OCCI mailing list archive.
I found the discussion fascinating, both from a technical perspective and a theatrical perspective.
Technically, the discussion is about whether to use HTTP headers to carry “metadata” (by which I think they mean everything that’s not part of the business payload, e.g. an OVF document or other domain-specific payload). I don’t have enough context on the specific proposal to care to express my opinion on its merits, but what I find very interesting is that this shines another light on the age-old issue of how to carry non-payload info when designing a protocol. Whatever you call these data fields, you have to specify (by decreasing order of architectural importance):
- How you deal with unknown fields: mustUnderstand or mustIgnore semantics.
- How you keep them apart (prevent two people defining fields by the same name, telling different versions apart).
- How you parse their content (and are they all parsed in the same manner or is it specific to each field).
- Where they go.
SOAP provides one set of answers.
- You can tag each one with a mustUnderstand attribute to force any consumer who doesn’t understand them to fault.
- They are namespace-qualified.
- They are XML-formatted.
- They go at the top of the XML doc, in a section called the SOAP header.
You may agree or not with the approach SOAP took, but it’s important to realize that at its core SOAP is just this: the answer (in the form of the SOAP processing model) to these simple questions (here is more about the SOAP processing model and the abuses it has suffered if you’re interested). WSDL is something else. The WS-* stack is also something else. It’s probably too late to rescue SOAP from these associations, but I wanted to point this out for the record.
Whatever you answer to the four “non-payload data fields” questions above, there are many practical concerns that you have to consider when validating your proposal. They may not all be relevant to your use case, but then explicitly decide that they are not. They are things like:
- Performance
- Ability to process in a stream-based system
- Ease of development (tool support, runtime accessibility…)
- Ease of debugging
- Field length limitations
- Security
- Ability to structure the data in the fields
- Ability to use different transports (way overplayed in SOAP, but not totally irrelevant either)
- Ability to survive intermediaries / proxies
Now leaving the technology aside, this OCCI email thread is also interesting from a human and organizational perspective. Another take on the good old Commedia dell standarte. Again, I don’t have enough context in the history of this specific group to have an opinion about the dynamics. I’ll just say that things are a bit more “free-flowing” than when people like my friend Dave Snelling were in charge in OGF. In any case, it’s great that the debate is taking place in public. If it had been a closed discussion they probably would not have benefited from Tim Bray dropping in to share his experience. On the plus side, they would have avoided my pontifications…
There should be a word for this (Blog/Twitter edition)
I enjoyed finishing reading The Atlantic with Barbara Wallraff’s “Word Fugitives” column every month. Until earlier this year, when it was replaced with Jeffrey Goldberg’s attempts at humor. For old time sake, I am borrowing the “Word Fugitive” format and applying it to the world of blogs and tweets. Here is a list of blog/twitter situations for which “there should be a word”.
#1 The ego-crushing realization, in the course of a face to face conversation covering topics you’ve written about, that the other person has not read your blog/tweets on this. Even though the first thing they told you when you met 10 minutes earlier is that they love your blog.
Candidate: followimp (from Shlomo).
#2 Conversely when someone brings up in the conversation something you wrote and had forgotten you did (maybe we need two words here, one if you are happy to be reminded of this and one if you’d rather not have been).
Candidates: twegreat and twegrets, respectively (from Shlomo).
#3 Seeing the corner of the blogo-twitto-sphere where you hang out light up in response to someone’s post even though you wrote up the same thing two years ago. At least you were trying to explain the same thing, but your brilliance went unnoticed.
Candidate: deja-lu.
#4 The frustrating (for system modelers at least) intermixing of data (your text) and metadata (e.g. the identification of the tweet you are responding to) in Tweeter conversations.
Candidate: metamess.
#5 (This one comes from @Beaker) The art of carving up tweets from others to be able to retweet them in 140 characters.
Hoff has a suggestion: Twexter (Twitter + Dexter).
#6 The art of guessing early the Twitter #hashtag that will emerge as a winner for a given topic.
Candidate: foretweetude.
#7 The frustration of having too many blog drafts and no time to write them up.
Candidate: blocrastination. And Neil WD offered logjam in the comments.
#8 (added on 2009/10/22 after seeing this) The feeling of nakedness one has while his/her blog is offline.
Candidate: e-vanescence.
Cloud platform patching conundrum: PaaS has it much worse than IaaS and SaaS
The potential user impact of changes (e.g. patches or config changes) made on the Cloud infrastructure (by the Cloud provider) is a sore point in the Cloud value proposition (see Hoff’s take for example). You have no control over patching/config actions taken by the provider, any of which could potentially affect you. In a traditional data center, you can test the various changes on specific applications; you don’t have to apply them at the same time on all servers; and you can even decide to skip some infrastructure patches not relevant to your application (”if it aint’ broken…”). Not so in a Cloud environment, where you may not even know about a change until after the fact. And you have no control over the timing and the roll-out of the patch, so that some of your instances may be running on patched nodes and others may not (good luck with troubleshooting that).
Unfortunately, this is even worse for PaaS than IaaS. Simply because you seat on a lot more infrastructure that is opaque to you. In a IaaS environment, the only thing that can change is the hardware (rarely a cause of problem) and the hypervisor (or equivalent Cloud OS). In a PaaS environment, it’s all that plus whatever flavor of OS and application container is used. Depending on how streamlined this all is (just enough OS/AS versus a traditional deployment), that’s potentially a lot of code and configuration. Troubleshooting is also somewhat easier in a IaaS setup because the error logs are localized (or localizable) to a specific instance. Not necessarily so with PaaS (and even if you could localize the error, you couldn’t guarantee that your troubleshooting test runs on the same node anyway).
In a way, PaaS is squeezed between IaaS and SaaS on this. IaaS gets away with a manageable problem because the opaque infrastructure is not too thick. For SaaS it’s manageable too because the consumer is typically either a human (who is a lot more resilient to change) or a very simple and well-understood interface (e.g. IMAP or some Web services). Contrast this with PaaS where the contract is that of an application container (e.g. JEE, RoR, Django).There are all kinds of subtle behaviors (e.g, timing/ordering issues) that are not part of the contract and can surface after a patch: for example, a bug in the application that was never found because before the patch things always happened in a certain order that the application implicitly – and erroneously – relied on. That’s exactly why you always test your key applications today even if the OS/AS patch should, in theory, not change anything for the application. And it’s not just patches that can do that. For example, network upgrades can introduce timing changes that surface new issues in the application.
And it goes both ways. Just like you can be hurt by the Cloud provider patching things, you can be hurt by them not patching things. What if there is an obscure bug in their infrastructure that only affects your application. First you have to convince them to troubleshoot with you. Then you have to convince them to produce (or get their software vendor to produce) and deploy a patch.
So what are the solutions? Is PaaS doomed to never go beyond hobbyists? Of course not. The possible solutions are:
- Write a bug-free and high-performance PaaS infrastructure from the start, one that never needs to be changed in any way. How hard could it be? ;-)
- More realistically, narrowly define container types to reduce both the contract and the size of the underlying implementation of each instance. For example, rather than deploying a full JEE+SOA container componentize the application so that each component can deploy in a small container (e.g. a servlet engine, a process management engine, a rule engine, etc). As a result, the interface exposed by each container type can be more easily and fully tested. And because each instance is slimmer, it requires fewer patches over time.
- PaaS providers may give their users some amount of visibility and control over this. For example, by announcing upgrades ahead of time, providing updated nodes to test on early and allowing users to specify “freeze” periods where nothing changes (unless an urgent security patch is needed, presumably). Time for a Cloud “refresh” in ITIL/ITSM-land?
- The PaaS providers may also be able to facilitate debugging of infrastructure-related problem. For example by stamping the logs with a version ID for the infrastructure on the node that generated the log entry. And the ability to request that a test runs on a node with the same version. Keeping in mind that in a SOA / Composite world, the root cause of a problem found on one node may be a configuration change on a different node…
Some closing notes:
- Another incarnation of this problem is likely to show up in the form of PaaS certification. We should not assume that just because you use a PaaS you are the developer of the application. Why can’t I license an ISV app that runs on GAE? But then, what does the ISV certify against? A given PaaS provider, e.g. Google? A given version of the PaaS infrastructure (if there is such a thing… Google advertises versions of the GAE SDK, but not of the actual GAE runtime)? Or maybe a given PaaS software stack, e.g. the Oracle/Microsoft/IBM/VMWare/JBoss/etc, meaning that any Cloud provider who uses this software stack is certified?
- I have only discussed here changes to the underlying platform that do not change the contract (or at least only introduce backward-compatible changes, i.e. add APIs but don’t remove any). The matter of non-compatible platform updates (and version coexistence) is also a whole other ball of wax, one that comes with echoes of SOA governance discussions (because in PaaS we are talking about pure software contracts, not hardware or hardware-like contracts). Another area in which PaaS has larger challenges than IaaS.
- Finally, for an illustration of how a highly focused and specialized container cuts down on the need for config changes, look at this photo from earlier today during the presentation of JRockit Virtual Edition at Oracle Open World. This slide shows (in font size 3, don’t worry you’re not supposed to be able to read), the list of configuration files present on a normal Linux instance, versus a stripped-down (”JeOS”) Linux, versus JRockit VE.

By the way, JRockit VE is very interesting and the environment today is much more favorable than when BEA first did it, but that’s a topic for another post.
[UPDATED 2009/10/22: For more on this (in an EC2-centric context) see section 4 ("service problem resolution") of this IBM paper. It ends with "another possible direction is to develop new mechanisms or APIs to enable cloud users to directly and automatically query and correlate application level events with lower level hardware information to better identify the root cause of the problem".]
PaaS as a satisfying and success-ready hobbyist plaform
I don’t know anyone in Silicon Valley who can code and doesn’t fantasize about writing an accidental killer app. One that gets designed during a long layover in DEN and implemented in a rainy weekend (El Nino is my VC). One that was only supposed to meet the needs of a few friends and is used by half of the world a few months/years later.
I am not talking about seasoned entrepreneurs, who have a network, discipline, resources and enough experience to know that it takes a lot more than a cool idea. Rather, about programming hobbyists (who may of may not be programmers in their day jobs),
By definition, hobbyists only do things that are satisfying. In the rarefied air of Silicon Valley, it also helps if there is a conceivable “upside” to dream about. Platform as a Service (PaaS, e.g. Google App Engine) provides both to software-oriented hobbyist. And make it very cheap (borderline free), which doesn’t hurt.
Satisfying
In a well-crafted PaaS environment, the development process and the result are both satisfying. I am not a Google shrill, but GAE is a fair example. The barrier to entry is very low (the download is less than 10MB and contains all you need to get started). In an hour you have an application running locally. In an hour and 5 minutes you have it deployed and accessible on the web for all. And yet this ease of bootstrap does not come at the cost of too many longer-term limitations (now that the environment has gown a bit from the original limitations and provides scheduled and background jobs). Unlike Yahoo Pipes, for which the first impression is “nifty!” and the second is “gimme a textual representation of my pipe now!”.
Beyond the easy ramp-up, the main source of satisfaction developing in a PaaS environment is that you spend 99% of your time working on the application. Not the OS, not the firewall, not the application container, not the database. Not to mention having to deal with your co-lo provider or the leased line for the servers in the basement. If you are a hobbyist with only a few spare hours per week, that’s a make or break deal. It also means that you have a fighting chance of developing a secure application because you are responsible for a much smaller surface of attack.
Eons ago (in computing time), Visual Basic was the name of the game for these people. More recent was the rise of PHP. It dramatically lowered the barrier to adoption and provided a quick route to a working web application. I know several non-professional developers (e.g. web designers) who are scared of any “normal” programming language but happily write PHP (often of equivalent complexity BTW). Combine this with the wide availability of ISP-managed PHP environments and you get close to what GAE gives you. At the risk of adding to the annoying trend of retroactively cloudifying everything, I think of ISP-hosted PHP as the first generation of PaaS. But it is focused on “show what’s in the DB” scenarios rather than service-centric / mash-up / web 2.0 integration. And even for DB-centric scenarios, by and large PHP coders don’t want to think too much about model and queries (and it shows). I think Google decided to go with Python rather than the easy route of aping the hosted PHP environments in large part to avoid hitting such ceilings down the road. Not surprisingly, PHP support is currently the most requested GAE feature, ahead of Perl and Ruby. Lets see if Google tries to get the PHP community on board or prefers to stay clear of such PaaS legacy (already!).
Ready for success
In the unlikely event that your application catches fire and sees wide adoption (which is not impossible, especially if well integrated in a social network), what are you going to do about it? Keeping in mind two constraints: first, this is a part-time hobby of yours. Second, don’t dream of riches. We are talking about an influx of facebookers or twitterers here, with no intention to pay for anything. But click they will. If you were going to answer: “I get funding and hire a real IT staff” then think again. You most likely won’t get funding for your toy app without revenue potential. And even if you do, by the time you have it it’s too late and people have moved on because you were not there when the spotlight was on you.
With a PaaS-based application you have a fighting chance. If the spike is short enough, you may not even hit the limit of the free quota. If it does, you have the choice of whether you are willing to pay to support the extra traffic or not. No change in code required (though it may be advised anyway, if your app wasn’t architected for efficient scaling – PaaS doesn’t entirely take this off your hands).
That “sudden spike” story is a commonly-invoked use case for EC2. And it’s probably true for a start-up with an IT staff (of at least one full-time person). But despite Amazon’s efforts (and other providers such as RightScale) this type of scaling is something you have to architect for and putting it in place takes away from the time you spend coding application features. It also means that you are responsible for more infrastructure (OS and application container at least). Not to mention that IaaS providers don’t usually offer free resources for limited usage, the way Google does (I suspect 99% of GAE apps never get over this quota). Even if a small EC2 instance is not very expensive (though it adds up over time if you keep it up for that occasional user), the difference between “free” and “cheap” is significant. As an application provider you’d like for this not to be the case with your users, but as a consumer of infrastructure service you’re on the other side of the deal, aren’t you?
There is a reason why suburbia-bound SUVs are advertised crossing mountain streams. The “I could if I wanted to” line has appeal. For the software hobbyist, knowing that your application won’t crash if it happened to meet success (even if only for a couple of days, e.g. the Slashdot effect) is a good feeling (”I could if *they* wanted to”). In truth this occasion is rare (and likely to end up like this), but you are ready for the eventuality. And if there are enough such hobbyists, then statistically some will encounter it.
The provider’s upside
That last point brings the topic of the PaaS provider’s upside in this. I have read several critical comments arguing that no company will rewrite their application for GAE (true) and that no start-up will write their new code for it either because of the risk of lock-in (also true: “being bought by Google” is not a bad outcome but “has to be bought by Google” is a bad exit strategy). But I think this misses the point of casting such a wide nest and starting with creating a great tool for hobbyists.
After all, Google has made a great business monetizing millions of small sites none of which makes much money by itself. At the very least least this can grow the web and, symbiotically, Google. With two possible upsides:
- Some of these hobbyist applications may actually take off and Google becomes their natural partner/godfather (including managing their user accounts). For example, wouldn’t it be nice for Google if Craigslist or Twitter was running on GAE?
- The platform eventually evolves into something that makes sense for start-ups, SMBs and/or enterprises to use. Google works out the kinks with less demanding users first.
Interesting times
Two closing thoughts, which I’ll leave undeveloped for now:
- There is an especially good synergy between mobile apps and PaaS. Once you get past the restaurant tip calculators, many mobile apps need a server-hosted sidekick to do the heavy lifting of gathering/storing/transforming data. As a hobbyist, you want to spend most of your time making you mobile app cool. Which leaves even less time for administrating server components. On the server side, you are even less likely to want to deal with anything but application logic. PaaS is especially attractive in these scenarios. Google and Microsoft have to navigate these waters carefully but look for some synergy/integration stories around GAE + Android and Azure + Windows Mobile respectively. Not clear what Apple’s story is here or if they think they need one. If it surfaces as an issue then we have yet another reason to restart the “Apple buys Adobe” rumor. Or maybe Sanjiva will get a middle-of-the-night call from Steve Jobs…
- A platform to build/run your application is one thing. A way to reach users is another (arguably much more critical). Things like mobile app stores (especially Apple’s of course), Facebook and next generation app stores. But this goes beyond the scope of this post.
Just to be clear, I am not in any way suggesting that PaaS is only for hobbyists. I am just saying that right now it is a great tool for them, the best way for an individual programmer to have fun and have an impact. This doesn’t take away from the value that PaaS will eventually deliver to larger organizations.
[UPDATED 2009/10/4: Microsoft Azure apparently supports PHP.]
The future (2006 version), has arrived
Remember 2006? Things were starting to fall into place for IT management integration and automation:
- SDD was already on its way to cleanly describe/package/manage the lifecycle of simple and composite applications alike,
- the first version of SML came out to capture all the relevant constraints of complex and composite systems and open the door to “desired-state management”,
- the CMDBf effort was started to seamlessly integrate all sources of configuration and provide a bird-eye view of your entire IT infrastructure, and
- the WSDM/WS-Management convergence/reconciliation was announced and promised to free management consoles from supporting many resource discovery, collection and control mechanisms and from having platform/library dependencies between the manager and its targets.
It looked like we were a year or two from standardization on all these and another year or two from shipping implementations. Things were looking good.
Good news: the schedule was respected. SDD, SML and CMDBf are now all standards (at OASIS, W3C and DMTF respectively). And today the Eclipse COSMOS project announced the release of COSMOS 1.1 which implements them all. The WSDM/WS-Management convergence is the only one that didn’t quite go according to the plan but it is about to come out as a standard too (in a pared-down form).
Bad news: nobody cares. We’ve moved on to “private clouds”.
Having been involved with these specifications in various degrees (a little bit on SDD, a fair amount on SML and a lot on CMDBf and WSDM/WS-Management) I am not as detached as my sarcastic tone may suggest. But as they say in action movies, “don’t let sentiments get in the way of the mission”.
There is still a chance to reuse parts of this stack (e.g. the CMDBf query language) and there are lessons to learn from our errors. The over-promising, the technical misjudgments, the political bickering, the lack of concrete customer validation, etc. To some extent this work was also victim of collateral damages from the excesses of WS-* (I am looking at you WS-Addressing). We also failed to notice the rise of the hypervisor in our peripheral vision.
I tried to capture some important lessons in this post-mortem. For the edification of the cloud generation. I also see a pendulum in action. Where we over-engineered I now see some under-engineering (overly granular interaction models, overemphasis on the virtual machine as the unit of everything, simplistic constraint models, underestimation of config/patching issues…). Things will come around and may eventually look familiar (suggested exercise: compare PubSubHubBub with WS-Notification).
As long as each iteration gets us closer to the goal things are good.
See you in 2012. Same place, same day, same time.
On Twitter
I created the @vambenepe Twitter account a while ago to reserve the username. Yesterday I posted three tweets, so I guess I am now “on Twitter”, in case anybody cares. We’ll see where this goes. @jamesurquhart gave me a kind (but intimidating) welcome and @Beaker hasn’t called me a “jackass” yet, so things are looking good. BTW, is it just me or has Cisco assembled a top-notch good cop / bad cop team? I hope I manage my blog-to-twitter expansion as well as they did.
The Cloud stuff is where the fun is, but if this Twitter thing is going to be of any use for real work I need to find who to follow in the IT management, application management and systems modeling areas. Any suggestion beyond @cote, @MouthOfOpenNMS, @dmcclure, @puppetmasterd and @theitskeptic (I feel like I am just Twitterifying my blogroll)?
And even then, finding people to follow seems to be the easy part. It took me about 20 minutes last night to realize that I am not going to read all the tweets (and I currently only follow 18 people). Worst case I’ll just track the direct mentions of my handle and some occasional hastags during interesting announcements. And scan the rest once a week. I assume that’s what the Twitter natives like @cote do as well (I seeded my list by picking names I recognized from his 1,130-long follow list). Advice?
The other issue is the 140 characters limit of course, but this should be easier to get used to. In the Apple/Palm tweet last night (about how this might show us what enforcement options standard bodies have) I wanted to invoke Stalin’s dismissive “The Pope! How many divisions has he got?” quote by replacing the pope with the USB Implementers Forum (USB-IF). But no room left unless I sacrificed the image of a working group chair breaking the knee cap of an offending implementer (which, as an ex-WG chair myself, I see some upside to).
Is it bad form to post multi-part tweets? How about, say, 50 parts? I need a protocol to guarantee delivery and order on top of the Twitter API. Maybe REST-* can help me… ;-)
I also wanted to ping Andy Updegrove with the hope that he’d comment on the USB-IF letter (he has looked at the iPhone before, but not this specific issue) for an authoritative opinion. But he doesn’t seem to be on Twitter. The nerve!
And then there is the “follower” thing, which I guess I am now supposed to start obsessing about (folks, if I don’t have a hundred followers by week end the kitten gets it).
In the real world, there are a few people who return my emails and occasionally agree to have lunch with me, but that’s a far cry from calling them “followers”. Even my wife would spit her coffee if I referred to her as my “follower”. But on Twitter, I just posted three tweets yesterday and I already feel like a religious guru with my 24 “followers”.
Jokes aside (on the cult-leader overtones of the word “follower”), the fact that these people are identified is a nice improvement over blog subscribers (who, to me, are just an occasional number within the user-agent field in my Apache httpd logs), at least until they comment/email. Nice to “see” you.
One more step in the slippery slope towards total egomania. Blog > Twitter > Live webcam of the inside of my stomach.
Thoughts on the “Simple Cloud API”
PHP developers with Cloud aspirations rejoice! Zend has announced a PHP toolkit (called the Simple Cloud API project) to abstract and access application-level Cloud services. This is not just YACA (yet another Cloud API), as there are interesting differences between this and all the other Cloud toolkits out there.
First it’s PHP, which was not covered by the existing toolkits. Considering how many web applications are written in PHP (including the one that serves this very blog) this may seem strange, until you realize that most Cloud toolkits out there are focused on provisioning/managing low-level compute resources of the IaaS kind. Something that is far out of PHP’s sweetspot and much more practically handled with Java, Python, Ruby or some .NET language accessible via PowerShell.
Which takes us to the second, and arguably most interesting, characteristic of this toolkit: it is focused on application-level Cloud services (files, documents and queues for now) rather than infrastructure-level. In other word, it’s the first (to my knowledge) PaaS toolkit.
I also notice that Zend has gotten endorsements from IBM, Microsoft, Nirvanix, Rackspace and GoGrid. The first two especially seem to have impressed InfoWorld. Let’s keep in mind that at this point all we are talking about are canned quotes in a press release. Which rank only above politician campaign promises as predictor of behavior. In any case that can’t be the full extent of IBM and Microsoft’s response to the VMWare/Cisco push on IaaS standards. But it may suggest that their response will move the battlefield to include PaaS, which would be a smart move.
Now for a few more acerbic comments:
- It has “simple” in its name, like SOAP (as Pete Lacey famously lampooned). In the long term this tends to negatively correlate with simplicity, just like the presence of “democratic” in the official name of a country does not bode well for actual democracy.
- Please, don’t shorten “Simple Cloud API” to SCA which is already claimed in a (potentially) closely related field.
- Reuven Cohen is technically correct to see it as “a way to create other higher level programmatic API interfaces such as REST or SOAP using an easy, yet portable PHP programming environment”. But pay attention to how many turtles are on this pile: the native provider API, the adapter to the “simple cloud API”, the SOAP or REST remote API and the consuming application’s native API. How much real isolation are you getting when you build your house on such a wobbly foundation
[UPDATE: Comments from someone in the know: a programmer working on adding Azure support for this Simple Cloud API project.]
Look Ma, no hypervisor!
Encouraged by hypervisor vendors, the confusion between virtualization and Cloud Computing is rampant. In the industry, the term “virtualization” (and its corollary, “virtual machine”) is used in so many different ways that it has lost all usefulness. For a recent example, read the introduction of this SNIA/OGF white paper (on Cloud Storage) which asserts that “the new technology underlying this is the system virtual machine that allows multiple instances of an operating system and associated applications to run on single physical machine. Delivering this over the network, on demand, is termed Infrastructure as a Service (IaaS)”.
In fact, even IaaS-type Cloud services don’t imply the use of hypervisors.
We need to decouple the Cloud interface/contract (e.g. “what are the types of resources that can I provision on demand? hosts, app servers, storage capacity, app services…”) from the underlying implementation (e.g. “are hypervisors used by the Cloud provider?”). At the risk of spelling out things that may be obvious to many readers of this blog, here is a simplified matrix of Cloud Computing systems, designed to illustrate that all combinations of interface and implementation are possible and in many cases even reasonable.
IaaS interface PaaS interface Hypervisor used Yes! (see #1) Yes! (see #2) Hypervisor not used Yes! (see #3) Yes! (see #4)#1: IaaS interface, hypervisor-based implementation
This is a very common approach these days, both in public Clouds (EC2, Rackspace and presumably at some point the VMWare vCloud Express service providers) and private Clouds (Citrix, Sun, Oracle, Eucalyptus, VMWare…). Basically, you take a bunch of servers, put hypervisors on all of them and make VMs running on these hypervisors available to the Cloud customers.
But despite its predominance, this is not the only path to a Cloud, not even to an IaaS (e.g. “x86 hosts on demand”) Cloud. The following three other scenarios are all valid too.
#2: PaaS interface, hypervisor-based implementation
This is the road SpringSource has been on, first with Cloud Foundry (using AWS EC2 which is based on the Xen hypervisor) and presumably soon on top of VMWare.
#3: IaaS interface, no hypervisor in the implementation
Let’s remember that the utility computing vision (before the term fell in desuetude in favor of “cloud”) has been around before x86 hypervisors were so common. Take Loudcloud as an illustration. They were building what is now called a “public Cloud” starting back in 1999 and not using any hypervisor. Just bare metal provisioning and advanced provisioning automation software. Then they sold the hosting part to EDS (now HP) and only kept the software, under the name Opsware (now HP too, incidentally). That software was meant to create what we now call a “private Cloud”. See this old DCML announcement as one example of the Opsware vision. And no hypervisor was harmed in the making of this movie.
At the current point in time, the hardware (e.g. multiple cores, shared memory) and software (hypervisors, legacy apps) environment is such that hypervisor-based solutions seem to have an edge over those based on automated provisioning/configuration alone. But these things tend to change quickly in our industry… Especially if you factor in non-technical considerations like compliance, fear of data leakage and the risk of having the hardware underlying your application seized because of an investigation involving another tenant…
And this is not going into finner techno-philosophical points about the different types of hypervisors. Not to mention mainframe LPARs… One could build a hypervisor-free IaaS solution on these.
To some extent, you may even put the “pwned” machines (in a botnet) in this “IaaS with no hypervisor” category (with the small difference that what’s being made available is an x86 with an OS, typically Windows, already installed). If you factor out externalities (like the FBI breaking down your front door at 6:00AM) this approach has claims as the most cost-effective form of Cloud computing available today… Solaris zones are another example of possible foundation for a hypervisor-free IaaS-like offering (here too, with an OS rather than a “raw host” as the interface).
#4: PaaS interface, no hypervisor in the implementation
In the public sphere, this corresponds to Google App Engine.
In the private sphere, several companies have built it themselves on top of WebLogic, by adding some level of “on-demand” application provisioning in order to streamline the relationship between the IT group running the servers and the business groups who want to deploy applications on them. Something that one should ideally be able to buy rather than build.
Waiting for the question to become irrelevant
Like most deeply-ingrained confusions, the conflation of virtualization and Cloud Computing won’t be dispelled as much as made irrelevant. The four categories enumerated in this post are a point-in-time view of a continuously evolving system. What may start today as a bundle of a hypervisor, an OS and an app server may become a somewhat monolithic “PaaS engine” over time as the components are more tightly integrated. That “engine” may have memory isolation mechanisms that look a lot like a hypervisor. But it may not be able to host a generic OS. In the same way that whales don’t have fingers and toes and yet they are still very much apparent in their skeleton.
[UPDATED 2009/10/8: A real-life example of #3! On-demand servers via bare metal provisioning (via Sam). No hypervisor in the picture.]
REST-*: good specs, bad branding?
In an earlier post, I argued for standardization of some basic REST-inspired mechanisms for the narrow goal of supporting control interfaces for different forms of Cloud Computing. As I was doing so, I noticed the first report of something called REST-*, introduced by RedHat’s Mark Little and I ended my post by wondering whether we were talking about the same thing or not.
Now that more information has emerged it seems pretty clear that we are not.
Mark Little understands transactions very well. No argument. He is not happy with some aspects of how they are supported over SOAP. Fine. He thinks it can be done better (at least for 80% of the cases and with lower barriers to entry) directly on top of HTTP (no envelope). Fine. He would like this to be standardized so that middleware stacks can interoperate. Fine. Same applies for pub/sub and p2p messaging, the other initial project out of the REST-* effort. All good.
Where it all goes wrong is the attempt to get on the REST bandwagon. REST is not the only proper way to write distributed applications. It’s a good way to do it for a specific (through arguably very large) set of distributed applications. One that may not include financial trading or RFID-enabled inventory tracking. More specifically, REST might not be the appropriate approach for all parts of all distributed applications. Working on smoothly connecting the REST and non-REST parts is interesting. Working on forcing the non-REST parts under the REST mantle less so.
By REST here I mean REST-the-architectural-style (narrowly defined), not REST-the-brand (much more broadly defined). Even if your work does not fall under the umbrella of REST-the-architectural-style, you may choose to position it under REST-the-brand as a pragmatic calculation (like a police department might pragmatically include a plasma TV in the “terrorism preparation” accounting category). In the “pros” category, positioning it as REST gives you instantaneous press coverage. In the “cons” category, it gives you instantaneous twitter coverage (of the kind that Steve Jones reports). All in all, it seems like a bad bargain to me if you want to get things done. But Bill Burke (who works with Mark on this) has chosen to accept it: “I really don’t care in the end if any of the architectural principles of Roy’s thesis are broken as long these requirements are met”. As a side note, the REST-* announcement puts this comment by Bill on Roy’s blog in context…
In any case, the way the proposed umbrella organization is shaping up is also giving me concerns. Less about some nefarious intent than about a certain tone-deafness regarding how it comes across. I am not talking about details such as the REST-* moniker, the fact that http://rest-star.org is just a facade that redirects to http://www.jboss.org/reststar or the fact that their blog feed uses RSS rather than Atom (way to get the REST crowd on your side). Rather I am thinking of statements like “Red Hat, as the founder of REST-*, gets a permanent seat on the board. All other board members must be elected by the overall membership once a year”. Which suggests (probably incorrectly) more arrogance than even Microsoft and IBM combined were able to muster when setting up WS-I (modulo the Sun snub). Speaking of Sun, if the JCP (and Sun’s position in it) is the model that RedHat has in mind it might be helpful to point out to them that Sun invented the language after all…
All in all, the specifications Mark and team have in mind may make perfect sense, but they way they are going about it leaves me highly skeptical.
[UPDATE 2009/9/17: More REST-* skepticism. But it looks like Mark and Bill are taking it in stride, acknowledging a less-than-optimal execution and trying to fix things. I doubt this specific initiative can be salvaged, but I think a lot of the goals are good and need to be realized. Though my intuition is that it is more likely to get done in a piecemeal fashion, distributed between specialized communities (e.g. the Cloud people, the messaging/AMQP people...) who take on, in a very practical way, the portions most relevant to their needs. Whether all the pieces then get pulled together in one place with a nice bow is not important right now.]
[UPDATED 2009/9/18: Changes!]
Monitoringforge.org vs. the ghost of openmanagement.org
There is a new site to promote open source “IT monitoring and network management” solutions. It appears to be an initiative from GroundWork. There are many very interesting open source projects in this area and anything that helps lower the cost of finding and filtering them is good. But when the GroundWork VP of marketing says, in an interview, that “this is the first attempt to really create a neighborhood for all these projects to be represented in that is truly neutral” it brings back to mind the Open Management Consortium, announced in May 2006 and apparently defunct (the WayBack Machine has not been able to snapshot the site since February 2008). Back in the days it billed itself as a way “to help advance the promotion, adoption, development and integration of open source systems /network management software”, which sounds pretty similar.
What happened to it and what will keep monitoringforge.org from meeting a similar fate?
Cloud Data Management Interface (CDMI) draft released
Have you developed “Cloud API fatigue” from seeing too many IaaS “Cloud APIs” lately? Are you starting to wonder how many different ways there can possibly be to launch a virtual machine via an HTTP POST? Are you wondering why everybody else seems to equate Cloud computing with on-demand server instances?
If yes, then CDMI will come as a breath of fresh air. This specification (just a draft at this point) is a rare example of a different beast. Coming out of SNIA, it endeavors to standardize the way storage resources are managed and accessed in a Cloud environment. They call this DaaS (Data storage as a Service).
The specification has two components (which may benefit from being separated in two specifications at some point). One (called “control paths”) is an interface to manage a data storage service. That interface is expected to work across many forms of data storage from block storage (like AWS EBS) to filesystems (e.g. NFS) to object stores with a CRUD interface (similar to the WebDAV volumes of the Sun API). It also mentions a “simple table space storage” storage form, but that part is pretty fuzzy.
The second component of CDMI (called “data paths”) only applies to the CRUD object store and it describes a RESTful interface for accessing it. This figure from the specification does a good job of illustrating the two different APIs in the specification (and the different types of storage envisioned).

One of the most interesting sections in the document describes the way in which the authors envision the ability to export the storage resources provisioned/managed through CDMI to other Cloud APIs. They illustrate it in an example involving OCCI (see also this joint white paper). This is very interesting and another sign that we need a shared RESTful resource control framework for Cloud computing as a first layer of standardization. One of the reasons I used to justify this claim two weeks ago was that “there will not be one API that provides control of [all the different forms of Cloud Computing], but they can share a base protocol that will make life a lot easier for developers. These Clouds won’t be isolated, developers will use them as a continuum.” One week later, this draft specification illustrates the point very well.
[As a somewhat related side note, this interesting post about what it takes to provide a large-scale resilient data service (the Google App Engine data store). And more about the Google File System in general.]
Toolkits to wrap and bridge Cloud management protocols
Cloud development toolkits like Libcloud (for Python) and jcloud (for Java) have been around for some time, but over the last two months they have been joined by several other open source contenders. They all claim to abstract the on-the-wire Cloud management protocols sufficiently to let you access different Clouds via the same code; while at the same time providing objects in your programming language of choice and saving you the trouble of dealing with on-the-wire messages. By focusing on interoperability, they slot themselves below the larger role of a “Cloud broker” (which also deals with tasks like transfer and choice). Here is the list, starting with the more recent contenders:
- CloudLoop (here is the announcement) – focused on storage
- OpenNebula (here is the announcement)
- Dasein (here is the announcement)
- jclouds
- Libcloud
DeltaCloud shares the same goal of translating between different Cloud management protocols but they present their own interface as yet another Cloud REST API/protocol rather than a language-specific toolkit. More along the lines of what UCI is trying to do (not sure what’s up with that project, I recorded my skepticism earlier and am still waiting to be pleasantly surprised).
Of course there are also programming toolkits that are specific to one Cloud provider. They are language-specific wrappers around one Cloud management protocol. AWS protocols (EC2, S3, etc…) represent the most common case, for example amazon-ec2 (a Ruby Gem), Power-EC2Dream (in C# which gives it the tantalizing advantage of being invokable via PowerShell) and typica (for Java). For Clouds beyond AWS, check out the various RightScale Ruby Gems.
The main point of this entry was to list the cross-Cloud development toolkits in the bullet list above. But if you’re in the mood for some pontification you can keep reading.
For some reason, what used to be called “protocols” is often called “APIs” in Cloud settings. Witness the Sun Cloud “API” or the vCloud “API” which only define XML formats for on-the-wire messages. I have never heard of CIM/XML over HTTP, WSDM or WS-Management being referred as APIs though they occupy a very similar place. They are usually considered “protocols”.
It’s a just question of definition whether an on-the-wire protocol (rather than a language-specific set of objects/methods) qualifies as an “Application Programming Interface”. It’s not an “interface” in the Java sense of the term. But I can “program” against it so it could go either way. On this blog I have gone along with the “API” term because that seemed widely used, though in verbal conversations I have tended to stick to “protocol”. One problem with “API” is that it pushes you towards mixing the “what” and the “how” and not respecting the protocol/model dichotomy.
Where is becomes relevant is when you start to see language-specific APIs for Cloud control pop-up as listed above. You now have two classes of things called “API” and it gets a bit confusing. Is it time to bring back the “protocol” term for on-the-wire definitions?
As a developer, whether you’re better off eating your Cloud noodles using chopsticks (on-the-wire protocol definitions) or a fork (language-specific APIs) is an important decision that will stay with you and may come back to bit you (e.g. when the interfaces are versioned). There is a place for both of course, but if we are to learn anything from WS-* it’s that we went way too far in the “give me a java stub” direction. Which doesn’t mean there is no room for them, but be careful how far from the wire semantics you get. It become even trickier when your stub tries not jsut to bridge between XML and Java but also to smooth out the differences between several on-the-wire protocols, as the toolkits above do. The hope, of course, is that there will eventually be enough standardization of on-the-wire protocols to make this a moot point.


