While at the UC, you can learn more about these APIs at the Server Road Ahead sessions, the EDN sessions, Advanced ArcGIS Online Sessions and also at the Java SIG.
Monday, June 18, 2007
Tuesday, June 5, 2007
You have your business engine all setup. Your processing algorithms have been optimized to the hilt. Your data model is as scalable as any. Any now you are publishing your data to the WWW. Well, the good news is you can still do more - all with plain old HTTP.
Herein I list 3 simple ways you can leverage HTTP to help you better serve your content:
Cache-Controlis byfar is the most widely used of all http headers and for good reason. You generate your response and you set a
Cache-Controlresponse header with a validity period. The clients, the intermediaries and the web infrastructure at large all work overtime for you caching your content until the time that you have tagged it valid.
This of course works best for static resources or for such dynamic resources whose validity you can reasonably predict before hand.
This is one of the most powerful but unfortunately, a not-frequently used technique. So your content is such that you can't reasonably predict its validity period. Which means #1 doesn't work for you. Your next best buddy is
This is how it works: You generate your content and you set the http header
ETag(entity tag). The
ETagrepresents the state of your resource. Even if one bit of the resource content changes, so does its
ETag. You can think of
ETagas a simple hash of your content. Ok so you have set the
ETagand sent the response. Now the next time the client tries to access the same URL, it will send an
If-None-Matchrequest header and it will be set to the same value as that of your
ETag. Now you can either regenerate the content or you may have it cached on your server, if the
ETagof your content matches the
If-None-Matchit implies that the content has not changed. The client has indicated to you thru the
If-None-Matchthat it already has this content. So what do you do - send nothing! Yes - simply set the response status to HTTP 304 (Not Modified) and the size of the content that you send this time is exactly 0. At a minimum, you gain in saved bandwidth (read performance) but if you have cached the content on your server, you also gain from saved computation.
With #1 and #2 you benefit by not having to resend your content in certain situations. But even when you can't get away from having to send content, you can still gain in bandwidth by simply compressing it. But of course you want to compress content only for clients that you know can decompress it and even then you need to tell the client that the content you sent is compressed and it needs to decompress it.
No hassle - http makes it fairly straightforward. If the client understands gzip, it sends an
Accept-Encodingrequest header with the string
gzipin it. If you (the server) read this header and find the string
gzip, you gzip your content and set a response header
gzip. This tells the client that the content is gzipped and it must decompress it before providing it to the user.
It's normal for gzip to compress text content in upwards of 70% and given how easy it is to compress content you should be compressing your content right about now.
Note that #2 + #3 put together have problems in IE 6 and you might have to take that into account.
But all in all optimizing the delivery of your content with http is simple yet powerful and your web application can only benefit from it.
Friday, May 25, 2007
If-None-Match give you the benefit of not having to send unmodified content repeatedly (HTTP 304). IE6 handles this well.
gzip gives you the benefit of compressing the content that you send. IE6 handles this well as well.
If-None-Match should give you the combined benefit of sending compressed content when you must and not sending content at all if it's not modified. Well, as you might have already guessed, IE6 does not handle this well. If your content is gzipped and you send an
ETag header as well, IE6 does not send an
If-None-Match on subsequent requests. Which of course means that you can't leverage HTTP 304.
So if you are servicing IE6 clients beware that it supports either compression or
ETags but not both.
Thankfully, this has been fixed in IE7. Firefox of course just works.
Friday, May 18, 2007
Link: Sam Ruby on Google Maps
Of course, the rest of the iceberg was that Google had simply tiled the Earth. In so doing, they converted a single web service (call me with a bunch of information, and I will provide you with a custom result) into a large number of individually addressable, cacheable, and scalable web resources.That's it. More the resources more is the opportunity to use Cache-Control headers, to use ETags, to distribute and load-balance the system.
In the same article, Sam also talks about how the web is not a service but a space. And in today's world adding more "space" will scale your system manifold than implementing a state-of-the-art service with the most optimal algorithm. Processor speeds have flattened. Today it's about dual-cores, quad-cores, (your-budget)-cores. The more cores your program can use to get the job done, the more scalable will your system be.
As Brian Goetz puts it: "Tasks must be amenable to parallelization". Parallelization comes for free with every resource you add. So add more resources and see your system scale.
Sunday, May 13, 2007
Day 1 was primarily scripting, day 2 hardcore java. Days 3 and 4 saw sessions on the entire gamut of Java technologies - from garbage collection to mashups. I discuss some of them here.
Blueprints for Mashups
This was arguably the most informative session for me by far at this year's JavaOne. Kudos to Greg, Mark and Sean for putting together a to-the-point, practical and readily-usable session together. Personally, this session validated the REST and JSON concepts I had gathered over the past few months. I'll recommend this session to anybody interested in building mashups, REST services, AJAX and a whole lot more. (No, they haven't paid me to say this.)
- Server-side proxy: Clients always send requests to the server that is hosting the page. The server in turn acts as a proxy, sends your request to the (remote) mashup service and returns the data it gets from the mashup service to you.
- Dynamic script tags: Browsers allow script tags to communicate with cross domain servers. This opens up the opportunity for you to issue requests to any mashup service by generating dynamic script tags.
Various options are available for securing your REST services:
- User tokens
- Session based hash
- URL based API key
- Authentication headers
- Use namespaces
- Use CSS for applying styles
- Setting the
innerHTMLproperty is easier / better than DOM manipulation
- A server-side service
- A client-side CSS for applying styles
- Document the API
- Create simple examples enabling a simple cut-and-paste approach to learning your API
If there was one objective of this session, it was to clear the GC myths out there. Some of the finest Java minds were refuting the urban legends out there and when they talk you listen:
- Object allocation is cheap. Reclaiming young objects is also cheap.
- Small, short-lived immutable objects are good. Large, long-lived mutable objects are bad.
- Nulling references rarely helps - except when it comes to arrays.
- Avoid finalizers - in most cases there are better alternatives possible.
System.gc()- except between well-defined application phases and when the load on your system is low.
- Object pooling is not required in today's VMs. It is a legacy of older VMs. Exceptions are objects that are expensive to allocate / initialize or objects that represent scarce resources.
- Consider using reference-objects for limited interactivity with the garbage collector.
Certain memory-leak pitfalls:
- Objects in wrong scope
- Lapsed listeners
- Metadata mismanagement
They strongly advocated FindBugs for finding such pitfalls as well as other potential bugs.
Generics and Collections
Generics have been around for a while and most Java programmers have some understanding of it. There have been many criticisms against the erasure based implementation of generics - since parameter types are not reified, constructs that require type information at runtime don't work well. However erasures allow 2 most important benefits - migration compatibility and piecewise generification of existing code. Just these benefits make erasures a necessary bane.
Huge additions have been made to the Collections framework in Java 5 and 6. So much so that many recommend programmers to never use arrays - Collections all the way!
The concurrency classes introduced in Java 5 are a great new toolset for the Java programmer. New concurrency related annotations such as
@GuardedBy, etc. are being considered. One important aspect to keep in mind - imposing locking requirements on external code is asking for trouble. Ensure that you make your code threadsafe yourself. Performance penalties incurred by
synchronized constructs are overblown.
Immutable objects are your friends -
final is the new
private! They are automatically threadsafe. Object creation is cheap. Aim for less and less mutable state.
Performance is now a function of parallelization - write code such that it can use more cores to get the job done faster. So to improve scalability find serialization in your code and break it up:
- Hold locks for less time
- Use more than one lock
- Consider using
ThreadLocalfor heavyweight mutable objects that don't need to be shared
Builder pattern allows you to construct objects cleanly in a valid state with both required and optional parameters. The basic premise can be expressed in code as such:
MyObject myobj = new MyObject.Builder(requiredParam1,requiredParam2).optionalParam1(opt1).optionalParam2(opt2).build();
Generics - avoid raw types in new code. Don't ignore compiler warnings. Annotate local variables with
@SuppressWarningsrather than the entire method / class. Use bounded wildcards to increase applicability of APIs. 3 take home points for paramterized methods:
- Parameterize using
<? extends T>for reading from a Collection
- Parameterize using
<? super T>for writing to a Collection
- Parameterize using
<T>for methods that both read and write
Java EE 5 Blueprints
Filter.doFilter()may run in a different thread than the
Servlet.service()method. The Servlet API does not use NIO, however it is possible to implement it yourself. New annotations may make
The JSP and JSF EL have been unified.
#is now reserved in JSP 2.1. The
javax.faces.ViewStatehidden field (for client side state) will be standardized. Include this field in your AJAX postback. Application-wide configuration of JSF resource bundles will be possible in
<f:verbatim>tag is no longer needed to interleave HTML and JSF content.
@PreDestroyannotations will be supported for JSF managed beans.
WADL - Web Application Description Language. As a colleague of mine put it - it's WSDL for REST! And IMO that's almost what it is. It's all in good intent to introduce WADL - a formal definition of your REST resources, allows you to automatically stub out code in your favorite language, aids testing. However WADL has many rough edges and I don't see myself using it anytime soon until those have been addressed - overloaded / loosely typed query parameters, security, non-standard content negotiation, et al.
Tuesday, May 8, 2007
Nothing to "Wow!"
If you like showtime, there was none. No big announcements, no new features, no gennext projects. But IMO that was a good thing. Java has seen many evolutions, many innovations. It's time to make it just a little better, a little more robust, a little more performant. Java is a mature technology now and any radical change might actually be shaky news for the massive Java community.
Oh so Groovy
As I mentioned - scripting languages were the talk of the day. And Groovy is one of them. It has been around for a while and the rich set of features it offers are testimony to that - Java-like syntax, pre-compiled or runtime compilation, "duck typing", annotation support and a whole pageful of other features.
I wonder what has lead to this sudden wave of dynamic languages? Wasn't it only yesterday when everything-should-be-strongly-typed was the hallowed rule of programming. Maybe it's the advent of mashups. Maybe the uptake in AJAX. Or maybe we just need something new to keep us interested!
Web as a platform
"Integrated rich clients" - or what we know as mashups - are the killer apps of the day. Things (technologies, techniques, hacks) that allow for mashups to be assembled with minimal code will be on the victory parade. More code is executed on the client but most of that code is still sent to the client by the server. No wonder scripting is big today and getting bigger. But what I do wonder is where does that leave EJBs / ESBs?
JSR 311 - Java REST API
I had questioned the need for frameworks to build RESTful services earlier. And after having sat thru a session explaining the new Java REST API spec I am no more close to getting an answer to that. Ok so they showed a few annotations which got rid of a few lines of code. But does the same simplicity work if I have even a slightly complicated URI pattern? Or will it work if I don't use the "Accepts" header for conent negiotiation and use query params or path extensions? And from what I saw it seemed they were generating XML and JSON from Java code. Aren't JSPs (or any template technology) a whole lot easier to do that?
I got Closure
Neal Gafter gave a presentation on Closures in Java. Very well done, no gimmicks, to the point. A great presentation for a potentially very powerful feature in the Java language. If Neal needs more voices to move this up the JCP and into Sun's plans for Java SE 7, count me at +1.
Take home point
Like 'em or hate 'em - scripting is in town.
Saturday, April 28, 2007
Many REST frameworks such as Restlets and simple web are gaining popularity of late. There's also a new JSR to give Java developers a new API to build RESTful web services. It's quite natural that as more people "get" REST they look for ways to simplify building their next REST web application / service.
And while I admittedly have to yet fully "get" REST, whatever I have got so far is certainly not by using frameworks that let me implement REST by writing POJOs but rather by extensive use of the POW (Plain Old Web) and POGS (Plain Old Google Search). And applying the principles I learned by using POS (Plain Old Servlets).
This is not a criticism of these frameworks. May be I still haven't understood the role of these frameworks. And may be once I do understand their goodness I myself will start using them.
But my question is this - why are servlets not good enough? Sure they have their limitations. But rather than have yet another framework or a brand new API, why not have a JSR to fix the servlets and JSPs themselves? (A good start would be to not enable sessions for JSPs by default.) After all isn't it the convenience of using frameworks galore that has kept the larger community from understanding the goodness of HTTP? Isn't it the same convenience that has made it a common practice to use (bloated) sessions?
You have to wet your feet to tread the waters. You have to get your hands dirty in HTTP to implement REST.
Thursday, April 19, 2007
Stefan Tilkov and Sean Gillies respond to my previous post about modelling operations in REST. There's also an interesting discussion on this on the rest-discuss yahoo group.
First of all, I think my example wasn't a good one. While the operations I had in mind were operations alright, they weren't state changing ones. I realize that my example definitely implies changing state. And I myself would advocate a PUT or at worst a POST in that case. The case I am making is for operations that don't change state but are like queries on a given resource.
I agree with both Stefan and Sean. "Resources, not Objects" as Sean puts it. In the REST world, a
person does not
walk but s/he reaches a
location. And if I were designing a REST API for a new system I would most certainly use that approach.
But if I have existing APIs or SOAP web services such that the verbs
walk were firmly instilled in the verbiage of my user community, it might be a difficult proposition for me to suddenly introduce a new vocabulary for the same set of operations to my users. The user community sees them as operations and not in terms of the resulting resources (
location). Legacy wins over technical correctness.
Tuesday, April 17, 2007
A common way of designing a REST system is to recognize the resources and associate a logical hierarchy of URLs to them and perform the traditional CRUD operations by using the well-known HTTP methods (GET, POST, PUT and DELETE).
However, many systems have operations which don't quite fit into the CRUD paradigm. Say, I have a repository of
persons. And say I have an
id associated with every person which allows me to have nice URLs to each person as such:
http://example.com/persons/1This URL obviously GETs me the details pertaining to person
1. Now what if I wanted this person to
walk. I can think of 3 approaches to designing this scenario:
- Operations as resources: Which means I can have URLs such as:
talkare obviously not resources and designing it this way might be considered unRESTful(?).
- Operation as a query parameter: This leads to URLs such as:
http://example.com/persons/1?operation=talkA drawback of this approach comes to the fore if you need other parameters to perform an operation. So, for instance, if you had to specify
walk. You'll end up with URLs such as these:
http://example.com/persons/1?operation=talk&what=HelloAs you can see, with this approach you end up having resources with overloaded operations and parameters. And you have to be aware of these combinations yourself and also explain them to your users.
- Matrix URIs: Specify the operations using matrix-like URIs. With this, your URLs look as such:
http://example.com/persons/1;talkAnd you can specify the operation parameters using the traditional query string:
http://example.com/persons/1;talk?what=HelloWith this approach you have 3 distinct ways of representing 3 different things - slashes (
/) for hierarchical resources, semi-colons(
;) for operations and query strings (
?param1=val1¶m2=val2) for parameters.
Many questions. Any answers?
Sunday, April 15, 2007
The ArcGIS Server Java ADF supports accessing the ArcGIS Server over the internet (http) as well as locally (dcom). Internet access uses SOAP. Local access can work with the server objects directly over DCOM or you can also issue SOAP calls over DCOM.
For local access, the ADF gives preference to SOAP over DCOM while it accesses the ArcObjects directly over DCOM only when the functionality is not available thru SOAP. There are 2 primary reasons why SOAP / DCOM is preferred to ArcObjects / DCOM:
- Code reuse
When you work with ArcObjects / DCOM the server gives you access to remote ArcObjects proxies (such as
ILayerDescription, etc.). Every method call on these proxies is a remote method call. So methods such
ILayerDescription.getID()are both remote method calls.
On the other hand, when you work with SOAP / DCOM, only the methods defined in the WSDLs are remote calls. Once you have made those remote calls, you get access to objects which are local value objects (such as
LayerDescription, etc.). So methods such as
ILayerDescription.getLayerID()are both local method calls.
As you can infer, ArcObjects / DCOM elicits more "chattiness" with the server than SOAP / DCOM. And the reduced number of remote calls in case of SOAP / DCOM obviously translates to better performance over the lifetime of your application.
If you look at various functionalities supported by the ADF such as
AGSGeocodeFunctionality, etc., the same functionality classes are used for both internet and local access. This was possible because these functionalities were implemented by using the SOAP interface to the server. The transport is HTTP in case of internet access and DCOM in case of local access but the code remains the same allowing us to reuse the same functionality implementation in both cases.
Capabilities such as the
EditingTaskwhich are not available with SOAP have obviously been implemented by using ArcObjects / DCOM.
In summary, if your functionality can be implemented by using the SOAP interface you should use it. The richness of ArcObjects / DCOM is of course always available to you in cases where SOAP does not suffice.
Friday, April 13, 2007
There is a wealth of material out there on REST but very few that actually explain them succinctly enough for you to, well, pitch them to your manager in the elevator. Looks like someone has tried to do that and done a very good job at it:
Link: REST: the quick pitch
With REST, every piece of information has its own URL.
I'll use some of David's material myself and highlight the key REST concepts as bullet points:
- [Of course] Everything is a URL: And what does that mean? Immediately all your information is readily accessible to everyone. It is cache-able, bookmark-able, search-able, link-able - basically it's intrinsically web enabled.
- Think resources: With REST it helps if you design your system as a repository of resources. Not services. Not as a data provider - but resources.
- URLs matter: You might argue that if it's machines that are calling into my REST resources, how does the niceness of URLs matter? Well, given that URLs are representations of resources and representations can be human readable text formats or browser readable html; your REST URLs are no longer just a privilege of machines. So URLs matter. Avoid query parameters as much as possible. You have a better chance of being indexed by search engines if you avoid 'em. Your implementation becomes easier. Refactoring is smoother.
- POSTs are ok: In the ideal world all HTTP clients and servers would allow PUT and DELETE. But the world doesn't come to a standstill without these methods. Many have done just fine using POST and so would you.
- Requesting content type in URLs is also ok: Again, in the ideal world, clients and servers could do content negotiation. And again, many have done just fine by specifying the format in the URL path or as a query parameter and so would you.
- Consider JSON: JSON is simple. Parsing JSON is simpler. You don't even need to parse it if you are consuming it in a browser. You still want to serve XML given the huge support for it but JSON support is spreading every day and you'll benefit if you're a part of it.
- Use HTTP as a platform: HTTP is not just a protocol. It's a platform. It already provides services such as caching, security (of course more could be done here), standardized status codes - benefit from them.
Is that all to it? Hardly. There's literally a whole science behind it. But that will do for now.
Thursday, April 5, 2007
Like many others, inspired by Pete Lacey's S Stands for Simple, late last year I began to look into REST and by extension into HTTP, status codes, web caching, et al. In a nutshell I went back to the basics and discovered the wealth that I had turned a blind eye to what with the latest and greatest frameworks "abstracting out" what constitutes the web from me.
Suddenly the stateless nature of HTTP transformed from being a limitation to a virtue. The status codes weren't just numbers but a means of communication (in some cases even the lack of it - 304, anyone?). Caching wasn't something I needed to build but something I needed to learn how to use (ETag, If-None-Match, Cache control headers, what not). Ditto with security. URLs ceased being just names - they are a language.
It took all of that for me to realize that it's not the next WS* standard that will help me develop the next state-of-the-art web service but it's the existing goodness in HTTP, it's what makes the web work today, it's what brought you to this page and what enabled me to publish this page to the world.
Having relatively recently discovered REST I find it simple and natural. Simple is good. Natural is good. It uses the existing web / HTTP infrastructure not merely as a protocol but as a platform. And it fits into this Web 2.0 thingy to a tee: Issue AJAX requests(Actually it's more like AJAX without the X). Receive JSON responses. There's your secret sauce to building mashups.
This is not to say that I suddenly shunt everything that is SOAP and just do REST all the way. Far from it. SOAP has served me very well and I like it and I'll continue to use it. Something that lets me use pure Java / .NET while working with a piece of software half a world away from me is too precious to be ignored.
I believe that SOAP and REST are not contradictory but complementary. They have their own usages and users and they will coexist. And I'll continue to use them both as per my application needs. Horses for courses
I rest my case.