Any time I work with data stored in a location that's not on the same server that the Drupal website is on, there's a high chance web services will be involved.
Where web services fail
Any task that could affect the user's experience on the website due to latency as a result of the web service call is undesireable. With the possibility of users or content being stored externally and relying on web services to store data or verify user credentials, there would be a few key hooks upon which the user could notice slow down:
- User login - hook_user_login
- User save - hook_user_update/hook_user_insert/hook_user_presave
- Comment creation - hook_comment_insert
- Node creation - hook_node_insert
Taking the example of updating an external database with node details every time a node is inserted and assuming a slow connection taking perhaps 10 seconds per node. Clicking save on the node page would provide the author with 10 seconds of painful waiting. An alternative to running on each node insert would of course be to make the database transactions occur on cron.
The default cron time limit is 240 seconds (4 minutes), which in our hypothetical situation allows for 24 nodes to be updated externally (provided no other cron tasks need to run). What happens if we have 30 nodes that are required to be updated?
Cron will time out and won't run fully.
Now, taking the example of user login to either authenticate credentials or update user profiles. Having to query, and wait for a slow web services call has the potential of diminishing user experience. Even though any services provided are supposed to be always on, there is a positive, non-zero probability that it will be unavailable eventually, however good the SLA.
Do users simply get denied authentication or details not get updated?
The Queue API
A much underrated and less well known Drupal API is the Queue API.
The queue system allows placing items in a queue and processing them later. The system tries to ensure that only one consumer can process an item.
By putting all of the tasks we need to process into a queue and working through them one by one we can ensure they all get taken care of and cron doesn't time out. This means that any hook_cron implementations that we expect to take a really long time can be put into Queue and will be processed.
Thinking outside Drupal
Although Drupal is, of course, the answer to all life's problems. Sometimes it just isn't.
It can be tempting to think of Drupal as the hammer to every single nail-like problem. Sometimes for cases with huge computational requirements it's best to keep Drupal completely out of the picture. In our example of user details being managed outside of Drupal and updated every time a user logs in. A better method may be to write a script to transfer data from the target server to the Drupal server. After which, it may be imported into a database table and processed by Drupal. Alternatively, the script may directly update the Drupal database.
Not only does this cut down any latency caused by data transfer, but in the case of the web service being inaccessible, data is available locally. When service resumes, the updates too will resume and users may continue logging in regardless of external downtime.
These are but two ways of thinking outside Drupal. There are numerous other novel ways to work with/around Drupal and web services. It's just up to the site administrator to find the best way that suits the problem at hand.