Wednesday, 14 September 2016

Apache Sling Pipes - tool for doing extract - transform - load operations

While working on huge amounts of Enterprise content data, I am sure you guys must have faced scenarios wherein there is a need to modify some part of content with a particular property or some property/node is wrongly edited and you are asked to find all those nodes and correct that property . Similar sort of problems become cumbersome if the amount of data is quite huge. General approach to solve this would be to write either some Groovy script or write some Java code to parse the content and change the necessary nodes. To overcome this redundant issue in content and to make a scalable approach to deal with all these content related issues, Apache came up with Sling Pipes.

What is a pipe

Its a tool where you can load content tree nodes , perform some operation and either Retrieve an output or Modify the nodes. The aim here is to provide reusable blocks called pipes which can be configured for any possible operation on content.

 getInput  +---+---+   getOutput
           |       |
      +----> Pipe  +---->
           |       |