Showing posts with label Sling. Show all posts
Showing posts with label Sling. Show all posts

Thursday, 12 November 2015

Finding out what components my page is using

There are a few ways to do this. One way is to go into the felix console: http://localhost:4502/system/console/bundles, and go under main, and click on the "Sling Resource Resolver" link. Enter the url you want to know more about and press enter. It will give you a list of possible pages it is using. Open up crx/de or the Eclipse plug-in and find the page, and find where the page or template is located.


A quicker way is to add ?debug=layout to the end of the url of the page in question. It will display where every component on your page is located.

URL Mapping and Deep Linking

A strength of Sling is the automatic and expected way that resources are structured and exposed in a logical, ordered manner. For example, AEM, by configuration allows page access through the /content/ directory. A typical page URL would be access with a browser visit to /content/projectname/pagename/subpagename.html. The minimal effort in maintaining this structure has its benefits, but changing that structure to fulfill business needs can be troublesome. Thankfully, AEM provides a number of tools and techniques to do so. This text will explore some of those options.
Root Mapping: The most common change is the need to hide the /content/ node or at least offer top-level pages at the root level. So, instead of a user typing “http://mycompany.com/content/myproject/home.html,” they should be able to type “http://mycompany.com/” to reach the home page.
Normally, in a production environment, this is configured in front-facing web server like Apache with some configuration in the httpd.conf file using mod_rewrite. But it can also be done in AEM with some OSGi configuration:
root_mapping
Open Tools
Open OSGi Console
Search for and open Day CQ Root Mapping
Change the Traget (sic) Path property to whereever you'd like the redirect
Tip 1: If you like the old CQ5.5 dev landing page compared to AEM5.6, change projects.html to welcome.html
Tip 2: Other mechanisms for remapping URLs, which are discussed in this article, will break author instances because it interferes with things like authentication. Use the method above, which is safe, to remap the root node
Page Redirects: Sometimes pages are meant to simply be ways of grouping subpages, and not locations that a user should land on directly. These kind of pages can be set up as redirect component. A good example is the /content/geometrixx sample page.
page_redirect
To recreate this page redirect, either set it up when creating the page in AEM’s site builder(under New Page -> Advanced -> Redirect), or just change the properties of that page in CRX:
expand the content/{page} node
Click the ./jcr:content node
Change the sling:resourceType property to be foundation/components/redirect
Add a new String property called redirectTarget
Give redirectTarget a value of the new page (e.g. /content/myproject/mypage/mysubpage)
Tip 1: If a site is being built in one language, there’s no need to mimic the /en/ structure shown in the geometrixx example.
Tip 2: The sling extension (e.g. .html) will automatically carry over to the redirect page, and shouldn’t be specified
sling:Mapping Redirects: AEM’s most powerful URL-manipulation feature, a set of redirect components, is often overlooked because application servers don’t traditionally have this ability. But CQ’s built in Vanity URL support, namespace mangling, and internal tools all make use of this ability. A custom application can too, for instance, by remapping all /content project to /content/myproject
In CRX, navigate to the /etc/map node.
Open (or create) the http node of type sling:Folder
Create a node called content of type sling:Mapping
Add a property called sling:match and give it a value of localhost.4502/content/
Add a property called sling:internalRedirect and give it a value of /content/myproject
Now navigate to http://localhost:4502/content/mypage.html. The browser will display the URL that was entered, but internally AEM will be referencing the mapped directory structure.
Another way to test how CQ is handling mapping is by opening the Sling Resource Resolver tool. Log in with normal admin credentials (the default being admin/admin), enterhttp://localhost:4502/content/mypage.html into the test box, click Resolve and examine the “path” property that is output below the test box.
Problem: Remappping all of /content will also remap /content/dam, breaking current links
Solution: Override that particular mapping to point to itself
Within CRX, navigate to the /etc/map/http node.
Create a node called content_dam of type sling:Mapping
Add a property called sling:match and give it a value of localhost.4502/content/dam/
Add a property called sling:internalRedirect and give it a value of /content/dam/
This node will point to itself, and override the above /content/ mapping. The mapping rule resolution I believe is determined by string length and can be manipulated with sling:OrderedFolder. If readers have experience with this, please add comments below. It hasn’t been necessary to experiment with this feature for my projects.
The above problem will be more apparent if, instead of /content/ being mapping to the content directory, the root directory is also mapped.
Tip: Do not remap the root directory using this technique. There is a better technique above
While I recommend against remapping the root directory on author instances, as it will break things like authentication and CRX, it can still be done. In which case, the following directories also need to be remapped to point to themselves:
\apps\
\content\dam\
\etc\
\home\
\libs\
\tmp\
\var\
This should allow most applications and features of AEM to still work. But some monitoring of the app should take place to make sure other locations are not inappropriately being remapped.
Tip: If there’s a question about which URLs are being accessed by browsers or CRX, a tool like Fiddlercan monitor and report on it.
Dynamic Redirects: Thankfully, both the sling:match and sling:internalRedirect nodes use regular expressions, so that URL mapping can be more dynamic. So things like top-level links can be remapped to strip off the “.html” extension.
Within /etc/map/http
Create a new node named posts of type sling:Mapping
Add a property called sling:match with value localhost.4502/posts/([^/]+)$
Add a property called sling:internalRedirect with value /content/myproject/posts/$1.html
This will allow posts to be accessed without an extension. There are other ways of performing this particular task, such as adding a GET.jsp entry to the components, but mapping redirects are the best way to not interfere with the rest of the Sling resolution logic.
Deep Linking: When taking advantage of dynamic sling:Mappings with regular expressions, URL deep-linking can be implemented without the use of URL parameters or javascript. In some cases, SEO (Search Engine Optimization) experts will advise that sub-pages be reached from URLs and get loaded on the initial request. Other needs like analytics engines might preclude the use of hash tags or URL parameters. In this case, AEM can leverage URL mappings and selectors to pass information to a component and preload content.
In the posts node above, for instance, we can instead change the mapping from going to post sub pages to the posts page directly. Here, assume there is some component inside the post page that loads properties from its child page. But it needs to know which child page is being accessed.
Add a property called sling:match with value localhost.4502/posts/([^/]+)$
Add a property called sling:internalRedirect with value /content/myproject/posts.$1.html
The mapping above will take a request to /posts/child-page-name, and remap it to /posts.child-page-name.html. Essentially, this is the same as a posts.html call, with the addition of a component now being able to use the sling Resource API to extract that selector. The code within such a component would look something like this:
String defaultPostName = "";
//Below is logic to default posts based on page selectors.
//e.g. posts.{post-name.}.html
String selectorPath = currentPage.adaptTo(Resource.class).getResourceMetadata().getResolutionPathInfo();
String selectors[] = selectorPath.split("\\.");
if (selectors.length > 2) {
defaultPostName = selectors[1];
}
view rawposts.jsp
Tip 1: The dollar sign ($) above tells the regex “this is the end of the string. AEM therefore won’t map any URL with additional characters in the URL. However, URL parameters and hash tags will be ignored for the purposes of mapping, and don’t need to be considered
Tip 2: The technique demonstrated above will extract selectors from a page. Multiple can be stacked with additions /([^/]+) references in the sling:match property, and $2, $3… variable references in the sling:match property
Tip 3: Selectors can also be added to components using sling:include tags. To get a parent resource containing that information, use componentContent.getResource() in place of the currentPage.adaptTo(Resource.class) reference above
Summary: CQ offers a number of techniques to remap pages and URLs. While most of this can and should be done on a proxy web server, some powerful mapping tools are available directly in CQ. The topic is only superficially covered here, and any shared experiences or tips from readers are welcome in the comments below.

Wednesday, 11 November 2015

How is resource resolution done in Sling?

The below images tells us how a URL is resolved and mapped to a resource.

Consider the URL
GET – www.mywebsite.com/products/product1.printable.a4.html/a/b?x=12
Here the type of request will be HTTP GET request
We can break it down into its composite parts:

Sling's Request Processing Revisited

In Apache Sling each (http) request is mapped onto a JCR resource, i.e. a repository node. This is very different from other web frameworks you might be familiar with, say, Struts or Rails. In these frameworks a request is mapped onto a controller, i.e. a url really addresses application code, so the application developer usually implements some application logic that retrieves model data and passes it on to the view.
Just like Rails or Struts, Sling implements a model-view-controller architecture. However, in Sling a request addresses a piece of content. The mapping between request and model (data, content) is accomplished through the url so there is no need for further custom mapping logic.


Node selection
So, how does this work in detail? Consider an http GET request for the url:

/content/corporate/jobs/developer.html

First, Sling will look in the repository for a file located at exactly this location. If such a file is found, it will be streamed into the response as is. This behavior allows you to use Sling as a web server and store your web application's binary data in the repository as well.

However, if there is no file to be found Sling will look for a repository node located at:

/content/corporate/jobs/developer

(i.e. it drops the file extension). If this node cannot be found Sling will return the http code 404.

Script folders
The scripts that Sling uses to process http requests are stored in subfolders of "/apps". Those subfolders are usually of type nt:folder, but that's not a requirement.

Script selection
Nodes can have a special property named "sling:resourceType" that determines the resource type. Let us consider the simplest case (using the example request URL from above) and assume that the resource type is, say, "hr/job". The selected script will then be "/apps/hr/job/job.esp" (the last part of the resource type will have to be the file name). This works for GET requests and URLs ending in ".html".
Requests using other request methods, say POST, will cause Sling to look for the script at "/apps/hr/job/job.POST.esp". Request URLs ending in something else than ".html", say ".pdf", will make Sling look at "/apps/hr/job/job.pdf.esp". The convention to distinguish the two cases is that http methods are all uppercase and the extension of the request is all lowercase.
In a content-centric application the same content (aka nodes) must often be displayed in different variations, e.g. as a teaser view and as a detail view. In Sling this is achieved through selectors. The selector is specified in the URL like e.g.

/content/corporate/jobs/developer.detail.html

For this URL Sling would locate the script at "/apps/hr/job/job.detail.esp"
If the selected resource has no special resource type a script will be looked up based on the content path. For example, the script for /content/corporate/jobs.html will be searched in /apps/corporate.

Script engine
The ".esp" extension of the scripts used in the examples above indicates script engine to use. ".esp" stands for Ecma script and internally uses Rhino, Mozilla's Javascript engine. Other supported extensions are ".rb" for JRuby scripts, ".jsp" for JSPs or ".jst" for client-side execution (".jst" denotes Javascript template).

Some interesting special cases
The examples above describe rendering nodes as html or as a pdf. Howevere, there are also some built-in renderers for json and txt. The corresponding node presentations are located at (using the example form above):

/content/corporate/jobs/developer.json
and
/content/corporate/jobs/developer.txt

respectively.
For http error handling (404 or 500) Sling will look for a script at "/apps/sling/servlet/errorhandler/404.esp" and 500.esp, respectively.


More on script selection
If you need to find out more on the details of script selection in Sling have a look at Sling ticket 387 where developer Felix Meschberger a lot more on the script resolution process.

SLING 387:

According to the findings in the dev list thread "Simplifying script paths and names?" at [1] I would now like to propose the implementation of this change in script/servlet path resolution:

Note: This issue talks about scripts. But as servlets are mirrored into the virtual Resource Tree accessible through the ResourceResolver, servlets are treated exactly the same as scripts (or vice-versa actually). So the discussion applies to servlets as well as to scripts.

(1) Script Location

Scripts to handle the processing or a resource are looked up in a single location:

     {scriptPathPrefix}/{resourceTypePath}

Where {scriptPathPrefix} is an absolute path prefix (as per ResourceResolver.getSearchPath()) to get absolute paths and {resourceTypePath} is the resource type converted to a path. If the {resourceTypePath} is actually an absolute path, the {scriptPathPrefix} is not used.

Example: Given the search path [ "/apps", "/libs" ] and a resource type of sling:sample, the following locations will be searched for scripts:

     * /aps/sling/script
     * /libs/sling/script


(2) Within the location(s) found through above mechanism a script is searched whose script name matches the pattern
     {resourceTypeLabel}.{selectorString}.{requestMethod}.{requestExtension}.{scriptExtension}

where the fields have the following meaning:

     {resourceTypeLabel} - the last segment of the {resourceTypePath} (see above)
                    This part is required. Only scripts whose name starts with this name are considerd
     {selectorString} - the selector string as per RequestPathInfo.getSelectorString
                    This part is optional. The more selectors of the selector string match, the
                    better.
     {requestMethod}
                    The request method name. This is optional for GET or HEAD requests
                    and is required for non-GET/non-HEAD requests
     {requestExtension}
                    The extension of the request. This is optional.
     {scriptExtension}
                     The extension indicating the script language. Not used for selecting
                     the script but for selecting the ScriptEngine. This is of course not existing
                     for servlets.

If multiple scripts would apply for a given request, the script with the best match is selected. Generally speaking a match is better if it is more specific. More in detail, a match with more selector matches is better than a match with less selector matches, regardless of any request extension or method name match.

For example, consider a request to resource /foo/bar.print.a4.html of type sling:sample. Assuming we have the following list of scripts in the correct location:

   (1) sample.esp
   (2) sample.GET.esp
   (3) sample.GET.html.esp
   (4) sample.html.esp
   (5) sample.print.esp
   (6) sample.print.a4.esp
   (7) sample.print.html.esp
   (8) sample.print.GET.html.esp
   (9) sample.print.a4.html.esp
   (10) sample.print.a4.GET.html.esp

It would probably be (10) - (9) - (6) - (8) - (7) - (5) - (3) - (4) - (2) - (1). Note that (6) is a better match than (8) because it matches more selectors even though (8) has a method name and extension match where (6) does not.

If there is a catch, e.g. between print.esp and print.jsp, the first script in the listing would be selected (of course, there should not be a catch...)

1.         Update Servlet Resolution Description                       Resolved        
To be clear: This new mechanism replaces the script resolution mechanism of today. As such this change is not backwards compatible and existing applications will have to be adapted.
Bertrand Delacretaz added a comment - 18/Apr/08 13:41
> Given the search path [ "/apps", "/libs" ] and a resource type of sling:sample, the following locations will be
> searched for scripts:

> * /aps/sling/script
> * /libs/sling/script

I think this should be

/apps/sling/sample
/lib/sling/sample

And I'm not sure what the initial */ means in your example, I thought the search paths were absolute.
Felix Meschberger added a comment - 18/Apr/08 13:53
the initial "*" in the lines is just a numbering symbol. It has no code significance. And yes, your fixes are correct.
David Nuescheler added a comment - 25/Apr/08 15:38
i think it is important that this change was originally suggested to
make the simple cases as simple and intuitive as possible for
the user of sling and not to come up with something that is really
easy and consistent to map for the sling implementation.

let me try to explain with an example:
as a user of sling i would like to have my app in /apps/myapp and lets say i have a node of resourceType "myapp/homepage" at "/content/myapp".

i would like to to be able to structure my applications as follows:

(1) /apps/myapp/homepage/hompage.esp (or html.esp or GET.esp)
(2) /apps/myapp/homepage/edit.esp (or edit.html.esp)
(3) /apps/myapp/homepage/header/highlight.jpg.esp
(4) /apps/myapp/homepage/header/selected.jpg.esp
(5) /apps/myapp/homepage/header/small.jpg.esp

where

/content/myapp.html -> (1)
/content/myapp.edit.html -> (2)
/content/myapp.header.highlight.jpg -> (3)
/content/myapp.header.selected.jpg -> (4)
/content/myapp.header.small.jpg -> (5)

i think it is important that we avoid unnecessary repetition at any point
and we would allow for enough flexibility in the /apps directory allow
the user to come up with something short, distinct and meaningful.

I haven't found out how to hook a script to POST with a "delete" selector, tried POST.delete.html.esp and various other options but that didn't work.

Looking at the code, I think that might not work at all, and I didn't find tests related to this.
The correct script path would be .../delete/POST.esp

Request Selectors are ignored for non-GET/HEAD requests. Hence the POST-related test is expected to fail. This is related to issue (III) in [1] stating that not all request methods are equal.

Hence, I suggest to for the moment comment out these tests with a comment stating this situation and not implementing that support. Reason for this is, that a resource URL should address the resource (and at most give some hint for a specific representation, such as html or txt) but not include an operation such as delete.
                            
   * Refactored resolution of error handler servlets/scripts to use new mechanism
   * Removed unused methods and classes
   * Renamed helper classes to reflect their functionality

Closing this issue for now. Errors in the new implementation should be reported in new issues.

Currently, working with selectors requires you to put scripts in
subfolders, for example

/apps/foo/html.esp
/apps/foo/someselector/html.esp

and worse, all GET scripts which produce html are named html.esp,
which can be confusing when editing them.

We talked about this with David and Felix, here's a proposal for
simplifying those names in the "happy case", while keeping the current
conventions to resolve potential name conflicts where needed. Comments
welcome.

= Proposal =

The following variants should be accepted for script names, examples:

a) sling:resourceType=foo, request = bar.html

Sling searches for the following scripts and uses the first one found:

  /apps/foo/html.esp
  /apps/foo/foo.esp

The only change is that the script used for html rendering can
optionally be named foo.esp, to avoid having many scripts called
"html.esp" which is not practical when opening many of them in an
editor or IDE.

a) sling:resourceType=foo, request = bar.selector.html

The following scripts can be used to process this request, the first
one found being used:

/apps/foo/selector/html.esp
/apps/foo/selector.html.esp (same but with dots instead a subfolder)
/apps/foo/selector.esp
/apps/foo/html.esp (not specific to the selector)
/apps/foo/foo.esp (not specific either)

In the "happy case" people would then just have those two scripts to
handle the above cases:

/apps/foo/foo.esp
/apps/foo/selector.esp

Protocol host content path selector(s) extension suffix  param(s)
http://myhost:products/product1.printable.a4.html/a/b?x=12