Searching Confluence content using the GSA
Here is an extract from an email that I sent that discusses a few options:
- Utilizing GSA's search algorithms (recommended if you have enough license counts):
- The easiest would be to configure the GSA to "crawl" confluence's web interface . GSA can crawl any website, internal or external, - excellent crawler by the way. That would be simplest. For anonymous confluence spaces, it would be not be a problem at all. For authenticated spaces ( user/password), you can create a "GSAuser" account that has access to those spaces and plug that into the GSA via the admin panel.
- The more "controlled" but hands-on solution to "feed" content into the GSA would be to use the GSA Feed API where you can write the "feeders" in Java and other languages.
- Utilizing Confluence's search algorithms
You would write a small web application that will receive Google's queries, access confluence's programmatic search api and return the hits in Google's OneBox search format. So, User (browser or application) <> GSA <> YourOneBoxProvider <> Confluence will be how the interaction will happen.You do not have to change much on the confluence side.
Network security...make sure that the GSA does not provide "backdoor" entry into the "protected" spaces.Hope this helps . Do feel free to ask more questions.

Add Comment