Fedora, GSearch, Solr [UPDATED]
I am at the Fedora Commons Red Island Repository Institute this week to learn more about the Fedora Commons repository to help in my digital repository work at WGBH. For the last week, or so, I have been struggling with getting Fedora and Solr to play nice with each other, so we can do really interesting searches, faceted browsing, and more.
The secret, I have discovered, is to start with a pre-populated solr index (I don't know if this is strictly necessary, but it solved one of our major errors, and to run solr from its included Jetty engine (again, possibly not necessary, but I don't have the Java application experience to delve into complex configurations).
[UPDATE]
Here are some step-by-step directions to get GSearch and Fedora to play nice together:
###gsearch### 1) Download GSearch 2.1.1 and copy the fedoragsearch.war file to $TOMCAT_HOME/webapps 2) Restart tomcat to unpack WAR 3) configvalues.xml : Update soap.deploy.hostport, .user, and .pass :29,31s/basic/solr/g Line 242: set solr.index.1.indexbase and .indexdir 4) ant -f configvalues.xml configOnWebServer 5) cd $TOMCAT_HOME/webapps/fedoragsearch/WEB-INF/classes/ 6) cp -R configBase/updater configDemoOnSolr 7) configDemoOnSolr/fedoragsearch.properties append: fedoragsearch.updaterNames = BasicUpdaters 8) Edit configDemoOnSolr/index/DemoOnSolr/demoFoxmlToSolr.xml as appropriate. Copy this file to config/index/DemoOnSolr. ###SOLR### 9) Download solr 1.2 and unpack to $FEDORA_HOME/solr-1.2 10) mkdir $FEDORA_HOME/solr 11) cp -R solr-1.2/example/solr/* $FEDORA_HOME/solr 12) cp solr-1.2/dist/apache-solr-1.2.0.war $FEDORA_HOME/solr/solr.war 13) Edit conf/schema.xml to reflect the schema choices you made in demoFoxmlToSolr.xml 14) Create $TOMCAT_HOME/conf/Catalina/localhost/solr.xml 15) Restart Tomcat to start solr 16) Use solr to initialize the indexes a) Create a compatible solr ingest xml file by running one of your foxml files through your demoFoxmlToSolr.xslt file (maybe not necessary?) e.g. : xsltproc -o $FEDORA_HOME/tmp.xml tomcat/webapps/fedoragsearch/WEB-INF/classes/config/index/DemoOnSolr/demoFoxmlToSolr.xslt data/objects/2008/0808/14/47/wgbh_100 b) cp $FEDORA_HOME/solr-1.2/example/exampledocs/post.sh $FEDORA_HOME/solr c) Edit post.sh to change the URL (e.g. URL=http://localhost:8080/solr/update) d) ./post.sh $FEDORA_HOME/tmp.xml && rm $FEDORA_HOME/tmp.xml ###DOES IT WORK?### 17) Go to http://localhost:8080/fedoragsearch/rest -> browseIndex; Does your object exist? 18) Check gsearch -> solr integration works [updateIndex fromPid] 19) Import your current foxml files with [updateIndex fromFoxmlFiles]
This still doesn't take advantage of the JMS capabilities of Fedora 3.0, unfortunately; that's the next challenge.